torero

distributed clustering in torero for network automation scalability

Wyatt Sullivan

Technical Marketing Engineer ‐ Itential

distributed clustering in torero for network automation scalability
Share this:
Posted on September 11, 2024

The recent release of torero 1.1 includes the introduction of clustering in torero. With that, a couple of immediate questions come to mind. Why do I need clustering? And how would I do that?

The why and even some basics of the how are covered by Peter in his announcement blog. But today, let’s dive a little deeper into where this is useful and a couple of implementation methods to use in the real world.

>_ when to cluster

Clustering isn’t for everyone. I, personally, do just fine sharing my scripts with my small team using a simple t2.micro on EC2. Sure, I can’t do any major scripting, and some of the scripts run a bit slower than they could. However, I can go ahead and ask for that big Christmas bonus because I didn’t spend $100k on servers.

Where I *would* cluster is when I have any environment where script run times are important, there’s more than one person running them, and we need to get consistent responses. Environments such as:

>_  CI/CD Pipelines: multiple builds happen at any given time.
>_  Support Ops: uptime is key.
>_  Monitoring tooling.
>_  Deployment tooling.
>_  CIO is watching.


Any of these situations will have important markers that need the network scripting to consistently work in a known, reliable amount of time. These things snowball when multiple automations start relying on each other, which is where we want to be as automation evolves toward orchestration (or at least some form of reliable stepping that doesn’t require me to be on call or troubleshoot a script that isn’t working right).

Nothing worse than troubleshooting some weird dependency problem in a script that’s causing an outage when the boss is watching.

>_ where to cluster

Now this depends on your environment. This is one of the fantastic features of torero. It’s a single binary image — it can essentially run anywhere. Have EC2? Just run an OpenTofu plan and deploy runners as needed.

repo link: https://github.com/torerodev/singlehost-ec2-torero

Here, I’ve setup a small plan to deploy torero as a server. It’s trivial to change the config file to point to a torero controller and turn this into a runner. If you want to make this interesting, setup a torero service to deploy more runners. In fact, that’s probably the best plan for quick scaling.

Have Docker containers? Easy.

services: 
  server: 
    container_name: torero-server 
    build: 
      context: .. 
      dockerfile: ./Containerfile 
    entrypoint: 
        - torero 
        - server 
    environment: 
      TORERO_APPLICATION_WORKING_DIR: "/opt/torero" 
      TORERO_APPLICATION_AUTO_ACCEPT_EULA: "true" 
      TORERO_SERVER_USE_TLS: "false" 
      TORERO_STORE_BACKEND: "etcd" 
      TORERO_APPLICATION_MODE: "server" 
      TORERO_SERVER_LISTEN_ADDRESS: "0.0.0.0" 
      TORERO_SERVER_DISTRIBUTED_EXECUTION: "true" 
      TORERO_STORE_ETCD_HOSTS: "etcd-1:2379" 
      TORERO_STORE_ETCD_USE_TLS: "false" 
      TORERO_LOG_LEVEL: "TRACE" 
    ports: 
      - "127.0.0.1:50051:50051" 
    depends_on: 
      etcd-1: 
        condition: service_started 
  runner1: 
    container_name: torero-runner-1 
    build: 
      context: .. 
      dockerfile: ./Containerfile 
    entrypoint: 
      - torero 
      - runner 
    environment: 
      TORERO_APPLICATION_WORKING_DIR: "/opt/torero" 
      TORERO_APPLICATION_AUTO_ACCEPT_EULA: "true" 
      TORERO_RUNNER_LISTEN_ADDRESS: "0.0.0.0" 
      TORERO_RUNNER_USE_TLS: "false" 
      TORERO_STORE_BACKEND: "etcd" 
      TORERO_APPLICATION_MODE: "runner" 
      TORERO_STORE_ETCD_HOSTS: "etcd-1:2379" 
      TORERO_STORE_ETCD_USE_TLS: "false" 
      TORERO_LOG_LEVEL: "TRACE" 
    depends_on: 
      etcd-1: 
        condition: service_started 
 

Here’s the Containerfile:

FROM docker.io/alpine 
 
LABEL maintainer="itential.com" 
 
RUN apk add openssh-client python3 git 
 
RUN wget https://get.opentofu.org/install-opentofu.sh -O /tmp/install-opentofu.sh && \ 
    chmod +x /tmp/install-opentofu.sh && \ 
    /tmp/install-opentofu.sh --install-method apk && \ 
    rm /tmp/install-opentofu.sh 
 
RUN chmod -R a+w /etc/ssh 
 
RUN addgroup -S itential && adduser -S itential -G itential && \ 
    mkdir -p /opt/torero && chown -R itential:itential /opt/torero && \ 
    mkdir -p /var/log/torero && chown -R itential:itential /var/log/torero && \ 
    mkdir -p /etc/torero && chown -R itential:itential /etc/torero 
 
RUN wget https://download.torero.dev/torero-v1.1.0-linux-amd64.tar.gz && \ 
    tar xvzf torero-v1.1.0-linux-amd64.tar.gz && \ 
    rm torero-v1.1.0-linux-amd64.tar.gz && \ 
    mv torero /usr/local/bin 

USER itential 
 
CMD ["/usr/local/bin/torero", "server"] 

 

Note: I’m leaving out the docker compose stuff for etcd, so you’ll need to add that if you want to use this.

Have Kubernetes? Create a manifest from the docker compose file.

VMs? Just create torero as a service and have it startup on boot. Since it’s a single file, and you can use either environmental variables or a config file, it’s pretty easy to setup.

Essentially, clustering may look complicated, but with torero it’s VERY easy. So easy that I’d recommend that any production environment deploy it, as it creates a reliable foundation for your automation scripts.

>_ what to cluster

Once you start using torero in your daily life, it becomes a question of “why did this not exist before?” I now have a collection of scripts and anytime someone on my team wants to do something more than once, I create a script to do it. Yeah, it takes me a bit more time the first time, but when you can just tell someone to run “3node-iap in torero”, it’s amazing. My script database is just beginning, but torero takes my unorganized mess of scripts that no one can use and turns it into an easy list that anyone I allow can use.

>_  torero1.1-demo torero get services
NAME                TYPE               CREATED
10node-ha-iap       opentofu-plan      2024-09-06T03:18:04Z
3node-iap           opentofu-plan      2024-09-06T03:12:25Z
hello-ansible       ansible-playbook   2024-09-06T03:40:26Z
hello-python        python-script      2024-09-06T03:57:29Z
hello-tofu-github   opentofu-plan      2024-09-06T02:49:16Z
netbox-2-zabba      python-script      2024-09-06T15:28:45Z
plant-destruct      opentofu-plan      2024-09-06T16:46:31Z
tofu-demo           opentofu-plan      2024-09-04T16:12:27Z

At the end of the day, torero is that tool that I didn’t know I needed, that I now can’t live without.

As for what to cluster? Clustering is easy. If I’m sharing it with anyone outside my local team, and I don’t want to get called in the middle of the night. I’m clustering. The small cost to build a 4 or 5 node cluster on k8s or in EC2 is trivial compared to my good night’s sleep.

You can watch how I do it here.

Wyatt Sullivan

Technical Marketing Engineer ‐ Itential

Wyatt Sullivan, CCIE 18027, has been pushing packets for nearly 25 years. Yes, he’s old. He has accidentally taken down an entire datacenter, bricked network devices, created a loop in a network that caused a 120 minute brownout well outside of the local domain, cleared an entire VTP domain by adding a new device, and once requested a raise. Due to sheer laziness, his first automations were designed in 2000s with excel spreadsheets and bash scripts. By the time he was a Chief Network Architect, he was building scalable web portals for the ops teams to locate devices on the network because he was sick of finding those devices for them. Once he realized he was no longer valuable to real companies, he moved into the vendor space where he has been shilling, automating, and helping network engineers blame other departments for the past decade.

More from Wyatt Sullivan