torero

distributed clustering in torero for network automation scalability

Peter Sprygada

Vice President, Product Management ‐ Itential

distributed clustering in torero for network automation scalability
Share this:
Posted on September 11, 2024

The recent release of torero 1.1 includes the introduction of clustering in torero. With that, a couple of immediate questions come to mind. Why do I need clustering? And how would I do that?

The why and even some basics of the how are covered by Peter in his announcement blog. But today, let’s dive a little deeper into where this is useful and a couple of implementation methods to use in the real world.

>_ when to cluster

Clustering isn’t for everyone. I, personally, do just fine sharing my scripts with my small team using a simple t2.micro on EC2. Sure, I can’t do any major scripting, and some of the scripts run a bit slower than they could. However, I can go ahead and ask for that big Christmas bonus because I didn’t spend $100k on servers.

Where I *would* cluster is when I have any environment where script run times are important, there’s more than one person running them, and we need to get consistent responses. Environments such as:

>_  CI/CD Pipelines: multiple builds happen at any given time.
>_  Support Ops: uptime is key.
>_  Monitoring tooling.
>_  Deployment tooling.
>_  CIO is watching.


Any of these situations will have important markers that need the network scripting to consistently work in a known, reliable amount of time. These things snowball when multiple automations start relying on each other, which is where we want to be as automation evolves toward orchestration (or at least some form of reliable stepping that doesn’t require me to be on call or troubleshoot a script that isn’t working right).

Nothing worse than troubleshooting some weird dependency problem in a script that’s causing an outage when the boss is watching.

>_ where to cluster

Now this depends on your environment. This is one of the fantastic features of torero. It’s a single binary image — it can essentially run anywhere. Have EC2? Just run an OpenTofu plan and deploy runners as needed.

repo link: https://github.com/torerodev/singlehost-ec2-torero

Here, I’ve setup a small plan to deploy torero as a server. It’s trivial to change the config file to point to a torero controller and turn this into a runner. If you want to make this interesting, setup a torero service to deploy more runners. In fact, that’s probably the best plan for quick scaling.

Have Docker containers? Easy.

services: 
  server: 
    container_name: torero-server 
    build: 
      context: .. 
      dockerfile: ./Containerfile 
    entrypoint: 
        - torero 
        - server 
    environment: 
      TORERO_APPLICATION_WORKING_DIR: "/opt/torero" 
      TORERO_APPLICATION_AUTO_ACCEPT_EULA: "true" 
      TORERO_SERVER_USE_TLS: "false" 
      TORERO_STORE_BACKEND: "etcd" 
      TORERO_APPLICATION_MODE: "server" 
      TORERO_SERVER_LISTEN_ADDRESS: "0.0.0.0" 
      TORERO_SERVER_DISTRIBUTED_EXECUTION: "true" 
      TORERO_STORE_ETCD_HOSTS: "etcd-1:2379" 
      TORERO_STORE_ETCD_USE_TLS: "false" 
      TORERO_LOG_LEVEL: "TRACE" 
    ports: 
      - "127.0.0.1:50051:50051" 
    depends_on: 
      etcd-1: 
        condition: service_started 
  runner1: 
    container_name: torero-runner-1 
    build: 
      context: .. 
      dockerfile: ./Containerfile 
    entrypoint: 
      - torero 
      - runner 
    environment: 
      TORERO_APPLICATION_WORKING_DIR: "/opt/torero" 
      TORERO_APPLICATION_AUTO_ACCEPT_EULA: "true" 
      TORERO_RUNNER_LISTEN_ADDRESS: "0.0.0.0" 
      TORERO_RUNNER_USE_TLS: "false" 
      TORERO_STORE_BACKEND: "etcd" 
      TORERO_APPLICATION_MODE: "runner" 
      TORERO_STORE_ETCD_HOSTS: "etcd-1:2379" 
      TORERO_STORE_ETCD_USE_TLS: "false" 
      TORERO_LOG_LEVEL: "TRACE" 
    depends_on: 
      etcd-1: 
        condition: service_started 
 

Here’s the Containerfile:

FROM docker.io/alpine 
 
LABEL maintainer="itential.com" 
 
RUN apk add openssh-client python3 git 
 
RUN wget https://get.opentofu.org/install-opentofu.sh -O /tmp/install-opentofu.sh && \ 
    chmod +x /tmp/install-opentofu.sh && \ 
    /tmp/install-opentofu.sh --install-method apk && \ 
    rm /tmp/install-opentofu.sh 
 
RUN chmod -R a+w /etc/ssh 
 
RUN addgroup -S itential && adduser -S itential -G itential && \ 
    mkdir -p /opt/torero && chown -R itential:itential /opt/torero && \ 
    mkdir -p /var/log/torero && chown -R itential:itential /var/log/torero && \ 
    mkdir -p /etc/torero && chown -R itential:itential /etc/torero 
 
RUN wget https://download.torero.dev/torero-v1.1.0-linux-amd64.tar.gz && \ 
    tar xvzf torero-v1.1.0-linux-amd64.tar.gz && \ 
    rm torero-v1.1.0-linux-amd64.tar.gz && \ 
    mv torero /usr/local/bin 

USER itential 
 
CMD ["/usr/local/bin/torero", "server"] 

 

Note: I’m leaving out the docker compose stuff for etcd, so you’ll need to add that if you want to use this.

Have Kubernetes? Create a manifest from the docker compose file.

VMs? Just create torero as a service and have it startup on boot. Since it’s a single file, and you can use either environmental variables or a config file, it’s pretty easy to setup.

Essentially, clustering may look complicated, but with torero it’s VERY easy. So easy that I’d recommend that any production environment deploy it, as it creates a reliable foundation for your automation scripts.

>_ what to cluster

Once you start using torero in your daily life, it becomes a question of “why did this not exist before?” I now have a collection of scripts and anytime someone on my team wants to do something more than once, I create a script to do it. Yeah, it takes me a bit more time the first time, but when you can just tell someone to run “3node-iap in torero”, it’s amazing. My script database is just beginning, but torero takes my unorganized mess of scripts that no one can use and turns it into an easy list that anyone I allow can use.

>_  torero1.1-demo torero get services
NAME                TYPE               CREATED
10node-ha-iap       opentofu-plan      2024-09-06T03:18:04Z
3node-iap           opentofu-plan      2024-09-06T03:12:25Z
hello-ansible       ansible-playbook   2024-09-06T03:40:26Z
hello-python        python-script      2024-09-06T03:57:29Z
hello-tofu-github   opentofu-plan      2024-09-06T02:49:16Z
netbox-2-zabba      python-script      2024-09-06T15:28:45Z
plant-destruct      opentofu-plan      2024-09-06T16:46:31Z
tofu-demo           opentofu-plan      2024-09-04T16:12:27Z

At the end of the day, torero is that tool that I didn’t know I needed, that I now can’t live without.

As for what to cluster? Clustering is easy. If I’m sharing it with anyone outside my local team, and I don’t want to get called in the middle of the night. I’m clustering. The small cost to build a 4 or 5 node cluster on k8s or in EC2 is trivial compared to my good night’s sleep.

You can watch how I do it here.

Peter Sprygada

Vice President, Product Management ‐ Itential

Peter Sprygada serves as the Vice President, Product Management at Itential after serving as the Chief Technology Officer at Pureport where he was responsible for their multi-cloud network as a service interconnect platform. Prior to Pureport, Sprygada was a Distinguished Engineer for Red Hat, where he played the role of Chief Architect for the Ansible Automation Platform. Sprygada also held senior technical and leadership positions at Arista and Cisco, as well as several networking startups.

More from Peter Sprygada