Ansible for Infrastructure Automation: Playbooks That Scale

Ansible is the automation layer underneath most 47Network Studio hardware engagements. It's not the flashiest tool — Terraform and Pulumi get more conference talks — but for configuring real servers that already exist, it's hard to beat. No agent to install, playbooks are readable YAML, and idempotency means you can run the same playbook on a server you set up six months ago and it will converge to the correct state without breaking anything. This post covers the patterns that hold up across dozens of server configurations: project structure, roles, Ansible Vault for secrets, and the inventory setup that scales from 5 to 500 hosts.

Project structure that doesn't collapse

ansible/
  inventories/
    production/
      hosts.yml         # Production inventory
      group_vars/
        all.yml         # Variables for all hosts
        webservers.yml  # Variables for webserver group
      host_vars/
        server-01.yml   # Variables for a specific host
    staging/
      hosts.yml
      group_vars/
  roles/
    common/             # Every server gets this role
      tasks/main.yml
      handlers/main.yml
      defaults/main.yml
      templates/
      files/
    nginx/
    postgresql/
    monitoring/
  playbooks/
    site.yml            # Full site playbook (runs all roles)
    webservers.yml      # Role-specific playbooks
    rolling-update.yml  # Zero-downtime update playbook
  ansible.cfg

Inventory management with group_vars

A dynamic, grouped inventory with group_vars means you set a variable once for all production webservers, not once per host. Use YAML inventory format — it's more readable than INI and handles nested groups cleanly:

# inventories/production/hosts.yml
all:
  children:
    webservers:
      hosts:
        web-01.internal:
          ansible_host: 10.0.1.11
        web-02.internal:
          ansible_host: 10.0.1.12
    databases:
      hosts:
        db-primary.internal:
          ansible_host: 10.0.2.10
          postgres_role: primary
        db-replica.internal:
          ansible_host: 10.0.2.11
          postgres_role: replica
    monitoring:
      hosts:
        grafana.internal:
          ansible_host: 10.0.3.10
  vars:
    ansible_user: deploy
    ansible_ssh_private_key_file: ~/.ssh/deploy_ed25519
    ansible_python_interpreter: /usr/bin/python3

# inventories/production/group_vars/webservers.yml
nginx_worker_processes: auto
nginx_worker_connections: 4096
certbot_email: ops@example.com
certbot_domains:
  - example.com
  - www.example.com

# inventories/production/group_vars/all.yml
ntp_servers:
  - 0.ro.pool.ntp.org
  - 1.ro.pool.ntp.org
ssh_allowed_users:
  - deploy
  - ops-user
unattended_upgrades_enabled: true

Roles: reusable, testable, shareable

A role packages everything needed to configure a specific service. The common role runs on every host and handles the baseline: SSH hardening, unattended upgrades, NTP, fail2ban, and the monitoring agent. Service-specific roles layer on top:

# roles/common/tasks/main.yml
---
- name: Ensure SSH is hardened
  ansible.builtin.lineinfile:
    path: /etc/ssh/sshd_config
    regexp: "{{ item.regexp }}"
    line: "{{ item.line }}"
    state: present
  loop:
    - { regexp: '^#?PermitRootLogin',     line: 'PermitRootLogin no' }
    - { regexp: '^#?PasswordAuthentication', line: 'PasswordAuthentication no' }
    - { regexp: '^#?X11Forwarding',       line: 'X11Forwarding no' }
    - { regexp: '^#?AllowTcpForwarding',  line: 'AllowTcpForwarding no' }
  notify: Restart SSH

- name: Install and configure fail2ban
  ansible.builtin.apt:
    name: fail2ban
    state: present
    update_cache: true

- name: Deploy fail2ban jail config
  ansible.builtin.template:
    src: jail.local.j2
    dest: /etc/fail2ban/jail.local
    mode: '0644'
  notify: Restart fail2ban

- name: Enable unattended-upgrades
  ansible.builtin.debconf:
    name: unattended-upgrades
    question: unattended-upgrades/enable_auto_updates
    value: 'true'
    vtype: boolean
  when: unattended_upgrades_enabled | bool

# roles/common/handlers/main.yml
---
- name: Restart SSH
  ansible.builtin.service:
    name: sshd
    state: restarted

- name: Restart fail2ban
  ansible.builtin.service:
    name: fail2ban
    state: restarted

Ansible Vault: secrets in version control

Ansible Vault encrypts sensitive variables so they can be committed to version control safely. Use encrypt_string for individual values rather than encrypting entire files — it's easier to diff and review:

# Encrypt a single value
ansible-vault encrypt_string 'your-db-password' --name 'db_password'

# Result — paste this into your vars file:
db_password: !vault |
  $ANSIBLE_VAULT;1.1;AES256
  3961623366393733353837643132636264383834...

# Run playbook with vault password
ansible-playbook playbooks/site.yml --vault-password-file ~/.vault-pass
# Or use an environment variable:
ANSIBLE_VAULT_PASSWORD_FILE=~/.vault-pass ansible-playbook playbooks/site.yml

Never commit .vault-pass to version control. Add it to .gitignore immediately. The vault password file should be distributed via a secrets manager (HashiCorp Vault, AWS Secrets Manager) or shared out-of-band for each team member. In CI/CD, pass it as an environment variable injected from your secrets store.

Idempotency: the most important property

An idempotent playbook produces the same result whether it's run once or ten times. Ansible modules are generally idempotent by design — apt: state=present won't reinstall a package that's already installed, lineinfile won't add a duplicate line. Where idempotency breaks down is with command and shell tasks. Always add a guard:

# BAD — runs every time
- name: Initialise database
  ansible.builtin.command: psql -U postgres -c "CREATE DATABASE myapp"

# GOOD — only runs if the database doesn't exist
- name: Check if database exists
  ansible.builtin.command: psql -U postgres -lqt
  register: pg_databases
  changed_when: false   # This task never reports as "changed"

- name: Initialise database
  ansible.builtin.command: psql -U postgres -c "CREATE DATABASE myapp"
  when: "'myapp' not in pg_databases.stdout"

Rolling updates without downtime

# playbooks/rolling-update.yml
---
- name: Rolling update — webservers
  hosts: webservers
  serial: 1            # Update one server at a time
  max_fail_percentage: 0  # Abort if any server fails

  pre_tasks:
    - name: Remove from load balancer
      ansible.builtin.uri:
        url: "http://{{ lb_host }}/api/drain/{{ inventory_hostname }}"
        method: POST
      delegate_to: localhost

    - name: Wait for connections to drain
      ansible.builtin.wait_for:
        timeout: 30

  roles:
    - role: nginx
    - role: app

  post_tasks:
    - name: Health check before re-adding to LB
      ansible.builtin.uri:
        url: "http://{{ ansible_host }}/health"
        status_code: 200
      retries: 5
      delay: 5

    - name: Re-add to load balancer
      ansible.builtin.uri:
        url: "http://{{ lb_host }}/api/enable/{{ inventory_hostname }}"
        method: POST
      delegate_to: localhost

In 47Network Studio hardware engagements, Ansible manages the full server lifecycle: initial provisioning from our hardened base image, ongoing configuration drift correction, and software updates. The same playbooks that set up a new server are used for day-2 operations — if a configuration drifts (a sysadmin made a manual change), the next playbook run corrects it. The law firm hardware engagement uses this approach for all 6 servers in the rack.

← Back to Blog Proxmox Guide →

Ansible for infrastructure automation: playbooks that scale.

Project structure that doesn't collapse

Inventory management with group_vars

Roles: reusable, testable, shareable

Ansible Vault: secrets in version control

Idempotency: the most important property

Rolling updates without downtime