Ansible fundamentals
In this series (10 parts)
Configuration management tools fall into two camps: agent-based and agentless. Chef and Puppet require a daemon running on every managed host. Ansible takes the opposite approach. It uses SSH. Nothing gets installed on the target machine beyond Python, which virtually every Linux server already has. This makes Ansible trivially easy to adopt. You write YAML, run a command, and your servers converge to the desired state.
Agentless architecture
Ansible runs on a control node, which is your laptop, a CI runner, or a dedicated management server. When you execute a playbook, Ansible opens SSH connections to the target hosts, copies small Python scripts called modules to those hosts, executes them, captures the output, and removes the scripts. The entire cycle happens over SSH.
graph LR A[Control Node] -->|SSH| B[Web Server 1] A -->|SSH| C[Web Server 2] A -->|SSH| D[DB Server] B -->|Returns JSON| A C -->|Returns JSON| A D -->|Returns JSON| A
Ansible pushes modules over SSH and collects JSON results. No agent process runs on managed hosts.
This architecture has real consequences. There is no central server to maintain. There is no agent to update across hundreds of machines. There is no open port for an agent to listen on. The attack surface stays minimal.
Inventory
Ansible needs to know which hosts to manage. That information lives in an inventory file.
Static inventory
The simplest form is an INI file:
# inventory/hosts.ini
[webservers]
web1.example.com ansible_user=deploy
web2.example.com ansible_user=deploy
[databases]
db1.example.com ansible_user=deploy ansible_port=2222
[all:vars]
ansible_python_interpreter=/usr/bin/python3
Groups organize hosts by role. The [all:vars] section sets variables for every host. You can also use YAML format for the same inventory:
# inventory/hosts.yml
all:
vars:
ansible_python_interpreter: /usr/bin/python3
children:
webservers:
hosts:
web1.example.com:
ansible_user: deploy
web2.example.com:
ansible_user: deploy
databases:
hosts:
db1.example.com:
ansible_user: deploy
ansible_port: 2222
Dynamic inventory
Static files break down when your infrastructure changes constantly. Cloud environments spin up and tear down instances throughout the day. Dynamic inventory solves this by querying an API at runtime.
Ansible ships with inventory plugins for AWS, GCP, Azure, and many other providers. For AWS you configure a file ending in aws_ec2.yml:
# inventory/aws_ec2.yml
plugin: amazon.aws.aws_ec2
regions:
- us-east-1
filters:
tag:Environment: production
keyed_groups:
- key: tags.Role
prefix: role
Running ansible-inventory -i inventory/aws_ec2.yml --list queries the EC2 API and returns every matching instance grouped by the Role tag.
Playbooks, plays, and tasks
A playbook is a YAML file containing one or more plays. Each play targets a group of hosts and runs a sequence of tasks. Each task calls a module.
---
# site.yml
- name: Configure web servers
hosts: webservers
become: true
tasks:
- name: Install Nginx
ansible.builtin.apt:
name: nginx
state: present
update_cache: true
- name: Start Nginx
ansible.builtin.service:
name: nginx
state: started
enabled: true
The become: true directive tells Ansible to use sudo. Each task has a human-readable name and a module call with parameters. Run the playbook with:
ansible-playbook -i inventory/hosts.ini site.yml
Modules
Modules are the units of work in Ansible. The apt module manages Debian packages. The service module controls systemd services. The copy module transfers files. The template module renders Jinja2 templates and transfers the result. Ansible ships with thousands of modules organized into collections.
Every module is designed to be idempotent. Calling apt with state: present when the package is already installed produces no change. The module checks current state, compares it to desired state, and only acts when they differ.
Variables and facts
Variables come from many sources: inventory files, playbook vars, role defaults, command-line overrides, and facts. Facts are variables that Ansible discovers automatically by running the setup module on each host at the start of a play.
- name: Show OS information
hosts: all
tasks:
- name: Print distribution
ansible.builtin.debug:
msg: "This host runs {{ ansible_distribution }} {{ ansible_distribution_version }}"
You can define variables directly in a playbook:
- name: Deploy application
hosts: webservers
become: true
vars:
app_port: 8080
deploy_dir: /var/www/myapp
tasks:
- name: Create deploy directory
ansible.builtin.file:
path: "{{ deploy_dir }}"
state: directory
owner: www-data
group: www-data
mode: "0755"
Variables also live in separate files loaded with vars_files or placed in group_vars/ and host_vars/ directories alongside your inventory.
Handlers
Some actions should only happen when something changes. Restarting Nginx after updating its configuration makes sense. Restarting it when the configuration has not changed wastes time and drops connections. Handlers solve this.
tasks:
- name: Deploy Nginx config
ansible.builtin.template:
src: nginx.conf.j2
dest: /etc/nginx/sites-available/default
notify: Restart Nginx
handlers:
- name: Restart Nginx
ansible.builtin.service:
name: nginx
state: restarted
The notify directive triggers the handler only when the task reports a change. Handlers run once at the end of the play, regardless of how many tasks notify them.
Idempotency in practice
Idempotency means running a playbook once or ten times produces the same result. The first run makes changes. Subsequent runs report “ok” for every task because the system already matches the desired state. This property is what makes Ansible safe to run repeatedly in CI pipelines.
flowchart TD
A[Task executes] --> B{Current state matches desired?}
B -->|Yes| C[Report OK, no change]
B -->|No| D[Apply change]
D --> E[Report Changed]
E --> F{Handler notified?}
F -->|Yes| G[Queue handler]
F -->|No| H[Next task]
C --> H
G --> H
Every module checks current state before acting. This is the core of idempotency.
Not every module is automatically idempotent. The shell and command modules run arbitrary commands and cannot know whether the command needs to run again. Use creates or when conditions to make them idempotent:
- name: Build application
ansible.builtin.shell: make build
args:
chdir: /opt/myapp
creates: /opt/myapp/bin/server
The creates parameter tells Ansible to skip the task if the file already exists.
Full example: Nginx with a static site
Here is a complete playbook that installs Nginx on Ubuntu, deploys a static site, and configures a virtual host.
---
# deploy-static-site.yml
- name: Deploy static website on Nginx
hosts: webservers
become: true
vars:
site_domain: mysite.example.com
site_root: /var/www/mysite
nginx_worker_processes: auto
nginx_worker_connections: 1024
tasks:
- name: Update apt cache
ansible.builtin.apt:
update_cache: true
cache_valid_time: 3600
- name: Install Nginx
ansible.builtin.apt:
name: nginx
state: present
- name: Create site root directory
ansible.builtin.file:
path: "{{ site_root }}"
state: directory
owner: www-data
group: www-data
mode: "0755"
- name: Deploy index.html
ansible.builtin.copy:
dest: "{{ site_root }}/index.html"
owner: www-data
group: www-data
mode: "0644"
content: |
<!DOCTYPE html>
<html lang="en">
<head><meta charset="utf-8"><title>{{ site_domain }}</title></head>
<body>
<h1>Welcome to {{ site_domain }}</h1>
<p>Deployed by Ansible at {{ ansible_date_time.iso8601 }}</p>
</body>
</html>
- name: Deploy Nginx virtual host
ansible.builtin.copy:
dest: /etc/nginx/sites-available/{{ site_domain }}
owner: root
group: root
mode: "0644"
content: |
server {
listen 80;
server_name {{ site_domain }};
root {{ site_root }};
index index.html;
location / {
try_files $uri $uri/ =404;
}
access_log /var/log/nginx/{{ site_domain }}.access.log;
error_log /var/log/nginx/{{ site_domain }}.error.log;
}
notify: Reload Nginx
- name: Enable site by creating symlink
ansible.builtin.file:
src: /etc/nginx/sites-available/{{ site_domain }}
dest: /etc/nginx/sites-enabled/{{ site_domain }}
state: link
notify: Reload Nginx
- name: Remove default site
ansible.builtin.file:
path: /etc/nginx/sites-enabled/default
state: absent
notify: Reload Nginx
- name: Ensure Nginx is started and enabled
ansible.builtin.service:
name: nginx
state: started
enabled: true
handlers:
- name: Reload Nginx
ansible.builtin.service:
name: nginx
state: reloaded
Run it:
ansible-playbook -i inventory/hosts.ini deploy-static-site.yml
The first run installs Nginx, creates directories, drops the HTML file, writes the virtual host config, symlinks it, removes the default site, and reloads Nginx. The second run reports “ok” on every task and skips the handler because nothing changed.
Verifying your playbook
Check mode runs the playbook without making changes:
ansible-playbook -i inventory/hosts.ini deploy-static-site.yml --check --diff
The --diff flag shows what would change in files. Use this before applying changes to production systems.
Syntax checking catches YAML errors before execution:
ansible-playbook deploy-static-site.yml --syntax-check
Execution flow summary
sequenceDiagram
participant U as Operator
participant C as Control Node
participant H as Managed Host
U->>C: ansible-playbook site.yml
C->>H: SSH: gather facts
H-->>C: JSON facts
loop Each task
C->>H: SSH: copy module + args
H->>H: Execute module
H-->>C: JSON result (ok/changed/failed)
end
C->>H: SSH: run queued handlers
H-->>C: Handler results
C-->>U: Play recap
Ansible gathers facts, runs tasks sequentially, and fires handlers at the end of the play.
What comes next
You now understand inventories, playbooks, modules, variables, handlers, and idempotency. These are the building blocks. Real projects grow beyond a single playbook file. The next article covers Ansible roles and best practices, where you will learn to organize playbooks into reusable, testable components using role directory structures, Ansible Galaxy, Jinja2 templates, and encrypted secrets with Vault.