Ansible: The Automation Engine

Note: This post is over two years old and so the information contained here might be out of date. If you do spot something please leave a comment and we will endeavour to correct.

23rd December 2013 - 16 minutes read time

Ansible is a automation and provisioning tool that makes it easy to configure systems with the needed software, configuration options and even content. It is a command line tool, written in Python, that uses SSH connections to run these actions. This means that all you need to do is have a viable SSH connection to a machine and Ansible will run any actions you want to run. Ansible can either run single commands or use what is called a playbook to run several commands. Ansible playbooks are written in YAML, which makes understanding them quite easy.

I have tried to use other provisioning tools like Puppet and Chef in the past, but these have always been difficult to get to grips with. When I started using Ansible it wasn't more than 20 minutes before I was installing and configuring software on a server. The YAML files that Ansible uses makes it easy to see what is going on and have enough features to allow for some quite complex configurations.

Installing Ansible

Ansible can be installed via the apt or yum package managers on Linux, although you'll need to set up some of the source repositories first. If you are using yum then you'll need the EPEL libraries installed. To install via apt the following commands can be used.

sudo add-apt-repository ppa:rquillo/ansible
sudo apt-get update
sudo apt-get install ansible

However, the easiest way to install Ansible is using the Python easy_install and pip tools. Most systems with Python should have the easy_install tool, but if it isn't installed then you can install it with the python-setuptools.

sudo apt-get install python-setuptools

Ansible is actually installed via the pip Python package manager, it shouldn't be installed by easy_install directly. If pip isn't already available in your version of Python, you can get pip by using the following command.

sudo easy_install pip

Before you install Ansible you'll probably need to install a couple of dependencies, also available via pip.

sudo pip install paramiko PyYAML jinja2 httplib2 markupsafe

Ansible can then be installed using the following command.

sudo pip install ansible

You might see something like the following error when trying to install Ansible or any of the dependencies.

src/MD2.c:31:20: fatal error: Python.h: No such file or directory

 #include "Python.h"

                    ^

compilation terminated.

To solve this you need to install the python-dev package. Python.h is a header file that is used by gcc to compile Python applications and is needed to compile some of the dependencies.

sudo apt-get install python-dev

See the Ansible installation documentation for more information about installing Ansible.

Inventory Files

An inventory file tells Ansible what hosts it needs to look at when running commands. This file contains a list of hosts that can be split into groups, allowing certain actions to be performed to certain servers. A typical inventory file would look like this.

[webservers]
web1.example.com
web2.example.com

[dbservers]
db.example.com

Ansible has a global inventory file that is used as a default for all requests and is typically stored at the location /etc/ansible/hosts. Using the default hosts file can lead to some confusion and in my experience it is best to ignore this and use an inventory file for each project. You can tell Ansible which inventory file you wish to use when running commands. It is therefore best to create a file called hosts.ini and save it into your Ansible project. I will come onto running Ansible commands in the next section.

Inventory files are described in detail in the Ansible inventory documentation.

Giving Ansible Access

By default Ansible will attempt to connect to your hosts using OpenSSH. Due to this you will need to ensure that access keys (or usernames and passwords) exist for all of the hosts in your inventory files. The most secure way of allowing access to Ansible is to use ssh keys. These can either be assumed via the ssh configuration on your local machine, or they can be specifically issued to the servers via the inventory file. To pass ssh key details to the hosts use the ansible_ssh_user to issue the user and ansible_ssh_private_key_file to pass the private key file.

[webservers]
web1.example.com ansible_ssh_user=remoteuser ansible_ssh_private_key_file=~/.ssh/privatekey

It is good practice to check that the ssh keys allow you access before using them in your inventory files.

If you absolutely have to you can also send the user details and password to the hosts in the inventory file using the ansible_ssh_user and ansible_ssh_pass parameters. This is potentially a big security problem so you shouldn't do this unless you have absolutely no choice.

[webservers]
web1.example.com ansible_ssh_user=root ansible_ssh_pass=password

Running Ansible

The simplest thing to do in Ansible is use the 'ping' command. This doesn't run the usual ping command run through port 1 as you would do when pinging a host. Instead it runs an Ansible process, which logs into the remote machine, uploads the ping command and executes it. This returns a string of 'pong', which is the output of the command, run on the remote server. If anything is returned here then the host is said to be viable and passes the test. Running ping on a host is a good way of checking that a host is active and that your Ansible setup has access to the host. If ping fails then Ansible probably doesn't have access to the host and you need to recheck your connection.

To run the ping command on all hosts in your hosts.ini file use the following command.

ansible --inventory-file=hosts.ini all -m ping

The command has the following components:

--inventory-file=hosts.ini : This tells Ansible which hosts file to use. In this case we are referencing a file in the current directory called 'hosts.ini'.
all : The hosts.ini file can be split into sections, and the 'all' part here is a way of referencing every host in the file, regardless of which group it has been put into. Passing 'web servers' or 'dbservers' here would reference all of the servers in that group.
-m ping : The -m tells Ansible that we want to run a command on these hosts, in this case we pass the single parameter of 'ping'.

To run 'ls -la' on the host (in order to get a directory listing) just change ping to command in order to issue a command to the server(s).

ansible --inventory-file=hosts.ini all -m command -a "ls -la"

If you try to use an inventory file in this way and get an error similar to the following.

ERROR: problem running /path/to/host.ini --list ([Errno 8] Exec format error)

Then you can try removing the executable status of the ini file as Ansible can't read the file if it is executable.

chmod -x host.ini

This solved the error I was getting and allowed me to continue.

Running commands through Ansible in this way is called 'ad-hoc' as you can only run one command at a time. The best way to issue commands to a server is to use playbooks.

Playbooks

Ansible playbooks are a much more powerful (and repeatable) way of running commands on remote servers. The basic idea is that you create a series of books that Ansible will read and perform actions described in those books. Playbooks are written in YAML, but the syntax is quite easy to pick up.

To repeat the ping command for all hosts using playbooks create a playbook file (with a YAML extension) and place the following into it. This essentially runs anything in tasks for all of the given hosts and is about as simple as playbooks can get.

---
- hosts: all
  tasks:
  - name: ping all hosts
    action: ping

To run Ansible playbooks you need to use the ansible-playbook command, passing it the playbook to be used and the inventory file containing your hosts.

ansible-playbook playbook.yml -i host.ini

The anatomy of a playbook YAML file is pretty simple. It needs to start with '---', which is part of the YAML syntax declaration. After this, you need to state which hosts are to be dealt with, I have used 'all' in the above example so that every server in the inventory file is included, but you can include one or more groups or even a host pattern here. Host patterns are ways of referencing one or more host machines by using a regular expression to match their name or IP address.

There are many modules available for use in Ansible, with ping being perhaps the most simple. I tend to add ping into many of my playbooks as a simple connection check for the server before starting out trying to do anything else. For a full list of the modules available take a look at the Ansible modules list in the Ansible documentation.

A good example of a playbook is when setting up an Apache host. This is quite easy to do, but obviously needs a few steps in order to get things up and running. The following is an example of a playbook that installs Apache on a Debian/Ubuntu based system.

---
- hosts: webservers
  vars:
    http_port: 80
    max_clients: 200
  tasks:
  - name: ensure apache is at the latest version
    apt: pkg=apache2 state=latest
  - name: ensure apache is running
    service: name=httpd state=started
  - name: write the apache config file
    template: src=httpd.j2 dest=/etc/apache2/httpd.conf
    notify:
    - restart apache
  handlers:
    - name: restart apache
      service: name=httpd state=restarted

The above playbook will do the following:

The first step is to find all servers in the inventory file in the group 'webservers' using the hosts section. The rest of the document is part of this hosts section and so all commands, variables and other actions are run only on these servers.
Some variables for the Apache server configuration are set in the 'vars' section.
A set of tasks are defined under a section called 'tasks'. This has the following actions:
- Install Apache2 using the apt module.
- Ensure that the Apache2 server is running.
- Copy a template (called httpd.j2 and found in the current directory) to the server configuration area. This is an example of using the j2 template system, which is part of python. Any variable defined in Ansible can be included in the document by referencing it with double curly braces. For example, to include the port number you would include '{{ http_port }} ' variable. The handler 'restart apache' is also notified once this job is complete.
A handler is defined at the bottom of the playbook that restarts Apache2. This can be called at any time during the playbook process via the notify handler.

Playbooks can obviously get quite complex if you continue the above example to include things like PHP and MySQL. It is possible to separate playbooks into sections using include statements, but the best way to go about organising playbooks is by using roles and tags. Roles and tags allow you to separate your playbooks into discrete sections of configuration and functionality and make managing different software requirements on different hosts a lot easier.

To change the above example to a role based approach we need to first set up the correct directory and structure. Roles are kept in a directory called 'roles', with each role having it's own directory within that. Each role is split into separate directories and Ansible will look for a file called 'main.yml' within the tasks and handlers directories. Using the above configuration we split apart the Apache2 configuration steps in the playbook.yml file into separate components and arrange them like this.

playbook.yml
roles/
   apache2/
     templates/httpd.j2
     tasks/main.yml
     handlers/main.yml

As an example the tasks/main.yml file contains the following.

---
 - name: ensure apache is at the latest version
   apt: pkg=apache2 state=latest
 - name: ensure apache is running
   service: name=httpd state=started
 - name: write the apache config file
   template: src=httpd.j2 dest=/etc/apache2/httpd.conf
   notify:
   - restart apache

In order to include a role in our playbook we need to tweak the playbook.yml file slightly.

---
- hosts: webservers
  vars:
    http_port: 80
    max_clients: 200
  roles:
    apache2

More roles can be added to that server by creating the role configuration files and then adding them to the set of roles underneath the roles declaration in the main playbook file.

See the Ansible playbooks documentation for more information on playbooks.

Ansible