Ansible everything you need to know about set_facts

If you have seen lot of ansible playbook examples, set_facts module is very common in most of them. Let us dive little deeper to know what is it and how it may help you to write dynamic playbooks.

The jargon `set_facts` :

If you just read set_facts in an ansible playbook, it is really hard to interpret what it really means. You may think like it is setting some kind of facts but what facts? I had the same doubt and I was little overwhelmed by the terminology in here. But, to understand in generic terms :

set_facts module sets variables once you know their values and optionally deciding if you need their values to be set or not.

You may set simple variables in ansible using vars, vars_file or include_vars modules, however you know their values beforehand. In case of set_facts, you set variables on the fly depending on certain task results. These variables will be available to be used in the subsequent plays during an ansible playbook execution.

Let's see a real life use case :

Before diving into an actual playbook and overal syntax, let's first take a real life use case, which will help us connect the dots.

Use case : We need to spin up an instance on AWS and then add it into an existing AWS Target Group.

Known Variables : We will have some known variables like the instance AMI id and the instance type.

Unkwnon Variables : To add an instance to target group, we need the instance id. However, until that instance is spun up, we will not have the instance id. This will be set on the fly once instance is spun up.

In this case, we will use known variables to spin up an instance. Once that task is done, we will get an instance id from AWS. Thats a fact which we have come across from that task. We will set it to a variable using set_facts module and then it can be used later to add instance into an existing AWS Target Group.

Register and set_facts go hand in hand :

Until you have receieved a factual information or to be specific, a task result, you do not have facts to set using set_facts module. This is why I always feel register module and set_facts module go hand in hand.

Please note that, there are lots of other ways to get factual information from a task and register is not the only way. But it is one of the most common ways you would get facts. Some other modules to get factual information can be ansible_facts to get package information (package_facts) from a host. The possibilities are much more.

set_facts is host specific :

A very important thing to note that when you set a fact using set_facts module, it is specific to the host within which task is currently running. As documentation says : Variables are set on a host-by-host basis just like facts discovered by the setup module. If your playbook has multiple hosts then you can not share a fact set using set_facts from one host to another.

Diving into an example :

Let's take the use case we discussed earlier and make a simple playbook for it. We will have following structure :

ReleaseAMIUpdates/
├── config.yml
├── env.yml
├── playbook.yml
└── setup.sh

Please note that you may have more detailed structure based on your preferences. This article is about exploring set_facts, so we will focus more on its implementation.

config.yml :

This file will have the configuration variables which rarely change.

vpc_id: vpc-12345678
ec2_iam_role: ec2-iam-role-name
instance_type: t2.micro
instance_volume_in_gb: 30
instance_security_group: ec2-security-group-name
ami_id: ami-12345678
instance_key_name: ansible-instance.pem

env.yml :

This file will have the configurations which are sensitive and may change .

region: us-west-1
aws_access_key: your_aws_access_key
aws_secret_key: your_aws_secret_key
target_group: Test Ansible Target Group

setup.sh :

This file will have any user-data boostrap commands you need to run as soon as new instance is spun up. We will use this to install nginx so that we can server web traffic with a simple web page.

#!/bin/bash

# Update Package Lists
apt-get update

# Install add-apt-repository dependencies
apt-get install software-properties-common -y
apt-get install python-software-properties -y

# Update Package Lists
apt-get update -y

# Install nginx 
apt-get -y install nginx

Now you have above yml files set up, these files will act as environment variable files. We will refer the configurations specified above as variables in our playbook.

playbook.yml :

This file will contain all palybook tasks.

# create a launch configuration using an AMI image and instance type as a basis
- name: Launch new AMI Release
  hosts: localhost
  connection: local
  vars_files:
    - ./env.yml
    - ./config.yml

  tasks:

  #  Get VPC public subnet details as it will be needed later while launching the sandbox instance
  - name: Get VPC Subnet Details
    ec2_vpc_subnet_facts:
      aws_access_key: "{{ aws_access_key }}"
      aws_secret_key: "{{ aws_secret_key }}"
      region: "{{ region }}"
      filters:
        vpc-id: "{{ vpc_id }}"
        "tag:Availability": "Public"
    # Save the result json in variable subnet_facts_public
    register: subnet_facts_public

  - name: Get VPC Subnet ids which are available and public
    set_fact:
      vpc_subnet_id_public: "{{ subnet_facts_public.subnets|selectattr('state', 'equalto', 'available')|map(attribute='id')|list|random }}"

  # Launch instance with required settings
  - name: Launch new instance
    ec2:
      key_name: "{{ instance_key_name }}"
      aws_access_key: "{{ aws_access_key }}"
      aws_secret_key: "{{ aws_secret_key }}"
      region: "{{ region }}"
      image: "{{ ami_id }}"
      instance_profile_name: "{{ ec2_iam_role }}"
      vpc_subnet_id: "{{ vpc_subnet_id_public }}"
      instance_type: "{{ instance_type }}"
      group: "{{ instance_security_group }}"
      assign_public_ip: False
      # All commands specified in below will run as soon as instance is launched
      user_data: "{{ lookup('file', 'setup.sh') }}"
      wait: True
      wait_timeout: 500
      volumes: 
        - device_name: /dev/sda1
          volume_size: "{{ instance_volume_in_gb }}"
          volume_type: gp2
          encrypted: True
          delete_on_termination: True
      instance_tags:
        Name: Ansible-Test-Instance
    register: ec2

  # Get instance id from registered facts in ec2
  - name: Get instance id from registered facts in ec2
    set_fact:
      new_instance_id: "{{ ec2.instance_ids[0] }}"

  # Add newly created instance into target group
  - name: Add newly created instance into target group
    elb_target_group:
      name: "{{ target_group }}"
      aws_access_key: "{{ aws_access_key }}"
      aws_secret_key: "{{ aws_secret_key }}"
      region: "{{ region }}"
      target_type: instance
      health_check_interval: 30
      health_check_path: /health
      health_check_protocol: http
      health_check_timeout: 15
      healthy_threshold_count: 2
      unhealthy_threshold_count: 2
      protocol: http
      port: 80
      vpc_id: "{{ _vpc_id }}"
      successful_response_codes: "200"
      targets:
        - Id: "{{ new_instance_id }}"
          Port: 80
      state: present

Lets walk through each task in above playbook.yml first before we run it :

We already know our vpc id. But to spin up an instance, we will need to get subnet id. We will use ansible module ec2_vpc_subnet_facts to get the public subnets. We will register that result into subnet_facts_public variable.
We will use subnet_facts_public variable to parse its content and get a public subnet which is available chosen randomly from set of available public subnets from the results. The type of parsing used is called jinja which comes within ansible. Once we have that fact, we will set it using set_facts into vpc_subnet_id_public variable on the fly.
We will launch new instance and then get its information. We will register that into ec2 variable.
We will use ec2 variable to parse its content and get the instance id of newly spun up instance. Once we have that fact, we will set it using set_facts into new_instance_id variable on the fly.
Finally we will update our target group and add this instance into its targets.

Conditionally set facts :

You might need to set a fact using set_facts module when another variable or result registered contains some dependent value. In such case you can use when conditional.

Example :

- name: Get VPC Subnet ids which are available and public
    set_fact:
      vpc_subnet_id_public: "{{ subnet_facts_public.subnets|selectattr('state', 'equalto', 'available')|map(attribute='id')|list|random }}"
    when: region == "us-west-2"

Caching a set fact :

You can cache a fact set from set_facts module so that when you execute your playbook next time, it's retrieved from cache. You can set cacheable to yes to store variables across your playbook executions using a fact cache. You may need to look into precedence strategies used by ansible to evaluate the cacheable facts mentioned in their documentation.

By : Mihir Bhende Categories : aws, ansible Tags : aws, ansible, devops, automation, set_facts, set, facts, variables, onthefly, fly, dynamically, dynamic

The jargon set_facts :