ansible-role-openstack-oper.../README-backup-ops.md
Dan Macpherson 229b138357 Redis backup, restore, and validation tasks
This adds task files for the backup, restore, and validation
for Redis. It also incorporates some of the foundational changes
from https://review.openstack.org/#/c/635506/

Change-Id: Idd50a6b53a22bc6b23df776cff9537e3c47618f3
2019-03-21 16:33:21 +00:00

16 KiB

Backup and Restore Operations

The openstack-operations role includes some foundational backup and restore Ansible tasks to help with automatically backing up and restoring OpenStack services. The current services available to backup and restore include:

  • MySQL on a galera cluster
  • Redis

Scenarios tested:

  • TripleO, 1 Controller, 1 Compute, backup to the undercloud
  • TripleO, 1 Controller, 1 Compute, backup to remote server
  • TripleO, 3 Controllers, 1 Compute, backup to the undercloud
  • TripleO, 3 Controllers, 1 Compute, backup to remote server

Architecture

The architecture uses three main host types:

  • Target Hosts - Which are the OpenStack nodes with data to backup. For example, this would any nodes with database servers running
  • Backup Host - The destination to store the backup.
  • Control Host - The host that executes the playbook. For example, this would be the undercloud on TripleO.

You can also unify the Backup Host and Control Host onto a single host. For example, a host that runs playbooks AND stores the backup data,

Requirements

General Requirements:

  • Backup Host needs access to the rsync package. A task in initialize_backup_host.yml will attempt to install it.

MySQL/Galera

  • Target Hosts needs access to the mysql package. Tasks in the backup and restore files will attempt to install it.
  • When restoring to Galera, the Control Host requires the pacemaker_resource module. You can obtain this module from the ansible-pacemaker RPM. If your operating system does not have access to this package, you can clone the ansible-pacemaker git repo. When running a restore playbook, include the ansible-pacemaker module using the -M option (e.g. ansible-playbook -M /usr/share/ansible-modules ...)

Filesystem

  • It has no special requirements, only the tar command is going to be used.

Redis

  • Target Hosts needs access to the redis package. Tasks in the backup and restore files will attempt to install it.
  • When restoring Redis, the Control Host requires the pacemaker_resource module. You can obtain this module from the ansible-pacemaker RPM. If your operating system does not have access to this package, you can clone the ansible-pacemaker git repo. When running a restore playbook, include the ansible-pacemaker module using the -M option (e.g. ansible-playbook -M /usr/share/ansible-modules ...)

Task Files

The following is a list of the task files used in the backup and restore process.

Initialization Tasks:

  • initialize_backup_host.yml - Makes sure the Backup Host (destination) has an SSH key pair and rsync installed.
  • enable_ssh.yml - Enables SSH access from the Backup Host to the Target Hosts. This is so rsync can pull the backed up data and push the data during a restore.
  • disable_ssh.yml - Disables SSH access from the Backup Host to the Target Hosts. This ensures that access is only granted during the backup only.
  • set_bootstrap.yml - In situations with high availability, some restore tasks (such as Pacemaker functions) only need to be carried out by one of the Target Hosts. The tasks in set_bootstrap.yml set a "bootstrap" node to help execute single tasks on only one Target Host. This is usually the first node in your list of targets.

Backup Tasks:

  • backup_mysql.yml - Performs a backup of the OpenStack MySQL data and grants, archives them, and sends them to the desired backup host.
  • backup_filesystem.yml - Creates a tar file of a list of files/directories given and sends then to a desired backup host.
  • backup_redis.yml - Performs a backup of Redis data from one node, archives them, and sends them to the desired backup host.

Restore Tasks:

  • restore_galera.yml - Performs a restore of the OpenStack MySQL data and grants on a containerized galera cluster. This involves shutting down the current galera cluster, creating a brand new MySQL database, then importing the data and grants from the archive. In addition, the playbook saves a copy of the old data in case the restore process fails.
  • restore_redis.yml - Performs a restore of Redis data from one node to all nodes and resets the permissions using a redis container.

Validation Tasks:

  • validate_galera.yml - Performs the equivalent of clustercheck i.e. checks the wsrep_local_state is 4 ("Synced").
  • validate_galera.yml - Performs a Redis check with redis-cli ping.

Variables

Use the following variables to customize how you want to run these tasks.

Variables for all backup tasks:

  • backup_directory - The location on the backup host to rsync archives. If unset, defaults to the home directory of the chosen inventory user for the Backup Host. If you aim to have recurring backup jobs and store multiple iterations of the backup, you should set this to a dynamic value such as a timestamp or UUID.
  • backup_server_hostgroup - The name of the host group containing the backup server. Ideally, this host group only contains the Backup Host. If more than one host exists in this group, the tasks pick the first host in the group. Note the following:
    • The chosen Backup Host Group must be in your inventory.
    • The Backup Host must be initialized using the initialize_backup_host.yml. You can do this by placing the Backup Host in a single host group called backup and refer to it as using hosts: backup[0] in a play that runs the initialize_backup_host tasks.
    • You can only use one Backup Host. This is because the delegation for the synchronize module allows only one host.

MySQL and galera backup and restore variables:

  • kolla_path - The location of the configuration for Kolla containers. Defaults to /var/lib/config-data/puppet-generated.
  • mysql_bind_host - The IP address for database server access. The tasks place a temporary firewall block on this IP address to prevent services writing to the database during the restore.
  • mysql_root_password - The original root password to access the database. If unsent, it checks the Puppet hieradata for the password.
  • mysql_clustercheck_password - The original password for the clustercheck user. If unsent, it checks the Puppet hieradata for the password.
  • galera_container_image - The image to use for the temporary container to restore the galera database. If unset, it tries to determine the image from the existing galera container.

Filesystem backup variables:

  • backup_dirs - List of the files to backup.
  • baclup_exclude - List of the files that where not included on the backup.
  • backup_file - The end of the backup file name.

Redis backup and restore variables:

  • redis_vip - The VIP address of the Redis cluster. If unsent, it checks the Puppet hieradata for the VIP.
  • redis_matherauth_password - The master password for the Redis cluster. If unsent, it checks the Puppet hieradata for the password.
  • redis_container_image - The image to use for the temporary container that restores the permissions to the Redis data directory. If unset, it tries to determine the image from the existing redis container.

Inventory and Playbooks

You ultimately define how to use the tasks with your own playbooks and inventory. The inventory should include the host groups and users to access each host type. For example:

[my_backup_host]
192.0.2.200 ansible_user=backup

[my_target_host]
192.0.2.101 ansible_user=openstack
192.0.2.102 ansible_user=openstack
192.0.2.103 ansible_user=openstack

[all:vars]
backup_directory="/home/backup/my-backup-folder/"

The process for your playbook depends largely on whether you want to backup or restore. However, the general process usually follows:

  1. Initialize the backup host
  2. Ensure SSH access from the backup host to your OpenStack nodes
  3. Perform the backup or restore. If need be, you might need to set a bootstrap to carry out tasks to isolate on a single Target Host.
  4. (Optional) If using a separate Backup Host (i.e. not the Control Host), disable SSH access from the backup host to your OpenStack nodes.

Examples

The following examples show how to use the backup and restore tasks.

Backup and restore galera and redis to a remote backup server

This example shows how to backup data to the root user on a remote backup server, and then restore it. The inventory file for both functions are the same:

[backup]
192.0.2.250 ansible_user=root

[mysql]
192.0.2.101 ansible_user=heat-admin
192.0.2.102 ansible_user=heat-admin
192.0.2.103 ansible_user=heat-admin

[redis]
192.0.2.101 ansible_user=heat-admin
192.0.2.102 ansible_user=heat-admin
192.0.2.103 ansible_user=heat-admin

[all:vars]
backup_directory="/root/backup-test/"

Backup Playbook:

---
- name: Initialize backup host
  hosts: "{{ backup_hosts | default('backup') }}[0]"
  tasks:
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: initialize_backup_host

- name: Backup MySQL database
  hosts: "{{ target_hosts | default('mysql') }}[0]"
  vars:
    backup_server_hostgroup: "{{ backup_hosts | default('backup') }}"
  tasks:
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: enable_ssh
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: backup_mysql
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: disable_ssh

- name: Backup Redis database
  hosts: "{{ target_hosts | default('redis') }}[0]"
  vars:
    backup_server_hostgroup: "{{ backup_hosts | default('backup') }}"
  tasks:
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: enable_ssh
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: backup_redis
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: disable_ssh

We do not need to include the bootstrap tasks with the backup since all tasks are performed by one of the Target Hosts.

Restore Playbook:

---
- name: Initialize backup host
  hosts: "{{ backup_hosts | default('backup') }}[0]"
  tasks:
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: initialize_backup_host

- name: Restore MySQL database on galera cluster
  hosts: "{{ target_hosts | default('mysql') }}"
  vars:
    backup_server_hostgroup: "{{ backup_hosts | default('backup') }}"
  tasks:
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: set_bootstrap
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: enable_ssh
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: restore_galera
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: disable_ssh

- name: Restore Redis data
  hosts: "{{ target_hosts | default('redis') }}"
  vars:
    backup_server_hostgroup: "{{ backup_hosts | default('backup') }}"
  tasks:
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: set_bootstrap
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: enable_ssh
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: restore_redis
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: disable_ssh

We include the bootstrap tasks with the backup since all Target Hosts are required for the restore but only certain operations are performed on one of the hosts.

Backup and restore galera and redis to a combined control/backup host

This example shows how to back to a directory on the Control Host using the same user. In this case, we use the stack user for both Ansible and rsync operations. We also use the heat-admin user to access the OpenStack nodes. Both the backup and restore operations use the same inventory file:

[backup]
localhost ansible_user=stack

[mysql]
192.0.2.101 ansible_user=heat-admin
192.0.2.102 ansible_user=heat-admin
192.0.2.103 ansible_user=heat-admin

[redis]
192.0.2.101 ansible_user=heat-admin
192.0.2.102 ansible_user=heat-admin
192.0.2.103 ansible_user=heat-admin

[all:vars]
backup_directory="/home/stack/backup-test/"

Backup Playbook:

---
- name: Initialize backup host
  hosts: "{{ backup_hosts | default('backup') }}[0]"
  tasks:
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: initialize_backup_host

- name: Backup MySQL database
  hosts: "{{ target_hosts | default('mysql') }}[0]"
  vars:
    backup_server_hostgroup: "{{ backup_hosts | default('backup') }}"
  tasks:
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: enable_ssh
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: backup_mysql

- name: Backup Redis database
  hosts: "{{ target_hosts | default('redis') }}[0]"
  vars:
    backup_server_hostgroup: "{{ backup_hosts | default('backup') }}"
  tasks:
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: backup_redis

Restore Playbook:

---
- name: Initialize backup host
  hosts: "{{ backup_hosts | default('backup') }}[0]"
  tasks:
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: initialize_backup_host

- name: Restore MySQL database on galera cluster
  hosts: "{{ target_hosts | default('mysql') }}"
  vars:
    backup_server_hostgroup: "{{ backup_hosts | default('backup') }}"
  tasks:
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: set_bootstrap
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: enable_ssh
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: restore_galera

- name: Restore MySQL database on galera cluster
  hosts: "{{ target_hosts | default('redis') }}"
  vars:
    backup_server_hostgroup: "{{ backup_hosts | default('backup') }}"
  tasks:
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: set_bootstrap
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: enable_ssh
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: restore_redis

In This situation, we do not include the disable_ssh tasks since this would disable access from the Control Host to the OpenStack nodes for future Ansible operations.

Backup filesystem from controller

Inventory file

[backup]
undercloud-0 ansible_connection=local

[filesystem]
controller-0 ansible_user=heat-admin ansible_host=192.168.24.6
controller-1 ansible_user=heat-admin ansible_host=192.168.24.20
controller-2 ansible_user=heat-admin ansible_host=192.168.24.8

[all:vars]
backup_directory="/var/tmp/backup"
ansible_ssh_common_args='-o StrictHostKeyChecking=no'

Filesystem Backup Playbook:

---
- name: Initialize backup host
  hosts: "{{ backup_hosts | default('backup') }}[0]"
  tasks:
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: initialize_backup_host

- name: Backup Filesystem
  hosts: "{{ target_hosts | default('filesystem') }}"
  become: yes
  vars:
    backup_server_hostgroup: "{{ backup_hosts | default('backup') }}"
    backup_file: "filesystem.bck.tar"
    backup_dirs:
      - /etc
      - /var/lib/nova
      - /var/lib/glance
      - /var/lib/heat-config
      - /var/lib/heat-cfntools
      - /var/lib/openvswitch
      - /var/lib/config-data
      - /var/lib/tripleo-config
      - /srv/node
      - /usr/libexec/os-apply-config/
      - /root
  tasks:
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: enable_ssh
    - import_role:
        name: ansible-role-openstack-operations
        tasks_from: backup_filesystem