Merge "Add borg-backup roles"

This commit is contained in:
Zuul 2020-10-01 07:36:47 +00:00 committed by Gerrit Code Review
commit 083e8b43ea
18 changed files with 434 additions and 46 deletions

View File

@ -162,12 +162,14 @@ you can inspect any of the clouds with::
Backups
=======
Infra uses the `bup <https://bup.github.io>`__ tool for backups.
Infra uses the `borg <https://borgbackup.readthedocs.io>`__ backup
tool.
Hosts in the ``backup`` Ansible inventory group will be backed up to
servers in the ``backup-server`` group with ``bup``. The
``playbooks/roles/backup`` and ``playbooks/roles/backup-server`` roles
implement the required setup.
Hosts in the ``borg-backup`` Ansible inventory group will be backed up
to servers in the ``borg-backup-server`` group with ``borg``. The
``playbooks/roles/borg-backup`` and
``playbooks/roles/borg-backup-server`` roles implement the required
setup.
The backup server has a unique Unix user for each host to be backed
up. The roles will setup required users, their home directories in
@ -181,52 +183,27 @@ key setup just for backup communication (see ``/root/.ssh/config``).
Restore from Backup
-------------------
On the server that needs items restored from backup become root, start a
screen session as restoring can take a while, and create a working
directory to restore the backups into. This allows us to be selective in
how we restore content from backups::
``borg`` has many options for restoring but a basic way to dump a host
at a particular time is to
sudo su -
screen
mkdir /root/backup-restore-$DATE
cd /root/backup-restore-$DATE
Root uses a separate ssh key and remote user to communicate with the
backup server(s); the username and key to use for backup should be
automatically configured in ``/root/.ssh/config``. The backup server
hostname can be taken from there.
At this point we can join the tar that was split by the backup cron::
bup join -r backup.x.y.opendev.org: root > backup.tar
At this point you may need to wait a while. These backups are stored on
servers geographically distant from our normal servers resulting in less
network throughput between servers than we are used to.
Once the ``bup join`` is complete you will have a tar archive of that
backup. It may be useful to list the files in the backup
``tar -tf backup.tar`` to get an idea of what things are available. At
this point you will probably either want to extract the entire backup::
tar -xvf backup.tar
ls -al
Or selectively extract files::
# path/to/file needs to match the output given by tar -t
tar -xvf backup.tar path/to/file
Note if you created your working directory in a path that is not
excluded by bup you will want to remove that directory when your work is
done. /root/backup-restore-* is excluded so the path above is safe.
* log into the backup server
* sudo ``su -`` to switch to the backup user for the host to be restored
* you will now be in the home directory of that user
* run ``/opt/borg/bin/borg list ./backup`` to list the archives available
* these should look like ``hostname-YYYY-MM-DDTHH:MM:SS``
* move to working directory
* extract one of the appropriate archives with ``/opt/borg/bin/borg extract ~/backup <archive-tag>``
Rotating backup storage
-----------------------
Since ``bup`` only stores differences, it does not have an effective
way to prune old backups. The easiest way is to simply periodically
start the backups fresh.
We run ``borg`` in append-only mode, so that clients can not remove
old backups on the server.
TODO(ianw) : Write instructions on how to prune server side. We
should monitor growth to see if automatic pruning would be
appropriate, or periodic manual pruning, or something similar to this
existing system where we keep a historic archive and start fresh.
The backup server keeps an active volume and the previously rotated
volume. Each consists of 3 x 1TiB volumes grouped with LVM. The

View File

@ -0,0 +1,15 @@
Setup backup server
This role configures backup server(s) in the ``borg-backup-server`` group
to accept backups from remote hosts.
Note that the ``borg-backup`` role must have run on each host in the
``borg-backup`` group before this role. That role will create a
``borg_user`` tuple in the hostvars for for each host consisting of
the required username and public key.
Each required user gets a separate home directory in ``/opt/backups``.
Their ``authorized_keys`` file is configured with the public key to
allow the remote host to log in and only run ``borg`` in server mode.
**Role Variables**

View File

@ -0,0 +1 @@
borg_users: []

View File

@ -0,0 +1,19 @@
- name: Create backup directory
file:
state: directory
path: /opt/backups
- name: Install borg
include_role:
name: install-borg
- name: Build all borg users from backup hosts
set_fact:
borg_users: '{{ borg_users }} + [ {{ hostvars[item]["borg_user"] }} ]'
with_inventory_hostnames: 'borg-backup:!disabled'
- name: Create borg users
include_tasks: user.yaml
loop: '{{ borg_users }}'
loop_control:
loop_var: borg_user

View File

@ -0,0 +1,31 @@
# note borg_user is the parent loop variable name; this works on each
# element from the borg_users global.
- name: Set variables
set_fact:
user_name: '{{ borg_user[0] }}'
user_key: '{{ borg_user[1] }}'
- name: Create borg user
user:
name: '{{ user_name }}'
comment: 'Backup user'
shell: /bin/bash
home: '/opt/backups/{{ user_name }}'
create_home: yes
register: homedir
- name: Create borg user authorized key
authorized_key:
user: '{{ user_name }}'
state: present
key: '{{ user_key }}'
key_options: 'command="/opt/borg/bin/borg serve --append-only --restrict-to-path /opt/backups/{{ user_name }}/backup",restrict'
# ansible-lint wants this in a handler, it should be done here and
# now; this isn't like a service restart where multiple things might
# call it.
- name: Initalise borg
command: /opt/borg/bin/borg init --encryption=none /opt/backups/{{ user_name }}/backup
become: yes
become_user: '{{ user_name }}'
when: homedir.changed

View File

@ -0,0 +1,36 @@
Configure a host to be backed up
This role setups a host to use ``borgp`` for backup to any hosts in the
``borg-backup-server`` group.
A separate ssh key will be generated for root to connect to the backup
server(s) and the host key for the backup servers will be accepted to
the host.
The ``borg`` tool is installed and a cron job is setup to run the
backup periodically.
Note the ``borg-backup-server`` role must run after this to create the user
correctly on the backup server. This role sets a tuple ``borg_user``
with the username and public key; the ``borg-backup-server`` role uses this
variable for each host in the ``borg-backup`` group to initalise users.
**Role Variables**
.. zuul:rolevar:: borg_username
The username to connect to the backup server. If this is left
undefined, it will be automatically set to ``borg-$(hostname)``
.. zuul:rolevar:: borg_backup_excludes_extra
:default: []
A list of extra items to pass as ``--exclude`` arguments to borg.
Appended to the global default list of excludes set with
``borg_backup_excludes``.
.. zuul:rolevar:: borg_backup_dirs_extra
:default: []
A list of extra directories to backup. Appended to the global
default list of directories set with ``borg_backup_dirs``.

View File

@ -0,0 +1,13 @@
borg_backup_excludes:
- '/home/*.cache/*'
- '/var/cache/*'
- '/var/tmp/*'
borg_backup_excludes_extra: []
borg_backup_dirs:
- /etc
- /home
- /root
- /var
borg_backup_dirs_extra: []

View File

@ -0,0 +1,63 @@
- name: Generate borg username for this host
set_fact:
borg_username: 'borg-{{ inventory_hostname.split(".", 1)[0] }}'
when: borg_username is not defined
- debug:
var: borg_username
- name: Install borg
include_role:
name: install-borg
- name: Install backup script
template:
src: borg-backup.j2
dest: /usr/local/bin/borg-backup
mode: 0755
- name: Generate keypair for backups
openssh_keypair:
path: /root/.ssh/id_borg_backup_ed25519
type: ed25519
register: borg_keypair
- name: Configure ssh for backup server
blockinfile:
path: /root/.ssh/config
create: true
block: |
# {{ item }} backup server
Host {{ item }}
HostName {{ item }}
IdentityFile /root/.ssh/id_borg_backup_ed25519
User {{ borg_username }}
mode: 0600
with_inventory_hostnames: borg-backup-server
- name: Generate borg_user info tuple
set_fact:
borg_user: '{{ [ borg_username, borg_keypair["public_key"] ] }}'
- name: Accept hostkey of backup server
known_hosts:
state: present
key: '{{ item }} ssh-ed25519 {{ hostvars[item]["ansible_ssh_host_key_ed25519_public"] }}'
name: '{{ item }}'
with_inventory_hostnames: borg-backup-server
- name: Install backup cron job
cron:
name: "Run borg backup"
job: "/usr/local/bin/borg-backup {{ item }} 2>> /var/log/borg-backup-{{ item }}.log"
user: root
hour: '5'
minute: '{{ 59|random(seed=item) }}'
with_inventory_hostnames: borg-backup-server
- name: Install logrotate rules
include_role:
name: logrotate
vars:
logrotate_file_name: '/var/log/borg-backup-{{ item }}.txt'
with_inventory_hostnames: borg-backup-server

View File

@ -0,0 +1,53 @@
#!/bin/bash
# Flags based on
# https://borgbackup.readthedocs.io/en/stable/quickstart.html
if [ -z "$1" ]; then
echo "Must specify backup host"
exit 1
fi
BORG="/opt/borg/bin/borg"
# Setting this, so the repo does not need to be given on the commandline:
export BORG_REPO="ssh://{{ borg_username}}@${1}/opt/backups/{{ borg_username }}/backup"
# some helpers and error handling:
info() { printf "\n%s %s\n\n" "$( date )" "$*" >&2; }
trap 'echo $( date ) Backup interrupted >&2; exit 2' INT TERM
info "Starting backup"
# This avoids UI prompts when first accessing the remote repository
export BORG_UNKNOWN_UNENCRYPTED_REPO_ACCESS_IS_OK=1
# Backup the most important directories into an archive named after
# the machine this script is currently running on:
${BORG} create \
--verbose \
--filter AME \
--list \
--stats \
--show-rc \
--compression lz4 \
--exclude-caches \
{% for item in borg_backup_excludes + borg_backup_excludes_extra -%}
--exclude '{{ item }}' \
{% endfor -%}
\
::'{hostname}-{now}' \
{% for item in borg_backup_dirs + borg_backup_dirs_extra -%}
{{ item }} {{ '\\' if not loop.last }}
{% endfor -%}
backup_exit=$?
if [ ${backup_exit} -eq 0 ]; then
info "Backup finished successfully"
else
info "Backup finished with errors"
fi
exit ${backup_exit}

View File

@ -0,0 +1,11 @@
Install borg backup tool to /opt/borg
Install borg to a virtualenv; the binary will be available at
``/opt/borg/bin/borg``.
**Role Variables**
.. zuul:rolevar:: borg_version
The version of ``borg`` to install. This should likely be pinned
to be the same between server and client.

View File

@ -0,0 +1 @@
borg_version: 1.1.13

View File

@ -0,0 +1,24 @@
# We install into a virtualenv here for two reasons; we want a
# specific version pinned between server and client -- borg has had
# updates that required transitions so we don't want to use system
# packages where thing might get out of sync. Secondly we want to
# keep as few things as possible to go wrong when running backups.
- name: Install build deps
package:
name:
- python3-dev
- libssl-dev
- openssl
- libacl1-dev
- libacl1
- build-essential
- name: Install borg
pip:
# borg build deps are a little ... interesting, it needs cython
# but the requirements don't bring it in.
name:
- cython
- 'borgbackup=={{ borg_version }}'
virtualenv: /opt/borg
virtualenv_command: /usr/bin/python3 -m venv

View File

@ -0,0 +1,12 @@
# This needs to happen in order. Backup hosts export their username/key
# combos which are installed onto the backup server
- hosts: "borg-backup:!disabled"
name: "Base: Generate borg backup users and keys"
roles:
- iptables
- borg-backup
- hosts: "borg-backup-server:!disabled"
name: "Generate borg configuration"
roles:
- iptables
- borg-backup-server

View File

@ -20,3 +20,10 @@ groups:
backup:
- backup-test01.opendev.org
- backup-test02.opendev.org
borg-backup-server:
- borg-backup01.region.provider.opendev.org
borg-backup:
- borg-backup-test01.opendev.org
- borg-backup-test02.opendev.org

View File

@ -0,0 +1,77 @@
# Copyright 2019 Red Hat, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
import os.path
import pytest
testinfra_hosts = ['borg-backup01.region.provider.opendev.org',
'borg-backup-test01.opendev.org',
'borg-backup-test02.opendev.org']
def test_borg_installed(host):
f = host.file('/opt/borg/bin/borg')
assert f.exists
cmd = host.run('/opt/borg/bin/borg --version')
assert cmd.succeeded
# NOTE(ianw): deliberately pinned; we want to be careful if we
# update that the new version is compatible with old repos.
assert '1.1.13' in cmd.stdout
def test_borg_server_users(host):
hostname = host.backend.get_hostname()
if hostname.startswith('borg-backup-test'):
pytest.skip()
for username in 'borg-borg-backup-test01', 'borg-borg-backup-test02':
homedir = os.path.join('/opt/backups/', username)
borg_repo = os.path.join(homedir, 'backup')
authorized_keys = os.path.join(homedir, '.ssh', 'authorized_keys')
user = host.user(username)
assert user.exists
assert user.home == homedir
f = host.file(authorized_keys)
assert f.exists
assert f.contains("ssh-ed25519")
f = host.file(borg_repo)
assert f.exists
def test_borg_backup_host_config(host):
hostname = host.backend.get_hostname()
if hostname == 'borg-backup01.region.provider.opendev.org':
pytest.skip()
f = host.file('/usr/local/bin/borg-backup')
assert f.exists
f = host.file('/root/.ssh/id_borg_backup_ed25519')
assert f.exists
f = host.file('/root/.ssh/config')
assert f.exists
assert f.contains('Host borg-backup01.region.provider.opendev.org')
def test_borg_backup(host):
hostname = host.backend.get_hostname()
if hostname == 'borg-backup01.region.provider.opendev.org':
pytest.skip()
cmd = host.run(
'/usr/local/bin/borg-backup borg-backup01.region.provider.opendev.org 2>> '
'/var/log/borg-backup-borg-backup01.region.provider.opendev.org.log')
assert cmd.succeeded

View File

@ -287,6 +287,19 @@
- playbooks/roles/backup-server/
- playbooks/roles/iptables/
- job:
name: infra-prod-service-borg-backup
parent: infra-prod-service-base
description: Run service-borg-backup.yaml playbook.
vars:
playbook_name: service-borg-backup.yaml
files:
- inventory/
- playbooks/service-borg-backup.yaml
- playbooks/roles/borg-backup/
- playbooks/roles/borg-backup-server/
- playbooks/roles/iptables/
- job:
name: infra-prod-service-registry
parent: infra-prod-service-base

View File

@ -13,6 +13,7 @@
- system-config-run-base-ansible-devel:
voting: false
- system-config-run-backup
- system-config-run-borg-backup
- system-config-run-dns
- system-config-run-eavesdrop:
dependencies:
@ -235,6 +236,7 @@
- infra-prod-service-mirror
- infra-prod-service-static
- infra-prod-service-backup
- infra-prod-service-borg-backup
- infra-prod-service-registry
- infra-prod-service-zookeeper
- infra-prod-service-zuul
@ -276,6 +278,7 @@
- infra-prod-service-mirror-update
- infra-prod-service-mirror
- infra-prod-service-static
- infra-prod-service-borg-backup
- infra-prod-service-backup
- infra-prod-service-zookeeper
- infra-prod-service-review

View File

@ -356,6 +356,38 @@
- playbooks/zuul/templates/host_vars/backup
- testinfra/test_backups.py
- job:
name: system-config-run-borg-backup
parent: system-config-run
description: |
Run the playbook for borg backup configuration
nodeset:
nodes:
- name: bridge.openstack.org
label: ubuntu-bionic
- name: borg-backup01.region.provider.opendev.org
label: ubuntu-focal
- name: borg-backup-test01.opendev.org
label: ubuntu-focal
- name: borg-backup-test02.opendev.org
label: ubuntu-bionic
vars:
run_playbooks:
- playbooks/service-borg-backup.yaml
files:
- playbooks/install-ansible.yaml
- playbooks/roles/borg-backup
- playbooks/zuul/templates/host_vars/borg-backup
- testinfra/test_borg_backups.py
host-vars:
borg-backup-test01.opendev.org:
host_copy_output:
'/var/log/borg-backup-borg-backup01.region.provider.opendev.org.log': logs
borg-backup-test02.opendev.org:
host_copy_output:
'/var/log/borg-backup-borg-backup01.region.provider.opendev.org.log': logs
- job:
name: system-config-run-mirror-base
parent: system-config-run