doc: update backup instructions
Update the backup instructions for some recent changes. Make a note of the streaming backup method, discuss some caveats with append-only mode and discuss the pruning scripts and when to run (c.f. I9559bb8aeeef06b95fb9e172a2c5bfb5be5b480e, I250d84c4a9f707e63fef6f70cfdcc1fb7807d3a7). Change-Id: Idb04ebfa5666cd3c20bc0132683d187e705da3f1
This commit is contained in:
parent
62801d8a93
commit
116a2ca4a4
@ -240,6 +240,31 @@ individual host to be backed up. The host to be backed up initiates
|
|||||||
the backup process to the remote backup server(s) using a separate ssh
|
the backup process to the remote backup server(s) using a separate ssh
|
||||||
key setup just for backup communication (see ``/root/.ssh/config``).
|
key setup just for backup communication (see ``/root/.ssh/config``).
|
||||||
|
|
||||||
|
Setting up hosts for backup
|
||||||
|
---------------------------
|
||||||
|
|
||||||
|
To setup a host for backup, put it in the ``borg-backup`` group.
|
||||||
|
|
||||||
|
Hosts can specify ``borg_backup_excludes_extra`` and
|
||||||
|
``borg_backup_dirs_extra`` to exclude or include specific directories
|
||||||
|
as required (see role documentation for more details).
|
||||||
|
|
||||||
|
``borg`` splits backup data into chunks and de-duplicates as much as
|
||||||
|
possible. For backing up large items, particularly things like
|
||||||
|
database dumps, we want to give ``borg`` as much chance to
|
||||||
|
de-duplicate as possible. Approaches such as dumping to compressed
|
||||||
|
files on disk defeat de-duplication because all the data changes for
|
||||||
|
each dump.
|
||||||
|
|
||||||
|
For dumping large data, hosts should put a file into
|
||||||
|
``/etc/borg-streams`` that performs the dump in an uncompressed manner
|
||||||
|
to stdout. The backup scripts will create a separate archive for each
|
||||||
|
stream defined here. For more details, see the ``backup`` role
|
||||||
|
documentation. These streams should attempt to be as friendly to
|
||||||
|
de-duplication as possible; see some of the examples of ``mysqldump``
|
||||||
|
to find arguments that help keep the output data more stable (and
|
||||||
|
hence more easily de-duplicated).
|
||||||
|
|
||||||
Restore from Backup
|
Restore from Backup
|
||||||
-------------------
|
-------------------
|
||||||
|
|
||||||
@ -255,109 +280,32 @@ time is to
|
|||||||
* sudo ``su -`` to switch to the backup user for the host to be restored
|
* sudo ``su -`` to switch to the backup user for the host to be restored
|
||||||
* you will now be in the home directory of that user
|
* you will now be in the home directory of that user
|
||||||
* run ``/opt/borg/bin/borg list ./backup`` to list the archives available
|
* run ``/opt/borg/bin/borg list ./backup`` to list the archives available
|
||||||
* these should look like ``hostname-YYYY-MM-DDTHH:MM:SS``
|
* these should look like ``<hostname>-<stream>-YYYY-MM-DDTHH:MM:SS``
|
||||||
* move to working directory
|
* move to working directory
|
||||||
* extract one of the appropriate archives with ``/opt/borg/bin/borg extract ~/backup <archive-tag>``
|
* extract one of the appropriate archives with ``/opt/borg/bin/borg extract ~/backup <archive-tag>``
|
||||||
|
|
||||||
|
Managing backup storage
|
||||||
Rotating backup storage
|
|
||||||
-----------------------
|
-----------------------
|
||||||
|
|
||||||
We run ``borg`` in append-only mode, so that clients can not remove
|
We run ``borg`` in append-only mode. This means clients can not
|
||||||
old backups on the server.
|
remove old backups on the server.
|
||||||
|
|
||||||
TODO(ianw) : Write instructions on how to prune server side. We
|
However, due to the way borg works, append-only mode plays all client
|
||||||
should monitor growth to see if automatic pruning would be
|
transactions into a transaction log until a read-write operation
|
||||||
appropriate, or periodic manual pruning, or something similar to this
|
occurs. Examining the repository will appear to have all these
|
||||||
existing system where we keep a historic archive and start fresh.
|
transactions applied (e.g. pruned archives will not appear; even if
|
||||||
|
they have not actually been pruned from disk). If you have reason to
|
||||||
The backup server keeps an active volume and the previously rotated
|
not trust the state of the backup, you should *not* run any read-write
|
||||||
volume. Each consists of 3 x 1TiB volumes grouped with LVM. The
|
operations. You will need to manually examine the transaction log and
|
||||||
volumes are mounted at ``/opt/backups-YYYYMM`` for the date it was
|
roll-back to a known good state; see
|
||||||
created; ``/opt/backups`` is a symlink to the latest volume.
|
`<https://borgbackup.readthedocs.io/en/stable/usage/notes.html#append-only-mode>`__.
|
||||||
Periodically we rotate the active volume for a fresh one. Follow this
|
|
||||||
procedure:
|
|
||||||
|
|
||||||
#. Create the new volumes via API (on ``bridge.o.o``). Create 3
|
|
||||||
volumes, named for the server with the year and date added::
|
|
||||||
|
|
||||||
DATE=$(date +%Y%m)
|
|
||||||
OS_VOLUME_API_VERSION=1
|
|
||||||
OS_CMD="./env/bin/openstack --os-cloud-openstackci-rax --os-region=ORD"
|
|
||||||
SERVER="backup01.ord.rax.ci.openstack.org"
|
|
||||||
${CMD} volume create --size 1024 ${SERVER}/main01-${DATE}
|
|
||||||
${CMD} volume create --size 1024 ${SERVER}/main02-${DATE}
|
|
||||||
${CMD} volume create --size 1024 ${SERVER}/main03-${DATE}
|
|
||||||
|
|
||||||
#. Attach the volumes to the backup server::
|
|
||||||
${OS_CMD} server add volume ${SERVER} ${SERVER}/main01-${DATE}
|
|
||||||
${OS_CMD} server add volume ${SERVER} ${SERVER}/main02-${DATE}
|
|
||||||
${OS_CMD} server add volume ${SERVER} ${SERVER}/main03-${DATE}
|
|
||||||
|
|
||||||
#. Now on the backup server, create the new backup LVM volume (get the
|
|
||||||
device names from ``dmesg`` when they were attached). For
|
|
||||||
simplicity we create a new volume group for each backup series, and
|
|
||||||
a single logical volume ontop::
|
|
||||||
|
|
||||||
DATE=$(date +%Y%m)
|
|
||||||
pvcreate /dev/xvd<DRIVE1> /dev/xvd<DRIVE2> /dev/xvd<DRIVE3>
|
|
||||||
vgcreate main-${DATE} /dev/xvdX /dev/xvdY /dev/xvdZ
|
|
||||||
lvcreate -l 100%FREE -n backups-${DATE} main-${DATE}
|
|
||||||
|
|
||||||
mkfs.ext4 -m 0 -j -L "backups-${DATE}" /dev/main-${DATE}/backups-${DATE}
|
|
||||||
tune2fs -i 0 -c 0 /dev/main-${DATE}/backups-${DATE}
|
|
||||||
|
|
||||||
mkdir /opt/backups-${DATE}
|
|
||||||
# manually add mount details to /etc/fstab
|
|
||||||
mount /opt/backups-${DATE}
|
|
||||||
|
|
||||||
#. Making sure there are no backups currently running you can now
|
|
||||||
begin to switch the backups (you can stop the ssh service, but be
|
|
||||||
careful not to then drop your connection and lock yourself out; you
|
|
||||||
can always reboot via the API if you do). Firstly, edit
|
|
||||||
``/etc/fstab`` and make the current (soon to be *old*) backup
|
|
||||||
volume mount read-only. Unmount the old volume and then remount it
|
|
||||||
(now as read-only). This should prevent any accidental removal of
|
|
||||||
the existing backups during the following procedures.
|
|
||||||
|
|
||||||
#. Pre-seed the new backup directory (same terminal as above). This
|
|
||||||
will copy all the directories and authentication details (but none
|
|
||||||
of the actual backups) and initalise for fresh backups::
|
|
||||||
|
|
||||||
cd /opt/backups-${DATE}
|
|
||||||
rsync -avz --exclude '.bup' /opt/backups/ .
|
|
||||||
for dir in bup-*; do su $dir -c "BUP_DIR=/opt/backups-${DATE}/$dir/.bup bup init"; done
|
|
||||||
#. The ``/opt/backups`` symlink can now be switched to the new
|
|
||||||
volume::
|
|
||||||
|
|
||||||
ln -sf /opt/backups-${DATE} /opt/backups
|
|
||||||
#. ssh can be re-enabled and the new backup volume is effectively
|
|
||||||
active.
|
|
||||||
|
|
||||||
#. Now run a test backup from a server manually. Choose one, get the
|
|
||||||
backup command from cron and run it manually in a screen (it might
|
|
||||||
take a while), ensuring everything seems to be writing correctly to
|
|
||||||
the new volume.
|
|
||||||
|
|
||||||
#. You can now clean up the oldest backups (the one *before* the one
|
|
||||||
you just rotated). Remove the mount from fstab, unmount the volume
|
|
||||||
and cleanup the LVM components::
|
|
||||||
|
|
||||||
DATE=<INSERT OLD DATE CODE HERE>
|
|
||||||
umount /opt/backups-${DATE}
|
|
||||||
lvremove /dev/main-${DATE}/backups-${DATE}
|
|
||||||
vgremove main-${DATE}
|
|
||||||
# pvremove the volumes; they will have PFree @ 1024.00g as
|
|
||||||
# they are now not assigned to anything
|
|
||||||
pvremove /dev/xvd<DRIVE1>
|
|
||||||
pvremove /dev/xvd<DRIVE2>
|
|
||||||
pvremove /dev/xvd<DRIVE3>
|
|
||||||
|
|
||||||
#. Remove volumes via API (opposite of adding above with ``server
|
|
||||||
volume detach`` then ``volume delete``).
|
|
||||||
|
|
||||||
#. Done! Come back and rotate it again next year.
|
|
||||||
|
|
||||||
|
However, we have limited backup space. Each backup server has a
|
||||||
|
script ``/usr/local/bin/prune-borg-backups`` which can be run to
|
||||||
|
reclaim space. This will keep the last 7 days of backups, then
|
||||||
|
monthly backups for 1 year and yearly backups for each archive. The
|
||||||
|
backup servers will send a warning when backup volume usage is high,
|
||||||
|
at which point this can be run manually.
|
||||||
|
|
||||||
.. _force-merging-a-change:
|
.. _force-merging-a-change:
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user