6 Commits

Author SHA1 Message Date
Ian Wienand
e5a2354451 borg-backup-server: fix verification run
&>> is a bashism and not supported by sh, which cron runs the jobs
under.  Use >> instead.

Change-Id: I8e67f466887070fb1dedc403c53227c3ce1b2f1d
2021-03-17 15:09:57 +11:00
Ian Wienand
ece90fb7f7 borg-backup-server: make sure to append verification logs
We don't want to overwrite every run, but rather append to the log
file.

Change-Id: I304caedecbf6a9552f314636ca82a543ef16a8b6
2021-02-15 14:45:03 +11:00
Ian Wienand
0d01d941b1 borg-backup-server: run a weekly backup verification
This checks the backup archives and alerts us if anything seems wrong.
This will take a few hours, so we run once a week.

Change-Id: I832c0d29a37df94d4bf2704c59bb3f8d855c3cc8
2021-02-11 00:43:16 +00:00
Ian Wienand
62801d8a93 borg-backup-server: volume space monitor
Due to backups running in append-only mode, we do not have a way to
safely automatically prune backups.  To reduce the likelyhood we
forget about backups and end up with failing jobs, add a cron job to
send a email to infra-root if the backup partition goes over 90%
usage.  At this point a manual prune should be run
(I9559bb8aeeef06b95fb9e172a2c5bfb5be5b480e).

Change-Id: I250d84c4a9f707e63fef6f70cfdcc1fb7807d3a7
2021-02-09 11:31:02 +11:00
Ian Wienand
4f0bfa6d9d borg-backup-server: add script for pruning borg backups
This adds a script that performs a manual pruning of backup
directories.

Change-Id: I9559bb8aeeef06b95fb9e172a2c5bfb5be5b480e
2021-02-09 11:29:46 +11:00
Ian Wienand
028d655375 Add borg-backup roles
This adds roles to implement backup with borg [1].

Our current tool "bup" has no Python 3 support and is not packaged for
Ubuntu Focal.  This means it is effectively end-of-life.  borg fits
our model of servers backing themselves up to a central location, is
well documented and seems well supported.  It also has the clarkb seal
of approval :)

As mentioned, borg works in the same manner as bup by doing an
efficient back up over ssh to a remote server.  The core of these
roles are the same as the bup based ones; in terms of creating a
separate user for each host and deploying keys and ssh config.

This chooses to install borg in a virtualenv on /opt.  This was chosen
for a number of reasons; firstly reading the history of borg there
have been incompatible updates (although they provide a tool to update
repository formats); it seems important that we both pin the version
we are using and keep clients and server in sync.  Since we have a
hetrogenous distribution collection we don't want to rely on the
packaged tools which may differ.  I don't feel like this is a great
application for a container; we actually don't want it that isolated
from the base system because it's goal is to read and copy it offsite
with as little chance of things going wrong as possible.

Borg has a lot of support for encrypting the data at rest in various
ways.  However, that introduces the possibility we could lose both the
key and the backup data.  Really the only thing stopping this is key
management, and if we want to go down this path we can do it as a
follow-on.

The remote end server is configured via ssh command rules to run in
append-only mode.  This means a misbehaving client can't delete its
old backups.  In theory we can prune backups on the server side --
something we could not do with bup.  The documentation has been
updated but is vague on this part; I think we should get some hosts in
operation, see how the de-duplication is working out and then decide
how we want to mange things long term.

Testing is added; a focal and bionic host both run a full backup of
themselves to the backup server.  Pretty cool, the logs are in
/var/log/borg-backup-<host>.log.

No hosts are currently in the borg groups, so this can be applied
without affecting production.  I'd suggest the next steps are to bring
up a borg-based backup server and put a few hosts into this.  After
running for a while, we can add all hosts, and then deprecate the
current bup-based backup server in vexxhost and replace that with a
borg-based one; giving us dual offsite backups.

[1] https://borgbackup.readthedocs.io/en/stable/

Change-Id: I2a125f2fac11d8e3a3279eb7fa7adb33a3acaa4e
2020-07-21 17:36:50 +10:00