system-config/doc/source/reprepro.rst
Ian Wienand 3620dd24ef Add reprepo documentation
Add some documentation on reprepo, including some recovery techniques
gleaned from the battlefield.

Change-Id: I3368dedc1b9a769a1c8f5e8fe831d059ff23875b
2017-10-18 11:01:31 +11:00

5.3 KiB

title

reprepro

Reprepro

Debian package mirroring tool

At a Glance

Hosts
Puppet
  • modules/openstack_project/manifests/mirror_update.pp
Projects
Documentation
Bugs

Overview

reprepro is the tool we use to mirror Debian repositories (including Ubuntu) to the AFS mirrors.

Repository signing

Note our repositories are not signed. apt will require --no-check-gpg or similar settings in configuration to use OpenStack mirrors.

Normal operation

Repository syncs are driven from cron on the mirror-update.openstack.org host using the /usr/local/bin/reprepro-mirror-update script. Repositories will update, remove old references and perform the vos release.

Advanced Recovery Techniques

For a small repository, corruption is probably best handled by removing the entire repository and re-syncing. This is undesirable for larger repositories, however.

Note

Be careful with vos release which is done as part of /usr/local/bin/reprepo-mirror-update to avoid inadvertently releasing in progress work. Also be aware the commands in that script by default run under timeout which you may not want in recovery.

Corrupt reprepo databases will halt mirroring with often obscure symptoms. For example, this has been seen in production with reprepo ending up hung in an silent infinite loop. In this case, using strace revealed the last operation was on a file-descriptor related to a .db file, which gave a clue the databases were corrupt. Other failures may be possible, of course.

The following assumes you have a root shell with the correct AFS permissions for the mirror volumes, drop into something like:

k5start -t -f /etc/reprepro.keytab service/reprepro -- bash

We will use the Ubuntu repository as an example below.

In a crisis, you want to stop the cron job running to update the repo. You can either edit it out with crontab -e and put the host in the emergency file (so puppet doesn't replace it) or, in a pinch, take the lock in a infinite loop like :

flock -n /var/run/reprepro/ubuntu.lock bash -c while true; do sleep 1000; done

Firstly check in dmesg for AFS related errors. It is quite likely any corruption has happened due to issues at this layer, so ensure stability here before continuing to further recovery.

The databases are in the db directory in the mirror:

# ls /afs/.openstack.org/mirror/ubuntu/db
checksums.db  contents.cache.db  packages.db  references.db  release.caches.db  version

It is best to make backup copies before any recovery operations. Although AFS /should/ keep up, you should do any recovery of the db directory on a local copy to avoid any intermittent issues there further corrupting the database, then copy back the updated files when complete.:

# cp -r /afs/.openstack.org/mirror/ubuntu/db ~/db

For convenience, setup the common reprepo options for verbose logging, the configuration directory and to work on the local database:

# export REPREPRO=reprepro -VVV --confdir /etc/reprepro/ubuntu --dbdir ~/db

From the upstream recovery document, the references.db can be removed and recreated quickly with:

$REPREPRO rereference

The checksums.db can also be recreated. You can rebuild with:

cd /afs/.openstack.org/mirror/ubuntu
find -type f -printf "pool/%P\n" > /tmp/file-list
$REPREPRO -b . _detect < /tmp/file-list
  • This will take several hours (~6 hours in 2017) as it touches all the repo files.

Note that if the .deb files on disk are corrupt, this may lead to errors on update about mismatching checksums which have been stored in the database. Likely you want to remove these files from disk and from the checksums database with a command similar to:

$REPREPRO _forget pool/main/p/package/the_package_1.2.3.deb
rm pool/main/p/package/the_package_1.2.3.deb

They should come back with the next update.

In some situations where things are very out of sync, it may be easier to remove and replace an entire section of the repository. For example, if during updates files within xenial-security are seen to be corrupt, you can remove xenial-security from /etc/reprepro/ubuntu/distributions and run the following:

# remove old
$REPREPRO --delete clearvanished
# run an update
$REPREPRO update

You can then re-add the entries and run another update, which should resync everything from fresh.

Remember to put the databases back in place:

# mv /afs/.openstack.org/mirror/ubuntu/db /afs/.openstack.org/mirror/ubuntu/db.old
# cp -r ~/db  /afs/.openstack.org/mirror/ubuntu/

To stage a recovery prior to release, you can modify the mirror_root argument in openstack_project::mirror puppet to point to the RW mirror /afs/.openstack.org/mirror (rather than the released RO /afs/openstack.org/mirror). This way you can switch back quickly if things don't work.

You can then either manually run vos release, or restore cron and let the next reprepro-mirror-update run do it.