960 Commits

Author SHA1 Message Date
Ian Wienand
028d655375 Add borg-backup roles
This adds roles to implement backup with borg [1].

Our current tool "bup" has no Python 3 support and is not packaged for
Ubuntu Focal.  This means it is effectively end-of-life.  borg fits
our model of servers backing themselves up to a central location, is
well documented and seems well supported.  It also has the clarkb seal
of approval :)

As mentioned, borg works in the same manner as bup by doing an
efficient back up over ssh to a remote server.  The core of these
roles are the same as the bup based ones; in terms of creating a
separate user for each host and deploying keys and ssh config.

This chooses to install borg in a virtualenv on /opt.  This was chosen
for a number of reasons; firstly reading the history of borg there
have been incompatible updates (although they provide a tool to update
repository formats); it seems important that we both pin the version
we are using and keep clients and server in sync.  Since we have a
hetrogenous distribution collection we don't want to rely on the
packaged tools which may differ.  I don't feel like this is a great
application for a container; we actually don't want it that isolated
from the base system because it's goal is to read and copy it offsite
with as little chance of things going wrong as possible.

Borg has a lot of support for encrypting the data at rest in various
ways.  However, that introduces the possibility we could lose both the
key and the backup data.  Really the only thing stopping this is key
management, and if we want to go down this path we can do it as a
follow-on.

The remote end server is configured via ssh command rules to run in
append-only mode.  This means a misbehaving client can't delete its
old backups.  In theory we can prune backups on the server side --
something we could not do with bup.  The documentation has been
updated but is vague on this part; I think we should get some hosts in
operation, see how the de-duplication is working out and then decide
how we want to mange things long term.

Testing is added; a focal and bionic host both run a full backup of
themselves to the backup server.  Pretty cool, the logs are in
/var/log/borg-backup-<host>.log.

No hosts are currently in the borg groups, so this can be applied
without affecting production.  I'd suggest the next steps are to bring
up a borg-based backup server and put a few hosts into this.  After
running for a while, we can add all hosts, and then deprecate the
current bup-based backup server in vexxhost and replace that with a
borg-based one; giving us dual offsite backups.

[1] https://borgbackup.readthedocs.io/en/stable/

Change-Id: I2a125f2fac11d8e3a3279eb7fa7adb33a3acaa4e
2020-07-21 17:36:50 +10:00
Ian Wienand
b146181174 Grafana container deployment
This uses the Grafana container created with
Iddfafe852166fe95b3e433420e2e2a4a6380fc64 to run the
grafana.opendev.org service.

We retain the old model of an Apache reverse-proxy; it's well tested
and understood, it's much easier than trying to map all the SSL
termination/renewal/etc. into the Grafana container and we don't have
to convince ourselves the container is safe to be directly web-facing.

Otherwise this is a fairly straight forward deployment of the
container.  As before, it uses the graph configuration kept in
project-config which is loaded in with grafyaml, which is included in
the container.

Once nice advantage is that it makes it quite easy to develop graphs
locally, using the container which can talk to the public graphite
instance.  The documentation has been updated with a reference on how
to do this.

Change-Id: I0cc76d29b6911aecfebc71e5fdfe7cf4fcd071a4
2020-07-03 07:17:22 +10:00
Zuul
c3f5a87a5e Merge "Update refstack reference after rename" 2020-06-19 16:15:34 +00:00
Ian Wienand
ceb711e6d9 Swap mirror-update01 for mirror-update02
This is a new Focal based host, which we want for it's more recent
rsync which hopefully causes less issues resyncing things to AFS
volumes.

See 4918594aa472010a8a112f5f4ed0a471a3351a91 for discussion of the
original issues; we have found that without "-t" all new data seems to
be copied continuously.  Empirical testing shows later rsync doesn't
have this issue.

Depends-On: https://review.opendev.org/736859
Change-Id: Iebfffdf8aea6f123e36f264c87d6775771ce2dd8
2020-06-19 08:41:44 +10:00
Clark Boylan
3beb50a3b3 Add bit more info on disabling ansible runs
We've got a section on using the emergency file and disabled ansible
group. Add info about the special DISABLE-ANSIBLE file there to help
make that info easier to find.

Change-Id: I2e750b9b87ca7a4f800d3ac161a195d49543a7da
2020-06-15 14:41:51 -05:00
Andreas Jaeger
41b863344b Update refstack reference after rename
refstack was moved to osf namespace, update documentation.

Refstack also uses storyboard, update the link.

Change-Id: Id47017eee7a4d1e864c2f0369c73f7047c2df6cd
2020-06-13 16:08:15 +02:00
Zuul
d97a6e2d5e Merge "Add utility script to disable ansible" 2020-06-13 00:16:38 +00:00
Monty Taylor
dea12612c5 Add utility script to disable ansible
Touching the file works, but it's easy to misspell.

Change-Id: I4980ac2c290abd6cda39846e651fb490bfafe96f
2020-06-12 18:34:29 -05:00
Jeremy Stanley
9bac4659e2 Correct "ansbile" typos
Two mistypings of the string "ansible" (case insensitive) as
"ansbile" appear in our documentation. One of them is in a sample
command, which is particularly dangerous. Correct both.

Change-Id: Ib644f57060f467d4bfd70be60225e39385d38737
2020-06-12 22:41:54 +00:00
Ian Wienand
d19e567576 AFS: add note on volume creation servers
The inline note describes the problem we hit recently creating wheel
volumes.

Change-Id: I58064288c5cf21342b73e5ceb6aed685b3014578
2020-06-12 16:38:10 +10:00
Monty Taylor
96364a11d9 Stop cloning a bunch of puppet modules we don't use
We've stopped using many of these, but we never got around to
removing them from lists.

Also, we should probably retire the repos.

Depends-On: https://review.opendev.org/717620
Depends-On: https://review.opendev.org/720527
Change-Id: I8e012c5bfa48d274dbd7f5484a9e75fee080cb5e
2020-06-05 08:42:47 -05:00
Monty Taylor
83ced7f6e6 Split inventory into multiple dirs and move hostvars
Make inventory/service for service-specific things, including the
groups.yaml group definitions, and inventory/base for hostvars
related to the base system, including the list of hosts.

Move the exisitng host_vars into inventory/service, since most of
them are likely service-specific. Move group_vars/all.yaml into
base/group_vars as almost all of it is related to base things,
with the execption of the gerrit public key.

A followup patch will move host-specific values into equivilent
files in inventory/base.

This should let us override hostvars in gate jobs. It should also
allow us to do better file matchers - and to be able to organize
our playbooks move if we want to.

Depends-On: https://review.opendev.org/731583
Change-Id: Iddf57b5be47c2e9de16b83a1bc83bee25db995cf
2020-06-04 07:44:36 -05:00
Monty Taylor
f27c170d01 Rename service-letsencrypt to just letsencrypt
This isn't a service, it's a meta thing that we run for different
hosts at different times.

Change-Id: Ib65665c98afb3ddb94b15346931be88a4b1757d8
2020-06-04 07:44:36 -05:00
Dr. Jens Harbott
46b4053a0a Document the need to use sudo in order to access OSC
Change-Id: I9e80f0b57bc9758e6b0458428315b1087856ddec
2020-05-19 10:09:23 +00:00
Ian Wienand
f204337268 Add nb01/nb02 opendev servers
These are replacements for the nb01/02.openstack.org puppet servers

Change-Id: I376d70ee375289b004fb859751743c6fafa21411
2020-05-07 09:10:26 +10:00
Clark Boylan
6b1feb8ae6 Add logo file to docs
We are trying to use this file in our docs config but the file was
mistakently not added in that change. Add it now.

Change-Id: I8f5f9d62f96d8532477c42a7076c57aa6548c9cf
2020-04-30 13:37:55 -07:00
Clark Boylan
2e2ee170f8 Fix rooted path to docker-compose
In places like crontab entries we use full paths to executables because
PATH is different under cron. Unfortunately, this meant we broke
docker-compose commands using /usr/bin/docker-compose when we installed
it under /usr/local/bin/docker-compose. In particular this impacted
database backups on gitea nodes and etherpad.

Update these paths so that everything is happy again.

Change-Id: Ib001baab419325ef1a43ac8e3364e755a6655617
2020-04-22 14:09:46 -07:00
Monty Taylor
5468f49254 Remove unused gerrit puppet things
We ain't using em.

Change-Id: I4ce9188a6b6a7e6a670e61bb17ab07e890faebcf
2020-04-19 10:59:25 -05:00
Monty Taylor
711295e918 Remove old etherpad.openstack.org
Once the DNS is swapped over to point at etherpad.opendev.org
we can delete the old stuff.

Change-Id: I626dd22b22a23619fcf460533336f1ddfec615d9
2020-04-19 10:58:46 -05:00
Zuul
4a9e839dd0 Merge "Remove puppet and cron mentions from docs" 2020-04-16 21:18:08 +00:00
Zuul
e3ad9e79eb Merge "Get rid of all-clouds.yaml" 2020-04-16 15:41:55 +00:00
Monty Taylor
cba5129465 Remove puppet and cron mentions from docs
We've got some old out of date docs in some places. This isn't even
a full reworking, but at least tries to remove some of the more
egregiously wrong things.

Change-Id: I9033acb9572e1ce1b3e4426564b92706a4385dcb
2020-04-16 07:04:14 -07:00
Monty Taylor
ebae022d07 Use project-config from zuul instead of direct clones
We use project-config for gerrit, gitea and nodepool config. That's
cool, because can clone that from zuul too and make sure that each
prod run we're doing runs with the contents of the patch in question.

Introduce a flag file that can be touched in /home/zuulcd that will
block zuul from running prod playbooks. By default, if the file is
there, zuul will wait for an hour before giving up.

Rename zuulcd to zuul

To better align prod and test, name the zuul user zuul.

Change-Id: I83c38c9c430218059579f3763e02d6b9f40c7b89
2020-04-15 12:29:33 -05:00
Monty Taylor
8af7b47812 Get rid of all-clouds.yaml
We had the clouds split from back when we used the openstack
dynamic inventory plugin. We don't use that anymore, so we don't
need these to be split. Any other usage we have directly references
a cloud.

Change-Id: I5d95bf910fb8e2cbca64f92c6ad4acd3aaeed1a3
2020-04-09 16:44:20 -05:00
Clark Boylan
d07025f43f Switch documentation to alabaster theme
These are OpenDev docs now so the OpenStack theming doesn't quite fit.
Switch to Alabaster + OpenDev logo which is what we did with
infra-manual.

Change-Id: Id211e8e0b4dab7282fb5ca5fce494a028a826fba
2020-04-09 13:22:43 -07:00
Zuul
e71221ea33 Merge "Add a note about rename files to project renames doc" 2020-04-09 14:30:24 +00:00
Jeremy Stanley
8641302459 Mention new mailing lists
The OpenDev community is moving its discussions off the old
openstack-infra mailing list, so make sure to refer to the correct
new address(es).

Change-Id: I558b60ea0aa3421285d46be449f04198441cf285
2020-04-06 18:19:28 +00:00
Zuul
b474879c03 Merge "Correct launch readme link" 2020-04-04 19:46:53 +00:00
James E. Blair
06d5ce1423 Correct launch readme link
This has a .rst extension now.

Change-Id: Icafdb12f91315f5c37f95755034d216bc4a5c837
2020-03-27 09:45:42 -07:00
Jeremy Stanley
8da233817b Re-add secret decrypting docs
These are useful for the times when a secret needs to be decrypted
for debugging but seem to have been deleted when we did the zuulv3
migration removal.

Change-Id: Ib1544d9032df9bd25c50eeca032f643e40f035b0
2020-03-23 13:16:05 -05:00
Zuul
2c89ce1807 Merge "Split gitea and gerrit services from manage-projects" 2020-03-23 14:28:40 +00:00
Andreas Jaeger
62e76b5177 Docs: Update main page for OpenDev
Update conf.py and index.rst for OpenDev.

Use newer openstackdocstheme and update conf.py for this.

Change-Id: I62312ca1d3fda9221660b7bb664c8ea55dac68a4
2020-03-22 19:14:51 +01:00
Monty Taylor
86542eb9ba Split gitea and gerrit services from manage-projects
There are two different concerns here. One is configuring the gitea
and gerrit services. This is independent from the management of
projects running inside them.

Make a manage-projects playbook which currently runs gitea-git-repos
but will also get a gerrit-git-repos role in a bit. Make a
service-gitea playbook for deploying gitea itself and update
run_all to take all of that into account. This should make our
future world of turning these into zuul jobs easier.

Add several missing files to the files matchers for run-gitea
and run-review.

Also - nothing about this has anything to do with puppet.

Change-Id: I5eaf75129d76138c61013a3a7ed7c381d567bb8b
2020-03-21 11:34:19 -05:00
Andreas Jaeger
2c0b82e5e8 Update infra-manual location
The infra-manual now lives on docs.opendev.org, update links.

New location is: https://docs.opendev.org/opendev/infra-manual/latest

Change-Id: I7716c68cbff4f3a640d7161f59cfc034a7ccca52
2020-03-20 22:03:09 +01:00
James E. Blair
fc2a742b24 Add a note about rename files to project renames doc
We keep track of these files now in the opendev/project-config repo,
so make sure that they are committed there.

Change-Id: Icf4b4e32ac4f209811ba8361bbb9d8458c79251a
2020-03-20 07:09:56 -07:00
Zuul
a54baada30 Merge "Make Advisory Board a proper noun" 2020-03-19 01:02:19 +00:00
Zuul
e3f7c8cee8 Merge "Update references to IRC channels" 2020-03-18 18:55:57 +00:00
Dr. Jens Harbott
c86525ccd3 Update references to IRC channels
With the move from OpenStack governance to our own OpenDev team, we
should also move to use the #opendev IRC channel in preference to
the #openstack-infra channel which will remain in use for OpenStack
specific discussions.

Update the references in our docs accordingly.

Change-Id: I448704f5d2664fd233a69a2ad12578ca24d9878a
2020-03-18 17:33:08 +01:00
Zuul
8e45f95748 Merge "Update project doc to reflect OpenDev changes" 2020-03-17 20:22:45 +00:00
Clark Boylan
08e2418e58 Make Advisory Board a proper noun
This fixes a small nit on the prior docs change.

Change-Id: Id408cf410e7fc50d418cc701d3b195ebcffd1b85
2020-03-17 13:03:37 -07:00
Ian Wienand
288e516ace letsencrypt: add note on manual refresh of certificates
Add a note on how to manually refresh the certificates if required.

Change-Id: Ie5f494e3769b7b878c2d1b03836d436dd845e5d9
2020-03-05 21:50:29 +00:00
Sorin Sbarnea
f861cda57c Improve 3rd-party logging guidelines
Based on #openstack-infra talks from Feb 17th, I am proposing some clarifications regarding how logging should
be done by 3rd party CI.

These should help 3rd party integrators create a better
experience for developers, making logs more accessible.

Change-Id: I2ebc788505ba1319afc038d0aa1406da3823a911
2020-02-18 09:29:24 +00:00
Zuul
1f67b8ed37 Merge "Add docs for deleting an AFS volume" 2020-02-10 17:09:04 +00:00
Clark Boylan
95e8c8edde Update project doc to reflect OpenDev changes
This change effectively converts the OpenStack Infra project description
into an OpenDev project description in our documentation. Since OpenDev
is largely an evolution of the preexisting infra team much of the
content remains the same. I have added a section on governance as we'll
not be able to run off of the OpenStack governance any longer.

Note this leaves what becomes the OpenStack Infra project without a
project document. However, the remaining scope of that OpenStack project
will be small and I don't think it will need to same level of team
organization. I think we can get by with OpenStack's default governance
for its teams there. Then should we need something more explicit or
different we can write that up within openstack itself.

Depends-On: https://review.opendev.org/#/c/703134/
Change-Id: I56aab771510768211386325e6466d2f94fe298fb
2020-02-05 14:59:39 +00:00
James E. Blair
cfc1841c06 Add warning about kerberos key rotation
Change-Id: I9e4caf8feeb775c02208a5e5f1627f03a90e4211
2020-01-31 16:22:52 -08:00
James E. Blair
255f996916 Add docs for deleting an AFS volume
Change-Id: I1763eb2bf580591b68bf4e2853378331b8261293
2020-01-20 09:43:34 -08:00
Zuul
44935bca39 Merge "Add notes on manual host configuration runs" 2020-01-16 22:53:05 +00:00
Ian Wienand
4bb7746347 Update gitea docs
Give the location of the database backups, and update the replication
section.

Change-Id: Ic687ab3bab1a1534cdd26d357f729db054e8b60e
2019-11-15 10:21:51 +11:00
James E. Blair
87fccc8e9b Add docs for recovering an OpenAFS fileserver
This should be a smooth recovery process.

Change-Id: I3c68b077e38a88160286d94e71676c0c4dbb6a51
2019-09-13 10:42:17 -07:00
Zuul
1b14855a45 Merge "AFS server restart and audit logging : helper script" 2019-08-29 21:03:09 +00:00