system-config

Author	SHA1	Message	Date
James E. Blair	b768325480	Use upstream jitsi-meet web image This has our change to open etherpad on join, so we should no longer need to run a fork of the web server. Switch to the upstream container image and stop building our own. Change-Id: I3e8da211c78b6486a3dcbd362ae7eb03cc9f5a48	2021-03-09 12:35:46 -08:00
Zuul	1b2435c349	Merge "backups: remove all bup"	2021-02-21 22:41:41 +00:00
Ian Wienand	39ffc685d6	backups: remove all bup All hosts are now running thier backups via borg to servers in vexxhost and rax.ord. For reference, the servers being backed up at this time are: borg-ask01 borg-ethercalc02 borg-etherpad01 borg-gitea01 borg-lists borg-review-dev01 borg-review01 borg-storyboard01 borg-translate01 borg-wiki-update-test borg-zuul01 This removes the old bup backup hosts, the no-longer used ansible roles for the bup backup server and client roles, and any remaining bup related configuration. For simplicity, we will remove any remaining bup cron jobs on the above servers manually after this merges. Change-Id: I32554ca857a81ae8a250ce082421a7ede460ea3c	2021-02-16 16:00:28 +11:00
Clark Boylan	2bb3dd797b	Cleanup refstack job dependencies We need to depend on the buildset registry as we are building this image in a separate job. We also don't need to depend on the build job in gate, we only need the upload job. Change-Id: Ie7c2ed29c028f8c23d67ad38edbe04b12e22d026	2021-02-10 15:11:54 -08:00
Clark Boylan	9b90e192b1	Run gerrit 3.2 and 3.3 functional tests This change splits our existing system-config-run-review job into two jobs, one for gerrit 3.2 and another for 3.3. The biggest change is that we use a var called zuul_test_gerrit_version to select which version we want and that ends up in the fake group file written out by Zuul for the nested ansible run. The nested ansible run will then populate the docker-compose file with the appropriate version for us. Change-Id: I00b52c0f4aa8df3ecface964007fcf5724887e5e	2021-02-10 15:10:46 -08:00
Ian Wienand	78167396bf	refstack: add production image and deployment jobs Change-Id: I017a32ee374f0473525c9941c41b26c2a43bf2c8	2021-02-10 07:11:22 +11:00
Clark Boylan	a4604ae0b3	Deploy refstack with ansible docker This adds a dockerfile to build an opendevorg/refstack image as well as the jobs to build and publish it. Change-Id: Icade6c713fa9bf6ab508fd4d8d65debada2ddb30	2021-02-05 19:23:34 +00:00
Ian Wienand	7683fa11b3	openafs-server : add ansible roles for OpenAFS servers This starts at migrating OpenAFS server setup to Ansible. Firstly we split up the groups and explicitly name hosts, as we will me migrating each one step-by-step. We split out 1.8 hosts into a new afs-1.8 group; the first host is afs01.ord.openstack.org which already has openafs 1.8 installed manually. An openafs-server role is introduced that does the same setup as the extant puppet. The AFS job is renamed to infra-prod-afs as the puppet component will eventually disappear. Otherwise it runs in the same way, but also runs the openafs-server role for the 1.8 servers. Once this is merged, we can run it against afs01.ord.openstack.org to ensure it works and is idempotent. We can then take on upgrading the other file servers, and work further on the database servers. Change-Id: I7998af43961999412f58a78214f4b5387713d30e	2021-01-19 08:08:33 +11:00
Clark Boylan	c882808578	Remove container image builds for old gerrit versions Having upgraded to 3.2, we don't need these versions any more. Change-Id: Ifc37a75aa62b2498e649a4c81b589a04c794184a Depends-On: https://review.opendev.org/763617	2020-11-21 13:58:19 -08:00
Ian Wienand	368466730c	Migrate codesearch site to container The hound project has undergone a small re-birth and moved to https://github.com/hound-search/hound which has broken our deployment. We've talked about leaving codesearch up to gitea, but it's not quite there yet. There seems to be no point working on the puppet now. This builds a container than runs houndd. It's an opendev specific container; the config is pulled from project-config directly. There's some custom scripts that drive things. Some points for reviewers: - update-hound-config.sh uses "create-hound-config" (which is in jeepyb for historical reasons) to generate the config file. It grabs the latest projects.yaml from project-config and exits with a return code to indicate if things changed. - when the container starts, it runs update-hound-config.sh to populate the initial config. There is a testing environment flag and small config so it doesn't have to clone the entire opendev for functional testing. - it runs under supervisord so we can restart the daemon when projects are updated. Unlike earlier versions that didn't start listening till indexing was done, this version now puts up a "Hound is not ready yet" message when while it is working; so we can drop all the magic we were doing to probe if hound is listening via netstat and making Apache redirect to a status page. - resync-hound.sh is run from an external cron job daily, and does this update and restart check. Since it only reloads if changes are made, this should be relatively rare anyway. - There is a PR to monitor the config file (https://github.com/hound-search/hound/pull/357) which would mean the restart is unnecessary. This would be good in the near and we could remove the cron job. - playbooks/roles/codesearch is unexciting and deploys the container, certificates and an apache proxy back to localhost:6080 where hound is listening. I've combined removal of the old puppet bits here as the "-codesearch" namespace was already being used. Change-Id: I8c773b5ea6b87e8f7dfd8db2556626f7b2500473	2020-11-20 07:41:12 +11:00
Clark Boylan	bec0127970	Add jvb02 prior to the PTG This will scale up our meetpad install by 50% giving us more capacity for PTG sessions. We also increase the tox linters job timeout as it is slow pip installing then slow running ansible-lint. Do this until we can sort out why it is slow. Change-Id: Ieceafefa27266f0bc0f427af790f920a8c44326c	2020-10-23 15:28:03 -07:00
Zuul	083e8b43ea	Merge "Add borg-backup roles"	2020-10-01 07:36:47 +00:00
Clark Boylan	6df28cec61	Run service-eavesdrop hourly Now that gerritbot is deployed from containers on eavesdrop we want to run the infra-prod-service-eavesdrop job hourly to ensure that we keep the docker image up to date there. We haven't added the service-eavesdrop job to a deploy pipeline in gerritbot because that would require us to add gerritbot's project ssh key to bridge. Change-Id: I5aba91f2ae5c018ee9b2d0481a53b630fc5d1ab7	2020-08-13 08:58:03 -07:00
Ian Wienand	028d655375	Add borg-backup roles This adds roles to implement backup with borg [1]. Our current tool "bup" has no Python 3 support and is not packaged for Ubuntu Focal. This means it is effectively end-of-life. borg fits our model of servers backing themselves up to a central location, is well documented and seems well supported. It also has the clarkb seal of approval :) As mentioned, borg works in the same manner as bup by doing an efficient back up over ssh to a remote server. The core of these roles are the same as the bup based ones; in terms of creating a separate user for each host and deploying keys and ssh config. This chooses to install borg in a virtualenv on /opt. This was chosen for a number of reasons; firstly reading the history of borg there have been incompatible updates (although they provide a tool to update repository formats); it seems important that we both pin the version we are using and keep clients and server in sync. Since we have a hetrogenous distribution collection we don't want to rely on the packaged tools which may differ. I don't feel like this is a great application for a container; we actually don't want it that isolated from the base system because it's goal is to read and copy it offsite with as little chance of things going wrong as possible. Borg has a lot of support for encrypting the data at rest in various ways. However, that introduces the possibility we could lose both the key and the backup data. Really the only thing stopping this is key management, and if we want to go down this path we can do it as a follow-on. The remote end server is configured via ssh command rules to run in append-only mode. This means a misbehaving client can't delete its old backups. In theory we can prune backups on the server side -- something we could not do with bup. The documentation has been updated but is vague on this part; I think we should get some hosts in operation, see how the de-duplication is working out and then decide how we want to mange things long term. Testing is added; a focal and bionic host both run a full backup of themselves to the backup server. Pretty cool, the logs are in /var/log/borg-backup-<host>.log. No hosts are currently in the borg groups, so this can be applied without affecting production. I'd suggest the next steps are to bring up a borg-based backup server and put a few hosts into this. After running for a while, we can add all hosts, and then deprecate the current bup-based backup server in vexxhost and replace that with a borg-based one; giving us dual offsite backups. [1] https://borgbackup.readthedocs.io/en/stable/ Change-Id: I2a125f2fac11d8e3a3279eb7fa7adb33a3acaa4e	2020-07-21 17:36:50 +10:00
Zuul	c2b2efdf5b	Merge "Graphite container deployment"	2020-07-07 00:41:10 +00:00
Zuul	1d610297f3	Merge "Grafana container deployment"	2020-07-06 05:56:02 +00:00
Ian Wienand	3cf11d298e	Update grafana-container There is a new release, update base container. Add promote job that was forgotten with the original commit Iddfafe852166fe95b3e433420e2e2a4a6380fc64. Change-Id: Ie0d7febd2686d267903b29dfeda54e7cd6ad77a3	2020-07-06 10:48:25 +10:00
Ian Wienand	185797a0e5	Graphite container deployment This deploys graphite from the upstream container. We override the statsd configuration to have it listen on ipv6. Similarly we override the ngnix config to listen on ipv6, enable ssl, forward port 80 to 443, block the /admin page (we don't use it). For production we will just want to put some cinder storage in /opt/graphite/storage on the production host and figure out how to migrate the old stats. The is also a bit of cleanup that will follow, because we half-converted grafana01.opendev.org -- so everything can't be in the same group till that is gone. Testing has been added to push some stats and ensure they are seen. Change-Id: Ie843b3d90a72564ef90805f820c8abc61a71017d	2020-07-03 07:17:28 +10:00
Ian Wienand	b146181174	Grafana container deployment This uses the Grafana container created with Iddfafe852166fe95b3e433420e2e2a4a6380fc64 to run the grafana.opendev.org service. We retain the old model of an Apache reverse-proxy; it's well tested and understood, it's much easier than trying to map all the SSL termination/renewal/etc. into the Grafana container and we don't have to convince ourselves the container is safe to be directly web-facing. Otherwise this is a fairly straight forward deployment of the container. As before, it uses the graph configuration kept in project-config which is loaded in with grafyaml, which is included in the container. Once nice advantage is that it makes it quite easy to develop graphs locally, using the container which can talk to the public graphite instance. The documentation has been updated with a reference on how to do this. Change-Id: I0cc76d29b6911aecfebc71e5fdfe7cf4fcd071a4	2020-07-03 07:17:22 +10:00
Ian Wienand	330e297318	Add a grafana/grafyaml image This is a docker image based on the latest upstream Grafana with grafyaml also installed inside. It includes a small script to run a refresh of the dashboards. Change-Id: Iddfafe852166fe95b3e433420e2e2a4a6380fc64	2020-06-24 08:21:26 +10:00
Zuul	30ff2de191	Merge "Split inventory into multiple dirs and move hostvars"	2020-06-04 22:33:37 +00:00
Zuul	b25804d4cb	Merge "Rename service-letsencrypt to just letsencrypt"	2020-06-04 21:19:27 +00:00
Zuul	075c4035b3	Merge "Run iptables in service playbooks instead of base"	2020-06-04 21:05:04 +00:00
Monty Taylor	83ced7f6e6	Split inventory into multiple dirs and move hostvars Make inventory/service for service-specific things, including the groups.yaml group definitions, and inventory/base for hostvars related to the base system, including the list of hosts. Move the exisitng host_vars into inventory/service, since most of them are likely service-specific. Move group_vars/all.yaml into base/group_vars as almost all of it is related to base things, with the execption of the gerrit public key. A followup patch will move host-specific values into equivilent files in inventory/base. This should let us override hostvars in gate jobs. It should also allow us to do better file matchers - and to be able to organize our playbooks move if we want to. Depends-On: https://review.opendev.org/731583 Change-Id: Iddf57b5be47c2e9de16b83a1bc83bee25db995cf	2020-06-04 07:44:36 -05:00
Monty Taylor	f27c170d01	Rename service-letsencrypt to just letsencrypt This isn't a service, it's a meta thing that we run for different hosts at different times. Change-Id: Ib65665c98afb3ddb94b15346931be88a4b1757d8	2020-06-04 07:44:36 -05:00
Monty Taylor	d93a661ae4	Run iptables in service playbooks instead of base It's the only part of base that's important to run when we run a service. Run it in the service playbooks and get rid of the dependency on infra-prod-base. Continue running it in base so that new nodes are brought up with iptables in place. Bump the timeout for the mirror job, because the iptables addition seems to have just bumped it over the edge. Change-Id: I4608216f7a59cfa96d3bdb191edd9bc7bb9cca39	2020-06-04 07:44:22 -05:00
James E. Blair	1210ef366c	Move run-eavesdrop from periodic-hourly to periodic This is required by the accessbot job which is in periodic. We moved it to hourly so that ptgbot could be updated more often, but without it being in periodic, no periodic jobs are running, and that seems more critical at the moment. Change-Id: I0c7dbc0db77f295820302441e495fe4e9ea7d726	2020-06-02 09:33:15 -07:00
Jeremy Stanley	918dd0ed91	Deploy eavesdrop hourly Since changes to some services on eavesdrop, for example ptgbot, may need to take effect fairly quickly, run the playbook hourly rather than daily. We can't easily trigger on changes merging to the ptgbot repo in the future when it's in a different Zuul tenant from system-config. Change-Id: I90ddc555ded0ac1d3134fd075d816155a475c6d2	2020-05-29 16:21:35 +00:00
Monty Taylor	6a53ffa3ae	Run accessbot less frequently We already run accessbot in project-config when the accessbot script changes. We don't need to run it whenever any of the puppet or other config on eavesdrop runs, not do we need to run it hourly. Just run it nightly and on changes to the actual accessbot config. Change-Id: Idd47f7c96f677fd1e1c8da3be262a52a70646acd	2020-05-08 08:15:14 -05:00
Clark Boylan	cfc83807b7	Organize zuul jobs in zuul.d/ dir Our .zuul.yaml file has grown quite large. Try to make this more manageable by splitting it into zuul.d/ directory with jobs organized by function. Change-Id: I0739eb1e2bc64dcacebf92e25503f67302f7c882	2020-05-07 17:30:48 -05:00

1 2 3

130 Commits