system-config

Author	SHA1	Message	Date
Zuul	b879e5fad7	Merge "Fork the maxking/docker-mailman images"	2022-11-22 18:11:24 +00:00
Zuul	b7b2157133	Merge "Add a mailman3 list server"	2022-11-22 18:00:30 +00:00
Ian Wienand	9445fccb55	system-config-run-gitea: use standard bridge host In what looks like a typo, we are overriding the bridge node for this test to a bionic host. Remove this. This was detected by testing an upgraded Ansible, which wouldn't install on the lower python on Bionic. Change-Id: Ie3e754598c6da1812e74afa914f50d91972012cd	2022-11-22 11:26:14 +11:00
Zuul	be9db368af	Merge "openafs: copy dkms log directory"	2022-11-21 21:12:41 +00:00
Clark Boylan	12d4355385	Fork the maxking/docker-mailman images These images have a number of issues we've identified and worked around. The current iteration of this change is essentially identical to upstream but with a minor tweak to allow the latest mailman version, and adjusts the paths for hyperkitty and postorius URLs to match those in the upstream mailman-web codebase, but doesn't try to address the other items. However, we should consider moving our fixes from ansible into the docker images where possible and upstream those updates. Unfortunately upstream hasn't been super responsive so far hence this fork. For tracking purposes here are the issues/PRs we've already filed upstream: https://github.com/maxking/docker-mailman/pull/552 https://github.com/maxking/docker-mailman/issues/548 https://github.com/maxking/docker-mailman/issues/549 https://github.com/maxking/docker-mailman/issues/550 Change-Id: I3314037d46c2ef2086a06dea0321d9f8cdd35c73	2022-11-21 16:51:02 +00:00
Ian Wienand	039aae5fa7	openafs: copy dkms log directory Grab the make logs from the dkms directory. This is helpful if the modules are failing to build. The /var/lib/dkms directory contains all the source and object files, etc., which seems unnecessary to store in general. Thus we just trim this to the log directory. Change-Id: I9b5abc9cf4cd59305470a04dda487dfdfd1b395a	2022-11-21 10:33:11 +11:00
Zuul	94cb35a7f6	Merge "Update Gerrit images to 3.5.4 and 3.6.3"	2022-11-16 01:29:42 +00:00
Clark Boylan	c1c91886b4	Add a mailman3 list server This should now be a largely functional deployment of mailman 3. There are still some bits that need testing but we'll use followup changes to force failure and hold nodes. This deployment of mailman3 uses upstream docker container images. We currently hack up uids and gids to accomodate that. We also hack up the settings file and bind mount it over the upstream file in order to use host networking. We override the hyperkitty index type to xapian. All list domains are hosted in a single installation and we use native vhosting to handle that. We'll deploy this to a new server and migrate one mailing list domain at a time. This will allow us to start with lists.opendev.org and test things like dmarc settings before expanding to the remaining lists. A migration script is also included, which has seen extensive testing on held nodes for importing copies of the production data sets. Change-Id: Ic9bf5cfaf0b87c100a6ce003a6645010a7b50358	2022-11-11 23:20:19 +00:00
Ian Wienand	9c76ebf4af	Update a few s/bridge01/bridge99 references These were foregotten in I137ab824b9a09ccb067b8d5f0bb2896192291883 when we switched the testing bridge host to bridge99. Change-Id: I742965c61ed00be05f1daea2d6110413cff99e2a	2022-11-11 15:05:39 +11:00
Clark Boylan	5e8d704278	Update Gerrit images to 3.5.4 and 3.6.3 Gerrit made new releases and we should update to them. Release notes can be found here: https://www.gerritcodereview.com/3.5.html#354 https://www.gerritcodereview.com/3.6.html#363 The main improvement for us is likely to be the copy approvals performance boosts and error handling. We still need to run that prior to our 3.6 upgrade. Note we currently only run 3.5 in production but we test the 3.6 upgrade from our current production version so it makes sense to update the 3.6 image as well. Change-Id: Idf9a16b443907a2d0c19c1b6ec016f5d16583ad2	2022-11-11 13:20:36 +11:00
Ian Wienand	0c90c128d7	Reference bastion through prod_bastion group In thinking harder about the bootstrap process, it struck me that the "bastion" group we have is two separate ideas that become a bit confusing because they share a name. We have the testing and production paths that need to find a single bridge node so they can run their nested Ansible. We've recently merged changes to the setup playbooks to not hard-code the bridge node and they now use groups["bastion"][0] to find the bastion host -- but this group is actually orthogonal to the group of the same name defined in inventory/service/groups.yaml. The testing and production paths are running on the executor, and, as mentioned, need to know the bridge node to log into. For the testing path this is happening via the group created in the job definition from zuul.d/system-config-run.yaml. For the production jobs, this group is populated via the add-bastion-host role which dynamically adds the bridge host and group. Only the nested Ansible running on the bastion host reads s-c:inventory/service/groups.yaml. None of the nested-ansible playbooks need to target only the currently active bastion host. For example, we can define as many bridge nodes as we like in the inventory and run service-bridge.yaml against them. It won't matter because the production jobs know the host that is the currently active bridge as described above. So, instead of using the same group name in two contexts, rename the testing/production group "prod_bastion". groups["prod_bastion"][0] will be the host that the testing/production jobs use as the bastion host -- references are updated in this change (i.e. the two places this group is defined -- the group name in the system-config-run jobs, and add-bastion-host for production). We then can return the "bastion" group match to bridge*.opendev.org in inventory/service/groups.yaml. This fixes a bootstrapping problem -- if you launch, say, bridge03.opendev.org the launch node script will now apply the base.yaml playbook against it, and correctly apply all variables from the "bastion" group which now matches this new host. This is what we want to ensure, e.g. the zuul user and keys are correctly populated. The other thing we can do here is change the testing path "prod_bastion" hostname to "bridge99.opendev.org". By doing this we ensure we're not hard-coding for the production bridge host in any way (since if both testing and production are called bridge01.opendev.org we can hide problems). This is a big advantage when we want to rotate the production bridge host, as we can be certain there's no hidden dependencies. Change-Id: I137ab824b9a09ccb067b8d5f0bb2896192291883	2022-11-04 09:18:35 +11:00
Ian Wienand	730fcf0171	Remove old bridge testing We don't need to test on a bionic bridge any more, remove the old test job. Change-Id: I826f740b004bdc8b85977ba08f4e17e92f40a316	2022-11-03 04:10:31 +00:00
Zuul	b8326dcc9d	Merge "Add python 3.11 docker images"	2022-10-27 23:11:29 +00:00
Zuul	7aa0ef1304	Merge "Switch bridge to bridge01.opendev.org"	2022-10-26 01:13:52 +00:00
Zuul	39387607bc	Merge "Drop python 3.8 base image builds"	2022-10-25 22:06:30 +00:00
Clark Boylan	ee359c7e3b	Add python 3.11 docker images Python 3.11 has been released. Once the parent commit of this commit lands we will have removed our python3.8 images making room for python3.11 in our image list. Add these new images which will make way for running and testing our software on this new version of python. Change-Id: Idcea3d6fa22839390f63cd1722bc4cb46a6ccd53	2022-10-25 10:43:29 -07:00
Ian Wienand	102534fdb8	Switch bridge to bridge01.opendev.org This switches the bridge name to bridge01.opendev.org. The testing path is updated along with some final references still in testinfra. The production jobs are updated in add-bastion-host, and will have the correct setup on the new host after the dependent change. Everything else is abstracted behind the "bastion" group; the entry is changed here which will make all the relevant playbooks run on the new host. Depends-On: https://review.opendev.org/c/opendev/base-jobs/+/862551 Change-Id: I21df81e45a57f1a4aa5bc290e9884e6dc9b4ca13	2022-10-25 16:08:10 +11:00
Ian Wienand	dc18968927	Run a base test against "old" bridge Run a base test against a Bionic bridge to ensure we don't break things as we transition the current production host as we move to a new Focal-based environment. Change-Id: I1f745a06c4428cf31a166b3d53dd6321bfd41ebc	2022-10-20 09:49:10 +11:00
Ian Wienand	51611845d4	Convert production playbooks to bastion host group Following-on from Iffb462371939989b03e5d6ac6c5df63aa7708513, instead of directly referring to a hostname when adding the bastion host to the inventory for the production playbooks, this finds it from the first element of the "bastion" group. As we do this twice for the run and post playbooks, abstract it into a role. The host value is currently "bridge.openstack.org" -- as is the existing hard-coding -- thus this is intended to be a no-op change. It is setting the foundation to make replacing the bastion host a simpler process in the future. Change-Id: I286796ebd71173019a627f8fe8d9a25d0bfc575a	2022-10-20 09:49:10 +11:00
Ian Wienand	d4c46ecdef	Abstract name of bastion host for testing path This replaces hard-coding of the host "bridge.openstack.org" with hard-coding of the first (and only) host in the group "bastion". The idea here is that we can, as much as possible, simply switch one place to an alternative hostname for the bastion such as "bridge.opendev.org" when we upgrade. This is just the testing path, for now; a follow-on will modify the production path (which doesn't really get speculatively tested) This needs to be defined in two places : 1) We need to define this in the run jobs for Zuul to use in the playbooks/zuul/run-*.yaml playbooks, as it sets up and collects logs from the testing bastion host. 2) The nested Ansible run will then use inventory inventory/service/groups.yaml Various other places are updated to use this abstracted group as the bastion host. Variables are moved into the bastion group (which only has one host -- the actual bastion host) which means we only have to update the group mapping to the new host. This is intended to be a no-op change; all the jobs should work the same, but just using the new abstractions. Change-Id: Iffb462371939989b03e5d6ac6c5df63aa7708513	2022-10-20 09:00:43 +11:00
Ian Wienand	deed697853	testinfra: Update selenium calls Now that all the bridge nodes are Jammy (3.10), we can uncap this dependency which will bring in the latest selenium. Unfortunately after investigation the easier way to do things I hoped this would allow doesn't work; comments are added and small updates for new API. Update the users file-match so they run too. Change-Id: I6a9d02bfc79b90417b1f5b3d9431f4305864869c	2022-10-20 09:00:43 +11:00
Ian Wienand	34dc0f2679	Run jobs with a jammy bridge.openstack.org In prepartion for upgrading this host, run jobs with a Jammy based bridge.openstack.org. Since this has a much later Python, it brings in a later version of selenium when testing (used for screenshots) which has dropped some of the APIs we use. Pin it to the old version; we will fix this in a follow-on just to address one thing at a time (I6a9d02bfc79b90417b1f5b3d9431f4305864869c). Change-Id: If53286c284f8d25248abf4a1b2edd6951437dec2	2022-10-20 09:00:43 +11:00
Ian Wienand	8efaf8da93	infra-prod-bootstrap-bridge: fix typo in playbook name Introduced with Iebaeed5028050d890ab541818f405978afd60124 Change-Id: I2e06221d03589dc6bcb5fb060b439e35e3d604dc	2022-10-19 11:10:21 +11:00
Ian Wienand	77ebe6e0b7	infra-prod-bootstrap-bridge: run directly on bridge In discussion of other changes, I realised that the bridge bootstrap job is running via zuul/run-production-playbook.yaml. This means it uses the Ansible installed on bridge to run against itself -- which isn't much of a bootstrap. What should happen is that the bootstrap-bridge.yaml playbook, which sets up ansible and keys on the bridge node, should run directly from the executor against the bridge node. To achieve this we reparent the job to opendev-infra-prod-setup-keys, which sets up the executor to be able to log into the bridge node. We then add the host dynamically and run the bootstrap-bridge.yaml playbook against it. This is similar to the gate testing path; where bootstrap-bridge.yaml is run from the exeuctor against the ephemeral bridge testing node before the nested-Ansible is used. The root key deployment is updated to use the nested Ansible directly, so that it can read the variable from the on-host secrets. Change-Id: Iebaeed5028050d890ab541818f405978afd60124	2022-10-15 10:39:53 +11:00
Clark Boylan	df4f11393b	Drop python 3.8 base image builds Python 3.11 is coming up and running image builds for all the pythons gets overwhelming fast. We end up with so many jobs that landing any one change to our base images becomes difficult. To reduce the total count we remove builds for 3.8 to make room for 3.11. Only a few things appear to still be using the 3.8 images and their updates are all listed as depends on below. Depends-On: https://review.opendev.org/c/opendev/gerritbot/+/861474 Depends-On: https://review.opendev.org/c/opendev/grafyaml/+/861475 Depends-On: https://review.opendev.org/c/opendev/statusbot/+/861476 Depends-On: https://review.opendev.org/c/zuul/zuul-client/+/861477 Depends-On: https://review.opendev.org/c/zuul/zuul-operator/+/861478 Depends-On: https://review.opendev.org/c/zuul/zuul-registry/+/861479 Change-Id: Ifa44ed0586f54b7ee4d6e37ba32235d63a30addb	2022-10-14 14:34:03 -07:00
Zuul	46ea560259	Merge "Add Jammy gitea-lb02 to our inventory"	2022-10-14 16:44:12 +00:00
Clark Boylan	3e3e053f49	Resync gerrit plugin versions to latest gerrit releases This was missed in the effort to push out Gerrit 3.5.3 as well as the ssh rsa sha2 fixes. That said it should be mostly fine as all of the plugins tagged 3.5.2 have tagged the same commit with 3.5.3. Making this largely a bookkeeping change. There is one bit that isn't strictly bookkeeping and that is the plugins/its-base checkout. Against gerrit 3.5 we convert from a master checkout [0] to a stable-3.5 [1] checkout as this branch exists now. Against gerrit 3.6 we convert from a stable-3.6 checkout to a master checkout. I suspect that a stable-3.6 branch existed for a short period of time and was cleaned up and zuul is using an old cached state. The change for its-base on gerrit 3.5 does represent a reversion of three commits but they all seem related to gerrit 3.6 so I expect this is fine. [0] https://gerrit.googlesource.com/plugins/its-base/+log/refs/heads/master [1] https://gerrit.googlesource.com/plugins/its-base/+log/refs/heads/stable-3.5 Change-Id: I619b28fe642ca8b57eb533157ec0a441f6b66890	2022-10-13 16:54:12 -07:00
Clark Boylan	8d4f1c719e	Add Jammy gitea-lb02 to our inventory This adds our first Jammy production server to the mix. We update the gitea load balancer as it is a fairly simple service which will allow us to focus on Jammy updates and not various server updates. We update testing to shift testing to a jammy node as well. We don't remove gitea-lb01 yet as this will happen after we switch DNS over to the new server and are happy with it. Change-Id: I8fb992e23abf9e97756a3cfef996be4c85da9e6f	2022-10-13 13:09:13 -07:00
Ian Wienand	14b85ea1b8	afs-release: better info when can not get lockfile For some reason this is failing in the gate -- the some reason bit is hard to determine at the moment. Log the exception. Change-Id: I13c60c5dfc4ab19d8dec589c96338adc7461c992	2022-10-11 10:53:02 +11:00
Ian Wienand	66e510f0ee	run-selenium: Use latest tag on firefox image I'm not sure why I used this tag; I probably copied it from [1] at the time? Let's just try latest. Update matchers so the screenshot jobs run [1] https://github.com/SeleniumHQ/docker-selenium Change-Id: I8ea7981dac54883822f3b6076b6f0f564571f018	2022-10-11 10:53:00 +11:00
Clark Boylan	4170a94be1	Collect apache logs from gitea99 host in testing We want to ensure that the logging apache does for us is sufficient to trace requests from the load balancer to apache to gitea. To do that we need to gather the logs and look at them. Change-Id: I468d37709c1a3c2255b1bfcf38a23bb1a2a75899	2022-09-30 12:48:51 -07:00
Clark Boylan	8e618c4a95	Remove ansible-version: 2.9 Zuul is removing support for old ansible versions. Remove our pin to old ansible. There shouldn't be any reason for these pins at this point. Change-Id: I0e0998e0d29d55695c6cd92b10feeb910b086d0a	2022-09-21 08:47:41 -07:00
James E. Blair	c661fb0972	Add Jaeger tracing server Change-Id: I1aa68b1d5f99364fa09776301894b922ed169a3a	2022-09-15 19:21:33 -07:00
Clark Boylan	970d5f6a06	Update python builder and base image It is a good idea ot periodically update our base python images. Now is a good time to do it as we've got debian bullseye updates and python minor releases. The bullseye updates fix a glibc bug that was affecting Ansible in the zuul images. With this update we'll be able to remove the workaround for that issue. We also update the builder image's apt-get process to include a clean to match tbe base image. This is more for consistency than anything else. Finally update job timeouts for builds as it seems we occasionally need more time particularly for emulated arm64 builds. Change-Id: I31483ff434f19f408aef3b63cb2cd24044a8bf29	2022-09-13 11:39:10 -07:00
Zuul	a8a19abf2c	Merge "system-config-run-borg-backup: add to gate"	2022-08-12 07:11:38 +00:00
Zuul	74389454ce	Merge "system-config-run-borg-backup: rename hosts to distro"	2022-08-11 23:57:30 +00:00
Zuul	00df4d06c0	Merge "system-config-run-borg-backup: add jammy test host"	2022-08-11 05:32:30 +00:00
Ian Wienand	46bb73d947	system-config-run-borg-backup: add to gate We must have missed this, I noticed when it didn't run on the gate job for I949c40e9046008d4f442b322a267ce0c967a99dc Change-Id: I62c5c0f262d9bd53580367dc9f1ad00fe7b6f6f2	2022-08-11 13:54:52 +10:00
Ian Wienand	55654851bc	system-config-run-borg-backup: rename hosts to distro Rename the testing hosts to be clearer that they are different distros. Change-Id: Ic4b2b4a1b1fa8bc9a9eb62dc2ccba529958f19cd	2022-08-11 13:32:49 +10:00
Zuul	4ee5be00d9	Merge "Also pin pip/setuptools when creating Xenial venvs"	2022-08-11 00:19:46 +00:00
Jeremy Stanley	2d9d24d07d	Also pin pip/setuptools when creating Xenial venvs We still have some Ubuntu Xenial servers, so cap the max usable pip and setuptools versions in their venvs like we already do for Bionic, in order to avoid broken installations. Switch our conditionals from release name comparisons to version numbers in order to more cleanly support ranges. Also make sure the borg run test is triggered by changes to the create-venv role. Change-Id: I5dd064c37786c47099bf2da66b907facb517c92a	2022-08-10 19:35:10 +00:00
Ian Wienand	a36ee527c8	system-config-run-borg-backup: add jammy test host With Jammy production nodes coming, add testing to the backup roles on this distro. Change-Id: I7d7733c7a52918b1faa65c3d0dcfd2cf94e66066	2022-08-10 10:14:56 +10:00
Ian Wienand	57939b40d9	system-config-run: bump base timeout to 3600 Many of our tests are actually running with a timeout of 3600; I think between a combination of bumping timeouts for failures and copy-pasting jobs. We are seeing frequent timeouts of other jobs without this, particularly on OVH GRA1. Let's bump the base timeout to 3600 to account for this. The only job that overrides this now is gitea, which runs for 4800 due to it's long import process. Change-Id: I762f0f7c7a53a456d9269530c9ae5a9c85903c9c	2022-08-10 10:14:56 +10:00
Ian Wienand	08644ae925	mirror-update: move testing to mirror-update99 Keeping the testing nodes at the other end of the namespace separates them from production hosts. This one isn't really referencing itself in testing like many others, but move it anyway. Change-Id: I2130829a5f913f8c7ecd8b8dfd0a11da3ce245a9	2022-08-05 08:18:55 +10:00
Ian Wienand	5ba37ced60	paste: move certificate to group variable Similar to Id98768e29a06cebaf645eb75b39e4dc5adb8830d, move the certificate variables to the group definition file, so that we don't have to duplicate handlers or definitions for the testing host. Change-Id: I6650f5621a4969582f40700232a596d84e2b4a06	2022-08-05 08:18:55 +10:00
Ian Wienand	e70c1e581c	static: move certs to group, update testing name to static99 Currently we define the letsencrypt certs for each host in its individual host variables. With recent work we have a trusted CA and SAN names setup in our testing environment; introducing the possibility that we could accidentally reference the production host during testing (both have valid certs, as far as the testing hosts are concerned). To avoid this, we can use our naming scheme to move our testing hosts to "99" and avoid collision with the production hosts. As a bonus, this really makes you think more about your group/host split to get things right and keep the environment as abstract as possible. One example of this is that with letsencrypt certificates defined in host vars, testing and production need to use the same hostname to get the right certificates created. Really, this should be group-level information so it applies equally to host01 and host99. To cover "hostXX.opendev.org" as a SAN we can include the inventory_hostname in the group variables. This updates one of the more tricky hosts, static, as a proof of concept. We rename the handlers to be generic, and update the testing targets. Change-Id: Id98768e29a06cebaf645eb75b39e4dc5adb8830d	2022-08-05 08:18:55 +10:00
Zuul	11494a31a4	Merge "system-config-run-gitea: increase timeout"	2022-08-04 17:06:06 +00:00
Zuul	13d65b07a1	Merge "Run our base playbook on jammy"	2022-08-04 11:34:10 +00:00
Ian Wienand	53da4a3fb2	system-config-run-gitea: increase timeout I've seen a couple of jobs timeout on this for no apparent reason. Loading all the repos just seems to take a long time. Looking at the logs [1], depending on the cloud taking 55m - 1h is not terribly uncommon. Increase the timeout on this by 20 minutes to give it enough headroom over an hour. [1] https://zuul.opendev.org/t/openstack/builds?job_name=system-config-run-gitea&project=opendev%2Fsystem-config Change-Id: I51080820bae35ac615a3b8b7ee1b8890e0df8410	2022-08-04 20:38:08 +10:00
Zuul	187e4307a1	Merge "paste : move testing host to paste99, remove https hacks"	2022-08-04 07:19:05 +00:00

1 2 3 4 5 ...

422 Commits