We've been running against the dev branch of acme.sh since the initial
commit of the letsencrypt work -- at the time I feel like there were
things we needed that weren't in a release. Anyway, there is now an
issue causing ECC certificates to be made and failing to renew [1]
which we can't work-around.
Pin this to the current release. It would probably be good to pin
this to the "latest" release to avoid us forgetting to ever bump this
and ending up with even harder to debug bit-rot.
[1] https://github.com/acmesh-official/acme.sh/issues/4416
Change-Id: I0d07ba1b5ab77e07c67ad990e7bc78a9f90005a4
In what looks like a typo, we are overriding the bridge node for this
test to a bionic host. Remove this. This was detected by testing an
upgraded Ansible, which wouldn't install on the lower python on
Bionic.
Change-Id: Ie3e754598c6da1812e74afa914f50d91972012cd
These images have a number of issues we've identified and worked
around. The current iteration of this change is essentially
identical to upstream but with a minor tweak to allow the latest
mailman version, and adjusts the paths for hyperkitty and postorius
URLs to match those in the upstream mailman-web codebase, but
doesn't try to address the other items. However, we should consider
moving our fixes from ansible into the docker images where possible
and upstream those updates.
Unfortunately upstream hasn't been super responsive so far hence this
fork. For tracking purposes here are the issues/PRs we've already filed
upstream:
https://github.com/maxking/docker-mailman/pull/552https://github.com/maxking/docker-mailman/issues/548https://github.com/maxking/docker-mailman/issues/549https://github.com/maxking/docker-mailman/issues/550
Change-Id: I3314037d46c2ef2086a06dea0321d9f8cdd35c73
This turns launch-node into an installable package. This is not meant
for distribution, we just encapsulate the installation in a virtualenv
on the bastion host. Small updates to documentation and simple
testing are added (also remove some spaces to make test_bridge.py
consistent).
Change-Id: Ibcb4774114d73600753ca155ed277d775964bc79
It looks like at some point the RAX bind output changed format
slightly, which messed up our backup script. Rework it to parse the
current output.
This parsing is obviously a little fragile ... it is nice to have the
output sorted and lined up nicely (like our manually maintained
opendev.org bind files...). If the format changes again and this
becomes a problem, maybe we switch to dumping the RAX output directly
and forget about formatting it nicely.
Change-Id: I742dd6ef9ffdb377274b384b847625c98dd5ff16
Grab the make logs from the dkms directory. This is helpful if the
modules are failing to build.
The /var/lib/dkms directory contains all the source and object files,
etc., which seems unnecessary to store in general. Thus we just trim
this to the log directory.
Change-Id: I9b5abc9cf4cd59305470a04dda487dfdfd1b395a
This was missed in I137ab824b9a09ccb067b8d5f0bb2896192291883.
The called bootstrap playbook runs on prod_bastion[0], but we were
still calling the constructed gate group "bastion" (see note below on
what it's doing).
We don't notice because the multi-node setup is already making it so
the nodes can log into each other. But it means we're not exercising
the root key addition role, which we should be doing in the gate.
Change-Id: I8238fc11a055c6d926b58df93c48a47121c0fde1
Add the host keys to the inventory. This will allow us to populate
the known_hosts on the bastion host from system-config
Change-Id: I4863425d5b784d0cdf118e1252414ca78fd24179
At some point we shifted from doing this task using the web UI to
primarily using ssh only admin accounts. The docs ended up in a slightly
confusing place with steps that only make sense when you interact with
the web UI. Update the force merge docs to assume ssh only which is far
more aligned with our admin account expectations.
Change-Id: Ia99afe7ee10927765733891f72bd428e52fa2225
This adds links to @opendevinfra; the Mastodon one allows us to have a
"green" certified link to opendev.org in our Mastodon profile.
Change-Id: Ic127ceb4abd2d89cd6155e8831145fa3b3705664
The dependent change allows us to also post to mastodon. Configure
this to point to fosstodon where we have an opendevinfra account.
Change-Id: Iafa8074a439315f3db74b6372c1c3181a159a474
Depends-On: https://review.opendev.org/c/opendev/statusbot/+/864586
This should now be a largely functional deployment of mailman 3. There
are still some bits that need testing but we'll use followup changes to
force failure and hold nodes.
This deployment of mailman3 uses upstream docker container images. We
currently hack up uids and gids to accomodate that. We also hack up the
settings file and bind mount it over the upstream file in order to use
host networking. We override the hyperkitty index type to xapian. All
list domains are hosted in a single installation and we use native
vhosting to handle that.
We'll deploy this to a new server and migrate one mailing list domain at
a time. This will allow us to start with lists.opendev.org and test
things like dmarc settings before expanding to the remaining lists.
A migration script is also included, which has seen extensive
testing on held nodes for importing copies of the production data
sets.
Change-Id: Ic9bf5cfaf0b87c100a6ce003a6645010a7b50358
These were foregotten in I137ab824b9a09ccb067b8d5f0bb2896192291883
when we switched the testing bridge host to bridge99.
Change-Id: I742965c61ed00be05f1daea2d6110413cff99e2a
Gerrit made new releases and we should update to them. Release notes can
be found here:
https://www.gerritcodereview.com/3.5.html#354https://www.gerritcodereview.com/3.6.html#363
The main improvement for us is likely to be the copy approvals
performance boosts and error handling. We still need to run that prior
to our 3.6 upgrade.
Note we currently only run 3.5 in production but we test the 3.6 upgrade
from our current production version so it makes sense to update the 3.6
image as well.
Change-Id: Idf9a16b443907a2d0c19c1b6ec016f5d16583ad2
This adds optional SSL support to zookeeper-statsd. This could
come in handy if we ever decide to turn off the plaintext
localhost-only port.
This also corrects the type handling for the latency value, which
can be a floating point.
Change-Id: Id39fc8bd924eda528723c40d2e7e24993a60d6a5
The OpenInfra Summit organizers have decided they're going back to
using the term "track chairs" instead of "programming committee" and
would like to switch to a new mailing list name in order to
coordinate things for the upcoming conference. Remove the old list
from our configuration when adding the new one, and set up a
forwarding alias for the old list's address so that replies to
previous messages will end up in the right place.
Change-Id: I8060b78b74f66dd8eb95d83659cc92b3186f573e
A recent change in pip wheel cache behavior had upstream pip indicating
that we really should be using pip wheel instead. The reason we weren't
using pip wheel appears to be that we wanted to infer what top level
wheel to install via contents of a dir separate from our wheel output
dir/wheel cache. Using pip wheel implies everything gets flattened into
one location. We deal with this by having the build tool write all of
the top level wheels we care about into a separate location. Later we
can install all of the top level wheels while pointing find links at the
larger set of deps in the dir created by pip wheel.
Change-Id: Id9c674c1ec6fe5e72534549082e3adda9e286fd5
The recent uwsgi 2.0.21 release claims to have fixed issues building
uwsgi that required us to increase pip verbosity and reduce concurrency.
Remove those hacky workarounds in order to simplify our image.
Change-Id: I8b81bc3a5e6977ba8cd296708f356bc6db030fc2
Redirect etherpad container logs via rsyslogd to /var/log/containers,
which is rotated by default. This avoids some issues we've seen with
the journal becoming too big.
Change-Id: Id557b9265e30acdb2ca09631dbedf034f85a700f
This job is special in that we want it to install only on the
production bastion host. Pin it directly to the current host, and
leave a note about changing it when the bridge node is updated.
Change-Id: I15303daedef62d3002f0126c7782c59cc6ad2a8e
In thinking harder about the bootstrap process, it struck me that the
"bastion" group we have is two separate ideas that become a bit
confusing because they share a name.
We have the testing and production paths that need to find a single
bridge node so they can run their nested Ansible. We've recently
merged changes to the setup playbooks to not hard-code the bridge node
and they now use groups["bastion"][0] to find the bastion host -- but
this group is actually orthogonal to the group of the same name
defined in inventory/service/groups.yaml.
The testing and production paths are running on the executor, and, as
mentioned, need to know the bridge node to log into. For the testing
path this is happening via the group created in the job definition
from zuul.d/system-config-run.yaml. For the production jobs, this
group is populated via the add-bastion-host role which dynamically
adds the bridge host and group.
Only the *nested* Ansible running on the bastion host reads
s-c:inventory/service/groups.yaml. None of the nested-ansible
playbooks need to target only the currently active bastion host. For
example, we can define as many bridge nodes as we like in the
inventory and run service-bridge.yaml against them. It won't matter
because the production jobs know the host that is the currently active
bridge as described above.
So, instead of using the same group name in two contexts, rename the
testing/production group "prod_bastion". groups["prod_bastion"][0]
will be the host that the testing/production jobs use as the bastion
host -- references are updated in this change (i.e. the two places
this group is defined -- the group name in the system-config-run jobs,
and add-bastion-host for production).
We then can return the "bastion" group match to bridge*.opendev.org in
inventory/service/groups.yaml.
This fixes a bootstrapping problem -- if you launch, say,
bridge03.opendev.org the launch node script will now apply the
base.yaml playbook against it, and correctly apply all variables from
the "bastion" group which now matches this new host. This is what we
want to ensure, e.g. the zuul user and keys are correctly populated.
The other thing we can do here is change the testing path
"prod_bastion" hostname to "bridge99.opendev.org". By doing this we
ensure we're not hard-coding for the production bridge host in any way
(since if both testing and production are called bridge01.opendev.org
we can hide problems). This is a big advantage when we want to rotate
the production bridge host, as we can be certain there's no hidden
dependencies.
Change-Id: I137ab824b9a09ccb067b8d5f0bb2896192291883