This is a first step toward making smaller playbooks which can be
run by Zuul in CD.
Zuul should be able to handle missing projects now, so remove it
from the puppet_git playbook and into puppet.
Make the base playbook be merely the base roles.
Make service playbooks for each service.
Remove the run-docker job because it's covered by service jobs.
Stop testing that puppet is installed in testinfra. It's accidentally
working due to the selection of non-puppeted hosts only being on
bionic nodes and not installing puppet on bionic. Instead, we can now
rely on actually *running* puppet when it's important, such as in the
eavesdrop job. Also remove the installation of puppet on the nodes in
the base job, since it's only useful to test that a synthetic test
of installing puppet on nodes we don't use works.
Don't run remote_puppet_git on gitea for now - it's too slow. A
followup patch will rework gitea project creation to not take hours.
Change-Id: Ibb78341c2c6be28005cea73542e829d8f7cfab08
This change contains the roles and testing for deploying certificates
on hosts using letsencrypt with domain authentication.
From a top level, the process is implemented in the roles as follows:
1) letsencrypt-acme-sh-install
This role installs the acme.sh tool on hosts in the letsencrypt
group, along with a small custom driver script to help parse output
that is used by later roles.
2) letsencrypt-request-certs
This role runs on each host, and reads a host variable describing
the certificates required. It uses the acme.sh tool (via the
driver) to request the certificates from letsencrypt. It populates
a global Ansible variable with the authentication TXT records
required.
If the certificate exists on the host and is not within the renewal
period, it should do nothing.
3) letsencrypt-install-txt-record
This role runs on the adns server. It installs the TXT records
generated in step 2 to the acme.opendev.org domain and then
refreshes the server. Hosts wanting certificates will have
pre-provisioned CNAME records for _acme-challenge.host.opendev.org
pointing to acme.opendev.org.
4) letsencrypt-create-certs
This role runs on each host, reading the same variable as in step
2. However this time the acme.sh tool is run to authenticate and
create the certificates, which should now work correctly via the
TXT records from step 3. After this, the host will have the
full certificate material.
Testing is added via testinfra. For testing purposes requests are
made to the staging letsencrypt servers and a self-signed certificate
is provisioned in step 4 (as the authentication is not available
during CI). We test that the DNS TXT records are created locally on
the CI adns server, however.
Related-Spec: https://review.openstack.org/587283
Change-Id: I1f66da614751a29cc565b37cdc9ff34d70fdfd3f
Although we only have adns01, for testing purposes it would be handy
to have another adns server in testinfra (this way, we can write tests
for letsencrypt paths that don't try and execute on the existing dns
testing paths).
Change-Id: Ie1968660c110bdb626df637f182f1f39598e59ac
This runs an haproxy which is strikingly similar to the one we
currently run for git.openstack.org, but it is run in a docker
container.
Change-Id: I647ae8c02eb2cd4f3db2b203d61a181f7eb632d2
We want our base ansible roles to run on these nodes. However,
k8s-on-openstack manages firewall rules via openstack security
groups, so we don't want to run those there.
There was a discussion about making a minimal set of roles that
are run by default and then a group containing servers that got
the full set ... but that would require a duplicate entry for 99%
of our servers in the inventory, while the "only run a subset" is
the exception case.
Change-Id: I2cbf364305f758cecf11df41398d3d2c05222fda
Add the gitea k8s cluster to root's .kube/config file on bridge.
The default context does not exist in order to force us to explicitly
specify a context for all commands (so that we do not inadvertently
deploy something on the wrong k8s cluster).
Change-Id: I53368c76e6f5b3ab45b1982e9a977f9ce9f08581
This is a role for installing docker on our control-plane servers.
It is based on install-docker from zuul-jobs.
Basic testinfra tests are added; because docker fiddles the iptables
rules in magic ways, the firewall testing is moved out of the base
tests and modified to partially match our base firewall configuration.
Change-Id: Ia4de5032789ff0f2b07d4f93c0c52cf94aa9c25c
This adds connection information for an experimental kubernetes
cluster hosted in vexxhost-sjc1 to the nodepool servers.
Change-Id: Ie7aad841df1779ddba69315ddd9e0ae96a1c8c53
Let's abandon the idea that we'll treat the backup server specially.
As long as we allow *any* automated remote access via ansible, we
have opened the door to potential compromise of the backup systems
if bridge is compromised. Rather than pretending that this separation
gives us any benefit, remove it.
Change-Id: I751060dc05918c440374e80ffb483d948f048f36
In run_all, we start a bunch of plays in sequence, but it's difficult
to tell what they're doing until you see the tasks. Name the plays
themselves to produce a better narrative structure.
Change-Id: I0597eab2c06c6963601dec689714c38101a4d470
This manages the clouds.yaml files in ansible so that we can get them
updated automatically on bridge.openstack.org (which does not puppet).
Co-Authored-By: James E. Blair <jeblair@redhat.com>
Depends-On: https://review.openstack.org/598378
Change-Id: I2071f2593f57024bc985e18eaf1ffbf6f3d38140
In order to talk to limestone clouds we need to configure a custom CA.
Do this in ansible instead of puppet.
A followup should add writing out clouds.yaml files.
Change-Id: I355df1efb31feb31e039040da4ca6088ea632b7e
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
This role manages puppet on the host and should be applied to all
puppet hosts.
The dependent change is required to get the "puppet" group into the
generated inventory for the system-config-run-base test.
Change-Id: I0e18c53836baca743d32abf1bb4b7a3f63c025bb
Depends-On: https://review.openstack.org/596994
Contains a handler to restart crond when tz is changed. Cron service
name differs across distros.
Removes the puppet-timezone usage.
Change-Id: I4e45d0e0ed37214ac491f373ff2d37750e720718
Co-Authored-By: James E. Blair <corvus@inaugust.com>
Change-Id: Id8b347483affd710759f9b225bfadb3ce851333c
Depends-On: https://review.openstack.org/596503
Now that we've got base server stuff rewritten in ansible, remove the
old puppet versions.
Depends-On: https://review.openstack.org/588326
Change-Id: I5c82fe6fd25b9ddaa77747db377ffa7e8bf23c7b
In order to get puppet out of the business of mucking with exim and
fighting ansible, finish moving the config to ansible.
This introduces a storyboard group that we can use to apply the exim
config across both servers. It also splits the base playbook so that we
can avoid running exim on the backup servers. And we set
purge_apt_sources the same as was set in puppet. We should probably
remove it though, since none of us have any clue why it's here.
Change-Id: I43ee891a9c1beead7f97808208829b01a0a7ced6
The mailing list servers have a more complex exim config. Put the
routers and transports into ansible variables.
While we're doing it, role variables with an exim_ prefix - since 'routers'
as a global variable might be a little broad.
iteritems isn't a thing in python3, only items.
We need to escape the exim config with ${if or{{ - because of the {{
which looks like jinja. Wrap it in a {% raw %} block.
Getting the yaml indentation right for things here is non-trivial. Make
them strings instead.
Add a README.rst file - and use the zuul:rolevar construct in it,
because it's nice.
Change-Id: Ieccfce99a1d278440c5baa207479a1887898298e
We want email to work.
Add a default value so that integration tests work - and update the
template so that if the value in the alias mapping is empty we don't
write out a half-formed alias.
Enable the epel repo on CentOS nodes in base-repos. This is done in
install_puppet.sh, but install_puppet.sh doesn't get run on ansible-only
nodes.
Change-Id: I68ad9f66c3b8672d9642c7764e50adac9cafdaf9
Split base playbook into two plays
The update apt-cache handler from base-repos needs to fire before we run
base-server. Split into two plays so that the handler will fire.
Fix use of first_found
For include_vars, using the lookup version of first_found requires being
explicit about the path to search in as well. We also need to use query
together with loop to get skip to work right.
Extract the list of file locations we look for for distro and platform
specific variables into a variable so that we can reuse it instead of
copy-pasta.
The vim package is vim-nox on ubuntu and vim-minimal on debian.
ntpdate only needs to be enabled on boot, it does not need to be
immediately started. At least, that's what the old puppet was doing and
trying to start it immediately breaks centos integration tests.
emacs-nox is emacs23-nox on trusty.
Change-Id: If3db276a5f6a8f76d7ce8635da8d2cbc316af341
Depends-On: https://review.openstack.org/588326
We want to launch a new bastion host to run ansible on. Because we're
working on the transition to ansible, it seems like being able to do
that without needing puppet would be nice. This gets user management,
base repo setup and whatnot installed. It doesn't remove them from the
existing puppet, nor does it change the way we're calling anything that
currently exists.
Add bridge.openstack.org to the disabled group so that we don't try to
run puppet on it.
Change-Id: I3165423753009c639d9d2e2ed7d9adbe70360932