32 Commits

Author SHA1 Message Date
Clark Boylan
506a11f9d2 Add ansible role to manage gerritbot
This new ansible role deploys gerritbot with docker-compose on
eavesdrop.openstack.org. This way we can run it where the other bots
live.

Testing is rudimentary for now as we don't really want to connect to a
production gerrit and freenode. We check things the best we can.

We will want to coordinate deployment of this change with disabling the
running service on the gerrit server.

Depends-On: https://review.opendev.org/745240
Change-Id: I008992978791ff0a38f92fb4bc529ff643f01dd6
2020-08-07 13:20:18 -07:00
Ian Wienand
ba45f251d1 Fix junit error, add HTML report
Specifying the family stops a deprecation warning being output.

Add a HTML report and report it as an artifact as well; this is easier
to read.

Change-Id: I2bd6505c19cee2d51e9af27e9344cfe2e1110572
2020-07-15 07:03:22 +10:00
Ian Wienand
a020568ee5 Copy generated inventory to bridge logs
This is the inventory generated and used by bridge, copy it into the
logs as well.

Change-Id: I15d0ddc4c8340735c0332139ddedc06fc05b8269
2020-07-15 07:03:22 +10:00
Ian Wienand
185797a0e5 Graphite container deployment
This deploys graphite from the upstream container.

We override the statsd configuration to have it listen on ipv6.
Similarly we override the ngnix config to listen on ipv6, enable ssl,
forward port 80 to 443, block the /admin page (we don't use it).

For production we will just want to put some cinder storage in
/opt/graphite/storage on the production host and figure out how to
migrate the old stats.  The is also a bit of cleanup that will follow,
because we half-converted grafana01.opendev.org -- so everything can't
be in the same group till that is gone.

Testing has been added to push some stats and ensure they are seen.

Change-Id: Ie843b3d90a72564ef90805f820c8abc61a71017d
2020-07-03 07:17:28 +10:00
Ian Wienand
b146181174 Grafana container deployment
This uses the Grafana container created with
Iddfafe852166fe95b3e433420e2e2a4a6380fc64 to run the
grafana.opendev.org service.

We retain the old model of an Apache reverse-proxy; it's well tested
and understood, it's much easier than trying to map all the SSL
termination/renewal/etc. into the Grafana container and we don't have
to convince ourselves the container is safe to be directly web-facing.

Otherwise this is a fairly straight forward deployment of the
container.  As before, it uses the graph configuration kept in
project-config which is loaded in with grafyaml, which is included in
the container.

Once nice advantage is that it makes it quite easy to develop graphs
locally, using the container which can talk to the public graphite
instance.  The documentation has been updated with a reference on how
to do this.

Change-Id: I0cc76d29b6911aecfebc71e5fdfe7cf4fcd071a4
2020-07-03 07:17:22 +10:00
Clark Boylan
9b5e5d3c57 Deal with gitea pagination of repo lists
We list gitea repos to determine if we need to create a repo. If the
repo isn't listed by gitea we create it. New gitea paginates these
listings so we were only getting 30 repos listed when we had far more.
This resulted in us trying to create repos which already exist which is
a gitea http 409 error.

Fix this by paging through the listings until we've seen all the
repos. This should give us a complete listing.

To test this we run our manage-projects playbook twice in the
system-config-run-gitea job. The first pass creates all the new
projects. Then the second pass should noop cleanly.

Change-Id: I73b77b9ddaa0106d4dc0a49c4d4b7751a39a16f9
Co-Authored-By: Jeremy Stanley <fungi@yuggoth.org>
2020-06-25 13:51:27 -07:00
Ian Wienand
8acd503692 mirror-update: update to focal
We want a more recent version of rsync, and upgrading to focal is one
easy way to get it, and to also have a base OS with a longer support
period.  Test it in the gate.

Change-Id: I1edf074e5fe788ef75693d2cd172370c05bf4732
2020-06-18 14:24:27 +10:00
Monty Taylor
9b28b8864a Run restart playbooks to test they work
This runs our zuul and nodepool restart playbooks after the initial
service installs in the system-config-run jobs. This will help ensure
that they work consistently over time.

Fix nodepool restart playbook

Change-Id: I953e7222218c5555bb44fccd913eaa5e9374c669
2020-06-16 12:03:00 -05:00
James E. Blair
ac5fc652f4 Merge "Fake zuul_connections for gate" 2020-06-15 21:47:49 +00:00
James E. Blair
e989281e02 Merge "Stop using backend hostname in zuul testinfra tests" 2020-06-15 21:47:43 +00:00
James E. Blair
7f7c155555 Fake zuul_connections for gate
We can't establish Gerrit or Github connections in the gate, so
Zuul fails to start.  Reducing the set of connections in the gate
to just smtp should allow it to start (albiet with tenant loading
errors).  But that should let us test basic system setup and
internal connectivity.

Change-Id: I39d648ac5dd6ee3e9bfbc026cd6d7142461c418c
2020-06-15 09:57:39 -07:00
Zuul
7c913ab48b Merge "Test etherpad with testinfra" 2020-06-12 00:03:54 +00:00
Clark Boylan
7caf3a6c6d Test etherpad with testinfra
This adds simple testing of the etherpad service to testinfra.

Change-Id: I3c89a0a92a41cf69d075d6cef99fa12db68b17c6
2020-06-11 10:24:39 -07:00
James E. Blair
3d6cefe9dd Stop using backend hostname in zuul testinfra tests
Tests that call host.backend.get_hostname() to switch on test
assertions are likely to fail open.  Stop using this in zuul tests
and instead add new files for each of the types of zuul hosts
where we want to do additional verification.

Share the iptables related code between all the tests that perform
iptables checks.

Also, some extra merger test and some negative assertions are added.

Move multi-node-hosts-file to after set-hostname. multi-node-hosts-file
is designed to append, and set-hostname is designed to write.

When we write the gate version of the inventory, map the nodepool
private_ipv4 address as the public_v4 address of the inventory host
since that's what is written to /etc/hosts, and is therefore, in the
context of a gate job, the "public" address.

Change-Id: Id2dad08176865169272a8c135d232c2b58a7a2c1
2020-06-10 14:48:40 -07:00
Monty Taylor
83ced7f6e6 Split inventory into multiple dirs and move hostvars
Make inventory/service for service-specific things, including the
groups.yaml group definitions, and inventory/base for hostvars
related to the base system, including the list of hosts.

Move the exisitng host_vars into inventory/service, since most of
them are likely service-specific. Move group_vars/all.yaml into
base/group_vars as almost all of it is related to base things,
with the execption of the gerrit public key.

A followup patch will move host-specific values into equivilent
files in inventory/base.

This should let us override hostvars in gate jobs. It should also
allow us to do better file matchers - and to be able to organize
our playbooks move if we want to.

Depends-On: https://review.opendev.org/731583
Change-Id: Iddf57b5be47c2e9de16b83a1bc83bee25db995cf
2020-06-04 07:44:36 -05:00
Monty Taylor
f27c170d01 Rename service-letsencrypt to just letsencrypt
This isn't a service, it's a meta thing that we run for different
hosts at different times.

Change-Id: Ib65665c98afb3ddb94b15346931be88a4b1757d8
2020-06-04 07:44:36 -05:00
Monty Taylor
d93a661ae4 Run iptables in service playbooks instead of base
It's the only part of base that's important to run when we run a
service. Run it in the service playbooks and get rid of the
dependency on infra-prod-base.

Continue running it in base so that new nodes are brought up
with iptables in place.

Bump the timeout for the mirror job, because the iptables addition
seems to have just bumped it over the edge.

Change-Id: I4608216f7a59cfa96d3bdb191edd9bc7bb9cca39
2020-06-04 07:44:22 -05:00
Zuul
3f61433c59 Merge "Generate ssl check list directly from letsencrypt variables" 2020-05-28 23:31:11 +00:00
Monty Taylor
e8716e742e Move base roles into a base subdir
If we move these into a subdir, it cleans up the number of things
we nave to files match on.

Stop running disable-puppet-agent in base. We run it in run-puppet
which should be fine.

Change-Id: Ia16adb96b11d25a097490882c4c59a50a0b7b23d
2020-05-27 16:28:37 -05:00
Clark Boylan
eb22e01f31 Add support for multiple jvbs behind meetpad
The jitsi video bridge (jvb) appears to be the main component we'll need
to scale up to handle more users on meetpad. Start preliminary
ansiblification of scale out jvb hosts.

Note this requires each new jvb to run on a separate host as the jvb
docker images seem to rely on $HOSTNAME to uniquely identify each jvb.

Change-Id: If6d055b6ec163d4a9d912bee9a9912f5a7b58125
2020-05-20 13:41:30 -07:00
James E. Blair
085856e318 Add iptables_extra_allowed_groups
This adds a new variable for the iptables role that allows us to
indicate all members of an ansible inventory group should have
iptables rules added.

It also removes the unused zuul-executor-opendev group, and some
unused variables related to the snmp rule.

Also, collect the generated iptables rules for debugging.

Change-Id: I48746a6527848a45a4debf62fd833527cc392398
Depends-On: https://review.opendev.org/728952
2020-05-20 13:18:29 -07:00
James E. Blair
b173fcb1d9 Vendor the apt repo gpg keys used for Zuul
We use several PPAs on the Zuul servers, and today the Ubuntu keyring
servers are frequently failing.  Rather than rely on them, store the
GPG keys in this repo and install the files "manually" rather than
using the apt_repo module.

Change-Id: I009a1a38d3a5864a8d5b0d8f8be24a83d1924292
2020-05-20 13:17:09 -07:00
James E. Blair
7a63dad5c1 Save zuul and nodepool logs from gate test jobs
Let's save our debug logs so we can better observe the system in
the gate.

Change-Id: Ic80b646e0407d27e43cdb10cb573551999dd01d4
2020-05-20 13:17:08 -07:00
Ian Wienand
c9215801f0 Generate ssl check list directly from letsencrypt variables
This autogenerates the list of ssl domains for the ssl-cert-check tool
directly from the letsencrypt list.

The first step is the install-certcheck role that replaces the
puppet-ssl_cert_check module that does the same.  The reason for this
is so that during gate testing we can test this on the test
bridge.openstack.org server, and avoid adding another node as a
requirement for this test.

letsencrypt-request-certs is updated to set a fact
letsencrypt_certcheck_domains for each host that is generating a
certificate.  As described in the comments, this defaults to the first
host specified for the certificate and the listening port can be
indicated (if set, this new port value is stripped when generating
certs as is not necessary for certificate generation).

The new letsencrypt-config-certcheck role runs and iterates all
letsencrypt hosts to build the final list of domains that should be
checked.  This is then extended with the
letsencrypt_certcheck_additional_domains value that covers any hosts
using certificates not provisioned by letsencrypt using this
mechanism.

These additional domains are pre-populated from the openstack.org
domains in the extant check file, minus those openstack.org domain
certificates we are generating via letsencrypt (see
letsencrypt-create-certs/handlers/main.yaml).  Additionally, we
update some of the certificate variables in host_vars that are
listening on port .

As mentioned, bridge.openstack.org is placed in the new certcheck
group for gate testing, so the tool and config file will be deployed
to it.  For production, cacti is added to the group, which is where
the tool currently runs.  The extant puppet installation is disabled,
pending removal in a follow-on change.

Change-Id: Idbe084f13f3684021e8efd9ac69b63fe31484606
2020-05-20 14:27:14 +10:00
Zuul
728f8a9ee5 Merge "Enable ssl on all mirror vhosts" 2020-05-19 21:38:12 +00:00
Clark Boylan
79ff2afb87 Enable ssl on all mirror vhosts
Previously we had enabled SSL on our main vhost for the mirrors. Do
similar for all of the proxy cache vhosts for docker and other external
resources.

As part of this change we improve the testing to ensure that the new
vhosts are working as expected. One testing specific change to note is
the testinfra node names did not match our existing system-config-run
job nodenames. This has been corrected.

Additionally RHRegistryMirror and QuayMirror may not be working and
fixing those is left as a followup.

Change-Id: I9dbbd4080c3a2cce4acc39d63244f7a645503553
2020-05-19 11:52:20 -07:00
Ian Wienand
45201f3d66 Remove puppet mirror support
Remove the separate "mirror_opendev" group and rename it to just
"mirror".  Update various parts to reflect that change.

We no longer deploy any mirror hosts with puppet, remove the various
configuration files.

Depends-On: https://review.opendev.org/728345
Change-Id: Ia982fe9cb4357447989664f033df976b528aaf84
2020-05-16 10:14:25 +10:00
Ian Wienand
7b8b788ce2 Add focal testing for mirror nodes
Change-Id: I64de9a61c5044b93f6ce7e2d31cf51d78fd4ec16
2020-05-13 05:32:54 +10:00
Zuul
03cc87dd5a Merge "Add focal to system-config base job" 2020-05-11 23:16:31 +00:00
Clark Boylan
f0352e31e1 Run jobs prod test jobs when docker images update
We build our own docker images for several services, but weren't
triggering production test runs when the docker images are updated. Fix
this and force chagnes to dockerfiles, which produce new images, to
trigger production test runs of those services.

Change-Id: I18962663f168bbf6380e315d96b18751a46ceb58
2020-05-08 09:05:23 -07:00
Ian Wienand
877d0bf525 Add focal to system-config base job
Follow-on to addition for executors in
I0126f7c77d92deb91711f38a19384a9319955cf5 to keep the base roles focal
clean.

Change-Id: I40fdf1b2b7d012f4ce5d013528f4460d277e44d4
2020-05-07 17:30:48 -05:00
Clark Boylan
cfc83807b7 Organize zuul jobs in zuul.d/ dir
Our .zuul.yaml file has grown quite large. Try to make this more
manageable by splitting it into zuul.d/ directory with jobs organized by
function.

Change-Id: I0739eb1e2bc64dcacebf92e25503f67302f7c882
2020-05-07 17:30:48 -05:00