Both the filesevers and db servers have common key material deployed
by the openafs-server-config role. Put both types of server in a new
group "afs-server-common" so we can define this key material in just
one group file on bridge.
Then separate out the two into afs-<file|db>-server groups for
consistent naming.
Rename afs-admin for consistent naming.
The service file is updated to reflect the new groups.
Change-Id: Ifa5f251fdfb8de737ad2ed96491d45294ce23a0c
With all AFS file-servers upgraded to 1.8, we can move afs01.dfw back
and rename the group to just "afs".
Change-Id: Ib31bde124e01cd07d6ff7eb31679c55728b95222
Backups have been going well on ethercalc02, so add borg backup runs
to all backed-up servers. Port in some additional excludes for Zuul
and slightly modify the /var/ matching.
Change-Id: Ic3adfd162fa9bedd84402e3c25b5c1bebb21f3cb
This deploys graphite from the upstream container.
We override the statsd configuration to have it listen on ipv6.
Similarly we override the ngnix config to listen on ipv6, enable ssl,
forward port 80 to 443, block the /admin page (we don't use it).
For production we will just want to put some cinder storage in
/opt/graphite/storage on the production host and figure out how to
migrate the old stats. The is also a bit of cleanup that will follow,
because we half-converted grafana01.opendev.org -- so everything can't
be in the same group till that is gone.
Testing has been added to push some stats and ensure they are seen.
Change-Id: Ie843b3d90a72564ef90805f820c8abc61a71017d
Remove the separate "mirror_opendev" group and rename it to just
"mirror". Update various parts to reflect that change.
We no longer deploy any mirror hosts with puppet, remove the various
configuration files.
Depends-On: https://review.opendev.org/728345
Change-Id: Ia982fe9cb4357447989664f033df976b528aaf84
Zuul is publishing lovely container images, so we should
go ahead and start using them.
We can't use containers for zuul-executor because of the
docker->bubblewrap->AFS issue, so install from pip there.
Don't start any of the containers by default, which should
let us safely roll this out and then do a rolling restart.
For things (like web or mergers) where it's safe to do so,
a followup change will swap the flag.
Change-Id: I37dcce3a67477ad3b2c36f2fd3657af18bc25c40
Migration plan:
* add zk* to emergency
* copy data files on each node to a safe place for DR backup
* make a json data backup: zk-shell localhost:2181 --run-once 'mirror / json://!tmp!zookeeper-backup.json/'
* manually run a modified playbook to set up the docker infra without starting containers
* rolling restart; for each node:
* stop zk
* split data and log files and move them to new locations
* remove zk packages
* start zk containers
* remove from emergency; land this change.
Change-Id: Ic06c9cf9604402aa8eb4bb79238021c14c5d9563
This should be mostly a no-op - but we will need to do a shutdown
in emergency mode.
Tell the gerrit role to not run compose up when run as part of
remote_puppet_git.
Change-Id: Id45376c2697656a12afeacf317b6f26c85c08dad
We have LE dns entries for review.o.o, but we're not actually
requesting the cert. Go ahead and request it - it'll make the
apache config easier to sort out.
Get the openstack.org certs for review-dev while we're at it.
Change-Id: I91d06c97993ba37204bd1fc326ae823e1b9c0c1a
Depends-On: https://review.opendev.org/707267
Depends-On: https://review.opendev.org/707255
We ended up running into a problem with nodepool built control plane
images (has to do with boot from volume not allowing us to delete images
that are in use by a nova instance). We have decided to clean this up
and go back to not doing this until we can do it more properly.
Note this isn't a revert because having a group for access to control
plane clouds does seem like a good idea in general and I believe there
have been changes we'd have to resolve in the clouds.yaml files anyway.
Depends-On: https://review.opendev.org/#/c/665012/
Change-Id: I5e72928ec2dec37afa9c8567eff30eb6e9c04f1d
In order to have nodepool build images and upload them to control
plane clouds, add them to the clouds.yaml on the nodepool-builder
hosts. Keep them out of the launcher configs by splitting the config
templates. So that we can keep our copies of things to a minimum,
create a group called "control-plane-clouds" and put bridge and nb0*
in it.
There are clouds mentions in here that we no longer use, a followup
patch will clean those up.
NOTE: Requires shifting the clouds config dict from
host_vars/bridge.openstack.org.yaml to group_vars/control-plane-clouds.yaml
in the secrets on bridge.
Needed-By: https://review.opendev.org/640044
Change-Id: Id1161bca8f23129202599dba299c288a6aa29212
The server has been removed, remove it from inventory.
While we're here, s/graphite.openstack.org/graphite.opendev.org/'
... it's a CNAME redirect but we might as well clean up.
Change-Id: I36c951c85316cd65dde748b1e50ffa2e058c9a88
This leaves ask.o.o and lists.o.o, which are still running Trusty, and
the cgit servers, which are likely to be decommissioned soon.
Change-Id: I78e7fd9e3079cc760da0aad955f6eeb32d442fc3
Two related changes that need to go together because we test with the
production groups.yaml.
Confusingly, there are arm64 PC1 puppet repos, and it contains a bunch
of things that it turns out are the common java parts only. The
puppet-agent package is not available, and it doesn't seem like it
will be [1]. I think this means we can not run puppet4 on our arm64
xenial ci hosts.
The problem is the mirrors have been updated to puppet4 -- runs are
now breaking on the arm mirrors because they don't have puppet-agent
packages. It seems all we can really do at this point is contine to
run them on puppet3.
This is hard (impossible?) to express with a fnmatch in the existing
yamlgroups syntax. We could do something like list all the mirror
hosts and use anchors etc, but we have to keep that maintained. Add
an feature to the inventory plugin that if the list entry starts with
a ^ it is considered a full regex and passed to re.match. This
allows us to write more complex matchers where required -- in this
case the arm64 ci mirror hosts are excluded from the puppet4 group.
Testing is updated.
[1] https://groups.google.com/forum/#!msg/puppet-dev/iBMYJpvhaWM/WTGmJvXxAgAJ
Change-Id: I828e0c524f8d5ca866786978486bc04829464b47
In roughly lexicographical order, upgrade a batch of servers to puppet
4. We skip ask-staging because although it is in the futureparser group
it was temporarily disabled in puppet and so hasn't actually gone
through the futureparser validation stage yet.
Depends-On: https://review.openstack.org/643465
Change-Id: I3971ffb9800e95aaaba0076ec3bd6a05cd92a750
Remove the puppetry for managing nameservers as we now use ansible
configured name servers without puppet.
We will need to follow this up with deletion of the existing
ns*.openstack.org and adns1.openstack.org servers.
Change-Id: Id7ec8fa58c9e37ce94ec71e4562607914e5c3ea4
An upcoming change will remove review.openstack.org and
puppetmaster.openstack.org from our hostgroups, since these servers
have been deleted from the provider already. We were explicitly
testing the hostgroup membership for the former, so replace that
with a couple of new ones which should provide more stable coverage
going forward.
Change-Id: Ida28b65e9f1dc01f233cc9bff4ce32aef70e347a
We removed the mirror nodes from the webservers group to fix iptables
rule application on the nodes. Unfortunately we didn't update our test
that tries to assert mirrors should be in the webservers group. Update
the test results fixture to remove webservers as a valid group for a
mirror node.
Change-Id: Iba18e54f4df4a36c0247f65642faacca9d195769
This mocks out enough of the Ansible inventory framework so we can
test the group matching against a range of corner cases as present in
the results.yaml file.
Change-Id: I05114d9aae6f149122da20f239c8b3546bc140bc