104 Commits

Author SHA1 Message Date
Monty Taylor
8d7075b02f Run zookeeper cluster in nodepool jobs
Rather than running a local zookeeper, just run a real zookeeper.
Also, get rid of nb01-test and just use nb04 - what could possibly
go wrong?

Dynamically write zookeeper host information to nodepool.yaml

So that we can run an actual zk using the new zk role on hosts in
ansible inventory, we need to write out the ip addresses of the
hosts that we build in zuul. This means having the info baked in
to the file in project-config isn't going to work.

We can do this in prod too, it shouldn't hurt anything.

Increase timeout for run-service-nodepool

We need to fix the playbook, but we'll do that after we get the
puppet gone.

Change-Id: Ib01d461ae2c5cec3c31ec5105a41b1a99ff9d84a
2020-04-29 16:18:25 -05:00
Zuul
5e4901b7c6 Merge "Install docker-compose from pypi" 2020-04-17 19:11:19 +00:00
Zuul
b0ab2f37c5 Merge "Run ZK from containers" 2020-04-17 16:49:34 +00:00
James E. Blair
42574b2b37 Run ZK from containers
Migration plan:
* add zk* to emergency
* copy data files on each node to a safe place for DR backup
* make a json data backup: zk-shell localhost:2181 --run-once 'mirror / json://!tmp!zookeeper-backup.json/'
* manually run a modified playbook to set up the docker infra without starting containers
* rolling restart; for each node:
  * stop zk
  * split data and log files and move them to new locations
  * remove zk packages
  * start zk containers
* remove from emergency; land this change.

Change-Id: Ic06c9cf9604402aa8eb4bb79238021c14c5d9563
2020-04-17 08:43:09 -07:00
Clark Boylan
8eb981b47f Install docker-compose from pypi
We want to use stop_grace_period to manage gerrit service stops. This
feature was added in docker-compose 1.10 but the distro provides 1.5.
Work around this by installing docker-compose from pypi.

This seems like a useful feature and we want to manage docker-compose
the same way globally so move docker-compose installation into the
install-docker role.

New docker-compose has slightly different output that we must check for
in the gitea start/stop machinery. We also need to check for different
container name formatting in our test cases. We should pause here and
consider if this has any upgrade implications for our existing services.

Change-Id: Ia8249a2b84a2ef167ee4ffd66d7a7e7cff8e21fb
2020-04-16 12:08:00 -07:00
Zuul
e3ad9e79eb Merge "Get rid of all-clouds.yaml" 2020-04-16 15:41:55 +00:00
Monty Taylor
ebae022d07 Use project-config from zuul instead of direct clones
We use project-config for gerrit, gitea and nodepool config. That's
cool, because can clone that from zuul too and make sure that each
prod run we're doing runs with the contents of the patch in question.

Introduce a flag file that can be touched in /home/zuulcd that will
block zuul from running prod playbooks. By default, if the file is
there, zuul will wait for an hour before giving up.

Rename zuulcd to zuul

To better align prod and test, name the zuul user zuul.

Change-Id: I83c38c9c430218059579f3763e02d6b9f40c7b89
2020-04-15 12:29:33 -05:00
Monty Taylor
8af7b47812 Get rid of all-clouds.yaml
We had the clouds split from back when we used the openstack
dynamic inventory plugin. We don't use that anymore, so we don't
need these to be split. Any other usage we have directly references
a cloud.

Change-Id: I5d95bf910fb8e2cbca64f92c6ad4acd3aaeed1a3
2020-04-09 16:44:20 -05:00
Monty Taylor
589521fd18 Remove run_all.sh and ansible cron job
Remove the script and the cronjob on bridge that runs it.

Change-Id: I45e4d9713f3ba4760ba384d13487c6214d068800
2020-04-08 10:46:55 -05:00
James E. Blair
6ee1cfa736 Use our jitsi-meet image for meetpad
Change-Id: I0be6adf3852c7ab475ab6b4cfc8e53e207c382d1
2020-03-27 09:43:41 -07:00
James E. Blair
8b093dacd5 Add meetpad server
Depends-On: https://review.opendev.org/714189
Change-Id: I5863aaa805a18f9085ee01c3205b0f9ad602922d
2020-03-25 07:44:24 -07:00
Zuul
a31bae50a3 Merge "Add a new docs.airshipit.org vhost on static01" 2020-03-20 22:07:40 +00:00
Jeremy Stanley
abcae98b8e Add a new docs.airshipit.org vhost on static01
The Airship project is continuously publishing documentation to AFS,
so serve that volume with a corresponding vhost on the static01
server. Also add it to the list of volumes for periodic vos release.

Change-Id: I718963533d9e8596d44d451b5e930314d699fa28
Depends-On: https://review.opendev.org/706599
2020-03-20 19:09:13 +00:00
Zuul
b0f81dc7b9 Merge "Update git.starlingx/git.airship redirects" 2020-03-19 01:21:21 +00:00
Zuul
51a5f5488f Merge "Update git.zuul-ci.org redirects" 2020-03-19 01:21:20 +00:00
Andreas Jaeger
eecf3e71fc Update git.starlingx/git.airship redirects
After the big OpenDev rename, these repos got renamed again. Update the
redirects for git.airshipit.org and git.starlingx.io to point to the
current location.

Update test_static.py for this, change the test repo since
airship-in-a-bottle was first renamed to in-a-bottle and later to
airship-in-a-bottle.

Change-Id: I71b786cd528aac9ae68464618db02e22cd4c0b5b
2020-03-18 18:39:48 +01:00
Andreas Jaeger
a6480bcefb Update git.zuul-ci.org redirects
zuul and nodepool now life in opendev, avoid double redirects and
redirect directly to final location.

Change-Id: Ia55d76b24f07ec64cb55055955c4549f3706a95b
2020-03-18 18:28:42 +01:00
Zuul
87db9b6ac6 Merge "nodepool-builder: put container configs in /etc" 2020-03-17 17:50:12 +00:00
Ian Wienand
b967495dc3 nodepool-builder: put container configs in /etc
Currently we deploy the openstacksdk config into ~nodepool/.config on
the container, and then map this directory back to /etc/openstack in
the docker-compose.  The config-file still hard-codes the
limestone.pem file to ~nodepool/.config.

Switch the nodepool-builder_opendev group to install to
/etc/openstack, and update the nodepool config file template to use
the configured directory for the .pem path.

Also update the testing paths.

Story: #2007407
Task: #39015
Change-Id: I9ca77927046e2b2e3cee9a642d0bc566e3871515
2020-03-17 07:37:00 +11:00
Monty Taylor
e5e925d715 Switch back to docker for gerrit and nodepool-builder
We rolled out review-dev with podman and it worked fine for us. It
worked less fine for nodepool-builder, although we still might be
able to solve it. Maybe right now isn't the time to do this switch.
Gitea, gitea-lb and zuul-registry all use docker instead of podman.

The only thing running with podman right now is review-dev. We can
do a manual cleanup of podman there before runnign this to keep
things simple:

  - stop gerrit service
  - uninstall podman and podman-compose
  - uninstall podman ppa config
  - uninstall pip3

Then let ansible install docker and docker compose up.

Story: #2007407
Task: #39062
Change-Id: I9bf99b18559d49d11ba99a96f02a4a45a4f65a86
2020-03-15 23:26:49 +00:00
Ian Wienand
b1bfee423b nodepool-builder: Add webserver
This adds the webserver that serves the logs and generated images.

Change-Id: I230f5291e0bd928af2e00966d76c3f385b749cb6
2020-03-11 09:16:31 +11:00
Ian Wienand
1979d6b160 nodepool-builder: deploy from container
This deploys the nodepool-builder container and verifies it has
started in testinfra.

Change-Id: I8a717d06f1291a4112b2753641ff88f074cf0b31
2020-03-11 09:16:24 +11:00
Ian Wienand
3aaf87ee6d letsencrypt: Register email with accounts
Currently we don't set a contact email with our accounts.  This is an
optional feature, but would be helpful for things like [1] where we
would be notified of certificates affected by bugs, etc.

Setup the email address in the acme.sh config which will apply with
any new accounts created.  To update all the existing hosts, we see if
the account email is added/modified in the config *and* if we have
existing account details; if so we need a manual update call.

For anyone who might be poking here, we also add a note on sharing an
account based on some broadly agreed upon discussion in IRC.

[1] https://community.letsencrypt.org/t/revoking-certain-certificates-on-march-4/114864

Change-Id: Ib4dc3e179010419a1b18f355d13b62c6cc4bc7e8
2020-03-05 12:25:56 +11:00
Andreas Jaeger
e47de667d5 Kill qa.o.o
This site was never used nor published, it can be killed according to QA
PTL.

codesearch returns no matches for it in any docs.

Keep the occurence in manifests/static.pp, this will get deleted
as part of https://review.opendev.org/710388.

Change-Id: I3c0d3b567a3eccb959dc903f169197e4581f1e13
2020-02-28 09:30:27 +01:00
Andreas Jaeger
03276e0c93 Update redirects for legacy sides
The content for many projects has moved but the legacy redirects were
not updated, update to current location.

Change-Id: I7030ad35378085b0c45429c272dc24f00d33b2d2
2020-02-27 10:06:25 +01:00
Ian Wienand
d961b6d0d4 static: implement legacy redirect sites
This is a slight divergence from the accepted spec, where we were
going to implement these redirects via a new haproxy instance
(I961456d44a56f2334d3c94ef27e408f27409cd65).  We've decided it's
easier to keep them on static.opendev.org

The following sites are configured to redirect to whatever they are
redirecting to now on static.opendev.org:

 * devstack.org
 * www.devstack.org
 * ci.openstack.org
 * cinder.openstack.org
 * glance.openstack.org
 * horizon.openstack.org
 * keystone.openstack.org
 * nova.openstack.org
 * qa.openstack.org
 * summit.openstack.org
 * swift.openstack.org

As a bonus, they all get a https instance too, which they didn't have
before.

testinfra coverage should be total for this change.  I have created
the _acme-challange CNAME records for all the above.

Story: #2006598
Task: #38881

Change-Id: I3f1fc108e7bb1c9500ad4d1a51df13bb4ae00cb9
2020-02-27 16:25:39 +11:00
Ian Wienand
d78f3fa8f3 static: fix git raw file redirect
When converting this from a htaccess file to run in the virtualhost
context, one instance of '^cgit' -> '^/cgit' was missed.  Fix it, and
add a coverage test for it to testinfra.

Change-Id: Icc1dae6dce232e69c5cd1cf98b594f562c60d3f2
2020-02-27 09:50:34 +11:00
Ian Wienand
b5266ea20c static: provide git services
This creates the redirect sites

 git.airshipit.org
 git.openstack.org
 git.starlingx.io
 git.zuul-ci.org

The htaccess rules are put into the main configuration file to avoid
having to create a directory and manage another file.  We use a macro
to duplicate the rules and retain the old semantics of the http site
redirecting directly (as opposed to doing a extra 301 to
https://git.openstack.org first).  This required adding "/" to the "^"
matches as it now runs in VirtualHost context; no functional change is
intended over the old sites.

This will require _acme-challenge CNAMEs to acme.opendev.org before
being merged.

testinfra is updated to exercise some redirects matching against the
results of the extant sites.

Change-Id: Iaa9d5dc2af3f5f8abc11c2312e4308b50f5fcd2b
2020-02-26 12:27:13 +11:00
Ian Wienand
56509e83a4 static: add static.openstack.org/files.openstack.org
files.openstack.org serves a view of /afs/openstack.org/, which is the
same as static.opendev.org.  Add a serveralias for it and certificate.

Make static.openstack.org be consistent with opendev by showing the
same thing.

Change-Id: I4c492e3b02554a7c736c015790bd4cd5bb435a43
2020-02-26 10:39:50 +11:00
Ian Wienand
74005bb29a static: add a periodic 404 checker
This is an alternative to Iccf24a72cf82592bae8c699f9f857aa54fc74f10
which removes the 404 scraping tool.  It creates a zuul user and
enables login via the system-config per-project ssh key, and then runs
the 404 scraping script against it periodically.

Change-Id: I30467d791a7877b5469b173926216615eb57d035
2020-02-26 09:05:31 +11:00
Ian Wienand
3206fd02b8 static: move afs sites from files.openstack.org to static.opendev.org
This creates sites to serve

 developer.openstack.org
 docs.openstack.org
 docs.opendev.org
 docs.starlingx.io

which are all just static directories underneath /afs/openstack.org/.

This is currently done by files02.openstack.org, but will be better
served in the future by consolidating in ansible configuration on
static.opendev.org.

The following dns entries need to be made before merging to ensure the
certificates are provisioned

 _acme-challenge.developer.openstack.org
 _acme-challenge.docs.openstack.org
 _acme-challenge.docs.opendev.org
 _acme-challenge.docs.starlingx.io

Once done, we can merge and then cut-over the main DNS entries as we
like.

Since there are some follow-ons, I have not removed the puppet
configuration from files02.openstack.org.  I think it's best we
migrate everything away from that and remove it in one lot.

Change-Id: I459a36f823a8868e6cc09e2b0d85f2fe05d69002
2020-02-21 17:59:14 +01:00
Ian Wienand
047eae459d static: add releases.openstack.org site
This adds the site to publish from

 /afs/openstack.org/project/releases.openstack.org

Change-Id: Ia91deb9a51441ac9974137ed39fc5a185689a11c
Task: #37724
Story: #2006598
2020-02-21 14:35:35 +11:00
Ian Wienand
b2eb6c7003 static: add default site
If you currently hit https://static.opendev.org you get redirected to
the default site, which is just the first site in alphabetical order,
which happens to be governance.openstack.org.

Add a 00-static.opendev.org.conf file so this is the default site.  It
will just serve up the top-level afs directory.

Change-Id: Icdcee962b76545c12e84d4cadb0b60a68cabe38b
2020-02-21 09:15:47 +11:00
Ian Wienand
2f1b2f3eae static: Add service-types.openstack.org
Publishing changes done with https://review.opendev.org/#/c/708518/

Change-Id: I13934473aa85fce17a269f81f67c6332d51a9ab1
Story: #2006598
Task: #37723
2020-02-20 11:09:28 +11:00
Ian Wienand
738468b6ad Add specs.openstack.org
Old content is rsynced and publishing to be switched with
https://review.opendev.org/#/c/708500/

Change-Id: I797bb51970d9e7cd3ee5c2635bb5045c618b9d2c
Story: #2006598
Task: # 37721
2020-02-20 10:37:45 +11:00
Ian Wienand
97c4735129 Move afsmon to mirror-update.opendev.org
This migrates the afsmon script from puppet deploying on
mirror-update.openstack.org to ansible deploying on
mirror-update.opendev.org.

There is nothing particularly special and this just a straight install
with some minor dependencies.  Since we have log publishing running on
the opendev.org server, we publish the update logs alongside the
others.

Change-Id: Ifa3b4d59f8d0fc23a4492e50348bab30766d5779
2020-02-12 14:38:48 +11:00
Ian Wienand
fbb9790d49 Allow for periodic afs releases from mirror-update
This is a migration of the current periodic "vos release" script to
mirror-update.opendev.org.

The current script is deployed by puppet and run by a cron job on
afsdb01.dfw.openstack.org.

My initial motivation for this was wanting to better track our release
of these various volumes.  With tarballs and releases moving to AFS
publishing, we are going to want to track the release process more
carefully.

Initially, I wanted to send timing statistics to graphite so we could
build a dashboard and track the release times of all volumes.  Because
this requires an additional libraries and since we are deprecating
puppet, further development there is unappealing and it would better
live in ansible.

Since I6c96f89c6f113362e6085febca70d58176f678e7 we have the ability to
call "vos release" with "-localauth" permissions via ssh on
mirror-update; this avoids various timeout issues (see the changelog
comment there for more details).  So we do not need to run this script
directly on the afsdb server.

We are alreadying publishing mirror update logs from mirror-update,
and it would be good to also publish these release logs so anyone can
see if there are problems.

All this points to mirror-update.opendev.org being a good future home
for this script.

The script has been refactored some to

 - have a no-op mode
 - send timing stats for each volume release
 - call "vos release" via the ssh mecahnism we created
 - use an advisory lock to avoid running over itself

It runs from a virtualenv and it's logs are published via the same
mechanism as the mirror logs (slightly misnamed now).

Note this script is currently a no-op to test the deployment, running
and log publishing.  A follow-up will disable the old job and make
this active.

Change-Id: I62ae941e70c7d58e00bc663a50d52e79dfa5a684
2020-02-11 08:52:01 +11:00
Ian Wienand
3fd6e16077 Add tarballs.<openstack|opendev>.org to static.opendev.org
Add these hosts to static.opendev.org, serving from AFS.  Note that
tarballs.openstack.org just redirects to static.opendev.org/openstack.

This should have no effect currently, it will only become live when we
switch DNS.

For more details see the thread at:

 http://lists.openstack.org/pipermail/openstack-infra/2020-January/006584.html

Change-Id: Ie56fac17ffaa91ee55be986de636485a58125a02
2020-02-06 08:24:16 +11:00
Monty Taylor
cc619fe589 Add review-dev01.opendev.org
Add a new review-dev server on the opendev domain with LE support
enabled.

Depends-On: https://review.opendev.org/705661
Change-Id: Ie32124cd617e9986602301f230e83bb138524fdf
2020-02-05 09:58:25 -06:00
Ian Wienand
f5b5ee9336 Add roles for a basic static server
Basic implementation of the opendev static server, described in

 https://docs.opendev.org/opendev/infra-specs/latest/specs/retire-static.html

Change-Id: Ie1b92f06b71aa6069fe831b26ba1cc272ce4562c
Story: #2006598
Task:  #37757
2020-01-16 14:10:08 +11:00
Monty Taylor
1d37be64b4 Add service playbook and test run for prod gerrit
We need to test this against production variables too.

Change-Id: I7813787506e3b70ef0960ce85dccca4eb9ec7a3f
2019-12-17 08:13:34 -05:00
Zuul
29019411eb Merge "Run a gerrit container on review-dev01" 2019-12-15 19:00:21 +00:00
James E. Blair
4f9720e76e Run a gerrit container on review-dev01
This runs gerrit in a container on review-dev01 using podman.

Remove an unused web_server.py file that we found from copying it
from puppet to ansible.

Change-Id: I399d3cf8471bc8063022b0db0ff81718b2ee2941
2019-10-29 08:29:17 +09:00
Zuul
34d2dcc928 Merge "base-server: disable install of suggests and recommends packages" 2019-10-24 01:03:29 +00:00
Zuul
309daf9482 Merge "backup: minor fixes" 2019-08-12 23:43:29 +00:00
Ian Wienand
445eb7a7b2 backup: minor fixes
The ssh config file is /.ssh/config (not ssh_config)

We are accepting the ed25519 key, not the ecdsa key, so fix that in
the known_hosts stanza.

Change-Id: If3a42a7872f5d5e7a2bf9c3b5184fb14d43e6a1a
2019-08-09 14:11:41 +10:00
Clark Boylan
05e0ffdebc Collect gitea sshd logs
Currently we don't have any logs from our gitea sshd processes because
sshd logs to syslog by default and /dev/log isn't in our containers. You
can ask sshd nicely to log to stderr instead with the -e flag which
docker will pick up and store for us.

Update the sshd command to include -e then use testinfra to check we
collect logs and they are accssible from docker.

Change-Id: Ib7d6d405554c3c30be410bc08c6fee7d4363b096
2019-08-06 13:42:25 -07:00
Ian Wienand
814e4be128 Ansible roles for backup
This introduces two new roles for managing the backup-server and hosts
that we wish to back up.

Firstly the "backup" role runs on hosts we wish to backup.  This
generates and configures a separate ssh key for running bup and
installs the appropriate cron job to run the backup daily.

The "backup-server" job runs on the backup server (or, indeed
servers).  It creates users for each backup host, accepts the remote
keys mentioned above and initalises bup.  It is then ready to receive
backups from the remote hosts.

This eliminates a fairly long-standing requirement for manual setup of
the backup server users and keys; this section is removed from the
documentation.

testinfra coverage is added.

Change-Id: I9bf74df351e056791ed817180436617048224d2c
2019-08-05 16:59:57 +10:00
Ian Wienand
d232403e79 base-server: disable install of suggests and recommends packages
The options to disable installing suggests and recommended packages
has been in diskimage-builder based images for a long time [1].
However we have no setting for it in our base-server role, meaning
that when launching nodes from cloud-provider images we can be out of
sync on this option.

I6d69ac0bd2ade95fede33c5f82e7df218da9458b is an example where packages
pulled in by suggestions can fail (arguably a packaging issue, but
anyway...)

By enabling this here, we make our control plane servers homogenous
with our diskimage-builder based testing nodes, which is better for
general sanity.  Overall it gives us more control over what's
installed.

[1] https://opendev.org/openstack/diskimage-builder/src/branch/master/diskimage_builder/elements/dpkg/pre-install.d/00-disable-apt-recommends

As I6d69ac0bd2ade95fede33c5f82e7df218da9458b showed, installing
suggested or recommended packages might result in

Change-Id: Id6dcc158944a46fc0ae03b6f1ff372dacd67c2e6
2019-07-31 16:21:08 +10:00
Jeremy Stanley
5587c299ea Re-add gitea01 replacement to inventory
Add new IP addresses to inventory for the rebuild, but don't
reactivate it in the haproxy pools yet.

Note this switches the gitea testing to use a host called gitea99 so
that it doesn't conflict with our changes of the production hosts.

Change-Id: I9779e16cca423bcf514dd3a8d9f14e91d43f1ca3
2019-07-23 16:17:41 -07:00