63 Commits

Author SHA1 Message Date
Ian Wienand
376915e17a run_all.sh : add backup playbook
The backup roles have been debugged and are ready to run.

A note is added about having the backup server in a default disabled
state.  This was discussed at an infra meeting where consensus was to
keep it disabled [1].

[1] http://eavesdrop.openstack.org/meetings/infra/2019/infra.2019-06-11-19.01.log.html#l-184

Change-Id: I2a3d2d08a9d1514bf6bdcf15bc5bc95689f3020f
2019-08-09 16:43:55 +10:00
Ian Wienand
a595d1d1d0 Add mirror-update to run_all.sh
It looks like I forgot to add this in
I525ac18b55f0e11b0a541b51fa97ee5d6512bf70 so the mirror-update
specific roles aren't running automatically.

Change-Id: Iee60906c367c9dec1143ee5ce2735ed72160e13d
2019-07-16 10:04:15 +10:00
Clark Boylan
cd9f3cfdad Apply service-bridge.yaml in run_all.sh
Prior to https://review.opendev.org/#/c/656871/ this code was executed
by run_all.sh in every pass but seems to have been missed as part of
656871's base.yaml split up.

Add service-bridge.yaml to run_all.sh to get these updates applying to
bridge again. In particular things like clouds.yaml updates are missing
otherwise.

Note I've not merged bridge.yaml and service-bridge.yaml as it appears
we want all of the service stuff to happen after base.yaml but
bridge.yaml needs to happen before. I think this is why they were split
in the first place.

Change-Id: I0a7ce1a65cd19459bbaf244b94a23ddde360da1a
2019-07-02 15:04:55 -07:00
James E. Blair
a92ac59e15 Fix new mirror system errors
Fix the reported stat name for the mirror playbook.

Run the mirror job in gate.

Set follow=false so that we're telling Ansible to set the perms
on the link rather than the target (which is the default).

Change-Id: Id594cf3f7ab1dacae423cd2b7e158a701d086af6
2019-05-24 09:42:38 -07:00
Clark Boylan
926ba11184 Cleanup bashate errors to make them easier to understand
We ignore E006 which is line lenght longer than 79 characters. We don't
actually care about that. Fix E042 in run_all.sh this represents a
potential real issue in bash as it will hide errors.

This makes the bashate output much cleaner which should make it easier
for people to understand why it fails when it fails in check.

Change-Id: I2249b76e33003b57a1d2ab5fcdb17eda4e5cd7ad
2019-05-23 14:00:37 -07:00
Ian Wienand
670107045a Create opendev mirrors
This impelements mirrors to live in the opendev.org namespace.  The
implementation is Ansible native for deployment on a Bionic node.

The hostname prefix remains the same (mirrorXX.region.provider.) but
the groups.yaml splits the opendev.org mirrors into a separate group.
The matches in the puppet group are also updated so to not run puppet
on the hosts.

The kerberos and openafs client parts do not need any updating and
works on the Bionic host.

The hosts are setup to provision certificates for themselves from
letsencrypt.  Note we've added a new handler for mirror nodes to use
that restarts apache on certificate issue/renewal.

The new "mirror" role is a port of the existing puppet mirror.pp.  It
installs apache, sets up some modules, makes some symlinks, sets up a
cleanup cron job and installs the apache vhost configuration.

The vhost configuration is also ported from the extant puppet.  It is
simplified somewhat; but the biggest change is that we have extracted
the main port 80 configuration into a macro which is applied to both
port 80 and 443; i.e. the host will have SSL support.  The other ports
are left alone for now, but can be updated in due course.

Thus we should be able to CNAME the existing mirrors to new nodes, and
any existing http access can continue.  We can update our mirror setup
scripts to point to https resources as appropriate.

Change-Id: Iec576d631dd5b02f6b9fb445ee600be060f9cf1e
2019-05-21 11:08:25 +10:00
Zuul
2c5847dad9 Merge "Split the base playbook into services" 2019-05-20 10:04:40 +00:00
James E. Blair
8ad300927e Split the base playbook into services
This is a first step toward making smaller playbooks which can be
run by Zuul in CD.

Zuul should be able to handle missing projects now, so remove it
from the puppet_git playbook and into puppet.

Make the base playbook be merely the base roles.

Make service playbooks for each service.

Remove the run-docker job because it's covered by service jobs.

Stop testing that puppet is installed in testinfra. It's accidentally
working due to the selection of non-puppeted hosts only being on
bionic nodes and not installing puppet on bionic. Instead, we can now
rely on actually *running* puppet when it's important, such as in the
eavesdrop job. Also remove the installation of puppet on the nodes in
the base job, since it's only useful to test that a synthetic test
of installing puppet on nodes we don't use works.

Don't run remote_puppet_git on gitea for now - it's too slow. A
followup patch will rework gitea project creation to not take hours.

Change-Id: Ibb78341c2c6be28005cea73542e829d8f7cfab08
2019-05-19 07:31:00 -05:00
Ian Wienand
d5b321b074 Handle moved puppet repos
As per [1], it seems puppet has "cleaned up" most of the packages we
are using to install.

Install the puppet-agent packages directly as puppet's archive location
is not a valid repo. With puppet 4 at least these packages should bundle
everything we need including ruby.

[1] https://groups.google.com/forum/#!msg/puppet-users/cCsGWKunBe4/OdG0T7LeDAAJ

Depends-On: https://review.opendev.org/659384
Depends-On: https://review.opendev.org/659395
Change-Id: Ie9e2b79b42f397bddd960ccdc303b536155ce123
2019-05-15 16:03:07 -07:00
Zuul
a3dac3913b Merge "Stop running gitea k8s cluster playbooks" 2019-05-08 01:06:50 +00:00
James E. Blair
08c8b2df09 Stop running gitea k8s cluster playbooks
The gitea k8s cluster is not currently in use; don't run playbooks
relating to it.

Change-Id: I87c0dd71b2284ea5e9b580999242e901a8fee235
2019-05-07 16:05:21 -07:00
Ian Wienand
2acfc176b0 Remove graphite.openstack.org
The server has been removed, remove it from inventory.

While we're here, s/graphite.openstack.org/graphite.opendev.org/'
... it's a CNAME redirect but we might as well clean up.

Change-Id: I36c951c85316cd65dde748b1e50ffa2e058c9a88
2019-05-08 05:55:33 +10:00
Clark Boylan
c74a4da06e Fix puppet 4 installations
Our old puppet 4 process was to run the install_puppet.sh script to
transition from puppet 3 to puppet 4 but this ran after base.yaml which
enforces a puppet version.

Unfortunately we were enforcing puppet version 3 in the base.yaml
playbook via the puppet-install role which meant base would install
pupept 3 and our upgrade playbook would install puppet 4 in a loop.
Thankfully we run puppet after the upgrade so we were using the puppet
version we wanted.

To fix this needless reinstall loop we do two things. We move the
upgrade playbook before base.yaml so that we upgrade before we enforce a
version. Then we update group vars for the puppet4 group to enforce the
puppet 4 version.

Change-Id: I97ca81ed5331e664f8e2e65b283793f0919f6033
2019-03-08 14:18:28 -08:00
James E. Blair
f363ed6dc0 Reduce timeouts in run_all.sh
Most of these playbooks finish much faster than 2 hours.  Set
timeouts which are approximately 3x as long as they are currently
running, rounded to the nearest 10m.

Emit the name of the timer to the log at the end of each run so
that it's more clear which playbook just finished.

Correct the timer name for one of the playbooks.

The k8s cluster deployment playbooks are not yet functional --
run times for those are still unknown.

Change-Id: I43a06baaec908cba7d88c4b0932dcc95f1a9a108
2019-02-13 14:52:59 -08:00
Zuul
be5b02d08f Merge "Fix gitea playbooks" 2019-02-12 23:19:27 +00:00
Zuul
d3e554e306 Merge "Stop running k8s-on-openstack nested" 2019-02-12 22:22:06 +00:00
James E. Blair
1f1f358c03 Fix gitea playbooks
First, we need an @ before the extra vars files.  Why?  Because
an @ is needed.

Second, the rook playbook was stringing all 4 commands on to one
exec call which was working poorly.  Instead, make 4 tasks so that
it's slightly better represented in ansible output, each of which
has a (presumably) valid command.

Change-Id: I30efe84d2041237a00da0c0aac02afa92d29c0fb
2019-02-12 14:20:02 -08:00
Monty Taylor
0c4a981f73 Stop running k8s-on-openstack nested
The current code runs k8s-on-openstack's ansible in an ansible
task. This makes debugging failures especially difficult.

Instead, move the prep task to update-system-config, which will
ensure the repo is cloned, and move the post task to its own
playbook. The cinder storage class k8s action can be removed from
this completely as it's handled in the rook playbook.

Then just run the k8s-on-openstack playbook as usual, but without
the cd first so that our normal ansible.cfg works.

Change-Id: I6015e58daa940914d46602a2cb64ecac5d59fa2e
2019-02-12 18:17:46 +00:00
James E. Blair
ff4532789c Add gitea-cluster extra vars
Since the gitea cluster doesn't appear in any ansible inventory,
we need to create a dedicated file to hold the extra variables.

Change-Id: Ib2365c9204bff549fdc0116243376d6e895f2296
2019-02-11 11:11:46 -08:00
James E. Blair
0e7d6a507c Run the gitea k8s playbooks
We have playbooks to manage the resources in the gitea k8s, run them
from run_all.sh.

Change-Id: If4c8e6d87995d466505e7b78c7d8eb04d17318de
2019-02-06 09:29:39 -08:00
Monty Taylor
9cac3c6b63 Run k8s-on-openstack to manage k8s control plane
The k8s-on-openstack project produces an opinionated kubernetes
that is correctly set up to be integrated with OpenStack. All of the
patches we've submitted to update it for our environment have been
landed upstream, so just consume it directly.

It's possible we might want to take a more hands-on forky approach in
the future, but for now it seems fairly stable.

Change-Id: I4ff605b6a947ab9b9f3d0a73852dde74c705979f
2019-02-05 18:50:31 +00:00
Ian Wienand
97a3ab9bf3 Add statsd metrics for ansible runs
Add some coarse-grained statsd tracking for the global ansible runs.
Adds a timer for each step, along with an overall timer.

This adds a single argument so that we only try to run stats when
running from the cron job (so if we're debugging by hand or something,
this doesn't trigger).  Graphite also needs to accept stats from
bridge.o.o.  The plan is to present this via a simple grafana
dashboard.

Change-Id: I299c0ab5dc3dea4841e560d8fb95b8f3e7df89f2
2018-09-10 14:49:45 +10:00
James E. Blair
008d0044e8 Increase forks to 50
In run_all.sh, increase the number of ansible forks to 50 for most
playbooks in an attempt to speed up the process.

Change-Id: I487605fd3b2d20d7b1f19c40d22018deeae9c112
2018-09-07 10:53:20 -07:00
James E. Blair
89252d9285 Revert "Fix ansible forks env variable"
And revert "Set Ansible forks to 50"

This doesn't seem to have helped, and may have made the run longer.
I suspect a problem with the env var, but let's revert back to the
old value and mechanism (cli flag) to re-establish a baseline,
then we'll change the value of the cli flag.

This reverts commit 84199095716da416849ed4a2649ec8a2c878609d.
This reverts commit 97d8f9d0bfaec24413f134fe252d4011fe9e36d4.

Change-Id: I825b2b3db26ce6dd7d70fcc8b33e70b511eb52db
2018-09-06 09:07:36 -07:00
James E. Blair
97d8f9d0bf Fix ansible forks env variable
This is how bash works.

Change-Id: I362ea15e44f8086464fb3aa42c41a51d222391c4
2018-09-05 14:17:17 -07:00
James E. Blair
8419909571 Set Ansible forks to 50
20 is working fine with plenty of ram/cpu to spare, increase to 50
to attempt to speed up the runtime.

The environment variable should be used by default, but the "-f"
option will override that, in the one case where we need it.

Change-Id: Ie6a1d991a346702ec58cd716b0b94af5c93554ac
2018-09-04 14:15:48 -07:00
Monty Taylor
4f9ab4eeb2 Increase the run_all forks count to 20
In testing this on bridge, 20 forks did not appreciably increase the
load average.

Change-Id: Ib571dec0f07e031273dc76a9f364478183b8f578
2018-08-22 01:42:43 +00:00
Ian Wienand
64ac47cca9 run_all.sh: add timestamp
When cron runs this we don't get any delimination between runs in the
output log file.  Add begin and end markers.

Change-Id: I4d73d7a8943a302e229517bc717175cda260282c
2018-08-22 09:34:39 +10:00
Monty Taylor
1e223dbd4c
Run base and bridge playbooks in run_all.sh
We need to run base before we run the other things.

Change-Id: Iaa8525946e5f09df842ef141213b7ddcb63dfea1
2018-08-20 11:26:42 -05:00
Monty Taylor
dd4b26903b
Use ansible.cfg for ansible logging config
We have an ansible logging location defined in ansible.cfg. We don't
need to override it in run_all.sh.

Change-Id: I7f0a8b70a1ccd7a43ce47a3f452b6d0d5c57e96a
2018-08-19 10:26:10 -05:00
Monty Taylor
1a8c2f66da
Move /opt/system-config/production to /opt/system-config
The production directory is a relic from the puppet environment concept,
which we do not use. Remove it.

The puppet apply tests run puppet locally, where the production
environment is still needed, so don't update the paths in the
tools/prep-apply.sh.

Depends-On: https://review.openstack.org/592946
Change-Id: I82572cc616e3c994eab38b0de8c3c72cb5ec5413
2018-08-17 09:41:02 -05:00
Monty Taylor
bab6fcad3c
Remove base.yaml things from openstack_project::server
Now that we've got base server stuff rewritten in ansible, remove the
old puppet versions.

Depends-On: https://review.openstack.org/588326
Change-Id: I5c82fe6fd25b9ddaa77747db377ffa7e8bf23c7b
2018-08-16 17:25:10 -05:00
Monty Taylor
815355bc83
Rename update_puppet to update-system-config
The purpose of the playbook is to update the system-config checkout, as
well as installing puppet modules and ansible roles.

Rename it, so that it's clearer what it does. Also, clean it up a bit.
We've gotten better at playbooks since we originally wrote this.

Change-Id: I793914ca3fc7f89cf019cf4cdf52acb7e0c93e60
2018-08-03 09:05:13 -05:00
Colleen Murphy
d4d0ae0e40 Add playbook to upgrade puppet
Add a playbook to rerun install_puppet.sh with PUPPET_VERSION=4. Also
make the install_modules.sh script smarter about figuring out the puppet
version so that the update_puppet.yaml playbook, which updates the
puppet config and puppet modules but not the puppet package, does not
need to be changed.

When we're ready to start upgrading nodes, we'll add them to the puppet4
group in `modules/openstack_project/files/puppetmaster/groups.txt`.

Change-Id: Ic41d277b2d70e7c25669e0c07e668fb9479b8abf
2018-06-05 00:25:21 +02:00
Monty Taylor
e043e6e4bc
Add zuul scheduler to the git/gerrit puppet sequence
We have a race condition on project creation otherwise.

Change-Id: Ia5741d69194ec6a3fcba6ca58552ce021c6aaa1f
2017-12-18 09:46:36 -06:00
Monty Taylor
b02c411166 Run puppet on infracloud in a different cron
It takes too long to run puppet on infracloud and it's blocking our
other servers.

Change-Id: I7202617acc5a04e18672b217db53510167d597bd
2016-08-31 14:39:53 +00:00
Spencer Krum
ec4d6cfbeb Run ansible-playbook in timeout
We need this in case it is oomkilled

Change-Id: Ia405dd800850ad46e3e11696079012dfc34a06ea
2016-05-02 18:48:38 -07:00
Spencer Krum
e84b1ddb2b Run less parallelism in ansible
We are running into memory contention and ooming out on
ansible-playbook. Less workers = more ram,  hope.

We can also move puppetmaster.o.o to a host with more ram (it only has
2G right now.) We can also disable the apache/passenger/puppet that is
running on the host.

Change-Id: Id5ade889748d5e8f65a8ea68cc64b0c071c6a627
2016-04-11 13:18:50 -07:00
Colleen Murphy
32f956f268 Add infracloud playbook
Add separate playbook for infacloud nodes to ensure they run in the
correct order - baremetal -> controller -> compute.

Baremetal is intentionally left out, it is not ready yet.

All 'disabled' flags on infracloud hosts are turned off. This patch
landing turns on management of the infracloud.

Co-Authored-By: Yolanda Robla <info@ysoft.biz>
Co-Authored-By: Spencer Krum <nibz@spencerkrum.com>
Change-Id: Ieeda072d45f7454d6412295c2c6a0cf7ce61d952
2016-02-08 18:03:02 -08:00
Monty Taylor
f1b9b864f7 Translate the rest of run_all.sh to ansible
There are a few things that are run as part of run_all.sh that are
not logged into puppet_run_all.log - namely git cloning, module installation
and ansible role installation. Let's go ahead and do those in a playbook
so that we can see their output while we're watching the log file.

Change-Id: I6982452f1e572b7bc5a7b7d167c1ccc159c94e66
2016-01-10 12:38:22 -05:00
Monty Taylor
8ff794f599 Copy system-config and puppet modules everywhere
If we're going to run puppet apply on all of our nodes, they need
the puppet modules installed on them first.

Change-Id: I84b80818fa54d1ddc4d46fead663ed4212bb6ff3
2015-11-24 16:32:00 -05:00
Monty Taylor
d039a62045 Move playbooks out of the puppet module
/etc/ansible/playbooks isn't actually a thing, it was just a convenient
place to put things. However, to enable puppet apply, we're going to
want a group_vars directory adjacent to the playbooks, so having them be
a subdirectory of the puppet module and installed by it is just extra
complexity. Also, if we run out of system-config, then it'll be easier
to work with things like what we do with puppet environments for testing
things.

Change-Id: I947521a73051a44036e7f4c45ce74a79637f5a8b
2015-10-30 11:31:05 +09:00
AzherKhan
ece86b546b Setting ansible playbooks path variable
Created a variable to manage the ansible
playbooks directory path.

Change-Id: Iabb74e9f1aa95828c01b1957849e2b68164d7d20
2015-10-01 12:07:38 +05:30
Clark Boylan
5e283fd6cc Run more puppet agents at a time with ansible
Our current puppet run_all.sh script takes almost 45 minutes to run
puppet agent on all of our nodes. We are using the default concurrency
of 5. Our puppet master should be able to handle a bit more than that.

Run the git/gerrit playbook with a concurrency of 10 and everything else
with a concurrency of 20.

Change-Id: Ia09abb6fa8c699e156aed38d86ce6fd193f3a42d
2015-04-23 09:48:24 -07:00
Clark Boylan
e13b91ee03 Need to force ansible role installs
Ansible galaxy will not overwrite a role that already exists by default.
To keep our ansible puppet role up to date force its installation.

Change-Id: I75eda8600f666895f9be8711d089615e57b3f3c5
2015-03-03 17:20:25 -08:00
Jenkins
ac7a36db8a Merge "Rename roles.yml to roles.yaml" 2015-03-04 00:44:05 +00:00
Jenkins
f41251935d Merge "Install standalone ansible roles" 2015-03-04 00:43:22 +00:00
James E. Blair
7faa62efb1 Rename roles.yml to roles.yaml
All our other YAML files end with .yaml and also:

  http://www.yaml.org/faq.html

Change-Id: I2ecf2e715f704d92861d34db1479fdd29ff816d8
2015-02-26 15:20:38 -08:00
Monty Taylor
b7cfc00620 Install standalone ansible roles
Similar to how we install puppet modules from standalone repos, start
using the ansible-galaxy command to install roles from standalone role
repos.

Change-Id: Iae7d8e4626479e565bc194496de289027a4668ed
Depends-On: I76d5cab55942beaff44ea5f289f93ff6ce772c5f
2015-02-25 20:07:16 -05:00
Gregory Haynes
b9945fccec Use ansible logging during puppet run_all.sh
When ansible-playbook outputs to stdout it does not include timestamps,
but ansible logging does.

Change-Id: Ifb63d34d1dcc7931d734d08dc31223b531d65aa2
2015-02-12 22:55:01 +00:00