We have been having memory leak issues with zuul-web on our move from
running on the host with python3.5 to running in containers with
python3.7 and python3.8. One other thing that chagned was we added
LD_PRELOAD settings to use jemalloc instead of normal libc provided
malloc. In an effort to rule this out disable jemalloc in the zuul-web
containers.
Change-Id: Icf03b60266f876dd7c322e8c8f7c207b692d3ad7
The command is broken (extra ", misses redirection) and duplicates
content from accessbot.sh. Call accessbot.sh directly and use that one
instead.
Change-Id: Ieb530ef27e5995f2848a3c23a6c04a0717716e14
We are trying to use this file in our docs config but the file was
mistakently not added in that change. Add it now.
Change-Id: I8f5f9d62f96d8532477c42a7076c57aa6548c9cf
Rather than running a local zookeeper, just run a real zookeeper.
Also, get rid of nb01-test and just use nb04 - what could possibly
go wrong?
Dynamically write zookeeper host information to nodepool.yaml
So that we can run an actual zk using the new zk role on hosts in
ansible inventory, we need to write out the ip addresses of the
hosts that we build in zuul. This means having the info baked in
to the file in project-config isn't going to work.
We can do this in prod too, it shouldn't hurt anything.
Increase timeout for run-service-nodepool
We need to fix the playbook, but we'll do that after we get the
puppet gone.
Change-Id: Ib01d461ae2c5cec3c31ec5105a41b1a99ff9d84a
This sets up a robots.txt on our lists servers. To start this file
prevents SEMrush bot from indexing our lists as that has been causing
lists.openstack.org to OOM with many listinfo processes started by
Apache.
We've avoided this OOM by manually configuring this robots.txt. Other
things we have ruled out are bup and input email causes qrunner's to
grow unexpectedly large. Fairly confident this bot is the trigger.
Note this fixes testing by adding 'hieradata' to set listpassword var.
Depends-On: https://review.opendev.org/724389
Change-Id: Id4f6739a8cf6a01f9796fa54c86ba1af3e31fecf
As we add jobs that have more nodes in them, we need to make
sure we're running ansible with enough forks that the jobs
don't take forever.
Change-Id: I2b5bf55bd65eaf0fc2671f5379bd0cb5c3696f87
This job compiles openafs with dkms among other things that cause it run
over the default half hour timeout occasionally. Bump the timeout to an
hour to deal with that.
Change-Id: I8a56a7f42ce2ee8331befb45aceb1d511a33d9e6
We used to replicate every openstack/* project to GitHub,
through a global replication at Gerrit-level. Now that the job for
granular replication is in place, we can stop the global replication,
so that only active/official repositories are synced.
Depends-On: https://review.opendev.org/724310
Change-Id: Ibba02e626e33aba9779f771d5ae49920bac86b19
We use the zuul_scheduler_start flag to determine if we want to start
the zuul-scheduler when new containers show up. Unfortunately we weren't
setting zuul_scheduler_start in prod so we failed with this error:
error while evaluating conditional (zuul_scheduler_start | bool): 'zuul_scheduler_start' is undefined
Fix this by treating an unset var as equivalent to a set truthy var
value. We do this instead of always setting the var to false in prod as
it simplifies testing.
Change-Id: I1f1a86e80199601646c7f2dec2a91c5d65d77231
The intent of the periodic jobs is to run with latest master. If
they get enqueued, then other patches land, they'll still run with
the value of the zuul ref from when they were enqueued. That's not
what we want for prod, as it can lead to running old versions of
config.
We don't usually like doing this, but in this case, rather than
making us remember to add a flag every time a prod job gets added
to a periodic pipeline, how's about we just calculate it.
Change-Id: Ib999731fe132b1e9f197e51d74066fa75cb6c69b
We get deprecation warnings from ansible about use
of python2 on xenial hosts. Rather than setting
ansible_python_interpreter to python3 on a host by
host basis, set it globally to python3.
Set it to python for the one host that's too old,
refstack.openstack.org, which is running on trusty
which only has python3.4.
Change-Id: I4965d950c13efad80d72912911bc7099e9da1659
Compress css and javascript content as they can be quite large for zuul.
Also, cache status json results when using the non whitelabeled api
paths for zuul.opendev.org. This should improve performance for those
status files.
Change-Id: I7b965b27a88d5fda4d43be31c39989994334989c
If we need to start and stop, it's best to use playbooks.
We already have tasks files with start commands in each role,
so put the stop commands into similar task files.
Make the restart playbook import_playbook the stop and start
playbooks to reduce divergence.
Use the graceful shutdown pattern from the gerrit docker-compose
to stop the zuul scheduler.
Change-Id: Ia20124553821f4b41186bce6ba2bff6ca2333a99
Due to a configuration issue, zuul.openstack.org is currently throwing
SSL validation errors. Update the status.openstack.org to the
canonical OpenStack tenant page directly.
Change-Id: Idf08e140de11126061cb6f9783d13dc64fefff60
We don't want to HUP all the processes in the container, we just
want zuul to reconfigure. Use the smart-reconfigure command.
Also - start the scheduler in the gate job.
Change-Id: I66754ed168165d2444930ab1110e95316f7307a7
This adds a necessary newline, removes port numbers, and sets the
executor ssh key to the correct path.
Change-Id: I6b4afa876b6cd7d8f87cc35bc51b4e9d6e31ee2b
When we install packages on ubuntu, we should use their actual
package names rather than incorrect or otherwise fictional
package names.
Also, fix the hostname in the test job - because when we don't
do that, we don't run all of the roles, and thus we don't
catch these things.
Change-Id: I18e676ef0fe343513db4c8ad7e340ee45092c0a3
The executors aren't using docker container images yet due to conflicts
with bubblewrap. This means we are still installing it directly on the
host using pip. Unfortunately we were using `pip` before which maybe
install to python2 and zuul doesn't run under python2. Address this by
explicitly using pip3.
Change-Id: I2ec551e8207e29ca420b09b8818154b9c32b47cf