17876 Commits

Author SHA1 Message Date
Zuul
3b123e2726 Merge "gitea: set custom avatars for orgs" 2022-03-18 18:28:30 +00:00
Zuul
7e76a78e60 Merge "Add firewall behavior assertions to testinfra testing" 2022-03-18 17:12:00 +00:00
Ian Wienand
2d9c8b620f gitea: set custom avatars for orgs
Over a few upgrades, we've managed to break some of the default avatar
logos you see when browsing code on opendev.org.

After investigating ways to fix this up, we established that there
isn't an exposed API for setting these, but we can do a simple query
to point to logo files on disk.  This implements that.

One caveat is that the logos should be PNG files; particiularly we
note that SVG files don't work reliably because they don't get served
with the image/svg+xml mime-type.

Change-Id: Ie6799de2fb27e09f936c488258dc1bd1c638c370
2022-03-18 11:06:09 +11:00
Clark Boylan
67b6b6c237 Test gitea 1.16 partial clones
Gitea 1.16 enabled clone filters by default. Unfortunately pip passes
--filter=blob:none when fetching git resources and the new gitea support
for filters breaks against that filter. We are working around this by
restoring the 1.15 behavior of not supporting filters and this change
will test the behavior is as expected.

Change-Id: I13d57e3cc7e135058ff320b3bd9bea76fb178064
2022-03-17 11:07:57 -07:00
Jeremy Stanley
4863b1200c Disable partial clone feature in Gitea
Gitea 1.16 added partial clone support, but the clone filters pip
tries to apply (--filter=blob:none) don't work well when combined
with older cgit clients and lead to errors like "Server does not
allow request for unadvertised object" or "protocol error: bad pack
header".

Explicitly disable this feature server-side for now, so that clients
will fall back to making full clones.

Change-Id: Ia86394d5176c28567bf67b60578aadde6629c775
Depends-On: https://review.opendev.org/834196
2022-03-17 16:18:21 +00:00
Jeremy Stanley
7af66f25c4 Stop checking the OpenStackID HTTPS cert
None of the services we operate rely on openstackid.org any longer,
so we can drop our monitoring of its cert expiration safely (which
is currently complaining). We're already monitoring its successor,
id.openinfra.dev.

Change-Id: I059ef0492f05137fa542c819b64427bd9ef0eb0c
2022-03-17 02:51:39 +00:00
Zuul
e9e63f1d52 Merge "Clean up Gerrit image builds" 2022-03-16 20:19:20 +00:00
Zuul
0525c5d896 Merge "Update Gitea to 1.16.4" 2022-03-16 16:26:46 +00:00
wangxiyuan
a6a5988f8a Fix openEuler mirror problem
openEuler yum mirror in Russia is down. This patch change the
rsync url to the official HongKong one.

This patch also fix the openEuler mirror url nit.

Change-Id: Ifb930e34fd7f16f77ba55bc489e5389c641139de
2022-03-16 12:05:06 +08:00
Zuul
8a0a0040e3 Merge "grafana: proxy websockets" 2022-03-15 20:10:40 +00:00
Clark Boylan
dd0a3374d2 Update Gitea to 1.16.4
Gitea 1.16.4 is now available. Note that this update includes the
changes from 1.16.0-1.16.3 as well since we are upgrading from
1.15.x. The changelog can be found at:

  https://github.com/go-gitea/gitea/blob/v1.16.4/CHANGELOG.md

In particular this calls out:

  https://github.com/go-gitea/gitea/pull/17846

as a potentially breaking change that may impact our use of ssh. We
attempt to update our Dockerfile to use the correct gitea command script
to address this but we should likely test replication before landing
this update.

The changelog is quite large and I haven't been able to fully examine it
for impacts. Reviewers are encouraged to look it over and find items we
should address. Additionally once this is reliably building we should
hold a node and inspect it directly.

Change-Id: I0bf7400d43583a8e8b54581225c70cba53007876
2022-03-14 14:57:00 -07:00
Zuul
4c34ff6bb1 Merge "prod-playbook: use job name for stats" 2022-03-10 20:02:20 +00:00
Zuul
decddbe23f Merge "docs: reorganise around a open infrastructure overview" 2022-03-10 05:45:08 +00:00
Ian Wienand
d87ce0e35f prod-playbook: use job name for stats
Because "." is a field separator for graphite, we're incorrectly
nesting the results.

A better idea seems to be to store these stats under the job name.
That's going to be more helpful when looking up in Zuul build results
anyway.

Follow-on to I90dfb7a25cb5ab08403c89ef59ea21972cf2aae2

Change-Id: Icbb57fd23d8b90f52bc7a0ea5fa80f389ab3892e
2022-03-10 16:41:59 +11:00
Ian Wienand
7745bf16f3 grafana: proxy websockets
If you watch the web console of your browser in a grafana page, it
constantly tries to hit /api/live/ws which is currently giving an
error.

Following some combination of [1], [2], [3] and some trial-and-error,
this appears to let apache proxy through the requests.

[1] https://github.com/grafana/grafana/issues/36929
[2] https://github.com/grafana/grafana/issues/34537
[3] https://grafana.com/tutorials/run-grafana-behind-a-proxy/

Change-Id: I6c5ba71a1c0feab36b4df56f80271fa52f6354de
2022-03-10 12:49:56 +11:00
Ian Wienand
b7cdaa7fce prod-playbook : send playbook runtime/status to graphite
We used to track the runtime with the old cron-based system
(I299c0ab5dc3dea4841e560d8fb95b8f3e7df89f2) and had a dashboard view,
which was often helpful to see at a glance what might be going wrong.

Restore this for Zuul CD by simply sending the nested-Ansible task
time-delta and status to graphite.  bridge.openstack.org is still
allowed to send stats to graphite from this prior work, so no ports
need to be opened.

Change-Id: I90dfb7a25cb5ab08403c89ef59ea21972cf2aae2
2022-03-09 16:51:07 +11:00
Jeremy Stanley
c43289b75a Correct Apache restart for vexxhost-sjc1 mirror
This typo has apparently been causing occasional deploy job failures
for almost two years.

Change-Id: Ic74fa9241a70c120fc496c4e7461e7c899de90d2
2022-03-08 23:49:48 +00:00
Zuul
74eb280b71 Merge "Add check to remainder of balance_git_https" 2022-03-08 22:30:49 +00:00
Ian Wienand
29202eba1a zuul-lb : issue HEAD / checks
As found in Ie5d55b2a2d96a78b34d23cc6fbac62900a23fc37, the default for
this is to issue "OPTIONS /" which is kind of a weird request.  The
Zuul hosts currently seem to return the main page content in response
to a OPTIONS request, which probably isn't right.

Make this more robust by just using "HEAD /" request.

Change-Id: Ibbd32ae744af9c33aedd087a8146195844814b3f
2022-03-08 10:24:03 +11:00
Ian Wienand
c9b580cc0d gitea-haproxy: issue liveness check to HEAD /
By default this sends OPTIONS /, which apache rejects with an error.

Change-Id: Ie5d55b2a2d96a78b34d23cc6fbac62900a23fc37
2022-03-08 09:46:59 +11:00
Jeremy Stanley
f5268d956c Add check to remainder of balance_git_https
Now that we can confirm this hasn't broken for gitea01, set check on
all the remaining server lines as well.

Change-Id: I11f1f15210dafed66e1209329ddf7f3838592881
2022-03-07 18:14:19 +00:00
Jeremy Stanley
4061acd3e7 Add check keyword to balance_zuul_https servers
Apparently the check-ssl option only modifies check behavior, but
does not actually turn it on. The check option also needs to be set
in order to activate checks of the server. See §5.2 of the haproxy
docs for details:
https://git.haproxy.org/?p=haproxy-2.5.git;a=blob;f=doc/configuration.txt;h=e3949d1eebe171920c451b4cad1d5fcd07d0bfb5;hb=HEAD#l14396

Turn it on for all of our balance_zuul_https server entries.

Also set this on the gitea01 server entry in balance_git_https, so
we can make sure it's still seen as "up" once this change takes
effect. A follow-up change will turn it on for the other
balance_git_https servers out of an abundance of caution around that
service.

Change-Id: I4018507f6e0ee1b5c30139de301e09b3ec6fc494
2022-03-07 18:11:46 +00:00
Zuul
1807c07533 Merge "grafana: set custom home dashboard" 2022-03-07 13:55:02 +00:00
Zuul
53fbf72fdd Merge "Allow zuul-lb to send stats to graphite" 2022-03-07 05:23:29 +00:00
Zuul
b8576c09c0 Merge "Don't run infra-prod-run-refstack on all group var updates" 2022-03-07 05:13:56 +00:00
Ian Wienand
50600f49a2 grafana: set custom home dashboard
Set a home dashboard with a little logo, link to the source files and
a plain list of dashboards.

Change-Id: Ifa9373695c1edb7de83b342948d46a816702ee10
2022-03-07 12:45:03 +11:00
Clark Boylan
e2442eeaf0 Don't run infra-prod-run-refstack on all group var updates
This was running on all group var updates but we only need to run it
when refstack group vars update. Change the file requirements to match
the refstack.yaml group file to address this.

Change-Id: Id5ed4b65c1ed6566696fea9a33db27e9318af1a6
2022-03-04 15:30:47 -08:00
James E. Blair
e97efd5fd5 Allow zuul-lb to send stats to graphite
Change-Id: Ib6bcd21555d34f80e1ace58cbd1cc7f479f92f7a
2022-03-04 15:26:21 -08:00
Clark Boylan
f24bbf97a7 Do more robust checks against zuul-web with haproxy
Switch the port 80 and 443 endpoints over to doing http checks instead
of tcp checks. This ensures that both apache and the zuul-web backend
are functional before balancing to them.

The fingergw remains a tcp check.

Change-Id: Iabe2d7822c9ef7e4514b9a0eb627f15b93ad48e2
2022-03-04 14:17:51 -08:00
James E. Blair
3f8acefbe1 Run zuul-web on zuul01 and add to load balancer
Change-Id: Ia8b10338fa3a1876993404276e0759f4b10d6b54
2022-03-04 13:11:09 -08:00
James E. Blair
bd70f7b9d5 Add zuul-lb01 to inventory
Change-Id: I1dd8ee520aba2c5c47801e751b5ed492f0efddd5
2022-03-04 11:17:02 -08:00
Zuul
4570e3064e Merge "Adds support for running zuul-registry as a non-root user" 2022-03-04 17:16:02 +00:00
Ian Wienand
4c86706e5e docs: reorganise around a open infrastructure overview
This introduces and "Open Infrastructure" page which is designed for a
moderately experienced developer with some understanding of Zuul,
Ansible and basic Linux admin skills to have an entrypoint to
navigating the system-config and related repositories.

It is designed to re-enforce the idea of open infrastructure, and
explain how development, testing and production come together at a
level high enough to be understood, but with links or descriptions of
specific places in the code to get started.

It moves a little of what was in the sysadmin page into this, and
leaves that page as more low-level descriptions of various tasks.

Change-Id: I60a9299df455b98ad549ac0075a59d381722bc06
2022-03-04 12:18:42 +11:00
Zuul
c5b95b55fa Merge "Block access to Gitiles" 2022-03-03 22:22:09 +00:00
Clark Boylan
47c242ff21 Pull gerrit/plugins/gitiles from stable branch not tag
This plugin was updated to accomodate the ${hash} substition in gerrit
gitweb weblinks. We now need this updated version to build Gerrit
successfully but there is no tag for it yet. Just use the branch to
address this.

Change-Id: I4b0fd4ac845cc4289f78aacfa536db4185f12d38
2022-03-03 10:48:03 -08:00
Jack Morgan
ded27cbb5d Adds support for running zuul-registry as a non-root user
Signed-off-by: Jack Morgan <jack@jento.io>
Change-Id: I89594affb04639b49b409a569036d6afac997251
2022-03-03 09:06:51 -08:00
Clark Boylan
6f178c2737 Add docs on restoring a gitea repository
We have discovered that it is possible for a gitea repository to be come
corrupted. Since gitea is not the source of truth the easiest way to
handle this is to replace the repo with a new empty repository and have
Gerrit replicate back to it. This adds documentation that walks through
the process of doing this.

Change-Id: Ief990adaaf3cbb3c748bc9ee6ceb466a1104915a
2022-03-02 12:03:01 -08:00
Ian Wienand
93e2b84df0 zuul run-base: make sure we catch failures when teeing to logs
Change I5b9f9dd53eb896bb542652e8175c570877842584 introduced this tee
to capture and encrypt the logs.  However, we should make sure to fail
if the ansible runs fail.  Switch on pipefail, which will exit with an
error if the earlier parts of the pipeline fail.  Also make sure we
run under bash.

Change-Id: I2c4cb9aec3d4f8bb5bb93e2d2c20168dc64e78cb
2022-03-02 13:42:13 +11:00
Zuul
bb93b17c05 Merge "Remove airship-citycloud resources" 2022-03-01 22:02:53 +00:00
Clark Boylan
b7ccc12a6b Remove airship-citycloud resources
We've been told these resources are going away. Trying to remove them
gracefully from nodepool. Once that is done we can remove our configs
here.

Depends-On: https://review.opendev.org/c/openstack/project-config/+/831398
Change-Id: I396ca49ab33c09622dd398012528fe7172c39fe8
2022-03-01 11:39:53 -08:00
Zuul
012ba26d38 Merge "encrypt-logs: turn on for all prod playbooks" 2022-03-01 19:37:05 +00:00
Zuul
1b8fdec20e Merge "Remove Gerrit's JVM GC logs" 2022-02-28 03:28:49 +00:00
Zuul
6463275bf2 Merge "hound: enable detect-ref" 2022-02-27 21:51:39 +00:00
Zuul
36ceb62e51 Merge "Restore is:mergeable predicate in Gerrit" 2022-02-25 20:42:35 +00:00
Zuul
6d76448298 Merge "Update Etherpad to 1.8.17" 2022-02-25 17:09:14 +00:00
Ian Wienand
25f7403e2a hound: enable detect-ref
The dependent change enables the "detect-ref" option of hound, which
looks at the remote origin HEAD and indexes on that.  That should
allow indexing of our mixed repos that have a mix of "master" and
"main".

Add cirros to the test, which should exercise this path, and take some
screenshosts because this a js/react app and just a "curl" doesn't
help.

Change-Id: I1850577c63566b594f9730f5b8f0bc10b07ff7e4
Depends-On: https://review.opendev.org/c/opendev/jeepyb/+/830919
2022-02-25 17:27:35 +11:00
Clark Boylan
7c9d9d7993 Remove Gerrit's JVM GC logs
These were added when we faced significant memory pressure on the old
server. That is no longer a problem and there is an issue with the
specification that breaks file compression due to destination files
already existing. It seems like the log specification is only able to
rotate once then it cannot keep moving files aside because they already
exist as eg jvm_gc.log.0.gz. This results in annoying errors in the
Gerrit error_log.

Note that it doesn't appear sufficient to remove this log specification
we also need to move the existing jvm_gc.log* files aside or delete
them. This was tested on a held zuul node and I stopped gerrit, updated
the docker-compose file, moved the files aside, then started gerrit and
that got rid of the startup errors in error_log. Merely updating
docker-compose resulted in the same errors on startup.

Change-Id: Ied1464c57b2e8331b9bdf7cbc9ad74f92dea2dfd
2022-02-24 14:41:17 -08:00
Jeremy Stanley
9a25740961 Clean up two retired mailing lists
The enterprise-wg and product-wg lists were deleted from the
openstack site per the announcement[*] on 2022-02-01, but I
neglected to push a change to remove them from our configuration
management, so Ansible helpfully recreated them for me. Clean this
up so I can re-remove the lists once and for all.

[*] http://lists.openinfra.dev/pipermail/foundation/2022-February/003048.html

Change-Id: Iddcb5cbac68d426e0ad13dd41541ad1371366bb1
2022-02-24 19:34:19 +00:00
Clark Boylan
48d8f27101 Update Etherpad to 1.8.17
This brings bug fixes and performance improvements.

Change-Id: I9df00f68ea4062f318affc7cb73f5a20d2db46d8
2022-02-24 09:09:51 -08:00
Ian Wienand
3f6cd427d7 encrypt-logs: turn on for all prod playbooks
We have validated that the log encryption/export path is working, so
turn it on for all prod jobs.

Change-Id: Ic04d5b6e716dffedc925cb799e3630027183d890
2022-02-24 09:57:55 +11:00