Correct (again) how ansible-galaxy proxy is configured

The mix of <Location> and ProxyPass [path] <target> lead to some issue.
This patch corrects them and makes the config more consistent.

Until now, the last URI was actually an error page from the main galaxy
website. With this change, we now hit the S3 bucket as we should,
allowing ansible-galaxy to download the archive, validate its checksum,
and install the intended collection/role.

This patch was fully tested locally using the httpd container image, a
minimal configuration adding only the needed modules and the
ansible-galaxy vhost/proxy, and running ansible-galaxy directly.

In addition, this patch also makes a better testing of the proxy, using
cURL until we actually download the file.
Since ansible galaxy will provide a file under any condition, we also
assert the downloaded file is really what it should be - a plain
archive. If it's a miss on S3 side, it would be a JSON. And if we get an
ansible galaxy answer, that would be an HTML file.

The following commands were used:
Run the container:
podman run --rm --security-opt label=disable \
        -v ./httpd.conf:/usr/local/apache2/conf/httpd.conf:ro \
        -p 8080:8080 httpd:2.4

Run ansible-galaxy while ensuring we don't rely on its own internal
cache:
rm -rf operator ~/.ansible/galaxy_cache/api.json
ansible-galaxy collection download -vvvvvvv \
        -s http://localhost:8080/ -p ./operator tripleo.operator

Then, the following URI were shown in the ansible-galaxy log:

http://localhost:8080/
http://localhost:8080/api
http://localhost:8080/api/v2/collections/tripleo/operator/
http://localhost:8080/api/v2/collections/tripleo/operator/versions/?page_size=100
http://localhost:8080/api/v2/collections/tripleo/operator/versions/0.9.0/

Then, the actual download:
http://localhost:8080/download/tripleo-operator-0.9.0.tar.gz

Then the checksum validation, and eventually it ended with:
Collection 'tripleo.operator:0.9.0' was downloaded successfully

Change-Id: Ibfe846b59bf987df3f533802cb329e15ce83500b
This commit is contained in:
Cédric Jeanneret 2023-01-11 13:35:51 +01:00
parent 515c295721
commit e7e5504a9e
3 changed files with 25 additions and 6 deletions
playbooks/roles/mirror
testinfra

View File

@ -54,6 +54,11 @@
state: present
name: ssl
- name: Apache headers module
apache2_module:
state: present
name: headers
- name: Apache webroot
file:
path: '{{ www_base }}'

View File

@ -567,8 +567,13 @@ ErrorLogFormat "[%{cu}t] [%-m:%l] [pid %P:tid %T] %7F: %E: [client\ %a] %M% , \
# 5GiB
CacheMaxFileSize 5368709120
CacheStoreExpired On
CacheIgnoreQueryString On
CacheDefaultExpire 86400
CacheIgnoreCacheControl On
CacheStorePrivate On
<Location "/">
CacheEnable disk
ProxyPass "https://galaxy.ansible.com/" ttl=120 keepalive=On retry=0
ProxyPassReverse "https://galaxy.ansible.com/"
SetOutputFilter INFLATE;SUBSTITUTE;DEFLATE
@ -577,15 +582,19 @@ ErrorLogFormat "[%{cu}t] [%-m:%l] [pid %P:tid %T] %7F: %E: [client\ %a] %M% , \
# of the REQUEST_SCHEME. Note that mod_substitute can't use parameters...
<If "-T %{HTTPS}">
Substitute "s|https://galaxy.ansible.com/|https://{{ apache_server_name }}:$port/|ni"
Substitute "s|https://ansible-galaxy.s3.amazonaws.com/|https://{{ apache_server_name }}:$port/galaxy-s3/|ni"
</If>
<If "! -T %{HTTPS}">
Substitute "s|https://galaxy.ansible.com/|http://{{ apache_server_name }}:$port/|ni"
Substitute "s|https://ansible-galaxy.s3.amazonaws.com/|http://{{ apache_server_name }}:$port/galaxy-s3/|ni"
</If>
# Substitute doesn't edit headers - ansible-galaxy sets a Location header for the final link
# to S3 bucket and content. Let's override it in order to point to our local endpoint
Header edit Location "^https://ansible-galaxy.s3.amazonaws.com/" "/galaxy-s3/"
</Location>
<Location "/galaxy-s3/">
CacheEnable disk
ProxyPass "https://ansible-galaxy.s3.amazonaws.com/" ttl=120 keepalive=On retry=0
ProxyPassReverse "https://ansible-galaxy.s3.amazonaws.com/"
</Location>
ProxyPass "/galaxy-s3/" "https://ansible-galaxy.s3.amazonaws.com/" ttl=120 keepalive=On retry=0
ProxyPassReverse "/galaxy-s3/" "https://ansible-galaxy.s3.amazonaws.com/"
ErrorLog /var/log/apache2/proxy_$port_error.log
LogLevel warn

View File

@ -23,9 +23,9 @@ def test_apache(host):
apache = host.service('apache2')
assert apache.is_running
def _run_cmd(host, port, scheme='https', url=''):
def _run_cmd(host, port, scheme='https', url='', curl_opt=''):
hostname = host.backend.get_hostname()
return f'curl --resolve {hostname}:127.0.0.1 {scheme}://{hostname}:{port}{url}'
return f'curl {curl_opt} --resolve {hostname}:127.0.0.1 {scheme}://{hostname}:{port}{url}'
def test_base_mirror(host):
# base mirror
@ -87,3 +87,8 @@ def test_galaxy_mirror(host):
answer = json.loads(cmd.stdout)
download_uri = answer['download_url']
assert download_uri.startswith('https://{}:4448/download/community-general'.format(hostname))
# Download a file and check we get an actual archive
download_uri = download_uri.replace('https://{}:4448'.format(hostname), '')
host.run(_run_cmd(host, 4448, url=download_uri, curl_opt='-sL --output /tmp/output.tar.gz'))
check_file = host.run('file /tmp/output.tar.gz')
assert 'gzip compressed data' in check_file.stdout