promote-container-image: add promote_container_image_method

After recent conversations, we've come to the conclusion it will be
good to have two models of promotion

 - using tags, where gate directly uploads to the final repository and
   promote retags the image.

 - from an intermediate-registry, where upload stores the built image
   in an i-r and the promote step uploads to the final registry.

To facilitate this, we add a "promote_container_image_method" flag to
the promote roles.

The documentation is expanded to explain how all this is intended to
work together.

These roles haven't been publicised yet, but this should be a no-op as
it defaults to tags, which is the current operation.

c.f. Ia24bbd101e01ab371ceacfed006b5ff806418a97

Change-Id: I1c25f60f835b1cab983bcdd169eeffc0e250a56c
This commit is contained in:
Ian Wienand 2023-03-30 09:14:26 +11:00
parent d7e5559e58
commit 0a64d51c3d
No known key found for this signature in database
5 changed files with 193 additions and 62 deletions
playbooks/container-image
roles
build-container-image
promote-container-image

View File

@ -6,29 +6,126 @@ context:
* :zuul:job:`upload-container-image`: Build and stage the images in a registry.
* :zuul:job:`promote-container-image`: Promote previously uploaded images.
The :zuul:job:`build-container-image` job is designed to be used in
a `check` pipeline and simply builds the images to verify that
the build functions.
The :zuul:job:`upload-container-image` job builds and uploads the
images to a registry, but only with a single tag corresponding to the
change ID. This job is designed in a `gate` pipeline so that the
build produced by the gate is staged and can later be promoted to
production if the change is successful.
The :zuul:job:`promote-container-image` job is designed to be
used in a `promote` pipeline. It requires no nodes and runs very
quickly on the Zuul executor. It simply re-tags a previously uploaded
image for a change with whatever tags are supplied by
:zuul:jobvar:`build-container-image.container_images.tags`.
It also removes the change ID tag from the repository in the registry.
If any changes fail to merge, this cleanup will not run and those tags
will need to be deleted manually.
The jobs can work in multiple modes depending on your requirements.
They all accept the same input data, principally a list of
dictionaries representing the images to build. YAML anchors_ can be
used to supply the same data to all three jobs.
*Promotion via tags*
The :zuul:job:`build-container-image` job runs in the `check` pipeline
to validate the change.
The :zuul:job:`upload-container-image` job runs in the `gate` pipeline
and builds and uploads the images to a remote registry, but only with
a single temporary tag corresponding to the change ID. This is a
*speculative* upload; the change is not "live" (the main tag is not
updated) and other gate jobs may fail and the change may not merge,
effectively invalidating the upload.
The :zuul:job:`promote-container-image` job runs in a post-merge
`promote` pipeline. It requires no nodes and runs very quickly on the
Zuul executor. It simply re-tags a previously uploaded image for a
change with whatever tags are supplied by
:zuul:jobvar:`build-container-image.container_images.tags` after the
code has merged. It also cleans up and removes the change ID tag from
the repository in the registry. If any changes fail to merge, this
cleanup will not run and those tags will need to be deleted manually.
This advantage of this method is that it minimises the window in which
the published image differs from the merged code. There are some
caveats to be aware of. `gate` failures may mean that unused layers
and tags are present in the remote repository, which need to be
cleaned up. Removing registry tags is not a generic option; you will
need to check the promote role documentation to ensure you are passing
the right registry details so tags can be cleaned up.
In the `tag` and `release` pipelines there is no need for a
speculative upload (the tagged/released change is committed code and
has already passed gate tests). In this case,
:zuul:job:`upload-container-image` job is run with the flag
``upload_container_image_promote: false`` to directly build and push
with the final tags.
Summary:
* :zuul:job:`build-container-image` in `check`
* :zuul:job:`upload-container-image` in `gate`
* :zuul:job:`promote-container-image` in `promote` with
``promote_container_method: tag``
* :zuul:job:`upload-container-image` with
``upload_container_image_promote: false`` in `tag` and `release`
*Promotion via intermediate registry*
Note that as of 2023-03, this path is not fully implemented. It is
documented here for compeleteness.
The :zuul:job:`build-container-image` runs in the `check` pipeline,
but also in the `gate` pipeline. Usually in both cases the job builds
and uploads the images to an intermediate registry; but at least the
`gate` pipeline job must..
The :zuul:job:`promote-container-image` job is designed to be used in
a post-merge `promote` pipeline. It requires no nodes and run on the
Zuul executor. It inspects the artifacts of the gate job to find the
correct tags to pull from the intermediate registry. It then uploads
this image from the intermediate registry to the remote registry with
the final tags supplied by
:zuul:jobvar:`build-container-image.container_images.tags`.
In the `tag` and `release` pipelines the
:zuul:job:`upload-container-image` job is run with the flag
``upload_container_image_promote: false`` to directly build and push
with the final tags.
The advantages of this method is that no partial or unused images will
ever be present in the final repository. Copying from the
intermediate registry effectively caches the expensive build process.
This means that although the window that the production tags are
out-of-sync with the merged code is larger than when using speculative
uploads, it is smaller than having to rebuild *and* upload the image.
Copying is a generic operation, so it should work with any registry.
The layer upload has more exposure to transient errors than the
``tag`` promotion step, so needs to be monitored more carefully. You
also must manage an external intermediate registry to hold the image
between upload and promote steps in this model.
Summary:
* :zuul:job:`build-container-image` in `check`
* :zuul:job:`build-container-image` in `gate`. This must push to an
intermediate registry.
* :zuul:job:`promote-container-image` in `promote` with
``promote_container_method: intermediate-registry``
* :zuul:job:`upload-container-image` with
``upload_container_image_promote: false`` in `tag` and `release`
*Publish via full release*
The :zuul:job:`build-container-image` job runs in the `check` pipeline
to validate the change.
The :zuul:job:`build-container-image` job also runs in the `gate`
pipeline to validate the change before merge.
Once the change has merged, :zuul:job:`upload-container-image` job is
run with the flag ``upload_container_image_promote: false`` to
directly build and push with the final tags. This is also run in the
`tag` and `release` piplines in the same way.
The advantage of this mode is that it requires no external
dependencies or management of speculative uploads. The disadvantage
is that it has the longest window where published image is out-of-sync
with merged-code, as the post-merge release process must re-build the
entire container and upload it.
* :zuul:job:`build-container-image` in `check`
* :zuul:job:`build-container-image` in `gate`
* :zuul:job:`upload-container-image` with
``upload_container_image_promote: false`` after code merge, and
`tag` and `release` pipelines.
**Job Variables**
.. zuul:jobvar:: zuul_work_dir

View File

@ -22,9 +22,9 @@ use of subsequent roles to upload the images to a registry.
The :zuul:role:`upload-container-image` role uploads the images to a
registry. It can be used in one of two modes:
1. The default mode is as part of a two-step `promote` pipeline. This
mode is designed to minimize the time the published registry tag is
out of sync with the changes Zuul has merged to the underlying code
1. Using tags as part of a two-step `promote` pipeline. This mode is
designed to minimize the time the published registry tag is out of
sync with the changes Zuul has merged to the underlying code
repository.
In this mode, the role is intended to run in the `gate` pipeline.
@ -45,13 +45,23 @@ registry. It can be used in one of two modes:
to by ``<tag>`` will now reflect the underlying code closing the
out-of-sync window.
2. The other mode allows for use of this job in a `release` pipeline
to directly upload a release build with the final set of tags.
2. The second mode allows for use of this job in `release` and `tag`
pipelines to directly upload a release build with the final set of
tags.
In this mode, the completion of the `gate` jobs will have merged
the code changes, and the role will now have to build and upload
the resulting image to the remote repository. Once uploaded, the
tags will be updated.
In this mode, ``upload_container_image_promote: false`` should be
set. The role will build and upload the resulting image to the
remote repository with the final tags.
This should be used with `tag` and `release` pipelines, where
committed code has been tagged for publishing. The tagged commit
is "known good" thanks to gating, so the build and upload process
is expected to work unconditionally.
This can be used in a post-commit pipeline, with the caveat that it
has a much longer window where published code is out of sync with
the published image, as the image must be completely rebuilt and
uploaded after code merge in the `gate` job.
The alternative `promote` method can be thought of as a
"speculative" upload. There is a possibility the `gate` job
@ -77,9 +87,11 @@ registry. It can be used in one of two modes:
*Promoting*
As discussed above, the :zuul:role:`promote-container-image` role is
designed to be used in a `promote` pipeline. It re-tags a previously
uploaded image by copying the temporary change-id based tags made
during upload to the final production tags supplied by
designed to be used in a `promote` pipeline.
In ``tag`` mode, it re-tags a previously uploaded image by copying the
temporary change-id based tags made during upload to the final
production tags supplied by
:zuul:rolevar:`build-container-image.container_images.tags`. It is
intended to run very quickly and with no dependencies, so it can run
directly on the Zuul executor.
@ -90,6 +102,11 @@ the registry, and removes any similar change-ids tags. This keeps the
repository tidy in the case that gated changes fail to merge after
uploading their staged images.
In ``intermediate-registry`` mode, this role queries Zuul to find the
build performed by the build role in the ``gate``. It then copies
this image from the intermediate-registry to the final location in the
remote registry.
*Dependencies*
Use the :zuul:role:`ensure-skopeo` role as well as the

View File

@ -1,3 +1,12 @@
Promote one or more previously uploaded container images.
.. include:: ../../roles/build-container-image/common.rst
.. zuul:rolevar:: promote_container_image_method
:type: string
:default: tag
If ``tag`` (the default), then this role will update tags created
by the upload-container-image role. Set to
``intermediate-registry`` to have this role copy an image created
and pushed to an intermediate registry by the build-container-role.

View File

@ -1,32 +1,8 @@
- name: Verify repository names
when: |
container_registry_credentials is defined
and zj_image.registry not in container_registry_credentials
loop: "{{ container_images }}"
loop_control:
loop_var: zj_image
- name: Promote container image with tags
when: promote_container_image_method|default('tag') == 'tag'
include_tasks: promote-from-tag.yaml
- name: Promote container image with intermediate registry
when: promote_container_image_method|default('tag') == 'intermediate-registry'
fail:
msg: "{{ zj_image.registry }} credentials not found"
- name: Verify repository permission
when: |
container_registry_credentials[zj_image.registry].repository is defined and
not zj_image.repository | regex_search(container_registry_credentials[zj_image.registry].repository)
loop: "{{ container_images }}"
loop_control:
loop_var: zj_image
fail:
msg: "{{ zj_image.repository }} not permitted by {{ container_registry_credentials[zj_image.registry].repository }}"
- name: Promote image
loop: "{{ container_images }}"
loop_control:
loop_var: zj_image
include_tasks: promote-retag.yaml
# The docker roles prune obsolete tags here, but that relies on a
# timestamp to make sure we're not deleting in-progress tags (that the
# gate pipeline may be uploading at the same time we're promoting).
# That timestamp is not available with skopeo list-tags, so some other
# mechanism will need to be devised to clean them up. In the
# meantime, we hope that the cleanup in promote-retag succeeds.
msg: 'The intermediate-registry promote role is not yet complete'

View File

@ -0,0 +1,32 @@
- name: Verify repository names
when: |
container_registry_credentials is defined
and zj_image.registry not in container_registry_credentials
loop: "{{ container_images }}"
loop_control:
loop_var: zj_image
fail:
msg: "{{ zj_image.registry }} credentials not found"
- name: Verify repository permission
when: |
container_registry_credentials[zj_image.registry].repository is defined and
not zj_image.repository | regex_search(container_registry_credentials[zj_image.registry].repository)
loop: "{{ container_images }}"
loop_control:
loop_var: zj_image
fail:
msg: "{{ zj_image.repository }} not permitted by {{ container_registry_credentials[zj_image.registry].repository }}"
- name: Promote image
loop: "{{ container_images }}"
loop_control:
loop_var: zj_image
include_tasks: promote-retag.yaml
# The docker roles prune obsolete tags here, but that relies on a
# timestamp to make sure we're not deleting in-progress tags (that the
# gate pipeline may be uploading at the same time we're promoting).
# That timestamp is not available with skopeo list-tags, so some other
# mechanism will need to be devised to clean them up. In the
# meantime, we hope that the cleanup in promote-retag succeeds.