From 0a64d51c3d39d197c33e30dbb4f94d2a5db4dc6f Mon Sep 17 00:00:00 2001 From: Ian Wienand Date: Thu, 30 Mar 2023 09:14:26 +1100 Subject: [PATCH] promote-container-image: add promote_container_image_method After recent conversations, we've come to the conclusion it will be good to have two models of promotion - using tags, where gate directly uploads to the final repository and promote retags the image. - from an intermediate-registry, where upload stores the built image in an i-r and the promote step uploads to the final registry. To facilitate this, we add a "promote_container_image_method" flag to the promote roles. The documentation is expanded to explain how all this is intended to work together. These roles haven't been publicised yet, but this should be a no-op as it defaults to tags, which is the current operation. c.f. Ia24bbd101e01ab371ceacfed006b5ff806418a97 Change-Id: I1c25f60f835b1cab983bcdd169eeffc0e250a56c --- playbooks/container-image/README.rst | 135 +++++++++++++++--- roles/build-container-image/common.rst | 41 ++++-- roles/promote-container-image/README.rst | 9 ++ roles/promote-container-image/tasks/main.yaml | 38 +---- .../tasks/promote-from-tag.yaml | 32 +++++ 5 files changed, 193 insertions(+), 62 deletions(-) create mode 100644 roles/promote-container-image/tasks/promote-from-tag.yaml diff --git a/playbooks/container-image/README.rst b/playbooks/container-image/README.rst index 5eeea6444..b788e0da2 100644 --- a/playbooks/container-image/README.rst +++ b/playbooks/container-image/README.rst @@ -6,29 +6,126 @@ context: * :zuul:job:`upload-container-image`: Build and stage the images in a registry. * :zuul:job:`promote-container-image`: Promote previously uploaded images. -The :zuul:job:`build-container-image` job is designed to be used in -a `check` pipeline and simply builds the images to verify that -the build functions. - -The :zuul:job:`upload-container-image` job builds and uploads the -images to a registry, but only with a single tag corresponding to the -change ID. This job is designed in a `gate` pipeline so that the -build produced by the gate is staged and can later be promoted to -production if the change is successful. - -The :zuul:job:`promote-container-image` job is designed to be -used in a `promote` pipeline. It requires no nodes and runs very -quickly on the Zuul executor. It simply re-tags a previously uploaded -image for a change with whatever tags are supplied by -:zuul:jobvar:`build-container-image.container_images.tags`. -It also removes the change ID tag from the repository in the registry. -If any changes fail to merge, this cleanup will not run and those tags -will need to be deleted manually. - +The jobs can work in multiple modes depending on your requirements. They all accept the same input data, principally a list of dictionaries representing the images to build. YAML anchors_ can be used to supply the same data to all three jobs. +*Promotion via tags* + +The :zuul:job:`build-container-image` job runs in the `check` pipeline +to validate the change. + +The :zuul:job:`upload-container-image` job runs in the `gate` pipeline +and builds and uploads the images to a remote registry, but only with +a single temporary tag corresponding to the change ID. This is a +*speculative* upload; the change is not "live" (the main tag is not +updated) and other gate jobs may fail and the change may not merge, +effectively invalidating the upload. + +The :zuul:job:`promote-container-image` job runs in a post-merge +`promote` pipeline. It requires no nodes and runs very quickly on the +Zuul executor. It simply re-tags a previously uploaded image for a +change with whatever tags are supplied by +:zuul:jobvar:`build-container-image.container_images.tags` after the +code has merged. It also cleans up and removes the change ID tag from +the repository in the registry. If any changes fail to merge, this +cleanup will not run and those tags will need to be deleted manually. + +This advantage of this method is that it minimises the window in which +the published image differs from the merged code. There are some +caveats to be aware of. `gate` failures may mean that unused layers +and tags are present in the remote repository, which need to be +cleaned up. Removing registry tags is not a generic option; you will +need to check the promote role documentation to ensure you are passing +the right registry details so tags can be cleaned up. + +In the `tag` and `release` pipelines there is no need for a +speculative upload (the tagged/released change is committed code and +has already passed gate tests). In this case, +:zuul:job:`upload-container-image` job is run with the flag +``upload_container_image_promote: false`` to directly build and push +with the final tags. + +Summary: + +* :zuul:job:`build-container-image` in `check` +* :zuul:job:`upload-container-image` in `gate` +* :zuul:job:`promote-container-image` in `promote` with + ``promote_container_method: tag`` +* :zuul:job:`upload-container-image` with + ``upload_container_image_promote: false`` in `tag` and `release` + +*Promotion via intermediate registry* + +Note that as of 2023-03, this path is not fully implemented. It is +documented here for compeleteness. + +The :zuul:job:`build-container-image` runs in the `check` pipeline, +but also in the `gate` pipeline. Usually in both cases the job builds +and uploads the images to an intermediate registry; but at least the +`gate` pipeline job must.. + +The :zuul:job:`promote-container-image` job is designed to be used in +a post-merge `promote` pipeline. It requires no nodes and run on the +Zuul executor. It inspects the artifacts of the gate job to find the +correct tags to pull from the intermediate registry. It then uploads +this image from the intermediate registry to the remote registry with +the final tags supplied by +:zuul:jobvar:`build-container-image.container_images.tags`. + +In the `tag` and `release` pipelines the +:zuul:job:`upload-container-image` job is run with the flag +``upload_container_image_promote: false`` to directly build and push +with the final tags. + +The advantages of this method is that no partial or unused images will +ever be present in the final repository. Copying from the +intermediate registry effectively caches the expensive build process. +This means that although the window that the production tags are +out-of-sync with the merged code is larger than when using speculative +uploads, it is smaller than having to rebuild *and* upload the image. +Copying is a generic operation, so it should work with any registry. +The layer upload has more exposure to transient errors than the +``tag`` promotion step, so needs to be monitored more carefully. You +also must manage an external intermediate registry to hold the image +between upload and promote steps in this model. + +Summary: + +* :zuul:job:`build-container-image` in `check` +* :zuul:job:`build-container-image` in `gate`. This must push to an + intermediate registry. +* :zuul:job:`promote-container-image` in `promote` with + ``promote_container_method: intermediate-registry`` +* :zuul:job:`upload-container-image` with + ``upload_container_image_promote: false`` in `tag` and `release` + +*Publish via full release* + +The :zuul:job:`build-container-image` job runs in the `check` pipeline +to validate the change. + +The :zuul:job:`build-container-image` job also runs in the `gate` +pipeline to validate the change before merge. + +Once the change has merged, :zuul:job:`upload-container-image` job is +run with the flag ``upload_container_image_promote: false`` to +directly build and push with the final tags. This is also run in the +`tag` and `release` piplines in the same way. + +The advantage of this mode is that it requires no external +dependencies or management of speculative uploads. The disadvantage +is that it has the longest window where published image is out-of-sync +with merged-code, as the post-merge release process must re-build the +entire container and upload it. + +* :zuul:job:`build-container-image` in `check` +* :zuul:job:`build-container-image` in `gate` +* :zuul:job:`upload-container-image` with + ``upload_container_image_promote: false`` after code merge, and + `tag` and `release` pipelines. + **Job Variables** .. zuul:jobvar:: zuul_work_dir diff --git a/roles/build-container-image/common.rst b/roles/build-container-image/common.rst index 52583594e..737887001 100644 --- a/roles/build-container-image/common.rst +++ b/roles/build-container-image/common.rst @@ -22,9 +22,9 @@ use of subsequent roles to upload the images to a registry. The :zuul:role:`upload-container-image` role uploads the images to a registry. It can be used in one of two modes: -1. The default mode is as part of a two-step `promote` pipeline. This - mode is designed to minimize the time the published registry tag is - out of sync with the changes Zuul has merged to the underlying code +1. Using tags as part of a two-step `promote` pipeline. This mode is + designed to minimize the time the published registry tag is out of + sync with the changes Zuul has merged to the underlying code repository. In this mode, the role is intended to run in the `gate` pipeline. @@ -45,13 +45,23 @@ registry. It can be used in one of two modes: to by ```` will now reflect the underlying code closing the out-of-sync window. -2. The other mode allows for use of this job in a `release` pipeline - to directly upload a release build with the final set of tags. +2. The second mode allows for use of this job in `release` and `tag` + pipelines to directly upload a release build with the final set of + tags. - In this mode, the completion of the `gate` jobs will have merged - the code changes, and the role will now have to build and upload - the resulting image to the remote repository. Once uploaded, the - tags will be updated. + In this mode, ``upload_container_image_promote: false`` should be + set. The role will build and upload the resulting image to the + remote repository with the final tags. + + This should be used with `tag` and `release` pipelines, where + committed code has been tagged for publishing. The tagged commit + is "known good" thanks to gating, so the build and upload process + is expected to work unconditionally. + + This can be used in a post-commit pipeline, with the caveat that it + has a much longer window where published code is out of sync with + the published image, as the image must be completely rebuilt and + uploaded after code merge in the `gate` job. The alternative `promote` method can be thought of as a "speculative" upload. There is a possibility the `gate` job @@ -77,9 +87,11 @@ registry. It can be used in one of two modes: *Promoting* As discussed above, the :zuul:role:`promote-container-image` role is -designed to be used in a `promote` pipeline. It re-tags a previously -uploaded image by copying the temporary change-id based tags made -during upload to the final production tags supplied by +designed to be used in a `promote` pipeline. + +In ``tag`` mode, it re-tags a previously uploaded image by copying the +temporary change-id based tags made during upload to the final +production tags supplied by :zuul:rolevar:`build-container-image.container_images.tags`. It is intended to run very quickly and with no dependencies, so it can run directly on the Zuul executor. @@ -90,6 +102,11 @@ the registry, and removes any similar change-ids tags. This keeps the repository tidy in the case that gated changes fail to merge after uploading their staged images. +In ``intermediate-registry`` mode, this role queries Zuul to find the +build performed by the build role in the ``gate``. It then copies +this image from the intermediate-registry to the final location in the +remote registry. + *Dependencies* Use the :zuul:role:`ensure-skopeo` role as well as the diff --git a/roles/promote-container-image/README.rst b/roles/promote-container-image/README.rst index de99c32e8..dc97afd8d 100644 --- a/roles/promote-container-image/README.rst +++ b/roles/promote-container-image/README.rst @@ -1,3 +1,12 @@ Promote one or more previously uploaded container images. .. include:: ../../roles/build-container-image/common.rst + +.. zuul:rolevar:: promote_container_image_method + :type: string + :default: tag + + If ``tag`` (the default), then this role will update tags created + by the upload-container-image role. Set to + ``intermediate-registry`` to have this role copy an image created + and pushed to an intermediate registry by the build-container-role. diff --git a/roles/promote-container-image/tasks/main.yaml b/roles/promote-container-image/tasks/main.yaml index 2a14f2e15..aea1b0a22 100644 --- a/roles/promote-container-image/tasks/main.yaml +++ b/roles/promote-container-image/tasks/main.yaml @@ -1,32 +1,8 @@ -- name: Verify repository names - when: | - container_registry_credentials is defined - and zj_image.registry not in container_registry_credentials - loop: "{{ container_images }}" - loop_control: - loop_var: zj_image +- name: Promote container image with tags + when: promote_container_image_method|default('tag') == 'tag' + include_tasks: promote-from-tag.yaml + +- name: Promote container image with intermediate registry + when: promote_container_image_method|default('tag') == 'intermediate-registry' fail: - msg: "{{ zj_image.registry }} credentials not found" - -- name: Verify repository permission - when: | - container_registry_credentials[zj_image.registry].repository is defined and - not zj_image.repository | regex_search(container_registry_credentials[zj_image.registry].repository) - loop: "{{ container_images }}" - loop_control: - loop_var: zj_image - fail: - msg: "{{ zj_image.repository }} not permitted by {{ container_registry_credentials[zj_image.registry].repository }}" - -- name: Promote image - loop: "{{ container_images }}" - loop_control: - loop_var: zj_image - include_tasks: promote-retag.yaml - -# The docker roles prune obsolete tags here, but that relies on a -# timestamp to make sure we're not deleting in-progress tags (that the -# gate pipeline may be uploading at the same time we're promoting). -# That timestamp is not available with skopeo list-tags, so some other -# mechanism will need to be devised to clean them up. In the -# meantime, we hope that the cleanup in promote-retag succeeds. + msg: 'The intermediate-registry promote role is not yet complete' diff --git a/roles/promote-container-image/tasks/promote-from-tag.yaml b/roles/promote-container-image/tasks/promote-from-tag.yaml new file mode 100644 index 000000000..2a14f2e15 --- /dev/null +++ b/roles/promote-container-image/tasks/promote-from-tag.yaml @@ -0,0 +1,32 @@ +- name: Verify repository names + when: | + container_registry_credentials is defined + and zj_image.registry not in container_registry_credentials + loop: "{{ container_images }}" + loop_control: + loop_var: zj_image + fail: + msg: "{{ zj_image.registry }} credentials not found" + +- name: Verify repository permission + when: | + container_registry_credentials[zj_image.registry].repository is defined and + not zj_image.repository | regex_search(container_registry_credentials[zj_image.registry].repository) + loop: "{{ container_images }}" + loop_control: + loop_var: zj_image + fail: + msg: "{{ zj_image.repository }} not permitted by {{ container_registry_credentials[zj_image.registry].repository }}" + +- name: Promote image + loop: "{{ container_images }}" + loop_control: + loop_var: zj_image + include_tasks: promote-retag.yaml + +# The docker roles prune obsolete tags here, but that relies on a +# timestamp to make sure we're not deleting in-progress tags (that the +# gate pipeline may be uploading at the same time we're promoting). +# That timestamp is not available with skopeo list-tags, so some other +# mechanism will need to be devised to clean them up. In the +# meantime, we hope that the cleanup in promote-retag succeeds.