46f8129894
Make upload workers processing image layers only once (as the best effort). This also reworks and simplifies locks management for individual tasks now managed for the PythonImageUploader class namespace only. When fetching source layer, cross-link it for the target local image, whenever that source is already exists. When pushing a layer to a target registry, do not repeat transfering the same data, if already pushed earlier for another image. The 1st time a layer gets uploaded/fetched for an image, that image and its known path (local or remote) becomes a reference for future cross-referencing by other images. Store such information about already processed layers in global view shared for all workers to speed-up data transfering jobs they execute. Having that global view, uploading the 1st image in the tasks list as a separate (and non-concurrent) job becomes redundant and now will be executed concurently with other images. Based on the dynamically picked multi-workers mode, provide the global view as a graf with its MP/MT state synchronization as the following: * use globally shared locking info also containing global layers view for MP-workers. With the shared global view state we can no longer use local locking objects individual for each task. * if cannot use multi-process workers, like when executing it via Mistral by monkey patched eventlet greenthreads, choose threadinglock and multi-threads-safe standard dictionary in the shared class namespace to store the global view there * if it can do MP, pick processlock also containing a safe from data races Manager().dict() as the global view shared among cooperating OS processes. * use that global view in a transparent fashion, provided by a special classmethod proxying access to the internal state shared for workers. Ultimately, all that optimizes: * completion time * re-fetching of the already processed layers * local deduplication of layers * the amount of outbound HTTP requests to registries * if-layer-exists and other internal logic check executed against the in-memory cache firstly. As layers locking and unlocking becomes a popular action, reduce the noise of the debug messages it produces. Closes-bug: #1847225 Related-bug: #1844446 Change-Id: Ie5ef4045b7e22c06551e886f9f9b6f22c8d4bd21 Signed-off-by: Bogdan Dobrelya <bdobreli@redhat.com>
27 lines
991 B
Python
27 lines
991 B
Python
# Copyright 2019 Red Hat, Inc.
|
|
# All Rights Reserved.
|
|
#
|
|
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
|
# not use this file except in compliance with the License. You may obtain
|
|
# a copy of the License at
|
|
#
|
|
# http://www.apache.org/licenses/LICENSE-2.0
|
|
#
|
|
# Unless required by applicable law or agreed to in writing, software
|
|
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
|
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
|
# License for the specific language governing permissions and limitations
|
|
# under the License.
|
|
|
|
|
|
def uploaded_layers_details(uploaded_layers, layer, scope):
|
|
known_path = None
|
|
known_layer = None
|
|
image = None
|
|
if layer:
|
|
known_layer = uploaded_layers.get(layer, None)
|
|
if known_layer and scope in known_layer:
|
|
known_path = known_layer[scope].get('path', None)
|
|
image = known_layer[scope].get('ref', None)
|
|
return (known_path, image)
|