jenkins-pipelines/dockerfiles/coreutils
Davlet Panech fe5793b71d archive-dir: binary search + parallelism
Performance enhancements for archive-dir:

* While searching for old checksums, use BSD look [1] (binary search),
  rather than grep (linear). This requires a docker image with that
  utility installed. A Dockerfile is included and is meant to be built
  and pushed to Docker Hub manually as needed. Image name:
  starlings/jenkins-pipelines-coreutils:TIMESTAMP .

* Process all files in parallel. Previously we only calculated checksums
  in parallel.

Timings before & after the patch, using a build with ~100K files and
~300K old checksums (docker + aptly + mirrors):

* before patch with JOBS=4: 2 hrs 7 min
* this patch with JOBS=4: 26 min
* this patch with JOBS=1: 1hr 10 min

[1] https://man.openbsd.org/look.1

TESTS
=======================
Run "archive-misc" and make sure it copies/links the same files as
before the patch.

Story: 2010226
Task: 48184

Signed-off-by: Davlet Panech <davlet.panech@windriver.com>
Change-Id: I2ad271be673e8499c17a87e9d52864b40e217fc7
2023-06-06 15:48:11 -04:00
..
.dockerignore archive-dir: binary search + parallelism 2023-06-06 15:48:11 -04:00
Dockerfile archive-dir: binary search + parallelism 2023-06-06 15:48:11 -04:00
build.sh archive-dir: binary search + parallelism 2023-06-06 15:48:11 -04:00
push.sh archive-dir: binary search + parallelism 2023-06-06 15:48:11 -04:00