Support overlapping repos and a flat workspace scheme

This adds the concept of a 'scheme' to the merger.  Up to this point,
the merger has used the 'golang' scheme in all cases.  However it is
possible with Gerrit to create a set of git repositories which collide
with each other using that scheme:

  root/example.com/component
  root/example.com/component/subcomponent

The users which brought this to our attention intend to use their repos
in a flat layout, like:

  root/component
  root/subcomponent

To resolve this we need to do two things: avoid collisions in all cases
in the internal git repo caches of the mergers and executors, and give
users options to resolve collisions in workspace checkouts.

In this change, mergers are updated to support three schemes:

  * golang (the current behavior)
  * flat (new behavior described above)
  * unique

The unique scheme is not intended to be user-visible.  It produces a
truly unique and non-conflicting name by using urllib.quote_plus.  It
sacrifices legibility in order to obtain uniqueness.

The mergers and executors are updated to use the unique scheme in their
internal repo caches.

A new job attribute, 'workspace-scheme' is added to allow the user to
select between 'golang' and 'flat' when Zuul prepares the repos for
checkout.

There is one more kind of repo that Zuul prepares: the playbook repo.
Each project that supplies a playbook to a job gets a copy of its repo
checked out into a dedicated directory (with no sibling repos).  In that
case there is no risk of collision, and so we retain the current behavior
of using the golang scheme for these checkouts.  This allows the playbook
paths to continue to be self-explanatory.  For example:

  trusted/project_0/example.com/org/project/playbooks/run.yaml

Documentation and a release note are added as well.

Change-Id: I3fa1fd3c04626bfb7159aefce0f4dcb10bbaf5d9
This commit is contained in:
James E. Blair 2021-04-21 09:16:03 -07:00
parent e0bb00ff0b
commit b9a6190a45
13 changed files with 433 additions and 25 deletions

View File

@ -866,3 +866,36 @@ Here is an example of two job definitions:
This means that changes to jobs with file matchers will be
self-testing without requiring that the file matchers include
the Zuul configuration file defining the job.
.. attr:: workspace-scheme
:default: golang
The scheme to use when placing git repositories in the
workspace.
.. value:: golang
This writes the repository into a directory based on the
canonical hostname and the full name of the repository. For
example::
src/example.com/organization/project
This is the default and, despite the name, is suitable and
recommended for any language.
.. value:: flat
This writes the repository into a directory based only on the
last component of the name. For example::
src/project
In some cases the ``golang`` scheme can produce collisions
(conisder the projects `component` and
`component/subcomponent`). In this case it may be preferable
to use the ``flat`` scheme (which would produce repositories
at `component` and `subcomponent`).
Note, however, that this scheme may produce collisions with
`component` and `component/component`.

View File

@ -0,0 +1,15 @@
---
features:
- |
Jobs may now specify an alternate scheme to use when preparing
repositories in the workspace. The default remains the same
golang-style, but an alternate scheme called `flat` is now
available. See :attr:`job.workspace-scheme` for more details.
upgrade:
- |
The internal git repo caches maintained by the mergers and
executors now use a new naming scheme in order to avoid
collisions. When existing executors and mergers are restarted,
they will remove their git repo caches and re-clone repos using
the new scheme. Jobs may be slow to start until the caches are
warmed again.

View File

@ -0,0 +1,65 @@
- pipeline:
name: check
manager: independent
trigger:
gerrit:
- event: patchset-created
success:
gerrit:
Verified: 1
failure:
gerrit:
Verified: -1
- pipeline:
name: gate
manager: dependent
success-message: Build succeeded (gate).
trigger:
gerrit:
- event: comment-added
approval:
- Approved: 1
success:
gerrit:
Verified: 2
submit: true
failure:
gerrit:
Verified: -2
start:
gerrit:
Verified: 0
precedence: high
- pipeline:
name: post
manager: independent
trigger:
gerrit:
- event: ref-updated
ref: ^(?!refs/).*$
- job:
name: base
parent: null
run: playbooks/base.yaml
- job:
name: test-job
workspace-scheme: flat
required-projects:
- component/subcomponent
- component
- project:
name: component/subcomponent
check:
jobs:
- test-job
- project:
name: component
check:
jobs:
- test-job

View File

@ -20,9 +20,9 @@ import os
import git
import testtools
from zuul.merger.merger import Repo
from zuul.model import MERGER_MERGE_RESOLVE
from tests.base import ZuulTestCase, FIXTURE_DIR, simple_layout
from zuul.merger.merger import MergerTree, Repo
import zuul.model
from tests.base import BaseTestCase, ZuulTestCase, FIXTURE_DIR, simple_layout
class TestMergerRepo(ZuulTestCase):
@ -701,7 +701,7 @@ class TestMerger(ZuulTestCase):
branch=fake_change.branch,
project=fake_change.project,
buildset_uuid='fake-uuid',
merge_mode=MERGER_MERGE_RESOLVE,
merge_mode=zuul.model.MERGER_MERGE_RESOLVE,
)
def test_merge_multiple_items(self):
@ -913,7 +913,7 @@ class TestMerger(ZuulTestCase):
# Add an index.lock file
fpath = os.path.join(self.merger_src_root, 'review.example.com',
'org', 'project1', '.git', 'index.lock')
'org', 'org%2Fproject1', '.git', 'index.lock')
with open(fpath, 'w'):
pass
self.assertTrue(os.path.exists(fpath))
@ -1003,3 +1003,163 @@ class TestMerger(ZuulTestCase):
zuul_repo.commit('refs/heads/master').hexsha)
self.assertEqual(upstream_repo.commit(change_ref).hexsha,
zuul_repo.commit('HEAD').hexsha)
class TestMergerTree(BaseTestCase):
def test_tree(self):
t = MergerTree()
t.add('/root/component')
t.add('/root/component2')
with testtools.ExpectedException(Exception):
t.add('/root/component/subcomponent')
t.add('/root/foo/bar/baz')
with testtools.ExpectedException(Exception):
t.add('/root/foo')
class TestMergerSchemes(ZuulTestCase):
tenant_config_file = 'config/single-tenant/main.yaml'
def setUp(self):
super().setUp()
self.work_root = os.path.join(self.test_root, 'workspace')
self.cache_root = os.path.join(self.test_root, 'cache')
def _getMerger(self, work_root=None, cache_root=None, scheme=None):
work_root = work_root or self.work_root
return self.executor_server._getMerger(
work_root, cache_root=cache_root, scheme=scheme)
def _assertScheme(self, root, scheme):
if scheme == 'unique':
self.assertTrue(os.path.exists(
os.path.join(root, 'review.example.com',
'org/org%2Fproject1')))
else:
self.assertFalse(os.path.exists(
os.path.join(root, 'review.example.com',
'org/org%2Fproject1')))
if scheme == 'golang':
self.assertTrue(os.path.exists(
os.path.join(root, 'review.example.com',
'org/project1')))
else:
self.assertFalse(os.path.exists(
os.path.join(root, 'review.example.com',
'org/project1')))
if scheme == 'flat':
self.assertTrue(os.path.exists(
os.path.join(root, 'project1')))
else:
self.assertFalse(os.path.exists(
os.path.join(root, 'project1')))
def test_unique_scheme(self):
merger = self._getMerger(scheme=zuul.model.SCHEME_UNIQUE)
merger.getRepo('gerrit', 'org/project1')
self._assertScheme(self.work_root, 'unique')
def test_golang_scheme(self):
cache_merger = self._getMerger(work_root=self.cache_root)
cache_merger.updateRepo('gerrit', 'org/project1')
self._assertScheme(self.cache_root, 'unique')
merger = self._getMerger(
cache_root=self.cache_root,
scheme=zuul.model.SCHEME_GOLANG)
merger.getRepo('gerrit', 'org/project1')
self._assertScheme(self.work_root, 'golang')
def test_flat_scheme(self):
cache_merger = self._getMerger(work_root=self.cache_root)
cache_merger.updateRepo('gerrit', 'org/project1')
self._assertScheme(self.cache_root, 'unique')
merger = self._getMerger(
cache_root=self.cache_root,
scheme=zuul.model.SCHEME_FLAT)
merger.getRepo('gerrit', 'org/project1')
self._assertScheme(self.work_root, 'flat')
@simple_layout('layouts/overlapping-repos.yaml')
def test_golang_collision(self):
merger = self._getMerger(scheme=zuul.model.SCHEME_GOLANG)
repo = merger.getRepo('gerrit', 'component')
self.assertIsNotNone(repo)
repo = merger.getRepo('gerrit', 'component/subcomponent')
self.assertIsNone(repo)
@simple_layout('layouts/overlapping-repos.yaml')
def test_flat_collision(self):
merger = self._getMerger(scheme=zuul.model.SCHEME_FLAT)
repo = merger.getRepo('gerrit', 'component')
self.assertIsNotNone(repo)
repo = merger.getRepo('gerrit', 'component/component')
self.assertIsNone(repo)
class TestOverlappingRepos(ZuulTestCase):
@simple_layout('layouts/overlapping-repos.yaml')
def test_overlapping_repos(self):
self.executor_server.keep_jobdir = True
A = self.fake_gerrit.addFakeChange('component', 'master', 'A')
self.fake_gerrit.addEvent(A.getPatchsetCreatedEvent(1))
self.waitUntilSettled()
self.assertHistory([
dict(name='test-job', result='SUCCESS', changes='1,1')],
ordered=False)
build = self.getJobFromHistory('test-job')
jobdir_git_dir = os.path.join(build.jobdir.src_root,
'component', '.git')
self.assertTrue(os.path.exists(jobdir_git_dir))
jobdir_git_dir = os.path.join(build.jobdir.src_root,
'subcomponent', '.git')
self.assertTrue(os.path.exists(jobdir_git_dir))
class TestMergerUpgrade(ZuulTestCase):
tenant_config_file = 'config/single-tenant/main.yaml'
def test_merger_upgrade(self):
work_root = os.path.join(self.test_root, 'workspace')
# Simulate existing repos
org_project = os.path.join(work_root, 'review.example.com', 'org',
'project', '.git')
os.makedirs(org_project)
scheme_file = os.path.join(work_root, '.zuul_merger_scheme')
# Verify that an executor merger doesn't "upgrade" or write a
# scheme file.
self.executor_server._getMerger(
work_root, cache_root=None, scheme=zuul.model.SCHEME_FLAT)
self.assertTrue(os.path.exists(org_project))
self.assertFalse(os.path.exists(scheme_file))
# Verify that a "real" merger does upgrade.
self.executor_server._getMerger(
work_root, cache_root=None,
execution_context=False)
self.assertFalse(os.path.exists(org_project))
self.assertTrue(os.path.exists(scheme_file))
with open(scheme_file) as f:
self.assertEqual(f.read().strip(), 'unique')
# Verify that the next time it starts, we don't upgrade again.
flag_dir = os.path.join(work_root, 'flag')
os.makedirs(flag_dir)
self.executor_server._getMerger(
work_root, cache_root=None,
execution_context=False)
self.assertFalse(os.path.exists(org_project))
self.assertTrue(os.path.exists(scheme_file))
self.assertTrue(os.path.exists(flag_dir))

View File

@ -68,7 +68,8 @@ class TestOpenStack(AnsibleZuulTestCase):
# Check that a change to nova triggered a keystone clone
executor_git_dir = os.path.join(self.executor_src_root,
'review.example.com',
'openstack', 'keystone', '.git')
'openstack', 'openstack%2Fkeystone',
'.git')
self.assertTrue(os.path.exists(executor_git_dir),
msg='openstack/keystone should be cloned.')
@ -91,7 +92,8 @@ class TestOpenStack(AnsibleZuulTestCase):
# Check that a change to keystone triggered a nova clone
executor_git_dir = os.path.join(self.executor_src_root,
'review.example.com',
'openstack', 'nova', '.git')
'openstack', 'openstack%2Fnova',
'.git')
self.assertTrue(os.path.exists(executor_git_dir),
msg='openstack/nova should be cloned.')

View File

@ -374,7 +374,8 @@ class TestWeb(BaseTestWeb):
'group_variables': {},
'host_variables': {},
'variant_description': '',
'voting': True
'voting': True,
'workspace_scheme': 'golang'
}, {
'name': 'project-test1',
'abstract': False,
@ -419,7 +420,8 @@ class TestWeb(BaseTestWeb):
'group_variables': {},
'host_variables': {},
'variant_description': 'stable',
'voting': True
'voting': True,
'workspace_scheme': 'golang'
}], data)
data = self.get_url('api/tenant/tenant-one/job/test-job').json()
@ -462,7 +464,8 @@ class TestWeb(BaseTestWeb):
'group_variables': {},
'host_variables': {},
'variant_description': '',
'voting': True
'voting': True,
'workspace_scheme': 'golang'
}], data)
def test_find_job_complete_playbooks(self):
@ -584,7 +587,8 @@ class TestWeb(BaseTestWeb):
'group_variables': {},
'host_variables': {},
'variant_description': '',
'voting': True}],
'voting': True,
'workspace_scheme': 'golang'}],
[{'abstract': False,
'ansible_version': None,
'attempts': 3,
@ -622,7 +626,8 @@ class TestWeb(BaseTestWeb):
'group_variables': {},
'host_variables': {},
'variant_description': '',
'voting': True}],
'voting': True,
'workspace_scheme': 'golang'}],
[{'abstract': False,
'ansible_version': None,
'attempts': 3,
@ -660,7 +665,8 @@ class TestWeb(BaseTestWeb):
'group_variables': {},
'host_variables': {},
'variant_description': '',
'voting': True}],
'voting': True,
'workspace_scheme': 'golang'}],
[{'abstract': False,
'ansible_version': None,
'attempts': 3,
@ -698,7 +704,8 @@ class TestWeb(BaseTestWeb):
'group_variables': {},
'host_variables': {},
'variant_description': '',
'voting': True}]]
'voting': True,
'workspace_scheme': 'golang'}]]
self.assertEqual(
{
@ -756,7 +763,8 @@ class TestWeb(BaseTestWeb):
'group_variables': {},
'host_variables': {},
'variant_description': '',
'voting': True}
'voting': True,
'workspace_scheme': 'golang'}
]],
}
]
@ -1075,6 +1083,7 @@ class TestWeb(BaseTestWeb):
'child_jobs': [],
'event_id': None,
},
'workspace_scheme': 'golang',
}
self.assertEqual(job_params, resp.json())

View File

@ -650,6 +650,7 @@ class JobParser(object):
'variant-description': str,
'post-review': bool,
'match-on-config-updates': bool,
'workspace-scheme': vs.Any('golang', 'flat'),
}
job_name = {vs.Required('name'): str}
@ -676,6 +677,7 @@ class JobParser(object):
'override-branch',
'override-checkout',
'match-on-config-updates',
'workspace-scheme',
]
def __init__(self, pcontext):

View File

@ -14,6 +14,8 @@
import os
from zuul.lib import strings
def construct_gearman_params(uuid, sched, nodeset, job, item, pipeline,
dependent_changes=[], merger_items=[],
@ -32,7 +34,11 @@ def construct_gearman_params(uuid, sched, nodeset, job, item, pipeline,
short_name=item.change.project.name.split('/')[-1],
canonical_hostname=item.change.project.canonical_hostname,
canonical_name=item.change.project.canonical_name,
src_dir=os.path.join('src', item.change.project.canonical_name),
src_dir=os.path.join('src',
strings.workspace_project_path(
item.change.project.canonical_hostname,
item.change.project.name,
job.workspace_scheme)),
)
zuul_params = dict(
@ -90,6 +96,7 @@ def construct_gearman_params(uuid, sched, nodeset, job, item, pipeline,
params['override_checkout'] = job.override_checkout
params['repo_state'] = item.current_build_set.repo_state
params['ansible_version'] = job.ansible_version
params['workspace_scheme'] = job.workspace_scheme
def make_playbook(playbook):
d = playbook.toDict(redact_secrets=redact_secrets_and_keys)
@ -187,7 +194,11 @@ def construct_gearman_params(uuid, sched, nodeset, job, item, pipeline,
# project.values() is easier for callers
canonical_name=p.canonical_name,
canonical_hostname=p.canonical_hostname,
src_dir=os.path.join('src', p.canonical_name),
src_dir=os.path.join('src',
strings.workspace_project_path(
p.canonical_hostname,
p.name,
job.workspace_scheme)),
required=(p in required_projects),
))

View File

@ -60,6 +60,7 @@ from zuul.merger.server import BaseMergeServer, RepoLocks
from zuul.model import (
BuildCompletedEvent, BuildPausedEvent, BuildStartedEvent, BuildStatusEvent
)
import zuul.model
from zuul.zk.event_queues import PipelineResultEventQueue
BUFFER_LINES_FOR_SYNTAX = 200
@ -823,6 +824,10 @@ class AnsibleJob(object):
self.zuul_event_id = self.arguments.get('zuul_event_id')
# Record ansible version being used for the cleanup phase
self.ansible_version = self.arguments.get('ansible_version')
# TODO(corvus): Remove default setting after 4.3.0; this is to handle
# scheduler/executor version skew.
self.scheme = self.arguments.get('workspace_scheme',
zuul.model.SCHEME_GOLANG)
self.log = get_annotated_logger(
logger, self.zuul_event_id, build=job.unique)
self.executor_server = executor_server
@ -1083,7 +1088,8 @@ class AnsibleJob(object):
merger = self.executor_server._getMerger(
self.jobdir.src_root,
self.executor_server.merge_root,
self.log)
logger=self.log,
scheme=self.scheme)
repos = {}
for project in args['projects']:
self.log.debug("Cloning %s/%s" % (project['connection'],
@ -1776,10 +1782,15 @@ class AnsibleJob(object):
branch)
self.log.debug("Cloning %s@%s into new trusted space %s",
project, branch, root)
# We always use the golang scheme for playbook checkouts
# (so that the path indicates the canonical repo name for
# easy debugging; there are no concerns with collisions
# since we only have one repo in the working dir).
merger = self.executor_server._getMerger(
root,
self.executor_server.merge_root,
self.log)
logger=self.log,
scheme=zuul.model.SCHEME_GOLANG)
merger.checkoutBranch(
project.connection_name, project.name,
branch,
@ -1804,6 +1815,11 @@ class AnsibleJob(object):
# If the project is in the dependency chain, clone from
# there so we pick up any speculative changes, otherwise,
# clone from the cache.
#
# We always use the golang scheme for playbook checkouts
# (so that the path indicates the canonical repo name for
# easy debugging; there are no concerns with collisions
# since we only have one repo in the working dir).
merger = None
for p in args['projects']:
if (p['connection'] == project.connection_name and
@ -1813,7 +1829,9 @@ class AnsibleJob(object):
merger = self.executor_server._getMerger(
root,
self.jobdir.src_root,
self.log)
logger=self.log,
scheme=zuul.model.SCHEME_GOLANG,
cache_scheme=self.scheme)
break
repo_state = None
@ -1821,7 +1839,8 @@ class AnsibleJob(object):
merger = self.executor_server._getMerger(
root,
self.executor_server.merge_root,
self.log)
logger=self.log,
scheme=zuul.model.SCHEME_GOLANG)
# If we don't have this repo yet prepared we need to restore
# the repo state. Otherwise we have speculative merges in the

View File

@ -12,8 +12,11 @@
# License for the specific language governing permissions and limitations
# under the License.
import os.path
from urllib.parse import quote_plus
import zuul.model
def unique_project_name(project_name):
parts = project_name.split('/')
@ -21,3 +24,15 @@ def unique_project_name(project_name):
name = quote_plus(project_name)
return f'{prefix}/{name}'
def workspace_project_path(hostname, project_name, scheme):
"""Return the project path based on the specified scheme"""
if scheme == zuul.model.SCHEME_UNIQUE:
project_name = unique_project_name(project_name)
return os.path.join(hostname, project_name)
elif scheme == zuul.model.SCHEME_GOLANG:
return os.path.join(hostname, project_name)
elif scheme == zuul.model.SCHEME_FLAT:
parts = project_name.split('/')
return os.path.join(parts[-1])

View File

@ -28,6 +28,7 @@ import git
import gitdb
import paramiko
from zuul.zk import ZooKeeperClient
from zuul.lib import strings
import zuul.model
@ -723,7 +724,36 @@ class Repo(object):
return branches
class MergerTree:
"""
A tree structure for quickly determining if a repo collides with
another in the same merger workspace.
Each node is a dictionary represents a path element. The keys are
child path elements and their values are either another dictionary
for another node, or None if the child node is a git repo.
"""
def __init__(self):
self.root = {}
def add(self, path):
parts = path.split('/')
root = self.root
for i, part in enumerate(parts[:-1]):
root = root.setdefault(part, {})
if root is None:
other = '/'.join(parts[:i])
raise Exception(f"Repo {path} collides with {other}")
last = parts[-1]
if last in root:
raise Exception(f"Repo {path} collides with "
"an existing repo with a longer path")
root[last] = None
class Merger(object):
def __init__(
self,
working_root: str,
@ -737,6 +767,8 @@ class Merger(object):
logger: Optional[Logger] = None,
execution_context: bool = False,
git_timeout: int = 300,
scheme: str = None,
cache_scheme: str = None,
):
self.logger = logger
if logger is None:
@ -754,20 +786,55 @@ class Merger(object):
self.speed_time = speed_time
self.git_timeout = git_timeout
self.cache_root = cache_root
self.scheme = scheme or zuul.model.SCHEME_UNIQUE
self.cache_scheme = cache_scheme or zuul.model.SCHEME_UNIQUE
# Flag to determine if the merger is used for preparing repositories
# for job execution. This flag can be used to enable executor specific
# behavior e.g. to keep the 'origin' remote intact.
self.execution_context = execution_context
# A tree of repos in our working root for detecting collisions
self.repo_roots = MergerTree()
# If we're not in an execution context, then check to see if
# our working root needs a "schema" upgrade.
if not execution_context:
self._upgradeRootScheme()
def _upgradeRootScheme(self):
# If the scheme for the root directory has changed, delete all
# the repos so they can be re-cloned.
try:
with open(os.path.join(self.working_root,
'.zuul_merger_scheme')) as f:
scheme = f.read().strip()
except FileNotFoundError:
# The previous default was golang
scheme = zuul.model.SCHEME_GOLANG
if scheme == self.scheme:
return
if os.listdir(self.working_root):
self.log.info(f"Existing merger scheme {scheme} does not match "
f"requested scheme {self.scheme}, deleting merger "
"root (repositories will be re-cloned).")
shutil.rmtree(self.working_root)
os.makedirs(self.working_root)
with open(os.path.join(self.working_root,
'.zuul_merger_scheme'), 'w') as f:
f.write(self.scheme)
def _addProject(self, hostname, connection_name, project_name, url, sshkey,
zuul_event_id, process_worker=None):
repo = None
key = '/'.join([hostname, project_name])
try:
path = os.path.join(self.working_root, hostname, project_name)
path = os.path.join(self.working_root,
strings.workspace_project_path(
hostname, project_name, self.scheme))
self.repo_roots.add(path)
if self.cache_root:
cache_path = os.path.join(self.cache_root, hostname,
project_name)
cache_path = os.path.join(
self.cache_root,
strings.workspace_project_path(
hostname, project_name, self.cache_scheme))
else:
cache_path = None
repo = Repo(

View File

@ -104,7 +104,8 @@ class BaseMergeServer(metaclass=ABCMeta):
self.merger_jobs)
def _getMerger(self, root, cache_root, logger=None,
execution_context=True):
execution_context=True, scheme=None,
cache_scheme=None):
return merger.Merger(
root,
self.connections,
@ -117,6 +118,8 @@ class BaseMergeServer(metaclass=ABCMeta):
logger,
execution_context=execution_context,
git_timeout=self.git_timeout,
scheme=scheme,
cache_scheme=cache_scheme,
)
def _repoLock(self, connection_name, project_name):

View File

@ -92,6 +92,11 @@ NODE_STATES = set([STATE_BUILDING,
STATE_HOLD,
STATE_DELETING])
# Workspace scheme
SCHEME_GOLANG = 'golang'
SCHEME_FLAT = 'flat'
SCHEME_UNIQUE = 'unique' # Internal use only
class ConfigurationErrorKey(object):
"""A class which attempts to uniquely identify configuration errors
@ -1242,6 +1247,7 @@ class Job(ConfigObject):
override_branch=None,
override_checkout=None,
post_review=None,
workspace_scheme=SCHEME_GOLANG,
)
# These are generally internal attributes which are not
@ -1340,6 +1346,7 @@ class Job(ConfigObject):
d['ansible_version'] = self.ansible_version
else:
d['ansible_version'] = None
d['workspace_scheme'] = self.workspace_scheme
return d
def __ne__(self, other):