2014-12-05 13:30:45 +03:00
|
|
|
..
|
|
|
|
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
|
|
|
License.
|
|
|
|
|
|
|
|
http://creativecommons.org/licenses/by/3.0/legalcode
|
|
|
|
|
|
|
|
==========================================
|
|
|
|
Semver Utility for DB storage
|
|
|
|
==========================================
|
|
|
|
|
|
|
|
https://blueprints.launchpad.net/glance/+spec/semver-support
|
|
|
|
|
|
|
|
Working with versions of various objects is a common problem, which already has
|
|
|
|
a number of market-adopted solutions. One of these solutions is Semantic
|
|
|
|
Versioning - a system of rules and requirements for assigning version numbers
|
|
|
|
to software components and other objects. One can find the specification for
|
|
|
|
SemVer freely at `semver.org <http://semver.org>`
|
|
|
|
|
|
|
|
It is proposed to add support of the semantic versioning concept into Glance,
|
|
|
|
according to the version 2.0.0 of the specification [1], so Glance objects
|
|
|
|
(starting from Artifacts, but probably including Images in future) may be
|
|
|
|
properly versioned.
|
|
|
|
|
|
|
|
Problem description
|
|
|
|
===================
|
|
|
|
|
|
|
|
Semantic versioning provides the ability to compare two or more objects based
|
|
|
|
on the version with which they are associated. According to the specification
|
|
|
|
[1] objects are compared first by their major versions, then minor versions,
|
|
|
|
then patch versions, also there is a concept of "pre-release" versions (alphas,
|
|
|
|
betas, release candidates (RC) etc) which should always be considered "lower"
|
|
|
|
then the "released" version with the same values of numeric versions.
|
|
|
|
|
|
|
|
For example, 1.2.2 < 1.2.3-beta < 1.2.3
|
|
|
|
|
|
|
|
If we want to store versioned objects in the catalog (this may be applied to
|
|
|
|
images, artifacts and other entities), then we need to be able to execute this
|
|
|
|
kind of semantic comparison for large amounts of entities.
|
|
|
|
|
|
|
|
So, the comparison should be made not only in memory, but at the database as
|
|
|
|
well, and there is no generic datatype in modern RDBMs to store this kind of
|
|
|
|
versioning information. So, a method for storing easily-sortable version
|
|
|
|
identifiers should be introduced.
|
|
|
|
|
|
|
|
|
|
|
|
Proposed change
|
|
|
|
===============
|
|
|
|
|
|
|
|
First of all, Glance has to adopt some utility to parse strings which contain
|
|
|
|
version information, verify their compliance with the specification and
|
|
|
|
properly process the version objects in memory. There is a number of mature
|
|
|
|
libraries which have this functionality and there is no need to re-implement
|
|
|
|
them.
|
|
|
|
After some research it has been suggested to use "semantic_version" library
|
2016-10-24 15:23:23 +08:00
|
|
|
which is available at pypi [2]. This library is not present in OpenStack global
|
2014-12-05 13:30:45 +03:00
|
|
|
requirements, so a patchset [3] has been submitted to add it there.
|
|
|
|
|
|
|
|
To be able to sort these version objects in the database it is required to
|
|
|
|
convert them into some generic comparable data type. Due to the nature of
|
|
|
|
version information (fixed numeric components for major-minor-patch part) and
|
|
|
|
arbitrary sequences of alphanumeric strings for pre-release and metadata labels
|
|
|
|
it is suggested to store them separately as three database fields: one for the
|
|
|
|
numeric part, another for the pre-release label and the last one for build
|
|
|
|
metadata.
|
|
|
|
|
|
|
|
Three numeric components (major, minor and patch) may be converted into a
|
|
|
|
single unsigned 64-bit integer number: first 16 bits of this number
|
|
|
|
will be allocated to store major revision, next 16 bits - for minor revision,
|
|
|
|
next 16 bits - for patch revision. Remaining 16 bits will be used to store the
|
|
|
|
release type flag (to make sure that the final release had higher precedence
|
|
|
|
then pre-releases) and may be reserved for future improvement and storing
|
|
|
|
additional information which is irrelevant for semantic versioning but may have
|
|
|
|
some other meanings (see Alternatives section below for more details).
|
|
|
|
|
|
|
|
The labels of pre-release version should be stored independently from the
|
|
|
|
numeric part as a regular string, because - according to the semver spec - they
|
|
|
|
are to be compared according to regular alphanumeric comparison only if the
|
|
|
|
numeric parts of the versions are identical.
|
|
|
|
|
|
|
|
So, these two values - long number and a string - may be combined into a single
|
|
|
|
composite index in the database, which will provide efficient capabilities to
|
|
|
|
sort and filter objects with the versions assigned.
|
|
|
|
|
|
|
|
However, there is one important difference between the semver requirement and
|
|
|
|
simple comparison of alphanumeric strings: semver requires that the labels are
|
|
|
|
compared "per component" (where "component" is a dot-separated part of the
|
|
|
|
label), and the components which consists only of digits are to be treated as
|
|
|
|
integers rather then ASCII strings. For example, version "1.0.0-alpha.4.foo"
|
|
|
|
should have lower precedence then "1.0.0-alpha.10.bar", because their numeric
|
|
|
|
components are equal, and the labels have identical first component ('alpha')
|
|
|
|
but differs in the second ("4" vs "10"), and 4 is less then 10.
|
|
|
|
But the labels are compared as string database fields, the precedence will be
|
|
|
|
wrong as "alpha.4.foo" is lexically greater then "alpha.10.bar" (due to "4"
|
|
|
|
being greater then "1").
|
|
|
|
|
|
|
|
To solve this problem it is suggested to add one constraint to this semver
|
|
|
|
implementation: to limit the maximum length of numeric components in the pre-
|
|
|
|
release label to a reasonably low value (say, 6 characters) and add extra
|
|
|
|
leading zeros to these components when saving them to database.
|
|
|
|
|
|
|
|
In this case the "alpha.4.foo" label from the example above will become
|
|
|
|
"alpha.000004.foo", and "alpha.10.foo" - "alpha.000010.foo". ASCII-based
|
|
|
|
comparison of these strings will give the results which are consistent with the
|
|
|
|
requirements of semver. Later, when these values are read from the database the
|
|
|
|
leading zeros may be removed so the labels look fine again.
|
|
|
|
|
|
|
|
This applies only to the pre-release label part. Build metadata (the part which
|
|
|
|
is separated by the '+' character) does not take part in the precedence
|
|
|
|
resolution, so it neither has to be part of the database index nor has to be
|
|
|
|
pre-processed in anyway.
|
|
|
|
|
|
|
|
It is suggested to create custom composite field for SQLAlchemy which will
|
|
|
|
encapsulate the above described logic (converting from semantic version into 3
|
|
|
|
database-friendly values and back) and will be usable for building
|
|
|
|
version-aware model classes.
|
|
|
|
|
|
|
|
|
|
|
|
Alternatives
|
|
|
|
------------
|
|
|
|
|
|
|
|
Semantic Versioning is not the only specification which defines the format for
|
|
|
|
version string.
|
|
|
|
There is another standard - PEP440 - which describes a scheme for identifying
|
|
|
|
versions of Python software distributions [4]. It shares some common features
|
|
|
|
with Semantic Versioning but has different and a bit more complicated notation.
|
|
|
|
|
|
|
|
Besides slightly different syntax (it just concatenates pre-release segments to
|
|
|
|
the right of release number, while semver separates them with a dash), it puts
|
|
|
|
extra constraints on what the pre-release label may contain. In semver,
|
|
|
|
pre-release label may contain arbitrary alphanumeric characters, while in
|
|
|
|
PEP440 they may be only be 'a', 'b' or 'rc' followed by a number. This could
|
|
|
|
theoretically allow to store the pre-release component as part of the same
|
|
|
|
64-bit long database field which is used to store the release number (e.g.
|
|
|
|
the release type flag takes 2 bits, and remaining 14 bits are left for the
|
|
|
|
number of the pre-release build) - however this significantly decreases the
|
|
|
|
flexibility of the pre-release version field.
|
|
|
|
|
|
|
|
Also, PEP440 adds more additional entities: it has a concept of development
|
|
|
|
builds (being one additional special segment which goes after the pre-release
|
|
|
|
segments), Epochs (which precedes the build number), local version (which is
|
|
|
|
actually similar to build metadata of semantic versioning but has different
|
|
|
|
purpose and also takes part in precedence resolution by following about the
|
|
|
|
same rules as arbitrary pre-release label of semantic versioning) etc. Also,
|
|
|
|
unlike semantic versioning PEP440 does not have any limits on the amount of
|
|
|
|
numeric components in the build number: so, it may be anything from simple "1"
|
|
|
|
to "1.2.3.4.5.6.7.8.9.10" and beyond. This, of course, gives more flexibility
|
|
|
|
and power, but may not be easily mapped to efficient database storage.
|
|
|
|
|
|
|
|
Which is more important, PEP440 is a standard which is native to Python world,
|
|
|
|
but is not known outside, while the purpose of Glance Artifacts is to be as
|
|
|
|
generic as possible in terms of the nature of its objects. This means that the
|
|
|
|
users of the artifacts are not restricted to be Python developers only: they
|
|
|
|
may not be the developers at all. So, following easier and more generic
|
|
|
|
standard seems preferable.
|
|
|
|
|
|
|
|
There is one more standard which stands between semver and pep440. It is
|
|
|
|
called "Linux Compatible Semantic Versioning 3.0.0", is a fork of regular
|
2016-10-24 15:23:23 +08:00
|
|
|
semver (its 2.0 version) and is developed within OpenStack community [5]. It
|
2014-12-05 13:30:45 +03:00
|
|
|
tries to blend regular semver with versions of Linux Distribution packages and
|
|
|
|
uses some concepts of pep440 for it.
|
|
|
|
|
|
|
|
This notation is easier to map to the database type, however it is still local
|
2016-10-24 15:23:23 +08:00
|
|
|
to relatively small community of developers (OpenStack developers in this
|
2014-12-05 13:30:45 +03:00
|
|
|
case), so more generic and widely adopted standard as semver seems more
|
2016-11-19 16:04:39 +08:00
|
|
|
preferable.
|
2014-12-05 13:30:45 +03:00
|
|
|
|
|
|
|
However we are not limited to having only a single versioning notation. In
|
|
|
|
future we may add support for extra schemas, including some subset of pep440 or
|
|
|
|
Linux Compatible Semantic Versioning. This may be implemented as part of
|
|
|
|
further Artifact Repository roadmap or other activities. This particular spec
|
|
|
|
leaves this out of scope and focuses only on semver implementation.
|
|
|
|
|
|
|
|
|
|
|
|
Data model impact
|
|
|
|
-----------------
|
|
|
|
|
|
|
|
None: this spec does not cover any actual database changes, it just describes
|
|
|
|
the utility which will allow to operate with semver objects and convert them to
|
|
|
|
data which may be usable for DB storage - and back.
|
|
|
|
|
|
|
|
REST API impact
|
|
|
|
---------------
|
|
|
|
|
|
|
|
None
|
|
|
|
|
|
|
|
Security impact
|
|
|
|
---------------
|
|
|
|
|
|
|
|
None
|
|
|
|
|
|
|
|
Notifications impact
|
|
|
|
--------------------
|
|
|
|
|
|
|
|
None
|
|
|
|
|
|
|
|
Other end user impact
|
|
|
|
---------------------
|
|
|
|
|
|
|
|
None
|
|
|
|
|
|
|
|
Performance Impact
|
|
|
|
------------------
|
|
|
|
|
|
|
|
The proposed change does not affect existing code in any sense.
|
|
|
|
|
|
|
|
Other deployer impact
|
|
|
|
---------------------
|
|
|
|
|
|
|
|
This spec assumes that [3] is merged, i.e. the semantic_version library is
|
|
|
|
added to the global requirements.
|
|
|
|
|
|
|
|
Developer impact
|
|
|
|
----------------
|
|
|
|
|
|
|
|
The usage of the lib should be documented for developers, so they may
|
|
|
|
efficiently use it in their code.
|
|
|
|
|
|
|
|
Implementation
|
|
|
|
==============
|
|
|
|
|
|
|
|
Assignee(s)
|
|
|
|
-----------
|
|
|
|
|
|
|
|
Primary assignee:
|
|
|
|
ativelkov
|
|
|
|
|
|
|
|
|
|
|
|
Reviewers
|
|
|
|
---------
|
|
|
|
|
|
|
|
Core reviewer(s):
|
|
|
|
jokke
|
|
|
|
|
|
|
|
Other reviewer(s):
|
|
|
|
ivasilevskaya
|
|
|
|
mfedosin
|
|
|
|
travis-tripp
|
|
|
|
icordasc
|
|
|
|
|
|
|
|
Work Items
|
|
|
|
----------
|
|
|
|
|
|
|
|
Initial implementation of the feature may be done in a single changeset.
|
|
|
|
However it seems preferable to add this support to semantic_version library [2]
|
|
|
|
and remove it from glance codebase aftwerwards.
|
|
|
|
If the maintainer of the library does not accept this functionality (or if we
|
|
|
|
decide to add support for more versioning notations later) then this code may
|
2016-10-24 15:23:23 +08:00
|
|
|
be transferred to some common OpenStack library, such as Oslo.
|
2014-12-05 13:30:45 +03:00
|
|
|
|
|
|
|
After this feature is implemented we should continue the work to add support
|
|
|
|
for other versioning schemas, such as pep440, Linux Compatible Semantic
|
|
|
|
Versioning and others. These should be added as independent features covered by
|
|
|
|
separate specs.
|
|
|
|
|
|
|
|
|
|
|
|
Dependencies
|
|
|
|
============
|
|
|
|
|
|
|
|
None
|
|
|
|
|
|
|
|
|
|
|
|
Testing
|
|
|
|
=======
|
|
|
|
|
|
|
|
A unit test should be added demonstrating the data structure usage, comparison,
|
|
|
|
string parsing and conversion operation to DB type (long)
|
|
|
|
|
|
|
|
|
|
|
|
Documentation Impact
|
|
|
|
====================
|
|
|
|
|
|
|
|
Developers' guide has to be updated to hint the developers on how to properly
|
|
|
|
use the library in their code.
|
|
|
|
|
|
|
|
|
|
|
|
References
|
|
|
|
==========
|
|
|
|
|
|
|
|
[1] http://semver.org
|
|
|
|
[2] https://pypi.python.org/pypi/semantic_version/
|
|
|
|
[3] https://review.openstack.org/#/c/151466/
|
|
|
|
[4] https://www.python.org/dev/peps/pep-0440/
|