The loop is used to cover when more than one different thread of
activity is trying to add the same aggregate. Initially set at
five, perfload results [1] are showing that this is not high
enough. While it is true that perfload exercises writing to
placement in ways that are unusually concurrent, we want to
be robust and enlarging the loop size doesn't impact the common
case, so it has been raised to ten.
[1] http://logs.openstack.org/99/632599/1/check/placement-perfload/8c2a0ad/logs/
Change-Id: I288fc07df30adb044e0ce1614519b3391cf9bb54
The 0.2.0 release of os-resource-classes has happened, adding the
'PCPU' class. It's already updated in global upper-constraints,
so update it in requirements.txt and lower-constraints and change
the gabbi tests which count resource classes.
Change-Id: I4a189aca59485d65ad8a7c9bfbeca7ac995ed336
https://review.openstack.org/#/c/632599/ added a new check for missing
root provider ids, but missed to update the placement-status-checks
history in the doc, so this patch updates it.
Change-Id: I8f5dc7b25ded628772d3863e4ccecb14b5224c58
https://review.openstack.org/#/c/631671/ added a new CLI command:
`placement-status upgrade check`
However, it fails logging that it finds no database connection info.
This patch fixes it calling the placement.db_api.configure() before
connecting the database.
Change-Id: Icf0be55ef9aa8f858eb386a9b69cbb7e2bf81b07
Closes-Bug: #1812829
Since the create_incomplete_consumers online data migration
was copied from nova to placement, and will eventually be
removed from nova (and the nova online data migration won't
help once the placement data is copied over to extracted placement
on upgrade anyway), this adds an upgrade check to make sure
operators completed that online data migration.
Normally we wouldn't add upgrade checks for online data migrations
which should get automatically run by deployment tools, but since
extracted placement is very new, it's nice to provide tooling to
let operators know they have not only properly migrated their
data to extracted placement but whether or not their upgrade
homework is done.
Change-Id: I7f3ba20153a4c1dc8f5b209024edb882fcb726ef
Change Id609789ef6b4a4c745550cde80dd49cabe03869a in nova added
online data migration code to create missing consumer records
for old allocations. That was added to nova in Rocky. The
incomplete consumers are already migrated in the REST API when
showing allocations for a given consumer or listing allocations
for a given resource provider.
This adds the online data migration command line entry point
for "placement-manage db online_data_migrations" so that
incomplete consumer records can be migrated on-demand in batches
by the operator.
The nova change had no diret testing of the CLI entry point since
it's just a call to the same code that the API uses, which is already
tested in CreateIncompleteConsumersTestCase, so no explicit CLI
unit test is added here since it would be redundant.
This is part of the placement extraction effort; in the case that
a deployment migrates to extracted placement before completing
the online data migration in nova, this allows them to still complete
the migration within extracted placement.
Change-Id: If5babb29b13a3e8c26ac04ecee02f4d3d5404263
When nested resource provider feature was added in Rocky,
root_provider_id column, which should be non-None value, is created in
the resource provider DB.
However, online data migration is only done implicitly via listing or
showing resource providers. With this patch, executing the cli command
`placement-manage db online_data_migrations`
makes sure all the resource providers are ready for nested provider
feature, that is, all the root_provider_ids in the DB have non-None
value.
Change-Id: I42a1afa69f379b095417f5eb106fe52ebff15017
Related-Bug:#1803925
Placement uses alembic to manage the DB version for schema changes.
However, changes with data manupulation should be separated from the
schema changes since the table can be locked and in the worst case
it breaks the service for backward incompatible changes.
We could handle them as a task that is done in a service down time.
However, to minimize the down time, it is better to have the concepts
of online data migration which has been a traditional way to handle
those data manipulation changes in nova.
This patch adds online data migration command to placement to enable
operators to manipulate DB data while the service is running:
placement-manage db online_data_migrations [--max-count]
where --max-count controls the maximum number of objects to migrate
in a given call. If not specified, migration will occur in batches
of 50 until fully complete.
Change-Id: I9cef6829513d9a54d110426baf6bcc312554e3e7
This is a first pass at creating the install documentation for
placement, by copying the nova install documentation, and removing the
nova parts, leaving the placement parts in place.
The expectation here is that this provides a starting point which
can be iteratively improved, perfection being the enemy of the
done and all that.
The from-pypi and verify pages are currently left as TODO and not
included in the visible table of contents. They are present in
the hidden table of contents.
Change-Id: I2f7bcd8efabc628bd27e3a9ce74e277a9e37fb69
This removes all the releasenotes related to changes already released
with nova in rocky and prior. They were carried over into placement
as we weren't sure how we were going to manage release notes. In
a meeting it was decided that we would start the release notes fresh
with "since-extraction".
Therefore this change removes all the releasenotes that are already
present in nova, keeping anything new.
The index.rst file is already set to say "look at nova" for older
release notes.
Change-Id: If9255212acf21ed69abbed0601c783ad01133f72
To aid debuggability, the provider UUID was added to the error message
for the generation conflict exception in nova's incarnation of the
reshaper handler via [1]. This patch syncs the change to the placement
repository.
[1] https://review.openstack.org/#/c/615695/
Change-Id: I606e883983da65a2253be23ee1786d10ee53680c
The link was to nova, to documents that are no longer there, so
update to the new location. A change will also be made in nova to
add a redirect.
Change-Id: Ibfe016f25a29b6810ea09c5d03a01dbf3c53371f
os-resource-classes is a python library in which the standardized
resource classes are maintained. It is done as a library so that
multiple services (e.g., placement and nova) can use the same stuff.
It is used and managed here in the same way the os-traits library is
used: At system start up we compare the contents of the
resource_classes table with the classes in the library and add any
that are missing. CUSTOM resource classes are added with a high id
(and always were, even before this change). Because we need to
insert standard resource classes with an id of zero, so we need to
protect against mysql thinking 0 on a primary key id is "generate
the next one". We don't need a similar thing in os-traits because
we don't care about the ids there. And we don't need to guard
against postgresql or sqlite at this point because they do not have
the same behavior.
The resource_class_cache of id to string and string to id mappings
continues to be maintained, but now it looks solely in the database.
As part of confirming that code, it was discovered that the reader
context manager was being entered twice, this has been fixed.
Locking around every access to the resource class cache is fairly
expensive (changes the perfload job from <2s to >5s). Prior to this
change we would only go to cache if the resource classes in the
query were not standards. Now we always look at the cache so rather
than locking around reads and writes we only lock around writes.
This should be okay, because as long as we do a get (intead of the
previous two separate accesses) on the cache's dict that operation
is safe and if it misses (because something else destroyed the
cache) the fall through is to refresh the cache, which still has the
lock.
While updating the database fixture to ensure that the resource
classes are synched properly, it was discovered that addCleanup was
being called twice with the same args. That has been fixed.
In objects/resource_provider.py the ResourceClass field is changed to a
StringField. The field definition was in rc_field and we simply don't
need it anymore. This is satisfactory because we don't do any validation
on the field internal to the objects (but do elsewhere).
Based on initial feedback, 'os_resource_classes' is imported as 'orc'
throughout to avoid unwieldy length.
Change-Id: Ib7e8081519c3b310cd526284db28c623c8410fbe
Add a section to the api-ref describing the error codes that some
responses produce.
Note in the contributor docs that this should be updated when one is
added.
The reshaper docs is adjusted so a ref can be made to it from the
errors. The implicit link to the header that would be the norm there
doesn't work as there are two headers named "Reshaper".
Change-Id: I89bbd383ba102fdd707ccc9f2fc973c6dd841fa8
Closes-Bug: #1794712
We don't want perfload to run on docs, tox, or test changes, but
because it is based off a very basic job (to avoid weight) there was
no built in irrlevant-files list. This adds one to ensure that the
job only runs on code-related changes.
We need a slightly different list of files from the list used by
integrated gate jobs so do not use the changes made in
Iae26251cb455640fbc74dad326666d2015e7a46c .
Change-Id: I092b5ed19ef51c42a714984127e2fdc91d5c6c70
The existing nova-api to placement database migration shell scripts
don't include the post task to stamp the placement database to the
(b4ed3a175331) version. Without that stamping, syncing the DB to the
current version fails on having the initial upgrade: (initial) ->
(b4ed3a175331). This is because it assumes the initial DB has no
contents and tries to create the tables again.
This patch changes the shell scripts to include that stamping task
rather than leaving it as operators' manual duties to safely bring
the placement DB under alembic version control.
With this change, the shell scripts will need to be executed under the
following condition:
- The placement is already installed to execute `placement-manage *`
- The placement cli can access the placement's database, for example,
by reffering to the `[placement_database]` section in the
`placement.conf`.
Depends-On: https://review.openstack.org/620485
Change-Id: I75926b0efb3983d62603f2fd30b5a8cc30203d46
Use the oslo_db provided wrap_db_retry to retry _ensure_aggregates a
maximum of 5 times when DBDuplicateEntry happens.
This replaces previous looping which had no guard to prevent unlimited
looping if for some reason DBDuplicateEntry kept happening. This is
possible, rarely, when there are is a small number of threads talking
to the same database, and a small number of aggregates of aggregates
being ensured a huge number of times (thousands) in a very small amount
of time. Under those circumstances a maximum recursion error was
possible.
The test of the new behavior is shamefully over-mocked but manages to
confirm the behavior against the real database, thus is in the
functional tree.
Change-Id: I67ab2b9a44264c9fd3a8e69a6fa17466473326f1
Closes-Bug: #1804453
Coverage testing revealed that a few methods in
objects/resource_provider.py had no coverage. It was initially assumed
this was an artifact of the extraction of placement from nova.
Turns out that the code is not used. It was added in
Ib45fde56706a861df0fc048bbec8a568fd26d85d but then refactoring in
Ib1738fb4a4664aa7b78398655fd23159a54f5f69 made it redundant but the
changes did not remove the unused code.
Change-Id: Id2f0a85b58f191886a2edea8fc195bc2289cbcec
Closes-Bug: #1805858
python3.5 was the only supported python3 version on Xenial. Now that
we have Bionic nodes supporting python3.6 and python3.7, let's also
test with python3.7. Eventually we'd like to remove python 3.5, but not
yet.
See ML discussion here [1] for context.
[1] http://lists.openstack.org/pipermail/openstack-dev/2018-October/135626.html
Change-Id: I78517d8a873ed3d41a8e06b4193fdb677f0705d7
Story: #2004073
Task: #27433
We are already in a writer context, via _set_aggregates, we don't
need to ask for it again.
This is in preparation of changing the way the retry is done in the
_ensure_aggrgates method, in subsequent patches.
Change-Id: I1c5c831ae27c3a1936e4782ceccb4d4b39c31b65
Related-Bug: #1804453
With the merge of Iefa8ad22dcb6a128293ea71ab77c377db56e8d70 placement
can run without a config file, so in this change we remove the
creation of an empty one. All the relevant config is managed by
environment variables, as provided by oslo.config 6.7.0.
Change-Id: Ibf285e1da57be57f8f66f3c20d5631d07098ec1c
The openstack-dev mailing list has been replaced with
the openstack-discuss mailing list (*).
So replace the openstack-dev mailing list with
the openstack-discuss mailing list in setup.cfg.
*: http://lists.openstack.org/pipermail/openstack-dev/2018-September/134911.html
Change-Id: I1138d32462a7f1531588a7c4321086c9abb22d49
Now we have `placement-manage db sync` CLI, which upgrades the
placement DB to the current version using alembic. However, if you
have already created tables in placement, the command fails on having
the initial upgrade: (initial) -> (b4ed3a175331). This is because it
assumes the initial DB has no contents and tries to create the tables.
Since this can be a problem for nova-api -> placement migration case,
where we expect placement DB has tables before starting alembic
version mangement via the migration script in `tools/*-migrate-db.sh`,
this patch provides a way to stamp the current version via the CLI:
`placement-manage db stamp <version>`.
Change-Id: I65fa8fd6e2479224f1b25cd62ca15a90d5948424
Converts the plain and migration fixtures to use oslo_db
test_fixtures.
The one snag that can be improved on the oslo_db side is
that the generate_schema hooks called by test_fixtures.SimpleDBFixture
are called before the enginefacades have been connected with the
provisioned engine, so an additional patch is called here
as placement's database setup relies upon enginefacade-enabled
methods to insert DB data.
The WalkVersionsMixin here is necessary as the one in
oslo.db is specific to SQLAlchemy-migrate, so oslo.db should
also include this alembic-capable WalkVersionsMixin within
its own modules.
The normal provisioning system of generating anonymously-named
databases within MySQL and Postgresql is in place, which allows
tests to run concurrently in multiple processes without conflicts.
Change-Id: Icf2ea4a5f51dce412c6e93fda3950d0c9f17aa9f
When I initially wrote the goal, I figured it would be ages before we'd
do it, but the difficulties with the functional tests changes in nova[1]
meant we did it sooner than expected.
[1] Idaed39629095f86d24a54334c699a26c218c6593
Change-Id: Ia48431503ce19feb0ac7a16cc65f699c58a57af2
In this job we install placement by hand, based on the
instructions in
https://docs.openstack.org/placement/latest/contributor/quick-dev.html
and run the placeload command against it. This avoids a lot of node
set up time.
* mysql is installed, placement is installed, uwsgi is installed
* the database is synced
* the service started, via uwsgi, which run with 5 processs each
with 25 threads, otherwise writing the resource providers is
very slow and causes errors in placeload. It's an 8 core vm.
* placeload is called
A post.yaml is added to get the generated logs back to zuul.
Change-Id: I93875e3ce1f77fdb237e339b7b3e38abe3dad8f7
This adds the placeload perf output as its own job, using a
very basic devstack set up. It is non-voting. If it reports as
failing it means it was unable to generate the correct number
of resource providers against which to test.
It ought to be possible to do this without devstack, and thus speed
things up, but some more digging in existing zuul playbooks is
needed first, and having some up to date performance info is useful
now.
Change-Id: Ic1a3dc510caf2655eebffa61e03f137cc09cf098
This change was driven out of trying to get nova functional tests
working with an extracted placement, starting with getting the
database fixture cleaner.
Perhaps not surprisingly, trying to share the same 'cfg.CONF' between
two services is rather fraught. Rather than trying to tease out all the
individual issues, which is a very time consuming effort for not much
gain, a different time consuming effort with great gain was tried
instead.
This patch removes the use of the default global cfg.CONF that
oslo_config (optionally) provides and instead ensures that at
the various ways in which one might enter placement: wsgi, cli,
tests, the config is generated and managed in a more explicit
fashion.
Unfortunately this is a large change, but there's no easy way to do it
in incremental chunks without getting very confused and having tests
pass. There are a few classes of changes here, surrounded by various
cleanups to address their addition. Quite a few holes were found in how
config is managed, especially in tests where often we were getting what
we wanted pretty much by accident.
The big changes:
* Importing placement.conf does not automatically register options
with the global conf. Instead there is a now a register_opts method
to which a ConfigOpts() is required.
* Because of policy enforcement wanting access to conf, a convenient way
of having the config pass through context.can() was needed. At
the start of PlacementHandler (the main dispatch routine) the
current config (provided to the PlacementHandler at application
configuration time) is set as an attribute on the RequestContext.
This is also used where CONF is required in the objects, such as
randomizing the limited allocation canidates.
* Passing in config to PlacementHandler changes the way the gabbi fixture
loads the WSGI application. To work around a shortcoming in gabbi
the fixture needs to CONF global. This is _not_ an
oslo_config.cfg.CONF global, but something used locally in the fixture
to set a different config per gabbi test suite.
* The --sql command for alembic commands has been disabled. We don't
really need that and it would require some messing about with config.
The command lets you dump raw sql intead of migration files.
* PlacementFixture, for use by nova, has been expanded to create and
manage its config, database and policy requirements using non-global
config. It can also accept a previously prepared config.
* The Database fixtures calls 'reset()' in both setUp and cleanUp to be
certain we are both starting and ending in a known state that will
not disturb or be disturbed by other tests. This adds confidence (but
not a guarantee) that in tests that run with eventlet (as in nova)
things are in more consistent state.
* Configuring the db in the Database fixture is moved into setUp where
it should have been all along, but is important to be there _after_
'reset()'.
These of course cascade other changes all over the place. Especially the
need to manually register_opts. There are probably refactorings that can
be done or base classes that can be removed.
Command line tools (e.g. status) which are mostly based on external
libraries continue to use config in the pre-existing way.
A lock fixture for the opportunistic migration tests has been added.
There was a lock fixture previously, provided by oslo_concurrency, but
it, as far as I can tell, requires global config. We don't want that.
Things that will need to be changed as a result of these changes:
* The goals doc at https://review.openstack.org/#/c/618811/ will
need to be changed to say "keep it this way" rather than "get it
this way".
Change-Id: Icd629d7cd6d68ca08f9f3b4f0465c3d9a1efeb22