This is essentially a duplicate of the cycle highlights provided
with the release, plus a warning about needing to upgrade to extracted
placement in stein before the train upgrade.
Few links are provided because the rest of the release notes are very
short and will be presented adjacent to the prelude, and they have
links.
Change-Id: I872271688fbee69ff35bf44ef82fad9ab34f5229
In the document source, we had a hidden toctree to avoid an error
when creating an html document. However, the contents in the hidden
toctree appears explicitly as actual subsections. This resulted in
having wrong subsection structures when creating a pdf document.
This patch fixes it by adding explicit sections for the contents
in the (previously) hidden toctree.
Change-Id: I8420835c19856953c65513914362bf402ff0f08b
Story: 2006110
Task: 35398
This follows the instructions [1] in an attempt to build pdf docs.
Several doc/source/conf.py changes are required to get this to work.
The most important one is
'maxlistdepth': '10',
which prevents the build process from stalling out and dropping the
caller into an interactive session.
[1] https://etherpad.openstack.org/p/train-pdf-support-goal
Change-Id: Icf7c22bf9d1de6fb2a74a756c370930d4c00b0b9
Story: 2006110
Task: 35398
jsonschema < 3.0.0 does not support python3.6 or 3.7 which will cause
placement to fail to install on instances using 3.6 (Ubuntu Bionic) or
3.7.
Global-requirements removed this cap months ago in
828484138c
Change-Id: I49e628c87276b5495bca01842f6228742ed49765
The [placement]/policy_file option was necessary when placement
was in nova since nova uses the standard [oslo_policy]/policy_file
option for defining custom policy rules. Now that placement is
extracted (+1 release) we can deprecate the placement-specific
option and use the standard [oslo_policy]/policy_file option as well.
The tricky thing with this is both options define a default value
but those values are different, and neither need to exist or can
exist but be empty and we'll use policy defaults from code. So some
logic is necessary for detecting which option we should pass to the
oslo.policy Enforcer object. We prefer to use [oslo_policy]/policy_file
if it exists but will fallback to use [placement]/policy_file for
backward compatibility. We also check for a couple of edge cases to
try and detect misconfiguration and usage of the deprecated option.
The config generation docs are updated to include the [oslo_policy]
options as well as registering the options from that library for
runtime code.
Change-Id: Ifb14d2c14b17fc5bcdf7d124744ac2e1b58fd063
Story: #2005171
Task: #29913
root_member_of was not implemented (as we thought it might not be).
The spec is updated to make this explicit.
Change-Id: I2a60bff04b0837bcac7037d766c675ef3896692f
The assertion failure message in a microversion sequence test
had a typo. This was not noticed because the test usually
passes and the message is not produced under normal circumstances.
Change-Id: I76cca613e0d89e235f7fc23740dda93bda4ffd15
The migration scripts have moved, gitea doesn't handle the redirects
properly, so update the links to point to the correct locations
directly.
Change-Id: I73a47c862606fa27158e0b7af9f111b5df8a065d
WSGI middleware is a confusing topic for people who are not
familiar with it, so make the NOTE that describes how the
ordering works more clear.
Change-Id: I21680298f98f83d47e8319f01ad714cd96418c26
This is done to ensure that both are in the outermost
piece of middleware so that we generate and log the
global request id (if any) as early as possible, and
use it, and the local request id, throughout the entire
middleware stack.
This turned out to be somewhat more complex than desired
because:
* The request id middleware from oslo_middleware is a webob
wsgi app, so we need to map the request log middleware to
that style (meaning more changes than strictly necessary).
* The __call__ in the request id middleware is not composed
to make calling it as a super clean, so instead of doing that
we are copying code. Which implies some risks for which we
may wish to consider workarounds.
This is an alternative to I5a80056dd88836a4e79a649fa02d36dc7e75eee4
Change-Id: I7e24b54bfcd296f13ea4f65dbb10ba63679c05b1
Because exclude_nested_providers() had several iteration loops that
were redundant, this patch refactors it.
Change-Id: I3e7b222c8e6c278d1b66a1068a9f05b70c41428a
For performance optimization purpose, we explored which provider is
already processed both in extend_usages_by_provider_tree() and in
_build_provider_summaries(), and we iterated all usage information
in _build_provider_summaries(), which could cost even when only one
request group has been given.
This patch refactors it and move get_usages_by_provider_trees()
into _build_provider_summaries() to combine the exploration parts
into one.
Change-Id: Ic36d453adf92b1f475960eb1b2796919d635da10
Sphinx 2.2.0 gets upset when a directory it is configured for does
not exist.
The _static directory is only used for automatically generated
configuration and policy sample files.
Change-Id: I5ee07a2cb118e5c9b16aefee73c9274ecace1d44
In testing with the nested-perfload topology containing 7000
providers, _merge_candidates had been returning 21000 ProviderSummary
objects which were not trimmed until serialization processing in
handlers/allocation_candidate.py. That situation has been present
at least since when we added same_subtree [1] so is not simply
because of the addition of rw_ctx.summaries_by_id [2].
Teasing out fixing this exposed a lot of opportunities for
shrinking the amount of data being inspected and processed.
In no particular order the changes are:
* Return set() from _alloc_candidates_single_provider and
_alloc_candidates_multiple_providers instead of list. It doesn't need
to be a list and the cast to list costs.
* Because we have rw_ctx.summaries_by_id we don't need to return
summaries from _get_by_one_request, they're in the rw_ctx.
* The candidates provided to _merge_candidates is no longer a dict of
tuples of allocatation requests and summaries (_all_ the summaries).
It's just a dict of allocation requests.
* At the end of _merge_candidates we need to winnow the contents of
rw_ctx.summaries_by_id to only those that still matter.
[1] I7fdeac24606359d37f1a7405d22c5797840e1a9e
[2] I43ae1118421366336b4e96738c2981e07caebec8
Change-Id: Ibccaed40f1eac9c244cf70654f6be1d72f7a6054
In _merge_candidates we need a final _exceeds_capacity check to
make sure a multi-part granular request has not exceeded capacity
limits. This uses a dict that maps resource provider uuid and resource
class name to a ProviderSummaryResource.
Rather than creating this dict in _merge_candidates, we now have it
as a member on the RequestWideSearchContext and add to it when
creating the ProviderSummaryResource.
This allows us to remove some looping in _merge_candidates and is
also a bit more tidy. This is only possible with the advent of
the RequestWideSearchContext, which is newer than _merge_candidates.
_exceeds_capacity is moved to the RequestWideSearchContext as
exceeds_capacity as it now makes sense as a method.
Because _rp_rc_key ended up used from two different modules and for
two different purposes, the method is instead removed and callers
create their own keys directly.
Change-Id: Id74c01215956fc998f2dadcd9d84801ac58c2d3d
This is the tiniest of memory and hashing optimizations but is
also for subsequent changes which will change when and where a
key is create and used in the management of psum_res_by_rp_rc.
All the current callers have an rp with id set when they use this.
Change-Id: I9d8f4ed1c32deafe90cd4c4ddf7c36f11a56b6b5
In _merge_candidates a dict of rp uuid to parent uuid mappings
is used to determine if providers are in the same subtree.
In this change the creation of that dict is moved from _merge_canidates
to _build_provider_summaries because we have established that all
relevant providers will be visited by the sum of calls to that method
and we do not want to have to rely on visiting all providers in
_merge_candidates, to allow for the possibility of short circuiting
some loops.
The signature on _satisfies_same_subtree is updated to accept an
rw_ctx, rather than two separate members of the object.
Change-Id: Iaba4a098847dc4a6fc1bdaee44971dd4f0e051ae
Modify the NUMANetworkFixture so we make a subclass
that pre-creates cn1 and cn2 and puts them under distinct parents and a
shared grandparent. This helps to model a deeply nested but sparse
topology where providers that contribute to the solution are low down
in the tree.
This provides a deeper topology to test the same_subtree and
_merge_candidates changes that come in the following patches.
The added gabbit intentionally uses yaml anchors to indicate that the
expected response on each test is the same.
Understanding the structure of the available resource providers and
the UUIDs in use can be super painful, but osc_placement_tree [1] can
help. In my own tetsing I added the following to the end of
make_entities to trigger it to dump:
from osc_placement_tree import utils as placement_visual
from placement import direct
import time
with direct.PlacementDirect(
self.conf_fixture.conf, latest_microversion=True) as client:
placement_visual.dump_placement_db_to_dot(
placement_visual.PlacementDirectAsClientWrapper(client),
'/tmp/dump.%s.dot' % time.time(),
hidden_fields=['inventories', 'generation', 'aggregates', 'resource_provider_generation'])
[1] https://pypi.org/project/osc-placement-tree/
Change-Id: I2aedac0ce3a4f5a40de796bb9f74824541a95a65
Based on the analysis at [1], we need to continue to use fully-populated
ancestry data when calculating same_subtree-ness. This commit introduces
a gabbi test that covers a particular case that can be broken if we
don't do that: where a provider involved in the request has a broken
ancestry path to another provider involved in the request.
[1] https://review.opendev.org/#/c/675606/2/placement/objects/allocation_candidate.py@908
Change-Id: I5062e725d6f29cd997b974a0dad7dc4d0ed2900f
While doing something else, I noticed a bug in the NUMANetworkFixture:
we were trying to consume two VCPU from numa0 and three from cn2; but
our second call to tb.set_allocation() was using the same consumer as
the first, so numa0 was losing its allocation.
This commit uses a new consumer for the cn2 allocation, and tweaks an
existing gabbi test to prove the fix.
Change-Id: I893b27b7a07d20ef245c6056cde0e2c5bc5f4daa
When there are many root_ids, the in_ in get_traits_by_provider_tree
can be expensive (according to profiling data), so we switch it
here to an expanding bindparam which has proven elsewher to help.
Change-Id: I86e96b877945a81c10ccbb2437da4213a3d06844
Profiling revealed that copying AllocationRequestResource in
_consolidate_allocation_requests was expensive.
As an optimization, this patch adds logic to only do that copy when it's
necessary.
And it's only necessary for an ARR whose resource class was requested
from multiple groups, when group_policy=none. In that case, we're
constructing multiple results with the same ARR. For example, consider a
host with two PFs with VF inventory. A request like:
?resources1=VF:1
&resources2=VF:1
&group_policy=none
...will yield results like:
(1) PF1{VF:1},PF2{VF:1}
(2) PF1{VF:2}
(3) PF2{VF:2}
The AllocationRequestResource representing PF1{VF:1} gets used in both
result (1) and result (2). When producing result (2), we add the
AllocationRequestResource.amountZ together. If we were reusing the same
ARR instance from result (1), result (1) would end up looking like:
(1) PF1{VF:2},PF2{VF:1}
...which is wrong. So we have to do the copy.
With group_policy=isolate, or if we've only requested the resource class
from one group, we're only using that ARR once in the results, so we can
just use it without copying.
Change-Id: Idb98611f549b628d273c43f07d137fcf9c73314c
It turns out that when Python runs copy.copy on an object it does
a lot of unnecessary work that can be avoided by implementing a
__copy__ method that essentially does a clone(), creating a new
object with the same values. This is considered the idiomatic form
of clone.
That unnecessary work shows up quite significantly in the
_consolidate_allocation_requests method in allocation_candidate.py where
AllocationRequestResource objects get copied, a lot. The use of __copy__
more than halves the time consumed by _consolidate_allocation_requests.
There is another copy.copy, in _alloc_candidates_single_provider in the
same file, this time copying AllocationRequest objects. This is only
done when a shared provider is involved, and we haven't been doing scale
testing for those (yet) but since we're in that domain I've gone ahead
and pre-emptively added a __copy__ to that object too, to prepare for
the eventual work of many shared disks.
Change-Id: I3e8167e5d09aeb2ae68282bc0378bee6d956a286
We updated the SQL when getting rid of null root provider
protections [1] but did not update the docstring. This does.
I confirmed it was accurate by printing the generated SQL
and doing a compare.
[1] I84baff29505550f3f20069ad5817784d0d1aaea6
Change-Id: Iefbb38f1f328b4d7a51cfec360c81369a977461e
We had already [1] changed one in_ in _get_usages_by_provider_trees to
be an expanding bindparam. This changes the other based on new profiling
results. The same value is used for the param.
[1] I121e5ac6e8202467bf72d9de7ac452d5b5804da1h
Change-Id: If9e6693da9ea09966b4282369cd99516658507b6
Right now the only consumer of provider_ids_from_rp_ids is
_build_summaries, which is private to allocation_candidates.
Since provider_ids_from_rp_ids has evolved to have a fairly
specific behavior, with some tricky considerations for how
it is used, this patch moves it to allocation_candidate.py
and makes it private to make it a bit more clear that it is
special.
If/when it gains more uses we can reconsider.
This change also returns the expanding bindparam functionality
added in [1] and removed in [2] as part of a rebasing mistake.
See above about "tricky".
[1] Ic4e0cdd87f8f2d76b921059ac4bf16a838913abf
[2] If937053a8af3f0eecdd90aa807be0abc316d8c4f
Change-Id: Ief3e86ca0d722035809115818d38fb14a89a7a39