In microversion 1.34 add a 'mappings' key to each allocation
request. Its value is dict keyed by resource group suffixes
with values of a list of resource providers satisfying that
group.
To preserve symmetry, the mappings key may be sent back when
writing allocations so the schema for POST and PUT allocations
and POST /reshaper are updated.
api history, api-ref and reno are added
Change-Id: Ie78ed7e050416d4ccb62697ba608131038bb4303
Story: 2005575
Task: 33536
We need to carry a 'suffix' through the process of generating
allocation requests in order to be able to present those suffixes
as a 'mappings' key in a forthcoming microversion.
In this patch, the suffix is tracked, but the data is not used
when presenting results.
A test is added to validate results. Existing tests which
check the count of results are adjusted to reflect new results:
The possible combinations is now increased because we are
accounting for the suffix as a differentiator.
Change-Id: I3fdd46a0a92bf9666696a1c5f98afc402cf43b33
Story: 2005575
Task: 33536
Remove _create_incomplete_consumers_for_provider and
_create_incomplete_consumer from placement.objects.allocation as
we now have a blocker migration which means there should no longer
be any incomplete consumers.
Tests are updated, with one test that was testing both the online and
inline migrations modified to simply confirm the online migration.
Change-Id: I5a29685d1e22c66e4d09f04198c011e71340a5d0
There are several TODO throughout the code about ensuring
consumer records calls that can be removed once there is
blocker migration that will fail if there are missing
consumer records.
This patch adds that blocker migration. Subsequent patches
will remove the redundant code.
Change-Id: I6029c5095ed1e6ff7c46d480454db1382073bd57
When making a GET /allocation_candidates request with the 'limit'
parameter, include all the providers in the same tree as the providers
mentioned in the allocation requests. That is, in a nested situation,
if we limit the allocation requests to 10, we would usually expect
considerably more than 10 providers in provider summaries, to include
any non-contributing providers in the same tree.
Accomplish this by filtering provider summaries on root_provider_uuid
instead of individual provider uuid.
A gabbi test is added which exercises this in a nested situation.
An existing test is updated to reflect the new filtering style. It
does not exercise the nested scenario.
A reno is added indicating the fix.
Change-Id: I136bd7cd89f1bd54f0d1691268545850af234f18
Story: 2005859
Task: 33654
Adds an APIFixture representing compute hosts with characteristics such
as:
* A root compute node provider with no resources (VCPU/MEMORY_MB in NUMA
providers, DISK_GB provided by a sharing provider).
* NUMA nodes, providing VCPU and MEMORY_MB resources (with interesting
min_unit and step_size values), both decorated with the HW_NUMA_ROOT
trait, one decorated with a CUSTOM trait.
* Each NUMA node is associated with some devices.
* Two network agents, themselves devoid of resources, parenting
different kinds of network devices.
* A more "normal" compute node provider with VCPU/MEMORY_MB/DISK_GB
resources.
* Some NIC subtree roots, themselves devoid of resources, decorated with
the HW_NIC_ROOT trait, parenting PF providers, on different physical
networks, with VF resources.
* A sharing DISK_GB provider associated with both compute nodes. (It is
the only thing that can provide DISK_GB to the first compute, but the
second compute also has DISK_GB of its own.)
This will be used by subsequent patches to create test cases including
things like subtree affinity, resourceless request groups / providers,
can_split, etc.
Also includes a couple of tweaks to the base test helpers to make the
fixture easier to set up.
Change-Id: I65dbbff1d58a486941f085e89c8e32fc071b6d4b
Story: #2005575
A set of AllocationRequest objects is extended in _merge_candidates,
controlled by the hash of the AllocationRequest, which is based on
a list of AllocationRequestResource objects.
It turns out that this hash can be different depending on the ordering
of the members of the list, with different behaviors in Python 2 and 3.
Therefore, a stable sort is needed to ensure the stability of the hash.
It's quite likely that a sort could have been done elsewhere, but there
are at least two places where resource_requests are added, with
different attributes available for sorting, so this contained
implementation was done instead.
Tests are added that confirms the expected behavior.
Note that the expected behavior will change when we add suffix
mappings to allocation requests. Whereas now
'test_nested_result_count_isolate' gets winnowed to 2 results because
the following are treated as duplicates
[('cn1', orc.VCPU, 2),
('cn1_numa0_pf0', orc.SRIOV_NET_VF, 1),
('cn1_numa1_pf1', orc.SRIOV_NET_VF, 1)],
[('cn1', orc.VCPU, 2),
('cn1_numa0_pf0', orc.SRIOV_NET_VF, 1),
('cn1_numa1_pf1', orc.SRIOV_NET_VF, 1)],
they are in fact slightly different: one set has pf0 satisfying group
_NET1 and pf1 _NET2, the other vice versa. With mappings available
we presumably want to expose this difference, resulting in more...
results.
An interesting note: The original bug report was that python 2.7 was
varying across 3 values, sometimes right, but mostly wrong and 3.7
was always right. Turns out that 3.7 was consistently wrong, and 2.7
was mostly right, but not always.
Change-Id: I4fce243659d5c429dc3d48f07888e38bd0aed5d4
Story: 2005822
Task: 33579
One of the needs we've discussed for perfload is making sure
it is measuring when some inventory has been used.
Here, we change the perload job so that it creates the 1000 providers,
measures getting allocation_candidates and then, in a loop of 99, gets
a limited set of candidates, writes the first one back as an allocation
for a random consumer, project and user. At each iteration it measures
again.
This will make the log file a lot longer, but that's not a significant
issue: the numbers that matter will either be near the top or near the
end. If they are weird, looking in the middle will be informative. We
can tweak it.
This, as usual, is one of many ways to accomplish gathering some data.
Other options might include parallelizing the writes, but in this case
we are trying to see the impact of code on a single request, not on
concurrency.
At some point we will want to add nested and sharing into this mix.
Change-Id: I74b64a25f2be8fbbd01b3a3b438bba68de04b269
os-traits 0.14.0 hit upper-constraints [1] so placement's lower bounds
and canary test need to be updated accordingly.
[1] https://review.opendev.org/663507
Change-Id: Ib6f08a681162101f91c0de4ce1d2eed35975ef96
This is the code used to run placement with the wsgi
profiling described in a blog post [1]. It has proven
useful enough that we may wish to include it in the
released code. It is added in a way that is off by
default and makes no changes to requirements.
Brief docs are included in testing.rst. They are
brief because it's assumed that someone who wants to
do this already mostly knows what they want to do and
merely needs the specifics on how to do it in this
environment.
[1] https://anticdent.org/profiling-wsgi-apps.html
Change-Id: I342512732b94bc19bd711684ba3ec9480cc51f81
To support QoS minimum bandwidth policy during server scheduling
Neutron needs to know which resource provider provides the bandwidth
resource for each port in the server create request.
Story: 2005575
Task: 30804
Change-Id: Iafdb0eab9b41f4c34c93cada08a4da27cf4b1499
Getting sharing providers in _get_providers_with_shared_capacity() was
heavier than the main function _get_providers_with_resource() to search
the requested resource, but they had similar algorithm being a bit
redundant.
This patch changes _get_providers_with_shared_capacity() to reuse the
result of _get_providers_with_resource() for performance optimization.
Change-Id: I190c13c61314da172623c7f9fded7e4ed3d83f80
Story: 2005712
Task: 31038
This patch moves the resource look up to `RequestGroupSearchContext`
initialization. This enables us to reuse the results for
_get_providers_with_shared_capacity(). That optimization is coming
in a following patch.
Change-Id: I2796c3e93317edd3ae8b02fb038cc39f23b41840
Story: 2005712
Task: 31038
_normalize_trait_map() was used in `GET /resource_providers` to
translate trait names to ids, but the parent patch
https://review.opendev.org/#/c/660049/ changed it to call the
translation, trait_obj.ids_from_names(), on demand.
Since it is no longer used, this patch removes that
_normalize_trait_map function.
Change-Id: I0c8d61bab848b3833752b213806c494891bf329c
Getting allocation candidates for one request group, placement
internally calls provider_ids_matching_aggregates() several times with
the same arguments. This is redundant and can degrade performance.
This patch changes to call it on initialization of
`RequestGroupSearchContext` object and cache the results so that we can
refer the results later on demand.
Change-Id: I93e1fa81a7345b651ea2d58dfa0d5508e726f43f
Story: 2005712
Task: 31038
To keep `resource_provider.py` file as a pure module for resource
provider object and to avoid circular import that can occur in the
following refactoring, this patch moves functions to search resource
providers for specific conditions from `resource_provider.py` to the
new file, `research_context.py`.
No functional change or optimizaiton is included in this patch.
Change-Id: I7b217cae6db967b1cc7f1885fff67e4148893fc6
Story: 2005712
Task: 31038
There are cases where exactly the same database query is issued several
times in a request group, such as provider_ids_matching_aggregates().
This patch creates a new object, `RequestGroupSearchContext` to cache
the results not to ask to issue the same query in a request group.
The main refatctoring to reduce the number of the same query will be
done in following patches.
This patch brings a slight optimzation to skip retrieving sharing
providers for `use_same_provder=True` case.
Change-Id: Ifbaac9861f86d85a5bff58573c30a4cf957503d8
Story: 2005712
Task: 31038
In the early days of placement, the CORS middleware's config
options were registered out of band with other options, probably
due to efforts to use non-global config and avoid conflicts with
nova's use of the same settings. Now that placement always uses
non-global config, we can change setup back to something more
normal.
At the same time a gap in header-visibility defaults is corrected.
The updated test confirms header visibility. When the 'allow_headers'
configuration setting in fixtures/gabbits.py is removed, it fails.
Change-Id: I8481b3016de19e1617810cb3d3efa092560dfbb7
CORS configuration settings are available to placement, so we
should include them in the sample and docs.
Change-Id: I15587af6a302f87b4159c819a8046ab489b684ea
To further refine the tests being run during the nova functional
test job, remove the functional.db tests. These shouldn't be
interacting with placement and because some of them use live
mysql or postgresql services for the migration tests, are more
susceptible to intermittent failures when nodes are being slow.
Change-Id: I2cfc6cb74d3acf78043a20cc7254d1315d6fabf0
When ProviderSummary objects are created and then later serialized
to JSON the traits attribute has been a list of Trait objects.
At no point in the processing of that attribute does any caller want
it to be a Trait. They are stored as TraitS but used as strings.
Therefore, this change makes it so it is a list of strings, avoiding
a few different instances of translating one way or the other, keeping
the data in the format the system wants.
Change-Id: Ia9d81ce87111ec3496d10ed254f069c04b9bdf3c
While doing other work, I noticed that _check_traits_for_alloc_request
traverses a list of summaries multiple time to find the "right one".
This is a classic case of a list being the wrong data structure for
the job.
Instead use the request resource resource provider's id to key into
the summaries dict directly to get the relevant summary.
Then, since we know the summary has the trait info we need, we don't
need to pass prov_traits to the method.
Change-Id: Ic2f8f7bdc2984db2011cec329fb6f5b9efec5a0f
Since [1] we have decided to pin placement to the latest versions of
os-resource-classes and os-traits. Turns out we had gabbi tests that
would fail as soon as the os-resource-classes upper constraint was
bumped, alerting us that it was time to sync that requirement. This
commit adds a similar test for os-traits.
[1] https://review.opendev.org/#/c/658419/
Change-Id: I57e834311c73282a2e7a00b14823812cb6f3e1e6
In the review of change I84baff29505550f3f20069ad5817784d0d1aaea6
a two details that were missed were identified:
* a writer context that could be a reader now that the changes
are done
* a database return being sufficiently "dict-ish" to use directly
when creating a named tuple.
Change-Id: Id07eee0b427e0d12fc429fd02e17706fb4447418
A when root providers were added, a series of TODOs were
left in allocation candidate and resource provider data
handling. These TODOs should be removed once we are assured
that there are no null roots. That is the case now, given
that we had an online migration and a status check and now
(prior to this patch) there is a blocker migration and the
status check is an error instead of warning.
The release note for the change of the status check should
suffice for covering this. That is: it makes what it says,
True.
Story: 2005613
Task: 30860
Change-Id: I84baff29505550f3f20069ad5817784d0d1aaea6
This adds an alembic migration script which will block a db_sync
if the online data migration has not been run.
Change-Id: I74c49d286dfc62f49af24303ed1cb18489e7e89d
Story: 2005613
Task: 30921
We want to drop the REST API compability code for resource
providers with no root_provider_id in Train, so to start
we should make the related upgrade check a failure rather
than a warning.
Change-Id: Ifd3c84ea3348fc9e6653838d6fba4a5eb864f01e
Story: 2005613
Task: 30921
Add a 1.33 microversion to move from numeric suffixes to string
suffixes that can be 64 chars longs made from '-', '_', and
mixed-case alphanumeric. The format is shared between schema
and RequestGroup parsing.
Docs, api-ref, api history and microversion upper limit are updated
to indicate the new form in the new microversion.
A release note is added.
Story: 2005575
Task: 30781
Change-Id: Ia44b0922d151695d406883262e891bd932536f38
Since some people may be looking for docs-related work,
add a link to the docs worklist. Stories in the placement
project group show up there if they have been tagged 'docs'.
Change-Id: Ie9ff8c3527c145b4cbc356444acc587931398d00
There is reasonably good debug logging when filtering results to
create a set of allocation candidates, but the latter half of the
processing which merges candidates, removes nested providers and
limits results is not well logged.
This change adds some debug logs in that handling to provide more
information. While the logs do report the sizes of the entities
involved another important factor of the logging is the timestamps
that indicate time spent between each message.
Change-Id: I98ce4cade9acd64285c5a65bd439e37cb6a308f3
Story: 2005647
Task: 30927
In rocky cycle, 'GET /allocation_candidates' started to be aware of
nested providers from microversion 1.29, namely, it can have multiple
allocations from multiple resource providers in the same tree in the
allocation requests.
To keep the behavior of microversion before 1.29, it added a filters
to exculde nested providers being unaware of the nested architecture.
However that function "_exclude_nested_providers()" is very heavy
and is executed even if there is no nested provider in the environment
when microversion < 1.29.
This patch changes it to skip it if there is no nested provider.
Since _exclude_nested_providers() should be done before limitting
the candidates, this patch also moves it from hander file to the
deeper layer.
Change-Id: I4efdc65395e69a6d33fba927018d003cce26fa68
Story: 2005669
Task: 30980
Because os-traits and os-resource-classes are libraries that
provide enumerations, we usually want whatever the latest
version is, even in lower-constraints.txt. We haven't, however,
been good about keeping those up to date.
This updates both requriements.txt and lower-constraints.txt to
the latest versions of both os-traits and os-resource-clases.
No change is required is global upper-constraints. That already
has the latest.
Change-Id: I0bd0e8e071d6275c3e21556018d476c97e8533ae
Story: 2005651
Task: 30939