11321 Commits

Author SHA1 Message Date
Chris Dent
d38844e390 Implement allocation candidate mappings
In microversion 1.34 add a 'mappings' key to each allocation
request. Its value is dict keyed by resource group suffixes
with values of a list of resource providers satisfying that
group.

To preserve symmetry, the mappings key may be sent back when
writing allocations so the schema for POST and PUT allocations
and POST /reshaper are updated.

api history, api-ref and reno are added

Change-Id: Ie78ed7e050416d4ccb62697ba608131038bb4303
Story: 2005575
Task: 33536
2019-06-12 21:19:14 +00:00
Chris Dent
eb07913442 Prepare objects for allocation request mappings
We need to carry a 'suffix' through the process of generating
allocation requests in order to be able to present those suffixes
as a 'mappings' key in a forthcoming microversion.

In this patch, the suffix is tracked, but the data is not used
when presenting results.

A test is added to validate results. Existing tests which
check the count of results are adjusted to reflect new results:
The possible combinations is now increased because we are
accounting for the suffix as a differentiator.

Change-Id: I3fdd46a0a92bf9666696a1c5f98afc402cf43b33
Story: 2005575
Task: 33536
2019-06-12 21:19:04 +00:00
Zuul
b04a15cac7 Merge "Add NUMANetworkFixture for gabbits" 2019-06-12 14:58:01 +00:00
Chris Dent
01e699152b Remove incomplete consumer inline migrations
Remove _create_incomplete_consumers_for_provider and
_create_incomplete_consumer from placement.objects.allocation as
we now have a blocker migration which means there should no longer
be any incomplete consumers.

Tests are updated, with one test that was testing both the online and
inline migrations modified to simply confirm the online migration.

Change-Id: I5a29685d1e22c66e4d09f04198c011e71340a5d0
2019-06-12 12:09:13 +01:00
Chris Dent
221c65a701 Add a blocker migration for missing consumer records
There are several TODO throughout the code about ensuring
consumer records calls that can be removed once there is
blocker migration that will fail if there are missing
consumer records.

This patch adds that blocker migration. Subsequent patches
will remove the redundant code.

Change-Id: I6029c5095ed1e6ff7c46d480454db1382073bd57
2019-06-12 12:08:55 +01:00
Chris Dent
e1783b0087 Correctly limit provider summaries when nested
When making a GET /allocation_candidates request with the 'limit'
parameter, include all the providers in the same tree as the providers
mentioned in the allocation requests. That is, in a nested situation,
if we limit the allocation requests to 10, we would usually expect
considerably more than 10 providers in provider summaries, to include
any non-contributing providers in the same tree.

Accomplish this by filtering provider summaries on root_provider_uuid
instead of individual provider uuid.

A gabbi test is added which exercises this in a nested situation.

An existing test is updated to reflect the new filtering style. It
does not exercise the nested scenario.

A reno is added indicating the fix.

Change-Id: I136bd7cd89f1bd54f0d1691268545850af234f18
Story: 2005859
Task: 33654
2019-06-11 14:40:25 +01:00
Eric Fried
88b6c816a8 Add NUMANetworkFixture for gabbits
Adds an APIFixture representing compute hosts with characteristics such
as:

* A root compute node provider with no resources (VCPU/MEMORY_MB in NUMA
  providers, DISK_GB provided by a sharing provider).
* NUMA nodes, providing VCPU and MEMORY_MB resources (with interesting
  min_unit and step_size values), both decorated with the HW_NUMA_ROOT
  trait, one decorated with a CUSTOM trait.
* Each NUMA node is associated with some devices.
* Two network agents, themselves devoid of resources, parenting
  different kinds of network devices.

* A more "normal" compute node provider with VCPU/MEMORY_MB/DISK_GB
  resources.
* Some NIC subtree roots, themselves devoid of resources, decorated with
  the HW_NIC_ROOT trait, parenting PF providers, on different physical
  networks, with VF resources.

* A sharing DISK_GB provider associated with both compute nodes. (It is
  the only thing that can provide DISK_GB to the first compute, but the
  second compute also has DISK_GB of its own.)

This will be used by subsequent patches to create test cases including
things like subtree affinity, resourceless request groups / providers,
can_split, etc.

Also includes a couple of tweaks to the base test helpers to make the
fixture easier to set up.

Change-Id: I65dbbff1d58a486941f085e89c8e32fc071b6d4b
Story: #2005575
2019-06-11 12:30:30 +01:00
Zuul
c919b1f568 Merge "Modernize CORS config and setup" 2019-06-10 17:12:41 +00:00
Zuul
f443cb902f Merge "Add olso.middleware.cors to conf generator" 2019-06-10 11:40:07 +00:00
Zuul
9bb68ff1fc Merge "Stabilize AllocationRequest hash" 2019-06-08 03:12:25 +00:00
Chris Dent
81937773f5 Stabilize AllocationRequest hash
A set of AllocationRequest objects is extended in _merge_candidates,
controlled by the hash of the AllocationRequest, which is based on
a list of AllocationRequestResource objects.

It turns out that this hash can be different depending on the ordering
of the members of the list, with different behaviors in Python 2 and 3.
Therefore, a stable sort is needed to ensure the stability of the hash.

It's quite likely that a sort could have been done elsewhere, but there
are at least two places where resource_requests are added, with
different attributes available for sorting, so this contained
implementation was done instead.

Tests are added that confirms the expected behavior.

Note that the expected behavior will change when we add suffix
mappings to allocation requests. Whereas now
'test_nested_result_count_isolate' gets winnowed to 2 results because
the following are treated as duplicates

   [('cn1', orc.VCPU, 2),
    ('cn1_numa0_pf0', orc.SRIOV_NET_VF, 1),
    ('cn1_numa1_pf1', orc.SRIOV_NET_VF, 1)],
   [('cn1', orc.VCPU, 2),
    ('cn1_numa0_pf0', orc.SRIOV_NET_VF, 1),
    ('cn1_numa1_pf1', orc.SRIOV_NET_VF, 1)],

they are in fact slightly different: one set has pf0 satisfying group
_NET1 and pf1 _NET2, the other vice versa. With mappings available
we presumably want to expose this difference, resulting in more...
results.

An interesting note: The original bug report was that python 2.7 was
varying across 3 values, sometimes right, but mostly wrong and 3.7
was always right. Turns out that 3.7 was consistently wrong, and 2.7
was mostly right, but not always.

Change-Id: I4fce243659d5c429dc3d48f07888e38bd0aed5d4
Story: 2005822
Task: 33579
2019-06-07 15:01:38 +00:00
Chris Dent
910b466c50 perfload with written allocations
One of the needs we've discussed for perfload is making sure
it is measuring when some inventory has been used.

Here, we change the perload job so that it creates the 1000 providers,
measures getting allocation_candidates and then, in a loop of 99, gets
a limited set of candidates, writes the first one back as an allocation
for a random consumer, project and user. At each iteration it measures
again.

This will make the log file a lot longer, but that's not a significant
issue: the numbers that matter will either be near the top or near the
end. If they are weird, looking in the middle will be informative. We
can tweak it.

This, as usual, is one of many ways to accomplish gathering some data.
Other options might include parallelizing the writes, but in this case
we are trying to see the impact of code on a single request, not on
concurrency.

At some point we will want to add nested and sharing into this mix.

Change-Id: I74b64a25f2be8fbbd01b3a3b438bba68de04b269
2019-06-07 14:27:04 +00:00
Eric Fried
4cca0ee13c Bump os-traits to latest release (0.14.0)
os-traits 0.14.0 hit upper-constraints [1] so placement's lower bounds
and canary test need to be updated accordingly.

[1] https://review.opendev.org/663507

Change-Id: Ib6f08a681162101f91c0de4ce1d2eed35975ef96
2019-06-06 12:58:57 -05:00
Chris Dent
88c6ad9cb4 Optionally run a wsgi profiler when asked
This is the code used to run placement with the wsgi
profiling described in a blog post [1]. It has proven
useful enough that we may wish to include it in the
released code. It is added in a way that is off by
default and makes no changes to requirements.

Brief docs are included in testing.rst. They are
brief because it's assumed that someone who wants to
do this already mostly knows what they want to do and
merely needs the specifics on how to do it in this
environment.

[1] https://anticdent.org/profiling-wsgi-apps.html

Change-Id: I342512732b94bc19bd711684ba3ec9480cc51f81
2019-06-05 19:41:18 +00:00
Tetsuro Nakamura
7db53444fb Bump os-traits requirements
This patch bumps os-traits since os-traits 0.13.0 was released.

Change-Id: Idb972d9d905f2891cd412c0abefb7bff5524d04f
2019-06-02 21:09:14 +00:00
Zuul
fb57f23bf7 Merge "Resource provider - request group mapping in allocation candidate" 2019-06-04 03:24:37 +00:00
Balazs Gibizer
7049e40786 Resource provider - request group mapping in allocation candidate
To support QoS minimum bandwidth policy during server scheduling
Neutron needs to know which resource provider provides the bandwidth
resource for each port in the server create request.

Story: 2005575
Task: 30804

Change-Id: Iafdb0eab9b41f4c34c93cada08a4da27cf4b1499
2019-06-03 13:52:21 -05:00
Eric Fried
289130fc3d Bump openstackdocstheme to 1.30.0
...to pick up many improvements, including the return of table borders.

Change-Id: Ic3a43e2e35f361234f081c0e1f34729cf6560828
2019-05-30 18:08:16 -05:00
Tetsuro Nakamura
7b8e2a8a6b Reuse cache result for sharing providers capacity
Getting sharing providers in _get_providers_with_shared_capacity() was
heavier than the main function _get_providers_with_resource() to search
the requested resource, but they had similar algorithm being a bit
redundant.

This patch changes _get_providers_with_shared_capacity() to reuse the
result of _get_providers_with_resource() for performance optimization.

Change-Id: I190c13c61314da172623c7f9fded7e4ed3d83f80
Story: 2005712
Task: 31038
2019-05-29 15:22:38 -05:00
Tetsuro Nakamura
f8bbda15e7 Move seek providers with resource to context
This patch moves the resource look up to `RequestGroupSearchContext`
initialization. This enables us to reuse the results for
_get_providers_with_shared_capacity(). That optimization is coming
in a following patch.

Change-Id: I2796c3e93317edd3ae8b02fb038cc39f23b41840
Story: 2005712
Task: 31038
2019-05-29 15:22:38 -05:00
Tetsuro Nakamura
d75bdbff41 Remove normalize trait map func
_normalize_trait_map() was used in `GET /resource_providers` to
translate trait names to ids, but the parent patch
https://review.opendev.org/#/c/660049/ changed it to call the
translation, trait_obj.ids_from_names(), on demand.

Since it is no longer used, this patch removes that
_normalize_trait_map function.

Change-Id: I0c8d61bab848b3833752b213806c494891bf329c
2019-05-29 15:22:38 -05:00
Tetsuro Nakamura
7f4b79b7e1 Cache provider ids in requested aggregates
Getting allocation candidates for one request group, placement
internally calls provider_ids_matching_aggregates() several times with
the same arguments. This is redundant and can degrade performance.

This patch changes to call it on initialization of
`RequestGroupSearchContext` object and cache the results so that we can
refer the results later on demand.

Change-Id: I93e1fa81a7345b651ea2d58dfa0d5508e726f43f
Story: 2005712
Task: 31038
2019-05-29 15:22:38 -05:00
Tetsuro Nakamura
fb71a6ab71 Move search functions to the research context file
To keep `resource_provider.py` file as a pure module for resource
provider object and to avoid circular import that can occur in the
following refactoring, this patch moves functions to search resource
providers for specific conditions from `resource_provider.py` to the
new file, `research_context.py`.

No functional change or optimizaiton is included in this patch.

Change-Id: I7b217cae6db967b1cc7f1885fff67e4148893fc6
Story: 2005712
Task: 31038
2019-05-29 15:22:38 -05:00
Tetsuro Nakamura
daf7285a74 Add RequestGroupSearchContext class
There are cases where exactly the same database query is issued several
times in a request group, such as provider_ids_matching_aggregates().

This patch creates a new object, `RequestGroupSearchContext` to cache
the results not to ask to issue the same query in a request group.

The main refatctoring to reduce the number of the same query will be
done in following patches.

This patch brings a slight optimzation to skip retrieving sharing
providers for `use_same_provder=True` case.

Change-Id: Ifbaac9861f86d85a5bff58573c30a4cf957503d8
Story: 2005712
Task: 31038
2019-05-29 15:22:38 -05:00
Chris Dent
fc35e3112e Modernize CORS config and setup
In the early days of placement, the CORS middleware's config
options were registered out of band with other options, probably
due to efforts to use non-global config and avoid conflicts with
nova's use of the same settings. Now that placement always uses
non-global config, we can change setup back to something more
normal.

At the same time a gap in header-visibility defaults is corrected.

The updated test confirms header visibility. When the 'allow_headers'
configuration setting in fixtures/gabbits.py is removed, it fails.

Change-Id: I8481b3016de19e1617810cb3d3efa092560dfbb7
2019-05-29 10:31:14 +01:00
Zuul
c81f62d501 Merge "Use trait strings in ProviderSummary objects" 2019-05-29 02:53:43 +00:00
Zuul
d1abad4e48 Merge "Avoid traversing summaries in _check_traits_for_alloc_request" 2019-05-29 02:53:42 +00:00
Zuul
b0d1bb61f7 Merge "Allow [a-zA-Z0-9_-]{1,64} for request group suffix" 2019-05-29 02:53:34 +00:00
Chris Dent
7db2e29325 Add olso.middleware.cors to conf generator
CORS configuration settings are available to placement, so we
should include them in the sample and docs.

Change-Id: I15587af6a302f87b4159c819a8046ab489b684ea
2019-05-28 15:29:55 +01:00
Zuul
f517aeeb51 Merge "Don't run functional.db tests in nova functional run" 2019-05-24 19:13:59 +00:00
Chris Dent
e0b35bc3b8 Don't run functional.db tests in nova functional run
To further refine the tests being run during the nova functional
test job, remove the functional.db tests. These shouldn't be
interacting with placement and because some of them use live
mysql or postgresql services for the migration tests, are more
susceptible to intermittent failures when nodes are being slow.

Change-Id: I2cfc6cb74d3acf78043a20cc7254d1315d6fabf0
2019-05-24 14:00:58 +00:00
Tetsuro Nakamura
aeb65d4cd9 Trivial: Fix comment for LEFT join
This patch fixes some comments `LEFT` to `INNER`
on SQL queries.

Change-Id: Ida78cfabe59a03d65f50545e0a6859abaa0e5d40
2019-05-24 14:00:20 +00:00
Chris Dent
5f4da5e040 Use trait strings in ProviderSummary objects
When ProviderSummary objects are created and then later serialized
to JSON the traits attribute has been a list of Trait objects.

At no point in the processing of that attribute does any caller want
it to be a Trait. They are stored as TraitS but used as strings.

Therefore, this change makes it so it is a list of strings, avoiding
a few different instances of translating one way or the other, keeping
the data in the format the system wants.

Change-Id: Ia9d81ce87111ec3496d10ed254f069c04b9bdf3c
2019-05-24 13:50:26 +00:00
Chris Dent
346509f8fa Avoid traversing summaries in _check_traits_for_alloc_request
While doing other work, I noticed that _check_traits_for_alloc_request
traverses a list of summaries multiple time to find the "right one".
This is a classic case of a list being the wrong data structure for
the job.

Instead use the request resource resource provider's id to key into
the summaries dict directly to get the relevant summary.

Then, since we know the summary has the trait info we need, we don't
need to pass prov_traits to the method.

Change-Id: Ic2f8f7bdc2984db2011cec329fb6f5b9efec5a0f
2019-05-24 13:50:16 +00:00
Zuul
d878d82b6b Merge "Canary test for os-traits version" 2019-05-24 13:04:50 +00:00
Eric Fried
7d39ee720c Canary test for os-traits version
Since [1] we have decided to pin placement to the latest versions of
os-resource-classes and os-traits. Turns out we had gabbi tests that
would fail as soon as the os-resource-classes upper constraint was
bumped, alerting us that it was time to sync that requirement. This
commit adds a similar test for os-traits.

[1] https://review.opendev.org/#/c/658419/

Change-Id: I57e834311c73282a2e7a00b14823812cb6f3e1e6
2019-05-23 18:00:44 -05:00
EdLeafe
cb28c8dda7 Fix typo in usage.yaml and usage-policy.yaml
I am confused as to how this test was passing. It was failing on my
machine.

Change-Id: I029fe0c84f5f91443aa1aa4c0b6c0b56465f92b7
2019-05-23 22:56:11 +00:00
Eric Fried
4bfffd7f0b Bump os-resource-classes requirements
os-resource-classes 0.4.0 was recently released [1] and upper
constraints were bumped [2]. Since [3] we have decided to always pin
placement to the latest versions of os-resource-classes and os-traits.
This patch makes it so, again.

[1] https://review.opendev.org/#/c/660419/
[2] https://review.opendev.org/#/c/660899/
[3] https://review.opendev.org/#/c/658419/

Change-Id: I1c02a926478190c8695746ed539fb6cc97a49945
2019-05-23 17:52:43 -05:00
Chris Dent
4d05bb8f0b Fixups from removing null provider protections
In the review of change I84baff29505550f3f20069ad5817784d0d1aaea6
a two details that were missed were identified:

* a writer context that could be a reader now that the changes
  are done

* a database return being sufficiently "dict-ish" to use directly
  when creating a named tuple.

Change-Id: Id07eee0b427e0d12fc429fd02e17706fb4447418
2019-05-23 10:14:19 +01:00
Chris Dent
e0efa65e29 Remove null root provider protections
A when root providers were added, a series of TODOs were
left in allocation candidate and resource provider data
handling. These TODOs should be removed once we are assured
that there are no null roots. That is the case now, given
that we had an online migration and a status check and now
(prior to this patch) there is a blocker migration and the
status check is an error instead of warning.

The release note for the change of the status check should
suffice for covering this. That is: it makes what it says,
True.

Story: 2005613
Task: 30860
Change-Id: I84baff29505550f3f20069ad5817784d0d1aaea6
2019-05-22 15:18:27 -04:00
Matt Riedemann
4606e55d19 Add blocker alembic migration for null root_provider_ids
This adds an alembic migration script which will block a db_sync
if the online data migration has not been run.

Change-Id: I74c49d286dfc62f49af24303ed1cb18489e7e89d
Story: 2005613
Task: 30921
2019-05-22 15:17:42 -04:00
Matt Riedemann
4af1df9408 Change "Missing Root Provider IDs" upgrade check to a failure
We want to drop the REST API compability code for resource
providers with no root_provider_id in Train, so to start
we should make the related upgrade check a failure rather
than a warning.

Change-Id: Ifd3c84ea3348fc9e6653838d6fba4a5eb864f01e
Story: 2005613
Task: 30921
2019-05-22 14:45:36 -04:00
Chris Dent
fb0f6f2608 Allow [a-zA-Z0-9_-]{1,64} for request group suffix
Add a 1.33 microversion to move from numeric suffixes to string
suffixes that can be 64 chars longs made from '-', '_', and
mixed-case alphanumeric. The format is shared between schema
and RequestGroup parsing.

Docs, api-ref, api history and microversion upper limit are updated
to indicate the new form in the new microversion.

A release note is added.

Story: 2005575
Task: 30781
Change-Id: Ia44b0922d151695d406883262e891bd932536f38
2019-05-21 11:07:38 +01:00
Chris Dent
e98b5df3d9 Add 'docs' worklist to worklist table
Since some people may be looking for docs-related work,
add a link to the docs worklist. Stories in the placement
project group show up there if they have been tagged 'docs'.

Change-Id: Ie9ff8c3527c145b4cbc356444acc587931398d00
2019-05-20 11:33:10 +01:00
Zuul
0f1a5dddf8 Merge "Enhance debug logging in allocation candidate handling" 2019-05-20 09:35:40 +00:00
Qiu Fossen
43c859b421 Cap sphinx for py2 to match global requirements
Change-Id: I7d95fedf44dd0504a57ffa4c4ecd54fccf7a0230
2019-05-17 05:06:10 -04:00
Chris Dent
73b29cd6e4 Enhance debug logging in allocation candidate handling
There is reasonably good debug logging when filtering results to
create a set of allocation candidates, but the latter half of the
processing which merges candidates, removes nested providers and
limits results is not well logged.

This change adds some debug logs in that handling to provide more
information. While the logs do report the sizes of the entities
involved another important factor of the logging is the timestamps
that indicate time spent between each message.

Change-Id: I98ce4cade9acd64285c5a65bd439e37cb6a308f3
Story: 2005647
Task: 30927
2019-05-16 11:45:37 +01:00
Zuul
1281806c99 Merge "Skip _exclude_nested_providers() if not nested" 2019-05-14 20:04:42 +00:00
Tetsuro Nakamura
727fb88dcc Skip _exclude_nested_providers() if not nested
In rocky cycle, 'GET /allocation_candidates' started to be aware of
nested providers from microversion 1.29, namely, it can have multiple
allocations from multiple resource providers in the same tree in the
allocation requests.

To keep the behavior of microversion before 1.29, it added a filters
to exculde nested providers being unaware of the nested architecture.
However that function "_exclude_nested_providers()" is very heavy
and is executed even if there is no nested provider in the environment
when microversion < 1.29.

This patch changes it to skip it if there is no nested provider.

Since _exclude_nested_providers() should be done before limitting
the candidates, this patch also moves it from hander file to the
deeper layer.

Change-Id: I4efdc65395e69a6d33fba927018d003cce26fa68
Story: 2005669
Task: 30980
2019-05-13 12:38:28 +00:00
Chris Dent
fea9bad74d Raise os-traits os-resource-classes constraints
Because os-traits and os-resource-classes are libraries that
provide enumerations, we usually want whatever the latest
version is, even in lower-constraints.txt. We haven't, however,
been good about keeping those up to date.

This updates both requriements.txt and lower-constraints.txt to
the latest versions of both os-traits and os-resource-clases.

No change is required is global upper-constraints. That already
has the latest.

Change-Id: I0bd0e8e071d6275c3e21556018d476c97e8533ae
Story: 2005651
Task: 30939
2019-05-10 09:40:23 -07:00