This spec proposes to support negative filtering by the underlying resource provider's membership. Change-Id: I12d36993de58d75c639fc4eb3c9de6ad51d62b09 Blueprint: negative-aggregate-membership
13 KiB
Support filtering by forbidden aggregate membership
https://blueprints.launchpad.net/nova/+spec/negative-aggregate-membership
This blueprint proposes to support for negative filtering by the underlying resource provider's aggregate membership.
Problem description
Placement currently supports member_of query parameters
for the GET /resource_providers and
GET /allocation_candidates endpoints. This parameter is
either "a string representing an aggregate uuid" or "the prefix
in: followed by a comma-separated list of strings
representing aggregate uuids".
For example,
&member_of=in:<agg1>,<agg2>&member_of=<agg3>
would translate logically to:
Candidate resource providers should be in either agg1 or agg2, but definitely in agg3. (See alloc-candidates-member-of spec for details)
However, there is no expression for forbidden aggregates in the API. In other words, we have no way to say "don't use resource providers in this special aggregate for non-special workloads".
Use Cases
This feature is useful to save special resources for specific users.
Use Case 1
Some of the compute host are Licensed Windows Compute Host, meaning any VMs booted on this compute host will be considered as licensed Windows image and depending on the usage of VM, operator will charge it to the end-users. As an operator, I want to avoid booting images/volumes other than Windows OS on Licensed Windows Compute Host.
Use Case 2
Reservation projects like blazar would like to have its own aggregate for host reservation in order to have consumers without any reservations to be scheduled outside of that aggregate in order to save the reserved resources.
Proposed change
Adjust the handling of the member_of parameter so that
aggregates can be expressed as forbidden. Forbidden aggregates are
prefixed with a !.
In the following example,
&member_of=!<agg1>
would translate logically to:
Candidate resource providers should not be in agg1.
This negative expression can also be used in multiple
member_of parameters:
&member_of=in:<agg1>,<agg2>&member_of=<agg3>&member_of=!<agg4>
would translate logically to:
Candidate resource providers must be at least one of agg1 or agg2, definitely in agg3 and definitely not in agg4.
Note that we don't support ! in the in:
prefix:
&member_of=in:<agg1>,<agg2>,!<agg3>
would result in HTTP 400 Bad Request error.
Instead, we support !in: prefix:
&member_of=!in:<agg1>,<agg2>,<agg3>
which is equivalent to
member_of=!<agg1>&member_of=!<agg2>&member_of=!<agg3>
Nested resource providers
For nested resource providers, an aggregate on a root provider automatically spans the whole tree. When a root provider is in forbidden aggregates, the child providers can't be a candidate even if the child provider belongs to no (or another different) aggregate.
In the following environments, for example,
+-----------------------+
| sharing storage (ss1) |
| agg: [aggB] |
+-----------+-----------+
| aggB
+------------------------------+ +--------------|--------------+
| +--------------------------+ | | +------------+------------+ |
| | compute node (cn1) | | | |compute node (cn2) | |
| | agg: [aggA] | | | | agg: [aggB] | |
| +-----+-------------+------+ | | +----+-------------+------+ |
| | parent | parent | | | parent | parent |
| +-----+------+ +----+------+ | | +----+------+ +----+------+ |
| | numa1_1 | | numa1_2 | | | | numa2_1 | | numa2_2 | |
| | agg:[aggC]| | agg:[] | | | | agg:[] | | agg:[] | |
| +-----+------+ +-----------+ | | +-----------+ +-----------+ |
+-------|----------------------+ +-----------------------------+
| aggC
+-----+-----------------+
| sharing storage (ss2) |
| agg: [aggC] |
+-----------------------+
the exclusion constraint is as follows:
member_of=!<aggA>excludes "cn1", "numa1_1" and "numa1_2".member_of=!<aggB>excludes "cn2", "numa2_1", "numa2_2", and "ss1".member_of=!<aggC>excludes "numa1_1" and "ss2".
Note that this spanning doesn't happen on numbered
member_of parameters, which is used for the granular
request:
member_of<N>=!<aggA>excludes "cn1"member_of<N>=!<aggB>excludes "cn2" and "ss1"member_of<N>=!<aggC>excludes "numa1_1" and "ss2".
See granular-resource-request spec for details.
Alternatives
We can use forbidden traits to exclude specific resource providers, but if we use traits, then we should put Blazar or windows license trait not only on root providers but also on every resource providers in the tree, so we don't take this way.
We can also create nova scheduler filters to do post-processing of compute hosts by looking at host aggregate relationships just as BlazarFilter does today. However, this is inefficient and we don't want to develop/use another filter for the windows license use case.
Data model impact
None.
REST API impact
A new microversion will be created which will update the validation
for the member_of parameter on
GET /allocation_candidates and
GET /resource_providers to accept ! both as a
prefix on aggregate uuid and as a prefix on in: prefix to
express that the prefixed aggregate (or the aggregates) is required to
be excluded in the results.
Security impact
None.
Notifications impact
None.
Other end user impact
None.
Performance Impact
Queries to the database will see a moderate increase in complexity but existing table indexes should handle this with aplomb.
Other deployer impact
None.
Developer impact
This helps us to develop a simple reservation mechanism without having a specific nova filter, for example, via the following flow:
- Operator who wants to enable blazar sets default forbidden and
required membership key in the
nova.conf.- The parameter key in the configuration file is something like
[scheduler]/placement_req_default_forbidden_member_prefixand the value is set by the operator toreservation:. - The parameter key in the configuration file is something like
[scheduler]/placement_req_required_member_prefixand the value would is set by the operator toreservation:.
- The parameter key in the configuration file is something like
- Operator starts up the service and makes a host-pool for reservation
via blazar API
- Blazar makes an nova aggregate with
reservation:<random_id>metadata on initialization as a blazar's free pool - Blazar puts hosts specified by the operator into the free pool aggregate on demand
- Blazar makes an nova aggregate with
- User uses blazar to make a host reservation and to get the
reservation id
- Blazar picks up a host from the blazar's free pool
- Blazar creates a new nova aggregate for that reservation and set
that aggregate's metadata key to
reservation:<resv_id>and puts the reserved host into that aggregate
- User creates a VM with a flavor/image with
reservation:<resv_id>meta_data/extra_specs to consume the reservationNova finds in the flavor that the extra_spec has a key which starts with what is set in
[scheduler]/placement_req_required_member_prefix, and looks up the table for aggregates which has the specified metadata:required_prefix = CONF.scheduler.placement_req_required_member_prefix # required_prefix = 'reservation:' required_meta_data = get_flavor_extra_spec_starts_with(required_prefix) # required_meta_data = 'reservation:<resv_id>' required_aggs = aggs_whose_metadata_is(required_meta_data) # required_aggs = [<An aggregate for the reservation>]Nova finds out that the default forbidden aggregate metadata prefix, which is set in
[scheduler]/placement_req_default_forbidden_member_prefix, is explicitly via the flavor, so skip:default_forbidden_prefix = CONF.scheduler.placement_req_default_forbidden_member_prefix # default_forbidden_prefix = ['reservation:'] forbidden_aggs = set() if not get_flavor_extra_spec_starts_with(default_forbidden_prefix): # this is skipped because 'reservation:' is in the flavor in this case forbidden_aggs = aggs_whose_metadata_starts_with(default_forbidden_prefix)Nova calls placement with required and forbidden aggregates:
# We don't have forbidden aggregates in this case ?member_of=<required_aggs>
- User creates a VM with a flavor/image with no reservation, that is,
without
reservation:<resv_id>meta_data/extra_specs.Nova finds in the flavor that the extra_spec has no key which starts with what is set in
[scheduler]/placement_req_required_member_prefix, so no required aggregate is obtained:required_prefix = CONF.scheduler.placement_req_required_member_prefix # required_prefix = 'reservation:' required_meta_data = get_flavor_extra_spec_starts_with(required_prefix) # required_meta_data = '' required_aggs = aggs_whose_metadata_is(required_meta_data) # required_aggs = set()Nova looks up the table for default forbidden aggregates whose metadata starts with what is set in
[scheduler]/placement_req_default_forbidden_member_prefix:default_forbidden_prefix = CONF.scheduler.placement_req_default_forbidden_member_prefix # default_forbidden_prefix = ['reservation:'] forbidden_aggs = set() if not get_flavor_extra_spec_starts_with(default_forbidden_prefix): # This is not skipped now forbidden_aggs = aggs_whose_metadata_starts_with(default_forbidden_prefix) # forbidden_aggs = <blazar's free pool aggregates and the other reservation aggs>Nova calls placement with required and forbidden aggregates:
# We don't have required aggregates in this case ?member_of=!in:<forbidden_aggs>
Note that the change in the nova configuration file and change in the request filter is an example and out of the scope of this spec. An alternative for this is to let placement be aware of the default forbidden traits/aggregates (See the Bi-directional enforcement of traits spec). But we agreed that it is not placement but nova which is responsible for what traits/aggregate is forbidden/required for the instance.
Upgrade impact
None.
Implementation
Assignee(s)
- Primary assignee:
-
Tetsuro Nakamura (nakamura.tetsuro@lab.ntt.co.jp)
Work Items
- Update the
ResourceProviderList.get_all_by_filtersandAllocationCandidates.get_by_requestsmethods to change the database queries to filter on "not this aggregate". - Update the placement API handlers for
GET /resource_providersandGET /allocation_candidatesin a new microversion to pass the negative aggregates to the methods changed in the steps above, including input validation adjustments. - Add functional tests of the modified database queries.
- Add gabbi tests that express the new queries, both successful queries and those that should cause a 400 response.
- Release note for the API change.
- Update the microversion documents to indicate the new version.
- Update placement-api-ref to show the new query handling.
Dependencies
None.
Testing
Normal functional and unit testing.
Documentation Impact
Document the REST API microversion in the appropriate reference docs.
References
- alloc-candidates-member-of feature
- granular-resource-request feature