[placement] Enable limiting GET /allocation_candidates

This adds a limit query parameter to GET
/allocation_candidates?limit=5&resource=VCPU:1

A 'limit' filter is added to the AllocationCandidates. If set, after
the database query has been run to create the allocation requests and
provider summaries, a slice or sample of the allocation requests is
taken to limit the results. The summaries are then filtered to only
include those in the allocation requests.

This method avoids needing to make changes to the generated SQL, the
creation of which is fairly complex, or the database tables. The amount
of data queried is still high in the extreme case, but the amount of
data sent over the wire (as JSON) is shrunk. This is a trade-off that
was discussed in the spec and the discussion surrounding its review.
If it turns out that memory use server-side is an issue we can
investigate changing the SQL.

A configuration setting, [placement]/randomize_allocation_candidates,
is added to allow deployers to declare whether they want the results
to be returned in whatever order the database chooses or a random
order. The default is "False" which is expected to preserve existing
behavior and impose a packing placement strategy.

When the config setting is combined with the limit parameter, if
"True" the limited results are a random sampling from the full
results. If "False", it is a slice from the front.

This is done as a new microversion, 1.16, with updates to docs, a reno
and adjustments to the api history doc.

Change-Id: I5f3d4f49c34fd3cd6b9d2e12b3c3c4cdcb409bec
Implements: bp allocation-candidates-limit
This commit is contained in:
Chris Dent 2017-11-19 12:09:38 +00:00
parent 609ddc2244
commit 4a97bbd824
12 changed files with 235 additions and 27 deletions

View File

@ -194,12 +194,21 @@ def list_allocation_candidates(req):
"""
context = req.environ['placement.context']
want_version = req.environ[microversion.MICROVERSION_ENVIRON]
util.validate_query_params(req, schema.GET_SCHEMA_1_10)
get_schema = schema.GET_SCHEMA_1_10
if want_version.matches((1, 16)):
get_schema = schema.GET_SCHEMA_1_16
util.validate_query_params(req, get_schema)
requests = util.parse_qs_request_groups(req.GET)
limit = req.GET.getall('limit')
# JSONschema has already confirmed that limit has the form
# of an integer.
if limit:
limit = int(limit[0])
try:
cands = rp_obj.AllocationCandidates.get_by_requests(context, requests)
cands = rp_obj.AllocationCandidates.get_by_requests(context, requests,
limit)
except exception.ResourceClassNotFound as exc:
raise webob.exc.HTTPBadRequest(
_('Invalid resource class in resources parameter: %(error)s') %

View File

@ -57,6 +57,7 @@ VERSIONS = [
'1.14', # Adds parent and root provider UUID on resource provider
# representation and 'in_tree' filter on GET /resource_providers
'1.15', # Include last-modified and cache-control headers
'1.16', # Add 'limit' query parameter to GET /allocation_candidates
]

View File

@ -207,3 +207,10 @@ actual last modified time of the most recently modified associated database
entity or the current time if there is no direct mapping to the database. In
addition, 'cache-control: no-cache' headers are added where the 'last-modified'
header has been added to prevent inadvertent caching of resources.
1.16 Limit allocation candidates
--------------------------------
Add support for a ``limit`` query parameter when making a
``GET /allocation_candidates`` request. The parameter accepts an integer
value, `N`, which limits the maximum number of candidates returned.

View File

@ -11,6 +11,9 @@
# under the License.
"""Placement API schemas for getting allocation candidates."""
import copy
# Represents the allowed query string parameters to the GET
# /allocation_candidates API call
GET_SCHEMA_1_10 = {
@ -25,3 +28,15 @@ GET_SCHEMA_1_10 = {
],
"additionalProperties": False,
}
# Add limit query parameter.
GET_SCHEMA_1_16 = copy.deepcopy(GET_SCHEMA_1_10)
GET_SCHEMA_1_16['properties']['limit'] = {
# A query parameter is always a string in webOb, but
# we'll handle integer here as well.
"type": ["integer", "string"],
"pattern": "^[1-9][0-9]*$",
"minimum": 1,
"minLength": 1
}

View File

@ -49,7 +49,17 @@ Possible values:
help="""
Endpoint interface for this node. This is used when picking the URL in the
service catalog.
""")
"""),
cfg.BoolOpt('randomize_allocation_candidates',
default=False,
help="""
If True, when limiting allocation candidate results, the results will be
a random sampling of the full result set. If False, allocation candidates
are returned in a deterministic but undefined order. That is, all things
being equal, two requests for allocation candidates will return the same
results in the same order; but no guarantees are made as to how that order
is determined.
"""),
]
deprecated_opts = {

View File

@ -13,6 +13,7 @@
import collections
import copy
import itertools
import random
# NOTE(cdent): The resource provider objects are designed to never be
# used over RPC. Remote manipulation is done with the placement HTTP
@ -22,6 +23,7 @@ import itertools
import os_traits
from oslo_concurrency import lockutils
from oslo_config import cfg
from oslo_db import api as oslo_db_api
from oslo_db import exception as db_exc
from oslo_log import log as logging
@ -56,6 +58,7 @@ _RC_CACHE = None
_TRAIT_LOCK = 'trait_sync'
_TRAITS_SYNCED = False
CONF = cfg.CONF
LOG = logging.getLogger(__name__)
@ -3266,18 +3269,29 @@ class AllocationCandidates(base.NovaObject):
}
@classmethod
def get_by_requests(cls, context, requests):
def get_by_requests(cls, context, requests, limit=None):
"""Returns an AllocationCandidates object containing all resource
providers matching a set of supplied resource constraints, with a set
of allocation requests constructed from that list of resource
providers.
providers. If CONF.placement.randomize_allocation_candidates is True
(default is False) then the order of the allocation requests will
be randomized.
:param requests: List of nova.api.openstack.placement.util.RequestGroup
:param limit: An integer, N, representing the maximum number of
allocation candidates to return. If
CONF.placement.randomize_allocation_candidates is True
this will be a random sampling of N of the available
results. If False then the first N results, in whatever
order the database picked them, will be returned. In
either case if there are fewer than N total results,
all the results will be returned.
"""
_ensure_rc_cache(context)
_ensure_trait_sync(context)
alloc_reqs, provider_summaries = cls._get_by_requests(context,
requests)
requests,
limit)
return cls(
context,
allocation_requests=alloc_reqs,
@ -3286,7 +3300,7 @@ class AllocationCandidates(base.NovaObject):
@staticmethod
@db_api.api_context_manager.reader
def _get_by_requests(context, requests):
def _get_by_requests(context, requests, limit=None):
# We first get the list of "root providers" that either have the
# requested resources or are associated with the providers that
# share one or more of the requested resource(s)
@ -3339,22 +3353,58 @@ class AllocationCandidates(base.NovaObject):
# IDs.
rp_ids = _get_provider_ids_matching_all(context, resources,
trait_map)
return _alloc_candidates_no_shared(context, resources, rp_ids)
alloc_request_objs, summary_objs = _alloc_candidates_no_shared(
context, resources, rp_ids)
else:
if trait_map:
trait_rps = _get_provider_ids_having_any_trait(context,
trait_map)
if not trait_rps:
# If there aren't any providers that have any of the
# required traits, just exit early...
return [], []
if trait_map:
trait_rps = _get_provider_ids_having_any_trait(context, trait_map)
if not trait_rps:
# If there aren't any providers that have any of the required
# traits, just exit early...
return [], []
# rp_ids contains a list of resource provider IDs that EITHER have
# all the requested resources themselves OR have some resources
# and are related to a provider that is sharing some resources
# with it. In other words, this is the list of resource provider
# IDs that are NOT sharing resources.
rps = _get_all_with_shared(context, resources)
rp_ids = set([r[0] for r in rps])
alloc_request_objs, summary_objs = _alloc_candidates_with_shared(
context, resources, trait_map, rp_ids, sharing_providers)
# rp_ids contains a list of resource provider IDs that EITHER have all
# the requested resources themselves OR have some resources and are
# related to a provider that is sharing some resources with it. In
# other words, this is the list of resource provider IDs that are NOT
# sharing resources.
rps = _get_all_with_shared(context, resources)
rp_ids = set([r[0] for r in rps])
# Limit the number of allocation request objects. We do this after
# creating all of them so that we can do a random slice without
# needing to mess with the complex sql above or add additional
# columns to the DB.
return _alloc_candidates_with_shared(context, resources, trait_map,
rp_ids, sharing_providers)
# Track the resource provider uuids that we have chosen so that
# we can pull out their summaries below.
alloc_req_rp_uuids = set()
if limit and limit <= len(alloc_request_objs):
if CONF.placement.randomize_allocation_candidates:
alloc_request_objs = random.sample(alloc_request_objs, limit)
else:
alloc_request_objs = alloc_request_objs[:limit]
# Extract resource provider uuids from the resource requests.
for aro in alloc_request_objs:
for arr in aro.resource_requests:
alloc_req_rp_uuids.add(arr.resource_provider.uuid)
elif CONF.placement.randomize_allocation_candidates:
random.shuffle(alloc_request_objs)
# Limit summaries to only those mentioned in the allocation requests.
if limit and limit <= len(alloc_request_objs):
kept_summary_objs = []
for summary in summary_objs:
rp_uuid = summary.resource_provider.uuid
# Skip a summary if we are limiting and haven't selected an
# allocation request that uses the resource provider.
if rp_uuid not in alloc_req_rp_uuids:
continue
kept_summary_objs.append(summary)
else:
kept_summary_objs = summary_objs
return alloc_request_objs, kept_summary_objs

View File

@ -43,6 +43,42 @@ tests:
response_strings:
- Invalid resource class in resources parameter
- name: get bad limit microversion
GET: /allocation_candidates?resources=VCPU:1&limit=5
request_headers:
openstack-api-version: placement 1.15
status: 400
response_strings:
- Invalid query string parameters
- "'limit' was unexpected"
- name: get bad limit type
GET: /allocation_candidates?resources=VCPU:1&limit=cow
request_headers:
openstack-api-version: placement 1.16
status: 400
response_strings:
- Invalid query string parameters
- "Failed validating 'pattern'"
- name: get bad limit value negative
GET: /allocation_candidates?resources=VCPU:1&limit=-99
request_headers:
openstack-api-version: placement 1.16
status: 400
response_strings:
- Invalid query string parameters
- "Failed validating 'pattern'"
- name: get bad limit value zero
GET: /allocation_candidates?resources=VCPU:1&limit=0
request_headers:
openstack-api-version: placement 1.16
status: 400
response_strings:
- Invalid query string parameters
- "Failed validating 'pattern'"
- name: get allocation candidates no allocations yet
GET: /allocation_candidates?resources=VCPU:1,MEMORY_MB:1024,DISK_GB:100
status: 200
@ -126,3 +162,11 @@ tests:
cache-control: no-cache
# Does last-modified look like a legit timestamp?
last-modified: /^\w+, \d+ \w+ \d{4} [\d:]+ GMT$/
- name: get allocation candidates with limit
GET: /allocation_candidates?resources=VCPU:1,MEMORY_MB:1024,DISK_GB:100&limit=1
status: 200
request_headers:
openstack-api-version: placement 1.16
response_json_paths:
$.allocation_requests.`len`: 1

View File

@ -39,13 +39,13 @@ tests:
response_json_paths:
$.errors[0].title: Not Acceptable
- name: latest microversion is 1.15
- name: latest microversion is 1.16
GET: /
request_headers:
openstack-api-version: placement latest
response_headers:
vary: /OpenStack-API-Version/
openstack-api-version: placement 1.15
openstack-api-version: placement 1.16
- name: other accept header bad version
GET: /

View File

@ -335,12 +335,13 @@ class AllocationCandidatesTestCase(ProviderDBBase):
# _validate_allocation_requests to make failure results more readable.
self.rp_uuid_to_name = {}
def _get_allocation_candidates(self, requests=None):
def _get_allocation_candidates(self, requests=None, limit=None):
if requests is None:
requests = [placement_lib.RequestGroup(
use_same_provider=False,
resources=self.requested_resources)]
return rp_obj.AllocationCandidates.get_by_requests(self.ctx, requests)
return rp_obj.AllocationCandidates.get_by_requests(self.ctx, requests,
limit)
def _validate_allocation_requests(self, expected, candidates):
"""Assert correctness of allocation requests in allocation candidates.
@ -508,6 +509,57 @@ class AllocationCandidatesTestCase(ProviderDBBase):
self.assertEqual(1, len(cn2_p_sum.traits))
self.assertEqual(os_traits.HW_CPU_X86_AVX2, cn2_p_sum.traits[0].name)
def test_all_local_limit(self):
"""Create some resource providers that can satisfy the request for
resources with local (non-shared) resources, limit them, and verify
that the allocation requests returned by AllocationCandidates
correspond with each of these resource providers.
"""
# Create three compute node providers with VCPU, RAM and local disk
for name in ('cn1', 'cn2', 'cn3'):
cn = self._create_provider(name)
_add_inventory(cn, fields.ResourceClass.VCPU, 24,
allocation_ratio=16.0)
_add_inventory(cn, fields.ResourceClass.MEMORY_MB, 32768,
min_unit=64, step_size=64, allocation_ratio=1.5)
total_gb = 1000 if name == 'cn3' else 2000
_add_inventory(cn, fields.ResourceClass.DISK_GB, total_gb,
reserved=100, min_unit=10, step_size=10,
allocation_ratio=1.0)
# Ask for just one candidate.
limit = 1
alloc_cands = self._get_allocation_candidates(limit=limit)
allocation_requests = alloc_cands.allocation_requests
self.assertEqual(limit, len(allocation_requests))
# provider summaries should have only one rp
self.assertEqual(limit, len(alloc_cands.provider_summaries))
# Do it again, with conf set to randomize. We can't confirm the
# random-ness but we can be sure the code path doesn't explode.
self.flags(randomize_allocation_candidates=True, group='placement')
# Ask for two candidates.
limit = 2
alloc_cands = self._get_allocation_candidates(limit=limit)
allocation_requests = alloc_cands.allocation_requests
self.assertEqual(limit, len(allocation_requests))
# provider summaries should have two rps
self.assertEqual(limit, len(alloc_cands.provider_summaries))
# Do it again, asking for more than are available.
limit = 5
# We still only expect 2 because cn3 does not match default requests.
expected_length = 2
alloc_cands = self._get_allocation_candidates(limit=limit)
allocation_requests = alloc_cands.allocation_requests
self.assertEqual(expected_length, len(allocation_requests))
# provider summaries should have two rps
self.assertEqual(expected_length, len(alloc_cands.provider_summaries))
def test_local_with_shared_disk(self):
"""Create some resource providers that can satisfy the request for
resources with local VCPU and MEMORY_MB but rely on a shared storage

View File

@ -30,6 +30,7 @@ Request
.. rest_parameters:: parameters.yaml
- resources: resources_query_required
- limit: allocation_candidates_limit
Response (microversions 1.12 - )
--------------------------------

View File

@ -42,6 +42,14 @@ trait_name:
The name of a trait.
# variables in query
allocation_candidates_limit:
type: integer
in: query
required: false
min_version: 1.16
description: >
A positive integer used to limit the maximum number of allocation
candidates returned in the response.
member_of:
type: string
in: query

View File

@ -0,0 +1,11 @@
---
features:
- |
Add support, in new placement microversion 1.16, for a ``limit`` query
parameter when making a ``GET /allocation_candidates`` request. The
parameter accepts an integer value, `N`, which limits the number of
candidates returned. A new configuration item
``[placement]/randomize_allocation_candidates``, defaulting to `False`,
controls how the limited results are chosen. If `True`, a random sampling
of the entire result set is taken, otherwise the first N results are
returned.