Provide a direct interface to placement

This is a method of using wsgi-intercept to provide a context
manager that allows talking to placement over requests, but without
a network. It is a quick and dirty way to talk to and make changes
in the placement database where the only network traffic is with the
placement database.

This is expected to be useful in the creation of tools for
performing fast forward upgrades where each compute node may need to
"migrate" its resource providers, inventory and allocations in the
face of changing representations of hardware (for example
pre-existing VGPUs being represented as nested providers) but would
like to do so when all non-database services are stopped. A system
like this would allow code on the compute node to update the
placement database, using well known HTTP interactions, without the
placement service being up.

The basic idea is that we spin up the WSGI stack with no auth,
configured using whatever already loaded CONF we happen to have
available. That CONF points to the placement database and all the
usual stuff. The context manager provides a keystoneauth1 Adapter
class that operates as a client for accessing placement. The full
WSGI stack is brought up because we need various bits of middleware
to help ensure that policy calls don't explode and so JSON
validation is in place.

In this model everything else is left up to the caller: constructing
the JSON, choosing which URIs to call with what methods (see
test_direct for minimal examples that ought to give an idea of what
real callers could expect).

To make things friendly in the nova context and ease creation of fast
forward upgrade tools, SchedulerReportClient is tweaked to take an
optional adapter kwarg on construction. If specified, this is used
instead of creating one with get_ksa_adapter(), using settings from
[placement] conf.

Doing things in this way draws a clear line between the placement parts
and the nova parts while keeping the nova parts straightforward.

NoAuthReportClient is replaced with a base test class,
test_report_client.SchedulerReportClientTestBase. This provides an
_interceptor() context manager which is a wrapper around
PlacementDirect, but instead of producing an Adapter, it produces a
SchedulerReportClient (which has been passed the Adapter provided by
PlacementDirect). test_resource_tracker and test_report_client are
updated accordingly.

Caveats to be aware of:

* This is (intentionally) set up to circumvent authentication and
  authorization. If you have access to the necessary database
  connection string, then you are good to go. That's what we want,
  right?

* CONF construction being left up to the caller is on purpose
  because right now placement itself is not super flexible in this
  area and flexibility is desired here.

This is not (by a long shot) the only way to do this. Other options
include:

* Constructing a WSGI environ that has all the necessary bits to
  allow calling the methods in the handlers directly (as python
  commands).  This would duplicate a fair bit of the middleware and
  seems error prone, because it's hard to discern what parts of the
  environ need to be filled. It's also weird for data input: we need
  to use a BytesIO to pass in data on PUTs and POSTs.

* Using either the WSGI environ or wsgi-intercept models but wrap it
  with a pythonic library that exposes a "pretty" interface to
  callers. Something like:

      placement.direct.allocations.update(consumer_uuid, {data})

* Creating a python library that assembles the necessary data for
  calling the methods in the resource provider objects and exposing
  that to:
  a) the callers who want this direct stuff
  b) the existing handlers in placement (which remain responsible
     for json manipulation and validation and microversion handling,
     and marshal data appropriately for the python lib)

I've chosen the simplest thing as a starting point because it gives
us something to talk over and could solve the immediate problem. If
we were to eventually pursue the 4th option, I would hope that we
had some significant discussion before doing so as I think it is a)
harder than it might seem at first glance, b) likely to lead to many
asking "why bother with the http interface at all?". Both require
thought.

Partially implements blueprint reshape-provider-tree
Co-Authored-By: Eric Fried <efried@us.ibm.com>
Change-Id: I075785abcd4f4a8e180959daeadf215b9cd175c8
This commit is contained in:
Chris Dent 2018-06-05 15:36:14 -07:00 committed by Eric Fried
parent 5ef23af397
commit 43cc59abe2
5 changed files with 270 additions and 81 deletions

View File

@ -0,0 +1,96 @@
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
"""Call any URI in the placement service directly without real HTTP.
This is useful for those cases where processes wish to manipulate the
Placement datastore but do not want to run Placement as a long running
service. A PlacementDirect context manager is provided. Within that
HTTP requests may be made as normal but they will not actually traverse
a real socket.
"""
from keystoneauth1 import adapter
from keystoneauth1 import session
import mock
from oslo_utils import uuidutils
import requests
from wsgi_intercept import interceptor
from nova.api.openstack.placement import deploy
class PlacementDirect(interceptor.RequestsInterceptor):
"""Provide access to the placement service without real HTTP.
wsgi-intercept is used to provide a keystoneauth1 Adapter that has access
to an in-process placement service. This provides access to making changes
to the placement database without requiring HTTP over the network - it
remains in-process.
Authentication to the service is turned off; admin access is assumed.
Access is provided via a context manager which is responsible for
turning the wsgi-intercept on and off, and setting and removing
mocks required to keystoneauth1 to work around endpoint discovery.
Example::
with PlacementDirect(cfg.CONF, latest_microversion=True) as client:
allocations = client.get('/allocations/%s' % consumer)
:param conf: An oslo config with the options used to configure
the placement service (notably database connection
string).
:param latest_microversion: If True, API requests will use the latest
microversion if not otherwise specified. If
False (the default), the base microversion is
the default.
"""
def __init__(self, conf, latest_microversion=False):
conf.set_override('auth_strategy', 'noauth2', group='api')
app = lambda: deploy.loadapp(conf)
self.url = 'http://%s/placement' % str(uuidutils.generate_uuid())
# Supply our own session so the wsgi-intercept can intercept
# the right thing.
request_session = requests.Session()
headers = {
'x-auth-token': 'admin',
}
# TODO(efried): See below
if latest_microversion:
headers['OpenStack-API-Version'] = 'placement latest'
# TODO(efried): Set raise_exc globally when
# https://review.openstack.org/#/c/574784/ is released.
self.adapter = adapter.Adapter(
session.Session(auth=None, session=request_session,
additional_headers=headers),
service_type='placement')
# TODO(efried): Figure out why this isn't working:
# default_microversion='latest' if latest_microversion else None)
self._mocked_endpoint = mock.patch(
'keystoneauth1.session.Session.get_endpoint',
new=mock.Mock(return_value=self.url))
super(PlacementDirect, self).__init__(app, url=self.url)
def __enter__(self):
"""Start the wsgi-intercept interceptor and keystone endpoint mock.
A no auth ksa Adapter is provided to the context being managed.
"""
super(PlacementDirect, self).__enter__()
self._mocked_endpoint.start()
return self.adapter
def __exit__(self, *exc):
self._mocked_endpoint.stop()
return super(PlacementDirect, self).__exit__(*exc)

View File

@ -244,7 +244,14 @@ def get_placement_request_id(response):
class SchedulerReportClient(object):
"""Client class for updating the scheduler."""
def __init__(self):
def __init__(self, adapter=None):
"""Initialize the report client.
:param adapter: A prepared keystoneauth1 Adapter for API communication.
If unspecified, one is created based on config options in the
[placement] section.
"""
self._adapter = adapter
# An object that contains a nova-compute-side cache of resource
# provider and inventory information
self._provider_tree = provider_tree.ProviderTree()
@ -260,7 +267,7 @@ class SchedulerReportClient(object):
# Flush provider tree and associations so we start from a clean slate.
self._provider_tree = provider_tree.ProviderTree()
self._association_refresh_time = {}
client = utils.get_ksa_adapter('placement')
client = self._adapter or utils.get_ksa_adapter('placement')
# Set accept header on every request to ensure we notify placement
# service of our response body media type preferences.
client.additional_headers = {'accept': 'application/json'}

View File

@ -0,0 +1,98 @@
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
from oslo_config import cfg
from nova.api.openstack.placement import direct
from nova.api.openstack.placement.objects import resource_provider
from nova import context
from nova import test
from nova.tests import fixtures
from nova.tests import uuidsentinel
CONF = cfg.CONF
# FIXME(cdent): some dupes with db/test_base.py
class TestDirect(test.NoDBTestCase):
USES_DB_SELF = True
def setUp(self):
super(TestDirect, self).setUp()
self.api_db = self.useFixture(fixtures.Database(database='api'))
self._reset_traits_synced()
self.context = context.get_admin_context()
self.addCleanup(self._reset_traits_synced)
@staticmethod
def _reset_traits_synced():
"""Reset the _TRAITS_SYNCED boolean to base state."""
resource_provider._TRAITS_SYNCED = False
def test_direct_is_there(self):
with direct.PlacementDirect(CONF) as client:
resp = client.get('/')
self.assertTrue(resp)
data = resp.json()
self.assertEqual('v1.0', data['versions'][0]['id'])
def test_get_resource_providers(self):
with direct.PlacementDirect(CONF) as client:
resp = client.get('/resource_providers')
self.assertTrue(resp)
data = resp.json()
self.assertEqual([], data['resource_providers'])
def test_create_resource_provider(self):
data = {'name': 'fake'}
with direct.PlacementDirect(CONF) as client:
resp = client.post('/resource_providers', json=data)
self.assertTrue(resp)
resp = client.get('/resource_providers')
self.assertTrue(resp)
data = resp.json()
self.assertEqual(1, len(data['resource_providers']))
def test_json_validation_happens(self):
data = {'name': 'fake', 'cowsay': 'moo'}
with direct.PlacementDirect(CONF) as client:
# TODO(efried): Set raise_exc globally when
# https://review.openstack.org/#/c/574784/ is released.
resp = client.post('/resource_providers', json=data,
raise_exc=False)
self.assertFalse(resp)
self.assertEqual(400, resp.status_code)
def test_microversion_handling(self):
with direct.PlacementDirect(CONF) as client:
# create parent
parent_data = {'name': uuidsentinel.p_rp,
'uuid': uuidsentinel.p_rp}
resp = client.post('/resource_providers', json=parent_data)
self.assertTrue(resp, resp.text)
# attempt to create child
data = {'name': 'child', 'parent_provider_uuid': uuidsentinel.p_rp}
# no microversion, 400
resp = client.post('/resource_providers', json=data,
raise_exc=False)
self.assertFalse(resp)
self.assertEqual(400, resp.status_code)
# low microversion, 400
resp = client.post('/resource_providers', json=data,
raise_exc=False, microversion='1.13')
self.assertFalse(resp)
self.assertEqual(400, resp.status_code)
resp = client.post('/resource_providers', json=data,
microversion='1.14')
self.assertTrue(resp, resp.text)

View File

@ -11,9 +11,7 @@
# under the License.
import mock
from wsgi_intercept import interceptor
from nova.api.openstack.placement import deploy
from nova.compute import power_state
from nova.compute import resource_tracker
from nova.compute import task_states
@ -22,8 +20,7 @@ from nova import conf
from nova import context
from nova import objects
from nova import rc_fields as fields
from nova import test
from nova.tests.functional import test_report_client
from nova.tests.functional import test_report_client as test_base
from nova.tests import uuidsentinel as uuids
from nova.virt import driver as virt_driver
@ -34,7 +31,7 @@ DISK_GB = fields.ResourceClass.DISK_GB
COMPUTE_HOST = 'compute-host'
class IronicResourceTrackerTest(test.TestCase):
class IronicResourceTrackerTest(test_base.SchedulerReportClientTestBase):
"""Tests the behaviour of the resource tracker with regards to the
transitional period between adding support for custom resource classes in
the placement API and integrating inventory and allocation records for
@ -130,9 +127,16 @@ class IronicResourceTrackerTest(test.TestCase):
),
}
def _set_client(self, client):
"""Set up embedded report clients to use the direct one from the
interceptor.
"""
self.report_client = client
self.rt.scheduler_client.reportclient = client
self.rt.reportclient = client
def setUp(self):
super(IronicResourceTrackerTest, self).setUp()
self.flags(auth_strategy='noauth2', group='api')
self.flags(
reserved_host_memory_mb=0,
cpu_allocation_ratio=1.0,
@ -141,17 +145,12 @@ class IronicResourceTrackerTest(test.TestCase):
)
self.ctx = context.RequestContext('user', 'project')
self.app = lambda: deploy.loadapp(CONF)
self.report_client = test_report_client.NoAuthReportClient()
driver = mock.MagicMock(autospec=virt_driver.ComputeDriver)
driver.node_is_available.return_value = True
driver.update_provider_tree.side_effect = NotImplementedError
self.driver_mock = driver
self.rt = resource_tracker.ResourceTracker(COMPUTE_HOST, driver)
self.rt.scheduler_client.reportclient = self.report_client
self.rt.reportclient = self.report_client
self.url = 'http://localhost/placement'
self.create_fixtures()
def create_fixtures(self):
@ -198,22 +197,16 @@ class IronicResourceTrackerTest(test.TestCase):
if rc['name'] not in fields.ResourceClass.STANDARD]
@mock.patch('nova.compute.utils.is_volume_backed_instance',
return_value=False)
@mock.patch('nova.objects.compute_node.ComputeNode.save')
@mock.patch('keystoneauth1.session.Session.get_auth_headers',
return_value={'x-auth-token': 'admin'})
@mock.patch('keystoneauth1.session.Session.get_endpoint',
return_value='http://localhost/placement')
def test_ironic_ocata_to_pike(self, mock_vbi, mock_endpoint, mock_auth,
mock_cn):
new=mock.Mock(return_value=False))
@mock.patch('nova.objects.compute_node.ComputeNode.save', new=mock.Mock())
def test_ironic_ocata_to_pike(self):
"""Check that when going from an Ocata installation with Ironic having
node's resource class attributes set, that we properly "auto-heal" the
inventory and allocation records in the placement API to account for
both the old-style VCPU/MEMORY_MB/DISK_GB resources as well as the new
custom resource class from Ironic's node.resource_class attribute.
"""
with interceptor.RequestsInterceptor(
app=self.app, url=self.url):
with self._interceptor():
# Before the resource tracker is "initialized", we shouldn't have
# any compute nodes in the RT's cache...
self.assertEqual(0, len(self.rt.compute_nodes))

View File

@ -11,17 +11,9 @@
# License for the specific language governing permissions and limitations
# under the License.
from keystoneauth1 import adapter
from keystoneauth1 import session
import mock
import requests
from wsgi_intercept import interceptor
# NOTE(cdent): When placement is extracted, placement will need to
# expose a fixture of some kind which operates as deploy does here,
# providing a WSGI application to be intercepted, but it will also
# need to be responsible for having a reasonable persistence layer
from nova.api.openstack.placement import deploy
from nova.api.openstack.placement import direct
from nova.compute import provider_tree
from nova import conf
from nova import context
@ -39,45 +31,54 @@ from nova.tests import uuidsentinel as uuids
CONF = conf.CONF
class NoAuthReportClient(report.SchedulerReportClient):
"""A SchedulerReportClient that avoids keystone."""
class SchedulerReportClientTestBase(test.TestCase):
def __init__(self):
super(NoAuthReportClient, self).__init__()
# Supply our own session so the wsgi-intercept can intercept
# the right thing. Another option would be to use the direct
# urllib3 interceptor.
request_session = requests.Session()
headers = {
'x-auth-token': 'admin',
'OpenStack-API-Version': 'placement latest',
}
self._client = adapter.Adapter(
session.Session(auth=None, session=request_session,
additional_headers=headers),
service_type='placement')
def _interceptor(self, app=None):
"""Set up an intercepted placement API to test against.
Use as e.g.
with interceptor() as client:
ret = client.get_provider_tree_and_ensure_root(...)
:param app: An optional wsgi app loader.
:return: Context manager, which in turn returns a direct
SchedulerReportClient.
"""
class ReportClientInterceptor(direct.PlacementDirect):
"""A shim around PlacementDirect that wraps the Adapter in a
SchedulerReportClient.
"""
def __enter__(inner_self):
adap = super(ReportClientInterceptor, inner_self).__enter__()
client = report.SchedulerReportClient(adapter=adap)
# NOTE(efried): This `self` is the TestCase!
self._set_client(client)
return client
interceptor = ReportClientInterceptor(CONF, latest_microversion=True)
if app:
interceptor.app = app
return interceptor
def _set_client(self, client):
"""Set report client attributes on the TestCase instance.
Override this to do things like:
self.mocked_thingy.report_client = client
:param client: A direct SchedulerReportClient.
"""
pass
@mock.patch('nova.compute.utils.is_volume_backed_instance',
new=mock.Mock(return_value=False))
@mock.patch('nova.objects.compute_node.ComputeNode.save', new=mock.Mock())
@mock.patch('keystoneauth1.session.Session.get_auth_headers',
new=mock.Mock(return_value={'x-auth-token': 'admin'}))
@mock.patch('keystoneauth1.session.Session.get_endpoint',
new=mock.Mock(return_value='http://localhost:80/placement'))
class SchedulerReportClientTests(test.TestCase):
"""Set up an intercepted placement API to test against."""
class SchedulerReportClientTests(SchedulerReportClientTestBase):
def setUp(self):
super(SchedulerReportClientTests, self).setUp()
self.flags(auth_strategy='noauth2', group='api')
self.app = lambda: deploy.loadapp(CONF)
self.client = NoAuthReportClient()
# TODO(cdent): Port required here to deal with a bug
# in wsgi-intercept:
# https://github.com/cdent/wsgi-intercept/issues/41
self.url = 'http://localhost:80/placement'
self.compute_uuid = uuids.compute_node
self.compute_name = 'computehost'
self.compute_node = objects.ComputeNode(
@ -103,9 +104,9 @@ class SchedulerReportClientTests(test.TestCase):
extra_specs={}))
self.context = context.get_admin_context()
def _interceptor(self):
# Isolate this initialization for maintainability.
return interceptor.RequestsInterceptor(app=self.app, url=self.url)
def _set_client(self, client):
# TODO(efried): Rip this out and just use `as client` throughout.
self.client = client
def test_client_report_smoke(self):
"""Check things go as expected when doing the right things."""
@ -233,10 +234,6 @@ class SchedulerReportClientTests(test.TestCase):
@mock.patch('nova.compute.utils.is_volume_backed_instance',
new=mock.Mock(return_value=False))
@mock.patch('nova.objects.compute_node.ComputeNode.save', new=mock.Mock())
@mock.patch('keystoneauth1.session.Session.get_auth_headers',
new=mock.Mock(return_value={'x-auth-token': 'admin'}))
@mock.patch('keystoneauth1.session.Session.get_endpoint',
new=mock.Mock(return_value='http://localhost:80/placement'))
def test_ensure_standard_resource_class(self):
"""Test case for bug #1746615: If placement is running a newer version
of code than compute, it may have new standard resource classes we
@ -286,14 +283,12 @@ class SchedulerReportClientTests(test.TestCase):
'allocation_ratio': 8.0,
},
}
with interceptor.RequestsInterceptor(app=self.app, url=self.url):
with self._interceptor():
self.client.update_compute_node(self.context, self.compute_node)
self.client.set_inventory_for_provider(
self.context, self.compute_uuid, self.compute_name, inv)
@mock.patch('keystoneauth1.session.Session.get_endpoint',
return_value='http://localhost:80/placement')
def test_global_request_id(self, mock_endpoint):
def test_global_request_id(self):
global_request_id = 'req-%s' % uuids.global_request_id
def assert_app(environ, start_response):
@ -304,8 +299,7 @@ class SchedulerReportClientTests(test.TestCase):
start_response('204 OK', [])
return []
with interceptor.RequestsInterceptor(
app=lambda: assert_app, url=self.url):
with self._interceptor(app=lambda: assert_app):
self.client._delete_provider(self.compute_uuid,
global_request_id=global_request_id)
payload = {
@ -719,12 +713,13 @@ class SchedulerReportClientTests(test.TestCase):
self.assertFalse(
new_tree.have_aggregates_changed(uuid, cdata.aggregates))
# To begin with, the cache should be empty
self.assertEqual([], self.client._provider_tree.get_provider_uuids())
# When new_tree is empty, it's a no-op.
# Do this outside the interceptor to prove no API calls are made.
self.client.update_from_provider_tree(self.context, new_tree)
assert_ptrees_equal()
# Do these with a failing interceptor to prove no API calls are made.
with self._interceptor(app=lambda: 'nuke') as client:
# To begin with, the cache should be empty
self.assertEqual([], client._provider_tree.get_provider_uuids())
# When new_tree is empty, it's a no-op.
client.update_from_provider_tree(self.context, new_tree)
assert_ptrees_equal()
with self._interceptor():
# Populate with a provider with no inventories, aggregates, traits