Commit Graph

19 Commits (master)

Author SHA1 Message Date
Eric Fried 4c34ab574e Rip out the SchedulerClient
Since all remaining SchedulerClient methods were direct passthroughs to
the SchedulerQueryClient, and nothing special is being done anymore to
instantiate that client, the SchedulerClient is no longer necessary. All
references to it are replaced with direct references to
SchedulerQueryClient.

Change-Id: I57dd199a7c5e762d97a600307aa13a7aeb62d2b2
4 years ago
Eric Fried 86e94a11f7 Rip the report client out of SchedulerClient
A step toward getting rid of the SchedulerClient intermediary, this
patch removes the reportclient member from SchedulerClient, instead
instantiating SchedulerReportClient directly wherever it's needed.

Change-Id: I14d1a648843c6311a962aaf99a47bb1bebf7f5ea
4 years ago
Eric Fried 570ad36992 Commonize _update code path
There were a bunch of report client methods around updating inventory to
placement which were only being used in the non-update_provider_tree
code paths of the resource tracker's update routine. Those code paths
had already been retrofitted to produce a placement-shaped inventory
object.

update_from_provider_tree gives us another way to flush these inventory
changes.

This patch simply takes the inventory object produced by the
get_inventory() and update_compute_node() code paths and updates the
provider tree object in the same fashion as update_provider_tree does.
So now all three code paths can commonly invoke
update_from_provider_tree.

And we can get rid of a ton of redundant code in the report client.

This includes the former incarnation of set_inventory_for_provider; so
we rename the artist formerly known as _set_inventory_for_provider to
match its brethren, set_traits_for_provider and
set_aggregates_for_provider.

Change-Id: I1a305847f0310c8d4babd5a625e4cc7bffe5b086
4 years ago
Matt Riedemann 5af632e9ca Use long_rpc_timeout in select_destinations RPC call
Conductor RPC calls the scheduler to get hosts during
server create, which in a multi-create request with a
lot of servers and the default rpc_response_timeout, can
trigger a MessagingTimeout. Due to the old
retry_select_destinations decorator, conductor will retry
the select_destinations RPC call up to max_attempts times,
so thrice by default. This can clobber the scheduler and
placement while the initial scheduler worker is still
trying to process the beefy request and allocate resources
in placement.

This has been recreated in a devstack test patch [1] and
shown to fail with 1000 instances in a single request with
the default rpc_response_timeout of 60 seconds. Changing the
rpc_response_timeout to 300 avoids the MessagingTimeout and
retry loop.

Since Rocky we have the long_rpc_timeout config option which
defaults to 1800 seconds. The RPC client can thus be changed
to heartbeat the scheduler service during the RPC call every
$rpc_response_timeout seconds with a hard timeout of
$long_rpc_timeout. That change is made here.

As a result, the problematic retry_select_destinations
decorator is also no longer necessary and removed here. That
decorator was added in I2b891bf6d0a3d8f45fd98ca54a665ae78eab78b3
and was a hack for scheduler high availability where a
MessagingTimeout was assumed to be a result of the scheduler
service dying so retrying the request was reasonable to hit
another scheduler worker, but is clearly not sufficient
in the large multi-create case, and long_rpc_timeout is a
better fit for that HA type scenario to heartbeat the scheduler
service.

[1] https://review.openstack.org/507918/

Change-Id: I87d89967bbc5fbf59cf44d9a63eb6e9d477ac1f3
Closes-Bug: #1795992
5 years ago
Eric Fried 5d1a500185 Remove LazyLoad of Scheduler Clients
Things have changed here on Walton's Mountain since LazyLoad was
introduced [1]. It seems to have been created to avoid a circular
import, but as this patch should attest, that's no longer an issue.

Why change this now, besides removing weird and complicated code?
Because a subsequent patch needs to access a *property* of the report
client from the compute manager. As written, LazyLoad only lets you get
at *methods* (as functools.partial). There are other ways to solve that
while preserving the deferred import, but if this works, it's the better
solution.

[1] Ie5732baf9709cd0cb951eae4638910372c79e5f1

Change-Id: I1f97d00fb7633f173370ed6787c9a71ecd8106d5
5 years ago
Takashi NATSUME a326b03339 [placement] Add sending global request ID in delete (3)
Add the 'X-Openstack-Request-Id' header in the request of DELETE.
When deleteing resource provider inventories, the header is added.

Subsequent patches will add the header in the other cases.

Change-Id: I1dac3d340fe7077095d68f803cf5335ffd5b3364
Partial-Bug: #1734625
5 years ago
Zuul ced5d0f323 Merge "Scheduler set_inventory_for_provider does nested" 6 years ago
Ed Leafe e7152eef7b Modify select_destinations() to return objects and alts
This changes select_destinations() on the scheduler side to optionally
return Selection objects and alternates. The RPC signature doesn't
change in this patch, though, so everything on the conductor side
remains unchanged. The next patch in the series will make the actual RPC
change.

Blueprint: return-alternate-hosts

Change-Id: I03b95a2106624c2ea24835814ca38e954ec7a997
6 years ago
Eric Fried b75b35f482 Scheduler set_inventory_for_provider does nested
SchedulerReportClient.set_inventory_for_provider and its SchedulerClient
wrapper now accept a parent_provider_uuid kwarg, which must be specified
for any provider that isn't a root.  If the method winds up creating the
provider, and parent_provider_uuid is None (the default), the provider
is created as a root - this is the previous behavior.  If
parent_provider_uuid is specified, and the method winds up creating the
provider, it is created as a child of the provider indicated.

Change-Id: I7cfbc80a9a41e97623950deaab9a7b0604fa487d
blueprint: nested-resource-providers
6 years ago
EdLeafe 8c86547fa0 Pass a list of instance UUIDs to scheduler
When the RequestSpec object was created, it was assumed that when the
request was for more than one instance the scheduler would not need to
know the UUIDs for the individual instances and so it was agreed to
only pass one instance UUID. If, however, we want the scheduler to be
able to claim resources on the selected host, we will need to know the
instance UUID, which will be the consumer_id of the allocation.

This patch adds a new RPC parameter 'instance_uuids' that will be passed
to the scheduler. The next patch in the series adds the logic to the
scheduler to use this new field when selecting hosts.

Co-Authored-By: Sylvain Bauza <sbauza@redhat.com>
Partially-Implements: blueprint placement-claims

Change-Id: I44ebdb3e29db950bf2ad0e6b1dbfdecd1ca03530
6 years ago
Jay Pipes fbc5a67de9 virt: add get_inventory() virt driver API method
Adds a new get_inventory() method to the virt driver API for returning a
dict of inventory records in a format that the placement API
understands.

We also move the ComputeNode.save() call out of the scheduler reporting
client and into the resource tracker. The resource tracker's _update()
method now attempts to call the new get_inventory() virt driver method
and falls back on the existing update_resource_stats() (renamed to
update_compute_node() in this patch) method when get_inventory() is not
implemented.

The next patch implements get_inventory() for the Ironic virt driver.

Change-Id: I921daea7f6d5776b19561f0ca457e604a372eb9e
blueprint: custom-resource-classes-pike
6 years ago
Sylvain Bauza b67d62399b Modify conductor to use RequestSpec object
Now, we can hydrate the RequestSpec object directly in the conductor and modify
the Scheduler client to primitive it before calling the RPC API.

Later changes will focus on modifying the scheduler.utils methods to play with
RequestSpec object (and possibly kick build_request_spec) but this can be done
on separate changes for cleaning that up.

NOTE: There is an ugly hack hidden in that change because of a bug
in oslo.messaging which doesn't accept datetime type for values. That will be
removed right in the next patch of the branch. Yeah, I know it's bad but it's
very temporary.

Change-Id: I916707d4dee66608e0bf7cd2d0784dafbfecda38
Partially-Implements: blueprint request-spec-object-mitaka
8 years ago
Paul Murray 2ef014eb31 Convert RT compute_node to be a ComputeNode object
This patch converts the ResourceTracker compute_node property
to be a ComputeNode object. A number of fields automatically
take care of mapping their values to a db format, so some of the
code creating json strings goes away with this change.

The scheduler client report code is simplified by the change
to use a ComputeNode object.

Note that this change naturally required modification to a
number of tests in test_tracker, test_resource_tracker and
test_client to cater for objects instead of dicts. Some of these
tests were using incorrect values or arbitrary key names that do
not exist as ComputeNode fields, so they had to be corrected to
conform to the type checking of the ComputeNode object.

part of blueprint make-resource-tracker-use-objects

Change-Id: I2279f01ad55083c31c663242a2a60a48191e88c3
8 years ago
EdLeafe b6cd8c7dde Add the RPC calls for instance updates.
The Scheduler needs to receive updates from compute whenever there are
changes to any instance so that it can update its view of the instances
on each compute node. This adds the required RPC methods for updates
(new and resized instances) and deletes, as well as a method for sending
sync information to verify that the scheduler and compute views of the
instances are the same.

Partially-Implements: blueprint isolate-scheduler-db
(https://blueprints.launchpad.net/nova/+spec/isolate-scheduler-db)

Change-Id: I4c16f3080c1f2ed5ef0b9e50a46b7f4280f09a16
8 years ago
Sylvain Bauza 05743b3cd5 Create Scheduler client methods for aggregates
Now that we provided a Scheduler RPC API method for updating and deleting
aggregates, we can add the methods to the Scheduler client too.

Partially-Implements: blueprint isolate-scheduler-db

Change-Id: Ib747e2a1a63ed3ee573e1c63583581b6720695e8
8 years ago
Davanum Srinivas af2d6c9576 Switch to using oslo_* instead of oslo.*
The oslo team is recommending everyone to switch to the
non-namespaced versions of libraries. Updating the hacking
rule to include a check to prevent oslo.* import from
creeping back in.

This commit includes:
- using oslo_utils instead of oslo.utils
- using oslo_serialization instead of oslo.serialization
- using oslo_db instead of oslo.db
- using oslo_i18n instead of oslo.i18n
- using oslo_middleware instead of oslo.middleware
- using oslo_config instead of oslo.config
- using oslo_messaging instead of "from oslo import messaging"
- using oslo_vmware instead of oslo.vmware

Change-Id: I3e2eb147b321ce3e928817b62abcb7d023c5f13f
8 years ago
Grzegorz Grasza 8ba2d8ea52 Reschedule queries to nova-scheduler after a timeout occurs
In case the nova-scheduler service dies, try sending the message
once again. The message will be picked up by another server or
the same one when it restarts.

Change-Id: I2b891bf6d0a3d8f45fd98ca54a665ae78eab78b3
Closes-Bug: 1402574
8 years ago
Davanum Srinivas 323fa6fef7 Use oslo.utils
oslo.utils library now provides the functionality previously in
oslo-incubator's excutils, importutils, network_utils, strutils
timeutils, units etc. Some modules already moved to oslo.utils
will still be around since other code in nova/openstack/common/
are using it and will be removed in a subsequent commit.

Change-Id: Idc716342535fdfa680963e0e073ddb46f5f1eb34
9 years ago
Sylvain Bauza b16cd4548d Add support for select_destinations in Scheduler client
It was defined in the spec that the scheduler will provide a clear
interface for all scheduler API. Select_destinations() so needs
to be added in the client library.

Implements blueprint scheduler-lib

Change-Id: Ie5732baf9709cd0cb951eae4638910372c79e5f1
9 years ago