Add prep-for-network-aware-scheduling-pike spec
Change how we talk to neutron to allow us in the future to implement network aware scheduling. This is continuation of the work started during newton. This is a important, because this blocks routed networks, which is blocking the removal of cells v1. blueprint prep-for-network-aware-scheduling-pike Previously-Approved: Ocata Change-Id: I4d87fd0cc8148aa90a1968064378b11a0d7b65c1
This commit is contained in:
committed by
John Garbutt
parent
3c48c2295b
commit
04f25414c2
219
specs/pike/approved/prep-for-network-aware-scheduling-pike.rst
Normal file
219
specs/pike/approved/prep-for-network-aware-scheduling-pike.rst
Normal file
@@ -0,0 +1,219 @@
|
|||||||
|
..
|
||||||
|
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||||
|
License.
|
||||||
|
|
||||||
|
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||||
|
|
||||||
|
==============================================
|
||||||
|
Prep work for Network aware scheduling (Pike)
|
||||||
|
==============================================
|
||||||
|
|
||||||
|
https://blueprints.launchpad.net/nova/+spec/prep-for-network-aware-scheduling-pike
|
||||||
|
|
||||||
|
Change how we talk to neutron to allow us in the future to implement
|
||||||
|
network aware scheduling.
|
||||||
|
|
||||||
|
This continues on from the work started in Newton:
|
||||||
|
http://specs.openstack.org/openstack/nova-specs/specs/newton/implemented/prep-for-network-aware-scheduling.html
|
||||||
|
|
||||||
|
Problem description
|
||||||
|
===================
|
||||||
|
|
||||||
|
Some IP subnets can be restricted to a subset of hosts, due to an operators
|
||||||
|
network configuration. In this environment, it means you could build somewhere
|
||||||
|
that has no public IPs available. The ability to manage IP addresses in this
|
||||||
|
way is being added into Neutron by the Routed Networks feature:
|
||||||
|
|
||||||
|
* http://specs.openstack.org/openstack/neutron-specs/specs/newton/routed-networks.html
|
||||||
|
* https://specs.openstack.org/openstack/nova-specs/specs/newton/implemented/neutron-routed-networks.html
|
||||||
|
|
||||||
|
To make this possible, we need to know the details of all the user's requested
|
||||||
|
ports, and what resources are required, before asking the scheduler for a
|
||||||
|
host. In addition, after picking a location, we should check that there is
|
||||||
|
an IP available before continuing with the rest of the build process.
|
||||||
|
|
||||||
|
As an aside, the allocate_for_instance call currently contains both
|
||||||
|
parts of that operation and has proved very difficult to maintain and evolve.
|
||||||
|
In newton, we changed the code to separate the update and create operations,
|
||||||
|
so we are now able to look at moving where those operations happen.
|
||||||
|
|
||||||
|
Use Cases
|
||||||
|
---------
|
||||||
|
|
||||||
|
This is largely a code refactor.
|
||||||
|
|
||||||
|
Proposed change
|
||||||
|
===============
|
||||||
|
|
||||||
|
In newton we have changed the code inside allocate_for_instance into clear
|
||||||
|
get/create and update phase. We need to complete the split, by ensuring the
|
||||||
|
network info cache contains all the info required to be shared between the
|
||||||
|
two phases of the operation (get/create ports and update ports).
|
||||||
|
Should a build request fail, and the build is retried on a different host,
|
||||||
|
the ports that Nova creates should be re-used for the new build attempt,
|
||||||
|
just like the ports that are passed into Nova. This bug fix requires the data
|
||||||
|
to be correctly passed in a very similar way, so will be the initial focus of
|
||||||
|
this effort.
|
||||||
|
|
||||||
|
Second, we want to move the get/create ports before the scheduler is called.
|
||||||
|
In terms of upgrades, we need to ensure old compute nodes don't re-create
|
||||||
|
ports that the conductor has already created. Similarly, when deleting an
|
||||||
|
instance, the old node should still correctly know which ports were created
|
||||||
|
by Nova and can be deleted when the instance is deleted, in the usual way.
|
||||||
|
For nova-network users, the get/create ports can be a noop.
|
||||||
|
|
||||||
|
To avoid problems across upgrades, the early creating of ports is not allowed
|
||||||
|
until all nova-compute nodes are upgraded to the version that understands if
|
||||||
|
a port has been created or not. Once all are upgraded, and credentials are
|
||||||
|
available on the nova-conductor node, ports will be created before calling the
|
||||||
|
scheduler.
|
||||||
|
|
||||||
|
The third step is to move the port update into the conductor, right after
|
||||||
|
the scheduler has picked an appropriate host. We will not be able to run
|
||||||
|
this code until all compute nodes have been upgraded to the newest version.
|
||||||
|
Until all nodes have been upgraded, the new nodes will still have to run this
|
||||||
|
code on the Compute node. While annoying, this move is only to help with
|
||||||
|
faster retries, and as such, should not block any progress. Note this port
|
||||||
|
update step includes setting the host on the port, and in the future will
|
||||||
|
be the point an IP is assigned, if the port does not yet have an IP.
|
||||||
|
|
||||||
|
It is useful to update the port bindings in the conductor, so any failure in
|
||||||
|
the port binding for a specific host can quickly trigger a retry. This is
|
||||||
|
particularly a problem when you have routed networks, and segments can run out
|
||||||
|
of IP addresses independently.
|
||||||
|
|
||||||
|
For nova-network, we can run the existing allocate-for-instance logic in the
|
||||||
|
conductor, after the scheduler is called. For cells v1 users, this should
|
||||||
|
correctly be in the child cell conductor, because each cells v1 cell has its
|
||||||
|
own separate nova-network instance with a different set of IP addresses.
|
||||||
|
(For cells v2 users, the networking is global to the nova deployment, so its
|
||||||
|
doesn't matter where that happens.)
|
||||||
|
|
||||||
|
Alternatives
|
||||||
|
------------
|
||||||
|
|
||||||
|
We could attempt to add more complexity into the existing
|
||||||
|
allocate_for_instance code. But history has shown that is likely to create
|
||||||
|
many regressions.
|
||||||
|
|
||||||
|
Data model impact
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
REST API impact
|
||||||
|
---------------
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
Security impact
|
||||||
|
---------------
|
||||||
|
|
||||||
|
Eventually it could mean we don't need any neutron related credentials on
|
||||||
|
any of the compute nodes. This work will not achieve that goal, but it is a
|
||||||
|
step in the right direction.
|
||||||
|
|
||||||
|
Notifications impact
|
||||||
|
--------------------
|
||||||
|
|
||||||
|
Notifications may now have a different host and service, but they should
|
||||||
|
be otherwise identical.
|
||||||
|
|
||||||
|
Other end user impact
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
Performance Impact
|
||||||
|
------------------
|
||||||
|
|
||||||
|
Currently the neutron port binding is done in parallel with other long running
|
||||||
|
tasks that the compute node performs during the boot process. This moves the
|
||||||
|
port creation and binding into the critical path of the boot process.
|
||||||
|
|
||||||
|
When Nova is creating ports for users, instead of just calling port create
|
||||||
|
with all the parameters, will now first create the port and later update the
|
||||||
|
port. This will slightly increase the load on the Neutron API during the boot
|
||||||
|
process. However this should be minimal, as we are not duplicating any of
|
||||||
|
the expensive parts of the process, such as port binding and IP allocation.
|
||||||
|
|
||||||
|
This also generally moves more load into the nova-conductor nodes, but on the
|
||||||
|
upside this reduces the load on the nova-compute nodes.
|
||||||
|
|
||||||
|
Other deployer impact
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
We will need the neutron credentials on nova-conductor, which may not
|
||||||
|
currently have been happening.
|
||||||
|
|
||||||
|
To maintain our upgrade promise, we will fall back to the old behaviour for
|
||||||
|
one cycle to give deployers a warning about the missing credentials. The
|
||||||
|
following cycle will require the credentials to be present on nova-conductor.
|
||||||
|
|
||||||
|
Developer impact
|
||||||
|
----------------
|
||||||
|
|
||||||
|
Improved ability to understand allocate_for_instance, and its replacements.
|
||||||
|
|
||||||
|
Implementation
|
||||||
|
==============
|
||||||
|
|
||||||
|
Assignee(s)
|
||||||
|
-----------
|
||||||
|
|
||||||
|
Primary assignee:
|
||||||
|
John Garbutt (IRC: johnthetubaguy)
|
||||||
|
|
||||||
|
Work Items
|
||||||
|
----------
|
||||||
|
|
||||||
|
* Split allocate_for_instance into two functions
|
||||||
|
* Move create/get port call into the conductor, before calling the scheduler,
|
||||||
|
such that allocate_for_instance no longer creates ports, no op for nova-net.
|
||||||
|
This is likely to be achieved by adding a new method into the network API
|
||||||
|
for both neutron and nova-net.
|
||||||
|
* Move the remainder of allocate_for_instance call into conductor, for both
|
||||||
|
nova-net and neutron
|
||||||
|
|
||||||
|
Dependencies
|
||||||
|
============
|
||||||
|
|
||||||
|
None (however, several things depend on this work)
|
||||||
|
|
||||||
|
Testing
|
||||||
|
=======
|
||||||
|
|
||||||
|
Grenade + neutron should ensure the pre-upgrade flow is covered, the regular
|
||||||
|
gate tests should ensure the post-upgrade flow is covered.
|
||||||
|
|
||||||
|
We should add functional tests to test the re-schedule flow. We might also
|
||||||
|
need functional tests to check the transition between the pre and post upgrade
|
||||||
|
flows.
|
||||||
|
|
||||||
|
Documentation Impact
|
||||||
|
====================
|
||||||
|
|
||||||
|
Need to describe the transition in the release notes, and release specific
|
||||||
|
upgrade documentation, at a minimum.
|
||||||
|
|
||||||
|
References
|
||||||
|
==========
|
||||||
|
|
||||||
|
* Previous work: http://specs.openstack.org/openstack/nova-specs/specs/newton/implemented/prep-for-network-aware-scheduling.html
|
||||||
|
* Neutron Routed network spec: http://specs.openstack.org/openstack/neutron-specs/specs/newton/routed-networks.html
|
||||||
|
* Nova Routed network spec: https://specs.openstack.org/openstack/nova-specs/specs/newton/implemented/neutron-routed-networks.html
|
||||||
|
|
||||||
|
History
|
||||||
|
=======
|
||||||
|
|
||||||
|
.. list-table:: Revisions
|
||||||
|
:header-rows: 1
|
||||||
|
|
||||||
|
* - Release Name
|
||||||
|
- Description
|
||||||
|
* - Newton
|
||||||
|
- Introduced
|
||||||
|
* - Ocata
|
||||||
|
- Continued
|
||||||
|
* - Pike
|
||||||
|
- Reproposed
|
||||||
Reference in New Issue
Block a user