Add prep-for-network-aware-scheduling-pike spec

Change how we talk to neutron to allow us in the future to implement network aware scheduling. This is continuation of the work started during newton. This is a important, because this blocks routed networks, which is blocking the removal of cells v1. blueprint prep-for-network-aware-scheduling-pike Previously-Approved: Ocata Change-Id: I4d87fd0cc8148aa90a1968064378b11a0d7b65c1
2017-02-10 22:55:52 +00:00
parent 3c48c2295b
commit 04f25414c2
1 changed files with 219 additions and 0 deletions
--- a/specs/pike/approved/prep-for-network-aware-scheduling-pike.rst
+++ b/specs/pike/approved/prep-for-network-aware-scheduling-pike.rst
@@ -0,0 +1,219 @@
+..
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
+ License.
+
+ http://creativecommons.org/licenses/by/3.0/legalcode
+
+==============================================
+Prep work for Network aware scheduling (Pike)
+==============================================
+
+https://blueprints.launchpad.net/nova/+spec/prep-for-network-aware-scheduling-pike
+
+Change how we talk to neutron to allow us in the future to implement
+network aware scheduling.
+
+This continues on from the work started in Newton:
+http://specs.openstack.org/openstack/nova-specs/specs/newton/implemented/prep-for-network-aware-scheduling.html
+
+Problem description
+===================
+
+Some IP subnets can be restricted to a subset of hosts, due to an operators
+network configuration. In this environment, it means you could build somewhere
+that has no public IPs available. The ability to manage IP addresses in this
+way is being added into Neutron by the Routed Networks feature:
+
+* http://specs.openstack.org/openstack/neutron-specs/specs/newton/routed-networks.html
+* https://specs.openstack.org/openstack/nova-specs/specs/newton/implemented/neutron-routed-networks.html
+
+To make this possible, we need to know the details of all the user's requested
+ports, and what resources are required, before asking the scheduler for a
+host. In addition, after picking a location, we should check that there is
+an IP available before continuing with the rest of the build process.
+
+As an aside, the allocate_for_instance call currently contains both
+parts of that operation and has proved very difficult to maintain and evolve.
+In newton, we changed the code to separate the update and create operations,
+so we are now able to look at moving where those operations happen.
+
+Use Cases
+---------
+
+This is largely a code refactor.
+
+Proposed change
+===============
+
+In newton we have changed the code inside allocate_for_instance into clear
+get/create and update phase. We need to complete the split, by ensuring the
+network info cache contains all the info required to be shared between the
+two phases of the operation (get/create ports and update ports).
+Should a build request fail, and the build is retried on a different host,
+the ports that Nova creates should be re-used for the new build attempt,
+just like the ports that are passed into Nova. This bug fix requires the data
+to be correctly passed in a very similar way, so will be the initial focus of
+this effort.
+
+Second, we want to move the get/create ports before the scheduler is called.
+In terms of upgrades, we need to ensure old compute nodes don't re-create
+ports that the conductor has already created. Similarly, when deleting an
+instance, the old node should still correctly know which ports were created
+by Nova and can be deleted when the instance is deleted, in the usual way.
+For nova-network users, the get/create ports can be a noop.
+
+To avoid problems across upgrades, the early creating of ports is not allowed
+until all nova-compute nodes are upgraded to the version that understands if
+a port has been created or not. Once all are upgraded, and credentials are
+available on the nova-conductor node, ports will be created before calling the
+scheduler.
+
+The third step is to move the port update into the conductor, right after
+the scheduler has picked an appropriate host. We will not be able to run
+this code until all compute nodes have been upgraded to the newest version.
+Until all nodes have been upgraded, the new nodes will still have to run this
+code on the Compute node. While annoying, this move is only to help with
+faster retries, and as such, should not block any progress. Note this port
+update step includes setting the host on the port, and in the future will
+be the point an IP is assigned, if the port does not yet have an IP.
+
+It is useful to update the port bindings in the conductor, so any failure in
+the port binding for a specific host can quickly trigger a retry. This is
+particularly a problem when you have routed networks, and segments can run out
+of IP addresses independently.
+
+For nova-network, we can run the existing allocate-for-instance logic in the
+conductor, after the scheduler is called. For cells v1 users, this should
+correctly be in the child cell conductor, because each cells v1 cell has its
+own separate nova-network instance with a different set of IP addresses.
+(For cells v2 users, the networking is global to the nova deployment, so its
+doesn't matter where that happens.)
+
+Alternatives
+------------
+
+We could attempt to add more complexity into the existing
+allocate_for_instance code. But history has shown that is likely to create
+many regressions.
+
+Data model impact
+-----------------
+
+None
+
+REST API impact
+---------------
+
+None
+
+Security impact
+---------------
+
+Eventually it could mean we don't need any neutron related credentials on
+any of the compute nodes. This work will not achieve that goal, but it is a
+step in the right direction.
+
+Notifications impact
+--------------------
+
+Notifications may now have a different host and service, but they should
+be otherwise identical.
+
+Other end user impact
+---------------------
+
+None
+
+Performance Impact
+------------------
+
+Currently the neutron port binding is done in parallel with other long running
+tasks that the compute node performs during the boot process. This moves the
+port creation and binding into the critical path of the boot process.
+
+When Nova is creating ports for users, instead of just calling port create
+with all the parameters, will now first create the port and later update the
+port. This will slightly increase the load on the Neutron API during the boot
+process. However this should be minimal, as we are not duplicating any of
+the expensive parts of the process, such as port binding and IP allocation.
+
+This also generally moves more load into the nova-conductor nodes, but on the
+upside this reduces the load on the nova-compute nodes.
+
+Other deployer impact
+---------------------
+
+We will need the neutron credentials on nova-conductor, which may not
+currently have been happening.
+
+To maintain our upgrade promise, we will fall back to the old behaviour for
+one cycle to give deployers a warning about the missing credentials. The
+following cycle will require the credentials to be present on nova-conductor.
+
+Developer impact
+----------------
+
+Improved ability to understand allocate_for_instance, and its replacements.
+
+Implementation
+==============
+
+Assignee(s)
+-----------
+
+Primary assignee:
+  John Garbutt (IRC: johnthetubaguy)
+
+Work Items
+----------
+
+* Split allocate_for_instance into two functions
+* Move create/get port call into the conductor, before calling the scheduler,
+  such that allocate_for_instance no longer creates ports, no op for nova-net.
+  This is likely to be achieved by adding a new method into the network API
+  for both neutron and nova-net.
+* Move the remainder of allocate_for_instance call into conductor, for both
+  nova-net and neutron
+
+Dependencies
+============
+
+None (however, several things depend on this work)
+
+Testing
+=======
+
+Grenade + neutron should ensure the pre-upgrade flow is covered, the regular
+gate tests should ensure the post-upgrade flow is covered.
+
+We should add functional tests to test the re-schedule flow. We might also
+need functional tests to check the transition between the pre and post upgrade
+flows.
+
+Documentation Impact
+====================
+
+Need to describe the transition in the release notes, and release specific
+upgrade documentation, at a minimum.
+
+References
+==========
+
+* Previous work: http://specs.openstack.org/openstack/nova-specs/specs/newton/implemented/prep-for-network-aware-scheduling.html
+* Neutron Routed network spec: http://specs.openstack.org/openstack/neutron-specs/specs/newton/routed-networks.html
+* Nova Routed network spec: https://specs.openstack.org/openstack/nova-specs/specs/newton/implemented/neutron-routed-networks.html
+
+History
+=======
+
+.. list-table:: Revisions
+   :header-rows: 1
+
+   * - Release Name
+     - Description
+   * - Newton
+     - Introduced
+   * - Ocata
+     - Continued
+   * - Pike
+     - Reproposed