diff --git a/specs/ocata/approved/cells-scheduling-interaction.rst b/specs/ocata/approved/cells-scheduling-interaction.rst new file mode 100644 index 000000000..6ee5c4a08 --- /dev/null +++ b/specs/ocata/approved/cells-scheduling-interaction.rst @@ -0,0 +1,220 @@ +.. + This work is licensed under a Creative Commons Attribution 3.0 Unported + License. + + http://creativecommons.org/licenses/by/3.0/legalcode + +================================ +Scheduling interaction for cells +================================ + +https://blueprints.launchpad.net/nova/+spec/cells-scheduling-interaction + +In order to schedule instance builds to compute hosts Nova and the scheduler +will need to take into account that hosts are grouped into cells. It is not +necessary that this is apparent when Nova is requesting a placement decision. + + +Problem description +=================== + +In order to partition Nova into cells the scheduler will need to be involved +earlier in the build process. + +Use Cases +---------- + +* Operators want to partition their deployments into cells for scaling, failure + domain, and buildout reasons. When partitioned, we need to have flexible + api level scheduling that can make decisions on cells and hosts. + + +Proposed change +=============== + +The scheduler will be queried by Nova at the api level so that it knows which +cell to pass the build to. The instance table exists in a cell and not in the +api database, so to create an instance we will first need to know which cell to +create it in. + +The scheduler will continue to return a (host, node) tuple and the calling +service will look up the host in a mapping table to determine which cell it is +in. This means the current select_destinations interface will not need to +change. Querying the scheduler will take place after the API has returned a +response so the most reasonable thing to do is pass the build request to a +conductor operating outside of any cell. The conductor will call the scheduler +and then create the instance in the cell and pass the build request to it. + +While much of Nova is being split between being api-level or cell-level the +scheduler remains outside of either distinction. It can be thought of as a +separate service in the same way that Cinder/Neutron are. As a result the +scheduler will require knowledge of all hosts in all cells. This is different +from the cellsv1 architecture and may be a surprise for those familiar with +that setup. A separate effort can address scalability for the scheduler and the +potential for partitioning it along some boundary, which may be a cell. + +Because the existing build_instances conductor method assumes that the instance +already exists within a cell database and makes some assumptions about cleaning +up resources on failure we will not complicate that method with an alternate +path. Instead a new conductor method will be added which can take the task of +querying the scheduler and then creating the instance and associated resources +within the cell indicated by the scheduler. + +The new boot workflow would look like the following: + + - nova-api creates and persists a BuildRequest object, not an Instance. + - Cast to the api level conductor to execute the new method proposed here. The + api level conductor is whatever is configured in DEFAULT/transport_url. + - Conductor will call the scheduler once. + - Conductor will create the instance in the proper cell + - Conductor will cast to the proper nova-compute to continue the build + process. This cast will be the same as what is currently done in the + conductor build_instances method. + - In the event of a reschedulable build failure nova-compute will cast to a + cell conductor to execute the current build_instances method just as it's + currently done. + +Rescheduling will still take place within a cell via the normal +compute->conductor loop, using the conductors within the cell. Adding +rescheduling at a cell level will be a later effort. + +Information about cells will need to be fed into the scheduler in order for it +to account for that during its reschedule/migration placement decisions, but +that is outside the scope of this spec. + + +Alternatives +------------ + +We could query the scheduler at two points like in cellsv1. This creates more +deployment complexity and creates an unnecessary coupling between the +architecture of Nova and the scheduler. + +A reschedule could recreate a BuildRequest object, delete the instance and any +associated resources in a cell, and then let the scheduler pick a new host in +any cell. However there is a fair bit of complexity in doing that cleanup and +resource tracking and I believe that this is an effort best left for a later +time. It should also be noted that any time a build crosses a cell boundary +there are potential races with deletion code so it should be done as little as +possible. + +Rather than requiring an api level set of conductor nodes nova-api could +communicate with the conductors within a cell thus simplifying deployment. This +is probably worth doing for the single cell case so another spec will propose +doing this. + +Data model impact +----------------- + +None + +REST API impact +--------------- + +None + +Security impact +--------------- + +None + +Notifications impact +-------------------- + +None + +Other end user impact +--------------------- + +None + +Performance Impact +------------------ + +None + +Other deployer impact +--------------------- + +Nova-conductor will need to be deployed for use by nova-api. + +Developer impact +---------------- + +None + + +Implementation +============== + +Assignee(s) +----------- + +Primary assignee: + alaski + +Other contributors: + None + +Work Items +---------- + +* Add a conductor method to call the scheduler, create an instance in the db of + the cell scheduled to, then cast to the selected compute host to proceed with + the build. + +* Update the compute api to not create the instance in the db during a build + request, and change it to cast to the new scheduler method. + +* Ensure devstack is configured to that nova-api shares the cell level + conductors. This makes the single cell setup as simple as possible. A later + effort can investigate making this configurable in devstack for multiple cell + setups. + + +Dependencies +============ + +None + + +Testing +======= + +Since this is designed to be an internal re-architecting of Nova with no user +visible changes the current suite of Tempest or functional tests should +suffice. At some point we will want to look at how to test multiple cells or +potentially exposing the concept of a cell in the API and we will tackle +testing requirements then. + + +Documentation Impact +==================== + +Documentation will be written describing the flow of an instance build and how +and where scheduling decisions are made. + + +References +========== + +``https://etherpad.openstack.org/p/kilo-nova-cells`` +``https://etherpad.openstack.org/p/nova-cells-scheduling-requirements`` + + +History +======= + +.. list-table:: Revisions + :header-rows: 1 + + * - Release Name + - Description + * - Liberty + - Introduced + * - Mitaka + - Re-proposed; partially implemented. + * - Newton + - Re-proposed; partially implemented. + * - Ocata + - Re-proposed.