Merge "Re-propose Ironic: multiple compute host support"

2016-03-23 20:55:36 +00:00
parent 0bc11c569f c5e7435214
commit f4752e13a9
1 changed files with 218 additions and 0 deletions
--- a/specs/newton/approved/ironic-multiple-compute-hosts.rst
+++ b/specs/newton/approved/ironic-multiple-compute-hosts.rst
@@ -0,0 +1,218 @@
+..
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
+ License.
+
+ http://creativecommons.org/licenses/by/3.0/legalcode
+
+=====================================
+Ironic: Multiple compute host support
+=====================================
+
+https://blueprints.launchpad.net/nova/+spec/ironic-multiple-compute-hosts
+
+Today, the Ironic virt driver only supports a single nova-compute service.
+This is clearly not viable for an environment of any interesting scale;
+there's no HA, everything fails if the compute service goes down. Let's fix
+that.
+
+
+Problem description
+===================
+
+Computers are horrible things. They die sometimes. They crash processes at
+random. Solar flares can make bad things happen. And so on and so forth.
+
+Running only one instance of nova-compute for an entire Ironic environment
+is going to be a bad time. The Ironic virt driver currently assumes that only
+one nova-compute process can run at once. It exposes all resources from an
+Ironic installation to the resource tracker, without the ability to split
+those resources out into many compute services.
+
+Use Cases
+----------
+
+This allows operators to avoid having a single nova-compute service for an
+Ironic deployment, so that the deployment may continue to function if a
+compute service goes down. Note that this assumes a single Ironic cluster
+per Nova deployment; this is not unreasonable in most cases, as Ironic should
+be able to scale to 10^5 nodes.
+
+
+Proposed change
+===============
+
+In general, a nova-compute running the Ironic virt driver should only
+register as a single row in the compute_nodes table, rather than many rows.
+
+Nova's scheduler should schedule only to a nova-compute host; the host will
+choose an Ironic node itself, from the nodes that match the request (explained
+further below).  Once an instance is placed on a given nova-compute service
+host, that host will always manage other requests for that instance (delete,
+etc). So the instance count scheduler filter can just be used here to equally
+distribute instances between compute hosts. This reduces the failure domain to
+failing actions for existing instances on a compute host, if a compute host
+happens to fail.
+
+The Ironic virt driver should be modified to call an Ironic endpoint with
+the request spec for the instance. This endpoint will reserve a node, and
+return that node. The virt driver will then deploy the instance to this node.
+When the instance is destroyed, the reservation should also be destroyed.
+
+This endpoint will take parameters related to the request spec, and is being
+worked on the Ironic side here.[0] This has not yet been solidified, but it
+will have, at a minimum, all fields in the flavor object. This might look
+something like::
+
+    {
+        "memory_mb": 1024,
+        "vcpus": 8,
+        "vcpu_weight": null,
+        "root_gb": 20,
+        "ephemeral_gb": 10,
+        "swap": 2,
+        "rxtx_factor": 1.0,
+        "extra_specs": {
+            "capabilities": "supports_uefi,has_gpu",
+        },
+        "image": {
+            "id": "some-uuid",
+            "properties": {...},
+        },
+    }
+
+
+We will report (total ironic capacity) into the resource tracker for each
+compute host. This will end up over-reporting total available capacity to Nova,
+however is the least wrong option here. Other (worse) options are:
+
+* Report (total ironic capacity)/(number of compute hosts) from each compute
+  host. This is more "right", but has the possibility of a compute host
+  reporting (usage) > (max capacity), and therefore becoming unable to perform
+  new build actions.
+
+* Report some arbitrary incorrect number for total capacity, and try to make
+  the scheduler ignore it. This reports numbers more incorrectly, and also
+  takes more code and has more room for error.
+
+Alternatives
+------------
+
+Do what we do today, with active/passive failover.
+
+Doing active/passive failover well is not an easy task, and doesn't account for
+all possible failures. This also does not follow Nova's prescribed model for
+compute failure. Furthermore, the resource tracker initialization is slow
+with many Ironic nodes, and so a cold failover could take minutes.
+
+Resource providers[1] may be another viable alternative, but we shouldn't
+have a hard dependency on that.
+
+Data model impact
+-----------------
+
+None.
+
+REST API impact
+---------------
+
+None.
+
+Security impact
+---------------
+
+None.
+
+Notifications impact
+--------------------
+
+None.
+
+Other end user impact
+---------------------
+
+None.
+
+Performance Impact
+------------------
+
+This should improve performance a bit. Currently the resource tracker is
+responsible for every node in an Ironic deployment. This will make that group
+smaller and improve the performance of the resource tracker loop.
+
+Other deployer impact
+---------------------
+
+A version of Ironic that supports the reservation endpoint must be deployed
+before a version of Nova with this change is deployed. If this is not the
+case, the previous behavior should be used. We'll need to properly deprecate
+the old behavior behind a config option, as deployers will need to configure
+different scheduler filters and host managers than the current recommendation
+for this to work correctly. We should investigate if this can be done
+gracefully without a new config option, however I'm not sure it's possible.
+
+Developer impact
+----------------
+
+None, though Ironic driver developers should be aware of the situation.
+
+
+Implementation
+==============
+
+Assignee(s)
+-----------
+
+Primary assignee:
+  jim-rollenhagen (jroll)
+
+Other contributors:
+  devananda
+  jaypipes
+
+Work Items
+----------
+
+* Change the Ironic driver to be a 1:1 host:node mapping.
+
+* Change the Ironic driver to get reservations from Ironic.
+
+
+Dependencies
+============
+
+This depends on a new endpoint in Ironic.[0]
+
+
+Testing
+=======
+
+This should be tested by being the default configuration.
+
+
+Documentation Impact
+====================
+
+Deployer documentation will need updates to specify how this works, since it
+is different than most drivers.
+
+
+References
+==========
+
+[0] https://review.openstack.org/#/c/204641/
+
+[1] https://review.openstack.org/#/c/225546/
+
+
+History
+=======
+
+.. list-table:: Revisions
+   :header-rows: 1
+
+   * - Release Name
+     - Description
+   * - Mitaka
+     - Introduced but no changes merged.
+   * - Newton
+     - Re-proposed.