diff --git a/specs/stein/nova-less-deploy.rst b/specs/stein/nova-less-deploy.rst new file mode 100644 index 00000000..01e315c5 --- /dev/null +++ b/specs/stein/nova-less-deploy.rst @@ -0,0 +1,638 @@ +.. + This work is licensed under a Creative Commons Attribution 3.0 Unported + License. + + http://creativecommons.org/licenses/by/3.0/legalcode + +======================================= +Provision nodes without Nova and Glance +======================================= + +https://blueprints.launchpad.net/tripleo/+spec/nova-less-deploy + +Currently TripleO undercloud uses Heat, Nova, Glance, Neutron and Ironic for +provisioning bare metal machines. This blueprint proposes excluding Heat, Nova +and Glance from this flow, removing Nova and Glance completely from the +undercloud. + +Problem Description +=================== + +Making TripleO workflows use Ironic directly to provision nodes has quite a few +benefits: + +#. First and foremost, getting rid of the horrible "no valid hosts found" + exception. The scheduling will be much simpler and the errors will be + clearer. + + .. note:: + This and many other problems with using Nova in the undercloud come from + the fact that Nova is cloud-oriented software, while the undercloud is + more of a traditional installer. In the "pet vs cattle" metaphore, Nova + handles the "cattle" case, while the undercloud is the "pet" case. + +#. Also important for the generic provisioner case, we'll be able to get rid of + Nova and Glance, reducing the memory footprint. + +#. We'll get rid of pre-deploy validations that currently try to guess what + Nova scheduler will expect. + +#. We'll be able to combine nodes deployed by Ironic with pre-deployed servers. + +#. We'll become in charge of building the configdrive, potentially putting more + useful things there. + +#. Hopefully, scale-up will be less error-prone. + +Also in the future we may be able to: + +#. Integrate things like building RAID on demand much easier. + +#. Use introspection data in scheduling and provisioning decisions. + Particularly, we can automate handling root device hints. + +#. Make Neutron optional and use static DHCP and/or *os-net-config*. + +Proposed Change +=============== + +Overview +-------- + +This blueprint proposes removal replacing the triad Heat-Nova-Glance with +Ironic driven directly by Mistral. To avoid placing Ironic-specific code into +tripleo-common, a new library metalsmith_ has been developed and accepted into +the Ironic governance. + +As part of the implementation, this blueprint proposes completely separting the +bare metal provisioning process from software configuration, including the CLI +level. This has two benefits: + +#. Having a clear separation between two error-prone processes simplifies + debugging for operators. + +#. Reusing the existing *deployed-server* workflow simplifies the + implementation. + +In the distant future, the functionality of metalsmith_ may be moved into +Ironic API itself. In this case it will be phased out, while keeping the same +Mistral workflows. + +Operator workflow +----------------- + +As noted in Overview_, the CLI/GUI workflow will be split into hardware +provisioning and software configuration parts (the former being optional). + +#. In addition to existing Heat templates, a new file + baremetal_deployment.yaml_ will be populated by an operator with the bare + metal provisioning information. + +#. Bare metal deployment will be conducted by a new CLI command or GUI + operation using the new `deploy_roles workflow`_:: + + openstack overcloud node provision \ + -o baremetal_environment.yaml baremetal_deployment.yaml + + This command will take the input from baremetal_deployment.yaml_, provision + requested bare metal machines and output a Heat environment file + baremetal_environment.yaml_ to use with the *deployed-server* feature. + +#. Finally, the regular deployment is done, including the generated file:: + + openstack overcloud deploy \ + \ + -e baremetal_environment.yaml \ + -e /usr/share/openstack-tripleo-heat-templates/environments/deployed-server-environment.yaml \ + -e /usr/share/openstack-tripleo-heat-templates/environments/deployed-server-bootstrap-environment-centos.yaml \ + -r /usr/share/openstack-tripleo-heat-templates/deployed-server/deployed-server-roles-data.yaml + +For simplicity the two commands can be combined:: + + openstack overcloud deploy \ + \ + -b baremetal_deployment.yaml \ + -e /usr/share/openstack-tripleo-heat-templates/environments/deployed-server-environment.yaml \ + -e /usr/share/openstack-tripleo-heat-templates/environments/deployed-server-bootstrap-environment-centos.yaml \ + -r /usr/share/openstack-tripleo-heat-templates/deployed-server/deployed-server-roles-data.yaml + +The new argument ``--baremetal-deployment``/``-b`` will accept the +baremetal_deployment.yaml_ and do the deployment automatically. + +Breakdown of the changes +------------------------ + +This section describes the required changes in depth. + +Image upload +~~~~~~~~~~~~ + +As Glance will no longer be used, images will have to be served from other +sources. Ironic supports HTTP and file sources from its images. For the +undercloud case, the file source seems to be the most straightforward, also the +*Edge* case may require using HTTP images. + +To make both cases possible, the ``openstack overcloud image upload`` command +will now copy the three overcloud images (``overcloud-full.qcow2``, +``overcloud-full.kernel`` and ``overcloud-full.ramdisk``) to +``/var/lib/ironic/httpboot/overcloud-images``. This will allow referring to +images both via ``file:///var/lib/ironic/httpboot/overcloud.images/...`` and +``http(s)://:/overcloud-images/...``. + +Finally, a checksum file will be generated from the copied images using:: + + cd /var/lib/ironic/httpboot/overcloud-images + md5sum overcloud-full.* > MD5SUMS + +This is required since the checksums will no longer come from Glance. + +baremetal_deployment.yaml +~~~~~~~~~~~~~~~~~~~~~~~~~ + +This file will describe which the bare metal provisioning parameters. It will +provide the information that is currently implicitly deduced from the Heat +templates. + +.. note:: + We could continue extracting it from the templates well. However, a separate + file will avoid a dependency on any Heat-specific logic, potentially + benefiting standalone installer cases. It also provides the operators with + more control over the provisioning process. + +The format of this file resembles one of the ``roles_data`` file. It describes +the deployment parameters for each role. The file contains a list of roles, +each with a ``name``. Other accepted parameters are: + +``count`` + number of machines to deploy for this role. Defaults to 1. +``profile`` + profile (``compute``, ``control``, etc) to use for this role. Roughly + corresponds to a flavor name for a Nova based deployment. Defaults to no + profile (any node can be picked). +``hostname_format`` + a template for generating host names. This is similar to + ``HostnameFormatDefault`` of a ``roles_data`` file and should use + ``%index%`` to number the nodes. The default is ``%stackname%--%index%``. +``instances`` + list of instances in the format accepted by `deploy_instances workflow`_. + This allows to tune parameters per instance. + +Examples +^^^^^^^^ + +Deploy one compute and one control with any profile: + +.. code-block:: yaml + + - name: Compute + - name: Controller + +HA deployment with two computes and profile matching: + +.. code-block:: yaml + + - name: Compute + count: 2 + profile: compute + hostname_format: compute-%index%.example.com + - name: Controller + count: 3 + profile: control + hostname_format: controller-%index%.example.com + +Advanced deployment with custom hostnames and parameters set per instance: + +.. code-block:: yaml + + - name: Compute + profile: compute + instances: + - hostname: compute-05.us-west.example.com + nics: + - network: ctlplane + fixed_ip: 10.0.2.5 + traits: + - HW_CPU_X86_VMX + - hostname: compute-06.us-west.example.com + nics: + - network: ctlplane + fixed_ip: 10.0.2.5 + traits: + - HW_CPU_X86_VMX + - name: Controller + profile: control + instances: + - hostname: controller-1.us-west.example.com + swap_size_mb: 4096 + - hostname: controller-2.us-west.example.com + swap_size_mb: 4096 + - hostname: controller-3.us-west.example.com + swap_size_mb: 4096 + +deploy_roles workflow +~~~~~~~~~~~~~~~~~~~~~ + +The workflow ``tripleo.baremetal_deploy.v1.deploy_roles`` will accept the +information from baremetal_deployment.yaml_, convert it into the low-level +format accepted by the `deploy_instances workflow`_ and call the +`deploy_instances workflow`_ with it. + +It will accept the following mandatory input: + +``roles`` + parsed baremetal_deployment.yaml_ file. + +It will accept one optional input: + +``plan`` + plan/stack name, used for templating. Defaults to ``overcloud``. + +It will return the same output as the `deploy_instances workflow`_ plus: + +``environment`` + the content of the generated baremetal_environment.yaml_ file. + +Examples +^^^^^^^^ + +The examples from baremetal_deployment.yaml_ will be converted to: + +.. code-block:: yaml + + - hostname: overcloud-compute-0 + - hostname: overcloud-controller-0 + +.. code-block:: yaml + + - hostname: compute-0.example.com + profile: compute + - hostname: compute-1.example.com + profile: compute + - hostname: controller-0.example.com + profile: control + - hostname: controller-1.example.com + profile: control + - hostname: controller-2.example.com + profile: control + +.. code-block:: yaml + + - hostname: compute-05.us-west.example.com + nics: + - network: ctlplane + fixed_ip: 10.0.2.5 + profile: compute + traits: + - HW_CPU_X86_VMX + - hostname: compute-06.us-west.example.com + nics: + - network: ctlplane + fixed_ip: 10.0.2.5 + profile: compute + traits: + - HW_CPU_X86_VMX + - hostname: controller-1.us-west.example.com + profile: control + swap_size_mb: 4096 + - hostname: controller-2.us-west.example.com + profile: control + swap_size_mb: 4096 + - hostname: controller-3.us-west.example.com + profile: control + swap_size_mb: 4096 + +deploy_instances workflow +~~~~~~~~~~~~~~~~~~~~~~~~~ + +The workflow ``tripleo.baremetal_deploy.v1.deploy_instances`` is a thin wrapper +around the corresponding metalsmith_ calls. + +The following inputs are mandatory: + +``instances`` + list of requested instances in the format described in `Instance format`_. +``ssh_keys`` + list of SSH public keys contents to put on the machines. + +The following inputs are optional: + +``ssh_user_name`` + SSH user name to create, defaults to ``heat-admin`` for compatibility. +``timeout`` + deployment timeout, defaults to 3600 seconds. +``concurrency`` + deployment concurrency - how many nodes to deploy at the same time. Defaults + to 20, which matches introspection. + +Instance format +^^^^^^^^^^^^^^^ + +The instance record format closely follows one of the `metalsmith ansible +role`_ with only a few TripleO-specific additions and defaults changes. + +Either or both of the following fields must be present: + +``hostname`` + requested hostname. It is used to identify the deployed instance later on. + Defaults to ``name``. +``name`` + name of the node to deploy on. If ``hostname`` is not provided, ``name`` is + also used as the hostname. + +The following fields will be supported: + +``capabilities`` + requested node capabilities (except for ``profile`` and ``boot_option``). +``conductor_group`` + requested node's conductor group. This is primary for the *Edge* case when + nodes managed by the same Ironic can be physically separated. +``nics`` + list of requested NICs, see metalsmith_ documentation for details. Defaults + to ``{"network": "ctlplane"}`` which requests creation of a port on the + ``ctlplane`` network. +``profile`` + profile to use (e.g. ``compute``, ``control``, etc). +``resource_class`` + requested node's resource class, defaults to ``baremetal``. +``root_size_gb`` + size of the root partition in GiB, defaults to 49. +``swap_size_mb`` + size of the swap partition in MiB, if needed. +``traits`` + list of requested node traits. +``whole_disk_image`` + boolean, whether to treat the image (``overcloud-full.qcow2`` or provided + through the ``image`` field) as a whole disk image. Defaults to false. + +The following fields will be supported, but the defaults should work for all +but the most extreme cases: + +``image`` + file or HTTP URL of the root partition or whole disk image. +``image_kernel`` + file or HTTP URL of the kernel image (partition images only). +``image_ramdisk`` + file or HTTP URL of the ramdisk image (partition images only). +``image_checksum`` + checksum of URL of checksum of the root partition or whole disk image. + +Certificate authority configuration +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If TLS is used in the undercloud, we need to make the nodes trust +the Certificate Authority (CA) that signed the TLS certificates. +If ``/etc/pki/ca-trust/source/anchors/cm-local-ca.pem`` exists, it will be +included in the generated configdrive, so that the file is copied into the same +location on target systems. + +Outputs +^^^^^^^ + +The workflow will provide the following outputs: + +``ctlplane_ips`` + mapping of host names to their respective IP addresses on the ``ctlplane`` + network. +``instances`` + mapping of host names to full instance representations with fields: + + ``node`` + Ironic node representation. + ``ip_addresses`` + mapping of network names to list of IP addresses on them. + ``hostname`` + instance hostname. + ``state`` + `metalsmith instance state`_. + ``uuid`` + Ironic node uuid. + +Also two subdicts of ``instances`` are provided: + +``existing_instances`` + only instances that already existed. +``new_instances`` + only instances that were deployed. + +.. note:: + Instances are distinguised by their hostnames. + +baremetal_environment.yaml +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This file will serve as an output of the bare metal provisioning process. It +will be fed into the overcloud deployment command. Its goal is to provide +information for the *deployed-server* workflow. + +The file will contain the ``HostnameMap`` generated from role names and +hostnames, e.g. + +.. code-block:: yaml + + parameter_defaults: + HostnameMap: + overcloud-controller-0: controller-1.us-west.example.com + overcloud-controller-1: controller-2.us-west.example.com + overcloud-controller-2: controller-3.us-west.example.com + overcloud-novacompute-0: compute-05.us-west.example.com + overcloud-novacompute-1: compute-06.us-west.example.com + +undeploy_instances workflow +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The workflow ``tripleo.baremetal_deploy.v1.undeploy_instances`` will take a +list of hostnames and undeploy the corresponding nodes. + +Novajoin replacement +-------------------- + +The *novajoin* service is currently used to enroll nodes into IPA and provide +them with TLS certificates. Unfortunately, it has hard dependencies on Nova, +Glance and Metadata API, even though the information could be provided via +other means. Actually, the metadata API cannot always be provided with Ironic +(notably, it may not be available when using isolated provisioning networks). + +A potential solution is to provide the required information via a configdrive, +and make the nodes register themselves instead. + +Alternatives +------------ + +* Do nothing, continue to rely on Nova and work around cases when it does + match our goals well. See `Problem Description`_ for why it is not desired. + +* Avoid metalsmith_, use OpenStack Ansible modules or Bifrost. They currently + lack features (such as VIF attach/detach API) and do not have any notion of + scheduling. Implementing sophisticated enough scheduling in pure Ansible + seems a serious undertaking. + +* Avoid Mistral, drive metalsmith_ via Ansible. This is a potential future + direction of this work, but currently it seems much simpler to call + metalsmith_ Python API from Mistral actions. We would anyway need Mistral ( + (or Ansible Tower) to drive Ansible, because we need some API level. + +* Remove Neutron in the same change. Would reduce footprint even further, but + some operators may find the presence of an IPAM desirable. Also setting up + static DHCP would increase the scope of the implementation substantially and + complicate the upgrade even further. + +* Keep Glance but remove Nova. Does not make much sense, since Glance is only a + requirement because of Nova. Ironic can deploy from HTTP or local file + locations just as well. + +Security Impact +--------------- + +* Overcloud images will be exposed to unauthenticated users via HTTP. We need + to communicate it clearly that secrets must not be built into images in plain + text and should be delivered via *configdrive* instead. If it proves + a problem, we can limit ourselves to providing images via local files. + + .. note:: + This issue exists today, as images are transferred via insecure medium in + all supported deploy methods. + +* Removing two services from the undercloud will reduce potential attack + surface and simplify audit. + +Upgrade Impact +-------------- + +The initial version of this feature will be enabled for new deployments only. + +The upgrade procedure will happen within a release, not between releases. +It will go roughly as follows: + +#. Upgrade to a release where undercloud without Nova and Glance is supported. + +#. Make a full backup of the undercloud. + +#. Run ``openstack overcloud image upload`` to ensure that the + ``overcloud-full`` images are available via HTTP(s). + +The next steps will probably be automated via an Ansible playbook or a Mistral +workflow: + +#. Mark deployed nodes *protected* in Ironic to prevent undeploying them + by mistake. + +#. Run a Heat stack update replacing references to Nova servers with references + to deployed servers. This will require telling Heat not to remove the + instances. + +#. Mark nodes as managed by *metalsmith* (optional, but simplifies + troubleshooting). + +#. Update node's ``instance_info`` to refer to images over HTTP(s). + + .. note:: This may require temporary moving nodes to maintenance. + +#. Run an undercloud update removing Nova and Glance. + +Other End User Impact +--------------------- + +* Nova CLI will no longer be available for troubleshooting. It should not be a + big problem in reality, as most of the problems it is used for are caused by + using Nova itself. + + metalsmith_ provides a CLI tool for troubleshooting and advanced users. We + will document using it for tasks like determining IP addresses of nodes. + +* It will no longer be possible to update images via Glance API, e.g. from GUI. + It should not be a bit issue, as most of users use pre-built images. Advanced + operators are likely to resort to CLI anyway. + +* *No valid host found* error will no longer be seen by operators. metalsmith_ + provides more detailed errors, and is less likely to fail because of its + scheduling approach working better with the undercloud case. + +Performance Impact +------------------ + +* A substantial speed-up is expected for deployments because of removing + several layers of indirection. The new deployment process will also fail + faster if the scheduling request cannot be satisfied. + +* Providing images via local files will remove the step of downloading them + from Glance, providing even more speed-up for larger images. + +* An operator will be able to tune concurrency of deployment via CLI arguments + or GUI parameters, other than ``nova.conf``. + +Other Deployer Impact +--------------------- + +None + +Developer Impact +---------------- + +New features for bare metal provisioning will have to be developed with this +work in mind. It may mean implementing something in metalsmith_ code instead of +relying on Nova servers or flavors, or Glance images. + +Implementation +============== + +Assignee(s) +----------- + +Primary assignee: + Dmitry Tantsur, IRC: dtantsur, LP: divius + +Work Items +---------- + +Phase 1 (Stein, technical preview): + +#. Update ``openstack overcloud image upload`` to copy images into the HTTP + location and generate checksums. + +#. Implement `deploy_instances workflow`_ and `undeploy_instances workflow`_. + +#. Update validations to not fail if Nova and/or Glance are not present. + +#. Implement `deploy_roles workflow`_. + +#. Provide CLI commands for the created workflows. + +#. Provide an experimental OVB CI job exercising the new approach. + +Phase 2 (T+, fully supported): + +#. Update ``openstack overcloud deploy`` to support the new workflow. + +#. Support scaling down. + +#. Provide a `Novajoin replacement`_. + +#. Provide an upgrade workflow. + +#. Consider deprecating provisioning with Nova and Glance. + +Dependencies +============ + +* metalsmith_ library will be used for easier access to Ironic+Neutron API. + +Testing +======= + +Since testing this feature requires bare metal provisioning, a new OVB job will +be created for it. Initially it will be experimental, and will move to the +check queue before the feature is considered fully supported. + +Documentation Impact +==================== + +Documentation will have to be reworked to explain the new deployment approach. +Troubleshooting documentation will have to be updated. + +References +========== + +.. _metalsmith: https://docs.openstack.org/metalsmith/latest/ +.. _metalsmith ansible role: https://docs.openstack.org/metalsmith/latest/user/ansible.html#instance +.. _metalsmith instance state: https://docs.openstack.org/metalsmith/latest/reference/api/metalsmith.html#metalsmith.Instance.state