
As part of the move to using Ironic shards, we document that the best practice for scaling Ironic and Nova deployments is to shard Ironic nodes between nova-compute processes, rather than attempting to user the peer_list. Currently, we only allow users to do this using conductor groups. This works well for those wanting a conductor group per L2 network domain. But in general, conductor groups per nova-compute are a very poor trade off in terms of ironic deployment complexity. Futher patches will look to enable the use of ironic shards, alongside conductor groups, to more easily shard your ironic nodes between nova-compute processes. To avoid confusion, we rename the partition_key configuration value to conductor_group. blueprint ironic-shards Change-Id: Ia2e23a59dbd2f13c6f74ca975c249751bebf54b2
3.0 KiB
Ironic
Introduction
The ironic hypervisor driver wraps the Bare Metal (ironic) API, enabling Nova to provision baremetal resources using the same user-facing API as for server management.
This is the only driver in nova where one compute service can map to
many hosts, meaning a nova-compute
service can manage
multiple ComputeNodes
. An ironic driver managed compute
service uses the ironic node uuid
for the compute node
hypervisor_hostname
(nodename) and uuid
fields. The relationship of
instance:compute node:ironic node
is 1:1:1.
Scheduling of bare metal nodes is based on custom resource classes,
specified via the resource_class
property on a node and a
corresponding resource property on a flavor (see the flavor documentation
</install/configure-nova-flavors.html>
). The RAM and CPU
settings on a flavor are ignored, and the disk is only used to determine
the root partition size when a partition image is used (see the image documentation
</latest/install/configure-glance-images.html>
).
Configuration
Configure the Compute service to use the Bare Metal service </latest/install/configure-compute.html>
.Create flavors for use with the Bare Metal service </latest/install/configure-nova-flavors.html>
.Conductors Groups </admin/conductor-groups.html>
.
Scaling and performance issues
- It is typical for a single nova-compute process to support several hundred Ironic nodes. There are known issues when you attempt to support more than 1000 Ironic nodes associated with a single nova-compute process, even though Ironic is able to scale out a single conductor group to much larger sizes. There are many other factors that can affect what is the maximum practical size of a conductor group within your deployment.
- The
update_available_resource
periodic task reports all the resources managed by Ironic. Depending the number of nodes, it can take a lot of time. The nova-compute will not perform any other operations when this task is running. You can use conductor groups to help shard your deployment between multiple nova-compute processes by setting :oslo.configironic.conductor_group
. - The nova-compute process using the Ironic driver can be moved
between different physical servers using active/passive failover. But
when doing this failover, you must ensure :oslo.config
host
is the same no matter where the nova-compute process is running. Similarly you must ensure there are at most one nova-compute processes running for each conductor group. - Running multiple nova-compute processes that point at the same
conductor group is now deprecated. Please never have more than one host
in the peer list: :oslo.config
ironic.peer_list
Known limitations / Missing features
- Migrate
- Resize
- Snapshot
- Pause
- Shelve
- Evacuate