460 lines
17 KiB
ReStructuredText
460 lines
17 KiB
ReStructuredText
Hardware introspection for OpenStack Bare Metal
|
|
===============================================
|
|
|
|
This is an auxiliary service for discovering hardware properties for a
|
|
node managed by `Ironic`_. Hardware introspection or hardware
|
|
properties discovery is a process of getting hardware parameters required for
|
|
scheduling from a bare metal node, given it's power management credentials
|
|
(e.g. IPMI address, user name and password).
|
|
|
|
A special ramdisk is required to collect the information on a
|
|
node. The default one can be built using diskimage-builder_ and
|
|
`ironic-discoverd-ramdisk element`_ (see Configuration_ below).
|
|
|
|
* Free software: Apache license
|
|
* Source: http://git.openstack.org/cgit/openstack/ironic-inspector
|
|
* Bugs: http://bugs.launchpad.net/ironic-inspector
|
|
* Blueprints: https://blueprints.launchpad.net/ironic-inspector
|
|
* Downloads: https://pypi.python.org/pypi/ironic-inspector
|
|
* Python client library and CLI tool: `python-ironic-inspector-client
|
|
<https://pypi.python.org/pypi/python-ironic-inspector-client>`_.
|
|
|
|
Refer to CONTRIBUTING.rst_ for instructions on how to contribute.
|
|
|
|
.. _Ironic: https://wiki.openstack.org/wiki/Ironic
|
|
.. _PyPI: https://pypi.python.org/pypi/ironic-inspector
|
|
.. _CONTRIBUTING.rst: https://github.com/openstack/ironic-inspector/blob/master/CONTRIBUTING.rst
|
|
|
|
.. note::
|
|
**ironic-inspector** was called *ironic-discoverd* before version 2.0.0.
|
|
|
|
Version Support Matrix
|
|
----------------------
|
|
|
|
**ironic-inspector** currently requires bare metal API version ``1.6`` to be
|
|
provided by Ironic. This version is available starting with Ironic Kilo
|
|
release.
|
|
|
|
Here is a mapping between Ironic versions and supported **ironic-inspector**
|
|
versions. The Standalone column shows which **ironic-inspector** versions can
|
|
be used in standalone mode with each Ironic version. The Inspection Interface
|
|
column shows which **ironic-inspector** versions can be used with the Ironic
|
|
inspection interface in each version of Ironic.
|
|
|
|
+--------------+-------------------------------+
|
|
|Ironic Version| Inspector (Discoverd) Version |
|
|
| +----------+--------------------+
|
|
| |Standalone|Inspection Interface|
|
|
+==============+==========+====================+
|
|
|Juno |1.0 |N/A |
|
|
+--------------+----------+--------------------+
|
|
|Kilo |1.0 - 2.2 |1.0 - 1.1 |
|
|
+--------------+----------+--------------------+
|
|
|Liberty |1.1 - 2.X |2.0 - 2.X |
|
|
+--------------+----------+--------------------+
|
|
|
|
.. note::
|
|
``2.X`` means we don't have specific plans on deprecating support for this
|
|
Ironic version. This does not imply that we'll support it forever though.
|
|
|
|
Workflow
|
|
--------
|
|
|
|
Usual hardware introspection flow is as follows:
|
|
|
|
* Operator enrolls nodes into Ironic_ e.g. via ironic CLI command.
|
|
Power management credentials should be provided to Ironic at this step.
|
|
|
|
* Nodes are put in the correct state for introspection as described in
|
|
`Node States`_.
|
|
|
|
* Operator sends nodes on introspection using **ironic-inspector** API or CLI
|
|
(see Usage_).
|
|
|
|
* On receiving node UUID **ironic-inspector**:
|
|
|
|
* validates node power credentials, current power and provisioning states,
|
|
* allows firewall access to PXE boot service for the nodes,
|
|
* issues reboot command for the nodes, so that they boot the ramdisk.
|
|
|
|
* The ramdisk collects the required information and posts it back to
|
|
**ironic-inspector**.
|
|
|
|
* On receiving data from the ramdisk, **ironic-inspector**:
|
|
|
|
* validates received data,
|
|
* finds the node in Ironic database using it's BMC address (MAC address in
|
|
case of SSH driver),
|
|
* fills missing node properties with received data and creates missing ports.
|
|
|
|
.. note::
|
|
**ironic-inspector** is responsible to create Ironic ports for some or all
|
|
NIC's found on the node. **ironic-inspector** is also capable of
|
|
deleting ports that should not be present. There are two important
|
|
configuration options that affect this behavior: ``add_ports`` and
|
|
``keep_ports`` (please refer to ``example.conf`` for detailed explanation).
|
|
|
|
Default values as of **ironic-inspector** 1.1.0 are ``add_ports=pxe``,
|
|
``keep_ports=all``, which means that only one port will be added, which is
|
|
associated with NIC the ramdisk PXE booted from. No ports will be deleted.
|
|
This setting ensures that deploying on introspected nodes will succeed
|
|
despite `Ironic bug 1405131
|
|
<https://bugs.launchpad.net/ironic/+bug/1405131>`_.
|
|
|
|
Ironic inspection feature by default requires different settings:
|
|
``add_ports=all``, ``keep_ports=present``, which means that ports will be
|
|
created for all detected NIC's, and all other ports will be deleted.
|
|
Refer to the `Ironic inspection documentation`_ for details.
|
|
|
|
* Separate API (see Usage_) can be used to query introspection results
|
|
for a given node.
|
|
|
|
* Nodes are put in the correct state for deploying as described in
|
|
`Node States`_.
|
|
|
|
Starting DHCP server and configuring PXE boot environment is not part of this
|
|
package and should be done separately.
|
|
|
|
.. _instack-undercloud: https://www.rdoproject.org/Deploying_an_RDO_Undercloud_with_Instack
|
|
.. _Ironic inspection documentation: http://docs.openstack.org/developer/ironic/deploy/install-guide.html#hardware-inspection
|
|
|
|
Installation
|
|
------------
|
|
|
|
Install from PyPI_ (you may want to use virtualenv to isolate your
|
|
environment)::
|
|
|
|
pip install ironic-inspector
|
|
|
|
Also there is a `DevStack <http://docs.openstack.org/developer/devstack/>`_
|
|
plugin for **ironic-inspector** - see CONTRIBUTING.rst_ for the current status.
|
|
|
|
Finally, some distributions (e.g. Fedora) provide **ironic-inspector**
|
|
packaged, some of them - under its old name *ironic-discoverd*.
|
|
|
|
Configuration
|
|
~~~~~~~~~~~~~
|
|
|
|
Copy ``example.conf`` to some permanent place
|
|
(e.g. ``/etc/ironic-inspector/inspector.conf``).
|
|
Fill in at least these configuration values:
|
|
|
|
* ``os_username``, ``os_password``, ``os_tenant_name`` - Keystone credentials
|
|
to use when accessing other services and check client authentication tokens;
|
|
|
|
* ``os_auth_url``, ``identity_uri`` - Keystone endpoints for validating
|
|
authentication tokens and checking user roles;
|
|
|
|
* ``database`` - where you want **ironic-inspector** SQLite database
|
|
to be placed;
|
|
|
|
* ``dnsmasq_interface`` - interface on which ``dnsmasq`` (or another DHCP
|
|
service) listens for PXE boot requests (defaults to ``br-ctlplane`` which is
|
|
a sane default for TripleO-based installations but is unlikely to work for
|
|
other cases).
|
|
|
|
See comments inside `example.conf
|
|
<https://github.com/openstack/ironic-inspector/blob/master/example.conf>`_
|
|
for the other possible configuration options.
|
|
|
|
.. note::
|
|
Configuration file contains a password and thus should be owned by ``root``
|
|
and should have access rights like ``0600``.
|
|
|
|
As for PXE boot environment, you'll need:
|
|
|
|
* TFTP server running and accessible (see below for using *dnsmasq*).
|
|
Ensure ``pxelinux.0`` is present in the TFTP root.
|
|
|
|
* Build and put into your TFTP directory kernel and ramdisk from the
|
|
diskimage-builder_ `ironic-discoverd-ramdisk element`_::
|
|
|
|
ramdisk-image-create -o discovery fedora ironic-discoverd-ramdisk
|
|
|
|
You need diskimage-builder_ 0.1.38 or newer to do it (using the latest one
|
|
is always advised).
|
|
|
|
* You need PXE boot server (e.g. *dnsmasq*) running on **the same** machine as
|
|
**ironic-inspector**. Don't do any firewall configuration:
|
|
**ironic-inspector** will handle it for you. In **ironic-inspector**
|
|
configuration file set ``dnsmasq_interface`` to the interface your
|
|
PXE boot server listens on. Here is an example *dnsmasq.conf*::
|
|
|
|
port=0
|
|
interface={INTERFACE}
|
|
bind-interfaces
|
|
dhcp-range={DHCP IP RANGE, e.g. 192.168.0.50,192.168.0.150}
|
|
enable-tftp
|
|
tftp-root={TFTP ROOT, e.g. /tftpboot}
|
|
dhcp-boot=pxelinux.0
|
|
|
|
* Configure your ``$TFTPROOT/pxelinux.cfg/default`` with something like::
|
|
|
|
default discover
|
|
|
|
label discover
|
|
kernel discovery.kernel
|
|
append initrd=discovery.initramfs discoverd_callback_url=http://{IP}:5050/v1/continue
|
|
|
|
ipappend 3
|
|
|
|
Replace ``{IP}`` with IP of the machine (do not use loopback interface, it
|
|
will be accessed by ramdisk on a booting machine).
|
|
|
|
.. note::
|
|
There are some prebuilt images which use obsolete ``ironic_callback_url``
|
|
instead of ``discoverd_callback_url``. Modify ``pxelinux.cfg/default``
|
|
accordingly if you have one of these.
|
|
|
|
Here is *inspector.conf* you may end up with::
|
|
|
|
[DEFAULT]
|
|
debug = false
|
|
[ironic]
|
|
identity_uri = http://127.0.0.1:35357
|
|
os_auth_url = http://127.0.0.1:5000/v2.0
|
|
os_username = admin
|
|
os_password = password
|
|
os_tenant_name = admin
|
|
[firewall]
|
|
dnsmasq_interface = br-ctlplane
|
|
|
|
.. note::
|
|
Set ``debug = true`` if you want to see complete logs.
|
|
|
|
.. _diskimage-builder: https://github.com/openstack/diskimage-builder
|
|
.. _ironic-discoverd-ramdisk element: https://github.com/openstack/diskimage-builder/tree/master/elements/ironic-discoverd-ramdisk
|
|
|
|
Running
|
|
~~~~~~~
|
|
|
|
Run as ``root``::
|
|
|
|
ironic-inspector --config-file /etc/ironic-inspector/inspector.conf
|
|
|
|
.. note::
|
|
Running as ``root`` is not required if **ironic-inspector** does not
|
|
manage the firewall (i.e. ``manage_firewall`` is set to ``false`` in the
|
|
configuration file).
|
|
|
|
A good starting point for writing your own *systemd* unit should be `one used
|
|
in Fedora <http://pkgs.fedoraproject.org/cgit/openstack-ironic-discoverd.git/plain/openstack-ironic-discoverd.service>`_
|
|
(note usage of old name).
|
|
|
|
Usage
|
|
-----
|
|
|
|
Refer to HTTP-API.rst_ for information on the HTTP API.
|
|
Refer to the `client page`_ for information on how to use CLI and Python
|
|
library.
|
|
|
|
.. _HTTP-API.rst: https://github.com/openstack/ironic-inspector/blob/master/HTTP-API.rst
|
|
.. _HTTP API: https://github.com/openstack/ironic-inspector/blob/master/HTTP-API.rst
|
|
.. _client page: https://pypi.python.org/pypi/python-ironic-inspector-client
|
|
|
|
Using from Ironic API
|
|
~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Ironic Kilo introduced support for hardware introspection under name of
|
|
"inspection". **ironic-inspector** introspection is supported for some generic
|
|
drivers, please refer to `Ironic inspection documentation`_ for details.
|
|
|
|
Node States
|
|
~~~~~~~~~~~
|
|
|
|
* As of Ironic Kilo release the nodes should be moved to ``MANAGEABLE``
|
|
provision state before introspection (requires *python-ironicclient*
|
|
of version 0.5.0 or newer)::
|
|
|
|
ironic node-set-provision-state <UUID> manage
|
|
|
|
With Juno release and/or older *python-ironicclient* it's recommended
|
|
to set maintenance mode, so that nodes are not taken by Nova for deploying::
|
|
|
|
ironic node-update <UUID> replace maintenance=true
|
|
|
|
* After successful introspection and before deploying nodes should be made
|
|
available to Nova, either by moving them to ``AVAILABLE`` state (Kilo)::
|
|
|
|
ironic node-set-provision-state <UUID> provide
|
|
|
|
or by removing maintenance mode (Juno and/or older client)::
|
|
|
|
ironic node-update <UUID> replace maintenance=false
|
|
|
|
.. note::
|
|
Due to how Nova interacts with Ironic driver, you should wait 1 minute
|
|
before Nova becomes aware of available nodes after issuing these commands.
|
|
|
|
Setting IPMI Credentials
|
|
~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
If you have physical access to your nodes, you can use **ironic-inspector** to
|
|
set IPMI credentials for them without knowing the original ones. The workflow
|
|
is as follows:
|
|
|
|
* Ensure nodes will PXE boot on the right network by default.
|
|
|
|
* Set ``enable_setting_ipmi_credentials = true`` in the **ironic-inspector**
|
|
configuration file.
|
|
|
|
* Enroll nodes in Ironic with setting their ``ipmi_address`` only. This step
|
|
allows **ironic-inspector** to distinguish nodes.
|
|
|
|
* Set maintenance mode on nodes. That's an important step, otherwise Ironic
|
|
might interfere with introspection process.
|
|
|
|
* Start introspection with providing additional parameters:
|
|
|
|
* ``new_ipmi_password`` IPMI password to set,
|
|
* ``new_ipmi_username`` IPMI user name to set, defaults to one in node
|
|
driver_info.
|
|
|
|
* Manually power on the nodes and wait.
|
|
|
|
* After introspection is finished (watch nodes power state or use
|
|
**ironic-inspector** status API) you can turn maintenance mode off.
|
|
|
|
Note that due to various limitations on password value in different BMC,
|
|
**ironic-inspector** will only accept passwords with length between 1 and 20
|
|
consisting only of letters and numbers.
|
|
|
|
Plugins
|
|
~~~~~~~
|
|
|
|
**ironic-inspector** heavily relies on plugins for data processing. Even the
|
|
standard functionality is largely based on plugins. Set ``processing_hooks``
|
|
option in the configuration file to change the set of plugins to be run on
|
|
introspection data. Note that order does matter in this option.
|
|
|
|
These are plugins that are enabled by default and should not be disabled,
|
|
unless you understand what you're doing:
|
|
|
|
``ramdisk_error``
|
|
reports error, if ``error`` field is set by the ramdisk, also optionally
|
|
stores logs from ``logs`` field, see `HTTP API`_ for details.
|
|
``scheduler``
|
|
validates and updates basic hardware scheduling properties: CPU number and
|
|
architecture, memory and disk size.
|
|
``validate_interfaces``
|
|
validates network interfaces information.
|
|
|
|
Here are some plugins that can be additionally enabled:
|
|
|
|
``example``
|
|
example plugin logging it's input and output.
|
|
``raid_device`` (deprecated name ``root_device_hint``)
|
|
gathers block devices from ramdisk and exposes root device in multiple
|
|
runs.
|
|
``extra_hardware``
|
|
stores the value of the 'data' key returned by the ramdisk as a JSON
|
|
encoded string in a Swift object.
|
|
|
|
Refer to CONTRIBUTING.rst_ for information on how to write your own plugin.
|
|
|
|
Troubleshooting
|
|
---------------
|
|
|
|
Errors when starting introspection
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
* *Refusing to introspect node <UUID> with provision state "available"
|
|
and maintenance mode off*
|
|
|
|
In Kilo release with *python-ironicclient* 0.5.0 or newer Ironic
|
|
defaults to reporting provision state ``AVAILABLE`` for newly enrolled
|
|
nodes. **ironic-inspector** will refuse to conduct introspection in
|
|
this state, as such nodes are supposed to be used by Nova for scheduling.
|
|
See `Node States`_ for instructions on how to put nodes into
|
|
the correct state.
|
|
|
|
Introspection times out
|
|
~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
There may be 3 reasons why introspection can time out after some time
|
|
(defaulting to 60 minutes, altered by ``timeout`` configuration option):
|
|
|
|
#. Fatal failure in processing chain before node was found in the local cache.
|
|
See `Troubleshooting data processing`_ for the hints.
|
|
|
|
#. Failure to load the ramdisk on the target node. See `Troubleshooting
|
|
PXE boot`_ for the hints.
|
|
|
|
#. Failure during ramdisk run. See `Troubleshooting ramdisk run`_ for the
|
|
hints.
|
|
|
|
Troubleshooting data processing
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
In this case **ironic-inspector** logs should give a good idea what went wrong.
|
|
E.g. for RDO or Fedora the following command will output the full log::
|
|
|
|
sudo journalctl -u openstack-ironic-inspector
|
|
|
|
(use ``openstack-ironic-discoverd`` for version < 2.0.0).
|
|
|
|
.. note::
|
|
Service name and specific command might be different for other Linux
|
|
distributions (and for old version of **ironic-inspector**).
|
|
|
|
If ``ramdisk_error`` plugin is enabled and ``ramdisk_logs_dir`` configuration
|
|
option is set, **ironic-inspector** will store logs received from the ramdisk
|
|
to the ``ramdisk_logs_dir`` directory. This depends, however, on the ramdisk
|
|
implementation.
|
|
|
|
Troubleshooting PXE boot
|
|
^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
PXE booting most often becomes a problem for bare metal environments with
|
|
several physical networks. If the hardware vendor provides a remote console
|
|
(e.g. iDRAC for DELL), use it to connect to the machine and see what is going
|
|
on. You may need to restart introspection.
|
|
|
|
Another source of information is DHCP and TFTP server logs. Their location
|
|
depends on how the servers were installed and run. For RDO or Fedora use::
|
|
|
|
$ sudo journalctl -u openstack-ironic-inspector-dnsmasq
|
|
|
|
(use ``openstack-ironic-discoverd-dnsmasq`` for version < 2.0.0).
|
|
|
|
The last resort is ``tcpdump`` utility. Use something like
|
|
::
|
|
|
|
$ sudo tcpdump -i any port 67 or port 68 or port 69
|
|
|
|
to watch both DHCP and TFTP traffic going through your machine. Replace
|
|
``any`` with a specific network interface to check that DHCP and TFTP
|
|
requests really reach it.
|
|
|
|
If you see node not attempting PXE boot or attempting PXE boot on the wrong
|
|
network, reboot the machine into BIOS settings and make sure that only one
|
|
relevant NIC is allowed to PXE boot.
|
|
|
|
If you see node attempting PXE boot using the correct NIC but failing, make
|
|
sure that:
|
|
|
|
#. network switches configuration does not prevent PXE boot requests from
|
|
propagating,
|
|
|
|
#. there is no additional firewall rules preventing access to port 67 on the
|
|
machine where *ironic-inspector* and its DHCP server are installed.
|
|
|
|
If you see node receiving DHCP address and then failing to get kernel and/or
|
|
ramdisk or to boot them, make sure that:
|
|
|
|
#. TFTP server is running and accessible (use ``tftp`` utility to verify),
|
|
|
|
#. no firewall rules prevent access to TFTP port,
|
|
|
|
#. DHCP server is correctly set to point to the TFTP server,
|
|
|
|
#. ``pxelinux.cfg/default`` within TFTP root contains correct reference to the
|
|
kernel and ramdisk.
|
|
|
|
Troubleshooting ramdisk run
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Connect to the remote console as described in `Troubleshooting PXE boot`_ to
|
|
see what is going on with the ramdisk. The ramdisk drops into emergency shell
|
|
on failure, which you can use to look around. There should be file called
|
|
``logs`` with the current ramdisk logs.
|