280 Commits

Author SHA1 Message Date
Jenkins
031593614e Merge "Add boot information into the inventory" 2016-06-02 19:11:35 +00:00
Jenkins
928b10cbd3 Merge "Returns CPU flags in the CPU inventory" 2016-06-01 18:21:21 +00:00
Dmitry Tantsur
6670da4ed1 Returns CPU flags in the CPU inventory
These flags will be processed in a new ironic-inspector plugin
to support setting capabilities like cpu_vt (virtualization enabled).

Change-Id: I5fe9310c316841eabdd2d5e2ef2ae30afa03d29a
Partial-Bug: #1571580
2016-06-01 16:12:32 +02:00
Dmitry Tantsur
53b187a4c3 Add boot information into the inventory
Adds a new BootInfo object with 2 fields:

* current_boot_mode - bios or uefi, detected from presence of /sys/firmware/efi
  as per the following answer: http://askubuntu.com/a/162896
  This field will be used for setting the boot_mode capability in ironic-inspector
* pxe_interface - PXE booting interface, if it can be detected.
  This fields is already used by ironic-inspector, added here for consistency.

Change-Id: Ib36b592ffaba3bfa055d65c9526607867d302584
Partial-Bug: #1571580
2016-05-26 17:05:11 +02:00
Lucas Alvares Gomes
33535cd572 Get root device hints from the node object
In order to support a more complex syntax for root device hints (e.g
operators: greater than, less than, in, etc...) we need to stop relying
on the kernel command line for passing the root device hints. This patch
changes this approach by getting the root device hints from a cached
node object that was set in the hardware module.

Two new functions: "cache_node" and "get_cached_node" were added to the
hardware module. The idea is to facilitate the access to a node object
representation from the hardware extension methods without changing
method signatures, which would break compatibility with out-of-tree
hardware managers.

Note that the new "get_cached_node" is just a guard function to
facilitate the tests for the code.

The function parse_root_device_hints() and its tests were removed since
it's not used/needed anymore.

Partial-Bug: #1561137
Change-Id: I830fe7da1a59b46e348213b6f451c2ee55f6008c
2016-05-26 14:52:15 +01:00
Dmitry Tantsur
c15ed6a48e Wait for at least one suitable disk to appear on start up
Some kernel modules take substantial time to initialize. For example,
with mpt2sas RAID driver inspection and deployment randomly fail
due to IPA starting before the driver finishes initialization.

This problem is probably impossible to solve in a generic case, as
modern Linux environment do not have a notion of "hardware is fully
initialized" moment. All hardware is essentially hotplug.

To solve it at least for the simplest case, this patch adds a wait loop
on start up waiting for at least one suitable disk to appear in inventory.
Note that root device hints are not considered, as the node might not
be known at that moment yet.

Change-Id: Id163ca28f7c140c302ea04947ded3f3c58b284de
Partial-Bug: #1582797
2016-05-24 10:36:45 +02:00
Jay Faulkner
5a1a1ca61c Revert "Add hardware manager interface for hardware initialization"
I would've voted -1 on the patch in question had I reviewed it, and per
standard OpenStack/Ironic procedure, I'm reverting it for re-review and
discussion.

In this case; I don't think the new method in the HWM interface is
needed, and that evaluate_hardware_support() is intended to handle the
cases handled.

This reverts commit 0962cae1da69a1a2981d5950ad741d91115dac06.

Change-Id: Ic08e44bdf116403444b257ee9f4e5b906f5eac53
2016-05-23 17:41:29 +00:00
Dmitry Tantsur
0962cae1da Add hardware manager interface for hardware initialization
Some kernel modules take substantial time to initialize. For example,
with mpt2sas RAID driver inspection and deployment randomly fail
due to IPA starting before the driver finishes initialization.

Add a new hardware manager method initialize_hardware, which gets
run on start up before other hardware manager method invocations.

The generic implementation is to call udev settle and wait for
at least one suitable disk device to appear with the hardcoded
timeout of 15 seconds. Also preload the IPMI modules instead of
calling modprobe every time the inventory is requested.

Change-Id: If7758bb6e3faac7d05451baa3a26adb8ab9953d5
Partial-Bug: #1582797
2016-05-20 15:38:53 +02:00
Mathieu Mitchell
1c9ecbd8cb Allow shred zeroize option to be configured
Introduce a new parameter in driver_internal_info called
agent_erase_devices_zeroize to control the behavior of shred. This
parameter controls the --zero argument used when invoking shred.
Configuring this to false disabled the last pass of zeroes, leaving the
device with random data.

Change-Id: I7053034f5b5bc6737b535ee601e6fb71284d4a83
Partial-bug: #1568811
Depends-On: Ia7ea8d909df9ae86a6dbd68ba94746b171535eb8
2016-04-18 16:59:32 -04:00
Jenkins
a630a921d3 Merge "Provide fallback from ATA erase to shredding" 2016-04-08 22:21:40 +00:00
Julia Kreger
ed74a062c1 Provide fallback from ATA erase to shredding
Presently should the ATA erasure operation fails, IPA halts the
cleaning process and the node goes to CLEANFAIL state as a result.

This failure could be the result of a previous cleaning failure
that left drive security enabled, for which code has been added
in an attempt to address this case by attempting to unlock the
the drive.

In the event that an operator wishes to automatically fallback to
disk scrubbing operations, the capability has been added through
a driver_internal_info field "agent_continue_if_ata_erase_failed"
that can be set to True, however defaults to False keeping the
same behavior that IPA presently exhibits in the event of ATA
erase operations failing.

Partial-Bug: #1536695
Change-Id: I88edd9477f4f05aa55b2fe8efa4bbff1c5573bb1
2016-04-08 15:55:06 -04:00
Dmitry Tantsur
3deb25a3ce Wait for the interfaces to get IP addresses before inspection
In the DIB build the DHCP code (provided by the dhcp-all-interfaces element)
races with the service starting IPA. It does not matter for deployment itself,
as we're waiting for the route to the Ironic API to appear. However, for
inspection it may result in reporting back all NIC's without IP addresses.
Inspection fails in this case.

This change makes inspection wait for *all* NIC's to get their IP addresses up
to a small timeout. The timeout is 60 seconds by default and can be changed
via the new ipa-inspection-dhcp-wait-timeout kernel option (0 to not wait).

After the wait inspection proceedes in any case, so the worst downside
is making inspection 60 seconds longer.

To avoid waiting for NIC's that are not even connected, this change extends the
NetworkInterface class with 'has_carrier' field.

Closes-Bug: #1564954
Change-Id: I5bf14de4c1c622f4bf6e3eadbe20c44759da5d66
2016-04-05 20:03:33 +02:00
Jenkins
dcd1c8f19b Merge "Document hardware inventory sent to lookup and inspection" 2016-03-15 17:04:25 +00:00
Lucas Alvares Gomes
055998c812 Wait for udev to settle before listing the block devices
This patch is making the list_all_block_devices() method to wait for
udev to settle it's event queue prior to listing the devices.

Sometimes the ironic-python-agent service may start before all devices
were detected and end up erroring out because it couldn't find a
suitable disk for deployment.

Closes-Bug: #1551300
Change-Id: I1ae2062a711115a1ea14b79ae9ace7ddd2fff9d5
2016-03-02 10:54:51 +00:00
twm2016
d66fa523bf Reduced restriction of parsing for dmidecode output
Changed implementation to strip tokens up until the first 'Size: '
string. This will allow for less parsing errors in the first
six lines of the following output:
"dmidecode --type 17 | grep Size" returns:
        Maximum Memory Module Size: 4096 MB
        Maximum Total Memory Size: 8192 MB
        Size: 2048 MB
        Size: 2048 MB

Added a condition in the exception handling to address the
issue of the bug on other outputs like:
        Installed Size: Not Installed
        Enabled Size: Not Installed
        Size: No Module Installed
        Size: 1024 MB

Common strings like "No Module Installed" and "Not Installed" are
normal. These two strings are hard coded in the before mentioned
comparison and when found are logged as warnings instead of errors.

Change-Id: If3475afcebfc7af7e9256b99924919557c4d909c
Closes-Bug: #1521202
2016-02-22 22:46:28 +00:00
Dmitry Tantsur
c9674da220 Document hardware inventory sent to lookup and inspection
Also add a missing docstring to HardwareManager.list_hardware_info.

Change-Id: Iee3584320f0591398e7761513ff588efeb62886d
2016-02-18 13:32:43 +01:00
Jenkins
b5008dcb31 Merge "Add 'system_vendor' information to data" 2016-02-16 18:41:09 +00:00
Yuiko Takada
3823a53040 Add 'system_vendor' information to data
This patch set add hardware vendor information to data.
By using this data, we can get hints to detect driver.

Change-Id: I39385fd5d616edfad719c255f22642f215bfb532
2016-02-15 10:19:17 +09:00
Lucas Alvares Gomes
6752ce8032 Extend root device hints to support device name
This patch is extending the root device hints to also look at the device
name. This patch also refactors the tests for root device hints making
it easier to test a different hint per test.

Change-Id: I48d6456c75bbe6ddf16ac6561e5461ca51eb9c37
Partial-Bug: #1526732
2016-02-02 10:32:39 +00:00
Josh Gachnang
61b4387b95 Allow hardware managers to override clean step priority
If two hardware managers have the same clean step, for example
'erase_devices' in the GenericHardwareManager and a custom manager,
IPA must determine which step should be kept and which should be run
in order to prevent running the step multiple times.

This patch uses the following filtering logic to decide which step
"wins":
- Keep the step that belongs to HardwareManager with highest
  HardwareSupport (larger int) value.
- If equal support level, keep the step with the higher defined
  priority (larger int).
- If equal support level and priority, keep the step associated with
  the HardwareManager whose name comes earlier in the alphabet.

Other than individual step priority, picking which step to keep does
not actually impact the cleaning run. However, in order to make
testing easier, this change ensures deterministic, predictable
results.

Co-Authored-By: Mario Villaplana <mario.villaplana@gmail.com>
Co-Authored-By: Jay Faulkner <jay@jvf.cc>
Co-Authored-By: Brad Morgan <brad@morgabra.com>
Change-Id: Iaeea4200c38ee22cab72ba81c1dbae3389e675e4
2016-01-14 13:12:52 -08:00
Dmitry Tantsur
2fc6ce22f8 pyudev exception has changed for from_device_file
Now pyudev raises DeviceNotFoundByFileError which does not inherit
from EnvironmentError, so our 'except' block in hardware.py no longer
catch the exception. It broke unit tests, but it can also potentially
break the deploy.

This patch updates hardware.py to catch both old a new exceptions.

Change-Id: Iaefd6089f6f766a241054d8e132b2f3098c8130d
Closes-Bug: #1522756
2015-12-07 16:47:29 +01:00
Lucas Alvares Gomes
c21409e98b Follow up patch for da9c3b0adc67efa916fc534d975823c0a45948a1
This patch is a follow up patch fixing some nits left by the review
da9c3b0adc67efa916fc534d975823c0a45948a1, this patch adds the
wwn_with_extension and wwn_vendor_extension root device hints to the
"serializable_fields" list attribute of the BlockDevice class and fixes
some tests.

Change-Id: I6039be535988319276f9ac355c80997d34328ce8
2015-11-18 09:56:09 +00:00
Lucas Alvares Gomes
da9c3b0adc Extend root device hints for different types of WWN
This patch is extending the root device hints to also look at
ID_WWN_WITH_EXTENSION and ID_WWN_VENDOR_EXTENSION from udev.

Prior to this patch the IPA ramdisk only cared about ID_WWN but in some
systems in some platforms with a RAID controller, this ID can be same
even if they are different disks (see bug 1516641).

Closes-Bug: #1516641
Change-Id: Ic3e9a1111dfcc99702190c173562a0dccf5f94c4
2015-11-16 14:58:24 +00:00
Zhenguo Niu
18d5d6aba3 Replace deprecated LOG.warn with LOG.warning
Change-Id: Ib3d566f6e608ee453659e15cabcf8e9332aedc52
Closes-Bug: #1508442
2015-10-22 14:42:57 +08:00
John L. Villalovos
deb50ac5a8 Add LOG.debug() if requested device type not found
This is a follow-up patch for commit
3af9ab36bfae3a369fdb3d2b6d02ac803c39ee17

The review requested that a LOG.debug() message be added.

Change-Id: I36fbd4269c948812f4bee66d0130150afd0c0279
2015-10-07 17:07:23 -07:00
Jenkins
aa908205c6 Merge "Refactor list_all_block_devices & add block_type param" 2015-10-07 02:42:07 +00:00
John L. Villalovos
dcbba2b121 Enforce all flake8 rules except E129
Bring ironic-python-agent in line with the other ironic projects.

Stop ignoring all E12* errors except E129
Stop ignoring E711

Change-Id: Icb9bc198473d1b5e807c20869eb2af7f4d7ac360
2015-10-02 10:01:00 -07:00
Jenkins
378197caee Merge "Make the erase_devices clean step abortable" 2015-09-26 22:25:17 +00:00
Lucas Alvares Gomes
cd70f514d6 Make the erase_devices clean step abortable
This patches updates the get_clean_steps() method to make the
erase_devices step abortable. Erasing devices is something that can be
cancelled without damaging the machine.

When a clean step is aborted the provision state of the Ironic node
will go to CLEANFAIL state. The operator can then do what is needed to
fix the problem (i.e network booting issues) and restart the cleaning
later on.

Partial-Bug: #1455825
Change-Id: Ic181ac3712810c6f6925e8b627ee79e77ecf4d83
2015-09-26 19:01:26 +00:00
John L. Villalovos
3af9ab36bf Refactor list_all_block_devices & add block_type param
Put the columns to retrieve from lsblk into a list so that future
modifications to columns will require fewer code changes.

Also add a 'block_type' parameter which defaults to 'disk'. To make the
function more flexible if callers wanted a different block type.

Update and add unit tests

Change-Id: If06460e13a5b56dc8d6efca9ff5b58ac6ba1f357
2015-09-24 15:31:13 -07:00
Dmitry Tantsur
b569e37d06 Expose serial, wwn and vendor on the BlockDevice object
Currently we only use these disk properties for root device hints.
However, they'll be really useful for inspector, especially for also
implementing root device hints.

Change-Id: I48aa6b6d2d198d16f2f8e387970f7230066cf8a2
2015-09-21 13:17:20 +02:00
John L. Villalovos
1285baee1c Create a SerializableComparable class
Create a SerializableComparable class derived from the Serializable
class.

Added the following functions to the SerializableComparable class:
  '__eq__'
  '__ne__'

Disable the '__hash__' function in the SerializableComparable class as
some derived classes are mutable.

Use the SerializableComparable class in hardware.py and
extensions/base.py

This should make unit testing users of the class easier when doing a
self.assertEqual() or self.assertNotEqual()

Added some initial unit testing for encoding.py

Change-Id: If0f14b3bfe7f1391f65dd730a16a534afed0da82
2015-09-11 13:44:09 -07:00
Dmitry Tantsur
096830414b Add support for inspection using ironic-inspector
Adds a new module ironic_python_agent.inspector and new entry point
for extensions, which will allow vendor-specific inspection.

Inspection is run on service start up just before the lookup.
Due to this early start, and due to the fact we don't even know
MAC address of nodes on inspection (to say nothing about IP addresses),
exception handling is a bit different from other agent features:
we try hard not to error out until we send at least something to inspector.

Change-Id: I00932463d41819fd0a050782e2c88eddf6fc08c6
2015-09-07 18:22:54 +02:00
Jenkins
a417baf25c Merge "Fix get_os_install_device()" 2015-09-06 13:35:12 +00:00
Jenkins
87223a0f94 Merge "Refactor list_block_devices to its own function" 2015-09-02 22:17:44 +00:00
Pavlo Shchelokovskyy
19575d026a Fix get_os_install_device()
Instead of silently failing, raise DeviceNotFound when no root device
hints were provided and all found block devices are smaller than 4GB.

Change-Id: Idd2e2c5905adf847f00ad15a84a817c3715225dd
Closes-Bug: #1490761
2015-09-02 17:13:07 +00:00
Jenkins
41caebed8c Merge "Dispatch the call to erase_block_device" 2015-09-01 03:10:51 +00:00
Jay Faulkner
09221f98cb Load Hardware Managers at runtime
Hardware managers should load at runtime. This will ensure the agent is
ready to respond to API calls before it begins heartbeating. Also, it
means in case of a syntax or other error in a HardwareManager, the agent
will crash before it heartbeats, which is better than it working until a
hardware manager method is needed.

Change-Id: I9403ce7bedc8d5af20b6d84371367253b26b74c2
Closes-bug: 1490008
2015-08-28 14:15:31 -07:00
Josh Gachnang
cd6f15dffe Dispatch the call to erase_block_device
There is no way for two hardware managers to handle erasing two disks
in two different ways. dispatch_to_managers was designed specifically
for this case, and the default behavior will remain the same for the
GenericHardwareManager (erase_block_device will pick up each disk).

Also return the result of the dispatch calls, so they'll be logged by
Ironic and give more cleaning insight.

Change-Id: I19e9dc8539a0729fbb96cae92fe633e24608fc68
2015-08-28 18:08:56 +00:00
Josh Gachnang
c014549804 Refactor list_block_devices to its own function
This function is useful in any HardwareManager that interacts with
disks. Subclassing GenericHardwareManager is not ideal for any
hardware manager that interacts with only specific devices.

Change-Id: Ib20e68a8916590513c0a825e44407a110cfbb441
2015-08-28 10:03:40 -07:00
Dmitry Tantsur
17c7e05235 Extend hardware manager with data needed for inspector
* Added NetworkInterface.ip4_address
* Added HardwareManager.get_bmc_address()
* Added Memory.physical_mb

  This is total memory as reported by dmidecode, and yes,
  it's different from total, as it includes kernel reserved space.

* Added CPU.architecture

  As a side effect, get_cpus was switched to lscpu.
  Also fixes problem when get_cpus reported the current frequency
  instead of maximum one.

Change-Id: I4080d4d551eb0bb995a94ef9a300351910c09fb9
2015-08-21 16:25:04 +02:00
Jenkins
bde6ed5570 Merge "Improve IPA logging and exception handling" 2015-08-03 16:31:47 +00:00
Josh Gachnang
9f2ea824ec Add node param to base erase_block_device
The param was added to the GenericHardwareManager but it wasn't added
to the base class.

This is a breaking API change for the hardware managers.

Change-Id: Ia73fe14308986496e3a4f8d71bc2298a9130cffa
2015-07-28 16:57:42 -07:00
Josh Gachnang
59281ecda8 Improve IPA logging and exception handling
Debugging the agent is a huge pain point. Tracebacks are rarely logged,
error messages are often only returned via the API, and lack of
info logging makes it hard to determine where some failures occur.
Some errors only return a 500 with no error message or logs.

Change-Id: I0a127de6e4abf62e20d5c5ad583ba46738604d2d
2015-07-28 09:37:43 -07:00
Jacob McCann
c0769691bd Convert Int to String for shred execute
Was running into 'expected string, int found' when calling
shred with an Int for iterations.

Change-Id: Iffce247caba5b0d62ac89b6411402c8d975cfd2f
Closes-Bug: #1469838
2015-07-01 15:08:18 +00:00
Anusha Ramineni
02f78453b2 IPA:'shred' utility to use configured iterations
Today, there is no option to configure number of iterations to be
done for shred block device erasing and defaults it to 1. This patch
adds a configuration option to change the number of passes to be done
to erase a block device.

Change-Id: I1921d33a6b364c4682b6c9baaf61ac092cfa11d7
Partial-Bug:#1465130
2015-06-18 09:26:36 +00:00
Anusha Ramineni
8cef029d0d Fix error in in-band disk erase using shred
in-band disk erase using shred fails with error "'module' object has no
attribute 'ProcessExecutionError'". This commit is to fix the issue.

Change-Id: Ia0c426074b2f0e9d534ed96a3e213933160edc61
Closes-Bug:#144799
2015-05-08 15:02:36 +05:30
Anusha Ramineni
efba46a8a2 Fix inband disk erase using agent_ilo driver
In-band disk erase using shred fails for agent_ilo driver as it tries to
erase the virtual floppy device attached.This fix is to skip the virtual
media devices and continue with other disks.

Change-Id: I26745985382d440f7d4b3fbfffb14545067fcca6
Closes-Bug:#1450298
2015-05-07 09:51:11 +05:30
Jenkins
a1c87672ea Merge "Fix Sphinx Autodoc WARNING/ERROR in docs build" 2015-04-01 00:39:21 +00:00
Jay Faulkner
8bad5bbac3 Fix Sphinx Autodoc WARNING/ERROR in docs build
The docstrings here were all giving WARNINGs or ERRORs during the docs
build, and were generally making unappealing looking developer
documentation. I corrected the syntax and did what was neccessary to
make the build come out clean.

Change-Id: I74b00a7f125770b0468cff3bdf26d0d52cd054d7
(cherry picked from commit c0921cdff372ce1fd6df1c4ab4eb5463e2cba0e4)
2015-03-31 16:22:57 -07:00