The log extension is responsible for retrieving logs from the system,
if journalctl is present the logs will come from it, otherwise we
fallback to getting the logs from the /var/log directory + dmesg logs.
In the coreos ramdisk, we need to bind mount /run/log in the container
so the IPA service can have access to the journal.
For the tinyIPA ramdisk, the logs from IPA are now being redirected to
/var/logs/ironic-python-agent.log instead of only going to the default
stdout.
Inspector now shares the same method of collecting logs, extending its
capabilities for non-systemd systems.
Partial-Bug: #1587143
Change-Id: Ie507e2e5c58cffa255bbfb2fa5ffb95cb98ed8c4
Adds a new collector, which gathers list of PCI devices.
Each entry is a dictionary containing 2 keys:
- vendor-id
- product-id
Such information can then be used by the inspector to distinguish
appropriate PCI devices.
Change-Id: Id7521d66410e7d408d7eada692b6123e769ce084
Partial-Bug: #1580893
This replaces dict.get(key) with dict[key] in the unit tests,
since we should be explicit about whether we expect the key to
exist or not. (These are tests, after all.)
Change-Id: Ide98ceba325214e55cab2063d9ace150d63f9617
Closes-Bug: #1590808
To support multi-tenant networking in Ironic we need to be able to
discover not just the NICs a baremetal machine has but also the physical
connectivity to switches in the network.
This patch collects LLDP (Link Layer Discovery Protocol) data as part of
the list interfaces stage of the generic hardware manager. This
information can then be processed by the ironic inspector to populate
the local link information on each ironic port.
The processing done on this data in ironic python agent is limited, this
is to allow for server side processing hooks to process as much or as
little of the data as they want. This is to allow for multi-vendor
environments that might use different parts of the LLDP packet to use a
generic ramdisk and configure the processing server side using inspector
plugins.
Reserved fields switch_port_descr and switch_chassis_descr have been
deprecated for removal in Ocata in favor of passing the whole packet.
Change-Id: Idae9b1ede1797029da1bd521501b121957ca1f1a
Partial-Bug: #1526403
The functional tests in IPA were broken because the configuration
options weren't loaded prior to starting the service. This patch does
now register the configuration option at the base class.
Closes-Bug: #1595145
Change-Id: Iaaa16fddd093075e7f995fb82ad3abb64e8e5794
This patch replace assertRaisesRegexp with
assertRaisesRegex which is deprecated in python3
https://docs.python.org/3.2/library/unittest.html
Also it update the base tests to be oslotest
BaseTestCase for python2.7 and python3 compatibility
Change-Id: I02571946f0643247e208d98dc91ea78cd9d351ee
https://review.openstack.org/#/c/320295/ introduced two internal
variables: _DISK_WAIT_ATTEMPTS and _DISK_WAIT_DELAY. These values are
hardcoded. This patch adds configuration options for these so
that an operator can change them based on their own needs/fleet of
hardware.
Change-Id: I2ba97669ec710fb4a435307466cd8add9c2293ba
Closes-Bug: #1585663
Passing mutable objects as default args is a known Python pitfall.
We'd better avoid this. This commit changes mutable default args with
None, 'arg = [] if arg is None else arg'.
TrivialFix
Change-Id: I384b24e81543999a8b873e3223cd409ed799ffa0
Tests should use:
self.assertIn(value, list)
self.assertNotIn(value, list)
instead of:
self.assertTrue(value in list)
self.assertFalse(value in list)
because assertIn and assertNotIn raise more meaningful errors:
self.assertIn(3, [1, 2]
>>> MismatchError: 3 not in [1, 2]
self.assertTrue(3 in [1, 2])
>>> AssertionError: False is not true
Change-Id: I2d1c78fe71fe03e350b1035123b0a48b7186a6ec
Closes-Bug: #1510007
During node look up sometimes IPA sends hardware inventory information before
the interfaces are up, resulting in empty interfaces information being sent
to the conductor. This causes conductor to throw exceptions and polluting
the log until the real interfaces information is populated. This change makes
IPA to wait for at least one interface to be up before trying for node lookup.
Change-Id: Ifdb91298eaa5c725f108fa722263ed925691ecda
Closes-Bug: #1562786
Every other Ironic python agent kernel parameter is prefixed with "ipa-".
This patch allows users to use the old "lldp-timeout" parameter or the new
"ipa-lldp-timeout" parameter. Warning message is logged if "lldp-timeout"
parameter is used.
(Also fixed typo while I'm at it.)
Change-Id: Icc05ead31506628e4926be6549916a19cad48db3
Closes-Bug: #1588325
These flags will be processed in a new ironic-inspector plugin
to support setting capabilities like cpu_vt (virtualization enabled).
Change-Id: I5fe9310c316841eabdd2d5e2ef2ae30afa03d29a
Partial-Bug: #1571580
This patch moves the IPA oslo configs out of the agent cmd into their
own module so that it is safe to import them from other places in the
application without causing circular imports.
Change-Id: I100792bd0d1f369763afaa6f93e144e9967c3048
utils.SUPPORTED_ROOT_DEVICE_HINTS is no longer being used so
delete it. (The method that used it was removed in
33535cd572b90a817b9c96b42a0d2fe54751351d).
Change-Id: Ibb675d4496d7814778f3bab9c161734013479116
Adds a new BootInfo object with 2 fields:
* current_boot_mode - bios or uefi, detected from presence of /sys/firmware/efi
as per the following answer: http://askubuntu.com/a/162896
This field will be used for setting the boot_mode capability in ironic-inspector
* pxe_interface - PXE booting interface, if it can be detected.
This fields is already used by ironic-inspector, added here for consistency.
Change-Id: Ib36b592ffaba3bfa055d65c9526607867d302584
Partial-Bug: #1571580
In order to support a more complex syntax for root device hints (e.g
operators: greater than, less than, in, etc...) we need to stop relying
on the kernel command line for passing the root device hints. This patch
changes this approach by getting the root device hints from a cached
node object that was set in the hardware module.
Two new functions: "cache_node" and "get_cached_node" were added to the
hardware module. The idea is to facilitate the access to a node object
representation from the hardware extension methods without changing
method signatures, which would break compatibility with out-of-tree
hardware managers.
Note that the new "get_cached_node" is just a guard function to
facilitate the tests for the code.
The function parse_root_device_hints() and its tests were removed since
it's not used/needed anymore.
Partial-Bug: #1561137
Change-Id: I830fe7da1a59b46e348213b6f451c2ee55f6008c
Some kernel modules take substantial time to initialize. For example,
with mpt2sas RAID driver inspection and deployment randomly fail
due to IPA starting before the driver finishes initialization.
As much as I hate it, the only way to guarantee that the hardware is
truely initalized is to wait for it. Apparently all hardware in Linux
is treated as hotplugged, so there is no such thing as "hardware
initialization is finished". Operators can add a sleep based on their
knowledge of their hardware.
The default behaviour remains the same.
Change-Id: I0446ae81d760dacaf31eea6ad9f9eaa098cf5e93
Partial-Bug: #1582797
Some kernel modules take substantial time to initialize. For example,
with mpt2sas RAID driver inspection and deployment randomly fail
due to IPA starting before the driver finishes initialization.
This problem is probably impossible to solve in a generic case, as
modern Linux environment do not have a notion of "hardware is fully
initialized" moment. All hardware is essentially hotplug.
To solve it at least for the simplest case, this patch adds a wait loop
on start up waiting for at least one suitable disk to appear in inventory.
Note that root device hints are not considered, as the node might not
be known at that moment yet.
Change-Id: Id163ca28f7c140c302ea04947ded3f3c58b284de
Partial-Bug: #1582797
I would've voted -1 on the patch in question had I reviewed it, and per
standard OpenStack/Ironic procedure, I'm reverting it for re-review and
discussion.
In this case; I don't think the new method in the HWM interface is
needed, and that evaluate_hardware_support() is intended to handle the
cases handled.
This reverts commit 0962cae1da69a1a2981d5950ad741d91115dac06.
Change-Id: Ic08e44bdf116403444b257ee9f4e5b906f5eac53
Some kernel modules take substantial time to initialize. For example,
with mpt2sas RAID driver inspection and deployment randomly fail
due to IPA starting before the driver finishes initialization.
Add a new hardware manager method initialize_hardware, which gets
run on start up before other hardware manager method invocations.
The generic implementation is to call udev settle and wait for
at least one suitable disk device to appear with the hardcoded
timeout of 15 seconds. Also preload the IPMI modules instead of
calling modprobe every time the inventory is requested.
Change-Id: If7758bb6e3faac7d05451baa3a26adb8ab9953d5
Partial-Bug: #1582797
The script splits 'sfdisk' output wrongly in blank spaces leading to
failures when partition id is greater than 10, e.g.:
"/dev/sda10: start= ...".
This fix splits the output on ":" and trims trailing space when
partition id is less than 10, e.g.: "/dev/sda1 : start= ...".
Change-Id: I23f4b747fc0a86713cb912afc5d193398e6a597b
Closes-Bug: 1581699
This patch adds code into the tinyipa build process and IPA itself to
allow the required python code to be PYTHONOPTIMIZE precompiled into
pyo files, this speeds up IPA startup time in a nested virt by 50%.
Change-Id: Ib60c420719ea52a602c1752b572d3b217c2cefc7
We hoped that checking /sys/class/net/XXX/carrier will allow us
to not wait for interfaces that are not connected at all.
In reality this field turned out to be unreliable. For example, it is
also set to 0 when interface is down or is being configured.
The bug https://bugzilla.redhat.com/show_bug.cgi?id=1327255 shows
the case when carrier is 0 for all interfaces, including one that is
used to post back data, which is obvious non-sense.
This change removes check on carrier for the loop. To avoid 60 seconds
wait for people with several NIC's, it's changed to only wait for the
PXE booting NIC, which obviously must get an IP address.
This makes IP addresses in the inspection data for other NIC's somewhat
unreliable. A new option inspection_dhcp_all_interfaces is introduced
to allow waiting for all NIC's to get IP addresses.
This change should finally fix bug 1564954.
Change-Id: I8b04bf726980fdcf6bd536c6bb28e30ac50658fb
Related-Bug: #1564954
Introduce a new parameter in driver_internal_info called
agent_erase_devices_zeroize to control the behavior of shred. This
parameter controls the --zero argument used when invoking shred.
Configuring this to false disabled the last pass of zeroes, leaving the
device with random data.
Change-Id: I7053034f5b5bc6737b535ee601e6fb71284d4a83
Partial-bug: #1568811
Depends-On: Ia7ea8d909df9ae86a6dbd68ba94746b171535eb8
Create a helper function: create_hdparm_info() and use it in the unit
tests.
Also fix mispelling of 'indicative'
Change-Id: Ifa15fde72bc0ca6d925408c1dcafce85c192abb7
Presently should the ATA erasure operation fails, IPA halts the
cleaning process and the node goes to CLEANFAIL state as a result.
This failure could be the result of a previous cleaning failure
that left drive security enabled, for which code has been added
in an attempt to address this case by attempting to unlock the
the drive.
In the event that an operator wishes to automatically fallback to
disk scrubbing operations, the capability has been added through
a driver_internal_info field "agent_continue_if_ata_erase_failed"
that can be set to True, however defaults to False keeping the
same behavior that IPA presently exhibits in the event of ATA
erase operations failing.
Partial-Bug: #1536695
Change-Id: I88edd9477f4f05aa55b2fe8efa4bbff1c5573bb1
In the DIB build the DHCP code (provided by the dhcp-all-interfaces element)
races with the service starting IPA. It does not matter for deployment itself,
as we're waiting for the route to the Ironic API to appear. However, for
inspection it may result in reporting back all NIC's without IP addresses.
Inspection fails in this case.
This change makes inspection wait for *all* NIC's to get their IP addresses up
to a small timeout. The timeout is 60 seconds by default and can be changed
via the new ipa-inspection-dhcp-wait-timeout kernel option (0 to not wait).
After the wait inspection proceedes in any case, so the worst downside
is making inspection 60 seconds longer.
To avoid waiting for NIC's that are not even connected, this change extends the
NetworkInterface class with 'has_carrier' field.
Closes-Bug: #1564954
Change-Id: I5bf14de4c1c622f4bf6e3eadbe20c44759da5d66
If we do not set this explicitly, tar will warn "journal: implausibly
old time stamp" when the user tries to untar the log files.
Change-Id: I4a5a1ffd4eeca9697cdcf16e02d3ff3c22d7132c