30 Commits

Author SHA1 Message Date
Dmitry Tantsur
62672de131 Reduce the duration of retries in the inspector tests
Currently the test takes 5*5=25 seconds. Re-arrange the code so
that it's possible to change the retry delay in tests.

Change-Id: Ia559dad4bc656f8ad6b2cb8cb0137a97e2614db7
2020-10-07 12:39:01 +02:00
Julia Kreger
bb27badf76 Add basic retries for inspection
A transitory connection failure, such as one caused by
a port being held down for traffic forwarding, can experience
intermittent connectivity failures which result in failed
introspections.

Now the agent retries.

Change-Id: I72c5e3aca000d3854a17f8a461b1a2935e5c0d9b
2020-09-14 22:38:18 +00:00
Kaifeng Wang
b424fbfa35 Extends pci devices metrics
Collects PCI class, revision, and bus information for the pci-devices
collector, these metrics as well as vendor id and device id are
components which can be used to construct device information like
lspci output, which is how cyborg agent collects accelerator devices.

Accelerator device based scheduling is possible after ironic has such
information in place.

Change-Id: I6c37c554f37dd5f1d21c8fd4fad2a4f44a3c75d7
Story: 2007971
Task: 40474
2020-08-04 23:32:37 +08:00
Julia Kreger
c76b8b2c21 Limit Inspection->Lookup->Heartbeat lag
Caches hardware information collected during inspection
so that the initial lookup can occur without any delay.

Also adds logging to track how long inventory collection takes.

Co-Authored-By: Dmitry Tantsur <dtantsur@protonmail.com>
Change-Id: I3e0d237d37219e783d81913fa6cc490492b3f96a
2020-07-03 10:32:26 +02:00
Dmitry Tantsur
31b73b4984 Expose collector and hardware manager names via introspection data
This change adds a new introspection data field 'configuration'
with two lists: managers and collectors.

Change-Id: Ice0d7e6ecff3f319bc3a4f41617059fd6914e31c
2020-01-22 11:15:38 +01:00
Julia Kreger
696606f682 manual introspection trigger command
Change-Id: I64e66682c1e54f6edc260a22f46f5f6df8e85af1
Story: 2005896
Task: 33756
2019-07-08 07:43:40 -07:00
Dmitry Tantsur
5c5328ccaa Supports fetching API endpoints from mDNS
This change enables IPA to receive API endpoints and configuration
via multicast DNS.

Story: #2005393
Task: #30382
Change-Id: Ibbf07052bea8f5c0305dda098b2879bcbc2fece5
2019-05-29 16:58:24 +02:00
Doug Hellmann
a2d25de639 show inspection callback url in error messages
We have seen issues in misconfigured systems where messages from IPA
to ironic-inspector fail with 404 or other standard HTTP error
codes. Although there is an info message reporting the URL, the error
messages do not include the URL of the service IPA is trying to talk
to, which makes debugging the configuration difficult. This change
adds the URL to the error already being reported to improve that
situation.

Change-Id: Ib092ac690d29c385c0564c086ad2fec3df0fb2e0
Signed-off-by: Doug Hellmann <doug@doughellmann.com>
2019-05-13 18:33:49 -04:00
Dmitry Tantsur
f153a741e1 Clean up deprecated items in the inspection code
* Remove support for setting IPMI credentials (removed from inspector in Pike)
* Stop sending the ipmi_address field (bmc_address is used instead since Pike)

Change-Id: I1696041db62ba27e5d31e8481cb225a43d7e2a46
Closes-Bug: #1654318
2017-09-19 14:05:13 +02:00
Jenkins
fd7f10b993 Merge "Configure and use SSL-related requests options" 2017-02-07 09:57:49 +00:00
Pavlo Shchelokovskyy
fdd11b54a5 Configure and use SSL-related requests options
This patch adds standard SSL options to IPA config and makes use of them
when making HTTP requests.

For now, a single set of certificates is used when needed.
In the future configuration can be expanded to allow per-service
certificates.

Besides, the 'insecure' option (defaults to False) can be overridden
through kernel command line parameter 'ipa-insecure'.
This will allow running IPA in CI-like environments with self-signed SSL
certificates.

Change-Id: I259d9b3caa9ba1dc3d7382f375b8e086a5348d80
Closes-Bug: #1642515
2017-01-13 11:33:44 +02:00
Dmitry Tantsur
10bff0a518 Remove compatibility with old bash-based introspection ramdisk
Inspector is using inventory directly, so we can remove the bits sending
processed network and scheduling properties.

Change-Id: I6c58bc3c5ea78fd2dbda82b38515f332ce2e8d4a
2017-01-09 14:10:47 +01:00
Luong Anh Tuan
ab41106cf6 Python 3 Compatible JSON
In order to be really python3 compatible, the json lib was replaced
with oslo.serialization(1.10 or newer) module jsontuils since it's
the recommended migration to python3 guide.

https://wiki.openstack.org/wiki/Python3#Serialization:_base64.2C_JSON.2C_etc.

Change-Id: I2d8b62e642aba4ccd1b70be7e9b3784a95a6743d
Closes-Bug: #1629068
2016-11-16 08:19:51 +00:00
Jim Rollenhagen
62743fec6a Update to work with latest stevedore
stevedore no longer raises a KeyError when there's an entrypoint
missing. Instead we must supply a callback that handles that error.
Update inspection code to work with this.

Also bumps stevedore minimum to 1.16 ahead of the global-requirements
bot.

Closes-Bug: #1603542
Change-Id: I12af23f2525ac90e577bdd10bbfbbd9788e9551c
Depends-On: I8aa1ee52ff7de50488acb86e8920da89ddb05771
2016-07-17 14:38:00 -04:00
Lucas Alvares Gomes
af81914ce7 Add a log extension
The log extension is responsible for retrieving logs from the system,
if journalctl is present the logs will come from it, otherwise we
fallback to getting the logs from the /var/log directory + dmesg logs.

In the coreos ramdisk, we need to bind mount /run/log in the container
so the IPA service can have access to the journal.

For the tinyIPA ramdisk, the logs from IPA are now being redirected to
/var/logs/ironic-python-agent.log instead of only going to the default
stdout.

Inspector now shares the same method of collecting logs, extending its
capabilities for non-systemd systems.

Partial-Bug: #1587143
Change-Id: Ie507e2e5c58cffa255bbfb2fa5ffb95cb98ed8c4
2016-06-28 17:02:11 +01:00
Szymon Borkowski
f7e080c8bf Add PCI devices collector to inspector
Adds a new collector, which gathers list of PCI devices.
Each entry is a dictionary containing 2 keys:
- vendor-id
- product-id
Such information can then be used by the inspector to distinguish
appropriate PCI devices.

Change-Id: Id7521d66410e7d408d7eada692b6123e769ce084
Partial-Bug: #1580893
2016-06-24 14:50:58 +02:00
Dmitry Tantsur
53b187a4c3 Add boot information into the inventory
Adds a new BootInfo object with 2 fields:

* current_boot_mode - bios or uefi, detected from presence of /sys/firmware/efi
  as per the following answer: http://askubuntu.com/a/162896
  This field will be used for setting the boot_mode capability in ironic-inspector
* pxe_interface - PXE booting interface, if it can be detected.
  This fields is already used by ironic-inspector, added here for consistency.

Change-Id: Ib36b592ffaba3bfa055d65c9526607867d302584
Partial-Bug: #1571580
2016-05-26 17:05:11 +02:00
Dmitry Tantsur
6da6ace384 [inspection] wait for the PXE DHCP by default and remove the carrier check
We hoped that checking /sys/class/net/XXX/carrier will allow us
to not wait for interfaces that are not connected at all.
In reality this field turned out to be unreliable. For example, it is
also set to 0 when interface is down or is being configured.
The bug https://bugzilla.redhat.com/show_bug.cgi?id=1327255 shows
the case when carrier is 0 for all interfaces, including one that is
used to post back data, which is obvious non-sense.

This change removes check on carrier for the loop. To avoid 60 seconds
wait for people with several NIC's, it's changed to only wait for the
PXE booting NIC, which obviously must get an IP address.

This makes IP addresses in the inspection data for other NIC's somewhat
unreliable. A new option inspection_dhcp_all_interfaces is introduced
to allow waiting for all NIC's to get IP addresses.

This change should finally fix bug 1564954.

Change-Id: I8b04bf726980fdcf6bd536c6bb28e30ac50658fb
Related-Bug: #1564954
2016-05-10 18:12:46 +02:00
Jenkins
2d8e139f03 Merge "Set modification time in tarfile of ramdisk logs" 2016-04-08 12:41:28 +00:00
Dmitry Tantsur
3deb25a3ce Wait for the interfaces to get IP addresses before inspection
In the DIB build the DHCP code (provided by the dhcp-all-interfaces element)
races with the service starting IPA. It does not matter for deployment itself,
as we're waiting for the route to the Ironic API to appear. However, for
inspection it may result in reporting back all NIC's without IP addresses.
Inspection fails in this case.

This change makes inspection wait for *all* NIC's to get their IP addresses up
to a small timeout. The timeout is 60 seconds by default and can be changed
via the new ipa-inspection-dhcp-wait-timeout kernel option (0 to not wait).

After the wait inspection proceedes in any case, so the worst downside
is making inspection 60 seconds longer.

To avoid waiting for NIC's that are not even connected, this change extends the
NetworkInterface class with 'has_carrier' field.

Closes-Bug: #1564954
Change-Id: I5bf14de4c1c622f4bf6e3eadbe20c44759da5d66
2016-04-05 20:03:33 +02:00
Miles Gould
3f715a20fd Set modification time in tarfile of ramdisk logs
If we do not set this explicitly, tar will warn "journal: implausibly
old time stamp" when the user tries to untar the log files.

Change-Id: I4a5a1ffd4eeca9697cdcf16e02d3ff3c22d7132c
2016-04-04 17:29:16 +01:00
Dmitry Tantsur
58f86d0353 Stop trying to log stdout when fetching logs during inspection
Logging the whole journalctl output is not the best idea. Fortunately,
it does not work right now and fails with a traceback :)

This change adds a new log_stdout argument to utils.execute() and uses it in
the "logs" inspection collector.

Also do not log the logs while logging the collected data.

Change-Id: Ibc726ac2c4f5eb06c73ac4765bb400077b84a6cc
2016-03-08 16:31:18 +01:00
Dmitry Tantsur
5fa258b708 Fix "logs" inspection collector when logs contain non-ascii symbols
Somehow it didn't pop earlier. Updated tests to contain some creepy
russian letters :)

Closes-Bug: #1517913
Change-Id: I4c6712ea1e813d1f0f0d0aedaccfa1187526e0ec
2015-12-08 14:32:16 +01:00
Jenkins
2bce5f6065 Merge "Use oslo.log instead of original logging" 2015-11-02 17:44:19 +00:00
ZhiQiang Fan
9e75ba5460 Use oslo.log instead of original logging
We are using oslo.log now, but some of the modules still use logging.
We should use oslo.log to keep consistency, besides, oslo.log can
provide fine wrapper for OpenStack projects.

Change-Id: Ibe57e503b88b39e284a9e4b11a1886cd4e8d4ccf
2015-10-24 03:22:36 -06:00
Zhenguo Niu
18d5d6aba3 Replace deprecated LOG.warn with LOG.warning
Change-Id: Ib3d566f6e608ee453659e15cabcf8e9332aedc52
Closes-Bug: #1508442
2015-10-22 14:42:57 +08:00
Dmitry Tantsur
9d6b0864e3 Add "logs" and "extra-hardware" inspection collectors
This is a port of downstream inspector ramdisk plugins we found helpful.
* logs - sends journald logs with inspection data.
* extra-hardware - uses hardware-detect utility to collect bigger
  hardware inventory and to run benchmarks.

Change-Id: If05402606c45185d618279eef46e68c51209f82b
2015-10-01 18:25:30 +02:00
Dmitry Tantsur
3b70647358 inspection: prepare for future deprecations
1. cleanly separate deprecated and non-deprecated properties
2. add root disk to inspection data, so that we can have a proper
   fallback when root device hints are not given.

Change-Id: Ie19b82ff2a914873ff4b2395b02643e086b934b1
2015-09-16 14:26:57 +02:00
Dmitry Tantsur
e3e6000524 Follow-up to inspection patch 096830414b
Change-Id: I7ec05e501ec40802efa14cabe14752972919c7a9
2015-09-16 10:36:33 +00:00
Dmitry Tantsur
096830414b Add support for inspection using ironic-inspector
Adds a new module ironic_python_agent.inspector and new entry point
for extensions, which will allow vendor-specific inspection.

Inspection is run on service start up just before the lookup.
Due to this early start, and due to the fact we don't even know
MAC address of nodes on inspection (to say nothing about IP addresses),
exception handling is a bit different from other agent features:
we try hard not to error out until we send at least something to inspector.

Change-Id: I00932463d41819fd0a050782e2c88eddf6fc08c6
2015-09-07 18:22:54 +02:00