97 Commits

Author SHA1 Message Date
Jay Faulkner
e303a369dc Inspect non-raw images for safety
When IPA gets a non-raw image, it performs an on-the-fly conversion
using qemu-img convert, as well as running qemu-img frequently to get
basic information about the image before validating it.

Now, we ensure that before any qemu-img calls are made, that we have
inspected the image for safety and pass through the detected format.

If given a disk_format=raw image and image streaming is enabled
(default), we retain the existing behavior of not inspecting it in
any way and streaming it bit-perfect to the device. In this case, we
never use qemu-based tools on the image at all.

If given a disk_format=raw image and image streaming is disabled, this
change fixes a bug where the image may have been converted if it was not
actually raw in the first place. We now stream these bit-perfect to the
device.

Adds two config options:
- [DEFAULT]/disable_deep_image_inspection, which can be set to "True" in
  order to disable all security features. Do not do this.
- [DEFAULT]/permitted_image_formats, default raw,qcow2, for image types
  IPA should accept.

Both of these configuration options are wired up to be set by the lookup
data returned by Ironic at lookup time.

This uses a image format inspection module imported from Nova; this
inspector will eventually live in oslo.utils, at which point we'll
migrate our usage of the inspector to it.

Closes-Bug: #2071740
Change-Id: I5254b80717cb5a7f9084e3eff32a00b968f987b7
2024-09-04 09:11:28 -07:00
Dmitry Tantsur
6cd36a750f
Make inspection URL optional if the collectors are provided
With the new in-band inspection, we can derive the callback URL from
the Ironic URL, there is no need to duplicate it. This change uses
the presence of collectors as a sign to run inspection.

The previous approach of setting an inspection URL, with or without
explicitly setting collectors, still works for compatibility with
ironic-inspector.

Change-Id: Ie4279ee6d2995c9686f1dcdef1d6e5dc1dd20871
2024-01-10 08:55:42 +01:00
Dmitry Tantsur
0d4ae976c2
Support several API and Inspector URLs
Allows nodes with a single IP stack to be deployed from a dual-stack
Ironic.

Detecting advertised address and usable Ironic URLs are done completely
independently which does open some space for a misconfiguration. I hope
it's not likely in the reality, especially since this feature is
targetting advanced standalone users.

Change-Id: Ifa506c58caebe00b37167d329b81c166cdb323f2
Closes-Bug: #2045548
2024-01-09 16:43:23 +01:00
Dmitry Tantsur
2ab8364649
Add a jitter to heartbeat retries
Currently, if heartbeat fails, we reschedule it after 5 seconds.
This is fine for the first retry, but it can cause a thundering herd
problem when a lot of nodes fail to heartbeat at once.

This change adds jitter to the minimum wait of 5 seconds. The jitter is
not applied for forced heartbeats: they still have a minimum wait of
exactly 5 seconds from the last heartbeat.

The code is re-ordered to move the interval calculation to one place.
Bonus: correctly logging the next interval.

The unit tests have been rewritten to test the heartbeat process step by
step and not rely on the exact sequence of the calls.

Closes-Bug: #2038438
Change-Id: I4c4207b15fb3d48b55e340b7b3b54af833f92cb5
2023-12-13 17:34:24 +01:00
Julia Kreger
e6fd7e753e Allow md5 to be disabled from the conductor
Also fixes my use of set_override, as it is not on the actual
config object. You'd think I'd remember that, since I've done
that before...

Change-Id: I4b578c4319354001cbbd3b3856af96b30fd25555
2023-05-25 07:59:07 -07:00
Julia Kreger
32df26a22a Disable MD5 image checksums
MD5 image checksums have long been supersceeded by the use of a
``os_hash_algo`` and ``os_hash_value`` field as part of the
properties of an image.

In the process of doing this, we determined that checksum via
URL usage was non-trivial and determined that an appropriate
path was to allow the checksum type to be determined as needed.

Change-Id: I26ba8f8c37d663096f558e83028ff463d31bd4e6
2023-04-24 16:54:42 -07:00
Riccardo Pittau
efbbc86f53 Increase version of hacking and pycodestyle
Fix H904 "Delay string interpolations at logging calls" errors

Change-Id: I331808d0132094faf739998a6984440787d3ebf8
2021-07-30 14:34:33 +02:00
Dmitry Tantsur
b605943796 Coalesce heartbeats
The IPA sends heartbeats to the conductor periodically and when
requested, e.g. at the end of asynchronous commands. In order
to avoid to send such notifications in too quick succession,
e.g. when two asynchronous commands finish at the same time or
when the periodic heartbeat was just sent right before a command
ended, this patch proposes to coalesce heartbeats which are
close together timewise and send only one for all of them
in a time interval of 5 seconds.

Co-Authored-By: Arne Wiebalck <arne.wiebalck@cern.ch>

Story: #2008983
Task: 42633

Change-Id: Idfbce44065e1e5a8b730b94741b2604c51f0ab14
2021-06-18 17:19:30 +02:00
Dmitry Tantsur
be3882162e Remove the iscsi extension
Change-Id: I2f0e581575112d6c7ba0d211661cab3e0b6caca6
2021-05-10 12:43:44 +02:00
Zuul
faeb9441d3 Merge "Simplify heartbeating by removing use of select()" 2020-09-29 15:47:08 +00:00
Jay Faulkner
a01646f56b Simplify heartbeating by removing use of select()
Heartbeating in IPA has used select.poll() for years to workaround
a bug where changing the time in the ramdisk could cause heartbeats
to stop and never resume.

Now that IPA syncs time at start and exit, this workaround is no
longer needed. So instead, we'll revert to using threading.Event()
in order to make the code simpler and easier to understand.

Since we need this to be an eventlet-event, and not a standard-thread
event, also monkey_patch threading.

Additionally, there were a few completely unused backoff interval
values set, that were never applied. In respect of maintaining the
5+ years old behavior of not doing error backoffs, that code was
removed instead of being made to work.

Change-Id: Ibcde99de64bb7e95d5df63a42a4ca4999f0c4c9b
2020-09-22 16:59:47 +00:00
Dmitry Tantsur
021e0a6a46 Generate a TLS certificate and send it to ironic
Adds a new flag (on by default) that enables generating a TLS
certificate and sending it to ironic via heartbeat. Whether
ironic supports auto-generated certificates is determined by
checking its API version.

Change-Id: I01f83dd04cfec2adc9e2a6b9c531391773ed36e5
Depends-On: https://review.opendev.org/747136
Depends-On: https://review.opendev.org/749975
Story: #2007214
Task: #40604
2020-09-11 17:46:52 +02:00
Julia Kreger
d3c3d4dabe Update the cache if we don't have a root device hint
Or at least try to.

Some deployments just don't use root device hints, and this is okay.

However, other deployments need root device hints, and with fast
track mode in ramdisks, we created a situation where the node cache
could be updated by a human or software between the time the agent
was started, and the deployment was requested.

As a result, the agent has been updated to check if we have a hint
and if we don't, update the cache from the node lookup endpoint.

This is not needed when the inband deploy steps are executed, as
the process of updating the steps does force the node cache to be
updated.

Change-Id: I27201319f31cdc01605a3c5ae9ef4b4218e4a3f6
Story: 2008039
Task: 40701
2020-08-25 19:34:48 +00:00
Dmitry Tantsur
353d09c3b0 Support changing the protocol part of callback_url to https
Adds a new kernel parameter for manual configuration and also creates
foundation for automatic TLS support later.

Change-Id: If341c3a8a268fc8cab6bd6be04b12ca32b31c8d8
Story: #2007214
Task: #40619
2020-08-06 15:14:31 +02:00
Zuul
9ca640a1c5 Merge "Prevent un-needed iscsi cleanup" 2020-07-25 13:54:51 +00:00
Zuul
bfb395837d Merge "Adds poll mode deployment support" 2020-07-22 19:53:31 +00:00
Julia Kreger
2a56ee03b6 Prevent un-needed iscsi cleanup
When we added software raid support, we started calling bootloader
installation. As time went on, we ehnanced that code path for non
RAID cases in order to ensure that UEFI nvram was setup
for the instance to boot properly.

Somewhere in this process, we missed a possible failure case where
the iscsi client tgtadm may return failures. Obviously, the correct
path is to not call iscsi teardown if we don't need to.

Since it was always semi-opportunistic teardown, we can't blindly
catch any error, and if we started iSCSI and failed to tear the
connection down, we might want to still fail, so this change
moves the logic over to use a flag on the agent object which
one extension to set the flag and the other to read it and take
action based upon that.

Change-Id: Id3b1ae5e59282f4109f6246d5614d44c93aefa7c
Story: 2007937
Task: 40395
2020-07-20 14:24:06 -07:00
Julia Kreger
c76b8b2c21 Limit Inspection->Lookup->Heartbeat lag
Caches hardware information collected during inspection
so that the initial lookup can occur without any delay.

Also adds logging to track how long inventory collection takes.

Co-Authored-By: Dmitry Tantsur <dtantsur@protonmail.com>
Change-Id: I3e0d237d37219e783d81913fa6cc490492b3f96a
2020-07-03 10:32:26 +02:00
Kaifeng Wang
61c95554ff Adds poll mode deployment support
Adds a new poll extension to provide get_hardware_info and get_node_info
interfaces.

get_hardware_info will be used for node validation by ironic deploy
drivers.

get_node_info will be used for sending lookup data to IPA.

standalone mode is assumed as debug only, but it's not the case
considering the poll mode will be introduced, slightly updates the
description, also prevents the mdns lookup when standalone is true.

Story: 1526486
Task: 28724

Change-Id: I5ad772a18cc4584585c5a7b6fb127547cece1998
2020-06-21 16:44:00 +08:00
Dmitry Tantsur
9d4cf5532f Add a deploy step for writing an image
The new step just invokes the appropriate method of the standby extension.

Change-Id: Ic74f83ab2b7e58f8e4b46e0abfab79e221afeb3e
Story: 2006963
2020-06-02 15:23:54 +02:00
Fedor Tarasenko
952489020e Fix an issue with high cpu usage caused by ironic-python-agent
Currently running of ipa-centos8-stable-ussuri image causes 100%
cpu usage while cleaning. Proposed change fixes this behavior and
significantly speeds up cleaning.

Change-Id: I2ba9a69f22b11830d8ff1bc346b17bf1a52f25b0
Story: #2007696
Task: #39809
2020-05-25 22:18:17 +03:00
Dmitry Tantsur
6a9a9e9e14 Fix the token logic to be compatible with older ironic
Currently we fail with HTTP 401 if both the known and the received
tokens are None. This prevents IPA from being updated before ironic.

Story: #2007557
Task: #39419
Change-Id: I80249bd3468b581dc035d72156cbfa2f5f225a1b
2020-04-15 13:01:26 +02:00
Riccardo Pittau
a332a19a57 Bump hacking to 3.0.0
Change-Id: I1032ea6a2e9d79aeaecb1458c319cbeb15ac1fff
2020-03-30 12:55:46 +02:00
Julia Kreger
c97a71d6f3 Fix agent token vmedia use
The agent token was being passed through upon class invocation
as opposed through direct configuration loading due to how IPA
is designed to load items upon start-up.

As a result, we somewhere forgot to save the token to the api
client so heartbeats were being rejected when being forced
to be enabled by default.

This has been corrected an an additional test class has been
added to help ensure that this operates as expected.

Change-Id: I1151945d3dc6a685d62ad49183919ef4a81962f8
2020-03-20 08:53:53 +00:00
Julia Kreger
af5f05a0ee Agent token support
Adds support to the agent to receive, store, and return
that token to ironic's API, when supported.

This feature allows ironic and ultimately the agent to
authenticate interactions, when supported, to prevent
malicious abuse of the API endpoint.

Sem-Ver: feature
Change-Id: I6db9117a38be946b785e6f5e75ada1bfdff560ba
2020-03-12 10:35:17 -07:00
Zuul
5521fa32f6 Merge "Add NTP time sync" 2020-03-11 19:51:24 +00:00
Julia Kreger
731bbe88d7 Log the agent version
Change-Id: If0bac7a6cd11aeac46ed2e92abc3d60493172c7e
2020-03-11 15:59:32 +00:00
Julia Kreger
cee4bfc4bc Add NTP time sync
Attempt to sync the clock and save it to the hardware clock.

This feature supports use of chrony or ntpdate.

Sem-Ver: feature
Change-Id: I178d7614429d582e742d9cba6d0fa3ae099775e3
Story: 1619054
Task: 11591
2020-03-07 09:16:19 -08:00
Dmitry Tantsur
31b73b4984 Expose collector and hardware manager names via introspection data
This change adds a new introspection data field 'configuration'
with two lists: managers and collectors.

Change-Id: Ice0d7e6ecff3f319bc3a4f41617059fd6914e31c
2020-01-22 11:15:38 +01:00
Dmitry Tantsur
f1b2df908a Replace WSME and Pecan with Werkzeug
WSME is no longer maintained and Pecan is an overkill for our (purely
internal) API. This change rewrites the API in Werkzeug (the library
underneath Flask). I don't use Flask here since it's also an overkill
for API with 4 meaningful endpoints.

Change-Id: Ifed45f70869adf00e795202a53a2a53c9c57ef30
2019-12-04 16:50:47 +01:00
Zuul
6032643a04 Merge "Stop using six library" 2019-12-03 13:09:30 +00:00
Dmitry Tantsur
4354bc04f9 Replace netaddr dependency with stdlib ipaddress
netaddr is quite a big library, and all we need is covered by
the built-in ipaddress module (available in python 3).

Also add a safeguard for invalid 'ip route' output.

Change-Id: I9d76a8d1c1b6b1585e301a9c63b37fab3b98746f
2019-12-02 12:13:04 +01:00
Riccardo Pittau
ca7a46b113 Stop using six library
Since we've dropped support for Python 2.7, it's time to look at
the bright future that Python 3.x will bring and stop forcing
compatibility with older versions.
This patch removes the six library from requirements, not
looking back.

Change-Id: I4795417aa649be75ba7162a8cf30eacbb88c7b5e
2019-11-29 10:18:14 +01:00
Dmitry Tantsur
d61887e744 Correct string formatting in logging
Change-Id: Ibc3656e32b94004da7e895a27860c1fa3ac0bbb1
2019-10-17 16:14:10 +02:00
Julia Kreger
696606f682 manual introspection trigger command
Change-Id: I64e66682c1e54f6edc260a22f46f5f6df8e85af1
Story: 2005896
Task: 33756
2019-07-08 07:43:40 -07:00
Dmitry Tantsur
5c5328ccaa Supports fetching API endpoints from mDNS
This change enables IPA to receive API endpoints and configuration
via multicast DNS.

Story: #2005393
Task: #30382
Change-Id: Ibbf07052bea8f5c0305dda098b2879bcbc2fece5
2019-05-29 16:58:24 +02:00
Doug Hellmann
21393cb1c5 report the URL when heartbeats fail
We have seen issues in misconfigured systems where messages from IPA
to ironic fail with 404 or other standard HTTP error codes. The
messages do not include the URL of the service IPA is trying to talk
to, which makes debugging the configuration difficult. This change
adds the URLs to the end of a couple of errors already being reported
to improve that situation.

Change-Id: I6a4ea8768df016e58a073d78edbe4006e217d69e
Signed-off-by: Doug Hellmann <doug@doughellmann.com>
2019-05-13 10:25:10 -04:00
Riccardo Pittau
d525f8a07f Making ironic-python-agent able to stop with python 3.x
The agent stop function will write a byte string 'a' to the pipe
as a signal for the run function to end process.
The run function is expecting a literal string.
In python 2.x the byte string will automatically be converted to
literal, while python 3.x won't do the conversion, causing the
process to never stop.
This patch will fix that behavior, allowing the IPA to correctly
stop using python 3.x.

Story: 2004928
Task: 29308
Change-Id: Iad16e8bed2436d961dea8ddaec1c2724225b4097
2019-02-04 09:55:08 +01:00
Matthew Thode
a03661c4a8
write byte objects when using os.write
Change-Id: I184a9d0bf4a0ba0776d519b3a3b9ccd39151b4ae
Story: 2002052
Task: 19713
2018-05-17 11:11:55 -05:00
Shivanand Tendulker
f08636fe8b Follow-up patch for rescue extension for CoreOS
This patch addresses few minor comments in commit
a659306272542dd38420cb118cc7b04b1e8cf377

Change-Id: Id5b48e3cc96c8807c471c947da3e233cebdf687e
Related-Bug: #1526449
2018-01-30 19:00:13 +00:00
Zuul
893c63f24a Merge "Rescue extension for CoreOS with DHCP tenant networks" 2017-12-11 21:14:09 +00:00
Derek Higgins
214790d17e Ignore IPv6 link local addresses
Prevent IPA from picking up the IPv6 link-local address
as a callback_url in cases where it gets tried before other
addressing methods havn't complete yet. In this scenario IPA
sleeps for 10 seconds and then retries giving the nic a chance to
configure its routable IP address.

Change-Id: Ic53334c630180f0d77bb0231e548d2c44bfe55ca
Closes-Bug: #1732692
2017-11-21 10:11:21 +00:00
Mario Villaplana
a659306272 Rescue extension for CoreOS with DHCP tenant networks
This patch adds support for rescue mode with DHCP tenant networks in
CoreOS. Applying network config from a configdrive is not yet supported
but will be in a future patch.

Co-Authored-By: Jay Faulkner <jay@jvf.cc>
Co-Authored-By: Taku Izumi <izumi.taku@jp.fujitsu.com>
Co-Authored-By: Annie Lezil <annie.lezil@gmail.com>
Co-Authored-By: Aparna <aparnavtce@gmail.com>
Co-Authored-By: Shivanand Tendulker <stendulker@gmail.com>
Change-Id: I7898ff22800dedba73d7fbfb3801378867abe183
Partial-Bug: 1526449
2017-11-06 04:48:58 -05:00
John L. Villalovos
949f4f509e Use flake8-import-order
Use the flake8 plugin flake8-import-order to check import ordering. It
can do it automatically and don't need reviewers to check it.

Change-Id: I946457e9079ce0b54c7fe0ad554d024a1c61dce0
2017-02-16 09:46:21 -08:00
Jenkins
fd7f10b993 Merge "Configure and use SSL-related requests options" 2017-02-07 09:57:49 +00:00
Derek Higgins
b4e41e2dd2 Agent: Listen for connections on both IPv4 and IPv6 ports
Allow connections if deploying over a IPv6 network.

Change-Id: Ied2f6be4aa4d1a70524df1df3506e596f6926e5b
Closes-Bug: #1650539
2017-01-19 15:24:11 +00:00
Pavlo Shchelokovskyy
fdd11b54a5 Configure and use SSL-related requests options
This patch adds standard SSL options to IPA config and makes use of them
when making HTTP requests.

For now, a single set of certificates is used when needed.
In the future configuration can be expanded to allow per-service
certificates.

Besides, the 'insecure' option (defaults to False) can be overridden
through kernel command line parameter 'ipa-insecure'.
This will allow running IPA in CI-like environments with self-signed SSL
certificates.

Change-Id: I259d9b3caa9ba1dc3d7382f375b8e086a5348d80
Closes-Bug: #1642515
2017-01-13 11:33:44 +02:00
Derek Higgins
9f5f664080 Advertise the correct address when using IPv6
Parse the output of "ip route get $IP" taking
IPv6 into consideration. Also wrap the IP address
in square brackets if it is IPv6.

Change-Id: Ifc44e5aa3c5b814b6ceba04461bb68fe1d75c22b
Closes-Bug: #1650533
2017-01-11 11:00:56 +00:00
Yufei
dd9253f1b6 Skip API related work if no api url configured
Currently, if IPA is booted without an ironic api url, it will default
to localhost and fail to connect. Instead, we now explicitly fail and
print a log message if no api callback url is provided.

Change-Id: I0271be94ba7febc6abd5bf3343f6fa179bc1a6a4
Closes-Bug: #1643966
2016-12-07 17:04:05 +08:00
Pavlo Shchelokovskyy
b033bfd933 Remove old lookup/heartbeat from IPA
Lookup/Heartbeat via vendor passthru was deprecated in Newton.

This patch removes the corresponding functionality from IPA,
and also removes handling of 'ipa-driver-name' kernel parameter,
as it was only used in code related to old passthru.

Change-Id: I2c7989063ab3e4c0bae33f05d6d2ed857a2d9944
Closes-Bug: #1640533
2016-11-09 16:34:44 +00:00