272 Commits

Author SHA1 Message Date
Vanou Ishii
d6dd05ab12 Enable Reuse of Zuul Job in 3rd Party CI Environment
At current Zuul job in zuul.d/ironic-jobs.yaml, items of
required-project are like this (without leading hostname)

    required-projects:
      - openstack/ironic
      - openstack/ABCD

but not like this (with leading hostname)

    required-projects:
      - opendev.org/openstack/ironic
      - opendev.org/openstack/ABCD

With first format, if we have two openstack/ironic entries in
Zuul's tenant configuration file (Zuul tenant config file in 3rd
party CI environment usually has 2 entries: one to fetch upstream
code, another for Gerrit event stream to trigger Zuul job), we'll
have warning in zuul-scheduler's log

    Project name 'openstack/ironic' is ambiguous,
    please fully qualify the project with a hostname

With second format, that warning doesn't appear. And Zuul running at
3rd party CI environment can reuse Zuul jobs in zuul.d/ironic-jobs.yaml
in their Zuul jobs.

This commit modifies all Zuul jobs in zuul.d/ironic-jobs.yaml
to use second format.

Story: 2008724
Task: 42068
Change-Id: I85adf3c8b3deaf0d1b2d58dcd82724c7e412e2db
2021-03-17 19:01:07 +09:00
Zuul
fc7f34b2fe Merge "Prepare to use tinycore 12 for tinyipa" 2021-03-15 13:15:45 +00:00
Julia Kreger
ebaa359937 Mark multinode non-voting due to high failure rate
Change-Id: Iea8212ee69a8fe8c5f181c87271f46779e3a46b4
2021-03-11 17:05:50 -08:00
Riccardo Pittau
807a6d2bea Prepare to use tinycore 12 for tinyipa
Tinycore 12 requires some more RAM than its predecessor.

Change-Id: Ibced843f34c78af7780bfe1ade833208b458bb8b
2021-03-02 17:42:45 +01:00
Dmitry Tantsur
5533077c7d Enable swift temporary URLs in grenade and provide a good error message
The fixed configdrive_use_object_store requires them.

Change-Id: Ie7323ae107c7f801be010353c7c4f3b8a43c3a1a
2021-02-24 13:34:17 +01:00
Dmitry Tantsur
414f0ca24e Switch multinode jobs to 512M RAM
384M no longer works reliably with newer tinyIPA.

Change-Id: I7e48b2e682dc0d5e6109e17b0e73ee9763a29d23
2021-02-15 16:20:24 +01:00
Dmitry Tantsur
7c8d1e1e7f Move the IPv6 job to the experimental pipeline
It's broken for months and no effort is under way to fix it.

Change-Id: I88fb5733b3054c2ffa4660f3cb5bff3c852faa75
2021-02-12 17:08:47 +01:00
Julia Kreger
561ed90390 Swap Metalsmith job out for centos8-uefi
Depends-On: https://review.opendev.org/c/openstack/metalsmith/+/773701
Change-Id: Ide1a8988e12958e684670a340bf3c09d516ffa23
2021-02-02 07:01:43 -08:00
likui
378557b7f2 add openstack-python3-wallaby-jobs-arm64 job
This is a non-voting job to validate py3 unittests on ARM64

Change-Id: I7a3a783ddeb5e9b7aaad9ccfb8aeeb7fcc8a1593
Task: 41376
Story: 2007938
2020-12-31 09:06:10 +08:00
Zuul
a58b88c737 Merge "Remove lower-constraints job" 2020-12-16 15:46:46 +00:00
Riccardo Pittau
840488e595 Remove lower-constraints job
As discussed during the upstream ironic community meeting on
Monday Dec 14 2020, the lower-constraints job is being removed.

Change-Id: I116d99014a7bf77ca77b796ea3b759800dd808ce
2020-12-15 18:43:09 +01:00
Dmitry Tantsur
97ceb38a72 CI: switch the multinode job to tempest-multinode-full-base
The non-base job is designed for the integrated gate and may have
unnecessary side effects. It has recently overriding the OVS agent
bridge settings, breaking our job.

Make the job voting again.

Change-Id: Ied8cafd32c3e634d498467ebe878a411f0b24e6d
2020-12-14 10:16:12 +01:00
Iury Gregory Melo Ferreira
98732623b2 Fix lower-constraints with the new pip resolver
* move pep8 dependencies from test-requirements to tox.ini,
  they're not needed there and are hard to constraint properly.
* add oslo.cache to l-c to avoid bump of dependencies

Change-Id: Ia5330f3d5778ee62811da081c28a16965e512b55
2020-12-11 13:34:24 -08:00
Zuul
e425b6d663 Merge "CI: add a non-voting bifrost-vmedia-uefi job" 2020-12-02 09:01:44 +00:00
Riccardo Pittau
475af371dd Use openstack-tox for ironic-tox-unit-with-driver-libs
All the tox jobs are based on openstack-tox, we should convert
ironic-tox-unit-with-driver-libs too.

Change-Id: I20836d586edccfb8cd8fed1f3a89f1497ff96943
2020-11-27 12:20:53 +01:00
Dmitry Tantsur
9ea4142982 CI: add a non-voting bifrost-vmedia-uefi job
It provides useful coverage of e.g. fast-track with virtual media.

Change-Id: Ie09f4daced5ffd9d953b9add4d5484bbdd1ba1ac
2020-11-26 12:11:48 +01:00
Riccardo Pittau
89af9aef80 Make standalone jobs voting again
Also remove 2 non-voting jobs from gate.

Change-Id: I40574cad53de8b9f89e1ae0a033b75de39140769
2020-11-23 14:51:25 +01:00
Riccardo Pittau
1ab192beca Convert last bionic jobs to focal
And disable dstat in ironic-base for the time being.

Change-Id: Ib05a952260d027f9f1307a9948ac5691b57e96d3
2020-11-12 14:43:05 +00:00
Zuul
09e246294b Merge "CI: increase cleaning timeout and tie it to PXE boot timeout" 2020-10-30 19:19:20 +00:00
Zuul
485601be5b Merge "Mark standalone job non-voting/remove from gate" 2020-10-28 13:47:07 +00:00
Julia Kreger
9696ec9a5a Mark standalone job non-voting/remove from gate
The standalone job at present has a high chance of failure
due to two separate things occuring:

1) The deployed nodes from raid tests can be left in a dirty state
   as the raid configuration remains and is chosen as the root
   device for the next deployment. IF this is chosen by any job,
   such as rescue or a deployment test that attempts to login,
   then the job fails with unable to ssh. The fix for this is
   in the ironic-tempest-plugin but we need to get other fixes
   into stablilize the gate first.
   https://review.opendev.org/#/c/757141/
2) Long running scenarios run in cleaning such as deployment with
   RAID in the standalone suite can encounter conditions where
   the conductor tries to send the next command along before the
   present configuration command has completed. An example is
   downloading the image is still running, while a heartbeat
   has occured in the background and the conductor then seeks
   to perform a second action. This then causes the entire
   deployment to fail, even though it was transitory.
   This should be a relatively easy fix.
   https://review.opendev.org/759906

Change-Id: I6b02be0fa353daac90abf2b1576800c0710f651e
2020-10-27 17:16:44 +00:00
Dmitry Tantsur
2e2b07bb91 Move the multinode grenade job to the experimental pipeline
It's hopelessly broken, let's not waste resources on it until we
get back to making it work.

Change-Id: I171fa566e36ad5ac8659ecb0578029df270497d6
2020-10-27 12:32:46 +01:00
Dmitry Tantsur
87e634dee1 CI: increase cleaning timeout and tie it to PXE boot timeout
We're seeing cases where cleaning barely manages to finish after
a 2nd PXE retry, failing a job.

Also make the PXE retry timeout consistent between the CI and
local devstack installations.

Change-Id: I6dc7a91d1a482008cf4ec855a60a95ec0a1abe28
2020-10-27 12:21:43 +01:00
Riccardo Pittau
fc2964cb75 Run bifrost integration job on focal
Part of the migration from bionic to focal as community goal.

Change-Id: I100f799efb7be4a0413a38cd0e218dce43a44573
2020-10-14 17:50:31 +02:00
Riccardo Pittau
4775288b93 migrate testing to ubuntu focal
As per victoria cycle testing runtime and community goal
we need to migrate upstream CI/CD to Ubuntu Focal(20.04).

keeping few jobs running on bionic nodeset till
https://storyboard.openstack.org/#!/story/2008185 is fixed
otherwise base devstack jobs switching to Focal will block
the gate.

Change-Id: I1106c5c2b400e7db899959550eb1dc92577b319d
Story: #2007865
Task: #40188
2020-10-05 17:05:49 +02:00
1c49b62e2f Add Python3 wallaby unit tests
This is an automatically generated patch to ensure unit testing
is in place for all the of the tested runtimes for wallaby.

See also the PTI in governance [1].

[1]: https://governance.openstack.org/tc/reference/project-testing-interface.html

Change-Id: I3d01db0826babc1022a3a8aa3254ea164cd3265e
2020-10-01 19:15:32 +00:00
Zuul
28c0f38322 Merge "Deprecate the iscsi deploy interface" 2020-09-24 06:09:10 +00:00
Zuul
19d910db40 Merge "Limit inspector jobs to 1 testing VM" 2020-09-23 11:12:17 +00:00
Dmitry Tantsur
b0b71653c7 Make the standalone-redfish job voting
It was supposed to be made voting shortly after the split, but we
sort of forgot. It provides coverage for things (like ansible deploy)
that we used to have voting jobs for.

Change-Id: Id99586d5e01b940089d55c133d9181db05bfdc7e
2020-09-22 16:55:26 +02:00
Dmitry Tantsur
d8dccc8d06 Deprecate the iscsi deploy interface
This change marks the iscsi deploy interface as deprecated and
stops enabling it by default.

An online data migration is provided for iscsi->direct, provided that:
1) the direct deploy is enabled,
2) image_download_source!=swift.

The CI coverage for iscsi deploy is left only on standalone jobs.

Story: #2008114
Task: #40830
Change-Id: I4a66401b24c49c705861e0745867b7fc706a7509
2020-09-22 15:39:36 +02:00
Julia Kreger
ed0ef6cf59 Reduce VMs for multinode and standalone jobs
The minimum amount of disk space on CI test nodes
may be approximately 60GB on /opt with now only 1GB
of available swap space by default.

This means we're constrained on the number of VMs and
their disk storage capacity in some cases.

Change-Id: Ia6dac22081c92bbccc803f233dd53740f6b48abb
2020-09-21 15:31:07 -07:00
Julia Kreger
7d0661a1b4 Reduce grenade node count
Infra's disk/swap availability has been apparently
reduced with the new focal node sets such that we
have ~60GB of disk space and only 1GB of swap.

If we configure more swap, then naturally that means
we take away from available VMs as well.

And as such, we should be able to complete grenade
with only four instances, I hope.

Change-Id: I36f8fc8130ed914e8a2c2a11c9679144d931ad73
2020-09-21 15:20:03 -07:00
Dmitry Tantsur
a141b6f17c Limit inspector jobs to 1 testing VM
Currently ironic-base defaults to 2 and our tests try to introspect
all of them. This puts unnecessary strain on the CI systems, return
the number back to 1.

Change-Id: I820bba1347954b659fd7469ed542f98ef0a6eaf0
2020-09-21 19:33:26 +02:00
Iury Gregory Melo Ferreira
1f0174bb41 Native zuulv3 grenade multinode multitenant
Based on the native 'grenade-multinode' job

Change-Id: I4d0a23c371bc42c5bf18e79ea7920bd77b066154
2020-09-16 23:33:42 +02:00
Dmitry Tantsur
b5d5e5774c Change [agent]image_download_source=http
As part of the plan to deprecate the iSCSI deploy interface, changing
this option to a value that will work out-of-box for more deployments.

The standalone CI jobs are switched to http as well, the rest of jobs
are left with swift. The explicit indirect jobs are removed.

Change-Id: Idc56a70478dfe65e9b936006a5355d6b96e536e1
Story: #2008114
Task: #40831
2020-09-08 16:28:31 +02:00
Zuul
b6cf0432a7 Merge "Remove token-less agent support" 2020-09-07 15:07:17 +00:00
Zuul
3709cce11f Merge "ISO ramdisk virtual media test enablement" 2020-09-06 12:07:02 +00:00
Julia Kreger
5b272b0c46 Remove token-less agent support
Removes the deprecated support for token-less agents which
better secures the ironic-python-agent<->ironic interactions
to help ensure heartbeat operations are coming from the same
node which originally checked-in with the Ironic and that
commands coming to an agent are originating from the same
ironic deployment which the agent checked-in with to begin
with.

Story: 2007025
Task: 40814
Change-Id: Id7a3f402285c654bc4665dcd45bd0730128bf9b0
2020-09-04 17:09:39 +00:00
Riccardo Pittau
e9bda223f7 Increase memory of tinyipa vms
Tinyipa is not that tiny anymore and we need to increase the base
memory for VMs in jobs that use it.

Change-Id: Ibd7e87c0b5676eef94512285edaca416635a29ef
2020-08-21 14:41:15 +02:00
Julia Kreger
9e7f1cb570 ISO ramdisk virtual media test enablement
Sets the settings to enable the ramdisk iso booting tests
including a bootable ISO image that should boot the machine.

NB: The first depends-on is only for temporary testing of another
which changes the substrate ramdisk interface. Since this change pulls
in tempest testing for iso ramdisk and uses it, might as well
use it to test if the change works or not as the other two patches
below are known to be in a good state.

Change-Id: I5d4213b0ba4f7884fb542e7d6680f95fc94e112e
2020-08-17 10:14:38 -07:00
Julia Kreger
ceb0d284b7 Change UEFI PXE job to use tinyipa
The kernel for the UEFI PXE job seems to download
without issue, however the required ramdisk does not
seem to be making it.

As such, changing the job to use TinyCore to see if the smaller
helps resolve these issues.

Change-Id: Ie248de2269a63a41b634f7205468edabccc53738
2020-08-03 11:31:52 -07:00
Julia Kreger
e144453c12 Mark IPv6 job as non-voting to unblock the gate
The default dhcp client in tinycore does not automatically trigger
IPv6 address acquisition.

This is a problem when the random spread of nodes and devstack
cause tinycore to get pulled in for the v6 job.

Change-Id: I635a69dfd7450a218474ccb7cecf1c9e29c0a43c
2020-07-28 13:53:14 +00:00
Zuul
bc60f08a59 Merge "Extend base build timeouts" 2020-07-27 14:27:57 +00:00
Zuul
8f754180e8 Merge "Remove old driver name from cross-gating job" 2020-07-23 03:14:54 +00:00
Julia Kreger
9169085db7 Extend base build timeouts
Our ramdisks have swelled, and are taking anywhere from 500-700
seconds to even reach the point where IPA is starting up.

This means, that a 900 second build timeout is cutting it close
and intermittent performance degredation in CI means that a job
may fail simply because it is colliding with the timeout.

One example I deconstruted today where a 900 second timout was
in effect:

* 08:21:41 Tempest job startes
* 08:21:46 Nova instance requested
* Compute service requests ironic to do the thing.
* Ironic downloads IPA and stages it - ~20-30 seconds
* VM boots and loads ipxe ~30 seconds.
* 08:23:22 - ipxe downloads kernel/ramdisk (time should be completion
             unless apache has changed logging behavior for requests.)
* 08:26:28 - Kernel at 120 second marker and done decompressing
             the ramdisk.
* ~08:34:30 - Kernel itself hit the six hundred second runtime
              marker and hasn't even started IPA.
* 08:35:02 - Ironic declars the deploy failed due to wait timeout.
             ([conductor]deploy_callback_timeout hit at 700 seconds.)
* 08:35:32 - Nova fails the build saying it can't be scheduled.

(Note, I started adding times to figure out the window to myself, so
they are incomplete above.)

The time we can account for in the job is about 14 minutes or 840
seconds. As such, our existing defaults are just not enough to handle
the ramdisk size AND variance in cloud performance.

Change-Id: I4f9db300e792980059c401fce4c37a68c438d7c0
2020-07-21 19:25:15 +00:00
Zuul
c2b8e3f80e Merge "Stop running test_schedule_to_all_nodes in the multinode job" 2020-07-21 14:45:25 +00:00
Dmitry Tantsur
1cb1df76d9 Stop running test_schedule_to_all_nodes in the multinode job
After the recent changes we're running 5 tests already, some of them
using several VMs. This should cover scheduling to different conductors
well enough, the nova test just adds random failures on top.

This allows reducing the number of test VMs to 3 per testing node
(6 totally), reducing the resource pressure and allowing giving
each VM a bit more RAM.

Also adding missing VM_SPECS_DISK to the subnode configuration.

Change-Id: Idde2891b2f15190f327e4298131a6069c58163c0
2020-07-20 12:35:16 +02:00
Iury Gregory Melo Ferreira
dc87a189cb Update number of VM on ironic-base
Since we merged the change to have partition and wholedisk
testing on basic_ops most of the jobs started requiring 2 VMs
to run the tempest tets.

Let's increase on the ironic-base so all jobs will be default to 2.

Removing IRONIC_VM_COUNT=2 from jobs that uses ironic-base as parent.

Change-Id: I13da6275c04ffc6237a7f2edf25c03b4ddee936a
2020-07-18 10:57:21 +02:00
Zuul
521d796037 Merge "Explicitly set jobs to ML2/OVS" 2020-07-18 03:35:08 +00:00
Zuul
0238034827 Merge "Ironic to use DevStack's neutron"-legacy" module" 2020-07-18 03:17:03 +00:00