There are cases where policy file is re-generated freshly
and end up having the new defaults only but expectation is that
old deprecated rule keep working.
If a rule is present in policy file then, that has priority over
its defaults so either rules should not be present in policy file
or users need to update their token to match the overridden rule
permission.
This issue was always present when any policy defaults were changed
with old defaults being supported as deprecated. This is we have
changed all the policy for new defaults so it came up as broken case.
Adding nova-status upgrade check also to detect such policy file.
Related-Bug: #1875418
Change-Id: Id9cd65877e53577bff22e408ca07bbeec4407f6e
Since I537ed74503d208957f0a97af3ab754a6750dac20 had some clean-up comments,
we can just provide a follow-up change.
Change-Id: Ie8b5147322e13ad7df966b5c3c41ef0418e4f64c
Related-Bug: #1793569
There are different situations when allocations can be orphaned.
Adding a new nova-manage command to lookup at all resource providers
and check against the related compute nodes whether they have
orphaned allocations.
Change-Id: I537ed74503d208957f0a97af3ab754a6750dac20
Closes-Bug: #1793569
Placement microversion 1.35 gives us the root_required queryparam to GET
/allocation_candidates, allowing us to filter out candidates where the
*root* provider has/lacks certain traits, independent of traits
specified in any of the individual request groups.
Use it.
And add affordance for specifying such traits to the RequestSpec.
Which allows us to fix up the couple of request filters that were
hacking traits into the RequestSpec.flavor.
Change-Id: I44f02044ce178e84c23d178e5a23a3aa1208e502
This legacy service is no longer used and was deprecated during the
Stein cycle [1]. It's time to say adios and remove them in their
entirety. This is pretty straightforward, with the sole exception of
schema for the 'remote-consoles' API, which has to continue supporting
requests for type 'xvpvnc' even if we can't fulfil those requests now.
[1] https://review.opendev.org/#/c/610076/
Part of blueprint remove-xvpvncproxy
Depends-On: https://review.opendev.org/695853
Change-Id: I2f7f2379d0cd54e4d0a91008ddb44858cfc5a4cf
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
If a nova-manage command is executed without the -h option
or a subcommand the user gets an ugly traceback. This is
easily recreated:
$ tox -e venv -- nova-manage db
Make the action argument required, so we get a helpful error message
instead.
$ nova-manage db
usage: nova-manage db [-h]
{archive_deleted_rows,ironic_flavor_migration,
null_instance_uuid_scan,online_data_migrations,
purge,sync,version}
...
nova-manage db: error: the following arguments are required: action
Note that unit tests appear to be impossible for this, since doing so
attempts to initialize an oslo.config 'CONF' singleton and this is
something we've already done in 'nova.test' and can't do again.
Change-Id: I24d03eed3aa3b882c49916938f4c25d76fd4e831
Closes-Bug: #1837199
Co-Authored-By: Stephen Finucane <stephenfin@redhat.com>
This has come up a few times via support questions from operators
that have a nova cell database out of sync with the placement
database resulting in a mismatch in compute nodes to provider
uuids and they just want to wipe the placement database and rebuild
it from the current data in nova. This provides a document with the
high level steps to do that.
Change-Id: Ie4fed22615f60e132a887fe541771c447fae1082
This commit cuts us over to using placement microversion 1.34 for GET
/allocation_candidates, thereby supporting string request group suffixes
(added in 1.33) when specified in flavor extra_specs.
The mappings (added in 1.34) are not used in code yet, but a future
patch will tie the group suffixes to the RequestGroup.requester_id so
that it can be correlated after GET /a_c. This will allow us to get rid
of map_requested_resources_to_providers, which was a hack to bridge the
gap until we had mappings from placement.
Change-Id: I52499ff6639c1a5815a8557b22dd33106dcc386b
Get excited, people. It's finally dying, for real. There is a lot more
doc work needed here, but this is a start. No need for a release note
modification since we've already said that nova-network has been
removed, so there's no point in saying that the service itself has been
removed since that's implicit.
Change-Id: I18d73212f9d98bc75974a024cf6fd872fdfb1ca4
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
There are actually a few things here that rely on a running
nova-dhcpbridge instances, but since it's not possible to start
nova-network now, that shouldn't matter.
Change-Id: I63447baeaac0be3fb7f919bfe588da50133c74d7
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
This legacy service was only compatible with the XenServer driver and
has effectively been replaced by the noVNC console proxy service. Remove
the service. The API that provided remote access to this service,
'os-consoles', was removed in a previous change. Note that
'os-remote-consoles' is unrelated and therefore is not removed, though
it will now reject requests for XVP VNC consoles.
This was previously discussed and agreed on openstack-dev [1] and
openstack-discuss [1].
Part of blueprint remove-xvpvncproxy
[1] http://lists.openstack.org/pipermail/openstack-dev/2018-October/135413.html
[2] http://lists.openstack.org/pipermail/openstack-discuss/2019-April/005369.html
Change-Id: Ib1ff32f04b16af7981471f67c8e0bf04e6ecb6be
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
While we do not have an automated fix for bug 1849479 this provides
a troubleshooting document for working around that issue where
allocations from a server that was evacuated from a down host need
to be cleaned up manually in order to delete the resource provider
and associated compute node/service.
In general this is also a useful guide for linking up the various
resources and terms in nova and how they are reflected in placement
with the relevant commands which is probably something we should
do more of in our docs.
Change-Id: I120e1ddd7946a371888bfc890b5979f2e19288cd
Related-Bug: #1829479
A blocker migration was added in Train [1] to force
deployments to make sure they have completed the
services.uuid online migration (added in Pike). Now
that we're in Ussuri we can drop that online data
migration code.
Note that InstanceListWithServicesTestCase is removed
because the scenario is now invalid with the blocker
DB migration.
[1] I8927b8a4513dab242d34953d13dd2cc95393dc80
Change-Id: If77702f0c3212f904443f627037782f9ad7b3b55
Document the config options for this, most of which come from
'oslo.config'. Only the '--remote_debug-host' and '--remote_debug-port'
options are not documented since they shouldn't be available for this
command (it's not a service) and will be removed in a later patch.
Change-Id: Ie321268cc56da04ff4111f7c34a29ba23d416e66
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
In change I3fd9fe0317bcd1a59c366e60154b095e8df92327, we deprecated the
'--version' option in favour of a 'VERSION' positional. This was later
removed in change I7795e308497de66329f288b43ecfbf978d67ad75. Update the
docs to reflect this. 'nova-manage api_db sync' were already corrected
in change Ibc49f93b8bd51d9a050acde5ef3dc8aad91321ca and does not need
the same fix, though a minor tweak is included.
Change-Id: I2c0fb04fbc3f6d2074596894782ed3143b0c2338
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
Closes-Bug: #1840807
This patch modifies Nova objects that will allow Destination object to
store list of forbidden aggregates that placement should ignore in
'GET /allocation_candidates' API at microversion 1.32, including the
code to generate the placement querystring syntax from them.
Change-Id: Ic2bcee40b41c97170a8603b27b935113f0633de7
Implements: blueprint placement-req-filter-forbidden-aggregates
The archive_deleted_rows cmd depend on DB connection config from config
file, and when applying super-conductor mode, there are several config
files for different cells. If so, the command can only archive rows in
cell0 DB as it only reads the nova.conf
This patch added code that provides --all-cells parameter to the
command and read info for all cells from the api_db and then archive
rows across all cells.
The --all-cells parameter is passed on to the purge command when
archive_deleted_rows is called with both --all-cells and --purge.
Co-Authored-By: melanie witt <melwittt@gmail.com>
Change-Id: Id16c3d91d9ce5db9ffd125b59fffbfedf4a6843d
Closes-Bug: #1719487
This just changes the bullet format to a more appealing
table format and is consistent with the other commands
in here that use a table format for documenting return
codes.
Change-Id: I427e64f49f152231810ca48908f1ceff5b8a41d9
This just converts the bullet list to the prettier table
format like archive_deleted_rows and map_instances use.
Change-Id: I33b78a4d55c9a34fca6ab20b2b9e8db41d747da5
This pretties up the return code documentation for the
map_instances command and while in here formats the
--max-count and --reset option mentions in the description.
Change-Id: I03d67e75f5633f4b21bf753698674747a0ed06a8
If any nova-manage command fails in an unexpected way and
it bubbles back up to main() the return code will be 1.
There are some commands like archive_deleted_rows,
map_instances and heal_allocations which return 1 for flow
control with automation systems. As a result, those tools
could be calling the command repeatedly getting rc=1 thinking
there is more work to do when really something is failing.
This change makes the unexpected error code 255, updates the
relevant nova-manage command docs that already mention return
codes in some kind of list/table format, and adds an upgrade
release note just to cover our bases in case someone was for
some weird reason relying on 1 specifically for failures rather
than anything greater than 0.
Change-Id: I2937c9ef00f1d1699427f9904cb86fe2f03d9205
Closes-Bug: #1840978
The archive_deleted_rows command has a specific set of return
codes for both errors and flow control so we should document
those in the CLI guide.
A FIXME is left in the code because of the odd case of when
you could get 1 back meaning "we archived some stuff" or
"something really bad happened".
Change-Id: Id10bd286249ad68a841f2bfb3a3623b429b2b3cc
Fix up some sphinx-isms in nova-manage.rst:
- Replace ``[group]/option`` with :oslo.config:option:`group.option`
- Replace ``[group]`` with :oslo.config:group:`group`
- Replace `default role used for italics` with *explicit italics*
- Fix up a mention of [placement_database] -- for which we can't use
:oslo.config:* because that section doesn't exist anymore -- to
pertain only to Rocky & Stein, where it did exist.
Change-Id: I9cfb9d0b20d4dfc18236a2d528a8f65534c9a263
CONF.api_database.connection is a variable in code, not something
an operator needs to know what it means, so this changes that
mention in the docs and error message for the nova-manage db
archive_deleted_rows command.
Change-Id: If27814e0006a6c33ae6270dff626586c41eafcad
Closes-Bug: #1839391
If the instances per host are not cached in the HostManager
we lookup the HostMapping per candidate compute node during
each scheduling request to get the CellMapping so we can target
that cell database to pull the instance uuids on the given host.
For example, if placement returned 20 compute node allocation
candidates and we don't have the instances cached for any of those,
we'll do 20 queries to the API DB to get host mappings.
We can improve this by caching the host to cell uuid after the first
lookup for a given host and then after that, get the CellMapping
from the cells cache (which is a dict, keyed by cell uuid, to the
CellMapping for that cell).
Change-Id: Ic6b1edfad2e384eb32c6942edc522ee301123cbc
Related-Bug: #1737465
The "Request Spec Migration" upgrade status check was added
in Rocky [1] and the related online data migration was removed
in Stein [2]. Now that we're in Train we should be able to
remove the upgrade status check.
The verify install docs are updated to remove the now-removed
check along with "Console Auths" which were removed in Train [3].
[1] I1fb63765f0b0e8f35d6a66dccf9d12cc20e9c661
[2] Ib0de49b3c0d6b3d807c4eb321976848773c1a9e8
[3] I5c7e54259857d9959f5a2dfb99102602a0cf9bb7
Change-Id: I6dfa0b226ab0b99864fc27209ebb7b73e3f73512
Before I97f06d0ec34cbd75c182caaa686b8de5c777a576 it was possible to
create servers with neutron ports which had resource_request (e.g. a
port with QoS minimum bandwidth policy rule) without allocating the
requested resources in placement. So there could be servers for which
the allocation needs to be healed in placement.
This patch extends the nova-manage heal_allocation CLI to create the
missing port allocations in placement and update the port in neutron
with the resource provider uuid that is used for the allocation.
There are known limiations of this patch. It does not try to reimplement
Placement's allocation candidate functionality. Therefore it cannot
handle the situation when there is more than one RP in the compute
tree which provides the required traits for a port. In this situation
deciding which RP to use would require the in_tree allocation candidate
support from placement which is not available yet and 2) information
about which PCI PF an SRIOV port is allocated from its VF and which RP
represents that PCI device in placement. This information is only
available on the compute hosts.
For the unsupported cases the command will fail gracefully. As soon as
migration support for such servers are implemented in the blueprint
support-move-ops-with-qos-ports the admin can heal the allocation of
such servers by migrating them.
During healing both placement and neutron need to be updated. If any of
those updates fail the code tries to roll back the previous updates for
the instance to make sure that the healing can be re-run later without
issue. However if the rollback fails then the script will terminate with
an error message pointing to documentation that describes how to
recover from such a partially healed situation manually.
Closes-Bug: #1819923
Change-Id: I4b2b1688822eb2f0174df0c8c6c16d554781af85
Since cells v2 was introduced, nova operators must run two commands to
migrate the database schemas of nova's databases - nova-manage api_db
sync and nova-manage db sync. It is necessary to run them in this order,
since the db sync may depend on schema changes made to the api database
in the api_db sync. Executing the db sync first may fail, for example
with the following seen in a Queens to Rocky upgrade:
nova-manage db sync
ERROR: Could not access cell0.
Has the nova_api database been created?
Has the nova_cell0 database been created?
Has "nova-manage api_db sync" been run?
Has "nova-manage cell_v2 map_cell0" been run?
Is [api_database]/connection set in nova.conf?
Is the cell0 database connection URL correct?
Error: (pymysql.err.InternalError) (1054, u"Unknown column
'cell_mappings.disabled' in 'field list'") [SQL: u'SELECT
cell_mappings.created_at AS cell_mappings_created_at,
cell_mappings.updated_at AS cell_mappings_updated_at,
cell_mappings.id AS cell_mappings_id, cell_mappings.uuid AS
cell_mappings_uuid, cell_mappings.name AS cell_mappings_name,
cell_mappings.transport_url AS cell_mappings_transport_url,
cell_mappings.database_connection AS
cell_mappings_database_connection, cell_mappings.disabled AS
cell_mappings_disabled \nFROM cell_mappings \nWHERE
cell_mappings.uuid = %(uuid_1)s \n LIMIT %(param_1)s'] [parameters:
{u'uuid_1': '00000000-0000-0000-0000-000000000000', u'param_1': 1}]
(Background on this error at: http://sqlalche.me/e/2j85)
Despite this error, the command actually exits zero, so deployment tools
are likely to continue with the upgrade, leading to issues down the
line.
This change modifies the command to exit 1 if the cell0 sync fails.
This change also clarifies this ordering in the upgrade and nova-manage
documentation, and adds information on exit codes for the command.
Change-Id: Iff2a23e09f2c5330b8fc0e9456860b65bd6ac149
Closes-Bug: #1832860
The --before option to nova manage db purge and archive_deleted_rows
accepts a string to be parsed by dateutils.parser.parse() with
fuzzy=True. This is fairly forgiving, but doesn't handle e.g. "now - 1
day". This commit adds some clarification to the help strings, and some
examples to the docs.
Change-Id: Ib218b971784573fce16b6be4b79e0bf948371954
We're going to remove all the code, but first, remove the docs.
Part of blueprint remove-consoleauth
Change-Id: Ie96e18ea7762b93b4116b35d7ebcfcbe53c55527
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
When the 'nova-manage cellv2 discover_hosts' command is run in parallel
during a deployment, it results in simultaneous attempts to map the
same compute or service hosts at the same time, resulting in
tracebacks:
"DBDuplicateEntry: (pymysql.err.IntegrityError) (1062, u\"Duplicate
entry 'compute-0.localdomain' for key 'uniq_host_mappings0host'\")
[SQL: u'INSERT INTO host_mappings (created_at, updated_at, cell_id,
host) VALUES (%(created_at)s, %(updated_at)s, %(cell_id)s,
%(host)s)'] [parameters: {'host': u'compute-0.localdomain',
%'cell_id': 5, 'created_at': datetime.datetime(2019, 4, 10, 15, 20,
%50, 527925), 'updated_at': None}]
This adds more information to the command help and adds a warning
message when duplicate host mappings are detected with guidance about
how to run the command. The command will return 2 if a duplicate host
mapping is encountered and the documentation is updated to explain
this.
This also adds a warning to the scheduler periodic task to recommend
enabling the periodic on only one scheduler to prevent collisions.
We choose to warn and stop instead of ignoring DBDuplicateEntry because
there could potentially be a large number of parallel tasks competing
to insert duplicate records where only one can succeed. If we ignore
and continue to the next record, the large number of tasks will
repeatedly collide in a tight loop until all get through the entire
list of compute hosts that are being mapped. So we instead stop the
colliding task and emit a message.
Closes-Bug: #1824445
Change-Id: Ia7718ce099294e94309103feb9cc2397ff8f5188
Add a parameter to limit the archival of deleted rows by date. That is,
only rows related to instances deleted before provided date will be
archived.
This option works together with --max_rows, if both are specified both
will take effect.
Closes-Bug: #1751192
Change-Id: I408c22d8eada0518ec5d685213f250e8e3dae76e
Implements: blueprint nova-archive-before
The compute API has required cinder API >= 3.44 since Queens [1] for
working with the volume attachments API as part of the wider
volume multi-attach support.
In order to start removing the compatibility code in the compute API
this change adds an upgrade check for the minimum required cinder API
version (3.44).
[1] Ifc01dbf98545104c998ab96f65ff8623a6db0f28
Change-Id: Ic9d1fb364e06e08250c7c5d7d4bdb956cb60e678
This patch adds the translation of `RequestGroup.in_tree` to the
actual placement query and bumps microversion to enable it.
The release note for this change is added.
Change-Id: I8ec95d576417c32a57aa0298789dac6afb0cca02
Blueprint: use-placement-in-tree
Related-Bug: #1777591
These are no longer necessary with the removal of cells v1. A check for
cells v1 in 'nova-manage cell_v2 simple_cell_setup' is also removed,
meaning this can no longer return the '2' exit code.
Part of blueprint remove-cells-v1
Change-Id: I8c2bfb31224300bc639d5089c4dfb62143d04b7f
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>