Files
nova-specs/specs/mitaka/implemented/get-valid-server-state.rst
Matt Riedemann 6f4faa9637 Move implemented mitaka specs
Note that resource-classes was already moved but the redirects
file wasn't updated, that's fixed here.

There are some partial blueprints that were marked completed in
mitaka and are still being worked in newton, like the config
option work. I've moved those to implemented here also.

Change-Id: I16f279b4794127cb7abc40ffc22cc237702d14ed
2016-03-29 21:40:41 -04:00

230 lines
6.5 KiB
ReStructuredText

..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
==========================================
Get valid server state
==========================================
https://blueprints.launchpad.net/nova/+spec/get-valid-server-state
When a compute service fails, the power states of the hosted VMs are not
updated. A normal user querying his or her VMs does not get any indication
about the failure. Also there is no indication about maintenance.
Problem description
===================
VM query do not give needed information to the user about a compute host that
is failed/unreachable, nova-compute service that is failed/stopped or
nova-compute service that is explicitly marked as failed or disabled. The user
should get the information about nova-compute state when querying his or her
VMs to get better understanding about the situation.
Use Cases
---------
As a user I want to be able to have accurate VM state information even when the
compute service fails or host is down, so I can do quick actions for my VMs.
Mostly the failure information is critical to a user having HA type of VMs that
needs to make a quick switch over for service. Other thing is for user or admin
to do something for the VMs on the host. Action might be case and deployment
specific, as some admin actions can be automated for external service and some
left to user. Normally user can just do just delete or create for a VM.
As a user I want to get information about maintenance, so I can do actions for
my VMs. As user get information about host being in maintenance (service=
disabled), user knows to plan what to do for his or her VMs as host may be
rebooted soon.
Proposed change
===============
A new ``host_status`` field will be added to the ``/servers/{server_id}`` and
``/servers/detail`` endpoints. ``host_status`` will be ``UP`` if nova-compute's
state is up, ``DOWN`` if nova-compute is forced_down, ``UNKNOWN`` if
nova-compute last_seen_up is not up-to-date and ``MAINTENANCE`` if
nova-compute's state disabled. Needed information can be retriewed by host
API and servicegroup API if new policy allows. forced_down flag handling is
described in this spec:
http://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/mark-host-down.html
A new policy element will be added to control access to ``host_status``. This
can be used both to prevent this host-based data being disclosed as well as to
eliminate the performance impact of this feature.
Alternatives
------------
When returning the VM power_state, check the service status for the host. If
the service is ``forced_down``, return ``UNKNOWN`` instead. This would be an
API-only change, it is NOT proposed that we update the DB value to
``UNKNOWN``. This means we retain a record of the VM power state independent
of the service state, which may be interesting in case the host lost network
rather than power. Community feedback indicated that as the power_state is only
true for a point in time anyway, technically the state is always ``UNKNOWN``.
``os-services/force-down`` could mark all VMs managed by the affected service
as ``UNKNOWN`` in db. This would sometimes be wrong as a VM can be up even if
its host is unreachable. This would make also a need to remove this state data
in case VM evacuated to another compute node.
A possible extension is a host ``NEEDS_MAINTENANCE`` state, which would show
that maintenance is required soon. This would allow users who monitor this info
to prepare their VMs for downtime and enter maintenance at a time convenient
for them.
An extension could be added for filtering ``/servers`` and ``/servers/detail``
endpoints response message by ``host_status``.
Data model impact
-----------------
None
REST API impact
---------------
GET ``/v2.1/{tenant_id}/servers/{server_id}`` and ``/v2.1/{tenant_id}/servers/
detail`` will return ``host_status`` field if "os_compute_api:servers:show:
host_status" policy is defined for the user. This will require a microversion.
Case where nova-compute enabled and reporting normally::
GET /v2.1/{tenant_id}/servers/{server_id}
200 OK
{
"server": {
"host_status": "UP",
...
}
}
Case where nova-compute enabled, but not reporting normally::
GET /v2.1/{tenant_id}/servers/{server_id}
200 OK
{
"server": {
"host_status": "UNKNOWN",
...
}
}
Case where nova-compute enabled, but forced_down::
GET /v2.1/{tenant_id}/servers/{server_id}
200 OK
{
"server": {
"host_status": "DOWN",
...
}
}
Case where nova-compute disabled::
GET /v2.1/{tenant_id}/servers/{server_id}
200 OK
{
"server": {
"host_status": "MAINTENANCE",
...
}
}
This may be presented by python-novaclient as::
+-------+------+--------+------------+-------------+----------+-------------+
| ID | Name | Status | Task State | Power State | Networks | Host Status |
+-------+------+--------+------------+-------------+----------+-------------+
| 9a... | vm1 | ACTIVE | - | RUNNING | xnet=... | UP |
+-------+------+--------+------------+-------------+----------+-------------+
New policy element to be added to allow assigning permission to see
host_status:
::
"os_compute_api:servers:show:host_status": "rule:admin_api"
Security impact
---------------
Normal users may be able to correlate host states across multiple VMs to draw
conclusions about the cloud topology. This can be prevented by not granting the
policy.
Notifications impact
--------------------
None
Other end user impact
---------------------
None
Performance Impact
------------------
An additional database query will be required to look up the service when a
server detail request is received.
Other deployer impact
---------------------
None
Developer impact
----------------
None
Implementation
==============
Assignee(s)
-----------
Primary assignee: Tomi Juvonen
Other contributors: None
Work Items
----------
* Expose host_status as detailed.
* Update python-novaclient.
Dependencies
============
None
Testing
=======
Unit and functional test cases needs to be added.
Documentation Impact
====================
API change needs to be documented:
* Compute API extensions documentation.
http://developer.openstack.org/api-ref-compute-v2.1.html
References
==========
* https://blueprints.launchpad.net/nova/+spec/mark-host-down
* OPNFV Doctor project: https://wiki.opnfv.org/doctor