ce0fa97fb7
TrivialFix Change-Id: I28b93c6f127fa6f65acac2827c3cdadbe8d38068
316 lines
12 KiB
ReStructuredText
316 lines
12 KiB
ReStructuredText
..
|
|
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
|
License.
|
|
|
|
http://creativecommons.org/licenses/by/3.0/legalcode
|
|
|
|
===============================
|
|
Nova Server Count API Extension
|
|
===============================
|
|
|
|
https://blueprints.launchpad.net/nova/+spec/server-count-api
|
|
|
|
This blueprint proposes a new REST API extension that returns the number of
|
|
servers that match the specified search criteria.
|
|
|
|
|
|
Problem description
|
|
===================
|
|
|
|
There is no current API that can retrieve summary count data for servers that
|
|
match a variety of search filters. For example, getting the total number of
|
|
servers in a given state.
|
|
|
|
Retrieving all servers and then manually determining the count data does not
|
|
scale because pagination queries must be implemented (see Alternatives section
|
|
for a detailed explanation).
|
|
|
|
The use cases that are driving this API extension are derived from a user's
|
|
experience in a GUI.
|
|
|
|
Use Case 1: A UI dashboard that contains servers in various states for a cloud
|
|
administrator. A new API extension is needed to retrieve the server count data
|
|
associated with various filters (ie, servers in active state, servers in
|
|
building state, servers in error state, etc.) for the entire cloud.
|
|
|
|
Assume that you have 5k instances in your cloud. The admin wants to see a
|
|
summary of instances in each state -- this API extension will help them
|
|
quickly determine if there is an issue that need attention; for example, if
|
|
there are many instances in 'error'. It is likely that once the admin sees
|
|
this count that they will then drill down into the data. However, without
|
|
this new API extension, the admin will not know if there are unacceptable
|
|
number of systems in a given state without drilling down into each set.
|
|
|
|
From a deployer's perspective, creating this dashboard with the existing APIs
|
|
is very painful since pagination is required (assume more then the default of
|
|
1k items). Also, processing time to get this data using the existing APIs
|
|
(even the non-detailed) is slow (and possibly inaccurate -- see #3) compared
|
|
to the processing time to get and return a single number.
|
|
|
|
Use Case 2: Showing filtered data in a table in the UI. Assume that the UI
|
|
supports tables that show filtered data (ie, table just showing instances in
|
|
'error' state) and uses pagination to get the data. Many users do not like
|
|
"infinite scrolling" where they have no idea how many items really are in the
|
|
list (more just show up as you scroll down or navigate to the next page).
|
|
Using this new count API, the UI table can indicate how many total items are
|
|
in the list (ie, showing 1-20 of 1000).
|
|
|
|
Assume that you have 500 instances in error state and that you can open a UI
|
|
table showing their details -- when creating the table, assume that the UI
|
|
uses a page size of 100 and assume that there is no dashboard showing the
|
|
'error' count. In this case, the admin logs into the UI and wants to know
|
|
how many servers are in error state. In order to do this, the admin navigates
|
|
to the 'servers in error state' table -- the UI only retrieves the first 100
|
|
items so it impossible to know if there are 101 total items or 500 total
|
|
items. As an admin, I would like to know what the total number of items in the
|
|
table is.
|
|
|
|
Use Case 3: Inherent timing window when adding a new item with limit/marker
|
|
processing. Assume that you are using pagination to iterate over the data to
|
|
get a count. When you are getting page n, it is possible that page n-1 has a
|
|
new item x that was just added. Due to the sorting of the data, limit/marker
|
|
will not detect that this new item was added.
|
|
|
|
While this timing window is small, it does exist so getting an accurate count
|
|
using this method is not guaranteed to be accurate.
|
|
|
|
I realize that you can argue that the count API may not handle this UI use case
|
|
either. However, the count will always be accurate from the DB at the time that
|
|
the .count() function was processed -- the same claim cannot be made about
|
|
getting the count using limit/marker since multiple DB calls are being invoked
|
|
to calculate the number.
|
|
|
|
|
|
Proposed change
|
|
===============
|
|
|
|
The new count API extension must accept that same filter values as the
|
|
existing /servers and /servers/details APIs and re-use the existing filter
|
|
processing (once the common parts are refactored into utility methods that
|
|
can be utilized by both paths). Once the filters are processed to create the
|
|
query object, then the number of matching servers will be retrieved and
|
|
returned from the database.
|
|
|
|
The count API extension will be both per tenant and global (admin-only),
|
|
similar to the existing /servers APIs. An admin can supply the 'all_tenants'
|
|
parameter to signify that server count data should be retrieved globally.
|
|
|
|
This new flow requires new functions to retrieve the count value in the
|
|
compute API layer, in the instance layer, and in the database layers; all
|
|
functions return an integer value. The naming conventions for the functions
|
|
will follow the existing functions used for retrieving server instances, for
|
|
example:
|
|
|
|
* Compute API: get_count function
|
|
|
|
* Instance layer (InstanceList class): get_count_by_filters function
|
|
|
|
* DB layer: instance_count_by_filters function
|
|
|
|
* Sqlalchemy layer: instance_count_by_filters function
|
|
|
|
In the sqlalchemy DB layer, the filter processing (for processing exact name
|
|
filters, regex filters, and tag filters) needs to be moved into a common
|
|
function so that both the new count API extension and the existing get servers
|
|
APIs can utilize it. Once the query object is created, then the count()
|
|
function is invoked to retrieve the total number of matching servers for the
|
|
given query.
|
|
|
|
For the v2 API extension, the existing filtering pre-processing done in
|
|
nova.api.openstack.compute.servers.Controller._get_servers needs to be moved
|
|
into a static utility method so that the new count API extension can utilize
|
|
it; this is critical so that the filtering support for the count API matches
|
|
the filtering support for the /servers API.
|
|
|
|
For the v3 API, a new count function (similar to 'index' and 'detail') needs
|
|
to be added to nova.api.openstack.compute.plugins.v3.servers directly. Common
|
|
filter processing needs to broken out into utility functions (same idea as the
|
|
v2 API). For v3, the 'count' GET API can be registered with the Servers
|
|
extensions.V3APIExtensionBase directly.
|
|
|
|
Alternatives
|
|
------------
|
|
|
|
Other APIs exist that return count data (quotas and limit) but they do not
|
|
accept filter values.
|
|
|
|
A user could accomplish the same result (less the timing window noted in Use
|
|
Case #3) using the existing non-detailed /servers API with a filter and then
|
|
count up the results. However, the primary use case for this blueprint is
|
|
getting summary count data at scale. For example, if the total cloud has 5k
|
|
VMs then doing paginated queries to iterate over the non-detailed '/servers'
|
|
API with a filter and limit/marker is really inefficient -- the API is going
|
|
to return more data then the user cares about (and do a lot of processing to
|
|
get it). Assume that there are 2,500 instances in an active state; if the
|
|
non-detailed query (and the default limit of 1k) is used then the application
|
|
would have to make 3 separate REST API calls to get the all of the VMs and,
|
|
at the DB layer, the marker processing would be used to find the correct page
|
|
of data to return. Since the user only cares about a summary count, then the
|
|
most efficient mechanism to retrieve that data would be a single DB query
|
|
using the count() function.
|
|
|
|
Note that the default maximum page set is set on the server (default of 1k);
|
|
therefore, a user MUST HANDLE pagination since the number of items being
|
|
queried may be greater then the default.
|
|
|
|
There are other options for how the v2 and v3 APIs can be registered. For v2,
|
|
the new count API could be registered by modifying the API routing in
|
|
nova.api.openstack.compute.__init__.APIRouter directly (to create the
|
|
/servers/count API just like /server/detail). Since v3 is still experimental,
|
|
this blueprint is proposing that the count API is baked into
|
|
nova.api.openstack.compute.plugins.v3.servers directly.
|
|
|
|
I cannot think of alternative implementations. The new API needs to utilitize
|
|
the existing filter processing as the current /servers APIs in order to ensure
|
|
consistency and prevent dual maintenance.
|
|
|
|
Data model impact
|
|
-----------------
|
|
|
|
None
|
|
|
|
REST API impact
|
|
---------------
|
|
|
|
The response for the existing /servers and /servers/detail REST APIs will
|
|
not be affected.
|
|
|
|
* New v2 API extension:
|
|
|
|
* Name: ServerCounts
|
|
* Alias: os-server-counts
|
|
|
|
* NEW v2 URL: v2/{tenant_id}/servers/count
|
|
|
|
* NEW v3 URL: v3/servers/count
|
|
|
|
* Description: Get number of servers
|
|
|
|
* Method type: GET
|
|
|
|
* Normal Response Codes: Same as the 'v2/{tenant_id}/servers/detail' API):
|
|
|
|
* 200
|
|
* 203
|
|
|
|
* Error Response Codes (same as the 'v2/{tenant_id}/servers/detail' API):
|
|
|
|
* computeFault (400, 500, ...)
|
|
* serviceUnavailable (503)
|
|
* badRequest (400)
|
|
* unauthorized (401)
|
|
* forbidden (403)
|
|
* badMethod (405)
|
|
|
|
* Parameters (same as the 'v2/{tenant_id}/servers' API except the 'limit' and
|
|
'marker' parameters):
|
|
|
|
+---------------+-------+--------------+--------------------------------------+
|
|
| Parameter | Style | Type | Description |
|
|
+===============+=======+==============+======================================+
|
|
| all_tenants | query | xsd:boolean | Display server count information |
|
|
| (optional) | | | from all tenants (Admin only). |
|
|
+---------------+-------+--------------+--------------------------------------+
|
|
| changes-since | query | xsd:dateTime | A time/date stamp for when the |
|
|
| (optional) | | | serverlast changed status. |
|
|
+---------------+-------+--------------+--------------------------------------+
|
|
| image | query | xsd:anyURI | Name of the image in URL format. |
|
|
| (optional) | | | |
|
|
+---------------+-------+--------------+--------------------------------------+
|
|
| flavor | query | xsd:anyURI | Name of the flavor in URL format. |
|
|
| (optional) | | | |
|
|
+---------------+-------+--------------+--------------------------------------+
|
|
| name | query | xsd:string | Name of the server as a string. |
|
|
| (optional) | | | |
|
|
+---------------+-------+--------------+--------------------------------------+
|
|
| status | query | csapi:Server | Value of the status of the server so |
|
|
| (optional) | | Status | that you can filter on "ACTIVE" for |
|
|
| | | | example. |
|
|
+---------------+-------+--------------+--------------------------------------+
|
|
|
|
* JSON schema definition for the body data: N/A
|
|
|
|
* JSON schema definition for the response data: {"count": <int>}
|
|
|
|
Security impact
|
|
---------------
|
|
|
|
None
|
|
|
|
Notifications impact
|
|
--------------------
|
|
|
|
None
|
|
|
|
Other end user impact
|
|
---------------------
|
|
|
|
None
|
|
|
|
Performance Impact
|
|
------------------
|
|
|
|
None -- This new API is not introducing any new DB joins that would affect
|
|
performance.
|
|
|
|
Other deployer impact
|
|
---------------------
|
|
|
|
None
|
|
|
|
Developer impact
|
|
----------------
|
|
|
|
None
|
|
|
|
|
|
Implementation
|
|
==============
|
|
|
|
Assignee(s)
|
|
-----------
|
|
|
|
Primary assignee:
|
|
Steven Kaufer
|
|
|
|
Other contributors:
|
|
<launchpad-id or None>
|
|
|
|
Work Items
|
|
----------
|
|
|
|
* Move filter processing code into utility functions at the API layer and at
|
|
the DB sqlalchemy layer.
|
|
* Create new API functions in the various layers to get the count data.
|
|
* v2 API extension and v3 API updates to expose the new count API function.
|
|
|
|
|
|
Dependencies
|
|
============
|
|
|
|
Related (but independent) change being proposed in cinder:
|
|
https://blueprints.launchpad.net/cinder/+spec/volume-count-api
|
|
|
|
|
|
Testing
|
|
=======
|
|
|
|
Both unit and Tempest tests need to be created to ensure that the count data
|
|
is accurate for various filters.
|
|
|
|
Testing should be done against multiple backend database types.
|
|
|
|
|
|
Documentation Impact
|
|
====================
|
|
|
|
Document the new v2 API extension and v3 API updates (see "REST API impact"
|
|
section for details).
|
|
|
|
|
|
References
|
|
==========
|
|
|
|
None
|
|
|