..
 This work is licensed under a Creative Commons Attribution 3.0 Unported
 License.

 http://creativecommons.org/licenses/by/3.0/legalcode

==========================================
Simple tenant usage pagination
==========================================

https://blueprints.launchpad.net/nova/+spec/paginate-simple-tenant-usage

The blueprint aims to add optional `limit` and `marker` parameters
to the GET /os-simple-tenant-usage endpoints.

::

    GET /os-simple-tenant-usage?limit={limit}&marker={instance_uuid}
    GET /os-simple-tenant-usage/{tenant_id}?limit={limit}&marker={instance_uuid}

Problem description
===================

The simple tenant usage API can return extremely large amounts of data and
provides no way to paginate the results. Because the API does not use the
pagination code, it doesn't even respect the "max results" sanity limit.
Because it can query a ton of data, it also causes the api workers to inflate
their memory footprint to the size of the DB result set, which is large.
Since horizon queries this by default, most users are affected unless their
ops team is extremely diligent about purging deleted instances (which are
returned by the API by design).

Use Cases
---------

Horizon uses these endpoints to display server usage.

Proposed change
===============

Add an API microversion that allows for pagination of the simple tenant usage
results using Nova'a existing approach to pagination (optional `limit` and
`marker` query parameters).

Pagination would be made available for both the "all tenants" (`index`) and
"specific tenant" (`show`) cases.

::

    List Tenant Usage For All Tenants
    /os-simple-tenant-usage?limit={limit}&marker={instance_uuid}

    Show Usage Details For Tenant
    /os-simple-tenant-usage/{tenant_id}?limit={limit}&marker={instance_uuid}

Currently, the simple tenant usage endpoints include aggregate data (like
`total_hours`) which is the sum of the `hours` for each instance in a
specific time window, grouped by tenant.

.. note:: For clarity, I've removed all other usage response fields from the
          examples.


::

    GET /os-simple-tenant-usage?detailed=1

    {
        "tenant_usages": [
            {
                "server_usages": [
                    {
                        "instance_id": "instance-uuid-1",
                        "tenant_id": "tenant-uuid-1",
                        "hours": 1
                    },
                    {
                        "instance_id": "instance-uuid-2",
                        "tenant_id": "tenant-uuid-1",
                        "hours": 1
                    },
                    {
                        "instance_id": "instance-uuid-3",
                        "tenant_id": "tenant-uuid-1",
                        "hours": 1
                    }
                ],
                "tenant_id": "tenant-uuid-1",
                "total_hours": 3
            },
            {
                "server_usages": [
                    {
                        "instance_id": "instance-uuid-4",
                        "tenant_id": "tenant-uuid-2",
                        "hours": 1
                    }
                ],
                "tenant_id": "tenant-uuid-2",
                "total_hours": 1
            }
        ]
    }

Once paging is introduced, API consumers would need to stitch together the
aggregate results if they still want totals for all instances in a specific
time window, grouped by tenant.

For example, that same data would be returned as follows if the `limit` query
parameter was set to 2. Note that the totals on the first page of results
only reflect 2 of the 3 instances for tenant-uuid-1, and that the
tenant-uuid-1 totals on the second page of results only reflect the remaining
instance for tenant-uuid-1. API consumers would need to manually add these
totals back up if they want the totals to reflect all 3 instances for
tenant-uuid-1.

::

    /os-simple-tenant-usage?detailed=1&limit=2

    {
        "tenant_usages": [
            {
                "server_usages": [
                    {
                        "instance_id": "instance-uuid-1",
                        "tenant_id": "tenant-uuid-1",
                        "hours": 1
                    },
                    {
                        "instance_id": "instance-uuid-2",
                        "tenant_id": "tenant-uuid-1",
                        "hours": 1
                    }
                ],
                "tenant_id": "tenant-uuid-1",
                "total_hours": 2
            },
        ],
        "tenant_usages_links": [
            {
                "href": "/os-simple-tenant-usage?detailed=1&limit=2&marker=instance-uuid-2",
                "rel": "next"
            }
        ]
    }

::

    /os-simple-tenant-usage?detailed=1&limit=2&marker=instance-uuid-2

    {
        "tenant_usages": [
            {
                "server_usages": [
                    {
                        "instance_id": "instance-uuid-3",
                        "tenant_id": "tenant-uuid-1",
                        "hours": 1
                    }
                ],
                "tenant_id": "tenant-uuid-1",
                "total_hours": 1
            },
            {
                "server_usages": [
                    {
                        "instance_id": "instance-uuid-4",
                        "tenant_id": "tenant-uuid-2",
                        "hours": 1
                    }
                ],
                "tenant_id": "tenant-uuid-2",
                "total_hours": 1
            },
        ]
    }

Paging is done on the inner `server_usages` list. The `marker` is the last
instance UUID in the `server_usages` list from the previous page.

The simple tenant usage endpoints will also include the conventional "next"
links: `tenant_usages_links` in the case of `index` and `tenant_usage_links`
in the `show` case.

::

    /os-simple-tenant-usage?detailed=1&limit={limit}

    {
        "tenant_usages": [
            {
                "server_usages": [
                   ...
                ],
                "tenant_id": "{tenant_id}",
            }
        ],
        "tenant_usages_links": [
            {
                "href": "/os-simple-tenant-usage?detailed=1&limit={limit}&marker={marker}",
                "rel": "next"
            }
        ]
    }

::

    /os-simple-tenant-usage/{tenant_id}?detailed=1&limit={limit}

    {
        "tenant_usage": {
            "server_usages": [
               ...
            ]
        },
        "tenant_usage_links": [
            {
                "href": "os-simple-tenant-usage/{tenant_id}?limit={limit}&marker={marker}",
                "rel": "next"
            }
        ]
    }


.. note:: For clarity, I omitted the additional query parameters (like start
          & end) from the next links, but they need to be preserved. An actual
          next link would look more like this.


::

    "tenant_usages_links": [
        {
            "href": "http://openstack.example.com/v2.1/6f70656e737461636b20342065766572/os-simple-tenant-usage?detailed=1&end=2016-10-12+18%3A22%3A04.868106&limit=1&marker=1f1deceb-17b5-4c04-84c7-e0d4499c8fe0&start=2016-10-12+18%3A22%3A04.868106",
            "rel": "next"
        }
    ]

Alternatives
------------

None

Data model impact
-----------------

Sorting will need to be added to the query that returns the instances in the
`server_usages` list. The sort order will need to be deterministic across
cell databases, and we may need to modify/add a new database index as a
result.


REST API impact
---------------

Add an API microversion that allows for pagination of the simple tenant usage
results using optional `limit` and `marker` query parameters. If `limit`
isn't provided, it will default to `CONF.osapi_max_limit` which is currently
1000.

::

    GET /os-simple-tenant-usage?limit={limit}&marker={instance_uuid}
    GET /os-simple-tenant-usage/{tenant_id}?limit={limit}&marker={instance_uuid}

Older versions of the `os-simple-tenant-usage` endpoints will not accept these
new paging query parameters, but they will start to silently limit by
`CONF.osapi_max_limit` to encourage the adoption of this new microversion, and
circumvent the existing possibility DoS-like usage requests on systems with
thousands of instances.

Security impact
---------------

None

Notifications impact
--------------------

None

Other end user impact
---------------------

Also change the python-novaclient to accept `limit` and `marker` options for
simple tenant usage.

Performance Impact
------------------

Horizon consumes these API endpoints which are currently slow with a large
memory profile when there are a lot of instances.

Other deployer impact
---------------------

None

Developer impact
----------------

None

Implementation
==============

Assignee(s)
-----------

Primary assignee:
  diana_clarke

Other contributors:
  None

Work Items
----------

- Create a new API microversion for simple tenant usage pagination.
- Update python-novaclient to be able to take advantage of these changes.
- Communicate these changes to the Horizon team.


Dependencies
============

None

Testing
=======

Needs functional and unit tests.

Documentation Impact
====================

Update the "Usage reports" section of the compute api-ref to mention the new
microversion and optional `limit` and `marker` query parameters.

References
==========

Bug that describes the problem:

[1] https://bugs.launchpad.net/nova/+bug/1421471

Proof of concept (nova & python-novaclient):

[2] https://review.openstack.org/#/c/386093/

[3] https://review.openstack.org/#/c/394653/

History
=======

.. list-table:: Revisions
   :header-rows: 1

   * - Release Name
     - Description
   * - Ocata
     - Introduced