From 13b06ebb1cea239f96a3426eb30f1ad42a0ff7ed Mon Sep 17 00:00:00 2001 From: Stephen Finucane Date: Fri, 9 Jul 2021 17:58:52 +0100 Subject: [PATCH] docs: Add a new cells v2 document We currently have three cells v2 documents in-tree: - A 'user/cellsv2-layout' document that details the structure or architecture of a cells v2 deployment (which is to say, any modern nova deployment) - A 'user/cells' document, which is written from a pre-cells v2 viewpoint and details the changes that cells v2 *will* require and the benefits it *would* bring. It also includes steps for upgrading from pre-cells v2 (that is, pre-Pike) deployment or a deployment with cells v1 (which we removed in Train and probably broke long before) - An 'admin/cells' document, which doesn't contain much other than some advice for handling down cells Clearly there's a lot of cruft to be cleared out as well as some centralization of information that's possible. As such, we combine all of these documents into one document, 'admin/cells'. This is chosen over 'users/cells' since cells are not an end-user-facing feature. References to cells v1 and details on upgrading from pre-cells v2 deployments are mostly dropped, as are some duplicated installation/configuration steps. Formatting is fixed and Sphinx-isms used to cross reference config option where possible. Finally, redirects are added so that people can continue to find the relevant resources. The result is (hopefully) a one stop shop for all things cells v2-related that operators can use to configure and understand their deployments. Change-Id: If39db50fd8b109a5a13dec70f8030f3663555065 Signed-off-by: Stephen Finucane --- doc/source/_extra/.htaccess | 8 +- doc/source/admin/cells.rst | 1094 ++++++++++++++++- .../admin/configuration/cross-cell-resize.rst | 61 +- doc/source/admin/configuring-migrations.rst | 3 +- .../affinity-policy-violated.rst | 9 +- doc/source/admin/upgrades.rst | 4 +- doc/source/index.rst | 9 +- doc/source/reference/index.rst | 4 +- doc/source/user/architecture.rst | 2 +- doc/source/user/cells.rst | 759 ------------ doc/source/user/cellsv2-layout.rst | 412 ------- doc/source/user/index.rst | 6 +- doc/test/redirect-tests.txt | 8 +- 13 files changed, 1146 insertions(+), 1233 deletions(-) delete mode 100644 doc/source/user/cells.rst delete mode 100644 doc/source/user/cellsv2-layout.rst diff --git a/doc/source/_extra/.htaccess b/doc/source/_extra/.htaccess index 5d361fc91c9c..79048f1ee869 100644 --- a/doc/source/_extra/.htaccess +++ b/doc/source/_extra/.htaccess @@ -9,12 +9,12 @@ redirectmatch 301 ^/nova/([^/]+)/api_plugins.html$ /nova/$1/contributor/api.html redirectmatch 301 ^/nova/([^/]+)/architecture.html$ /nova/$1/user/architecture.html redirectmatch 301 ^/nova/([^/]+)/block_device_mapping.html$ /nova/$1/user/block-device-mapping.html redirectmatch 301 ^/nova/([^/]+)/blueprints.html$ /nova/$1/contributor/blueprints.html -redirectmatch 301 ^/nova/([^/]+)/cells.html$ /nova/$1/user/cells.html +redirectmatch 301 ^/nova/([^/]+)/cells.html$ /nova/$1/admin/cells.html redirectmatch 301 ^/nova/([^/]+)/code-review.html$ /nova/$1/contributor/code-review.html redirectmatch 301 ^/nova/([^/]+)/conductor.html$ /nova/$1/user/conductor.html redirectmatch 301 ^/nova/([^/]+)/development.environment.html$ /nova/$1/contributor/development-environment.html redirectmatch 301 ^/nova/([^/]+)/devref/api.html /nova/$1/contributor/api.html -redirectmatch 301 ^/nova/([^/]+)/devref/cells.html /nova/$1/user/cells.html +redirectmatch 301 ^/nova/([^/]+)/devref/cells.html /nova/$1/admin/cells.html redirectmatch 301 ^/nova/([^/]+)/devref/filter_scheduler.html /nova/$1/admin/scheduling.html # catch all, if we hit something in devref assume it moved to # reference unless we have already triggered a hit above. @@ -64,7 +64,9 @@ redirectmatch 301 ^/nova/([^/]+)/testing/zero-downtime-upgrade.html$ /nova/$1/co redirectmatch 301 ^/nova/([^/]+)/threading.html$ /nova/$1/reference/threading.html redirectmatch 301 ^/nova/([^/]+)/upgrade.html$ /nova/$1/admin/upgrades.html redirectmatch 301 ^/nova/([^/]+)/user/aggregates.html$ /nova/$1/admin/aggregates.html -redirectmatch 301 ^/nova/([^/]+)/user/cellsv2_layout.html$ /nova/$1/user/cellsv2-layout.html +redirectmatch 301 ^/nova/([^/]+)/user/cells.html$ /nova/$1/admin/cells.html +redirectmatch 301 ^/nova/([^/]+)/user/cellsv2-layout.html$ /nova/$1/admin/cells.html +redirectmatch 301 ^/nova/([^/]+)/user/cellsv2_layout.html$ /nova/$1/admin/cells.html redirectmatch 301 ^/nova/([^/]+)/user/config-drive.html$ /nova/$1/user/metadata.html redirectmatch 301 ^/nova/([^/]+)/user/filter-scheduler.html$ /nova/$1/admin/scheduling.html redirectmatch 301 ^/nova/([^/]+)/user/metadata-service.html$ /nova/$1/user/metadata.html diff --git a/doc/source/admin/cells.rst b/doc/source/admin/cells.rst index 92f336a4e40f..87bf81a542a0 100644 --- a/doc/source/admin/cells.rst +++ b/doc/source/admin/cells.rst @@ -1,11 +1,929 @@ -================== -CellsV2 Management -================== +========== +Cells (v2) +========== + +.. versionadded:: 16.0.0 (Pike) + +This document describes the layout of a deployment with cells v2, including +deployment considerations for security and scale and recommended practices and +tips for running and maintaining cells v2 for admins and operators. It is +focused on code present in Pike and later, and while it is geared towards +people who want to have multiple cells for whatever reason, the nature of the +cells v2 support in Nova means that it applies in some way to all deployments. + +Before reading any further, there is a nice overview presentation_ that Andrew +Laski gave at the Austin (Newton) summit which may be worth watching. + +.. _presentation: https://www.openstack.org/videos/summits/austin-2016/nova-cells-v2-whats-going-on + +.. note:: + + Cells v2 is different to the cells feature found in earlier versions of + nova, also known as cells v1. Cells v1 was deprecated in 16.0.0 (Pike) and + removed entirely in Train (20.0.0). + + +Overview +-------- + +The purpose of the cells functionality in nova is specifically to +allow larger deployments to shard their many compute nodes into cells. +A basic Nova system consists of the following components: + +- The :program:`nova-api` service which provides the external REST API to + users. + +- The :program:`nova-scheduler` and ``placement`` services which are + responsible for tracking resources and deciding which compute node instances + should be on. + +- An "API database" that is used primarily by :program:`nova-api` and + :program:`nova-scheduler` (called *API-level services* below) to track + location information about instances, as well as a temporary location for + instances being built but not yet scheduled. + +- The :program:`nova-conductor` service which offloads long-running tasks for + the API-level service and insulates compute nodes from direct database access + +- The :program:`nova-compute` service which manages the virt driver and + hypervisor host. + +- A "cell database" which is used by API, conductor and compute + services, and which houses the majority of the information about + instances. + +- A "cell0 database" which is just like the cell database, but + contains only instances that failed to be scheduled. This database mimics a + regular cell, but has no compute nodes and is used only as a place to put + instances that fail to land on a real compute node (and thus a real cell). + +- A message queue which allows the services to communicate with each + other via RPC. + +All deployments have at least the above components. Smaller deployments +likely have a single message queue that all services share and a +single database server which hosts the API database, a single cell +database, as well as the required cell0 database. This is considered a +"single-cell deployment" because it only has one "real" cell. +However, while there will only ever be one global API database, a larger +deployments can have many cell databases (where the bulk of the instance +information lives), each with a portion of the instances for the entire +deployment within, as well as per-cell message queues. + +In these larger deployments, each of the nova services will use a cell-specific +configuration file, all of which will at a minimum specify a message queue +endpoint (i.e. :oslo.config:option:`transport_url`). Most of the services will +also contain database connection configuration information (i.e. +:oslo.config:option:`database.connection`), while API-level services that need +access to the global routing and placement information will also be configured +to reach the API database (i.e. :oslo.config:option:`api_database.connection`). + +.. note:: + + The pair of :oslo.config:option:`transport_url` and + :oslo.config:option:`database.connection` configured for a service defines + what cell a service lives in. + +API-level services need to be able to contact other services in all of +the cells. Since they only have one configured +:oslo.config:option:`transport_url` and +:oslo.config:option:`database.connection`, they look up the information for the +other cells in the API database, with records called *cell mappings*. + +.. note:: + + The API database must have cell mapping records that match + the :oslo.config:option:`transport_url` and + :oslo.config:option:`database.connection` configuration options of the + lower-level services. See the ``nova-manage`` :ref:`man-page-cells-v2` + commands for more information about how to create and examine these records. + + +Service layout +-------------- + +The services generally have a well-defined communication pattern that +dictates their layout in a deployment. In a small/simple scenario, the +rules do not have much of an impact as all the services can +communicate with each other on a single message bus and in a single +cell database. However, as the deployment grows, scaling and security +concerns may drive separation and isolation of the services. + +Single cell +~~~~~~~~~~~ + +This is a diagram of the basic services that a simple (single-cell) deployment +would have, as well as the relationships (i.e. communication paths) between +them: + +.. graphviz:: + + digraph services { + graph [pad="0.35", ranksep="0.65", nodesep="0.55", concentrate=true]; + node [fontsize=10 fontname="Monospace"]; + edge [arrowhead="normal", arrowsize="0.8"]; + labelloc=bottom; + labeljust=left; + + { rank=same + api [label="nova-api"] + apidb [label="API Database" shape="box"] + scheduler [label="nova-scheduler"] + } + { rank=same + mq [label="MQ" shape="diamond"] + conductor [label="nova-conductor"] + } + { rank=same + cell0db [label="Cell0 Database" shape="box"] + celldb [label="Cell Database" shape="box"] + compute [label="nova-compute"] + } + + api -> mq -> compute + conductor -> mq -> scheduler + + api -> apidb + api -> cell0db + api -> celldb + + conductor -> apidb + conductor -> cell0db + conductor -> celldb + } + +All of the services are configured to talk to each other over the same +message bus, and there is only one cell database where live instance +data resides. The cell0 database is present (and required) but as no +compute nodes are connected to it, this is still a "single cell" +deployment. + +Multiple cells +~~~~~~~~~~~~~~ + +In order to shard the services into multiple cells, a number of things +must happen. First, the message bus must be split into pieces along +the same lines as the cell database. Second, a dedicated conductor +must be run for the API-level services, with access to the API +database and a dedicated message queue. We call this *super conductor* +to distinguish its place and purpose from the per-cell conductor nodes. + +.. graphviz:: + + digraph services2 { + graph [pad="0.35", ranksep="0.65", nodesep="0.55", concentrate=true]; + node [fontsize=10 fontname="Monospace"]; + edge [arrowhead="normal", arrowsize="0.8"]; + labelloc=bottom; + labeljust=left; + + subgraph api { + api [label="nova-api"] + scheduler [label="nova-scheduler"] + conductor [label="super conductor"] + { rank=same + apimq [label="API MQ" shape="diamond"] + apidb [label="API Database" shape="box"] + } + + api -> apimq -> conductor + api -> apidb + conductor -> apimq -> scheduler + conductor -> apidb + } + + subgraph clustercell0 { + label="Cell 0" + color=green + cell0db [label="Cell Database" shape="box"] + } + + subgraph clustercell1 { + label="Cell 1" + color=blue + mq1 [label="Cell MQ" shape="diamond"] + cell1db [label="Cell Database" shape="box"] + conductor1 [label="nova-conductor"] + compute1 [label="nova-compute"] + + conductor1 -> mq1 -> compute1 + conductor1 -> cell1db + + } + + subgraph clustercell2 { + label="Cell 2" + color=red + mq2 [label="Cell MQ" shape="diamond"] + cell2db [label="Cell Database" shape="box"] + conductor2 [label="nova-conductor"] + compute2 [label="nova-compute"] + + conductor2 -> mq2 -> compute2 + conductor2 -> cell2db + } + + api -> mq1 -> conductor1 + api -> mq2 -> conductor2 + api -> cell0db + api -> cell1db + api -> cell2db + + conductor -> cell0db + conductor -> cell1db + conductor -> mq1 + conductor -> cell2db + conductor -> mq2 + } + +It is important to note that services in the lower cell boxes only +have the ability to call back to the placement API but cannot access +any other API-layer services via RPC, nor do they have access to the +API database for global visibility of resources across the cloud. +This is intentional and provides security and failure domain +isolation benefits, but also has impacts on some things that would +otherwise require this any-to-any communication style. Check the +release notes for the version of Nova you are using for the most +up-to-date information about any caveats that may be present due to +this limitation. + + +Database layout +--------------- + +As mentioned previously, there is a split between global data and data that is +local to a cell. + +The following is a breakdown of what data can uncontroversially considered +global versus local to a cell. Missing data will be filled in as consensus is +reached on the data that is more difficult to cleanly place. The missing data +is mostly concerned with scheduling and networking. + +.. note:: + + This list of tables is accurate as of the 15.0.0 (Pike) release. It's + possible that schema changes may have added additional tables since. + +Global (API-level) tables +~~~~~~~~~~~~~~~~~~~~~~~~~ + +- ``instance_types`` +- ``instance_type_projects`` +- ``instance_type_extra_specs`` +- ``quotas`` +- ``project_user_quotas`` +- ``quota_classes`` +- ``quota_usages`` +- ``security_groups`` +- ``security_group_rules`` +- ``security_group_default_rules`` +- ``provider_fw_rules`` +- ``key_pairs`` +- ``migrations`` +- ``networks`` +- ``tags`` + +Cell-level tables +~~~~~~~~~~~~~~~~~ + +- ``instances`` +- ``instance_info_caches`` +- ``instance_extra`` +- ``instance_metadata`` +- ``instance_system_metadata`` +- ``instance_faults`` +- ``instance_actions`` +- ``instance_actions_events`` +- ``instance_id_mappings`` +- ``pci_devices`` +- ``block_device_mapping`` +- ``virtual_interfaces`` + + +Usage +----- + +As noted previously, all deployments are in effect now cells v2 deployments. As +a result, setup of a any nova deployment - even those that intend to only have +once cell - will involve some level of cells configuration. These changes are +configuration-related, both in the main nova configuration file as well as some +extra records in the databases. + +All nova deployments must now have the following databases available +and configured: + +1. The "API" database +2. One special "cell" database called "cell0" +3. One (or eventually more) "cell" databases + +Thus, a small nova deployment will have an API database, a cell0, and +what we will call here a "cell1" database. High-level tracking +information is kept in the API database. Instances that are never +scheduled are relegated to the cell0 database, which is effectively a +graveyard of instances that failed to start. All successful/running +instances are stored in "cell1". + +.. note:: + + Since Nova services make use of both configuration file and some + databases records, starting or restarting those services with an + incomplete configuration could lead to an incorrect deployment. + Only restart the services once you are done with the described + steps below. + +.. note:: + + The following examples show the full expanded command line usage of + the setup commands. This is to make it easier to visualize which of + the various URLs are used by each of the commands. However, you should be + able to put all of that in the config file and :program:`nova-manage` will + use those values. If need be, you can create separate config files and pass + them as ``nova-manage --config-file foo.conf`` to control the behavior + without specifying things on the command lines. + +Configuring a new deployment +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If you are installing Nova for the first time and have no compute hosts in the +database yet then it will be necessary to configure cell0 and at least once +additional "real" cell. To begin, ensure your API database has been created +using the :program:`nova-manage api_db sync` command. Ensure the connection +information for this database is stored in the ``nova.conf`` file using the +:oslo.config:option:`api_database.connection` config option: + +.. code-block:: ini + + [api_database] + connection = mysql+pymysql://root:secretmysql@dbserver/nova_api?charset=utf8 + +Since there may be multiple "cell" databases (and in fact everyone +will have cell0 and cell1 at a minimum), connection info for these is +stored in the API database. Thus, the API database must exist and must provide +information on how to connect to it before continuing to the steps below, so +that :program:`nova-manage` can find your other databases. + +Next, we will create the necessary records for the cell0 database. To +do that we will first use :program:`nova-manage cell_v2 map_cell0` to create +and map cell0. For example: + +.. code-block:: bash + + $ nova-manage cell_v2 map_cell0 \ + --database_connection mysql+pymysql://root:secretmysql@dbserver/nova_cell0?charset=utf8 + +.. note:: + + If you don't specify ``--database_connection`` then the commands will use + the :oslo.config:option:`database.connection` value from your config file + and mangle the database name to have a ``_cell0`` suffix + +.. warning:: + + If your databases are on separate hosts then you should specify + ``--database_connection`` or make certain that the :file:`nova.conf` + being used has the :oslo.config:option:`database.connection` value pointing + to the same user/password/host that will work for the cell0 database. + If the cell0 mapping was created incorrectly, it can be deleted + using the :program:`nova-manage cell_v2 delete_cell` command before running + :program:`nova-manage cell_v2 map_cell0` again with the proper database + connection value. + +We will then use :program:`nova-manage db sync` to apply the database schema to +this new database. For example: + +.. code-block:: bash + + $ nova-manage db sync \ + --database_connection mysql+pymysql://root:secretmysql@dbserver/nova_cell0?charset=utf8 + +Since no hosts are ever in cell0, nothing further is required for its setup. +Note that all deployments only ever have one cell0, as it is special, so once +you have done this step you never need to do it again, even if you add more +regular cells. + +Now, we must create another cell which will be our first "regular" +cell, which has actual compute hosts in it, and to which instances can +actually be scheduled. First, we create the cell record using +:program:`nova-manage cell_v2 create_cell`. For example: + +.. code-block:: bash + + $ nova-manage cell_v2 create_cell \ + --name cell1 \ + --database_connection mysql+pymysql://root:secretmysql@127.0.0.1/nova?charset=utf8 \ + --transport-url rabbit://stackrabbit:secretrabbit@mqserver:5672/ + +.. note:: + + If you don't specify the database and transport urls then + :program:`nova-manage` will use the :oslo.config:option:`transport_url` and + :oslo.config:option:`database.connection` values from the config file. + +.. note:: + + It is a good idea to specify a name for the new cell you create so you can + easily look up cell UUIDs with the :program:`nova-manage cell_v2 list_cells` + command later if needed. + +.. note:: + + The :program:`nova-manage cell_v2 create_cell` command will print the UUID + of the newly-created cell if ``--verbose`` is passed, which is useful if you + need to run commands like :program:`nova-manage cell_v2 discover_hosts` + targeted at a specific cell. + +At this point, the API database can now find the cell database, and further +commands will attempt to look inside. If this is a completely fresh database +(such as if you're adding a cell, or if this is a new deployment), then you +will need to run :program:`nova-manage db sync` on it to initialize the +schema. + +Now we have a cell, but no hosts are in it which means the scheduler will never +actually place instances there. The next step is to scan the database for +compute node records and add them into the cell we just created. For this step, +you must have had a compute node started such that it registers itself as a +running service. You can identify this using the :program:`openstack compute +service list` command: + +.. code-block:: bash + + $ openstack compute service list --service nova-compute + +Once that has happened, you can scan and add it to the cell using the +:program:`nova-manage cell_v2 discover_hosts` command: + +.. code-block:: bash + + $ nova-manage cell_v2 discover_hosts + +This command will connect to any databases for which you have created cells (as +above), look for hosts that have registered themselves there, and map those +hosts in the API database so that they are visible to the scheduler as +available targets for instances. Any time you add more compute hosts to a cell, +you need to re-run this command to map them from the top-level so they can be +utilized. You can also configure a periodic task to have Nova discover new +hosts automatically by setting the +:oslo.config:option:`scheduler.discover_hosts_in_cells_interval` to a time +interval in seconds. The periodic task is run by the :program:`nova-scheduler` +service, so you must be sure to configure it on all of your +:program:`nova-scheduler` hosts. + +.. note:: + + In the future, whenever you add new compute hosts, you will need to run the + :program:`nova-manage cell_v2 discover_hosts` command after starting them to + map them to the cell if you did not configure automatic host discovery using + :oslo.config:option:`scheduler.discover_hosts_in_cells_interval`. + +Adding a new cell to an existing deployment +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +You can add additional cells to your deployment using the same steps used above +to create your first cell. We can create a new cell record using +:program:`nova-manage cell_v2 create_cell`. For example: + +.. code-block:: bash + + $ nova-manage cell_v2 create_cell \ + --name cell2 \ + --database_connection mysql+pymysql://root:secretmysql@127.0.0.1/nova?charset=utf8 \ + --transport-url rabbit://stackrabbit:secretrabbit@mqserver:5672/ + +.. note:: + + If you don't specify the database and transport urls then + :program:`nova-manage` will use the :oslo.config:option:`transport_url` and + :oslo.config:option:`database.connection` values from the config file. + +.. note:: + + It is a good idea to specify a name for the new cell you create so you can + easily look up cell UUIDs with the :program:`nova-manage cell_v2 list_cells` + command later if needed. + +.. note:: + + The :program:`nova-manage cell_v2 create_cell` command will print the UUID + of the newly-created cell if ``--verbose`` is passed, which is useful if you + need to run commands like :program:`nova-manage cell_v2 discover_hosts` + targeted at a specific cell. + +You can repeat this step for each cell you wish to add to your deployment. Your +existing cell database will be re-used - this simply informs the top-level API +database about your existing cell databases. + +Once you've created your new cell, use :program:`nova-manage cell_v2 +discover_hosts` to map compute hosts to cells. This is only necessary if you +haven't enabled automatic discovery using the +:oslo.config:option:`scheduler.discover_hosts_in_cells_interval` option. For +example: + +.. code-block:: bash + + $ nova-manage cell_v2 discover_hosts + +.. note:: + + This command will search for compute hosts in each cell database and map + them to the corresponding cell. This can be slow, particularly for larger + deployments. You may wish to specify the ``--cell_uuid`` option, which will + limit the search to a specific cell. You can use the :program:`nova-manage + cell_v2 list_cells` command to look up cell UUIDs if you are going to + specify ``--cell_uuid``. + +Finally, run the :program:`nova-manage cell_v2 map_instances` command to map +existing instances to the new cell(s). For example: + +.. code-block:: bash + + $ nova-manage cell_v2 map_instances + +.. note:: + + This command will search for instances in each cell database and map them to + the correct cell. This can be slow, particularly for larger deployments. You + may wish to specify the ``--cell_uuid`` option, which will limit the search + to a specific cell. You can use the :program:`nova-manage cell_v2 + list_cells` command to look up cell UUIDs if you are going to specify + ``--cell_uuid``. + +.. note:: + + The ``--max-count`` option can be specified if you would like to limit the + number of instances to map in a single run. If ``--max-count`` is not + specified, all instances will be mapped. Repeated runs of the command will + start from where the last run finished so it is not necessary to increase + ``--max-count`` to finish. An exit code of 0 indicates that all instances + have been mapped. An exit code of 1 indicates that there are remaining + instances that need to be mapped. + + +Template URLs in Cell Mappings +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Starting in the 18.0.0 (Rocky) release, the URLs provided in the cell mappings +for ``--database_connection`` and ``--transport-url`` can contain +variables which are evaluated each time they are loaded from the +database, and the values of which are taken from the corresponding +base options in the host's configuration file. The base URL is parsed +and the following elements may be substituted into the cell mapping +URL (using ``rabbit://bob:s3kret@myhost:123/nova?sync=true#extra``): + +.. list-table:: Cell Mapping URL Variables + :header-rows: 1 + :widths: 15, 50, 15 + + * - Variable + - Meaning + - Part of example URL + * - ``scheme`` + - The part before the `://` + - ``rabbit`` + * - ``username`` + - The username part of the credentials + - ``bob`` + * - ``password`` + - The password part of the credentials + - ``s3kret`` + * - ``hostname`` + - The hostname or address + - ``myhost`` + * - ``port`` + - The port number (must be specified) + - ``123`` + * - ``path`` + - The "path" part of the URL (without leading slash) + - ``nova`` + * - ``query`` + - The full query string arguments (without leading question mark) + - ``sync=true`` + * - ``fragment`` + - Everything after the first hash mark + - ``extra`` + +Variables are provided in curly brackets, like ``{username}``. A simple template +of ``rabbit://{username}:{password}@otherhost/{path}`` will generate a full URL +of ``rabbit://bob:s3kret@otherhost/nova`` when used with the above example. + +.. note:: + + The :oslo.config:option:`database.connection` and + :oslo.config:option:`transport_url` values are not reloaded from the + configuration file during a SIGHUP, which means that a full service restart + will be required to notice changes in a cell mapping record if variables are + changed. + +.. note:: + + The :oslo.config:option:`transport_url` option can contain an + extended syntax for the "netloc" part of the URL + (i.e. ``userA:passwordA@hostA:portA,userB:passwordB:hostB:portB``). In this + case, substitions of the form ``username1``, ``username2``, etc will be + honored and can be used in the template URL. + +The templating of these URLs may be helpful in order to provide each service host +with its own credentials for, say, the database. Without templating, all hosts +will use the same URL (and thus credentials) for accessing services like the +database and message queue. By using a URL with a template that results in the +credentials being taken from the host-local configuration file, each host will +use different values for those connections. + +Assuming you have two service hosts that are normally configured with the cell0 +database as their primary connection, their (abbreviated) configurations would +look like this: + +.. code-block:: ini + + [database] + connection = mysql+pymysql://service1:foo@myapidbhost/nova_cell0 + +and: + +.. code-block:: ini + + [database] + connection = mysql+pymysql://service2:bar@myapidbhost/nova_cell0 + +Without cell mapping template URLs, they would still use the same credentials +(as stored in the mapping) to connect to the cell databases. However, consider +template URLs like the following:: + + mysql+pymysql://{username}:{password}@mycell1dbhost/nova + +and:: + + mysql+pymysql://{username}:{password}@mycell2dbhost/nova + +Using the first service and cell1 mapping, the calculated URL that will actually +be used for connecting to that database will be:: + + mysql+pymysql://service1:foo@mycell1dbhost/nova + + +Design +------ + +Prior to the introduction of cells v2, when a request hit the Nova API for a +particular instance, the instance information was fetched from the database. +The information contained the hostname of the compute node on which the +instance was currently located. If the request needed to take action on the +instance (which it generally would), the hostname was used to calculate the +name of a queue and a message was written there which would eventually find its +way to the proper compute node. + +The meat of the cells v2 feature was to split this hostname lookup into two parts +that yielded three pieces of information instead of one. Basically, instead of +merely looking up the *name* of the compute node on which an instance was +located, we also started obtaining database and queue connection information. +Thus, when asked to take action on instance $foo, we now: + +1. Lookup the three-tuple of (database, queue, hostname) for that instance +2. Connect to that database and fetch the instance record +3. Connect to the queue and send the message to the proper hostname queue + +The above differs from the previous organization in two ways. First, we now +need to do two database lookups before we know where the instance lives. +Second, we need to demand-connect to the appropriate database and queue. Both +of these changes had performance implications, but it was possible to mitigate +them through the use of things like a memcache of instance mapping information +and pooling of connections to database and queue systems. The number of cells +will always be much smaller than the number of instances. + +There were also availability implications with the new feature since something like a +instance list which might query multiple cells could end up with a partial result +if there is a database failure in a cell. These issues can be mitigated, as +discussed in :ref:`handling-cell-failures`. A database failure within a cell +would cause larger issues than a partial list result so the expectation is that +it would be addressed quickly and cells v2 will handle it by indicating in the +response that the data may not be complete. + + +Comparison with cells v1 +------------------------ + +Prior to the introduction of cells v2, nova had a very similar feature, also +called cells or referred to as cells v1 for disambiguation. Cells v2 was an +effort to address many of the perceived shortcomings of the cell v1 feature. +Benefits of the cells v2 feature over the previous cells v1 feature include: + +- Native sharding of the database and queue as a first-class-feature in nova. + All of the code paths will go through the lookup procedure and thus we won't + have the same feature parity issues as we do with current cells. + +- No high-level replication of all the cell databases at the top. The API will + need a database of its own for things like the instance index, but it will + not need to replicate all the data at the top level. + +- It draws a clear line between global and local data elements. Things like + flavors and keypairs are clearly global concepts that need only live at the + top level. Providing this separation allows compute nodes to become even more + stateless and insulated from things like deleted/changed global data. + +- Existing non-cells users will suddenly gain the ability to spawn a new "cell" + from their existing deployment without changing their architecture. Simply + adding information about the new database and queue systems to the new index + will allow them to consume those resources. + +- Existing cells users will need to fill out the cells mapping index, shutdown + their existing cells synchronization service, and ultimately clean up their + top level database. However, since the high-level organization is not + substantially different, they will not have to re-architect their systems to + move to cells v2. + +- Adding new sets of hosts as a new "cell" allows them to be plugged into a + deployment and tested before allowing builds to be scheduled to them. + + +.. _cells-v2-caveats: + +Caveats +------- + +.. note:: + + Many of these caveats have been addressed since the introduction of cells v2 + in the 16.0.0 (Pike) release. These are called out below. + +Cross-cell move operations +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Support for cross-cell cold migration and resize was introduced in the 21.0.0 +(Ussuri) release. This is documented in +:doc:`/admin/configuration/cross-cell-resize`. Prior to this release, it was +not possible to cold migrate or resize an instance from a host in one cell to a +host in another cell. + +It is not currently possible to live migrate, evacuate or unshelve an instance +from a host in one cell to a host in another cell. + +Quota-related quirks +~~~~~~~~~~~~~~~~~~~~ + +Quotas are now calculated live at the point at which an operation +would consume more resource, instead of being kept statically in the +database. This means that a multi-cell environment may incorrectly +calculate the usage of a tenant if one of the cells is unreachable, as +those resources cannot be counted. In this case, the tenant may be +able to consume more resource from one of the available cells, putting +them far over quota when the unreachable cell returns. + +.. note:: + + Starting in the Train (20.0.0) release, it is possible to configure + counting of quota usage from the placement service and API database + to make quota usage calculations resilient to down or poor-performing + cells in a multi-cell environment. See the :doc:`quotas documentation + ` for more details. + +Performance of listing instances +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Prior to the 17.0.0 (Queens) release, the instance list operation may not sort +and paginate results properly when crossing multiple cell boundaries. Further, +the performance of a sorted list operation across multiple cells was +considerably slower than with a single cell. This was resolved as part of the +`efficient-multi-cell-instance-list-and-sort`__ spec. + +.. __: https://blueprints.launchpad.net/nova/+spec/efficient-multi-cell-instance-list-and-sort + +Notifications +~~~~~~~~~~~~~ + +With a multi-cell environment with multiple message queues, it is +likely that operators will want to configure a separate connection to +a unified queue for notifications. This can be done in the configuration file +of all nodes. Refer to the :oslo.messaging-doc:`oslo.messaging configuration +documentation +` for more +details. + +.. _cells-v2-layout-metadata-api: + +Nova Metadata API service +~~~~~~~~~~~~~~~~~~~~~~~~~ + +Starting from the 19.0.0 (Stein) release, the :doc:`nova metadata API service +` can be run either globally or per cell using the +:oslo.config:option:`api.local_metadata_per_cell` configuration option. + +.. rubric:: Global + +If you have networks that span cells, you might need to run Nova metadata API +globally. When running globally, it should be configured as an API-level +service with access to the :oslo.config:option:`api_database.connection` +information. The nova metadata API service **must not** be run as a standalone +service, using the :program:`nova-api-metadata` service, in this case. + +.. rubric:: Local per cell + +Running Nova metadata API per cell can have better performance and data +isolation in a multi-cell deployment. If your networks are segmented along +cell boundaries, then you can run Nova metadata API service per cell. If you +choose to run it per cell, you should also configure each +:neutron-doc:`neutron-metadata-agent +` service to +point to the corresponding :program:`nova-api-metadata`. The nova metadata API +service **must** be run as a standalone service, using the +:program:`nova-api-metadata` service, in this case. + +Console proxies +~~~~~~~~~~~~~~~ + +Starting from the 18.0.0 (Rocky) release, console proxies must be run per cell +because console token authorizations are stored in cell databases. This means +that each console proxy server must have access to the +:oslo.config:option:`database.connection` information for the cell database +containing the instances for which it is proxying console access. This +functionality was added as part of the `convert-consoles-to-objects`__ spec. + +.. __: https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/convert-consoles-to-objects.html + +.. _upcall: + +Operations requiring upcalls +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If you deploy multiple cells with a superconductor as described above, +computes and cell-based conductors will not have the ability to speak +to the scheduler as they are not connected to the same MQ. This is by +design for isolation, but currently the processes are not in place to +implement some features without such connectivity. Thus, anything that +requires a so-called "upcall" will not function. This impacts the +following: + +#. Instance reschedules during boot and resize (part 1) + + .. note:: + + This has been resolved in the `Queens release`__. + + .. __: https://specs.openstack.org/openstack/nova-specs/specs/queens/approved/return-alternate-hosts.html + +#. Instance affinity reporting from the compute nodes to scheduler +#. The late anti-affinity check during server create and evacuate +#. Querying host aggregates from the cell + + .. note:: + + This has been resolved in the `Rocky release`__. + + .. __: https://blueprints.launchpad.net/nova/+spec/live-migration-in-xapi-pool + +#. Attaching a volume and ``[cinder] cross_az_attach = False`` +#. Instance reschedules during boot and resize (part 2) + + .. note:: This has been resolved in the `Ussuri release`__. + + .. __: https://review.opendev.org/q/topic:bug/1781286 + +The first is simple: if you boot an instance, it gets scheduled to a +compute node, fails, it would normally be re-scheduled to another +node. That requires scheduler intervention and thus it will not work +in Pike with a multi-cell layout. If you do not rely on reschedules +for covering up transient compute-node failures, then this will not +affect you. To ensure you do not make futile attempts at rescheduling, +you should set :oslo.config:option:`scheduler.max_attempts` to ``1`` in +``nova.conf``. + +The second two are related. The summary is that some of the facilities +that Nova has for ensuring that affinity/anti-affinity is preserved +between instances does not function in Pike with a multi-cell +layout. If you don't use affinity operations, then this will not +affect you. To make sure you don't make futile attempts at the +affinity check, you should set +:oslo.config:option:`workarounds.disable_group_policy_check_upcall` to ``True`` +and :oslo.config:option:`filter_scheduler.track_instance_changes` to ``False`` +in ``nova.conf``. + +The fourth was previously only a problem when performing live migrations using +the since-removed XenAPI driver and not specifying ``--block-migrate``. The +driver would attempt to figure out if block migration should be performed based +on source and destination hosts being in the same aggregate. Since aggregates +data had migrated to the API database, the cell conductor would not be able to +access the aggregate information and would fail. + +The fifth is a problem because when a volume is attached to an instance +in the *nova-compute* service, and ``[cinder]/cross_az_attach=False`` in +nova.conf, we attempt to look up the availability zone that the instance is +in which includes getting any host aggregates that the ``instance.host`` is in. +Since the aggregates are in the API database and the cell conductor cannot +access that information, so this will fail. In the future this check could be +moved to the *nova-api* service such that the availability zone between the +instance and the volume is checked before we reach the cell, except in the +case of :term:`boot from volume ` where the *nova-compute* +service itself creates the volume and must tell Cinder in which availability +zone to create the volume. Long-term, volume creation during boot from volume +should be moved to the top-level superconductor which would eliminate this AZ +up-call check problem. + +The sixth is detailed in `bug 1781286`__ and is similar to the first issue. +The issue is that servers created without a specific availability zone +will have their AZ calculated during a reschedule based on the alternate host +selected. Determining the AZ for the alternate host requires an "up call" to +the API DB. + +.. __: https://bugs.launchpad.net/nova/+bug/1781286 -This section describes the various recommended practices/tips for runnning and -maintaining CellsV2 for admins and operators. For more details regarding the -basic concept of CellsV2 and its layout please see the main :doc:`/user/cellsv2-layout` -page. .. _handling-cell-failures: @@ -13,9 +931,11 @@ Handling cell failures ---------------------- For an explanation on how ``nova-api`` handles cell failures please see the -`Handling Down Cells `__ -section of the Compute API guide. Below, you can find some recommended practices and -considerations for effectively tolerating cell failure situations. +`Handling Down Cells`__ section of the Compute API guide. Below, you can find +some recommended practices and considerations for effectively tolerating cell +failure situations. + +.. __: https://docs.openstack.org/api-guide/compute/down_cells.html Configuration considerations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -67,8 +987,8 @@ Known issues information on each of the compute service's version to determine the compute version cap to use. The current workaround is to pin the :oslo.config:option:`upgrade_levels.compute` to a particular version like - "rocky" and get the service up under such situations. See `bug 1815697 - `__ for more details. Also note + "rocky" and get the service up under such situations. See `bug 1815697`__ + for more details. Also note that in general during situations where cells are not reachable certain "slowness" may be experienced in operations requiring hitting all the cells because of the aforementioned configurable timeout/retry values. @@ -90,3 +1010,153 @@ Known issues API database to make quota usage calculations resilient to down or poor-performing cells in a multi-cell environment. See the :doc:`quotas documentation` for more details. + +.. __: https://bugs.launchpad.net/nova/+bug/1815697 + + +FAQs +---- + +- How do I find out which hosts are bound to which cell? + + There are a couple of ways to do this. + + #. Run :program:`nova-manage cell_v2 discover_hosts --verbose`. + + This does not produce a report but if you are trying to determine if a + host is in a cell you can run this and it will report any hosts that are + not yet mapped to a cell and map them. This command is idempotent. + + #. Run :program:`nova-manage cell_v2 list_hosts`. + + This will list hosts in all cells. If you want to list hosts in a + specific cell, you can use the ``--cell_uuid`` option. + +- I updated the ``database_connection`` and/or ``transport_url`` in a cell + using the ``nova-manage cell_v2 update_cell`` command but the API is still + trying to use the old settings. + + The cell mappings are cached in the :program:`nova-api` service worker so you + will need to restart the worker process to rebuild the cache. Note that there + is another global cache tied to request contexts, which is used in the + nova-conductor and nova-scheduler services, so you might need to do the same + if you are having the same issue in those services. As of the 16.0.0 (Pike) + release there is no timer on the cache or hook to refresh the cache using a + SIGHUP to the service. + +- I have upgraded from Newton to Ocata and I can list instances but I get a + HTTP 404 (NotFound) error when I try to get details on a specific instance. + + Instances need to be mapped to cells so the API knows which cell an instance + lives in. When upgrading, the :program:`nova-manage cell_v2 simple_cell_setup` + command will automatically map the instances to the single cell which is + backed by the existing nova database. If you have already upgraded and did + not use the :program:`nova-manage cell_v2 simple_cell_setup` command, you can run the + :program:`nova-manage cell_v2 map_instances` command with the ``--cell_uuid`` + option to map all instances in the given cell. See the + :ref:`man-page-cells-v2` man page for details on command usage. + +- Can I create a cell but have it disabled from scheduling? + + Yes. It is possible to create a pre-disabled cell such that it does not + become a candidate for scheduling new VMs. This can be done by running the + :program:`nova-manage cell_v2 create_cell` command with the ``--disabled`` + option. + +- How can I disable a cell so that the new server create requests do not go to + it while I perform maintenance? + + Existing cells can be disabled by running :program:`nova-manage cell_v2 + update_cell` with the ``--disable`` option and can be re-enabled once the + maintenance period is over by running this command with the ``--enable`` + option. + +- I disabled (or enabled) a cell using the :program:`nova-manage cell_v2 + update_cell` or I created a new (pre-disabled) cell(mapping) using the + :program:`nova-manage cell_v2 create_cell` command but the scheduler is still + using the old settings. + + The cell mappings are cached in the scheduler worker so you will either need + to restart the scheduler process to refresh the cache, or send a SIGHUP + signal to the scheduler by which it will automatically refresh the cells + cache and the changes will take effect. + +- Why was the cells REST API not implemented for cells v2? Why are there no + CRUD operations for cells in the API? + + One of the deployment challenges that cells v1 had was the requirement for + the API and control services to be up before a new cell could be deployed. + This was not a problem for large-scale public clouds that never shut down, + but is not a reasonable requirement for smaller clouds that do offline + upgrades and/or clouds which could be taken completely offline by something + like a power outage. Initial devstack and gate testing for cells v1 was + delayed by the need to engineer a solution for bringing the services + partially online in order to deploy the rest, and this continues to be a gap + for other deployment tools. Consider also the FFU case where the control + plane needs to be down for a multi-release upgrade window where changes to + cell records have to be made. This would be quite a bit harder if the way + those changes are made is via the API, which must remain down during the + process. + + Further, there is a long-term goal to move cell configuration (i.e. + cell_mappings and the associated URLs and credentials) into config and get + away from the need to store and provision those things in the database. + Obviously a CRUD interface in the API would prevent us from making that move. + +- Why are cells not exposed as a grouping mechanism in the API for listing + services, instances, and other resources? + + Early in the design of cells v2 we set a goal to not let the cell concept + leak out of the API, even for operators. Aggregates are the way nova supports + grouping of hosts for a variety of reasons, and aggregates can cut across + cells, and/or be aligned with them if desired. If we were to support cells as + another grouping mechanism, we would likely end up having to implement many + of the same features for them as aggregates, such as scheduler features, + metadata, and other searching/filtering operations. Since aggregates are how + Nova supports grouping, we expect operators to use aggregates any time they + need to refer to a cell as a group of hosts from the API, and leave actual + cells as a purely architectural detail. + + The need to filter instances by cell in the API can and should be solved by + adding a generic by-aggregate filter, which would allow listing instances on + hosts contained within any aggregate, including one that matches the cell + boundaries if so desired. + +- Why are the API responses for ``GET /servers``, ``GET /servers/detail``, + ``GET /servers/{server_id}`` and ``GET /os-services`` missing some + information for certain cells at certain times? Why do I see the status as + "UNKNOWN" for the servers in those cells at those times when I run + ``openstack server list`` or ``openstack server show``? + + Starting from microversion 2.69 the API responses of ``GET /servers``, ``GET + /servers/detail``, ``GET /servers/{server_id}`` and ``GET /os-services`` may + contain missing keys during down cell situations. See the `Handling Down + Cells`__ section of the Compute API guide for more information on the partial + constructs. + + For administrative considerations, see :ref:`handling-cell-failures`. + +.. __: https://docs.openstack.org/api-guide/compute/down_cells.html + +References +---------- + +A large number of cells v2-related presentations have been given at various +OpenStack and OpenInfra Summits over the years. These provide an excellent +reference on the history and development of the feature along with details from +real-world users of the feature. + +- `Newton Summit Video - Nova Cells V2: What's Going On?`__ +- `Pike Summit Video - Scaling Nova: How CellsV2 Affects Your Deployment`__ +- `Queens Summit Video - Add Cellsv2 to your existing Nova deployment`__ +- `Rocky Summit Video - Moving from CellsV1 to CellsV2 at CERN`__ +- `Stein Summit Video - Scaling Nova with CellsV2: The Nova Developer and the + CERN Operator perspective`__ +- `Ussuri Summit Video - What's new in Nova Cellsv2?`__ + +.. __: https://www.openstack.org/videos/austin-2016/nova-cells-v2-whats-going-on +.. __: https://www.openstack.org/videos/boston-2017/scaling-nova-how-cellsv2-affects-your-deployment +.. __: https://www.openstack.org/videos/sydney-2017/adding-cellsv2-to-your-existing-nova-deployment +.. __: https://www.openstack.org/videos/summits/vancouver-2018/moving-from-cellsv1-to-cellsv2-at-cern +.. __: https://www.openstack.org/videos/summits/berlin-2018/scaling-nova-with-cellsv2-the-nova-developer-and-the-cern-operator-perspective +.. __: https://www.openstack.org/videos/summits/denver-2019/whats-new-in-nova-cellsv2 diff --git a/doc/source/admin/configuration/cross-cell-resize.rst b/doc/source/admin/configuration/cross-cell-resize.rst index d17ee24109ac..e51e42577484 100644 --- a/doc/source/admin/configuration/cross-cell-resize.rst +++ b/doc/source/admin/configuration/cross-cell-resize.rst @@ -2,21 +2,25 @@ Cross-cell resize ================= -This document describes how to configure nova for cross-cell resize. -For information on :term:`same-cell resize `, refer to -:doc:`/admin/configuration/resize`. +.. note:: + + This document describes how to configure nova for cross-cell resize. + For information on :term:`same-cell resize `, refer to + :doc:`/admin/configuration/resize`. For information on the cells v2 feature, + refer to :doc:`/admin/cells`. Historically resizing and cold migrating a server has been explicitly -`restricted`_ to within the same cell in which the server already exists. +`restricted`__ to within the same cell in which the server already exists. The cross-cell resize feature allows configuring nova to allow resizing and cold migrating servers across cells. -The full design details are in the `Ussuri spec`_ and there is a `video`_ from -a summit talk with a high-level overview. +The full design details are in the `Ussuri spec`__ and there is a `video`__ +from a summit talk with a high-level overview. + +.. __: https://opendev.org/openstack/nova/src/tag/20.0.0/nova/conductor/tasks/migrate.py#L164 +.. __: https://specs.openstack.org/openstack/nova-specs/specs/ussuri/approved/cross-cell-resize.html +.. __: https://www.openstack.org/videos/summits/denver-2019/whats-new-in-nova-cellsv2 -.. _restricted: https://opendev.org/openstack/nova/src/tag/20.0.0/nova/conductor/tasks/migrate.py#L164 -.. _Ussuri spec: https://specs.openstack.org/openstack/nova-specs/specs/ussuri/approved/cross-cell-resize.html -.. _video: https://www.openstack.org/videos/summits/denver-2019/whats-new-in-nova-cellsv2 Use case -------- @@ -32,6 +36,7 @@ migrate their servers to the new cell with newer hardware. Administrators could also just cold migrate the servers during a maintenance window to the new cell. + Requirements ------------ @@ -70,6 +75,7 @@ Networking The networking API must expose the ``Port Bindings Extended`` API extension which was added in the 13.0.0 (Rocky) release for Neutron. + Notifications ------------- @@ -78,13 +84,14 @@ from same-cell resize is the *publisher_id* may be different in some cases since some events are sent from the conductor service rather than a compute service. For example, with same-cell resize the ``instance.resize_revert.start`` notification is sent from the source compute -host in the `finish_revert_resize`_ method but with cross-cell resize that +host in the `finish_revert_resize`__ method but with cross-cell resize that same notification is sent from the conductor service. Obviously the actual message queue sending the notifications would be different for the source and target cells assuming they use separate transports. -.. _finish_revert_resize: https://opendev.org/openstack/nova/src/tag/20.0.0/nova/compute/manager.py#L4326 +.. __: https://opendev.org/openstack/nova/src/tag/20.0.0/nova/compute/manager.py#L4326 + Instance actions ---------------- @@ -96,6 +103,7 @@ names are generated based on the compute service methods involved in the operation and there are different methods involved in a cross-cell resize. This is important for triage when a cross-cell resize operation fails. + Scheduling ---------- @@ -107,19 +115,20 @@ cell. However, this behavior is configurable using the configuration option if, for example, you want to drain old cells when resizing or cold migrating. + Code flow --------- The end user experience is meant to not change, i.e. status transitions. A successfully cross-cell resized server will go to ``VERIFY_RESIZE`` status and from there the user can either confirm or revert the resized server using -the normal `confirmResize`_ and `revertResize`_ server action APIs. +the normal `confirmResize`__ and `revertResize`__ server action APIs. Under the covers there are some differences from a traditional same-cell resize: * There is no inter-compute interaction. Everything is synchronously - `orchestrated`_ from the (super)conductor service. This uses the + `orchestrated`__ from the (super)conductor service. This uses the :oslo.config:option:`long_rpc_timeout` configuration option. * The orchestration tasks in the (super)conductor service are in charge of @@ -129,15 +138,16 @@ resize: ``instance_mappings`` table record in the API database. * Non-volume-backed servers will have their root disk uploaded to the image - service as a temporary snapshot image just like during the `shelveOffload`_ + service as a temporary snapshot image just like during the `shelveOffload`__ operation. When finishing the resize on the destination host in the target cell that snapshot image will be used to spawn the guest and then the snapshot image will be deleted. -.. _confirmResize: https://docs.openstack.org/api-ref/compute/#confirm-resized-server-confirmresize-action -.. _revertResize: https://docs.openstack.org/api-ref/compute/#revert-resized-server-revertresize-action -.. _orchestrated: https://opendev.org/openstack/nova/src/branch/master/nova/conductor/tasks/cross_cell_migrate.py -.. _shelveOffload: https://docs.openstack.org/api-ref/compute/#shelf-offload-remove-server-shelveoffload-action +.. __: https://docs.openstack.org/api-ref/compute/#confirm-resized-server-confirmresize-action +.. __: https://docs.openstack.org/api-ref/compute/#revert-resized-server-revertresize-action +.. __: https://opendev.org/openstack/nova/src/branch/master/nova/conductor/tasks/cross_cell_migrate.py +.. __: https://docs.openstack.org/api-ref/compute/#shelf-offload-remove-server-shelveoffload-action + Sequence diagram ---------------- @@ -230,6 +240,7 @@ status. } + Limitations ----------- @@ -251,19 +262,21 @@ Other limitations: * The config drive associated with the server, if there is one, will be re-generated on the destination host in the target cell. Therefore if the - server was created with `personality files`_ they will be lost. However, this - is no worse than `evacuating`_ a server that had a config drive when the + server was created with `personality files`__ they will be lost. However, this + is no worse than `evacuating`__ a server that had a config drive when the source and destination compute host are not on shared storage or when shelve offloading and unshelving a server with a config drive. If necessary, the resized server can be rebuilt to regain the personality files. + * The ``_poll_unconfirmed_resizes`` periodic task, which can be :oslo.config:option:`configured ` to automatically confirm pending resizes on the target host, *might* not support cross-cell resizes because doing so would require an :ref:`up-call ` to the API to confirm the resize and cleanup the source cell database. -.. _personality files: https://docs.openstack.org/api-guide/compute/server_concepts.html#server-personality -.. _evacuating: https://docs.openstack.org/api-ref/compute/#evacuate-server-evacuate-action +.. __: https://docs.openstack.org/api-guide/compute/server_concepts.html#server-personality +.. __: https://docs.openstack.org/api-ref/compute/#evacuate-server-evacuate-action + Troubleshooting --------------- @@ -301,9 +314,9 @@ manually recovered, for example volume attachments or port bindings, and also check the (super)conductor service logs. Assuming volume attachments and port bindings are OK (current and pointing at the correct host), then try hard rebooting the server to get it back to ``ACTIVE`` status. If that fails, you -may need to `rebuild`_ the server on the source host. Note that the guest's +may need to `rebuild`__ the server on the source host. Note that the guest's disks on the source host are not deleted until the resize is confirmed so if there is an issue prior to confirm or confirm itself fails, the guest disks should still be available for rebuilding the instance if necessary. -.. _rebuild: https://docs.openstack.org/api-ref/compute/#rebuild-server-rebuild-action +.. __: https://docs.openstack.org/api-ref/compute/#rebuild-server-rebuild-action diff --git a/doc/source/admin/configuring-migrations.rst b/doc/source/admin/configuring-migrations.rst index 63edae6e2164..4de7fe36aa5f 100644 --- a/doc/source/admin/configuring-migrations.rst +++ b/doc/source/admin/configuring-migrations.rst @@ -63,7 +63,8 @@ The migration types are: .. note:: In a multi-cell cloud, instances can be live migrated to a - different host in the same cell, but not across cells. + different host in the same cell, but not across cells. Refer to the + :ref:`cells v2 documentation `. for more information. The following sections describe how to configure your hosts for live migrations using the libvirt virt driver and KVM hypervisor. diff --git a/doc/source/admin/troubleshooting/affinity-policy-violated.rst b/doc/source/admin/troubleshooting/affinity-policy-violated.rst index a7a563491e29..f93a256660c4 100644 --- a/doc/source/admin/troubleshooting/affinity-policy-violated.rst +++ b/doc/source/admin/troubleshooting/affinity-policy-violated.rst @@ -43,9 +43,9 @@ reschedule of the server create request. Note that this means the deployment must have :oslo.config:option:`scheduler.max_attempts` set greater than ``1`` (default is ``3``) to handle races. -An ideal configuration for multiple cells will minimize `upcalls`_ from the -cells to the API database. This is how devstack, for example, is configured in -the CI gate. The cell conductors do not set +An ideal configuration for multiple cells will minimize :ref:`upcalls ` +from the cells to the API database. This is how devstack, for example, is +configured in the CI gate. The cell conductors do not set :oslo.config:option:`api_database.connection` and ``nova-compute`` sets :oslo.config:option:`workarounds.disable_group_policy_check_upcall` to ``True``. @@ -73,6 +73,3 @@ parallel server create requests are racing. Future work is needed to add anti-/affinity support to the placement service in order to eliminate the need for the late affinity check in ``nova-compute``. - -.. _upcalls: https://docs.openstack.org/nova/latest/user/cellsv2-layout.html#operations-requiring-upcalls - diff --git a/doc/source/admin/upgrades.rst b/doc/source/admin/upgrades.rst index 75ac5a4ca56c..00a714970b20 100644 --- a/doc/source/admin/upgrades.rst +++ b/doc/source/admin/upgrades.rst @@ -53,8 +53,8 @@ same time. deployment. The `nova-conductor`` service will fail to start when a ``nova-compute`` service that is older than the previous release (N-2 or greater) is detected. Similarly, in a :doc:`deployment with multiple cells - `, neither the super conductor service nor any - per-cell conductor service will start if any other conductor service in the + `, neither the super conductor service nor any per-cell + conductor service will start if any other conductor service in the deployment is older than the previous release. #. Before maintenance window: diff --git a/doc/source/index.rst b/doc/source/index.rst index 01ed1d7a1c23..80bd8bbc3536 100644 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -153,9 +153,10 @@ the defaults from the :doc:`install guide ` will be sufficient. * :doc:`Feature Support full list `: A detailed dive through features in each compute driver backend. -* :doc:`Cells v2 Planning `: For large deployments, Cells v2 - allows sharding of your compute environment. Upfront planning is key to a - successful Cells v2 layout. +* :doc:`Cells v2 configuration `: For large deployments, cells v2 + cells allow sharding of your compute environment. Upfront planning is key to + a successful cells v2 layout. + * :doc:`Running nova-api on wsgi `: Considerations for using a real WSGI container instead of the baked-in eventlet web server. @@ -166,7 +167,7 @@ the defaults from the :doc:`install guide ` will be sufficient. user/feature-classification user/support-matrix - user/cellsv2-layout + admin/cells user/wsgi Maintenance diff --git a/doc/source/reference/index.rst b/doc/source/reference/index.rst index 6b397f5f5cfd..d162157ac470 100644 --- a/doc/source/reference/index.rst +++ b/doc/source/reference/index.rst @@ -92,10 +92,9 @@ these documents will move into the "Internals" section. If you want to get involved in shaping the future of nova's architecture, these are a great place to start reading up on the current plans. -* :doc:`/user/cells`: How cells v2 is evolving * :doc:`/reference/policy-enforcement`: How we want policy checks on API actions to work in the future -* :doc:`/reference/stable-api`: What stable api means to nova +* :doc:`/reference/stable-api`: What stable API means to nova * :doc:`/reference/scheduler-evolution`: Motivation behind the scheduler / placement evolution @@ -104,7 +103,6 @@ these are a great place to start reading up on the current plans. .. toctree:: :hidden: - /user/cells policy-enforcement stable-api scheduler-evolution diff --git a/doc/source/user/architecture.rst b/doc/source/user/architecture.rst index c4d597b86fe6..2841cc000178 100644 --- a/doc/source/user/architecture.rst +++ b/doc/source/user/architecture.rst @@ -42,7 +42,7 @@ To make this possible nova-compute proxies DB requests over RPC to a central manager called ``nova-conductor``. To horizontally expand Nova deployments, we have a deployment sharding -concept called cells. For more information please see: :doc:`cells` +concept called cells. For more information please see: :doc:`/admin/cells` Components ---------- diff --git a/doc/source/user/cells.rst b/doc/source/user/cells.rst deleted file mode 100644 index 74d6fc1a3c0a..000000000000 --- a/doc/source/user/cells.rst +++ /dev/null @@ -1,759 +0,0 @@ -.. - Licensed under the Apache License, Version 2.0 (the "License"); you may - not use this file except in compliance with the License. You may obtain - a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - - Unless required by applicable law or agreed to in writing, software - distributed under the License is distributed on an "AS IS" BASIS, WITHOUT - WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the - License for the specific language governing permissions and limitations - under the License. - -.. _cells: - -======= - Cells -======= - -Before reading further, there is a nice overview presentation_ that -Andrew Laski gave at the Austin (Newton) summit which is worth watching. - -.. _presentation: https://www.openstack.org/videos/summits/austin-2016/nova-cells-v2-whats-going-on - -.. _cells-v2: - -Cells V2 -======== - -* `Newton Summit Video - Nova Cells V2: What's Going On? `_ -* `Pike Summit Video - Scaling Nova: How CellsV2 Affects Your Deployment `_ -* `Queens Summit Video - Add Cellsv2 to your existing Nova deployment `_ -* `Rocky Summit Video - Moving from CellsV1 to CellsV2 at CERN - `_ -* `Stein Summit Video - Scaling Nova with CellsV2: The Nova Developer and the CERN Operator perspective - `_ - -Manifesto -~~~~~~~~~ - -Proposal --------- - -Right now, when a request hits the Nova API for a particular instance, the -instance information is fetched from the database, which contains the hostname -of the compute node on which the instance currently lives. If the request needs -to take action on the instance (which is most of them), the hostname is used to -calculate the name of a queue, and a message is written there which finds its -way to the proper compute node. - -The meat of this proposal is changing the above hostname lookup into two parts -that yield three pieces of information instead of one. Basically, instead of -merely looking up the *name* of the compute node on which an instance lives, we -will also obtain database and queue connection information. Thus, when asked to -take action on instance $foo, we will: - -1. Lookup the three-tuple of (database, queue, hostname) for that instance -2. Connect to that database and fetch the instance record -3. Connect to the queue and send the message to the proper hostname queue - -The above differs from the current organization in two ways. First, we need to -do two database lookups before we know where the instance lives. Second, we -need to demand-connect to the appropriate database and queue. Both of these -have performance implications, but we believe we can mitigate the impacts -through the use of things like a memcache of instance mapping information and -pooling of connections to database and queue systems. The number of cells will -always be much smaller than the number of instances. - -There are availability implications with this change since something like a -'nova list' which might query multiple cells could end up with a partial result -if there is a database failure in a cell. See :doc:`/admin/cells` for knowing -more about the recommended practices under such situations. A database failure -within a cell would cause larger issues than a partial list result so the -expectation is that it would be addressed quickly and cellsv2 will handle it by -indicating in the response that the data may not be complete. - -Since this is very similar to what we have with current cells, in terms of -organization of resources, we have decided to call this "cellsv2" for -disambiguation. - -After this work is complete there will no longer be a "no cells" deployment. -The default installation of Nova will be a single cell setup. - -Benefits --------- - -The benefits of this new organization are: - -* Native sharding of the database and queue as a first-class-feature in nova. - All of the code paths will go through the lookup procedure and thus we won't - have the same feature parity issues as we do with current cells. - -* No high-level replication of all the cell databases at the top. The API will - need a database of its own for things like the instance index, but it will - not need to replicate all the data at the top level. - -* It draws a clear line between global and local data elements. Things like - flavors and keypairs are clearly global concepts that need only live at the - top level. Providing this separation allows compute nodes to become even more - stateless and insulated from things like deleted/changed global data. - -* Existing non-cells users will suddenly gain the ability to spawn a new "cell" - from their existing deployment without changing their architecture. Simply - adding information about the new database and queue systems to the new index - will allow them to consume those resources. - -* Existing cells users will need to fill out the cells mapping index, shutdown - their existing cells synchronization service, and ultimately clean up their - top level database. However, since the high-level organization is not - substantially different, they will not have to re-architect their systems to - move to cellsv2. - -* Adding new sets of hosts as a new "cell" allows them to be plugged into a - deployment and tested before allowing builds to be scheduled to them. - - -Database split -~~~~~~~~~~~~~~ - -As mentioned above there is a split between global data and data that is local -to a cell. - -The following is a breakdown of what data can uncontroversially considered -global versus local to a cell. Missing data will be filled in as consensus is -reached on the data that is more difficult to cleanly place. The missing data -is mostly concerned with scheduling and networking. - -Global (API-level) Tables -------------------------- - -instance_types -instance_type_projects -instance_type_extra_specs -quotas -project_user_quotas -quota_classes -quota_usages -security_groups -security_group_rules -security_group_default_rules -provider_fw_rules -key_pairs -migrations -networks -tags - -Cell-level Tables ------------------ - -instances -instance_info_caches -instance_extra -instance_metadata -instance_system_metadata -instance_faults -instance_actions -instance_actions_events -instance_id_mappings -pci_devices -block_device_mapping -virtual_interfaces - -Setup of Cells V2 -================= - -Overview -~~~~~~~~ - -As more of the CellsV2 implementation is finished, all operators are -required to make changes to their deployment. For all deployments -(even those that only intend to have one cell), these changes are -configuration-related, both in the main nova configuration file as -well as some extra records in the databases. - -All nova deployments must now have the following databases available -and configured: - -1. The "API" database -2. One special "cell" database called "cell0" -3. One (or eventually more) "cell" databases - -Thus, a small nova deployment will have an API database, a cell0, and -what we will call here a "cell1" database. High-level tracking -information is kept in the API database. Instances that are never -scheduled are relegated to the cell0 database, which is effectively a -graveyard of instances that failed to start. All successful/running -instances are stored in "cell1". - - -.. note:: Since Nova services make use of both configuration file and some - databases records, starting or restarting those services with an - incomplete configuration could lead to an incorrect deployment. - Please only restart the services once you are done with the described - steps below. - - -First Time Setup -~~~~~~~~~~~~~~~~ - -Since there is only one API database, the connection information for -it is stored in the nova.conf file. -:: - - [api_database] - connection = mysql+pymysql://root:secretmysql@dbserver/nova_api?charset=utf8 - -Since there may be multiple "cell" databases (and in fact everyone -will have cell0 and cell1 at a minimum), connection info for these is -stored in the API database. Thus, you must have connection information -in your config file for the API database before continuing to the -steps below, so that `nova-manage` can find your other databases. - -The following examples show the full expanded command line usage of -the setup commands. This is to make it easier to visualize which of -the various URLs are used by each of the commands. However, you should -be able to put all of that in the config file and `nova-manage` will -use those values. If need be, you can create separate config files and -pass them as `nova-manage --config-file foo.conf` to control the -behavior without specifying things on the command lines. - -The commands below use the API database so remember to run -`nova-manage api_db sync` first. - -First we will create the necessary records for the cell0 database. To -do that we use `nova-manage` like this:: - - nova-manage cell_v2 map_cell0 --database_connection \ - mysql+pymysql://root:secretmysql@dbserver/nova_cell0?charset=utf8 - -.. note:: If you don't specify `--database_connection` then - `nova-manage` will use the `[database]/connection` value - from your config file, and mangle the database name to have - a `_cell0` suffix. -.. warning:: If your databases are on separate hosts then you should specify - `--database_connection` or make certain that the nova.conf - being used has the `[database]/connection` value pointing to the - same user/password/host that will work for the cell0 database. - If the cell0 mapping was created incorrectly, it can be deleted - using the `nova-manage cell_v2 delete_cell` command and then run - `map_cell0` again with the proper database connection value. - -Since no hosts are ever in cell0, nothing further is required for its -setup. Note that all deployments only ever have one cell0, as it is -special, so once you have done this step you never need to do it -again, even if you add more regular cells. - -Now, we must create another cell which will be our first "regular" -cell, which has actual compute hosts in it, and to which instances can -actually be scheduled. First, we create the cell record like this:: - - nova-manage cell_v2 create_cell --verbose --name cell1 \ - --database_connection mysql+pymysql://root:secretmysql@127.0.0.1/nova?charset=utf8 - --transport-url rabbit://stackrabbit:secretrabbit@mqserver:5672/ - -.. note:: If you don't specify the database and transport urls then - `nova-manage` will use the - `[database]/connection` and `[DEFAULT]/transport_url` values - from the config file. - -.. note:: At this point, the API database can now find the cell - database, and further commands will attempt to look - inside. If this is a completely fresh database (such as if - you're adding a cell, or if this is a new deployment), then - you will need to run `nova-manage db sync` on it to - initialize the schema. - -The `nova-manage cell_v2 create_cell` command will print the UUID of the -newly-created cell if `--verbose` is passed, which is useful if you -need to run commands like `discover_hosts` targeted at a specific -cell. - -Now we have a cell, but no hosts are in it which means the scheduler -will never actually place instances there. The next step is to scan -the database for compute node records and add them into the cell we -just created. For this step, you must have had a compute node started -such that it registers itself as a running service. Once that has -happened, you can scan and add it to the cell:: - - nova-manage cell_v2 discover_hosts - -This command will connect to any databases for which you have created -cells (as above), look for hosts that have registered themselves -there, and map those hosts in the API database so that -they are visible to the scheduler as available targets for -instances. Any time you add more compute hosts to a cell, you need to -re-run this command to map them from the top-level so they can be -utilized. - -Template URLs in Cell Mappings -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Starting in the Rocky release, the URLs provided in the cell mappings -for ``--database_connection`` and ``--transport-url`` can contain -variables which are evaluated each time they are loaded from the -database, and the values of which are taken from the corresponding -base options in the host's configuration file. The base URL is parsed -and the following elements may be substituted into the cell mapping -URL (using ``rabbit://bob:s3kret@myhost:123/nova?sync=true#extra``): - -.. list-table:: Cell Mapping URL Variables - :header-rows: 1 - :widths: 15, 50, 15 - - * - Variable - - Meaning - - Part of example URL - * - ``scheme`` - - The part before the `://` - - ``rabbit`` - * - ``username`` - - The username part of the credentials - - ``bob`` - * - ``password`` - - The password part of the credentials - - ``s3kret`` - * - ``hostname`` - - The hostname or address - - ``myhost`` - * - ``port`` - - The port number (must be specified) - - ``123`` - * - ``path`` - - The "path" part of the URL (without leading slash) - - ``nova`` - * - ``query`` - - The full query string arguments (without leading question mark) - - ``sync=true`` - * - ``fragment`` - - Everything after the first hash mark - - ``extra`` - -Variables are provided in curly brackets, like ``{username}``. A simple template -of ``rabbit://{username}:{password}@otherhost/{path}`` will generate a full URL -of ``rabbit://bob:s3kret@otherhost/nova`` when used with the above example. - -.. note:: The ``[database]/connection`` and - ``[DEFAULT]/transport_url`` values are not reloaded from the - configuration file during a SIGHUP, which means that a full service - restart will be required to notice changes in a cell mapping record - if variables are changed. - -.. note:: The ``[DEFAULT]/transport_url`` option can contain an - extended syntax for the "netloc" part of the url - (i.e. `userA:passwordA@hostA:portA,userB:passwordB:hostB:portB`). In this - case, substitions of the form ``username1``, ``username2``, etc will be - honored and can be used in the template URL. - -The templating of these URLs may be helpful in order to provide each service host -with its own credentials for, say, the database. Without templating, all hosts -will use the same URL (and thus credentials) for accessing services like the -database and message queue. By using a URL with a template that results in the -credentials being taken from the host-local configuration file, each host will -use different values for those connections. - -Assuming you have two service hosts that are normally configured with the cell0 -database as their primary connection, their (abbreviated) configurations would -look like this:: - - [database] - connection = mysql+pymysql://service1:foo@myapidbhost/nova_cell0 - -and:: - - [database] - connection = mysql+pymysql://service2:bar@myapidbhost/nova_cell0 - -Without cell mapping template URLs, they would still use the same credentials -(as stored in the mapping) to connect to the cell databases. However, consider -template URLs like the following:: - - mysql+pymysql://{username}:{password}@mycell1dbhost/nova - -and:: - - mysql+pymysql://{username}:{password}@mycell2dbhost/nova - -Using the first service and cell1 mapping, the calculated URL that will actually -be used for connecting to that database will be:: - - mysql+pymysql://service1:foo@mycell1dbhost/nova - - -References -~~~~~~~~~~ - -* :doc:`nova-manage man page ` - -Step-By-Step for Common Use Cases -================================= - -The following are step-by-step examples for common use cases setting -up Cells V2. This is intended as a quick reference that puts together -everything explained in `Setup of Cells V2`_. It is assumed that you have -followed all other install steps for Nova and are setting up Cells V2 -specifically at this point. - -Fresh Install -~~~~~~~~~~~~~ - -You are installing Nova for the first time and have no compute hosts in the -database yet. This will set up a single cell Nova deployment. - -1. Reminder: You should have already created and synced the Nova API database - by creating a database, configuring its connection in the - ``[api_database]/connection`` setting in the Nova configuration file, and - running ``nova-manage api_db sync``. - -2. Create a database for cell0. If you are going to pass the database - connection url on the command line in step 3, you can name the cell0 - database whatever you want. If you are not going to pass the database url on - the command line in step 3, you need to name the cell0 database based on the - name of your existing Nova database: _cell0. For - example, if your Nova database is named ``nova``, then your cell0 database - should be named ``nova_cell0``. - -3. Run the ``map_cell0`` command to create and map cell0:: - - nova-manage cell_v2 map_cell0 \ - --database_connection - - The database connection url is generated based on the - ``[database]/connection`` setting in the Nova configuration file if not - specified on the command line. - -4. Run ``nova-manage db sync`` to populate the cell0 database with a schema. - The ``db sync`` command reads the database connection for cell0 that was - created in step 3. - -5. Run the ``create_cell`` command to create the single cell which will contain - your compute hosts:: - - nova-manage cell_v2 create_cell --name \ - --transport-url \ - --database_connection - - The transport url is taken from the ``[DEFAULT]/transport_url`` setting in - the Nova configuration file if not specified on the command line. The - database url is taken from the ``[database]/connection`` setting in the Nova - configuration file if not specified on the command line. - -6. Configure your compute host(s), making sure ``[DEFAULT]/transport_url`` - matches the transport URL for the cell created in step 5, and start the - nova-compute service. Before step 7, make sure you have compute hosts in the - database by running:: - - nova service-list --binary nova-compute - -7. Run the ``discover_hosts`` command to map compute hosts to the single cell - by running:: - - nova-manage cell_v2 discover_hosts - - The command will search for compute hosts in the database of the cell you - created in step 5 and map them to the cell. You can also configure a - periodic task to have Nova discover new hosts automatically by setting - the ``[scheduler]/discover_hosts_in_cells_interval`` to a time interval in - seconds. The periodic task is run by the nova-scheduler service, so you must - be sure to configure it on all of your nova-scheduler hosts. - -.. note:: Remember: In the future, whenever you add new compute hosts, you - will need to run the ``discover_hosts`` command after starting them - to map them to the cell if you did not configure the automatic host - discovery in step 7. - -Upgrade (minimal) -~~~~~~~~~~~~~~~~~ - -You are upgrading an existing Nova install and have compute hosts in the -database. This will set up a single cell Nova deployment. - -1. If you haven't already created a cell0 database in a prior release, - create a database for cell0 with a name based on the name of your Nova - database: _cell0. If your Nova database is named - ``nova``, then your cell0 database should be named ``nova_cell0``. - -.. warning:: In Newton, the ``simple_cell_setup`` command expects the name of - the cell0 database to be based on the name of the Nova API - database: _cell0 and the database - connection url is taken from the ``[api_database]/connection`` - setting in the Nova configuration file. - -2. Run the ``simple_cell_setup`` command to create and map cell0, create and - map the single cell, and map existing compute hosts and instances to the - single cell:: - - nova-manage cell_v2 simple_cell_setup \ - --transport-url - - The transport url is taken from the ``[DEFAULT]/transport_url`` setting in - the Nova configuration file if not specified on the command line. The - database connection url will be generated based on the - ``[database]/connection`` setting in the Nova configuration file. - -.. note:: Remember: In the future, whenever you add new compute hosts, you - will need to run the ``discover_hosts`` command after starting them - to map them to the cell. You can also configure a periodic task to - have Nova discover new hosts automatically by setting the - ``[scheduler]/discover_hosts_in_cells_interval`` to a time interval - in seconds. The periodic task is run by the nova-scheduler service, - so you must be sure to configure it on all of your nova-scheduler - hosts. - -Upgrade with Cells V1 -~~~~~~~~~~~~~~~~~~~~~ - -.. todo:: This needs to be removed but `Adding a new cell to an existing deployment`_ - is still using it. - -You are upgrading an existing Nova install that has Cells V1 enabled and have -compute hosts in your databases. This will set up a multiple cell Nova -deployment. At this time, it is recommended to keep Cells V1 enabled during and -after the upgrade as multiple Cells V2 cell support is not fully finished and -may not work properly in all scenarios. These upgrade steps will help ensure a -simple cutover from Cells V1 to Cells V2 in the future. - -.. note:: There is a Rocky summit video from CERN about how they did their - upgrade from cells v1 to v2 here: - - https://www.openstack.org/videos/vancouver-2018/moving-from-cellsv1-to-cellsv2-at-cern - -1. If you haven't already created a cell0 database in a prior release, - create a database for cell0. If you are going to pass the database - connection url on the command line in step 2, you can name the cell0 - database whatever you want. If you are not going to pass the database url on - the command line in step 2, you need to name the cell0 database based on the - name of your existing Nova database: _cell0. For - example, if your Nova database is named ``nova``, then your cell0 database - should be named ``nova_cell0``. - -2. Run the ``map_cell0`` command to create and map cell0:: - - nova-manage cell_v2 map_cell0 \ - --database_connection - - The database connection url is generated based on the - ``[database]/connection`` setting in the Nova configuration file if not - specified on the command line. - -3. Run ``nova-manage db sync`` to populate the cell0 database with a schema. - The ``db sync`` command reads the database connection for cell0 that was - created in step 2. - -4. Run the ``create_cell`` command to create cells which will contain your - compute hosts:: - - nova-manage cell_v2 create_cell --name \ - --transport-url \ - --database_connection - - You will need to repeat this step for each cell in your deployment. Your - existing cell database will be re-used -- this simply informs the top-level - API database about your existing cell databases. - - It is a good idea to specify a name for the new cell you create so you can - easily look up cell uuids with the ``list_cells`` command later if needed. - - The transport url is taken from the ``[DEFAULT]/transport_url`` setting in - the Nova configuration file if not specified on the command line. The - database url is taken from the ``[database]/connection`` setting in the Nova - configuration file if not specified on the command line. If you are not - going to specify ``--database_connection`` and ``--transport-url`` on the - command line, be sure to specify your existing cell Nova configuration - file:: - - nova-manage --config-file cell_v2 create_cell \ - --name - -5. Run the ``discover_hosts`` command to map compute hosts to cells:: - - nova-manage cell_v2 discover_hosts --cell_uuid - - You will need to repeat this step for each cell in your deployment unless - you omit the ``--cell_uuid`` option. If the cell uuid is not specified on - the command line, ``discover_hosts`` will search for compute hosts in each - cell database and map them to the corresponding cell. You can use the - ``list_cells`` command to look up cell uuids if you are going to specify - ``--cell_uuid``. - - You can also configure a periodic task to have Nova discover new hosts - automatically by setting the - ``[scheduler]/discover_hosts_in_cells_interval`` to a time interval in - seconds. The periodic task is run by the nova-scheduler service, so you must - be sure to configure it on all of your nova-scheduler hosts. - -6. Run the ``map_instances`` command to map instances to cells:: - - nova-manage cell_v2 map_instances --cell_uuid \ - --max-count - - You will need to repeat this step for each cell in your deployment. You can - use the ``list_cells`` command to look up cell uuids. - - The ``--max-count`` option can be specified if you would like to limit the - number of instances to map in a single run. If ``--max-count`` is not - specified, all instances will be mapped. Repeated runs of the command will - start from where the last run finished so it is not necessary to increase - ``--max-count`` to finish. An exit code of 0 indicates that all instances - have been mapped. An exit code of 1 indicates that there are remaining - instances that need to be mapped. - -.. note:: Remember: In the future, whenever you add new compute hosts, you - will need to run the ``discover_hosts`` command after starting them - to map them to a cell if you did not configure the automatic host - discovery in step 5. - -Adding a new cell to an existing deployment -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -To expand your deployment with a new cell, first follow the usual steps for -standing up a new Cells V1 cell. After that is finished, follow step 4 in -`Upgrade with Cells V1`_ to create a new Cells V2 cell for it. If you have -added new compute hosts for the new cell, you will also need to run the -``discover_hosts`` command after starting them to map them to the new cell if -you did not configure the automatic host discovery as described in step 5 in -`Upgrade with Cells V1`_. - -References -~~~~~~~~~~ - -* :doc:`nova-manage man page ` - -FAQs -==== - -#. How do I find out which hosts are bound to which cell? - - There are a couple of ways to do this. - - 1. Run ``nova-manage --config-file host list``. This will - only lists hosts in the provided cell nova.conf. - - .. deprecated:: 16.0.0 - The ``nova-manage host list`` command is deprecated as of the - 16.0.0 Pike release. - - 2. Run ``nova-manage cell_v2 discover_hosts --verbose``. This does not - produce a report but if you are trying to determine if a host is in a - cell you can run this and it will report any hosts that are not yet - mapped to a cell and map them. This command is idempotent. - - 3. Run ``nova-manage cell_v2 list_hosts``. This will list hosts in all - cells. If you want to list hosts in a specific cell, you can run - ``nova-manage cell_v2 list_hosts --cell_uuid ``. - -#. I updated the database_connection and/or transport_url in a cell using the - ``nova-manage cell_v2 update_cell`` command but the API is still trying to - use the old settings. - - The cell mappings are cached in the nova-api service worker so you will need - to restart the worker process to rebuild the cache. Note that there is - another global cache tied to request contexts, which is used in the - nova-conductor and nova-scheduler services, so you might need to do the same - if you are having the same issue in those services. As of the 16.0.0 Pike - release there is no timer on the cache or hook to refresh the cache using a - SIGHUP to the service. - -#. I have upgraded from Newton to Ocata and I can list instances but I get a - 404 NotFound error when I try to get details on a specific instance. - - Instances need to be mapped to cells so the API knows which cell an instance - lives in. When upgrading, the ``nova-manage cell_v2 simple_cell_setup`` - command will automatically map the instances to the single cell which is - backed by the existing nova database. If you have already upgraded - and did not use the ``simple_cell_setup`` command, you can run the - ``nova-manage cell_v2 map_instances --cell_uuid `` command to - map all instances in the given cell. See the :ref:`man-page-cells-v2` man - page for details on command usage. - -#. Should I change any of the ``[cells]`` configuration options for Cells v2? - - **NO**. Those options are for Cells v1 usage only and are not used at all - for Cells v2. That includes the ``nova-cells`` service - it has nothing - to do with Cells v2. - -#. Can I create a cell but have it disabled from scheduling? - - Yes. It is possible to create a pre-disabled cell such that it does not - become a candidate for scheduling new VMs. This can be done by running the - ``nova-manage cell_v2 create_cell`` command with the ``--disabled`` option. - -#. How can I disable a cell so that the new server create requests do not go to - it while I perform maintenance? - - Existing cells can be disabled by running ``nova-manage cell_v2 update_cell - --cell_uuid --disable`` and can be re-enabled once the - maintenance period is over by running ``nova-manage cell_v2 update_cell - --cell_uuid --enable`` - -#. I disabled (or enabled) a cell using the ``nova-manage cell_v2 update_cell`` - or I created a new (pre-disabled) cell(mapping) using the - ``nova-manage cell_v2 create_cell`` command but the scheduler is still using - the old settings. - - The cell mappings are cached in the scheduler worker so you will either need - to restart the scheduler process to refresh the cache, or send a SIGHUP - signal to the scheduler by which it will automatically refresh the cells - cache and the changes will take effect. - -#. Why was the cells REST API not implemented for CellsV2? Why are - there no CRUD operations for cells in the API? - - One of the deployment challenges that CellsV1 had was the - requirement for the API and control services to be up before a new - cell could be deployed. This was not a problem for large-scale - public clouds that never shut down, but is not a reasonable - requirement for smaller clouds that do offline upgrades and/or - clouds which could be taken completely offline by something like a - power outage. Initial devstack and gate testing for CellsV1 was - delayed by the need to engineer a solution for bringing the services - partially online in order to deploy the rest, and this continues to - be a gap for other deployment tools. Consider also the FFU case - where the control plane needs to be down for a multi-release - upgrade window where changes to cell records have to be made. This - would be quite a bit harder if the way those changes are made is - via the API, which must remain down during the process. - - Further, there is a long-term goal to move cell configuration - (i.e. cell_mappings and the associated URLs and credentials) into - config and get away from the need to store and provision those - things in the database. Obviously a CRUD interface in the API would - prevent us from making that move. - -#. Why are cells not exposed as a grouping mechanism in the API for - listing services, instances, and other resources? - - Early in the design of CellsV2 we set a goal to not let the cell - concept leak out of the API, even for operators. Aggregates are the - way nova supports grouping of hosts for a variety of reasons, and - aggregates can cut across cells, and/or be aligned with them if - desired. If we were to support cells as another grouping mechanism, - we would likely end up having to implement many of the same - features for them as aggregates, such as scheduler features, - metadata, and other searching/filtering operations. Since - aggregates are how Nova supports grouping, we expect operators to - use aggregates any time they need to refer to a cell as a group of - hosts from the API, and leave actual cells as a purely - architectural detail. - - The need to filter instances by cell in the API can and should be - solved by adding a generic by-aggregate filter, which would allow - listing instances on hosts contained within any aggregate, - including one that matches the cell boundaries if so desired. - -#. Why are the API responses for ``GET /servers``, ``GET /servers/detail``, - ``GET /servers/{server_id}`` and ``GET /os-services`` missing some - information for certain cells at certain times? Why do I see the status as - "UNKNOWN" for the servers in those cells at those times when I run - ``openstack server list`` or ``openstack server show``? - - Starting from microversion 2.69 the API responses of ``GET /servers``, - ``GET /servers/detail``, ``GET /servers/{server_id}`` and - ``GET /os-services`` may contain missing keys during down cell situations. - See the `Handling Down Cells - `__ - section of the Compute API guide for more information on the partial - constructs. - - For administrative considerations, see - :ref:`Handling cell failures `. diff --git a/doc/source/user/cellsv2-layout.rst b/doc/source/user/cellsv2-layout.rst deleted file mode 100644 index 945770c6658a..000000000000 --- a/doc/source/user/cellsv2-layout.rst +++ /dev/null @@ -1,412 +0,0 @@ -.. - Licensed under the Apache License, Version 2.0 (the "License"); you may - not use this file except in compliance with the License. You may obtain - a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - - Unless required by applicable law or agreed to in writing, software - distributed under the License is distributed on an "AS IS" BASIS, WITHOUT - WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the - License for the specific language governing permissions and limitations - under the License. - -=================== - Cells Layout (v2) -=================== - -This document describes the layout of a deployment with Cells -version 2, including deployment considerations for security and -scale. It is focused on code present in Pike and later, and while it -is geared towards people who want to have multiple cells for whatever -reason, the nature of the cellsv2 support in Nova means that it -applies in some way to all deployments. - -Concepts -======== - -A basic Nova system consists of the following components: - -* The nova-api service which provides the external REST API to users. -* The nova-scheduler and placement services which are responsible - for tracking resources and deciding which compute node instances - should be on. -* An "API database" that is used primarily by nova-api and - nova-scheduler (called *API-level services* below) to track location - information about instances, as well as a temporary location for - instances being built but not yet scheduled. -* The nova-conductor service which offloads long-running tasks for the - API-level service, as well as insulates compute nodes from direct - database access -* The nova-compute service which manages the virt driver and - hypervisor host. -* A "cell database" which is used by API, conductor and compute - services, and which houses the majority of the information about - instances. -* A "cell0 database" which is just like the cell database, but - contains only instances that failed to be scheduled. -* A message queue which allows the services to communicate with each - other via RPC. - -All deployments have at least the above components. Small deployments -likely have a single message queue that all services share, and a -single database server which hosts the API database, a single cell -database, as well as the required cell0 database. This is considered a -"single-cell deployment" because it only has one "real" cell. The -cell0 database mimics a regular cell, but has no compute nodes and is -used only as a place to put instances that fail to land on a real -compute node (and thus a real cell). - -The purpose of the cells functionality in nova is specifically to -allow larger deployments to shard their many compute nodes into cells, -each of which has a database and message queue. The API database is -always and only global, but there can be many cell databases (where -the bulk of the instance information lives), each with a portion of -the instances for the entire deployment within. - -All of the nova services use a configuration file, all of which will -at a minimum specify a message queue endpoint -(i.e. ``[DEFAULT]/transport_url``). Most of the services also require -configuration of database connection information -(i.e. ``[database]/connection``). API-level services that need access -to the global routing and placement information will also be -configured to reach the API database -(i.e. ``[api_database]/connection``). - -.. note:: The pair of ``transport_url`` and ``[database]/connection`` - configured for a service defines what cell a service lives - in. - -API-level services need to be able to contact other services in all of -the cells. Since they only have one configured ``transport_url`` and -``[database]/connection`` they look up the information for the other -cells in the API database, with records called *cell mappings*. - -.. note:: The API database must have cell mapping records that match - the ``transport_url`` and ``[database]/connection`` - configuration elements of the lower-level services. See the - ``nova-manage`` :ref:`man-page-cells-v2` commands for more - information about how to create and examine these records. - -Service Layout -============== - -The services generally have a well-defined communication pattern that -dictates their layout in a deployment. In a small/simple scenario, the -rules do not have much of an impact as all the services can -communicate with each other on a single message bus and in a single -cell database. However, as the deployment grows, scaling and security -concerns may drive separation and isolation of the services. - -Simple ------- - -This is a diagram of the basic services that a simple (single-cell) -deployment would have, as well as the relationships -(i.e. communication paths) between them: - -.. graphviz:: - - digraph services { - graph [pad="0.35", ranksep="0.65", nodesep="0.55", concentrate=true]; - node [fontsize=10 fontname="Monospace"]; - edge [arrowhead="normal", arrowsize="0.8"]; - labelloc=bottom; - labeljust=left; - - { rank=same - api [label="nova-api"] - apidb [label="API Database" shape="box"] - scheduler [label="nova-scheduler"] - } - { rank=same - mq [label="MQ" shape="diamond"] - conductor [label="nova-conductor"] - } - { rank=same - cell0db [label="Cell0 Database" shape="box"] - celldb [label="Cell Database" shape="box"] - compute [label="nova-compute"] - } - - api -> mq -> compute - conductor -> mq -> scheduler - - api -> apidb - api -> cell0db - api -> celldb - - conductor -> apidb - conductor -> cell0db - conductor -> celldb - } - -All of the services are configured to talk to each other over the same -message bus, and there is only one cell database where live instance -data resides. The cell0 database is present (and required) but as no -compute nodes are connected to it, this is still a "single cell" -deployment. - -Multiple Cells --------------- - -In order to shard the services into multiple cells, a number of things -must happen. First, the message bus must be split into pieces along -the same lines as the cell database. Second, a dedicated conductor -must be run for the API-level services, with access to the API -database and a dedicated message queue. We call this *super conductor* -to distinguish its place and purpose from the per-cell conductor nodes. - -.. graphviz:: - - digraph services2 { - graph [pad="0.35", ranksep="0.65", nodesep="0.55", concentrate=true]; - node [fontsize=10 fontname="Monospace"]; - edge [arrowhead="normal", arrowsize="0.8"]; - labelloc=bottom; - labeljust=left; - - subgraph api { - api [label="nova-api"] - scheduler [label="nova-scheduler"] - conductor [label="super conductor"] - { rank=same - apimq [label="API MQ" shape="diamond"] - apidb [label="API Database" shape="box"] - } - - api -> apimq -> conductor - api -> apidb - conductor -> apimq -> scheduler - conductor -> apidb - } - - subgraph clustercell0 { - label="Cell 0" - color=green - cell0db [label="Cell Database" shape="box"] - } - - subgraph clustercell1 { - label="Cell 1" - color=blue - mq1 [label="Cell MQ" shape="diamond"] - cell1db [label="Cell Database" shape="box"] - conductor1 [label="nova-conductor"] - compute1 [label="nova-compute"] - - conductor1 -> mq1 -> compute1 - conductor1 -> cell1db - - } - - subgraph clustercell2 { - label="Cell 2" - color=red - mq2 [label="Cell MQ" shape="diamond"] - cell2db [label="Cell Database" shape="box"] - conductor2 [label="nova-conductor"] - compute2 [label="nova-compute"] - - conductor2 -> mq2 -> compute2 - conductor2 -> cell2db - } - - api -> mq1 -> conductor1 - api -> mq2 -> conductor2 - api -> cell0db - api -> cell1db - api -> cell2db - - conductor -> cell0db - conductor -> cell1db - conductor -> mq1 - conductor -> cell2db - conductor -> mq2 - } - -It is important to note that services in the lower cell boxes only -have the ability to call back to the placement API but cannot access -any other API-layer services via RPC, nor do they have access to the -API database for global visibility of resources across the cloud. -This is intentional and provides security and failure domain -isolation benefits, but also has impacts on some things that would -otherwise require this any-to-any communication style. Check the -release notes for the version of Nova you are using for the most -up-to-date information about any caveats that may be present due to -this limitation. - -Caveats of a Multi-Cell deployment ----------------------------------- - -.. note:: This information is correct as of the Pike release. Where - improvements have been made or issues fixed, they are noted per - item. - -Cross-cell instance migrations -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Currently it is not possible to migrate an instance from a host in one -cell to a host in another cell. This may be possible in the future, -but it is currently unsupported. This impacts cold migration, -resizes, live migrations, evacuate, and unshelve operations. - -Quota-related quirks -~~~~~~~~~~~~~~~~~~~~ - -Quotas are now calculated live at the point at which an operation -would consume more resource, instead of being kept statically in the -database. This means that a multi-cell environment may incorrectly -calculate the usage of a tenant if one of the cells is unreachable, as -those resources cannot be counted. In this case, the tenant may be -able to consume more resource from one of the available cells, putting -them far over quota when the unreachable cell returns. - -.. note:: Starting in the Train (20.0.0) release, it is possible to configure - counting of quota usage from the placement service and API database - to make quota usage calculations resilient to down or poor-performing - cells in a multi-cell environment. See the - :doc:`quotas documentation` for more details. - -Performance of listing instances -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. note:: This has been resolved in the Queens release [#]_. - -With multiple cells, the instance list operation may not sort and -paginate results properly when crossing multiple cell -boundaries. Further, the performance of a sorted list operation will -be considerably slower than with a single cell. - -Notifications -~~~~~~~~~~~~~ - -With a multi-cell environment with multiple message queues, it is -likely that operators will want to configure a separate connection to -a unified queue for notifications. This can be done in the configuration file -of all nodes. Refer to the :oslo.messaging-doc:`oslo.messaging configuration -documentation -` for more -details. - -.. _cells-v2-layout-metadata-api: - -Nova Metadata API service -~~~~~~~~~~~~~~~~~~~~~~~~~ - -Starting from the Stein release, the :doc:`nova metadata API service -` can be run either globally or per cell using the -:oslo.config:option:`api.local_metadata_per_cell` configuration option. - -**Global** - -If you have networks that span cells, you might need to run Nova metadata API -globally. When running globally, it should be configured as an API-level -service with access to the :oslo.config:option:`api_database.connection` -information. The nova metadata API service **must not** be run as a standalone -service, using the :program:`nova-api-metadata` service, in this case. - -**Local per cell** - -Running Nova metadata API per cell can have better performance and data -isolation in a multi-cell deployment. If your networks are segmented along -cell boundaries, then you can run Nova metadata API service per cell. If you -choose to run it per cell, you should also configure each -:neutron-doc:`neutron-metadata-agent -` service to -point to the corresponding :program:`nova-api-metadata`. The nova metadata API -service **must** be run as a standalone service, using the -:program:`nova-api-metadata` service, in this case. - - -Console proxies -~~~~~~~~~~~~~~~ - -`Starting from the Rocky release`__, console proxies must be run per cell -because console token authorizations are stored in cell databases. This means -that each console proxy server must have access to the -:oslo.config:option:`database.connection` information for the cell database -containing the instances for which it is proxying console access. - -.. __: https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/convert-consoles-to-objects.html - - -.. _upcall: - -Operations Requiring upcalls -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -If you deploy multiple cells with a superconductor as described above, -computes and cell-based conductors will not have the ability to speak -to the scheduler as they are not connected to the same MQ. This is by -design for isolation, but currently the processes are not in place to -implement some features without such connectivity. Thus, anything that -requires a so-called "upcall" will not function. This impacts the -following: - -#. Instance reschedules during boot and resize (part 1) - - .. note:: This has been resolved in the Queens release [#]_. - -#. Instance affinity reporting from the compute nodes to scheduler -#. The late anti-affinity check during server create and evacuate -#. Querying host aggregates from the cell - - .. note:: This has been resolved in the Rocky release [#]_. - -#. Attaching a volume and ``[cinder]/cross_az_attach=False`` -#. Instance reschedules during boot and resize (part 2) - - .. note:: This has been resolved in the Ussuri release [#]_ [#]_. - -The first is simple: if you boot an instance, it gets scheduled to a -compute node, fails, it would normally be re-scheduled to another -node. That requires scheduler intervention and thus it will not work -in Pike with a multi-cell layout. If you do not rely on reschedules -for covering up transient compute-node failures, then this will not -affect you. To ensure you do not make futile attempts at rescheduling, -you should set ``[scheduler]/max_attempts=1`` in ``nova.conf``. - -The second two are related. The summary is that some of the facilities -that Nova has for ensuring that affinity/anti-affinity is preserved -between instances does not function in Pike with a multi-cell -layout. If you don't use affinity operations, then this will not -affect you. To make sure you don't make futile attempts at the -affinity check, you should set -``[workarounds]/disable_group_policy_check_upcall=True`` and -``[filter_scheduler]/track_instance_changes=False`` in ``nova.conf``. - -The fourth was previously only a problem when performing live migrations using -the since-removed XenAPI driver and not specifying ``--block-migrate``. The -driver would attempt to figure out if block migration should be performed based -on source and destination hosts being in the same aggregate. Since aggregates -data had migrated to the API database, the cell conductor would not be able to -access the aggregate information and would fail. - -The fifth is a problem because when a volume is attached to an instance -in the *nova-compute* service, and ``[cinder]/cross_az_attach=False`` in -nova.conf, we attempt to look up the availability zone that the instance is -in which includes getting any host aggregates that the ``instance.host`` is in. -Since the aggregates are in the API database and the cell conductor cannot -access that information, so this will fail. In the future this check could be -moved to the *nova-api* service such that the availability zone between the -instance and the volume is checked before we reach the cell, except in the -case of :term:`boot from volume ` where the *nova-compute* -service itself creates the volume and must tell Cinder in which availability -zone to create the volume. Long-term, volume creation during boot from volume -should be moved to the top-level superconductor which would eliminate this AZ -up-call check problem. - -The sixth is detailed in `bug 1781286`_ and similar to the first issue. -The issue is that servers created without a specific availability zone -will have their AZ calculated during a reschedule based on the alternate host -selected. Determining the AZ for the alternate host requires an "up call" to -the API DB. - -.. _bug 1781286: https://bugs.launchpad.net/nova/+bug/1781286 - -.. [#] https://blueprints.launchpad.net/nova/+spec/efficient-multi-cell-instance-list-and-sort -.. [#] https://specs.openstack.org/openstack/nova-specs/specs/queens/approved/return-alternate-hosts.html -.. [#] https://blueprints.launchpad.net/nova/+spec/live-migration-in-xapi-pool -.. [#] https://review.opendev.org/686047/ -.. [#] https://review.opendev.org/686050/ diff --git a/doc/source/user/index.rst b/doc/source/user/index.rst index 5facf792ad51..743519a86ebd 100644 --- a/doc/source/user/index.rst +++ b/doc/source/user/index.rst @@ -54,9 +54,9 @@ the defaults from the :doc:`install guide ` will be sufficient. * :doc:`Feature Support full list `: A detailed dive through features in each compute driver backend. -* :doc:`Cells v2 Planning `: For large deployments, Cells v2 - allows sharding of your compute environment. Upfront planning is key to a - successful Cells v2 layout. +* :doc:`Cells v2 configuration `: For large deployments, cells v2 + cells allow sharding of your compute environment. Upfront planning is key to + a successful cells v2 layout. * :placement-doc:`Placement service <>`: Overview of the placement service, including how it fits in with the rest of nova. diff --git a/doc/test/redirect-tests.txt b/doc/test/redirect-tests.txt index 4ee7d865c990..012ddfe7bc15 100644 --- a/doc/test/redirect-tests.txt +++ b/doc/test/redirect-tests.txt @@ -9,12 +9,12 @@ /nova/latest/architecture.html 301 /nova/latest/user/architecture.html /nova/latest/block_device_mapping.html 301 /nova/latest/user/block-device-mapping.html /nova/latest/blueprints.html 301 /nova/latest/contributor/blueprints.html -/nova/latest/cells.html 301 /nova/latest/user/cells.html +/nova/latest/cells.html 301 /nova/latest/admin/cells.html /nova/latest/code-review.html 301 /nova/latest/contributor/code-review.html /nova/latest/conductor.html 301 /nova/latest/user/conductor.html /nova/latest/development.environment.html 301 /nova/latest/contributor/development-environment.html /nova/latest/devref/api.html 301 /nova/latest/contributor/api.html -/nova/latest/devref/cells.html 301 /nova/latest/user/cells.html +/nova/latest/devref/cells.html 301 /nova/latest/admin/cells.html /nova/latest/devref/filter_scheduler.html 301 /nova/latest/admin/scheduling.html # catch all, if we hit something in devref assume it moved to # reference unless we have already triggered a hit above. @@ -64,7 +64,9 @@ /nova/latest/threading.html 301 /nova/latest/reference/threading.html /nova/latest/upgrade.html 301 /nova/latest/admin/upgrades.html /nova/latest/user/aggregates.html 301 /nova/latest/admin/aggregates.html -/nova/latest/user/cellsv2_layout.html 301 /nova/latest/user/cellsv2-layout.html +/nova/latest/user/cells.html 301 /nova/latest/admin/cells.html +/nova/latest/user/cellsv2_layout.html 301 /nova/latest/admin/cells.html +/nova/latest/user/cellsv2-layout.html 301 /nova/latest/admin/cells.html /nova/latest/user/config-drive.html 301 /nova/latest/user/metadata.html /nova/latest/user/filter-scheduler.html 301 /nova/latest/admin/scheduling.html /nova/latest/user/metadata-service.html 301 /nova/latest/user/metadata.html