Merge "docs: Follow-ups for cells v2, architecture docs"

2022-02-07 10:27:51 +00:00 · 2022-02-07 10:27:51 +00:00 · b6fe7521af
commit b6fe7521af
parent 87dd10dcd4 136f1deb6e
5 changed files with 103 additions and 114 deletions
--- a/doc/source/admin/architecture.rst
+++ b/doc/source/admin/architecture.rst
@ -11,23 +11,26 @@ reads/writes, optionally sending RPC messages to other Nova services,
 and generating responses to the REST calls.
 RPC messaging is done via the **oslo.messaging** library,
 an abstraction on top of message queues.
-Nova uses a messaging-based, ``shared nothing`` architecture and most of the
+Nova uses a messaging-based, "shared nothing" architecture and most of the
 major nova components can be run on multiple servers, and have a manager that
 is listening for RPC messages.
-The one major exception is ``nova-compute``, where a single process runs on the
+The one major exception is the compute service, where a single process runs on the
 hypervisor it is managing (except when using the VMware or Ironic drivers).
 The manager also, optionally, has periodic tasks.
-For more details on our RPC system, please see: :doc:`/reference/rpc`
+For more details on our RPC system, refer to :doc:`/reference/rpc`.

-Nova also uses a central database that is (logically) shared between all
-components. However, to aid upgrade, the DB is accessed through an object
-layer that ensures an upgraded control plane can still communicate with
-a ``nova-compute`` running the previous release.
-To make this possible ``nova-compute`` proxies DB requests over RPC to a
-central manager called ``nova-conductor``.
+Nova uses traditional SQL databases to store information.
+These are (logically) shared between multiple components.
+To aid upgrade, the database is accessed through an object layer that ensures
+an upgraded control plane can still communicate with a compute nodes running
+the previous release.
+To make this possible, services running on the compute node proxy database
+requests over RPC to a central manager called the conductor.

 To horizontally expand Nova deployments, we have a deployment sharding
-concept called cells. For more information please see: :doc:`/admin/cells`
+concept called :term:`cells <cell>`.
+All deployments contain at least one cell.
+For more information, refer to :doc:`/admin/cells`.


 Components
@ -109,11 +112,9 @@ projects on a shared system, and role-based access assignments. Roles control
 the actions that a user is allowed to perform.

 Projects are isolated resource containers that form the principal
-organizational structure within the Nova service. They typically consist of an
-individual VLAN, and volumes, instances, images, keys, and users. A user can
-specify the project by appending ``project_id`` to their access key.  If no
-project is specified in the API request, Nova attempts to use a project with
-the same ID as the user.
+organizational structure within the Nova service. They typically consist of
+networks, volumes, instances, images, keys, and users. A user can
+specify the project by appending ``project_id`` to their access key.

 For projects, you can use quota controls to limit the number of processor cores
 and the amount of RAM that can be allocated. Other projects also allow quotas
@ -142,13 +143,14 @@ consumption across available hardware resources.
 Block storage
 -------------

-OpenStack provides two classes of block storage: ephemeral storage and
-persistent volume.
+OpenStack provides two classes of block storage: storage that is provisioned by
+Nova itself, and storage that is managed by the block storage service, Cinder.

-.. rubric:: Ephemeral storage
+.. rubric:: Nova-provisioned block storage

-Ephemeral storage includes a root ephemeral volume and an additional ephemeral
-volume. These are provided by nova itself.
+Nova provides the ability to create a root disk and an optional "ephemeral"
+volume. The root disk will always be present unless the instance is a
+:term:`Boot From Volume` instance.

 The root disk is associated with an instance, and exists only for the life of
 this very instance. Generally, it is used to store an instance's root file
@ -156,7 +158,7 @@ system, persists across the guest operating system reboots, and is removed on
 an instance deletion. The amount of the root ephemeral volume is defined by the
 flavor of an instance.

-In addition to the ephemeral root volume, flavors can provide an additional
+In addition to the root volume, flavors can provide an additional
 ephemeral block device. It is represented as a raw block device with no
 partition table or file system. A cloud-aware operating system can discover,
 format, and mount such a storage device. Nova defines the default file system
@ -171,17 +173,17 @@ is possible to configure other filesystem types.
   mounts it on ``/mnt``. This is a cloud-init feature, and is not an OpenStack
   mechanism. OpenStack only provisions the raw storage.

-.. rubric:: Persistent volume
+.. rubric:: Cinder-provisioned block storage

-A persistent volume is represented by a persistent virtualized block device
-independent of any particular instance. These are provided by the OpenStack
-Block Storage service, cinder.
+The OpenStack Block Storage service, Cinder, provides persistent volumes hat
+are represented by a persistent virtualized block device independent of any
+particular instance.

 Persistent volumes can be accessed by a single instance or attached to multiple
 instances. This type of configuration requires a traditional network file
 system to allow multiple instances accessing the persistent volume. It also
 requires a traditional network file system like NFS, CIFS, or a cluster file
-system such as GlusterFS. These systems can be built within an OpenStack
+system such as Ceph. These systems can be built within an OpenStack
 cluster, or provisioned outside of it, but OpenStack software does not provide
 these features.

@ -194,14 +196,6 @@ if the instance is shut down. For more information about this type of
 configuration, see :cinder-doc:`Introduction to the Block Storage service
 <configuration/block-storage/block-storage-overview.html>`.

-.. note::
-
-   A persistent volume does not provide concurrent access from multiple
-   instances. That type of configuration requires a traditional network file
-   system like NFS, or CIFS, or a cluster file system such as GlusterFS. These
-   systems can be built within an OpenStack cluster, or provisioned outside of
-   it, but OpenStack software does not provide these features.
-

 Building blocks
 ---------------
@ -245,7 +239,7 @@ The displayed image attributes are:

 Virtual hardware templates are called ``flavors``. By default, these are
 configurable by admin users, however, that behavior can be changed by redefining
-the access controls ``policy.yaml`` on the ``nova-compute`` server. For more
+the access controls ``policy.yaml`` on the ``nova-api`` server. For more
 information, refer to :doc:`/configuration/policy`.

 For a list of flavors that are available on your system:
--- a/doc/source/admin/availability-zones.rst
+++ b/doc/source/admin/availability-zones.rst
@ -9,15 +9,22 @@ Availability Zones
    zones, refer to the :doc:`user guide </user/availability-zones>`.

 Availability Zones are an end-user visible logical abstraction for partitioning
-a cloud without knowing the physical infrastructure. Availability zones are not
-modeled in the database; rather, they are defined by attaching specific
-metadata information to an :doc:`aggregate </admin/aggregates>` The addition of
-this specific metadata to an aggregate makes the aggregate visible from an
-end-user perspective and consequently allows users to schedule instances to a
-specific set of hosts, the ones belonging to the aggregate.
+a cloud without knowing the physical infrastructure. They can be used to
+partition a cloud on arbitrary factors, such as location (country, datacenter,
+rack), network layout and/or power source.

-However, despite their similarities, there are a few additional differences to
-note when comparing availability zones and host aggregates:
+.. note::
+
+   Availability Zones should not be assumed to map to fault domains and provide
+   no intrinsic HA benefit by themselves.
+
+Availability zones are not modeled in the database; rather, they are defined by
+attaching specific metadata information to an
+:doc:`aggregate </admin/aggregates>` The addition of this specific metadata to
+an aggregate makes the aggregate visible from an end-user perspective and
+consequently allows users to schedule instances to a specific set of hosts, the
+ones belonging to the aggregate. There are a few additional differences to note
+when comparing availability zones and host aggregates:

 - A host can be part of multiple aggregates but it can only be in one
  availability zone.
--- a/doc/source/admin/cells.rst
+++ b/doc/source/admin/cells.rst
@ -26,9 +26,13 @@ Laski gave at the Austin (Newton) summit which may be worth watching.
 Overview
 --------

-The purpose of the cells functionality in nova is specifically to
-allow larger deployments to shard their many compute nodes into cells.
-A basic Nova system consists of the following components:
+The purpose of the cells functionality in nova is to allow larger deployments
+to shard their many compute nodes into cells. All nova deployments are by
+definition cells deployments, even if most will only ever have a single cell.
+This means a multi-cell deployment will not b radically different from a
+"standard" nova deployment.
+
+Consider such a deployment. It will consists of the following components:

 - The :program:`nova-api` service which provides the external REST API to
  users.
@ -43,7 +47,7 @@ A basic Nova system consists of the following components:
  instances being built but not yet scheduled.

 - The :program:`nova-conductor` service which offloads long-running tasks for
-  the API-level service and insulates compute nodes from direct database access
+  the API-level services and insulates compute nodes from direct database access

 - The :program:`nova-compute` service which manages the virt driver and
  hypervisor host.
@ -60,15 +64,19 @@ A basic Nova system consists of the following components:
 - A message queue which allows the services to communicate with each
  other via RPC.

-All deployments have at least the above components. Smaller deployments
-likely have a single message queue that all services share and a
-single database server which hosts the API database, a single cell
-database, as well as the required cell0 database. This is considered a
-"single-cell deployment" because it only has one "real" cell.
-However, while there will only ever be one global API database, a larger
-deployments can have many cell databases (where the bulk of the instance
-information lives), each with a portion of the instances for the entire
-deployment within, as well as per-cell message queues.
+In smaller deployments, there will typically be a single message queue that all
+services share and a single database server which hosts the API database, a
+single cell database, as well as the required cell0 database. Because we only
+have one "real" cell, we consider this a "single-cell deployment".
+
+In larger deployments, we can opt to shard the deployment using multiple cells.
+In this configuration there will still only be one global API database but
+there will be a cell database (where the bulk of the instance information
+lives) for each cell, each containing a portion of the instances for the entire
+deployment within, as well as per-cell message queues and per-cell
+:program:`nova-conductor` instances. There will also be an additional
+:program:`nova-conductor` instance, known as a *super conductor*, to handle
+API-level operations.

 In these larger deployments, each of the nova services will use a cell-specific
 configuration file, all of which will at a minimum specify a message queue
@ -98,6 +106,9 @@ other cells in the API database, with records called *cell mappings*.
   lower-level services. See the ``nova-manage`` :ref:`man-page-cells-v2`
   commands for more information about how to create and examine these records.

+The following section goes into more detail about the difference between
+single-cell and multi-cell deployments.
+

 Service layout
 --------------
@ -242,70 +253,42 @@ any other API-layer services via RPC, nor do they have access to the
 API database for global visibility of resources across the cloud.
 This is intentional and provides security and failure domain
 isolation benefits, but also has impacts on some things that would
-otherwise require this any-to-any communication style. Check the
-release notes for the version of Nova you are using for the most
-up-to-date information about any caveats that may be present due to
-this limitation.
+otherwise require this any-to-any communication style. Check :ref:`upcall`
+below for the most up-to-date information about any caveats that may be present
+due to this limitation.


 Database layout
 ---------------

 As mentioned previously, there is a split between global data and data that is
-local to a cell.
+local to a cell. These databases schema are referred to as the *API* and *main*
+database schemas, respectively.

-The following is a breakdown of what data can uncontroversially considered
-global versus local to a cell.  Missing data will be filled in as consensus is
-reached on the data that is more difficult to cleanly place.  The missing data
-is mostly concerned with scheduling and networking.
+API database
+~~~~~~~~~~~~

-.. note::
+The API database is the database used for API-level services, such as
+:program:`nova-api` and, in a multi-cell deployment, the superconductor.
+The models and migrations related to this database can be found in
+``nova.db.api``, and the database can be managed using the
+:program:`nova-manage api_db` commands.

-   This list of tables is accurate as of the 15.0.0 (Pike) release. It's
-   possible that schema changes may have added additional tables since.
+Main (cell-level) database
+~~~~~~~~~~~~~~~~~~~~~~~~~~

-Global (API-level) tables
-~~~~~~~~~~~~~~~~~~~~~~~~~
-
- ``instance_types``
- ``instance_type_projects``
- ``instance_type_extra_specs``
- ``quotas``
- ``project_user_quotas``
- ``quota_classes``
- ``quota_usages``
- ``security_groups``
- ``security_group_rules``
- ``security_group_default_rules``
- ``provider_fw_rules``
- ``key_pairs``
- ``migrations``
- ``networks``
- ``tags``
-
-Cell-level tables
-~~~~~~~~~~~~~~~~~
-
- ``instances``
- ``instance_info_caches``
- ``instance_extra``
- ``instance_metadata``
- ``instance_system_metadata``
- ``instance_faults``
- ``instance_actions``
- ``instance_actions_events``
- ``instance_id_mappings``
- ``pci_devices``
- ``block_device_mapping``
- ``virtual_interfaces``
+The main database is the database used for cell-level :program:`nova-conductor`
+instances. The models and migrations related to this database can be found in
+``nova.db.main``, and the database can be managed using the
+:program:`nova-manage db` commands.


 Usage
 -----

 As noted previously, all deployments are in effect now cells v2 deployments. As
-a result, setup of a any nova deployment - even those that intend to only have
-once cell - will involve some level of cells configuration. These changes are
+a result, setup of any nova deployment - even those that intend to only have
+one cell - will involve some level of cells configuration. These changes are
 configuration-related, both in the main nova configuration file as well as some
 extra records in the databases.

@ -345,11 +328,11 @@ Configuring a new deployment
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 If you are installing Nova for the first time and have no compute hosts in the
-database yet then it will be necessary to configure cell0 and at least once
-additional "real" cell. To begin, ensure your API database has been created
-using the :program:`nova-manage api_db sync` command. Ensure the connection
-information for this database is stored in the ``nova.conf`` file using the
-:oslo.config:option:`api_database.connection` config option:
+database yet then it will be necessary to configure cell0 and at least one
+additional "real" cell. To begin, ensure your API database schema has been
+populated using the :program:`nova-manage api_db sync` command. Ensure the
+connection information for this database is stored in the ``nova.conf`` file
+using the :oslo.config:option:`api_database.connection` config option:

 .. code-block:: ini

@ -557,7 +540,6 @@ existing instances to the new cell(s). For example:
   have been mapped. An exit code of 1 indicates that there are remaining
   instances that need to be mapped.

-
 Template URLs in Cell Mappings
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@ -1152,7 +1134,7 @@ real-world users of the feature.
 - `Rocky Summit Video - Moving from CellsV1 to CellsV2 at CERN`__
 - `Stein Summit Video - Scaling Nova with CellsV2: The Nova Developer and the
  CERN Operator perspective`__
- `Ussuri Summit Video - What's new in Nova Cellsv2?`__
+- `Train Summit Video - What's new in Nova Cellsv2?`__

 .. __: https://www.openstack.org/videos/austin-2016/nova-cells-v2-whats-going-on
 .. __: https://www.openstack.org/videos/boston-2017/scaling-nova-how-cellsv2-affects-your-deployment
--- a/doc/source/admin/index.rst
+++ b/doc/source/admin/index.rst
@ -101,9 +101,7 @@ the defaults from the :doc:`install guide </install/index>` will be sufficient.

 * :doc:`Availablity Zones </admin/availability-zones>`: Availability Zones are
  an end-user visible logical abstraction for partitioning a cloud without
-  knowing the physical infrastructure. They can be used to partition a cloud on
-  arbitrary factors, such as location (country, datacenter, rack), network
-  layout and/or power source.
+  knowing the physical infrastructure.

 * :placement-doc:`Placement service <>`: Overview of the placement
  service, including how it fits in with the rest of nova.
--- a/doc/source/reference/glossary.rst
+++ b/doc/source/reference/glossary.rst
@ -23,6 +23,14 @@ Glossary
        has an empty ("") ``image`` parameter in ``GET /servers/{server_id}``
        responses.

+    Cell
+        A cell is a shard or horizontal partition in a nova deployment.
+        A cell mostly consists of a database, queue, and set of compute nodes.
+        All deployments willl have at least one cell (and one "fake" cell).
+        Larger deployments can have many.
+
+        For more information, refer to :doc:`/admin/cells`.
+
    Cross-Cell Resize
        A resize (or cold migrate) operation where the source and destination
        compute hosts are mapped to different cells. By default, resize and