diff --git a/deploy-guide/source/features/backends.rst b/deploy-guide/source/features/backends.rst index 47a25515..4cd7b050 100644 --- a/deploy-guide/source/features/backends.rst +++ b/deploy-guide/source/features/backends.rst @@ -10,9 +10,7 @@ OpenStack projects. deploy_manila cinder_custom_backend cinder_netapp - cephadm deployed_ceph - ceph_config ceph_external domain_specific_ldap_backends swift_external diff --git a/deploy-guide/source/features/ceph_config.rst b/deploy-guide/source/features/ceph_config.rst deleted file mode 100644 index d7dcd520..00000000 --- a/deploy-guide/source/features/ceph_config.rst +++ /dev/null @@ -1,1020 +0,0 @@ -Configuring Ceph with Custom Config Settings (via ceph-ansible or puppet-ceph) -============================================================================== - -This guide assumes that the undercloud is already installed and ready -to deploy an overcloud and that the appropriate repositories -containing Ceph packages, including ceph-ansible if applicable, have -been enabled and installed as described in -:doc:`../deployment/index`. - -.. warning:: TripleO integration with ceph-ansible became deprecated - in Wallaby and support was removed in the next release. - TripleO integration with puppet-ceph became deprecated - in Pike and support was removed in the next release. - For Wallaby and newer use :doc:`deployed_ceph` to have - TripleO deploy Ceph. - -Deploying an Overcloud with Ceph --------------------------------- - -TripleO can deploy and configure Ceph as if it was a composable -OpenStack service and configure OpenStack services like Nova, Glance, -Cinder and Cinder Backup to use it as a storage backend. - -TripleO can only deploy one Ceph cluster in the overcloud per Heat -stack. However, within that Heat stack it's possible to configure -an overcloud to communicate with multiple Ceph clusters which are -external to the overcloud. To do this, follow this document to -configure the "internal" Ceph cluster which is part of the overcloud -and also use the `CephExternalMultiConfig` parameter described in the -:doc:`ceph_external` documentation. - -Prior to Pike, TripleO deployed Ceph with `puppet-ceph`_. With the -Pike release it became possible to use TripleO to deploy Ceph with -`ceph-ansible`_ and puppet-ceph became deprecated. - -TripleO Wallaby can deploy Ceph Pacific with ceph-ansible though -Wallaby is the last release with ceph-ansible integration. -Wallaby is also able to deploy a full Ceph cluster, with RBD, RGW, -MDS, and Dashboard, using cephadm in place of ceph-ansible as -described in :doc:`cephadm`. The preferred way to deploy Ceph -with TripleO, in Wallaby and newer, is before the overcloud as -described in :doc:`deployed_ceph`. - -To deploy with Ceph include either of the appropriate environment -files. For puppet-ceph use "environments/puppet-ceph.yaml" -like the following:: - - openstack overcloud deploy --templates -e /usr/share/openstack-tripleo-heat-templates/environments/puppet-ceph.yaml - -For ceph-ansible use "environments/ceph-ansible/ceph-ansible.yaml" -like the following:: - - openstack overcloud deploy --templates -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml - -When using ceph-ansible to deploy Ceph in containers, the process -described in the :ref:`prepare-environment-containers` documentation -will configure the deployment to use the appropriate Ceph docker -image. However, it is also possible to override the default docker -image. For example:: - - parameter_defaults: - DockerCephDaemonImage: ceph/daemon:tag-stable-3.0-jewel-centos-7 - -In both the puppet-ceph and ceph-ansible examples above, at least one -Ceph storage node is required. The following example will configure -one Ceph storage nodes on servers matching the `ceph-storage` -profile. It will also set the default pool size, the number of times -that an object should be written for data protection, to one. These -`parameter_defaults` may be saved in an environment file -"~/my-ceph-settings.yaml" and added to the deploy commandline:: - - parameter_defaults: - OvercloudCephStorageFlavor: ceph-storage - CephStorageCount: 1 - CephDefaultPoolSize: 1 - -The values above are only appropriate for a development or POC -deployment. The default pool size is three but if there are less -than three Ceph OSDs, then the cluster will never reach status -`HEALTH_OK` because it has no place to make additional copies. -Thus, a POC deployment with less than three OSDs should override the -default default pool size. However, a production deployment should -replace both of the ones above with threes, or greater, in order to -have at least three storage nodes and at least three back up copies of -each object at minimum. - -Configuring nova-compute ephemeral backend per role ---------------------------------------------------- - -NovaEnableRdbBackend can be configured on a per-role basis allowing compute -hosts to be deployed with a subset using RBD ephemeral disk and a subset using -local ephemeral disk. - -.. note:: - - For best performance images to be deployed to RBD ephemeral computes should be in RAW format while images to be deployed to local ephemeral computes should be QCOW2 format. - -Generate roles_data including the provided ComputeLocalEphemeral and -ComputeRBDEphemeral roles as described in the :ref:`custom_roles` -documentation. - -Configure the role counts, for example "nodes.yaml":: - - parameter_defaults: - ComputeLocalEphemeralCount: 10 - ComputeRBDEphemeralCount: 10 - -Alternatively the "NovaEnableRbdBackend" parameter can be set as a role -parameter on any Compute role, for example:: - - parameter_defaults: - ComputeParameters: - NovaEnableRbdBackend: true - MyCustomComputeParameters: - NovaEnableRbdBackend: false - -If the top-level NovaEnableRbdBackend parameter is set to true, as it is in -environments/ceph-ansible/ceph-ansible.yaml, then then this will be -the default when not overridden via role parameters. - -Setting NovaEnableRbdBackend to true at the top level also enables the glance -image_conversion import plugin and show_multiple_locations option. -These parameters must be set explicitly when changing the top-level -NovaEnableRbdBackend to false:: - - parameter_defaults: - NovaEnableRbdBackend: false - GlanceShowMultipleLocations: true - GlanceImageImportPlugins: - - image_conversion - -Customizing ceph.conf with puppet-ceph --------------------------------------- - -Ceph demands for more careful configuration when deployed at scale. - -It is possible to override any of the configuration parameters supported by -`puppet-ceph`_ at deployment time via Heat environment files. For example:: - - parameter_defaults: - ExtraConfig: - ceph::profile::params::osd_journal_size: 2048 - -will customize the default `osd_journal_size` overriding any default -provided in the `ceph.yaml static hieradata`_. - -It is also possible to provide arbitrary stanza/key/value lines for `ceph.conf` -using the special `ceph::conf` configuration class. For example by using:: - - parameter_defaults: - ExtraConfig: - ceph::conf::args: - global/max_open_files: - value: 131072 - global/my_setting: - value: my_value - -the resulting `ceph.conf` file should be populated with the following:: - - [global] - max_open_files: 131072 - my_setting: my_value - -To specify a set of dedicated block devices to use as Ceph OSDs use -the following:: - - parameter_defaults: - ExtraConfig: - ceph::profile::params::osds: - '/dev/sdb': - journal: '/dev/sde' - '/dev/sdc': - journal: '/dev/sde' - '/dev/sdd': - journal: '/dev/sde' - -The above will produce three OSDs which run on `/dev/sdb`, `/dev/sdc`, -and `/dev/sdd` which all journal to `/dev/sde`. This same setup will -be duplicated per Ceph storage node and assumes uniform hardware. If -you do not have uniform hardware see :doc:`node_specific_hieradata`. - -The `parameter_defaults` like the above may be saved in an environment -file "~/my-ceph-settings.yaml" and added to the deploy commandline:: - - openstack overcloud deploy --templates -e /usr/share/openstack-tripleo-heat-templates/environments/puppet-ceph.yaml -e ~/my-ceph-settings.yaml - -Customizing ceph.conf with ceph-ansible ---------------------------------------- - -The playbooks provided by `ceph-ansible` are triggered by a Mistral -workflow. A new `CephAnsibleExtraConfig` parameter has been added to -the templates and can be used to provide arbitrary config variables -consumed by `ceph-ansible`. The pre-existing template params consumed -by the TripleO Pike release to drive `puppet-ceph` continue to work -and are translated, when possible, into their equivalent -`ceph-ansible` variable. - -For example, to encrypt the data stored on OSDs use the following:: - - parameter_defaults: - CephAnsibleExtraConfig: - dmcrypt: true - -The above example may be used to change any of the defaults found in -`ceph-ansible/group_vars`_. - -If a parameter to override is not an available group variable, then -`ceph.conf` sections settings may be set directly using -`CephConfigOverrides` like the following:: - - parameter_defaults: - CephConfigOverrides: - global: - max_open_files: 131072 - osd: - osd_journal_size: 40960 - -To change the backfill and recovery operations that Ceph uses to -rebalance a cluster, use an example like the following:: - - parameter_defaults: - CephConfigOverrides: - global: - osd_recovery_op_priority: 3 - osd_recovery_max_active: 3 - osd_max_backfills: 1 - -Configuring CephX Keys ----------------------- - -TripleO will create a Ceph cluster with a CephX key file for OpenStack -RBD client connections that is shared by the Nova, Cinder, and Glance -services to read and write to their pools. Not only will the -keyfile be created but the Ceph cluster will be configured to accept -connections when the key file is used. The file will be named -`ceph.client.openstack.keyring` and it will be stored in `/etc/ceph` -within the containers, but on the container host it will be stored in -a location defined by a TripleO exposed parameter which defaults to -`/var/lib/tripleo-config/ceph`. - -.. admonition:: Wallaby and newer versions - - Prior to Wallaby the `CephConfigPath` option didn't exist and the - configuration files (keyfiles and ceph.conf) were always stored - in /etc/ceph. - Wallaby introduces a new tripleo-ansible role which is responsible - to create the keyrings and the ceph configuration file and, later - in the process, configure the clients by copying the rendered files. - The containers will find the Ceph related files inside /etc/ceph, - however, TripleO exposes the new parameter that can be used to - specify the location where the tripleo-ansible Ceph client role is - supposed to render the keyfiles and the ceph.conf file. - -The keyring file is created using the following defaults: - -* CephClusterName: 'ceph' -* CephClientUserName: 'openstack' -* CephClientKey: This value is randomly generated per Heat stack. If - it is overridden the recommendation is to set it to the output of - `ceph-authtool --gen-print-key`. - -If the above values are overridden, the keyring file will have a -different name and different content. E.g. if `CephClusterName` was -set to 'foo' and `CephClientUserName` was set to 'bar', then the -keyring file would be called `foo.client.bar.keyring` and it would -contain the line `[client.bar]`. - -The `CephExtraKeys` parameter may be used to generate additional key -files containing other key values and should contain a list of maps -where each map describes an additional key. The syntax of each -map must conform to what the `ceph-ansible/library/ceph_key.py` -Ansible module accepts. The `CephExtraKeys` parameter should be used -like this:: - - CephExtraKeys: - - name: "client.glance" - caps: - mgr: "allow *" - mon: "profile rbd" - osd: "profile rbd pool=images" - key: "AQBRgQ9eAAAAABAAv84zEilJYZPNuJ0Iwn9Ndg==" - mode: "0600" - -If the above is used, in addition to the -`ceph.client.openstack.keyring` file, an additional file called -`ceph.client.glance.keyring` will be created which contains:: - - [client.glance] - key = AQBRgQ9eAAAAABAAv84zEilJYZPNuJ0Iwn9Ndg== - caps mgr = "allow *" - caps mon = "profile rbd" - caps osd = "profile rbd pool=images" - -The Ceph cluster will also allow the above key file to be used to -connect to the images pool. Ceph RBD clients which are external to the -overcloud could then use this CephX key to connect to the images -pool used by Glance. The default Glance deployment defined in the Heat -stack will continue to use the `ceph.client.openstack.keyring` file -unless that Glance configuration itself is overridden. - -Tuning Ceph OSD CPU and Memory ------------------------------- - -The group variable `ceph_osd_docker_cpu_limit`, which corresponds to -``docker run ... --cpu-quota``, may be overridden depending on the -hardware configuration and the system needs. Below is an example of -setting custom values for this parameter:: - - parameter_defaults: - CephAnsibleExtraConfig: - ceph_osd_docker_cpu_limit: 1 - -.. warning:: Overriding the `ceph_osd_docker_memory_limit` variable - is not recommended. Use of ceph-ansible 3.2 or newer is - recommended as it will automatically tune this variable - based on hardware. - -.. admonition:: ceph-ansible 3.2 and newer - :class: ceph - - As of ceph-ansible 3.2, the `ceph_osd_docker_memory_limit` is set - by default to the max memory of the host in order to ensure Ceph - does not run out of resources. While it is technically possible to - override the bluestore `osd_memory_target` by setting it inside of - the `CephConfigOverrides` directive, it is better to let - ceph-ansible automatically tune this variable. Such tuning is - also influenced by the boolean `is_hci` flag. When collocating - Ceph OSD services on the same nodes which run Nova compute - services (also known as "hyperconverged deployments"), set - this variable as in the example below:: - - parameter_defaults: - CephAnsibleExtraConfig: - is_hci: true - - When using filestore in hyperconverged deployments, include the - "environments/tuned-ceph-filestore-hci.yaml" enviornment file to - set a :doc:`tuned profile ` designed for Ceph filestore. - Do not use this tuned profile with bluestore. - -.. admonition:: ceph-ansible 4.0 and newer - :class: ceph - - Stein's default Ceph was Nautilus, which introduced the Messenger v2 protocol. - ceph-ansible 4.0 and newer added a parameter in order to: - - * enable or disable the v1 protocol - * define the port used to bind the process - - Ceph Nautilus enables both v1 and v2 protocols by default and v1 is maintained - for backward compatibility. - To disable v1 protocol, set the variables as in the example below:: - - parameter_defaults: - CephAnsibleExtraConfig: - mon_host_v1: - enabled: False - - -Configure OSD settings with ceph-ansible ----------------------------------------- - -To specify which block devices will be used as Ceph OSDs, use a -variation of the following:: - - parameter_defaults: - CephAnsibleDisksConfig: - devices: - - /dev/sdb - - /dev/sdc - - /dev/sdd - - /dev/nvme0n1 - osd_scenario: lvm - osd_objectstore: bluestore - -Because `/dev/nvme0n1` is in a higher performing device class, e.g. -it is an SSD and the other devices are spinning HDDs, the above will -produce three OSDs which run on `/dev/sdb`, `/dev/sdc`, and -`/dev/sdd` and they will use `/dev/nvme0n1` as a bluestore WAL device. -The `ceph-volume` tool does this by using `the "batch" subcommand`_. -This same setup will be duplicated per Ceph storage node and assumes -uniform hardware. If you do not have uniform hardware see -:doc:`node_specific_hieradata`. If the bluestore WAL data will reside -on the same disks as the OSDs, then the above could be changed to the -following:: - - parameter_defaults: - CephAnsibleDisksConfig: - devices: - - /dev/sdb - - /dev/sdc - - /dev/sdd - osd_scenario: lvm - osd_objectstore: bluestore - -The example above configures the devices list using the disk -name, e.g. `/dev/sdb`, based on the `sd` driver. This method of -referring to block devices is not guaranteed to be consistent on -reboots so a disk normally identified by `/dev/sdc` may be named -`/dev/sdb` later. Another way to refer to block devices is `by-path` -which is persistent accross reboots. The `by-path` names for your -disks are in the Ironic introspection data. A utility exists to -generate a Heat environment file from Ironic introspection data -with a devices list for each of the Ceph nodes in a deployment -automatically as described in :doc:`node_specific_hieradata`. - -.. warning:: `osd_scenario: lvm` is used above to default new - deployments to bluestore as configured, by `ceph-volume`, - and is only available with ceph-ansible 3.2, or newer, - and with Luminous, or newer. The parameters to support - filestore with ceph-ansible 3.2 are backwards-compatible - so existing filestore deployments should not simply have - their `osd_objectstore` or `osd_scenario` parameters - changed without taking steps to maintain both backends. - -.. admonition:: Filestore or ceph-ansible 3.1 (or older) - :class: ceph - - Ceph Luminous supports both filestore and bluestore, but bluestore - deployments require ceph-ansible 3.2, or newer, and `ceph-volume`. - For older versions, if the `osd_scenario` is either `collocated` or - `non-collocated`, then ceph-ansible will use the `ceph-disk` tool, - in place of `ceph-volume`, to configure Ceph's filestore backend - in place of bluestore. A variation of the above example which uses - filestore and `ceph-disk` is the following:: - - parameter_defaults: - CephAnsibleDisksConfig: - devices: - - /dev/sdb - - /dev/sdc - - /dev/sdd - dedicated_devices: - - /dev/nvme0n1 - - /dev/nvme0n1 - - /dev/nvme0n1 - osd_scenario: non-collocated - osd_objectstore: filestore - - The above will produce three OSDs which run on `/dev/sdb`, - `/dev/sdc`, and `/dev/sdd`, and which all journal to three - partitions which will be created on `/dev/nvme0n1`. If the - journals will reside on the same disks as the OSDs, then - the above should be changed to the following:: - - parameter_defaults: - CephAnsibleDisksConfig: - devices: - - /dev/sdb - - /dev/sdc - - /dev/sdd - osd_scenario: collocated - osd_objectstore: filestore - - It is unsupported to use `osd_scenario: collocated` or - `osd_scenario: non-collocated` with `osd_objectstore: bluestore`. - -Maintaining both Bluestore and Filestore Ceph Backends ------------------------------------------------------- - -For existing Ceph deployments, it is possible to scale new Ceph -storage nodes which use bluestore while keeping the existing Ceph -storage nodes using filestore. - -In order to support both filestore and bluestore in a deployment, -the nodes which use filestore must continue to use the filestore -parameters like the following:: - - parameter_defaults: - CephAnsibleDisksConfig: - devices: - - /dev/sdb - - /dev/sdc - dedicated_devices: - - /dev/nvme0n1 - - /dev/nvme0n1 - osd_scenario: non-collocated - osd_objectstore: filestore - -While the nodes which will use bluestore, all of the new nodes, must -use bluestore parameters like the following:: - - parameter_defaults: - CephAnsibleDisksConfig: - devices: - - /dev/sdb - - /dev/sdc - - /dev/nvme0n1 - osd_scenario: lvm - osd_objectstore: bluestore - -To resolve this difference, use :doc:`node_specific_hieradata` to -map the filestore node's machine unique UUID to the filestore -parameters, so that only those nodes are passed the filestore -parmaters, and then set the default Ceph parameters, e.g. those -found in `~/my-ceph-settings.yaml`, to the bluestore parameters. - -An example of what the `~/my-node-settings.yaml` file, as described in -:doc:`node_specific_hieradata`, might look like for two nodes which -will keep using filestore is the following:: - - parameter_defaults: - NodeDataLookup: - 00000000-0000-0000-0000-0CC47A6EFDCC: - devices: - - /dev/sdb - - /dev/sdc - dedicated_devices: - - /dev/nvme0n1 - - /dev/nvme0n1 - osd_scenario: non-collocated - osd_objectstore: filestore - 00000000-0000-0000-0000-0CC47A6F13FF: - devices: - - /dev/sdb - - /dev/sdc - dedicated_devices: - - /dev/nvme0n1 - - /dev/nvme0n1 - osd_scenario: non-collocated - osd_objectstore: filestore - -Be sure to set every existing Ceph filestore server to the filestore -parameters by its machine unique UUID. If the above is not done and -the default parameter is set to `osd_scenario=lvm` for the existing -nodes which were configured with `ceph-disk`, then these OSDs will not -start after a restart of the systemd unit or a system reboot. - -The example above, makes bluestore the new default and filestore an -exception per node. An alternative approach is to keep the default of -filestore and `ceph-disk` and use :doc:`node_specific_hieradata` for -adding new nodes which use bluestore and `ceph-volume`. A benefit of -this is that there wouldn't be any configuration change for existing -nodes. However, every scale operation with Ceph nodes would require -the use of :doc:`node_specific_hieradata`. While the example above, -of making filestore and `ceph-disk` the per-node exception, requires -more work up front, it simplifies future scale up when completed. If -the cluster will be migrated to all bluestore, through node scale down -and scale up, then the amount of items in `~/my-node-settings.yaml` -could be reduced for each scale down and scale up operation until the -full cluster uses bluestore. - -Customize Ceph Placement Groups per OpenStack Pool --------------------------------------------------- - -The number of OSDs in a Ceph deployment should proportionally affect -the number of Ceph PGs per Pool as determined by Ceph's -`pgcalc`_. When the appropriate default pool size and PG number are -determined, the defaults should be overridden using an example like -the following:: - - parameter_defaults: - CephPoolDefaultSize: 3 - CephPoolDefaultPgNum: 128 - -In addition to setting the default PG number for each pool created, -each Ceph pool created for OpenStack can have its own PG number. -TripleO supports customization of these values by using a syntax like -the following:: - - parameter_defaults: - CephPools: - - {"name": backups, "pg_num": 512, "pgp_num": 512, "application": rbd} - - {"name": volumes, "pg_num": 1024, "pgp_num": 1024, "application": rbd, "rule_name": 'replicated_rule', "erasure_profile": '', "expected_num_objects": 6000} - - {"name": vms, "pg_num": 512, "pgp_num": 512, "application": rbd} - - {"name": images, "pg_num": 128, "pgp_num": 128, "application": rbd} - -In the above example, PG numbers for each pool differ based on the -OpenStack use case from `pgcalc`_. The example above also passes -additional options as described in the `ceph osd pool create`_ -documentation to the volumes pool used by Cinder. A TripleO validation -(described in `Validating Ceph Configuration`_) may be used to verify -that the PG numbers satisfy Ceph's PG overdose protection check before -the deployment starts. - -Customizing crushmap using device classes ------------------------------------------ - -Since Luminous, Ceph introduces a new `device classes` feature with the -purpose of automating one of the most common reasons crushmaps are -directly edited. -Device classes are a new property for OSDs visible by running `ceph osd -tree` and observing the class column, which should default correctly to -each device's hardware capability (hdd, ssd or nvme). -This feature is useful because Ceph CRUSH rules can restrict placement -to a specific device class. For example, they make it easy to create a -"fast" pool that distributes data only over SSDs. -To do this, one simply needs to specify in the pool definition which -device class should be used. -This is simpler than directly editing the CRUSH map itself. -There is no need for the operator to specify the device class for each -disk added into the cluster: with this new functionality, ceph is able -to autodetect the disk type (exposed by Linux kernel), placing it in -the right category. -For this reason the old way of specifying which block devices will be -used as Ceph OSDs is still valid:: - - CephAnsibleDisksConfig: - devices: - - /dev/sdb - - /dev/sdc - - /dev/sdd - osd_scenario: lvm - osd_objectstore: bluestore - -However, if the operator would like to force a specific device to -belong to a specific class, the `crush_device_class` property is -provided and the device list defined above can be changed into:: - - CephAnsibleDisksConfig: - lvm_volumes: - - data: '/dev/sdb' - crush_device_class: 'hdd' - - data: '/dev/sdc' - crush_device_class: 'sdd' - - data: '/dev/sdd' - crush_device_class: 'hdd' - osd_scenario: lvm - osd_objectstore: bluestore - -.. note:: - - crush_device_class property is optional and can be omitted. Ceph is - able to `autodect` the type of disk, so this option can be used for - advanced users or to fake/force the disk type. - -After the device list is defined, the next step is to set some additional -parameters to properly generate the ceph-ansible variables; in TripleO -there are no explicitly exposed parameters to integrate this feature, -however, the ceph-ansible expected parameters can be generated through -`CephAnsibleExtraConfig`:: - - CephAnsibleExtraConfig: - crush_rule_config: true - create_crush_tree: true - crush_rules: - - name: HDD - root: default - type: host - class: hdd - default: true - - name: SSD - root: default - type: host - class: ssd - default: false - -As seen in the example above, in order to properly generate the -crushmap hierarchy used by device classes, the `crush_rule_config` and -`create_crush_tree` booleans should be enabled. These booleans will -trigger the ceph-ansible playbook related to the crushmap customization, -and the rules associated to the device classes will be generated -according to the `crush_rules` array. This allows the ceph cluster to -build a shadow hierarchy which reflects the specified rules. -Finally, as described in the customize placement group section, TripleO -supports the customization of pools; in order to tie a specific pool to -a device class, the `rule_name` option should be added as follows:: - - CephPools: - - name: fastpool - pg_num: 8 - rule_name: SSD - application: rbd - -By adding this rule, we can make sure `fastpool` will follow the SSD -rule which is defined for the ssd device class and it can be configured -and used as a second (fast) tier to manage cinder volumes. - -Customizing crushmap using node specific overrides --------------------------------------------------- - -With device classes the ceph cluster can expose different storage -tiers with no need to manually edit the crushmap. -However, if device classes are not sufficient, the creation of a -specific crush hierarchy (e.g., host, rack, row, etc.), adding or -removing extra layers (e.g., racks) on the crushmap is still valid -and can be done via :doc:`node_specific_hieradata`. -NodeDataLookup playbook is able to generate node spec overrides using -the following syntax:: - - NodeDataLookup: {"SYSTEM_UUID": {"osd_crush_location": {"root": "$MY_ROOT", "rack": "$MY_RACK", "host": "$OVERCLOUD_NODE_HOSTNAME"}}} - -Generate NodeDataLookup manually can be error-prone. For this reason -TripleO provides the `make_ceph_disk`_ utility to build a JSON file -to get started, then it can be modified adding the `osd_crush_location` -properties dictionary with the syntax described above. - -Override Ansible run options ----------------------------- - -TripleO runs the ceph-ansible `site-docker.yml.sample` playbook by -default. The values in this playbook should be overridden as described -in this document and the playbooks themselves should not be modified. -However, it is possible to specify which playbook is run using the -following parameter:: - - parameter_defaults: - CephAnsiblePlaybook: /usr/share/ceph-ansible/site-docker.yml.sample - -For each TripleO Ceph deployment, the above playbook's output is logged -to `/var/log/mistral/ceph-install-workflow.log`. The default verbosity -of the playbook run is 0. The example below sets the verbosity to 3:: - - parameter_defaults: - CephAnsiblePlaybookVerbosity: 3 - -During the playbook run temporary files, like the Ansible inventory -and the ceph-ansible parameters that are passed as overrides as -described in this document, are stored on the undercloud in a -directory that matches the pattern `/tmp/ansible-mistral-action*`. -This directory is deleted at the end of each Mistral workflow which -triggers the playbook run. However, the temporary files are not -deleted when the verbosity is greater than 0. This option is helpful -when debugging. - -The Ansible environment variables may be overridden using an example -like the following:: - - parameter_defaults: - CephAnsibleEnvironmentVariables: - ANSIBLE_SSH_RETRIES: '6' - DEFAULT_FORKS: '25' - -In the above example, the number of SSH retries is increased from the -default to prevent timeouts. Ansible's fork number is automatically -limited to the number of possible hosts at runtime. TripleO uses -ceph-ansible to configure Ceph clients in addition to Ceph servers so -when deploying a large number of compute nodes ceph-ansible may -consume a lot of memory on the undercloud. Lowering the fork count -will reduce the memory footprint while the Ansible playbook is running -at the expense of the number of hosts configured in parallel. - -Applying ceph-ansible customizations to a overcloud deployment --------------------------------------------------------------- - -The desired options from the ceph-ansible examples above to customize -the ceph.conf, container, OSD or Ansible options may be combined under -one `parameter_defaults` setting and saved in an environment file -"~/my-ceph-settings.yaml" and added to the deploy commandline:: - - openstack overcloud deploy --templates -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml -e ~/my-ceph-settings.yaml - -Already Deployed Servers and ceph-ansible ------------------------------------------ - -When using ceph-ansible and :doc:`deployed_server`, it is necessary -to run commands like the following from the undercloud before -deployment:: - - export OVERCLOUD_HOSTS="192.168.1.8 192.168.1.42" - bash /usr/share/openstack-tripleo-heat-templates/deployed-server/scripts/enable-ssh-admin.sh - -In the example above, the OVERCLOUD_HOSTS variable should be set to -the IPs of the overcloud hosts which will be Ceph servers or which -will host Ceph clients (e.g. Nova, Cinder, Glance Manila, etc.). The -`enable-ssh-admin.sh` script configures a user on the overcloud nodes -that Ansible uses to configure Ceph. - -.. note:: - - Both puppet-ceph and ceph-ansible do not reformat the OSD disks and - expect them to be clean to complete successfully. Consequently, when reusing - the same nodes (or disks) for new deployments, it is necessary to clean the - disks before every new attempt. One option is to enable the automated - cleanup functionality in Ironic, which will zap the disks every time that a - node is released. The same process can be executed manually or only for some - target nodes, see `cleaning instructions in the Ironic documentation`_. - -.. note:: - - The :doc:`extra_config` doc has a more details on the usage of the different - ExtraConfig interfaces. - -.. note:: - - Deployment with `ceph-ansible` requires that OSDs run on dedicated - block devices. - -.. note:: - - If the overcloud is named differently than the default ("overcloud"), - then you'll have to set the OVERCLOUD_PLAN variable as well - - -Adding Ceph Dashboard to a Overcloud deployment ------------------------------------------------- - -Starting from Ceph Nautilus the ceph dashboard component is available and -fully automated by TripleO. -To deploy the ceph dashboard include the ceph-dashboard.yaml environment -file as in the following example:: - - openstack overcloud deploy --templates -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-dashboard.yaml - -The command above will include the ceph dashboard related services and -generates all the `ceph-ansible` required variables to trigger the playbook -execution for both deployment and configuration of this component. -When the deployment has been completed the Ceph dashboard containers, -including prometheus and grafana, will be running on the controller nodes -and will be accessible using the port 3100 for grafana and 9092 for prometheus; -since this service is only internal and doesn’t listen on the public vip, users -can reach both grafana and the exposed ceph dashboard using the controller -provisioning network vip on the specified port (8444 is the default for a generic -overcloud deployment). -The resulting deployment will be composed by an external stack made by grafana, -prometheus, alertmanager, node-exporter containers and the ceph dashboard mgr -module that acts as the backend for this external stack, embedding the grafana -layouts and showing the ceph cluster specific metrics coming from prometheus. -The Ceph Dashboard frontend is fully integrated with the tls-everywhere framework, -hence providing the tls environments files will trigger the certificate request for -both grafana and the ceph dashboard: the generated crt and key files are then passed -to ceph-ansible. -The Ceph Dashboard admin user role is set to `read-only` mode by default for safe -monitoring of the Ceph cluster. To permit an admin user to have elevated privileges -to alter elements of the Ceph cluster with the Dashboard, the operator can change the -default. -For this purpose, TripleO exposes a parameter that can be used to change the Ceph -Dashboard admin default mode. -Log in to the undercloud as `stack` user and create the `ceph_dashboard_admin.yaml` -environment file with the following content:: - - parameter_defaults: - CephDashboardAdminRO: false - -Run the overcloud deploy command to update the existing stack and include the environment -file created with all other environment files that are already part of the existing -deployment:: - - openstack overcloud deploy --templates -e -e ceph_dashboard_admin.yml - -The ceph dashboard will also work with composable networks. -In order to isolate the monitoring access for security purposes, operators can -take advantage of composable networks and access the dashboard through a separate -network vip. By doing this, it's not necessary to access the provisioning network -and separate authorization profiles may be implemented. -To deploy the overcloud with the ceph dashboard composable network we need first -to generate the controller specific role created for this scenario:: - - openstack overcloud roles generate -o /home/stack/roles_data.yaml ControllerStorageDashboard Compute BlockStorage ObjectStorage CephStorage - -Finally, run the overcloud deploy command including the new generated `roles_data.yaml` -and the `network_data_dashboard.yaml` file that will trigger the generation of this -new network. -The final overcloud command must look like the following:: - - openstack overcloud deploy --templates -r /home/stack/roles_data.yaml -n /usr/share/openstack-tripleo-heat-templates/network_data_dashboard.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml -e ~/my-ceph-settings.yaml - -Using Ansible --limit with ceph-ansible ---------------------------------------- - -When using :doc:`config-download -<../deployment/ansible_config_download>` to configure Ceph, -if Ansible's `--limit` option is used, then it is passed to the -execution of ceph-ansible too. This is the case for Train and newer. - -In the previous section an example was provided where Ceph was -deployed with TripleO. The examples below show how to update the -deployment and pass the `--limit` option. - -If oc0-cephstorage-0 had a disk failure and a factory clean disk was -put in place of the failed disk, then the following could be run so -that the new disk is used to bring up the missing OSD and so that -ceph-ansible is only run on the nodes where it needs to be run. This -is useful to reduce the time it takes to update the deployment:: - - openstack overcloud deploy --templates -r /home/stack/roles_data.yaml -n /usr/share/openstack-tripleo-heat-templates/network_data_dashboard.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml -e ~/my-ceph-settings.yaml --limit oc0-controller-0:oc0-controller-2:oc0-controller-1:oc0-cephstorage-0:undercloud - -If :doc:`config-download <../deployment/ansible_config_download>` has -generated a `ansible-playbook-command.sh` script, then that script may -also be run with the `--limit` option and it will be passed to -ceph-ansible:: - - ./ansible-playbook-command.sh --limit oc0-controller-0:oc0-controller-2:oc0-controller-1:oc0-cephstorage-0:undercloud - -In the above example the controllers are included because the -Ceph Mons need Ansible to change their OSD definitions. Both commands -above would do the same thing. The former would only be needed if -there were Heat environment file updates. After either of the above -has run the -`~/config-download/config-download-latest/ceph-ansible/ceph_ansible_command.sh` -file should contain the `--limit` option. - -.. warning:: You must always include the undercloud in the limit list - or ceph-ansible will not be executed when using - `--limit`. This is necessary because the ceph-ansible - execution happens through the external_deploy_steps_tasks - playbook and that playbook only runs on the undercloud. - -Validating Ceph Configuration ------------------------------ - -The tripleo-validations framework contains validations for Ceph -which may be run before deployment to save time debugging possible -failures. - -Create an inventory on the undercloud which refers to itself:: - - echo "undercloud ansible_connection=local" > inventory - -Set Ansible environment variables:: - - BASE="/usr/share/ansible" - export ANSIBLE_RETRY_FILES_ENABLED=false - export ANSIBLE_KEEP_REMOTE_FILES=1 - export ANSIBLE_CALLBACK_PLUGINS="${BASE}/callback_plugins" - export ANSIBLE_ROLES_PATH="${BASE}/roles" - export ANSIBLE_LOOKUP_PLUGINS="${BASE}/lookup_plugins" - export ANSIBLE_LIBRARY="${BASE}/library" - -See what Ceph validations are available:: - - ls $BASE/validation-playbooks | grep ceph - -Run a Ceph validation with command like the following:: - - ansible-playbook -i inventory $BASE/validation-playbooks/ceph-ansible-installed.yaml - -For Stein and newer, it is possible to run validations using the -`openstack tripleo validator run` command with a syntax like the -following:: - - openstack tripleo validator run --validation ceph-ansible-installed - -The `ceph-ansible-installed` validation warns if the `ceph-ansible` -RPM is not installed on the undercloud. This validation is also run -automatically during deployment unless validations are disabled. - -.. admonition:: Ussuri and older - :class: ceph - - For Ussuri and older the base path should be set like this:: - - BASE="/usr/share/openstack-tripleo-validations" - - Also, the validation playbooks will be in $BASE/playbooks/ and not - $BASE/validation-playbooks. E.g. the ceph-pg.yaml playbook covered - in the next section would be run like this:: - - ansible-playbook -i inventory $BASE/playbooks/ceph-pg.yaml -e @ceph.yaml -e num_osds=36 - -Ceph Placement Group Validation -------------------------------- - -Ceph will refuse to take certain actions if they are harmful to the -cluster. E.g. if the placement group numbers are not correct for the -amount of available OSDs, then Ceph will refuse to create pools which -are required for OpenStack. Rather than wait for the deployment to -reach the point where Ceph is going to be configured only to find out -that the deployment failed because the parameters were not correct, -you may run a validation before deployment starts to quickly determine -if Ceph will create your OpenStack pools based on the overrides which -will be passed to the overcloud. - -.. note:: - - Unless there are at least 8 OSDs, the TripleO defaults will - cause the deployment to fail unless you modify the CephPools, - CephPoolDefaultSize, or CephPoolDefaultPgNum parameters. This - validation will help you find the appropriate values. - -To run the `ceph-pg` validation, configure your environment as -described in the previous section but also run the following -command to switch Ansible's `hash_behaviour` from `replace` -(the default) to `merge`. This is done to make Ansible behave -the same way that TripleO Heat Templates behaves when multiple -environment files are passed with the `-e @file.yaml` syntax:: - - export ANSIBLE_HASH_BEHAVIOUR=merge - -Then use a command like the following:: - - ansible-playbook -i inventory $BASE/validation-playbooks/ceph-pg.yaml -e @ceph.yaml -e num_osds=36 - -The `num_osds` parameter is required. This value should be the number -of expected OSDs that will be in the Ceph deployment. It should be -equal to the number of devices and lvm_volumes under -`CephAnsibleDisksConfig` multiplied by the number of nodes running the -`CephOSD` service (e.g. nodes in the CephStorage role, nodes in the -ComputeHCI role, and any custom roles, etc.). This value should also -be adjusted to compensate for the number of OSDs used by nodes with -node-specific overrides as covered earlier in this document. - -In the above example, `ceph.yaml` should be the same file passed to -the overcloud deployment, e.g. `opesntack overcloud deploy ... -e -ceph.yaml`, as covered earlier in this document. As many files as -required may be passed using `-e @file.yaml` in order to get the -following parameters passed to the `ceph-pg` validation. - -* CephPoolDefaultSize -* CephPoolDefaultPgNum -* CephPools - -If the above parameters are not passed, then the TripleO defaults will -be used for the parameters above. - -The above example is based only on Ceph pools created for RBD. If Ceph -RGW and/or Manila via NFS Ganesha is also being deployed, then simply -pass the same environment files for enabling these services you would -as if you were running `openstack overcloud deploy`. For example:: - - export THT=/usr/share/openstack-tripleo-heat-templates/ - ansible-playbook -i inventory $BASE/validation-playbooks/ceph-pg.yaml \ - -e @$THT/environments/ceph-ansible/ceph-rgw.yaml \ - -e @$THT/environments/ceph-ansible/ceph-mds.yaml \ - -e @$THT/environments/manila-cephfsganesha-config.yaml \ - -e @ceph.yaml -e num_osds=36 - -In the above example, the validation will simulate the creation of the -pools required for the RBD, RGW and MDS services and the validation -will fail if the placement group numbers are not correct. - -.. _`puppet-ceph`: https://github.com/openstack/puppet-ceph -.. _`ceph-ansible`: https://github.com/ceph/ceph-ansible -.. _`ceph.yaml static hieradata`: https://github.com/openstack/tripleo-heat-templates/blob/master/puppet/hieradata/ceph.yaml -.. _`ceph-ansible/group_vars`: https://github.com/ceph/ceph-ansible/tree/master/group_vars -.. _`the "batch" subcommand`: http://docs.ceph.com/docs/master/ceph-volume/lvm/batch -.. _`pgcalc`: http://ceph.com/pgcalc -.. _`ceph osd pool create`: http://docs.ceph.com/docs/jewel/rados/operations/pools/#create-a-pool -.. _`cleaning instructions in the Ironic documentation`: https://docs.openstack.org/ironic/latest/admin/cleaning.html -.. _`make_ceph_disk`: https://github.com/openstack/tripleo-heat-templates/blob/master/tools/make_ceph_disk_list.py diff --git a/deploy-guide/source/features/ceph_external.rst b/deploy-guide/source/features/ceph_external.rst index fe019198..d369e1fc 100644 --- a/deploy-guide/source/features/ceph_external.rst +++ b/deploy-guide/source/features/ceph_external.rst @@ -94,7 +94,7 @@ Do not use `CephExternalMultiConfig` when configuring an overcloud to use only one external Ceph cluster. Instead follow the example in the previous section. The example in the previous section and the method of deploying an internal Ceph cluster documented in -:doc:`ceph_config` are mutually exclusive per Heat stack. The +:doc:`deployed_ceph` are mutually exclusive per Heat stack. The following scenarios are the only supported ones in which `CephExternalMultiConfig` may be used per Heat stack: @@ -102,7 +102,7 @@ following scenarios are the only supported ones in which section, in addition to multiple external Ceph clusters configured via `CephExternalMultiConfig`. -* One internal Ceph cluster, as described in :doc:`ceph_config` in +* One internal Ceph cluster, as described in :doc:`deployed_ceph` in addition to multiple external ceph clusters configured via `CephExternalMultiConfig`. @@ -211,7 +211,7 @@ overcloud to connect to an external ceph cluster: ceph-ansible, then the deployer of that cluster could share that map with the TripleO deployer so that it could be used as a list item of `CephExternalMultiConfig`. Similarly, the `CephExtraKeys` parameter, - described in the :doc:`ceph_config` documentation, has the same + described in the :doc:`deployed_ceph` documentation, has the same syntax. Deploying Manila with an External CephFS Service diff --git a/deploy-guide/source/features/cephadm.rst b/deploy-guide/source/features/cephadm.rst deleted file mode 100644 index 51c21a55..00000000 --- a/deploy-guide/source/features/cephadm.rst +++ /dev/null @@ -1,917 +0,0 @@ -Deploying Ceph with cephadm -=========================== - -TripleO can deploy and configure Ceph as if it was a composable -OpenStack service and configure OpenStack services like Nova, Glance, -Cinder, and Cinder Backup to use its RBD interface as a storage -backend as well as configure Ceph's RGW service as the backend for -OpenStack object storage. Both Ceph and OpenStack containers can also -run "hyperconverged" on the same container host. - -This guide assumes that the undercloud is already installed and ready -to deploy an overcloud as described in :doc:`../deployment/index`. - -Limitations ------------ - -TripleO deployments of Ceph with cephadm_ are only supported in Wallaby -or newer. The default version of Ceph deployed by TripleO in Wallaby -is Pacific, regardless of if cephadm or ceph-ansible is used to deploy -it. - -TripleO can only deploy one Ceph cluster in the overcloud per Heat -stack. However, within that Heat stack it's possible to configure -an overcloud to communicate with multiple Ceph clusters which are -external to the overcloud. To do this, follow this document to -configure the "internal" Ceph cluster which is part of the overcloud -and also use the `CephExternalMultiConfig` parameter described in the -:doc:`ceph_external` documentation. - -Prerequisite: Ensure the Ceph container is available ----------------------------------------------------- - -Before deploying Ceph follow the -:ref:`prepare-environment-containers` documentation so -the appropriate Ceph container image is used. -The output of the `openstack tripleo container image prepare` -command should contain a line like the following:: - - ContainerCephDaemonImage: undercloud.ctlplane.mydomain.tld:8787/ceph-ci/daemon:v6.0.0-stable-6.0-pacific-centos-8-x86_64 - -Prerequisite: Ensure the cephadm package is installed ------------------------------------------------------ - -The `cephadm` package needs to be installed on at least one node in -the overcloud in order to bootstrap the first node of the Ceph -cluster. - -The `cephadm` package is pre-built into the overcloud-full image. -The `tripleo_cephadm` role will also use Ansible's package module -to ensure it is present. If `tripleo-repos` is passed the `ceph` -argument for Wallaby or newer, then the CentOS SIG Ceph repository -will be enabled with the appropriate version containing the `cephadm` -package, e.g. for Wallaby the ceph-pacific repository is enabled. - -Prerequisite: Ensure Disks are Clean ------------------------------------- - -cephadm does not reformat the OSD disks and expect them to be clean to -complete successfully. Consequently, when reusing the same nodes (or -disks) for new deployments, it is necessary to clean the disks before -every new attempt. One option is to enable the automated cleanup -functionality in Ironic, which will zap the disks every time that a -node is released. The same process can be executed manually or only -for some target nodes, see `cleaning instructions in the Ironic documentation`_. - - -Deploying Ceph During Overcloud Deployment ------------------------------------------- - -To deploy an overcloud with a Ceph include the appropriate environment -file as in the example below:: - - openstack overcloud deploy --templates \ - -e /usr/share/openstack-tripleo-heat-templates/environments/cephadm/cephadm.yaml - -If you only wish to deploy Ceph RBD without RGW then use the following -variation of the above:: - - openstack overcloud deploy --templates \ - -e /usr/share/openstack-tripleo-heat-templates/environments/cephadm/cephadm-rbd-only.yaml - -Do not directly edit the `environments/cephadm/cephadm.yaml` -or `cephadm-rbd-only.yaml` file. If you wish to override the defaults, -as described below in the sections starting with "Overriding", then -place those overrides in a separate `cephadm-overrides.yaml` file and -deploy like this:: - - openstack overcloud deploy --templates \ - -e /usr/share/openstack-tripleo-heat-templates/environments/cephadm/cephadm.yaml \ - -e cephadm-overrides.yaml - -Deploying with the commands above will result in the processes described -in the rest of this document. - -Deploying Ceph Before Overcloud Deployment ------------------------------------------- - -In Wallaby and newer it is possible to provision hardware and deploy -Ceph before deploying the overcloud on the same hardware. This feature -is called "deployed ceph" and it uses the command `openstack overcloud -ceph deploy` which executes the same Ansible roles described -below. For more details see :doc:`deployed_ceph`. - -Overview of Ceph Deployment with TripleO and cephadm ----------------------------------------------------- - -When Ceph is deployed during overcloud configuration or when Ceph is -deployed before overcloud configuration with :doc:`deployed_ceph`, -TripleO will use Ansible automate the process described in the -`cephadm`_ documentation to bootstrap a new cluster. It will -bootstrap a single Ceph monitor and manager on one server -(by default on the first Controller node) and then add the remaining -servers to the cluster by using Ceph orchestrator to apply a `Ceph -Service Specification`_. The Ceph Service Specification is generated -automatically based on TripleO composable roles and most of the -existing THT parameters remain backwards compatible. During stack -updates the same bootstrap process is not executed. - -Details of Ceph Deployment with TripleO and cephadm ---------------------------------------------------- - -After the hardware is provisioned, the user `ceph-admin` is created -on the overcloud nodes. The `ceph-admin` user has one set of public -and private SSH keys created on the undercloud (in -/home/stack/.ssh/ceph-admin-id_rsa.pub and .ssh/ceph-admin-id_rsa) -which is distributed to all overcloud nodes which host the Ceph -Mgr and Mon service; only the public key is distributed to nodes -in the Ceph cluster which do not run the Mgr or Mon service. Unlike -the `tripleo-admin` user, this allows the `ceph-admin` user to SSH -from any overcloud node hosting the Mon or Mgr service to any other -overcloud node hosting the Mon or Mgr service. By default these -services run on the controller nodes so this means by default that -Controllers can SSH to each other but other nodes, e.g. CephStorage -nodes, cannot SSH to Controller nodes. `cephadm`_ requires this type -of access in order to scale from more than one Ceph node. - -The deployment definition as described TripleO Heat Templates, -e.g. which servers run which services according to composable -roles, will be converted by the tripleo-ansible `ceph_spec_bootstrap`_ -module into a `Ceph Service Specification`_ file. The module has the -ability to do this based on the Ansible inventory generated by the -`tripleo-ansible-inventory`. When Ceph is deployed *during* overcloud -configuration by including the cephadm.yaml environment file, the -module uses the Ansible inventory to create the `Ceph Service -Specification`_. In this scenario the default location of the -generated Ceph Service Specification file is -`config-download//cephadm/ceph_spec.yaml`. - -The same `ceph_spec_bootstrap`_ module can also generate the Ceph -Service Specification file from a combination of a TripleO roles data -file -(e.g. /usr/share/openstack-tripleo-heat-templates/roles_data.yaml) -and the output of the command -`openstack overcloud node provision --output deployed_metal.yaml`. -When Ceph is deployed *before* overcloud configuration as described in -:doc:`deployed_ceph`, the module uses the deployed_metal.yaml and -roles_data.yaml to create the `Ceph Service Specification`_. - -After the `ceph-admin` user is created, `ceph_spec.yaml` is copied -to the bootstrap host. The bootstrap host will be the first host -in the `ceph_mons` group of the inventory generated by the -`tripleo-ansible-inventory` command. By default this is the first -controller node. - -Ansible will then interact only with the bootstrap host. It will run -the `cephadm` commands necessary to bootstrap a small Ceph cluster on -the bootstrap node and then run `ceph orch apply -i ceph_spec.yaml` -and `cephadm` will use the `ceph-admin` account and SSH keys to add -the other nodes. - -After the full Ceph cluster is running, either as a result of -:doc:`deployed_ceph` or by cephadm being triggered during the -overcloud deployment via the `cephadm.yaml` environment file, the -Ceph pools and the cephx keys to access the pools will be created as -defined or overridden as described in the Heat environment examples -below. The information necessary to configure Ceph clients will then -be extracted to `/home/stack/ceph_client.yml` on the undercloud and -passed to the as input to the tripleo-ansible role tripleo_ceph_client -which will then configure the rest of the overcloud to use the new -Ceph cluster as described in the :doc:`ceph_external` documentation. - -When `openstack overcloud deploy` is re-run in order to update -the stack, the cephadm bootstrap process is not repeated because -that process is only run if `cephadm list` returns an empty -list. Thus, configuration changes to the running Ceph cluster, outside -of scale up as described below, should be made directly with `Ceph -Orchestrator`_. - -Overriding Ceph Configuration Options during deployment -------------------------------------------------------- - -To override the keys and values of the Ceph configuration -database, which has been traditionally stored in the Ceph -configuration file, e.g. `/etc/ceph/ceph.conf`, use the -`CephConfigOverrides` parameter. For example, if the -`cephadm-overrides.yaml` file referenced in the example `openstack -overcloud deploy` command in the previous section looked like the -following:: - - parameter_defaults: - CephConfigOverrides: - mon: - mon_warn_on_pool_no_redundancy: false - -Then the Ceph monitors would be configured with the above parameter -and a command like the following could confirm it:: - - [stack@standalone ~]$ sudo cephadm shell -- ceph config dump | grep warn - Inferring fsid 65e8d744-eaec-4ff1-97be-2551d452426d - Inferring config /var/lib/ceph/65e8d744-eaec-4ff1-97be-2551d452426d/mon.standalone.localdomain/config - Using recent ceph image quay.ceph.io/ceph-ci/daemon@sha256:6b3c720e58ae84b502bd929d808ba63a1e9b91f710418be9df3ee566227546c0 - mon advanced mon_warn_on_pool_no_redundancy false - [stack@standalone ~]$ - -In the above example the configuration group is 'mon' for the Ceph -monitor. The supported configuration groups are 'global', 'mon', -'mgr', 'osd', 'mds', and 'client'. If no group is provided, then the -default configuration group is 'global'. - -The above does not apply to :doc:`deployed_ceph`. - -Overriding Server Configuration after deployment ------------------------------------------------- - -To make a Ceph *server* configuration change, after the cluster has -been deployed, use the `ceph config command`_. A '/etc/ceph/ceph.conf' -file is not distributed to all Ceph servers and instead `Ceph's -centralized configuration management`_ is used. - -A single '/etc/ceph/ceph.conf' file may be found on the bootstrap node. -The directives under `CephConfigOverrides` are used to create a config -file, e.g. assimilate_ceph.conf, which is passed to `cephadm bootstrap` -with `--config assimilate_ceph.conf` so that those directives are -applied to the new cluster at bootstrap. The option `--output-config -/etc/ceph/ceph.conf` is also passed to the `cephadm bootstrap` command -and that's what creates the `ceph.conf` on the bootstrap node. The -name of the file is `ceph.conf` because the `CephClusterName` -parameter defaults to "ceph". If `CephClusterName` was set to "foo", -then the file would be called `/etc/ceph/foo.conf`. - -By default the parameters in `CephConfigOverrides` are only applied to -a new Ceph server at bootstrap. They are ignored during stack updates -because `ApplyCephConfigOverridesOnUpdate` defaults to false. When -`ApplyCephConfigOverridesOnUpdate` is set to true, parameters in -`CephConfigOverrides` are put into a file, e.g. assimilate_ceph.conf, -and a command like `ceph config assimilate-conf -i -assimilate_ceph.conf` is run. - -When using :doc:`deployed_ceph` the `openstack overcloud ceph deploy` -command outputs an environment file with -`ApplyCephConfigOverridesOnUpdate` set to true so that services not -covered by deployed ceph, e.g. RGW, can have the configuration changes -that they need applied during overcloud deployment. After the deployed -ceph process has run and then after the overcloud is deployed, it is -recommended to set `ApplyCephConfigOverridesOnUpdate` to false. - - -Overriding Client Configuration after deployment ------------------------------------------------- - -To make a Ceph *client* configuration change, update the parameters in -`CephConfigOverrides` and run a stack update. This will not change the -configuration for the Ceph servers unless -`ApplyCephConfigOverridesOnUpdate` is set to true (as described in the -section above). By default it should only change configurations for -the Ceph clients. Examples of Ceph clients include Nova compute -containers, Cinder volume containers, Glance image containers, etc. - -The `CephConfigOverrides` directive updates all Ceph client -configuration files on the overcloud in the `CephConfigPath` (which -defaults to /var/lib/tripleo-config/ceph). The `CephConfigPath` is -mounted on the client containers as `/etc/ceph`. The name of the -configuration file is `ceph.conf` because the `CephClusterName` -parameter defaults to "ceph". If `CephClusterName` was set to "foo", -then the file would be called `/etc/ceph/foo.conf`. - - -Overriding the Ceph Service Specification ------------------------------------------ - -All TripleO cephadm deployments rely on a valid `Ceph Service -Specification`_. It is not necessary to provide a service -specification directly as TripleO will generate one dynamically. -However, one may provide their own service specification by disabling -the dynamic spec generation and providing a path to their service -specification as shown in the following:: - - parameter_defaults: - CephDynamicSpec: false - CephSpecPath: /home/stack/cephadm_spec.yaml - -The `CephDynamicSpec` parameter defaults to true. The `CephSpecPath` -defaults to "{{ playbook_dir }}/cephadm/ceph_spec.yaml", where the -value of "{{ playbook_dir }}" is controlled by config-download. -If `CephDynamicSpec` is true and `CephSpecPath` is set to a valid -path, then the spec will be created at that path before it is used to -deploy Ceph. - -The `CephDynamicSpec` and `CephSpecPath` parameters are not available -when using "deployed ceph", but the functionality is available via -the `--ceph-spec` command line option as described in -:doc:`deployed_ceph`. - -Overriding which disks should be OSDs -------------------------------------- - -The `Advanced OSD Service Specifications`_ should be used to define -how disks are used as OSDs. - -By default all available disks (excluding the disk where the operating -system is installed) are used as OSDs. This is because the -`CephOsdSpec` parameter defaults to the following:: - - data_devices: - all: true - -In the above example, the `data_devices` key is valid for any `Ceph -Service Specification`_ whose `service_type` is "osd". Other OSD -service types, as found in the `Advanced OSD Service -Specifications`_, may be set by overriding the `CephOsdSpec` -parameter. In the example below all rotating devices will be data -devices and all non-rotating devices will be used as shared devices -(wal, db) following:: - - parameter_defaults: - CephOsdSpec: - data_devices: - rotational: 1 - db_devices: - rotational: 0 - -When the dynamic Ceph service specification is built (whenever -`CephDynamicSpec` is true) whatever is in the `CephOsdSpec` will -be appended to that section of the specification if the `service_type` -is "osd". - -If `CephDynamicSpec` is false, then the OSD definition can also be -placed directly in the `Ceph Service Specification`_ located at the -path defined by `CephSpecPath` as described in the previous section. - -The :doc:`node_specific_hieradata` feature is not supported by the -cephadm integration but the `Advanced OSD Service Specifications`_ has -a `host_pattern` parameter which specifies which host to target for -certain `data_devices` definitions, so the equivalent functionality is -available but with the new syntax. When using this option consider -setting `CephDynamicSpec` to false and defining a custom specification -which is passed to TripleO by setting the `CephSpecPath`. - -The `CephOsdSpec` parameter is not available when using "deployed -ceph", but the same functionality is available via `--osd-spec` -command line option as described in :doc:`deployed_ceph`. - -Overriding Ceph Pools and Placement Group values during deployment ------------------------------------------------------------------- - -The default cephadm deployment as triggered by TripleO has -`Autoscaling Placement Groups`_ enabled. Thus, it is not necessary to -use `pgcalc`_ and hard code a PG number per pool. - -However, the interfaces described in the :doc:`ceph_config` -for configuring the placement groups per pool remain backwards -compatible. For example, to set the default pool size and default PG -number per pool use an example like the following:: - - parameter_defaults: - CephPoolDefaultSize: 3 - CephPoolDefaultPgNum: 128 - -In addition to setting the default PG number for each pool created, -each Ceph pool created for OpenStack can have its own PG number. -TripleO supports customization of these values by using a syntax like -the following:: - - parameter_defaults: - CephPools: - - {"name": backups, "pg_num": 512, "pgp_num": 512, "application": rbd} - - {"name": volumes, "pg_num": 1024, "pgp_num": 1024, "application": rbd} - - {"name": vms, "pg_num": 512, "pgp_num": 512, "application": rbd} - - {"name": images, "pg_num": 128, "pgp_num": 128, "application": rbd} - - -Regardless of if the :doc:`deployed_ceph` feature is used, pools will -always be created during overcloud deployment as documented above. -Additional pools may also be created directly via the Ceph command -line tools. - -Overriding CRUSH rules ----------------------- - -To deploy Ceph pools with custom `CRUSH Map Rules`_ use the -`CephCrushRules` parameter to define a list of named rules and -then associate the `rule_name` per pool with the `CephPools` -parameter:: - - parameter_defaults: - CephCrushRules: - - name: HDD - root: default - type: host - class: hdd - default: true - - name: SSD - root: default - type: host - class: ssd - default: false - CephPools: - - {'name': 'slow_pool', 'rule_name': 'HDD', 'application': 'rbd'} - - {'name': 'fast_pool', 'rule_name': 'SSD', 'application': 'rbd'} - -Regardless of if the :doc:`deployed_ceph` feature is used, custom -CRUSH rules may be created during overcloud deployment as documented -above. CRUSH rules may also be created directly via the Ceph command -line tools. - -Overriding CephX Keys ---------------------- - -TripleO will create a Ceph cluster with a CephX key file for OpenStack -RBD client connections that is shared by the Nova, Cinder, and Glance -services to read and write to their pools. Not only will the keyfile -be created but the Ceph cluster will be configured to accept -connections when the key file is used. The file will be named -`ceph.client.openstack.keyring` and it will be stored in `/etc/ceph` -within the containers, but on the container host it will be stored in -a location defined by a TripleO exposed parameter which defaults to -`/var/lib/tripleo-config/ceph`. - -The keyring file is created using the following defaults: - -* CephClusterName: 'ceph' -* CephClientUserName: 'openstack' -* CephClientKey: This value is randomly generated per Heat stack. If - it is overridden the recommendation is to set it to the output of - `ceph-authtool --gen-print-key`. - -If the above values are overridden, the keyring file will have a -different name and different content. E.g. if `CephClusterName` was -set to 'foo' and `CephClientUserName` was set to 'bar', then the -keyring file would be called `foo.client.bar.keyring` and it would -contain the line `[client.bar]`. - -The `CephExtraKeys` parameter may be used to generate additional key -files containing other key values and should contain a list of maps -where each map describes an additional key. The syntax of each -map must conform to what the `ceph-ansible/library/ceph_key.py` -Ansible module accepts. The `CephExtraKeys` parameter should be used -like this:: - - CephExtraKeys: - - name: "client.glance" - caps: - mgr: "allow *" - mon: "profile rbd" - osd: "profile rbd pool=images" - key: "AQBRgQ9eAAAAABAAv84zEilJYZPNuJ0Iwn9Ndg==" - mode: "0600" - -If the above is used, in addition to the -`ceph.client.openstack.keyring` file, an additional file called -`ceph.client.glance.keyring` will be created which contains:: - - [client.glance] - key = AQBRgQ9eAAAAABAAv84zEilJYZPNuJ0Iwn9Ndg== - caps mgr = "allow *" - caps mon = "profile rbd" - caps osd = "profile rbd pool=images" - -The Ceph cluster will also allow the above key file to be used to -connect to the images pool. Ceph RBD clients which are external to the -overcloud could then use this CephX key to connect to the images -pool used by Glance. The default Glance deployment defined in the Heat -stack will continue to use the `ceph.client.openstack.keyring` file -unless that Glance configuration itself is overridden. - -Regardless of if the :doc:`deployed_ceph` feature is used, CephX keys -may be created during overcloud deployment as documented above. -Additional CephX keys may also be created directly via the Ceph -command line tools. - -Enabling cephadm debug mode ---------------------------- - -TripleO can deploy the Ceph cluster enabling the cephadm backend in debug -mode; this is useful for troubleshooting purposes, and can be activated -by using a syntax like the following:: - - parameter_defaults: - CephAdmDebug: true - -After step 2, when the Ceph cluster is up and running, after SSH'ing into -one of your controller nodes run:: - - sudo cephadm shell ceph -W cephadm --watch-debug - -The command above shows a more verbose cephadm execution, and it's useful -to identify potential issues with the deployment of the Ceph cluster. - - -Accessing the Ceph Command Line -------------------------------- - -After step 2 of the overcloud deployment is completed you can login to -check the status of your Ceph cluster. By default the Ceph Monitor -containers will be running on the Controller nodes. After SSH'ing into -one of your controller nodes run `sudo cephadm shell`. An example of -what you might see is below:: - - [stack@standalone ~]$ sudo cephadm shell - Inferring fsid 65e8d744-eaec-4ff1-97be-2551d452426d - Inferring config /var/lib/ceph/65e8d744-eaec-4ff1-97be-2551d452426d/mon.standalone.localdomain/config - Using recent ceph image quay.ceph.io/ceph-ci/daemon@sha256:6b3c720e58ae84b502bd929d808ba63a1e9b91f710418be9df3ee566227546c0 - [ceph: root@standalone /]# ceph -s - cluster: - id: 65e8d744-eaec-4ff1-97be-2551d452426d - health: HEALTH_OK - - services: - mon: 1 daemons, quorum standalone.localdomain (age 61m) - mgr: standalone.localdomain.saojan(active, since 61m) - osd: 1 osds: 1 up (since 61m), 1 in (since 61m) - rgw: 1 daemon active (1 hosts, 1 zones) - - data: - pools: 8 pools, 201 pgs - objects: 315 objects, 24 KiB - usage: 19 MiB used, 4.6 GiB / 4.7 GiB avail - pgs: 201 active+clean - - [ceph: root@standalone /]# - -If you need to make updates to your Ceph deployment use the `Ceph -Orchestrator`_. - -Scenario: Deploy Ceph with TripleO and Metalsmith -------------------------------------------------- - -Deploy the hardware as described in :doc:`../provisioning/baremetal_provision` -and include nodes with in the `CephStorage` role. For example, the -following could be the content of ~/overcloud_baremetal_deploy.yaml:: - - - name: Controller - count: 3 - instances: - - hostname: controller-0 - name: controller-0 - - hostname: controller-1 - name: controller-1 - - hostname: controller-2 - name: controller-2 - - name: CephStorage - count: 3 - instances: - - hostname: ceph-0 - name: ceph-0 - - hostname: ceph-1 - name: ceph-2 - - hostname: ceph-2 - name: ceph-2 - - name: Compute - count: 1 - instances: - - hostname: compute-0 - name: compute-0 - -which is passed to the following command:: - - openstack overcloud node provision \ - --stack overcloud \ - --output ~/overcloud-baremetal-deployed.yaml \ - ~/overcloud_baremetal_deploy.yaml - -If desired at this stage, then Ceph may be deployed early as described -in :doc:`deployed_ceph`. Otherwise Ceph may be deployed during the -overcloud deployment. Either way, as described in -:doc:`../provisioning/baremetal_provision`, pass -~/overcloud_baremetal_deploy.yaml as input, along with -/usr/share/openstack-tripleo-heat-templates/environments/cephadm/cephadm.yaml -and cephadm-overrides.yaml described above, to the `openstack overcloud -deploy` command. - -Scenario: Scale Up Ceph with TripleO and Metalsmith ---------------------------------------------------- - -Modify the ~/overcloud_baremetal_deploy.yaml file described above to -add more CephStorage nodes. In the example below the number of storage -nodes is doubled:: - - - name: CephStorage - count: 6 - instances: - - hostname: ceph-0 - name: ceph-0 - - hostname: ceph-1 - name: ceph-2 - - hostname: ceph-2 - name: ceph-2 - - hostname: ceph-3 - name: ceph-3 - - hostname: ceph-4 - name: ceph-4 - - hostname: ceph-5 - name: ceph-5 - -As described in :doc:`../provisioning/baremetal_provision`, re-run the -same `openstack overcloud node provision` command with the updated -~/overcloud_baremetal_deploy.yaml file. This will result in the three -new storage nodes being provisioned and output an updated copy of -~/overcloud-baremetal-deployed.yaml. The updated copy will have the -`CephStorageCount` changed from 3 to 6 and the `DeployedServerPortMap` -and `HostnameMap` will contain the new storage nodes. - -After the three new storage nodes are deployed run the same -`openstack overcloud deploy` command as described in the previous -section with updated copy of ~/overcloud-baremetal-deployed.yaml. -The additional Ceph Storage nodes will be added to the Ceph and -the increased capacity will available. - -In particular, the following will happen as a result of running -`openstack overcloud deploy`: - -- The storage networks and firewall rules will be appropriately - configured on the new CephStorage nodes -- The ceph-admin user will be created on the new CephStorage nodes -- The ceph-admin user's public SSH key will be distributed to the new - CephStorage nodes so that cephadm can use SSH to add extra nodes -- If a new host with the Ceph Mon or Ceph Mgr service is being added, - then the private SSH key will also be added to that node. -- An updated Ceph spec will be generated and installed on the - bootstrap node, i.e. /home/ceph-admin/specs/ceph_spec.yaml on the - bootstrap node will contain new entries for the new CephStorage - nodes. -- The cephadm bootstrap process will be skipped because `cephadm ls` - will indicate that Ceph containers are already running. -- The updated spec will be applied and cephadm will schedule the new - nodes to join the cluster. - -Scenario: Scale Down Ceph with TripleO and Metalsmith ------------------------------------------------------ - -.. warning:: This procedure is only possible if the Ceph cluster has - the capacity to lose OSDs. - -Before using TripleO to remove hardware which is part of a Ceph -cluster, use Ceph orchestrator to deprovision the hardware gracefully. -This example uses commands from the `OSD Service Documentation for -cephadm`_ to remove the OSDs, and their host, before using TripleO -to scale down the Ceph storage nodes. - -Start a Ceph shell as described in "Accessing the Ceph Command Line" -above and identify the OSDs to be removed by server. In the following -example we will identify the OSDs of the host ceph-2:: - - [ceph: root@oc0-controller-0 /]# ceph osd tree - ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF - -1 0.58557 root default - ... - -7 0.19519 host ceph-2 - 5 hdd 0.04880 osd.5 up 1.00000 1.00000 - 7 hdd 0.04880 osd.7 up 1.00000 1.00000 - 9 hdd 0.04880 osd.9 up 1.00000 1.00000 - 11 hdd 0.04880 osd.11 up 1.00000 1.00000 - ... - [ceph: root@oc0-controller-0 /]# - -As per the example above the ceph-2 host has OSDs 5,7,9,11 which can -be removed by running `ceph orch osd rm 5 7 9 11`. For example:: - - [ceph: root@oc0-controller-0 /]# ceph orch osd rm 5 7 9 11 - Scheduled OSD(s) for removal - [ceph: root@oc0-controller-0 /]# ceph orch osd rm status - OSD_ID HOST STATE PG_COUNT REPLACE FORCE DRAIN_STARTED_AT - 7 ceph-2 draining 27 False False 2021-04-23 21:35:51.215361 - 9 ceph-2 draining 8 False False 2021-04-23 21:35:49.111500 - 11 ceph-2 draining 14 False False 2021-04-23 21:35:50.243762 - [ceph: root@oc0-controller-0 /]# - -Use `ceph orch osd rm status` to check the status:: - - [ceph: root@oc0-controller-0 /]# ceph orch osd rm status - OSD_ID HOST STATE PG_COUNT REPLACE FORCE DRAIN_STARTED_AT - 7 ceph-2 draining 34 False False 2021-04-23 21:35:51.215361 - 11 ceph-2 done, waiting for purge 0 False False 2021-04-23 21:35:50.243762 - [ceph: root@oc0-controller-0 /]# - -Only proceed if `ceph orch osd rm status` returns no output. - -Remove the host with `ceph orch host rm `. For example:: - - [ceph: root@oc0-controller-0 /]# ceph orch host rm ceph-2 - Removed host 'ceph-2' - [ceph: root@oc0-controller-0 /]# - -Now that the host and OSDs have been logically removed from the Ceph -cluster proceed to remove the host from the overcloud as described in -the "Scaling Down" section of :doc:`../provisioning/baremetal_provision`. - -Scenario: Deploy Hyperconverged Ceph ------------------------------------- - -Use a command like the following to create a `roles.yaml` file -containing a standard Controller role and a ComputeHCI role:: - - openstack overcloud roles generate Controller ComputeHCI -o ~/roles.yaml - -The ComputeHCI role is a Compute node which also runs co-located Ceph -OSD daemons. This kind of service co-location is referred to as HCI, -or hyperconverged infrastructure. See the :doc:`composable_services` -documentation for details on roles and services. - -When collocating Nova Compute and Ceph OSD services boundaries can be -set to reduce contention for CPU and Memory between the two services. -This is possible by adding parameters to `cephadm-overrides.yaml` like -the following:: - - parameter_defaults: - CephHciOsdType: hdd - CephHciOsdCount: 4 - CephConfigOverrides: - osd: - osd_memory_target_autotune: true - osd_numa_auto_affinity: true - mgr: - mgr/cephadm/autotune_memory_target_ratio: 0.2 - -The `CephHciOsdType` and `CephHciOsdCount` parameters are used by the -Derived Parameters workflow to tune the Nova scheduler to not allocate -a certain amount of memory and CPU from the hypervisor to virtual -machines so that Ceph can use them instead. See the -:doc:`derived_parameters` documentation for details. If you do not use -Derived Parameters workflow, then at least set the -`NovaReservedHostMemory` to the number of OSDs multipled by 5 GB per -OSD per host. - -The `CephConfigOverrides` map passes Ceph OSD parameters to limit the -CPU and memory used by the OSDs. - -The `osd_memory_target_autotune`_ is set to true so that the OSD -daemons will adjust their memory consumption based on the -`osd_memory_target` config option. The `autotune_memory_target_ratio` -defaults to 0.7. So 70% of the total RAM in the system is the starting -point, from which any memory consumed by non-autotuned Ceph daemons -are subtracted, and then the remaining memory is divided by the OSDs -(assuming all OSDs have `osd_memory_target_autotune` true). For HCI -deployments the `mgr/cephadm/autotune_memory_target_ratio` can be set -to 0.2 so that more memory is available for the Nova Compute -service. This has the same effect as setting the ceph-ansible `is_hci` -parameter to true. - -A two NUMA node system can host a latency sensitive Nova workload on -one NUMA node and a Ceph OSD workload on the other NUMA node. To -configure Ceph OSDs to use a specific NUMA node (and not the one being -used by the Nova Compute workload) use either of the following Ceph -OSD configurations: - -- `osd_numa_node` sets affinity to a numa node (-1 for none) -- `osd_numa_auto_affinity` automatically sets affinity to the NUMA - node where storage and network match - -If there are network interfaces on both NUMA nodes and the disk -controllers are NUMA node 0, then use a network interface on NUMA node -0 for the storage network and host the Ceph OSD workload on NUMA -node 0. Then host the Nova workload on NUMA node 1 and have it use the -network interfaces on NUMA node 1. Setting `osd_numa_auto_affinity`, -to true, as in the example `cephadm-overrides.yaml` file above, should -result in this configuration. Alternatively, the `osd_numa_node` could -be set directly to 0 and `osd_numa_auto_affinity` could be unset so -that it will default to false. - -When a hyperconverged cluster backfills as a result of an OSD going -offline, the backfill process can be slowed down. In exchange for a -slower recovery, the backfill activity has less of an impact on -the collocated Compute workload. Ceph Pacific has the following -defaults to control the rate of backfill activity:: - - parameter_defaults: - CephConfigOverrides: - osd: - osd_recovery_op_priority: 3 - osd_max_backfills: 1 - osd_recovery_max_active_hdd: 3 - osd_recovery_max_active_ssd: 10 - -It is not necessary to pass the above as they are the default values, -but if these values need to be deployed with different values modify -an example like the above before deployment. If the values need to be -adjusted after the deployment use `ceph config set osd `. - -Deploy the overcloud as described in "Scenario: Deploy Ceph with -TripleO and Metalsmith" but use the `-r` option to include generated -`roles.yaml` file and the `-e` option with the -`cephadm-overrides.yaml` file containing the HCI tunings described -above. - -The examples above may be used to tune a hyperconverged system during -deployment. If the values need to be changed after deployment, then -use the `ceph orchestrator` command to set them directly. - -After deployment start a Ceph shell as described in "Accessing the -Ceph Command Line" and confirm the above values were applied. For -example, to check that the NUMA and memory target auto tuning run -commands lke this:: - - [ceph: root@oc0-controller-0 /]# ceph config dump | grep numa - osd advanced osd_numa_auto_affinity true - [ceph: root@oc0-controller-0 /]# ceph config dump | grep autotune - osd advanced osd_memory_target_autotune true - [ceph: root@oc0-controller-0 /]# ceph config get mgr mgr/cephadm/autotune_memory_target_ratio - 0.200000 - [ceph: root@oc0-controller-0 /]# - -We can then confirm that a specific OSD, e.g. osd.11, inherited those -values with commands like this:: - - [ceph: root@oc0-controller-0 /]# ceph config get osd.11 osd_memory_target - 4294967296 - [ceph: root@oc0-controller-0 /]# ceph config get osd.11 osd_memory_target_autotune - true - [ceph: root@oc0-controller-0 /]# ceph config get osd.11 osd_numa_auto_affinity - true - [ceph: root@oc0-controller-0 /]# - -To confirm that the default backfill values are set for the same -example OSD, use commands like this:: - - [ceph: root@oc0-controller-0 /]# ceph config get osd.11 osd_recovery_op_priority - 3 - [ceph: root@oc0-controller-0 /]# ceph config get osd.11 osd_max_backfills - 1 - [ceph: root@oc0-controller-0 /]# ceph config get osd.11 osd_recovery_max_active_hdd - 3 - [ceph: root@oc0-controller-0 /]# ceph config get osd.11 osd_recovery_max_active_ssd - 10 - [ceph: root@oc0-controller-0 /]# - -The above example assumes that :doc:`deployed_ceph` is not used. - -Add the Ceph Dashboard to a Overcloud deployment ------------------------------------------------- - -During the overcloud deployment most of the Ceph daemons can be added and -configured. -To deploy the ceph dashboard include the ceph-dashboard.yaml environment -file as in the following example:: - - openstack overcloud deploy --templates -e /usr/share/openstack-tripleo-heat-templates/environments/cephadm/cephadm.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/cephadm/ceph-dashboard.yaml - -The command above will include the ceph dashboard related services and -generates all the `cephadm` required variables to render the monitoring -stack related spec that can be applied against the deployed Ceph cluster. -When the deployment has been completed the Ceph dashboard containers, -including prometheus and grafana, will be running on the controller nodes -and will be accessible using the port 3100 for grafana and 9092 for prometheus; -since this service is only internal and doesn’t listen on the public vip, users -can reach both grafana and the exposed ceph dashboard using the controller -provisioning network vip on the specified port (8444 is the default for a generic -overcloud deployment). -The resulting deployment will be composed by an external stack made by grafana, -prometheus, alertmanager, node-exporter containers and the ceph dashboard mgr -module that acts as the backend for this external stack, embedding the grafana -layouts and showing the ceph cluster specific metrics coming from prometheus. -The Ceph Dashboard backend services run on the specified `CephDashboardNetwork` -and `CephGrafanaNetwork`, while the high availability is realized by haproxy and -Pacemaker. -The Ceph Dashboard frontend is fully integrated with the tls-everywhere framework, -hence providing the tls environments files will trigger the certificate request for -both grafana and the ceph dashboard: the generated crt and key files are then -configured by cephadm, resulting in a key-value pair within the Ceph orchestrator, -which is able to mount the required files to the dashboard related containers. -The Ceph Dashboard admin user role is set to `read-only` mode by default for safe -monitoring of the Ceph cluster. To permit an admin user to have elevated privileges -to alter elements of the Ceph cluster with the Dashboard, the operator can change the -default. -For this purpose, TripleO exposes a parameter that can be used to change the Ceph -Dashboard admin default mode. -Log in to the undercloud as `stack` user and create the `ceph_dashboard_admin.yaml` -environment file with the following content:: - - parameter_defaults: - CephDashboardAdminRO: false - -Run the overcloud deploy command to update the existing stack and include the environment -file created with all other environment files that are already part of the existing -deployment:: - - openstack overcloud deploy --templates -e -e ceph_dashboard_admin.yml - -The ceph dashboard will also work with composable networks. -In order to isolate the monitoring access for security purposes, operators can -take advantage of composable networks and access the dashboard through a separate -network vip. By doing this, it's not necessary to access the provisioning network -and separate authorization profiles may be implemented. -To deploy the overcloud with the ceph dashboard composable network we need first -to generate the controller specific role created for this scenario:: - - openstack overcloud roles generate -o /home/stack/roles_data.yaml ControllerStorageDashboard Compute BlockStorage ObjectStorage CephStorage - -Finally, run the overcloud deploy command including the new generated `roles_data.yaml` -and the `network_data_dashboard.yaml` file that will trigger the generation of this -new network. -The final overcloud command must look like the following:: - - openstack overcloud deploy --templates -r /home/stack/roles_data.yaml -n /usr/share/openstack-tripleo-heat-templates/network_data_dashboard.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/cephadm/cephadm.yaml -e ~/my-ceph-settings.yaml - -.. _`cephadm`: https://docs.ceph.com/en/latest/cephadm/index.html -.. _`cleaning instructions in the Ironic documentation`: https://docs.openstack.org/ironic/latest/admin/cleaning.html -.. _`Ceph Orchestrator`: https://docs.ceph.com/en/latest/mgr/orchestrator/ -.. _`ceph config command`: https://docs.ceph.com/en/latest/man/8/ceph/#config -.. _`Ceph's centralized configuration management`: https://ceph.io/community/new-mimic-centralized-configuration-management/ -.. _`Ceph Service Specification`: https://docs.ceph.com/en/octopus/mgr/orchestrator/#orchestrator-cli-service-spec -.. _`ceph_spec_bootstrap`: https://docs.openstack.org/tripleo-ansible/latest/modules/modules-ceph_spec_bootstrap.html -.. _`Advanced OSD Service Specifications`: https://docs.ceph.com/en/octopus/cephadm/drivegroups/ -.. _`Autoscaling Placement Groups`: https://docs.ceph.com/en/latest/rados/operations/placement-groups/ -.. _`pgcalc`: http://ceph.com/pgcalc -.. _`CRUSH Map Rules`: https://docs.ceph.com/en/latest/rados/operations/crush-map-edits/?highlight=ceph%20crush%20rules#crush-map-rules -.. _`OSD Service Documentation for cephadm`: https://docs.ceph.com/en/latest/cephadm/services/osd/ -.. _`osd_memory_target_autotune`: https://docs.ceph.com/en/latest/cephadm/services/osd/#automatically-tuning-osd-memory diff --git a/deploy-guide/source/features/deployed_ceph.rst b/deploy-guide/source/features/deployed_ceph.rst index f9bb41e2..2d29a588 100644 --- a/deploy-guide/source/features/deployed_ceph.rst +++ b/deploy-guide/source/features/deployed_ceph.rst @@ -1,15 +1,15 @@ -Deployed Ceph -============= +Deploying Ceph with TripleO +=========================== -In Wallaby and newer it is possible to provision hardware and deploy -Ceph before deploying the overcloud on the same hardware. +In Wallaby and newer it is possible to have TripleO provision hardware +and deploy Ceph before deploying the overcloud on the same hardware. Deployed Ceph Workflow ---------------------- -As described in the :doc:`../deployment/network_v2` the ``overcloud -deploy`` command was extended so that it can run all of the following -as separate steps: +As described in the :doc:`../deployment/network_v2` the ``openstack +overcloud`` command was extended so that it can run all of the +following as separate steps: #. Create Networks #. Create Virtual IPs @@ -18,8 +18,10 @@ as separate steps: #. Create the overcloud Ephemeral Heat stack #. Run Config-Download and the deploy-steps playbook -This document covers the "Deploy Ceph" step above. For details on the -other steps see :doc:`../deployment/network_v2`. +This document covers the "Deploy Ceph" step above. It also covers how +to configure the overcloud deployed in the subsequent steps to use the +Ceph cluster. For details on the earlier steps see +:doc:`../deployment/network_v2`. The "Provision Baremetal Instances" step outputs a YAML file describing the deployed baremetal, for example:: @@ -58,12 +60,15 @@ Deployed Ceph Scope ------------------- The "Deployed Ceph" feature deploys a Ceph cluster ready to serve RBD -by calling the same TripleO Ansible roles described in :doc:`cephadm`. -When the "Deployed Ceph" process is over you should expect to find the -following: +and CephFS by calling TripleO Ansible roles which execute the +`cephadm` command. When the "Deployed Ceph" process is over you should +expect to find the following: -- The CephMon, CephMgr, and CephOSD services are running on all nodes - which should have those services +- The CephMon, CephMgr and CephOSD services are running on all nodes + which should have those services as defined by the + :doc:`composable_services` interface +- If desired, the CephMds and CephNFS service will also be deployed + and running (this feature is not available in Wallaby however). - It's possible to SSH into a node with the CephMon service and run `sudo cephadm shell` - All OSDs should be running unless there were environmental issues @@ -75,7 +80,7 @@ following: You should not expect the following after "Deployed Ceph" has run: - No pools or cephx keys for OpenStack will be created yet -- No CephDashboard, CephRGW or CephMds services will be running yet +- No CephDashboard or CephRGW services will be running yet The above will be configured during overcloud deployment by the `openstack overcloud deploy` command as they were prior to the @@ -88,9 +93,8 @@ The above will be configured during overcloud deployment by the used so they must be in the overcloud definition. Thus, they are created during overcloud deployment -During the overcloud deployment the above resources will be -created in Ceph by the TripleO Ansible roles described in -:doc:`cephadm` using the client admin keyring file and the +During the overcloud deployment, the above resources will be created +in Ceph using the client admin keyring file and the ``~/deployed_ceph.yaml`` file output by `openstack overcloud ceph deploy`. Because these resources are created directly on the Ceph cluster with admin level access, "Deployed Ceph" is different from @@ -113,6 +117,56 @@ while `openstack overcloud deploy` (and the commands that follow) deploy OpenStack and configure that Ceph cluster to be used by OpenStack. +Multiple Ceph clusters per deployment +------------------------------------- + +TripleO can only deploy one Ceph cluster in the overcloud per Heat +stack. However, within that Heat stack it's possible to configure +an overcloud to communicate with multiple Ceph clusters which are +external to the overcloud. To do this, follow this document to +configure the "internal" Ceph cluster which is part of the overcloud +and also use the `CephExternalMultiConfig` parameter described in the +:doc:`ceph_external` documentation. + +Prerequisite: Ensure the Ceph container is available +---------------------------------------------------- + +Before deploying Ceph follow the +:ref:`prepare-environment-containers` documentation so +the appropriate Ceph container image is used. +The output of the `openstack tripleo container image prepare` +command should contain a line like the following:: + + ContainerCephDaemonImage: undercloud.ctlplane.mydomain.tld:8787/ceph-ci/daemon:v6.0.0-stable-6.0-pacific-centos-8-x86_64 + +See "Container Options" options below for more details. + +Prerequisite: Ensure the cephadm package is installed +----------------------------------------------------- + +The `cephadm` package needs to be installed on at least one node in +the overcloud in order to bootstrap the first node of the Ceph +cluster. + +The `cephadm` package is pre-built into the overcloud-full image. +The `tripleo_cephadm` role will also use Ansible's package module +to ensure it is present. If `tripleo-repos` is passed the `ceph` +argument for Wallaby or newer, then the CentOS SIG Ceph repository +will be enabled with the appropriate version containing the `cephadm` +package, e.g. for Wallaby the ceph-pacific repository is enabled. + +Prerequisite: Ensure Disks are Clean +------------------------------------ + +cephadm does not reformat the OSD disks and expect them to be clean to +complete successfully. Consequently, when reusing the same nodes (or +disks) for new deployments, it is necessary to clean the disks before +every new attempt. One option is to enable the automated cleanup +functionality in Ironic, which will zap the disks every time that a +node is released. The same process can be executed manually or only +for some target nodes, see `cleaning instructions in the Ironic documentation`_. + + Deployed Ceph Command Line Interface ------------------------------------ @@ -338,10 +392,15 @@ through to cephadm with `openstack overcloud ceph deploy --config`:: $ cat < initial-ceph.conf [global] - osd crush chooseleaf type = 0 + ms_cluster_mode: secure + ms_service_mode: secure + ms_client_mode: secure EOF $ openstack overcloud ceph deploy --config initial-ceph.conf ... +The above example shows how to configure the messenger v2 protocol to +use a secure mode that encrypts all data passing over the network. + The `deployed_ceph.yaml` Heat environment file output by `openstack overcloud ceph deploy` has `ApplyCephConfigOverridesOnUpdate` set to true. This means that services not covered by deployed ceph, e.g. RGW, @@ -351,8 +410,7 @@ then after the overcloud is deployed, it is recommended to update the `deployed_ceph.yaml` Heat environment file, or similar, to set `ApplyCephConfigOverridesOnUpdate` to false. Any subsequent Ceph configuration changes should then be made by the `ceph config -command`_. For more information on the `CephConfigOverrides` and -`ApplyCephConfigOverridesOnUpdate` parameters see :doc:`cephadm`. +command`_. It is supported to pass through the `cephadm --single-host-defaults` option, which configures a Ceph cluster to run on a single host:: @@ -376,11 +434,29 @@ overcloud:: The `--force` option is required when using `--cephadm-extra-args` because not all possible options ensure a functional deployment. +Placement Groups (PGs) +---------------------- + +When Ceph is initially deployed with `openstack overcloud ceph deploy` +the PG and replica count settings are not changed from Ceph's own +defaults unless their parameters (osd_pool_default_size, +osd_pool_default_pg_num, osd_pool_default_pgp_num) are included in an +initial Ceph configuration file which can be passed with the --config +option. These settings may also be modified after `openstack overcloud +ceph deploy`. + +The deprecated Heat paramters `CephPoolDefaultSize` and +`CephPoolDefaultPgNum` no longer have any effect as these +configurations are not made during overcloud deployment. +However, during overcloud deployment pools are created and +both the target_size_ratio or pg_num per pool may be set at that +point. See the "Ceph Pool Options" section for more details. + Ceph Name Options ----------------- -To use a deploy with a different cluster name than the default of -"ceph" use the ``--cluster`` option:: +To deploy with a different cluster name than the default of "ceph" use +the ``--cluster`` option:: openstack overcloud ceph deploy \ --cluster central \ @@ -483,9 +559,15 @@ this:: -o deployed_ceph.yaml \ --ceph-spec ~/ceph_spec.yaml -By default the spec instructs cephadm to use all available disks -(excluding the disk where the operating system is installed) as OSDs. -The syntax it uses to do this is the following:: +Overriding which disks should be OSDs +------------------------------------- + +The `Advanced OSD Service Specifications`_ should be used to define +how disks are used as OSDs. + +By default all available disks (excluding the disk where the operating +system is installed) are used as OSDs. This is because the default +spec has the following:: data_devices: all: true @@ -517,6 +599,59 @@ the specification if the `service_type` is "osd". The same ``--osd-spec`` is available to the `openstack overcloud ceph spec` command. +The :doc:`node_specific_hieradata` feature is not supported by the +cephadm integration but the `Advanced OSD Service Specifications`_ has +a `host_pattern` parameter which specifies which host to target for +certain `data_devices` definitions, so the equivalent functionality is +available but with the new syntax. + +Service Placement Options +------------------------- + +The Ceph services defined in the roles_data.yaml file as described in +:doc:`composable_services` determine which baremetal node runs which +service. By default the Controller role has the CephMon and CephMgr +service while the CephStorage role has the CephOSD service. Most +composable services require Heat output in order to determine how +services are configured, but not the Ceph services. Thus, the +roles_data.yaml file remains authoritative for Ceph service placement +even though the "Deployed Ceph" process happens before Heat is run. + +It is only necessary to use the `--roles-file` option if the default +roles_data.yaml file is not being used. For example if you intend to +deploy hyperconverged nodes, then you want the predeployed compute +nodes to be in the ceph spec with the "osd" label and for the +`service_type` "osd" to have a placement list containing a list of the +compute nodes. To do this generate a custom roles file as described in +:doc:`composable_services` like this:: + + openstack overcloud roles generate Controller ComputeHCI > custom_roles.yaml + +and then pass that roles file like this:: + + openstack overcloud ceph deploy \ + deployed_metal.yaml \ + -o deployed_ceph.yaml \ + --roles-data custom_roles.yaml + +After running the above the compute nodes should have running OSD +containers and when the overcloud is deployed Nova compute services +will then be set up on the same hosts. + +If you wish to generate the ceph spec with the modified placement +described above before the ceph deployment, then the same roles +file may be passed to the 'openstack overcloud ceph spec' command:: + + openstack overcloud ceph spec \ + --stack overcloud \ + --roles-data custom_roles.yaml \ + --output ceph_spec.yaml \ + deployed_metal.yaml + +In the above example the `--stack` is used in order to find the +working directory containing the Ansible inventory which was created +when `openstack overcloud node provision` was run. + Ceph VIP Options ---------------- @@ -635,9 +770,8 @@ option:: Crush Hierarchy Options ----------------------- -As described in the previous section, the `ceph_spec_bootstrap`_ Ansible -module is used to generate the Ceph related spec file which is applied -using the Ceph orchestrator tool. +The `ceph_spec_bootstrap`_ Ansible module is used to generate the Ceph +related spec file which is applied using the Ceph orchestrator tool. During the Ceph OSDs deployment, a custom crush hierarchy can be defined and passed using the ``--crush-hierarchy`` option. As per `Ceph Host Management`_, by doing this the `location` attribute is @@ -691,56 +825,10 @@ Then the Ceph cluster will bootstrap with the following Ceph OSD layout:: .. note:: - Device classes are automatically detected by Ceph, but crush rules are associated to pools - and they still be defined using the CephCrushRules parameter during the overcloud deployment. - Additional details can be found in the `Overriding crush rules`_ section. - -Service Placement Options -------------------------- - -The Ceph services defined in the roles_data.yaml file as described in -:doc:`composable_services` determine which baremetal node runs which -service. By default the Controller role has the CephMon and CephMgr -service while the CephStorage role has the CephOSD service. Most -composable services require Heat output in order to determine how -services are configured, but not the Ceph services. Thus, the -roles_data.yaml file remains authoritative for Ceph service placement -even though the "Deployed Ceph" process happens before Heat is run. - -It is only necessary to use the `--roles-file` option if the default -roles_data.yaml file is not being used. For example if you intend to -deploy hyperconverged nodes, then you want the predeployed compute -nodes to be in the ceph spec with the "osd" label and for the -`service_type` "osd" to have a placement list containing a list of the -compute nodes. To do this generate a custom roles file as described in -:doc:`composable_services` like this:: - - openstack overcloud roles generate Controller ComputeHCI > custom_roles.yaml - -and then pass that roles file like this:: - - openstack overcloud ceph deploy \ - deployed_metal.yaml \ - -o deployed_ceph.yaml \ - --roles-data custom_roles.yaml - -After running the above the compute nodes should have running OSD -containers and when the overcloud is deployed Nova compute services -will then be set up on the same hosts. - -If you wish to generate the ceph spec with the modified placement -described above before the ceph deployment, then the same roles -file may be passed to the 'openstack overcloud ceph spec' command:: - - openstack overcloud ceph spec \ - --stack overcloud \ - --roles-data custom_roles.yaml \ - --output ceph_spec.yaml \ - deployed_metal.yaml - -In the above example the `--stack` is used in order to find the -working directory containing the Ansible inventory which was created -when `openstack overcloud node provision` was run. + Device classes are automatically detected by Ceph, but crush rules + are associated to pools and they still be defined using the + CephCrushRules parameter during the overcloud deployment. Additional + details can be found in the "Overriding CRUSH rules" section below. Network Options --------------- @@ -955,6 +1043,72 @@ calling `openstack overcloud ceph deploy`. See `openstack overcloud ceph user enable --help` and `openstack overcloud ceph user disable --help` for more information. +Container Options +----------------- + +As described in :doc:`../deployment/container_image_prepare` the +undercloud may be used as a container registry for ceph containers +and there is a supported syntax to download containers from +authenticated registries. + +By default `openstack overcloud ceph deploy` will pull the Ceph +container in the default ``container_image_prepare_defaults.yaml`` +file. If a `push_destination` is defined in this file, then the +overcloud will be configured so it can access the local registry in +order to download the Ceph container. This means that `openstack +overcloud ceph deploy` will modify the overcloud's ``/etc/hosts`` +and ``/etc/containers/registries.conf`` files; unless the +`--skip-hosts-config` and `--skip-container-registry-config` options +are used or a `push_destination` is not defined. + +The version of the Ceph used in each OpenStack release changes per +release and can be seen by running a command like this:: + + egrep "ceph_namespace|ceph_image|ceph_tag" \ + /usr/share/tripleo-common/container-images/container_image_prepare_defaults.yaml + +The `--container-image-prepare` option can be used to override which +``container_image_prepare_defaults.yaml`` file is used. If a version +of this file called ``custom_container_image_prepare.yaml`` is +modified to contain syntax like the following:: + + ContainerImageRegistryCredentials: + quay.io/ceph-ci: + quay_username: quay_password + +Then when a command like the following is run:: + + openstack overcloud ceph deploy \ + deployed_metal.yaml \ + -o deployed_ceph.yaml \ + --container-image-prepare custom_container_image_prepare.yaml + +The credentials will be extracted from the file and the tripleo +ansible role to bootstrap Ceph will be executed like this:: + + cephadm bootstrap + --registry-url quay.io/ceph-ci + --registry-username quay_username + --registry-password quay_password + ... + +The syntax of the container image prepare file can also be ignored and +instead the following command line options may be used instead:: + + --container-namespace CONTAINER_NAMESPACE + e.g. quay.io/ceph + --container-image CONTAINER_IMAGE + e.g. ceph + --container-tag CONTAINER_TAG + e.g. latest + --registry-url REGISTRY_URL + --registry-username REGISTRY_USERNAME + --registry-password REGISTRY_PASSWORD + +If a variable above is unused, then it defaults to the ones found in +the default ``container_image_prepare_defaults.yaml`` file. In other +words, the above options are overrides. + Creating Pools and CephX keys before overcloud deployment (Optional) -------------------------------------------------------------------- @@ -964,7 +1118,8 @@ deployment the pools and cephx keys are created based on which Heat environment files are passed. For most cases only pools for Cinder (volumes), Nova (vms), and Glance (images) are created but if the Heat environment file to configure additional services are passed, -e.g. cinder-backup, then the required pools are created. +e.g. cinder-backup, then the required pools are created. This is +covered in more detail in the next section of this document. It is not necessary to create pools and cephx keys before overcloud deployment but it is possible. The Ceph pools can be created when @@ -1064,74 +1219,624 @@ The CephPools Heat parameter above has always supported idempotent updates. It will be pre-populated with the pools from tripleo_cephadm_pools after Ceph is deployed. The deployed_ceph.yaml which is output can also be updated so that additional pools can be -created when the overcloud is deployed. +created when the overcloud is deployed. The Heat parameters above are +described in more detail in the rest of this document. -Container Options +Environment files to configure Ceph during Overcloud deployment +--------------------------------------------------------------- + +After `openstack overcloud ceph deploy` has run and output the +`deployed_ceph.yaml` file, this file and other Heat environment +files should be passed to the `openstack overcloud deploy` +command:: + + openstack overcloud deploy --templates \ + -e /usr/share/openstack-tripleo-heat-templates/environments/cephadm/cephadm.yaml \ + -e deployed_ceph.yaml + +The above will make the following modifications to the Ceph cluster +while the overcloud is being deployed: + +- Execute cephadm to add the Ceph RADOS Gateway (RGW) service +- Configure HAProxy as a front end for RGW +- Configure Keystone so RGW behaves like the OpenStack object service +- Create Pools for both RGW and RBD services +- Create an openstack client cephx keyring for Nova, Cinder, Glance to + access RBD + +The information necessary to configure Ceph clients will then +be extracted to `/home/stack/ceph_client.yml` on the undercloud and +passed to the as input to the tripleo-ansible role tripleo_ceph_client +which will then configure the rest of the overcloud to use the new +Ceph cluster as described in the :doc:`ceph_external` documentation. + +If you only wish to deploy Ceph RBD without RGW then use the following +variation of the above:: + + openstack overcloud deploy --templates \ + -e /usr/share/openstack-tripleo-heat-templates/environments/cephadm/cephadm-rbd-only.yaml \ + -e deployed_ceph.yaml + +Do not directly edit the `environments/cephadm/cephadm.yaml` +or `cephadm-rbd-only.yaml` file. If you wish to override the defaults, +as described below in the sections starting with "Overriding", then +place those overrides in a separate `cephadm-overrides.yaml` file and +deploy like this:: + + openstack overcloud deploy --templates \ + -e /usr/share/openstack-tripleo-heat-templates/environments/cephadm/cephadm.yaml \ + -e deployed_ceph.yaml \ + -e ceph-overrides.yaml + +Applying Ceph server configuration during overcloud deployment +-------------------------------------------------------------- + +The `deployed_ceph.yaml` file output by `openstack overcloud ceph deploy` +has the paramter `ApplyCephConfigOverridesOnUpdate` set to true so +that Ceph services not deployed by `openstack overcloud ceph deploy`, +e.g. RGW, can be configured during initial overcloud deployment. After +both Ceph and the overcloud have been deployed, edit the +`deployed_ceph.yaml` file and set `ApplyCephConfigOverridesOnUpdate` +to false. All Ceph server configuration changes should then be made +using `Ceph Orchestrator`_. + +It is technically possible to set `ApplyCephConfigOverridesOnUpdate` +to true and use `CephConfigOverrides` to override Ceph *server* +configurations during stack updates. When this happens, parameters in +`CephConfigOverrides` are put into a file, e.g. assimilate_ceph.conf, +and a command like `ceph config assimilate-conf -i +assimilate_ceph.conf` is run. + +Regardless of the value of the `ApplyCephConfigOverridesOnUpdate` +boolean, if `openstack overcloud deploy` is re-run in order to update +the stack, the cephadm bootstrap process is not repeated because +that process is only run if `cephadm list` returns an empty list. + +Applying Ceph client configuration during overcloud deployment +-------------------------------------------------------------- + +To make a Ceph *client* configuration change, update the parameters in +`CephConfigOverrides` and run a stack update. This will not +change the configuration for the Ceph servers unless +`ApplyCephConfigOverridesOnUpdate` is set to true (as described in the +section above). By default it should only change configurations for +the Ceph clients. Examples of Ceph clients include Nova compute +containers, Cinder volume containers, Glance image containers, etc. + +The `CephConfigOverrides` directive updates all Ceph client +configuration files on the overcloud in the `CephConfigPath` (which +defaults to /var/lib/tripleo-config/ceph). The `CephConfigPath` is +mounted on the client containers as `/etc/ceph`. The name of the +configuration file is `ceph.conf` because the `CephClusterName` +parameter defaults to "ceph". If `CephClusterName` was set to "foo", +then the file would be called `/etc/ceph/foo.conf`. + +Ceph Pool Options ----------------- -As described in :doc:`../deployment/container_image_prepare` the -undercloud may be used as a container registry for ceph containers -and there is a supported syntax to download containers from -authenticated registries. +When `openstack overcloud deploy` is run a pool is created for each +OpenStack service depending on if that service is enabled by including +its Heat environment. For example, a command like the following will +result in pools for Nova (vms), Cinder (volumes) and Glance (images) +being created:: -By default `openstack overcloud ceph deploy` will pull the Ceph -container in the default ``container_image_prepare_defaults.yaml`` -file. If a `push_destination` is defined in this file, then the -overcloud will be configured so it can access the local registry in -order to download the Ceph container. This means that `openstack -overcloud ceph deploy` will modify the overcloud's ``/etc/hosts`` -and ``/etc/containers/registries.conf`` files; unless the -`--skip-hosts-config` and `--skip-container-registry-config` options -are used or a `push_destination` is not defined. + openstack overcloud deploy --templates \ + -e /usr/share/openstack-tripleo-heat-templates/environments/cephadm/cephadm-rbd-only.yaml -The version of the Ceph used in each OpenStack release changes per -release and can be seen by running a command like this:: +If `-e environments/cinder-backup.yaml` included in the above command +then a pool called backups would also be created. - egrep "ceph_namespace|ceph_image|ceph_tag" \ - /usr/share/tripleo-common/container-images/container_image_prepare_defaults.yaml +By default each pool will have Ceph`s pg_autoscale_mode enabled so it +is not necessary to directly set a PG number per pool. However, even +with this mode enabled it is recommended to set a `target_size_ratio` +(or pg_num) per pool in order to minimize data rebalancing. For more +information on pg_autoscale_mode see `Autoscaling Placement Groups`_. -The `--container-image-prepare` option can be used to override which -``container_image_prepare_defaults.yaml`` file is used. If a version -of this file called ``custom_container_image_prepare.yaml`` is -modified to contain syntax like the following:: +To control the target_size_ratio per pool, create a Heat environment +file like pools.yaml with the following content and include it in the +`openstack overcloud deploy` command with a `-e pools.yaml`:: - ContainerImageRegistryCredentials: - quay.io/ceph-ci: - quay_username: quay_password + CephPools: + - name: volumes + target_size_ratio: 0.4 + application: rbd + - name: images + target_size_ratio: 0.1 + application: rbd + - name: vms + target_size_ratio: 0.3 + application: rbd -Then when a command like the following is run:: +In the above example it is assumed that the percentage of data used +per service will be Cinder volumes 40%, Glance images 10% and Nova vms +30% (with 20% of space free for other pools). It is worthwhile to set +these values based on your expected usage (e.g. maybe 40% is not right +for your usecase). If you do not override the CephPools parameter, +then each pool will have Ceph's default PG number. Though the +autoscaler will adjust this number automatically over time based on +usage, the data will be moved within the cluster as a result which +will use computational resources. - openstack overcloud ceph deploy \ - deployed_metal.yaml \ - -o deployed_ceph.yaml \ - --container-image-prepare custom_container_image_prepare.yaml +If you prefer to set a PG number instead of a target size ratio, then +replace `target_size_ratio` in the example above with ‘pg_num’ and +supply a different integer per pool (e.g. 512 for volumes, 128 for +images, etc.) based on your expected usage. -The credentials will be extracted from the file and the tripleo -ansible role to bootstrap Ceph will be executed like this:: +Overriding CRUSH rules +---------------------- - cephadm bootstrap - --registry-url quay.io/ceph-ci - --registry-username quay_username - --registry-password quay_password - ... +To deploy Ceph pools with custom CRUSH Map Rules use the +`CephCrushRules` parameter to define a list of named rules and then +associate the `rule_name` per pool with the `CephPools` parameter:: -The syntax of the container image prepare file can also be ignored and -instead the following command line options may be used instead:: + parameter_defaults: + CephCrushRules: + - name: HDD + root: default + type: host + class: hdd + default: true + - name: SSD + root: default + type: host + class: ssd + default: false + CephPools: + - {'name': 'slow_pool', 'rule_name': 'HDD', 'application': 'rbd'} + - {'name': 'fast_pool', 'rule_name': 'SSD', 'application': 'rbd'} - --container-namespace CONTAINER_NAMESPACE - e.g. quay.io/ceph - --container-image CONTAINER_IMAGE - e.g. ceph - --container-tag CONTAINER_TAG - e.g. latest - --registry-url REGISTRY_URL - --registry-username REGISTRY_USERNAME - --registry-password REGISTRY_PASSWORD +CRUSH rules may be created during overcloud deployment as documented +above. CRUSH rules may also be created directly via the Ceph command +line tools. -If a variable above is unused, then it defaults to the ones found in -the default ``container_image_prepare_defaults.yaml`` file. In other -words, the above options are overrides. +Overriding CephX Keys +--------------------- +During overcloud deployment, TripleO will create a Ceph cluster with a +CephX key file for OpenStack RBD client connections that is shared by +the Nova, Cinder, and Glance services to read and write to their +pools. Not only will the keyfile be created but the Ceph cluster will +be configured to accept connections when the key file is used. The +file will be named `ceph.client.openstack.keyring` and it will be +stored in `/etc/ceph` within the containers, but on the container host +it will be stored in a location defined by a TripleO exposed parameter +which defaults to `/var/lib/tripleo-config/ceph`. + +The keyring file is created using the following defaults: + +* CephClusterName: 'ceph' +* CephClientUserName: 'openstack' +* CephClientKey: This value is randomly generated per Heat stack. If + it is overridden the recommendation is to set it to the output of + `ceph-authtool --gen-print-key`. + +If the above values are overridden, the keyring file will have a +different name and different content. E.g. if `CephClusterName` was +set to 'foo' and `CephClientUserName` was set to 'bar', then the +keyring file would be called `foo.client.bar.keyring` and it would +contain the line `[client.bar]`. + +The `CephExtraKeys` parameter may be used to generate additional key +files containing other key values and should contain a list of maps +where each map describes an additional key. The syntax of each +map must conform to what the `ceph-ansible/library/ceph_key.py` +Ansible module accepts. The `CephExtraKeys` parameter should be used +like this:: + + CephExtraKeys: + - name: "client.glance" + caps: + mgr: "allow *" + mon: "profile rbd" + osd: "profile rbd pool=images" + key: "AQBRgQ9eAAAAABAAv84zEilJYZPNuJ0Iwn9Ndg==" + mode: "0600" + +If the above is used, in addition to the +`ceph.client.openstack.keyring` file, an additional file called +`ceph.client.glance.keyring` will be created which contains:: + + [client.glance] + key = AQBRgQ9eAAAAABAAv84zEilJYZPNuJ0Iwn9Ndg== + caps mgr = "allow *" + caps mon = "profile rbd" + caps osd = "profile rbd pool=images" + +The Ceph cluster will also allow the above key file to be used to +connect to the images pool. Ceph RBD clients which are external to the +overcloud could then use this CephX key to connect to the images +pool used by Glance. The default Glance deployment defined in the Heat +stack will continue to use the `ceph.client.openstack.keyring` file +unless that Glance configuration itself is overridden. + +Add the Ceph Dashboard to a Overcloud deployment +------------------------------------------------ + +During the overcloud deployment most of the Ceph daemons can be added and +configured. +To deploy the ceph dashboard include the ceph-dashboard.yaml environment +file as in the following example:: + + openstack overcloud deploy --templates \ + -e /usr/share/openstack-tripleo-heat-templates/environments/cephadm/cephadm.yaml \ + -e /usr/share/openstack-tripleo-heat-templates/environments/cephadm/ceph-dashboard.yaml + +The command above will include the ceph dashboard related services and +generates all the `cephadm` required variables to render the monitoring +stack related spec that can be applied against the deployed Ceph cluster. +When the deployment has been completed the Ceph dashboard containers, +including prometheus and grafana, will be running on the controller nodes +and will be accessible using the port 3100 for grafana and 9092 for prometheus; +since this service is only internal and doesn’t listen on the public vip, users +can reach both grafana and the exposed ceph dashboard using the controller +provisioning network vip on the specified port (8444 is the default for a generic +overcloud deployment). + +The resulting deployment will be composed by an external stack made by grafana, +prometheus, alertmanager, node-exporter containers and the ceph dashboard mgr +module that acts as the backend for this external stack, embedding the grafana +layouts and showing the ceph cluster specific metrics coming from prometheus. +The Ceph Dashboard backend services run on the specified `CephDashboardNetwork` +and `CephGrafanaNetwork`, while the high availability is realized by haproxy and +Pacemaker. + +The Ceph Dashboard frontend is fully integrated with the tls-everywhere framework, +hence providing the tls environments files will trigger the certificate request for +both grafana and the ceph dashboard: the generated crt and key files are then +configured by cephadm, resulting in a key-value pair within the Ceph orchestrator, +which is able to mount the required files to the dashboard related containers. +The Ceph Dashboard admin user role is set to `read-only` mode by default for safe +monitoring of the Ceph cluster. To permit an admin user to have elevated privileges +to alter elements of the Ceph cluster with the Dashboard, the operator can change the +default. + +For this purpose, TripleO exposes a parameter that can be used to change the Ceph +Dashboard admin default mode. + +Log in to the undercloud as `stack` user and create the `ceph_dashboard_admin.yaml` +environment file with the following content:: + + parameter_defaults: + CephDashboardAdminRO: false + +Run the overcloud deploy command to update the existing stack and include the environment +file created with all other environment files that are already part of the existing +deployment:: + + openstack overcloud deploy --templates \ + -e \ + -e ceph_dashboard_admin.yml + +The ceph dashboard will also work with composable networks. +In order to isolate the monitoring access for security purposes, operators can +take advantage of composable networks and access the dashboard through a separate +network vip. By doing this, it's not necessary to access the provisioning network +and separate authorization profiles may be implemented. + +To deploy the overcloud with the ceph dashboard composable network we need first +to generate the controller specific role created for this scenario:: + + openstack overcloud roles generate \ + -o /home/stack/roles_data.yaml \ + ControllerStorageDashboard Compute \ + BlockStorage ObjectStorage CephStorage + +Finally, run the overcloud deploy command including the new generated `roles_data.yaml` +and the `network_data_dashboard.yaml` file that will trigger the generation of this +new network. + +The final overcloud command must look like the following:: + + openstack overcloud deploy --templates \ + -r /home/stack/roles_data.yaml \ + -n /usr/share/openstack-tripleo-heat-templates/network_data_dashboard.yaml \ + -e /usr/share/openstack-tripleo-heat-templates/environments/cephadm/cephadm.yaml \ + -e ~/my-ceph-settings.yaml + +Scenario: Deploy Ceph with TripleO and Metalsmith and then Scale Up +------------------------------------------------------------------- + +Deploy the hardware as described in :doc:`../provisioning/baremetal_provision` +and include nodes with in the `CephStorage` role. For example, the +following could be the content of ~/overcloud_baremetal_deploy.yaml:: + + - name: Controller + count: 3 + instances: + - hostname: controller-0 + name: controller-0 + - hostname: controller-1 + name: controller-1 + - hostname: controller-2 + name: controller-2 + - name: CephStorage + count: 3 + instances: + - hostname: ceph-0 + name: ceph-0 + - hostname: ceph-1 + name: ceph-2 + - hostname: ceph-2 + name: ceph-2 + - name: Compute + count: 1 + instances: + - hostname: compute-0 + name: compute-0 + +which is passed to the following command:: + + openstack overcloud node provision \ + --stack overcloud \ + --output ~/overcloud-baremetal-deployed.yaml \ + ~/overcloud_baremetal_deploy.yaml + +Ceph may then be deployed with `openstack overcloud ceph deploy`. +As described in :doc:`../provisioning/baremetal_provision`, pass +~/overcloud_baremetal_deploy.yaml as input, along with +/usr/share/openstack-tripleo-heat-templates/environments/cephadm/cephadm.yaml +and any Ceph Overrides described in the rest of this document, to the +`openstack overcloud deploy` command. + +To scale up, modify the ~/overcloud_baremetal_deploy.yaml file +described above to add more CephStorage nodes. In the example below +the number of storage nodes is doubled:: + + - name: CephStorage + count: 6 + instances: + - hostname: ceph-0 + name: ceph-0 + - hostname: ceph-1 + name: ceph-2 + - hostname: ceph-2 + name: ceph-2 + - hostname: ceph-3 + name: ceph-3 + - hostname: ceph-4 + name: ceph-4 + - hostname: ceph-5 + name: ceph-5 + +As described in :doc:`../provisioning/baremetal_provision`, re-run the +same `openstack overcloud node provision` command with the updated +~/overcloud_baremetal_deploy.yaml file. This will result in the three +new storage nodes being provisioned and output an updated copy of +~/overcloud-baremetal-deployed.yaml. The updated copy will have the +`CephStorageCount` changed from 3 to 6 and the `DeployedServerPortMap` +and `HostnameMap` will contain the new storage nodes. + +After the three new storage nodes are deployed run the same +`openstack overcloud deploy` command as described in the previous +section with updated copy of ~/overcloud-baremetal-deployed.yaml. +The additional Ceph Storage nodes will be added to the Ceph and +the increased capacity will available. It is not necessary to run +`openstack overcloud ceph deploy` to scale up. + +In particular, the following will happen as a result of running +`openstack overcloud deploy`: + +- The storage networks and firewall rules will be appropriately + configured on the new CephStorage nodes +- The ceph-admin user will be created on the new CephStorage nodes +- The ceph-admin user's public SSH key will be distributed to the new + CephStorage nodes so that cephadm can use SSH to add extra nodes +- If a new host with the Ceph Mon or Ceph Mgr service is being added, + then the private SSH key will also be added to that node. +- An updated Ceph spec will be generated and installed on the + bootstrap node, i.e. /home/ceph-admin/specs/ceph_spec.yaml on the + bootstrap node will contain new entries for the new CephStorage + nodes. +- The cephadm bootstrap process will be skipped because `cephadm ls` + will indicate that Ceph containers are already running. +- The updated spec will be applied and cephadm will schedule the new + nodes to join the cluster. + +Scenario: Scale Down Ceph with TripleO and Metalsmith +----------------------------------------------------- + +.. warning:: This procedure is only possible if the Ceph cluster has + the capacity to lose OSDs. + +Before using TripleO to remove hardware which is part of a Ceph +cluster, use Ceph orchestrator to deprovision the hardware gracefully. +This example uses commands from the `OSD Service Documentation for +cephadm`_ to remove the OSDs, and their host, before using TripleO +to scale down the Ceph storage nodes. + +Start a Ceph shell and identify the OSDs to be removed by server. In +the following example we will identify the OSDs of the host ceph-2:: + + [root@oc0-controller-0 ~]# cephadm shell + ... + [ceph: root@oc0-controller-0 /]# ceph osd tree + ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF + -1 0.58557 root default + ... + -7 0.19519 host ceph-2 + 5 hdd 0.04880 osd.5 up 1.00000 1.00000 + 7 hdd 0.04880 osd.7 up 1.00000 1.00000 + 9 hdd 0.04880 osd.9 up 1.00000 1.00000 + 11 hdd 0.04880 osd.11 up 1.00000 1.00000 + ... + [ceph: root@oc0-controller-0 /]# + +As per the example above the ceph-2 host has OSDs 5,7,9,11 which can +be removed by running `ceph orch osd rm 5 7 9 11`. For example:: + + [ceph: root@oc0-controller-0 /]# ceph orch osd rm 5 7 9 11 + Scheduled OSD(s) for removal + [ceph: root@oc0-controller-0 /]# ceph orch osd rm status + OSD_ID HOST STATE PG_COUNT REPLACE FORCE DRAIN_STARTED_AT + 7 ceph-2 draining 27 False False 2021-04-23 21:35:51.215361 + 9 ceph-2 draining 8 False False 2021-04-23 21:35:49.111500 + 11 ceph-2 draining 14 False False 2021-04-23 21:35:50.243762 + [ceph: root@oc0-controller-0 /]# + +Use `ceph orch osd rm status` to check the status:: + + [ceph: root@oc0-controller-0 /]# ceph orch osd rm status + OSD_ID HOST STATE PG_COUNT REPLACE FORCE DRAIN_STARTED_AT + 7 ceph-2 draining 34 False False 2021-04-23 21:35:51.215361 + 11 ceph-2 done, waiting for purge 0 False False 2021-04-23 21:35:50.243762 + [ceph: root@oc0-controller-0 /]# + +Only proceed if `ceph orch osd rm status` returns no output. + +Remove the host with `ceph orch host rm `. For example:: + + [ceph: root@oc0-controller-0 /]# ceph orch host rm ceph-2 + Removed host 'ceph-2' + [ceph: root@oc0-controller-0 /]# + +Now that the host and OSDs have been logically removed from the Ceph +cluster proceed to remove the host from the overcloud as described in +the "Scaling Down" section of :doc:`../provisioning/baremetal_provision`. + +Scenario: Deploy Hyperconverged Ceph +------------------------------------ + +Use a command like the following to create a `roles.yaml` file +containing a standard Controller role and a ComputeHCI role:: + + openstack overcloud roles generate Controller ComputeHCI -o ~/roles.yaml + +The ComputeHCI role is a Compute node which also runs co-located Ceph +OSD daemons. This kind of service co-location is referred to as HCI, +or hyperconverged infrastructure. See the :doc:`composable_services` +documentation for details on roles and services. + +When collocating Nova Compute and Ceph OSD services, boundaries can be +set to reduce contention for CPU and Memory between the two services. +To limit Ceph for HCI, create an initial Ceph configuration file with +the following:: + + $ cat < initial-ceph.conf + [osd] + osd_memory_target_autotune = true + osd_numa_auto_affinity = true + [mgr] + mgr/cephadm/autotune_memory_target_ratio = 0.2 + EOF + $ + +The `osd_memory_target_autotune`_ is set to true so that the OSD +daemons will adjust their memory consumption based on the +`osd_memory_target` config option. The `autotune_memory_target_ratio` +defaults to 0.7. So 70% of the total RAM in the system is the starting +point, from which any memory consumed by non-autotuned Ceph daemons +are subtracted, and then the remaining memory is divided by the OSDs +(assuming all OSDs have `osd_memory_target_autotune` true). For HCI +deployments the `mgr/cephadm/autotune_memory_target_ratio` can be set +to 0.2 so that more memory is available for the Nova Compute +service. This has the same effect as setting the ceph-ansible `is_hci` +parameter to true. + +A two NUMA node system can host a latency sensitive Nova workload on +one NUMA node and a Ceph OSD workload on the other NUMA node. To +configure Ceph OSDs to use a specific NUMA node (and not the one being +used by the Nova Compute workload) use either of the following Ceph +OSD configurations: + +- `osd_numa_node` sets affinity to a numa node (-1 for none) +- `osd_numa_auto_affinity` automatically sets affinity to the NUMA + node where storage and network match + +If there are network interfaces on both NUMA nodes and the disk +controllers are NUMA node 0, then use a network interface on NUMA node +0 for the storage network and host the Ceph OSD workload on NUMA +node 0. Then host the Nova workload on NUMA node 1 and have it use the +network interfaces on NUMA node 1. Setting `osd_numa_auto_affinity`, +to true, as in the `initial-ceph.conf` file above, should result in +this configuration. Alternatively, the `osd_numa_node` could be set +directly to 0 and `osd_numa_auto_affinity` could be unset so that it +will default to false. + +When a hyperconverged cluster backfills as a result of an OSD going +offline, the backfill process can be slowed down. In exchange for a +slower recovery, the backfill activity has less of an impact on +the collocated Compute workload. Ceph Pacific has the following +defaults to control the rate of backfill activity:: + + osd_recovery_op_priority = 3 + osd_max_backfills = 1 + osd_recovery_max_active_hdd = 3 + osd_recovery_max_active_ssd = 10 + +It is not necessary to pass the above in an initial ceph.conf as they +are the default values, but if these values need to be deployed with +different values modify an example like the above and add it to the +initial Ceph configuration file before deployment. If the values need +to be adjusted after the deployment use `ceph config set osd +`. + +To limit Nova resources add parameters to `ceph-overrides.yaml` +like the following:: + + parameter_defaults: + CephHciOsdType: hdd + CephHciOsdCount: 4 + +The `CephHciOsdType` and `CephHciOsdCount` parameters are used by the +Derived Parameters workflow to tune the Nova scheduler to not allocate +a certain amount of memory and CPU from the hypervisor to virtual +machines so that Ceph can use them instead. See the +:doc:`derived_parameters` documentation for details. If you do not use +Derived Parameters workflow, then at least set the +`NovaReservedHostMemory` to the number of OSDs multipled by 5 GB per +OSD per host. + +Deploy the Ceph with `openstack overcloud ceph deploy` and be sure to +pass the initial Ceph configuration file with Ceph HCI tunings. Then +deploy the overcloud with `openstack overcloud deploy` and the as +described in "Scenario: Deploy Ceph with TripleO and Metalsmith" but +use the `-r` option to include generated `roles.yaml` file and the +`-e` option with the `ceph-overrides.yaml` file containing the Nova +HCI tunings described above. + +The examples above may be used to tune a hyperconverged system during +deployment. If the values need to be changed after deployment, then +use the `ceph orchestrator` command to set them directly. + +After deployment start a Ceph shell and confirm the above values were +applied. For example, to check that the NUMA and memory target auto +tuning run commands lke this:: + + [ceph: root@oc0-controller-0 /]# ceph config dump | grep numa + osd advanced osd_numa_auto_affinity true + [ceph: root@oc0-controller-0 /]# ceph config dump | grep autotune + osd advanced osd_memory_target_autotune true + [ceph: root@oc0-controller-0 /]# ceph config get mgr mgr/cephadm/autotune_memory_target_ratio + 0.200000 + [ceph: root@oc0-controller-0 /]# + +We can then confirm that a specific OSD, e.g. osd.11, inherited those +values with commands like this:: + + [ceph: root@oc0-controller-0 /]# ceph config get osd.11 osd_memory_target + 4294967296 + [ceph: root@oc0-controller-0 /]# ceph config get osd.11 osd_memory_target_autotune + true + [ceph: root@oc0-controller-0 /]# ceph config get osd.11 osd_numa_auto_affinity + true + [ceph: root@oc0-controller-0 /]# + +To confirm that the default backfill values are set for the same +example OSD, use commands like this:: + + [ceph: root@oc0-controller-0 /]# ceph config get osd.11 osd_recovery_op_priority + 3 + [ceph: root@oc0-controller-0 /]# ceph config get osd.11 osd_max_backfills + 1 + [ceph: root@oc0-controller-0 /]# ceph config get osd.11 osd_recovery_max_active_hdd + 3 + [ceph: root@oc0-controller-0 /]# ceph config get osd.11 osd_recovery_max_active_ssd + 10 + [ceph: root@oc0-controller-0 /]# + + +.. _`cephadm`: https://docs.ceph.com/en/latest/cephadm/index.html +.. _`cleaning instructions in the Ironic documentation`: https://docs.openstack.org/ironic/latest/admin/cleaning.html .. _`ceph config command`: https://docs.ceph.com/en/latest/man/8/ceph/#config .. _`ceph_spec_bootstrap`: https://docs.openstack.org/tripleo-ansible/latest/modules/modules-ceph_spec_bootstrap.html .. _`Ceph Service Specification`: https://docs.ceph.com/en/octopus/mgr/orchestrator/#orchestrator-cli-service-spec @@ -1139,3 +1844,7 @@ words, the above options are overrides. .. _`Ceph Host Management`: https://docs.ceph.com/en/latest/cephadm/host-management/#setting-the-initial-crush-location-of-host .. _`Overriding crush rules`: https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/cephadm.html#overriding-crush-rules .. _`CephIngress`: https://docs.ceph.com/en/pacific/cephadm/services/nfs/#high-availability-nfs +.. _`Ceph Orchestrator`: https://docs.ceph.com/en/latest/mgr/orchestrator/ +.. _`Autoscaling Placement Groups`: https://docs.ceph.com/en/latest/rados/operations/placement-groups/ +.. _`OSD Service Documentation for cephadm`: https://docs.ceph.com/en/latest/cephadm/services/osd/ +.. _`osd_memory_target_autotune`: https://docs.ceph.com/en/latest/cephadm/services/osd/#automatically-tuning-osd-memory diff --git a/deploy-guide/source/features/derived_parameters.rst b/deploy-guide/source/features/derived_parameters.rst index 3b3d11df..e7b6a2c8 100644 --- a/deploy-guide/source/features/derived_parameters.rst +++ b/deploy-guide/source/features/derived_parameters.rst @@ -196,7 +196,7 @@ devices like ``CephAnsibleDisksConfig``, setting the count directly is necessary in order to know how much CPU/RAM to reserve. Similarly, because a device path is not hard coded, we cannot look up that device in Ironic to determine its type. For information on the -``CephOsdSpec`` parameter see the :doc:`cephadm` documentation. +``CephOsdSpec`` parameter see the :doc:`deployed_ceph` documentation. ``CephHciOsdType`` is the type of data_device (not db_device) used for each OSD and must be one of hdd, ssd, or nvme. These are used by diff --git a/deploy-guide/source/features/distributed_multibackend_storage.rst b/deploy-guide/source/features/distributed_multibackend_storage.rst index 476c7cd1..ff7f452a 100644 --- a/deploy-guide/source/features/distributed_multibackend_storage.rst +++ b/deploy-guide/source/features/distributed_multibackend_storage.rst @@ -115,7 +115,7 @@ Ceph Deployment Types |project| supports two types of Ceph deployments. An "internal" Ceph deployment is one where a Ceph cluster is deployed as part of the -overcloud as described in :doc:`ceph_config`. An "external" Ceph +overcloud as described in :doc:`deployed_ceph`. An "external" Ceph deployment is one where a Ceph cluster already exists and an overcloud is configured to be a client of that Ceph cluster as described in :doc:`ceph_external`. Ceph external deployments have special meaning @@ -160,7 +160,7 @@ types of deployments as described in the following sequence: as an additional RBD backend. The above sequence is possible by using the `CephExtraKeys` parameter -as described in :doc:`ceph_config` and the `CephExternalMultiConfig` +as described in :doc:`deployed_ceph` and the `CephExternalMultiConfig` parameter described in :doc:`ceph_external`. Decide which cephx key will be used to access remote Ceph clusters @@ -239,7 +239,7 @@ Ceph cluster with pools which may be accessed by the cephx user "client.external". The same parameters will be used later when the DCN overclouds are configured as external Ceph clusters. For more information on the `CephExtraKeys` parameter see the document -:doc:`ceph_config` section called `Configuring CephX Keys`. +:doc:`deployed_ceph` section called `Overriding CephX Keys`. Create control-plane roles ^^^^^^^^^^^^^^^^^^^^^^^^^^ diff --git a/deploy-guide/source/post_deployment/upgrade/fast_forward_upgrade.rst b/deploy-guide/source/post_deployment/upgrade/fast_forward_upgrade.rst index beb06534..0e17ebfe 100644 --- a/deploy-guide/source/post_deployment/upgrade/fast_forward_upgrade.rst +++ b/deploy-guide/source/post_deployment/upgrade/fast_forward_upgrade.rst @@ -641,8 +641,7 @@ Following there is a list of all the changes needed: ensure the devices list has been migrated to the format expected by ceph-ansible. It is possible to use the ``CephAnsibleExtraConfig`` and `CephAnsibleDisksConfig`` parameters to pass arbitrary variables to - ceph-ansible, like ``devices`` and ``dedicated_devices``. See the - :doc:`TripleO Ceph config guide <../../../features/ceph_config>` + ceph-ansible, like ``devices`` and ``dedicated_devices``. The other parameters (for example ``CinderRbdPoolName``, ``CephClientUserName``, ...) will behave as they used to with puppet-ceph diff --git a/deploy-guide/source/post_deployment/upgrade/major_upgrade.rst b/deploy-guide/source/post_deployment/upgrade/major_upgrade.rst index 2fad48bc..56d515a0 100644 --- a/deploy-guide/source/post_deployment/upgrade/major_upgrade.rst +++ b/deploy-guide/source/post_deployment/upgrade/major_upgrade.rst @@ -559,9 +559,7 @@ major-upgrade-composable-steps that come first, as described above. ensure the devices list has been migrated to the format expected by ceph-ansible. It is possible to use the ``CephAnsibleExtraConfig`` and ``CephAnsibleDisksConfig`` parameters to pass arbitrary variables to - ceph-ansible, like ``devices`` and ``dedicated_devices``. See the - `ceph-ansible scenarios`_ or the :doc:`TripleO Ceph config guide - <../../features/ceph_config>` + ceph-ansible, like ``devices`` and ``dedicated_devices``. The other parameters (for example ``CinderRbdPoolName``, ``CephClientUserName``, ...) will behave as they used to with puppet-ceph