tripleo-specs/specs/yoga/tripleo_ceph_ingress.rst

9.6 KiB
Raw Blame History

TripleO Ceph Ingress Daemon Integration

Starting in the Octopus release, Ceph introduced its own day1 tool called cephadm and its own day2 tool called orchestrator which replaced ceph-ansible. During the Wallaby and Xena cycles TripleO moved away from ceph-ansible and adopted cephadm1 as described in2. During Xena cycle a new approach of deploying Ceph in a TripleO context has been established and now a Ceph cluster can be provisioned before the overcloud is created, leaving to the overcloud deployment phase the final configuration of the Ceph cluster which depends on the OpenStack enabled services defined by the tripleo-heat-templates interface. The goal for the next cycle is moving to deployed Ceph as much services as possible, minimizing the actions against the Ceph cluster during the overcloud deployment phase. As part of this effort, we should pay attention to the high-availability aspect, how it's implemented in the current release and how it should be changed for Ceph. This spec represents a follow up of3, it defines the requirements to rely on the Ceph provided HA daemons and describes the changes required in TripleO to meet this goal.

Problem Description

In the following description we are referring to the Ganesha daemon and the need of the related Ceph Ingress daemon deployment, but the same applies to all the existing daemons that requires an high-availability configuration (e.g., RGW and the Ceph dashboard for the next Ceph release). In TripleO we support deployment of Ganesha both when the Ceph cluster is itself managed by TripleO and when the Ceph cluster is itself not managed by TripleO. When the cluster is managed by TripleO, as per spec4, it is preferable to have cephadm manage the lifecycle of the NFS container instead of deploying it with tripleo-ansible, and this is broadly covered and solved by allowing the tripleo Ceph mkspec module to support the new Ceph daemon 5. The ceph-nfs daemon deployed by cephadm has its own HA mechanism, called ingress, which is based on haproxy and keepalived6 so we would no longer use pcmk as the VIP owner. Note this means we would run pcmk and keepalived in addition to haproxy (deployed by tripleo) and another haproxy (deployed by cephadm) on the same server (though with listeners on different ports). This approach totally relies on Ceph components, and both external and internal scenarios are covered. However, adopting the ingress daemon for a TripleO deployed Ceph cluster means that we need to make the overcloud aware about the new running services: for this reason the proposed change is meant to introduce a new TripleO resource that properly handles the interface with the Ceph services and it helps being consistent with the tripleo-heat-templates roles.

Proposed Change

Overview

The change proposed by this spec requires the introduction of a new TripleO Ceph Ingress resource that describes the ingress daemon provided by Ceph and it's made by two components:

  • The HAProxy container
  • The keepalived container

The impact of adding a new OS::Heat::CephIngress resource can be seen on the following projects.

tripleo-common

As described in Container Image Preparation7 the undercloud may be used as a container registry for all the ceph related containers and a new, supported syntax, has been introduced to deployed ceph to download containers from authenticated registries. However, as per8, the Ceph ingress daemons wont be baked into the Ceph daemon container, hence tripleo container image prepare should be executed to pull the new container images/tags in the undercloud as made for the Ceph Dashboard and the regular Ceph image. Once the ingress containers are available, it's possible to deploy the daemon on top of ceph-nfs or ceph-rgw. In particular, if this spec is going to be implemented, deployed ceph will be the only way of setting up this daemon through cephadm for ceph-nfs, resulting in a simplified tripleo-heat-templates interface and a less number of tripleo ansible tasks execution because part of the configuration is moved before the overcloud deploy created. As part of this effort, considering that the Ceph related container images have grown over the time, a new condition will be added to the tripleo-container jinja template9 to avoid pulling additional ceph images if Ceph is not deployed by TripleO. This result in a new optimization for all the Ceph external cluster use cases, as well as the existing CI jobs without Ceph.

tripleo-heat-templates

A Heat resource will be created within the cephadm space. The new resource will be also added to the existing Controller roles and all the relevant environment files will be updated with the new reference. In addition, as described in the spec10, pacemaker constraints for ceph-nfs and the related vip will be removed. The Ceph mkspec tripleo-ansible module is already able to generate the spec for this kind of daemon and it will trigger cephadm11 to deploy an ingress daemon provided that the NFS Ceph spec is applied against an existing cluster and the backend daemon is up and running. As mentioned before, the ingress daemon can also be deployed on top of an RGW instance, therefore the proposed change is valid for all the Ceph services that require an HA configuration.

Security Impact

None, the ingress daemon applied to an existing ceph-nfs instance it's managed by cephadm, resulting in a simplified model. The firewall rules are still applied and managed by TripleO and defined for each HEAT resource.

Upgrade Impact

The problem of an existing Ceph cluster is covered by the spec12.

Performance Impact

No changes.

Developer Impact

This effort can be easily extended to move the RGW service to deployed ceph, which is out of scope of this spec.

Implementation

Deployment Flow

The deployment and configuration described in this spec will happen during openstack overcloud ceph deploy, as described in13. VIPs are provisioned with openstack overcloud network vip provision before openstack overcloud network provision and before openstack overcloud node provision so we would have an ingress VIP in advance that can be processed by cephadm in the "deployed ceph" context. Parameters will be provided to deploy this daemon, along with ceph-nfs, after the Ceph cluster is up & running. As described in the overview section, an ingress object will be defined and deployed and this is supposed to manage both the VIP and the HA for this component.

Assignee(s)

  • fmount
  • fultonj
  • gfidente

Work Items

  • Create a new Ceph prefixed Heat resource that describes the Ingress daemon in the TripleO context.
  • Add both haproxy and keepalived containers to the Ceph list so that they can be pulled during the Container Image preparation phase.
  • Create a set of tasks to deploy both the nfs and the related ingress daemon
  • Deprecate the pacemaker related configuration for ceph-nfs, including pacemaker constraints between the manila-share service and ceph-nfs
  • Create upgrade playbooks to transition from TripleO/pcmk managed nfs ganesha to nfs/ingress daemons deployed by cephadm and managed by ceph orch

Depending on the state of the directord/task-core migration we might skip the ansible part, though we could POC with it to get started, extending the existing tripleo-ansible cephadm role.

Dependencies

This work depends on the tripleo_ceph_nfs spec14 that moves from tripleo deployed ganesha to the cephadm approach.

Testing

The NFS daemon feature can be enabled at day1 and it will be tested against the existing TripleO scenario00415. As part of the implementation plan, the update of the existing heat templates environment CI files, which contain both the Heat resources and the testing job parameters, is one of the goals of this spec.

Documentation Impact

The documentation will describe the new parameters introduced to the deployed ceph cli to give the ability to deploy additional daemons (ceph-nfs and the related ingress daemon) as part of deployed ceph. However, we should provide upgrade instructions for pre existing environments that need to transition from TripleO/pcmk managed nfs ganesha to nfs daemons deployed by cephadm and managed by ceph orch.

References


  1. cephadm↩︎

  2. tripleo-ceph↩︎

  3. tripleo-nfs-spec↩︎

  4. tripleo-nfs-spec↩︎

  5. tripleo-ceph-mkspec↩︎

  6. cephadm-nfs-ingress↩︎

  7. container-image-preparation↩︎

  8. ceph-ingress-containers↩︎

  9. tripleo-common-j2↩︎

  10. tripleo-nfs-spec↩︎

  11. tripleo-ceph-mkspec↩︎

  12. tripleo-common-j2↩︎

  13. tripleo-common-j2↩︎

  14. tripleo-nfs-spec↩︎

  15. tripleo-scenario004↩︎