tripleo-heat-templates/deployment/etcd/etcd-container-puppet.yaml
Michele Baldessari 75eb5bcc3f Fix etcd/tls-e deployments
Currently etcd is requiring the following dns entries in the
certificate:

  - str_replace:
      template: "{{fqdn_$NETWORK}}"
      params:
        $NETWORK: {get_param: [ServiceNetMap, EtcdNetwork]}
  - str_replace:
      template: "{{cloud_names.cloud_name_NETWORK}}"
      params:
        NETWORK: {get_param: [ServiceNetMap, EtcdNetwork]}

The problem is that etcd tasks get invoked before anything else creates
the actual service corresponding to the vip name. So the deployment
fails with:

Apr 27 18:29:17 ctrl-1-0.bgp.ftw certmonger[38973]: 2021-04-27 18:29:17 [39933] Setting "CERTMONGER_REQ_SUBJECT" to "CN=ctrl-1-0.mainnetwork.bgp.ftw" for child.
Apr 27 18:29:17 ctrl-1-0.bgp.ftw certmonger[38973]: 2021-04-27 18:29:17 [39933] Setting "CERTMONGER_REQ_HOSTNAME" to "ctrl-1-0.mainnetwork.bgp.ftw
Apr 27 18:29:17 ctrl-1-0.bgp.ftw certmonger[38973]: overcloud.main.bgp.ftw
Apr 27 18:29:17 ctrl-1-0.bgp.ftw certmonger[38973]: " for child.
Apr 27 18:29:17 ctrl-1-0.bgp.ftw certmonger[38973]: 2021-04-27 18:29:17 [39933] Setting "CERTMONGER_REQ_PRINCIPAL" to "etcd/ctrl-1-0.mainnetwork.bgp.ftw@BGP.FTW

Apr 27 18:29:17 ctrl-1-0.bgp.ftw certmonger[38973]: 2021-04-27 18:29:17 [39933] Running enrollment helper "/usr/libexec/certmonger/ipa-submit".
Apr 27 18:29:17 ctrl-1-0.bgp.ftw certmonger[38973]: Submitting request to "https://freeipa-0.bgp.ftw/ipa/json".
Apr 27 18:29:17 ctrl-1-0.bgp.ftw ipa-submit[39933]: JSON-RPC error: 4001: The service principal for subject alt name overcloud.main.bgp.ftw in certificate request does not exist
Apr 27 18:29:17 ctrl-1-0.bgp.ftw certmonger[38973]: 2021-04-27 18:29:17 [38973] Certificate submission still ongoing.
Apr 27 18:29:17 ctrl-1-0.bgp.ftw certmonger[38973]: 2021-04-27 18:29:17 [38973] Certificate submission attempt complete.
Apr 27 18:29:17 ctrl-1-0.bgp.ftw certmonger[38973]: 2021-04-27 18:29:17 [38973] Child status = 3.
Apr 27 18:29:17 ctrl-1-0.bgp.ftw certmonger[38973]: 2021-04-27 18:29:17 [38973] Child output:
Apr 27 18:29:17 ctrl-1-0.bgp.ftw certmonger[38973]: "Server at https://freeipa-0.bgp.ftw/ipa/json failed request, will retry: 4001 (The service principal for subject alt name overcloud.main.bgp.ftw in certificate request does not exist).

Let's make sure that the type: vip is inside the metadata_settings.
After this my deployment succeeded with:
pr 27 19:19:49 ctrl-1-0.bgp.ftw certmonger[38973]: 2021-04-27 19:19:49 [58130] Setting "CERTMONGER_REQ_SUBJECT" to "CN=ctrl-1-0.mainnetwork.bgp.ftw" for child.
Apr 27 19:19:49 ctrl-1-0.bgp.ftw certmonger[38973]: 2021-04-27 19:19:49 [58130] Setting "CERTMONGER_REQ_HOSTNAME" to "ctrl-1-0.mainnetwork.bgp.ftw
Apr 27 19:19:49 ctrl-1-0.bgp.ftw certmonger[38973]: overcloud.main.bgp.ftw
Apr 27 19:19:49 ctrl-1-0.bgp.ftw certmonger[38973]: " for child.
Apr 27 19:19:49 ctrl-1-0.bgp.ftw certmonger[38973]: 2021-04-27 19:19:49 [58130] Setting "CERTMONGER_REQ_PRINCIPAL" to "etcd/ctrl-1-0.mainnetwork.bgp.ftw@BGP.FTW
Apr 27 19:19:49 ctrl-1-0.bgp.ftw certmonger[38973]: " for child.
Apr 27 19:19:49 ctrl-1-0.bgp.ftw certmonger[38973]: 2021-04-27 19:19:49 [58130] Setting "CERTMONGER_OPERATION" to "SUBMIT" for child.
...
Apr 27 19:19:49 ctrl-1-0.bgp.ftw certmonger[58174]: Certificate in file "/etc/pki/tls/certs/etcd.crt" issued by CA and saved.

Tested in a couple of runs and with this patch the TLS-E deployment
proceeds.

Change-Id: I8c77ca4b983c8d617b3d0576877c138e75eb4530
2021-04-28 07:43:01 +02:00

270 lines
10 KiB
YAML

heat_template_version: wallaby
description: >
Containerized etcd services
parameters:
ContainerEtcdImage:
description: image
type: string
ContainerEtcdConfigImage:
description: The container image to use for the etcd config_volume
type: string
EndpointMap:
default: {}
description: Mapping of service endpoint -> protocol. Typically set
via parameter_defaults in the resource registry.
type: json
ServiceData:
default: {}
description: Dictionary packing service data
type: json
ServiceNetMap:
default: {}
description: Mapping of service_name -> network name. Typically set
via parameter_defaults in the resource registry. This
mapping overrides those in ServiceNetMapDefaults.
type: json
RoleName:
default: ''
description: Role name on which the service is applied
type: string
RoleParameters:
default: {}
description: Parameters specific to the role
type: json
EtcdInitialClusterToken:
description: Initial cluster token for the etcd cluster during bootstrap.
type: string
hidden: true
MonitoringSubscriptionEtcd:
default: 'overcloud-etcd'
type: string
EnableInternalTLS:
type: boolean
default: false
EnableEtcdInternalTLS:
description: Controls whether etcd and the cinder-volume service use TLS
for cinder's lock manager, even when the rest of the internal
API network is using TLS.
type: boolean
default: true
InternalTLSCAFile:
default: '/etc/ipa/ca.crt'
type: string
description: Specifies the default CA cert to use if TLS is used for
services in the internal network.
Debug:
default: false
description: Set to True to enable debugging on all services.
type: boolean
CertificateKeySize:
type: string
default: '2048'
description: Specifies the private key size used when creating the
certificate.
EtcdCertificateKeySize:
type: string
default: ''
description: Override the private key size used when creating the
certificate for this service
parameter_groups:
- label: deprecated
description: |
The following parameters are deprecated and will be removed. They should not
be relied on for new deployments. If you have concerns regarding deprecated
parameters, please contact the TripleO development team on IRC or the
OpenStack mailing list.
parameters:
- EnableEtcdInternalTLS
conditions:
internal_tls_enabled:
and:
- {get_param: EnableInternalTLS}
- {get_param: EnableEtcdInternalTLS}
key_size_override_set:
not: {equals: [{get_param: EtcdCertificateKeySize}, '']}
resources:
ContainersCommon:
type: ../containers-common.yaml
outputs:
role_data:
description: Role data for the etcd role.
value:
service_name: etcd
firewall_rules:
'141 etcd':
dport:
- 2379
- 2380
monitoring_subscription: {get_param: MonitoringSubscriptionEtcd}
config_settings:
map_merge:
- etcd::etcd_name:
str_replace:
template:
"%{hiera('fqdn_$NETWORK')}"
params:
$NETWORK: {get_param: [ServiceNetMap, EtcdNetwork]}
tripleo::profile::base::etcd::bind_ip:
str_replace:
template:
"%{hiera('$NETWORK')}"
params:
$NETWORK: {get_param: [ServiceNetMap, EtcdNetwork]}
tripleo::profile::base::etcd::client_port: '2379'
tripleo::profile::base::etcd::peer_port: '2380'
etcd::debug: {get_param: Debug}
etcd::initial_cluster_token: {get_param: EtcdInitialClusterToken}
etcd::manage_package: false
etcd::manage_service: false
- if:
- internal_tls_enabled
- tripleo::profile::base::etcd::certificate_specs:
service_certificate: '/etc/pki/tls/certs/etcd.crt'
service_key: '/etc/pki/tls/private/etcd.key'
etcd::trusted_ca_file: {get_param: InternalTLSCAFile}
etcd::peer_trusted_ca_file: {get_param: InternalTLSCAFile}
# Ensure etcd and cinder-volume aren't configured to use TLS
- tripleo::profile::base::etcd::enable_internal_tls: false
tripleo::profile::base::cinder::volume::enable_internal_tls: false
# BEGIN DOCKER SETTINGS
puppet_config:
config_volume: etcd
config_image: &etcd_config_image {get_param: ContainerEtcdConfigImage}
step_config:
list_join:
- "\n"
- - "['Etcd_key'].each |String $val| { noop_resource($val) }"
- "include tripleo::profile::base::etcd"
kolla_config:
/var/lib/kolla/config_files/etcd.json:
command: /usr/bin/etcd --config-file /etc/etcd/etcd.yml
config_files:
- source: "/var/lib/kolla/config_files/src/*"
dest: "/"
merge: true
preserve_properties: true
- source: "/var/lib/kolla/config_files/src-tls/*"
dest: "/"
merge: true
preserve_properties: true
optional: true
permissions:
- path: /var/lib/etcd
owner: etcd:etcd
recurse: true
- path: /etc/pki/tls/certs/etcd.crt
owner: etcd:etcd
- path: /etc/pki/tls/private/etcd.key
owner: etcd:etcd
docker_config:
step_2:
etcd:
image: {get_param: ContainerEtcdImage}
net: host
privileged: false
restart: always
healthcheck:
test: /openstack/healthcheck
volumes:
list_concat:
- {get_attr: [ContainersCommon, volumes]}
- - /var/lib/etcd:/var/lib/etcd
- /var/lib/kolla/config_files/etcd.json:/var/lib/kolla/config_files/config.json:ro
- /var/lib/config-data/puppet-generated/etcd/:/var/lib/kolla/config_files/src:ro
- if:
- internal_tls_enabled
- - /etc/pki/tls/certs/etcd.crt:/var/lib/kolla/config_files/src-tls/etc/pki/tls/certs/etcd.crt:ro
- /etc/pki/tls/private/etcd.key:/var/lib/kolla/config_files/src-tls/etc/pki/tls/private/etcd.key:ro
environment:
KOLLA_CONFIG_STRATEGY: COPY_ALWAYS
container_puppet_tasks:
# Etcd keys initialization occurs only on single node
step_2:
config_volume: 'etcd_init_tasks'
puppet_tags: 'etcd_key'
step_config: |
include tripleo::profile::base::etcd
config_image: *etcd_config_image
volumes:
- /var/lib/config-data/etcd/etc/etcd/:/etc/etcd:ro
- /var/lib/etcd:/var/lib/etcd:ro
deploy_steps_tasks:
if:
- internal_tls_enabled
- - name: Certificate generation
when: step|int == 1
block:
- include_role:
name: linux-system-roles.certificate
vars:
certificate_requests:
- name: etcd
dns:
- str_replace:
template: "{{fqdn_$NETWORK}}"
params:
$NETWORK: {get_param: [ServiceNetMap, EtcdNetwork]}
- str_replace:
template: "{{cloud_names.cloud_name_NETWORK}}"
params:
NETWORK: {get_param: [ServiceNetMap, EtcdNetwork]}
principal:
str_replace:
template: "etcd/{{fqdn_$NETWORK}}@{{idm_realm}}"
params:
$NETWORK: {get_param: [ServiceNetMap, EtcdNetwork]}
run_after: |
# cinder uses etcd, so its containers also need to be refreshed
container_names=$({{container_cli}} ps --format=\{\{.Names\}\} | grep -E 'cinder|etcd')
service_crt="/etc/pki/tls/certs/etcd.crt"
service_key="/etc/pki/tls/private/etcd.key"
kolla_dir="/var/lib/kolla/config_files/src-tls"
# For each container, check whether the cert file needs to be updated.
# The check is necessary because the original THT design directly bind mounted
# the files to their final location, and did not copy them in via $kolla_dir.
# Regardless of whether the container is directly using the files, or a copy,
# there's no need to trigger a reload because the cert is not cached.
for container_name in ${container_names[*]}; do
{{container_cli}} exec -u root "$container_name" bash -c "
[[ -f ${kolla_dir}/${service_crt} ]] && cp ${kolla_dir}/${service_crt} $service_crt;
[[ -f ${kolla_dir}/${service_key} ]] && cp ${kolla_dir}/${service_key} $service_key;
true
"
done
key_size:
if:
- key_size_override_set
- {get_param: EtcdCertificateKeySize}
- {get_param: CertificateKeySize}
ca: ipa
host_prep_tasks:
- name: create /var/lib/etcd
file:
path: /var/lib/etcd
state: directory
setype: container_file_t
external_deploy_tasks:
if:
- internal_tls_enabled
- - name: check if ipa server has required permissions
when: step|int == 1
import_role:
name: tls_everywhere
tasks_from: ipa-server-check
upgrade_tasks: []
metadata_settings:
if:
- internal_tls_enabled
- - service: etcd
network: {get_param: [ServiceNetMap, EtcdNetwork]}
type: vip
- service: etcd
network: {get_param: [ServiceNetMap, EtcdNetwork]}
type: node