tripleo-heat-templates/deployment/haproxy/haproxy-pacemaker-puppet.yaml
Jiri Stransky 8827e4f7f1 Pacemaker resource upgrade tasks compatible with staged upgrade
Add better idempotency checks on editing the pacemaker resources and
fetching and re-tagging new images, which prevents the upgrade from
failing. The latest status after staged upgrade looks like this:

Online: [ controller-0 controller-1 controller-2 ]
GuestOnline: [ galera-bundle-0@controller-0 galera-bundle-1@controller-1 galera-bundle-2@controller-2 rabbitmq-bundle-0@controller-0 rabbitmq-bundle-1@controller-1 redis-bundle-0@controller-0 redis-bundle-1@controller-1 ]

Full list of resources:

 podman container set: galera-bundle [brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhosp15/openstack-mariadb:pcmklatest]
   galera-bundle-0      (ocf:💓galera):        Master controller-0
   galera-bundle-1      (ocf:💓galera):        Master controller-1
   galera-bundle-2      (ocf:💓galera):        Master controller-2
 podman container set: rabbitmq-bundle [brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhosp15/openstack-rabbitmq:pcmklatest]
   rabbitmq-bundle-0    (ocf:💓rabbitmq-cluster):      Started controller-0
   rabbitmq-bundle-1    (ocf:💓rabbitmq-cluster):      Started controller-1
 podman container set: redis-bundle [brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhosp15/openstack-redis:pcmklatest]
   redis-bundle-0       (ocf:💓redis): Master controller-0
   redis-bundle-1       (ocf:💓redis): Slave controller-1
 ip-192.168.24.8        (ocf:💓IPaddr2):       Started controller-0
 ip-10.0.0.106  (ocf:💓IPaddr2):       Started controller-0
 ip-172.17.1.16 (ocf:💓IPaddr2):       Started controller-0
 ip-172.17.1.23 (ocf:💓IPaddr2):       Started controller-0
 ip-172.17.3.11 (ocf:💓IPaddr2):       Started controller-0
 ip-172.17.4.25 (ocf:💓IPaddr2):       Started controller-0
 podman container set: haproxy-bundle [brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhosp15/openstack-haproxy:pcmklatest]
   haproxy-bundle-podman-0      (ocf:💓podman):        Started controller-0
   haproxy-bundle-podman-1      (ocf:💓podman):        Started controller-1
   haproxy-bundle-podman-2      (ocf:💓podman):        Stopped
 podman container: openstack-cinder-volume [brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhosp15/openstack-cinder-volume:pcmklatest]
   openstack-cinder-volume-podman-0     (ocf:💓podman):        Started controller-1

Failed Resource Actions:
* rabbitmq_monitor_10000 on rabbitmq-bundle-0 'unknown error' (1): call=4861, status=Timed Out, exitreason='',
    last-rc-change='Mon Aug  5 10:37:51 2019', queued=0ms, exec=0ms
* rabbitmq_monitor_10000 on rabbitmq-bundle-1 'unknown error' (1): call=42, status=Timed Out, exitreason='',
    last-rc-change='Mon Aug  5 10:15:55 2019', queued=0ms, exec=0ms

This indicates that there are still issues we'll need to solve, but at
least the upgrade passes now and we can keep solving the follow-up
issues while the critical upgrade path is unblocked.

Closes-Bug: #1838971
Change-Id: I2e88dc34fa59624523de4c52a1873438c78e972f
2019-08-05 13:28:29 +02:00

566 lines
25 KiB
YAML

heat_template_version: rocky
description: >
OpenStack containerized HAproxy service for pacemaker
parameters:
ContainerHAProxyImage:
description: image
type: string
ContainerHAProxyConfigImage:
description: The container image to use for the haproxy config_volume
type: string
ServiceData:
default: {}
description: Dictionary packing service data
type: json
ServiceNetMap:
default: {}
description: Mapping of service_name -> network name. Typically set
via parameter_defaults in the resource registry. This
mapping overrides those in ServiceNetMapDefaults.
type: json
DefaultPasswords:
default: {}
type: json
EndpointMap:
default: {}
description: Mapping of service endpoint -> protocol. Typically set
via parameter_defaults in the resource registry.
type: json
SSLCertificate:
default: ''
description: >
The content of the SSL certificate (without Key) in PEM format.
type: string
PublicSSLCertificateAutogenerated:
default: false
description: >
Whether the public SSL certificate was autogenerated or not.
type: boolean
EnablePublicTLS:
default: true
description: >
Whether to enable TLS on the public interface or not.
type: boolean
DeployedSSLCertificatePath:
default: '/etc/pki/tls/private/overcloud_endpoint.pem'
description: >
The filepath of the certificate as it will be stored in the controller.
type: string
RoleName:
default: ''
description: Role name on which the service is applied
type: string
RoleParameters:
default: {}
description: Parameters specific to the role
type: json
EnableInternalTLS:
type: boolean
default: false
InternalTLSCAFile:
default: '/etc/ipa/ca.crt'
type: string
description: Specifies the default CA cert to use if TLS is used for
services in the internal network.
HAProxyInternalTLSCertsDirectory:
default: '/etc/pki/tls/certs/haproxy'
type: string
HAProxyInternalTLSKeysDirectory:
default: '/etc/pki/tls/private/haproxy'
type: string
HAProxySyslogAddress:
default: /dev/log
description: Syslog address where HAproxy will send its log
type: string
HAProxySyslogFacility:
default: local0
description: Syslog facility HAProxy will use for its logs
type: string
ConfigDebug:
default: false
description: Whether to run config management (e.g. Puppet) in debug mode.
type: boolean
PcmkConfigRestartTimeout:
default: 600
description: Time in seconds to wait for a pcmk resource to restart when
a config change is detected and the resource is being restarted
type: number
ContainerCli:
type: string
default: 'podman'
description: CLI tool used to manage containers.
constraints:
- allowed_values: ['docker', 'podman']
DeployIdentifier:
default: ''
type: string
description: >
Setting this to a unique value will re-run any deployment tasks which
perform configuration on a Heat stack-update.
conditions:
puppet_debug_enabled: {get_param: ConfigDebug}
public_tls_enabled:
and:
- {get_param: EnablePublicTLS}
- or:
- not:
equals:
- {get_param: SSLCertificate}
- ""
- equals:
- {get_param: PublicSSLCertificateAutogenerated}
- true
internal_tls_enabled: {equals: [{get_param: EnableInternalTLS}, true]}
resources:
ContainersCommon:
type: ../containers-common.yaml
HAProxyBase:
type: ./haproxy-container-puppet.yaml
properties:
ServiceData: {get_param: ServiceData}
ServiceNetMap: {get_param: ServiceNetMap}
DefaultPasswords: {get_param: DefaultPasswords}
EndpointMap: {get_param: EndpointMap}
RoleName: {get_param: RoleName}
RoleParameters: {get_param: RoleParameters}
outputs:
role_data:
description: Role data for the HAproxy role.
value:
service_name: haproxy
monitoring_subscription: {get_attr: [HAProxyBase, role_data, monitoring_subscription]}
config_settings:
map_merge:
- get_attr: [HAProxyBase, role_data, config_settings]
- tripleo::haproxy::haproxy_service_manage: false
tripleo::haproxy::mysql_clustercheck: true
tripleo::haproxy::haproxy_log_address: {get_param: HAProxySyslogAddress}
tripleo::haproxy::haproxy_log_facility: {get_param: HAProxySyslogFacility}
- haproxy_docker: true
tripleo::profile::pacemaker::haproxy_bundle::haproxy_docker_image: &haproxy_image {get_param: ContainerHAProxyImage}
tripleo::profile::pacemaker::haproxy_bundle::container_backend: {get_param: ContainerCli}
# the list of directories that contain the certs to bind mount in the countainer
# bind-mounting the directories rather than all the cert, key and pem files ensures
# that docker won't create directories on the host when then pem files do not exist
tripleo::profile::pacemaker::haproxy_bundle::tls_mapping: &tls_mapping
list_concat:
- if:
- public_tls_enabled
- - get_param: DeployedSSLCertificatePath
- null
- if:
- internal_tls_enabled
- - get_param: InternalTLSCAFile
- get_param: HAProxyInternalTLSKeysDirectory
- get_param: HAProxyInternalTLSCertsDirectory
- null
tripleo::profile::pacemaker::haproxy_bundle::internal_certs_directory: {get_param: HAProxyInternalTLSCertsDirectory}
tripleo::profile::pacemaker::haproxy_bundle::internal_keys_directory: {get_param: HAProxyInternalTLSKeysDirectory}
# disable the use CRL file until we can restart the container when the file expires
tripleo::haproxy::crl_file: null
tripleo::profile::pacemaker::haproxy_bundle::haproxy_docker_image: &haproxy_image_pcmklatest
list_join:
- ':'
- - yaql:
data: {get_param: ContainerHAProxyImage}
expression: $.data.rightSplit(separator => ":", maxSplits => 1)[0]
- 'pcmklatest'
# BEGIN DOCKER SETTINGS
puppet_config:
config_volume: haproxy
puppet_tags: haproxy_config
step_config:
list_join:
- "\n"
- - "exec {'wait-for-settle': command => '/bin/true' }"
- "class tripleo::firewall(){}; define tripleo::firewall::rule( $port = undef, $dport = undef, $sport = undef, $proto = undef, $action = undef, $state = undef, $source = undef, $iniface = undef, $chain = undef, $destination = undef, $extras = undef){}"
- "['pcmk_bundle', 'pcmk_resource', 'pcmk_property', 'pcmk_constraint', 'pcmk_resource_default'].each |String $val| { noop_resource($val) }"
- 'include ::tripleo::profile::pacemaker::haproxy_bundle'
config_image: {get_param: ContainerHAProxyConfigImage}
volumes: &deployed_cert_mount
yaql:
expression: $.data.select($+":"+$+":ro")
data: *tls_mapping
kolla_config:
/var/lib/kolla/config_files/haproxy.json:
# HAProxy 1.8 doesn't ship haproxy-systemd-wrapper, we have
# to use a new dedicated option for live config reload.
# Note: we can't use quotes in kolla command, hence the workaround
command: bash -c $* -- eval if [ -f /usr/sbin/haproxy-systemd-wrapper ]; then exec /usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg; else exec /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -Ws; fi
config_files:
- source: "/var/lib/kolla/config_files/src/*"
dest: "/"
merge: true
preserve_properties: true
optional: true
- source: "/var/lib/kolla/config_files/src-tls/*"
dest: "/"
merge: true
optional: true
preserve_properties: true
permissions:
- path: /var/lib/haproxy
owner: haproxy:haproxy
recurse: true
- path:
list_join:
- ''
- - {get_param: HAProxyInternalTLSCertsDirectory}
- '/*'
owner: haproxy:haproxy
perm: '0600'
optional: true
- path:
list_join:
- ''
- - {get_param: HAProxyInternalTLSKeysDirectory}
- '/*'
owner: haproxy:haproxy
perm: '0600'
optional: true
container_config_scripts: {get_attr: [ContainersCommon, container_config_scripts]}
docker_config:
step_2:
haproxy_restart_bundle:
start_order: 2
detach: false
net: host
ipc: host
user: root
config_volume: haproxy
environment:
- TRIPLEO_MINOR_UPDATE
command:
- '/usr/bin/bootstrap_host_exec'
- 'haproxy'
- str_replace:
template:
'if [ x"${TRIPLEO_MINOR_UPDATE,,}" != x"true" ] && /usr/sbin/pcs resource show haproxy-bundle; then /usr/sbin/pcs resource restart --wait=PCMKTIMEOUT haproxy-bundle; echo "haproxy-bundle restart invoked"; fi'
params:
PCMKTIMEOUT: {get_param: PcmkConfigRestartTimeout}
image: {get_param: ContainerHAProxyImage}
volumes:
list_concat:
- {get_attr: [ContainersCommon, volumes]}
-
- /etc/corosync/corosync.conf:/etc/corosync/corosync.conf:ro
- /var/lib/config-data/puppet-generated/haproxy/:/var/lib/kolla/config_files/src:ro
haproxy_init_bundle:
start_order: 3
detach: false
net: host
ipc: host
user: root
privileged: true
command: # '/container_puppet_apply.sh "STEP" "TAGS" "CONFIG" "DEBUG"'
list_concat:
- - '/container_puppet_apply.sh'
- '2'
- 'file,file_line,concat,augeas,pacemaker::resource::bundle,pacemaker::property,pacemaker::resource::ip,pacemaker::resource::ocf,pacemaker::constraint::order,pacemaker::constraint::colocation'
- 'include ::tripleo::profile::base::pacemaker; include ::tripleo::profile::pacemaker::haproxy_bundle'
- if:
- puppet_debug_enabled
- - '--debug'
- - ''
image: {get_param: ContainerHAProxyImage}
volumes:
list_concat:
- {get_attr: [ContainersCommon, container_puppet_apply_volumes]}
- *deployed_cert_mount
-
- /etc/corosync/corosync.conf:/etc/corosync/corosync.conf:ro
environment:
# NOTE: this should force this container to re-run on each
# update (scale-out, etc.)
- list_join:
- ''
- - 'TRIPLEO_DEPLOY_IDENTIFIER='
- {get_param: DeployIdentifier}
host_prep_tasks:
- {get_attr: [HAProxyBase, role_data, host_prep_tasks]}
- name: Check if rsyslog exists
shell: systemctl is-active rsyslog
register: rsyslog_config
- when:
- rsyslog_config is changed
- rsyslog_config.rc == 0
block:
- name: Forward logging to haproxy.log file
blockinfile:
content: |
if $syslogfacility-text == '{{facility}}' and $programname == 'haproxy' then -/var/log/containers/haproxy/haproxy.log
& stop
create: yes
path: /etc/rsyslog.d/openstack-haproxy.conf
vars:
facility: {get_param: HAProxySyslogFacility}
register: logconfig
- name: restart rsyslog service after logging conf change
service:
name: rsyslog
state: restarted
when: logconfig is changed
- name: create persistent directories
file:
path: "{{ item.path }}"
state: directory
setype: "{{ item.setype }}"
with_items:
- { 'path': /var/log/containers/haproxy, 'setype': var_log_t }
- { 'path': /var/lib/haproxy, 'setype': svirt_sandbox_file_t }
- { 'path': /var/log/haproxy, 'setype': svirt_sandbox_file_t }
- name: haproxy logs readme
copy:
dest: /var/log/haproxy/readme.txt
content: |
Log files from the haproxy containers can be found under
/var/log/containers/haproxy.
ignore_errors: true
metadata_settings:
{get_attr: [HAProxyBase, role_data, metadata_settings]}
deploy_steps_tasks:
- name: HAproxy tag container image for pacemaker
when: step|int == 1
import_role:
name: tripleo-container-tag
vars:
container_image: {get_param: ContainerHAProxyImage}
container_image_latest: *haproxy_image_pcmklatest
- name: Run puppet on the host to apply IPtables rules
when: step|int == 2
shell: |
set +e
puppet apply {{ puppet_debug }} --detailed-exitcodes --summarize --color=false \
--modulepath '{{ puppet_modulepath }}' --tags '{{ puppet_tags }}' -e '{{ puppet_execute }}'
rc=$?
set -e
set +ux
if [ $rc -eq 2 -o $rc -eq 0 ]; then
exit 0
fi
exit $rc
vars:
puppet_execute: include ::tripleo::profile::base::haproxy
puppet_tags: tripleo::firewall::rule
puppet_modulepath: /etc/puppet/modules:/opt/stack/puppet-modules:/usr/share/openstack-puppet/modules
puppet_debug:
if:
- puppet_debug_enabled
- '--debug --verbose'
- ''
update_tasks:
- name: Set HAProxy upgrade facts
block: &haproxy_update_upgrade_facts
- name: set is_haproxy_bootstrap_node fact
tags: common
set_fact: is_haproxy_bootstrap_node={{haproxy_short_bootstrap_node_name|lower == ansible_hostname|lower}}
- name: Mount TLS cert if needed
when:
- step|int == 1
- is_haproxy_bootstrap_node
block:
- name: Check haproxy public certificate configuration in pacemaker
command: cibadmin --query --xpath "//storage-mapping[@id='haproxy-cert']"
ignore_errors: true
register: haproxy_cert_mounted
- name: Disable the haproxy cluster resource
pacemaker_resource:
resource: haproxy-bundle
state: disable
wait_for_resource: true
register: output
retries: 5
until: output.rc == 0
# rc == 6 means the configuration doesn't exist in the CIB
when: haproxy_cert_mounted.rc == 6
- name: Set HAProxy public cert volume mount fact
set_fact:
haproxy_public_cert_path: {get_param: DeployedSSLCertificatePath}
haproxy_public_tls_enabled: {if: [public_tls_enabled, true, false]}
- name: Add a bind mount for public certificate in the haproxy bundle
command: pcs resource bundle update haproxy-bundle storage-map add id=haproxy-cert source-dir={{ haproxy_public_cert_path }} target-dir=/var/lib/kolla/config_files/src-tls/{{ haproxy_public_cert_path }} options=ro
when: haproxy_cert_mounted.rc == 6 and haproxy_public_tls_enabled|bool
- name: Enable the haproxy cluster resource
pacemaker_resource:
resource: haproxy-bundle
state: enable
wait_for_resource: true
register: output
retries: 5
until: output.rc == 0
when: haproxy_cert_mounted.rc == 6
- name: Haproxy fetch and retag container image for pacemaker
when: step|int == 2
block: &haproxy_fetch_retag_container_tasks
- name: Get container haproxy image
set_fact:
haproxy_image: {get_param: ContainerHAProxyImage}
haproxy_image_latest: *haproxy_image_pcmklatest
- name: Pull latest haproxy images
command: "{{container_cli}} pull {{haproxy_image}}"
- name: Get previous haproxy image id
shell: "{{container_cli}} inspect --format '{{'{{'}}.Id{{'}}'}}' {{haproxy_image_latest}}"
register: old_haproxy_image_id
failed_when: false
- name: Get new haproxy image id
shell: "{{container_cli}} inspect --format '{{'{{'}}.Id{{'}}'}}' {{haproxy_image}}"
register: new_haproxy_image_id
- name: Retag pcmklatest to latest haproxy image
include_role:
name: tripleo-container-tag
vars:
container_image: "{{haproxy_image}}"
container_image_latest: "{{haproxy_image_latest}}"
when:
- old_haproxy_image_id.stdout != new_haproxy_image_id.stdout
- block:
- name: Get a list of container using haproxy image
shell: "{{container_cli}} ps -a -q -f 'ancestor={{old_haproxy_image_id.stdout}}'"
register: haproxy_containers_to_destroy
# It will be recreated with the delpoy step.
- name: Remove any container using the same haproxy image
shell: "{{container_cli}} rm -fv {{item}}"
with_items: "{{ haproxy_containers_to_destroy.stdout_lines }}"
- name: Remove previous haproxy images
shell: "{{container_cli}} rmi -f {{old_haproxy_image_id.stdout}}"
when:
- old_haproxy_image_id.stdout != ''
- old_haproxy_image_id.stdout != new_haproxy_image_id.stdout
upgrade_tasks:
- name: Prepare switch of haproxy image name
when:
- step|int == 0
block:
- name: Get haproxy image id currently used by pacemaker
shell: "pcs resource config haproxy-bundle | grep -Eo 'image=[^ ]+' | awk -F= '{print $2;}'"
register: haproxy_image_current_res
failed_when: false
- name: Image facts for haproxy
set_fact:
haproxy_image_latest: *haproxy_image_pcmklatest
haproxy_image_current: "{{haproxy_image_current_res.stdout}}"
- name: Prepare the switch to new haproxy container image name in pacemaker
block:
- name: Temporarily tag the current haproxy image id with the upgraded image name
import_role:
name: tripleo-container-tag
vars:
container_image: "{{haproxy_image_current}}"
container_image_latest: "{{haproxy_image_latest}}"
pull_image: false
when:
- haproxy_image_current != ''
- haproxy_image_current != haproxy_image_latest
- name: Check haproxy cluster resource status
shell: pcs resource config haproxy-bundle
failed_when: false
register: haproxy_pcs_res_result
- name: Set upgrade haproxy facts
set_fact:
haproxy_pcs_res: "{{haproxy_pcs_res_result|succeeded}}"
is_haproxy_bootstrap_node: "{{haproxy_short_bootstrap_node_name|lower == ansible_hostname|lower}}"
- name: Update haproxy pcs resource bundle for new container image
when:
- step|int == 1
- is_haproxy_bootstrap_node|bool
- haproxy_pcs_res|bool
- haproxy_image_current != haproxy_image_latest
block:
- name: Disable the haproxy cluster resource before container upgrade
pacemaker_resource:
resource: haproxy-bundle
state: disable
wait_for_resource: true
register: output
retries: 5
until: output.rc == 0
- name: Expose HAProxy stats socket on the host and mount TLS cert if needed
block:
- name: Check haproxy stats socket configuration in pacemaker
command: cibadmin --query --xpath "//storage-mapping[@id='haproxy-var-lib']"
ignore_errors: true
register: haproxy_stats_exposed
- name: Check haproxy public certificate configuration in pacemaker
command: cibadmin --query --xpath "//storage-mapping[@id='haproxy-cert']"
ignore_errors: true
register: haproxy_cert_mounted
- name: Add a bind mount for stats socket in the haproxy bundle
command: pcs resource bundle update haproxy-bundle storage-map add id=haproxy-var-lib source-dir=/var/lib/haproxy target-dir=/var/lib/haproxy options=rw
# rc == 6 means the configuration doesn't exist in the CIB
when: haproxy_stats_exposed.rc == 6
- name: Set HAProxy public cert volume mount fact
set_fact:
haproxy_public_cert_path: {get_param: DeployedSSLCertificatePath}
haproxy_public_tls_enabled: {if: [public_tls_enabled, true, false]}
- name: Add a bind mount for public certificate in the haproxy bundle
command: pcs resource bundle update haproxy-bundle storage-map add id=haproxy-cert source-dir={{ haproxy_public_cert_path }} target-dir=/var/lib/kolla/config_files/src-tls/{{ haproxy_public_cert_path }} options=ro
when:
- haproxy_cert_mounted.rc == 6
- haproxy_public_tls_enabled|bool
- name: Update the haproxy bundle to use the new container image name
command: "pcs resource bundle update haproxy-bundle container image={{haproxy_image_latest}}"
- name: Enable the haproxy cluster resource
pacemaker_resource:
resource: haproxy-bundle
state: enable
wait_for_resource: true
register: output
retries: 5
until: output.rc == 0
- name: Create hiera data to upgrade haproxy in a stepwise manner.
when:
- step|int == 1
block:
- name: set haproxy upgrade node facts in a single-node environment
set_fact:
haproxy_short_node_names_upgraded: "{{ haproxy_short_node_names }}"
cacheable: no
when: groups['haproxy'] | length <= 1
- name: set haproxy upgrade node facts from the limit option
set_fact:
haproxy_short_node_names_upgraded: "{{ haproxy_short_node_names_upgraded|default([]) + [item.split('.')[0]] }}"
cacheable: no
when:
- groups['haproxy'] | length > 1
- item.split('.')[0] in ansible_limit.split(',')
loop: "{{ haproxy_short_node_names }}"
- debug:
msg: "Prepare haproxy upgrade for {{ haproxy_short_node_names_upgraded }}"
- fail:
msg: >
You can't upgrade haproxy without staged
upgrade. You need to use the limit option in order
to do so.
when: >-
haproxy_short_node_names_upgraded is not defined or
haproxy_short_node_names_upgraded | length == 0
- name: add the haproxy short name to hiera data for the upgrade.
include_role:
name: tripleo-upgrade-hiera
tasks_from: set.yml
vars:
tripleo_upgrade_key: haproxy_short_node_names_override
tripleo_upgrade_value: "{{haproxy_short_node_names_upgraded}}"
- name: remove the extra hiera data needed for the upgrade.
include_role:
name: tripleo-upgrade-hiera
tasks_from: remove.yml
vars:
tripleo_upgrade_key: haproxy_short_node_names_override
when: haproxy_short_node_names_upgraded | length == haproxy_short_node_names | length
- name: Retag the pacemaker image if containerized
when:
- step|int == 3
block: *haproxy_fetch_retag_container_tasks