ff3173bdcf
On particular role compositions, the code joining the update_tasks might order things differently then on a typical 3ctrl control plane and the ovn-dbs tasks at step1 (which require the cluster to be up) will happen after the pacemaker task at step1 which stops the cluster. So we can observe something like the following: 2021-09-10 10:05:13.370339 | 001c2891-506d-f833-ff5a-000000000954 | TASK | Change the bundle operation timeout 2021-09-10 10:05:14.136798 | 001c2891-506d-f833-ff5a-000000000954 | CHANGED | Change the bundle operation timeout | ovn-db-01 2021-09-10 10:05:14.137982 | 001c2891-506d-f833-ff5a-000000000954 | TIMING | Change the bundle operation timeout | ovn-db-01 | 0:00:54.808754 | 0.77s 2021-09-10 10:05:14.146853 | 001c2891-506d-f833-ff5a-000000000956 | TASK | Acquire the cluster shutdown lock to stop pacemaker cluster 2021-09-10 10:05:14.508085 | 001c2891-506d-f833-ff5a-000000000956 | CHANGED | Acquire the cluster shutdown lock to stop pacemaker cluster | ovn-db-01 2021-09-10 10:05:14.509257 | 001c2891-506d-f833-ff5a-000000000956 | TIMING | Acquire the cluster shutdown lock to stop pacemaker cluster | ovn-db-01 | 0:00:55.180032 | 0.36s 2021-09-10 10:05:14.518668 | 001c2891-506d-f833-ff5a-000000000957 | TASK | Stop pacemaker cluster 2021-09-10 10:05:18.559627 | 001c2891-506d-f833-ff5a-000000000957 | CHANGED | Stop pacemaker cluster | ovn-db-01 2021-09-10 10:05:18.560561 | 001c2891-506d-f833-ff5a-000000000957 | TIMING | Stop pacemaker cluster | ovn-db-01 | 0:00:59.231336 | 4.04s 2021-09-10 10:05:18.569161 | 001c2891-506d-f833-ff5a-000000000958 | TASK | Start pacemaker cluster 2021-09-10 10:05:18.627924 | 001c2891-506d-f833-ff5a-000000000958 | SKIPPED | Start pacemaker cluster | ovn-db-01 2021-09-10 10:05:18.628678 | 001c2891-506d-f833-ff5a-000000000958 | TIMING | Start pacemaker cluster | ovn-db-01 | 0:00:59.299453 | 0.06s 2021-09-10 10:05:18.637292 | 001c2891-506d-f833-ff5a-000000000959 | TASK | Release the cluster shutdown lock 2021-09-10 10:05:18.694945 | 001c2891-506d-f833-ff5a-000000000959 | SKIPPED | Release the cluster shutdown lock | ovn-db-01 2021-09-10 10:05:18.695717 | 001c2891-506d-f833-ff5a-000000000959 | TIMING | Release the cluster shutdown lock | ovn-db-01 | 0:00:59.366493 | 0.06s 2021-09-10 10:05:18.704368 | 001c2891-506d-f833-ff5a-00000000095a | TASK | Clear ovndb cluster pacemaker error 2021-09-10 10:05:19.368816 | 001c2891-506d-f833-ff5a-00000000095a | FATAL | Clear ovndb cluster pacemaker error | ovn-db-01 | error={"changed": true, "cmd": "pcs resource cleanup ovn-dbs-bundle", "delta": "0:00:00.399084", "end": "2021-09-10 10:05:20 .044985", "msg": "non-zero return code", "rc": 1, "start": "2021-09-10 10:05:19.645901", "stderr": "Error: Unable to forget failed operations of resource: ovn-dbs-bundle\nError connecting to the CIB manager: Transport endpoint is not connected\nError perf orming operation: Transport endpoint is not connected", "stderr_lines": ["Error: Unable to forget failed operations of resource: ovn-dbs-bundle", "Error connecting to the CIB manager: Transport endpoint is not connected", "Error performing operation: Tran sport endpoint is not connected"], "stdout": "", "stdout_lines": []} We cannot call pcs resource cleanup at step1, we must call it at step0 so we're guaranteed that the cluster is up, no matter how heat/ansible decide to order the update_tasks. Note: This is the short-term less-invasive fix. The mid-long term fix should be around verifying that we can now remove those workarounds that were implemented for OVN bugs. Closes-Bug: #1943254 Change-Id: Idd827f72c0033978db7b9a8ea6acec2086cda961 |
||
---|---|---|
ci | ||
common | ||
container_config_scripts | ||
deployed-server | ||
deployment | ||
doc | ||
environments | ||
extraconfig | ||
firstboot | ||
network | ||
network-data-samples | ||
plan-samples | ||
puppet | ||
releasenotes | ||
roles | ||
sample-env-generator | ||
scripts | ||
tools | ||
tripleo_heat_templates | ||
zuul.d | ||
.ansible-lint | ||
.gitignore | ||
.gitreview | ||
.testr.conf | ||
LICENSE | ||
README.rst | ||
babel.cfg | ||
bindep.txt | ||
config-download-software.yaml | ||
config-download-structured.yaml | ||
j2_excludes.yaml | ||
network_data.yaml | ||
network_data_dashboard.yaml | ||
network_data_default.yaml | ||
network_data_ganesha.yaml | ||
network_data_routed.yaml | ||
network_data_subnets_routed.yaml | ||
network_data_undercloud.yaml | ||
overcloud-resource-registry-puppet.j2.yaml | ||
overcloud.j2.yaml | ||
requirements.txt | ||
roles_data.yaml | ||
roles_data_undercloud.yaml | ||
setup.cfg | ||
setup.py | ||
test-ansible-requirements.txt | ||
test-requirements.txt | ||
tox.ini | ||
vip_data_default.yaml |
README.rst
Team and repository tags
tripleo-heat-templates
Heat templates to deploy OpenStack using OpenStack.
- Free software: Apache License (2.0)
- Documentation: https://docs.openstack.org/tripleo-docs/latest/
- Source: https://opendev.org/openstack/tripleo-heat-templates
- Bugs: https://bugs.launchpad.net/tripleo
- Release notes: https://docs.openstack.org/releasenotes/tripleo-heat-templates/
Features
The ability to deploy a multi-node, role based OpenStack deployment using OpenStack Heat. Notable features include:
- Choice of deployment/configuration tooling: puppet, (soon) docker
- Role based deployment: roles for the controller, compute, ceph, swift, and cinder storage
- physical network configuration: support for isolated networks, bonding, and standard ctlplane networking
Directories
A description of the directory layout in TripleO Heat Templates.
- environments: contains heat environment files that can be used with -e
on the command like to enable features, etc.
- extraconfig: templates used to enable 'extra' functionality. Includes
functionality for distro specific registration and upgrades.
- firstboot: example first_boot scripts that can be used when initially
creating instances.
- network: heat templates to help create isolated networks and ports
- puppet: templates mostly driven by configuration with puppet. To use these
templates you can use the overcloud-resource-registry-puppet.yaml.
- validation-scripts: validation scripts useful to all deployment
configurations
- roles: example roles that can be used with the tripleoclient to generate
a roles_data.yaml for a deployment See the roles/README.rst for additional details.
Service testing matrix
The configuration for the CI scenarios will be defined in tripleo-heat-templates/ci/ and should be executed according to the following table:
- | scn000 | scn001 | scn002 | scn003 | scn004 | scn006 | scn007 | scn009 | scn010 | scn013 | non-ha | ovh-ha |
---|---|---|---|---|---|---|---|---|---|---|---|---|
keystone |
|
|
|
|
|
|
|
|
|
|
|
|
glance |
|
swift |
|
|
|
|
|
|
|
|
||
cinder |
|
iscsi | ||||||||||
heat |
|
|
||||||||||
ironic |
|
|||||||||||
mysql |
|
|
|
|
|
|
|
|
|
|
|
|
neutron |
|
|
|
|
|
|
|
|
|
|
||
neutron-bgpvpn |
|
|||||||||||
ovn |
|
|||||||||||
neutron-l2gw |
|
|||||||||||
om-rpc | rabbit | rabbit |
|
rabbit | rabbit | rabbit | rabbit | rabbit | rabbit | rabbit | ||
om-notify | rabbit | rabbit | rabbit | rabbit | rabbit | rabbit | rabbit | rabbit | rabbit | rabbit | ||
redis |
|
|
||||||||||
haproxy |
|
|
|
|
|
|
|
|
|
|
||
memcached |
|
|
|
|
|
|
|
|
|
|
||
pacemaker |
|
|
|
|
|
|
|
|
|
|
||
nova |
|
|
|
|
ironic |
|
|
|
|
|
||
placement |
|
|
|
|
|
|
|
|
|
|
||
ntp |
|
|
|
|
|
|
|
|
|
|
|
|
snmp |
|
|
|
|
|
|
|
|
|
|
|
|
timezone |
|
|
|
|
|
|
|
|
|
|
|
|
mistral |
|
|||||||||||
swift |
|
|||||||||||
aodh |
|
|
||||||||||
ceilometer |
|
|
||||||||||
gnocchi |
|
|
||||||||||
barbican |
|
|||||||||||
zaqar |
|
|||||||||||
cephrgw |
|
|||||||||||
cephmds |
|
|||||||||||
manila |
|
|||||||||||
collectd |
|
|||||||||||
designate |
|
|||||||||||
octavia |
|
|
||||||||||
rear |
|
|||||||||||
Extra Firewall |
|