no-tls-endpoints-public-ip.yaml is a new file that needs to be validated
among other TLS environments, so we can make sure that EndpointMap will
be constructed correctly with all needed endpoints.
Change-Id: I5e83b37d8fa757065a6dab87d6eeac1c345efd32
CephClient should be removed from the CephAll role.
The only thing it does is the key set which is already
handled by the ceph mon profile.
if not will cause Duplicate declaration: Class[Ceph::Keys]
Change-Id: I77bbec1edd21cd6a4212a381a1a7712adc4b604f
Related-Bug: 1722633
Since old-style nic config files can no longer be used in Queens
or Rocky (the format changed in Ocata), add a check in yaml-validate
to detect that old-style files are in use and list conversion script
that can be used.
Made some changes to the script to ask before overwriting the nic
config file and save a datestamped copy as backup. In addition,
the script now takes an optional parameter to define location
of run-os-net-config.sh.
Change-Id: Ic56c48fa35ab2f4c1762c0e370be03fbf2e7671c
Closes-Bug: 1753812
The resultin pre_upgrade_rolling_steps_playbook will be executed in a
node-by-node rolling fashion at the beginning of major upgrade
workflow (before upgrade_steps_playbook).
The current intended use case is special handling of L3 agent upgrade
when moving Neutron services into containers. Special care needs to be
taken in this case to preserve L3 connectivity of instances (with
regard to dnsmasq and keepalived sub-processes of L3 agent).
The playbook can be run before the main upgrade like this:
openstack overcloud upgrade run --roles overcloud --playbook pre_upgrade_rolling_steps_playbook.yaml
Partial-Bug: #1738768
Change-Id: Icb830f8500bb80fd15036e88fcd314bf2c54445d
Implements: blueprint major-upgrade-workflow
Co-Authored-By: Martin André <m.andre@redhat.com>
Co-Authored-By: Dan Prince <dprince@redhat.com>
Co-Authored-By: Emilien Macchi <emilien@redhat.com>
Partially-Implements: bp tripleo-ui-undercloud-container
Change-Id: I1109d19e586958ac4225107108ff90187da30edd
The ControlPlanePort interface changed in the deployed-server
ressource. It now takes an additional fixed_ips parameters.
This add the required parameter and is required for FFU testing in CI.
Adjust the validation to take the fixed_ips discrepancies between two
different templates interfaces in deployed-neutron-port and
ctlplane-port.
Change-Id: I58af23129bcba04a367d0169dcafd53d33ab42f2
Closes-Bug: #1755837
In last step of FFU we need to swich repos before running upgrade.
We do so by introducing post FFU steps and running the switch in
them. We also update heat agents and os-collect-config on nodes.
Change-Id: I649afc6fa384ae21edc5bc917f8bb586350e5d47
Updating OpenStack (within release) means updating ODL from v1 to v1.1.
This is done by "openstack overcloud update" which collects
update_tasks. ODL needs 2 different steps to achieve this
minor update. These are called Level1 and Level2. L1 is
simple - stop ODL, update, start. This is taken care by paunch
and no separate implementation is needed. L2 has extra steps
which are implemented in update_tasks and post_update_tasks.
Updating ODL within the same major release (1->1.1) consists of either
L1 or L2 steps. These steps are decided from ODLUpdateLevel parameter
specified in environments/services-docker/update-odl.yaml.
Upgrading ODL to the next major release (1.1->2) requires
only the L2 steps. These are implemented as upgrade_tasks and
post_upgrade_tasks in https://review.openstack.org/489201.
Steps involved in level 2 update are
1. Block OVS instances to connect to ODL
2. Set ODL upgrade flag to True
3. Start ODL
4. Start Neutron re-sync and wait for it to finish
5. Delete OVS groups and ports
6. Stop OVS
7. Unblock OVS ports
8. Start OVS
9. Unset ODL upgrade flag
These steps are exactly same as upgrade_tasks.
The logic implemented is:
follow upgrade_tasks; when update_level == 2
Change-Id: Ie532800663dd24313a7350b5583a5080ddb796e7
Extra checks:
- Check that only tags=['common', 'validate',
'pre-upgrade'] are accepted.
- Fail if tags not defined.
- Fail if 'step|int == ' condition inside 'when'
is not the evaluated first.
- Suggest the use of lists to append conditions
inside 'when'.
Change-Id: I15f6d4cb6f2a13d04580779a93a02daf86f8b412
If ceph-nfs (ganesha) service is enabled, it's set up by ceph-ansible
and it can be used as a manila backend. Manila can be configured to use
ceph either directly (manila-cephfsnative-config-docker.yaml env file)
or through ganesha (environments/manila-cephfganesha-config-docker.yaml
env file).
Change-Id: Ib408c7827e5fba0c1b01388db26363806fc64370
Partially-Implements: blueprint nfs-ganesha
This change adds a StorageNFS network. It's required by
https://review.openstack.org/#/c/471245 which implements
NFS Ganesha backend for Manila service.
To define and enable the StorageNFS network, deploy using
network_data_ganesha.yaml instead of network_data.yaml.
Besides the former adding the StorageNFS network, these
are otherwise identical.
If enabled it's also necessary to add StorageNFSIpSubnet and
StorageNFSNetworkVlanID heat parameters into network templates.
Co-Authored-By: Dan Sneddon <dsneddon@redhat.com>
Change-Id: If31722d669efe91082c93ecb815e6c41676480c8
Partially-Implements: blueprint nfs-ganesha
Realtime roles could be deployed in a cluster along with the
non-realtime roles. The only difference is the image and few
parameters. Two roles, ComputeOvsDpdkRT and ComputeSriovRT, are
added.
Change-Id: Ieb43a83af46cf5d002eab8cb9a67431fefdd8d59
This change introduces the ComputeRealTime role, that can be used
Create an initial version of a custom role for real-time compute
nodes.
Partially Implements: blueprint tripleo-realtime
Change-Id: I935cf8a74415b28f50eac041d6d3f92dc1ec1391
A new set of tasks are being added,
the 'fast_forward_upgrade_tasks', which
share structure with 'upgrade_tasks' hence
the same set of validations are being applied
to check the correct tasks structure.
Change-Id: I26898e2df68d80d0721e0467c5220f34835f6ba4
This converts "tags: stepN" to "when: step|int == N" for the direct
execution as an ansible playbook, with a loop variable 'step'.
The tasks all include the explicit cast |int.
This also adds a set_fact task for handling of the package removal
with the UpgradeRemovePackages parameter (no change to the interface)
The yaml-validate also now checks for duplicate 'when:' statements
Q upgrade spec @ Ibde21e6efae3a7d311bee526d63c5692c4e27b28
Related Blueprint: major-upgrade-workflow
[0]: 394a92f761/tripleo_common/utils/config.py (L141)
Change-Id: I6adc5619a28099f4e241351b63377f1e96933810
With the move to containers, Ceph OSDs may be combined with other
Ceph services and dedicated Ceph monitors on controllers will be
used less. Popular Ceph roles which include OSDs are Ceph file,
object and nodes which can run all Ceph services. This pattern
will also apply to HCI roles. This change adds the following
pre-composed roles to make it easier for users to use these
patterns:
- CephAll: Standalone Storage Full Role (OSD + MON + RGW + MDS + MGR + RBD Mirroring)
- CephFile: Standalone Scale-out File Role (OSD + MDS)
- CephObject: Standalone Scale-out Object Role (OSD + RGW)
- HciCephAll: HCI Full Stack Role (OSD + MON + Nova + RGW + MDS + MGR + RBD Mirroring)
- HciCephFile: HCI Scale-out File Role (OSD + Nova + MDS)
- HciCephObject: HCI Scale-out Object Role (OSD + Nova + RGW)
- HciCephMon: HCI Scale-out Block Full Role (OSD + MON + MGR + Nova)
- ControllerNoCeph: OpenStack Controller without any Ceph Services
Change-Id: Idce7aa04753eadb459124d6095efd1fe2cc95c17
All SoftwareDeployment resources should use the name property when using
config-download.
This also adds a validation to check that the name property is set in
yaml-validate.py
Change-Id: I621e282a2e2c041a0701da0296881c615f0bfda4
Closes-Bug: #1733586
The service NovaMigrationTarget is missing in SR-IOV compute role,
but is required for migration of instances. Added the missing
service to the role. And added validation to avoid such mistakes.
Closes-Bug: #1730275
Change-Id: I49d310b0c61331eef2d2bf5fd05cf67b34095bbb
This adds another interface like external_deploy_tasks, but instead
of running on each deploy step, the tasks are run after the deploy
is completed, so it's useful for per-service bootstrapping such
as is under development for octavia in:
https://review.openstack.org/#/c/508195https://review.openstack.org/#/c/515402/
These reviews could potentially be reworked to use this interface,
which would avoid the issue where the configuration needs to happen
after all the openstack services are deployed and configured.
As an example, here is how you could create a temp file post deploy:
external_post_deploy_tasks:
- name: Test something happens post-deploy
copy:
dest: /tmp/debugpostdeploy
content: "done"
Change-Id: Iff3190a7d5a238c8647a4ac474821aeda5f2b1f8
Step config is only required within the puppet_configs section
of docker/services/*. This patch drops the top level 'step_config'
and updates the unit tests accordingly.
Change-Id: I7dc7cfae3ef1965ec95b1d9ef23e7f162418c034
The compute service list is polled until all expected hosts are reported or a
timeout occurs (600s).
Adds a cellv2_discovery flag to puppet services. Used to generate a list of
hosts that should have cellv2 host mappings.
Adds a canonical fqdn and that should match the fqdn reported by a host.
Adds the ability to upload a config script for docker config instead of using
complex bash on-liners.
Closes-bug: 1720821
Change-Id: I33e2f296526c957cb5f96dff19682a4e60c6a0f0
The format which ceph-ansible uses to describe the list of pools
to be created in the cluster is different from the one which
puppet-ceph uses; this commit updates the description and the
the docker templates accordingly.
Change-Id: I1e5b2c3cbf6ae02c19a2275ca119fed6e173319d
Closes-Bug: #1720373
Services can define external_deploy_tasks, which are meant to be
executed on the undercloud node. They are step-based as the other
Ansible tasks we have, and they get executed during each deployment
step before the puppet and docker tasks.
These tasks can be used to perform complex actions from the
undercloud, such as executing nested installers like kubespray or
ceph-ansible. This should allow deploying overcloud with a single
Ansible playbook, and without creating Ansible->Mistral->Ansible loop.
Implements: blueprint ansible-config-download
Change-Id: I3dcafb96f5cea5fdcebe2b2012b61a38b0568834
Depends-On: I8491540edf78711f3229eabeda22a17cd55e99c8
This adds a simple validation that checks that the required outputs
are present in the templates for logging to stdout or files. It also
disables checking the usual required parameters (EndpointMap,
ServiceNetMap, etc.) since these are not used.
bp logging-stdout-rsyslog
Change-Id: I1d7d0faa5f9488cfba08107fc87ecd213f07d063
Presently, "openstack overcloud config download" does not support all
Deployment resources, only those included in the RoleData and are
natively of type group:ansible.
This patch adds support for also pulling all the deployment data for
OS::Heat::SoftwareDeployment (singular) resources applied to individual
servers of any group type. Those resources are mapped to a new nested
stack via the config-download-environment.yaml environment.
The nested stack has the same interface as a SoftwareDeployment but only
creates a OS::Heat::Value resource. The "config download" code will be
updated in a separate patch to read the deployment data from these Value
resources and apply them via ansible.
The related tripleo-common patch (which depends on this patch) is:
I7d7f6b831b8566390d8f747fb6f45e879b0392ba
implements: blueprint ansible-config-download
Change-Id: Ic2af634403b1ab2924c383035f770453f39a2cd5
Adds update_tasks for the minor update workflow. These will be
collected into playbooks during an initial 'update init' heat
stack update and then invoked later by the operator as ansible
playbooks.
Current understanding/workflow:
Step=1: stop the cluster on the updated node
Step=2: Pull the latest image and retag the it pcmklatest
Step=3: yum upgrade happens on the host
Step=4: Restart the cluster on the node
Step=5: Verification: test pacemaker services are running.
https://etherpad.openstack.org/p/tripleo-pike-updates-upgrades
Related-Bug: 1715557
Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com>
Co-Authored-By: Sofer Athlan-Guyot <sathlang@redhat.com>
Change-Id: I101e0f5d221045fbf94fb9dc11a2f30706843806
The services that docker depends on, have logging_sources and logging_groups;
but those are not set on the docker outputs so they are not used when dockers
are deployed.
Added logging_source & logging_groups as docker optional parameters in
tools/yaml-validate.py
Closes-Bug: #1718110
Change-Id: I8795eaf4bd06051e9b94aa50450dee0d8761e526
Upgrades from older versions using Management network fail.
This patch enables the management network even though it is not
enabled in any of the role definitions. This will allow upgrades
to complete using existing network environment files, without
requiring operators to switch to the new method for defining
which networks are attached to roles. Eventually these older
environment files will be removed.
Change-Id: Iadd12a559f0ad6918958a1355f189187fd327363
Closes-bug: 1717123
Using the service_ prefix seems incoherent with its use in
service_config_settings (vs config_settings).
Change-Id: Ia39f181415bee0071409dabddfa0c5c312915e1f
This adds a new config/deployment per role that will come after any
post deploy steps. It drives the same ansible config as the
upgrade_tasks but instead collects the post_upgrade_tasks for any
service in the given role.
The workflow is upgrade_tasks, then post deploy steps (either
puppet/ or docker/ depending on the env) and then the
post_upgrade_tasks added here.
This is added to the pacemaker/cinder-volume.yaml service for now
see the bug below for more info
Change-Id: Iced34fecf02ebddc91df9302de54d2f4c2cab680
Closes-Bug: 1706951
This adds the --quiet (-q) option to yaml-validate.py. With '-q',
yaml-validate.py will only print warnings or errors. With '-qq',
yaml-validate.py will print only errors. This makes it much easier to
spot problems when working on templates.
This commit does not change the default behavior of yaml-validate.py.
Change-Id: I358f1d4edde03714627d98361b44e6b90ce5e93c
In Ocata all live-migration over ssh is performed on the default ssh port (22).
In Pike the containerized live-migration over ssh is on port 2022 as the
docker host's sshd is using port 22.
To allow live migration during upgrade we need to temporarily pin the Pike
computes to port 22 and in the final converge we can switch over to port 2022.
This also changes the default port to 2022 for baremetal computes in Pike to
enable live-migration between baremetal and containerized computes.
Change-Id: Icb9bfdd9a99dc1dce28eb95c50a9a36bffa621b1
Depends-On: I0b80b81711f683be539939e7d084365ff63546d3
Closes-Bug: 1714171
In every ansible task defined within upgrade_tasks it is
necessary to specify the tag 'tags' which are used during
the ansible execution for the upgrade_tasks serialization.
Adding the 'tags' check per upgrade_tasks step into
the YAML validation will allow us to catch if any
service upgrade task is missing this flag.
Change-Id: I8f56a87cc2e9ffc0d827bbb729f6bc3f6ca7550b
Add the qdrouterd container as an infrastructure component
that provides a messaging backend for olso.messaging. Currently
the qdr role aliases the rabbitmq service.
This patch:
* Add qdrouterd to docker services
* Update environments docker file
* Add global_config_settings to yaml validate
Change-Id: Ief8c09a2728b6e1a1127a53b6df2affecc0ce3c4
Use a separate config_volume for swift_ringbuilder puppet_config tasks.
This is necessary so that the swift_ringbuilder and swift-storage
services don't both rsync files to the same bind mounted directory.
The rsync command from docker-puppet.py uses --delete-after, so when
they both use the same config_volume, they can end up deleting the files
generated by the other (depending on the order of execution).
Even though a separate config_volume is used, the rings must still end up
in /etc/swift for the swift services containers. An additional
container init task is used to copy the ring files into
/var/lib/config-data/puppet-generated/swift/etc/swift so that they will
be present when the actual swift services containers are started.
Change-Id: I05821e76191f64212704ca8e3b7428cda6b3a4b7
Closes-Bug: #1710952
The current check just validates that the heat template versions are using the
release alias instead of the date format.
This patch add stricter validation to ensure the heat template versions used
are supported by the release.
Also emits a warning if the most current template
version is not used.
Change-Id: I9e1d9b05fd5aa91c6a26d032043f386b6a9b072d
There is logic in nova-base.yaml that depends on the default for
this parameter being '', and the nova-compute service only needs it
set to auto during upgrade. That will be done by [1] anyway, so it
doesn't matter what the default is. It's also not clear to me that
the nova-compute task is even needed now that we're post-Ocata, but
that's not a change I feel comfortable making.
1: https://github.com/openstack/tripleo-heat-templates/blob/master/environments/major-upgrade-composable-steps.yaml
Change-Id: Iccfcb5b68e406db1b942375803cfedbb929b4307
Partial-Bug: 1700664
These are mostly the low hanging fruit that only required a few
minor changes to fix. There are more that require a lot of changes
or might be more controversial that will be done later.
Change-Id: I55cebc92ef37a3bb167f5fae0debe77339395e62
Partial-Bug: 1700664
The key_name default is ignored because the parameter is used in
some mutually exclusive environments where the default doesn't
need to be the same.
Change-Id: I77c1a1159fae38d03b0e59b80ae6bee491d734d7
Partial-Bug: 1700664
Services that access database have to read an extra MySQL configuration file
/etc/my.cnf.d/tripleo.cnf which holds client-only settings, like client bind
address and SSL configuration. The configuration file is thus used by
containerized services, but also by non-containerized services that still
run on the host.
In order to generate that client configuration file appropriately both on the
host and for containers, 1) the MySQLClient service must be included by the
role; 2) every containerized service which uses the database must include the
mysql::client profile in the docker-puppet config generation step.
By including the mysql::client profile in each containerized service, we ensure
that any change in configuration file will be reflected in the service's
/var/lib/config-data/{service}, and that paunch will restart the service's
container automatically.
We now only rely on MySQLClient from puppet/services, to make it possible to
generate /etc/my.cnf.d/tripleo.cnf on the host, and to set the hiera keys that
drive the generation of that config file in containers via docker-puppet.
We include a new YAML validation step to ensure that any service which depends
on MySQL will initialize the mysql::client profile during the docker-puppet
step.
Change-Id: I0dab1dc9caef1e749f1c42cfefeba179caebc8d7
Add docker profiles to deploy Ceph in containers via ceph-ansible. This is
implemented by triggering a Mistral workflow during one of the overcloud
deployment steps, as provided by [1].
Some new service-specific parameters are available to determine the workflow to
execute and the ansible playbook to use. A new `CephAnsibleExtraConfig`
parameter can be used to provide arbitrary config variables consumed by `ceph-ansible`.
The pre-existing template params consumed up until the Pike release to
drive `puppet-ceph` continue to work and are translated, when possible, into
the equivalent `ceph-ansible` variable.
A new environment file is added to enable use of ceph-ansible;
the pre-existing puppet-ceph implementation remains unchanged and usable
for non-containerized deployments.
1. https://review.openstack.org/#/c/463324/
Change-Id: I81d44a1e198c83a4ef8b109b4eb6c611555dcdc5
This patch adds parameters to configure alternative version
of the Zaqar messaging and management backends.
The intent is to make use of these settings in the
containers undercloud to use swift/mysql backends as a default
thus avoiding the dependency on MongoDB.
Change-Id: Ifd6a561737184c9322192ffc9a412c77d6eac3e9
Depends-On: Ie6a56b9163950cee2c0341afa0c0ddce665f3704
Depends-On: I3598e39c0a3cdf80b96e728d9aa8a7e6505e0690
Since these are obviously global parameters they shouldn't specify
what will be using them because they are used in multiple places.
Change-Id: I5054c2d67dffe802e37f8391dd7bad4721e29831
Partial-Bug: 1700664
It seems UpdateIdentifier is an overloaded parameter - it is used
both to trigger package updates in the minor update case as well as
to trigger the upgrade steps during a major upgrade. I'm not sure
it's appropriate to change either of the descriptions as a result,
so for the moment that is added to the exclusion list.
Change-Id: Ied36cf259f6a6e5c8cfa7a01722fb7fda6900976
Partial-Bug: 1700664
This way we have one list of problems that need to be fixed and can
enable this check to avoid adding any new ones. As parameters are
fixed they can be removed from the exclusion list.
Change-Id: Icb5fc36e2da3a3bfb7eaa8a66464c685220e527f
Makes it possible to resolve network subnets within a service
template; the data is transported into a new property ServiceData
wired into every service which hopefully is generic enough to
be extended in the future and transport more data.
Data can be consumed in service templates to set config values
which need to know what is the subnet where a deamon operates (for
example the Ceph Public vs Cluster network).
Change-Id: I28e21c46f1ef609517175f7e7ee19e28d1c0cba2
With the merging of Iad3e9b215c6f21ba761c8360bb7ed531e34520e6 the
roles_data.yaml should be generated with tripleoclient rather than
edited. This change adds in a pep8 task to verify that the appropriate
role files in roles/ have been modified to match how our default
roles_data.yaml is constructed. Additionally this change adds a new tox
target called 'genrolesdata' that will all you to automatically generate
roles_data.yaml and roles_data_undercloud.yaml
Change-Id: I5eb15443a131a122d1a4abf6fc15a3ac3e15941b
Related-Blueprint: example-custom-role-environments