Ansible fixed "issue" where gather_facts was working with any tag defined
Now this behaviour changed and we need to gather facts not depending on
the provided tag. Some info provided here [1]
[1] https://github.com/ansible/ansible/issues/57529
Change-Id: Idc8be5f490cba79e70a45d159718ab68c78cbcee
This patch finalizes migration to journald with cleanup of
unnecesary tasks, like mapping log directories or launching
rsyslog_client role.
Depends-On: https://review.opendev.org/672059
Depends-On: https://review.opendev.org/672506
Change-Id: Ib7f2fde163a3d3b965a61fee1dd456a186b84e23
In order to transition all roles to using the new venv build mechanism,
we need to make sure that the git pins override the role pins.
Change-Id: If8c8bd211e20c1f61b3ccb1237c5961843243e1e
This change resolves the deprecation warnings for the use of "include"
which has been removed upstream. Playbooks will now use "import_playbook"
and tasks will now use "include_tasks". While this change attempts to
resolve all of the include issues several playbooks will need to be
refactored to resolve the use of include with variables as this is not
a pattern that is supported by mainline ansible.
Change-Id: I8fdb2f9f75f38986ba1dc9f93e274749c49e5c67
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
Now that all the MQ and database creation tasks are in the roles,
and use appropriate defaults, we can remove all the wiring from
group_vars and the tasks.
To cater to the changes in passwords, we also ensure that the
upgrade tooling renames any existing secrets.
The healthcheck-infrastructure.yml playbook is deliberately left
alone due to it being refactored anyway in
https://review.openstack.org/587408
Change-Id: Ie3960e2e2ac9c0aff0bc36f46182be2fc0a038b3
Add new 'aio_distro_basekit' jobs to test the minimal basekit deployment
using distribution packages for the OpenStack services.
We can skip all repo-* related playbooks and roles since we are not
building pip packages for OpenStack services anymore. Finally, we can
populate the utility container using the distribution packages for the
OpenStack client instead of using the wheel packages.
Change-Id: Ia8c394123b5588fff8c4acbe1532ed5a6dc7e8ec
Depends-On: https://review.openstack.org/#/c/583161/
Depends-On: https://review.openstack.org/#/c/567530/
Depends-On: https://review.openstack.org/#/c/580455/
Implements: blueprint openstack-distribution-packages
Without this patch, ansible parses the includes for containers,
even when running on metal, and therefore even if there is no
reason to do so.
This is a problem because some of those includes rely on variables
set by our dynamic inventory, and would fail when running OSA
with a static inventory on metal.
This patch solves the problem by ensuring the task will not
run unnecessarily.
Change-Id: I101b121d3d94673597a705979c38ea207043555f
This removes, everywhere applicable, the reference
_rabbitmq_, replacing it with the new oslo
counterpart. Shims for compatibility are explicitly
referenced, so that we can remove them later, when
the roles have been adapted appropriately.
It explicitly puts all the oslo messaging details
in one place, as this is easier to track. It is
applied on the all group, as most of the time, the
details are used by 2 or more roles.
This also updated spice-html5 git repository URL
as their code moved out of github. This also adds
some details in job log collection in order to be
able to gather more information about failures.
Depends-On: https://review.openstack.org/#/c/572413/
Depends-On: https://review.openstack.org/#/c/572565/
Depends-On: https://review.openstack.org/#/c/560574/
Change-Id: I1bab622e22d7d820fa2f774b4df60323b1eb7d1e
The rsyslog role has served us well however there's now a better way
given the ability to remotely journal. This change disables the use of
the `rsyslog_client` and `server` roles unless journal remote is
unavailable. In this change the `rsyslog_client` interaction has been
moved into an include_tasks which is conditionally loaded when the
variable `rsyslog_client_enabled` is set to true. Additionally the
`rsyslog_server` role is disabled unless `rsyslog_server_enabled`
is set true. Using the new variables legacy functionality can be
enabled. By disabling the general rsyslog roles we'll lessen the
overall IO on the cloud and improve the speed of the deployment.
> NOTE: At this time there's no suitable package to install
"systemd-journal-remote" on opensuse so a conditional has been
added to ensure distributed log syncing remains functional on
all of our supported distros. Should a package be made
available for journal remote we can globally disable the
legacy rsyslog roles entirely by default.
Change-Id: Ice21667c6999d0ac86b2d7bde648a0375f890210
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
The openrc role will no longer be executed on all OpenStack
service containers/hosts. Instead a single host is designated
through the use of the ``openstack_service_setup_host``
variable. The default is ``localhost`` (the deployment host).
While each playbook previously executed the role, it did not
need to as the role is already in each service role's meta
dependencies.
Needed-By: https://review.openstack.org/568142
Change-Id: I7bb90264ee379db5410e7ec16c5bac1ee2c398dc
This commit introduces oslo.messaging service variables in place of
the rabbitmq server. This will enable the use of separate and alternative
messaging system backends for RPC and Notify communications with
minimal impact to the overall deployment configuration.
This patch:
* update service passwords
* add oslo-messaging to group vars
* update inventory group vars for each service
* add common task for oslo messaging vhost/user install
* update service install playbooks
Change-Id: I235bd33a52df5451a62d2c8700a65a1bc093e4d6
When the repo container is removed, the apt cache config
is still in place and causes problems. This is due to the
tasks running to remove the cache configuration not
executing early enough in the deployment.
This patch re-orders the tasks so that the removal can
be done and the cache is updated afterwards to confirm
a working state.
It also makes the configuration happen on all hosts, then
all containers, at the right timings to ensure that if the
container is removed, the proxy config will also be removed
automatically.
Change-Id: I0d1bf9617594d9a5c7c2ea3335c80cc2478df985
This option can cause silent failures which are confusing and hard to
track down. While the intention of this was to allow large scale
deployments to succeed in cases where a single node fails due to
transiant issues it has produced more problems in terms of confusion
that it solves. This change removes the option from all production
playbooks.
Change-Id: I1dcbbf5bc8cc66f11dd8ddc22d2a177c5c0f31f1
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
This change makes it possible for us to add future support for multiple
container technologies within a single deployment. A new variable has
been added allowing the deployer to set the container tech within a
deployment. At this point the only supported container tech is "lxc"
however in a follow on PR we intend to add systemd-nspawn.
The playbooks for lxc-containers-* have all been renamed so we have a
consistent experience when sourcing and executing container type plays.
To ensure this change does not break existing deployer automation links
have been created for the old playbook names. In a future release we can
remove these links.
Change-Id: I8c2f8f29a93a3212de73c74c7d1ab7d851bbd204
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
The is_metal variable is set in group_vars/all/all.yml so
it does not need to also be set in every playbook.
Change-Id: I9a2337f0e2b908a59e8d0f3cddf2b61cc48a49a8
Ansible throws a warning when we use gather_facts as a variable. This
patches changes the variable to osa_gather_facts to avoid warnings.
Closes-Bug: 1722604
Change-Id: Ic0a46b93e1ba62bc8e99e2d0a279c361a168690b
This change modifies the unbound playbook so that it follows a JIT
pattern allowing the playbooks to install what they need when they need
it. Prior to this change the unbound playbook would iterate through all
target hosts, including containers, and attempt to install the client
resource or simply execute the resolveconf role which would end in
failure should the target resource not be available at the time of
execution. Converting this to a JIT pattern should save general time
when installing with and without unbound on initial deployments and will
guarantee that the target hosts, including containers, have the most
up-to-date client code unpo deployment, or upgrade, of a given service.
Change-Id: I829747094cabc8027bad904cb822a6d265f48d73
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
When global_environment_variables is set in user_variables.yml, this
installs environment settings in /etc/environment on all hosts and
containers. These remain in place after deployment is complete.
This patch adds a similar variable deployment_environment_variables
that defines environment strings applied only while the playbooks
are running. They leave nothing behind on the hosts or containers.
This may be used, for example, for proxy settings required only
during deployment. A simpler no_proxy setting is adequate during
deployment, so this provides a workaround to Bug #1691749.
Change-Id: Ia15d2133c6749fa9496bbf9359b8bf075742d60e
Related-Bug: #1691749
The Ceilometer API is now deprecated in favor of using Gnocchi,
Aodh, and Panko to pull telemetry information
Change-Id: I239c23487d28f9892591e56677976ff6a0caea9d
Partial-Bug: 1666640
Injected role vars, in cases where the vars were not generated
within the playbook run, are not ideal. In the case of our venv vars,
it removes the ability for a deploy to manipulate the venv download
tag or path using group vars.
This cleanup standardizes the venv tag and download url vars using
group vars.
I also removed several unnecessary definitions of
pip_lock_to_internal_repo in the playbooks since it is already
defined in the group_vars/all.yml.
Change-Id: Iddf27179d5babb91f4518202bdae5855f110b958
This removes the rabbitmq deterministic sort common task. Due to
upstream improvements in oslo this common task is no longer needed.
Change-Id: I97c85f0c12b877927779dafa983ec1ce2afb5821
Closes-Bug: #1632831
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
Ansible 2.1.1 introduces a regression in the way conditional
includes are handled which results in every task in the
included file being evaluated even if the condition for the
include is not met. This extends the run time significantly
for a deployment.
This patch forces all conditional includes to be dynamic.
Change-Id: I864e5178f6cb8ca245bafbed9a94ae8d4ea29eae
Related-Bug: https://github.com/ansible/ansible/issues/17687
The numerous tags within the playbook have been condensed
to two potential tags: "$NAMESPACE-config", and "$NAMESPACE".
These tags have been chosen as they are namespaced and cover
the configuration pre-tasks as well as the option to execute
the main playbook from a higher level play. By tagging
everything in the play with "$NAMESPACE" we're ensuring that
everything covered by the playbook has a namespaced tag. Any
place using the "always" tag was left alone as the tasks being
executed must "always" execute whenever the playbook is called.
Notice: The os-swift-setup.yml playbook file has been
removed because it currently serves no purpose. While the
os-swift-sync.yml is no longer being directly called it has been
left as it could be of use to a deployer.
Change-Id: Iebfd82ebffedc768a18d9d9be6a9e70df2ae8fc1
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
The pip_lock_to_internal_repo variable is repeated in most playbooks,
move it to all's group vars instead.
The repo_all group and lxc_hosts play are exceptions, overriding it to
False.
Change-Id: I65393b74ed00d3348424e06f4f2fd4b4687cb8d6
The rabbitmq-install play now supports a variable
inventory host group defined with *rabbitmq_host_group*.
This allows deployers to use the play to create multiple
RabbitMQ clusters using for example:
Typical infra cluster:
openstack-ansible rabbitmq-install.yml
Second cluster for telemetry:
openstack-ansible -e "rabbitmq_host_group=telemetry_rabbitmq_all" rabbitmq-install.yml
Many vars were moved from group-specific group_vars to the "all" group
as the ceilometer role/play requires the RabbitMQ details and telemetry enabled
state for all services that can potentially be monitored by ceilometer.
By default, the "telemetry cluster" and the "RPC" cluster are the same, but
deployers that wish to create multiple clusters are able to override the
*_rabbitmq_telemetry_* vars when choosing to run separate clusters.
Use case is more fully explained here:
http://trumant.github.io/openstack-ansible-multiple-rabbitmq-clusters.html
Change-Id: I737711085fc0e50c4c5a0ee7c2037927574f8448
Depends-On: Ib23c8829468bbb4ddae67e08a092240f54a6c729
Depends-On: Ifad921525897c5887f6ad6871c6d72c595d79fa6
Depends-On: I7cc5a5dac4299bd3d1bc810ea886f3888abaa2da
Depends-On: Ib3ee5b409dbc2fc2dc8d25990d4962ec49e21131
Depends-On: I397b56423d9ef6757bada27c92b7c9f0d5126cc9
Depends-On: I98cb273cbd8948072ef6e7fc5a15834093d70099
Depends-On: I89363b64ff3f41ee65d72ec51cb8062fc7bf08b0
Implements: blueprint multi-rabbitmq-clusters
Add support to all os-* for downloading venvs built/tagged with
the corresponding CPU architecture for the node deployed on.
Partially Implements: bp/multi-arch-repo
Depends-On: I31756f8383e6d69d2f80caf6a85c4c5021bfc46d
Change-Id: I045de3ac8b81cadbcb34102f1a2db5bff74c32a6
Package cache clients were configured in the repo_server role
previously. This change moves the client side configuration
of the proxy file to the service playbooks using common-tasks.
Change-Id: Icf127db9e279bd15b177347ecc4f3c8fe68b02f2
All of the common tasks shared across all of the playbooks have been
moved into "playbooks/common-tasks" as singular task files which are
simply included as needed.
* This change will assist developers adding additional playbooks, roles,
etc which may need access to common tasks.
* This change will guarantee consistency between playbooks when
executing common tasks which are generally used to setup services.
* This change greatly reduces code duplication across all plays.
* The common-task files have comments at the top for developer
instructions on how a task file can be used.
Change-Id: I399211c139d6388ab56b97b809f93d4936907c7a
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
There is no need to redefine these vars at the playbook
level when they are already defined in the group_vars for
the all group.
Change-Id: I4ef48b5cd34f5c751bf927fce4c9ffa9bd590992
When scoping pip_lock_to_internal_repo to each os_* role it was not
being applied correctly to the pip_install dependency of those roles.
Move pip_lock_to_internal_repo to the list of playbook scoped variables
so that the pip lockdown tasks are run as expected.
Change-Id: Ibd4f312b9b3174987b8bd319b3d5f15c397ec4a2
This commit adds a gather_facts variable in the playbooks, which
is defaulted to True. If run_playbooks.sh is used, this commit adds
"-e gather_facts=False" to the openstack-ansible cli for the playbook
run. Everything should work fine for deployers (because there is no
change for them), and for the gate (because fact caching is enabled)
It should speed up the gate by avoiding the long "setup" task cost for each host.
Instead it does the setup to the appropriate hosts, at the appropriate time.
Change-Id: I348a4b7dfe70d56a64899246daf65ea834a75d2a
Signed-off-by: Jean-Philippe Evrard <jean-philippe.evrard@rackspace.co.uk>
The pip_install and pip_lock_down roles have been merged.
Update playbooks to make use of the merged pip_install role by providing
the 'pip_lock_to_internal_repo' variable to based on whether or not
'pip_links' contains any entries.
Change-Id: I59ad75ac54fd2c54172b404c53cc2659b9dfe482
Ceilometer has no hard dependency on galera so this variable is
unnecessary.
Depends-On: If8780ddc85f18f93facc9ee61e773f602f03ccc5
Change-Id: I336dea8b1f303aa0f0fc3ac45f9c7664b6566600
When RabbitMQ is not installed by OSA, because a deployer is using an
existing RabbitMQ installation, or because it is not needed
(eg. Standalone Swift), then do not setup messaging vhost and user for
the various services.
Change-Id: Ia35c877939cb3d4a6b0d792165af8729a7062a6e
The Keystone role previously migrated the messaging vhost and user setup to
a pre-task in the os-keystone-install.yml playbook. This review continues this
migration for all other roles where this is applicable.
Change-Id: I3016039692d8130654fe1bff422f24ef2afc196e
ansible_hostname is not used within any tasks and, by default, is the
same value as container_name.
ansible_ssh_host and container_address are also the same value by
default, both are assigned to hosts through the dynamic inventory script.
Additionally, overriding ansible_ssh_host through playbook vars breaks
tasks that delegate to other hosts when using Ansible2.
Change-Id: I2525b476c9302ef37f29e2beb132856232d129e2
The change builds venvs in a single repo container and then
ships them to to all targets. The built venvs will be within
the repo servers and will allow for faster deployments,
upgrades, and more consistent deployments for the life cycle
of the deployment.
This will create a versioned tarball that will allow for
greater visablility into the build process as well as giving
deployers/developers the ability to compair a release in
place.
Change-Id: Ieef0b89ebc009d1453c99e19e53a36eb2d70edae
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
This commit conditionally allows the os_ceilometer role to
install build and deploy within a venv. This is the new
default behavior of the role however the functionality
can be disabled.
Change-Id: Ic12d0cb2151124fc5150170205d5d226dab53e5b
Implements: blueprint enable-venv-support-within-the-roles
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
This patch reduces the time it takes for the playbooks to execute when the
container configuration has not changed as the task to wait for a successful
ssh connection (after an initial delay) is not executed.
Change-Id: I52727d6422878c288884a70c466ccc98cd6fd0ff
This change implements a change in the file name for each service
so that the log rotate files don't collide when running on a shared
host.
Change-Id: Ia42656e4568c43667d610aa8421d2fa25437e2aa
Closes-Bug: 1499799
This commit removes the use of the net-cache flushing from all
$service plays which ensures that the cache is not overly flushed
which could impact performance of services like neutron.
The role lxc-container-destroy role was removed because its not used
and if it were ever used it its use would result in the same
situation covered by this issue.
Additionally it was noted that on container restarts, the mac addresses
of the container interfaces change. If *no* flushing is done at all,
this results in long run times whilst the arp entry for the container IP
times out. Hence, we add in here a configuration option that causes a
gratuitous arp whenever an interface has it's mac set, and/or the link
comes up. Because of the way the container veths work, we can't rely
on that happening on a linkm up event. So we forcefully set the mac
address in the post-up hook for the interface to force the issue of the
gratuitous arp.
Co-Authored-By: Evan Callicoat <diopter@gmail.com>
Co-Authored-By: Darren Birkett <darren.birkett@gmail.com>
Change-Id: I96800b2390ffbacb8341e5538545a3c3a4308cf3
Closes-Bug: 1497433
This patch adds additional retries to the ssh wait check
tasks which were unintentionally omitted from
https://review.openstack.org/218572
Change-Id: Id8f7df5e283a9f61373f1bfb167a4c0bd098cc25
Closes-Bug: #1490142
This patch adds a wait for the container's sshd to be available
after the container's apparmor profile is updated. When the
profile is updated the container is restarted, so this wait is
essential to the success of the playbook's completion.
It also includes 3 retries which has been found to improve the
rate of success.
Due to an upstream change in behaviour with netaddr 0.7.16 we
need to pin the package to a lower version until Neutron is
adjusted and we bump the Neutron SHA.
Change-Id: I30575ee31929b0c9af6353b7255cdfb6cebd2104
Closes-Bug: #1490142
Having the lxc container create role drop the lxc-openstack apparmor
profile on all containers anytime its executed leads to the possibility
of the lxc container create task overwriting the running profile on a given
container. If this happens its likley to cause service interruption until the
correct profile is loaded for all containers its effected by the action.
To fix this issue the default "lxc-openstack" profile has been removed from the
lxc contianer create task and added to all plays that are known to be executed
within an lxc container. This will ensure that the profile is untouched on
subsequent runs of the lxc-container-create.yml play.
Change-Id: Ifa4640be60c18f1232cc7c8b281fb1dfc0119e56
Closes-Bug: 1487130
This patch adds a configurable delay time for retrying the
ssh connection when waiting for the containers to restart.
This is useful for environments where resources are constrained
and containers may take longer to restart.
Change-Id: I0383e34a273b93e1b2651460c853cf1ceba89029
Closes-Bug: #1476885
This patch adds a check for the appropriate OpenSSH Daemon
reponse when waiting for the container to restart. This is
an optimisation over simply waiting for the TCP port.
Change-Id: Ie25af4f57bb98fb1d846d579b58b4d479b476675
Closes-Bug: #1476885