This change adds boot priorities to some container types where it'd be
beneficial. LXC allows containers with higher priority values to be
started ahead of others. The container default is 0 and all integers
greater than 0 will be started in descending order.
This change has no upgrade impact as the additional config is only used
on boot and will not require a container to be restarted.
Change-Id: I89703a24516d47d560c9c888538c384ad2228eb7
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
A previous commit added a hard-coded host group to
rabbitmq-install.yml, instead of using the variable rabbitmq_host_group.
This commit updates rabbitmq-install.yml to use rabbitmq_host_group
instead.
Change-Id: I4c4f8d0b903cc56ed025b7223b52d3205d9c9d37
RabbitMQ nodes must all be stopped prior to a major/minor version
upgrade [1]. The role does this by distinguishing between the upgrader
node and the rest in separate stop and start tasks.
Upgrades can fail when more than one member of rabbitmq_all are not
members of the cluster. This is due to a bug fixed for greenfield
deployments by 21c2478901. The same fix
was not applied to upgrades because major/minor upgrades require all
RabbitMQ nodes to be stopped which is incompatible with serialising the
role in isolation.
This change uses a play to stop all but one of the nodes, prior to
running the rabbitmq_server role, and then serialises the running of
the role so that one node is upgraded at a time. This minimises the
downtime as much as possible while allowing the role to be applied to
one node at a time.
This change is related to an equivalent one [2] made to the test
playbook of the role openstack-ansible-rabbitmq_server.
[1] http://www.rabbitmq.com/clustering.html#upgrading
[2] https://review.openstack.org/#/c/390966/
Change-Id: Icca5cb1a96f83063223b6ddbeb02eeb562b0931b
The numerous tags within the playbook have been condensed
to two potential tags: "$NAMESPACE-config", and "$NAMESPACE".
These tags have been chosen as they are namespaced and cover
the configuration pre-tasks as well as the option to execute
the main playbook from a higher level play. By tagging
everything in the play with "$NAMESPACE" we're ensuring that
everything covered by the playbook has a namespaced tag. Any
place using the "always" tag was left alone as the tasks being
executed must "always" execute whenever the playbook is called.
Notice: The os-swift-setup.yml playbook file has been
removed because it currently serves no purpose. While the
os-swift-sync.yml is no longer being directly called it has been
left as it could be of use to a deployer.
Change-Id: Iebfd82ebffedc768a18d9d9be6a9e70df2ae8fc1
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
The rabbitmq-install play now supports a variable
inventory host group defined with *rabbitmq_host_group*.
This allows deployers to use the play to create multiple
RabbitMQ clusters using for example:
Typical infra cluster:
openstack-ansible rabbitmq-install.yml
Second cluster for telemetry:
openstack-ansible -e "rabbitmq_host_group=telemetry_rabbitmq_all" rabbitmq-install.yml
Many vars were moved from group-specific group_vars to the "all" group
as the ceilometer role/play requires the RabbitMQ details and telemetry enabled
state for all services that can potentially be monitored by ceilometer.
By default, the "telemetry cluster" and the "RPC" cluster are the same, but
deployers that wish to create multiple clusters are able to override the
*_rabbitmq_telemetry_* vars when choosing to run separate clusters.
Use case is more fully explained here:
http://trumant.github.io/openstack-ansible-multiple-rabbitmq-clusters.html
Change-Id: I737711085fc0e50c4c5a0ee7c2037927574f8448
Depends-On: Ib23c8829468bbb4ddae67e08a092240f54a6c729
Depends-On: Ifad921525897c5887f6ad6871c6d72c595d79fa6
Depends-On: I7cc5a5dac4299bd3d1bc810ea886f3888abaa2da
Depends-On: Ib3ee5b409dbc2fc2dc8d25990d4962ec49e21131
Depends-On: I397b56423d9ef6757bada27c92b7c9f0d5126cc9
Depends-On: I98cb273cbd8948072ef6e7fc5a15834093d70099
Depends-On: I89363b64ff3f41ee65d72ec51cb8062fc7bf08b0
Implements: blueprint multi-rabbitmq-clusters
Package cache clients were configured in the repo_server role
previously. This change moves the client side configuration
of the proxy file to the service playbooks using common-tasks.
Change-Id: Icf127db9e279bd15b177347ecc4f3c8fe68b02f2
The RabbitMQ role no longer does any python package installation,
so the pip_install role execution is no longer required.
Change-Id: Ia05afabc756664195a382081382667a537f7ad33
All of the common tasks shared across all of the playbooks have been
moved into "playbooks/common-tasks" as singular task files which are
simply included as needed.
* This change will assist developers adding additional playbooks, roles,
etc which may need access to common tasks.
* This change will guarantee consistency between playbooks when
executing common tasks which are generally used to setup services.
* This change greatly reduces code duplication across all plays.
* The common-task files have comments at the top for developer
instructions on how a task file can be used.
Change-Id: I399211c139d6388ab56b97b809f93d4936907c7a
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
This commit adds a gather_facts variable in the playbooks, which
is defaulted to True. If run_playbooks.sh is used, this commit adds
"-e gather_facts=False" to the openstack-ansible cli for the playbook
run. Everything should work fine for deployers (because there is no
change for them), and for the gate (because fact caching is enabled)
It should speed up the gate by avoiding the long "setup" task cost for each host.
Instead it does the setup to the appropriate hosts, at the appropriate time.
Change-Id: I348a4b7dfe70d56a64899246daf65ea834a75d2a
Signed-off-by: Jean-Philippe Evrard <jean-philippe.evrard@rackspace.co.uk>
The pip_install and pip_lock_down roles have been merged.
Update playbooks to make use of the merged pip_install role by providing
the 'pip_lock_to_internal_repo' variable to based on whether or not
'pip_links' contains any entries.
Change-Id: I59ad75ac54fd2c54172b404c53cc2659b9dfe482
Backgroud: Bug Requests ability to access
RabbitMQ management UI through HAproxy
Approach:
--Add rabbitmq ui port 15672 to HAProxy
--DO NOT Add monitoring user by default,
instead key on existence of rabbitmw_monitoring_userid
in user_variables.yml
--ADD user_variables.yml update per above with
explanation
--Add "monitoring" user to rabbitmq for monitoring with
"monitoring" user tag
--Add monitoring user passwd var to user_secrets
--Add features: release note
Closes-Bug: 1446434
Change-Id: Idaf02cad6bb292d02f1cf6a733dbbc6ff4b4435e
In order to prevent race conditions with nodes joining the cluster
simultaneously when the cluster is first formed, we move the rabbitmq
installation play to be 'serial: 1'. However, when the nodes are being
upgraded, it cannot be done in serial so in this case we set 'serial: 0'
There are some tasks/roles called in this playbook that can still be
run in parallel, so we split out the rabbitmq-server install into a
separate play so that we only serialise the parts that are necessary
to ensure maximum efficiency.
Change-Id: I97cdae27fdce4f400492c2134b6589b55fbc5a61
Fixes-Bug: #1573030
ansible_hostname is not used within any tasks and, by default, is the
same value as container_name.
ansible_ssh_host and container_address are also the same value by
default, both are assigned to hosts through the dynamic inventory script.
Additionally, overriding ansible_ssh_host through playbook vars breaks
tasks that delegate to other hosts when using Ansible2.
Change-Id: I2525b476c9302ef37f29e2beb132856232d129e2
The change removes and points our role requirements to use
the independent repos that are now online.
Change-Id: I16f327bcdf35d5386396f9147257b5c8aca0603f
Implements: blueprint independent-role-repositories
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
This patch reduces the time it takes for the playbooks to execute when the
container configuration has not changed as the task to wait for a successful
ssh connection (after an initial delay) is not executed.
Change-Id: I52727d6422878c288884a70c466ccc98cd6fd0ff
This change implements a change in the file name for each service
so that the log rotate files don't collide when running on a shared
host.
Change-Id: Ia42656e4568c43667d610aa8421d2fa25437e2aa
Closes-Bug: 1499799
This commit removes the use of the net-cache flushing from all
$service plays which ensures that the cache is not overly flushed
which could impact performance of services like neutron.
The role lxc-container-destroy role was removed because its not used
and if it were ever used it its use would result in the same
situation covered by this issue.
Additionally it was noted that on container restarts, the mac addresses
of the container interfaces change. If *no* flushing is done at all,
this results in long run times whilst the arp entry for the container IP
times out. Hence, we add in here a configuration option that causes a
gratuitous arp whenever an interface has it's mac set, and/or the link
comes up. Because of the way the container veths work, we can't rely
on that happening on a linkm up event. So we forcefully set the mac
address in the post-up hook for the interface to force the issue of the
gratuitous arp.
Co-Authored-By: Evan Callicoat <diopter@gmail.com>
Co-Authored-By: Darren Birkett <darren.birkett@gmail.com>
Change-Id: I96800b2390ffbacb8341e5538545a3c3a4308cf3
Closes-Bug: 1497433
This patch adds a wait for the container's sshd to be available
after the container's apparmor profile is updated. When the
profile is updated the container is restarted, so this wait is
essential to the success of the playbook's completion.
It also includes 3 retries which has been found to improve the
rate of success.
Due to an upstream change in behaviour with netaddr 0.7.16 we
need to pin the package to a lower version until Neutron is
adjusted and we bump the Neutron SHA.
Change-Id: I30575ee31929b0c9af6353b7255cdfb6cebd2104
Closes-Bug: #1490142
Having the lxc container create role drop the lxc-openstack apparmor
profile on all containers anytime its executed leads to the possibility
of the lxc container create task overwriting the running profile on a given
container. If this happens its likley to cause service interruption until the
correct profile is loaded for all containers its effected by the action.
To fix this issue the default "lxc-openstack" profile has been removed from the
lxc contianer create task and added to all plays that are known to be executed
within an lxc container. This will ensure that the profile is untouched on
subsequent runs of the lxc-container-create.yml play.
Change-Id: Ifa4640be60c18f1232cc7c8b281fb1dfc0119e56
Closes-Bug: 1487130
The rabbitmq playbook is designed to run in parallel across the cluster.
This causes an issue when upgrading rabbitmq to a new major or minor
version because RabbitMQ does not support doing an online migration of
datasets between major versions. while a minor release can be upgrade
while online it is recommended to bring down the cluster to do any
upgrade actions. The current configuration takes no account of this.
Reference:
https://www.rabbitmq.com/clustering.html#upgrading for further details.
* A new variable has been added called `rabbitmq_upgrade`. This is set to
false by default to prevent a new version being installed unintentionally.
To run the upgrade, which will shutdown the cluster, the variable can be
set to true on the commandline:
Example:
openstack-ansible -e rabbitmq_upgrade=true \
rabbitmq-install.yml
* A new variable has been added called `rabbitmq_ignore_version_state`
which can be set "true" to ignore the package and version state tasks.
This has been provided to allow a deployer to rerun the plays in an
environment where the playbooks have been upgraded and the default
version of rabbitmq has changed within the role and the deployer has
elected to upgraded the installation at that time. This will ensure a
deployer is able to recluster an environment as needed without
effecting the package state.
Example:
openstack-ansible -e rabbitmq_ignore_version_state=true \
rabbitmq-install.yml
* A new variable has been added `rabbitmq_primary_cluster_node` which
allows a deployer to elect / set the primary cluster node in an
environment. This variable is used to determine the restart order
of RabbitMQ nodes. IE this will be the last node down and first one
up in an environment. By default this variable is set to:
rabbitmq_primary_cluster_node: "{{ groups['rabbitmq_all'][0] }}"
scripts/run-upgrade.sh has been modified to pass 'rabbitmq_upgrade=true'
on the command line so that RabbitMQ can be upgraded as part of the
upgrade between OpenStack versions.
DocImpact
Change-Id: I17d4429b9b94d47c1578dd58a2fb20698d1fe02e
Closes-bug: #1474992
Currently every host, both containers and bare metal, has a crontab
configured with the same values for minute, hour, day of week etc. This
means that there is the potential for a service interruption if, for
example, a cron job were to cause a service to restart.
This commit adds a new role which attempts to adjust the times defined
in the entries in the default /etc/crontab to reduce the overlap
between hosts.
Change-Id: I18bf0ac0c0610283a19c40c448ac8b6b4c8fd8f5
Closes-bug: #1424705
In order to ease the addition of external log receivers this adds an
rsyslog-client tag to the installation plays. This allows us to run
openstack-ansible setup-everything.yml --tags rsyslog-client to add
additional logging configuration.
Change-Id: If002f67a626ff5fe3dc06d77c9295ede9369b3dc
Partially-Implements: blueprint master-kilofication
This commit adds the rsyslog_client role to the general stack. This
change is part 3 of 3 role will allow rsyslog to server as a log
shipper within a given host / container. The role has been setup to
allow for logs to be shipped to multiple hosts and or other
providers, e.g. splunk, loggly, etc... All of the plays that need
to support logging have been modified to use the new rsyslog_client
role.
Roles added:
* rsyslog_client
Plays modified:
* playbooks/galera-install.yml
* playbooks/lxc-hosts-setup.yml
* playbooks/os-cinder-install.yml
* playbooks/os-glance-install.yml
* playbooks/os-heat-install.yml
* playbooks/os-horizon-install.yml
* playbooks/os-keystone-install.yml
* playbooks/os-neutron-install.yml
* playbooks/os-nova-install.yml
* playbooks/os-swift-install.yml
* playbooks/os-tempest-install.yml
* playbooks/rabbitmq-install.yml
* playbooks/repo-server.yml
DocImpact
Implements: blueprint rsyslog-update
Change-Id: I4028a58db3825adb8a5aa73dbaabbe353bb33046
This change implements the blueprint to convert all roles and plays into
a more generic setup, following upstream ansible best practices.
Items Changed:
* All tasks have tags.
* All roles use namespaced variables.
* All redundant tasks within a given play and role have been removed.
* All of the repetitive plays have been removed in-favor of a more
simplistic approach. This change duplicates code within the roles but
ensures that the roles only ever run within their own scope.
* All roles have been built using an ansible galaxy syntax.
* The `*requirement.txt` files have been reformatted follow upstream
Openstack practices.
* Dynamically generated inventory is now more organized, this should assist
anyone who may want or need to dive into the JSON blob that is created.
In the inventory a properties field is used for items that customize containers
within the inventory.
* The environment map has been modified to support additional host groups to
enable the seperation of infrastructure pieces. While the old infra_hosts group
will still work this change allows for groups to be divided up into seperate
chunks; eg: deployment of a swift only stack.
* The LXC logic now exists within the plays.
* etc/openstack_deploy/user_variables.yml has all password/token
variables extracted into the separate file
etc/openstack_deploy/user_secrets.yml in order to allow seperate
security settings on that file.
Items Excised:
* All of the roles have had the LXC logic removed from within them which
should allow roles to be consumed outside of the `os-ansible-deployment`
reference architecture.
Note:
* the directory rpc_deployment still exists and is presently pointed at plays
containing a deprecation warning instructing the user to move to the standard
playbooks directory.
* While all of the rackspace specific components and variables have been removed
and or were refactored the repository still relies on an upstream mirror of
Openstack built python files and container images. This upstream mirror is hosted
at rackspace at "http://rpc-repo.rackspace.com" though this is
not locked to and or tied to rackspace specific installations. This repository
contains all of the needed code to create and/or clone your own mirror.
DocImpact
Co-Authored-By: Jesse Pretorius <jesse.pretorius@rackspace.co.uk>
Closes-Bug: #1403676
Implements: blueprint galaxy-roles
Change-Id: I03df3328b7655f0cc9e43ba83b02623d038d214e