Commit Graph

24 Commits

Author SHA1 Message Date
Kevin Carter
11c7b27235 Add boot priorities to containers
This change adds boot priorities to some container types where it'd be
beneficial. LXC allows containers with higher priority values to be
started ahead of others. The container default is 0 and all integers
greater than 0 will be started in descending order.

This change has no upgrade impact as the additional config is only used
on boot and will not require a container to be restarted.

Change-Id: I89703a24516d47d560c9c888538c384ad2228eb7
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2016-11-07 13:48:34 +00:00
Matt Thompson
97a5c6ae3a Remove hard-coded host group
A previous commit added a hard-coded host group to
rabbitmq-install.yml, instead of using the variable rabbitmq_host_group.
This commit updates rabbitmq-install.yml to use rabbitmq_host_group
instead.

Change-Id: I4c4f8d0b903cc56ed025b7223b52d3205d9c9d37
2016-11-02 11:45:53 -04:00
git-harry
76639d6a86 Stop all but one RabbitMQ node prior to upgrade
RabbitMQ nodes must all be stopped prior to a major/minor version
upgrade [1]. The role does this by distinguishing between the upgrader
node and the rest in separate stop and start tasks.

Upgrades can fail when more than one member of rabbitmq_all are not
members of the cluster. This is due to a bug fixed for greenfield
deployments by 21c2478901. The same fix
was not applied to upgrades because major/minor upgrades require all
RabbitMQ nodes to be stopped which is incompatible with serialising the
role in isolation.

This change uses a play to stop all but one of the nodes, prior to
running the rabbitmq_server role, and then serialises the running of
the role so that one node is upgraded at a time. This minimises the
downtime as much as possible while allowing the role to be applied to
one node at a time.

This change is related to an equivalent one [2] made to the test
playbook of the role openstack-ansible-rabbitmq_server.

[1] http://www.rabbitmq.com/clustering.html#upgrading
[2] https://review.openstack.org/#/c/390966/

Change-Id: Icca5cb1a96f83063223b6ddbeb02eeb562b0931b
2016-10-28 11:04:48 +01:00
Kevin Carter
ecd81b9618 Cleanup/standardize usage of tags in plays
The numerous tags within the playbook have been condensed
to two potential tags: "$NAMESPACE-config", and "$NAMESPACE".

These tags have been chosen as they are namespaced and cover
the configuration pre-tasks as well as the option to execute
the main playbook from a higher level play. By tagging
everything in the play with "$NAMESPACE" we're ensuring that
everything covered by the playbook has a namespaced tag. Any
place using the "always" tag was left alone as the tasks being
executed must "always" execute whenever the playbook is called.

Notice: The os-swift-setup.yml playbook file has been
removed because it currently serves no purpose. While the
os-swift-sync.yml is no longer being directly called it has been
left as it could be of use to a deployer.

Change-Id: Iebfd82ebffedc768a18d9d9be6a9e70df2ae8fc1
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2016-09-15 10:08:48 +00:00
Travis Truman
75b36283fa Support multiple rabbitmq clusters
The rabbitmq-install play now supports a variable
inventory host group defined with *rabbitmq_host_group*.

This allows deployers to use the play to create multiple
RabbitMQ clusters using for example:

Typical infra cluster:

  openstack-ansible rabbitmq-install.yml

Second cluster for telemetry:

  openstack-ansible -e "rabbitmq_host_group=telemetry_rabbitmq_all" rabbitmq-install.yml

Many vars were moved from group-specific group_vars to the "all" group
as the ceilometer role/play requires the RabbitMQ details and telemetry enabled
state for all services that can potentially be monitored by ceilometer.

By default, the "telemetry cluster" and the "RPC" cluster are the same, but
deployers that wish to create multiple clusters are able to override the
*_rabbitmq_telemetry_* vars when choosing to run separate clusters.

Use case is more fully explained here:

http://trumant.github.io/openstack-ansible-multiple-rabbitmq-clusters.html

Change-Id: I737711085fc0e50c4c5a0ee7c2037927574f8448
Depends-On: Ib23c8829468bbb4ddae67e08a092240f54a6c729
Depends-On: Ifad921525897c5887f6ad6871c6d72c595d79fa6
Depends-On: I7cc5a5dac4299bd3d1bc810ea886f3888abaa2da
Depends-On: Ib3ee5b409dbc2fc2dc8d25990d4962ec49e21131
Depends-On: I397b56423d9ef6757bada27c92b7c9f0d5126cc9
Depends-On: I98cb273cbd8948072ef6e7fc5a15834093d70099
Depends-On: I89363b64ff3f41ee65d72ec51cb8062fc7bf08b0
Implements: blueprint multi-rabbitmq-clusters
2016-08-25 10:40:45 +00:00
Logan V
55de7dd275 Move package cache configuration to common tasks
Package cache clients were configured in the repo_server role
previously. This change moves the client side configuration
of the proxy file to the service playbooks using common-tasks.

Change-Id: Icf127db9e279bd15b177347ecc4f3c8fe68b02f2
2016-08-18 01:34:09 +00:00
Jesse Pretorius
e481744787 Remove pip_install role execution from RabbitMQ playbook
The RabbitMQ role no longer does any python package installation,
so the pip_install role execution is no longer required.

Change-Id: Ia05afabc756664195a382081382667a537f7ad33
2016-07-23 15:15:34 +00:00
Kevin Carter
91deb13ec2 Cleanup/standardize common tasks
All of the common tasks shared across all of the playbooks have been
moved into "playbooks/common-tasks" as singular task files which are
simply included as needed.

* This change will assist developers adding additional playbooks, roles,
  etc which may need access to common tasks.

* This change will guarantee consistency between playbooks when
  executing common tasks which are generally used to setup services.

* This change greatly reduces code duplication across all plays.

* The common-task files have comments at the top for developer
  instructions on how a task file can be used.

Change-Id: I399211c139d6388ab56b97b809f93d4936907c7a
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2016-07-21 17:18:26 -05:00
Jean-Philippe Evrard
e5622adc43 Speed up gate: avoid gathering facts more than necessary
This commit adds a gather_facts variable in the playbooks, which
is defaulted to True. If run_playbooks.sh is used, this commit adds
"-e gather_facts=False" to the openstack-ansible cli for the playbook
run. Everything should work fine for deployers (because there is no
change for them), and for the gate (because fact caching is enabled)

It should speed up the gate by avoiding the long "setup" task cost for each host.
Instead it does the setup to the appropriate hosts, at the appropriate time.

Change-Id: I348a4b7dfe70d56a64899246daf65ea834a75d2a
Signed-off-by: Jean-Philippe Evrard <jean-philippe.evrard@rackspace.co.uk>
2016-06-15 14:09:20 +00:00
Jimmy McCrory
810e0a7d32 Use combined pip_install role
The pip_install and pip_lock_down roles have been merged.

Update playbooks to make use of the merged pip_install role by providing
the 'pip_lock_to_internal_repo' variable to based on whether or not
'pip_links' contains any entries.

Change-Id: I59ad75ac54fd2c54172b404c53cc2659b9dfe482
2016-06-02 13:27:36 -07:00
wade-holler
b5b2bb9af4 Add RabbitMQ mgmt UI through HAProxy
Backgroud: Bug Requests ability to access
RabbitMQ management UI through HAproxy

Approach:
--Add rabbitmq ui port 15672 to HAProxy
--DO NOT Add monitoring user by default,
instead key on existence of rabbitmw_monitoring_userid
in user_variables.yml
--ADD user_variables.yml update per above with
explanation
--Add "monitoring" user to rabbitmq for monitoring with
"monitoring" user tag
--Add monitoring user passwd var to user_secrets
--Add features: release note

Closes-Bug: 1446434

Change-Id: Idaf02cad6bb292d02f1cf6a733dbbc6ff4b4435e
2016-06-01 09:24:06 +00:00
Darren Birkett
21c2478901 install rabbitmq-server in serial
In order to prevent race conditions with nodes joining the cluster
simultaneously when the cluster is first formed, we move the rabbitmq
installation play to be 'serial: 1'. However, when the nodes are being
upgraded, it cannot be done in serial so in this case we set 'serial: 0'

There are some tasks/roles called in this playbook that can still be
run in parallel, so we split out the rabbitmq-server install into a
separate play so that we only serialise the parts that are necessary
to ensure maximum efficiency.

Change-Id: I97cdae27fdce4f400492c2134b6589b55fbc5a61
Fixes-Bug: #1573030
2016-05-06 15:12:45 +01:00
Jimmy McCrory
bcbc5b27fc Remove unneeded playbook vars
ansible_hostname is not used within any tasks and, by default, is the
same value as container_name.

ansible_ssh_host and container_address are also the same value by
default, both are assigned to hosts through the dynamic inventory script.
Additionally, overriding ansible_ssh_host through playbook vars breaks
tasks that delegate to other hosts when using Ansible2.

Change-Id: I2525b476c9302ef37f29e2beb132856232d129e2
2016-01-25 16:21:22 -08:00
Kevin Carter
6f443d6971 IRR - Implemented for setup-infrastructure
The change removes and points our role requirements to use
the independent repos that are now online.

Change-Id: I16f327bcdf35d5386396f9147257b5c8aca0603f
Implements: blueprint independent-role-repositories
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2015-12-13 11:38:20 +00:00
Jesse Pretorius
e343d6e9fe Only wait for SSH if the container config has changed
This patch reduces the time it takes for the playbooks to execute when the
container configuration has not changed as the task to wait for a successful
ssh connection (after an initial delay) is not executed.

Change-Id: I52727d6422878c288884a70c466ccc98cd6fd0ff
2015-10-07 08:22:34 +01:00
Kevin Carter
2346e5ced4 Fixes log rotate issue
This change implements a change in the file name for each service
so that the log rotate files don't collide when running on a shared
host.

Change-Id: Ia42656e4568c43667d610aa8421d2fa25437e2aa
Closes-Bug: 1499799
2015-09-30 18:03:13 +00:00
Kevin Carter
4745e709f6 Removes over zealous arp cache flushing
This commit removes the use of the net-cache flushing from all
$service plays which ensures that the cache is not overly flushed
which could impact performance of services like neutron.

The role lxc-container-destroy role was removed because its not used
and if it were ever used it its use would result in the same
situation covered by this issue.

Additionally it was noted that on container restarts, the mac addresses
of the container interfaces change.  If *no* flushing is done at all,
this results in long run times whilst the arp entry for the container IP
times out.  Hence, we add in here a configuration option that causes a
gratuitous arp whenever an interface has it's mac set, and/or the link
comes up.  Because of the way the container veths work, we can't rely
on that happening on a linkm up event. So we forcefully set the mac
address in the post-up hook for the interface to force the issue of the
gratuitous arp.

Co-Authored-By: Evan Callicoat <diopter@gmail.com>
Co-Authored-By: Darren Birkett <darren.birkett@gmail.com>

Change-Id: I96800b2390ffbacb8341e5538545a3c3a4308cf3
Closes-Bug: 1497433
2015-09-29 13:21:29 +01:00
Jesse Pretorius
a40cb58118 Wait for container ssh after apparmor profile update
This patch adds a wait for the container's sshd to be available
after the container's apparmor profile is updated. When the
profile is updated the container is restarted, so this wait is
essential to the success of the playbook's completion.

It also includes 3 retries which has been found to improve the
rate of success.

Due to an upstream change in behaviour with netaddr 0.7.16 we
need to pin the package to a lower version until Neutron is
adjusted and we bump the Neutron SHA.

Change-Id: I30575ee31929b0c9af6353b7255cdfb6cebd2104
Closes-Bug: #1490142
2015-09-02 09:21:55 +01:00
kevin
ffb701f8a3 Removed default lxc profile on container create
Having the lxc container create role drop the lxc-openstack apparmor
profile on all containers anytime its executed leads to the possibility
of the lxc container create task overwriting the running profile on a given
container. If this happens its likley to cause service interruption until the
correct profile is loaded for all containers its effected by the action.

To fix this issue the default "lxc-openstack" profile has been removed from the
lxc contianer create task and added to all plays that are known to be executed
within an lxc container. This will ensure that the profile is untouched on
subsequent runs of the lxc-container-create.yml play.

Change-Id: Ifa4640be60c18f1232cc7c8b281fb1dfc0119e56
Closes-Bug: 1487130
2015-08-25 13:15:45 +00:00
git-harry
6ea86e6274
Fix rabbitmq playbook to allow upgrades
The rabbitmq playbook is designed to run in parallel across the cluster.
This causes an issue when upgrading rabbitmq to a new major or minor
version because RabbitMQ does not support doing an online migration of
datasets between major versions. while a minor release can be upgrade
while online it is recommended to bring down the cluster to do any
upgrade actions. The current configuration takes no account of this.

Reference:
https://www.rabbitmq.com/clustering.html#upgrading for further details.

* A new variable has been added called `rabbitmq_upgrade`. This is set to
  false by default to prevent a new version being installed unintentionally.
  To run the upgrade, which will shutdown the cluster, the variable can be
  set to true on the commandline:

  Example:
    openstack-ansible -e rabbitmq_upgrade=true \
    rabbitmq-install.yml

* A new variable has been added called `rabbitmq_ignore_version_state`
  which can be set "true" to ignore the package and version state tasks.
  This has been provided to allow a deployer to rerun the plays in an
  environment where the playbooks have been upgraded and the default
  version of rabbitmq has changed within the role and the deployer has
  elected to upgraded the installation at that time. This will ensure a
  deployer is able to recluster an environment as needed without
  effecting the package state.

  Example:
    openstack-ansible -e rabbitmq_ignore_version_state=true \
    rabbitmq-install.yml

* A new variable has been added `rabbitmq_primary_cluster_node` which
  allows a deployer to elect / set the primary cluster node in an
  environment. This variable is used to determine the restart order
  of RabbitMQ nodes. IE this will be the last node down and first one
  up in an environment. By default this variable is set to:
  rabbitmq_primary_cluster_node: "{{ groups['rabbitmq_all'][0] }}"

scripts/run-upgrade.sh has been modified to pass 'rabbitmq_upgrade=true'
on the command line so that RabbitMQ can be upgraded as part of the
upgrade between OpenStack versions.

DocImpact
Change-Id: I17d4429b9b94d47c1578dd58a2fb20698d1fe02e
Closes-bug: #1474992
2015-07-21 18:32:52 -05:00
git-harry
e148635e78 Add role system-crontab-coordination
Currently every host, both containers and bare metal, has a crontab
configured with the same values for minute, hour, day of week etc. This
means that there is the potential for a service interruption if, for
example, a cron job were to cause a service to restart.

This commit adds a new role which attempts to adjust the times defined
in the entries in the default /etc/crontab to reduce the overlap
between hosts.

Change-Id: I18bf0ac0c0610283a19c40c448ac8b6b4c8fd8f5
Closes-bug: #1424705
2015-06-30 10:06:11 +01:00
d34dh0r53
31da4f0331 Adds rsyslog-client tag to install plays
In order to ease the addition of external log receivers this adds an
rsyslog-client tag to the installation plays.  This allows us to run
openstack-ansible setup-everything.yml --tags rsyslog-client to add
additional logging configuration.

Change-Id: If002f67a626ff5fe3dc06d77c9295ede9369b3dc
Partially-Implements: blueprint master-kilofication
2015-04-16 08:10:46 +00:00
Kevin Carter
5b4eee1fc1 Adds rsyslog client role and enables it in all plays
This commit adds the rsyslog_client role to the general stack. This
change is part 3 of 3 role will allow rsyslog to server as a log
shipper within a given host / container. The role has been setup to
allow for logs to be shipped to multiple hosts and or other
providers, e.g. splunk, loggly, etc... All of the plays that need
to support logging have been modified to use the new rsyslog_client
role.

Roles added:
* rsyslog_client

Plays modified:
* playbooks/galera-install.yml
* playbooks/lxc-hosts-setup.yml
* playbooks/os-cinder-install.yml
* playbooks/os-glance-install.yml
* playbooks/os-heat-install.yml
* playbooks/os-horizon-install.yml
* playbooks/os-keystone-install.yml
* playbooks/os-neutron-install.yml
* playbooks/os-nova-install.yml
* playbooks/os-swift-install.yml
* playbooks/os-tempest-install.yml
* playbooks/rabbitmq-install.yml
* playbooks/repo-server.yml

DocImpact
Implements: blueprint rsyslog-update

Change-Id: I4028a58db3825adb8a5aa73dbaabbe353bb33046
2015-03-17 13:52:30 -05:00
Kevin Carter
8e6dbd01c9 Convert existing roles into galaxy roles
This change implements the blueprint to convert all roles and plays into
a more generic setup, following upstream ansible best practices.

Items Changed:
* All tasks have tags.
* All roles use namespaced variables.
* All redundant tasks within a given play and role have been removed.
* All of the repetitive plays have been removed in-favor of a more
  simplistic approach. This change duplicates code within the roles but
  ensures that the roles only ever run within their own scope.
* All roles have been built using an ansible galaxy syntax.
* The `*requirement.txt` files have been reformatted follow upstream
  Openstack practices.
* Dynamically generated inventory is now more organized, this should assist
  anyone who may want or need to dive into the JSON blob that is created.
  In the inventory a properties field is used for items that customize containers
  within the inventory.
* The environment map has been modified to support additional host groups to
  enable the seperation of infrastructure pieces. While the old infra_hosts group
  will still work this change allows for groups to be divided up into seperate
  chunks; eg: deployment of a swift only stack.
* The LXC logic now exists within the plays.
* etc/openstack_deploy/user_variables.yml has all password/token
  variables extracted into the separate file
  etc/openstack_deploy/user_secrets.yml in order to allow seperate
  security settings on that file.

Items Excised:
* All of the roles have had the LXC logic removed from within them which
  should allow roles to be consumed outside of the `os-ansible-deployment`
  reference architecture.

Note:
* the directory rpc_deployment still exists and is presently pointed at plays
  containing a deprecation warning instructing the user to move to the standard
  playbooks directory.
* While all of the rackspace specific components and variables have been removed
  and or were refactored the repository still relies on an upstream mirror of
  Openstack built python files and container images. This upstream mirror is hosted
  at rackspace at "http://rpc-repo.rackspace.com" though this is
  not locked to and or tied to rackspace specific installations. This repository
  contains all of the needed code to create and/or clone your own mirror.

DocImpact
Co-Authored-By: Jesse Pretorius <jesse.pretorius@rackspace.co.uk>
Closes-Bug: #1403676
Implements: blueprint galaxy-roles
Change-Id: I03df3328b7655f0cc9e43ba83b02623d038d214e
2015-02-18 10:56:25 +00:00