[DOCS] Creating new folder for proposed operations guide
Moves pre-existing operations content to new folder Change-Id: I5c177dda2bba47e835fbd77cd63df3b52864c4d4 Implements: blueprint create-ops-guide
This commit is contained in:
parent
3070530e88
commit
a7f25d8162
306
doc/source/draft-operations-guide/extending.rst
Normal file
306
doc/source/draft-operations-guide/extending.rst
Normal file
@ -0,0 +1,306 @@
|
|||||||
|
===========================
|
||||||
|
Extending OpenStack-Ansible
|
||||||
|
===========================
|
||||||
|
|
||||||
|
The OpenStack-Ansible project provides a basic OpenStack environment, but
|
||||||
|
many deployers will wish to extend the environment based on their needs. This
|
||||||
|
could include installing extra services, changing package versions, or
|
||||||
|
overriding existing variables.
|
||||||
|
|
||||||
|
Using these extension points, deployers can provide a more 'opinionated'
|
||||||
|
installation of OpenStack that may include their own software.
|
||||||
|
|
||||||
|
Including OpenStack-Ansible in your project
|
||||||
|
-------------------------------------------
|
||||||
|
|
||||||
|
Including the openstack-ansible repository within another project can be
|
||||||
|
done in several ways.
|
||||||
|
|
||||||
|
1. A git submodule pointed to a released tag.
|
||||||
|
2. A script to automatically perform a git checkout of
|
||||||
|
openstack-ansible
|
||||||
|
|
||||||
|
When including OpenStack-Ansible in a project, consider using a parallel
|
||||||
|
directory structure as shown in the `ansible.cfg files`_ section.
|
||||||
|
|
||||||
|
Also note that copying files into directories such as `env.d`_ or
|
||||||
|
`conf.d`_ should be handled via some sort of script within the extension
|
||||||
|
project.
|
||||||
|
|
||||||
|
ansible.cfg files
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
You can create your own playbook, variable, and role structure while still
|
||||||
|
including the OpenStack-Ansible roles and libraries by putting an
|
||||||
|
``ansible.cfg`` file in your ``playbooks`` directory.
|
||||||
|
|
||||||
|
The relevant options for Ansible 1.9 (included in OpenStack-Ansible)
|
||||||
|
are as follows:
|
||||||
|
|
||||||
|
``library``
|
||||||
|
This variable should point to
|
||||||
|
``openstack-ansible/playbooks/library``. Doing so allows roles and
|
||||||
|
playbooks to access OpenStack-Ansible's included Ansible modules.
|
||||||
|
``roles_path``
|
||||||
|
This variable should point to
|
||||||
|
``openstack-ansible/playbooks/roles``. This allows Ansible to
|
||||||
|
properly look up any OpenStack-Ansible roles that extension roles
|
||||||
|
may reference.
|
||||||
|
``inventory``
|
||||||
|
This variable should point to
|
||||||
|
``openstack-ansible/playbooks/inventory``. With this setting,
|
||||||
|
extensions have access to the same dynamic inventory that
|
||||||
|
OpenStack-Ansible uses.
|
||||||
|
|
||||||
|
Note that the paths to the ``openstack-ansible`` top level directory can be
|
||||||
|
relative in this file.
|
||||||
|
|
||||||
|
Consider this directory structure::
|
||||||
|
|
||||||
|
my_project
|
||||||
|
|
|
||||||
|
|- custom_stuff
|
||||||
|
| |
|
||||||
|
| |- playbooks
|
||||||
|
|- openstack-ansible
|
||||||
|
| |
|
||||||
|
| |- playbooks
|
||||||
|
|
||||||
|
The variables in ``my_project/custom_stuff/playbooks/ansible.cfg`` would use
|
||||||
|
``../openstack-ansible/playbooks/<directory>``.
|
||||||
|
|
||||||
|
|
||||||
|
env.d
|
||||||
|
-----
|
||||||
|
|
||||||
|
The ``/etc/openstack_deploy/env.d`` directory sources all YAML files into the
|
||||||
|
deployed environment, allowing a deployer to define additional group mappings.
|
||||||
|
|
||||||
|
This directory is used to extend the environment skeleton, or modify the
|
||||||
|
defaults defined in the ``playbooks/inventory/env.d`` directory.
|
||||||
|
|
||||||
|
See also `Understanding Container Groups`_ in Appendix C.
|
||||||
|
|
||||||
|
.. _Understanding Container Groups: ../install-guide/app-custom-layouts.html#understanding-container-groups
|
||||||
|
|
||||||
|
conf.d
|
||||||
|
------
|
||||||
|
|
||||||
|
Common OpenStack services and their configuration are defined by
|
||||||
|
OpenStack-Ansible in the
|
||||||
|
``/etc/openstack_deploy/openstack_user_config.yml`` settings file.
|
||||||
|
|
||||||
|
Additional services should be defined with a YAML file in
|
||||||
|
``/etc/openstack_deploy/conf.d``, in order to manage file size.
|
||||||
|
|
||||||
|
See also `Understanding Host Groups`_ in Appendix C.
|
||||||
|
|
||||||
|
.. _Understanding Host Groups: ../install-guide/app-custom-layouts.html#understanding-host-groups
|
||||||
|
|
||||||
|
user\_*.yml files
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
Files in ``/etc/openstack_deploy`` beginning with ``user_`` will be
|
||||||
|
automatically sourced in any ``openstack-ansible`` command. Alternatively,
|
||||||
|
the files can be sourced with the ``-e`` parameter of the ``ansible-playbook``
|
||||||
|
command.
|
||||||
|
|
||||||
|
``user_variables.yml`` and ``user_secrets.yml`` are used directly by
|
||||||
|
OpenStack-Ansible. Adding custom variables used by your own roles and
|
||||||
|
playbooks to these files is not recommended. Doing so will complicate your
|
||||||
|
upgrade path by making comparison of your existing files with later versions
|
||||||
|
of these files more arduous. Rather, recommended practice is to place your own
|
||||||
|
variables in files named following the ``user_*.yml`` pattern so they will be
|
||||||
|
sourced alongside those used exclusively by OpenStack-Ansible.
|
||||||
|
|
||||||
|
Ordering and Precedence
|
||||||
|
+++++++++++++++++++++++
|
||||||
|
|
||||||
|
``user_*.yml`` variables are just YAML variable files. They will be sourced
|
||||||
|
in alphanumeric order by ``openstack-ansible``.
|
||||||
|
|
||||||
|
.. _adding-galaxy-roles:
|
||||||
|
|
||||||
|
Adding Galaxy roles
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
Any roles defined in ``openstack-ansible/ansible-role-requirements.yml``
|
||||||
|
will be installed by the
|
||||||
|
``openstack-ansible/scripts/bootstrap-ansible.sh`` script.
|
||||||
|
|
||||||
|
|
||||||
|
Setting overrides in configuration files
|
||||||
|
----------------------------------------
|
||||||
|
|
||||||
|
All of the services that use YAML, JSON, or INI for configuration can receive
|
||||||
|
overrides through the use of a Ansible action plugin named ``config_template``.
|
||||||
|
The configuration template engine allows a deployer to use a simple dictionary
|
||||||
|
to modify or add items into configuration files at run time that may not have a
|
||||||
|
preset template option. All OpenStack-Ansible roles allow for this
|
||||||
|
functionality where applicable. Files available to receive overrides can be
|
||||||
|
seen in the ``defaults/main.yml`` file as standard empty dictionaries (hashes).
|
||||||
|
|
||||||
|
Practical guidance for using this feature is available in the `Install Guide`_.
|
||||||
|
|
||||||
|
This module has been `submitted for consideration`_ into Ansible Core.
|
||||||
|
|
||||||
|
.. _Install Guide: ../install-guide/app-advanced-config-override.html
|
||||||
|
.. _submitted for consideration: https://github.com/ansible/ansible/pull/12555
|
||||||
|
|
||||||
|
|
||||||
|
Build the environment with additional python packages
|
||||||
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++
|
||||||
|
|
||||||
|
The system will allow you to install and build any package that is a python
|
||||||
|
installable. The repository infrastructure will look for and create any
|
||||||
|
git based or PyPi installable package. When the package is built the repo-build
|
||||||
|
role will create the sources as Python wheels to extend the base system and
|
||||||
|
requirements.
|
||||||
|
|
||||||
|
While the packages pre-built in the repository-infrastructure are
|
||||||
|
comprehensive, it may be needed to change the source locations and versions of
|
||||||
|
packages to suit different deployment needs. Adding additional repositories as
|
||||||
|
overrides is as simple as listing entries within the variable file of your
|
||||||
|
choice. Any ``user_.*.yml`` file within the "/etc/openstack_deployment"
|
||||||
|
directory will work to facilitate the addition of a new packages.
|
||||||
|
|
||||||
|
|
||||||
|
.. code-block:: yaml
|
||||||
|
|
||||||
|
swift_git_repo: https://private-git.example.org/example-org/swift
|
||||||
|
swift_git_install_branch: master
|
||||||
|
|
||||||
|
|
||||||
|
Additional lists of python packages can also be overridden using a
|
||||||
|
``user_.*.yml`` variable file.
|
||||||
|
|
||||||
|
.. code-block:: yaml
|
||||||
|
|
||||||
|
swift_requires_pip_packages:
|
||||||
|
- virtualenv
|
||||||
|
- virtualenv-tools
|
||||||
|
- python-keystoneclient
|
||||||
|
- NEW-SPECIAL-PACKAGE
|
||||||
|
|
||||||
|
|
||||||
|
Once the variables are set call the play ``repo-build.yml`` to build all of the
|
||||||
|
wheels within the repository infrastructure. When ready run the target plays to
|
||||||
|
deploy your overridden source code.
|
||||||
|
|
||||||
|
|
||||||
|
Module documentation
|
||||||
|
++++++++++++++++++++
|
||||||
|
|
||||||
|
These are the options available as found within the virtual module
|
||||||
|
documentation section.
|
||||||
|
|
||||||
|
.. code-block:: yaml
|
||||||
|
|
||||||
|
module: config_template
|
||||||
|
version_added: 1.9.2
|
||||||
|
short_description: >
|
||||||
|
Renders template files providing a create/update override interface
|
||||||
|
description:
|
||||||
|
- The module contains the template functionality with the ability to
|
||||||
|
override items in config, in transit, through the use of a simple
|
||||||
|
dictionary without having to write out various temp files on target
|
||||||
|
machines. The module renders all of the potential jinja a user could
|
||||||
|
provide in both the template file and in the override dictionary which
|
||||||
|
is ideal for deployers who may have lots of different configs using a
|
||||||
|
similar code base.
|
||||||
|
- The module is an extension of the **copy** module and all of attributes
|
||||||
|
that can be set there are available to be set here.
|
||||||
|
options:
|
||||||
|
src:
|
||||||
|
description:
|
||||||
|
- Path of a Jinja2 formatted template on the local server. This can
|
||||||
|
be a relative or absolute path.
|
||||||
|
required: true
|
||||||
|
default: null
|
||||||
|
dest:
|
||||||
|
description:
|
||||||
|
- Location to render the template to on the remote machine.
|
||||||
|
required: true
|
||||||
|
default: null
|
||||||
|
config_overrides:
|
||||||
|
description:
|
||||||
|
- A dictionary used to update or override items within a configuration
|
||||||
|
template. The dictionary data structure may be nested. If the target
|
||||||
|
config file is an ini file the nested keys in the ``config_overrides``
|
||||||
|
will be used as section headers.
|
||||||
|
config_type:
|
||||||
|
description:
|
||||||
|
- A string value describing the target config type.
|
||||||
|
choices:
|
||||||
|
- ini
|
||||||
|
- json
|
||||||
|
- yaml
|
||||||
|
|
||||||
|
|
||||||
|
Example task using the "config_template" module
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
.. code-block:: yaml
|
||||||
|
|
||||||
|
- name: Run config template ini
|
||||||
|
config_template:
|
||||||
|
src: test.ini.j2
|
||||||
|
dest: /tmp/test.ini
|
||||||
|
config_overrides: {{ test_overrides }}
|
||||||
|
config_type: ini
|
||||||
|
|
||||||
|
|
||||||
|
Example overrides dictionary(hash)
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
.. code-block:: yaml
|
||||||
|
|
||||||
|
test_overrides:
|
||||||
|
DEFAULT:
|
||||||
|
new_item: 12345
|
||||||
|
|
||||||
|
|
||||||
|
Original template file "test.ini.j2"
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
.. code-block:: ini
|
||||||
|
|
||||||
|
[DEFAULT]
|
||||||
|
value1 = abc
|
||||||
|
value2 = 123
|
||||||
|
|
||||||
|
|
||||||
|
Rendered on disk file "/tmp/test.ini"
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
.. code-block:: ini
|
||||||
|
|
||||||
|
[DEFAULT]
|
||||||
|
value1 = abc
|
||||||
|
value2 = 123
|
||||||
|
new_item = 12345
|
||||||
|
|
||||||
|
|
||||||
|
In this task the ``test.ini.j2`` file is a template which will be rendered and
|
||||||
|
written to disk at ``/tmp/test.ini``. The **config_overrides** entry is a
|
||||||
|
dictionary(hash) which allows a deployer to set arbitrary data as overrides to
|
||||||
|
be written into the configuration file at run time. The **config_type** entry
|
||||||
|
specifies the type of configuration file the module will be interacting with;
|
||||||
|
available options are "yaml", "json", and "ini".
|
||||||
|
|
||||||
|
|
||||||
|
Discovering Available Overrides
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
All of these options can be specified in any way that suits your deployment.
|
||||||
|
In terms of ease of use and flexibility it's recommended that you define your
|
||||||
|
overrides in a user variable file such as
|
||||||
|
``/etc/openstack_deploy/user_variables.yml``.
|
||||||
|
|
||||||
|
The list of overrides available may be found by executing:
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
find . -name "main.yml" -exec grep '_.*_overrides:' {} \; \
|
||||||
|
| grep -v "^#" \
|
||||||
|
| sort -u
|
18
doc/source/draft-operations-guide/index.rst
Normal file
18
doc/source/draft-operations-guide/index.rst
Normal file
@ -0,0 +1,18 @@
|
|||||||
|
==================================
|
||||||
|
OpenStack-Ansible operations guide
|
||||||
|
==================================
|
||||||
|
|
||||||
|
This is a draft index page for the proposed OpenStack-Ansible
|
||||||
|
operations guide.
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:maxdepth: 2
|
||||||
|
|
||||||
|
ops-lxc-commands.rst
|
||||||
|
ops-add-computehost.rst
|
||||||
|
ops-remove-computehost.rst
|
||||||
|
ops-galera.rst
|
||||||
|
ops-tips.rst
|
||||||
|
ops-troubleshooting.rst
|
||||||
|
extending.rst
|
||||||
|
|
29
doc/source/draft-operations-guide/ops-add-computehost.rst
Normal file
29
doc/source/draft-operations-guide/ops-add-computehost.rst
Normal file
@ -0,0 +1,29 @@
|
|||||||
|
=====================
|
||||||
|
Adding a compute host
|
||||||
|
=====================
|
||||||
|
|
||||||
|
Use the following procedure to add a compute host to an operational
|
||||||
|
cluster.
|
||||||
|
|
||||||
|
#. Configure the host as a target host. See `Prepare target hosts
|
||||||
|
<http://docs.openstack.org/developer/openstack-ansible/install-guide/targethosts.html>`_
|
||||||
|
for more information.
|
||||||
|
|
||||||
|
#. Edit the ``/etc/openstack_deploy/openstack_user_config.yml`` file and
|
||||||
|
add the host to the ``compute_hosts`` stanza.
|
||||||
|
|
||||||
|
If necessary, also modify the ``used_ips`` stanza.
|
||||||
|
|
||||||
|
#. If the cluster is utilizing Telemetry/Metering (Ceilometer),
|
||||||
|
edit the ``/etc/openstack_deploy/conf.d/ceilometer.yml`` file and add the
|
||||||
|
host to the ``metering-compute_hosts`` stanza.
|
||||||
|
|
||||||
|
#. Run the following commands to add the host. Replace
|
||||||
|
``NEW_HOST_NAME`` with the name of the new host.
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# cd /opt/openstack-ansible/playbooks
|
||||||
|
# openstack-ansible setup-hosts.yml --limit NEW_HOST_NAME
|
||||||
|
# openstack-ansible setup-openstack.yml --skip-tags nova-key-distribute --limit NEW_HOST_NAME
|
||||||
|
# openstack-ansible setup-openstack.yml --tags nova-key --limit compute_hosts
|
302
doc/source/draft-operations-guide/ops-galera-recovery.rst
Normal file
302
doc/source/draft-operations-guide/ops-galera-recovery.rst
Normal file
@ -0,0 +1,302 @@
|
|||||||
|
=======================
|
||||||
|
Galera cluster recovery
|
||||||
|
=======================
|
||||||
|
|
||||||
|
Run the ``galera-bootstrap`` playbook to automatically recover
|
||||||
|
a node or an entire environment. Run the ``galera install`` playbook
|
||||||
|
using the ``galera-bootstrap`` tag to auto recover a node or an
|
||||||
|
entire environment.
|
||||||
|
|
||||||
|
#. Run the following Ansible command to show the failed nodes:
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# openstack-ansible galera-install.yml --tags galera-bootstrap
|
||||||
|
|
||||||
|
The cluster comes back online after completion of this command.
|
||||||
|
|
||||||
|
Single-node failure
|
||||||
|
~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
If a single node fails, the other nodes maintain quorum and
|
||||||
|
continue to process SQL requests.
|
||||||
|
|
||||||
|
#. Run the following Ansible command to determine the failed node:
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# ansible galera_container -m shell -a "mysql -h localhost \
|
||||||
|
-e 'show status like \"%wsrep_cluster_%\";'"
|
||||||
|
node3_galera_container-3ea2cbd3 | FAILED | rc=1 >>
|
||||||
|
ERROR 2002 (HY000): Can't connect to local MySQL server through
|
||||||
|
socket '/var/run/mysqld/mysqld.sock' (111)
|
||||||
|
|
||||||
|
node2_galera_container-49a47d25 | success | rc=0 >>
|
||||||
|
Variable_name Value
|
||||||
|
wsrep_cluster_conf_id 17
|
||||||
|
wsrep_cluster_size 3
|
||||||
|
wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
wsrep_cluster_status Primary
|
||||||
|
|
||||||
|
node4_galera_container-76275635 | success | rc=0 >>
|
||||||
|
Variable_name Value
|
||||||
|
wsrep_cluster_conf_id 17
|
||||||
|
wsrep_cluster_size 3
|
||||||
|
wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
wsrep_cluster_status Primary
|
||||||
|
|
||||||
|
|
||||||
|
In this example, node 3 has failed.
|
||||||
|
|
||||||
|
#. Restart MariaDB on the failed node and verify that it rejoins the
|
||||||
|
cluster.
|
||||||
|
|
||||||
|
#. If MariaDB fails to start, run the ``mysqld`` command and perform
|
||||||
|
further analysis on the output. As a last resort, rebuild the container
|
||||||
|
for the node.
|
||||||
|
|
||||||
|
Multi-node failure
|
||||||
|
~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
When all but one node fails, the remaining node cannot achieve quorum and
|
||||||
|
stops processing SQL requests. In this situation, failed nodes that
|
||||||
|
recover cannot join the cluster because it no longer exists.
|
||||||
|
|
||||||
|
#. Run the following Ansible command to show the failed nodes:
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# ansible galera_container -m shell -a "mysql \
|
||||||
|
-h localhost -e 'show status like \"%wsrep_cluster_%\";'"
|
||||||
|
node2_galera_container-49a47d25 | FAILED | rc=1 >>
|
||||||
|
ERROR 2002 (HY000): Can't connect to local MySQL server
|
||||||
|
through socket '/var/run/mysqld/mysqld.sock' (111)
|
||||||
|
|
||||||
|
node3_galera_container-3ea2cbd3 | FAILED | rc=1 >>
|
||||||
|
ERROR 2002 (HY000): Can't connect to local MySQL server
|
||||||
|
through socket '/var/run/mysqld/mysqld.sock' (111)
|
||||||
|
|
||||||
|
node4_galera_container-76275635 | success | rc=0 >>
|
||||||
|
Variable_name Value
|
||||||
|
wsrep_cluster_conf_id 18446744073709551615
|
||||||
|
wsrep_cluster_size 1
|
||||||
|
wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
wsrep_cluster_status non-Primary
|
||||||
|
|
||||||
|
In this example, nodes 2 and 3 have failed. The remaining operational
|
||||||
|
server indicates ``non-Primary`` because it cannot achieve quorum.
|
||||||
|
|
||||||
|
#. Run the following command to
|
||||||
|
`rebootstrap <http://galeracluster.com/documentation-webpages/quorumreset.html#id1>`_
|
||||||
|
the operational node into the cluster:
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# mysql -e "SET GLOBAL wsrep_provider_options='pc.bootstrap=yes';"
|
||||||
|
node4_galera_container-76275635 | success | rc=0 >>
|
||||||
|
Variable_name Value
|
||||||
|
wsrep_cluster_conf_id 15
|
||||||
|
wsrep_cluster_size 1
|
||||||
|
wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
wsrep_cluster_status Primary
|
||||||
|
|
||||||
|
node3_galera_container-3ea2cbd3 | FAILED | rc=1 >>
|
||||||
|
ERROR 2002 (HY000): Can't connect to local MySQL server
|
||||||
|
through socket '/var/run/mysqld/mysqld.sock' (111)
|
||||||
|
|
||||||
|
node2_galera_container-49a47d25 | FAILED | rc=1 >>
|
||||||
|
ERROR 2002 (HY000): Can't connect to local MySQL server
|
||||||
|
through socket '/var/run/mysqld/mysqld.sock' (111)
|
||||||
|
|
||||||
|
The remaining operational node becomes the primary node and begins
|
||||||
|
processing SQL requests.
|
||||||
|
|
||||||
|
#. Restart MariaDB on the failed nodes and verify that they rejoin the
|
||||||
|
cluster:
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# ansible galera_container -m shell -a "mysql \
|
||||||
|
-h localhost -e 'show status like \"%wsrep_cluster_%\";'"
|
||||||
|
node3_galera_container-3ea2cbd3 | success | rc=0 >>
|
||||||
|
Variable_name Value
|
||||||
|
wsrep_cluster_conf_id 17
|
||||||
|
wsrep_cluster_size 3
|
||||||
|
wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
wsrep_cluster_status Primary
|
||||||
|
|
||||||
|
node2_galera_container-49a47d25 | success | rc=0 >>
|
||||||
|
Variable_name Value
|
||||||
|
wsrep_cluster_conf_id 17
|
||||||
|
wsrep_cluster_size 3
|
||||||
|
wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
wsrep_cluster_status Primary
|
||||||
|
|
||||||
|
node4_galera_container-76275635 | success | rc=0 >>
|
||||||
|
Variable_name Value
|
||||||
|
wsrep_cluster_conf_id 17
|
||||||
|
wsrep_cluster_size 3
|
||||||
|
wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
wsrep_cluster_status Primary
|
||||||
|
|
||||||
|
#. If MariaDB fails to start on any of the failed nodes, run the
|
||||||
|
``mysqld`` command and perform further analysis on the output. As a
|
||||||
|
last resort, rebuild the container for the node.
|
||||||
|
|
||||||
|
Complete failure
|
||||||
|
~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
Restore from backup if all of the nodes in a Galera cluster fail (do not
|
||||||
|
shutdown gracefully). Run the following command to determine if all nodes in
|
||||||
|
the cluster have failed:
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# ansible galera_container -m shell -a "cat /var/lib/mysql/grastate.dat"
|
||||||
|
node3_galera_container-3ea2cbd3 | success | rc=0 >>
|
||||||
|
# GALERA saved state
|
||||||
|
version: 2.1
|
||||||
|
uuid: 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
seqno: -1
|
||||||
|
cert_index:
|
||||||
|
|
||||||
|
node2_galera_container-49a47d25 | success | rc=0 >>
|
||||||
|
# GALERA saved state
|
||||||
|
version: 2.1
|
||||||
|
uuid: 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
seqno: -1
|
||||||
|
cert_index:
|
||||||
|
|
||||||
|
node4_galera_container-76275635 | success | rc=0 >>
|
||||||
|
# GALERA saved state
|
||||||
|
version: 2.1
|
||||||
|
uuid: 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
seqno: -1
|
||||||
|
cert_index:
|
||||||
|
|
||||||
|
|
||||||
|
All the nodes have failed if ``mysqld`` is not running on any of the
|
||||||
|
nodes and all of the nodes contain a ``seqno`` value of -1.
|
||||||
|
|
||||||
|
If any single node has a positive ``seqno`` value, then that node can be
|
||||||
|
used to restart the cluster. However, because there is no guarantee that
|
||||||
|
each node has an identical copy of the data, we do not recommend to
|
||||||
|
restart the cluster using the ``--wsrep-new-cluster`` command on one
|
||||||
|
node.
|
||||||
|
|
||||||
|
Rebuilding a container
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
Recovering from certain failures require rebuilding one or more containers.
|
||||||
|
|
||||||
|
#. Disable the failed node on the load balancer.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
Do not rely on the load balancer health checks to disable the node.
|
||||||
|
If the node is not disabled, the load balancer sends SQL requests
|
||||||
|
to it before it rejoins the cluster and cause data inconsistencies.
|
||||||
|
|
||||||
|
#. Destroy the container and remove MariaDB data stored outside
|
||||||
|
of the container:
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# lxc-stop -n node3_galera_container-3ea2cbd3
|
||||||
|
# lxc-destroy -n node3_galera_container-3ea2cbd3
|
||||||
|
# rm -rf /openstack/node3_galera_container-3ea2cbd3/*
|
||||||
|
|
||||||
|
In this example, node 3 failed.
|
||||||
|
|
||||||
|
#. Run the host setup playbook to rebuild the container on node 3:
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# openstack-ansible setup-hosts.yml -l node3 \
|
||||||
|
-l node3_galera_container-3ea2cbd3
|
||||||
|
|
||||||
|
|
||||||
|
The playbook restarts all other containers on the node.
|
||||||
|
|
||||||
|
#. Run the infrastructure playbook to configure the container
|
||||||
|
specifically on node 3:
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# openstack-ansible setup-infrastructure.yml \
|
||||||
|
-l node3_galera_container-3ea2cbd3
|
||||||
|
|
||||||
|
|
||||||
|
.. warning::
|
||||||
|
|
||||||
|
The new container runs a single-node Galera cluster, which is a dangerous
|
||||||
|
state because the environment contains more than one active database
|
||||||
|
with potentially different data.
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# ansible galera_container -m shell -a "mysql \
|
||||||
|
-h localhost -e 'show status like \"%wsrep_cluster_%\";'"
|
||||||
|
node3_galera_container-3ea2cbd3 | success | rc=0 >>
|
||||||
|
Variable_name Value
|
||||||
|
wsrep_cluster_conf_id 1
|
||||||
|
wsrep_cluster_size 1
|
||||||
|
wsrep_cluster_state_uuid da078d01-29e5-11e4-a051-03d896dbdb2d
|
||||||
|
wsrep_cluster_status Primary
|
||||||
|
|
||||||
|
node2_galera_container-49a47d25 | success | rc=0 >>
|
||||||
|
Variable_name Value
|
||||||
|
wsrep_cluster_conf_id 4
|
||||||
|
wsrep_cluster_size 2
|
||||||
|
wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
wsrep_cluster_status Primary
|
||||||
|
|
||||||
|
node4_galera_container-76275635 | success | rc=0 >>
|
||||||
|
Variable_name Value
|
||||||
|
wsrep_cluster_conf_id 4
|
||||||
|
wsrep_cluster_size 2
|
||||||
|
wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
wsrep_cluster_status Primary
|
||||||
|
|
||||||
|
#. Restart MariaDB in the new container and verify that it rejoins the
|
||||||
|
cluster.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
In larger deployments, it may take some time for the MariaDB daemon to
|
||||||
|
start in the new container. It will be synchronizing data from the other
|
||||||
|
MariaDB servers during this time. You can monitor the status during this
|
||||||
|
process by tailing the ``/var/log/mysql_logs/galera_server_error.log``
|
||||||
|
log file.
|
||||||
|
|
||||||
|
Lines starting with ``WSREP_SST`` will appear during the sync process
|
||||||
|
and you should see a line with ``WSREP: SST complete, seqno: <NUMBER>``
|
||||||
|
if the sync was successful.
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# ansible galera_container -m shell -a "mysql \
|
||||||
|
-h localhost -e 'show status like \"%wsrep_cluster_%\";'"
|
||||||
|
node2_galera_container-49a47d25 | success | rc=0 >>
|
||||||
|
Variable_name Value
|
||||||
|
wsrep_cluster_conf_id 5
|
||||||
|
wsrep_cluster_size 3
|
||||||
|
wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
wsrep_cluster_status Primary
|
||||||
|
|
||||||
|
node3_galera_container-3ea2cbd3 | success | rc=0 >>
|
||||||
|
Variable_name Value
|
||||||
|
wsrep_cluster_conf_id 5
|
||||||
|
wsrep_cluster_size 3
|
||||||
|
wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
wsrep_cluster_status Primary
|
||||||
|
|
||||||
|
node4_galera_container-76275635 | success | rc=0 >>
|
||||||
|
Variable_name Value
|
||||||
|
wsrep_cluster_conf_id 5
|
||||||
|
wsrep_cluster_size 3
|
||||||
|
wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
wsrep_cluster_status Primary
|
||||||
|
|
||||||
|
|
||||||
|
#. Enable the failed node on the load balancer.
|
32
doc/source/draft-operations-guide/ops-galera-remove.rst
Normal file
32
doc/source/draft-operations-guide/ops-galera-remove.rst
Normal file
@ -0,0 +1,32 @@
|
|||||||
|
==============
|
||||||
|
Removing nodes
|
||||||
|
==============
|
||||||
|
|
||||||
|
In the following example, all but one node was shut down gracefully:
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# ansible galera_container -m shell -a "mysql -h localhost \
|
||||||
|
-e 'show status like \"%wsrep_cluster_%\";'"
|
||||||
|
node3_galera_container-3ea2cbd3 | FAILED | rc=1 >>
|
||||||
|
ERROR 2002 (HY000): Can't connect to local MySQL server
|
||||||
|
through socket '/var/run/mysqld/mysqld.sock' (2)
|
||||||
|
|
||||||
|
node2_galera_container-49a47d25 | FAILED | rc=1 >>
|
||||||
|
ERROR 2002 (HY000): Can't connect to local MySQL server
|
||||||
|
through socket '/var/run/mysqld/mysqld.sock' (2)
|
||||||
|
|
||||||
|
node4_galera_container-76275635 | success | rc=0 >>
|
||||||
|
Variable_name Value
|
||||||
|
wsrep_cluster_conf_id 7
|
||||||
|
wsrep_cluster_size 1
|
||||||
|
wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
wsrep_cluster_status Primary
|
||||||
|
|
||||||
|
|
||||||
|
Compare this example output with the output from the multi-node failure
|
||||||
|
scenario where the remaining operational node is non-primary and stops
|
||||||
|
processing SQL requests. Gracefully shutting down the MariaDB service on
|
||||||
|
all but one node allows the remaining operational node to continue
|
||||||
|
processing SQL requests. When gracefully shutting down multiple nodes,
|
||||||
|
perform the actions sequentially to retain operation.
|
88
doc/source/draft-operations-guide/ops-galera-start.rst
Normal file
88
doc/source/draft-operations-guide/ops-galera-start.rst
Normal file
@ -0,0 +1,88 @@
|
|||||||
|
==================
|
||||||
|
Starting a cluster
|
||||||
|
==================
|
||||||
|
|
||||||
|
Gracefully shutting down all nodes destroys the cluster. Starting or
|
||||||
|
restarting a cluster from zero nodes requires creating a new cluster on
|
||||||
|
one of the nodes.
|
||||||
|
|
||||||
|
#. Start a new cluster on the most advanced node.
|
||||||
|
Check the ``seqno`` value in the ``grastate.dat`` file on all of the nodes:
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# ansible galera_container -m shell -a "cat /var/lib/mysql/grastate.dat"
|
||||||
|
node2_galera_container-49a47d25 | success | rc=0 >>
|
||||||
|
# GALERA saved state version: 2.1
|
||||||
|
uuid: 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
seqno: 31
|
||||||
|
cert_index:
|
||||||
|
|
||||||
|
node3_galera_container-3ea2cbd3 | success | rc=0 >>
|
||||||
|
# GALERA saved state version: 2.1
|
||||||
|
uuid: 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
seqno: 31
|
||||||
|
cert_index:
|
||||||
|
|
||||||
|
node4_galera_container-76275635 | success | rc=0 >>
|
||||||
|
# GALERA saved state version: 2.1
|
||||||
|
uuid: 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
seqno: 31
|
||||||
|
cert_index:
|
||||||
|
|
||||||
|
In this example, all nodes in the cluster contain the same positive
|
||||||
|
``seqno`` values as they were synchronized just prior to
|
||||||
|
graceful shutdown. If all ``seqno`` values are equal, any node can
|
||||||
|
start the new cluster.
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# /etc/init.d/mysql start --wsrep-new-cluster
|
||||||
|
|
||||||
|
This command results in a cluster containing a single node. The
|
||||||
|
``wsrep_cluster_size`` value shows the number of nodes in the
|
||||||
|
cluster.
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
node2_galera_container-49a47d25 | FAILED | rc=1 >>
|
||||||
|
ERROR 2002 (HY000): Can't connect to local MySQL server
|
||||||
|
through socket '/var/run/mysqld/mysqld.sock' (111)
|
||||||
|
|
||||||
|
node3_galera_container-3ea2cbd3 | FAILED | rc=1 >>
|
||||||
|
ERROR 2002 (HY000): Can't connect to local MySQL server
|
||||||
|
through socket '/var/run/mysqld/mysqld.sock' (2)
|
||||||
|
|
||||||
|
node4_galera_container-76275635 | success | rc=0 >>
|
||||||
|
Variable_name Value
|
||||||
|
wsrep_cluster_conf_id 1
|
||||||
|
wsrep_cluster_size 1
|
||||||
|
wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
wsrep_cluster_status Primary
|
||||||
|
|
||||||
|
#. Restart MariaDB on the other nodes and verify that they rejoin the
|
||||||
|
cluster.
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
node2_galera_container-49a47d25 | success | rc=0 >>
|
||||||
|
Variable_name Value
|
||||||
|
wsrep_cluster_conf_id 3
|
||||||
|
wsrep_cluster_size 3
|
||||||
|
wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
wsrep_cluster_status Primary
|
||||||
|
|
||||||
|
node3_galera_container-3ea2cbd3 | success | rc=0 >>
|
||||||
|
Variable_name Value
|
||||||
|
wsrep_cluster_conf_id 3
|
||||||
|
wsrep_cluster_size 3
|
||||||
|
wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
wsrep_cluster_status Primary
|
||||||
|
|
||||||
|
node4_galera_container-76275635 | success | rc=0 >>
|
||||||
|
Variable_name Value
|
||||||
|
wsrep_cluster_conf_id 3
|
||||||
|
wsrep_cluster_size 3
|
||||||
|
wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1
|
||||||
|
wsrep_cluster_status Primary
|
||||||
|
|
18
doc/source/draft-operations-guide/ops-galera.rst
Normal file
18
doc/source/draft-operations-guide/ops-galera.rst
Normal file
@ -0,0 +1,18 @@
|
|||||||
|
==========================
|
||||||
|
Galera cluster maintenance
|
||||||
|
==========================
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
|
||||||
|
ops-galera-remove.rst
|
||||||
|
ops-galera-start.rst
|
||||||
|
ops-galera-recovery.rst
|
||||||
|
|
||||||
|
Routine maintenance includes gracefully adding or removing nodes from
|
||||||
|
the cluster without impacting operation and also starting a cluster
|
||||||
|
after gracefully shutting down all nodes.
|
||||||
|
|
||||||
|
MySQL instances are restarted when creating a cluster, when adding a
|
||||||
|
node, when the service is not running, or when changes are made to the
|
||||||
|
``/etc/mysql/my.cnf`` configuration file.
|
||||||
|
|
38
doc/source/draft-operations-guide/ops-lxc-commands.rst
Normal file
38
doc/source/draft-operations-guide/ops-lxc-commands.rst
Normal file
@ -0,0 +1,38 @@
|
|||||||
|
========================
|
||||||
|
Linux Container commands
|
||||||
|
========================
|
||||||
|
|
||||||
|
The following are some useful commands to manage LXC:
|
||||||
|
|
||||||
|
- List containers and summary information such as operational state and
|
||||||
|
network configuration:
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# lxc-ls --fancy
|
||||||
|
|
||||||
|
- Show container details including operational state, resource
|
||||||
|
utilization, and ``veth`` pairs:
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# lxc-info --name container_name
|
||||||
|
|
||||||
|
- Start a container:
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# lxc-start --name container_name
|
||||||
|
|
||||||
|
- Attach to a container:
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# lxc-attach --name container_name
|
||||||
|
|
||||||
|
- Stop a container:
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# lxc-stop --name container_name
|
||||||
|
|
49
doc/source/draft-operations-guide/ops-remove-computehost.rst
Normal file
49
doc/source/draft-operations-guide/ops-remove-computehost.rst
Normal file
@ -0,0 +1,49 @@
|
|||||||
|
=======================
|
||||||
|
Removing a compute host
|
||||||
|
=======================
|
||||||
|
|
||||||
|
The `openstack-ansible-ops <https://git.openstack.org/cgit/openstack/openstack-ansible-ops>`_
|
||||||
|
repository contains a playbook for removing a compute host from an
|
||||||
|
OpenStack-Ansible (OSA) environment.
|
||||||
|
To remove a compute host, follow the below procedure.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
This guide describes how to remove a compute node from an OSA environment
|
||||||
|
completely. Perform these steps with caution, as the compute node will no
|
||||||
|
longer be in service after the steps have been completed. This guide assumes
|
||||||
|
that all data and instances have been properly migrated.
|
||||||
|
|
||||||
|
#. Disable all OpenStack services running on the compute node.
|
||||||
|
This can include, but is not limited to, the ``nova-compute`` service
|
||||||
|
and the neutron agent service.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
Ensure this step is performed first
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
# Run these commands on the compute node to be removed
|
||||||
|
# stop nova-compute
|
||||||
|
# stop neutron-linuxbridge-agent
|
||||||
|
|
||||||
|
#. Clone the ``openstack-ansible-ops`` repository to your deployment host:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ git clone https://git.openstack.org/openstack/openstack-ansible-ops \
|
||||||
|
/opt/openstack-ansible-ops
|
||||||
|
|
||||||
|
#. Run the ``remove_compute_node.yml`` Ansible playbook with the
|
||||||
|
``node_to_be_removed`` user variable set:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ cd /opt/openstack-ansible-ops/ansible_tools/playbooks
|
||||||
|
openstack-ansible remove_compute_node.yml \
|
||||||
|
-e node_to_be_removed="<name-of-compute-host>"
|
||||||
|
|
||||||
|
#. After the playbook completes, remove the compute node from the
|
||||||
|
OpenStack-Ansible configuration file in
|
||||||
|
``/etc/openstack_deploy/openstack_user_config.yml``.
|
38
doc/source/draft-operations-guide/ops-tips.rst
Normal file
38
doc/source/draft-operations-guide/ops-tips.rst
Normal file
@ -0,0 +1,38 @@
|
|||||||
|
===============
|
||||||
|
Tips and tricks
|
||||||
|
===============
|
||||||
|
|
||||||
|
Ansible forks
|
||||||
|
~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
The default MaxSessions setting for the OpenSSH Daemon is 10. Each Ansible
|
||||||
|
fork makes use of a Session. By default, Ansible sets the number of forks to
|
||||||
|
5. However, you can increase the number of forks used in order to improve
|
||||||
|
deployment performance in large environments.
|
||||||
|
|
||||||
|
Note that more than 10 forks will cause issues for any playbooks
|
||||||
|
which use ``delegate_to`` or ``local_action`` in the tasks. It is
|
||||||
|
recommended that the number of forks are not raised when executing against the
|
||||||
|
Control Plane, as this is where delegation is most often used.
|
||||||
|
|
||||||
|
The number of forks used may be changed on a permanent basis by including
|
||||||
|
the appropriate change to the ``ANSIBLE_FORKS`` in your ``.bashrc`` file.
|
||||||
|
Alternatively it can be changed for a particular playbook execution by using
|
||||||
|
the ``--forks`` CLI parameter. For example, the following executes the nova
|
||||||
|
playbook against the control plane with 10 forks, then against the compute
|
||||||
|
nodes with 50 forks.
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# openstack-ansible --forks 10 os-nova-install.yml --limit compute_containers
|
||||||
|
# openstack-ansible --forks 50 os-nova-install.yml --limit compute_hosts
|
||||||
|
|
||||||
|
For more information about forks, please see the following references:
|
||||||
|
|
||||||
|
* OpenStack-Ansible `Bug 1479812`_
|
||||||
|
* Ansible `forks`_ entry for ansible.cfg
|
||||||
|
* `Ansible Performance Tuning`_
|
||||||
|
|
||||||
|
.. _Bug 1479812: https://bugs.launchpad.net/openstack-ansible/+bug/1479812
|
||||||
|
.. _forks: http://docs.ansible.com/ansible/intro_configuration.html#forks
|
||||||
|
.. _Ansible Performance Tuning: https://www.ansible.com/blog/ansible-performance-tuning
|
125
doc/source/draft-operations-guide/ops-troubleshooting.rst
Normal file
125
doc/source/draft-operations-guide/ops-troubleshooting.rst
Normal file
@ -0,0 +1,125 @@
|
|||||||
|
===============
|
||||||
|
Troubleshooting
|
||||||
|
===============
|
||||||
|
|
||||||
|
Host kernel upgrade from version 3.13
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
Ubuntu kernel packages newer than version 3.13 contain a change in
|
||||||
|
module naming from ``nf_conntrack`` to ``br_netfilter``. After
|
||||||
|
upgrading the kernel, re-run the ``openstack-hosts-setup.yml``
|
||||||
|
playbook against those hosts. See `OSA bug 157996`_ for more
|
||||||
|
information.
|
||||||
|
|
||||||
|
.. _OSA bug 157996: https://bugs.launchpad.net/openstack-ansible/+bug/1579963
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
Container networking issues
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
All LXC containers on the host have two virtual Ethernet interfaces:
|
||||||
|
|
||||||
|
* `eth0` in the container connects to `lxcbr0` on the host
|
||||||
|
* `eth1` in the container connects to `br-mgmt` on the host
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
Some containers, such as ``cinder``, ``glance``, ``neutron_agents``, and
|
||||||
|
``swift_proxy``, have more than two interfaces to support their
|
||||||
|
functions.
|
||||||
|
|
||||||
|
Predictable interface naming
|
||||||
|
----------------------------
|
||||||
|
|
||||||
|
On the host, all virtual Ethernet devices are named based on their
|
||||||
|
container as well as the name of the interface inside the container:
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
${CONTAINER_UNIQUE_ID}_${NETWORK_DEVICE_NAME}
|
||||||
|
|
||||||
|
As an example, an all-in-one (AIO) build might provide a utility
|
||||||
|
container called `aio1_utility_container-d13b7132`. That container
|
||||||
|
will have two network interfaces: `d13b7132_eth0` and `d13b7132_eth1`.
|
||||||
|
|
||||||
|
Another option would be to use the LXC tools to retrieve information
|
||||||
|
about the utility container:
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# lxc-info -n aio1_utility_container-d13b7132
|
||||||
|
|
||||||
|
Name: aio1_utility_container-d13b7132
|
||||||
|
State: RUNNING
|
||||||
|
PID: 8245
|
||||||
|
IP: 10.0.3.201
|
||||||
|
IP: 172.29.237.204
|
||||||
|
CPU use: 79.18 seconds
|
||||||
|
BlkIO use: 678.26 MiB
|
||||||
|
Memory use: 613.33 MiB
|
||||||
|
KMem use: 0 bytes
|
||||||
|
Link: d13b7132_eth0
|
||||||
|
TX bytes: 743.48 KiB
|
||||||
|
RX bytes: 88.78 MiB
|
||||||
|
Total bytes: 89.51 MiB
|
||||||
|
Link: d13b7132_eth1
|
||||||
|
TX bytes: 412.42 KiB
|
||||||
|
RX bytes: 17.32 MiB
|
||||||
|
Total bytes: 17.73 MiB
|
||||||
|
|
||||||
|
The ``Link:`` lines will show the network interfaces that are attached
|
||||||
|
to the utility container.
|
||||||
|
|
||||||
|
Reviewing container networking traffic
|
||||||
|
--------------------------------------
|
||||||
|
|
||||||
|
To dump traffic on the ``br-mgmt`` bridge, use ``tcpdump`` to see all
|
||||||
|
communications between the various containers. To narrow the focus,
|
||||||
|
run ``tcpdump`` only on the desired network interface of the
|
||||||
|
containers.
|
||||||
|
|
||||||
|
Cached Ansible facts issues
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
At the beginning of a playbook run, information about each host is gathered.
|
||||||
|
Examples of the information gathered are:
|
||||||
|
|
||||||
|
* Linux distribution
|
||||||
|
* Kernel version
|
||||||
|
* Network interfaces
|
||||||
|
|
||||||
|
To improve performance, particularly in large deployments, you can
|
||||||
|
cache host facts and information.
|
||||||
|
|
||||||
|
OpenStack-Ansible enables fact caching by default. The facts are
|
||||||
|
cached in JSON files within ``/etc/openstack_deploy/ansible_facts``.
|
||||||
|
|
||||||
|
Fact caching can be disabled by commenting out the ``fact_caching``
|
||||||
|
parameter in ``playbooks/ansible.cfg``. Refer to the Ansible
|
||||||
|
documentation on `fact caching`_ for more details.
|
||||||
|
|
||||||
|
.. _fact caching: http://docs.ansible.com/ansible/playbooks_variables.html#fact-caching
|
||||||
|
|
||||||
|
Forcing regeneration of cached facts
|
||||||
|
------------------------------------
|
||||||
|
|
||||||
|
Cached facts may be incorrect if the host receives a kernel upgrade or new
|
||||||
|
network interfaces. Newly created bridges also disrupt cache facts.
|
||||||
|
|
||||||
|
This can lead to unexpected errors while running playbooks, and
|
||||||
|
require that the cached facts be regenerated.
|
||||||
|
|
||||||
|
Run the following command to remove all currently cached facts for all hosts:
|
||||||
|
|
||||||
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
# rm /etc/openstack_deploy/ansible_facts/*
|
||||||
|
|
||||||
|
New facts will be gathered and cached during the next playbook run.
|
||||||
|
|
||||||
|
To clear facts for a single host, find its file within
|
||||||
|
``/etc/openstack_deploy/ansible_facts/`` and remove it. Each host has
|
||||||
|
a JSON file that is named after its hostname. The facts for that host
|
||||||
|
will be regenerated on the next playbook run.
|
||||||
|
|
Loading…
Reference in New Issue
Block a user