Update documentation to match timmy 1.26.6 code

Might have missed something though!..

Change-Id: Ia35c5398570c6c46ec9a7d2e28a1865b56b24013
This commit is contained in:
Dmitry Sutyagin 2017-02-14 15:20:33 -08:00
parent 7d0c582065
commit d485bdfbb6
3 changed files with 102 additions and 58 deletions

View File

@ -4,45 +4,46 @@ General configuration
All default configuration values are defined in ``timmy/conf.py``. Timmy works with these values if no configuration file is provided.
If a configuration file is provided via ``-c | --config`` option, it overlays the default configuration.
An example of a configuration file is ``config.yaml``.
An example of a configuration file is ``timmy_data/rq/config/example.yaml``.
Some of the parameters available in configuration file:
* **ssh_opts** parameters to send to ssh command directly (recommended to leave at default), such as connection timeout, etc. See ``timmy/conf.py`` to review defaults.
* **env_vars** environment variables to pass to the commands and scripts - you can use these to expand variables in commands or scripts
* **fuel_ip** the IP address of the master node in the environment
* **fuel_user** username to use for accessing Nailgun API
* **fuel_pass** password to access Nailgun API
* **fuel_tenant** Fuel Keystone tenant to use when accessing Nailgun API
* **fuel_port** port to use when connecting to Fuel Nailgun API
* **fuel_keystone_port** port to use when getting a Keystone token to access Nailgun API
* **fuelclient** True/False - whether to use fuelclient library to access Nailgun API
* **fuel_skip_proxy** True/False - ignore ``http(s)_proxy`` environment variables when connecting to Nailgun API
* **rqdir** the path to the directory containing rqfiles, scripts to execute, and filelists to pass to rsync
* **ssh_opts** - parameters to send to ssh command directly (recommended to leave at default), such as connection timeout, etc. See ``timmy/conf.py`` to review defaults.
* **env_vars** - environment variables to pass to the commands and scripts - you can use these to expand variables in commands or scripts
* **fuel_ip** - the IP address of the master node in the environment
* **fuel_user** - username to use for accessing Nailgun API
* **fuel_pass** - password to access Nailgun API
* **fuel_tenant** - Fuel Keystone tenant to use when accessing Nailgun API
* **fuel_port** - port to use when connecting to Fuel Nailgun API
* **fuel_keystone_port** - port to use when getting a Keystone token to access Nailgun API
* **fuelclient** - True/False - whether to use fuelclient library to access Nailgun API
* **fuel_skip_proxy** - True/False - ignore ``http(s)_proxy`` environment variables when connecting to Nailgun API
* **rqdir** - the path to the directory containing rqfiles, scripts to execute, and filelists to pass to rsync
* **rqfile** - list of dicts:
* **file** - path to an rqfile containing actions and/or other configuration parameters
* **default** - should always be False, except when included default.yaml is used. This option is used to make **logs_no_default** work
* **logs_days** how many past days of logs to collect. This option will set **start** parameter for each **logs** action if not defined in it.
* **logs_speed_limit** True/False - enable speed limiting of log transfers (total transfer speed limit, not per-node)
* **logs_speed_default** Mbit/s - used when autodetect fails
* **logs_speed** Mbit/s - manually specify max bandwidth
* **logs_size_coefficient** a float value used to check local free space; 'logs size * coefficient' must be > free space; values lower than 0.3 are not recommended and will likely cause local disk fillup during log collection
* **do_print_results** print outputs of commands and scripts to stdout
* **clean** True/False - erase previous results in outdir and archive_dir dir, if any
* **outdir** directory to store output data. **WARNING: this directory is WIPED by default at the beginning of data collection. Be careful with what you define here.**
* **archive_dir** directory to put resulting archives into
* **timeout** timeout for SSH commands and scripts in seconds
* **default** - True/False - this option is used to make **logs_no_default** work (see below). Optional.
* **logs_no_default** - True/False - do not collect logs defined in any rqfile for which "default" is True
* **logs_days** - how many past days of logs to collect. This option will set **start** parameter for each **logs** action if not defined in it.
* **logs_speed_limit** - True/False - enable speed limiting of log transfers (total transfer speed limit, not per-node)
* **logs_speed_default** - Mbit/s - used when autodetect fails
* **logs_speed** - Mbit/s - manually specify max bandwidth
* **logs_size_coefficient** - a float value used to check local free space; 'logs size * coefficient' must be > free space; values lower than 0.3 are not recommended and will likely cause local disk fillup during log collection
* **do_print_results** - print outputs of commands and scripts to stdout
* **clean** - True/False - erase previous results in outdir and archive_dir dir, if any
* **outdir** - directory to store output data. **WARNING: this directory is WIPED by default at the beginning of data collection. Be careful with what you define here.**
* **archive_dir** - directory to put resulting archives into
* **timeout** - timeout for SSH commands and scripts in seconds
===================
Configuring actions
===================
Actions can be configured in a separate yaml file (by default ``rq.yaml`` is used) and / or defined in the main config file or passed via command line options ``-P``, ``-C``, ``-S``, ``-G``.
Actions can be configured in a separate yaml file (by default ``timmy_data/rq/default.yaml`` is used) and / or defined in the main config file or passed via command line options ``-P``, ``-C``, ``-S``, ``-G``.
The following actions are available for definition:
* **put** - a list of tuples / 2-element lists: [source, destination]. Passed to ``scp`` like so ``scp source <node-ip>:destination``. Wildcards supported for source.
* **cmds** - a list of dicts: {'command-name':'command-string'}. Example: {'command-1': 'uptime'}. Command string is a bash string. Commands are executed in a sorted order of their names.
* **cmds** - a list of dicts: {'command-name':'command-string'}. Example: {'command-1': 'uptime'}. Command string is a bash string. Commands are executed in alphabetical order of their names.
* **scripts** - a list of elements, each of which can be a string or a dict:
* string - represents a script filename located on a local system. If filename does not contain a path separator, the script is expected to be located inside ``rqdir/scripts``. Otherwise the provided path is used to access the script. Example: ``'./my-test-script.sh'``
* dict - use this option if you need to pass variables to your script. Script parameters are not supported, but you can use env variables instead. A dict should only contain one key which is the script filename (read above), and the value is a Bash space-separated variable assignment string. Example: ``'./my-test-script.sh': 'var1=123 var2="HELLO WORLD"'``
@ -52,9 +53,9 @@ The following actions are available for definition:
* **filelists** - a list of filelist filenames located on a local system. Filelist is a text file containing files and directories to collect, passed to rsync. Does not support wildcards. If the filename does not contain path separator, the filelist is expected to be located inside ``rqdir/filelists``. Otherwise the provided path is used to read the filelist.
* **logs**
* **path** - base path to scan for logs
* **include** - regexp string to match log files against for inclusion (if not set = include all)
* **exclude** - regexp string to match log files against. Excludes matched files from collection.
* **start** - date or datetime string to collect only files modified on or after the specified time. Format - ``YYYY-MM-DD`` or ``YYYY-MM-DD HH:MM:SS`` or ``N`` where N = integer number of days (meaning last N days).
* **include** - list of regexp strings to match log files against for inclusion (if not set = include all). Optional.
* **exclude** - list of regexp strings to match log files against. Excludes matched files from collection. Optional.
* **start** - date or datetime string to collect only files modified on or after the specified time. Format - ``YYYY-MM-DD`` or ``YYYY-MM-DD HH:MM:SS`` or ``N`` where N = integer number of days (meaning last N days). Optional.
===============
Filtering nodes
@ -141,7 +142,7 @@ rqfile format
by_roles:
compute: [d, e, f]
The **config** and **rqfile** definitions presented above are equivalent. It is possible to define config in a config file using the **config** format, or in an **rqfile** using **rqfile** format, linking to the **rqfile** in config with ``rqfile`` setting. It is also possible to define part here and part there. Mixing identical parameters in both places is not recommended - the results may be unpredictable (such a scenario has not been thoroughly tested). In general, **rqfile** is good for fewer settings with more parameter-based variations (``by_``), and main config for more different settings with less such variations.
The **config** and **rqfile** definitions presented above are equivalent. It is possible to define actions in a config file using the **config** format, or in an **rqfile** using **rqfile** format, linking to the **rqfile** in config with ``rqfile`` setting. It is also possible to define part here and part there. Mixing identical parameters in both places is not recommended - the results may be unpredictable (such a scenario has not been thoroughly tested). In general, **rqfile** is the preferred place to define actions.
===============================
Configuration application order

View File

@ -6,13 +6,13 @@ OpenStack Ansible-like tool for parallel node operations: two-way data transfer,
* The tool is based on https://etherpad.openstack.org/p/openstack-diagnostics
* Should work fine in environments deployed by Fuel versions: 4.x, 5.x, 6.x, 7.0, 8.0, 9.0, 9.1
* Should work fine in environments deployed by Fuel versions: 4.x, 5.x, 6.x, 7.0, 8.0, 9.0, 9.1, 9.2
* Operates non-destructively.
* Can be launched on any host within admin network, provided the fuel node IP is specified and access to Fuel and other nodes is possible via ssh from the local system.
* Can be launched on any host within admin network, provided the fuel node IP is specified and access to Fuel and other nodes is possible via ssh from the local system.
* Parallel launch - only on the nodes that are 'online'. Some filters for nodes are also available.
* Commands (from ./cmds directory) are separated according to roles (detected automatically) by the symlinks. Thus, the command list may depend on release, roles and OS. In addition, there can be some commands that run everywhere. There are also commands that are executed only on one node according to its role, using the first node of this type they encounter.
* Modular: possible to create a special package that contains only certain required commands.
* Collects log files from the nodes using filters
* Some archives are created - general.tar.bz2 and logs-*
* Collects log files from the nodes using filename and timestamp filters
* Packs collected data
* Checks are implemented to prevent filesystem overfilling due to log collection, appropriate error shown.
* Can be imported into other python scripts (ex. https://github.com/f3flight/timmy-customtest) and used as a transport and structure to access node parameters known to Fuel, run commands on nodes, collect outputs, etc. with ease.

View File

@ -4,11 +4,18 @@ Usage
**NOTICE:** Even though Timmy uses nice and ionice to limit impact on the cloud, you should still expect 1 core utilization both locally (where Timmy is launched) and on each node where commands are executed or logs collected. Additionally, if logs are collected, local disk (log destination directory) may get utilized significantly.
The easiest way to launch timmy would be running the ``timmy.py`` script.
However, you need to :doc:`configure </configuration>` it first.
**WARNING** If modifying the ``outdir`` config parameter, please first read the related warning on `configuration </configuration>` page.
Basically, the ``timmy.py`` is a simple wrapper that launches ``cli.py``.
Full :doc:`reference </cli>` for command line interface
The easiest way to launch Timmy would be running the ``timmy.py`` script / ``timmy`` command:
* Timmy will perform all actions defined in the ``default.yaml`` rq-file. The file is located in ``timmy_data/rq`` folder in Python installation directory. Specifically:
* run diagnostic scripts on all nodes, including Fuel server
* collect configuration files for all nodes
* Timmy will **NOT** collect log files when executed this way.
Basically, ``timmy.py`` is a simple wrapper that launches ``cli.py``.
* Current page does not reference all available CLI options. Full :doc:`reference </cli>` for command line interface.
* You may also want to create a custom :doc:`configuration </configuration>` for Timmy, depending on your use case.
Basic parameters:
@ -16,36 +23,61 @@ Basic parameters:
* ``-l``, ``--logs`` also collect logs (logs are not collected by default due to their size)
* ``-e``, ``--env`` filter by environment ID
* ``-R``, ``--role`` filter by role
* ``-c``, ``--config`` use custom configuration file to overwrite defaults. See ``config.yaml`` as an example
* ``-c``, ``--config`` use custom configuration file to overwrite defaults. See ``timmy_data/config/example.yaml`` as an example
* ``-j``, ``--nodes-json`` use json file instead of polling Fuel (to generate json file use ``fuel node --json``) - speeds up initialization
* ``-o``, ``--dest-file`` the name/path for output archive, default is ``general.tar.gz`` and put into ``/tmp/timmy/archives``.
* ``-v``, ``--verbose`` verbose(INFO) logging
* ``-d``, ``--debug`` debug(DEBUG) logging
* ``-o``, ``--dest-file`` the name/path for output archive, default is ``general.tar.gz`` and put into ``/tmp/timmy/archives``. A folder will be created if it does not exist. It's not recommended to use ``/var/log`` as destination because subsequent runs with log collection may cause Timmy to collect it's own previously created files or even update them while reading from them. The general idea is that a destination directory should contain enough space to hold all collected data and should not be in collection paths.
* ``-v``, ``--verbose`` verbose(INFO) logging. Use ``-vv`` to enable DEBUG logging.
**Shell Mode** - a mode of execution which makes the following changes:
==========
Shell Mode
==========
* rqfile (``rq.yaml`` by default) is skipped
* Fuel node is skipped
**Shell Mode** is activated whenever any of the following parameters are used via CLI: ``-C``, ``-S``, ``-P``, ``-G``.
A mode of execution which makes the following changes:
* rqfile (``timmy_data/rq/default.yaml`` by default) is skipped
* Fuel node is skipped. If for some reason you need to run specific scripts/actions via Timmy on Fuel and on other nodes at the same time, create an rqfile instead (see :doc:`configuration </configuration>` for details, see ``timmy_data/rq/neutron.yaml`` as an example), coupled with ``--rqfile`` option or a custom config file to override default rqfile.
* outputs of commands (specified with ``-C`` options) and scripts (specified with ``-S``) are printed on screen
* any actions (cmds, scripts, files, filelists, put, **except** logs) and Parameter Based configuration defined in config are ignored.
The following parameters ("actions") are available, the usage of any of them enables **Shell Mode**:
The following parameters ("actions") are available via CLI:
* ``-C <command>`` - Bash command (string) to execute on nodes. Using multiple ``-C`` statements will produce the same result as using one with several commands separated by ``;`` (traditional Shell syntax), but for each ``-C`` statement a new SSH connection is established
* ``-C <command>`` - Bash command (string) to execute on nodes. Using multiple ``-C`` statements will produce the same result as using one with several commands separated by ``;`` (traditional Shell syntax), but for each ``-C`` statement a new SSH connection is established.
* ``-S <script>`` - name of the Bash script file to execute on nodes (if you do not have a path separator in the filename, you need to put the file into ``scripts`` folder inside a path specified by ``rqdir`` config parameter, defaults to ``rq``. If a path separator is present, the given filename will be used directly as provided)
* ``-P <file/path> <dest>`` - upload local data to nodes (wildcards supported). You must specify 2 values for each ``-P`` switch.
* ``-G <file/path>`` - download (collect) data from nodes
====
Logs
====
It's possible to specify custom log collection when using CLI:
* ``-L <base-path> <include-regex> <exclude-regex>``, ``--get-logs`` - specify a base path, include regex and exclude regex to collect logs. This option can be specified more than once, in this case log lists will be united. This option **does not** disable default log collection defined in ``timmy_data/rq/default.yaml``.
* ``--logs-no-default`` - use this option of you **only** need logs specified via ``-L``.
===============
Execution order
===============
Specified actions are executed for all applicable nodes, always in the following order:
1. put
2. commands
3. scripts
4. get, filelists
5. logs
========
Examples
========
* ``timmy`` - run according to the default configuration and default actions. Default actions are defined in ``rq.yaml`` (``/usr/share/timmy/rq.yaml``). Logs are not collected.
* ``timmy -l`` - run default actions and also collect logs (default log setup applied - defaults are hardcoded in ``timmy/conf.py``). Such execution is similar to Fuel's "diagnostic snapshot" action, but will finish faster and collect less logs.
* ``timmy --only-logs`` - only collect logs, no actions performed (default log setup, as above)
* ``timmy`` - run according to the default configuration and default actions. Default actions are defined in ``timmy_data/rq/default.yaml``. Logs are not collected.
* ``timmy -l`` - run default actions and also collect logs. Such execution is similar to Fuel's "diagnostic snapshot" action, but will finish faster and collect less logs. There is a default log collection period based on file modification time, only files modified within the last 30 days are collected.
* ``timmy -l --days 3`` - same as above but only collect log files updated within the last 3 days.
* ``timmy --only-logs`` - only collect logs, no actions (files, filelists, commands, scripts, put, get) performed.
* ``timmy -C 'uptime; free -m'`` - check uptime and memory on all nodes
* ``timmy -G /etc/nova/nova.conf`` - get nova.conf from all nodes
* ``timmy -R controller -P package.deb '' -C 'dpkg -i package.deb' -C 'rm package.deb' -C 'dpkg -l | grep [p]ackage'`` - push a package to all nodes, install it, remove the file and check that it is installed
* ``timmy -G /etc/nova/nova.conf`` - get ``nova.conf`` from all nodes
* ``timmy -R controller -P package.deb '' -C 'dpkg -i package.deb' -C 'rm package.deb' -C 'dpkg -l | grep [p]ackage'`` - push a package to all nodes, install it, remove the file and check that it is installed. Commands are executed in the order in which they are provided.
* ``timmy -с myconf.yaml`` - use a custom config file and run the program according to it. Custom config can specify any actions, log setup, and other settings. See configuration doc for more details.
===============================
@ -67,7 +99,11 @@ If you want to perform a set of actions on the nodes without writing a long comm
02-disk-check: 'df -h'
and-also-ram: 'free -m'
logs:
exclude: '.*' # exclude all logs by default
path: '/var/log' # base path to search for logs
exclude: # a list of exclude regexes
- '.*' # exclude all logs by default - does not make much sense - just an example. If the intention is to not collect all logs then this 'logs' section can be removed altogether, just ensure that either rqfile is custom or 'null', or '--logs-no-default' is set via CLI / 'logs_no_default: True' set in config.
logs_days: 5 # collect only log files updated within the last 5 days
# an example of parameter-based configuration is below:
by_roles:
controller:
scripts: # I use script here to not overwrite the cmds we have already defined for all nodes
@ -76,9 +112,11 @@ If you want to perform a set of actions on the nodes without writing a long comm
- '/etc/coros*' # get all files from /etc/coros* wildcard path
fuel:
logs:
include: 'crmd|lrmd|corosync|pacemaker' # only get logs which names match (re.search is used) this regexp
path: '/var/log/remote'
include: # include regexp - non-matching log files will be excluded.
- 'crmd|lrmd|corosync|pacemaker'
Then you would run ``timmy -l -c my-config.yaml`` to execute timmy with such config.
Then you would run ``timmy -l -c my-config.yaml`` to execute Timmy with such config.
Instead of putting all structure in a config file you can move actions (cmds, files, filelists, scripts, logs) to an rqfile, and specify ``rqfile`` path in config (although in this example the config-way is more compact). ``rqfile`` structure is a bit different:
@ -100,16 +138,21 @@ Instead of putting all structure in a config file you can move actions (cmds, fi
logs:
by_roles:
fuel:
include: 'crmd|lrmd|corosync|pacemaker'
__default:
exclude: '.*'
path: '/var/log/remote'
include:
- 'crmd|lrmd|corosync|pacemaker'
__default: # again, this default section is useless, just serving as an example here.
path: '/var/log'
exclude:
- '.*'
Then the config should look like this:
::
rqdir: './pacemaker-debug'
rqfile: './pacemaker-rq.yaml'
rqfile:
- file: './pacemaker-rq.yaml'
hard_filter:
roles:
- fuel