This is the operational procedures guide that HPE used to operate and monitor their public Swift systems. It has been made publicly available. Change-Id: Iefb484893056d28beb69265d99ba30c3c84add2b
80 lines
2.3 KiB
ReStructuredText
80 lines
2.3 KiB
ReStructuredText
=================
|
|
Swift Ops Runbook
|
|
=================
|
|
|
|
This document contains operational procedures that Hewlett Packard Enterprise (HPE) uses to operate
|
|
and monitor the Swift system within the HPE Helion Public Cloud. This
|
|
document is an excerpt of a larger product-specific handbook. As such,
|
|
the material may appear incomplete. The suggestions and recommendations
|
|
made in this document are for our particular environment, and may not be
|
|
suitable for your environment or situation. We make no representations
|
|
concerning the accuracy, adequacy, completeness or suitability of the
|
|
information, suggestions or recommendations. This document are provided
|
|
for reference only. We are not responsible for your use of any
|
|
information, suggestions or recommendations contained herein.
|
|
|
|
This document also contains references to certain tools that we use to
|
|
operate the Swift system within the HPE Helion Public Cloud.
|
|
Descriptions of these tools are provided for reference only, as the tools themselves
|
|
are not publically available at this time.
|
|
|
|
- ``swift-direct``: This is similar to the ``swiftly`` tool.
|
|
|
|
|
|
.. toctree::
|
|
:maxdepth: 2
|
|
|
|
general.rst
|
|
diagnose.rst
|
|
procedures.rst
|
|
maintenance.rst
|
|
troubleshooting.rst
|
|
|
|
Is the system up?
|
|
~~~~~~~~~~~~~~~~~
|
|
|
|
If you have a report that Swift is down, perform the following basic checks:
|
|
|
|
#. Run swift functional tests.
|
|
|
|
#. From a server in your data center, use ``curl`` to check ``/healthcheck``.
|
|
|
|
#. If you have a monitoring system, check your monitoring system.
|
|
|
|
#. Check on your hardware load balancers infrastructure.
|
|
|
|
#. Run swift-recon on a proxy node.
|
|
|
|
Run swift function tests
|
|
------------------------
|
|
|
|
We would recommend that you set up your function tests against your production
|
|
system.
|
|
|
|
A script for running the function tests is located in ``swift/.functests``.
|
|
|
|
|
|
External monitoring
|
|
-------------------
|
|
|
|
- We use pingdom.com to monitor the external Swift API. We suggest the
|
|
following:
|
|
|
|
- Do a GET on ``/healthcheck``
|
|
|
|
- Create a container, make it public (x-container-read:
|
|
.r\*,.rlistings), create a small file in the container; do a GET
|
|
on the object
|
|
|
|
Reference information
|
|
~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Reference: Swift startup/shutdown
|
|
---------------------------------
|
|
|
|
- Use reload - not stop/start/restart.
|
|
|
|
- Try to roll sets of servers (especially proxy) in groups of less
|
|
than 20% of your servers.
|
|
|