40 Commits

Author SHA1 Message Date
Samuel Merritt
d88d12b120 Small clarification to swift-recon section of admin guide.
Apparently the use of port 6030 was causing some confusion.

Fixes bug 1095474.

Change-Id: I0cc71f4733ad91694e015a9b75c3eda080aca6fb
2013-03-17 15:58:06 -07:00
Jian Zhang
1d8a02f25c Added per disk PUT timing monitoring support.
Fixes bug 1104708

There could be severe performance drop for swift is one disk of one
storage node is problematic due to the tragic state of async disk I/O.

This patch provided PUT timing per kB transfered (ms/kB) monitoring
support for each non-zero-byte request of each disk and report to
statsD for alert.
-adding "object-server.PUT.<device>.timing" metrics for object-server.

DocImpact.

Change-Id: Ie94bddad28e8be52e71683bf6c9db988664abe47
2013-02-28 02:52:06 -08:00
Darrell Bishop
bce8443c9e Adds first-byte latency timings for GET requests.
This was an oustanding TODO for StatsD Swift metrics.  A new timing
metric is tracked for (only) GET requests for accounts, containers,
and objects:

  proxy-server.<req_type>.GET.<status_int>.first-byte.timing

Also updated StatsD documentation in the Admin Guide to clarify that
timing metrics are sent in units of milliseconds.

Change-Id: I5bb781c06cefcb5280f4fb1112a526c029fe0c20
2013-02-13 15:58:57 -08:00
Jenkins
23f33b2069 Merge "Make statsd sample rate behave better." 2013-02-13 08:19:46 +00:00
Joe Gordon
45f0502b52 Fix spelling mistakes
git ls-files | misspellings -f -
Source: https://github.com/lyda/misspell-check

Change-Id: I4132e6a276e44e2a8985238358533d315ee8d9c4
2013-02-12 16:39:40 -08:00
Mehdi Abaakouk
a1395ec672 Allow change the endpoint_type when use swift-dispersion tools
Fixes bug 1102319
DocImpact

Change-Id: I8fb0417ab9468e97ed01a6cb1e262630905e7f29
2013-01-31 16:10:37 +01:00
Florian Hines
00dbad0825 Add optional locking to swift-ring-builder
If invoked as 'swift-ring-builder-safe' the directory containing the builder
file provided will be locked (via lock_parent_directory()). This provides a
small safe guard against multiple instances of the swift-ring-builder (or
other utilities that observe this lock) from attempting to write to or read
the builder/ring files while operations are in progress.

This is particularly useful in environments where ring management has been
automated (via Chef or custom solutions) but the operator still occasionally
needs to manually interact with the ring.

DocImpact

Change-Id: Ia362744a8151a91bfb586d01da582906726852e6
2013-01-25 08:00:33 -08:00
Darrell Bishop
8801b74090 Make statsd sample rate behave better.
As Dieter pointed out in bug 1090495
(https://bugs.launchpad.net/swift/+bug/1090495), the volume of metrics
can vary wildly between StatsD metrics.

This patch implements a partial solution by reducing the sample_rate
used for known high-volume metrics (operational experience will need to
inform this over time) and introducing a new tunable,
log_statsd_sample_rate_factor which is multiplied by the sample_rate for
every statsd stat.  This tunable can be used to reduce StatsD traffic
proportionally for all metrics and is intended to replace
log_statsd_default_sample_rate, which is left alone for
backward-compatibility, should anyone be using it.

This patch also includes a drive-by fix for log_udp_port which wasn't
being converted to an int (I didn't verify that actually causes trouble
in SysLogHandler(), but it's definitely an improvement regardles).

Change-Id: Id404636e3629f6431cf1c4e64a143959750a3c23
2013-01-19 15:25:27 -08:00
Florian Hines
e474dfb720 Add dispersion report flags to limit reports
- Add two optional flags that let you limit swift-dispersion-report to only
reporting on containers OR objects.
- Also make dispersion.conf and swift-dispersion-report manpages
  current.

DocImpact

Change-Id: Iad56133cad261241db27d0e2103098e3c2f3c245
2012-12-09 18:20:08 -06:00
Jenkins
3af76e1096 Merge "statsd timing refactor" 2012-11-07 01:27:56 +00:00
Michael Barton
3586f829b0 statsd timing refactor
Change-Id: I99d9ddfbcad0f88e75c49235c8317ea97237d4e4
2012-11-06 15:39:25 -08:00
John Dickinson
ec75d1e343 add OPTIONS to proxy_logging configs and docs
Change-Id: I77e1d7fdcf217826402beeb7d583e3c7279c416c
2012-11-06 15:13:01 -08:00
Florian Hines
de09cbe6f4 Extended documentation for using custom loggers
Change-Id: I78a5c109c9440df752e390698502f57d4392fb67
2012-10-26 17:59:42 -05:00
Samuel Merritt
851bbe2ea9 Track unlinks of async_pendings.
It's not sufficient to just look at swift.object-updater.successes to
see the async_pending unlink rate. There are two different spots where
unlinks happen: one when an async_pending has been successfully
processed, and another when the updater notices multiple
async_pendings for the same object. Both events are now tracked under
the same name: swift.object-updater.unlinks.

FakeLogger has now sprouted a couple of convenience methods for
testing logged metrics.

Fixed pep8 1.3.3's complaints in the files this diff touches.

Also: bonus speling and, grammar fixes in the admin guide.

Change-Id: I8c1493784adbe24ba2b5512615e87669b3d94505
2012-10-23 10:27:21 -07:00
David Goetz
a6c44d2764 allow replicator run_once to check specific devices/partitions
Change-Id: If45f77fda269ae6e251579542e70eb71bd11fe2a
2012-09-28 12:24:15 -07:00
Darrell Bishop
4a2ae2b460 Upating proxy-server StatsD logging.
Removed many StatsD logging calls in proxy-server and added
swift-informant-style catch-all logging in the proxy-logger middleware.
Many errors previously rolled into the "proxy-server.<type>.errors"
counter will now appear broken down by response code and with timing
data at: "proxy-server.<type>.<verb>.<status>.timing".  Also, bytes
transferred (sum of in + out) will be at:
"proxy-server.<type>.<verb>.<status>.xfer".  The proxy-logging
middleware can get its StatsD config from standard vars in [DEFAULT] or
from access_log_statsd_* config vars in its config section.

Similarly to Swift Informant, request methods ("verbs") are filtered
using the new proxy-logging config var, "log_statsd_valid_http_methods"
which defaults to GET, HEAD, POST, PUT, DELETE, and COPY.  Requests with
methods not in this list use "BAD_METHOD" for <verb> in the metric name.
To avoid user error, access_log_statsd_valid_http_methods is also
accepted.

Previously, proxy-server metrics used "Account", "Container", and
"Object" for the <type>, but these are now all lowercase.

Updated the admin guide's StatsD docs to reflect the above changes and
also include the "proxy-server.<type>.handoff_count" and
"proxy-server.<type>.handoff_all_count" metrics.

The proxy server now saves off the original req.method and proxy_logging
will use this if it can (both for request logging and as the "<verb>" in
the statsd timing metric).  This fixes bug 1025433.

Removed some stale access_log_* related code in proxy/server.py.  Also
removed the BaseApplication/Application distinction as it's no longer
necessary.

Fixed up the sample config files a bit (logging lines, mostly).

Fixed typo in SAIO development guide.

Got proxy_logging.py test coverage to 100%.

Fixed proxy_logging.py for PEP8 v1.3.2.

Enhanced test.unit.FakeLogger to track more calls to enable testing
StatsD metric calls.

Change-Id: I45d94cb76450be96d66fcfab56359bdfdc3a2576
2012-08-29 16:08:30 -07:00
Darrell Bishop
66400b7337 Add device name to *-replicator.removes for DBs
To tell when replication for a device has finished, it's important to
know when the replicator is removing objects.  This was previously
handled for the object-replicator
(object-replicator.partition.delete.count.<device> and
object-replicator.partition.update.count.<device> metrics) but not the
account and container replicators.

This patch extends the existing DB removal count metrics to make them
per-device.  The new metrics are:
 account-replicator.removes.<device>
 container-replicator.removes.<device>

There's also a bonus refactoring and increased test coverage of the DB
replicator code.

Change-Id: I2067317d4a5f8ad2a496834147954bdcdfc541c1
2012-08-22 13:35:09 -07:00
Darrell Bishop
af2ff124eb Update docs for new ring serialization.
The Admin Guide now contains information about the ring serialization
change (and importantly, how to downgrade, if necessary).

Also added container-server conf var, "allow_versions" to the Deployment
Guide.

Also changed description of proxy-server conf var,
"max_containers_whitelist" to say it contains "account names" not
"account hashes".

Change-Id: Ib23c6118cc5195cc04765afd28e442e4c735f0d4
2012-08-21 12:09:28 -07:00
Florian Hines
5f72a8db4a Fix Dispersion report and swift-bench on saio
We're still using saio:11000 in a few spots so a few things
don't work out of the box on the saio. Fixes bug #1024561

Change-Id: I226de54c2785b0d0b681c8d0cc24260adbd3d663
2012-07-13 17:48:37 -05:00
John Dickinson
d668b27c09 fixed doc table format
Change-Id: I319de933ecfb1e3853e3064656968c36980ce5f5
2012-05-28 13:36:59 -05:00
Florian Hines
ccb6334c17 Expand recon middleware support
Expand recon middleware to include support for account and container
servers in addition to the existing object servers. Also add support
for retrieving recent information from auditors, replicators, and
updaters. In the case of certain checks (such as container auditors)
the stats returned are only for the most recent path processed.

The middleware has also been refactored and should now also handle
errors better in cases where stats are unavailable.

While new check's have been added the output from pre-existing
check's has not changed. This should allow existing 3rd party
utilities such as the Swift ZenPack to continue to function.

Change-Id: Ib9893a77b9b8a2f03179f2a73639bc4a6e264df7
2012-05-24 14:50:00 -05:00
Darrell Bishop
3d3ed34f44 Adding StatsD logging to Swift.
Documentation, including a list of metrics reported and their semantics,
is in the Admin Guide in a new section, "Reporting Metrics to StatsD".
An optional "metric prefix" may be configured which will be prepended to
every metric name sent to StatsD.

Here is the rationale for doing a deep integration like this versus only
sending metrics to StatsD in middleware.  It's the only way to report
some internal activities of Swift in a real-time manner. So to have one
way of reporting to StatsD and one place/style of configuration, even
some things (like, say, timing of PUT requests into the proxy-server)
which could be logged via middleware are consistently logged the same
way (deep integration via the logger delegate methods).

When log_statsd_host is configured, get_logger() injects a
swift.common.utils.StatsdClient object into the logger as
logger.statsd_client.  Then a set of delegate methods on LogAdapter
either pass through to the StatsdClient object or become no-ops. This
allows StatsD logging to look like:
    self.logger.increment('some.metric.here')
and do the right thing in all cases and with no messy conditional logic.

I wanted to use the pystatsd module for the StatsD client, but the
version on PyPi is lagging the git repo (and is missing both the prefix
functionality and timing_since() method).  So I wrote my
swift.common.utils.StatsdClient.  The interface is the same as
pystatsd.Client, but the code was written from scratch.  It's pretty
simple, and the tests I added cover it.  This also frees Swift from an
optional dependency on the pystatsd module, making this feature easier
to enable.

There's test coverage for the new code and all existing tests continue
to pass.

Refactored out _one_audit_pass() method in swift/account/auditor.py and
swift/container/auditor.py.

Fixed some misc. PEP8 violations.

Misc test cleanups and refactorings (particularly the way "fake logging"
is handled).

Change-Id: Ie968a9ae8771f59ee7591e2ae11999c44bfe33b2
2012-05-11 15:25:38 -07:00
Paul McMillan
92fbf44d10 Fixed grammar and improve docs.
Corrected its/it's mistakes, harmonized line wrapping within some docs
and clarified doc wording in several places.

Change-Id: Ib9ac6d5e859f770a702e1fad6de8d4abe0390b47
2012-04-10 12:27:14 -07:00
Florian Hines
5e4127ae2a Add json output option to swift-dispersion-report
Add's the configuration file option "dump_json" or command line
options [-j|--dump-json] to have swift-dispersion-report output
the report in json format. This allows the dispersion report to
be more easily consumed elsewhere.

There's also a few pep8 fixes and removal of unused imports.

Change-Id: I2374311ccbef43e6bbae24665c9584e60f3da173
2012-02-29 04:24:54 +00:00
gholt
65dba1a7aa Added swift-orphans and swift-oldies.
Change-Id: I95210098556a22d7bd05f245ae387ee13041fa61
2011-12-29 19:19:41 +00:00
Florian Hines
413ca11a5f Add sockstat info to recon.
Add's support for pulling info from /proc/net/sockstat and /proc/net/sockstat6 via recon.

Change-Id: Idb403c6eda199c5d36d96cc9027ee249c12c7d8b
2011-11-15 17:55:14 +00:00
Florian Hines
bb8c4eab41 Add documentation for Swift Recon.
Change-Id: I37f4fb624bdc5b8bbf2e691d29aa6b15cd648aa8
2011-10-21 00:17:10 +00:00
Julien Danjou
e3e8a1c586 Fix documentation leftover from swift-stats rename
Change-Id: Ia6f4eeb626cc34b6cec43cab92a0afe7b46354e0
Signed-off-by: Julien Danjou <julien.danjou@enovance.com>
2011-10-05 17:05:23 +02:00
gholt
3ee4a01100 Remove swauth; update references from swauth to testauth. 2011-05-26 02:17:42 +00:00
gholt
6c13001244 Rename swift-stats-* to swift-dispersion-* to avoid confusion with log stats stuff 2011-03-31 22:32:41 +00:00
gholt
bd22dbe712 Removing DevAuth 2011-03-14 02:56:37 +00:00
David Goetz
a86a569cae simplifying options and code 2011-02-21 16:37:12 -08:00
David Goetz
7728904dda audit zero byte files quickly without true value 2011-02-14 20:25:40 +00:00
gholt
09e39032bf new swauth-cleanup-tokens; restricted listing .auth account to .super_admin; doc updates 2010-12-09 17:57:26 -08:00
gholt
35f3487879 Incorporated Swauth into Swift as an optional DevAuth replacement. 2010-12-01 17:08:49 -08:00
Anne Gentle
45c59e0653 Edited to reflect ring creation not management 2010-11-30 14:15:41 -06:00
Anne Gentle
36935a2b5d Adding Citrix contributions to Admin Guide 2010-11-30 12:24:55 -06:00
Anne Gentle
6c5c1e3071 Spell check for .rst files 2010-10-13 11:28:27 -05:00
gholt
faa96c6aed Cluster health monitoring docs 2010-08-17 12:36:49 -07:00
Chuck Thier
e051495715 Added initial admin guide, and added more to the deployment guide, plus
cleaned up some of the doc string warning
2010-07-30 14:57:20 -05:00