Samuel Merritt f64c00b00a Improve object-updater's stats logging
The object updater has five different stats, but its logging only told
you two of them (successes and failures), and it only told you after
finishing all the async_pendings for a device. If you have a cluster
that's been sick and has millions upon millions of async_pendings
laying around, then your object-updaters are frustratingly
silent. I've seen one cluster with around 8 million async_pendings per
disk where the object-updaters only emitted stats every 12 hours.

Yes, if you have StatsD logging set up properly, you can go look at
your graphs and get real-time feedback on what it's doing. If you
don't have that, all you get is a frustrating silence.

Now, the object updater tells you all of its stats (successes,
failures, quarantines due to bad pickles, unlinks, and errors), and it
tells you incremental progress every five minutes. The logging at the
end of a pass remains and has been expanded to also include all stats.

Also included is a small change to what counts as an error: unmounted
drives no longer do. The goal is that only abnormal things count as
errors, like permission problems, malformed filenames, and so
on. These are things that should never happen, but if they do, may
require operator intervention. Drives fail, so logging an error upon
encountering an unmounted drive is not useful.

Change-Id: Idbddd507f0b633d14dffb7a9834fce93a10359ab
2018-01-17 13:59:23 -08:00
..
2017-12-13 21:26:12 +00:00
2017-01-16 15:16:37 +07:00
2017-07-12 12:14:45 -07:00
2014-06-19 10:18:34 -07:00
2017-12-14 20:12:55 +00:00
2014-02-20 16:15:48 +08:00
2010-07-12 17:03:45 -05:00
2018-01-06 20:48:10 +00:00
2017-12-13 21:26:12 +00:00
2017-12-13 21:26:12 +00:00
2014-06-19 10:18:34 -07:00
2017-06-13 09:23:23 -07:00
2017-06-13 09:23:23 -07:00
2017-05-15 16:42:00 -07:00