osd/OSD: Log slow ops/types to cluster logs

In addition to logging slow ops in mon and osd specific log files,
re-introduce logging the same information along with slow op type
details to cluster logs as well. The objective is to make debugging
slow ops easier.

Modify the log whitelisting string to "slow request" within qa suites in
order to make the search for the new warning log message within the
cluster log successful. This should not cause any issue as it's a
substring of the earlier string.

Fixes: https://tracker.ceph.com/issues/43975
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
This commit is contained in:
Sridhar Seshasayee 2020-02-14 16:43:52 +05:30
parent 6cd47842bb
commit d20f57000b
4 changed files with 11 additions and 6 deletions

View File

@ -53,7 +53,7 @@ tasks:
- \(SLOW_OPS\) - \(SLOW_OPS\)
- \(REQUEST_SLOW\) - \(REQUEST_SLOW\)
- \(TOO_FEW_PGS\) - \(TOO_FEW_PGS\)
- slow requests - slow request
- interactive: - interactive:
- parallel: - parallel:
- log-mon-rss - log-mon-rss

View File

@ -31,7 +31,7 @@ tasks:
- \(SLOW_OPS\) - \(SLOW_OPS\)
- \(REQUEST_SLOW\) - \(REQUEST_SLOW\)
- \(TOO_FEW_PGS\) - \(TOO_FEW_PGS\)
- slow requests - slow request
- workunit: - workunit:
clients: clients:
all: all:

View File

@ -12,4 +12,4 @@ overrides:
- \(SLOW_OPS\) - \(SLOW_OPS\)
- \(REQUEST_SLOW\) - \(REQUEST_SLOW\)
- \(TOO_FEW_PGS\) - \(TOO_FEW_PGS\)
- slow requests - slow request

View File

@ -7596,9 +7596,14 @@ vector<DaemonHealthMetric> OSD::get_health_metrics()
TrackedOpRef oldest_op; TrackedOpRef oldest_op;
auto count_slow_ops = [&](TrackedOp& op) { auto count_slow_ops = [&](TrackedOp& op) {
if (op.get_initiated() < too_old) { if (op.get_initiated() < too_old) {
lgeneric_subdout(cct,osd,20) << "slow op " << op.get_desc() stringstream ss;
<< " initiated " ss << "slow request " << op.get_desc()
<< op.get_initiated() << dendl; << " initiated "
<< op.get_initiated()
<< " currently "
<< op.state_string();
lgeneric_subdout(cct,osd,20) << ss.str() << dendl;
clog->warn() << ss.str();
slow++; slow++;
if (!oldest_op || op.get_initiated() < oldest_op->get_initiated()) { if (!oldest_op || op.get_initiated() < oldest_op->get_initiated()) {
oldest_op = &op; oldest_op = &op;