doc: update documentation for the MANY_OBJECTS_PER_PG warning
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
The current documentation for the MANY_OBJECTS_PER_PG warning
states that The threshold can be raised to silence the health
warning by adjusting the mon_pg_warn_max_object_skew config
option on the monitors. It seems that this is not true (at least)
since the luminous times, and this option should be adjusted on
the managers.
I encountered this problem and I spend quite sometime injecting
the mon_pg_warn_max_object_skew to the monitors, added the option
ceph.conf and restarted the monitors several times but the warning
was not going away. I had to download the code to see what's
happening and I found out this:
$ git grep -A 3 mon_pg_warn_max_object_skew src/common/options.cc
src/common/options.cc:1480: Option("mon_pg_warn_max_object_skew", Option::TYPE_FLOAT, Option::LEVEL_ADVANCED)
src/common/options.cc-1481- .set_default(10.0)
src/common/options.cc-1482- .set_description("max skew few average in objects per pg")
src/common/options.cc-1483- .add_service("mgr"),
After I restarted the ceph-mgr service, the warning went away.
Signed-off-by: Vangelis Tasoulas <vangelis@tasoulas.net>
If there are too many entries to send in a single osd op, the osd rejects
the request with EINVAL. This error happens in follow_olh(), which means
that requests against the object logical head (requests with no version
id) can't be resolved to the current object version. In multisite, this
also causes data sync to get stuck in retries
Fixes: http://tracker.ceph.com/issues/39118
Signed-off-by: Casey Bodley <cbodley@redhat.com>
crimson ProtocolV2 class is following a state-machine design style:
* states are defined in ProtocolV2::state_t;
* call `execute_<state_name>()` methods to trigger different states;
* V2 logics are implemented in each execute_<state_name>() methods, and
with explicit transitions to other states at the end of the execute_*;
* each state is associated with a write state defined in Protocol.h:
- none: not allowed to send;
- delay: messages can be queued, but will be delayed to send;
- open: dispatch queued message/keepalive/ack;
- drop: not send any messages, drop them all.
crimson ProtocolV2 is alike async ProtocolV2, with some considerations:
* explicit and encapsulated client/server handshake workflow.
* futurized-exception-based fault handling, which can interrupt protocol
workflow at any time in each state.
* introduced SERVER_WAIT state, meaning to wait for peer-client's socket
to reset or be replaced, and expect no further reads.
* introduced an explicit REPLACING state, async-msgr would be at the
NONE state during replacing.
Signed-off-by: Yingxin Cheng <yingxincheng@gmail.com>
The simplest async-msgr server which will have the same behavior with
crimson-msgr server for apple-to-apple performance test.
Signed-off-by: Yingxin Cheng <yingxincheng@gmail.com>
New features:
* --jobs: start multiple client messengers from core #1 ~ #jobs;
* --core: can assign server core to get away from busy client cores;
* --rounds: a client will send <rounds>/<jobs> messages;
Improved:
* Better configuration report;
* Report individual client results plus a summary;
* Validate if CPU number is sufficient before running;
* Sleep 1 second while connecting, so it won't hurt performance;
* Simplify client logic and bug fixes;
Removed unecessary features:
* finish_decode() for MOSDOp;
* keepalive ratio;
Signed-off-by: Yingxin Cheng <yingxincheng@gmail.com>
sharded data structures should only be allocated in core#0, or the
program will hang during exit.
Signed-off-by: Yingxin Cheng <yingxincheng@gmail.com>