U32 limit the max size of memstore to a few GB, which
block our test on memstore performance(as a phototype).
Bump it to U64 will suit for more widely usage
Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
This became conditional way back in 12e22b3d44
for unclear reasons. It probably predates the in_use checks. In any case,
at this point, we should only arrive here if the PG was queued, implying
that there will always be an event to process.
Signed-off-by: Sage Weil <sage@redhat.com>
If we don't handle the event, we need to put the PG back into the peering
queue or else the event won't get processed until the next event is
queued, at which point we'll be processing events with a delay.
The queue_null is not necessary (and is a waste of effort) because the
event is still in pg->peering_queue and the PG is queued.
Note that this only triggers when we exceeed osd_map_max_advance, usually
when there is a lot of peering and recovery activity going on. A
workaround is to increase that value, but if you exceed osd_map_cache_size
you expose yourself to crache thrashing by the peering work queue, which
can cause serious problems with heavily degraded clusters and bit lots of
people on dumpling.
Backport: giant, firefly
Fixes: #10431
Signed-off-by: Sage Weil <sage@redhat.com>
The operation flags in the public C API are a distinct enum
and need to be translated to Ceph OSD flags, like as happens in
the C++ API. It seems like the C enum and the C++ enum consciously
use the same values, so I reused the C++ translation function.
Signed-off-by: Matthew Richards <mattjrichards@gmail.com>
Group all test directories used for mini clusters into a single
sub-directory (testdir). This is easier to cleanup manually and less
error prone.
http://tracker.ceph.com/issues/10426Fixes: #10426
Signed-off-by: Loic Dachary <ldachary@redhat.com>
This command takes a gid, rank or name, but
in the name case it would previously only work if
the named daemon had a rank assigned (mds_info->rank >= 0),
otherwise it would fail silently.
Signed-off-by: John Spray <john.spray@redhat.com>
Previously, a standby could become active even if 'cluster_down'
had been run. This was awkward, because it would get you a
"laggy or crashed" mds for the standby that was actually
up and running, just being ignored because of cluster_down.
Signed-off-by: John Spray <john.spray@redhat.com>
The existence of the pidfile must be checked outside of the loop to send
a signal to the daemon. Otherwise the daemon will remove the pidfile and
stop can return before the process is dead because it only checks
/proc/$pid if the pidfile exists.
http://tracker.ceph.com/issues/10389Fixes: #10389
Signed-off-by: Loic Dachary <ldachary@redhat.com>
This feature determine whether we use tbl encode for transaction of use
the new map layout.
The primary uses peer_features to determine whether transaction should
use tbl, while the replica just follow the primary.
Change-Id: I92ca6e5b59bd1acde6007ad0dffc085be17accab
Signed-off-by: Dong Yuan <yuandong1222@gmail.com>
When tbl is used (for compatibility), the Transaction::begin method need
to build all fields used by iterator. That includes: coll_index,
object_index, data_bl, op_bl, etc.)
Change-Id: I48ea74fec8d052f50da254a726a9c0dffead19bc
Signed-off-by: Dong Yuan <yuandong1222@gmail.com>
Finish append and swap for new Transaction encode/decode layout.
Since append will modify the op_bl now, we changed the order of append
and swap in ReplicatedBackend::sub_op_modify and
ReplicatedBackend::submit_transaction to avoid append call on op_t, so
the op_t can be encode in message.
Change-Id: I6fb421e0defdb092fb9732eef818e90291b039f5
Signed-off-by: Dong Yuan <yuandong1222@gmail.com>
This patch add new Transaction::iterator interface according to new
encode/decode layout. The new iterator give the whole Op struct in a
single decode_op method.
All ObjectStore Impl (FileStore/MemStore/KeyValueStore) is also changed
to use the new interface.
Change-Id: I1900a6ec302890df2c4357b071e4966c26d7f037
Signed-off-by: Dong Yuan <yuandong1222@gmail.com>
When use_tbl is true, Transaction::encode will give the same result as
before, while when use_tbl is false, Transaction::encode will use new
field and logic to encode and all related methods such as
get_encoded_bytes, get_data_offset will do the same.
Change-Id: Ia5864e489d47f37cf496fe3fb825b21977d2d938
Signed-off-by: Dong Yuan <yuandong1222@gmail.com>
This patch add a new fixed size struct Transaction::Op to represent
all actions.
All coll and ghobject used by the transaction are keeped in two maps:
coll: map<coll_t, __le32> coll_index;
object: map<ghobject_t, __le32> object_index;
And the Op struct use the map value(__le32) to refer coll and object,
so each coll and object is only need to encode once in the transaction.
Other variable-size fields(key/value/data) is encoded in bufferlist
data_bl.
Change-Id: I52b2fcd3217a6cb35de7b309a6dd74a99478feb2
Signed-off-by: Dong Yuan <yuandong1222@gmail.com>
TransactionData wrap the following fields:
__le64 ops;
__le32 largest_data_len;
__le32 largest_data_off;
__le32 largest_data_off_in_tbl;
__le32 pad; //make TransactionData multiple of uint64_t
This struct can encode/decode just by a single memcpy instead of many
encode/decode operations.
Change-Id: I56df78def43bd2b80b77be0825756e133434a6e6
Signed-off-by: Dong Yuan <yuandong1222@gmail.com>
We don't need sobject and pool_override anymore since we don't need to
support anything older than dumpling.
Change-Id: I22c01d4b5c6bf99765bf6bc13aecadc997d6750c
Signed-off-by: Dong Yuan <yuandong1222@gmail.com>
Some tests were racing against the monitor. On a fast machine it worked
but slower machines (or sometime when running in parallel), the monitor
is lagging behind. Use wait_for_clean to make sure the monitor is in the
desired state for the test to succeed.
http://tracker.ceph.com/issues/10384Fixes: #10384
Signed-off-by: Loic Dachary <ldachary@redhat.com>