Add flush_journal admin socket command to be able to flush journal to
the permanent store for online osds. (For offline osds we already have
ceph-osd --flush-journal.)
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Instead of copying the files in the ceph repository, which is less
convenient.
When building the headers are ignored, even though they do
not exist. When creating the tarbal with make dist, it fails because
they cannot be found. I misread src/gf_int.h to be include/gf_int.h and
wrongfully thought the submodules were to blame. This is why they were
removed shortly after being added.
Signed-off-by: Loic Dachary <loic@dachary.org>
The journal reply code has check that decides which scatter locks
should be marked as dirty. So don't unconditionally mark scatter
locks dirty when dirfrag is dirty
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
If fragtree is (*^1) and the caller wants leaves under frag 00*.
get_leaves_under() should return empty list.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
When exporting a subtree, the migrator acquires the required locks,
then freezes the subtree and releases the locks. After subtree is
frozen, it try acquiring the same locks again.
This patch make scatter locks keep in their old states if inode has
exporting dirfrag. It improves the chance that migrator acquires all
required locks when subtree is frozen.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
We have bumped protocol version several times, no need to maintain
compatibility for ancient message.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
After importing inode, the issued caps can be less than the caps
client wants. So re-issue caps after importing inode.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
During subtree migration, the importer may need to open subtree
bound dirfrags. Opening subtree bound dirfrags happens after the
exporter freeze the exporting subtee. So the discover message for
opening subtree bound dirfrags should not wait for any freezing
tree/directory, otherwise deadlock can happen.
In MDCache::handle_discover(), there are two cases can cause
discover messages wait for freezing tree/directory. One case is
fetching bare-bone dirfrags. Another case is, when merging dirfrags,
some of the dirfrags are frozen, some are freezing.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Resolve ack message can abort slave requests that are being journalled.
The slave rollback does not handle this case properly. The fix is mark
slave request aborted in this case. The slave rollback code is executed
after slave prepare is safely journalled.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Only rbd and mount_ceph need secret.c, and only secret.c needs libkeyutils;
remove it from LIBCOMMON_DEPS so it's not a dependency for everything,
remove secret.c from libcommon.a, and add it to mount.ceph/rbd's sources;
add LIBKEYID_LIB to mount.ceph/rbd's LDADD.
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Since all we really need on a snapdir is the context, we really only
need it to be !missing. However, it might become !missing before it
becomes !unreadable. That allows ops to end up in the
waiting_for_degraded queue before one in waiting_for_unreadable is
woken, which allows the ops to be reordered. Rather than reintroduce an
extra waiting_for_missing queue, simply require !unreadable for snapdir
(which implies !misssing).
Fixes: #7777
Signed-off-by: Samuel Just <sam.just@inktank.com>
wip-mon-docs: Better explain required number of monitors & how to troubleshoot a monitor
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Loic Dachary <loic@dachary.org>
The pqueue is protected by the wq lock, not by qlock; for example, see
OpWQ::_enqueue. qlock protects the pg_for_processing map only.
Fixes: #7735
Signed-off-by: Sage Weil <sage@inktank.com>
If i is the first entry, then setting cur = begin() sets us up to point at
something that we are about to delete. Move the check to the end to avoid
this.
Backport: emperor, dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
Otherwise, we might get into a situation where the primary
forgets about a stray pg. This is simpler and does not
increase the number of notifies by much.
Fixes: #7733
Signed-off-by: Samuel Just <sam.just@inktank.com>
The previous logic should have kept the current best info if it found a
replica which best could log-recover, but p couldn't. However, the
continue in that loop advanced the inner loop instead of the outer loop
allowing the primary case to take over in cases where best had a longer
tail. Instead, we will prefer the longer tail regardless of the other
infos to simplify the logic.
Fixes: #7755
Signed-off-by: Samuel Just <sam.just@inktank.com>
Not handling the error return from cmd_getval() may leave uninitialzied
values, which can cause issues, specially with non-string values.
Fixes: 6806
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>