If the write_pos was behind the other pointers, we
would flag the header as invalid. However, that
would prevent doing a "header set" to correct it.
Signed-off-by: John Spray <john.spray@inktank.com>
This requires making the late-decoded attributes mutable.
The motivation is to allow get_paths and all the code downstream
of that to be const too.
Signed-off-by: John Spray <john.spray@inktank.com>
Switch to the ENCODE_START, DECODE_START macros
for Journal::Header. As well as being nice generally,
this has the important side effect of making
journal headers written since JOURNAL_FORMAT_RESILIENT
unreadable to older MDSs, as they will fail their check
on Header.magic.
Signed-off-by: John Spray <john.spray@inktank.com>
It was doing an unnecessary series of splices() after
reading header, replace with a single pass through
the data and one splice at the end.
Signed-off-by: John Spray <john.spray@inktank.com>
This is useful for testing, so that we can create an
old-style journal and then test the version migration
by changing the config setting.
Signed-off-by: John Spray <john.spray@inktank.com>
Two main pieces to this:
* A new JournalPointer object that stores two journal
inodes so that we can do a double-buffered update,
followed by an atomic swap.
* An extended recovery process in MDLog that dereferences
the JournalPointer and conditionally rewrites the
journal to accomodate format updates.
The JournalPointer indirection should also be useful for
making cephfs-journal-tool do updates more safely.
Signed-off-by: John Spray <john.spray@inktank.com>
This is used in subsequent commits to delete journals
that are no longer needed, such as after rewriting
a journal in a different format.
Signed-off-by: John Spray <john.spray@inktank.com>
Fix redundant (and subtly incorrect) calculation of
the number of bytes needed. It worked because waiting
for a few more bytes before reading the entry size
of an old-format entry was harmless.
Signed-off-by: John Spray <john.spray@inktank.com>
Use the new CInode::encode_bare/decode_bare fns
in CDir, so that we only have one implementation
of the inode encode/decode code.
Signed-off-by: John Spray <john.spray@inktank.com>
For tools that would like to know which dentries are
touched by a metablob, without understanding its
internal format.
Signed-off-by: John Spray <john.spray@inktank.com>
Two problems were causing undump to fail:
* Objecter lock was not being taken around call to
.write() and .write_full() calls, causing assertion.
* Once that is fixed, it is necessary to use a separate,
local lock to protect the completion condition for
write operations
Signed-off-by: John Spray <john.spray@inktank.com>
CInode itself combined the on-disk format and
encode/decode logic with lots of other complex
behaviours. This separates the simple parts
out so that they can be used by other tools that
are interested in looking at inodes outside of
a running MDS.
There is a small overhead because CInodeStore
can't decode a SnapRealm inline, so it keeps
a temporary copy of the encoded bufferlist.
Signed-off-by: John Spray <john.spray@inktank.com>
Previously the only way to get at the payload
of things like EUpdate and EOpen was to replay() them
(required a full running MDS) or to use downcasting
(yuck).
Signed-off-by: John Spray <john.spray@inktank.com>
This allows us to implement journal splicing
without moving blocks around, and without modifying
the outer journal syntax.
Signed-off-by: John Spray <john.spray@inktank.com>
* Separate journal encoding/envelope format
code (JournalStream) from I/O code (Journaler)
* Add new sentinel and start_ptr fields to
prefix and suffix of log events.
* Add journal encoding version to journal header
Signed-off-by: John Spray <john.spray@inktank.com>
Suppress messages about failure to register admin sockets
if they are EEXIST, because this is a case that can occur
naturally if multiple objecter/librados clients are instantiated
within the same process.
Signed-off-by: John Spray <john.spray@inktank.com>