mirror of
https://github.com/ceph/ceph
synced 2025-02-15 14:58:01 +00:00
* Simplified anchor ops. * Rollback. git-svn-id: https://ceph.svn.sf.net/svnroot/ceph@1231 29311d96-e01e-0410-9327-a35deaab8ce9
54 lines
2.3 KiB
Plaintext
54 lines
2.3 KiB
Plaintext
|
|
ANCHOR TABLE PROTOCOL
|
|
|
|
MDS sends an update PREPARE to the anchortable MDS. The prepare is
|
|
identified by the ino and operation type; only one for each type
|
|
(create, update, destroy) can be pending at any time. Both parties
|
|
may actually be the same local node, but for simplicity we treat that
|
|
situation the same. (That is, we act as if they may fail
|
|
independently, even if they can't.)
|
|
|
|
The anchortable journals the proposed update, and responds with an
|
|
AGREE and a version number. This uniquely identifies the request.
|
|
|
|
The MDS can then update the filesystem metadata however it sees fit.
|
|
When it is finished (and the results journaled), it sends a COMMIT to
|
|
the anchortable. The table journals the commit, frees any state from
|
|
the transaction, and sends an ACK. The initiating MDS should then
|
|
journal the ACK to complete the transaction.
|
|
|
|
|
|
ANCHOR TABLE FAILURE
|
|
|
|
If the AT fails before journaling the PREPARE and sending the AGREE,
|
|
the initiating MDS will simply retry the request.
|
|
|
|
If the AT fails after journaling PREPARE but before journaling COMMIT,
|
|
it will resend AGREE to the initiating MDS.
|
|
|
|
If the AT fails after the COMMIT, the transaction has been closed, and it
|
|
takes no action. If it receives a COMMIT for which it has no open
|
|
transaction, it will reply with ACK.
|
|
|
|
|
|
INITIATING MDS FAILURE
|
|
|
|
If the MDS fails before the metadata update has been journaled, no
|
|
action is taken, since nothing is known about the previously proposed
|
|
transaction. If an AGREE message is received and there is no
|
|
corresponding PREPARE state, and ROLLBACK is sent to the anchor table.
|
|
|
|
If the MDS fails after journaling the metadata update but before
|
|
journaling the ACK, it resends COMMIT to the anchor table. If it
|
|
receives an AGREE after resending the COMMIT, it simply ignores the
|
|
AGREE. The anchortable will respond with an ACK, allowing the
|
|
initiating MDS to journal the final ACK and close out the transaction
|
|
locally.
|
|
|
|
On journal replay, each metadata update (EMetaBlob) encountered that
|
|
includes an anchor transaction is noted in the AnchorClient by adding
|
|
it to the pending_commit list, and each journaled ACK is removed from
|
|
that list. Journal replay may enounter ACKs with no prior metadata
|
|
update; these are ignored. When recovery finishes, a COMMIT is sent
|
|
for all outstanding transactions.
|