One recovery scenario sees crisscrossing 'agree' and 'commit':
C->S --> commit1
S->C --> agree
C<-S <== agree
C->S --> commit2 .. client resends commit!
S<-C <== commit1
S->C --> ack .. server resends ack!
S<-C <== commit2
S->C --> ack .. server resends ack!
C<-S <== commit1
client journals ack
C<-S <== commit2
client should ignore dup ack ***
*** but doesn't, because the 'remove from committing list' bit above was
never in the code, even as far back as v0.4 (just the comment). Instead,
the map was getting fixed up in the _logged_ack() completion. Move it
up here instead, where it belongs!
Signed-off-by: Sage Weil <sage@newdream.net>
Wido saw a pg go active, but an activate log+info update crossed paths with
a pg_notify info, and the primary overwrote it's updated shiny new info
with the stale old info from the replica. Don't do that. It causes
problems down the line. In this case, we got
osd/OSD.cc: In function 'void OSD::generate_backlog(PG*)':
osd/OSD.cc:3863: FAILED assert(!pg->is_active())
1: (ThreadPool::worker()+0x28f) [0x5b08ff]
2: (ThreadPool::WorkThread::entry()+0xd) [0x4edb8d]
3: (Thread::_entry_func(void*)+0xa) [0x46892a]
4: (()+0x69ca) [0x7f889ff249ca]
5: (clone()+0x6d) [0x7f889f1446cd]
on the replica because it was active but the primary was restarting peering
due to the bad info.
We only want to apply _newly_ removed snaps once, or else we try to trim
the same snaps multiple times, and crash like so
./include/interval_set.h: In function 'void interval_set<T>::insert(T, T) [with T = snapid_t]':
./include/interval_set.h:202: FAILED assert(0)
1: (interval_set<snapid_t>::insert(snapid_t, snapid_t)+0x12c) [0x6b1728]
2: (interval_set<snapid_t>::insert(snapid_t)+0x2f) [0x6b195d]
3: (ReplicatedPG::snap_trimmer()+0x1c02) [0x66d5d6]
4: (OSD::SnapTrimWQ::_process(PG*)+0x24) [0x6dc2ac]
5: (ThreadPool::WorkQueue<PG>::_void_process(void*)+0x28) [0x6fa28a]
6: (ThreadPool::worker()+0x23a) [0x7f57a4]
7: (ThreadPool::WorkThread::entry()+0x19) [0x73e9b1]
8: (Thread::_entry_func(void*)+0x20) [0x6508a4]
9: /lib/libpthread.so.0 [0x7fa2707dc73a]
10: (clone()+0x6d) [0x7fa26fa0669d]
Signed-off-by: Sage Weil <sage@newdream.net>
Ignore lease sent vs lease_ack receive times bc multiple lease msgs may
be in flight and the ack may be from a previous one. This was causing
spurious
[WRN] : lease_ack from follower sent at time(10.06.07_15:07:11.441391), before lease extend was sent (10.06.07_15:07:11.826340)! Clocks not synchronized.
messages.
It is sufficient to just check for messages received from the future. To
avoid cruftiness trying to do that when the only stamp is the lease
timeout, add a sent_timestamp to the message and use that instead. This
simplifies things quite a bit, at the expense of not being backward
compatible.