Commit Graph

7357 Commits

Author SHA1 Message Date
Sage Weil
491dd4342c mon: change MMDSMap to send map we have, not map we want. 2009-06-24 13:15:33 -07:00
Sage Weil
1fd46cc3e8 osd: make object delete not remove _head if there are clones
Truncate and rmattrs instead, so we can keep the SnapSet.

Still need to make 'ls' work properly.
2009-06-24 13:07:42 -07:00
Sage Weil
1d5c805e51 filestore: rmattrs command
Delete all object attrs
2009-06-24 13:07:42 -07:00
Greg Farnum
5c39415b49 messages: Added PaxosServiceMessage to repository so previous commits work. 2009-06-24 13:06:33 -07:00
Greg Farnum
dd3ffaf072 Monitor/Message: All messages used by Paxos are now PaxosServiceMessages. 2009-06-24 12:27:55 -07:00
Greg Farnum
540ad29980 mon/msg: PThey mostly hold version_t's now. Unused, though. 2009-06-24 12:27:55 -07:00
Sage Weil
b3f7108be5 osd: adjust recovery op accounting; explicitly track set of recovering objects
Use a single {start,finish}_recovery_op() func to start and stop
recovery ops, so that there is a single point for counter adjustments
to occur.  On reset, simply call into OSD multiple times.

Also maintain a set<sobject_t> in each PG and on the OSD to track
the set of objects that are recovering.  This can hopefully be
compiled out once all the bugs are identified.

We are chasing this:

osd/OSD.cc:3465: FAILED assert(recovery_ops_active >= 0)
 1: ./cosd(_Z18__ceph_assert_failPKcS0_iS0_+0x3a) [0x7a769b]
 2: ./cosd(_ZN3OSD18finish_recovery_opEP2PGib+0x148) [0x696bce]
 3: ./cosd(_ZN12ReplicatedPG18finish_recovery_opEv+0x77) [0x6359c5]
 4: ./cosd(_ZN12ReplicatedPG17sub_op_push_replyEP14MOSDSubOpReply+0x540) [0x63628a]
 5: ./cosd(_ZN12ReplicatedPG15do_sub_op_replyEP14MOSDSubOpReply+0x64) [0x6407fe]
 6: ./cosd(_ZN3OSD10dequeue_opEP2PG+0x224) [0x6996ee]
 7: ./cosd(_ZN3OSD4OpWQ8_processEP2PG+0x21) [0x70d175]
 8: ./cosd(_ZN10ThreadPool9WorkQueueI2PGE13_void_processEPv+0x28) [0x6c9f78]
 9: ./cosd(_ZN10ThreadPool6workerEv+0x280) [0x7a825c]
 10: ./cosd(_ZN10ThreadPool10WorkThread5entryEv+0x19) [0x70cb9f]
 11: ./cosd(_ZN6Thread11_entry_funcEPv+0x20) [0x629d48]
 12: /lib/libpthread.so.0 [0x7f2f1e3f33f7]
 13: /lib/libc.so.6(clone+0x6d) [0x7f2f1d9c294d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2009-06-24 11:17:55 -07:00
Sage Weil
ca7d025449 osd: abort generate_backlog if already canceled
Bail out of generate_backlog if we've been canceled.  Fixes

osd/OSD.cc: In function 'void OSD::generate_backlog(PG*)':
osd/OSD.cc:3305: FAILED assert(!pg->is_active())
 1: ./cosd(_Z18__ceph_assert_failPKcS0_iS0_+0x3a) [0x7a833b]
 2: ./cosd(_ZN3OSD16generate_backlogEP2PG+0xb6) [0x69a1a6]
 3: ./cosd(_ZN3OSD9BacklogWQ8_processEP2PG+0x21) [0x70d92b]
 4: ./cosd(_ZN10ThreadPool9WorkQueueI2PGE13_void_processEPv+0x28) [0x6ca5f8]
 5: ./cosd(_ZN10ThreadPool6workerEv+0x280) [0x7a8efc]
 6: ./cosd(_ZN10ThreadPool10WorkThread5entryEv+0x19) [0x70d331]
 7: ./cosd(_ZN6Thread11_entry_funcEPv+0x20) [0x629e48]
 8: /lib/libpthread.so.0 [0x7f0a8feed3f7]
 9: /lib/libc.so.6(clone+0x6d) [0x7f0a8f4bc94d]
2009-06-24 11:12:03 -07:00
Sage Weil
4b5572a679 osd: fix merge_log when log and olog share bottom
If log has 6'10 and olog has 7'10, on same object, merge_log
was failing to throw out log's 6'10 entry because the
last_kept iterator was still end().  Use a simple eversion_t
instead, and simplify existing (and otherwise correct)
log.bottom logic, but without the last_kept != end() guard
that threw us off.

09.06.23 16:52:56.032981 1145465168 osd4 485 pg[1.cd( v 469'11021/469'11021 (469'11020,469'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering] merge_log log(469'11020,476'11021] from osd0 into log(469'11020,469'11021]
09.06.23 16:52:56.033001 1145465168 osd4 485 pg[1.cd( v 469'11021/469'11021 (469'11020,469'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering] merge_log extending top to 476'11021
09.06.23 16:52:56.033033 1145465168 osd4 485 pg[1.cd( v 469'11021/469'11021 (469'11020,469'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering]   ? 476'11021 (0'0) m 10001641d24.00000000/head by mds0.16:33860 09.06.23 16:50:28.931949
09.06.23 16:52:56.033057 1145465168 osd4 485 pg[1.cd( v 469'11021/469'11021 (469'11020,469'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering] merge_log 476'11021 (0'0) m 10001641d24.00000000/head by mds0.16:33860 09.06.23 16:50:28.931949
09.06.23 16:52:56.033090 1145465168 osd4 485 pg[1.cd( v 476'11021/469'11021 (469'11020,476'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering m=1 l=1] merge_log result log(469'11020,476'11021] missing(1) changed=1
2009-06-23 22:06:24 -07:00
Sage Weil
2ebc00f10a filestore: use readdir_r to avoid SIGBUS badness
We need to use reentrant readdir, since multiple threads
will otherwise share the struct dirent and walk all over
each other.
2009-06-23 22:03:04 -07:00
Sage Weil
c301e10b0c mds: fix session purge bug
mds/Server.cc: In function 'void Server::_finish_session_purge(Session*)':
mds/Server.cc:410: FAILED assert(session->is_stale_purging())
 1: ./cmds(_ZN6Server21_finish_session_purgeEP7Session+0x392) [0x49edf2]
 2: ./cmds(_ZN6Server18find_idle_sessionsEv+0xa18) [0x4a3188]
 3: ./cmds(_ZN3MDS4tickEv+0x220) [0x484f60]
 4: ./cmds(_ZN9SafeTimer12EventWrapper6finishEi+0x1c1) [0x63eb11]
 5: ./cmds(_ZN5Timer11timer_entryEv+0x6f6) [0x6412d6]
 6: ./cmds(_ZN5Timer11TimerThread5entryEv+0xd) [0x46d53d]
 7: ./cmds(_ZN6Thread11_entry_funcEPv+0xc) [0x480c9c]
 8: /lib/libpthread.so.0 [0x7f51a9f4c3f7]
 9: /lib/libc.so.6(clone+0x6d) [0x7f51a951b94d]
2009-06-23 15:48:37 -07:00
Sage Weil
63f4073df2 osd: allow recovery of missing objects not in log
This happens when a scrub/repair tells us to recovery an item, but
it's older than log.bottom.
2009-06-23 14:54:08 -07:00
Sage Weil
303fe163f9 osd: avoid using null ctx pointer
Use localt instead, it's on the stack.
2009-06-23 09:33:13 -07:00
Sage Weil
89eb728f14 osd: stop rewinding replica log when we reach log.bottom
We stop rewinding a replica log when we reach our own
log.bottom, because we don't know enough to do so in any
meaningful way, and because we can assume it is not
divergent at that point (barring any complete screwupedness).

Also, if we do change last_update, make sure last_complete is
rewound too.
2009-06-23 09:33:13 -07:00
Sage Weil
e6cb25111f mds: no fatal assert on ino allocation failures
We still log them LOG_ERR.  Client will be unhappy, but
that's their problem.
2009-06-23 09:33:12 -07:00
Sage Weil
40e44349b5 osd: small cleanups 2009-06-23 09:33:12 -07:00
Sage Weil
78780528f3 mds: don't choke on bad parallel_fetch paths
e.g., bad reconnect path from client, like /blah/file_not_dir/blah.
2009-06-23 09:33:12 -07:00
Sage Weil
54300766d9 rados: cleanup 2009-06-22 20:57:44 -07:00
Sage Weil
299eddaac3 kclient: make r_path[12] dup strings
The mds_request lifetime differs from the caller's stack, so we need to
duplicate these strings.  Fixes problems with request reply after MDS
recovery.
2009-06-22 20:21:58 -07:00
Sage Weil
8bbd6339ae kclient: clean up mds_request path generation 2009-06-22 20:21:58 -07:00
Sage Weil
541669cce2 todo 2009-06-22 20:21:58 -07:00
Sage Weil
f8d05356e9 Makefile: add missing kernel/ headers 2009-06-22 20:21:58 -07:00
Sage Weil
5f1ea72d55 kclient: import into fs/, not fs/staging/? 2009-06-22 20:21:58 -07:00
Greg Farnum
74ceb01289 rados/objecter: Changes to rados in/out, and various things work. 2009-06-22 15:35:36 -07:00
Greg Farnum
2bd1377016 Objecter/librados: Refactored and renamed for clarity. 2009-06-22 15:35:36 -07:00
Sage Weil
0e3e44449d todo 2009-06-22 10:05:19 -07:00
Sage Weil
794002e48b todo 2009-06-22 09:32:33 -07:00
Sage Weil
aa615b84f5 kclient: clean up unaligned pointer accesses
Get rid of the likes of *(__le64*)foo.

Get rid of useless ceph_decode_##_le() macros; use ceph_decode_copy
instead.
2009-06-20 15:00:31 -07:00
Sage Weil
3652d17afe cosd: conf updates 2009-06-20 14:01:10 -07:00
Sage Weil
1d21494a63 mon: allow repair of entire osd 2009-06-20 14:00:55 -07:00
Sage Weil
29a2b2f3a7 mds: reduce default memory, journal footprint 2009-06-20 14:00:55 -07:00
Sage Weil
e0097bc12e osd: do NOT include op vector when shipping raw transaction
This just doubles up the data payload.  And makes the MOSDSubOp printout
look like garbage, since e.g. the setxattr names are taken from the
portion of the data payload encoding the transaction.
2009-06-19 23:27:04 -07:00
Sage Weil
4148185392 kclient: strip out kernel version compatibility cruft 2009-06-19 15:03:18 -07:00
Sage Weil
7e51c52805 kclient: update script importer 2009-06-19 15:03:11 -07:00
Sage Weil
58f28204c5 todo 2009-06-19 14:56:32 -07:00
Sage Weil
9f58773e0a osd: on scrub repair, update replica pg stats as necessary
An MOSDPGInfo to an active replica is treated as a pg stat repair.  The
replica just saves it to disk.
2009-06-19 12:46:21 -07:00
Sage Weil
bcab1f16a9 osd: pass updated stats to replica
When we ship the raw transaction to the replica, we need to ship the
new pg_stat_t as well, since that isn't getting updated in parallel by
prepare_transaction().
2009-06-19 12:45:36 -07:00
Sage Weil
3e0c3e29d9 uclient: close mds session close race
If we get a mds push msg while closing the session, resend the close
request.
2009-06-19 11:40:09 -07:00
Sage Weil
4f0e84c982 objecter: some list_objects cleanups 2009-06-19 10:06:32 -07:00
Sage Weil
d82ea93ff5 osd: check that pg matches
Otherwise return an empty result.  May want to return an error here.. not
sure which tho.
2009-06-19 10:06:17 -07:00
Sage Weil
0258c459d2 todo: bugs that have come up >2x now 2009-06-18 21:19:25 -07:00
Sage Weil
f6981fce3a osd: adjust debug levels a bit
Try to put iterative output to be at 20, other stuff at 10,
so that we can tolerate 10 on large data sets.
2009-06-18 21:18:49 -07:00
Sage Weil
5d24752ada osd: fix initialization of log.complete_to in PG::activate()
The complete_to should point to the next object to get, which
should be just PAST info.last_complete.  That is because we
can trim the log up to and including last_complete (because
that entry is recovered), and we don't want to invalidate
the iterator.

That is
    while (log.complete_to->version <= info.last_complete)
      log.complete_to++;

and in sub_op_push,

    while (...) {
      ...
      if (info.last_complete < log.complete_to->version)
	info.last_complete = log.complete_to->version;
      log.complete_to++;
    }
2009-06-18 21:18:49 -07:00
Sage Weil
ddd624275e osd: remove bad trim assertion: trim point may preceed local log.bottom 2009-06-18 21:18:49 -07:00
Sage Weil
695e6a7625 osd: remove bad assertion to allow trim before pg is clean
We may trim the log before recovery completes.
2009-06-18 21:18:49 -07:00
Greg Farnum
17b0f2653d Objecter: now has list instead of librados. Hurrah. 2009-06-18 17:05:17 -07:00
Greg Farnum
f3210a7150 Objecter: Now resubmits *Op as part of tick() if the response takes too long. 2009-06-18 17:05:17 -07:00
Sage Weil
5720080b6f osd: be a bit more verbose about peer_info
Looking for residual bug where peer_info info is somehow missing
when activate() happens...
2009-06-18 16:42:35 -07:00
Sage Weil
9e33bf129b osd: don't trim pg log if degraded
Also be a bit more verbose about pg_trim_to changes.
2009-06-18 16:42:35 -07:00
Sage Weil
fc7ec60c7f osd: we don't use MOSDPGInfo to signal replica uptodate anymore
Clean out cruft from old replica-driven recovery.
2009-06-18 16:42:35 -07:00