Sage Weil
491dd4342c
mon: change MMDSMap to send map we have, not map we want.
2009-06-24 13:15:33 -07:00
Sage Weil
1fd46cc3e8
osd: make object delete not remove _head if there are clones
...
Truncate and rmattrs instead, so we can keep the SnapSet.
Still need to make 'ls' work properly.
2009-06-24 13:07:42 -07:00
Sage Weil
1d5c805e51
filestore: rmattrs command
...
Delete all object attrs
2009-06-24 13:07:42 -07:00
Greg Farnum
5c39415b49
messages: Added PaxosServiceMessage to repository so previous commits work.
2009-06-24 13:06:33 -07:00
Greg Farnum
dd3ffaf072
Monitor/Message: All messages used by Paxos are now PaxosServiceMessages.
2009-06-24 12:27:55 -07:00
Greg Farnum
540ad29980
mon/msg: PThey mostly hold version_t's now. Unused, though.
2009-06-24 12:27:55 -07:00
Sage Weil
b3f7108be5
osd: adjust recovery op accounting; explicitly track set of recovering objects
...
Use a single {start,finish}_recovery_op() func to start and stop
recovery ops, so that there is a single point for counter adjustments
to occur. On reset, simply call into OSD multiple times.
Also maintain a set<sobject_t> in each PG and on the OSD to track
the set of objects that are recovering. This can hopefully be
compiled out once all the bugs are identified.
We are chasing this:
osd/OSD.cc:3465: FAILED assert(recovery_ops_active >= 0)
1: ./cosd(_Z18__ceph_assert_failPKcS0_iS0_+0x3a) [0x7a769b]
2: ./cosd(_ZN3OSD18finish_recovery_opEP2PGib+0x148) [0x696bce]
3: ./cosd(_ZN12ReplicatedPG18finish_recovery_opEv+0x77) [0x6359c5]
4: ./cosd(_ZN12ReplicatedPG17sub_op_push_replyEP14MOSDSubOpReply+0x540) [0x63628a]
5: ./cosd(_ZN12ReplicatedPG15do_sub_op_replyEP14MOSDSubOpReply+0x64) [0x6407fe]
6: ./cosd(_ZN3OSD10dequeue_opEP2PG+0x224) [0x6996ee]
7: ./cosd(_ZN3OSD4OpWQ8_processEP2PG+0x21) [0x70d175]
8: ./cosd(_ZN10ThreadPool9WorkQueueI2PGE13_void_processEPv+0x28) [0x6c9f78]
9: ./cosd(_ZN10ThreadPool6workerEv+0x280) [0x7a825c]
10: ./cosd(_ZN10ThreadPool10WorkThread5entryEv+0x19) [0x70cb9f]
11: ./cosd(_ZN6Thread11_entry_funcEPv+0x20) [0x629d48]
12: /lib/libpthread.so.0 [0x7f2f1e3f33f7]
13: /lib/libc.so.6(clone+0x6d) [0x7f2f1d9c294d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2009-06-24 11:17:55 -07:00
Sage Weil
ca7d025449
osd: abort generate_backlog if already canceled
...
Bail out of generate_backlog if we've been canceled. Fixes
osd/OSD.cc: In function 'void OSD::generate_backlog(PG*)':
osd/OSD.cc:3305: FAILED assert(!pg->is_active())
1: ./cosd(_Z18__ceph_assert_failPKcS0_iS0_+0x3a) [0x7a833b]
2: ./cosd(_ZN3OSD16generate_backlogEP2PG+0xb6) [0x69a1a6]
3: ./cosd(_ZN3OSD9BacklogWQ8_processEP2PG+0x21) [0x70d92b]
4: ./cosd(_ZN10ThreadPool9WorkQueueI2PGE13_void_processEPv+0x28) [0x6ca5f8]
5: ./cosd(_ZN10ThreadPool6workerEv+0x280) [0x7a8efc]
6: ./cosd(_ZN10ThreadPool10WorkThread5entryEv+0x19) [0x70d331]
7: ./cosd(_ZN6Thread11_entry_funcEPv+0x20) [0x629e48]
8: /lib/libpthread.so.0 [0x7f0a8feed3f7]
9: /lib/libc.so.6(clone+0x6d) [0x7f0a8f4bc94d]
2009-06-24 11:12:03 -07:00
Sage Weil
4b5572a679
osd: fix merge_log when log and olog share bottom
...
If log has 6'10 and olog has 7'10, on same object, merge_log
was failing to throw out log's 6'10 entry because the
last_kept iterator was still end(). Use a simple eversion_t
instead, and simplify existing (and otherwise correct)
log.bottom logic, but without the last_kept != end() guard
that threw us off.
09.06.23 16:52:56.032981 1145465168 osd4 485 pg[1.cd( v 469'11021/469'11021 (469'11020,469'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering] merge_log log(469'11020,476'11021] from osd0 into log(469'11020,469'11021]
09.06.23 16:52:56.033001 1145465168 osd4 485 pg[1.cd( v 469'11021/469'11021 (469'11020,469'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering] merge_log extending top to 476'11021
09.06.23 16:52:56.033033 1145465168 osd4 485 pg[1.cd( v 469'11021/469'11021 (469'11020,469'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering] ? 476'11021 (0'0) m 10001641d24.00000000/head by mds0.16:33860 09.06.23 16:50:28.931949
09.06.23 16:52:56.033057 1145465168 osd4 485 pg[1.cd( v 469'11021/469'11021 (469'11020,469'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering] merge_log 476'11021 (0'0) m 10001641d24.00000000/head by mds0.16:33860 09.06.23 16:50:28.931949
09.06.23 16:52:56.033090 1145465168 osd4 485 pg[1.cd( v 476'11021/469'11021 (469'11020,476'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering m=1 l=1] merge_log result log(469'11020,476'11021] missing(1) changed=1
2009-06-23 22:06:24 -07:00
Sage Weil
2ebc00f10a
filestore: use readdir_r to avoid SIGBUS badness
...
We need to use reentrant readdir, since multiple threads
will otherwise share the struct dirent and walk all over
each other.
2009-06-23 22:03:04 -07:00
Sage Weil
c301e10b0c
mds: fix session purge bug
...
mds/Server.cc: In function 'void Server::_finish_session_purge(Session*)':
mds/Server.cc:410: FAILED assert(session->is_stale_purging())
1: ./cmds(_ZN6Server21_finish_session_purgeEP7Session+0x392) [0x49edf2]
2: ./cmds(_ZN6Server18find_idle_sessionsEv+0xa18) [0x4a3188]
3: ./cmds(_ZN3MDS4tickEv+0x220) [0x484f60]
4: ./cmds(_ZN9SafeTimer12EventWrapper6finishEi+0x1c1) [0x63eb11]
5: ./cmds(_ZN5Timer11timer_entryEv+0x6f6) [0x6412d6]
6: ./cmds(_ZN5Timer11TimerThread5entryEv+0xd) [0x46d53d]
7: ./cmds(_ZN6Thread11_entry_funcEPv+0xc) [0x480c9c]
8: /lib/libpthread.so.0 [0x7f51a9f4c3f7]
9: /lib/libc.so.6(clone+0x6d) [0x7f51a951b94d]
2009-06-23 15:48:37 -07:00
Sage Weil
63f4073df2
osd: allow recovery of missing objects not in log
...
This happens when a scrub/repair tells us to recovery an item, but
it's older than log.bottom.
2009-06-23 14:54:08 -07:00
Sage Weil
303fe163f9
osd: avoid using null ctx pointer
...
Use localt instead, it's on the stack.
2009-06-23 09:33:13 -07:00
Sage Weil
89eb728f14
osd: stop rewinding replica log when we reach log.bottom
...
We stop rewinding a replica log when we reach our own
log.bottom, because we don't know enough to do so in any
meaningful way, and because we can assume it is not
divergent at that point (barring any complete screwupedness).
Also, if we do change last_update, make sure last_complete is
rewound too.
2009-06-23 09:33:13 -07:00
Sage Weil
e6cb25111f
mds: no fatal assert on ino allocation failures
...
We still log them LOG_ERR. Client will be unhappy, but
that's their problem.
2009-06-23 09:33:12 -07:00
Sage Weil
40e44349b5
osd: small cleanups
2009-06-23 09:33:12 -07:00
Sage Weil
78780528f3
mds: don't choke on bad parallel_fetch paths
...
e.g., bad reconnect path from client, like /blah/file_not_dir/blah.
2009-06-23 09:33:12 -07:00
Sage Weil
54300766d9
rados: cleanup
2009-06-22 20:57:44 -07:00
Sage Weil
299eddaac3
kclient: make r_path[12] dup strings
...
The mds_request lifetime differs from the caller's stack, so we need to
duplicate these strings. Fixes problems with request reply after MDS
recovery.
2009-06-22 20:21:58 -07:00
Sage Weil
8bbd6339ae
kclient: clean up mds_request path generation
2009-06-22 20:21:58 -07:00
Sage Weil
541669cce2
todo
2009-06-22 20:21:58 -07:00
Sage Weil
f8d05356e9
Makefile: add missing kernel/ headers
2009-06-22 20:21:58 -07:00
Sage Weil
5f1ea72d55
kclient: import into fs/, not fs/staging/?
2009-06-22 20:21:58 -07:00
Greg Farnum
74ceb01289
rados/objecter: Changes to rados in/out, and various things work.
2009-06-22 15:35:36 -07:00
Greg Farnum
2bd1377016
Objecter/librados: Refactored and renamed for clarity.
2009-06-22 15:35:36 -07:00
Sage Weil
0e3e44449d
todo
2009-06-22 10:05:19 -07:00
Sage Weil
794002e48b
todo
2009-06-22 09:32:33 -07:00
Sage Weil
aa615b84f5
kclient: clean up unaligned pointer accesses
...
Get rid of the likes of *(__le64*)foo.
Get rid of useless ceph_decode_##_le() macros; use ceph_decode_copy
instead.
2009-06-20 15:00:31 -07:00
Sage Weil
3652d17afe
cosd: conf updates
2009-06-20 14:01:10 -07:00
Sage Weil
1d21494a63
mon: allow repair of entire osd
2009-06-20 14:00:55 -07:00
Sage Weil
29a2b2f3a7
mds: reduce default memory, journal footprint
2009-06-20 14:00:55 -07:00
Sage Weil
e0097bc12e
osd: do NOT include op vector when shipping raw transaction
...
This just doubles up the data payload. And makes the MOSDSubOp printout
look like garbage, since e.g. the setxattr names are taken from the
portion of the data payload encoding the transaction.
2009-06-19 23:27:04 -07:00
Sage Weil
4148185392
kclient: strip out kernel version compatibility cruft
2009-06-19 15:03:18 -07:00
Sage Weil
7e51c52805
kclient: update script importer
2009-06-19 15:03:11 -07:00
Sage Weil
58f28204c5
todo
2009-06-19 14:56:32 -07:00
Sage Weil
9f58773e0a
osd: on scrub repair, update replica pg stats as necessary
...
An MOSDPGInfo to an active replica is treated as a pg stat repair. The
replica just saves it to disk.
2009-06-19 12:46:21 -07:00
Sage Weil
bcab1f16a9
osd: pass updated stats to replica
...
When we ship the raw transaction to the replica, we need to ship the
new pg_stat_t as well, since that isn't getting updated in parallel by
prepare_transaction().
2009-06-19 12:45:36 -07:00
Sage Weil
3e0c3e29d9
uclient: close mds session close race
...
If we get a mds push msg while closing the session, resend the close
request.
2009-06-19 11:40:09 -07:00
Sage Weil
4f0e84c982
objecter: some list_objects cleanups
2009-06-19 10:06:32 -07:00
Sage Weil
d82ea93ff5
osd: check that pg matches
...
Otherwise return an empty result. May want to return an error here.. not
sure which tho.
2009-06-19 10:06:17 -07:00
Sage Weil
0258c459d2
todo: bugs that have come up >2x now
2009-06-18 21:19:25 -07:00
Sage Weil
f6981fce3a
osd: adjust debug levels a bit
...
Try to put iterative output to be at 20, other stuff at 10,
so that we can tolerate 10 on large data sets.
2009-06-18 21:18:49 -07:00
Sage Weil
5d24752ada
osd: fix initialization of log.complete_to in PG::activate()
...
The complete_to should point to the next object to get, which
should be just PAST info.last_complete. That is because we
can trim the log up to and including last_complete (because
that entry is recovered), and we don't want to invalidate
the iterator.
That is
while (log.complete_to->version <= info.last_complete)
log.complete_to++;
and in sub_op_push,
while (...) {
...
if (info.last_complete < log.complete_to->version)
info.last_complete = log.complete_to->version;
log.complete_to++;
}
2009-06-18 21:18:49 -07:00
Sage Weil
ddd624275e
osd: remove bad trim assertion: trim point may preceed local log.bottom
2009-06-18 21:18:49 -07:00
Sage Weil
695e6a7625
osd: remove bad assertion to allow trim before pg is clean
...
We may trim the log before recovery completes.
2009-06-18 21:18:49 -07:00
Greg Farnum
17b0f2653d
Objecter: now has list instead of librados. Hurrah.
2009-06-18 17:05:17 -07:00
Greg Farnum
f3210a7150
Objecter: Now resubmits *Op as part of tick() if the response takes too long.
2009-06-18 17:05:17 -07:00
Sage Weil
5720080b6f
osd: be a bit more verbose about peer_info
...
Looking for residual bug where peer_info info is somehow missing
when activate() happens...
2009-06-18 16:42:35 -07:00
Sage Weil
9e33bf129b
osd: don't trim pg log if degraded
...
Also be a bit more verbose about pg_trim_to changes.
2009-06-18 16:42:35 -07:00
Sage Weil
fc7ec60c7f
osd: we don't use MOSDPGInfo to signal replica uptodate anymore
...
Clean out cruft from old replica-driven recovery.
2009-06-18 16:42:35 -07:00