RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2024-12-14 07:25:50 +00:00

Author	SHA1	Message	Date
Sage Weil	491dd4342c	mon: change MMDSMap to send map we have, not map we want.	2009-06-24 13:15:33 -07:00
Sage Weil	1fd46cc3e8	osd: make object delete not remove _head if there are clones Truncate and rmattrs instead, so we can keep the SnapSet. Still need to make 'ls' work properly.	2009-06-24 13:07:42 -07:00
Sage Weil	1d5c805e51	filestore: rmattrs command Delete all object attrs	2009-06-24 13:07:42 -07:00
Greg Farnum	5c39415b49	messages: Added PaxosServiceMessage to repository so previous commits work.	2009-06-24 13:06:33 -07:00
Greg Farnum	dd3ffaf072	Monitor/Message: All messages used by Paxos are now PaxosServiceMessages.	2009-06-24 12:27:55 -07:00
Greg Farnum	540ad29980	mon/msg: PThey mostly hold version_t's now. Unused, though.	2009-06-24 12:27:55 -07:00
Sage Weil	b3f7108be5	osd: adjust recovery op accounting; explicitly track set of recovering objects Use a single {start,finish}_recovery_op() func to start and stop recovery ops, so that there is a single point for counter adjustments to occur. On reset, simply call into OSD multiple times. Also maintain a set<sobject_t> in each PG and on the OSD to track the set of objects that are recovering. This can hopefully be compiled out once all the bugs are identified. We are chasing this: osd/OSD.cc:3465: FAILED assert(recovery_ops_active >= 0) 1: ./cosd(_Z18__ceph_assert_failPKcS0_iS0_+0x3a) [0x7a769b] 2: ./cosd(_ZN3OSD18finish_recovery_opEP2PGib+0x148) [0x696bce] 3: ./cosd(_ZN12ReplicatedPG18finish_recovery_opEv+0x77) [0x6359c5] 4: ./cosd(_ZN12ReplicatedPG17sub_op_push_replyEP14MOSDSubOpReply+0x540) [0x63628a] 5: ./cosd(_ZN12ReplicatedPG15do_sub_op_replyEP14MOSDSubOpReply+0x64) [0x6407fe] 6: ./cosd(_ZN3OSD10dequeue_opEP2PG+0x224) [0x6996ee] 7: ./cosd(_ZN3OSD4OpWQ8_processEP2PG+0x21) [0x70d175] 8: ./cosd(_ZN10ThreadPool9WorkQueueI2PGE13_void_processEPv+0x28) [0x6c9f78] 9: ./cosd(_ZN10ThreadPool6workerEv+0x280) [0x7a825c] 10: ./cosd(_ZN10ThreadPool10WorkThread5entryEv+0x19) [0x70cb9f] 11: ./cosd(_ZN6Thread11_entry_funcEPv+0x20) [0x629d48] 12: /lib/libpthread.so.0 [0x7f2f1e3f33f7] 13: /lib/libc.so.6(clone+0x6d) [0x7f2f1d9c294d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.	2009-06-24 11:17:55 -07:00
Sage Weil	ca7d025449	osd: abort generate_backlog if already canceled Bail out of generate_backlog if we've been canceled. Fixes osd/OSD.cc: In function 'void OSD::generate_backlog(PG*)': osd/OSD.cc:3305: FAILED assert(!pg->is_active()) 1: ./cosd(_Z18__ceph_assert_failPKcS0_iS0_+0x3a) [0x7a833b] 2: ./cosd(_ZN3OSD16generate_backlogEP2PG+0xb6) [0x69a1a6] 3: ./cosd(_ZN3OSD9BacklogWQ8_processEP2PG+0x21) [0x70d92b] 4: ./cosd(_ZN10ThreadPool9WorkQueueI2PGE13_void_processEPv+0x28) [0x6ca5f8] 5: ./cosd(_ZN10ThreadPool6workerEv+0x280) [0x7a8efc] 6: ./cosd(_ZN10ThreadPool10WorkThread5entryEv+0x19) [0x70d331] 7: ./cosd(_ZN6Thread11_entry_funcEPv+0x20) [0x629e48] 8: /lib/libpthread.so.0 [0x7f0a8feed3f7] 9: /lib/libc.so.6(clone+0x6d) [0x7f0a8f4bc94d]	2009-06-24 11:12:03 -07:00
Sage Weil	4b5572a679	osd: fix merge_log when log and olog share bottom If log has 6'10 and olog has 7'10, on same object, merge_log was failing to throw out log's 6'10 entry because the last_kept iterator was still end(). Use a simple eversion_t instead, and simplify existing (and otherwise correct) log.bottom logic, but without the last_kept != end() guard that threw us off. 09.06.23 16:52:56.032981 1145465168 osd4 485 pg[1.cd( v 469'11021/469'11021 (469'11020,469'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering] merge_log log(469'11020,476'11021] from osd0 into log(469'11020,469'11021] 09.06.23 16:52:56.033001 1145465168 osd4 485 pg[1.cd( v 469'11021/469'11021 (469'11020,469'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering] merge_log extending top to 476'11021 09.06.23 16:52:56.033033 1145465168 osd4 485 pg[1.cd( v 469'11021/469'11021 (469'11020,469'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering] ? 476'11021 (0'0) m 10001641d24.00000000/head by mds0.16:33860 09.06.23 16:50:28.931949 09.06.23 16:52:56.033057 1145465168 osd4 485 pg[1.cd( v 469'11021/469'11021 (469'11020,469'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering] merge_log 476'11021 (0'0) m 10001641d24.00000000/head by mds0.16:33860 09.06.23 16:50:28.931949 09.06.23 16:52:56.033090 1145465168 osd4 485 pg[1.cd( v 476'11021/469'11021 (469'11020,476'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering m=1 l=1] merge_log result log(469'11020,476'11021] missing(1) changed=1	2009-06-23 22:06:24 -07:00
Sage Weil	2ebc00f10a	filestore: use readdir_r to avoid SIGBUS badness We need to use reentrant readdir, since multiple threads will otherwise share the struct dirent and walk all over each other.	2009-06-23 22:03:04 -07:00
Sage Weil	c301e10b0c	mds: fix session purge bug mds/Server.cc: In function 'void Server::_finish_session_purge(Session*)': mds/Server.cc:410: FAILED assert(session->is_stale_purging()) 1: ./cmds(_ZN6Server21_finish_session_purgeEP7Session+0x392) [0x49edf2] 2: ./cmds(_ZN6Server18find_idle_sessionsEv+0xa18) [0x4a3188] 3: ./cmds(_ZN3MDS4tickEv+0x220) [0x484f60] 4: ./cmds(_ZN9SafeTimer12EventWrapper6finishEi+0x1c1) [0x63eb11] 5: ./cmds(_ZN5Timer11timer_entryEv+0x6f6) [0x6412d6] 6: ./cmds(_ZN5Timer11TimerThread5entryEv+0xd) [0x46d53d] 7: ./cmds(_ZN6Thread11_entry_funcEPv+0xc) [0x480c9c] 8: /lib/libpthread.so.0 [0x7f51a9f4c3f7] 9: /lib/libc.so.6(clone+0x6d) [0x7f51a951b94d]	2009-06-23 15:48:37 -07:00
Sage Weil	63f4073df2	osd: allow recovery of missing objects not in log This happens when a scrub/repair tells us to recovery an item, but it's older than log.bottom.	2009-06-23 14:54:08 -07:00
Sage Weil	303fe163f9	osd: avoid using null ctx pointer Use localt instead, it's on the stack.	2009-06-23 09:33:13 -07:00
Sage Weil	89eb728f14	osd: stop rewinding replica log when we reach log.bottom We stop rewinding a replica log when we reach our own log.bottom, because we don't know enough to do so in any meaningful way, and because we can assume it is not divergent at that point (barring any complete screwupedness). Also, if we do change last_update, make sure last_complete is rewound too.	2009-06-23 09:33:13 -07:00
Sage Weil	e6cb25111f	mds: no fatal assert on ino allocation failures We still log them LOG_ERR. Client will be unhappy, but that's their problem.	2009-06-23 09:33:12 -07:00
Sage Weil	40e44349b5	osd: small cleanups	2009-06-23 09:33:12 -07:00
Sage Weil	78780528f3	mds: don't choke on bad parallel_fetch paths e.g., bad reconnect path from client, like /blah/file_not_dir/blah.	2009-06-23 09:33:12 -07:00
Sage Weil	54300766d9	rados: cleanup	2009-06-22 20:57:44 -07:00
Sage Weil	299eddaac3	kclient: make r_path[12] dup strings The mds_request lifetime differs from the caller's stack, so we need to duplicate these strings. Fixes problems with request reply after MDS recovery.	2009-06-22 20:21:58 -07:00
Sage Weil	8bbd6339ae	kclient: clean up mds_request path generation	2009-06-22 20:21:58 -07:00
Sage Weil	541669cce2	todo	2009-06-22 20:21:58 -07:00
Sage Weil	f8d05356e9	Makefile: add missing kernel/ headers	2009-06-22 20:21:58 -07:00
Sage Weil	5f1ea72d55	kclient: import into fs/, not fs/staging/?	2009-06-22 20:21:58 -07:00
Greg Farnum	74ceb01289	rados/objecter: Changes to rados in/out, and various things work.	2009-06-22 15:35:36 -07:00
Greg Farnum	2bd1377016	Objecter/librados: Refactored and renamed for clarity.	2009-06-22 15:35:36 -07:00
Sage Weil	0e3e44449d	todo	2009-06-22 10:05:19 -07:00
Sage Weil	794002e48b	todo	2009-06-22 09:32:33 -07:00
Sage Weil	aa615b84f5	kclient: clean up unaligned pointer accesses Get rid of the likes of (__le64)foo. Get rid of useless ceph_decode_##_le() macros; use ceph_decode_copy instead.	2009-06-20 15:00:31 -07:00
Sage Weil	3652d17afe	cosd: conf updates	2009-06-20 14:01:10 -07:00
Sage Weil	1d21494a63	mon: allow repair of entire osd	2009-06-20 14:00:55 -07:00
Sage Weil	29a2b2f3a7	mds: reduce default memory, journal footprint	2009-06-20 14:00:55 -07:00
Sage Weil	e0097bc12e	osd: do NOT include op vector when shipping raw transaction This just doubles up the data payload. And makes the MOSDSubOp printout look like garbage, since e.g. the setxattr names are taken from the portion of the data payload encoding the transaction.	2009-06-19 23:27:04 -07:00
Sage Weil	4148185392	kclient: strip out kernel version compatibility cruft	2009-06-19 15:03:18 -07:00
Sage Weil	7e51c52805	kclient: update script importer	2009-06-19 15:03:11 -07:00
Sage Weil	58f28204c5	todo	2009-06-19 14:56:32 -07:00
Sage Weil	9f58773e0a	osd: on scrub repair, update replica pg stats as necessary An MOSDPGInfo to an active replica is treated as a pg stat repair. The replica just saves it to disk.	2009-06-19 12:46:21 -07:00
Sage Weil	bcab1f16a9	osd: pass updated stats to replica When we ship the raw transaction to the replica, we need to ship the new pg_stat_t as well, since that isn't getting updated in parallel by prepare_transaction().	2009-06-19 12:45:36 -07:00
Sage Weil	3e0c3e29d9	uclient: close mds session close race If we get a mds push msg while closing the session, resend the close request.	2009-06-19 11:40:09 -07:00
Sage Weil	4f0e84c982	objecter: some list_objects cleanups	2009-06-19 10:06:32 -07:00
Sage Weil	d82ea93ff5	osd: check that pg matches Otherwise return an empty result. May want to return an error here.. not sure which tho.	2009-06-19 10:06:17 -07:00
Sage Weil	0258c459d2	todo: bugs that have come up >2x now	2009-06-18 21:19:25 -07:00
Sage Weil	f6981fce3a	osd: adjust debug levels a bit Try to put iterative output to be at 20, other stuff at 10, so that we can tolerate 10 on large data sets.	2009-06-18 21:18:49 -07:00
Sage Weil	5d24752ada	osd: fix initialization of log.complete_to in PG::activate() The complete_to should point to the next object to get, which should be just PAST info.last_complete. That is because we can trim the log up to and including last_complete (because that entry is recovered), and we don't want to invalidate the iterator. That is while (log.complete_to->version <= info.last_complete) log.complete_to++; and in sub_op_push, while (...) { ... if (info.last_complete < log.complete_to->version) info.last_complete = log.complete_to->version; log.complete_to++; }	2009-06-18 21:18:49 -07:00
Sage Weil	ddd624275e	osd: remove bad trim assertion: trim point may preceed local log.bottom	2009-06-18 21:18:49 -07:00
Sage Weil	695e6a7625	osd: remove bad assertion to allow trim before pg is clean We may trim the log before recovery completes.	2009-06-18 21:18:49 -07:00
Greg Farnum	17b0f2653d	Objecter: now has list instead of librados. Hurrah.	2009-06-18 17:05:17 -07:00
Greg Farnum	f3210a7150	Objecter: Now resubmits *Op as part of tick() if the response takes too long.	2009-06-18 17:05:17 -07:00
Sage Weil	5720080b6f	osd: be a bit more verbose about peer_info Looking for residual bug where peer_info info is somehow missing when activate() happens...	2009-06-18 16:42:35 -07:00
Sage Weil	9e33bf129b	osd: don't trim pg log if degraded Also be a bit more verbose about pg_trim_to changes.	2009-06-18 16:42:35 -07:00
Sage Weil	fc7ec60c7f	osd: we don't use MOSDPGInfo to signal replica uptodate anymore Clean out cruft from old replica-driven recovery.	2009-06-18 16:42:35 -07:00

... 2 3 4 5 6 ...

7357 Commits