RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-01-01 16:42:29 +00:00

Author	SHA1	Message	Date
Colin Patrick McCabe	6722b0c85d	rpm: add pkgconfig to BuildRequires You can't build without pkgconfig. Signed-off-by: Colin McCabe <colinm@hq.newdream.net>	2010-12-09 11:18:32 -08:00
Colin Patrick McCabe	9df18d1984	rpm: set files-attr for radosgw Signed-off-by: Colin McCabe <colinm@hq.newdream.net>	2010-12-09 10:28:39 -08:00
Sage Weil	b4264fbbdc	filejournal: reset last_commited_seq if we find journal to be invalid If we read an event that's later than our expected entry, we set read_pos to -1 and discard the journal. If that happens we also need to reset last_committed_seq to avoid a crash like 2010-12-08 17:04:39.246950 7f269d138910 journal commit_finish thru 16904 2010-12-08 17:04:39.246961 7f269d138910 journal committed_thru 16904 < last_committed_seq 37778589 os/FileJournal.cc: In function 'virtual void FileJournal::committed_thru(uint64_t)': os/FileJournal.cc:854: FAILED assert(seq >= last_committed_seq) ceph version 0.24~rc (commit:fe10300317383ec29948d7dbe3cb31b3aa277e3c) 1: (FileJournal::committed_thru(unsigned long)+0xad) [0x588e7d] 2: (JournalingObjectStore::commit_finish()+0x8c) [0x57f2ec] 3: (FileStore::sync_entry()+0xcff) [0x5764cf] 4: (FileStore::SyncThread::entry()+0xd) [0x506d9d] 5: (Thread::_entry_func(void*)+0xa) [0x4790ba] 6: /lib/libpthread.so.0 [0x7f26a2f8373a] 7: (clone()+0x6d) [0x7f26a1c2569d] Fixes #631 Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-08 18:10:49 -08:00
Sage Weil	a9c098df47	mon: use helper for clock drift check; log relative instead of absolute time Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-08 11:12:51 -08:00
Sage Weil	fe10300317	mds: sync->mix replica state is sync->mix(2) When auth first moves to sync->mix, - auth sends AC_MIX to replicas - replicas go to sync->mix - replicas finish gather, send AC_SYNCACK, move to sync->mix(2) - auth gets all acks, sends AC_MIX again - replica moves to MIX So any new replica should just get sync->mix(2), so that it is not confused by the second AC_MIX. Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-07 16:44:19 -08:00
Sage Weil	2000f69e99	mds: no not choose lock state on replicas The lock state has already been set during rejoin. Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-07 16:44:19 -08:00
Sage Weil	3825c4b87b	mds: small rejoin cleanup Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-07 16:44:18 -08:00
Sage Weil	9b9b86935e	mds: rev mds cluster internal protocol The lock encoding changed with the dirty bit on scatterlocks. Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-07 16:44:18 -08:00
Sage Weil	2ea9b2d7db	mds: fix replay of already-journaled requests Check for already-completed tids for both retried and replayed requests. Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-07 16:44:18 -08:00
Sage Weil	b5fd2e4d4e	mds: open undef dirfrags during rejoin Any invented dirfrags have a version of 0. This will cause problems later if we pre_dirty() anything in that dir because the dir version won't be in sync (it'll be way too small). Also, we can do that at any point, e.g. when flushing dirty caps, and aren't allowed to delay, so we need to load those dirfrags now. In theory we could read only the fnode and not all the dentries, but we may as well. We should be more careful about memory that this patch is, though. Fixes #15. Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-07 16:44:18 -08:00
Sage Weil	39c5933db0	mds: add missing try_clear_more() to scatterlock Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-07 16:44:18 -08:00
Sage Weil	c681ed752f	mds: explicitly pass scatterlock dirty flag to auth on gather This ensures that if the replica is thinks it is flushing something the auth will always do a scatter_writebehind. Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-07 16:44:18 -08:00
Sage Weil	9bbb33b436	mds: send LOCKFLUSHED to trigger finish_flush on replicas Since `f741766a` we have triggered start_flush and finish_flush on replicas. The problem is that the finish_flush didn't always happen for the mix->lock case: we sould start_flush when we sent the AC_LOCKACK, but could only finish_flush if/when we got another SYNC or MIX. If the primary stayed in the LOCK state, we would keep our flushing flag. That in turn causes problems later when we try to eval_gather() (esp if we are auth at that point?). Fix this by sending an explicit AC_LOCKFLUSHED message to replicas after we do a scatter_writebehind. The replica will only set flushing if it flushed dirty data, which forces scatter_writebehind, so we will always get the LOCKFLUSHED to match. Replicas that didn't flush will also get it, but oh well. We'd need to keep track which ones sent dirty data to do that properly, though. TODO: still need to verify that this is correct for rejoin. Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-07 16:44:18 -08:00
Sage Weil	681b010fdb	mds: clear EXPORTINGCAPS on export_reverse We need to reverse the effects of encode_export_inode_caps(), which is just the pin and state bit. The original problem can be reproduced with - ceph tell mds 0 injectargs '--mds-kill-import-at 5' - restart mds - recovery completes successfully - wait for the subtree to be reexported - fail with bad EXPORTINGCAPS get in encode_export_inode_caps Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-07 16:44:18 -08:00
Sage Weil	f97660ff40	mds: fix LOOKUPHASH to avoid creating bogus replica CDir We can't create the CDir if we are non-auth. Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-07 16:44:18 -08:00
Sage Weil	4f6439945b	mds: introduce rejoin_invent_dirfrag() helper Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-07 16:44:18 -08:00
Colin Patrick McCabe	1e2e4aa0f4	automake: in scripts, use sysconfdir as-is Signed-off-by: Colin McCabe <colinm@hq.newdream.net>	2010-12-07 10:56:05 -08:00
Colin Patrick McCabe	10b6887eae	automake: in deb pkg, use --syconfdir=/etc When building the debian packages, use --sysconfdir=/etc. Also, don't fudge sysconfdir in the init-ceph script. Signed-off-by: Colin McCabe <colinm@hq.newdream.net>	2010-12-07 10:48:19 -08:00
Sage Weil	57bcdc54d5	mkcephfs: require -k; update man page Force users to specify keyring location; update man page accordingly. Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-06 22:18:16 -08:00
Yehuda Sadeh	87545d0620	configure: detect crypto++ library	2010-12-06 15:25:34 -08:00
Sage Weil	ebcc9395b0	osd: drop not-quite-copy constructor for object_info_t Making a copy-like constructor that doesn't actaully copy is confusing and error prone. In this case, we initialized a clone's object_info with the head's snapid, causing problems with what info was encoded and crashing later in the snap_trimmer. Here the one caller already called copy_user_bits(); let's move the lost copy there. This backs out one of the changes in `0cc8d34e`. Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-06 14:01:51 -08:00
Colin Patrick McCabe	b1afea515f	librados: fix error path in rados_deinitialize Signed-off-by: Colin McCabe <colinm@hq.newdream.net>	2010-12-06 11:17:10 -08:00
Yehuda Sadeh	aa3dda61dd	librados: fix the C++ interface init	2010-12-06 11:16:35 -08:00
Yehuda Sadeh	9a60481681	librados: fix C interface error handling in init code	2010-12-06 10:31:06 -08:00
Greg Farnum	bf030ca267	client: resync ioctl header from ceph-client. Previous change to the CEPH_IOCTL_MAGIC in `fbbf448` was incorrect! Signed-off-by: Greg Farnum <gregf@hq.newdream.net>	2010-12-06 09:59:50 -08:00
Laszlo Boszormenyi	4e3c201132	Tune Debian packaging for the upcoming v0.24 release. Including switch OpenSSL dependency to Crypto++ as its being used instead of the former; remove radosacl as its not compiled anymore and pristine clean the source. Explicitly note this is in a 1.0 package format.	2010-12-05 22:20:48 -08:00
Sage Weil	27b70eb57b	osd: search for unfound on osds in might_have_unfound We were looking at 'up', which is just the set of OSDs we should be on in the current epoch; nothing to do with where the objects might be found. Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-04 21:29:00 -08:00
Sage Weil	8aa7b39138	Makefile: make radosacl build WITH_DEBUG only Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-04 20:45:50 -08:00
Yehuda Sadeh	23f370436e	ceph.spec.in: update dependency	2010-12-03 19:34:37 -08:00
Yehuda Sadeh	e005925988	rgw: null terminate armor result	2010-12-03 19:34:37 -08:00
Yehuda Sadeh	f2424dfbd5	rgw: get rid of openssl altogether	2010-12-03 19:34:37 -08:00
Yehuda Sadeh	a28b449439	configure: check for the presence of libcrypto++ header files	2010-12-03 19:34:37 -08:00
Yehuda Sadeh	8821377030	crypto: change include	2010-12-03 19:34:37 -08:00
Yehuda Sadeh	76e02c71dc	common: remove base64.c	2010-12-03 19:34:37 -08:00
Yehuda Sadeh	e135e9245e	crypto: remove old openssl implementation	2010-12-03 19:34:37 -08:00
Yehuda Sadeh	7fa9426c6b	makefile.am: most binaries (except rgw_*) don't link with openssl	2010-12-03 19:34:37 -08:00
Yehuda Sadeh	6ec622c0cf	common: use ceph_armor instead of openssl based functions also modify ceph_[un]armor to get dest buffer length	2010-12-03 19:34:37 -08:00
Yehuda Sadeh	58f3ce4a34	crypto: test for allocation failure, cleanup	2010-12-03 19:34:37 -08:00
Yehuda Sadeh	15d8bdf3bf	crypto: use crypto++ for aes instead of openssl need to implement it more efficiently, currently going through a string object	2010-12-03 19:34:37 -08:00
Sage Weil	378d13df95	osd: remove poid/soid from ScrubMap::object; clean up callers The soid is in the key in the map; no need to store it in the value. Update the scrub code appropriately. Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-03 10:02:30 -08:00
Sage Weil	a457cbb9c2	mon: fix typo Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-03 10:02:30 -08:00
Colin Patrick McCabe	a4cc929ced	make: create log directories and tmp directories Signed-off-by: Colin McCabe <colinm@hq.newdream.net>	2010-12-03 09:35:55 -08:00
Jim Schutt	a5297388a7	msgr: Correctly handle half-open connections. If poll() says a socket is ready for reading, but zero bytes are read, that means that the peer has sent a FIN. Handle that. One way the incorrect handling was manifesting is as follows: Under a heavy write load, clients log many messages like this: [19021.523192] libceph: tid 876 timed out on osd6, will reset osd [19021.523328] libceph: tid 866 timed out on osd10, will reset osd [19081.616032] libceph: tid 841 timed out on osd0, will reset osd [19081.616121] libceph: tid 826 timed out on osd2, will reset osd [19081.616176] libceph: tid 806 timed out on osd3, will reset osd [19081.616226] libceph: tid 875 timed out on osd9, will reset osd [19081.616275] libceph: tid 834 timed out on osd12, will reset osd [19081.616326] libceph: tid 874 timed out on osd10, will reset osd After the clients are done writing and the file system should be quiet, osd hosts have a high load with many active threads: $ ps u -C cosd USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1383 162 11.5 1456248 943224 ? Ssl 11:31 406:59 /usr/bin/cosd -i 7 -c /etc/ceph/ceph.conf $ for p in `ps -C cosd -o pid --no-headers`; do grep -nH State /proc/$p/task/*/status \| grep -v sleep; done /proc/1383/task/10702/status:2:State: R (running) /proc/1383/task/10710/status:2:State: R (running) /proc/1383/task/10717/status:2:State: R (running) /proc/1383/task/11396/status:2:State: R (running) /proc/1383/task/27111/status:2:State: R (running) /proc/1383/task/27117/status:2:State: R (running) /proc/1383/task/27162/status:2:State: R (running) /proc/1383/task/27694/status:2:State: R (running) /proc/1383/task/27704/status:2:State: R (running) /proc/1383/task/27728/status:2:State: R (running) With this fix applied, a heavy load still causes many client resets of osds, but no runaway threads result. Signed-off-by: Jim Schutt <jaschut@sandia.gov> Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-03 09:10:58 -08:00
Colin Patrick McCabe	39b42b21e9	make: create /etc/ceph if it doesn't exist make: create /etc/ceph if it doesn't exist. On uninstall, remove the directory if it's empty. (Never remove a user's config file, though.) Signed-off-by: Colin McCabe <colinm@hq.newdream.net>	2010-12-02 17:35:32 -08:00
Colin Patrick McCabe	da5ab7c9a4	ost: object_info_t: decode old versions correctly object_info_t has one constructor that initializes everything from a bufferlist. This means that the decode function needs to give default values to fields in object_info_t that aren't found in the bufferlist. Signed-off-by: Colin McCabe <colinm@hq.newdream.net>	2010-12-02 16:56:48 -08:00
Greg Farnum	03eb4e7a07	man: add man page for cephfs Add to Makefile, debian, and ceph.spec.in bits	2010-12-02 16:18:38 -08:00
Sage Weil	78a1462243	osd: fix log tail vs last_complete assert on replica activation The last_complete may be below the log tail IFF we have a backlog. Fixes `756918be3b`. Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-01 15:40:28 -08:00
Sage Weil	a3d8c52794	filestore: call lower-level do_transactions() during journal replay We used to call apply_transactions, which avoided rejournaling anything because the journal wasn't writeable yet, but that uses all kinds of other machinery that relies on threads and finishers and such that aren't appropriate or necessary when we're just replaying journaled events. Instead, call the lower-level do_transactions() directly. Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-01 13:48:56 -08:00
Sage Weil	9ecbc300cb	filestore: do journal mode autodetect and sanity check _before_ replay Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-01 13:46:30 -08:00
Sage Weil	f9fa855a71	filestore: fix journal locking on trailing mode We're already holding journal_lock due to the surrounding op_submit_{start,finish}. Signed-off-by: Sage Weil <sage@newdream.net>	2010-12-01 11:05:11 -08:00

1 2 3 4 5 ...

12144 Commits