Commit Graph

17395 Commits

Author SHA1 Message Date
Samuel Just
798ef38b14 osd: delay pg list on a snapid until missing is empty
We cannot determine from the missing set whether an object existed
at a given snap.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-12-09 13:44:32 -08:00
Greg Farnum
d21f4abc0f msgr: turn up socket debug printouts
These shouldn't be too common and will help in debugging
socket leaks.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-12-08 18:01:20 -08:00
Josh Durgin
891025e539 udev: drop device number from name
The device number depends on how many rbd images have been
mapped. Removing it makes the name determined solely by the name,
image, and snapshot that are mapped, for ease of scripting or persistence
across reboots.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-12-08 16:36:47 -08:00
Henry C Chang
a5606ca435 pybind: trivial fix of missing argument
Signed-off-by: Henry C Chang <henry.cy.chang@gmail.com>
2011-12-08 13:10:29 -08:00
Sage Weil
e4db12978f crush: whitespace
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-12-08 08:42:59 -08:00
Sage Weil
808763ea81 osdmap: initialize cluster_snapshot_epoch
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-12-08 08:41:54 -08:00
Sage Weil
c94590abb9 crush: set max_devices=0 for map with empty buckets
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-12-08 08:41:39 -08:00
Sage Weil
ca002a3389 crush: fix stepping on unallocated memory
If size is 0 we can't write here.

Reported-by: pankaj singh <psingh.ait@gmail.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-12-08 08:06:26 -08:00
Sage Weil
d940d68d63 client: trim lru after flushing dirty data
Shouldn't matter, but it would be interesting to see if this affects
#1737.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-12-07 21:19:19 -08:00
Sage Weil
1545d03c5a client: unmount cleanup
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-12-07 21:19:16 -08:00
Sage Weil
f3c90f8d16 client: wait for sync writes even with cache enabled
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-12-07 21:19:14 -08:00
Sage Weil
adbe36394a client: send umount warnings to log, not stderr
stderr isn't usually open anyway.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-12-07 21:19:10 -08:00
Samuel Just
2d3721c6cc ObjectStore,ReplicatedPG: remove old collection_list_partial
No need for the old collection_list_partial instance: it's cleaner to
just use an hobject_t as the collection list handle.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-12-07 11:40:11 -08:00
Samuel Just
717621f66e librados,Objecter,PG: list objects now includes the locator key
Previously, there was no way to recover the locator key used to create
and object.  Now, rados_objects_list_next and ObjectIterator will return
the key as well as the object name.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-12-07 11:40:11 -08:00
Sage Weil
322f93a2f7 hobject_t: encode max properly
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-12-07 11:40:11 -08:00
Sage Weil
0807e7d523 hobject_t: make filestore_hobject_key_t 64 bits
So we can return 0x100000000 when max=true.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-12-07 11:40:11 -08:00
Sage Weil
997265a245 os/HashIndex: some minimal debug output
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-12-07 11:40:11 -08:00
Sage Weil
9ab445a42d ObjectStore: Add collection_list_partial for hash order
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-12-07 11:40:11 -08:00
Sage Weil
63e3d86430 hobject_t: define explicit hash, operator<<; drop implicit sobject_t()
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-12-07 11:40:10 -08:00
Samuel Just
cada2f2ef4 object.h: Sort hobject_t by nibble reversed hash
To match the HashIndex ordering, we need to sort hobject_t by the nibble
reversed hash.  We store objects in the filestore in a directory tree
with the least significant nibble at the top and the most at the bottom
to facilitate pg splitting in the future.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-12-07 11:40:10 -08:00
Sage Weil
348321a591 hobject_t: sort by (max, hash, oid, snap)
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-12-07 11:40:10 -08:00
Sage Weil
2026450bd1 hobject_t: define max value
Create a max value that is greater than all other values.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-12-07 11:40:10 -08:00
Tommi Virtanen
745be30f51 gitignore: Ignore src/keyring, as created by vstart.sh
Commit 86c34ba9ee changed
the filename but not .gitignore.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
2011-12-06 15:22:16 -08:00
Samuel Just
a1ebd725dc ReplicatedPG: don't crash on empty data_subset in sub_op_push
If data_subset is empty (i.e., the data we pulled is no longer useful),
we should mark complete false and continue rather than fail the
assert in range_end().

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-12-06 14:44:46 -08:00
Josh Durgin
8afa5a5d9e workunits: fix secret file and temp file removal for kernel rbd
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-12-06 14:20:31 -08:00
Greg Farnum
03b03553b2 ReplicatedPG: do not ->put() scrub messages when adding to a WorkQueue.
This function is passing a reference from PG::active_rep_scrub to
the req_scrub_wq, not eliminating the reference (and the WorkQueue
doesn't grab a new reference itself, either).
The other alternative is to convert the WorkQueue to grab a
reference, but since they can cycle through the WorkQueue more than
once, and need to be ->put() outside the WorkQueue, I don't like
that option.
This should fix #1758.

Also add an assert to PG::_request_scrub_map to check on the other
possible cause of this bug (and fix the indentation).

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-12-06 14:24:08 -08:00
Josh Durgin
bcd26fca71 workunits: make rbd kernel workunit executable
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-12-06 13:36:51 -08:00
Tommi Virtanen
2bdf9078ef doc: Reorganize pip calls to use a requirements file.
The conditional before running pip install was unnecessary,
"pip install" on already installed packages is fast (as long
as it's not --upgrade), and --quiet makes it not spam the
console.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
2011-12-06 12:13:03 -08:00
Tommi Virtanen
200d7c89a6 doc: Switch diagram tools from dia to ditaa.
Now you can create diagrams easily with the ".. ditaa::"
directive in the Sphinx documents.

admin/build-doc now checks for debs required for building
the documentation, or just lists commands missing for hosts
not using dpkg.

For more on Ditaa, see http://ditaa.sourceforge.net/

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
2011-12-06 12:07:59 -08:00
Sage Weil
33753c82af filestore: send back op error to log, not stderr
Signed-off-by: Sage Weil <sage@newdream.net>
2011-12-06 10:50:01 -08:00
Sage Weil
20b7af79c6 doc: fix typo
Signed-off-by: Sage Weil <sage@newdream.net>
2011-12-06 10:50:01 -08:00
Josh Durgin
66b6b1bff8 workunits: add some tests for kernel rbd
This covers some snapshot and resize functions that aren't tested by fs benchmarks.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-12-06 10:31:17 -08:00
Josh Durgin
fd9556f0ac rbd: the showmapped command shouldn't connect to the cluster
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-12-06 10:26:24 -08:00
Josh Durgin
16a211bfee ceph-rbdnamer: include snapshot name if present
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-12-06 10:26:24 -08:00
Josh Durgin
274f4890dc rbd, mount.ceph: use pre-stored secret if available
If a secret is specified, store and use it, but otherwise
check for a pre-existing secret to use.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-12-06 10:26:24 -08:00
Josh Durgin
0ad0fbfe3a secret: add is_kernel_secret function
This will let us know whether we can add a key mount option
if no secret is specified.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-12-06 10:26:24 -08:00
Josh Durgin
01d30e6acb secret: fix error check
add_key will return -1 when an error occurs, which should be handled at a higher level and not printed here.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-12-06 10:26:24 -08:00
Josh Durgin
575f717fd4 rbd: allow snapshots to be mapped
unmap and showmapped already support snapshots. map should too.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-12-06 10:26:24 -08:00
Josh Durgin
ddc11a8f17 test_rados.py: clean up after EEXIST test
This extra pool caused subsequent pool tests to fail.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-12-06 08:34:47 -08:00
Sage Weil
54758abccf Merge remote branch 'gh/stable' 2011-12-05 17:33:57 -08:00
Sage Weil
9512aed5f5 doc: fix rst syntax
Signed-off-by: Sage Weil <sage@newdream.net>
2011-12-05 16:16:35 -08:00
Sage Weil
7178f1caa8 doc: document monitor cluster expansion/contraction
Pretty sure my rst syntax is wrong.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-12-05 14:07:44 -08:00
Sage Weil
16f79282cd cephtool: fix shutdown
Fix 'ceph -w' brokenness from commit ad13d0b7.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
2011-12-05 13:33:39 -08:00
Sage Weil
019597e6f4 filejournal: make FileJournal::open() arg slightly less weird
Pass in fs_op_seq (last_committed_seq), not the next expected seq, so we
can avoid subtracting and adding 1 in odd places.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-12-05 11:21:14 -08:00
Sage Weil
bfbc4324d6 Merge branch 'stable' 2011-12-05 11:21:08 -08:00
Sage Weil
86c34ba9ee vstart.sh: .ceph_keyring -> keyring
Signed-off-by: Sage Weil <sage@newdream.net>
2011-12-05 11:21:04 -08:00
Sage Weil
1e3da7edcf filejournal: remove bogus check in read_entry
It is perfectly fine to read events that are older than the fs's seq from
the journal; open() will skip them when positioning the read pointer on
open.

Also, this code is nonsensical; it always failed the assertion.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-12-05 10:53:27 -08:00
Sage Weil
dc167bac78 filejournal: set last_committed_seq based on fs, not journal
last_committed_seq is the last seq committed to the fs, not the journal.
Set it when we begin replay with the fs provided value, not from the newest
entry in the journal.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-12-05 09:37:10 -08:00
Sage Weil
4a0b00a0f2 mon: stub perfcounters for monitor, cluster
The 'mon' perfcounter is for the local daemon and is always registered.

The 'cluster' perfcounter is for cluster state, and is only registered
(and thus only shows up via the admin socket) when the current daemon is
part of the cluster quorum.

No actual counters yet.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-12-02 15:35:38 -08:00
Sage Weil
8bbe576cab osd: safely requeue waiting_for_ondisk waiters on_role_change
This could conceivably cause the reply ordering mismatch seen in bug
#1490.  Not sure why we didn't also fix this caller when we fixed that
bug last time :).

Signed-off-by: Sage Weil <sage@newdream.net>
2011-12-02 15:27:38 -08:00