When run by the udev rules, PATH is not defined. Thus,
ceph-disk-activate relies on its which() function to locate the
correct executable. The which() function used os.defpath if none was
set, and this worked for anything using it.
ad6b4b4b08 added a new default value to
PATH, so only /usr/bin was checked by callers that did not use
which(). This resulted in the mount command not being found when
ceph-disk-activate was run by udev, and thus osds failing to start
after being prepared by ceph-deploy.
Make ceph-disk consistently use the existing helpers (command() and
command_check_call()) that use which(), so lack of PATH does not
matter. Simplify _check_output() to use command(),
another wrapper around subprocess.Popen.
Fixes: #7258
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
There's a window in-between receiving an MOSDPGTemp message from an OSD
and actually handling it that may lead to the pool the pg temps refer to
no longer existing. This may happen if the MOSDPGTemp message is queued
pending dispatching due to an on-going proposal (maybe even the pool
removal).
This patch fixes such behavior in two steps:
1. Check if the pool exists in the osdmap upon preprocessing
- if pool does not exist in the osdmap, then the pool must have been
removed prior to handling the message, but after the osd sent it.
- safe to ignore the pg update
2. If all pg updates in the message have been ignored, ignore the whole
message. Otherwise, let prepare handle the rest.
3. Recheck if pool exists in the osdmap upon prepare
- We may have ignored this pg back in preprocess, but other pgs in the
message may have led the message to be passed on to prepare; ignore
pg update once more.
4. Check if pool is pending removal and ignore pg update if so.
We delegate checking the pending value to prepare_pgtemp() because in this
case we should only ignore the update IFF the pending value is in fact
committed. Otherwise we should retry the message. prepare_pgtemp() is
the appropriate place to do so.
Fixes: 7116
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
(cherry picked from commit f513f66f48)
This was causing a problem in the Striper, but fixing it here will avoid
corner cases all over the tree. Note that we have to bail out before
the end-of-buffer check to avoid hitting that check when the bufferlist is
also empty.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
If we add a partial result that is 0-length, we used to hit an assert in
buffer::list::splice(). Add a unit test to verify the fix.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
In the RPM spec file there is a test to deploy the uuid hack udev rules
for older udev operating systems. This includes CentOS and RHEL, but the
check currently only is for CentOS, causing RHEL clients to get a bogus
osd rules file.
Adjust the conditional to apply to RHEL as well as CentOS. (The %{rhel}
macro is defined in both platforms' redhat-rpm-config package.)
Fixes http://tracker.ceph.com/issues/7245
Signed-off-by: Ken Dreyer <ken.dreyer@inktank.com>
(cherry picked from commit 64a0b4fa56)
If we do, we will require the v2 feature bit from clients.
We could only include feature bits for rules that are actually referenced
by pools, but for now making the user create the rule is simpler. There is
no need to create this rule ahead of time.
Signed-off-by: Sage Weil <sage@inktank.com>
There is one path where a mds that is not sending its beacon (e.g.,
because it is not running at all) will lead to proposal of new mdsmaps.
Fix it.
Backport: emperor, dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
Automake puts ceph_common.sh into libdir/ceph, but the Red Hat packaging
was not capturing this file.
Add the libdir/ceph location to the RPM packaging.
Fixes#7117
(cherry picked from commit 2d0d48b829)
bufferlist::append(istream) now filters out empty lines; reflect this in
the test
Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit 4b5f2570e9)
The POST authentication by signature validation looked up a user based
on the access key, then used the first secret key for the user. If the
access key used was not the first access key, then the expected
signature would be wrong, and the POST would be rejected.
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
The signature variables for expected vs got are poorly named, and this
lead them being swapped in the signature validation failure print.
Change them to 'expected' and 'received' and make the related temporary
variables consistent to match.
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
Way back in a0ed9c2004 we introduced the
dirty flag, but we did not track it in the stats until much later in
c561d5ea22. Unfortunately this interval
spans the emperor release. To avoid making scrub error out and require
repair on *any* of those old pools, flag stats that were encoded before
now such that the dirty stats are ignored. Clear the flag if we *do*
do a repair so that it will be tracked properly thereafter.
Signed-off-by: Sage Weil <sage@inktank.com>
snap/clone promotion, flush, and other goodies
This is now passing the thrashing with both cache and snap ops:
sage-2014-01-13_15:45:26-rados:thrash-wip-cache-snap-testing-basic-plana
Reviewed-by: Samuel Just <sam.just@inktank.com>
find_object_context() has all the logic to choose a particular clone given
a logical snap. In the trim case, we want none of that: we just need to
pull the obc for a specific clone instance. Note that this changes
none of the failure cases (previous we asserted r == 0).
Signed-off-by: Sage Weil <sage@inktank.com>
We were fabricating an object_info_t correctly and writing it to disk, but
it was not reflected by the in-memory ObjectContext. If something came
along quickly (like backfill) and tried to use it, the info would be
invalid.
Fix this by fabricating it in the obc and copying it to the new_obs for
the update.
Fixes: #7122
Signed-off-by: Sage Weil <sage@inktank.com>
Previously, if a snap was deleted but the clone was there and we hadn't
trimmed it yet, we would still return the data. Instead, return ENOENT
unconditionally (even it's not removed yet). This makes the behavior from
the client perspective more predictable and conistent.
Signed-off-by: Sage Weil <sage@inktank.com>
This reliably returns ENODEV due to the test at the finish of flush. Not
because we are actually racing with trim, though: the trimmer doesn't run
at all. I believe it captures the important property, though. Namely:
we should not write a promoted object that is "behind" the snap trimmer's
progress. The fact that we are in front of it (the trimmer hasn't started
yet) should not matter since the object is logically deleted anyway.
We probably want to make the OSD return ENODEV on read in the normal case
when you try to access a clone that is pending trimming.
Signed-off-by: Sage Weil <sage@inktank.com>
If the object no longer exists (for example, because the snap trimmer just
killed it) clean up the flush state without trying to mark the object
clean.
Signed-off-by: Sage Weil <sage@inktank.com>
If we are promoting a clone and realize that the object is no longer
defined for any snaps, abort the copy and delete any temp object.
If the defined snaps have changed, make sure they are updated in memory
so that on promote completion the snapshot metadata is correct.
Signed-off-by: Sage Weil <sage@inktank.com>
Previously the caller was generating a temp object name and passing it
down in severaly different ways. Instead, generate one when we realize
that we need it, and store it in *one* place (CopyResults), where
the completions can get at the information.
Signed-off-by: Sage Weil <sage@inktank.com>
Make other find_object_context() callers handle the case where the object
in question needs to be promoted. We add a flag here that forces a promote
for these secondary objects so that the entire operation happens in the
same pool. Forwarding is not allowed in this case.
Signed-off-by: Sage Weil <sage@inktank.com>
If we have a clean object and clone it in make_writeable(), the clone
should also be clean (it does not need to be written back to the base
pool). If the object was dirty, the clone should be dirty.
Signed-off-by: Sage Weil <sage@inktank.com>
Consider:
- base and cache have same object foo; marked clean in cache pool
- modify + clone foo in cache pool. foo clone is clean.
- foo clone is evicted
- foo clone is read, and promoted
- we read foo@something from base pool, and get the head's content
copy-get does not provide us with a snaps list. Instead, we use the
snap_seq from the head to infer what the snaps vector was in the cache
pool and will be in the base pool when we flush the updates to the object.
Signed-off-by: Sage Weil <sage@inktank.com>
This is needed by the cache layer when reading a logical snap from a head
object on the backend in order to correctly recreate the clone in the
cache layer.
Signed-off-by: Sage Weil <sage@inktank.com>
Do not promote a clone for a snap that we know doesn't exist. If
find_object_context() didn't give us a missing_oid, there is nothing to
promote.
Signed-off-by: Sage Weil <sage@inktank.com>
This makes its results reliable. Otherwise, we can't mix the is_dirty
test with flush, which eliminates much of its value.
Signed-off-by: Sage Weil <sage@inktank.com>