We are using a hash_map<> to map tids to Op*'s. In handle_osd_map(),
we will recalc_op_target() on each Op in a random (hash) order. These
will get put in a temp map<tid,Op*> to ensure they are resent in the
correct order, but their order on the session->ops list will be random.
Then later, if we reset an OSD connection, we will resend everything for
that session in ops order, which is be incorrect.
Fix this by explicitly reordering the requests to resend in
kick_requests(), much like we do in handle_osd_map(). This lets us
continue to use a hash_map<>, which is faster for reasonable numbers of
requests. A simpler but slower fix would be to just use map<> instead.
This is one of many bugs contributing to #2947.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
We are calling requeue_ops() on each individual op, which means we need
to requeue in reverse order (newest first, oldest last).
Fixes: #2947
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
Fixes: #2961
Also make sure that size is 64 bit.
backport: argonaut
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
The waiting_for_ondisk (and ack) maps get dups of ops that are in progress.
If we have a peering change in which the role does not change, we will
requeue the in-progress ops but leave these in the waiting_for_ondisk
maps, which will then trigger an assert the next time we examine that map
and find it didn't match up with what we expected.
Fix this by requeuing these on any peering reset in on_change(). This
keeps the two queues in sync.
Fixes: #2956
Signed-off-by: Sage Weil <sage@inktank.com>
Specifically, /foo may exist and client may try to mount /foo/bar. That
GETATTR request is on #1/foo/bar, but we cannot return a null dentry on bar
because the client is not prepared to handle it and will crash in
fill_trace().
Fixes: #2959
Reported-by: Yan Zheng <zheng.z.yan@intel.com>
Signed-off-by: Sage Weil <sage@inktank.com>
Since we are requeuing stuff anyway, do it all in the correct order. This
fixes a bug where take_waiters() comes along later (at activate_map time)
and puts waiting_for_map events at the front of the queue, in front of
e.g. waiting_for_missing. This breaks ordering from the client's
perspective.
The convention should be: whenever you requeue, requeuing everything
that logically follows it first.
Fixes: #2947
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
Requeue them explicity from apply_and_flush_repops() and call it last, so
that the overall ordering is preserved.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
[ The following text is in the "UTF-8" character set. ]
[ Your display is set for the "ANSI_X3.4-1968" character set. ]
[ Some characters may be displayed incorrectly. ]
Fix config keys for OSD/MDS data dirs. As in documentation and other
places of the scripts the keys are 'osd data'/'mds data' and not
'osd_data'
In case if MDS: if 'mds data' doesn't exist, create it.
Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de>
[ The following text is in the "UTF-8" character set. ]
[ Your display is set for the "ANSI_X3.4-1968" character set. ]
[ Some characters may be displayed incorrectly. ]
Change ceph osd create <osd-id> to ceph osd create <uuid>, since this
is what the command is really doing.
Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de>
If mon_host fails to parse, return an error instead of success.
This avoids failing later on an assert monmap.size() > 0 in the
monmap in MonClient.
Fixes: #2913
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Issue #2504. This makes us listen and notify on more than
a single object, which reduces the contention of cache
notifications.
NOTE: This change requires that any radosgw and radosgw-admin
use the same 'rgw num control oids' config value. A config value
of 0 will maintain old compatibility, and will allow an upgraded
process run in conjuction with an old one. Setting value other
than 0 (or using the non-zero default) will require upgrading
and restarting all the gateways together. Failing to do so
might lead to inconsistent user and buckets metadata (which
will be resolved once gateways are restarted).
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Allow the buckets, and any child objects, of a user to be deleted when the
user is deleted through radosgw-admin. In reference to feature request
2499: http://tracker.newdream.net/issues/2499.
Signed-off-by: caleb miles <caleb.miles@inktank.com>
Allow objects to be deleted through radosgw-admin with an optional flag
to delete the tail of that object during the processing of the intent log.
Signed-off-by: caleb miles <caleb.miles@inktank.com>
Fixes: #2841.
Usage trim operation was encoding the wrong op structure (usage read).
Since the structures somewhat overlapped it somewhat worked, but user
info wasn't encoded.
Backport: argonaut
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
It was not encoding user, adding that and reset version
compatibility.
This changes affects command interface, makes use of
radosgw-admin usage trim incompatible. Use of old
radosgw-admin usage trim should be avoided, as it may
remove more data than requested. In any case, upgraded
server code will not handle old client's trim requests.
backport: argonaut
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Fixes: #2878
We now allow complete multipart upload to use chunked encoding
when sending request data. With chunked encoding the HTTP_LENGTH
header is not required.
Backport: argonaut
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Fixes: #2879.
xml_handle_data() appends data to the object instead of just
replacing it. Parsed data can arrive in pieces, specifically
when data is escaped.
Backport: argonaut
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Fixes#2877.
Removing quotes from ETag before comparing it to what we
have when completing a multipart upload.
Backport: argonaut
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
The ceph-conf command only parses the conf; it does not apply default
config values. This breaks mkcephfs if values are not specified in the
config.
Let ceph-osd create its own key, fix copying, and fix creation/copying for
the mds.
Fixes: #2845
Reported-by: Florian Haas <florian@hastexo.com>
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
This should only be called by ~ptr or when we are replacing the current
target with something new. It is not suitable for external consumption
Because it doesn't reset length and offset.
Signed-off-by: Sage Weil <sage@inktank.com>
Fixes: #1855
It is no longer possible to create a subuser and new S3 key associated
with that user through the radosgw-admin utility. In reference to Bug 1855
http://tracker.newdream.net/issues/1855.
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Signed-off-by: caleb miles <caleb.miles@inktank.com>