Commit Graph

75585 Commits

Author SHA1 Message Date
Abhishek Lekshmanan
365f3215cc rgw: move reshard pool to ns in log pool
Since the pool was introduced only in Luminous dev and RC releases we
can probably upgrade without the need to bump up the the struct version
numbers. This needs a pending release notes entry

Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
2017-07-17 15:52:48 +02:00
Abhishek Lekshmanan
a0d8149930 rgw: dump the reshard pool in rgw zone params json
So that the zone get/put commands display the reshard pool as well

Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
2017-07-17 15:49:55 +02:00
xie xingguo
963c1cd3af mon/OSDMonitor: fix "ceph osd pool get rbd all --format=json-pretty"
Two problems:
(1) MIN_WRITE_RECENCY_FOR_PROMOTE is a tier-only option.
(2) should automatically filter out unset pool options, otherwise it will
    keep outputing rubbish:

{
    "pool": "rbd",
    "pool_id": 3,
    "min_write_recency_for_promote": 0
}
{
    "pool": "rbd",
    "pool_id": 3,
    "fast_read": 0
}
{
    "pool": "rbd",
    "pool_id": 3
}
{
    "pool": "rbd",
    "pool_id": 3
}
{
    "pool": "rbd",
    "pool_id": 3
}
{
    "pool": "rbd",
    "pool_id": 3,
    "csum_type": "???"
}

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-17 21:49:45 +08:00
xie xingguo
21b5640549 mon/OSDMonitor: drop unnecessary write permission for "crush get-tunable" command
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-17 21:49:44 +08:00
xie xingguo
a7917e59ab osd/OSD: filter out deprecated meta for bluestore
Journal path is filestore related...

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-17 21:49:42 +08:00
xie xingguo
b9526c5455 mon/OSDMonitor: cleanup last_osd_report if osd does not exist
In case we might want to reuse the same slot later.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-17 21:49:42 +08:00
xie xingguo
a98c7e84ac osd/OSD: gracefully shutdown on error exit during init
This can avoid crashes as below:

  0> 2017-07-12 09:34:47.427438 7f320ce61b80 -1 /home/xxg/build/ceph-dev/src/common/HeartbeatMap.cc: In function 'ceph::HeartbeatMap::~HeartbeatMap()'
thread 7f320ce61b80 time 2017-07-12 09:34:47.422986
/home/xxg/build/ceph-dev/src/common/HeartbeatMap.cc: 39: FAILED assert(m_workers.empty())

 ceph version 12.1.0-702-gc5b99af (c5b99af) luminous (rc)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x7f320d8ba7f0]
 2: (ceph::HeartbeatMap::~HeartbeatMap()+0xf8) [0x7f320d9be0a8]
 3: (CephContext::~CephContext()+0x40c) [0x7f320d9a648c]
 4: (CephContext::put()+0xe6) [0x7f320d9a6776]
 5: (main()+0xad3) [0x7f320d282953]
 6: (__libc_start_main()+0xf5) [0x7f32094cfb15]
 7: (()+0x4964c9) [0x7f320d31f4c9]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-17 21:49:41 +08:00
Sage Weil
3a4931b0e4 ceph: allow '-' with -i and -o for stdin/stdout
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-17 09:38:52 -04:00
Ilya Dryomov
0f75d79c34 qa/tasks/rbd_fio: use teuthology.packaging for handling packages
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-17 15:32:51 +02:00
Kefu Chai
0abee472e3 mon: add force-create-pg back
and now it's "ceph osd force-create-pg'

Fixes: http://tracker.ceph.com/issues/20605
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-07-17 21:18:06 +08:00
Karol Mroz
fad3c4992d rgw: clarify when encountering eacces for reshard list
Specify that we don't have access to the reshard pool when encountering
EACCESS.
TODO: get rgw's name and add that in the log message

Fixes http://tracker.ceph.com/issues/20289

Signed-off-by: Karol Mroz <kmroz@suse.de>
Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
2017-07-17 14:48:07 +02:00
Nathan Cutler
758bcf9a0c build/ops: rpm: fix typo WTIH_BABELTRACE
Introduced by b331898ea9

Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-07-17 14:46:57 +02:00
Kefu Chai
7aabdc01eb Merge pull request #16339 from majianpeng/test-fio-print-more-perfcounter
test/fio: print all perfcounters rather than objectstore itself.

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-17 20:33:21 +08:00
Pan Liu
aeac0423cf os/bluestore: use reference to void string copy
Signed-off-by: Pan Liu <wanjun.lp@alibaba-inc.com>
2017-07-17 20:26:42 +08:00
Kefu Chai
c142f25a60 Merge pull request #16346 from liewegas/wip-20602
mon: skip crush smoke test when running under valgrind

Reviewed-by: Joao Eduardo Luis <joao@suse.de>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-17 20:15:24 +08:00
Kefu Chai
2b3adf71c8 Merge pull request #16302 from liewegas/wip-mds-dup-alerts
mon/MDSMonitor: fix segv when multiple MDSs raise same alert

Reviewed-by: John Spray <john.spray@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-17 19:46:24 +08:00
Nathan Cutler
0e199cef8c build/ops: rpm: socat is only needed for "make check"
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-07-17 13:13:33 +02:00
Nathan Cutler
a502a93907 build/ops: rpm: put mgr python build dependencies in make_check bcond
Fixes: http://tracker.ceph.com/issues/20425
Signed-off-by: Nathan Cutler <ncutler@suse.com>
Signed-off-by: Tim Serong <tserong@suse.com>
2017-07-17 13:12:54 +02:00
Josh Durgin
6b3e6302a7 osd/PGLog.h: handle lost+delete entries the same as client deletes
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-07-17 02:00:36 -04:00
Josh Durgin
31d6a030e5 osd/PGLog.h: update missing set verification for deletes
Deleted objects may still be on-disk after merging a log that includes
deletes, so adjust the asserts accordingly.

A case like:

980'1192 (972'1186) modify foo
--- osd restart ---
999'1196 (980'1192) delete foo
1003'1199 (0'0) modify foo
1015'1208 (1003'1199) delete foo

Would trigger the assert(miter->second.have == oi.info) since the
'have' version would would be reset to 0'0.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-07-17 02:00:36 -04:00
Josh Durgin
fff55a4834 osd/PGLog: client deletes are now part of the missing set
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-07-17 02:00:36 -04:00
Josh Durgin
b41fa99726 osd/PrimaryLogPG: check whether clones missing from the cache are recovering
This appears now that deletes are not processed inline from the PG log
- a clone that is missing only on a peer (due to being deleted) would
not stop rollback from promoting the clone, resulting in hitting an
assert on the replica when the promotion tried to write to the missing
object on the replica.

This only affects cache tiering due to the dependence on the
MAP_SNAP_CLONE flag in find_object_context() - missing_oid was not being checked for being
recovered, unlike the target oid for the op (in do_op()).

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-07-17 02:00:36 -04:00
Josh Durgin
96f9ff1be8 osd/PrimaryLogPG,PGBackend: handle deletes during recovery
Deletes are the same for EC and replicated pools, so add logic for
handling MOSDPGRecoveryDelete[Reply] to the base PGBackend
class.

Within PrimaryLogPG, add parallel paths for starting deletes,
recover_missing() and prep_object_replica_deletes(), and update the
local and global recovery callbacks to deal with lacking an
ObjectContext after a delete has been performed.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-07-17 02:00:35 -04:00
Josh Durgin
3a9d056d84 osd/PG: handle deletes in MissingLoc
There's no source needed for deleting an object, so don't keep track
of this. Update is_readable_with_acting/is_unfound, and add an
is_deleted() method to be used later.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-07-17 02:00:35 -04:00
Josh Durgin
fa037fb863 osd: add a 'delete' flag to missing items and related functions
This will track deletes that were in the pg log and still need to be
performed during recovery. Note that with these deleted objects we may
not have an accurate 'have' version, since the object may have already
been deleted locally, so tolerate this when examining divergent entries.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-07-17 02:00:35 -04:00
Josh Durgin
100a3d70e5 message, osd: add request/response messages for deletes during recovery
The existing BackfillRemove message has no reply, and PushOps have too
much logic that would need changing to accomodate deletions.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-07-17 02:00:35 -04:00
Sage Weil
dd61a7f737 Merge pull request #16189 from bassam/pr-msgr-bind-addr
mon: add support public_bind_addr option

Reviewed-by: Sage Weil <sage@redhat.com>
2017-07-16 21:26:23 -05:00
Sage Weil
6e33ba0183 Merge pull request #16349 from liewegas/wip-vstart-bind
vstart.sh: bind restful, dashboard to ::, not 127.0.0.1

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-16 21:24:53 -05:00
Sage Weil
0be22af9e6 Merge pull request #16329 from joscollin/wip-cleanup-crush-warning
crush: silence warning from -Woverflow

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-16 21:24:36 -05:00
Sage Weil
d659266c76 Merge pull request #16345 from jcsp/wip-watch-channel
ceph.in: filter out audit from ceph -w

Reviewed-by: Sage Weil <sage@redhat.com>
2017-07-16 21:16:59 -05:00
Sage Weil
1a0d645b1c Merge pull request #16315 from majianpeng/bluestore-misc-fix
os/bluestore: misc fix and cleanups

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-16 21:11:11 -05:00
Sage Weil
6b5b15d921 Merge pull request #16351 from liewegas/wip-mgr-init-debug
mgr,mon: debug init and mgrdigest subscriptions
2017-07-16 21:08:50 -05:00
Haomai Wang
eab1c25f1f Merge pull request #16358 from liupan1111/wip-fix-client
test/msgr: fixed the hang issue for perf_msg_client

Reviewed-by: Haomai Wang <haomai@xsky.com>
2017-07-17 09:49:23 +08:00
Pan Liu
78c6b480fa test/msgr: fixed the hang issue for perf_msg_client
Signed-off-by: Pan Liu <wanjun.lp@alibaba-inc.com>
2017-07-17 09:42:04 +08:00
Sage Weil
f9433e488b qa/suites/rados/rest/mgr-restful: simplify
Use default port; don't bother setting bind addr.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-16 21:28:03 -04:00
Sage Weil
1e50fb494c mon/MgrMonitor: only induce mgr epoch shortly after mkfs
For early clusters, if there isn't an active manager, we eventually want
to trigger a health warning by rolling over the mgrmap epoch.  We don't
want to do that if we have no active/available manager after that.  Fix
by checking ever_had_active_mgr here.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-16 14:47:44 -04:00
Enming Zhang
a1b05dd2f6 rgw: acl grants num limit
According to AWS S3 in this document[1], an ACL can have up to 100
grants.

If the nums of grants is larger than 100, S3 will return like following:
400
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>MalformedACLError</Code><Message>The XML you provided was not well-formed or did not validate against our published schema</Message><RequestId>10EC67824572C378</RequestId><HostId>AWL3NnQChs/HCfOTu5MtyEc9uzRuxpYMhmvXQry2CovCcuxO2/tMqY1zGoWOur86ipQt3v/WEiA=</HostId></Error>

Now if the nums of request acl grants is larger than the maximum allowed, rgw will return
like following:
400
<?xml version="1.0" encoding="UTF-8"?><Error><Code>MalformedACLError</Code><Message>The request is rejected, because the acl grants number you requested is larger than the maximum 101 grants allowed in an acl.</Message><BucketName>222</BucketName><RequestId>tx000000000000000000017-00596b5fad-101a-default</RequestId><HostId>101a-default-default</HostId></Error>

The maximum number of acl grants can be configured in config file with the configuration item:

rgw_acl_grants_max_num

[1] http://docs.aws.amazon.com/AmazonS3/latest/dev/acl-overview.html

Signed-off-by: Enming Zhang <enming.zhang@umcloud.com>
2017-07-16 20:58:43 +08:00
Enming Zhang
eb752e0b8d rgw: req xml params size limitation error msg
Now rgw will return like following:

400
<?xml version="1.0" encoding="UTF-8"?><Error><Code>MalformedXML</Code><Message>The XML you provided was larger than the maximum 2048 bytes allowed.</Message><BucketName>333</BucketName><RequestId>tx000000000000000000009-00596a1331-101a-default</RequestId><HostId>101a-default-default</HostId></Error>

Signed-off-by: Enming Zhang <enming.zhang@umcloud.com>
2017-07-15 21:13:11 +08:00
Kefu Chai
171104cb93 Merge pull request #15587 from wjwithagen/wip-wjw-ceph-disk-is_diskdevice
ceph-disk/ceph_disk/main.py: Replace ST_ISBLK() test by is_diskdevice()

Reviewed-by: Loic Dachary <ldachary@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-15 16:31:35 +08:00
Kefu Chai
742a117728 Merge pull request #16347 from tchaikov/wip-test-ceph-disk
tests: ceph-disk: use communicate() instead of wait() for output

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2017-07-15 16:24:23 +08:00
xie xingguo
a373692fa1 mon/MonCommand: drop unnecessary write permission
since "log last" does not ask for it.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-15 14:23:44 +08:00
xie xingguo
464179b447 osd/OSDMap: kill dead structure "struct qi"
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-15 14:23:43 +08:00
Jos Collin
9f6806559d Merge pull request #16334 from wjwithagen/wjw-bug-stringyfy
core:" Stringify needs access to << before reference" src/include/stringify.h

Reviewed-by: Jos Collin <jcollin@redhat.com>
2017-07-15 05:21:23 +00:00
xie xingguo
09af9b8afb osd/OSDMap: allow bidirectional swap of pg-upmap-items
This is useful when we also want an even distribution of pg primaries across osds.
For example:
Was:
[0 1 2]

By applying bidirectional swap of pg-upmap-items mapping [[0,1],[1,0]], now:
[1 0 2]

Thus we successfully decrease the number of primaries of osd.0 by 1 without
affecting the current (even) distribution of global pgs.

Real exmaple:
./bin/ceph pg ls-by-pool rbd
PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES LOG DISK_LOG STATE        STATE_STAMP                VERSION REPORTED UP      UP_PRIMARY ACTING  ACTING_PRIMARY LAST_SCRUB SCRUB_STAMP                LAST_DEEP_SCRUB DEEP_SCRUB_STAMP
3.0           0                  0        0         0       0     0   0        0 active+clean 2017-07-12 15:14:45.083441     0'0    29:13 [0,1,3]          0 [0,1,3]              0        0'0 2017-07-12 15:14:14.515989             0'0 2017-07-12 15:14:14.515989

./bin/ceph osd pg-upmap-items 3.0 0 1 1 0 3 5
set 3.0 pg_upmap_items mapping to [0->1,1->0,3->5]

./bin/ceph pg ls-by-pool rbd
PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES LOG DISK_LOG STATE        STATE_STAMP                VERSION REPORTED UP      UP_PRIMARY ACTING  ACTING_PRIMARY LAST_SCRUB SCRUB_STAMP                LAST_DEEP_SCRUB DEEP_SCRUB_STAMP
3.0           0                  0        0         0       0     0   0        0 active+clean 2017-07-12 15:16:22.648424     0'0    33:13 [1,0,5]          1 [1,0,5]              1        0'0 2017-07-12 15:14:14.515989             0'0 2017-07-12 15:14:14.515989

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-15 12:29:57 +08:00
xie xingguo
70e664b7c8 mon/OSDMonitor: another dedup case for pg-upmap-items
./bin/ceph osd pg-upmap-items 1.0 1 2 1 2
osd.1 -> osd.2 already exists, set 1.0 pg_upmap_items mapping to [1->2]

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-15 12:29:57 +08:00
Kefu Chai
c596bff584 qa/suites/ceph-disk: whitelist health warnings
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-07-15 11:27:02 +08:00
Kefu Chai
73c0740b08 tests: ceph-disk: use communicate() instead of wait() for output
to avoid possible deadlock. quote from doc of Popen.wait()

> This will deadlock when using stdout=PIPE and/or stderr=PIPE and the
child process generates enough output to a pipe such that it blocks
waiting for the OS pipe buffer to accept more data. Use communicate() to
avoid that.

and print out the stdout and stderr using LOG.warn() if the command
fails.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-07-15 11:27:02 +08:00
Kefu Chai
0cc65197d6 Merge pull request #16045 from Liuchang0812/wip-compact-osd-feature
osd: compact osd feature

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-07-15 10:57:13 +08:00
Jianpeng Ma
9ab14d1df7 test/fio: print all perfcounters rather than objectstore itself.
Need bluefs,rocksdb perfcounters.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
2017-07-15 06:15:53 +08:00
Sage Weil
d52763c2cc Merge pull request #16221 from liewegas/wip-20546
crush/CrushWrapper: make get_immediate_parent[_id] ignore per-class shadow hierarchy

Reviewed-by: Neha Ojha <nojha@redhat.com>
2017-07-14 15:09:22 -05:00