Commit Graph

75651 Commits

Author SHA1 Message Date
Jason Dillaman
6d7ac66ae2 mon: heuristics for auto-enabling pool applications upon upgrade
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-07-19 13:13:01 -04:00
Jason Dillaman
688026741b mon: health warning if in-use pools don't have application enabled
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-07-19 13:13:01 -04:00
Jason Dillaman
3514d6e53e mon: added new "osd pool application" commands
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-07-19 13:13:01 -04:00
Jason Dillaman
a15f6e4cea mon: store application metadata in pg_pool_t
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-07-19 13:13:01 -04:00
Sage Weil
572a942f8f mon: 'auth list' -> 'auth ls'
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-19 12:33:14 -04:00
Sage Weil
2ebb8e13f6 mon: 'config-key list' -> 'config-key ls'
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-19 12:32:48 -04:00
Yuri Weinstein
35bc5f4165 Merge pull request #16275 from linuxbox2/wip-rgw-readdir-cookie
rgw_file: permit dirent offset computation

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2017-07-19 08:43:16 -07:00
Yuri Weinstein
c95ab13384 Merge pull request #16368 from theanalyst/fix/rgw-reshard-pool-ns
rgw: use a namespace for rgw reshard pool for upgrades as well

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2017-07-19 08:42:36 -07:00
Sage Weil
798e46b335 Merge pull request #16395 from jecluis/wip-fix-mon-mgr-bootstrap
mon/AuthMonitor: generate bootstrap-mgr key on upgrade

Reviewed-by: Sage Weil <sage@redhat.com>
2017-07-19 10:02:06 -05:00
Sage Weil
6dec877335 Merge pull request #16425 from smithfarm/wip-ceph-w
doc: PendingReleaseNotes: "ceph -w" behavior has changed drastically

Reviewed-by: Joao Eduardo Luis <joao@suse.de>
Reviewed-by: Sage Weil <sage@redhat.com>
2017-07-19 09:47:12 -05:00
Kefu Chai
c3fba3c984 Merge pull request #16314 from tchaikov/wip-doc-replace-osd
doc: add instructions for replacing an OSD

Reviewed-by: Alfredo Deza <adeza@redhat.com>
2017-07-19 22:38:11 +08:00
Willem Jan Withagen
4f49402589 qa/workunits/cephtool/test.sh: ceph osd stat out has changed, fix tests for that
The output of ceph osd stat has changed,
It printed:

cluster b370a29d-9287-4ca3-ab57-3d824f65e339
 health HEALTH_OK
 monmap e1: 1 mons at {ceph1=10.0.0.8:6789/0}, election epoch 2, quorum 0 ceph1
 osdmap e63: 2 osds: 2 up, 2 in
  pgmap v41338: 952 pgs, 20 pools, 17130 MB data, 2199 objects
        115 GB used, 167 GB / 297 GB avail
             952 active+clean

but now the osdmap line has gone and thus this no longer works:
qa/workunits/cephtool/test.sh:1944:
old_pgs=$(ceph osd pool get $TEST_POOL_GETSET pg_num | sed -e 's/pg_num: //')
new_pgs=$(($old_pgs+$(ceph osd stat | grep osdmap | awk '{print $3}')*32))

4: qa/workunits/cephtool/test.sh: line 1945: 10+*32: syntax errotoken is "*32")

 - And parse the output in json , with jq, for better reliability

Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
2017-07-19 16:34:12 +02:00
Kefu Chai
e48fbe52b3 include/assert: test c++ before using static_cast<>
this partially reverts 2e7c72d.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-07-19 22:33:30 +08:00
Nathan Cutler
060084f708 doc: PendingReleaseNotes: "ceph -w" behavior has changed drastically
Signed-off-by: Joao Eduardo Luis <joao@suse.de>
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-07-19 16:13:17 +02:00
Kefu Chai
0e33001e17 messages/: always set header.version in encode_payload()
we encode the payload w/o the writelock even can_write == NOWRITE, if
the message "can_fast_prepare". in that case, the "feature" of the
connection is 0, as no handshake happens yet. so the header.version is
always set to a version compatible with pre-luminous. but when the
message is re-encoded when the connection is re-established with feature
with luminous, the header.version is not set back to HEADER_VERSION.
that's why the message's encoding is not consistent with header.version
sometimes.

in this change, we always set the header.version in encode_payload(), so
it's consistent even after connection reset and message re-encoding.

Fixes: http://tracker.ceph.com/issues/19939
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-07-19 21:30:25 +08:00
Sage Weil
f4d50c8f14 mon: define static mgr_commands at mkfs time
This closes a window between mkfs and when the first mgr goes active
where *no* mgr commands are defined, and things like 'pg dump' fail.  We
do not get the default set of commands defined by modules, but we get
everything else.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-19 08:58:41 -04:00
Sage Weil
c86c8b717e mgr: move mgr_commands to separate compilation unit
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-19 08:58:40 -04:00
Sage Weil
8badb73b9a mon/MonCommands: std::
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-19 08:58:40 -04:00
Sage Weil
388c8e4abd mon/MgrMonitor: mark mgr commands with FLAG_MGR
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-19 08:58:40 -04:00
John Spray
b28c300258 qa/doc: update for "mgr tell" no longer needed
Signed-off-by: John Spray <john.spray@redhat.com>
2017-07-19 08:58:40 -04:00
John Spray
54b693b06c mon: load mgr commands dynamically
So that the list of commands includes python modules,
thus allowing python-provided commands to be invoked
by the CLI with out a `tell mgr` prefix.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-07-19 08:58:40 -04:00
John Spray
5c3846306b mgr: transmit command descriptions to mgr in activating beacon
The mgr already sends a beacon to the mon immediately
after loading python modules in Mgr::init, to indicate
that it is now available.  Use that beacon to transmit
the command descriptions.

The monitor should handle this beacon by persisting
the command descriptions before persisting the updated
mgrmap that indicates that the mgr is now active.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-07-19 08:58:40 -04:00
John Spray
9efacf2a77 mgr: use MonCommand for command descriptions
...and update the MonCommand encoding so that we
can readily send vectors of them around.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-07-19 08:58:40 -04:00
John Spray
2350d2aded encoding: remove encode_array_nohead
This was just a for loop.  No longer needed for
MonCommands, and the usage in memstore/PageSet
was just iterating over char* and should never have
been there to begin with.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-07-19 08:58:39 -04:00
Abhishek Varshney
596b4bc05f rgw: fix error message in removing bucket with --bypass-gc flag
Fixes: http://tracker.ceph.com/issues/20688

Signed-off-by: Abhishek Varshney <abhishek.varshney@flipkart.com>
2017-07-19 12:24:37 +00:00
Abhishek L
e8e70ecbb3 Merge pull request #16411 from smithfarm/wip-crn-regression
tools: ceph-release-notes: refactor and fix regressions

Reviewed-By: Kefu Chai <kchai@redhat.com>
Reviewed-By: Abhishek Lekshmanan <abhishek@suse.com>
2017-07-19 13:25:10 +02:00
Ilya Dryomov
7e7f6cfe5c qa/suites/krbd: add luminous thrash tests
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-19 12:18:16 +02:00
Ilya Dryomov
0635c25e74 qa/suites/krbd: reorganize thrash tests
- factor out install and ceph into ceph/ceph.yaml
- pg_num thrashing + 20 minute health timeout for thrashosds
- common thrashosds-health.yaml whitelist
- drop iozone workload

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-19 12:18:16 +02:00
Ilya Dryomov
dac11877e2 qa/suites/krbd: heavier rbd_fio workload
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-19 12:18:16 +02:00
Ilya Dryomov
682c5a42e1 qa/tasks/rbd_fio: dump fio options before starting
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-19 12:18:16 +02:00
Ilya Dryomov
03f69b3275 qa/tasks/rbd_fio: support libaio engine
Want to set iodepth and do direct AIO.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-19 12:18:16 +02:00
Nathan Cutler
1a702adbfd tools: ceph-release-notes: match Reviewed-by more liberally
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-07-19 12:00:47 +02:00
xie xingguo
29ef297dfb mon/HealthMonitor: fix mon_warn_on_osd_down_out_interval_zero does not work
And the output is wrong...

cluster:
    id:     b979e20d-6441-46b4-8663-954e1e8ce01d
    health: HEALTH_WARN
            1 osds down
            mon %names has mon_osd_down_out_interval set to 0

Now:
 health: HEALTH_WARN
            mon a is low on available space
            mon a has mon_osd_down_out_interval set to 0

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-19 17:40:20 +08:00
xie xingguo
0d05c2ac7c mon/HealthMonitor: fix wrong health level
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-19 16:48:13 +08:00
xie xingguo
3b6ddf66d1 mon/HealthMonitor: fix regex formatting
Was:
./bin/ceph -s
  cluster:
    id:     0f704e51-f496-4812-a782-f6bcc490a109
    health: HEALTH_ERR
            mon%plurals% %names% %isorare% low on available space

Now:
 ./bin/ceph -s
  cluster:
    id:     1bb2cc6d-ba6f-467a-b811-994101a42749
    health: HEALTH_ERR
            mon a is low on available space

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-07-19 16:43:50 +08:00
Nathan Cutler
96c672e891 tools: ceph-release-notes: fix sorted() key lambda regression
https://github.com/ceph/ceph/pull/16261 ported the script to Python 3, but it
retained the 2-argument version of the sorted() key function - in Python 3 the
key function takes only one argument.

Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-07-19 09:40:25 +02:00
Nathan Cutler
cadab4368b tools: ceph-release-notes: refactor and fix regression
This commit refactors the logic for determining the PR title and merge message,
and fixes a regression introduced by https://github.com/ceph/ceph/pull/16277

Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-07-19 09:39:52 +02:00
Josh Durgin
16cc7efaac test: add a couple lost+delete unit tests
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-07-19 02:47:46 -04:00
Josh Durgin
62002a387d osd/PrimaryLogPG: guard lost_delete missing_loc change by feature flag
With deletes during recovery instead of during log processing, we need
to keep the entry in missing_loc.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-07-19 02:47:46 -04:00
Josh Durgin
6258444913 TestPGLog: add unit tests for rebuilding missing set
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-07-19 02:47:46 -04:00
Josh Durgin
b09249dee2 PGLog, PrimaryLogPG: rebuild the missing set when the OSDMap flag is set
The recovery_deletes flag will only be set once, by the 'ceph osd
require-osd-release luminous' command.

This matches how we rebuild the missing set when reading it off disk in
read_log_and_missing().

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-07-19 02:47:46 -04:00
Josh Durgin
a51d5fd9a1 osd_types, PGLog: encode missing based on features
Store whether the missing set should contain deletes, so that
persisted versions can be rebuilt if needed. Make missing_item
versioned, since it's persisted by the pg_log as an individual omap
value.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-07-19 02:47:45 -04:00
Josh Durgin
712f0da05c osd_types, Objecter: make recovery_deletes feature create a new interval
This is needed to create a single place to regenerate the missing set
- at the start of a new interval where support for recovery deletes
changed.

The missing set is otherwise not cleared, so it would need to be
rebuilt in arbitrary places if e.g. an osd not supporting it went down
and restarted with support, or if we used a feature flag command to
trigger rebuilding the missing set.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-07-19 02:47:45 -04:00
Josh Durgin
a67f3a8883 OSDMap, OSDMonitor: automatically set recovery deletes for luminous
Once the required osd release is luminous, all osds must support
recovery deletes, so set the flag then. This avoids an extra manual
step in luminous upgrades.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-07-19 02:47:45 -04:00
Josh Durgin
1ccc3c3cf7 OSDMap, OSDMonitor: add flag for all osds supporting recovery deletes
Just like sortbitwise, this can only be toggled on, and once on osds
that do not support it are not allowed to boot.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-07-19 02:47:45 -04:00
Josh Durgin
5741d3001a osd: add incompat superblock feature for deletes during recovery
On-disk missing sets would need to be regenerated if downgraded from
luminous to kraken.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-07-19 02:47:45 -04:00
Josh Durgin
2525609b30 include/ceph_features.h: add feature bit for handling deletes during recovery
The BLKIN feature bit was actually unused - it was a remnant from
earlier versions of the blkin work, but all the encoding is handled by
struct-level versioning in the version that merged.

Use bit 60 (unused in any prior version) so that recovery deletes
could potentially be backported.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-07-19 02:47:44 -04:00
Josh Durgin
3e29446a80 osd/PGBackend: include min_epoch in RecoveryDelete messages
This matches ordering with other recovery messages, and may speed up
processing a bit.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-07-19 02:47:44 -04:00
Josh Durgin
060fe828cc osd/PGLog: reset complete_to when appending lost_delete entries
Since lost_deletes queue recovery directly, and don't go through
activate_not_complete(), our complete_to iterator may still point at
log.end() (a list iterator pointing to .end() will still point to
.end() after a push_back().). Reset it to point before these new
lost_delete entries. This is needed now that lost_deletes are
performed during recovery, instead of inline when merging logs.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-07-19 02:47:44 -04:00
Jos Collin
ff05c2eea7 Merge pull request #16386 from mikulely/rgw-cleanup
rgw: drop unused find_replacement() and some function docs

Reviewed-by: Jos Collin <jcollin@redhat.com>
2017-07-19 05:39:24 +00:00