Return errors from flushing to the caller. Warn
if an error occurs during invalidation, but don't retry,
since the higher level handles these cases, namely:
* rollback (doing this with an image open is asking for trouble)
* shrink (doing this with writes in flight may create extra objects anyway)
* shutdown (qemu flushes before closing the device)
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Drop the keyring encode method, and binary encoder.
Don't just encode in plaintext because we assume we get the whole
bufferlist, and encoding something like list<KeyRing> would thus fail.
Fixes: #2435
Signed-off-by: Sage Weil <sage@inktank.com>
This will make it easier for sysvinit and upstart to coexist.
We will break existing users who have a separate .conf for each node and
didn't add host lines. We'll need to make note of that in the release
notes.
Fixes: #2404
Signed-off-by: Sage Weil <sage@inktank.com>
If a write error occurs, mark the BufferHead dirty again, and
pass the return value to the completion. This makes flushing
return the write error, if one occurs, since the flush callback
is passed as the write callback.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Previously the return value of a read operation was ignored. Now a
read error sets the error field, and changes the BufferHead to a new
error state. Error state BufferHeads are treated as misses so they can
be retried when requested by a user of the ObjectCacher. When _readx
is called again internally, they're treated as hits so the error can
be returned to the user.
The error value is ignored if the BufferHead is not in the error
state.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Only take our absence from the monmap to mean that we were removed if we
were ever a member in the first places.
This fixes the bootstrap case:
- create temp_monmap with existing member(s) plus new guy
- ceph-mon --mkfs --monmap temp_monmap --fsid ...
- start ceph-mon
Basically, this is just using the seed monmap as a way to tell the new
daemon which ip:port to use. Specifying mon addr, public network, or
public addr would also work.
Fixes: #2436
Signed-off-by: Sage Weil <sage@inktank.com>
_readx is called again after each bh is read by C_RetryRead. This
resulted in the read being counted many times for the internal
caller that was just checking whether it was done yet.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
1) Adjust h2 tags so that section titles are visually differentiated
2) Add 1.5em of margin to all pre blocks and tables
Signed-off-by: Ross Turk <ross@inktank.com>
This might be sufficient to let monitors with different versions of the
monmap encoding interoperate, but I'm too lazy to fully test it right now.
Signed-off-by: Sage Weil <sage@inktank.com>
Instead of selecting an encode method in the caller, use a normal features
argument to encode() and branch there.
Leave behavior of all callers untouched. We continue to assume, for
example, that all monitors have the same features, and that
'ceph mon getmap' should return the fully-featured encoding.
Signed-off-by: Sage Weil <sage@inktank.com>
Throttling is intended to stop the caller from submitting too many
requests, not blocking requests that are being resent internally. This
prevents a deadlock when handling an osdmap - previously
handle_osd_map could block when resending linger ops due to the
throttling. This would stop the messenger's dispatch thread from
delivering any subsequest messages, so the throttle budget would never
be replenished.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
- Feed our keyring into the auth methods.
- Do not fail to build a ticket for type MON when we don't have a cap; it
won't be in the auth database. Also, we don't have caps on the monitors
that are enfoced between each other.
Signed-off-by: Sage Weil <sage@newdream.net>
- Keep the mon. key in a separate keyring files, "keyring", in the mon
data dir.
- During init, if we don't find that file, copy the key from the keyserver
database.
- During mkfs, put the mon. key in that file, and remove it from the seed
file that primes the auth database.
This will allow admins to change the mon. key without bringing the cluster
online and doing something wonky.
Signed-off-by: Sage Weil <sage@newdream.net>
Pass the size of the weight vector into crush_do_rule() to ensure that we
don't access values past the end. This can happen if the caller misbehaves
and passes a weight vector that is smaller than max_devices.
Currently the monitor tries to prevent that from happening, but this will
gracefully tolerate previous bad osdmaps that got into this state. It's
also a bit more defensive.
Signed-off-by: Sage Weil <sage@inktank.com>
It is possible that the crush map contains device ids that do not exist as
osds. Filter them out of the CRUSH result.
Signed-off-by: Sage Weil <sage@newdream.net>
This lets us pass a keyring to the auth methods as a source for keys for
doing the authentication handshaking. Normally we pass a RotatatingKeyring
or the KeyServer, but for mon->mon we don't use a service key. This will
let us use a simple KeyRing for that.
Signed-off-by: Sage Weil <sage@newdream.net>