The final placement seed needs to factor in pool, but that can't be
fed into stable_mod or you get weird results (for example, 1.ff and
1.adff won't necessary map to the same thing because of the
stable_mod). Add pool to the stable_mod result, instead. The seed
itself doesn't need to be bounded; it's just an input for CRUSH.
Just so long as there are a limited number of such inputs for a given
pool.
Needs to factor in frag_is_leftmost to account for . and .., just
like the fi->offset calculation in readdir_prepopulate. Fixes the
problem where an ls on a large dir returns duplicate entries.
This is mainly just because /bin/ls will use the size, or blocks,
or blksize to decide how big of a buffer to allocate for getdents,
and the default of 4MB is unreasonably big. 64k seems like an
okay number, I guess.
We would get incorrect results if we calculated the same mapping
twice in a row in certain cases. Der. Also, the permutation
calculation was basically just wrong.
The dentry dir offset calculation wasn't taking into account the
possibility of multiple readdi requests, which in turn meant bad results
for readdir-from-dcache.
Since doing this on the client side was a mess, the MDS includes a dentry
offset for each readdir dentry within the dirfrag. This value is stored
in di->offset (with adjustment in leftmost frag for . and ..), and that's
the value that's passed back via filldir.
The previous use of I_READDIR vs I_COMPLETE was flawed, mainly because
the state was maintained on a per-inode basis, but readdir proceeds on a
per-file basis.
Instead of flags, maintain a counter in the inode that is incremented each
time a dentry is released. When readdir starts, note the counter, and if
it is the same when readdir completes, AND we did not do any forward
seeks on the file handle, AND prepopulate succeeded on each hunk, then we
can set I_COMPLETE.
The OSD will implicitly set the bits based on your OSDOps or class method
calls. The client may still find it useful to specify these expicitly
for it's own informational purposes.
Make sure the MOSDOpReply has bits set based on the _actual_ op performed.
Note that as things stand, this will confuse the Objecter, who relies on
these bits to choose read or modify reply paths and doesn't know a priori
what mode a method is.