From b95d4d3d0155525427574a8d7fa56f5b423b423e Mon Sep 17 00:00:00 2001 From: David Sterba Date: Wed, 5 Apr 2023 00:56:29 +0200 Subject: [PATCH] btrfs-progs: docs: add Development notes Copied from wiki. Signed-off-by: David Sterba --- Documentation/dev/Development-notes.rst | 486 ++++++++++++++++++++++++ Documentation/index.rst | 1 + 2 files changed, 487 insertions(+) create mode 100644 Documentation/dev/Development-notes.rst diff --git a/Documentation/dev/Development-notes.rst b/Documentation/dev/Development-notes.rst new file mode 100644 index 00000000..4d4f2f4e --- /dev/null +++ b/Documentation/dev/Development-notes.rst @@ -0,0 +1,486 @@ +Development notes +================= + +Collection of various notes about development practices, how-to's or +checklists. + +Adding a new ioctl, extending an existing one +--------------------------------------------- + +- add code to `strace `__ so the ioctl calls + are parsed into a human readable form. Most of the ioctls are already + `implemented `__ and + can be used a reference. + +Tracepoints +----------- + +The tracepoint message format should be compact and consistent, so please stick +to the following format: + +- *key=value* no spaces around *=* +- separated by spaces, not commas +- named values: print value and string, like "%llu(%s)", no space between, + string in parens +- avoid abbreviating key values too much, (e.g. use 'size' not 'sz') +- hexadecimal values are always preceded by *0x* (use "0x%llx") +- use *struct btrfs_inode* for inode types, not plain *struct inode* +- inode number type is *u64*, use *btrfs_ino* if needed +- key can be printed as *[%llu,%u,%llu]* +- enum types need to be exported as *TRACE_DEFINE_ENUM* + +**Example:** + +event: *btrfs__chunk* + +string: ``"root=%llu(%s) offset=%llu size=%llu num_stripes=%d sub_stripes=%d type=%s"`` + + +Error messages, verbosity +------------------------- + +- use *btrfs_\** helpers (btrfs_err, btrfs_info, ...), they print a filesystem + identification like ``BTRFS info (device sdb): ...`` +- first letter in the string is lowercase +- message contents + + - be descriptive + - keep the text length reasonable (fits one line without wrapping) + - no typos + - print values that refer to what happened (inode number, subvolume + id, path, offset, ...) + - print error value from the last call + - no *"\\n"* at the end of the string + - no *".*'' at the end of text + - un-indent the string so it fits under 80 columns + - don't split long strings, overflow 80 is ok in this case (we want + to be able to search for the error messages in the sources easily) + +- value representation + + - decimal: offsets, length, ordinary numbers + - hexadecimal: checksums + - hexadecimal + string: bitmasks (e.g. raid profiles, flags) + - intervals of integers: + + - closed interval (end values inclusive): [0, 4096] + - half-open (right value excluded): [0, 4096) + - half-open (left value excluded): (0, 4096] -- that one may look + strange but is used in several places + +Message level +------------- + +- btrfs_err -- such messages have high visibility so use them for serious + problems that need user attention +- btrfs_warn -- conditions that are not too serious but can point to potential + problems, the system should be still in a good state +- btrfs_info -- use for informative messages that are useful to see what's + happening in the filesystem or might help debugging problems in the future + and are worth keeping in the logs + +Error handling and transaction abort +------------------------------------ + +Please keep all transaction abort exactly at the place where they happen and do +not merge them to one. This pattern should be used everywhere and is important +when debugging because we can pinpoint the line in the code from the syslog +message and do not have to guess which way it got to the merged call. + +Error handling and return values +-------------------------------- + +Functions are supposed to handle all errors of the callees and clean up the +local context before returning. If a function does not need to pass errors to +the caller it can be switched to return *void*. Before doing so make sure that: + +- the function does not call any BUG/BUG_ON +- all callees properly handle errors and do not call BUG/BUG_ON in place of + error handling +- the whole call chain starting from the function satisfies the above + +Handling unexpected conditions +------------------------------ + +This is different than error handling. An unexpected condition happens when the +code invariants/assumptions do not hold and there's no way to recover from the +situation. This means that returning an error to the caller can't be done and +continuing would only propagate the logic error further. The reasons for that +bug can be two fold: internal (a genuine bug) or external (e.g. memory bitflip, +memory corrupted by other subsystem). In this case it is allowed to use the +nuclear option and do BUG_ON, that is otherwise highly discouraged. + +There are several ways how to react to the unexpected conditions: + +- ASSERT -- conditionally compiled in and crashes when the condition is false, + this is supposed to catch 'must never happen' at the time of development, + code must not continue +- WARN_ON -- light check that is visible in the log and allows the code to + continue but the reasons must be investigated +- BUG_ON -- last resort, checks condition that 'must never happen' and + continuing would cause more harm than the instant crash; code should always + try to avoid using it, but there are cases when sanity and invariant checks + are done in advance + +Error injection using eBPF +-------------------------- + +Functions can be annotated to enable error injection using the eBPF scripts. +See e.g. ``disk-io.c:open_ctree``. For btrfs-specific injection, the annotation +is ALLOW_ERROR_INJECTION, but beware that this only overrides the return value +and this can leak memory or other resources. For error injection to generic +helpers (e.g. memory allocation), you can use something like +``bcc/tools/inject.py kmalloc btrfs_alloc_device() -P 0.001`` + +Resources: + +- eBPF +- BCC tools + +Warnings and issues found by static checkers and similar tools +-------------------------------------------------------------- + +There are tools to automatically identify known issues in code and report them +as problems to be fixed, but not all such reports are valid or relevant in the +context of the code base. The fix should really fix the code, not just the +tool's warning. Such patches will be rejected with explanation first time and +ignored when sent repeatedly. Patches fixing real problems with a good +explanation are welcome. If you're not sure about sending such patch, please +ask the https://kernelnewbies.org/KernelJanitors for help. + +Do not blindly report issues caught by: + +- checkpatch.pl -- the script is good for catching some coding style but this + whole wiki page exists to be explicit what we want, not necessarily what + checkpatch wants +- clang static analyzer -- lots of the reports are not real problems and may + depend on a condition that's not recognized by the checker +- gcc -Wunused -- any of the -Wunused-\* options can report a valid issue but + it must be viewed in wider context and not just removed to get rid of the + warning +- codespell -- fixing typos is fine but should be done in batches and over + whole code base + +Hints: + +- if you find an issue, look in the whole code base if there are more instances + same or following a similar pattern +- look into git history of the changed code, why it got there and when, it may + help to understand if it's a bug or e.g. a stale code + +Coding style preferences +------------------------ + +Before applying recommendations from this sections, please make sure you're +familiar with the `kernel coding style guide +`__. + +The purpose of coding style is to maintain unified and consistent look & feel +of the patches and code, keeping distractions to minimum which decreases +cognitive load and allows focus on the important things. Coding style is not +only where to put white space or curly brackets but also coding patterns with +meaning that is established and understood in the developer group. The code in +linux kernel is maintained for a long period of time and maintainability is of +crucial importance. This means it does take time to write good code, with +attention to detail. Once written the code could stay unchanged for years but +will be read many times. `Read more +`__. + +Patches +^^^^^^^ + +- for patch subject use "btrfs:" followed by a lowercase +- read the patch again and fix all typos and grammar +- size units should use short form K/M/G/T or IEC form KiB/MiB/GiB +- don't write references to parameters to subject (like removing @pointer) +- do not end subject with a dot '.' +- parts of btrfs that could have a subject prefix to point to a specific subsystem + + - scrub, tests, integrity-checker, tree-checker, discard, locking, sysfs, + raid56, qgroup, compression, send, ioctl + +- additional information + + - if there's a stack trace relevant for the patch, add it ther (lockdep, + crash, warning) + - steps to reproduce a bug (that will also get turned to a proper fstests + case) + - sample output before/after if it could have impact on userspace + - *pahole* output if structure is being reorganized and optimized + +Function declarations +^^^^^^^^^^^^^^^^^^^^^ + +- avoid prototypes for static functions, order them in new code in a way that + does not need it + + - but don't move static functions just to get rid of the prototype + +- exported functions have btrfs\_ prefix +- do not use functions with double underscore, there's only a few valid uses of + that, namely when *\__function* is doing the core of the work with looser + checks, no locks or more parameters than *function* +- function type and name are on the same line, if this is too long, the + parameters continue on the next line (indented) +- 'static inline' functions should be small (measured by their resulting binary + size) +- conditional definitions should follow the style below, where the full + function body is in .c + +.. code-block:: c + + #ifdef CONFIG_BTRFS_DEBUG + void btrfs_assert_everything_is_fine(void *ptr); + #else + void btrfs_assert_everything_is_fine(void *ptr) { } + #endif + +Headers +^^^^^^^ + +- leave one newline before #endif in headers + +Comments +^^^^^^^^ + +- function comment goes to the .c file, not the header + + - kdoc style is recommended but the exact syntax is not mandated and + we're using only subset of the formatting + - the first line (mandatory): contains a brief description of what + the function does and should provide a summary information + + - do write in the imperative style "Iterate all pages and clear + some bits" + - don't write "This function is a helper to ...", "This is used + to ..." + + - parameter description (optional): + + - each line describes the parameter + - the list needs to be in the same order as for the function + - the list needs to be complete + - trivial parameters don't need to be explained, e.g. fs_info is + clear so the description could be 'the filesystem' + - context of the parameters matters a lot in some cases and + cannot be inferred from the name, then it should be documented + +.. code-block:: c + + /* +  * Look for blocks in the given offset. +  *  +  * @fs_info:    trivial parameters should be in the list but with some short description +  * @offset:     describe the context of the argument, e.g. offset to page or inode ... +  * +  * Long description comes here if necessary. +  * +  * Return value semantics if it's not obvious +  */ + +- comment new enum/define values, brief description or pointers to the code + that uses them +- comment new data structures, their purpose and context of use +- do not put struct member comments on the same line, put it on the line + before and do not trim/abbreviate the text +- comment text that fits on one line can use the */\* text \*/* format, slight + overflow of 80 chars is OK + +Misc +^^^^ + +- fix spelling, grammar and formatting of comments that get moved or changed +- fix coding style in code that's only moved +- one newline between functions + +Locking +^^^^^^^ + +- do not use ``spin_is_locked`` but ``lockdep_assert_held`` +- do not use ``assert_spin_locked`` without reading it's semantics (it does + not check if the caller hold the lock) +- use ``lockdep_assert_held`` and its friends for lock assertions +- add lock assertions to functions called deep in the call chain + +Code +^^^^ + +- default function return value is *int ret*, temporary return values should + be named like *ret2* etc +- structure initializers should use *{ 0 }* +- do not use *short* type if possible, if it fits to char/u8 use it instead, + or plain int + +Output +^^^^^^ + +- when dumping a lot of data after an error, consider what will remain visible + last + + - in case of ``btrfs_print_leaf``, print the specific error message after + that + +Expressions, operators +^^^^^^^^^^^^^^^^^^^^^^ + +- spaces around binary operators, no spaces for unary operators +- extra braces around expressions that might be harder to understand wrt + precedence are fine (logical and/or, shifts with other operations) + + - *a \* b + c*, *(a << b) + c*, *(a % b) + c* + +- 64bit division is not allowed, either avoid it completely, or use bit + shifts or use div_u64 helpers +- do not use chained assignments: no *a = b = c;* + +Variable declarations, parameters +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- declaration block in a function should do only simple initializations + (pointers, constants); nothing that would require error handling or has + non-trivial side effects +- use *const* extensively +- add temporary variable to store a value if it's used multiple times in the + function, or if reading the value needs to chase a long pointer chain + +Kernel config options +--------------------- + +Compile-time config options for kernel that can help debugging, testing. They +usually take a hit on performance or resources (memory) so they should be +selected wisely. The options in **bold** should be safe to use by default for +debugging builds. + +Please refer to the option documentation for further details. + +- devices for testing + + - **CONFIG_BLK_DEV_LOOP** - enable loop device + - for fstests: **DM_FLAKEY**, **CONFIG_FAIL_MAKE_REQUEST** + - **CONFIG_SCSI_DEBUG** - fake scsi block device + +- memory + + - **CONFIG_SLUB_DEBUG** - boot with slub_debug + - CONFIG_DEBUG_PAGEALLOC + CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT (on + newer kernels) + - CONFIG_SCHED_STACK_END_CHECK + - CONFIG_PAGE_POISONING + - CONFIG_HAVE_DEBUG_KMEMLEAK + - CONFIG_FAILSLAB -- fault injection to kmalloc + - CONFIG_DEBUG_LIST + - CONFIG_BUG_ON_DATA_CORRUPTION + +- btrfs + + - **CONFIG_BTRFS_DEBUG** + - **CONFIG_BTRFS_ASSERT** + - **CONFIG_BTRFS_FS_RUN_SANITY_TESTS** -- basic tests on module load + - **CONFIG_BTRFS_FS_CHECK_INTEGRITY** -- block integrity checker + enabled by mount options + - **CONFIG_BTRFS_FS_REF_VERIFY** -- additional checks for block + references + +- locking + + - CONFIG_DEBUG_SPINLOCK, CONFIG_DEBUG_MUTEXES + - CONFIG_DEBUG_LOCK_ALLOC + - CONFIG_PROVE_LOCKING, CONFIG_LOCKDEP + - CONFIG_LOCK_STAT + - CONFIG_PROVE_RCU + - CONFIG_DEBUG_ATOMIC_SLEEP + +- sanity checks + + - CONFIG_DEBUG_STACK_USAGE, CONFIG_HAVE_DEBUG_STACKOVERFLOW, + CONFIG_DEBUG_STACKOVERFLOW + - CONFIG_STACKTRACE + - CONFIG_KASAN -- address sanitizer + - CONFIG_UBSAN -- undefined behaviour sanitizer + - CONFIG_KCSAN -- concurrency checker + +- verbose reporting + + - CONFIG_DEBUG_BUGVERBOSE + +- tracing + + - CONFIG_TRACING etc + +BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low! +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Not a bug. Increase the config value of LOCKDEP_CHAINS_BITS, default is +16, 18 tends to work, increase if needed. + +fstests setup +------------- + +The fstests suite has very few "hard" requirements and will succeed without +actually running many tests. In order to ensure full test coverage, your test +environment should provide the settings from the following sections. Please +note that newly added tests silently add new dependencies, so you should always +review results after an update. + + +Kernel config options for complete test coverage +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- ``CONFIG_FAULT_INJECTION=y`` +- ``CONFIG_FAULT_INJECTION_DEBUG_FS=y`` +- ``CONFIG_FAIL_MAKE_REQUEST=y`` +- ``CONFIG_DM_FLAKEY=m`` or ``y`` +- ``CONFIG_DM_THIN_PROVISIONING=m`` or ``y`` +- ``CONFIG_DM_SNAPSHOT=m`` or ``y`` +- ``CONFIG_DM_DELAY=m`` or ``y`` +- ``CONFIG_DM_ERROR=m`` or ``y`` +- ``CONFIG_DM_LOG_WRITES=m`` or ``y`` +- ``CONFIG_DM_DUST=m`` or ``y`` +- ``CONFIG_BLK_DEV_LOOP=m`` or ``y`` +- ``CONFIG_EXT4_FS=m`` or ``y`` +- ``CONFIG_SCSI_DEBUG=m`` +- ``CONFIG_BLK_DEV_ZONED=y`` for zoned mode test coverage + + +Kernel config options for better bug reports +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +See the list in the section above for more options. + + +User space utilities and development library dependencies +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- fio +- dmsetup (device-mapper) +- lvm +- xfsprogs >= 4.3.1 (``xfs_io -c reflink`` is required) +- btrfsprogs +- dbench +- openssl +- libacl +- libattr +- libaio +- libuuid +- libcap-progs +- duperemove + +Note: This list may be incomplete. + +Storage environment +^^^^^^^^^^^^^^^^^^^ + +- At least 4 identically sized partitions/disks/virtual disks, specified using + ``$SCRATCH_DEV_POOL``, some tests may require 6 such partitions +- some tests need at least 10G of free space, as determined by ``df``, i.e. + the size of the device may need to be larger +- some tests require ``$LOGWRITES_DEV``, yet another separate block device, + for power fail testing +- for testing trim and discard, the devices must be capable of that (physical + or virtual) + +Other requirements +^^^^^^^^^^^^^^^^^^ + +- An ``fsgqa`` user and group must exist. +- An ``123456-fsgqa`` user and group must exist. diff --git a/Documentation/index.rst b/Documentation/index.rst index c91dc776..2e1c5a73 100644 --- a/Documentation/index.rst +++ b/Documentation/index.rst @@ -57,6 +57,7 @@ Welcome to BTRFS documentation! :maxdepth: 1 :caption: Developer documentation + dev/Development-notes dev/Experimental dev/dev-btrees dev/dev-btrfs-design