btrfs-progs: docs: enhance the error handling guidelines

Currently we only have a very brief explanation on the unexpected error
handling (only ASSERT()/WARN_ON()/BUG_ON()), and no further
recommendation on the proper usage of them.

This patch would improve the guideline by:

- Add btrfs_abort_transaction() usage
  Which is the recommended way when possible.

- More detailed explanation on the usage of ASSERT()
  Which is only a fail-fast option mostly designed for developers, thus
  is only recommended to rule out some invalid function usage.

- More detailed explanation on the usage of WARN_ON()
  Mostly for call sites which need a call trace strongly, and is not
  applicable for a btrfs_abort_transaction() call.

- Completely discourage the usage of BUG_ON()

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This commit is contained in:
Qu Wenruo 2023-08-11 14:00:28 +08:00 committed by David Sterba
parent 85c841f961
commit 6abbad4d1b

View File

@ -112,15 +112,53 @@ nuclear option and do BUG_ON, that is otherwise highly discouraged.
There are several ways how to react to the unexpected conditions:
- ASSERT -- conditionally compiled in and crashes when the condition is false,
this is supposed to catch 'must never happen' at the time of development,
code must not continue
- WARN_ON -- light check that is visible in the log and allows the code to
continue but the reasons must be investigated
- BUG_ON -- last resort, checks condition that 'must never happen' and
continuing would cause more harm than the instant crash; code should always
try to avoid using it, but there are cases when sanity and invariant checks
are done in advance
- btrfs_abort_transaction()
The recommended way if and only if we can not recover from the situation and
have a transaction handle.
This would cause the filesystem to be flipped read-only to prevent further
corruption.
Additionally call trace would be dumpped for the first btrfs_abort_transaction()
call site.
- ASSERT()
Conditionally compiled in and crashes when the condition is false.
This should only be utilized for debugging purposes, acts as a fail-fast
option for developers, thus should not be utilized for error handling.
It's recommended only for very basic (thus sometimes unnecessary) requirements.
Such usage should be easy to locate, have no complex call chain.
E.g. to rule out invalid function parameter combinations.
Should not be utilized on any data/metadata reads from disks, as they can be
invalid. For sanity check examples of on-disk metadata, please refer to
`Tree checker`.
- WARN_ON
Unconditional and noisy checks, but still allow the code to continue.
Should only be utilized if a call trace is critical for debugging.
Not recommended if:
* The call site is unique or can be easily located
In that case, an error message is recommended.
* The call site would eventually lead to a btrfs_abort_transaction() call
btrfs_abort_transaction() call would dump the stack anyway.
If the call trace is critical, it's recommended to move the
btrfs_abort_transaction() call closer to the place where the error happens.
- BUG_ON
Should not be utilized, and is incrementally removed or replaced in the code.
Error injection using eBPF
--------------------------