#
9fd5a48a |
|
16-Apr-2024 |
Nathan Chancellor <nathan@kernel.org> |
bcachefs: Fix format specifier in validate_bset_keys() When building for 32-bit platforms, for which size_t is 'unsigned int', there is a warning from a format string in validate_bset_keys(): fs/bcachefs/btree_io.c: In function 'validate_bset_keys': fs/bcachefs/btree_io.c:891:34: error: format '%lu' expects argument of type 'long unsigned int', but argument 12 has type 'unsigned int' [-Werror=format=] 891 | "bad k->u64s %u (min %u max %lu)", k->u64s, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ fs/bcachefs/btree_io.c:603:32: note: in definition of macro 'btree_err' 603 | msg, ##__VA_ARGS__); \ | ^~~ fs/bcachefs/btree_io.c:887:21: note: in expansion of macro 'btree_err_on' 887 | if (btree_err_on(!bkeyp_u64s_valid(&b->format, k), | ^~~~~~~~~~~~ fs/bcachefs/btree_io.c:891:64: note: format string is defined here 891 | "bad k->u64s %u (min %u max %lu)", k->u64s, | ~~^ | | | long unsigned int | %u cc1: all warnings being treated as errors BKEY_U64s is size_t so the entire expression is promoted to size_t. Use the '%zu' specifier so that there is no warning regardless of the width of size_t. Fixes: 031ad9e7dbd1 ("bcachefs: Check for packed bkeys that are too big") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202404130747.wH6Dd23p-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202404131536.HdAMBOVc-lkp@intel.com/ Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
ba8ed36e |
|
11-Apr-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: don't queue btree nodes for rewrites during scan many nodes found during scan will be old nodes, overwritten by newer nodes Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
031ad9e7 |
|
11-Apr-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Check for packed bkeys that are too big add missing validation; fixes assertion pop in bkey unpack Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
55936afe |
|
15-Mar-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Flag btrees with missing data We need this to know when we should attempt to reconstruct the snapshots btree Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
e2a316b3 |
|
01-Apr-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: BCH_WATERMARK_interior_updates This adds a new watermark, higher priority than BCH_WATERMARK_reclaim, for interior btree updates. We've seen a deadlock where journal replay triggers a ton of btree node merges, and these use up all available open buckets and then interior updates get stuck. One cause of this is that we're currently lacking btree node merging on write buffer btrees - that needs to be fixed as well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
812a9297 |
|
26-Mar-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Fix btree node keys accounting in topology repair path When dropping keys now outside a now because we're changing the node min/max, we need to redo the node's accounting as well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
79032b07 |
|
23-Mar-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Improved topology repair checks Consolidate bch2_gc_check_topology() and btree_node_interior_verify(), and replace them with an improved version, bch2_btree_node_check_topology(). This checks that children of an interior node correctly span the full range of the parent node with no overlaps. Also, ensure that topology repairs at runtime are always a fatal error; in particular, this adds a check in btree_iter_down() - if we don't find a key while walking down the btree that's indicative of a topology error and should be flagged as such, not a null ptr deref. Some checks in btree_update_interior.c remaining BUG_ONS(), because we already checked the node for topology errors when starting the update, and the assertions indicate that we _just_ corrupted the btree node - i.e. the problem can't be that existing on disk corruption, they indicate an actual algorithmic bug. In the future, we'll be annotating the fsck errors list with which recovery pass corrects them; the open coded "run explicit recovery pass or fatal error" in bch2_btree_node_check_topology() will in the future be done for every fsck_err() call. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
3ed94062 |
|
17-Mar-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Improve bch2_fatal_error() error messages should always include __func__ Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
a5860368 |
|
16-Mar-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Don't corrupt journal keys gap buffer when dropping alloc info Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
52946d82 |
|
06-Feb-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Kill more -EIO error codes This converts -EIOs related to btree node errors to private error codes, which will help with some ongoing debugging by giving us better error messages. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
cb6fc943 |
|
01-Feb-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: kill kvpmalloc() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
94817db9 |
|
08-Mar-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Correctly validate k->u64s in btree node read path validate_bset_keys() never properly validated k->u64s; it checked if it was 0, but not if it was smaller than keys for the given packed format; this fixes that small oversight. This patch was backported, so it's adding quite a few error enums so that they don't get renumbered and we don't have confusing gaps. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
ec4edd7b |
|
16-Jan-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Prep work for variable size btree node buffers bcachefs btree nodes are big - typically 256k - and btree roots are pinned in memory. As we're now up to 18 btrees, we now have significant memory overhead in mostly empty btree roots. And in the future we're going to start enforcing that certain btree node boundaries exist, to solve lock contention issues - analagous to XFS's AGIs. Thus, we need to start allocating smaller btree node buffers when we can. This patch changes code that refers to the filesystem constant c->opts.btree_node_size to refer to the btree node buffer size - btree_buf_bytes() - where appropriate. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
4819b66e |
|
05-Jan-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: improve checksum error messages new helpers: - bch2_csum_to_text() - bch2_csum_err_msg() standardize our checksum error messages a bit, and print out the checksums a bit more nicely. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
2d02bfb0 |
|
05-Jan-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: improve validate_bset_keys() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
e9bc59f9 |
|
03-Jan-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: add missing bch2_latency_acct() call Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
c72e4d7a |
|
03-Jan-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: add time_stats for btree_node_read_done() Seeing weird latency issues in the btree node read path - add one bch2_btree_node_read_done(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
0beebd92 |
|
21-Dec-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: bkey_for_each_ptr() now declares loop iter Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
53b67d8d |
|
23-Dec-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: better error message in btree_node_write_work() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
483dea44 |
|
05-Dec-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Improve error message when finding wrong btree node single_device.merge_torture_flakey is, very rarely, finding a btree node that doesn't match the key that points to it: this patch improves the error message to print out more fields from the btree node header, so that we can see what else does or does not match the key. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
a564c9fa |
|
02-Dec-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Include btree_trans in more tracepoints This gives us more context information - e.g. which codepath is invoking btree node reads. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
cb52d23e |
|
11-Nov-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Rename BTREE_INSERT flags BTREE_INSERT flags are actually transaction commit flags - rename them for clarity. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
0117591e |
|
30-Nov-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Don't drop journal pins in exit path There's no need to drop journal pins in our exit paths - the code was trying to have everything cleaned up on any shutdown, but better to just tweak the assertions a bit. This fixes a bug where calling into journal reclaim in the exit path would cass a null ptr deref. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
d4e3b928 |
|
17-Nov-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
closures: CLOSURE_CALLBACK() to fix type punning Control flow integrity is now checking that type signatures match on indirect function calls. That breaks closures, which embed a work_struct in a closure in such a way that a closure_fn may also be used as a workqueue fn by the underlying closure code. So we have to change closure fns to take a work_struct as their argument - but that results in a loss of clarity, as closure fns have different semantics from normal workqueue functions (they run owning a ref on the closure, which must be released with continue_at() or closure_return()). Thus, this patc introduces CLOSURE_CALLBACK() and closure_type() macros as suggested by Kees, to smooth things over a bit. Suggested-by: Kees Cook <keescook@chromium.org> Cc: Coly Li <colyli@suse.de> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
a8958a1a |
|
02-Nov-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: bkey_copy() is no longer a macro Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
b65db750 |
|
24-Oct-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Enumerate fsck errors This patch adds a superblock error counter for every distinct fsck error; this means that when analyzing filesystems out in the wild we'll be able to see what sorts of inconsistencies are being found and repair, and hence what bugs to look for. Errors validating bkeys are not yet considered distinct fsck errors, but this patch adds a new helper, bkey_fsck_err(), in order to add distinct error types for them as well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
94119eeb |
|
25-Oct-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Add IO error counts to bch_member We now track IO errors per device since filesystem creation. IO error counts can be viewed in sysfs, or with the 'bcachefs show-super' command. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
88dfe193 |
|
19-Oct-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: bch2_btree_id_str() Since we can run with unknown btree IDs, we can't directly index btree IDs into fixed size arrays. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
6bd68ec2 |
|
12-Sep-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Heap allocate btree_trans We're using more stack than we'd like in a number of functions, and btree_trans is the biggest object that we stack allocate. But we have to do a heap allocatation to initialize it anyways, so there's no real downside to heap allocating the entire thing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
96dea3d5 |
|
12-Sep-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Fix W=12 build errors Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
1809b8cb |
|
10-Sep-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Break up io.c More reorganization, this splits up io.c into - io_read.c - io_misc.c - fallocate, fpunch, truncate - io_write.c Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
5cfd6977 |
|
09-Sep-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Array bounds fixes It's no longer legal to use a zero size array as a flexible array member - this causes UBSAN to complain. This patch switches our zero size arrays to normal flexible array members when possible, and inserts casts in other places (e.g. where we use the zero size array as a marker partway through an array). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
e08e63e4 |
|
06-Aug-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: BCH_COMPAT_bformat_overflow_done no longer required Awhile back, we changed bkey_format generation to ensure that the packed representation could never represent fields larger than the unpacked representation. This was to ensure that bkey_packed_successor() always gave a sensible result, but in the current code bkey_packed_successor() is only used in a debug assertion - not for anything important. This kills the requirement that we've gotten rid of those weird bkey formats, and instead changes the assertion to check if we're dealing with an old weird bkey format. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
56046e3e |
|
03-Aug-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Convert btree_err_type to normal error codes Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
73adfcaf |
|
03-Aug-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Fix btree_err() macro Error code wasn't being propagated correctly, change it to match fsck_err() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
ad52bac2 |
|
03-Aug-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Log a message when running an explicit recovery pass Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
6c643965 |
|
03-Aug-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: bkey_format helper improvements - add a to_text() method for bkey_format - convert bch2_bkey_format_validate() to modern error message style, where we pass a printbuf for the error string instead of returning a static string Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
922bc5a0 |
|
16-Jul-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Make topology repair a normal recovery pass This adds bch2_run_explicit_recovery_pass(), for rewinding recovery and explicitly running a specific recovery pass - this is a more general replacement for how we were running topology repair before. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
ba8eeae8 |
|
27-Jun-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: bcachefs_metadata_version_major_minor This introduces major/minor versioning to the superblock version number. Major version number changes indicate incompatible releases; we can move forward to a new major version number, but not backwards. Minor version numbers indicate compatible changes - these add features, but can still be mounted and used by old versions. With the recent patches that make it possible to roll out new btrees and key types without breaking compatibility, we should be able to roll out most new features without incompatible changes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
73bd774d |
|
06-Jul-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Assorted sparse fixes - endianness fixes - mark some things static - fix a few __percpu annotations - fix silent enum conversions Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
faa6cb6c |
|
28-Jun-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Allow for unknown btree IDs We need to allow filesystems with metadata from newer versions to be mountable and usable by older versions. This patch enables us to roll out new btrees without a new major version number; we can now handle btree roots for unknown btree types. The unknown btree roots will be retained, and fsck (including backpointers) will check them, the same as other btree types. We add a dynamic array for the extra, unknown btree roots, in addition to the fixed size btree root array, and add new helpers for looking up btree roots. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
a02a0121 |
|
28-Jun-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: bch2_version_compatible() This adds a new helper for checking if an on-disk version is compatible with the running version of bcachefs - prep work for introducing major:minor version numbers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
f33c58fc |
|
27-Jun-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Kill BTREE_INSERT_USE_RESERVE Now that we have journal watermarks and alloc watermarks unified, BTREE_INSERT_USE_RESERVE is redundant and can be deleted. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
e4eb661d |
|
27-Jun-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Fix btree node write error message Error messages should include the error code, when available. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
19c304be |
|
28-May-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: GFP_NOIO -> GFP_NOFS GFP_NOIO dates from the bcache days, when we operated under the block layer. Now, GFP_NOFS is more appropriate, so switch all GFP_NOIO uses to GFP_NOFS. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
1fb4fe63 |
|
20-May-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
six locks: Kill six_lock_state union As suggested by Linus, this drops the six_lock_state union in favor of raw bitmasks. On the one hand, bitfields give more type-level structure to the code. However, a significant amount of the code was working with six_lock_state as a u64/atomic64_t, and the conversions from the bitfields to the u64 were deemed a bit too out-there. More significantly, because bitfield order is poorly defined (#ifdef __LITTLE_ENDIAN_BITFIELD can be used, but is gross), incrementing the sequence number would overflow into the rest of the bitfield if the compiler didn't put the sequence number at the high end of the word. The new code is a bit saner when we're on an architecture without real atomic64_t support - all accesses to lock->state now go through atomic64_*() operations. On architectures with real atomic64_t support, we additionally use atomic bit ops for setting/clearing individual bits. Text size: 7467 bytes -> 4649 bytes - compilers still suck at bitfields. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
09ebfa61 |
|
21-Apr-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Drop a redundant error message When we're already read-only, we don't need to print out errors from writing btree nodes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
65d48e35 |
|
14-Mar-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Private error codes: ENOMEM This adds private error codes for most (but not all) of our ENOMEM uses, which makes it easier to track down assorted allocation failures. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
ac2ccddc |
|
04-Mar-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Drop some anonymous structs, unions Rust bindgen doesn't cope well with anonymous structs and unions. This patch drops the fancy anonymous structs & unions in bkey_i that let us use the same helpers for bkey_i and bkey_packed; since bkey_packed is an internal type that's never exposed to outside code, it's only a minor inconvenienc. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
45dd05b3 |
|
04-Mar-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: BKEY_PADDED_ONSTACK() Rust bindgen doesn't do anonymous structs very nicely: BKEY_PADDED() only needs the anonymous struct when it's used on the stack, to guarantee layout, not when it's embedded in another struct. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
3329cf1b |
|
02-Mar-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Centralize btree node lock initialization This fixes some confusion in the lockdep code due to initializing btree node/key cache locks with the same lockdep key, but different names. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
1306f87d |
|
02-Mar-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Plumb btree_trans through btree cache code Soon, __bch2_btree_node_write() is going to require a btree_trans: zoned device support is going to require a new allocation for every btree node write. This is a bit of prep work. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
12795a19 |
|
10-Feb-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Add some logging for btree node rewrites due to errors Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
a8b3a677 |
|
02-Nov-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Nocow support This adds support for nocow mode, where we do writes in-place when possible. Patch components: - New boolean filesystem and inode option, nocow: note that when nocow is enabled, data checksumming and compression are implicitly disabled - To prevent in-place writes from racing with data moves (data_update.c) or bucket reuse (i.e. a bucket being reused and re-allocated while a nocow write is in flight, we have a new locking mechanism. Buckets can be locked for either data update or data move, using a fixed size hash table of two_state_shared locks. We don't have any chaining, meaning updates and moves to different buckets that hash to the same lock will wait unnecessarily - we'll want to watch for this becoming an issue. - The allocator path also needs to check for in-place writes in flight to a given bucket before giving it out: thus we add another counter to bucket_alloc_state so we can track this. - Fsync now may need to issue cache flushes to block devices instead of flushing the journal. We add a device bitmask to bch_inode_info, ei_devs_need_flush, which tracks devices that need to have flushes issued - note that this will lead to unnecessary flushes when other codepaths have already issued flushes, we may want to replace this with a sequence number. - New nocow write path: look up extents, and if they're writable write to them - otherwise fall back to the normal COW write path. XXX: switch to sequence numbers instead of bitmask for devs needing journal flush XXX: ei_quota_lock being a mutex means bch2_nocow_write_done() needs to run in process context - see if we can improve this Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
2e984040 |
|
01-Feb-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Improve btree node read error path This ensures that failure to read a btree node error is treated as a topology error, and returns the correct error so that the topology repair pass will be run. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
494dcc57 |
|
03-Jan-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Plumb saw_error through to btree_err() The btree node read path has the ability to kick off an asynchronous btree node rewrite if we saw and corrected an error. Previously this was only used for errors that caused one of the replicas to be unusable - this patch plumbs it through to all error paths, so that normal fsck errors can be corrected. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
b8fe1b1d |
|
03-Jan-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Convert btree_err() to a function This makes the code more readable, and reduces text size by 8 kb. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
149651dc |
|
25-Dec-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: fix fsck error Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
e88a75eb |
|
24-Nov-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: New bpos_cmp(), bkey_cmp() replacements This patch introduces - bpos_eq() - bpos_lt() - bpos_le() - bpos_gt() - bpos_ge() and equivalent replacements for bkey_cmp(). Looking at the generated assembly these could probably be improved further, but we already see a significant code size improvement with this patch. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
42af0ad5 |
|
17-Nov-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Fix a race with b->write_type b->write_type needs to be set atomically with setting the btree_node_need_write flag, so move it into b->flags. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
a1019576 |
|
22-Oct-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: More style fixes Fixes for various checkpatch errors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
2cb75179 |
|
28-Oct-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: should_compact_all() This factors out a properly-documented helper for deciding when we want to sort a btree node with MAX_BSETS bsets down to a single bset. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
46fee692 |
|
28-Oct-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Improved btree write statistics This replaces sysfs btree_avg_write_size with btree_write_stats, which now breaks out statistics by the source of the btree write. Btree writes that are too small are a source of inefficiency, and excessive btree resort overhead - this will let us see what's causing them. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
8cbb0002 |
|
30-Sep-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Write new btree nodes after parent update In order to avoid locking all btree nodes up to the root for btree node splits, we're going to have to introduce a new error path into bch2_btree_insert_node(); this mean we can't have done any writes or modified global state before that point. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
d704d623 |
|
25-Sep-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: btree_err() now uses bch2_print_string_as_lines() We've seen long error messages get truncated here, so convert to the new bch2_print_string_as_lines(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
ca7d8fca |
|
21-Aug-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: New locking functions In the future, with the new deadlock cycle detector, we won't be using bare six_lock_* anymore: lock wait entries will all be embedded in btree_trans, and we will need a btree_trans context whenever locking a btree node. This patch plumbs a btree_trans to the few places that need it, and adds two new locking functions - btree_node_lock_nopath, which may fail returning a transaction restart, and - btree_node_lock_nopath_nofail, to be used in places where we know we cannot deadlock (i.e. because we're holding no other locks). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
674cfc26 |
|
26-Aug-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Add persistent counters for all tracepoints Also, do some reorganizing/renaming, convert atomic counters in bch_fs to persistent counters, and add a few missing counters. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
bbf42884 |
|
17-Aug-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Always rebuild aux search trees when node boundaries change Topology repair may change btree node min/max keys: when it does so, we need to always rebuild eytzinger search trees because nodes directly depend on those values. This fixes a bug found by the 'kill_btree_node' test, where we'd pop an assertion in bch2_bset_search_linear(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
efa8a701 |
|
15-Aug-2022 |
Olexa Bilaniuk <obilaniu@gmail.com> |
bcachefs: remove dead whiteout_u64s argument. Signed-off-by: Olexa Bilaniuk <obilaniu@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
1ed0a5d2 |
|
19-Jul-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Convert fsck errors to errcode.h Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
c9bd6732 |
|
13-Jun-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix btree node read retries b->written wasn't being reset to 0 in the btree node read retry path, causing decrypting & validation of previously read bsets to not be re-run - ouch. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
401ec4db |
|
03-Feb-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Printbuf rework This converts bcachefs to the modern printbuf interface/implementation, synced with the version to be submitted upstream. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
652018d6 |
|
06-Jun-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix btree node read error path We were forgetting to clear the read_in_flight flag - oops. This also fixes it to not call bch2_fatal_error() before topology repair has had a chance to do its thing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
c7372678 |
|
26-May-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Print message on btree node read retry success Right now, we print an error message on btree node read error, and we print that we're retrying, but we don't explicitly say if the retry succeeded - this makes things a little clearer. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
ae21f74e |
|
18-Apr-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve invalid bkey error message Bkeys have gotten a lot bigger since this code was written and now are often formatted across multiple lines - while the reason a bkey is invalid will still be short and fit on a single line. This patch prints the error bfore the bkey, making it a bit more readable. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
c0960603 |
|
17-Apr-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Shutdown path improvements We're seeing occasional firings of the assertion in the key cache shutdown code that nr_dirty == 0, which means we must sometimes be doing transaction commits after we've gone read only. Cleanups & changes: - BCH_FS_ALLOC_CLEAN renamed to BCH_FS_CLEAN_SHUTDOWN - new helper bch2_btree_interior_updates_flush(), which returns true if it had to wait - bch2_btree_flush_writes() now also returns true if there were btree writes in flight - __bch2_fs_read_only now checks if btree writes were in flight in the shutdown loop: btree write completion does a transaction update, to update the pointer in the parent node - assert that !BCH_FS_CLEAN_SHUTDOWN in __bch2_trans_commit Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
cf0dd697 |
|
09-Apr-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Don't trigger extra assertions in journal replay We now pass a rw argument to .key_invalid methods so they can trigger assertions for updates but not on existing keys. We shouldn't trigger these extra assertions in journal replay - this patch changes the transaction commit path accordingly. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
275c8426 |
|
03-Apr-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add rw to .key_invalid() This adds a new parameter to .key_invalid() methods for whether the key is being read or written; the idea being that methods can do more aggressive checks when a key is newly created and being written, when we wouldn't want to delete the key because of those checks. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
f0ac7df2 |
|
03-Apr-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Convert .key_invalid methods to printbufs Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
c6b2826c |
|
11-Dec-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Freespace, need_discard btrees This adds two new btrees for the upcoming allocator rewrite: an extents btree of free buckets, and a btree for buckets awaiting discards. We also add a new trigger for alloc keys to keep the new btrees up to date, and a compatibility path to initialize them on existing filesystems. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
3756111d |
|
21-Mar-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add printf format attribute to bch2_pr_buf() This tells the compiler to check printf format strings, and catches a few bugs. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
74b33393 |
|
20-Mar-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: x-macro metadata version enum Now we've got strings for metadata versions - this changes bch2_sb_to_text() and our mount log message to use it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
cc23255e |
|
10-Mar-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add a missing wakeup This fixes a rare bug with bch2_btree_flush_all_writes() getting stuck. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
30985537 |
|
04-Mar-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix usage of six lock's percpu mode Six locks have a percpu mode, which we use for interior btree nodes, as well as btree key cache keys for the subvolumes btree. We've been switching locks back and forth between percpu and non percpu mode as needed, but it turns out this is racy - when we're reusing an existing node, other threads could be attempting to lock it while we're switching it between modes. This patch fixes this by never switching 'struct btree' between the two modes, and instead segragating them between two different freed lists. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
bf3efff5 |
|
27-Feb-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix race leading to btree node write getting stuck Checking btree_node_may_write() isn't atomic with the other btree flags, dirty and need_write in particular. There was a rare race where we'd unblock a node from writing while __btree_node_flush() was setting need_write, and no thread would notice that the node was now both able to write and needed to be written. Fix this by adding btree node flags for will_make_reachable and write_blocked that can be checked in the cmpxchg loop in __bch2_btree_node_write. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
82732ef5 |
|
26-Feb-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve btree_node_write_if_need() btree_node_write_if_need() kicks off a btree node write only if need_write is set; this makes the locking easier to reason about by moving the check into the cmpxchg loop in __bch2_btree_node_write(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
39dcace8 |
|
26-Feb-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix locking in btree_node_write_done() There was a rare recursive locking bug, in __bch2_btree_node_write() nowrite path -> btree_node_write_done(), in the path that kicks off another write. This splits out an inner __btree_node_write_done() that expects to be run with the btree node lock held. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
75ef2c59 |
|
26-Feb-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Start moving debug info from sysfs to debugfs In sysfs, files can only output at most PAGE_SIZE. This is a problem for debug info that needs to list an arbitrary number of times, and because of this limit some of our debug info has been terser and harder to read than we'd like. This patch moves info about journal pins and cached btree nodes to debugfs, and greatly expands and improves the output we return. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
55334d78 |
|
26-Feb-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Kill BCH_FS_HOLD_BTREE_WRITES This was just dead code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
fa8e94fa |
|
25-Feb-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Heap allocate printbufs This patch changes printbufs dynamically allocate and reallocate a buffer as needed. Stack usage has become a bit of a problem, and a major cause of that has been static size string buffers on the stack. The most involved part of this refactoring is that printbufs must now be exited with printbuf_exit(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
78a8f362 |
|
23-Feb-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve some btree node read error messages On btree node read error, it's helpful to see what we were trying to read - was it all zeroes? Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
a9de137b |
|
18-Feb-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Check for errors from crypto_skcipher_encrypt() Apparently it actually is possible for crypto_skcipher_encrypt() to return an error - not sure why that would be - but we need to replace our assertion with actual error handling. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
03ea3962 |
|
04-Jan-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Log & error message improvements - Add a shim uuid_unparse_lower() in the kernel, since %pU doesn't work in userspace - We don't need to print the bcachefs: or the filesystem name prefix in userspace - Improve a few error messages Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
8244f320 |
|
14-Dec-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Option improvements This adds flags for options that must be a power of two (block size and btree node size), and options that are stored in the superblock as a power of two (encoded extent max). Also: options are now stored in memory in the same units they're displayed in (bytes): we now convert when getting and setting from the superblock. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
62d5bd95 |
|
19-Dec-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Kill bch2_sort_repack_merge() The main function of bch2_sort_repack_merge() was to call .key_normalize on every key, which drops stale (cached) pointers - it hasn't actually merged extents in quite some time. But bch2_gc_gens() now works on individual keys - we used to gc old gens by rewriting entire btree nodes. With that gone, there's no need for internal btree code to be calling .key_normalize anymore. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
2a863c6c |
|
14-Dec-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix debug build in userspace This fixes some compiler warnings that only trigger in userspace - dead code, a maybe uninitialed variable, a maybe null ptr passed to printk. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
c79272d1 |
|
09-Sep-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix some compiler warnings gcc couldn't always deduce that written wasn't used uninitialized Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
f7a966a3 |
|
30-Aug-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Clean up/rename bch2_trans_node_* fns These utility functions are for managing btree node state within a btree_trans - rename them for consistency, and drop some unneeded arguments. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
9f6bd307 |
|
24-Aug-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Reduce iter->trans usage Disfavoured, and should go away. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
e719fc34 |
|
15-Jul-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: BSET_OFFSET() Add a field to struct bset for the sector offset within the btree node where it was written. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
9f1833ca |
|
10-Jul-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Update btree ptrs after every write This closes a significant hole (and last known hole) in our ability to verify metadata. Previously, since btree nodes are log structured, we couldn't detect lost btree writes that weren't the first write to a given node. Additionally, this seems to have lead to some significant metadata corruption on multi device filesystems with metadata replication: since a write may have made it to one device and not another, if we read that btree node back from the replica that did have that write and started appending after that point, the other replica would have a gap in the bset entries and reading from that replica wouldn't find the rest of the bsets. But, since updates to interior btree nodes are now journalled, we can close this hole by updating pointers to btree nodes after every write with the currently written number of sectors, without negatively affecting performance. This means we will always detect lost or corrupt metadata - it also means that our btree is now a curious hybrid of COW and non COW btrees, with all the benefits of both (excluding complexity). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
0a700890 |
|
11-Jul-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Kick off btree node writes from write completions This is a performance improvement by removing the need to wait for the in flight btree write to complete before kicking one off, which is going to be needed to avoid a performance regression with the upcoming patch to update btree ptrs after every btree write. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
19d54324 |
|
10-Jul-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Really don't hold btree locks while btree IOs are in flight This is something we've attempted to stick to for quite some time, as it helps guarantee filesystem latency - but there's a few remaining paths that this patch fixes. This is also necessary for an upcoming patch to update btree pointers after every btree write - since the btree write completion path will now be doing btree operations. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
e3a67bdb |
|
10-Jul-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Regularize argument passing of btree_trans btree_trans should always be passed when we have one - iter->trans is disfavoured. This mainly updates old code in btree_update_interior.c, some of which predates btree_trans. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
50ad5d09 |
|
22-Jun-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix btree_node_read_all_replicas() error handling We weren't checking bch2_btree_node_read_done() for errors, oops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
ee757054 |
|
10-Sep-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Fix a deadlock Waiting on a btree node write with btree locks held can deadlock, if the write errors: the write error path has to do do a btree update to drop the pointer to the replica that errored. The interior update path has to wait on in flight btree writes before freeing nodes on disk. Previously, this was done in bch2_btree_interior_update_will_free_node(), and could deadlock; now, we just stash a pointer to the node and do it in btree_update_nodes_written(), just prior to the transactional part of the update. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
9f2772c4 |
|
27-May-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Split out btree_error_wq We can't use btree_update_wq becuase btree updates may be waiting on btree writes to complete. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
9dd89a05 |
|
22-May-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix an issue with inconsistent btree writes after unclean shutdown After unclean shutdown, btree writes may have completed on one device and not others - and this inconsistency could lead us to writing new bsets with a gap in our btree node in one of our replicas. Fortunately, this is only an issue with bsets that are newer than the most recent journal flush, and we already have a mechanism for detecting and blacklisting those. We just need to make sure to start new btree writes after the most recent _non_ blacklisted bset. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
731bdd2e |
|
22-May-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add a workqueue for btree io completions Also, clean up workqueue usage - we shouldn't be using system workqueues, pretty much everything we do needs to be on our own WQ_MEM_RECLAIM workqueues. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
1ce0cf5f |
|
21-May-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add a debug mode that always reads from every btree replica There's a new module parameter, verify_all_btree_replicas, that enables reading from every btree replica when reading in btree nodes and comparing them against each other. We've been seeing some strange btree corruption - this will hopefully aid in tracking it down and catching it more often. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
5bc38f44 |
|
07-May-2021 |
Dan Robertson <dan@dlrobertson.com> |
bcachefs: Fix oob write in __bch2_btree_node_write Fix a possible out of bounds write in __bch2_btree_node_write when the data buffer padding is cleared up to the block size. The out of bounds write is possible if the data buffers size is not a multiple of the block size. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
aae15aaf |
|
24-Apr-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: New and improved topology repair code This splits out btree topology repair into a separate pass, and makes some improvements: - When we have to pick which of two overlapping nodes to drop keys from, we use the btree node header sequence number to preserve the newer node - the gc code has been changed so that it doesn't bail out if we're continuing/ignoring on fsck error - this way the dump tool can skip running the repair pass but still walk all reachable metadata - add a new superblock flag indicating when a filesystem is known to have btree topology issues, and the topology repair pass should be run - changing the start/end of a node might mean keys in that node have to be deleted: this patch handles that better by splitting it out into a separate function and running it explicitly in the topology repair code, previously those keys were only being dropped when the btree node was read in. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
bcd25dac |
|
24-Apr-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Rewrite btree nodes with errors This patch adds self healing functionality for btree nodes - if we notice a problem when reading a btree node, we just rewrite it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
51c804ed |
|
06-Apr-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Punt btree writes to workqueue to submit We don't want to be submitting IO with btree locks held, and btree writes usually aren't latency sensitive. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
2177147b |
|
06-Apr-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve bset compaction The previous patch that fixed btree nodes being written too aggressively now meant that we weren't sorting btree node bsets optimally - this patch fixes that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
ba5f03d3 |
|
31-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add a sysfs var for average btree write size Useful number for performance tuning. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
5f65d74d |
|
28-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add repair code for out of order keys in a btree node. This just drops the offending key - in the bug report where this was seen, it was clearly a single bit memory error, and fsck will fix the missing key. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
e751c01a |
|
24-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Start using bpos.snapshot field This patch starts treating the bpos.snapshot field like part of the key in the btree code: * bpos_successor() and bpos_predecessor() now include the snapshot field * Keys in btrees that will be using snapshots (extents, inodes, dirents and xattrs) now always have their snapshot field set to U32_MAX The btree iterator code gets a new flag, BTREE_ITER_ALL_SNAPSHOTS, that determines whether we're iterating over keys in all snapshots or not - internally, this controlls whether bkey_(successor|predecessor) increment/decrement the snapshot field, or only the higher bits of the key. We add a new member to struct btree_iter, iter->snapshot: when BTREE_ITER_ALL_SNAPSHOTS is not set, iter->pos.snapshot should always equal iter->snapshot, which will be 0 for btrees that don't use snapshots, and alsways U32_MAX for btrees that will use snapshots (until we enable snapshot creation). This patch also introduces a new metadata version number, and compat code for reading from/writing to older versions - this isn't a forced upgrade (yet). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
4cf91b02 |
|
04-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Split out bpos_cmp() and bkey_cmp() With snapshots, we're going to need to differentiate between comparisons that should and shouldn't include the snapshot field. bpos_cmp is now the comparison function that does include the snapshot field, used by core btree code. Upper level filesystem code generally does _not_ want to compare against the snapshot field - that code wants keys to compare as equal even when one of them is in an ancestor snapshot. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
0390ea8a |
|
24-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Drop bkey noops Bkey noops were introduced to deal with trimming inline data extents in place in the btree: if the u64s field of a bkey was 0, that u64 was a noop and we'd start looking for the next bkey immediately after it. But extent handling has been lifted above the btree - we no longer modify existing extents in place in the btree, and the compatibilty code for old style extent btree nodes is gone, so we can completely drop this code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
84cc758d |
|
21-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Validate bset version field against sb version fields The superblock version fields need to be accurate to know whether a filesystem is supported, thus we should be verifying them. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
50dc0f69 |
|
19-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Require all btree iterators to be freed We keep running into occasional bugs with btree transaction iterators overflowing - this will make those bugs more visible. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
f020bfcd |
|
04-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Use bch2_bpos_to_text() more consistently Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
2436cb9f |
|
20-Feb-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Use x-macros for more enums This patch standardizes all the enums that have associated string tables (probably more enums should have string tables). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
41f8b09e |
|
20-Feb-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Rename BTREE_ID enums for consistency with other enums Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
c052cf82 |
|
19-Feb-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: KEY_TYPE_discard is no longer used KEY_TYPE_discard used to be used for extent whiteouts, but when handling over overlapping extents was lifted above the core btree code it became unused. This patch updates various code to reflect that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
f2785955 |
|
19-Feb-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Kill support for !BTREE_NODE_NEW_EXTENT_OVERWRITE() bcachefs has been aggressively migrating filesystems and btree nodes to the new format for quite some time - this shouldn't affect anyone anymore, and lets us delete a _lot_ of code. Also, it frees up KEY_TYPE_discard for a new whiteout key type for snapshots. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
006d69aa |
|
16-Apr-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Don't drop ptrs to btree nodes If a ptr gen doesn't match the bucket gen, the bucket likely doesn't contain the data we want - but it's still possible the data we want might have been overwritten, and for btree node pointers we can verify whether or not the node is the one we wanted with the node's sequence number, so it's better to keep the pointer and try reading from it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
1889ad5a |
|
14-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add code to scan for/rewite old btree nodes This adds a new data job type to scan for btree nodes in the old extent format, and rewrite them. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
91f6ad6f |
|
02-Feb-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Include device in btree IO error messages Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
51d2dfb8 |
|
26-Jan-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add BTREE_PTR_RANGE_UPDATED This is so that when we discover btree topology issues, we can just update the pointer to a btree node and signal btree read path that the min/max keys in the node header should be updated from the node pointer. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
a5cd80ea |
|
20-Jan-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix an assertion pop There was a race: btree node writes drop their reference on journal pins before clearing the btree_node_write_in_flight flag. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
ed9d58a2 |
|
14-Jan-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Run jset_validate in write path as well This is because we had a bug where we were writing out journal entries with garbage last_seq, and not catching it. Also, completely ignore jset->last_seq when JSET_NO_FLUSH is true, because of aforementioned bug, but change the write path to set last_seq to 0 when JSET_NO_FLUSH is true. Minor other cleanups and comments. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
07a1006a |
|
17-Dec-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Reduce/kill BKEY_PADDED use With various newer key types - stripe keys, inline data extents - the old approach of calculating the maximum size of the value is becoming more and more error prone. Better to switch to bkey_on_stack, which can dynamically allocate if necessary to handle any size bkey. In particular we also want to get rid of BKEY_EXTENT_VAL_U64s_MAX. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
a2bfc841 |
|
06-Dec-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Try to print full btree error message Metadata corruption bugs are hard to debug if we can't see exactly what went wrong - try to allocate a bigger buffer so we can print out everything we have. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
5db43418 |
|
03-Dec-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Don't issue btree writes that weren't journalled If we have an error in the btree interior update path that prevents us from journalling the update, we can't issue the corresponding btree node write - we didn't get a journal sequence number that would cause it to be ignored in recovery. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
0fefe8d8 |
|
03-Dec-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve some IO error messages it's useful to know whether an error was for a read or a write - this also standardizes error messages a bit more. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
1c74cec1 |
|
16-Nov-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add more debug checks tracking down a bug where we see a btree node pointer in the wrong node Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
6d9378f3 |
|
10-Nov-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Hack around bch2_varint_decode invalid reads bch2_varint_decode can do reads up to 7 bytes past the end ptr, for the sake of performance - these extra bytes are always masked off. This won't be a problem in practice if we make sure to burn 8 bytes in any buffer that has bkeys in it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
6a747c46 |
|
09-Nov-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add accounting for dirty btree nodes/keys This lets us improve journal reclaim, so that it now tries to make sure no more than 3/4s of the btree node cache and btree key cache are dirty - ensuring the shrinkers can free memory. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
811d2bcd |
|
06-Nov-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Drop typechecking from bkey_cmp_packed() This only did anything in two places, and those can just be replaced wiht bkey_cmp_left_packed()). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
29364f34 |
|
02-Nov-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Drop sysfs interface to debug parameters It's not used much anymore, the module paramter interface is better. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
e00711d2 |
|
24-Oct-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve some error messages Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
9f115ce9 |
|
04-Aug-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix a bug with the journal_seq_blacklist mechanism Previously, we would start doing btree updates before writing the first journal entry; if this was after an unclean shutdown, this could cause those btree updates to not be blacklisted. Also, move some code to headers for userspace debug tools. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
7807e143 |
|
25-Jul-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Convert various code to printbuf printbufs know how big the buffer is that was allocated, so we can get rid of the random PAGE_SIZEs all over the place. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
4580baec |
|
25-Jul-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Remove some uses of PAGE_SIZE in the btree code For portability to userspace, we should try to avoid working in kernel pages. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
63b214e7 |
|
21-Jul-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add bch2_blk_status_to_str() We define our own BLK_STS_REMOVED, so we need our own to_str helper too. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
89fd25be |
|
09-Jul-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Use x-macros for data types Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
fff899b1 |
|
03-Jul-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Mark btree nodes as needing rewrite when not all replicas are RW This fixes a bug where recovery fails when one of the devices is read only. Also - consolidate the "must rewrite this node to insert it" behind a new btree node flag. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
306d40df |
|
02-Jul-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Use blk_status_to_str() Improved error messages are always a good thing Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
a34782a0 |
|
17-Jun-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Change bch2_dump_bset() to also print key values Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
9ef846a7 |
|
03-Jun-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve assorted error messages This also consolidates the various checks in bch2_mark_pointer() and bch2_trans_mark_pointer(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
f36dff28 |
|
12-May-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Validate that we read the correct btree node Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
bc970cec |
|
02-May-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix two more deadlocks Deadlock on shutdown: btree_update_nodes_written() unblocks btree nodes from being written; after doing so, it has to check if they were marked as needing to be written and if so kick off those writes - if that doesn't happen, we'll never release journal pins and shutdown will get stuck when flushing the journal. There was an error path where this didn't happen, because in the error path we don't actually want those btree nodes write to happen; however, we still have to kick off the write path so the journal pins get released. The btree write path checks if we're in a journal error state and doesn't do the actual write if we are. Also - there was another deadlock because btree_update_nodes_written() was taking the btree update off of the unwritten_list too soon - before getting a journal reservation, which could fail and have to be retried. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
39fb2983 |
|
07-Jan-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Kill bkey_type_successor Previously, BTREE_ID_INODES was special - inodes were indexed by the inode field, which meant the offset field of struct bpos wasn't used, which led to special cases in e.g. the btree iterator code. Now, inodes in the inodes btree are indexed by the offset field. Also: prevously min_key was special for extents btrees, min_key for extents would equal max_key for the previous node. Now, min_key = bkey_successor() of the previous node, same as non extent btrees. This means we can completely get rid of btree_type_sucessor/predecessor. Also make some improvements to the metadata IO validate/compat code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
4e4758c6 |
|
27-Mar-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Use memalloc_nofs_save() vmalloc allocations don't always obey GFP_NOFS - memalloc_nofs_save() is the prefered approach for the future. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
6357d607 |
|
08-Feb-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Journal updates to interior nodes Previously, the btree has always been self contained and internally consistent on disk without anything from the journal - the journal just contained pointers to the btree roots. However, this meant that btree node split or compact operations - i.e. anything that changes btree node topology and involves updates to interior nodes - would require that interior btree node to be written immediately, which means emitting a btree node write that's mostly empty (using 4k of space on disk if the filesystemm blocksize is 4k to only write perhaps ~100 bytes of new keys). More importantly, this meant most btree node writes had to be FUA, and consumer drives have a history of slow and/or buggy FUA support - other filesystes have been bit by this. This patch changes the interior btree update path to journal updates to interior nodes, after the writes for the new btree nodes have completed. Best of all, it turns out to simplify the interior node update path somewhat. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
e3e464ac |
|
30-Dec-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Move extent overwrite handling out of core btree code Ever since the btree code was first written, handling of overwriting existing extents - including partially overwriting and splittin existing extents - was handled as part of the core btree insert path. The modern transaction and iterator infrastructure didn't exist then, so that was the only way for it to be done. This patch moves that outside of the core btree code to a pass that runs at transaction commit time. This is a significant simplification to the btree code and overall reduction in code size, but more importantly it gets us much closer to the core btree code being completely independent of extents and is important prep work for snapshots. This introduces a new feature bit; the old and new extent update models are incompatible when the filesystem needs journal replay. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
f1f5f114 |
|
26-Feb-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve an error message Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
72141e1f |
|
24-Feb-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Use btree_ptr_v2.mem_ptr to avoid hash table lookup Nice performance optimization Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
548b3d20 |
|
07-Feb-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: btree_ptr_v2 Add a new btree ptr type which contains the sequence number (random 64 bit cookie, actually) for that btree node - this lets us verify that when we read in a btree node it really is the btree node we wanted. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
237e8048 |
|
18-Feb-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: introduce b->hash_val This is partly prep work for introducing bch_btree_ptr_v2, but it'll also be a bit of a performance boost by moving the full key out of the hot part of struct btree. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
1f49dafc |
|
06-Feb-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix bch2_ptr_swab for indirect extents bch2_ptr_swab was never updated when the code for generic keys with pointers was added - it assumed the entire val was only used for pointers. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
bcd6f3e0 |
|
26-Nov-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Use KEY_TYPE_deleted whitouts for extents Previously, partial overwrites of existing extents were handled implicitly by the btree code; when reading in a btree node, we'd do a mergesort of the different bsets and detect and fix partially overlapping extents during that mergesort. That approach won't work with snapshots: this changes extents to work like regular keys as far as the btree code is concerned, where a 0 size KEY_TYPE_deleted whiteout will completely overwrite an existing extent. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
ae2f17d5 |
|
14-Dec-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Kill btree_node_iter_large Long overdue cleanup - this converts btree_node_iter_large uses to sort_iter. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
8f82280e |
|
14-Dec-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Use one buffer for sorting whiteouts We're not really supposed to allocate from the same mempool more than once. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
c297a763 |
|
13-Dec-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Refactor whiteouts compaction The whiteout compaction path - as opposed to just dropping whiteouts - is now only needed for extents, and soon will only be needed for extent btree nodes in the old format. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
c9bebae6 |
|
29-Nov-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Whiteout changes More prep work for snapshots: extents will soon be using KEY_TYPE_deleted for whiteouts, with 0 size. But we wen't be able to keep these whiteouts with the rest of the extents in the btree node, due to sorting invariants breaking. We can deal with this by immediately moving the new whiteouts to the unwritten whiteouts area - this just means those whiteouts won't be sorted, so we need new code to sort them prior to merging them with the rest of the keys to be written. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
ad44bdc3 |
|
09-Nov-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: bkey noops For upcoming inline data extents, we're going to need to be able to shorten the value of existing bkeys in the btree - and to make that work we're going to be able to need to pad out the space the value previously took up with something. This patch changes the various code that iterates over bkeys to handle k->u64s == 0 as meaning "skip the next 8 bytes". Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
cdd775e6 |
|
21-Oct-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Don't use FUA unnecessarily Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
885678f6 |
|
03-Jul-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Kill direct access to bi_io_vec Switch to always using bio_add_page(), which merges contiguous pages now that we have multipage bvecs. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
20bceecb |
|
15-May-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: More work to avoid transaction restarts Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
c43a6ef9 |
|
05-Jun-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: btree_bkey_cached_common This is prep work for the btree key cache: btree iterators will point to either struct btree, or a new struct bkey_cached. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
1dd7f9d9 |
|
04-Apr-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Rewrite journal_seq_blacklist machinery Now, we store blacklisted journal sequence numbers in the superblock, not the journal: this helps to greatly simplify the code, and more importantly it's now implemented in a way that doesn't require all btree nodes to be visited before starting the journal - instead, we unconditionally blacklist the next 4 journal sequence numbers after an unclean shutdown. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
424eb881 |
|
25-Mar-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Only get btree iters from btree transactions Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
dc3b63dc |
|
21-Mar-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add time stats for btree updates Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
d0cc3def |
|
13-Jan-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: More allocator startup improvements Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
26609b61 |
|
01-Nov-2018 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Make bkey types globally unique this lets us get rid of a lot of extra switch statements - in a lot of places we dispatch on the btree node type, and then the key type, so this is a nice cleanup across a lot of code. Also improve the on disk format versioning stuff. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
5b8a9227 |
|
27-Nov-2018 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Split out bkey_sort.c Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
319f9ac3 |
|
08-Nov-2018 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: revamp to_text methods Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
ac10a961 |
|
03-Nov-2018 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Some fixes for building in userspace userspace allocators don't align allocations as nicely as kernel allocators, which meant that in some cases we weren't allocating big enough bvec arrays - just make the calculations more rigorous and explicit to fix it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
5bd95a37 |
|
01-Nov-2018 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: new avoid mechanism for io retries Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
198d6700 |
|
21-Oct-2018 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: add functionality for heaps to update backpointers Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
a2753581 |
|
30-Sep-2018 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: bch2_extent_drop_ptrs() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
4cb13156 |
|
02-Oct-2018 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: extent_ptr_decoded Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
a00fd8c5 |
|
21-Aug-2018 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Comparison function cleanups Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
271a3d3a |
|
21-Jul-2016 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: lift ordering restriction on 0 size extents This lifts the restriction that 0 size extents must not overlap with other extents, which means we can now sort extents and non extents the same way, and will let us simplify a bunch of other stuff as well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
1fe08f31 |
|
05-Aug-2018 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: bkey_written() also cleanups of btree node offsets Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
1c6fdbd8 |
|
17-Mar-2017 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Initial commit Initially forked from drivers/md/bcache, bcachefs is a new copy-on-write filesystem with every feature you could possibly want. Website: https://bcachefs.org Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|