History log of /linux-master/fs/bcachefs/backpointers.c
Revision Date Author Comments
# 0389c09b 17-Apr-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Fix bio alloc in check_extent_checksum()

if the buffer is virtually mapped it won't be a single bvec

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# f0a73d4f 13-Apr-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Check for backpointer bucket_offset >= bucket size

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 4c02e63d 30-Mar-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Check for extents that point to same space

In backpointer repair, if we get a missing backpointer - but there's
already a backpointer that points to an existing extent - we've got
multiple extents that point to the same space and need to decide which
to keep.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 11d5568d 27-Mar-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: fix backpointer for missing alloc key msg

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 47d2080e 25-Mar-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Kill bch2_bkey_ptr_data_type()

Remove some duplication, and inconsistency between check_fix_ptrs and
the main ptr marking paths

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# a5e3dce4 21-Mar-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Fix assert in bch2_backpointer_invalid()

Backpointers that point to invalid devices are caught by fsck, not
.key_invalid; so .key_invalid needs to check for them instead of hitting
asserts.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# cdce1094 11-Mar-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: reconstruct_alloc cleanup

Now that we've got the errors_silent mechanism, we don't have to check
if the reconstruct_alloc option is set all over the place.

Also - users no longer have to explicitly select fsck and fix_errors.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 91dcad18 22-Jan-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Pin btree cache in ram for random access in fsck

Various phases of fsck involve checking references from one btree to
another: this means doing a sequential scan of one btree, and then
mostly random access into the second.

This is particularly painful for checking extents <-> backpointers; we
can prefetch btree node access on the sequential scan, but not on the
random access portion, and this is particularly painful on spinning
rust, where we'd like to keep the pipeline fairly full of btree node
reads so that the elevator can reduce seeking.

This patch implements prefetching and pinning of the portion of the
btree that we'll be doing random access to. We already calculate how
much of the random access btree will fit in memory so it's a fairly
straightforward change.

This will put more pressure on system memory usage, so we introduce a
new option, fsck_memory_usage_percent, which is the percentage of total
system ram that fsck is allowed to pin.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 52946d82 06-Feb-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Kill more -EIO error codes

This converts -EIOs related to btree node errors to private error codes,
which will help with some ongoing debugging by giving us better error
messages.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 1f626223 20-Feb-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: fix backpointer_to_text() when dev does not exist

Fixes:
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# ec4edd7b 16-Jan-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Prep work for variable size btree node buffers

bcachefs btree nodes are big - typically 256k - and btree roots are
pinned in memory. As we're now up to 18 btrees, we now have significant
memory overhead in mostly empty btree roots.

And in the future we're going to start enforcing that certain btree node
boundaries exist, to solve lock contention issues - analagous to XFS's
AGIs.

Thus, we need to start allocating smaller btree node buffers when we
can. This patch changes code that refers to the filesystem constant
c->opts.btree_node_size to refer to the btree node buffer size -
btree_buf_bytes() - where appropriate.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 1a503904 15-Jan-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: extents_to_bp_state

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# ba96d36c 15-Jan-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bkey_and_val_eq()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 074cbcda 03-Jan-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: fsck_err()s don't need to manually check c->sb.version anymore

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 80eab7a7 16-Dec-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: for_each_btree_key() now declares loop iter

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# cf904c8d 16-Dec-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch_err_(fn|msg) check if should print

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# a56c6171 30-Nov-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Make backpointer fsck wb flush check more rigorous

backpointers fsck now always runs in rw mode - the btree is being
modified while it runs, by e.g. copygc, rebalance, the discard worker,
the invalidate worker.

We could find a missing backpointer, flush the btree write buffer, and
then on the next iteration find a new key at the exact same position -
which will most likely need another write buffer flush.

Hence, we have to check for an exact match on last_flushed, not just the
pos.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 0f64a6da 30-Nov-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: On missing backpointer to interior node, flush interior updates

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 3f0e297d 28-Nov-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Explicity go RW for fsck

This eliminates a lot of BCH_TRANS_COMMIT_lazy_rw flags, and is less
error prone.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# c259bd95 26-Nov-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: No need to allocate keys for write buffer

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# cd404e5b 12-Nov-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: backpointers fsck no longer uses BTREE_ITER_ALL_LEVELS

It appears that BTREE_ITER_ALL_LEVELS is racy with concurrent interior
node btree updates; unfortunate but not terribly surprising it's a
difficult problem - that was the original reason for gc_lock.

BTREE_ITER_ALL_LEVELS will probably be deleted in a subsequent patch,
this changes backpointers fsck to instead walk keys at one level of the
btree at a time.

This fixes the tiering_drop_alloc test, which stopped working with the
patch to not flush the journal after journal replay.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# cb52d23e 11-Nov-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Rename BTREE_INSERT flags

BTREE_INSERT flags are actually transaction commit flags - rename them
for clarity.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 03cc1e67 07-Nov-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Fix null ptr deref in bch2_backpointer_get_node()

bch2_btree_iter_peek_node() can return a NULL ptr (when the tree is
shorter than the search depth); handle this with an early return.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes: https://lore.kernel.org/linux-bcachefs/5fc3c28b-c232-4ec7-b0ac-4ef220ddf976@moroto.mountain/T/
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 853960d0 04-Nov-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Simplify, fix bch2_backpointer_get_key()

- backpointer_not_found() checks backpointers_no_use_write_buffer, no
need to do it inbackpointer_get_key().

- always use backpointer_get_node() for pointers to nodes:
backpointer_get_key() was sometimes returning the key from the root
node unlocked.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# daba90f2 04-Nov-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: kill thing_it_points_to arg to backpointer_not_found()

This can be calculated locally.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 7cb2a789 03-Nov-2023 Brian Foster <bfoster@redhat.com>

bcachefs: use swab40 for bch_backpointer.bucket_offset bitfield

The bucket_offset field of bch_backpointer is a 40-bit bitfield, but the
bch2_backpointer_swab() helper uses swab32. This leads to inconsistency
when an on-disk fs is accessed from an opposite endian machine.

As it turns out, we already have an internal swab40() helper that is
used from the bch_alloc_v4 swab callback. Lift it into the backpointers
header file and use it consistently in both places.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# b65db750 24-Oct-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Enumerate fsck errors

This patch adds a superblock error counter for every distinct fsck
error; this means that when analyzing filesystems out in the wild we'll
be able to see what sorts of inconsistencies are being found and repair,
and hence what bugs to look for.

Errors validating bkeys are not yet considered distinct fsck errors, but
this patch adds a new helper, bkey_fsck_err(), in order to add distinct
error types for them as well.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 88dfe193 19-Oct-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch2_btree_id_str()

Since we can run with unknown btree IDs, we can't directly index btree
IDs into fixed size arrays.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 82142a55 22-Sep-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Fix a null ptr deref in bch2_get_alloc_in_memory_pos()

Reported-by: smatch
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 6bd68ec2 12-Sep-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Heap allocate btree_trans

We're using more stack than we'd like in a number of functions, and
btree_trans is the biggest object that we stack allocate.

But we have to do a heap allocatation to initialize it anyways, so
there's no real downside to heap allocating the entire thing.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 96dea3d5 12-Sep-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Fix W=12 build errors

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 6bf3766b 12-Sep-2023 Colin Ian King <colin.i.king@gmail.com>

bcachefs: Fix a handful of spelling mistakes in various messages

There are several spelling mistakes in error messages. Fix these.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 2110f21e 19-Jul-2023 Brian Foster <bfoster@redhat.com>

bcachefs: remove duplicate code between backpointer update paths

Signed-off-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 813e0cec 15-Jul-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Upgrade path fixes

Some minor fixes to not print errors that are actually due to a verson
upgrade.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 067d228b 07-Jul-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Enumerate recovery passes

Recovery and fsck have many different passes/jobs to do, which always
run in the same order - but not all of them run all the time. Some are
for fsck, some for unclean shutdown, some for version upgrades.

This adds some new structure: a defined list of recovery passes that we
can run in a loop, as well as consolidating the log messages.

The main benefit is consolidating the "should run this recovery pass"
logic, as well as cleaning up the "this recovery pass has finished"
state; instead of having a bunch of ad-hoc state bits in c->flags, we've
now got c->curr_recovery_pass.

By consolidating the "should run this recovery pass" logic, in the
future on disk format upgrades will be able to say "upgrading to this
version requires x passes to run", instead of forcing all of fsck to
run.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 8726dc93 06-Jul-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Change check for invalid key types

As part of the forward compatibility patch series, we need to allow for
new key types without complaining loudly when running an old version.

This patch changes the flags parameter of bkey_invalid to an enum, and
adds a new flag to indicate we're being called from the transaction
commit path.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 73bd774d 06-Jul-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Assorted sparse fixes

- endianness fixes
- mark some things static
- fix a few __percpu annotations
- fix silent enum conversions

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# faa6cb6c 28-Jun-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Allow for unknown btree IDs

We need to allow filesystems with metadata from newer versions to be
mountable and usable by older versions.

This patch enables us to roll out new btrees without a new major version
number; we can now handle btree roots for unknown btree types.

The unknown btree roots will be retained, and fsck (including
backpointers) will check them, the same as other btree types.

We add a dynamic array for the extra, unknown btree roots, in addition
to the fixed size btree root array, and add new helpers for looking up
btree roots.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 1fa3e87a 27-Jun-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Fix leak in backpointers fsck

We were forgetting to exit a printbuf - whoops.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 1bb3c2a9 20-Jun-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: New error message helpers

Add two new helpers for printing error messages with __func__ and
bch2_err_str():
- bch_err_fn
- bch_err_msg

Also kill the old error strings in the recovery path, which were causing
us to incorrectly report memory allocation failures - they're not needed
anymore.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# bc166d71 04-Jun-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Improve backpointers error message

the error message here dated from when backpointers could be stored in
alloc keys; now, we should always print the full key.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# bcb79a51 29-Apr-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch2_bkey_get_iter() helpers

Introduce new helpers for a common pattern:

bch2_trans_iter_init();
bch2_btree_iter_peek_slot();

- bch2_bkey_get_iter_type() returns -ENOENT if it doesn't find a key of
the correct type
- bch2_bkey_get_val_typed() copies the val out of the btree to a
(typically stack allocated) variable; it handles the case where the
value in the btree is smaller than the current version of the type,
zeroing out the remainder.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 174f930b 29-Apr-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bkey_ops.min_val_size

This adds a new field to bkey_ops for the minimum size of the value,
which standardizes that check and also enforces the new rule (previously
done somewhat ad-hoc) that we can extend value types by adding new
fields on to the end.

To make that work we do _not_ initialize min_val_size with sizeof,
instead we initialize it to the size of the first version of those
values.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 62a03559 31-Mar-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Rip out code for storing backpointers in alloc keys

We don't store backpointers in alloc keys anymore, since we gained the
btree write buffer.

This patch drops support for backpointers in alloc keys, and revs the on
disk format version so that we know a fsck is required.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 6bdefe9c 29-Mar-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Use BTREE_ITER_INTENT in ec_stripe_update_extent()

This adds a flags param to bch2_backpointer_get_key() so that we can
pass BTREE_ITER_INTENT, since ec_stripe_update_extent() is updating the
extent immediately.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 2f081584 14-Mar-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Improve the backpointer to missing extent message

We now print the pos where the backpointer was found in the btree, as
well as the exact bucket:bucket_offset of the data, to aid in grepping
through logs.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 872c0311 13-Mar-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Fix bch2_check_extents_to_backpointers()

In rare cases, bch2_check_extents_to_backpointers() would incorrectly
flag an extent has having a missing backpointer when we just needed to
flush the btree write buffer - we weren't tracking the last flushed
position correctly.

This adds a level field to the last_flushed pos, fixing a bug where we'd
sometimes fail on a new root node.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# e07cb974 25-Feb-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Flush write buffer as needed in backpointers repair

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 564fbd9d 17-Feb-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Fix a 64 bit divide

This fixes a build failure on 32 bit

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# facafdcb 20-Dec-2022 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Change bkey_invalid() rw param to flags

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 7c057d35 12-Feb-2023 Kent Overstreet <kent.overstreet@linux.dev>

fixup bcachefs: New on disk format: Backpointers

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 53b1c6f4 14-Oct-2022 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Don't use key cache during fsck

The btree key cache mainly helps with lock contention, at the cost of
additional memory overhead. During some fsck passes the memory overhead
really matters, but fsck is single threaded so lock contention is an
issue - so skipping the key cache during fsck will help with
performance.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# b32f9a57 28-Sep-2022 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Run check_extents_to_backpointers() in multiple passes

Similer to the previous patch for check_backpointers_to_extents(), if
the alloc + backpointers btrees do not fit in ram we need to run into
multiple passes.

The counting of btree nodes that fit in memory is different here,
because we have to walk the alloc and backpointers btrees at the same
time, since a backpointer could reside in either of them and we don't
know which without checking both.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 23792a71 09-Oct-2022 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Run bch2_check_backpointers_to_extents() in multiple passes if necessary

When the extents + reflink btrees don't fit into memory this fsck pass
becomes _much_ slower, since we're doing random lookups.

This patch changes this pass to check how much of the relevant btrees
will fit into memory, and run in multiple passes if needed.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# a8c752bb 17-Mar-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: New on disk format: Backpointers

This patch adds backpointers: we now have a reverse index from device
and offset on that device (specifically, offset within a bucket) back to
btree nodes and (non cached) data extents.

The first 40 backpointers within a bucket are stored in the alloc key;
after that backpointers spill over to the next backpointers btree. This
is to help avoid performance regressions from additional btree updates
on large streaming workloads.

This patch adds all the code for creating, checking and repairing
backpointers. The next patch in the series is going to use backpointers
for copygc - finally getting rid of the need to scan all extents to do
copygc.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>