History log of /linux-master/fs/bcachefs/btree_cache.h
Revision Date Author Comments
# ec4edd7b 16-Jan-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Prep work for variable size btree node buffers

bcachefs btree nodes are big - typically 256k - and btree roots are
pinned in memory. As we're now up to 18 btrees, we now have significant
memory overhead in mostly empty btree roots.

And in the future we're going to start enforcing that certain btree node
boundaries exist, to solve lock contention issues - analagous to XFS's
AGIs.

Thus, we need to start allocating smaller btree node buffers when we
can. This patch changes code that refers to the filesystem constant
c->opts.btree_node_size to refer to the btree node buffer size -
btree_buf_bytes() - where appropriate.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# a564c9fa 02-Dec-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Include btree_trans in more tracepoints

This gives us more context information - e.g. which codepath is invoking
btree node reads.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 88dfe193 19-Oct-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch2_btree_id_str()

Since we can run with unknown btree IDs, we can't directly index btree
IDs into fixed size arrays.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 73bd774d 06-Jul-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Assorted sparse fixes

- endianness fixes
- mark some things static
- fix a few __percpu annotations
- fix silent enum conversions

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# faa6cb6c 28-Jun-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Allow for unknown btree IDs

We need to allow filesystems with metadata from newer versions to be
mountable and usable by older versions.

This patch enables us to roll out new btrees without a new major version
number; we can now handle btree roots for unknown btree types.

The unknown btree roots will be retained, and fsck (including
backpointers) will check them, the same as other btree types.

We add a dynamic array for the extra, unknown btree roots, in addition
to the fixed size btree root array, and add new helpers for looking up
btree roots.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# a345b0f3 06-Mar-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch2_btree_node_to_text() const correctness

This is for the Rust interface - Rust cares more about const than C
does.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 1306f87d 02-Mar-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Plumb btree_trans through btree cache code

Soon, __bch2_btree_node_write() is going to require a btree_trans: zoned
device support is going to require a new allocation for every btree node
write. This is a bit of prep work.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 77671e8f 21-Oct-2022 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Move bkey bkey_unpack_key() to bkey.h

Long ago, bkey_unpack_key() was added to bset.h instead of bkey.h
because bkey.h didn't include btree_types.h, which it needs for the
compiled unpack function.

This patch finally moves it to the proper location.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# ca7d8fca 21-Aug-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: New locking functions

In the future, with the new deadlock cycle detector, we won't be using
bare six_lock_* anymore: lock wait entries will all be embedded in
btree_trans, and we will need a btree_trans context whenever locking a
btree node.

This patch plumbs a btree_trans to the few places that need it, and adds
two new locking functions
- btree_node_lock_nopath, which may fail returning a transaction
restart, and
- btree_node_lock_nopath_nofail, to be used in places where we know we
cannot deadlock (i.e. because we're holding no other locks).

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 30985537 04-Mar-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Fix usage of six lock's percpu mode

Six locks have a percpu mode, which we use for interior btree nodes, as
well as btree key cache keys for the subvolumes btree. We've been
switching locks back and forth between percpu and non percpu mode as
needed, but it turns out this is racy - when we're reusing an existing
node, other threads could be attempting to lock it while we're switching
it between modes.

This patch fixes this by never switching 'struct btree' between the two
modes, and instead segragating them between two different freed lists.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# de517c95 26-Feb-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Use x-macros for btree node flags

This is for adding an array of strings for btree node flag names.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 8244f320 14-Dec-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Option improvements

This adds flags for options that must be a power of two (block size and
btree node size), and options that are stored in the superblock as a
power of two (encoded extent max).

Also: options are now stored in memory in the same units they're
displayed in (bytes): we now convert when getting and setting from the
superblock.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 67e0dd8f 30-Aug-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: btree_path

This splits btree_iter into two components: btree_iter is now the
externally visible componont, and it points to a btree_path which is now
reference counted.

This means we no longer have to clone iterators up front if they might
be mutated - btree_path can be shared by multiple iterators, and cloned
if an iterator would mutate a shared btree_path. This will help us use
iterators more efficiently, as well as slimming down the main long lived
state in btree_trans, and significantly cleans up the logic for iterator
lifetimes.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 78cf784e 30-Aug-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Further reduce iter->trans usage

This is prep work for splitting btree_path out from btree_iter -
btree_path will not have a pointer to btree_trans.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 8b3e9bd6 24-Jul-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Always check for transaction restarts

On transaction restart iterators won't be locked anymore - make sure
we're always checking for errors.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 6e075b54 24-Jul-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: bch2_btree_iter_relock_intent()

This adds a new helper for btree_cache.c that does what we want where
the iterator is still being traverse - and also eliminates some
unnecessary transaction restarts.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# ceda1b9a 25-Apr-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Evict btree nodes we're deleting

There was a bug that led to duplicate btree node pointers being inserted
at the wrong level. The new topology repair code can fix that, except
that the btree cache code gets confused when we read in a btree node
from the pointer that was at the wrong level. This patch evicts nodes
that we're deleting to, which nicely solves the problem.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 6adaac0b 20-Apr-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Update bch2_btree_verify()

bch2_btree_verify() verifies that the btree node on disk matches what we
have in memory. This patch changes it to verify every replica, and also
fixes it for interior btree nodes - there's a mem_ptr field which is
used as a scratch space and needs to be zeroed out for comparing with
what's on disk.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 2fa81d0b 31-Mar-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Fix BTREE_FOREGROUND_MERGE_HYSTERESIS

We were multiplying instead of dividing - oops.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 54ca47e1 28-Mar-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Kill bch2_btree_node_get_sibling()

This patch reworks the btree node merge path to use a second btree
iterator to get the sibling node - which means
bch2_btree_iter_get_sibling() can be deleted. Also, it uses
bch2_btree_iter_traverse_all() if necessary - which means it should be
more reliable. We don't currently even try to make it work when
trans->nounlock is set - after a BTREE_INSERT_NOUNLOCK transaction
commit, hopefully this will be a worthwhile tradeoff.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 41f8b09e 20-Feb-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Rename BTREE_ID enums for consistency with other enums

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# a0b73c1c 26-Jan-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Add (partial) support for fixing btree topology

When we walk the btrees during recovery, part of that is checking that
btree topology is correct: for every interior btree node, its child
nodes should exactly span the range the parent node covers.

Previously, we had checks for this, but not repair code. Now that we
have the ability to do btree updates during initial GC, this patch adds
that repair code.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# edfbba58 11-Jan-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Add btree node prefetching to bch2_btree_and_journal_walk()

bch2_btree_and_journal_walk() walks the btree overlaying keys from the
journal; it was introduced so that we could read in the alloc btree
prior to journal replay being done, when journalling of updates to
interior btree nodes was introduced.

But it didn't have btree node prefetching, which introduced a severe
regression with mount times, particularly on spinning rust. This patch
implements btree node prefetching for the btree + journal walk,
hopefully fixing that.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# d8ebed7d 19-Nov-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Add btree cache stats to sysfs

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# a301dc38 28-Oct-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Improve tracing for transaction restarts

We have a bug where we can get stuck with a process spinning in
transaction restarts - need more information.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 4580baec 25-Jul-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Remove some uses of PAGE_SIZE in the btree code

For portability to userspace, we should try to avoid working in kernel
pages.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# f270667a 10-Apr-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Slightly reduce btree split threshold

2/3rds performs a lot better than 3/4ths on the tested workloda, leading
to significanly fewer btree node compactions.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# e62d65f2 15-Mar-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: trans_commit() path can now insert to interior nodes

This will be needed for the upcoming patches to journal updates to
interior btree nodes.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 72141e1f 24-Feb-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Use btree_ptr_v2.mem_ptr to avoid hash table lookup

Nice performance optimization

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 548b3d20 07-Feb-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: btree_ptr_v2

Add a new btree ptr type which contains the sequence number (random 64
bit cookie, actually) for that btree node - this lets us verify that
when we read in a btree node it really is the btree node we wanted.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 237e8048 18-Feb-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: introduce b->hash_val

This is partly prep work for introducing bch_btree_ptr_v2, but it'll
also be a bit of a performance boost by moving the full key out of the
hot part of struct btree.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 5525f632 15-Jan-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Change btree split threshold to be in u64s

This fixes a bug with very small btree nodes where splitting would end
up with one of the new nodes empty.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# b03b81df 10-May-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Don't pass around may_drop_locks

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# c43a6ef9 05-Jun-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: btree_bkey_cached_common

This is prep work for the btree key cache: btree iterators will point to
either struct btree, or a new struct bkey_cached.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 26609b61 01-Nov-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Make bkey types globally unique

this lets us get rid of a lot of extra switch statements - in a lot of
places we dispatch on the btree node type, and then the key type, so
this is a nice cleanup across a lot of code.

Also improve the on disk format versioning stuff.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 319f9ac3 08-Nov-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: revamp to_text methods

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 1c6fdbd8 17-Mar-2017 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Initial commit

Initially forked from drivers/md/bcache, bcachefs is a new copy-on-write
filesystem with every feature you could possibly want.

Website: https://bcachefs.org

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>