History log of /linux-master/fs/bcachefs/btree_update.h
Revision Date Author Comments
# e07c28ab 08-Feb-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch2_btree_bit_mod()

Provide a non-write buffer version of bch2_btree_bit_mod_buffered(), for
the subvolume children btree.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 506b1876 08-Feb-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch2_btree_bit_mod -> bch2_btree_bit_mod_buffered

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 6474b706 11-Dec-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Clean up btree_trans

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 7f9821a7 10-Dec-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: btree_insert_entry -> btree_path_idx_t

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 74e600c1 10-Dec-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs; bch2_path_put() -> btree_path_idx_t

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 559e6c23 16-Dec-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: trans_for_each_update() now declares loop iter

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 67997234 11-Nov-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: kill btree_trans->wb_updates

the btree write buffer path now creates a journal entry directly

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 24de63da 10-Dec-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Improve trans->extra_journal_entries

Instead of using a darray, we now allocate journal entries for the
transaction commit path with our normal bump allocator - with an inlined
fastpath, and using btree_transaction_stats to remember how much to
initially allocate so as to avoid transaction restarts.

This is prep work for converting write buffer updates to use this
mechanism.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# c259bd95 26-Nov-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: No need to allocate keys for write buffer

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# cb52d23e 11-Nov-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Rename BTREE_INSERT flags

BTREE_INSERT flags are actually transaction commit flags - rename them
for clarity.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 5927310d 11-Nov-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch_str_hash_flags_t

Create a separate enum for str_hash flags - instead of abusing the
btree_insert_flags enum - and create a __bitwise typedef for sparse
typechecking.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# aa62aabb 11-Nov-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Kill dead BTREE_INSERT flags

BTREE_INSERT_NOWAIT and BTREE_INSERT_GC_LOCK_HELD are no longer used,
and can be deleted.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 3b59fbec 14-Sep-2023 Jiapeng Chong <jiapeng.chong@linux.alibaba.com>

bcachefs: Remove duplicate include

./fs/bcachefs/btree_update.h: journal.h is included more than once.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6573
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 6bd68ec2 12-Sep-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Heap allocate btree_trans

We're using more stack than we'd like in a number of functions, and
btree_trans is the biggest object that we stack allocate.

But we have to do a heap allocatation to initialize it anyways, so
there's no real downside to heap allocating the entire thing.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 96dea3d5 12-Sep-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Fix W=12 build errors

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# aaad530a 27-Aug-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: BTREE_ID_logged_ops

Add a new btree for long running logged operations - i.e. for logging
operations that we can't do within a single btree transaction, so that
they can be resumed if we crash.

Keys in the logged operations btree will represent operations in
progress, with the state of the operation stored in the value.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# aef32bf7 11-Sep-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: __bch2_btree_insert() -> bch2_btree_insert_trans()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 1e81f89b 06-Aug-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Fix assorted checkpatch nits

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 85beefef 20-Jul-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch2_trans_update_extent_overwrite()

Factor out a new helper, to be used when fsck has to repair overlapping
extents.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# ac319b4f 20-Jul-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Move some declarations to the correct header

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 8e992c6c 16-Jul-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch2_btree_bit_mod()

New helper for bitset btrees.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# eabb10dc 19-Jul-2023 Brian Foster <bfoster@redhat.com>

bcachefs: support btree updates of prejournaled keys

Introduce support for prejournaled key updates. This allows a
transaction to commit an update for a key that already exists (and
is pinned) in the journal. This is required for btree write buffer
updates as the current scheme of journaling both on write buffer
insertion and write buffer (slow path) flush is unsafe in certain
crash recovery scenarios.

Create a small trans update wrapper to pass along the seq where the
key resides into the btree_insert_entry. From there, trans commit
passes the seq into the btree insert path where it is used to manage
the journal pin for the associated btree leaf.

Note that this patch only introduces the underlying mechanism and
otherwise includes no functional changes.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# f33c58fc 27-Jun-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Kill BTREE_INSERT_USE_RESERVE

Now that we have journal watermarks and alloc watermarks unified,
BTREE_INSERT_USE_RESERVE is redundant and can be deleted.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# ec14fc60 27-Jun-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Kill JOURNAL_WATERMARK

This unifies JOURNAL_WATERMARK with BCH_WATERMARK; we're working towards
specifying watermarks once in the transaction commit path.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 0fb3355d 26-Jun-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Improve bch2_bkey_make_mut()

bch2_bkey_make_mut() now takes the bkey_s_c by reference and points it
at the new, mutable key.

This helps in some fsck paths that may have multiple repair operations
on the same key.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# ad520141 27-May-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Fix corruption with writeable snapshots

When partially overwriting an extent in an older snapshot, the existing
extent has to be split.

If the existing extent was overwritten in a different (sibling)
snapshot, we have to ensure that the split won't be visible in the
sibling snapshot.

data_update.c already has code for this,
bch2_insert_snapshot_writeouts() - we just need to move it into
btree_update_leaf.c and change bch2_trans_update_extent() to use it as
well.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 51e84d3b 27-Apr-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch2_bkey_get_empty_slot()

Add a new helper for allocating a new slot in a btree.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# dbda63bb 30-Apr-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch2_bkey_make_mut() now calls bch2_trans_update()

It's safe to call bch2_trans_update with a k/v pair where the value
hasn't been filled out, as long as the key part has been and the value
is filled out by transaction commit time.

This patch folds the bch2_trans_update() call into bch2_bkey_make_mut(),
eliminating a bit of boilerplate.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# f12a798a 30-Apr-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch2_bkey_get_mut() now calls bch2_trans_update()

It's safe to call bch2_trans_update with a k/v pair where the value
hasn't been filled out, as long as the key part has been and the value
is filled out by transaction commit time.

This patch folds the bch2_trans_update() call into bch2_bkey_get_mut(),
eliminating a bit of boilerplate.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# f8cb35fd 30-Apr-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch2_bkey_alloc() now calls bch2_trans_update()

It's safe to call bch2_trans_update with a k/v pair where the value
hasn't been filled out, as long as the key part has been and the value
is filled out by transaction commit time.

This patch folds the bch2_trans_update() call into bch2_bkey_alloc(),
eliminating a bit of boilerplate.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 34dfa5db 27-Apr-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch2_bkey_get_mut() improvements

- bch2_bkey_get_mut() now handles types increasing in size, allocating
a buffer for the type's current size when necessary
- bch2_bkey_make_mut_typed()
- bch2_bkey_get_mut() now initializes the iterator, like
bch2_bkey_get_iter()

Also, refactor so that most of the code is in functions - now macros are
only used for wrappers.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# d67a16df 30-Apr-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Move bch2_bkey_make_mut() to btree_update.h

It's for doing updates - this is where it belongs, and next pathes will
be changing these helpers to use items from btree_update.h.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 62a03559 31-Mar-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Rip out code for storing backpointers in alloc keys

We don't store backpointers in alloc keys anymore, since we gained the
btree write buffer.

This patch drops support for backpointers in alloc keys, and revs the on
disk format version so that we know a fsck is required.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 349b1d83 22-Mar-2023 Brian Foster <bfoster@redhat.com>

bcachefs: use reservation for log messages during recovery

If we block on journal reservation attempting to log journal
messages during recovery, particularly for the first message(s)
before we start doing actual work, chances are the filesystem ends
up deadlocked.

Allow logged messages to use reserved journal space to mitigate this
problem. In the worst case where no space is available whatsoever,
this at least allows the fs to recognize that the journal is stuck
and fail the mount gracefully.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 76c70c57 22-Mar-2023 Brian Foster <bfoster@redhat.com>

bcachefs: remove unused bch2_trans_log_msg()

Signed-off-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 83ec519a 07-Mar-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: When shutting down, flush btree node writes last

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 2798143a 16-Feb-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch2_btree_insert_nonextent()

This adds a new helper to delete some redundant code in
bch2_trans_update_extent().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 8ffa11a2 19-Jan-2023 Daniel Hill <daniel@gluo.nz>

bcachefs: let __bch2_btree_insert() pass in flags

This patch is prep work for the following patch.

Signed-off-by: Daniel Hill <daniel@gluo.nz>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 920e69bc 03-Jan-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Btree write buffer

This adds a new method of doing btree updates - a straight write buffer,
implemented as a flat fixed size array.

This is only useful when we don't need to read from the btree in order
to do the update, and when reading is infrequent - perfect for the LRU
btree.

This will make LRU btree updates fast enough that we'll be able to use
it for persistently indexing buckets by fragmentation, which will be a
massive boost to copygc performance.

Changes:
- A new btree_insert_type enum, for btree_insert_entries. Specifies
btree, btree key cache, or btree write buffer.

- bch2_trans_update_buffered(): updates via the btree write buffer
don't need a btree path, so we need a new update path.

- Transaction commit path changes:
The update to the btree write buffer both mutates global, and can
fail if there isn't currently room. Therefore we do all write buffer
updates in the transaction all at once, and also if it fails we have
to revert filesystem usage counter changes.

If there isn't room we flush the write buffer in the transaction
commit error path and retry.

- A new persistent option, for specifying the number of entries in the
write buffer.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 30ca6ece 09-Feb-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Kill trans->flags

Recursive transaction commits are occasionally necessary - in
particular, for the upcoming btree write buffer's flush path.

This avoids bugs due to trans->flags being accidentally mutated
mid-commit, which can cause c->writes refcount leaks.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 464b4155 05-Feb-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Fix bch2_trans_reset_updates()

This should have been resetting trans->fs_usage_deltas as well.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 5bbe3f2d 14-Dec-2022 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Log more messages in the journal

This patch

- Adds a mechanism for queuing up journal entries prior to the journal
being started, which will be used for early journal log messages

- Adds bch2_fs_log_msg() and improves bch2_trans_log_msg(), which now
take format strings. bch2_fs_log_msg() can be used before or after
the journal has been started, and will use the appropriate mechanism.

- Deletes the now obsolete bch2_journal_log_msg()

- And adds more log messages to the recovery path - messages for
journal/filesystem started, journal entries being blacklisted, and
journal replay starting/finishing.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 1ff7849f 09-Oct-2022 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch2_btree_insert_node() no longer uses lock_write_nofail

Now that we have an error path plumbed through, there's no need to be
using bch2_btree_node_lock_write_nofail().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 549d173c 17-Jul-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: EINTR -> BCH_ERR_transaction_restart

Now that we have error codes, with subtypes, we can switch to our own
error code for transaction restarts - and even better, a distinct error
code for each transaction restart reason: clearer code and better
debugging.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# e941ae7d 17-Jul-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Add a counter for btree_trans restarts

This will help us improve nested transactions - we need to add
assertions that whenever an inner transaction handles a restart, it
still returns -EINTR to the outer transaction.

This also adds nested_lockrestart_do() and nested_commit_do() which use
the new counters to correctly return -EINTR when the transaction was
restarted.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# dadecd02 14-Jul-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: bch2_trans_run()

This adds a new helper, bch2_trans_run(), that runs a function with a
btree_transaction context but without handling transaction restarts.
We're adding checks for nested transaction restart handling: when an
inner transaction handles a transaction restart it will still have to
return it to the outer transaction, or else assertions will be popped in
the outer transaction.

But some places don't need restart handling at the outer scope, so this
helper does what they need.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# e68914ca 13-Jul-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Rename __bch2_trans_do() -> commit_do()

Better/more descriptive naming, and prep for adding
nested_lockrestart_do() and nested_commit_do().

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 0fbf71f8 29-May-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: bch2_trans_reset_updates()

Factor out a new helper.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# aae29082 09-Apr-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: bch2_btree_delete_extent_at()

New helper, for deleting extents.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 31f63fd1 14-Mar-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Introduce a separate journal watermark for copygc

Since journal reclaim -> btree key cache flushing may require the
allocation of new btree nodes, it has an implicit dependency on copygc
in order to make forward progress - so we should avoid blocking copygc
unless the journal is really close to full.

This introduces watermarks to replace our single MAY_GET_UNRESERVED bit
in the journal, and adds a watermark for copygc and plumbs it through.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 5aabb324 30-Mar-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: bch2_trans_log_msg()

Add a new helper for logging messages to the journal - a new debugging
tool, an alternative to trace_printk().

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 12ce5b7d 11-Jan-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Btree key cache coherency

- Updates to non key cache iterators will now be transparently
redirected to the key cache for cached btrees.

- Except when creating new keys: then the update goes to underlying
btree

For for iterating over a cached btree to work, we need to ensure that if
a key exists in the key cache, it also exists in the btree - otherwise
the iterator code will skip past it and not check the key cache.

Otherwise, for consistency, all updates should go to the same place -
the key cache.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 3763cb95 25-Dec-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Don't use in-memory bucket array for alloc updates

More prep work for getting rid of the in-memory bucket array: now that
we have BTREE_ITER_WITH_JOURNAL, the allocator code can do ntree lookups
before journal replay is finished, and there's no longer any need for it
to get allocation information from the in-memory bucket array.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 1f2d9192 08-Jan-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: iter->update_path

With BTREE_ITER_FILTER_SNAPSHOTS, we have to distinguish between the
path where the key was found, and the path for inserting into the
current snapshot. This adds a new field to struct btree_iter for saving
a path for the current snapshot, and plumbs it through
bch2_trans_update().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# d248ee56 29-Dec-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Add iter_flags arg to bch2_btree_delete_range()

Will be used by the new snapshot tests, to pass in
BTREE_ITER_ALL_SNAPSHOTS.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 94a3e1a6 04-Dec-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: bch2_trans_update() is now __must_check

With snapshots, bch2_trans_update() has to check if we need a whitout,
which can cause a transaction restart, so this is important now.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# f3cf0999 24-Oct-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: bch2_btree_node_rewrite() now returns transaction restarts

We have been getting away from handling transaction restarts locally -
convert bch2_btree_node_rewrite() to the newer style.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 9a796fdb 19-Oct-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: bch2_trans_exit() no longer returns errors

Now that peek_node()/next_node() are converted to return errors
directly, we don't need bch2_trans_exit() to return errors - it's
cleaner this way and wasn't used much anymore.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# ef1669ff 19-Apr-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Update fsck for snapshots

This updates the fsck algorithms to handle snapshots - meaning there
will be multiple versions of the same key (extents, inodes, dirents,
xattrs) in different snapshots, and we have to carefully consider which
keys are visible in which snapshot.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 67e0dd8f 30-Aug-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: btree_path

This splits btree_iter into two components: btree_iter is now the
externally visible componont, and it points to a btree_path which is now
reference counted.

This means we no longer have to clone iterators up front if they might
be mutated - btree_path can be shared by multiple iterators, and cloned
if an iterator would mutate a shared btree_path. This will help us use
iterators more efficiently, as well as slimming down the main long lived
state in btree_trans, and significantly cleans up the logic for iterator
lifetimes.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 9f6bd307 24-Aug-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Reduce iter->trans usage

Disfavoured, and should go away.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 1a488e73 27-Jul-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Kill BTREE_INSERT_NOUNLOCK

With the recent transaction restart changes, it's no longer needed - all
transaction commits have BTREE_INSERT_NOUNLOCK semantics.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 3cc5288a 28-Jul-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Change lockrestart_do() to always call bch2_trans_begin()

More consistent behaviour means less likely to trip over ourselves in
silly ways.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 700c25b3 24-Jul-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Use bch2_trans_begin() more consistently

Upcoming patch will require that a transaction restart is always
immediately followed by bch2_trans_begin().

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 9f1833ca 10-Jul-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Update btree ptrs after every write

This closes a significant hole (and last known hole) in our ability to
verify metadata. Previously, since btree nodes are log structured, we
couldn't detect lost btree writes that weren't the first write to a
given node. Additionally, this seems to have lead to some significant
metadata corruption on multi device filesystems with metadata
replication: since a write may have made it to one device and not
another, if we read that btree node back from the replica that did have
that write and started appending after that point, the other replica
would have a gap in the bset entries and reading from that replica
wouldn't find the rest of the bsets.

But, since updates to interior btree nodes are now journalled, we can
close this hole by updating pointers to btree nodes after every write
with the currently written number of sectors, without negatively
affecting performance. This means we will always detect lost or corrupt
metadata - it also means that our btree is now a curious hybrid of COW
and non COW btrees, with all the benefits of both (excluding
complexity).

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# e3a67bdb 10-Jul-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Regularize argument passing of btree_trans

btree_trans should always be passed when we have one - iter->trans is
disfavoured. This mainly updates old code in btree_update_interior.c,
some of which predates btree_trans.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# b00fde8f 05-Jul-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: BTREE_UPDATE_INTERNAL_SNAPSHOT_NODE

Add a new flag to control assertions about updating to internal snapshot
nodes, that normally should not be written to - to be used in an
upcoming patch.

Also do some renaming - trigger_flags is now update_flags.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# cd8319fd 07-Jun-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Kill trans->updates2

Now that extent handling has been lifted to bch2_trans_update(), we
don't need to keep two different lists of updates.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# bcd25dac 24-Apr-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Rewrite btree nodes with errors

This patch adds self healing functionality for btree nodes - if we
notice a problem when reading a btree node, we just rewrite it.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# d3ff7fec 07-Apr-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Improved check_directory_structure()

Now that we have inode backpointers, we can simplify checking directory
structure: instead of doing a DFS from the filesystem root and then
checking if we found everything, we can iterate over every inode and see
if we can go up until we get to the root.

This patch also has a number of fixes and simplifications for the inode
backpointer checks. Also, it turns out we don't actually need the
BCH_INODE_BACKPTR_UNTRUSTED flag.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 43d00243 03-Feb-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Add a mechanism for running callbacks at trans commit time

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 3187aa8d 21-Dec-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Don't use BTREE_INSERT_USE_RESERVE so much

Previously, we were using BTREE_INSERT_RESERVE in a lot of places where
it no longer makes sense.

- we now have more open_buckets than we used to, and the reserves work
better, so we shouldn't need to use BTREE_INSERT_RESERVE just because
we're holding open_buckets pinned anymore.

- We have the btree key cache for updates to the alloc btree, meaning
we no longer need the btree reserve to ensure the allocator can make
forward progress.

This means that we should only need a reserve for btree updates to
ensure that copygc can make forward progress.

Since it's now just for copygc, we can also fold RESERVE_BTREE into
RESERVE_MOVINGGC (the allocator's freelist reserve).

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 087c2019 20-Nov-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: bch2_btree_delete_range_trans()

This helps reduce stack usage by avoiding multiple btree_trans on the
stack.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 2ca88e5a 07-Mar-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Btree key cache

This introduces a new kind of btree iterator, cached iterators, which
point to keys cached in a hash table. The cache also acts as a write
cache - in the update path, we journal the update but defer updating the
btree until the cached entry is flushed by journal reclaim.

Cache coherency is for now up to the users to handle, which isn't ideal
but should be good enough for now.

These new iterators will be used for updating inodes and alloc info (the
alloc and stripes btrees).

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 6357d607 08-Feb-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Journal updates to interior nodes

Previously, the btree has always been self contained and internally
consistent on disk without anything from the journal - the journal just
contained pointers to the btree roots.

However, this meant that btree node split or compact operations - i.e.
anything that changes btree node topology and involves updates to
interior nodes - would require that interior btree node to be written
immediately, which means emitting a btree node write that's mostly empty
(using 4k of space on disk if the filesystemm blocksize is 4k to only
write perhaps ~100 bytes of new keys).

More importantly, this meant most btree node writes had to be FUA, and
consumer drives have a history of slow and/or buggy FUA support - other
filesystes have been bit by this.

This patch changes the interior btree update path to journal updates to
interior nodes, after the writes for the new btree nodes have completed.
Best of all, it turns out to simplify the interior node update path
somewhat.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 511ed5bf 15-Mar-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Drop unused export

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# e3e464ac 30-Dec-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Move extent overwrite handling out of core btree code

Ever since the btree code was first written, handling of overwriting
existing extents - including partially overwriting and splittin existing
extents - was handled as part of the core btree insert path. The modern
transaction and iterator infrastructure didn't exist then, so that was
the only way for it to be done.

This patch moves that outside of the core btree code to a pass that runs
at transaction commit time.

This is a significant simplification to the btree code and overall
reduction in code size, but more importantly it gets us much closer to
the core btree code being completely independent of extents and is
important prep work for snapshots.

This introduces a new feature bit; the old and new extent update models
are incompatible when the filesystem needs journal replay.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 163e885a 26-Feb-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Kill TRANS_RESET_MEM|TRANS_RESET_ITERS

All iterators should be released now with bch2_trans_iter_put(), so
TRANS_RESET_ITERS shouldn't be needed anymore, and TRANS_RESET_MEM is
always used.

Also convert more code to __bch2_trans_do().

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 237e8048 18-Feb-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: introduce b->hash_val

This is partly prep work for introducing bch_btree_ptr_v2, but it'll
also be a bit of a performance boost by moving the full key out of the
hot part of struct btree.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 24326cd1 31-Dec-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Sort & deduplicate updates in bch2_trans_update()

Previously, when doing multiple update in the same transaction commit
that overwrote each other, we relied on doing the updates in the same
order as the bch2_trans_update() calls in order to get the correct
result. But that wasn't correct for triggers; bch2_trans_mark_update()
when marking overwrites would do the wrong thing because it hadn't seen
the update that was being overwritten.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 2d594dfb 31-Dec-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Split out btree_trigger_flags

The trigger flags really belong with individual btree_insert_entries,
not the transaction commit flags - this splits out those flags and
unifies them with the BCH_BUCKET_MARK flags. Todo - split out
btree_trigger.c from buckets.c

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 54e86b58 30-Dec-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Make btree_insert_entry more private to update path

This should be private to btree_update_leaf.c, and we might end up
removing it.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 8b3bbe2c 24-Dec-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Don't reexecute triggers when retrying transaction commit

This was causing a bug with transaction iterators overflowing; now, if
triggers have to be reexecuted we always return -EINTR and retry from
the start of the transaction.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 58e2388f 22-Dec-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Kill BTREE_INSERT_ATOMIC

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# b1fd23df 22-Dec-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Convert all bch2_trans_commit() users to BTREE_INSERT_ATOMIC

BTREE_INSERT_ATOMIC should really be the default mode, and there's not
that much code that doesn't need it - so this is prep work for getting
rid of the flag.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 2a9101a9 19-Oct-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Refactor bch2_trans_commit() path

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 8f1965391 19-Oct-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Make btree_node_type_needs_gc() cheaper

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 64bc0011 26-Sep-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Rework btree iterator lifetimes

The btree_trans struct needs to memoize/cache btree iterators, so that
on transaction restart we don't have to completely redo btree lookups,
and so that we can do them all at once in the correct order when the
transaction had to restart to avoid a deadlock.

This switches the btree iterator lookups to work based on iterator
position, instead of trying to match them up based on the stack trace.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# a7199432 22-Sep-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Kill deferred btree updates

Will be replaced by cached btree iterators

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 36e9d698 07-Sep-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Do updates in order they were queued up in

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 4430ea70 05-Sep-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Kill BTREE_INSERT_NOMARK_INSERT

Was dead code

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 6e738539 24-May-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Improve key marking interface

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 20bceecb 15-May-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: More work to avoid transaction restarts

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 17758a6c 11-May-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: bch2_btree_delete_at_range()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 932aa837 11-Mar-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: bch2_trans_mark_update()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# c6dd04f8 15-Apr-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Mark overwrites from journal replay in initial gc

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 76a0537b 27-Mar-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Sort updates in bch2_trans_update()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 134915f3 21-Mar-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Go rw lazily

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 9623ab27 15-Mar-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Btree update path cleanup

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 0dc17247 13-Mar-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: kill struct btree_insert

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 0564b167 13-Mar-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: convert bch2_btree_insert_at() usage to bch2_trans_commit()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 94d290e4 11-Mar-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: drop btree_insert->did_work

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# c93cead0 16-Mar-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Always use bch2_extent_trim_atomic()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# a8e00bd4 07-Mar-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: increase BTREE_ITER_MAX

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 3e5d6c59 19-Feb-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Use journal preres for deferred btree updates

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 8fe826f9 13-Feb-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Convert bucket invalidation to key marking path

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 430735cd 18-Nov-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Persist alloc info on clean shutdown

- Does not persist alloc info for stripes yet
- Also does not yet include filesystem block/sector counts yet, from
struct fs_usage
- Not made use of just yet

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 3636ed48 17-Jul-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Deferred btree updates

Will be used in the future for inode updates, which will be very helpful
for multithreaded workloads that have to update the inode with every
extent update (appends, or updates that change i_sectors)

Also will be used eventually for fully persistent alloc info

However - we still need a mechanism for reserving space in the journal
prior to getting a journal reservation, so it's not technically safe to
make use of this just yet, we could deadlock with the journal full
(although not likely to be an issue in practice)

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 26609b61 01-Nov-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Make bkey types globally unique

this lets us get rid of a lot of extra switch statements - in a lot of
places we dispatch on the btree node type, and then the key type, so
this is a nice cleanup across a lot of code.

Also improve the on disk format versioning stuff.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# fc3268c1 08-Aug-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: kill extent_insert_hook

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# fc88796d 17-Jul-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: bch2_trans_update() now takes struct btree_insert_entry

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 1c6fdbd8 17-Mar-2017 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Initial commit

Initially forked from drivers/md/bcache, bcachefs is a new copy-on-write
filesystem with every feature you could possibly want.

Website: https://bcachefs.org

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>