History log of /linux-master/fs/bcachefs/recovery.h
Revision Date Author Comments
# 55936afe 15-Mar-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Flag btrees with missing data

We need this to know when we should attempt to reconstruct the snapshots
btree

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# d2554263 23-Mar-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Split out recovery_passes.c

We've grown a fair amount of code for managing recovery passes; tracking
which ones we're running, which ones need to be run, and flagging in the
superblock which ones need to be run on the next recovery.

So it's worth splitting out into its own file, this code is pretty
different from the code in recovery.c.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 7f391b2f 06-Dec-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch2_run_online_recovery_passes()

Add a new helper for running online recovery passes - i.e. online fsck.
This is a subset of our normal recovery passes, and does not - for now -
use or follow c->curr_recovery_pass.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 808c680f 29-Dec-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Add persistent identifiers for recovery passes

The next patch will start to refer to recovery passes from the
superblock; naturally, we now need identifiers that don't change, since
the existing enum is in the order in which they are run and is not
fixed.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# e8c76927 17-Dec-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: print explicit recovery pass message only once

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 401585fe 05-Aug-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: btree_journal_iter.c

Split out a new file from recovery.c for managing the list of keys we
read from the journal: before journal replay finishes the btree iterator
code needs to be able to iterate over and return keys from the journal
as well, so there's a fair bit of code here.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 0ed4ca14 03-Aug-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Ensure topology repair runs

This fixes should_restart_for_topology_repair() - previously it was
returning false if the btree io path had already seleceted topology
repair to run, even if it hadn't run yet.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# ad52bac2 03-Aug-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Log a message when running an explicit recovery pass

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 065bd335 10-Jul-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Version table now lists required recovery passes

Now that we've got forward compatibility sorted out, we should be doing
more frequent version upgrades in the future.

To avoid having to run a full fsck for every version upgrade, this
improves the BCH_METADATA_VERSIONS() table to explicitly specify a
bitmask of recovery passes to run when upgrading to or past a given
version.

This means we can also delete PASS_UPGRADE().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 57617902 06-Jun-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Fix btree_and_journal_iter

We had a bug where btree_and_journal_iter would return the same key
twice - after deleting it (perhaps because it was present in both the
btree and the journal?)

This reworks btree_and_journal_iter to track the current position, much
like btree_paths, which makes the logic considerably simpler and more
robust.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 30525f68 21-May-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Fix journal_keys_search() overhead

Previously, on every btree_iter_peek() operation we were searching the
journal keys, doing a full binary search - which was slow.

This patch fixes that by saving our position in the journal keys, so
that we only do a full binary search when moving our position backwards
or a large jump forwards.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 5650bb46 11-Apr-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Introduce bch2_journal_keys_peek_(upto|slot)()

When many journal replay keys have been overwritten,
bch2_journal_keys_peek() was taking excessively long to scan before it
found a key to return.

Fix this by introducing bch2_journal_keys_peek_upto() which takes a
parameter for the end of the range we want, so that we can terminate the
search much sooner, and replace all uses of bch2_journal_keys_peek()
with peek_upto() or peek_slot().

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# ce6201c4 20-Mar-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Use a genradix for reading journal entries

Previously, the journal read path used a linked list for storing the
journal entries we read from disk. But there's been a bug that's been
causing journal_flush_delay to incorrectly be set to 0, leading to far
more journal entries than is normal being written out, which then means
filesystems are no longer able to start due to the O(n^2) behaviour of
inserting into/searching that linked list.

Fix this by switching to a radix tree.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# d1d7737f 03-Apr-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Gap buffer for journal keys

Btree updates before we go RW work by inserting into the array of keys
that journal replay will insert - but inserting into a flat array is
O(n), meaning if btree_gc needs to update many alloc keys, we're O(n^2).

Fortunately, the updates btree_gc does happens in sequential order,
which means a gap buffer works nicely here - this patch implements a gap
buffer for journal keys.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 45e4cd9e 24-Feb-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: run_one_trigger() now checks journal keys

Previously, when doing updates and running triggers before journal
replay completes, triggers would see the incorrect key for the old key
being overwritten - this patch updates the trigger code to check the
journal keys when necessary, needed for the upcoming allocator rewrite.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 5222a460 25-Dec-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: BTREE_ITER_WITH_JOURNAL

This adds a new btree iterator flag, BTREE_ITER_WITH_JOURNAL, that is
automatically enabled when initializing a btree iterator before journal
replay has completed - it overlays the contents of the journal with the
btree.

This lets us delete bch2_btree_and_journal_walk() and just use the
normal btree iterator interface instead - which also lets us delete a
significant amount of duplicated code.

Note that BTREE_ITER_WITH_JOURNAL is still unoptimized in this patch -
we're redoing the binary search over keys in the journal every time we
call bch2_btree_iter_peek().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# dfd41fb9 31-Dec-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Fix race between btree updates & journal replay

Add a flag to indicate whether a journal replay key has been
overwritten, and set/test it with appropriate btree locks held.

This fixes a race between the allocator - invalidating buckets, and
doing btree updates - and journal replay, which before this patch could
clobber the allocator thread's update with an older version of the key
from the journal.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# e75b2d4c 23-Dec-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: bch2_journal_key_insert() no longer transfers ownership

bch2_journal_key_insert() used to assume that the key passed to it was
allocated with kmalloc(), and on success took ownership. This patch
deletes that behaviour, making it more similar to
bch2_trans_update()/bch2_trans_commit().

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 904823de 29-Oct-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Convert bch2_mark_key() to take a btree_trans *

This helps to unify the interface between bch2_mark_key() and
bch2_trans_mark_key() - and it also gives access to the journal
reservation and journal seq in the mark_key path.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# ac1019d3 29-Apr-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Clean up bch2_btree_and_journal_walk()

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 5b593ee1 26-Jan-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Add support for doing btree updates prior to journal replay

Some errors may need to be fixed in order for GC to successfully run -
walk and mark all metadata. But we can't start the allocators and do
normal btree updates until after GC has completed, and allocation
information is known to be consistent, so we need a different method of
doing btree updates.

Fortunately, we already have code for walking the btree while overlaying
keys from the journal to be replayed. This patch adds an update path
that adds keys to the list of keys to be replayed by journal replay, and
also fixes up iterators.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# b2930396 24-May-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Fix reading of alloc info after unclean shutdown

When updates to interior nodes started being journalled, that meant that
after an unclean shutdown, until journal replay is done we can't walk
the btree without overlaying the updates from the journal.

The initial btree gc was changed to walk the btree overlaying keys from
the journal - but bch2_alloc_read() and bch2_stripes_read() were missed.
Major whoops...

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# f1d786a0 25-Mar-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Add an option for keeping journal entries after startup

This will be used by the userspace debug tools.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# e62d65f2 15-Mar-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: trans_commit() path can now insert to interior nodes

This will be needed for the upcoming patches to journal updates to
interior btree nodes.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# e3e464ac 30-Dec-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Move extent overwrite handling out of core btree code

Ever since the btree code was first written, handling of overwriting
existing extents - including partially overwriting and splittin existing
extents - was handled as part of the core btree insert path. The modern
transaction and iterator infrastructure didn't exist then, so that was
the only way for it to be done.

This patch moves that outside of the core btree code to a pass that runs
at transaction commit time.

This is a significant simplification to the btree code and overall
reduction in code size, but more importantly it gets us much closer to
the core btree code being completely independent of extents and is
important prep work for snapshots.

This introduces a new feature bit; the old and new extent update models
are incompatible when the filesystem needs journal replay.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 5c4a5cd5 27-Dec-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: btree_and_journal_iter

Introduce a new iterator that iterates over keys in the btree with keys
from the journal overlaid on top. This factors out what the erasure
coding init code was doing manually.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# e222d206 12-Jul-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Fix ec_stripes_read()

Change it to not mark keys that will be overwritten by keys in the
journal - this fixes a bug where we pop an assertion in
bucket_set_stripe() because of a stale pointer - because the stripe that
has the stale pointer has been deleted.

This code could be factored out and used elsewhere, at some point.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# d0734356 11-Apr-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Deduplicate keys in the journal before replay

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 1c6fdbd8 17-Mar-2017 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Initial commit

Initially forked from drivers/md/bcache, bcachefs is a new copy-on-write
filesystem with every feature you could possibly want.

Website: https://bcachefs.org

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>