History log of /linux-master/fs/bcachefs/buckets.h
Revision Date Author Comments
# 9abb6dd7 12-Apr-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Standardize helpers for printing enum strs with bounds checks

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# e2a316b3 01-Apr-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: BCH_WATERMARK_interior_updates

This adds a new watermark, higher priority than BCH_WATERMARK_reclaim,
for interior btree updates. We've seen a deadlock where journal replay
triggers a ton of btree node merges, and these use up all available open
buckets and then interior updates get stuck.

One cause of this is that we're currently lacking btree node merging on
write buffer btrees - that needs to be fixed as well.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 5b14ce35 11-Nov-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch2_trans_account_disk_usage_change()

The disk space accounting rewrite is splitting out accounting for each
replicas set - those are moving to btree keys, instead of percpu
counters.

This breaks bch2_trans_fs_usage_apply() up, splitting out the part we
will still need.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# e58f963c 06-Jan-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: helpers for printing data types

We need bounds checking since new versions may introduce new data types.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 4f9ec59f 28-Dec-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: unify extent trigger

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# f4f78779 27-Dec-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: move stripe triggers to ec.c

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 6820ac2c 27-Dec-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: move bch2_mark_alloc() to alloc_background.c

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 6cacd0c4 27-Dec-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: unify reservation trigger

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 282e7c37 27-Dec-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: kill mem_trigger_run_overwrite_then_insert()

now that type signatures are unified, redundant

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# ad00bce0 27-Dec-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: mark now takes bkey_s

Prep work for disk space accounting rewrite: we're going to want to use
a single callback for both of our current triggers, so we need to change
them to have the same type signature first.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 717296c3 27-Dec-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: trans_mark now takes bkey_s

Prep work for disk space accounting rewrite: we're going to want to use
a single callback for both of our current triggers, so we need to change
them to have the same type signature first.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 0d963a63 03-Dec-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Move reflink_p triggers into reflink.c

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# ed0cd515 23-Nov-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch2_dev_usage_to_text()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 25f64e99 11-Nov-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Don't use update_cached_sectors() in bch2_mark_alloc()

bch2_update_cached_sectors_list() is closer to how the new disk space
accounting works, called from trans_mark().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 523f33ef 22-Jun-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: All triggers are BTREE_TRIGGER_WANTS_OLD_AND_NEW

Upcoming rebalance_work btree will require extent triggers to be
BTREE_TRIGGER_WANTS_OLD_AND_NEW - so to reduce potential confusion,
let's just make all triggers BTREE_TRIGGER_WANTS_OLD_AND_NEW.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# bbe682c7 21-Oct-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Ensure devices are always correctly initialized

We can't mark device superblocks or allocate journal on a device that
isn't online.

That means we may need to do this on every mount, because we may have
formatted a new filesystem and then done the first mount
(bch2_fs_initialize()) in degraded mode.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 73bbeaa2 27-Sep-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bucket_lock() is now a sleepable lock

fsck_err() may sleep - it takes a mutex and may allocate memory, so
bucket_lock() needs to be a sleepable lock.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 8c2d82a6 13-Sep-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Change bucket_lock() to use bit_spin_lock()

bucket_lock() previously open coded a spinlock, because we need to cram
a spinlock into a single byte.

But it turns out not all archs support xchg() on a single byte; since we
need struct bucket to be small, this means we have to play fun games
with casts and ifdefs for endianness.

This fixes building on 32 bit arm, and likely other architectures.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Cc: linux-bcachefs@vger.kernel.org
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 3764647b 13-Sep-2023 Josh Poimboeuf <jpoimboe@kernel.org>

bcachefs: Remove undefined behavior in bch2_dev_buckets_reserved()

In general it's a good idea to avoid using bare unreachable() because it
introduces undefined behavior in compiled code. In this case it even
confuses GCC into emitting an empty unused
bch2_dev_buckets_reserved.part.0() function.

Use BUG() instead, which is nice and defined. While in theory it should
never trigger, if something were to go awry and the BCH_WATERMARK_NR
case were to actually hit, the failure mode is much more robust.

Fixes the following warnings:

vmlinux.o: warning: objtool: bch2_bucket_alloc_trans() falls through to next function bch2_reset_alloc_cursors()
vmlinux.o: warning: objtool: bch2_dev_buckets_reserved.part.0() is missing an ELF size annotation

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# fb8e5b4c 05-Aug-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: sb-members.c

Split out a new file for bch_sb_field_members - we'll likely want to
move more code here in the future.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 4dc5bb9a 16-Jul-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: move inode triggers to inode.c

bit of reorg

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 494036d8 27-Jun-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: BCH_WATERMARK_reclaim

Add another watermark for journal reclaim - this is needed for the next
patches, that unify BCH_WATERMARK with JOURNAL_WATERMARK.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# e53a961c 24-Jun-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Rename enum alloc_reserve -> bch_watermark

This is prep work for consolidating with JOURNAL_WATERMARK.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 962210b2 22-May-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Fix a buffer overrun in bch2_fs_usage_read()

We were copying the size of a struct bch_fs_usage_online to a struct
bch_fs_usage, which is 8 bytes smaller.

This adds some new helpers so we can do this correctly, and get rid of
some magic +1s too.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# e84face6 01-Mar-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: RESERVE_stripe

Rework stripe creation path - new algorithm for deciding when to create
new stripes or reuse existing stripes.

We add a new allocation watermark, RESERVE_stripe, above RESERVE_none.
Then we always try to create a new stripe by doing RESERVE_stripe
allocations; if this fails, we reuse an existing stripe and allocate
buckets for it with the reserve watermark for the given write
(RESERVE_none or RESERVE_movinggc).

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# b1cfe5ed 01-Mar-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Improve dev_alloc_debug_to_text()

Now we also print the number of buckets reserved for each watermark.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 2611a041 01-Mar-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch2_mark_key() now takes btree_id & level

btree & level are passed to trans_mark - for backpointers -
bch2_mark_key() should take them as well.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# a8c752bb 17-Mar-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: New on disk format: Backpointers

This patch adds backpointers: we now have a reverse index from device
and offset on that device (specifically, offset within a bucket) back to
btree nodes and (non cached) data extents.

The first 40 backpointers within a bucket are stored in the alloc key;
after that backpointers spill over to the next backpointers btree. This
is to help avoid performance regressions from additional btree updates
on large streaming workloads.

This patch adds all the code for creating, checking and repairing
backpointers. The next patch in the series is going to use backpointers
for copygc - finally getting rid of the need to scan all extents to do
copygc.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 920e69bc 03-Jan-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Btree write buffer

This adds a new method of doing btree updates - a straight write buffer,
implemented as a flat fixed size array.

This is only useful when we don't need to read from the btree in order
to do the update, and when reading is infrequent - perfect for the LRU
btree.

This will make LRU btree updates fast enough that we'll be able to use
it for persistently indexing buckets by fragmentation, which will be a
massive boost to copygc performance.

Changes:
- A new btree_insert_type enum, for btree_insert_entries. Specifies
btree, btree key cache, or btree write buffer.

- bch2_trans_update_buffered(): updates via the btree write buffer
don't need a btree path, so we need a new update path.

- Transaction commit path changes:
The update to the btree write buffer both mutates global, and can
fail if there isn't currently room. Therefore we do all write buffer
updates in the transaction all at once, and also if it fails we have
to revert filesystem usage counter changes.

If there isn't room we flush the write buffer in the transaction
commit error path and retry.

- A new persistent option, for specifying the number of entries in the
write buffer.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# b2d1d56b 13-Nov-2022 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Fixes for building in userspace

- Marking a non-static function as inline doesn't actually work and is
now causing problems - drop that

- Introduce BCACHEFS_LOG_PREFIX for when we want to prefix log messages
with bcachefs (filesystem name)

- Userspace doesn't have real percpu variables (maybe we can get this
fixed someday), put an #ifdef around bch2_disk_reservation_add()
fastpath

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 77671e8f 21-Oct-2022 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Move bkey bkey_unpack_key() to bkey.h

Long ago, bkey_unpack_key() was added to bset.h instead of bkey.h
because bkey.h didn't include btree_types.h, which it needs for the
compiled unpack function.

This patch finally moves it to the proper location.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# ed80c569 21-Oct-2022 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Optimize bch2_dev_usage_read()

- add bch2_dev_usage_read_fast(), which doesn't return by value -
bch_dev_usage is big enough that we don't want the silent memcpy
- tweak the allocation path to only call bch2_dev_usage_read() once per
bucket allocated

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 58aaa083 22-Jun-2022 Daniel Hill <daniel@gluo.nz>

bcachefs: fix __dev_available().

__dev_available() now calculates available buckets correctly. Previously
it would almost always return 0 when we have cached data.

Signed-off-by: Daniel Hill <daniel@gluo.nz>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 30f0349d 14-Jun-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Split out dev_buckets_free()

Previously, dev_buckets_available() only counted buckets that are
eligible to be allocated right now - i.e. buckets that don't have cached
data, or need discard, or need gc gens, etc.

But most users of this function want to know how many buckets are
eligible to be allocated from without moving data around - copygc,
allocator striping, which means we should be including cached data
buckets etc.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# e1b8f5f5 31-Mar-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Plumb btree_id & level to trans_mark

For backpointers, we'll need the full key location - that means btree_id
and btree level. This patch plumbs it through.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 822835ff 31-Mar-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Fold bucket_state in to BCH_DATA_TYPES()

Previously, we were missing accounting for buckets in need_gc_gens and
need_discard states. This matters because buckets in those states need
other btree operations done before they can be used, so they can't be
conuted when checking current number of free buckets against the
allocation watermark.

Also, we weren't directly counting free buckets at all. Now, data type 0
== BCH_DATA_free, and free buckets are counted; this means we can get
rid of the separate (poorly defined) count of unavailable buckets.

This is a new on disk format version, with upgrade and fsck required for
the accounting changes.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# f0ac7df2 03-Apr-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Convert .key_invalid methods to printbufs

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# c6b6d416 02-Apr-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: gc mark fn fixes, cleanups

mark_stripe_bucket() was busted; it was using @new unitialized.

Also, clean up all the gc mark functions, and convert them to the same
style.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 66d90823 13-Feb-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Kill struct bucket_mark

This switches struct bucket to using a lock, instead of cmpxchg. And now
that the protected members no longer need to fit into a u64, we can
expand the sector counts to 32 bits.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 5735608c 10-Feb-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Kill main in-memory bucket array

All code using the in-memory bucket array, excluding GC, has now been
converted to use the alloc btree directly - so we can finally delete it.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 5f43f99c 10-Feb-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: bch2_dev_usage_update() no longer depends on bucket_mark

This is one of the last steps in getting rid of the main in-memory
bucket array.

This changes bch2_dev_usage_update() to take bkey_alloc_unpacked instead
of bucket_mark, and for the places where we are in fact working with
bucket_mark and don't have bkey_alloc_unpacked, we add a wrapper that
takes bucket_mark and converts to bkey_alloc_unpacked.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# f25d8215 09-Jan-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Kill allocator threads & freelists

Now that we have new persistent data structures for the allocator, this
patch converts the allocator to use them.

Now, foreground bucket allocation uses the freespace btree to find
buckets to allocate, instead of popping buckets off the freelist.

The background allocator threads are no longer needed and are deleted,
as well as the allocator freelists. Now we only need background tasks
for invalidating buckets containing cached data (when we are low on
empty buckets), and for issuing discards.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# c6b2826c 11-Dec-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Freespace, need_discard btrees

This adds two new btrees for the upcoming allocator rewrite: an extents
btree of free buckets, and a btree for buckets awaiting discards.

We also add a new trigger for alloc keys to keep the new btrees up to
date, and a compatibility path to initialize them on existing
filesystems.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 3d48a7f8 31-Dec-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: KEY_TYPE_alloc_v4

This introduces a new alloc key which doesn't use varints. Soon we'll be
adding backpointers and storing them in alloc keys, which means our
pack/unpack workflow for alloc keys won't really work - we'll need to be
mutating alloc keys in place.

Instead of bch2_alloc_unpack(), we now have bch2_alloc_to_v4() that
converts older types of alloc keys to v4 if needed.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 880e2275 12-Mar-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Move trigger fns to bkey_ops

This replaces the switch statements in bch2_mark_key(),
bch2_trans_mark_key() with new bkey methods - prep work for the next
patch, which fixes BTREE_TRIGGER_WANTS_OLD_AND_NEW.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 3598c56e 24-Feb-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Consolidate trigger code a bit

Upcoming patches are doing more work on the triggers code, this patch
just moves code around.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# ae94c78f 10-Dec-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: bch2_trans_mark_key() now takes a bkey_i *

We're now coming up with triggers that modify the update being done. A
bkey_s_c is const - bkey_i is the correct type to be using here.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# c45c8667 24-Dec-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: bch2_gc_gens() no longer uses bucket array

Like the previous patches, this converts bch2_gc_gens() to use the alloc
btree directly, and private arrays of generation numbers for its own
recalculation of oldest_gen.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 0678cbe2 10-Jan-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Ignore cached data when calculating fragmentation

Previously, bucket fragmentation was considered to be bucket size -
total amount of live data, both dirty and cached.

This meant that if a bucket was full but only a small amount of data in
it was dirty - the rest cached, we'd get stuck: copygc wouldn't move the
dirty data out of the bucket and the allocator wouldn't be able to
invalidate and drop the cached data.

This changes fragmentation to exclude cached data, so that copygc will
evacuate these buckets and copygc/the allocator will always be able to
make forward progress.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 21aec962 04-Jan-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: New data structure for buckets waiting on journal commit

Implement a hash table, using cuckoo hashing, for empty buckets that are
waiting on a journal commit before they can be reused.

This replaces the journal_seq field of bucket_mark, and is part of
eventually getting rid of the in memory bucket array.

We may need to make bch2_bucket_needs_journal_commit() lockless, pending
profiling and testing.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# a7860877 25-Dec-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: New in-memory array for bucket gens

The main in-memory bucket array is going away, but we'll still need to
keep bucket generations in memory, at least for now - ptr_stale() needs
to be an efficient operation.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 47ac34ec 25-Dec-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Separate out gc_bucket()

Since the main in memory bucket array is going away, we don't want to be
calling bucket() or __bucket() when what we want is the GC in-memory
bucket.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 4b674b09 24-Dec-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Kill ptr_bucket_mark()

Only used in one place, we can just delete it.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 58e1ea4b 28-Nov-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Push c->mark_lock usage down to where it is needed

This changes the bch2_mark_key() and related paths to take mark lock
where it is needed, instead of taking it in the upper transaction commit
path - by pushing down locking we'll be able to handle fsck errors
locally instead of requiring a separate check in the btree_gc code for
replicas being marked.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 502cfb35 28-Nov-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Kill bch2_replicas_delta_list_marked()

This changes bch2_trans_fs_usage_apply() to handle failure (replicas
entry missing) by reverting the changes it made - meaning we can make
the main transaction commit path a bit slimmer, and perhaps also
simplify some locking in upcoming patches.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# f0c3f88b 26-Oct-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Run insert triggers before overwrite triggers

Currently, btree triggers are run in natural key order, which presents a
problem for fallocate in INSERT_RANGE mode: since we're moving existing
extents to higher offsets, the trigger for deleting the old extent runs
before the trigger that adds the new extent, potentially leading to
indirect extents being deleted that shouldn't be when the delete causes
the refcount to hit 0.

This changes the order we run triggers so that for a givin btree, we run
all insert triggers before overwrite triggers, nicely sidestepping this
issue.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 904823de 29-Oct-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Convert bch2_mark_key() to take a btree_trans *

This helps to unify the interface between bch2_mark_key() and
bch2_trans_mark_key() - and it also gives access to the journal
reservation and journal seq in the mark_key path.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 67e0dd8f 30-Aug-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: btree_path

This splits btree_iter into two components: btree_iter is now the
externally visible componont, and it points to a btree_path which is now
reference counted.

This means we no longer have to clone iterators up front if they might
be mutated - btree_path can be shared by multiple iterators, and cloned
if an iterator would mutate a shared btree_path. This will help us use
iterators more efficiently, as well as slimming down the main long lived
state in btree_trans, and significantly cleans up the logic for iterator
lifetimes.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 297d8934 10-Jun-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Extensive triggers cleanups

- We no longer mark subsets of extents, they're marked like regular
keys now - which means we can drop the offset & sectors arguments
to trigger functions
- Drop other arguments that are no longer needed anymore in various
places - fs_usage
- Drop the logic for handling extents in bch2_mark_update() that isn't
needed anymore, to match bch2_trans_mark_update()
- Better logic for hanlding the BTREE_ITER_CACHED_NOFILL case, where we
don't have an old key to mark

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 7e94eeff 31-Oct-2022 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Inline fastpath of bch2_disk_reservation_add()

The fastpath now doesn't even disable preemption - instead we use a (non
locked) cmpxchg.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# ed343411 18-May-2021 Dan Robertson <dan@dlrobertson.com>

bcachefs: statfs resports incorrect avail blocks

The current implementation of bch_statfs does not scale the number of
available blocks provided in f_bavail by the reserve factor. This causes
an allocation of a file of this size to fail.

Signed-off-by: Dan Robertson <dan@dlrobertson.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# dac1525d 16-Apr-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: gc shouldn't care about owned_by_allocator

The owned_by_allocator field is a purely in memory thing, even if/when
we bring back GC at runtime there's no need for it to be recalculating
this field. This is prep work for pulling it out of struct bucket, and
eventually getting rid of the bucket array.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# d62ab355 14-Apr-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Fix bch2_trans_mark_dev_sb()

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 35d5aff2 03-Apr-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Kill bch2_fs_usage_scratch_get()

This is an important cleanup, eliminating an unnecessary copy in the
transaction commit path.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 9c2e6242 03-Apr-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Fix livelock calling bch2_mark_bkey_replicas()

The bug was that we were trying to find a replicas entry that wasn't
sorted - but, we can also simplify the code by not using
bch2_mark_bkey_replicas and instead ensuring the list of replicas
entries exists directly.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 65bcd657 28-Mar-2021 Kent Overstreet <kent.overstreet@gmail.com>

buckets.c fixups XXX squash

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# cb66fc5f 13-Apr-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Fix copygc threshold

Awhile back the meaning of is_available_bucket() and thus also
bch_dev_usage->buckets_unavailable changed to include buckets that are
owned by the allocator - this was so that the stat could be persisted
like other allocation information, and wouldn't have to be regenerated
by walking each bucket at mount time.

This broke copygc, which needs to consider buckets that are reclaimable
and haven't yet been grabbed by the allocator thread and moved onta
freelist. This patch fixes that by adding dev_buckets_reclaimable() for
copygc and the allocator thread, and cleans up some of the callers a bit.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 180fb49d 21-Jan-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Journal updates to dev usage

This eliminates the need to scan every bucket to regenerate dev_usage at
mount time.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 2abe5420 21-Jan-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Persist 64 bit io clocks

Originally, bcachefs - going back to bcache - stored, for each bucket, a
16 bit counter corresponding to how long it had been since the bucket
was read from. But, this required periodically rescaling counters on
every bucket to avoid wraparound. That wasn't an issue in bcache, where
we'd perodically rewrite the per bucket metadata all at once, but in
bcachefs we're trying to avoid having to walk every single bucket.

This patch switches to persisting 64 bit io clocks, corresponding to the
64 bit bucket timestaps introduced in the previous patch with
KEY_TYPE_alloc_v2.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# bfcf840d 22-Jan-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Mark superblocks transactionally

More work towards getting rid of the in memory struct bucket: this path
adds code for marking superblock and journal buckets via the btree, and
uses it in the device add and journal resize paths.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 9afc6652 22-Jan-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Kill bch2_invalidate_bucket()

This patch is working towards eventually getting rid of the in memory
struct bucket, and relying only on the btree representation.

Since bch2_invalidate_bucket() was only used for incrementing gens, not
invalidating cached data, no other counters were being changed as a side
effect - meaning it's safe for the allocator code to increment the
bucket gen directly.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 72eab8da 21-Jan-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Refactor dev usage

This is to make it more amenable for serialization.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# cd9f3dfe 17-Jan-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Fix integer overflow in bch2_disk_reservation_get()

The sectors argument shouldn't have been a u32 - it can be up to U32_MAX
(i.e. fallocate creating persistent reservations), and if replication is
enabled we'll overflow when we calculate the real number of sectors to
reserve. Oops.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# f30dd860 16-Oct-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Don't write bucket IO time lazily

With the btree key cache code, we don't need to update the alloc btree
lazily - and this will mean we can remove the bch2_alloc_write() call in
the shutdown path.

Future work: we really need to expend the bucket IO clocks from 16 to 64
bits, so that we don't have to rescale them.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 719fe7fb 10-Dec-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Update transactional triggers interface to pass old & new keys

This is needed to fix a bug where we're overflowing iterators within a
btree transaction, because we're updating the stripes btree (to update
block counts) and the stripes btree trigger is unnecessarily updating
the alloc btree - it doesn't need to update the alloc btree when the
pointers within a stripe aren't changing.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 3d080aa5 22-Jul-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Delete unused arguments

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 89fd25be 09-Jul-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Use x-macros for data types

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# e63534a2 06-Jul-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Rework triggers interface

The trigger for stripe keys is shortly going to need both the old and
the new key passed to the trigger - this patch does that rework.

For now, this just changes the in memory triggers, and this doesn't
change how extent triggers work.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 697e45b2 06-Jul-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Kill BTREE_TRIGGER_NOOVERWRITES

This is prep work for reworking the triggers machinery - we have
triggers that need to know both the old and the new key.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 00b8ccf7 25-May-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Interior btree updates are now fully transactional

We now update the alloc info (bucket sector counts) atomically with
journalling the update to the interior btree nodes, and we also set new
btree roots atomically with the journalled part of the btree update.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# e3e464ac 30-Dec-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Move extent overwrite handling out of core btree code

Ever since the btree code was first written, handling of overwriting
existing extents - including partially overwriting and splittin existing
extents - was handled as part of the core btree insert path. The modern
transaction and iterator infrastructure didn't exist then, so that was
the only way for it to be done.

This patch moves that outside of the core btree code to a pass that runs
at transaction commit time.

This is a significant simplification to the btree code and overall
reduction in code size, but more importantly it gets us much closer to
the core btree code being completely independent of extents and is
important prep work for snapshots.

This introduces a new feature bit; the old and new extent update models
are incompatible when the filesystem needs journal replay.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 548b3d20 07-Feb-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: btree_ptr_v2

Add a new btree ptr type which contains the sequence number (random 64
bit cookie, actually) for that btree node - this lets us verify that
when we read in a btree node it really is the btree node we wanted.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 2d594dfb 31-Dec-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Split out btree_trigger_flags

The trigger flags really belong with individual btree_insert_entries,
not the transaction commit flags - this splits out those flags and
unifies them with the BCH_BUCKET_MARK flags. Todo - split out
btree_trigger.c from buckets.c

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 54e86b58 30-Dec-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Make btree_insert_entry more private to update path

This should be private to btree_update_leaf.c, and we might end up
removing it.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# b7ba66c8 28-Oct-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Inline more of bch2_trans_commit hot path

The main optimization here is that if we let
bch2_replicas_delta_list_apply() fail, we can completely skip calling
bch2_bkey_replicas_marked_locked().

And assorted other small optimizations.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 43de7376 07-Oct-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Fix erasure coding disk space accounting

Disk space accounting for erasure coding + compression was completely
broken - we need to calculate the parity sectors delta the same way we
calculate disk_sectors, by calculating the old and new usage and
subtracting to get the difference.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 06ab329c 29-Aug-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Improve pointer marking checks and error messages

Importantly, we don't want to use bch2_fs_inconsistent_on() for errors
that fsck can repair, becuase that will just put us in RO mode and
prevent fsck from actually fixing stuff. Probably want to get rid of it
in the future.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 2cbe5cfe 09-Aug-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Rework calling convention for marking overwrites

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 88767d65 24-Jun-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Update path now handles triggers that generate more triggers

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 6e738539 24-May-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Improve key marking interface

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 3838be78 15-May-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Don't use a fixed size buffer for fs_usage_deltas

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 6fb076e6 14-May-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Fix spurious inconsistency in recovery

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 932aa837 11-Mar-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: bch2_trans_mark_update()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 5e82a9a1 10-Feb-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Write out fs usage consistently

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 94f651e2 17-Apr-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Return errors from for_each_btree_key()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# c6dd04f8 15-Apr-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Mark overwrites from journal replay in initial gc

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# a1d58243 29-Mar-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: add ability to run gc on metadata only

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 36e916e1 29-Mar-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Caller now responsible for calling mark_key for gc

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 4d8100da 15-Mar-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Allocate fs_usage in do_btree_insert_at()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 0dc17247 13-Mar-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: kill struct btree_insert

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 28062d32 20-Feb-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Fix gc handling of bucket gens

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# ecf37a4a 14-Feb-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: fs_usage_u64s()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 768ac639 14-Feb-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Add a mechanism for blocking the journal

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 8fe826f9 13-Feb-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Convert bucket invalidation to key marking path

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 8777210b 12-Feb-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: refactor key marking code a bit

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 2ecc6171 12-Feb-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Fix double counting when gc is running

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 39fbc5a4 11-Feb-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: gc lock no longer needed for disk reservations

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 76f4c7b0 11-Feb-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Fix oldest_gen handling

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 3e0745e2 24-Jan-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: initialize fs usage summary in recovery

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 430735cd 18-Nov-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Persist alloc info on clean shutdown

- Does not persist alloc info for stripes yet
- Also does not yet include filesystem block/sector counts yet, from
struct fs_usage
- Not made use of just yet

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 7ef2a73a 21-Jan-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Fix check for if extent update is allocating

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 23f80d2b 17-Dec-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Factor out acc_u64s()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 5663a415 27-Nov-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: refactor bch_fs_usage

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 641ab736 06-Dec-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: improve/clarify ptr_disk_sectors()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 9166b41d 25-Nov-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: s/usage_lock/mark_lock

better describes what it's for, and we're going to call a new lock
usage_lock

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 8eb7f3ee 18-Nov-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: move dirty into bucket_mark

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 26609b61 01-Nov-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Make bkey types globally unique

this lets us get rid of a lot of extra switch statements - in a lot of
places we dispatch on the btree node type, and then the key type, so
this is a nice cleanup across a lot of code.

Also improve the on disk format versioning stuff.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# eeb83e25 22-Nov-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Hold usage_lock over mark_key and fs_usage_apply

Fixes an inconsistency at the end of gc

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# dfe9bfb3 24-Nov-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Stripes now properly subject to gc

gc now verifies the contents of the stripes radix tree, important for
persistent alloc info

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 9ca53b55 23-Jul-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: gc now operates on second set of bucket marks

This means we can now use gc to verify the allocation information -
important for testing persistant alloc info

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# cd575ddf 01-Nov-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Erasure coding

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# b35b1925 05-Nov-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Move key marking out of extents.c

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# b092dadd 04-Nov-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Scale down number of writepoints when low on space

this means we don't have to reserve space for them when calculating
filesystem capacity

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 47799326 01-Nov-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: more key marking refactoring

prep work for erasure coding

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 5b650fd1 24-Jul-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Account for internal fragmentation better

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 09f3297a 24-Jul-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: kill s_alloc, use bch_data_type

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# a7c7a309 23-Jul-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: bch2_mark_key() now takes bch_data_type

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# b29e197a 22-Jul-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Invalidate buckets when writing to alloc btree

Prep work for persistent alloc information. Refactoring also lets us
make free_inc much smaller, which means a lot fewer buckets stranded on
freelists.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# b2be7c8b 22-Jul-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: kill bucket mark sector count saturation

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 1c6fdbd8 17-Mar-2017 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Initial commit

Initially forked from drivers/md/bcache, bcachefs is a new copy-on-write
filesystem with every feature you could possibly want.

Website: https://bcachefs.org

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>