History log of /linux-master/fs/bcachefs/rebalance.c
Revision Date Author Comments
# ba78af9e 18-Jan-2024 Daniel Hill <daniel@gluo.nz>

bcachefs: rebalance_status now shows correct units

Signed-off-by: Daniel Hill <daniel@gluo.nz>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# d7e77f53 16-Jan-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: opts->compression can now also be applied in the background

The "apply this compression method in the background" paths now use the
compression option if background_compression is not set; this means that
setting or changing the compression option will cause existing data to
be compressed accordingly in the background.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# aead3428 16-Jan-2024 Colin Ian King <colin.i.king@gmail.com>

bcachefs: remove redundant variable tmp

The variable tmp is being assigned a value but it isn't being
read afterwards. The assignment is redundant and so tmp can be
removed.

Cleans up clang scan build warning:
warning: Although the value stored to 'ret' is used in the enclosing
expression, the value is never actually read from 'ret'
[deadcode.DeadStores]

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 189c176c 15-Jan-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Improve move_extent tracepoint

Also print out the data_opts, so that we can see what specifically is
being done to an extent.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# ef740a1e 15-Jan-2024 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Add missing bch2_moving_ctxt_flush_all()

This fixes a bug with rebalance IOs getting stuck with reads completed,
but writes never being issued.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# cf904c8d 16-Dec-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch_err_(fn|msg) check if should print

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 0c069781 25-Nov-2023 Daniel Hill <daniel@gluo.nz>

bcachefs: rebalance should wakeup on shutdown if disabled

Signed-off-by: Daniel Hill <daniel@gluo.nz>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 25d1e39d 24-Nov-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Add a rebalance, data_update tracepoints

Add a tracepoint for rebalance, printing out
- the target option
- the compression option
- the key being rebalanced

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# cb52d23e 11-Nov-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Rename BTREE_INSERT flags

BTREE_INSERT flags are actually transaction commit flags - rename them
for clarity.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# f82755e4 30-Oct-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Data move path now uses bch2_trans_unlock_long()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 1f7056b7 30-Oct-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Ensure copygc does not spin

If copygc does no work - finds no fragmented buckets - wait for a bit of
IO to happen.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# fb3f57bb 20-Oct-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: rebalance_work

This adds a new btree, rebalance_work, to eliminate scanning required
for finding extents that need work done on them in the background - i.e.
for the background_target and background_compression options.

rebalance_work is a bitset btree, where a KEY_TYPE_set corresponds to an
extent in the extents or reflink btree at the same pos.

A new extent field is added, bch_extent_rebalance, which indicates that
this extent has work that needs to be done in the background - and which
options to use. This allows per-inode options to be propagated to
indirect extents - at least in some circumstances. In this patch,
changing IO options on a file will not propagate the new options to
indirect extents pointed to by that file.

Updating (setting/clearing) the rebalance_work btree is done by the
extent trigger, which looks at the bch_extent_rebalance field.

Scanning is still requrired after changing IO path options - either just
for a given inode, or for the whole filesystem. We indicate that
scanning is required by adding a KEY_TYPE_cookie key to the
rebalance_work btree: the cookie counter is so that we can detect that
scanning is still required when an option has been flipped mid-way
through an existing scan.

Future possible work:
- Propagate options to indirect extents when being changed
- Add other IO path options - nr_replicas, ec, to rebalance_work so
they can be applied in the background when they change
- Add a counter, for bcachefs fs usage output, showing the pending
amount of rebalance work: we'll probably want to do this after the
disk space accounting rewrite (moving it to a new btree)

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# a0bfe3b0 20-Oct-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: move.c exports, refactoring

Prep work for the new rebalance code - we need a few helpers exported.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 1809b8cb 10-Sep-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Break up io.c

More reorganization, this splits up io.c into
- io_read.c
- io_misc.c - fallocate, fpunch, truncate
- io_write.c

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# e46c181a 10-Sep-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Convert more code to bch_err_msg()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# d0445e13 17-Aug-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Fix divide by zero in rebalance_work()

This fixes https://github.com/koverstreet/bcachefs-tools/issues/159

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 986e9842 12-Jul-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Compression levels

This allows including a compression level when specifying a compression
type, e.g.
compression=zstd:15

Values from 1 through 15 indicate compression levels, 0 or unspecified
indicates the default.

For LZ4, values 3-15 specify that the HC algorithm should be used.

Note that for compatibility, extents themselves only include the
compression type, not the compression level. This means that specifying
the same compression algorithm but different compression levels for the
compression and background_compression options will have no effect.

XXX: perhaps we could add a warning for this

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 5bc74082 30-May-2023 Brian Foster <bfoster@redhat.com>

bcachefs: don't spin in rebalance when background target is not usable

If a bcachefs filesystem is configured with a background device
(disk group), rebalance will relocate data to this device in the
background by checking extent keys for whether they currently reside
in the specified target. For keys that do not, rebalance performs a
read/write cycle to allow the write path to properly relocate data.

If the background target is not usable (read-only, for example),
however, the write path doesn't actually move data to another
device. Instead, rebalance spins indefinitely reading and rewriting
the same data over and over to the same device. If the background
target is made available again, the rebalance picks this up,
relocates the data, and eventually terminates.

To avoid this spinning behavior, update the rebalance background
target logic to not only check whether the extent is not in the
target, but whether the target is actually usable as well. If not,
then don't mark the key for rewrite.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# b2d1d56b 13-Nov-2022 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Fixes for building in userspace

- Marking a non-static function as inline doesn't actually work and is
now causing problems - drop that

- Introduce BCACHEFS_LOG_PREFIX for when we want to prefix log messages
with bcachefs (filesystem name)

- Userspace doesn't have real percpu variables (maybe we can get this
fixed someday), put an #ifdef around bch2_disk_reservation_add()
fastpath

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# d4bf5eec 18-Jul-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Use bch2_err_str() in error messages

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 0337cc7e 20-Jun-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: move.c refactoring

- add bch2_moving_ctxt_(init|exit)
- split out __bch2_evacutae_bucket() which takes an existing
moving_ctxt, this will be used for improving copygc performance by
pipelining across multiple buckets

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# c91996c5 15-Jun-2022 Daniel Hill <daniel@gluo.nz>

bcachefs: data jobs, including rebalance wait for copygc.

move_ratelimit() now has a bool that specifies whether we want to
wait for copygc to finish.

When copygc is running, we're probably low on free buckets instead
of consuming the remaining buckets, we want to wait for copygc to
finish.

This should help with performance, and run away bucket fragmentation.

Signed-off-by: Daniel Hill <daniel@gluo.nz>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 7f5c5d20 13-Jun-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Redo data_update interface

This patch significantly cleans up and simplifies the data_update
interface. Instead of only being able to specify a single pointer by
device to rewrite, we're now able to specify any or all of the pointers
in the original extent to be rewrited, as a bitmask.

data_cmd is no more: the various pred functions now just return true if
the extent should be moved/updated. All the data_update path does is
rewrite existing replicas, or add new ones.

This fixes a bug where with background compression on replicated
filesystems, where rebalance -> data_update would incorrectly drop the
wrong old replica, and keep trying to recompress an extent pointer and
each time failing to drop the right replica. Oops.

Now, the data update path doesn't look at the io options to decide which
pointers to keep and which to drop - it only goes off of the
data_update_options passed to it.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>


# 401ec4db 03-Feb-2023 Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Printbuf rework

This converts bcachefs to the modern printbuf interface/implementation,
synced with the version to be submitted upstream.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# fa8e94fa 25-Feb-2022 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Heap allocate printbufs

This patch changes printbufs dynamically allocate and reallocate a
buffer as needed. Stack usage has become a bit of a problem, and a major
cause of that has been static size string buffers on the stack.

The most involved part of this refactoring is that printbufs must now be
exited with printbuf_exit().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 8dd6ed94 23-Jul-2021 Brett Holman <bholman.devel@gmail.com>

bcachefs: add progress stats to sysfs

This adds progress stats to sysfs for copygc, rebalance, recovery, and the
cmd_job ioctls.

Signed-off-by: Brett Holman <bholman.devel@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# f020bfcd 04-Mar-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Use bch2_bpos_to_text() more consistently

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# a4805d66 22-Mar-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Scan for old btree nodes if necessary on mount

We dropped support for !BTREE_NODE_NEW_EXTENT_OVERWRITE but it turned
out there were people who still had filesystems with btree nodes in that
format in the wild. This adds a new compat feature that indicates we've
scanned for and rewritten nodes in the old format, and does that scan at
mount time if the option isn't set.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 1889ad5a 14-Mar-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Add code to scan for/rewite old btree nodes

This adds a new data job type to scan for btree nodes in the old extent
format, and rewrite them.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# dab9ef0d 23-Feb-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Add error message for some allocation failures

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 2abe5420 21-Jan-2021 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Persist 64 bit io clocks

Originally, bcachefs - going back to bcache - stored, for each bucket, a
16 bit counter corresponding to how long it had been since the bucket
was read from. But, this required periodically rescaling counters on
every bucket to avoid wraparound. That wasn't an issue in bcache, where
we'd perodically rewrite the per bucket metadata all at once, but in
bcachefs we're trying to avoid having to walk every single bucket.

This patch switches to persisting 64 bit io clocks, corresponding to the
64 bit bucket timestaps introduced in the previous patch with
KEY_TYPE_alloc_v2.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# b7a9bbfc 19-Nov-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Move journal reclaim to a kthread

This is to make tracing easier.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# b88e971e 22-Jul-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Don't drop replicas when copygcing ec data

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 7807e143 25-Jul-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Convert various code to printbuf

printbufs know how big the buffer is that was allocated, so we can get
rid of the random PAGE_SIZEs all over the place.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# e77e4efc 07-Apr-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Account for ioclock slop when throttling rebalance thread

This should fix an issue where the rebalance thread was spinning

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# ab05de4c 23-Feb-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Track incompressible data

This fixes the background_compression option: wihout some way of marking
data as incompressible, rebalance will keep rewriting incompressible
data over and over.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 182084e3 20-Jan-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Refactor rebalance_pred function

Before, the logic for if we should move an extent was duplicated
somewhat, in both rebalance_add_key() and rebalance_pred(); this
centralizes that in __rebalance_pred()

This is prep work for a patch that enables marking data as
incompressible.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 6876d2ab 16-Jan-2020 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Add a cond_resched() to rebalance loop

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 5055b509 06-Sep-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Rebalance now adds replicas if needed

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 99aaf570 25-Jul-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Refactor various code to not be extent specific

With reflink, various code now has to handle both KEY_TYPE_extent
or KEY_TYPE_reflink_v - so, convert it to be generic across all keys
with pointers.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 424eb881 25-Mar-2019 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Only get btree iters from btree transactions

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 0b847a19 18-Dec-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Lots of option handling improvements

Add helptext to option definitions - so we can unify the option
handling with the format command

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 26609b61 01-Nov-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Make bkey types globally unique

this lets us get rid of a lot of extra switch statements - in a lot of
places we dispatch on the btree node type, and then the key type, so
this is a nice cleanup across a lot of code.

Also improve the on disk format versioning stuff.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 319f9ac3 08-Nov-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: revamp to_text methods

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 1742237b 27-Sep-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: extent_for_each_ptr_decode()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 7b3f84ea 05-Oct-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Split out alloc_background.c

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# cf0517af 06-Sep-2018 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: fix a divide

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>


# 1c6fdbd8 17-Mar-2017 Kent Overstreet <kent.overstreet@gmail.com>

bcachefs: Initial commit

Initially forked from drivers/md/bcache, bcachefs is a new copy-on-write
filesystem with every feature you could possibly want.

Website: https://bcachefs.org

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>