Cross Reference: /linux-master/fs/btrfs/delayed-ref.c

History log of /linux-master/fs/btrfs/delayed-ref.c
Revision	Date	Author	Comments
# ef5a05c5	24-Feb-2024	Chengming Zhou <zhouchengming@bytedance.com>	btrfs: remove SLAB_MEM_SPREAD flag use The SLAB_MEM_SPREAD flag used to be implemented in SLAB, which was removed as of v6.8-rc1, so it became a dead flag since the commit 16a1d968358a ("mm/slab: remove mm/slab.c and slab_def.h"). And the series[1] went on to mark it obsolete to avoid confusion for users. Here we can just remove all its users, which has no functional change. [1] https://lore.kernel.org/all/20240223-slab-cleanup-flags-v2-1-02f1753e8303@suse.cz/ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# b2c7d55e	20-Feb-2024	Kunwu Chan <chentao@kylinos.cn>	btrfs: use KMEM_CACHE() to create delayed ref caches Use the KMEM_CACHE() macro instead of kmem_cache_create() to simplify the creation of SLAB caches related to delayed refs when the default values are used. Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# d57dd52a	16-Feb-2024	David Sterba <dsterba@suse.com>	btrfs: uninline some static inline helpers from delayed-ref.h The helpers are doing an initialization or release work, none of which is performance critical that it would require a static inline, so move them to the .c file. Signed-off-by: David Sterba <dsterba@suse.com>
# 609d9937	06-Nov-2023	Filipe Manana <fdmanana@suse.com>	btrfs: fix qgroup record leaks when using simple quotas When using simple quotas we are not supposed to allocate qgroup records when adding delayed references. However we allocate them if either mode of quotas is enabled (the new simple one or the old one), but then we never free them because running the accounting, which frees the records, is only run when using the old quotas (at btrfs_qgroup_account_extents()), resulting in a memory leak of the records allocated when adding delayed references. Fix this by allocating the records only if the old quotas mode is enabled. Also fix btrfs_qgroup_trace_extent_nolock() to return 1 if the old quotas mode is not enabled - meaning the caller has to free the record. Fixes: 182940f4f4db ("btrfs: qgroup: add new quota mode for simple quotas") Reported-by: syzbot+d3ddc6dcc6386dea398b@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/00000000000004769106097f9a34@google.com/ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# 9ef17228	28-Sep-2023	Filipe Manana <fdmanana@suse.com>	btrfs: stop reserving excessive space for block group item insertions Space for block group item insertions, necessary after allocating a new block group, is reserved in the delayed refs block reserve. Currently we do this by incrementing the transaction handle's delayed_ref_updates counter and then calling btrfs_update_delayed_refs_rsv(), which will increase the size of the delayed refs block reserve by an amount that corresponds to the same amount we use for delayed refs, given by btrfs_calc_delayed_ref_bytes(). That is an excessive amount because it corresponds to the amount of space needed to insert one item in a btree (btrfs_calc_insert_metadata_size()) times 2 when the free space tree feature is enabled. All we need is an amount as given by btrfs_calc_insert_metadata_size(), since we only need to insert a block group item in the extent tree (or block group tree if this feature is enabled). By using btrfs_calc_insert_metadata_size() we will need to reserve 2 times less space when using the free space tree, putting less pressure on space reservation. So use helpers to reserve and release space for block group item insertions that use btrfs_calc_insert_metadata_size() for calculation of the space. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# f66e0209	28-Sep-2023	Filipe Manana <fdmanana@suse.com>	btrfs: stop reserving excessive space for block group item updates Space for block group item updates, necessary after allocating or deallocating an extent from a block group, is reserved in the delayed refs block reserve. Currently we do this by incrementing the transaction handle's delayed_ref_updates counter and then calling btrfs_update_delayed_refs_rsv(), which will increase the size of the delayed refs block reserve by an amount that corresponds to the same amount we use for delayed refs, given by btrfs_calc_delayed_ref_bytes(). That is an excessive amount because it corresponds to the amount of space needed to insert one item in a btree (btrfs_calc_insert_metadata_size()) times 2 when the free space tree feature is enabled. All we need is an amount as given by btrfs_calc_metadata_size(), since we only need to update an existing block group item in the extent tree (or block group tree if this feature is enabled). By using btrfs_calc_metadata_size() we will need to reserve 4 times less space when using the free space tree and 2 times less space when not using it, putting less pressure on space reservation. So use helpers to reserve and release space for block group item updates that use btrfs_calc_metadata_size() for calculation of the space. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# cecbb533	28-Jun-2023	Boris Burkov <boris@bur.io>	btrfs: record simple quota deltas in delayed refs At the moment that we run delayed refs, we make the final ref-count based decision on creating/removing extent (and metadata) items. Therefore, it is exactly the spot to hook up simple quotas. There are a few important subtleties to the fields we must collect to accurately track simple quotas, particularly when removing an extent. When removing a data extent, the ref could be in any tree (due to reflink, for example) and so we need to recover the owning root id from the owner ref item. When removing a metadata extent, we know the owning root from the owner field in the header when we create the delayed ref, so we can recover it from there. We must also be careful to handle reservations properly to not leaked reserved space. The happy path is freeing the reservation when the simple quota delta runs on a data extent. If that doesn't happen, due to refs canceling out or some error, the ref head already has the must_insert_reserved machinery to handle this, so we piggy back on that and use it to clean up the reserved data. Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>
# cf79ac47	28-Jun-2023	Boris Burkov <boris@bur.io>	btrfs: track original extent owner in head_ref Simple quotas requires tracking the original creating root of any given extent. This gets complicated when multiple subvolumes create overlapping/contradictory refs in the same transaction. For example, due to modifying or deleting an extent while also snapshotting it. To resolve this in a general way, take advantage of the fact that we are essentially already tracking this for handling releasing reservations. The head ref coalesces the various refs and uses must_insert_reserved to check if it needs to create an extent/free reservation. Store the ref that set must_insert_reserved as the owning ref on the head ref. Note that this can result in writing an extent for the very first time with an owner different from its only ref, but it will look the same as if you first created it with the original owning ref, then added the other ref, then removed the owning ref. Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>
# 610647d7	28-Jun-2023	Boris Burkov <boris@bur.io>	btrfs: rename tree_ref and data_ref owning_root commit 113479d5b8eb ("btrfs: rename root fields in delayed refs structs") changed these from ref_root to owning_root. However, there are many circumstances where that name is not really accurate and the root on the ref struct _is_ the referring root. In general, these are not the owning root, though it does happen in some ref merging cases involving overwrites during snapshots and similar. Simple quotas cares quite a bit about tracking the original owner of an extent through delayed refs, so rename these back to free up the name for the real owning root (which will live on the generic btrfs_ref and the head ref) Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>
# 182940f4	16-May-2023	Boris Burkov <boris@bur.io>	btrfs: qgroup: add new quota mode for simple quotas Add a new quota mode called "simple quotas". It can be enabled by the existing quota enable ioctl via a new command, and sets an incompat bit, as the implementation of simple quotas will make backwards incompatible changes to the disk format of the extent tree. Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>
# 28270e25	08-Sep-2023	Filipe Manana <fdmanana@suse.com>	btrfs: always reserve space for delayed refs when starting transaction When starting a transaction (or joining an existing one with btrfs_start_transaction()), we reserve space for the number of items we want to insert in a btree, but we don't do it for the delayed refs we will generate while using the transaction to modify (COW) extent buffers in a btree or allocate new extent buffers. Basically how it works: 1) When we start a transaction we reserve space for the number of items the caller wants to be inserted/modified/deleted in a btree. This space goes to the transaction block reserve; 2) If the delayed refs block reserve is not full, its size is greater than the amount of its reserved space, and the flush method is BTRFS_RESERVE_FLUSH_ALL, then we attempt to reserve more space for it corresponding to the number of items the caller wants to insert/modify/delete in a btree; 3) The size of the delayed refs block reserve is increased when a task creates delayed refs after COWing an extent buffer, allocating a new one or deleting (freeing) an extent buffer. This happens after the the task started or joined a transaction, whenever it calls btrfs_update_delayed_refs_rsv(); 4) The delayed refs block reserve is then refilled by anyone calling btrfs_delayed_refs_rsv_refill(), either during unlink/truncate operations or when someone else calls btrfs_start_transaction() with a 0 number of items and flush method BTRFS_RESERVE_FLUSH_ALL; 5) As a task COWs or allocates extent buffers, it consumes space from the transaction block reserve. When the task releases its transaction handle (btrfs_end_transaction()) or it attempts to commit the transaction, it releases any remaining space in the transaction block reserve that it did not use, as not all space may have been used (due to pessimistic space calculation) by calling btrfs_block_rsv_release() which will try to add that unused space to the delayed refs block reserve (if its current size is greater than its reserved space). That transferred space may not be enough to completely fulfill the delayed refs block reserve. Plus we have some tasks that will attempt do modify as many leaves as they can before getting -ENOSPC (and then reserving more space and retrying), such as hole punching and extent cloning which call btrfs_replace_file_extents(). Such tasks can generate therefore a high number of delayed refs, for both metadata and data (we can't know in advance how many file extent items we will find in a range and therefore how many delayed refs for dropping references on data extents we will generate); 6) If a transaction starts its commit before the delayed refs block reserve is refilled, for example by the transaction kthread or by someone who called btrfs_join_transaction() before starting the commit, then when running delayed references if we don't have enough reserved space in the delayed refs block reserve, we will consume space from the global block reserve. Now this doesn't make a lot of sense because: 1) We should reserve space for delayed references when starting the transaction, since we have no guarantees the delayed refs block reserve will be refilled; 2) If no refill happens then we will consume from the global block reserve when running delayed refs during the transaction commit; 3) If we have a bunch of tasks calling btrfs_start_transaction() with a number of items greater than zero and at the time the delayed refs reserve is full, then we don't reserve any space at btrfs_start_transaction() for the delayed refs that will be generated by a task, and we can therefore end up using a lot of space from the global reserve when running the delayed refs during a transaction commit; 4) There are also other operations that result in bumping the size of the delayed refs reserve, such as creating and deleting block groups, as well as the need to update a block group item because we allocated or freed an extent from the respective block group; 5) If we have a significant gap between the delayed refs reserve's size and its reserved space, two very bad things may happen: 1) The reserved space of the global reserve may not be enough and we fail the transaction commit with -ENOSPC when running delayed refs; 2) If the available space in the global reserve is enough it may result in nearly exhausting it. If the fs has no more unallocated device space for allocating a new block group and all the available space in existing metadata block groups is not far from the global reserve's size before we started the transaction commit, we may end up in a situation where after the transaction commit we have too little available metadata space, and any future transaction commit will fail with -ENOSPC, because although we were able to reserve space to start the transaction, we were not able to commit it, as running delayed refs generates some more delayed refs (to update the extent tree for example) - this includes not even being able to commit a transaction that was started with the goal of unlinking a file, removing an empty data block group or doing reclaim/balance, so there's no way to release metadata space. In the worst case the next time we mount the filesystem we may also fail with -ENOSPC due to failure to commit a transaction to cleanup orphan inodes. This later case was reported and hit by someone running a SLE (SUSE Linux Enterprise) distribution for example - where the fs had no more unallocated space that could be used to allocate a new metadata block group, and the available metadata space was about 1.5M, not enough to commit a transaction to cleanup an orphan inode (or do relocation of data block groups that were far from being full). So improve on this situation by always reserving space for delayed refs when calling start_transaction(), and if the flush method is BTRFS_RESERVE_FLUSH_ALL, also try to refill the delayed refs block reserve if it's not full. The space reserved for the delayed refs is added to a local block reserve that is part of the transaction handle, and when a task updates the delayed refs block reserve size, after creating a delayed ref, the space is transferred from that local reserve to the global delayed refs reserve (fs_info->delayed_refs_rsv). In case the local reserve does not have enough space, which may happen for tasks that generate a variable and potentially large number of delayed refs (such as the hole punching and extent cloning cases mentioned before), we transfer any available space and then rely on the current behaviour of hoping some other task refills the delayed refs reserve or fallback to the global block reserve. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# adb86dbe	08-Sep-2023	Filipe Manana <fdmanana@suse.com>	btrfs: stop doing excessive space reservation for csum deletion Currently when reserving space for deleting the csum items for a data extent, when adding or updating a delayed ref head, we determine how many leaves of csum items we can have and then pass that number to the helper btrfs_calc_delayed_ref_bytes(). This helper is used for calculating space for all tree modifications we need when running delayed references, however the amount of space it computes is excessive for deleting csum items because: 1) It uses btrfs_calc_insert_metadata_size() which is excessive because we only need to delete csum items from the csum tree, we don't need to insert any items, so btrfs_calc_metadata_size() is all we need (as it computes space needed to delete an item); 2) If the free space tree is enabled, it doubles the amount of space, which is pointless for csum deletion since we don't need to touch the free space tree or any other tree other than the csum tree. So improve on this by tracking how many csum deletions we have and using a new helper to calculate space for csum deletions (just a wrapper around btrfs_calc_metadata_size() with a comment). This reduces the amount of space we need to reserve for csum deletions by a factor of 4, and it helps reduce the number of times we have to block space reservations and have the reclaim task enter the space flushing algorithm (flush delayed items, flush delayed refs, etc) in order to satisfy tickets. For example this results in a total time decrease when unlinking (or truncating) files with many extents, as we end up having to block on space metadata reservations less often. Example test: $ cat test.sh #!/bin/bash DEV=/dev/nullb0 MNT=/mnt/test umount $DEV &> /dev/null mkfs.btrfs -f $DEV # Use compression to quickly create files with a lot of extents # (each with a size of 128K). mount -o compress=lzo $DEV $MNT # 100G gives at least 983040 extents with a size of 128K. xfs_io -f -c "pwrite -S 0xab -b 1M 0 120G" $MNT/foobar # Flush all delalloc and clear all metadata from memory. umount $MNT mount -o compress=lzo $DEV $MNT start=$(date +%s%N) rm -f $MNT/foobar end=$(date +%s%N) dur=$(( (end - start) / 1000000 )) echo "rm took $dur milliseconds" umount $MNT Before this change rm took: 7504 milliseconds After this change rm took: 6574 milliseconds (-12.4%) Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# b6ea3e6a	08-Sep-2023	Filipe Manana <fdmanana@suse.com>	btrfs: remove pointless initialization at btrfs_delayed_refs_rsv_release() There's no point in initializing to 0 the local variable 'released' as we don't use it before the next assignment to it. So remove the initialization. This may help avoid some warnings with clang tools such as the one reported/fixed by commit 966de47ff0c9 ("btrfs: remove redundant initialization of variables in log_new_ancestors"). Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# 3ee56a58	08-Sep-2023	Filipe Manana <fdmanana@suse.com>	btrfs: reserve space for delayed refs on a per ref basis Currently when reserving space for delayed refs we do it on a per ref head basis. This is generally enough because most back refs for an extent end up being inlined in the extent item - with the default leaf size of 16K we can have at most 33 inline back refs (this is calculated by the macro BTRFS_MAX_EXTENT_ITEM_SIZE()). The amount of bytes reserved for each ref head is given by btrfs_calc_delayed_ref_bytes(), which basically corresponds to a single path for insertion into the extent tree plus another path for insertion into the free space tree if it's enabled. However if we have reached the limit of inline refs or we have a mix of inline and non-inline refs, then we will need to insert a non-inline ref and update the existing extent item to update the total number of references for the extent. This implies we need reserved space for two insertion paths in the extent tree, but we only reserved for one path. The extent item and the non-inline ref item may be located in different leaves, or even if they are located in the same leaf, after updating the extent item and before inserting the non-inline ref item, the extent buffers in the btree path may have been written (due to memory pressure for e.g.), in which case we need to COW the entire path again. In this case since we have not reserved enough space for the delayed refs block reserve, we will use the global block reserve. If we are in a situation where the fs has no more unallocated space enough to allocate a new metadata block group and available space in the existing metadata block groups is close to the maximum size of the global block reserve (512M), we may end up consuming too much of the free metadata space to the point where we can't commit any future transaction because it will fail, with -ENOSPC, during its commit when trying to allocate an extent for some COW operation (running delayed refs generated by running delayed refs or COWing the root tree's root node at commit_cowonly_roots() for example). Such dramatic scenario can happen if we have many delayed refs that require the insertion of non-inline ref items, due to too many reflinks or snapshots. We also have situations where we use the global block reserve because we could not in advance know that we will need space to update some trees (block group creation for example), so this all adds up to increase the chances of exhausting the global block reserve and making any future transaction commit to fail with -ENOSPC and turn the fs into RO mode, or fail the mount operation in case the mount needs to start and commit a transaction, such as when we have orphans to cleanup for example - such case was reported and hit by someone running a SLE (SUSE Linux Enterprise) distribution for example - where the fs had no more unallocated space that could be used to allocate a new metadata block group, and the available metadata space was about 1.5M, not enough to commit a transaction to cleanup an orphan inode (or do relocation of data block groups that were far from being full). So reserve space for delayed refs by individual refs and not by ref heads, as we may need to COW multiple extent tree paths due to non-inline ref items. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# 03551d65	08-Sep-2023	Filipe Manana <fdmanana@suse.com>	btrfs: pass a space_info argument to btrfs_reserve_metadata_bytes() We are passing a block reserve argument to btrfs_reserve_metadata_bytes() which is not really used, all we need is to pass the space_info associated to the block reserve, we don't change the block reserve at all. Not only it's pointless to pass the block reserve, it's also confusing as one might think that the reserved bytes will end up being added to the passed block reserve, when that's not the case. The pattern for reserving space and adding it to a block reserve is to first reserve space with btrfs_reserve_metadata_bytes() and if that succeeds, then add the space to a block reserve by calling btrfs_block_rsv_add_bytes(). Also the reverse of btrfs_reserve_metadata_bytes(), which is btrfs_space_info_free_bytes_may_use(), takes a space_info argument and not a block reserve, so one more reason to pass a space_info and not a block reserve to btrfs_reserve_metadata_bytes(). So change btrfs_reserve_metadata_bytes() and its callers to pass a space_info argument instead of a block reserve argument. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# 9580503b	07-Sep-2023	David Sterba <dsterba@suse.com>	btrfs: reformat remaining kdoc style comments Function name in the comment does not bring much value to code not exposed as API and we don't stick to the kdoc format anymore. Update formatting of parameter descriptions. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# a7ddeeb0	08-Sep-2023	Filipe Manana <fdmanana@suse.com>	btrfs: prevent transaction block reserve underflow when starting transaction When starting a transaction, with a non-zero number of items, we reserve metadata space for that number of items and for delayed refs by doing a call to btrfs_block_rsv_add(), with the transaction block reserve passed as the block reserve argument. This reserves metadata space and adds it to the transaction block reserve. Later we migrate the space we reserved for delayed references from the transaction block reserve into the delayed refs block reserve, by calling btrfs_migrate_to_delayed_refs_rsv(). btrfs_migrate_to_delayed_refs_rsv() decrements the number of bytes to migrate from the source block reserve, and this however may result in an underflow in case the space added to the transaction block reserve ended up being used by another task that has not reserved enough space for its own use - examples are tasks doing reflinks or hole punching because they end up calling btrfs_replace_file_extents() -> btrfs_drop_extents() and may need to modify/COW a variable number of leaves/paths, so they keep trying to use space from the transaction block reserve when they need to COW an extent buffer, and may end up trying to use more space then they have reserved (1 unit/path only for removing file extent items). This can be avoided by simply reserving space first without adding it to the transaction block reserve, then add the space for delayed refs to the delayed refs block reserve and finally add the remaining reserved space to the transaction block reserve. This also makes the code a bit shorter and simpler. So just do that. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# 2ed45c0f	08-Sep-2023	Filipe Manana <fdmanana@suse.com>	btrfs: fix race when refilling delayed refs block reserve If we have two (or more) tasks attempting to refill the delayed refs block reserve we can end up with the delayed block reserve being over reserved, that is, with a reserved space greater than its size. If this happens, we are holding to more reserved space than necessary for a while. The race happens like this: 1) The delayed refs block reserve has a size of 8M and a reserved space of 6M for example; 2) Task A calls btrfs_delayed_refs_rsv_refill(); 3) Task B also calls btrfs_delayed_refs_rsv_refill(); 4) Task A sees there's a 2M difference between the size and the reserved space of the delayed refs rsv, so it will reserve 2M of space by calling btrfs_reserve_metadata_bytes(); 5) Task B also sees that 2M difference, and like task A, it reserves another 2M of metadata space; 6) Both task A and task B increase the reserved space of block reserve by 2M, by calling btrfs_block_rsv_add_bytes(), so the block reserve ends up with a size of 8M and a reserved space of 10M; 7) The extra, over reserved space will eventually be freed by some task calling btrfs_delayed_refs_rsv_release() -> btrfs_block_rsv_release() -> block_rsv_release_bytes(), as there we will detect the over reserve and release that space. So fix this by checking if we still need to add space to the delayed refs block reserve after reserving the metadata space, and if we don't, just release that space immediately. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# f1ed785a	29-May-2023	Filipe Manana <fdmanana@suse.com>	btrfs: use a single switch statement when initializing delayed ref head At init_delayed_ref_head(), we are using two separate if statements to check the delayed ref head action, and initializing 'must_insert_reserved' to false twice, once when the variable is declared and once again in an else branch. Make this simpler and more straightforward by having a single switch statement, also moving the comment about a drop action to the corresponding switch case to make it more clear and eliminating the duplicated initialization of 'must_insert_reserved' to false. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# 61c681fe	29-May-2023	Filipe Manana <fdmanana@suse.com>	btrfs: use bool type for delayed ref head fields that are used as booleans There's no point in have several fields defined as 1 bit unsigned int in struct btrfs_delayed_ref_head, we can instead use a bool type, it makes the code a bit more readable and it doesn't change the structure size. So switch them to proper booleans. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# 1e6b71c3	29-May-2023	Filipe Manana <fdmanana@suse.com>	btrfs: assert correct lock is held at btrfs_select_ref_head() The function btrfs_select_ref_head() iterates over the red black tree of delayed reference heads, which is protected by the spinlock in the delayed refs root. The function doesn't take the lock, it's taken by its single caller, btrfs_obtain_ref_head(), because it needs to call that function and btrfs_delayed_ref_lock() in the same critical section (delimited by that spinlock). So assert at btrfs_select_ref_head() that we are holding the expected lock. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# 798f4d95	29-May-2023	Filipe Manana <fdmanana@suse.com>	btrfs: get rid of label and goto at insert_delayed_ref() At insert_delayed_ref() there's no point of having a label and goto in the case we were able to insert the delayed ref head. We can just add the code under label to the if statement's body and return immediately, and also there is no need to track the return value in a variable, we can just return a literal true or false value directly. So do those changes. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# f38462c4	29-May-2023	Filipe Manana <fdmanana@suse.com>	btrfs: make insert_delayed_ref() return a bool instead of an int insert_delayed_ref() can only return 0 or 1, to indicate if the given delayed reference was added to the head reference or if it was merged into an existing delayed ref, respectively. So just make it return a boolean instead. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# 293f8197	29-May-2023	Filipe Manana <fdmanana@suse.com>	btrfs: use a bool to track qgroup record insertion when adding ref head We are using an integer as a boolean to track the qgroup record insertion status when adding a delayed reference head. Since all we need is a boolean, switch the type from int to bool to make it more obvious. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# 4d34ad34	29-May-2023	Filipe Manana <fdmanana@suse.com>	btrfs: remove pointless in_tree field from struct btrfs_delayed_ref_node The 'in_tree' field is really not needed in struct btrfs_delayed_ref_node, as we can check whether a reference is in the tree or not simply by checking its red black tree node member with RB_EMPTY_NODE(), as when we remove it from the tree we always call RB_CLEAR_NODE(). So remove that field and use RB_EMPTY_NODE(). Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# 53499d5f	29-May-2023	Filipe Manana <fdmanana@suse.com>	btrfs: remove unused is_head field from struct btrfs_delayed_ref_node The 'is_head' field of struct btrfs_delayed_ref_node is no longer after commit d278850eff30 ("btrfs: remove delayed_ref_node from ref_head"), so remove it. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# 0e55a545	21-Mar-2023	Filipe Manana <fdmanana@suse.com>	btrfs: add helper to calculate space for delayed references Instead of duplicating the logic for calculating how much space is required for a given number of delayed references, add an inline helper to encapsulate that logic and use it everywhere we are calculating the space required. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# 1d0df22a	21-Mar-2023	Filipe Manana <fdmanana@suse.com>	btrfs: calculate the right space for a single delayed ref when refilling When refilling the delayed block reserve we are incorrectly computing the amount of bytes for a single delayed reference if the free space tree is being used. In that case we should double the calculated amount. Everywhere else we compute the correct amount, like when updating the delayed block reserve, at btrfs_update_delayed_refs_rsv(), or when releasing space from the delayed block reserve, at btrfs_delayed_refs_rsv_release(). So fix btrfs_delayed_refs_rsv_refill() to multiply the amount of bytes for a single delayed reference by two in case the free space tree is used. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# a8fdc051	21-Mar-2023	Filipe Manana <fdmanana@suse.com>	btrfs: remove obsolete delayed ref throttling logic when truncating items We have this logic encapsulated in btrfs_should_throttle_delayed_refs() where we try to estimate if running the current amount of delayed references we have will take more than half a second, and if so, the caller btrfs_should_throttle_delayed_refs() should do something to prevent more and more delayed refs from being accumulated. This logic was added in commit 0a2b2a844af6 ("Btrfs: throttle delayed refs better") and then further refined in commit a79b7d4b3e81 ("Btrfs: async delayed refs"). The idea back then was that the caller of btrfs_should_throttle_delayed_refs() would release its transaction handle (by calling btrfs_end_transaction()) when that function returned true, then btrfs_end_transaction() would trigger an async job to run delayed references in a workqueue, and later start/join a transaction again and do more work. However we don't run delayed references asynchronously anymore, that was removed in commit db2462a6ad3d ("btrfs: don't run delayed refs in the end transaction logic"). That makes the logic that tries to estimate how long we will take to run our current delayed references, at btrfs_should_throttle_delayed_refs(), pointless as we don't take any action to run delayed references anymore. We do have other type of throttling, which consists of checking the size and reserved space of the delayed and global block reserves, as well as if fluhsing delayed references for the current transaction was already started, etc - this is all done by btrfs_should_end_transaction(), and the only user of btrfs_should_throttle_delayed_refs() does periodically call btrfs_should_end_transaction(). So remove btrfs_should_throttle_delayed_refs() and the infrastructure that keeps track of the average time used for running delayed references, as well as adapting btrfs_truncate_inode_items() to call btrfs_check_space_for_delayed_refs() instead. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# cf5fa929	21-Mar-2023	Filipe Manana <fdmanana@suse.com>	btrfs: simplify btrfs_should_throttle_delayed_refs() Currently btrfs_should_throttle_delayed_refs() returns 1 or 2 in case the delayed refs should be throttled, however the only caller (inode eviction and truncation path) does not care about those two different conditions, it treats the return value as a boolean. This allows us to remove one of the conditions in btrfs_should_throttle_delayed_refs() and change its return value from 'int' to 'bool'. So just do that. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# 5c1f2c6b	21-Mar-2023	Filipe Manana <fdmanana@suse.com>	btrfs: pass a bool size update argument to btrfs_block_rsv_add_bytes() At btrfs_delayed_refs_rsv_refill(), we are passing a value of 0 to the 'update_size' argument of btrfs_block_rsv_add_bytes(), which is defined as a boolean. Functionally this is fine because a 0 is, implicitly, converted to a boolean false value. However it's easier to read an explicit 'false' value, so just pass 'false' instead of 0. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# 0c555c97	12-Dec-2022	Johannes Thumshirn <johannes.thumshirn@wdc.com>	btrfs: directly pass in fs_info to btrfs_merge_delayed_refs Now that none of the functions called by btrfs_merge_delayed_refs() needs a btrfs_trans_handle, directly pass in a btrfs_fs_info to btrfs_merge_delayed_refs(). Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# afe2d748	12-Dec-2022	Johannes Thumshirn <johannes.thumshirn@wdc.com>	btrfs: drop trans parameter of insert_delayed_ref Now that drop_delayed_ref() doesn't need a btrfs_trans_handle, drop it from insert_delayed_ref() as well. Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# f09f7851	12-Dec-2022	Johannes Thumshirn <johannes.thumshirn@wdc.com>	btrfs: remove trans parameter of merge_ref Now that drop_delayed_ref() doesn't get the btrfs_trans_handle passed in anymore, we can get rid of it in merge_ref() as well. Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# 4c89493f	12-Dec-2022	Johannes Thumshirn <johannes.thumshirn@wdc.com>	btrfs: drop unused trans parameter of drop_delayed_ref drop_delayed_ref() doesn't use the btrfs_trans_handle it gets passed in, so remove it. Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# 43dd529a	27-Oct-2022	David Sterba <dsterba@suse.com>	btrfs: update function comments Update, reformat or reword function comments. This also removes the kdoc marker so we don't get reports when the function name is missing. Changes made: - remove kdoc markers - reformat the brief description to be a proper sentence - reword to imperative voice - align parameter list - fix typos Signed-off-by: David Sterba <dsterba@suse.com>
# fc97a410	19-Oct-2022	Josef Bacik <josef@toxicpanda.com>	btrfs: move mount option definitions to fs.h These are fs wide definitions and helpers, move them out of ctree.h and into fs.h. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# 9b569ea0	19-Oct-2022	Josef Bacik <josef@toxicpanda.com>	btrfs: move the printk helpers out of ctree.h We have a bunch of printk helpers that are in ctree.h. These have nothing to do with ctree.c, so move them into their own header. Subsequent patches will cleanup the printk helpers. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
# c70c2c5b	23-Jun-2022	David Sterba <dsterba@suse.com>	btrfs: switch btrfs_block_rsv::full to bool Use simple bool type for the block reserve full status, there's short to save space as there used to be int but there's no reason for that. Reviewed-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
# 0e3696f8	20-Mar-2019	David Sterba <dsterba@suse.com>	btrfs: remove btrfs_delayed_extent_op::is_data The value of btrfs_delayed_extent_op::is_data is always false, we can cascade the change and simplify code that depends on it, removing the structure member eventually. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>