#
74e97958 |
|
21-Mar-2024 |
Boris Burkov <boris@bur.io> |
btrfs: qgroup: fix qgroup prealloc rsv leak in subvolume operations Create subvolume, create snapshot and delete subvolume all use btrfs_subvolume_reserve_metadata() to reserve metadata for the changes done to the parent subvolume's fs tree, which cannot be mediated in the normal way via start_transaction. When quota groups (squota or qgroups) are enabled, this reserves qgroup metadata of type PREALLOC. Once the operation is associated to a transaction, we convert PREALLOC to PERTRANS, which gets cleared in bulk at the end of the transaction. However, the error paths of these three operations were not implementing this lifecycle correctly. They unconditionally converted the PREALLOC to PERTRANS in a generic cleanup step regardless of errors or whether the operation was fully associated to a transaction or not. This resulted in error paths occasionally converting this rsv to PERTRANS without calling record_root_in_trans successfully, which meant that unless that root got recorded in the transaction by some other thread, the end of the transaction would not free that root's PERTRANS, leaking it. Ultimately, this resulted in hitting a WARN in CONFIG_BTRFS_DEBUG builds at unmount for the leaked reservation. The fix is to ensure that every qgroup PREALLOC reservation observes the following properties: 1. any failure before record_root_in_trans is called successfully results in freeing the PREALLOC reservation. 2. after record_root_in_trans, we convert to PERTRANS, and now the transaction owns freeing the reservation. This patch enforces those properties on the three operations. Without it, generic/269 with squotas enabled at mkfs time would fail in ~5-10 runs on my system. With this patch, it ran successfully 1000 times in a row. Fixes: e85fde5162bf ("btrfs: qgroup: fix qgroup meta rsv leak for subvolume operations") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
9dcb6ed9 |
|
23-Jan-2024 |
David Sterba <dsterba@suse.com> |
btrfs: handle invalid root reference found in btrfs_find_root() The btrfs_find_root() looks up a root by a key, allowing to do an inexact search when key->offset is -1. It's never expected to find such item, as it would break allowed the range of a root id. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
0fe29838 |
|
23-Jan-2024 |
David Sterba <dsterba@suse.com> |
btrfs: handle root deletion lookup error in btrfs_del_root() We're deleting a root and looking it up by key does not succeed, this is an inconsistent state and we can't do anything. All callers handle errors and abort a transaction. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
2b712e3b |
|
25-Jan-2024 |
David Sterba <dsterba@suse.com> |
btrfs: remove unused included headers With help of neovim, LSP and clangd we can identify header files that are not actually needed to be included in the .c files. This is focused only on removal (with minor fixups), further cleanups are possible but will require doing the header files properly with forward declarations, minimized includes and include-what-you-use care. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
182940f4 |
|
16-May-2023 |
Boris Burkov <boris@bur.io> |
btrfs: qgroup: add new quota mode for simple quotas Add a new quota mode called "simple quotas". It can be enabled by the existing quota enable ioctl via a new command, and sets an incompat bit, as the implementation of simple quotas will make backwards incompatible changes to the disk format of the extent tree. Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
50564b65 |
|
12-Sep-2023 |
Filipe Manana <fdmanana@suse.com> |
btrfs: abort transaction on generation mismatch when marking eb as dirty When marking an extent buffer as dirty, at btrfs_mark_buffer_dirty(), we check if its generation matches the running transaction and if not we just print a warning. Such mismatch is an indicator that something really went wrong and only printing a warning message (and stack trace) is not enough to prevent a corruption. Allowing a transaction to commit with such an extent buffer will trigger an error if we ever try to read it from disk due to a generation mismatch with its parent generation. So abort the current transaction with -EUCLEAN if we notice a generation mismatch. For this we need to pass a transaction handle to btrfs_mark_buffer_dirty() which is always available except in test code, in which case we can pass NULL since it operates on dummy extent buffers and all test roots have a single node/leaf (root node at level 0). Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
9580503b |
|
07-Sep-2023 |
David Sterba <dsterba@suse.com> |
btrfs: reformat remaining kdoc style comments Function name in the comment does not bring much value to code not exposed as API and we don't stick to the kdoc format anymore. Update formatting of parameter descriptions. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
aa5d3003 |
|
26-Oct-2022 |
Josef Bacik <josef@toxicpanda.com> |
btrfs: move orphan prototypes into orphan.h Move these out of ctree.h into orphan.h to cut down on code in ctree.h. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
45c40c8f |
|
24-Oct-2022 |
Josef Bacik <josef@toxicpanda.com> |
btrfs: move root tree prototypes to their own header Move all the root-tree.c prototypes to root-tree.h, and then update all the necessary files to include the new header. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
6db75318 |
|
19-Oct-2022 |
Sweet Tea Dorminy <sweettea-kernel@dorminy.me> |
btrfs: use struct fscrypt_str instead of struct qstr While struct qstr is more natural without fscrypt, since it's provided by dentries, struct fscrypt_str is provided by the fscrypt handlers processing dentries, and is thus more natural in the fscrypt world. Replace all of the struct qstr uses with struct fscrypt_str. Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
e43eec81 |
|
19-Oct-2022 |
Sweet Tea Dorminy <sweettea-kernel@dorminy.me> |
btrfs: use struct qstr instead of name and namelen pairs Many functions throughout btrfs take name buffer and name length arguments. Most of these functions at the highest level are usually called with these arguments extracted from a supplied dentry's name. But the entire name can be passed instead, making each function a little more elegant. Each function whose arguments are currently the name and length extracted from a dentry is herein converted to instead take a pointer to the name in the dentry. The couple of calls to these calls without a struct dentry are converted to create an appropriate qstr to pass in. Additionally, every function which is only called with a name/len extracted directly from a qstr is also converted. This change has positive effect on stack consumption, frame of many functions is reduced but this will be used in the future for fscrypt related structures. Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
07e81dc9 |
|
19-Oct-2022 |
Josef Bacik <josef@toxicpanda.com> |
btrfs: move accessor helpers into accessors.h This is a large patch, but because they're all macros it's impossible to split up. Simply copy all of the item accessors in ctree.h and paste them in accessors.h, and then update any files to include the header so everything compiles. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> [ reformat comments, style fixups ] Signed-off-by: David Sterba <dsterba@suse.com>
|
#
ec8eb376 |
|
19-Oct-2022 |
Josef Bacik <josef@toxicpanda.com> |
btrfs: move BTRFS_FS_STATE* definitions and helpers to fs.h We're going to use fs.h to hold fs wide related helpers and definitions, move the FS_STATE enum and related helpers to fs.h, and then update all files that need these definitions to include fs.h. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
9b569ea0 |
|
19-Oct-2022 |
Josef Bacik <josef@toxicpanda.com> |
btrfs: move the printk helpers out of ctree.h We have a bunch of printk helpers that are in ctree.h. These have nothing to do with ctree.c, so move them into their own header. Subsequent patches will cleanup the printk helpers. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
1fdbd03d |
|
22-Aug-2022 |
Filipe Manana <fdmanana@suse.com> |
btrfs: simplify error handling at btrfs_del_root_ref() At btrfs_del_root_ref() we are using two return variables, named 'ret' and 'err'. This makes it harder to follow and easier to return the wrong value in case an error happens - the previous patch in the series, which has the subject "btrfs: fix silent failure when deleting root reference", fixed a bug due to confusion created by these two variables. So change the function to use a single variable for tracking the return value of the function, using only 'ret', which is consistent with most of the codebase. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
47bf225a |
|
22-Aug-2022 |
Filipe Manana <fdmanana@suse.com> |
btrfs: fix silent failure when deleting root reference At btrfs_del_root_ref(), if btrfs_search_slot() returns an error, we end up returning from the function with a value of 0 (success). This happens because the function returns the value stored in the variable 'err', which is 0, while the error value we got from btrfs_search_slot() is stored in the 'ret' variable. So fix it by setting 'err' with the error value. Fixes: 8289ed9f93bef2 ("btrfs: replace the BUG_ON in btrfs_del_root_ref with proper error handling") CC: stable@vger.kernel.org # 5.16+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
d4135134 |
|
23-Mar-2022 |
Filipe Manana <fdmanana@suse.com> |
btrfs: avoid blocking on space revervation when doing nowait dio writes When doing a NOWAIT direct IO write, if we can NOCOW then it means we can proceed with the non-blocking, NOWAIT path. However reserving the metadata space and qgroup meta space can often result in blocking - flushing delalloc, wait for ordered extents to complete, trigger transaction commits, etc, going against the semantics of a NOWAIT write. So make the NOWAIT write path to try to reserve all the metadata it needs without resulting in a blocking behaviour - if we get -ENOSPC or -EDQUOT then return -EAGAIN to make the caller fallback to a blocking direct IO write. This is part of a patchset comprised of the following patches: btrfs: avoid blocking on page locks with nowait dio on compressed range btrfs: avoid blocking nowait dio when locking file range btrfs: avoid double nocow check when doing nowait dio writes btrfs: stop allocating a path when checking if cross reference exists btrfs: free path at can_nocow_extent() before checking for checksum items btrfs: release path earlier at can_nocow_extent() btrfs: avoid blocking when allocating context for nowait dio read/write btrfs: avoid blocking on space revervation when doing nowait dio writes The following test was run before and after applying this patchset: $ cat io-uring-nodatacow-test.sh #!/bin/bash DEV=/dev/sdc MNT=/mnt/sdc MOUNT_OPTIONS="-o ssd -o nodatacow" MKFS_OPTIONS="-R free-space-tree -O no-holes" NUM_JOBS=4 FILE_SIZE=8G RUN_TIME=300 cat <<EOF > /tmp/fio-job.ini [io_uring_rw] rw=randrw fsync=0 fallocate=posix group_reporting=1 direct=1 ioengine=io_uring iodepth=64 bssplit=4k/20:8k/20:16k/20:32k/10:64k/10:128k/5:256k/5:512k/5:1m/5 filesize=$FILE_SIZE runtime=$RUN_TIME time_based filename=foobar directory=$MNT numjobs=$NUM_JOBS thread EOF echo performance | \ tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor umount $MNT &> /dev/null mkfs.btrfs -f $MKFS_OPTIONS $DEV &> /dev/null mount $MOUNT_OPTIONS $DEV $MNT fio /tmp/fio-job.ini umount $MNT The test was run a 12 cores box with 64G of ram, using a non-debug kernel config (Debian's default config) and a spinning disk. Result before the patchset: READ: bw=407MiB/s (427MB/s), 407MiB/s-407MiB/s (427MB/s-427MB/s), io=119GiB (128GB), run=300175-300175msec WRITE: bw=407MiB/s (427MB/s), 407MiB/s-407MiB/s (427MB/s-427MB/s), io=119GiB (128GB), run=300175-300175msec Result after the patchset: READ: bw=436MiB/s (457MB/s), 436MiB/s-436MiB/s (457MB/s-457MB/s), io=128GiB (137GB), run=300044-300044msec WRITE: bw=435MiB/s (456MB/s), 435MiB/s-435MiB/s (456MB/s-456MB/s), io=128GiB (137GB), run=300044-300044msec That's about +7.2% throughput for reads and +6.9% for writes. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
b4be6aef |
|
18-Feb-2022 |
Josef Bacik <josef@toxicpanda.com> |
btrfs: do not start relocation until in progress drops are done We hit a bug with a recovering relocation on mount for one of our file systems in production. I reproduced this locally by injecting errors into snapshot delete with balance running at the same time. This presented as an error while looking up an extent item WARNING: CPU: 5 PID: 1501 at fs/btrfs/extent-tree.c:866 lookup_inline_extent_backref+0x647/0x680 CPU: 5 PID: 1501 Comm: btrfs-balance Not tainted 5.16.0-rc8+ #8 RIP: 0010:lookup_inline_extent_backref+0x647/0x680 RSP: 0018:ffffae0a023ab960 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 000000000000000c RDI: 0000000000000000 RBP: ffff943fd2a39b60 R08: 0000000000000000 R09: 0000000000000001 R10: 0001434088152de0 R11: 0000000000000000 R12: 0000000001d05000 R13: ffff943fd2a39b60 R14: ffff943fdb96f2a0 R15: ffff9442fc923000 FS: 0000000000000000(0000) GS:ffff944e9eb40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1157b1fca8 CR3: 000000010f092000 CR4: 0000000000350ee0 Call Trace: <TASK> insert_inline_extent_backref+0x46/0xd0 __btrfs_inc_extent_ref.isra.0+0x5f/0x200 ? btrfs_merge_delayed_refs+0x164/0x190 __btrfs_run_delayed_refs+0x561/0xfa0 ? btrfs_search_slot+0x7b4/0xb30 ? btrfs_update_root+0x1a9/0x2c0 btrfs_run_delayed_refs+0x73/0x1f0 ? btrfs_update_root+0x1a9/0x2c0 btrfs_commit_transaction+0x50/0xa50 ? btrfs_update_reloc_root+0x122/0x220 prepare_to_merge+0x29f/0x320 relocate_block_group+0x2b8/0x550 btrfs_relocate_block_group+0x1a6/0x350 btrfs_relocate_chunk+0x27/0xe0 btrfs_balance+0x777/0xe60 balance_kthread+0x35/0x50 ? btrfs_balance+0xe60/0xe60 kthread+0x16b/0x190 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x22/0x30 </TASK> Normally snapshot deletion and relocation are excluded from running at the same time by the fs_info->cleaner_mutex. However if we had a pending balance waiting to get the ->cleaner_mutex, and a snapshot deletion was running, and then the box crashed, we would come up in a state where we have a half deleted snapshot. Again, in the normal case the snapshot deletion needs to complete before relocation can start, but in this case relocation could very well start before the snapshot deletion completes, as we simply add the root to the dead roots list and wait for the next time the cleaner runs to clean up the snapshot. Fix this by setting a bit on the fs_info if we have any DEAD_ROOT's that had a pending drop_progress key. If they do then we know we were in the middle of the drop operation and set a flag on the fs_info. Then balance can wait until this flag is cleared to start up again. If there are DEAD_ROOT's that don't have a drop_progress set then we're safe to start balance right away as we'll be properly protected by the cleaner_mutex. CC: stable@vger.kernel.org # 5.10+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
9270501c |
|
09-Nov-2021 |
Josef Bacik <josef@toxicpanda.com> |
btrfs: change root to fs_info for btrfs_reserve_metadata_bytes We used to need the root for btrfs_reserve_metadata_bytes to check the orphan cleanup state, but we no longer need that, we simply need the fs_info. Change btrfs_reserve_metadata_bytes() to use the fs_info, and change both btrfs_block_rsv_refill() and btrfs_block_rsv_add() to do the same as they simply call btrfs_reserve_metadata_bytes() and then manipulate the block_rsv that is being used. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
3212fa14 |
|
21-Oct-2021 |
Josef Bacik <josef@toxicpanda.com> |
btrfs: drop the _nr from the item helpers Now that all call sites are using the slot number to modify item values, rename the SETGET helpers to raw_item_*(), and then rework the _nr() helpers to be the btrfs_item_*() btrfs_set_item_*() helpers, and then rename all of the callers to the new helpers. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
8289ed9f |
|
01-Dec-2021 |
Qu Wenruo <wqu@suse.com> |
btrfs: replace the BUG_ON in btrfs_del_root_ref with proper error handling I hit the BUG_ON() with generic/475 test case, and to my surprise, all callers of btrfs_del_root_ref() are already aborting transaction, thus there is not need for such BUG_ON(), just go to @out label and caller will properly handle the error. CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
a2c5062f |
|
20-May-2021 |
Kees Cook <keescook@chromium.org> |
btrfs: Use memset_startat() to clear end of struct In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memset(), avoid intentionally writing across neighboring fields. Use memset_startat() so memset() doesn't get confused about writing beyond the destination member that is intended to be the starting point of zeroing through the end of the struct. Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: David Sterba <dsterba@suse.com> Cc: linux-btrfs@vger.kernel.org Reviewed-by: Nikolay Borisov <nborisov@suse.com> Acked-by: David Sterba <dsterba@suse.com> Signed-off-by: Kees Cook <keescook@chromium.org>
|
#
e85fde51 |
|
24-Jul-2020 |
Qu Wenruo <wqu@suse.com> |
btrfs: qgroup: fix qgroup meta rsv leak for subvolume operations [BUG] When quota is enabled for TEST_DEV, generic/013 sometimes fails like this: generic/013 14s ... _check_dmesg: something found in dmesg (see xfstests-dev/results//generic/013.dmesg) And with the following metadata leak: BTRFS warning (device dm-3): qgroup 0/1370 has unreleased space, type 2 rsv 49152 ------------[ cut here ]------------ WARNING: CPU: 2 PID: 47912 at fs/btrfs/disk-io.c:4078 close_ctree+0x1dc/0x323 [btrfs] Call Trace: btrfs_put_super+0x15/0x17 [btrfs] generic_shutdown_super+0x72/0x110 kill_anon_super+0x18/0x30 btrfs_kill_super+0x17/0x30 [btrfs] deactivate_locked_super+0x3b/0xa0 deactivate_super+0x40/0x50 cleanup_mnt+0x135/0x190 __cleanup_mnt+0x12/0x20 task_work_run+0x64/0xb0 __prepare_exit_to_usermode+0x1bc/0x1c0 __syscall_return_slowpath+0x47/0x230 do_syscall_64+0x64/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 ---[ end trace a6cfd45ba80e4e06 ]--- BTRFS error (device dm-3): qgroup reserved space leaked BTRFS info (device dm-3): disk space caching is enabled BTRFS info (device dm-3): has skinny extents [CAUSE] The qgroup preallocated meta rsv operations of that offending root are: btrfs_delayed_inode_reserve_metadata: rsv_meta_prealloc root=1370 num_bytes=131072 btrfs_delayed_inode_reserve_metadata: rsv_meta_prealloc root=1370 num_bytes=131072 btrfs_subvolume_reserve_metadata: rsv_meta_prealloc root=1370 num_bytes=49152 btrfs_delayed_inode_release_metadata: convert_meta_prealloc root=1370 num_bytes=-131072 btrfs_delayed_inode_release_metadata: convert_meta_prealloc root=1370 num_bytes=-131072 It's pretty obvious that, we reserve qgroup meta rsv in btrfs_subvolume_reserve_metadata(), but doesn't have corresponding release/convert calls in btrfs_subvolume_release_metadata(). This leads to the leakage. [FIX] To fix this bug, we should follow what we're doing in btrfs_delalloc_reserve_metadata(), where we reserve qgroup space, and add it to block_rsv->qgroup_rsv_reserved. And free the qgroup reserved metadata space when releasing the block_rsv. To do this, we need to change the btrfs_subvolume_release_metadata() to accept btrfs_root, and record the qgroup_to_release number, and call btrfs_qgroup_convert_reserved_meta() for it. Fixes: 733e03a0b26a ("btrfs: qgroup: Split meta rsv type into meta_prealloc and meta_pertrans") CC: stable@vger.kernel.org # 4.19+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
56e9357a |
|
15-May-2020 |
David Sterba <dsterba@suse.com> |
btrfs: simplify root lookup by id The main function to lookup a root by its id btrfs_get_fs_root takes the whole key, while only using the objectid. The value of offset is preset to (u64)-1 but not actually used until btrfs_find_root that does the actual search. Switch btrfs_get_fs_root to use only objectid and remove all local variables that existed just for the lookup. The actual key for search is set up in btrfs_get_fs_root, reusing another key variable. Signed-off-by: David Sterba <dsterba@suse.com>
|
#
63f018be |
|
10-Mar-2020 |
Nikolay Borisov <nborisov@suse.com> |
btrfs: Remove __ prefix from btrfs_block_rsv_release Currently the non-prefixed version is a simple wrapper used to hide the 4th argument of the prefixed version. This doesn't bring much value in practice and only makes the code harder to follow by adding another level of indirection. Rectify this by removing the __ prefix and have only one public function to release bytes from a block reservation. No semantic changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
807fc790 |
|
24-Feb-2020 |
Andy Shevchenko <andriy.shevchenko@linux.intel.com> |
btrfs: switch to use new generic UUID API There are new types and helpers that are supposed to be used in new code. As a preparation to get rid of legacy types and API functions do the conversion here. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
00246528 |
|
24-Jan-2020 |
Josef Bacik <josef@toxicpanda.com> |
btrfs: rename btrfs_put_fs_root and btrfs_grab_fs_root We are now using these for all roots, rename them to btrfs_put_root() and btrfs_grab_root(); Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
bc44d7c4 |
|
24-Jan-2020 |
Josef Bacik <josef@toxicpanda.com> |
btrfs: push btrfs_grab_fs_root into btrfs_get_fs_root Now that all callers of btrfs_get_fs_root are subsequently calling btrfs_grab_fs_root and handling dropping the ref when they are done appropriately, go ahead and push btrfs_grab_fs_root up into btrfs_get_fs_root. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
0b2dee5c |
|
24-Jan-2020 |
Josef Bacik <josef@toxicpanda.com> |
btrfs: hold a ref for the root in btrfs_find_orphan_roots We lookup roots for every orphan item we have, we need to hold a ref on the root while we're doing this work. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
e59d18b4 |
|
24-Jan-2020 |
Josef Bacik <josef@toxicpanda.com> |
btrfs: make btrfs_find_orphan_roots use btrfs_get_fs_root btrfs_find_orphan_roots has this weird thing where it looks up the root in cache to see if it is there before just reading the root. But the read it uses just reads the root, it doesn't do any of the init work, we do that by hand here. But this is unnecessary, all we really want is to see if the root still exists and add it to the dead roots list to be cleaned up, otherwise we delete the orphan item. Fix this by just using btrfs_get_fs_root directly with check_ref set to false so we get the orphan root items. Then we just handle in cache and out of cache roots the same, add them to the dead roots list and carry on. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
423a716c |
|
18-Dec-2019 |
Josef Bacik <josef@toxicpanda.com> |
btrfs: do not delete mismatched root refs btrfs_del_root_ref() will simply WARN_ON() if the ref doesn't match in any way, and then continue to delete the reference. This shouldn't happen, we have these values because there's more to the reference than the original root and the sub root. If any of these checks fail, return -ENOENT. CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
2bd36e7b |
|
22-Aug-2019 |
Josef Bacik <josef@toxicpanda.com> |
btrfs: rename the btrfs_calc_*_metadata_size helpers btrfs_calc_trunc_metadata_size differs from trans_metadata_size in that it doesn't take into account any splitting at the levels, because truncate will never split nodes. However truncate _and_ changing will never split nodes, so rename btrfs_calc_trunc_metadata_size to btrfs_calc_metadata_size. Also btrfs_calc_trans_metadata_size is purely for inserting items, so rename this to btrfs_calc_insert_metadata_size. Making these clearer will help when I start using them differently in upcoming patches. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
28a32d2b |
|
19-Jun-2019 |
Josef Bacik <josef@toxicpanda.com> |
btrfs: move the subvolume reservation stuff out of extent-tree.c This is just two functions, put it in root-tree.c since it involves root items. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
72bd2323 |
|
29-Apr-2019 |
Filipe Manana <fdmanana@suse.com> |
Btrfs: do not abort transaction at btrfs_update_root() after failure to COW path Currently when we fail to COW a path at btrfs_update_root() we end up always aborting the transaction. However all the current callers of btrfs_update_root() are able to deal with errors returned from it, many do end up aborting the transaction themselves (directly or not, such as the transaction commit path), other BUG_ON() or just gracefully cancel whatever they were doing. When syncing the fsync log, we call btrfs_update_root() through tree-log.c:update_log_root(), and if it returns an -ENOSPC error, the log sync code does not abort the transaction, instead it gracefully handles the error and returns -EAGAIN to the fsync handler, so that it falls back to a transaction commit. Any other error different from -ENOSPC, makes the log sync code abort the transaction. So remove the transaction abort from btrfs_update_log() when we fail to COW a path to update the root item, so that if an -ENOSPC failure happens we avoid aborting the current transaction and have a chance of the fsync succeeding after falling back to a transaction commit. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203413 Fixes: 79787eaab46121 ("btrfs: replace many BUG_ONs with proper error handling") Cc: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
7ac1e464 |
|
26-Feb-2019 |
Qu Wenruo <wqu@suse.com> |
btrfs: Don't panic when we can't find a root key When we failed to find a root key in btrfs_update_root(), we just panic. That's definitely not cool, fix it by outputting an unique error message, aborting current transaction and return -EUCLEAN. This should not normally happen as the root has been used by the callers in some way. Reviewed-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
78c52d9e |
|
06-Feb-2019 |
Josef Bacik <josef@toxicpanda.com> |
btrfs: check for refs on snapshot delete resume There's a bug in snapshot deletion where we won't update the drop_progress key if we're in the UPDATE_BACKREF stage. This is a problem because we could drop refs for blocks we know don't belong to ours. If we crash or umount at the right time we could experience messages such as the following when snapshot deletion resumes BTRFS error (device dm-3): unable to find ref byte nr 66797568 parent 0 root 258 owner 1 offset 0 ------------[ cut here ]------------ WARNING: CPU: 3 PID: 16052 at fs/btrfs/extent-tree.c:7108 __btrfs_free_extent.isra.78+0x62c/0xb30 [btrfs] CPU: 3 PID: 16052 Comm: umount Tainted: G W OE 5.0.0-rc4+ #147 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./890FX Deluxe5, BIOS P1.40 05/03/2011 RIP: 0010:__btrfs_free_extent.isra.78+0x62c/0xb30 [btrfs] RSP: 0018:ffffc90005cd7b18 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 RDX: ffff88842fade680 RSI: ffff88842fad6b18 RDI: ffff88842fad6b18 RBP: ffffc90005cd7bc8 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000001 R11: ffffffff822696b8 R12: 0000000003fb4000 R13: 0000000000000001 R14: 0000000000000102 R15: ffff88819c9d67e0 FS: 00007f08bb138fc0(0000) GS:ffff88842fac0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f8f5d861ea0 CR3: 00000003e99fe000 CR4: 00000000000006e0 Call Trace: ? _raw_spin_unlock+0x27/0x40 ? btrfs_merge_delayed_refs+0x356/0x3e0 [btrfs] __btrfs_run_delayed_refs+0x75a/0x13c0 [btrfs] ? join_transaction+0x2b/0x460 [btrfs] btrfs_run_delayed_refs+0xf3/0x1c0 [btrfs] btrfs_commit_transaction+0x52/0xa50 [btrfs] ? start_transaction+0xa6/0x510 [btrfs] btrfs_sync_fs+0x79/0x1c0 [btrfs] sync_filesystem+0x70/0x90 generic_shutdown_super+0x27/0x120 kill_anon_super+0x12/0x30 btrfs_kill_super+0x16/0xa0 [btrfs] deactivate_locked_super+0x43/0x70 deactivate_super+0x40/0x60 cleanup_mnt+0x3f/0x80 __cleanup_mnt+0x12/0x20 task_work_run+0x8b/0xc0 exit_to_usermode_loop+0xce/0xd0 do_syscall_64+0x20b/0x210 entry_SYSCALL_64_after_hwframe+0x49/0xbe To fix this simply mark dead roots we read from disk as DEAD and then set the walk_control->restarted flag so we know we have a restarted deletion. From here whenever we try to drop refs for blocks we check to verify our ref is set on them, and if it is not we skip it. Once we find a ref that is set we unset walk_control->restarted since the tree should be in a normal state from then on, and any problems we run into from there are different issues. I tested this with an existing broken fs and my reproducer that creates a broken fs and it fixed both file systems. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
f65e25e3 |
|
19-Feb-2019 |
YueHaibing <yuehaibing@huawei.com> |
btrfs: Remove unnecessary casts in btrfs_read_root_item There is a messy cast here: min_t(int, len, (int)sizeof(*item))); min_t() should normally cast to unsigned. It's not possible for "len" to be negative, but if it were then we definitely wouldn't want to pass negatives to read_extent_buffer(). Also there is an extra cast. This patch shouldn't affect runtime, it's just a clean up. Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
6025c19f |
|
31-Jul-2018 |
Lu Fengqi <lufq.fnst@cn.fujitsu.com> |
btrfs: Remove fs_info from btrfs_add_root_ref It can be referenced from the passed transaction handle. Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
3ee1c553 |
|
31-Jul-2018 |
Lu Fengqi <lufq.fnst@cn.fujitsu.com> |
btrfs: Remove fs_info from btrfs_del_root_ref It can be referenced from the passed transaction handle. Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
ab9ce7d4 |
|
31-Jul-2018 |
Lu Fengqi <lufq.fnst@cn.fujitsu.com> |
btrfs: Remove fs_info from btrfs_del_root It can be referenced from the passed transaction handle. Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
95582b00 |
|
08-May-2018 |
Deepa Dinamani <deepa.kernel@gmail.com> |
vfs: change inode times to use struct timespec64 struct timespec is not y2038 safe. Transition vfs to use y2038 safe struct timespec64 instead. The change was made with the help of the following cocinelle script. This catches about 80% of the changes. All the header file and logic changes are included in the first 5 rules. The rest are trivial substitutions. I avoid changing any of the function signatures or any other filesystem specific data structures to keep the patch simple for review. The script can be a little shorter by combining different cases. But, this version was sufficient for my usecase. virtual patch @ depends on patch @ identifier now; @@ - struct timespec + struct timespec64 current_time ( ... ) { - struct timespec now = current_kernel_time(); + struct timespec64 now = current_kernel_time64(); ... - return timespec_trunc( + return timespec64_trunc( ... ); } @ depends on patch @ identifier xtime; @@ struct \( iattr \| inode \| kstat \) { ... - struct timespec xtime; + struct timespec64 xtime; ... } @ depends on patch @ identifier t; @@ struct inode_operations { ... int (*update_time) (..., - struct timespec t, + struct timespec64 t, ...); ... } @ depends on patch @ identifier t; identifier fn_update_time =~ "update_time$"; @@ fn_update_time (..., - struct timespec *t, + struct timespec64 *t, ...) { ... } @ depends on patch @ identifier t; @@ lease_get_mtime( ... , - struct timespec *t + struct timespec64 *t ) { ... } @te depends on patch forall@ identifier ts; local idexpression struct inode *inode_node; identifier i_xtime =~ "^i_[acm]time$"; identifier ia_xtime =~ "^ia_[acm]time$"; identifier fn_update_time =~ "update_time$"; identifier fn; expression e, E3; local idexpression struct inode *node1; local idexpression struct inode *node2; local idexpression struct iattr *attr1; local idexpression struct iattr *attr2; local idexpression struct iattr attr; identifier i_xtime1 =~ "^i_[acm]time$"; identifier i_xtime2 =~ "^i_[acm]time$"; identifier ia_xtime1 =~ "^ia_[acm]time$"; identifier ia_xtime2 =~ "^ia_[acm]time$"; @@ ( ( - struct timespec ts; + struct timespec64 ts; | - struct timespec ts = current_time(inode_node); + struct timespec64 ts = current_time(inode_node); ) <+... when != ts ( - timespec_equal(&inode_node->i_xtime, &ts) + timespec64_equal(&inode_node->i_xtime, &ts) | - timespec_equal(&ts, &inode_node->i_xtime) + timespec64_equal(&ts, &inode_node->i_xtime) | - timespec_compare(&inode_node->i_xtime, &ts) + timespec64_compare(&inode_node->i_xtime, &ts) | - timespec_compare(&ts, &inode_node->i_xtime) + timespec64_compare(&ts, &inode_node->i_xtime) | ts = current_time(e) | fn_update_time(..., &ts,...) | inode_node->i_xtime = ts | node1->i_xtime = ts | ts = inode_node->i_xtime | <+... attr1->ia_xtime ...+> = ts | ts = attr1->ia_xtime | ts.tv_sec | ts.tv_nsec | btrfs_set_stack_timespec_sec(..., ts.tv_sec) | btrfs_set_stack_timespec_nsec(..., ts.tv_nsec) | - ts = timespec64_to_timespec( + ts = ... -) | - ts = ktime_to_timespec( + ts = ktime_to_timespec64( ...) | - ts = E3 + ts = timespec_to_timespec64(E3) | - ktime_get_real_ts(&ts) + ktime_get_real_ts64(&ts) | fn(..., - ts + timespec64_to_timespec(ts) ,...) ) ...+> ( <... when != ts - return ts; + return timespec64_to_timespec(ts); ...> ) | - timespec_equal(&node1->i_xtime1, &node2->i_xtime2) + timespec64_equal(&node1->i_xtime2, &node2->i_xtime2) | - timespec_equal(&node1->i_xtime1, &attr2->ia_xtime2) + timespec64_equal(&node1->i_xtime2, &attr2->ia_xtime2) | - timespec_compare(&node1->i_xtime1, &node2->i_xtime2) + timespec64_compare(&node1->i_xtime1, &node2->i_xtime2) | node1->i_xtime1 = - timespec_trunc(attr1->ia_xtime1, + timespec64_trunc(attr1->ia_xtime1, ...) | - attr1->ia_xtime1 = timespec_trunc(attr2->ia_xtime2, + attr1->ia_xtime1 = timespec64_trunc(attr2->ia_xtime2, ...) | - ktime_get_real_ts(&attr1->ia_xtime1) + ktime_get_real_ts64(&attr1->ia_xtime1) | - ktime_get_real_ts(&attr.ia_xtime1) + ktime_get_real_ts64(&attr.ia_xtime1) ) @ depends on patch @ struct inode *node; struct iattr *attr; identifier fn; identifier i_xtime =~ "^i_[acm]time$"; identifier ia_xtime =~ "^ia_[acm]time$"; expression e; @@ ( - fn(node->i_xtime); + fn(timespec64_to_timespec(node->i_xtime)); | fn(..., - node->i_xtime); + timespec64_to_timespec(node->i_xtime)); | - e = fn(attr->ia_xtime); + e = fn(timespec64_to_timespec(attr->ia_xtime)); ) @ depends on patch forall @ struct inode *node; struct iattr *attr; identifier i_xtime =~ "^i_[acm]time$"; identifier ia_xtime =~ "^ia_[acm]time$"; identifier fn; @@ { + struct timespec ts; <+... ( + ts = timespec64_to_timespec(node->i_xtime); fn (..., - &node->i_xtime, + &ts, ...); | + ts = timespec64_to_timespec(attr->ia_xtime); fn (..., - &attr->ia_xtime, + &ts, ...); ) ...+> } @ depends on patch forall @ struct inode *node; struct iattr *attr; struct kstat *stat; identifier ia_xtime =~ "^ia_[acm]time$"; identifier i_xtime =~ "^i_[acm]time$"; identifier xtime =~ "^[acm]time$"; identifier fn, ret; @@ { + struct timespec ts; <+... ( + ts = timespec64_to_timespec(node->i_xtime); ret = fn (..., - &node->i_xtime, + &ts, ...); | + ts = timespec64_to_timespec(node->i_xtime); ret = fn (..., - &node->i_xtime); + &ts); | + ts = timespec64_to_timespec(attr->ia_xtime); ret = fn (..., - &attr->ia_xtime, + &ts, ...); | + ts = timespec64_to_timespec(attr->ia_xtime); ret = fn (..., - &attr->ia_xtime); + &ts); | + ts = timespec64_to_timespec(stat->xtime); ret = fn (..., - &stat->xtime); + &ts); ) ...+> } @ depends on patch @ struct inode *node; struct inode *node2; identifier i_xtime1 =~ "^i_[acm]time$"; identifier i_xtime2 =~ "^i_[acm]time$"; identifier i_xtime3 =~ "^i_[acm]time$"; struct iattr *attrp; struct iattr *attrp2; struct iattr attr ; identifier ia_xtime1 =~ "^ia_[acm]time$"; identifier ia_xtime2 =~ "^ia_[acm]time$"; struct kstat *stat; struct kstat stat1; struct timespec64 ts; identifier xtime =~ "^[acmb]time$"; expression e; @@ ( ( node->i_xtime2 \| attrp->ia_xtime2 \| attr.ia_xtime2 \) = node->i_xtime1 ; | node->i_xtime2 = \( node2->i_xtime1 \| timespec64_trunc(...) \); | node->i_xtime2 = node->i_xtime1 = node->i_xtime3 = \(ts \| current_time(...) \); | node->i_xtime1 = node->i_xtime3 = \(ts \| current_time(...) \); | stat->xtime = node2->i_xtime1; | stat1.xtime = node2->i_xtime1; | ( node->i_xtime2 \| attrp->ia_xtime2 \) = attrp->ia_xtime1 ; | ( attrp->ia_xtime1 \| attr.ia_xtime1 \) = attrp2->ia_xtime2; | - e = node->i_xtime1; + e = timespec64_to_timespec( node->i_xtime1 ); | - e = attrp->ia_xtime1; + e = timespec64_to_timespec( attrp->ia_xtime1 ); | node->i_xtime1 = current_time(...); | node->i_xtime2 = node->i_xtime1 = node->i_xtime3 = - e; + timespec_to_timespec64(e); | node->i_xtime1 = node->i_xtime3 = - e; + timespec_to_timespec64(e); | - node->i_xtime1 = e; + node->i_xtime1 = timespec_to_timespec64(e); ) Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Cc: <anton@tuxera.com> Cc: <balbi@kernel.org> Cc: <bfields@fieldses.org> Cc: <darrick.wong@oracle.com> Cc: <dhowells@redhat.com> Cc: <dsterba@suse.com> Cc: <dwmw2@infradead.org> Cc: <hch@lst.de> Cc: <hirofumi@mail.parknet.co.jp> Cc: <hubcap@omnibond.com> Cc: <jack@suse.com> Cc: <jaegeuk@kernel.org> Cc: <jaharkes@cs.cmu.edu> Cc: <jslaby@suse.com> Cc: <keescook@chromium.org> Cc: <mark@fasheh.com> Cc: <miklos@szeredi.hu> Cc: <nico@linaro.org> Cc: <reiserfs-devel@vger.kernel.org> Cc: <richard@nod.at> Cc: <sage@redhat.com> Cc: <sfrench@samba.org> Cc: <swhiteho@redhat.com> Cc: <tj@kernel.org> Cc: <trond.myklebust@primarydata.com> Cc: <tytso@mit.edu> Cc: <viro@zeniv.linux.org.uk>
|
#
c1d7c514 |
|
03-Apr-2018 |
David Sterba <dsterba@suse.com> |
btrfs: replace GPL boilerplate by SPDX -- sources Remove GPL boilerplate text (long, short, one-line) and keep the rest, ie. personal, company or original source copyright statements. Add the SPDX header. Signed-off-by: David Sterba <dsterba@suse.com>
|
#
bae15d95 |
|
07-Nov-2017 |
Qu Wenruo <wqu@suse.com> |
btrfs: Cleanup existing name_len checks Since tree-checker has verified leaf when reading from disk, we don't need the existing verify_dir_item() or btrfs_is_name_len_valid() checks. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
fa0d0888 |
|
09-Sep-2017 |
Christos Gkekas <chris.gekas@gmail.com> |
btrfs: Clean up dead code in root-tree The value of variable 'can_recover' is never used after being set, thus it should be removed, as it was never used since the first commit 68a7342c51c95042 ("Btrfs: cleanup orphaned root orphan item"). Signed-off-by: Christos Gkekas <chris.gekas@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
1cd5447e |
|
17-Aug-2017 |
Jeff Mahoney <jeffm@suse.com> |
btrfs: pass fs_info to btrfs_del_root instead of tree_root btrfs_del_roots always uses the tree_root. Let's pass fs_info instead. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
a4f78750 |
|
29-Jun-2017 |
David Sterba <dsterba@suse.com> |
btrfs: get fs_info from eb in btrfs_print_leaf, remove argument Signed-off-by: David Sterba <dsterba@suse.com>
|
#
bc98a42c |
|
17-Jul-2017 |
David Howells <dhowells@redhat.com> |
VFS: Convert sb->s_flags & MS_RDONLY to sb_rdonly(sb) Firstly by applying the following with coccinelle's spatch: @@ expression SB; @@ -SB->s_flags & MS_RDONLY +sb_rdonly(SB) to effect the conversion to sb_rdonly(sb), then by applying: @@ expression A, SB; @@ ( -(!sb_rdonly(SB)) && A +!sb_rdonly(SB) && A | -A != (sb_rdonly(SB)) +A != sb_rdonly(SB) | -A == (sb_rdonly(SB)) +A == sb_rdonly(SB) | -!(sb_rdonly(SB)) +!sb_rdonly(SB) | -A && (sb_rdonly(SB)) +A && sb_rdonly(SB) | -A || (sb_rdonly(SB)) +A || sb_rdonly(SB) | -(sb_rdonly(SB)) != A +sb_rdonly(SB) != A | -(sb_rdonly(SB)) == A +sb_rdonly(SB) == A | -(sb_rdonly(SB)) && A +sb_rdonly(SB) && A | -(sb_rdonly(SB)) || A +sb_rdonly(SB) || A ) @@ expression A, B, SB; @@ ( -(sb_rdonly(SB)) ? 1 : 0 +sb_rdonly(SB) | -(sb_rdonly(SB)) ? A : B +sb_rdonly(SB) ? A : B ) to remove left over excess bracketage and finally by applying: @@ expression A, SB; @@ ( -(A & MS_RDONLY) != sb_rdonly(SB) +(bool)(A & MS_RDONLY) != sb_rdonly(SB) | -(A & MS_RDONLY) == sb_rdonly(SB) +(bool)(A & MS_RDONLY) == sb_rdonly(SB) ) to make comparisons against the result of sb_rdonly() (which is a bool) work correctly. Signed-off-by: David Howells <dhowells@redhat.com>
|
#
64c7b014 |
|
06-Jun-2017 |
Su Yue <suy.fnst@cn.fujitsu.com> |
btrfs: Check name_len before in btrfs_del_root_ref btrfs_del_root_ref calls btrfs_search_slot and reads name from root_ref. Call btrfs_is_name_len_valid before memcmp. Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
fa7aede2 |
|
07-Apr-2017 |
Deepa Dinamani <deepa.kernel@gmail.com> |
btrfs: Use ktime_get_real_ts for root ctime btrfs_root_item maintains the ctime for root updates. This is not part of vfs_inode. Since current_time() uses struct inode* as an argument as Linus suggested, this cannot be used to update root times unless, we modify the signature to use inode. Since btrfs uses nanosecond time granularity, it can also use ktime_get_real_ts directly to obtain timestamp for the root. It is necessary to use the timespec time api here because the same btrfs_set_stack_timespec_*() apis are used for vfs inode times as well. These can be transitioned to using timespec64 when btrfs internally changes to use timespec64 as well. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Acked-by: David Sterba <dsterba@suse.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
310712b2 |
|
18-Jan-2017 |
Omar Sandoval <osandov@fb.com> |
Btrfs: constify struct btrfs_{,disk_}key wherever possible In a lot of places, it's unclear when it's safe to reuse a struct btrfs_key after it has been passed to a helper function. Constify these arguments wherever possible to make it obvious. Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
3a45bb20 |
|
09-Sep-2016 |
Jeff Mahoney <jeffm@suse.com> |
btrfs: remove root parameter from transaction commit/end routines Now we only use the root parameter to print the root objectid in a tracepoint. We can use the root parameter from the transaction handle for that. It's also used to join the transaction with async commits, so we remove the comment that it's just for checking. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
2ff7e61e |
|
22-Jun-2016 |
Jeff Mahoney <jeffm@suse.com> |
btrfs: take an fs_info directly when the root is not used otherwise There are loads of functions in btrfs that accept a root parameter but only use it to obtain an fs_info pointer. Let's convert those to just accept an fs_info pointer directly. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
0b246afa |
|
22-Jun-2016 |
Jeff Mahoney <jeffm@suse.com> |
btrfs: root->fs_info cleanup, add fs_info convenience variables In routines where someptr->fs_info is referenced multiple times, we introduce a convenience variable. This makes the code considerably more readable. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
6bccf3ab |
|
21-Jun-2016 |
Jeff Mahoney <jeffm@suse.com> |
btrfs: call functions that always use the same root with fs_info instead There are many functions that are always called with the same root argument. Rather than passing the same root every time, we can pass an fs_info pointer instead and have the function get the root pointer itself. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
5d163e0e |
|
20-Sep-2016 |
Jeff Mahoney <jeffm@suse.com> |
btrfs: unsplit printed strings CodingStyle chapter 2: "[...] never break user-visible strings such as printk messages, because that breaks the ability to grep for them." This patch unsplits user-visible strings. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
35bbb97f |
|
17-Aug-2016 |
Jeff Mahoney <jeffm@suse.com> |
btrfs: don't create or leak aliased root while cleaning up orphans commit 909c3a22da3 (Btrfs: fix loading of orphan roots leading to BUG_ON) avoids the BUG_ON but can add an aliased root to the dead_roots list or leak the root. Since we've already been loading roots into the radix tree, we should use it before looking the root up on disk. Cc: <stable@vger.kernel.org> # 4.5 Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
|
#
66642832 |
|
10-Jun-2016 |
Jeff Mahoney <jeffm@suse.com> |
btrfs: btrfs_abort_transaction, drop root parameter __btrfs_abort_transaction doesn't use its root parameter except to obtain an fs_info pointer. We can obtain that from trans->root->fs_info for now and from trans->fs_info in a later patch. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
01327610 |
|
19-May-2016 |
Nicholas D Steeves <nsteeves@gmail.com> |
btrfs: fix string and comment grammatical issues and typos Signed-off-by: Nicholas D Steeves <nsteeves@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
34d97007 |
|
16-Mar-2016 |
Anand Jain <anand.jain@oracle.com> |
btrfs: rename btrfs_std_error to btrfs_handle_fs_error btrfs_std_error() handles errors, puts FS into readonly mode (as of now). So its good idea to rename it to btrfs_handle_fs_error(). Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> [ edit changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
|
#
909c3a22 |
|
02-Mar-2016 |
Filipe Manana <fdmanana@suse.com> |
Btrfs: fix loading of orphan roots leading to BUG_ON When looking for orphan roots during mount we can end up hitting a BUG_ON() (at root-item.c:btrfs_find_orphan_roots()) if a log tree is replayed and qgroups are enabled. This is because after a log tree is replayed, a transaction commit is made, which triggers qgroup extent accounting which in turn does backref walking which ends up reading and inserting all roots in the radix tree fs_info->fs_root_radix, including orphan roots (deleted snapshots). So after the log tree is replayed, when finding orphan roots we hit the BUG_ON with the following trace: [118209.182438] ------------[ cut here ]------------ [118209.183279] kernel BUG at fs/btrfs/root-tree.c:314! [118209.184074] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [118209.185123] Modules linked in: btrfs dm_flakey dm_mod crc32c_generic ppdev xor raid6_pq evdev sg parport_pc parport acpi_cpufreq tpm_tis tpm psmouse processor i2c_piix4 serio_raw pcspkr i2c_core button loop autofs4 ext4 crc16 mbcache jbd2 sd_mod sr_mod cdrom ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring virtio scsi_mod e1000 floppy [last unloaded: btrfs] [118209.186318] CPU: 14 PID: 28428 Comm: mount Tainted: G W 4.5.0-rc5-btrfs-next-24+ #1 [118209.186318] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014 [118209.186318] task: ffff8801ec131040 ti: ffff8800af34c000 task.ti: ffff8800af34c000 [118209.186318] RIP: 0010:[<ffffffffa04237d7>] [<ffffffffa04237d7>] btrfs_find_orphan_roots+0x1fc/0x244 [btrfs] [118209.186318] RSP: 0018:ffff8800af34faa8 EFLAGS: 00010246 [118209.186318] RAX: 00000000ffffffef RBX: 00000000ffffffef RCX: 0000000000000001 [118209.186318] RDX: 0000000080000000 RSI: 0000000000000001 RDI: 00000000ffffffff [118209.186318] RBP: ffff8800af34fb08 R08: 0000000000000001 R09: 0000000000000000 [118209.186318] R10: ffff8800af34f9f0 R11: 6db6db6db6db6db7 R12: ffff880171b97000 [118209.186318] R13: ffff8801ca9d65e0 R14: ffff8800afa2e000 R15: 0000160000000000 [118209.186318] FS: 00007f5bcb914840(0000) GS:ffff88023edc0000(0000) knlGS:0000000000000000 [118209.186318] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [118209.186318] CR2: 00007f5bcaceb5d9 CR3: 00000000b49b5000 CR4: 00000000000006e0 [118209.186318] Stack: [118209.186318] fffffbffffffffff 010230ffffffffff 0101000000000000 ff84000000000000 [118209.186318] fbffffffffffffff 30ffffffffffffff 0000000000000101 ffff880082348000 [118209.186318] 0000000000000000 ffff8800afa2e000 ffff8800afa2e000 0000000000000000 [118209.186318] Call Trace: [118209.186318] [<ffffffffa042e2db>] open_ctree+0x1e37/0x21b9 [btrfs] [118209.186318] [<ffffffffa040a753>] btrfs_mount+0x97e/0xaed [btrfs] [118209.186318] [<ffffffff8108e1c0>] ? trace_hardirqs_on+0xd/0xf [118209.186318] [<ffffffff8117b87e>] mount_fs+0x67/0x131 [118209.186318] [<ffffffff81192d2b>] vfs_kern_mount+0x6c/0xde [118209.186318] [<ffffffffa0409f81>] btrfs_mount+0x1ac/0xaed [btrfs] [118209.186318] [<ffffffff8108e1c0>] ? trace_hardirqs_on+0xd/0xf [118209.186318] [<ffffffff8108c26b>] ? lockdep_init_map+0xb9/0x1b3 [118209.186318] [<ffffffff8117b87e>] mount_fs+0x67/0x131 [118209.186318] [<ffffffff81192d2b>] vfs_kern_mount+0x6c/0xde [118209.186318] [<ffffffff81195637>] do_mount+0x8a6/0x9e8 [118209.186318] [<ffffffff8119598d>] SyS_mount+0x77/0x9f [118209.186318] [<ffffffff81493017>] entry_SYSCALL_64_fastpath+0x12/0x6b [118209.186318] Code: 64 00 00 85 c0 89 c3 75 24 f0 41 80 4c 24 20 20 49 8b bc 24 f0 01 00 00 4c 89 e6 e8 e8 65 00 00 85 c0 89 c3 74 11 83 f8 ef 75 02 <0f> 0b 4c 89 e7 e8 da 72 00 00 eb 1c 41 83 bc 24 00 01 00 00 00 [118209.186318] RIP [<ffffffffa04237d7>] btrfs_find_orphan_roots+0x1fc/0x244 [btrfs] [118209.186318] RSP <ffff8800af34faa8> [118209.230735] ---[ end trace 83938f987d85d477 ]--- So fix this by not treating the error -EEXIST, returned when attempting to insert a root already inserted by the backref walking code, as an error. The following test case for xfstests reproduces the bug: seq=`basename $0` seqres=$RESULT_DIR/$seq echo "QA output created by $seq" tmp=/tmp/$$ status=1 # failure is the default! trap "_cleanup; exit \$status" 0 1 2 3 15 _cleanup() { _cleanup_flakey cd / rm -f $tmp.* } # get standard environment, filters and checks . ./common/rc . ./common/filter . ./common/dmflakey # real QA test starts here _supported_fs btrfs _supported_os Linux _require_scratch _require_dm_target flakey _require_metadata_journaling $SCRATCH_DEV rm -f $seqres.full _scratch_mkfs >>$seqres.full 2>&1 _init_flakey _mount_flakey _run_btrfs_util_prog quota enable $SCRATCH_MNT # Create 2 directories with one file in one of them. # We use these just to trigger a transaction commit later, moving the file from # directory a to directory b and doing an fsync against directory a. mkdir $SCRATCH_MNT/a mkdir $SCRATCH_MNT/b touch $SCRATCH_MNT/a/f sync # Create our test file with 2 4K extents. $XFS_IO_PROG -f -s -c "pwrite -S 0xaa 0 8K" $SCRATCH_MNT/foobar | _filter_xfs_io # Create a snapshot and delete it. This doesn't really delete the snapshot # immediately, just makes it inaccessible and invisible to user space, the # snapshot is deleted later by a dedicated kernel thread (cleaner kthread) # which is woke up at the next transaction commit. # A root orphan item is inserted into the tree of tree roots, so that if a # power failure happens before the dedicated kernel thread does the snapshot # deletion, the next time the filesystem is mounted it resumes the snapshot # deletion. _run_btrfs_util_prog subvolume snapshot $SCRATCH_MNT $SCRATCH_MNT/snap _run_btrfs_util_prog subvolume delete $SCRATCH_MNT/snap # Now overwrite half of the extents we wrote before. Because we made a snapshpot # before, which isn't really deleted yet (since no transaction commit happened # after we did the snapshot delete request), the non overwritten extents get # referenced twice, once by the default subvolume and once by the snapshot. $XFS_IO_PROG -c "pwrite -S 0xbb 4K 8K" $SCRATCH_MNT/foobar | _filter_xfs_io # Now move file f from directory a to directory b and fsync directory a. # The fsync on the directory a triggers a transaction commit (because a file # was moved from it to another directory) and the file fsync leaves a log tree # with file extent items to replay. mv $SCRATCH_MNT/a/f $SCRATCH_MNT/a/b $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar echo "File digest before power failure:" md5sum $SCRATCH_MNT/foobar | _filter_scratch # Now simulate a power failure and mount the filesystem to replay the log tree. # After the log tree was replayed, we used to hit a BUG_ON() when processing # the root orphan item for the deleted snapshot. This is because when processing # an orphan root the code expected to be the first code inserting the root into # the fs_info->fs_root_radix radix tree, while in reallity it was the second # caller attempting to do it - the first caller was the transaction commit that # took place after replaying the log tree, when updating the qgroup counters. _flakey_drop_and_remount echo "File digest before after failure:" # Must match what he got before the power failure. md5sum $SCRATCH_MNT/foobar | _filter_scratch _unmount_flakey status=0 exit Fixes: 2d9e97761087 ("Btrfs: use btrfs_get_fs_root in resolve_indirect_ref") Cc: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
|
#
04b285f3 |
|
07-Feb-2016 |
Deepa Dinamani <deepa.kernel@gmail.com> |
btrfs: Replace CURRENT_TIME by current_fs_time() CURRENT_TIME macro is not appropriate for filesystems as it doesn't use the right granularity for filesystem timestamps. Use current_fs_time() instead. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <jbacik@fb.com> Cc: linux-btrfs@vger.kernel.org Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
0412e58c |
|
24-Aug-2015 |
Alexandru Moise <00moses.alexander00@gmail.com> |
btrfs: Fixed declaration of old_len old_len is used to store the return value of btrfs_item_size_nr(). The return value of btrfs_item_size_nr() is of type u32. To improve code correctness and avoid mixing signed and unsigned integers I've changed old_len to be of type u32 as well. Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Alexandru Moise <00moses.alexander00@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
f14d104d |
|
08-Oct-2015 |
David Sterba <dsterba@suse.com> |
btrfs: switch more printks to our helpers Convert the simple cases, not all functions provide a way to reach the fs_info. Also skipped debugging messages (print-tree, integrity checker and pr_debug) and messages that are printed from possibly unfinished mount. Signed-off-by: David Sterba <dsterba@suse.com>
|
#
a4553fef |
|
25-Sep-2015 |
Anand Jain <anand.jain@oracle.com> |
Btrfs: consolidate btrfs_error() to btrfs_std_error() btrfs_error() and btrfs_std_error() does the same thing and calls _btrfs_std_error(), so consolidate them together. And the main motivation is that btrfs_error() is closely named with btrfs_err(), one handles error action the other is to log the error, so don't closely name them. Signed-off-by: Anand Jain <anand.jain@oracle.com> Suggested-by: David Sterba <dsterba@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
|
#
27cdeb70 |
|
02-Apr-2014 |
Miao Xie <miaox@cn.fujitsu.com> |
Btrfs: use bitfield instead of integer data type for the some variants in btrfs_root Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
|
#
886322e8 |
|
16-Feb-2014 |
Sachin Kamat <sachin.kamat@linaro.org> |
btrfs: Use PTR_ERR_OR_ZERO PTR_RET is deprecated. Use PTR_ERR_OR_ZERO instead. While at it also include missing err.h header. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Josef Bacik <jbacik@fb.com>
|
#
efe120a0 |
|
20-Dec-2013 |
Frank Holton <fholton@gmail.com> |
Btrfs: convert printk to btrfs_ and fix BTRFS prefix Convert all applicable cases of printk and pr_* to the btrfs_* macros. Fix all uses of the BTRFS prefix. Signed-off-by: Frank Holton <fholton@gmail.com> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
|
#
75ac2dd9 |
|
04-Nov-2013 |
Kelley Nielsen <kelleynnn@gmail.com> |
btrfs: expand btrfs_find_item() to include find_root_ref functionality This patch is the second step in bootstrapping the btrfs_find_item interface. The btrfs_find_root_ref() is similar to the former __inode_info(); it accepts four of its parameters, and duplicates the first half of its functionality. Replace the one former call to btrfs_find_root_ref() with a call to btrfs_find_item(), along with the defined key type that was used internally by btrfs_find_root ref, and a null found key. In btrfs_find_item(), add a test for the null key at the place where the functionality of btrfs_find_root_ref() ends; btrfs_find_item() then returns if the test passes. Finally, remove btrfs_find_root_ref(). Signed-off-by: Kelley Nielsen <kelleynnn@gmail.com> Suggested-by: Zach Brown <zab@redhat.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <clm@fb.com>
|
#
14927d95 |
|
25-Sep-2013 |
Miao Xie <miaox@cn.fujitsu.com> |
Btrfs: insert orphan roots into fs radix tree Now we don't drop all the deleted snapshots/subvolumes before the space balance. It means we have to relocate the space which is held by the dead snapshots/subvolumes. So we must into them into fs radix tree, or we would forget to commit the change of them when doing transaction commit, and it would corrupt the metadata. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
#
c1c9ff7c |
|
20-Aug-2013 |
Geert Uytterhoeven <geert@linux-m68k.org> |
Btrfs: Remove superfluous casts from u64 to unsigned long long u64 is "unsigned long long" on all architectures now, so there's no need to cast it when formatting it using the "ll" length modifier. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
#
171170c1 |
|
14-Aug-2013 |
Sergei Trofimovich <slyfox@gentoo.org> |
btrfs: mark some local function as 'static' Cc: Josef Bacik <jbacik@fusionio.com> Cc: Chris Mason <chris.mason@fusionio.com> Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
#
3cae210f |
|
15-Jul-2013 |
Qu Wenruo <quwenruo@cn.fujitsu.com> |
btrfs: Cleanup for using BTRFS_SETGET_STACK instead of raw convert Some codes still use the cpu_to_lexx instead of the BTRFS_SETGET_STACK_FUNCS declared in ctree.h. Also added some BTRFS_SETGET_STACK_FUNCS for btrfs_header btrfs_timespec and other structures. Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-by: Miao Xie <miaoxie@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
#
68a7342c |
|
27-Jun-2013 |
Josef Bacik <jbacik@fusionio.com> |
Btrfs: cleanup orphaned root orphan item I hit a weird problem were my root item had been deleted but the orphan item had not. This isn't necessarily a problem, but it keeps the file system from being mounted. To fix this we just need to axe the orphan item if we can't find the fs root when we're putting them altogether. With this patch I was able to successfully mount my file system. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
#
b7394eb9 |
|
26-May-2013 |
Liu Bo <bo.li.liu@oracle.com> |
Btrfs: remove unused code in btrfs_del_root 'leaf' and 'ri' is not used somehow. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
#
cb517eab |
|
15-May-2013 |
Miao Xie <miaox@cn.fujitsu.com> |
Btrfs: cleanup the similar code of the fs root read There are several functions whose code is similar, such as btrfs_find_last_root() btrfs_read_fs_root_no_radix() Besides that, some functions are invoked twice, it is unnecessary, for example, we are sure that all roots which is found in btrfs_find_orphan_roots() have their orphan items, so it is unnecessary to check the orphan item again. So cleanup it. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
#
5fbf83c1 |
|
19-Apr-2013 |
Stefan Behrens <sbehrens@giantdisaster.de> |
Btrfs: delete unused parameter to btrfs_read_root_item() Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
#
5f3ab90a |
|
07-Dec-2012 |
Anand Jain <anand.jain@oracle.com> |
Btrfs: rename root_times_lock to root_item_lock Originally root_times_lock was introduced as part of send/receive code however newly developed patch to label the subvol reused the same lock, so renaming it for a meaningful name. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
#
005d6427 |
|
18-Sep-2012 |
David Sterba <dsterba@suse.cz> |
btrfs: move transaction aborts to the point of failure Call btrfs_abort_transaction as early as possible when an error condition is detected, that way the line number reported is useful and we're not clueless anymore which error path led to the abort. Signed-off-by: David Sterba <dsterba@suse.cz>
|
#
dadd1105 |
|
30-Jul-2012 |
Dan Carpenter <dan.carpenter@oracle.com> |
Btrfs: fix some endian bugs handling the root times "trans->transid" is cpu endian but we want to store the data as little endian. "item->ctime.nsec" is only 32 bits, not 64. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
|
#
8ea05e3a |
|
25-Jul-2012 |
Alexander Block <ablock84@googlemail.com> |
Btrfs: introduce subvol uuids and times This patch introduces uuids for subvolumes. Each subvolume has it's own uuid. In case it was snapshotted, it also contains parent_uuid. In case it was received, it also contains received_uuid. It also introduces subvolume ctime/otime/stime/rtime. The first two are comparable to the times found in inodes. otime is the origin/creation time and ctime is the change time. stime/rtime are only valid on received subvolumes. stime is the time of the subvolume when it was sent. rtime is the time of the subvolume when it was received. Additionally to the times, we have a transid for each time. They are updated at the same place as the times. btrfs receive uses stransid and rtransid to find out if a received subvolume changed in the meantime. If an older kernel mounts a filesystem with the extented fields, all fields become invalid. The next mount with a new kernel will detect this and reset the fields. Signed-off-by: Alexander Block <ablock84@googlemail.com> Reviewed-by: David Sterba <dave@jikos.cz> Reviewed-by: Arne Jansen <sensille@gmx.net> Reviewed-by: Jan Schmidt <list.btrfs@jan-o-sch.net> Reviewed-by: Alex Lyakas <alex.bolshoy.btrfs@gmail.com>
|
#
79787eaa |
|
12-Mar-2012 |
Jeff Mahoney <jeffm@suse.com> |
btrfs: replace many BUG_ONs with proper error handling btrfs currently handles most errors with BUG_ON. This patch is a work-in- progress but aims to handle most errors other than internal logic errors and ENOMEM more gracefully. This iteration prevents most crashes but can run into lockups with the page lock on occasion when the timing "works out." Signed-off-by: Jeff Mahoney <jeffm@suse.com>
|
#
b45a9d8b |
|
03-Oct-2011 |
Jeff Mahoney <jeffm@suse.com> |
btrfs: btrfs_update_root error push-up btrfs_update_root BUG's when it can't alloc a path, yet it can recover from a search error. This patch returns -ENOMEM instead. Signed-off-by: Jeff Mahoney <jeffm@suse.com>
|
#
d16cb050 |
|
03-Oct-2011 |
Jeff Mahoney <jeffm@suse.com> |
btrfs: Simplify btrfs_insert_root btrfs_insert_root is just a wrapper for btrfs_insert_item. Just return the error directly. Signed-off-by: Jeff Mahoney <jeffm@suse.com>
|
#
bf5f32ec |
|
14-Jul-2011 |
Mark Fasheh <mfasheh@suse.com> |
btrfs: make btrfs_set_root_node void This is fairly trivial - btrfs_set_root_node() - always returns zero so we can just make it void. All callers ignore the return code now anyway. I also made sure to check that none of the functions that btrfs_set_root_node() calls returns an error that we might have needed to catch and pass back. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
65a246c5 |
|
18-May-2011 |
Tsutomu Itoh <t-itoh@jp.fujitsu.com> |
Btrfs: return error code to caller when btrfs_del_item fails The error code is returned instead of calling BUG_ON when btrfs_del_item returns the error. Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
f2a97a9d |
|
04-May-2011 |
David Sterba <dsterba@suse.cz> |
btrfs: remove all unused functions Remove static and global declarations and/or definitions. Reduces size of btrfs.ko by ~3.4kB. text data bss dec hex filename 402081 7464 200 409745 64091 btrfs.ko.base 398620 7144 200 405964 631cc btrfs.ko.remove-all Signed-off-by: David Sterba <dsterba@suse.cz>
|
#
b3b4aa74 |
|
20-Apr-2011 |
David Sterba <dsterba@suse.cz> |
btrfs: drop unused parameter from btrfs_release_path parameter tree root it's not used since commit 5f39d397dfbe140a14edecd4e73c34ce23c4f9ee ("Btrfs: Create extent_buffer interface for large blocksizes") Signed-off-by: David Sterba <dsterba@suse.cz>
|
#
08fe4db1 |
|
27-Mar-2011 |
Li Zefan <lizf@cn.fujitsu.com> |
Btrfs: Fix uninitialized root flags for subvolumes root_item->flags and root_item->byte_limit are not initialized when a subvolume is created. This bug is not revealed until we added readonly snapshot support - now you mount a btrfs filesystem and you may find the subvolumes in it are readonly. To work around this problem, we steal a bit from root_item->inode_item->flags, and use it to indicate if those fields have been properly initialized. When we read a tree root from disk, we check if the bit is set, and if not we'll set the flag and initialize the two fields of the root item. Reported-by: Andreas Philipp <philipp.andreas@gmail.com> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Tested-by: Andreas Philipp <philipp.andreas@gmail.com> cc: stable@kernel.org Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
db5b493a |
|
23-Mar-2011 |
Tsutomu Itoh <t-itoh@jp.fujitsu.com> |
Btrfs: cleanup some BUG_ON() This patch changes some BUG_ON() to the error return. (but, most callers still use BUG_ON()) Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
559af821 |
|
29-Oct-2010 |
Andi Kleen <andi@firstfloor.org> |
Btrfs: cleanup warnings from gcc 4.6 (nonbugs) These are all the cases where a variable is set, but not read which are not bugs as far as I can see, but simply leftovers. Still needs more review. Found by gcc 4.6's new warnings Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Chris Mason <chris.mason@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
15e70000 |
|
17-May-2010 |
Sage Weil <sage@newdream.net> |
Btrfs: avoid BUG when dropping root and reference in same transaction If btrfs_ioctl_snap_destroy() deletes a snapshot but finishes with end_transaction(), the cleaner kthread may come in and drop the root in the same transaction. If that's the case, the root's refs still == 1 in the tree when btrfs_del_root() deletes the item, because commit_fs_roots() hasn't updated it yet (that happens during the commit). This wasn't a problem before only because btrfs_ioctl_snap_destroy() would commit the transaction before dropping the dentry reference, so the dead root wouldn't get queued up until after the fs root item was updated in the btree. Since it is not an error to drop the root reference and the root in the same transaction, just drop the BUG_ON() in btrfs_del_root(). Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
d68fc57b |
|
16-May-2010 |
Yan, Zheng <zheng.yan@oracle.com> |
Btrfs: Metadata reservation for orphan inodes reserve metadata space for handling orphan inodes Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
df66916e |
|
06-Nov-2009 |
Li Dongyang <jerry87905@gmail.com> |
Btrfs: skip btrfs_release_path in btrfs_update_root and btrfs_del_root We don't need to call btrfs_release_path because btrfs_free_path will do that for us. Signed-off-by: Li Dongyang <Jerry87905@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
76dda93c |
|
21-Sep-2009 |
Yan, Zheng <zheng.yan@oracle.com> |
Btrfs: add snapshot/subvolume destroy ioctl This patch adds snapshot/subvolume destroy ioctl. A subvolume that isn't being used and doesn't contains links to other subvolumes can be destroyed. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
4df27c4d |
|
21-Sep-2009 |
Yan, Zheng <zheng.yan@oracle.com> |
Btrfs: change how subvolumes are organized btrfs allows subvolumes and snapshots anywhere in the directory tree. If we snapshot a subvolume that contains a link to other subvolume called subvolA, subvolA can be accessed through both the original subvolume and the snapshot. This is similar to creating hard link to directory, and has the very similar problems. The aim of this patch is enforcing there is only one access point to each subvolume. Only the first directory entry (the one added when the subvolume/snapshot was created) is treated as valid access point. The first directory entry is distinguished by checking root forward reference. If the corresponding root forward reference is missing, we know the entry is not the first one. This patch also adds snapshot/subvolume rename support, the code allows rename subvolume link across subvolumes. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
5d4f98a2 |
|
10-Jun-2009 |
Yan Zheng <zheng.yan@oracle.com> |
Btrfs: Mixed back reference (FORWARD ROLLING FORMAT CHANGE) This commit introduces a new kind of back reference for btrfs metadata. Once a filesystem has been mounted with this commit, IT WILL NO LONGER BE MOUNTABLE BY OLDER KERNELS. When a tree block in subvolume tree is cow'd, the reference counts of all extents it points to are increased by one. At transaction commit time, the old root of the subvolume is recorded in a "dead root" data structure, and the btree it points to is later walked, dropping reference counts and freeing any blocks where the reference count goes to 0. The increments done during cow and decrements done after commit cancel out, and the walk is a very expensive way to go about freeing the blocks that are no longer referenced by the new btree root. This commit reduces the transaction overhead by avoiding the need for dead root records. When a non-shared tree block is cow'd, we free the old block at once, and the new block inherits old block's references. When a tree block with reference count > 1 is cow'd, we increase the reference counts of all extents the new block points to by one, and decrease the old block's reference count by one. This dead tree avoidance code removes the need to modify the reference counts of lower level extents when a non-shared tree block is cow'd. But we still need to update back ref for all pointers in the block. This is because the location of the block is recorded in the back ref item. We can solve this by introducing a new type of back ref. The new back ref provides information about pointer's key, level and in which tree the pointer lives. This information allow us to find the pointer by searching the tree. The shortcoming of the new back ref is that it only works for pointers in tree blocks referenced by their owner trees. This is mostly a problem for snapshots, where resolving one of these fuzzy back references would be O(number_of_snapshots) and quite slow. The solution used here is to use the fuzzy back references in the common case where a given tree block is only referenced by one root, and use the full back references when multiple roots have a reference on a given block. This commit adds per subvolume red-black tree to keep trace of cached inodes. The red-black tree helps the balancing code to find cached inodes whose inode numbers within a given range. This commit improves the balancing code by introducing several data structures to keep the state of balancing. The most important one is the back ref cache. It caches how the upper level tree blocks are referenced. This greatly reduce the overhead of checking back ref. The improved balancing code scales significantly better with a large number of snapshots. This is a very large commit and was written in a number of pieces. But, they depend heavily on the disk format change and were squashed together to make sure git bisect didn't end up in a bad state wrt space balancing or the format change. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
d397712b |
|
05-Jan-2009 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: Fix checkpatch.pl warnings There were many, most are fixed now. struct-funcs.c generates some warnings but these are bogus. Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
b2950863 |
|
02-Dec-2008 |
Christoph Hellwig <hch@lst.de> |
Btrfs: make things static and include the right headers Shut up various sparse warnings about symbols that should be either static or have their declarations in scope. Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
ea9e8b11 |
|
17-Nov-2008 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: prevent loops in the directory tree when creating snapshots For a directory tree: /mnt/subvolA/subvolB btrfsctl -s /mnt/subvolA/subvolB /mnt Will create a directory loop with subvolA under subvolB. This commit uses the forward refs for each subvol and snapshot to error out before creating the loop. Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
0660b5af |
|
17-Nov-2008 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: Add backrefs and forward refs for subvols and snapshots Subvols and snapshots can now be referenced from any point in the directory tree. We need to maintain back refs for them so we can find lost subvols. Forward refs are added so that we know all of the subvols and snapshots referenced anywhere in the directory tree of a single subvol. This can be used to do recursive snapshotting (but they aren't yet) and it is also used to detect and prevent directory loops when creating new snapshots. Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
d352ac68 |
|
29-Sep-2008 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: add and improve comments This improves the comments at the top of many functions. It didn't dive into the guts of functions because I was trying to avoid merging problems with the new allocator and back reference work. extent-tree.c and volumes.c were both skipped, and there is definitely more work todo in cleaning and commenting the code. Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
1a40e23b |
|
26-Sep-2008 |
Zheng Yan <zheng.yan@oracle.com> |
Btrfs: update space balancing code This patch updates the space balancing code to utilize the new backref format. Before, btrfs-vol -b would break any COW links on data blocks or metadata. This was slow and caused the amount of space used to explode if a large number of snapshots were present. The new code can keeps the sharing of all data extents and most of the tree blocks. To maintain the sharing of data extents, the space balance code uses a seperate inode hold data extent pointers, then updates the references to point to the new location. To maintain the sharing of tree blocks, the space balance code uses reloc trees to relocate tree blocks in reference counted roots. There is one reloc tree for each subvol, and all reloc trees share same root key objectid. Reloc trees are snapshots of the latest committed roots of subvols (root->commit_root). To relocate a tree block referenced by a subvol, there are two steps. COW the block through subvol's reloc tree, then update block pointer in the subvol to point to the new block. Since all reloc trees share same root key objectid, doing special handing for tree blocks owned by them is easy. Once a tree block has been COWed in one reloc tree, we can use the resulting new block directly when the same block is required to COW again through other reloc trees. In this way, relocated tree blocks are shared between reloc trees, so they are also shared between subvols. Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
e02119d5 |
|
05-Sep-2008 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: Add a write ahead tree log to optimize synchronous operations File syncs and directory syncs are optimized by copying their items into a special (copy-on-write) log tree. There is one log tree per subvolume and the btrfs super block points to a tree of log tree roots. After a crash, items are copied out of the log tree and back into the subvolume. See tree-log.c for all the details. Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
b48652c1 |
|
04-Aug-2008 |
Yan Zheng <zheng.yan@oracle.com> |
Btrfs: Various small fixes. This trivial patch contains two locking fixes and a off by one fix. --- Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
a7a16fd7 |
|
26-Jun-2008 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: Fix deadlock while searching for dead roots on mount btrfs_find_dead_roots called btrfs_read_fs_root_no_radix, which means we end up calling btrfs_search_slot with a path already held. The fix is to remember the key inside btrfs_find_dead_roots and drop the path. Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
bf4ef679 |
|
08-May-2008 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: Properly find the root for snapshotted blocks during chunk relocation Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
d6667462 |
|
03-Jan-2008 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: Change st_blocksize to 4k Some programs (python) do rwm cycles at the granularity returned by stat. Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
edbd8d4e |
|
21-Dec-2007 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: Support for online FS resize (grow and shrink) Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
5f39d397 |
|
15-Oct-2007 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: Create extent_buffer interface for large blocksizes Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
5ce14bbc |
|
11-Sep-2007 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: Find and remove dead roots the first time a root is loaded. Dead roots are trees left over after a crash, and they were either in the process of being removed or were waiting to be removed when the box crashed. Before, a search of the entire tree of root pointers was done on mount looking for dead roots. Now, the search is done the first time we load a root. This makes mount faster when there are a large number of snapshots, and it enables the block accounting code to properly update the block counts on the latest root as old versions of the root are reaped after a crash. Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
a1f39630 |
|
11-Jul-2007 |
Aneesh <aneesh.kumar@gmail.com> |
Btrfs: Some code cleanups Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
ec6b910f |
|
11-Jul-2007 |
Zach Brown <zach.brown@oracle.com> |
Btrfs: trivial include fixups Almost none of the files including module.h need to do so, remove them. Include sched.h in extent-tree.c to silence a warning about cond_resched() being undeclared. Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
22bb92f3 |
|
22-Jun-2007 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: Documentation update Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
5eda7b5e |
|
22-Jun-2007 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: Add the ability to find and remove dead roots after a crash. Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
54aa1f4d |
|
22-Jun-2007 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: Audit callers and return codes to make sure -ENOSPC gets up the stack Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
f1ace244 |
|
13-Jun-2007 |
Aneesh <aneesh.kumar@linux.vnet.ibm.com> |
btrfs: Code cleanup Attaching below is some of the code cleanups that i came across while reading the code. a) alloc_path already calls init_path. b) Mention that btrfs_inode is the in memory copy.Ext4 have ext4_inode_info as the in memory copy ext4_inode as the disk copy Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
6cbd5570 |
|
12-Jun-2007 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: add GPLv2 Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
5af3981c |
|
12-Jun-2007 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: printk fixes Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
84f54cfa |
|
12-Jun-2007 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: 64 bit div fixes Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
8fd17795 |
|
19-Apr-2007 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: early fsync support Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
c5739bba |
|
10-Apr-2007 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: snapshot progress Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
5caf2a00 |
|
02-Apr-2007 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: dynamic allocation of path struct Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
d6025579 |
|
30-Mar-2007 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: corruption hunt continues Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
d5719762 |
|
23-Mar-2007 |
Chris Mason <chris.mason@oracle.com> |
btrfs_create, btrfs_write_super, btrfs_sync_fs Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
e20d96d6 |
|
21-Mar-2007 |
Chris Mason <chris.mason@oracle.com> |
Mountable btrfs, with readdir Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
2e635a27 |
|
21-Mar-2007 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: initial move to kernel module land Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
e089f05c |
|
16-Mar-2007 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: transaction handles everywhere Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
62e2749e |
|
14-Mar-2007 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: Use a chunk of the key flags to record the item type. Add (untested and simple) directory item code Fix comp_keys to use the new key ordering Add btrfs_insert_empty_item Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
123abc88 |
|
14-Mar-2007 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: variable block size support Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
#
3768f368 |
|
13-Mar-2007 |
Chris Mason <chris.mason@oracle.com> |
Btrfs: Change the super to point to a tree of trees to enable persistent snapshots Signed-off-by: Chris Mason <chris.mason@oracle.com>
|