#
6bb3f7f4 |
|
28-Jan-2024 |
Guoyu Ou <benogy@gmail.com> |
bcachefs: unlock parent dir if entry is not found in subvolume deletion Parent dir is locked by user_path_locked_at() before validating the required dentry. It should be unlocked if we can not perform the deletion. This fixes the problem: $ bcachefs subvolume delete not-exist-entry BCH_IOCTL_SUBVOLUME_DESTROY ioctl error: No such file or directory $ bcachefs subvolume delete not-exist-entry the second will stuck because the parent dir is locked in the previous deletion. Signed-off-by: Guoyu Ou <benogy@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
2acc59dd |
|
14-Jan-2024 |
Su Yue <glass.su@suse.com> |
bcachefs: grab s_umount only if snapshotting When I was testing mongodb over bcachefs with compression, there is a lockdep warning when snapshotting mongodb data volume. $ cat test.sh prog=bcachefs $prog subvolume create /mnt/data $prog subvolume create /mnt/data/snapshots while true;do $prog subvolume snapshot /mnt/data /mnt/data/snapshots/$(date +%s) sleep 1s done $ cat /etc/mongodb.conf systemLog: destination: file logAppend: true path: /mnt/data/mongod.log storage: dbPath: /mnt/data/ lockdep reports: [ 3437.452330] ====================================================== [ 3437.452750] WARNING: possible circular locking dependency detected [ 3437.453168] 6.7.0-rc7-custom+ #85 Tainted: G E [ 3437.453562] ------------------------------------------------------ [ 3437.453981] bcachefs/35533 is trying to acquire lock: [ 3437.454325] ffffa0a02b2b1418 (sb_writers#10){.+.+}-{0:0}, at: filename_create+0x62/0x190 [ 3437.454875] but task is already holding lock: [ 3437.455268] ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs] [ 3437.456009] which lock already depends on the new lock. [ 3437.456553] the existing dependency chain (in reverse order) is: [ 3437.457054] -> #3 (&type->s_umount_key#48){.+.+}-{3:3}: [ 3437.457507] down_read+0x3e/0x170 [ 3437.457772] bch2_fs_file_ioctl+0x232/0xc90 [bcachefs] [ 3437.458206] __x64_sys_ioctl+0x93/0xd0 [ 3437.458498] do_syscall_64+0x42/0xf0 [ 3437.458779] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.459155] -> #2 (&c->snapshot_create_lock){++++}-{3:3}: [ 3437.459615] down_read+0x3e/0x170 [ 3437.459878] bch2_truncate+0x82/0x110 [bcachefs] [ 3437.460276] bchfs_truncate+0x254/0x3c0 [bcachefs] [ 3437.460686] notify_change+0x1f1/0x4a0 [ 3437.461283] do_truncate+0x7f/0xd0 [ 3437.461555] path_openat+0xa57/0xce0 [ 3437.461836] do_filp_open+0xb4/0x160 [ 3437.462116] do_sys_openat2+0x91/0xc0 [ 3437.462402] __x64_sys_openat+0x53/0xa0 [ 3437.462701] do_syscall_64+0x42/0xf0 [ 3437.462982] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.463359] -> #1 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}: [ 3437.463843] down_write+0x3b/0xc0 [ 3437.464223] bch2_write_iter+0x5b/0xcc0 [bcachefs] [ 3437.464493] vfs_write+0x21b/0x4c0 [ 3437.464653] ksys_write+0x69/0xf0 [ 3437.464839] do_syscall_64+0x42/0xf0 [ 3437.465009] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.465231] -> #0 (sb_writers#10){.+.+}-{0:0}: [ 3437.465471] __lock_acquire+0x1455/0x21b0 [ 3437.465656] lock_acquire+0xc6/0x2b0 [ 3437.465822] mnt_want_write+0x46/0x1a0 [ 3437.465996] filename_create+0x62/0x190 [ 3437.466175] user_path_create+0x2d/0x50 [ 3437.466352] bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs] [ 3437.466617] __x64_sys_ioctl+0x93/0xd0 [ 3437.466791] do_syscall_64+0x42/0xf0 [ 3437.466957] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.467180] other info that might help us debug this: [ 3437.469670] 2 locks held by bcachefs/35533: other info that might help us debug this: [ 3437.467507] Chain exists of: sb_writers#10 --> &c->snapshot_create_lock --> &type->s_umount_key#48 [ 3437.467979] Possible unsafe locking scenario: [ 3437.468223] CPU0 CPU1 [ 3437.468405] ---- ---- [ 3437.468585] rlock(&type->s_umount_key#48); [ 3437.468758] lock(&c->snapshot_create_lock); [ 3437.469030] lock(&type->s_umount_key#48); [ 3437.469291] rlock(sb_writers#10); [ 3437.469434] *** DEADLOCK *** [ 3437.469670] 2 locks held by bcachefs/35533: [ 3437.469838] #0: ffffa0a02ce00a88 (&c->snapshot_create_lock){++++}-{3:3}, at: bch2_fs_file_ioctl+0x1e3/0xc90 [bcachefs] [ 3437.470294] #1: ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs] [ 3437.470744] stack backtrace: [ 3437.470922] CPU: 7 PID: 35533 Comm: bcachefs Kdump: loaded Tainted: G E 6.7.0-rc7-custom+ #85 [ 3437.471313] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 [ 3437.471694] Call Trace: [ 3437.471795] <TASK> [ 3437.471884] dump_stack_lvl+0x57/0x90 [ 3437.472035] check_noncircular+0x132/0x150 [ 3437.472202] __lock_acquire+0x1455/0x21b0 [ 3437.472369] lock_acquire+0xc6/0x2b0 [ 3437.472518] ? filename_create+0x62/0x190 [ 3437.472683] ? lock_is_held_type+0x97/0x110 [ 3437.472856] mnt_want_write+0x46/0x1a0 [ 3437.473025] ? filename_create+0x62/0x190 [ 3437.473204] filename_create+0x62/0x190 [ 3437.473380] user_path_create+0x2d/0x50 [ 3437.473555] bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs] [ 3437.473819] ? lock_acquire+0xc6/0x2b0 [ 3437.474002] ? __fget_files+0x2a/0x190 [ 3437.474195] ? __fget_files+0xbc/0x190 [ 3437.474380] ? lock_release+0xc5/0x270 [ 3437.474567] ? __x64_sys_ioctl+0x93/0xd0 [ 3437.474764] ? __pfx_bch2_fs_file_ioctl+0x10/0x10 [bcachefs] [ 3437.475090] __x64_sys_ioctl+0x93/0xd0 [ 3437.475277] do_syscall_64+0x42/0xf0 [ 3437.475454] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.475691] RIP: 0033:0x7f2743c313af ====================================================== In __bch2_ioctl_subvolume_create(), we grab s_umount unconditionally and unlock it at the end of the function. There is a comment "why do we need this lock?" about the lock coming from commit 42d237320e98 ("bcachefs: Snapshot creation, deletion") The reason is that __bch2_ioctl_subvolume_create() calls sync_inodes_sb() which enforce locked s_umount to writeback all dirty nodes before doing snapshot works. Fix it by read locking s_umount for snapshotting only and unlocking s_umount after sync_inodes_sb(). Signed-off-by: Su Yue <glass.su@suse.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
bbe6a7c8 |
|
14-Nov-2023 |
Al Viro <viro@zeniv.linux.org.uk> |
bch2_ioctl_subvolume_destroy(): fix locking make it use user_path_locked_at() to get the normal directory protection for modifications, as well as stable ->d_parent and ->d_name in victim Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
d9e14a4e |
|
30-Nov-2023 |
Brian Foster <bfoster@redhat.com> |
bcachefs: remove sb lock and flags update on explicit shutdown bcachefs grabs s_umount and sets SB_RDONLY when the fs is shutdown via the ioctl() interface. This has a couple issues related to interactions between shutdown and freeze: 1. The flags == FSOP_GOING_FLAGS_DEFAULT case is a deadlock vector because freeze_bdev() calls into freeze_super(), which also acquires s_umount. 2. If an explicit shutdown occurs while the sb is frozen, SB_RDONLY alters the thaw path as if the sb was read-only at freeze time. This effectively leaks the frozen state and leaves the sb frozen indefinitely. The usage of SB_RDONLY here goes back to the initial bcachefs commit and AFAICT is simply historical behavior. This behavior is unique to bcachefs relative to the handful of other filesystems that support the shutdown ioctl(). Typically, SB_RDONLY is reserved for the proper remount path, which itself is restricted from modifying frozen superblocks in reconfigure_super(). Drop the unnecessary sb lock and flags update bch2_ioc_goingdown() to address both of these issues. Signed-off-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
982c3b30 |
|
24-Oct-2023 |
Christian Brauner <brauner@kernel.org> |
bdev: rename freeze and thaw helpers We have bdev_mark_dead() etc and we're going to move block device freezing to holder ops in the next patch. Make the naming consistent: * freeze_bdev() -> bdev_freeze() * thaw_bdev() -> bdev_thaw() Also document the return code. Link: https://lore.kernel.org/r/20231024-vfs-super-freeze-v2-2-599c19f4faac@kernel.org Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
|
#
0d72ab35 |
|
29-Dec-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: make RO snapshots actually RO Add checks to all the VFS paths for "are we in a RO snapshot?". Note - we don't check this when setting inode options via our xattr interface, since those generally only affect data placement, not contents of data. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Reported-by: "Carl E. Thompson" <list-bcachefs@carlthompson.net>
|
#
7aebaabf |
|
04-Dec-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Fix creating snapshot with implict source When creating a snapshot without specifying the source subvolume, we use the subvolume containing the new snapshot. Previously, this worked if the directory containing the new snapshot was the subvolume root - but we were using the incorrect helper, and got a subvolume ID of 0 when the parent directory wasn't the root of the subvolume, causing an emergency read-only. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
103ffe9a |
|
02-Nov-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: x-macro-ify inode flags enum This lets us use bch2_prt_bitflags to print them out. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
37fad949 |
|
28-Sep-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: snapshot_create_lock Add a new lock for snapshot creation - this addresses a few races with logged operations and snapshot deletion. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
1f12900a |
|
15-Sep-2023 |
Dan Carpenter <dan.carpenter@linaro.org> |
bcachefs: fs-ioctl: Fix copy_to_user() error code The copy_to_user() function returns the number of bytes that it wasn't able to copy but we want to return -EFAULT to the user. Fixes: e0750d947352 ("bcachefs: Initial commit") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
791236b8 |
|
12-Aug-2023 |
Joshua Ashton <joshua@froggi.es> |
bcachefs: Add btree_trans* to inode_set_fn This will be used when we need to re-hash a directory tree when setting flags. It is not possible to have concurrent btree_trans on a thread. Signed-off-by: Joshua Ashton <joshua@froggi.es> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
e47a390a |
|
27-May-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Convert -ENOENT to private error codes As with previous conversions, replace -ENOENT uses with more informative private error codes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
e1e7ecaf |
|
15-Mar-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Improve error handling in bch2_ioctl_subvolume_destroy() Pure style fixes Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
65ff2d3a |
|
12-Oct-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Support FS_XFLAG_PROJINHERIT We already have support for the flag's semantics: inode options are inherited by children if they were explicitly set on the parent. This patch just maps the FS_XFLAG_PROJINHERIT flag to the "this option was epxlicitly set" bit. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
5c1ef830 |
|
18-Sep-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Errcodes can now subtype standard error codes The next patch is going to be adding private error codes for all the places we return -ENOSPC. Additionally, this patch updates return paths at all module boundaries to call bch2_err_class(), to return the standard error code. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
2027875b |
|
10-Oct-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add BCH_SUBVOLUME_UNLINKED Snapshot deletion needs to become a multi step process, where we unlink, then tear down the page cache, then delete the subvolume - the deleting flag is equivalent to an inode with i_nlink = 0. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
521b8067 |
|
20-Oct-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Delete dentry when deleting snapshots This fixes a bug where subsequently doing creates with the same name fails. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
42d23732 |
|
16-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Snapshot creation, deletion This is the final patch in the patch series implementing snapshots. This patch implements two new ioctls that work like creation and deletion of directories, but fancier. - BCH_IOCTL_SUBVOLUME_CREATE, for creating new subvolumes and snaphots - BCH_IOCTL_SUBVOLUME_DESTROY, for deleting subvolumes and snapshots Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
6fed42bb |
|
15-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Plumb through subvolume id To implement snapshots, we need every filesystem btree operation (every btree operation without a subvolume) to start by looking up the subvolume and getting the current snapshot ID, with bch2_subvolume_get_snapshot() - then, that snapshot ID is used for doing btree lookups in BTREE_ITER_FILTER_SNAPSHOTS mode. This patch adds those bch2_subvolume_get_snapshot() calls, and also switches to passing around a subvol_inum instead of just an inode number. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
284ae18c |
|
15-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add subvolume to ei_inode_info Filesystem operations generally operate within a subvolume: at the start of every btree transaction we'll be looking up (and locking) the subvolume to get the current snapshot ID, which we then use for our other btree lookups in BTREE_ITER_FILTER_SNAPSHOTS mode. But inodes don't record what subvolume they're in - they can't, because if they did we'd have to update every single inode within a subvolume when taking a snapshot in order to keep that field up to date. So it needs to be tracked in memory, based on how we got to that inode. Hence this patch adds a subvolume field to ei_inode_info, and switches to iget5() so we can index by it in the inode hash table. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
4495cbed |
|
22-May-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve FS_IOC_GOINGDOWN ioctl We weren't interpreting the flags argument at all. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
07bca3bd |
|
02-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Kill ei_str_hash Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
7af0cec3 |
|
24-Aug-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Some project id fixes Inode options that are accessible via the xattr interface are stored with a +1 bias, so that a value of 0 means unset. We weren't handling this consistently. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
96385742 |
|
02-Oct-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Factor out fs-common.c This refactoring makes the code easier to understand by separating the bcachefs btree transactional code from the linux VFS code - but more importantly, it's also to share code with the fuse port. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
168f4c5f |
|
24-Jun-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve bch2_lock_inodes() Can now be used for the two different types of locks we have so far Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
619f5bee |
|
17-Apr-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: some improvements to startup messages and options Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
e19e57f8 |
|
19-Dec-2018 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: fix new reinherit_attrs ioctl Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
2fab25cd |
|
19-Dec-2018 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: more project quota fixes Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
8095708f |
|
17-Dec-2018 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: bch2_ioc_reinherit_attrs() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
721d4ad8 |
|
13-Dec-2018 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add flags to indicate if inode opts were inherited or explicitly set Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
0f5254aa |
|
17-Dec-2018 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: bch2_fs_quota_transfer improve quota transfer locking & make ei_qid usage more consistent Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
2ea90048 |
|
17-Jul-2018 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix mtime/ctime updates Also make inode flags consistent with how the rest of the inode is updated Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
1c6fdbd8 |
|
17-Mar-2017 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Initial commit Initially forked from drivers/md/bcache, bcachefs is a new copy-on-write filesystem with every feature you could possibly want. Website: https://bcachefs.org Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|