Cross Reference: /linux-master/fs/gfs2/inode.c

History log of /linux-master/fs/gfs2/inode.c
Revision	Date	Author	Comments
# e9f1e6bb	02-Feb-2024	Andreas Gruenbacher <agruenba@redhat.com>	Revert "gfs2: Use GL_NOBLOCK flag for non-blocking lookups" Commit "gfs2: Use GL_NOBLOCK flag for non-blocking lookups" has several issues, some of which are non-trivial to fix, so revert it for now: https://lore.kernel.org/gfs2/20240202050230.GA875515@ZenIV/T/ This reverts commit dd00aaeb343255a8a30de671bd27bde79a47c8e5. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
# dd00aaeb	10-Nov-2023	Abhi Das <adas@redhat.com>	gfs2: Use GL_NOBLOCK flag for non-blocking lookups Add the GL_NOBLOCK flag to the locking requests in gfs2_permission() and gfs2_drevalidate() when called with the MAY_NOT_BLOCK flag and LOOKUP_RCU flag, respectively. This will cause the locking requests to be handled without sleeping if possible. We bail out with -ECHILD if we can't grant the glock immediately. Make sure not to dget() + dput() the parent dentry in gfs2_drevalidate() in LOOKUP_RCU mode; dput() is a sleeping operation. Signed-off-by: Abhi Das <adas@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
# 074d7306	30-Oct-2023	Andreas Gruenbacher <agruenba@redhat.com>	gfs2: Silence "suspicious RCU usage in gfs2_permission" warning Commit 0abd1557e21c added rcu_dereference() for dereferencing ip->i_gl in gfs2_permission. This now causes lockdep to complain when gfs2_permission is called in non-RCU context: WARNING: suspicious RCU usage in gfs2_permission Switch to rcu_dereference_check() and check for the MAY_NOT_BLOCK flag to shut up lockdep when we know that dereferencing ip->i_gl is safe. Fixes: 0abd1557e21c ("gfs2: fix an oops in gfs2_permission") Reported-by: syzbot+3e5130844b0c0e2b4948@syzkaller.appspotmail.com Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
# 062fb903	26-Jul-2023	Andreas Gruenbacher <agruenba@redhat.com>	gfs2: Rename gfs2_lookup_{ simple => meta } Function gfs2_lookup_simple() is used for looking up inodes in the metadata directory tree, so rename it to gfs2_lookup_meta() to closer match its purpose. Clean the function up a little on the way. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
# 4c7b3f7f	20-Oct-2023	Andreas Gruenbacher <agruenba@redhat.com>	gfs2: Get rid of gfs2_alloc_blocks generation parameter Get rid of the generation parameter of gfs2_alloc_blocks(): we only ever set the generation of the current inode while creating it, so do so directly. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
# 2d8d7990	21-Oct-2023	Andreas Gruenbacher <agruenba@redhat.com>	gfs2: setattr_chown: Add missing initialization Add a missing initialization of variable ap in setattr_chown(). Without, chown() may be able to bypass quotas. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
# 0abd1557	01-Oct-2023	Al Viro <viro@zeniv.linux.org.uk>	gfs2: fix an oops in gfs2_permission In RCU mode, we might race with gfs2_evict_inode(), which zeroes ->i_gl. Freeing of the object it points to is RCU-delayed, so if we manage to fetch the pointer before it's been replaced with NULL, we are fine. Check if we'd fetched NULL and treat that as "bail out and tell the caller to get out of RCU mode". Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
# 21d9067e	20-Sep-2023	Andreas Gruenbacher <agruenba@redhat.com>	gfs2: Get rid of the gfs2_glock_is_held_* helpers Those helpers don't add any clarity and are easy to use wrong. Spell them out to make more obvious what's happening. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
# 580f721b	04-Oct-2023	Jeff Layton <jlayton@kernel.org>	gfs2: convert to new timestamp accessors Convert to using the new inode timestamp accessor functions. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20231004185347.80880-38-jlayton@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
# 111c7d27	26-Jul-2023	Andreas Gruenbacher <agruenba@redhat.com>	gfs2: Use mapping->gfp_mask for metadata inodes Set mapping->gfp mask to GFP_NOFS for all metadata inodes so that allocating pages in the address space of those inodes won't call back into the filesystem. This allows to switch back from find_or_create_page() to grab_cache_page() in two places. Partially reverts commit 220cca2a4f58 ("GFS2: Change truncate page allocation to be GFP_NOFS"). Thanks to Dan Carpenter <dan.carpenter@linaro.org> for pointing out a Smatch static checker warning. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
# 913e9928	07-Aug-2023	Jeff Layton <jlayton@kernel.org>	fs: drop the timespec64 argument from update_time Now that all of the update_time operations are prepared for it, we can drop the timespec64 argument from the update_time operation. Do that and remove it from some associated functions like inode_update_time and inode_needs_update_time. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <20230807-mgctime-v7-8-d1dec143a704@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
# 541d4c79	07-Aug-2023	Jeff Layton <jlayton@kernel.org>	fs: drop the timespec64 arg from generic_update_time In future patches we're going to change how the ctime is updated to keep track of when it has been queried. The way that the update_time operation works (and a lot of its callers) make this difficult, since they grab a timestamp early and then pass it down to eventually be copied into the inode. All of the existing update_time callers pass in the result of current_time() in some fashion. Drop the "time" parameter from generic_update_time, and rework it to fetch its own timestamp. This change means that an update_time could fetch a different timestamp than was seen in inode_needs_update_time. update_time is only ever called with one of two flag combinations: Either S_ATIME is set, or S_MTIME\|S_CTIME\|S_VERSION are set. With this change we now treat the flags argument as an indicator that some value needed to be updated when last checked, rather than an indication to update specific timestamps. Rework the logic for updating the timestamps and put it in a new inode_update_timestamps helper that other update_time routines can use. S_ATIME is as treated as we always have, but if any of the other three are set, then we attempt to update all three. Also, some callers of generic_update_time need to know what timestamps were actually updated. Change it to return an S_* flag mask to indicate that and rework the callers to expect it. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <20230807-mgctime-v7-3-d1dec143a704@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
# 0d72b928	07-Aug-2023	Jeff Layton <jlayton@kernel.org>	fs: pass the request_mask to generic_fillattr generic_fillattr just fills in the entire stat struct indiscriminately today, copying data from the inode. There is at least one attribute (STATX_CHANGE_COOKIE) that can have side effects when it is reported, and we're looking at adding more with the addition of multigrain timestamps. Add a request_mask argument to generic_fillattr and have most callers just pass in the value that is passed to getattr. Have other callers (e.g. ksmbd) just pass in STATX_BASIC_STATS. Also move the setting of STATX_CHANGE_COOKIE into generic_fillattr. Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: "Paulo Alcantara (SUSE)" <pc@manguebit.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jeff Layton <jlayton@kernel.org> Message-Id: <20230807-mgctime-v7-2-d1dec143a704@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
# 8a8b8d91	05-Jul-2023	Jeff Layton <jlayton@kernel.org>	gfs2: convert to ctime accessor functions In later patches, we're going to change how the inode's ctime field is used. Switch to using accessor functions instead of raw accesses of inode->i_ctime. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <20230705190309.579783-45-jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
# 14a58517	14-Mar-2023	Andrew Price <anprice@redhat.com>	gfs2: Remove ghs[] from gfs2_unlink Replace the 3-item array with three variables for readability. Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
# 2d084780	14-Mar-2023	Andrew Price <anprice@redhat.com>	gfs2: Remove ghs[] from gfs2_link Replace the 2-item array with two variables for readability. Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
# 8dc14966	14-Mar-2023	Andrew Price <anprice@redhat.com>	gfs2: Remove duplicate i_nlink check from gfs2_link() The duplication is: struct gfs2_inode *ip = GFS2_I(inode); [...] error = -ENOENT; if (inode->i_nlink == 0) goto out_gunlock; [...] error = -EINVAL; if (!ip->i_inode.i_nlink) goto out_gunlock; The second check is removed. ENOENT is the correct error code for attempts to link a deleted inode (ref: link(2)). If we support O_TMPFILE in future the check will need to be updated with an exception for inodes flagged I_LINKABLE so sorting out this duplication now will make it a slightly cleaner change. Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
# 9ffa1888	23-Jan-2023	Andreas Gruenbacher <agruenba@redhat.com>	gfs2: gl_object races fix Function glock_clear_object() checks if the specified glock is still pointing at the right object and clears the gl_object pointer. To handle the case of incompletely constructed inodes, glock_clear_object() also allows gl_object to be NULL. However, in the teardown case, when iget_failed() is called and the inode is removed from the inode hash, by the time we get to the glock_clear_object() calls in gfs2_put_super() and its helpers, we don't have exclusion against concurrent gfs2_inode_lookup() and gfs2_create_inode() calls, and the inode and iopen glocks may already be pointing at another inode, so the checks in glock_clear_object() are incorrect. To better handle this case, always completely disassociate an inode from its glocks before tearing it down. In addition, get rid of a duplicate glock_clear_object() call in gfs2_evict_inode(). That way, glock_clear_object() will only ever be called when the glock points at the current inode, and the NULL check in glock_clear_object() can be removed. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
# 4609e1f1	12-Jan-2023	Christian Brauner <brauner@kernel.org>	fs: port ->permission() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
# 13e83a49	12-Jan-2023	Christian Brauner <brauner@kernel.org>	fs: port ->set_acl() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
# e18275ae	12-Jan-2023	Christian Brauner <brauner@kernel.org>	fs: port ->rename() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
# 5ebb29be	12-Jan-2023	Christian Brauner <brauner@kernel.org>	fs: port ->mknod() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
# c54bd91e	12-Jan-2023	Christian Brauner <brauner@kernel.org>	fs: port ->mkdir() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
# 7a77db95	12-Jan-2023	Christian Brauner <brauner@kernel.org>	fs: port ->symlink() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
# 6c960e68	12-Jan-2023	Christian Brauner <brauner@kernel.org>	fs: port ->create() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
# b74d24f7	12-Jan-2023	Christian Brauner <brauner@kernel.org>	fs: port ->getattr() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
# c1632a0f	12-Jan-2023	Christian Brauner <brauner@kernel.org>	fs: port ->setattr() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
# 2ec750a0	03-Dec-2022	Andreas Gruenbacher <agruenba@redhat.com>	gfs2: Add gfs2_inode_lookup comment Add comment on when and why gfs2_cancel_delete_work() needs to be skipped in gfs2_inode_lookup(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
# 4ec3c19d	05-Nov-2022	Andreas Gruenbacher <agruenba@redhat.com>	gfs2: Handle -EBUSY result of insert_inode_locked4 When creating a new inode, there is a small chance that an inode lookup for a previous version of the same inode is still in progress. In that case, that previous lookup will eventually fail, but we may still need to retry here. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
# 38552ff6	02-Nov-2022	Andreas Gruenbacher <agruenba@redhat.com>	gfs2: Fix and clean up create / evict interaction When gfs2_create_inode() fails after creating a new inode, it uses the GIF_FREE_VFS_INODE and GIF_ALLOC_FAILED inode flags to communicate to gfs2_evict_inode() which parts of the inode need to be deallocated and destroyed. In some error cases, the inode ends up being allocated on disk and then accidentally left behind. In others, the inode is partially constructed and then not properly destroyed. Clean this up by completely handling the inode deallocation and destruction in gfs2_evict_inode(). This means that gfs2_evict_inode() may now be faced with partially constructed inodes, so add the necessary checks to cope with that. In particular, make sure that for incompletely constructed inodes, we're not accessing the buffers backing the on-disk blocks; the contents may be undefined. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
# 3d0258bc	02-Nov-2022	Andreas Gruenbacher <agruenba@redhat.com>	gfs2: Clean up initialization of "ip" in gfs2_create_inode Initialize variable "ip" earlier so that it can be used interchangeably with "inode" everywhere. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
# 761fdbbc	04-Nov-2022	Andreas Gruenbacher <agruenba@redhat.com>	gfs2: Get rid of ghs[] in gfs2_create_inode In gfs2_create_inode, get rid of the ghs array in favor of two separate variables. This makes the code much less irritating. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
# 35c23fba	02-Nov-2022	Andreas Gruenbacher <agruenba@redhat.com>	gfs2: Add extra error check in alloc_dinode We have reserved the number of blocks we want to allocate, so the actual allocation isn't expected to fail. Nevertheless, make the code behave correctly even when things go wrong. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
# cac2f8b8	22-Sep-2022	Christian Brauner <brauner@kernel.org>	fs: rename current get acl method The current way of setting and getting posix acls through the generic xattr interface is error prone and type unsafe. The vfs needs to interpret and fixup posix acls before storing or reporting it to userspace. Various hacks exist to make this work. The code is hard to understand and difficult to maintain in it's current form. Instead of making this work by hacking posix acls through xattr handlers we are building a dedicated posix acl api around the get and set inode operations. This removes a lot of hackiness and makes the codepaths easier to maintain. A lot of background can be found in [1]. The current inode operation for getting posix acls takes an inode argument but various filesystems (e.g., 9p, cifs, overlayfs) need access to the dentry. In contrast to the ->set_acl() inode operation we cannot simply extend ->get_acl() to take a dentry argument. The ->get_acl() inode operation is called from: acl_permission_check() -> check_acl() -> get_acl() which is part of generic_permission() which in turn is part of inode_permission(). Both generic_permission() and inode_permission() are called in the ->permission() handler of various filesystems (e.g., overlayfs). So simply passing a dentry argument to ->get_acl() would amount to also having to pass a dentry argument to ->permission(). We should avoid this unnecessary change. So instead of extending the existing inode operation rename it from ->get_acl() to ->get_inode_acl() and add a ->get_acl() method later that passes a dentry argument and which filesystems that need access to the dentry can implement instead of ->get_inode_acl(). Filesystems like cifs which allow setting and getting posix acls but not using them for permission checking during lookup can simply not implement ->get_inode_acl(). This is intended to be a non-functional change. Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1] Suggested-by/Inspired-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
# 138060ba	23-Sep-2022	Christian Brauner <brauner@kernel.org>	fs: pass dentry to set acl method The current way of setting and getting posix acls through the generic xattr interface is error prone and type unsafe. The vfs needs to interpret and fixup posix acls before storing or reporting it to userspace. Various hacks exist to make this work. The code is hard to understand and difficult to maintain in it's current form. Instead of making this work by hacking posix acls through xattr handlers we are building a dedicated posix acl api around the get and set inode operations. This removes a lot of hackiness and makes the codepaths easier to maintain. A lot of background can be found in [1]. Since some filesystem rely on the dentry being available to them when setting posix acls (e.g., 9p and cifs) they cannot rely on set acl inode operation. But since ->set_acl() is required in order to use the generic posix acl xattr handlers filesystems that do not implement this inode operation cannot use the handler and need to implement their own dedicated posix acl handlers. Update the ->set_acl() inode method to take a dentry argument. This allows all filesystems to rely on ->set_acl(). As far as I can tell all codepaths can be switched to rely on the dentry instead of just the inode. Note that the original motivation for passing the dentry separate from the inode instead of just the dentry in the xattr handlers was because of security modules that call security_d_instantiate(). This hook is called during d_instantiate_new(), d_add(), __d_instantiate_anon(), and d_splice_alias() to initialize the inode's security context and possibly to set security.* xattrs. Since this only affects security.* xattrs this is completely irrelevant for posix acls. Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1] Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
# c412a97c	22-Aug-2022	Bob Peterson <rpeterso@redhat.com>	gfs2: Use TRY lock in gfs2_inode_lookup for UNLINKED inodes Before this patch, delete_work_func() would check for the GLF_DEMOTE flag on the iopen glock and if set, it would perform special processing. However, there was a race whereby the GLF_DEMOTE flag could be set by another process after the check. Then when it called gfs2_lookup_by_inum() which calls gfs2_inode_lookup(), it tried to lock the iopen glock in SH mode, but the GLF_DEMOTE flag prevented the request from being granted. But the iopen glock could never be demoted because that happens when the inode is evicted, and the evict was never completed because of the failed lookup. To fix that, change function gfs2_inode_lookup() so that when GFS2_BLKST_UNLINKED inodes are searched, it uses the LM_FLAG_TRY flag for the iopen glock. If the locking request fails, fail gfs2_inode_lookup() with -EAGAIN so that delete_work_func() can retry the operation later. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
# ebdc416c	05-Apr-2022	Andreas Gruenbacher <agruenba@redhat.com>	gfs2: Mark the remaining process-independent glock holders as GL_NOPID Add the GL_NOPID flag for the remaining glock holders which are not associated with the current process. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
# 29464ee3	23-Jan-2022	Andreas Gruenbacher <agruenba@redhat.com>	gfs2: Switch lock order of inode and iopen glock This patch tries to fix the continual ABBA deadlocks we keep having between the iopen and inode glocks. This switches the lock order in gfs2_inode_lookup and gfs2_create_inode so the iopen glock is always locked first. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
# 7336905a	10-Dec-2021	Andreas Gruenbacher <agruenba@redhat.com>	gfs2: gfs2_setattr_size error path fix When gfs2_setattr_size() fails, it calls gfs2_rs_delete(ip, NULL) to get rid of any reservations the inode may have. Instead, it should pass in the inode's write count as the second parameter to allow gfs2_rs_delete() to figure out if the inode has any writers left. In a next step, there are two instances of gfs2_rs_delete(ip, NULL) left where we know that there can be no other users of the inode. Replace those with gfs2_rs_deltree(&ip->i_res) to avoid the unnecessary write count check. With that, gfs2_rs_delete() is only called with the inode's actual write count, so get rid of the second parameter. Fixes: a097dc7e24cb ("GFS2: Make rgrp reservations part of the gfs2_inode structure") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
# 3d36e57f	30-Nov-2021	Andreas Gruenbacher <agruenba@redhat.com>	gfs2: gfs2_create_inode rework When gfs2_lookup_by_inum() calls gfs2_inode_lookup() for an uncached inode, gfs2_inode_lookup() will place a new tentative inode into the inode cache before verifying that there is a valid inode at the given address. This can race with gfs2_create_inode() which doesn't check for duplicates inodes. gfs2_create_inode() will try to assign the new inode to the corresponding inode glock, and glock_set_object() will complain that the glock is still in use by gfs2_inode_lookup's tentative inode. We noticed this bug after adding commit 486408d690e1 ("gfs2: Cancel remote delete work asynchronously") which allowed delete_work_func() to race with gfs2_create_inode(), but the same race exists for open-by-handle. Fix that by switching from insert_inode_hash() to insert_inode_locked4(), which does check for duplicate inodes. We know we've just managed to to allocate the new inode, so an inode tentatively created by gfs2_inode_lookup() will eventually go away and insert_inode_locked4() will always succeed. In addition, don't flush the inode glock work anymore (this can now only make things worse) and clean up glock_{set,clear}_object for the inode glock somewhat. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>