#
4d927b03 |
|
20-Dec-2023 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Rename gfs2_withdrawn to gfs2_withdrawing_or_withdrawn This function checks whether the filesystem has been been marked to be withdrawn eventually or has been withdrawn already. Rename this function to avoid confusing code like checking for gfs2_withdrawing() when gfs2_withdrawn() has already returned true. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
015af1af |
|
20-Dec-2023 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Mark withdraws as unlikely Mark the gfs2_withdrawn(), gfs2_withdrawing(), and gfs2_withdraw_in_prog() inline functions as likely to return %false. This allows to get rid of likely() and unlikely() annotations at the call sites of those functions. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
f9f229c1 |
|
13-Nov-2023 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Add GL_NOBLOCK flag Add a GL_NOBLOCK flag for trying to take a glock without sleeping. This will be used for implementing non-blocking lookup (MAY_NOT_BLOCK in gfs2_permission, LOOKUP_RCU in gfs2_drevalidate). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
600f111e |
|
17-Nov-2023 |
Matthew Wilcox (Oracle) <willy@infradead.org> |
fs: Rename mapping private members It is hard to find where mapping->private_lock, mapping->private_list and mapping->private_data are used, due to private_XXX being a relatively common name for variables and structure members in the kernel. To fit with other members of struct address_space, rename them all to have an i_ prefix. Tested with an allmodconfig build. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20231117215823.2821906-1-willy@infradead.org Acked-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
|
#
bb25b975 |
|
01-Nov-2023 |
Su Hui <suhui@nfschina.com> |
gfs2: remove dead code in add_to_queue clang static analyzer complains that value stored to 'gh' is never read. The code of this line is useless after commit 0b93bac2271e ("gfs2: Remove LM_FLAG_PRIORITY flag"). Remove this code to save space. Signed-off-by: Su Hui <suhui@nfschina.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
a304c23c |
|
11-Sep-2023 |
Qi Zheng <zhengqi.arch@bytedance.com> |
gfs2: dynamically allocate the gfs2-glock shrinker Use new APIs to dynamically allocate the gfs2-glock shrinker. Link: https://lkml.kernel.org/r/20230911094444.68966-9-zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Bob Peterson <rpeterso@redhat.com> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Abhinav Kumar <quic_abhinavk@quicinc.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Anna Schumaker <anna@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Carlos Llamas <cmllamas@google.com> Cc: Chandan Babu R <chandan.babu@oracle.com> Cc: Chao Yu <chao@kernel.org> Cc: Chris Mason <clm@fb.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Chuck Lever <cel@kernel.org> Cc: Coly Li <colyli@suse.de> Cc: Dai Ngo <Dai.Ngo@oracle.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: "Darrick J. Wong" <djwong@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Airlie <airlied@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Sterba <dsterba@suse.com> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Gao Xiang <hsiangkao@linux.alibaba.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Huang Rui <ray.huang@amd.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Jason Wang <jasowang@redhat.com> Cc: Jeff Layton <jlayton@kernel.org> Cc: Jeffle Xu <jefflexu@linux.alibaba.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Kirill Tkhai <tkhai@ya.ru> Cc: Marijn Suijten <marijn.suijten@somainline.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Mike Snitzer <snitzer@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nadav Amit <namit@vmware.com> Cc: Neil Brown <neilb@suse.de> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Cc: Olga Kornievskaia <kolga@netapp.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rob Clark <robdclark@gmail.com> Cc: Rob Herring <robh@kernel.org> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Sean Paul <sean@poorly.run> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Song Liu <song@kernel.org> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Steven Price <steven.price@arm.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Tom Talpey <tom@talpey.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Yue Hu <huyue2@coolpad.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
0ede61d8 |
|
29-Sep-2023 |
Christian Brauner <brauner@kernel.org> |
file: convert to SLAB_TYPESAFE_BY_RCU In recent discussions around some performance improvements in the file handling area we discussed switching the file cache to rely on SLAB_TYPESAFE_BY_RCU which allows us to get rid of call_rcu() based freeing for files completely. This is a pretty sensitive change overall but it might actually be worth doing. The main downside is the subtlety. The other one is that we should really wait for Jann's patch to land that enables KASAN to handle SLAB_TYPESAFE_BY_RCU UAFs. Currently it doesn't but a patch for this exists. With SLAB_TYPESAFE_BY_RCU objects may be freed and reused multiple times which requires a few changes. So it isn't sufficient anymore to just acquire a reference to the file in question under rcu using atomic_long_inc_not_zero() since the file might have already been recycled and someone else might have bumped the reference. In other words, callers might see reference count bumps from newer users. For this reason it is necessary to verify that the pointer is the same before and after the reference count increment. This pattern can be seen in get_file_rcu() and __files_get_rcu(). In addition, it isn't possible to access or check fields in struct file without first aqcuiring a reference on it. Not doing that was always very dodgy and it was only usable for non-pointer data in struct file. With SLAB_TYPESAFE_BY_RCU it is necessary that callers first acquire a reference under rcu or they must hold the files_lock of the fdtable. Failing to do either one of this is a bug. Thanks to Jann for pointing out that we need to ensure memory ordering between reallocations and pointer check by ensuring that all subsequent loads have a dependency on the second load in get_file_rcu() and providing a fixup that was folded into this patch. Cc: Jann Horn <jannh@google.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
|
#
62862485 |
|
12-Sep-2023 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: fix glock shrinker ref issues Before this patch, function gfs2_scan_glock_lru would only try to free glocks that had a reference count of 0. But if the reference count ever got to 0, the glock should have already been freed. Shrinker function gfs2_dispose_glock_lru checks whether glocks on the LRU are demote_ok, and if so, tries to demote them. But that's only possible if the reference count is at least 1. This patch changes gfs2_scan_glock_lru so it will try to demote and/or dispose of glocks that have a reference count of 1 and which are either demotable, or are already unlocked. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
e7beb8b6 |
|
23-Aug-2023 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Rename SDF_DEACTIVATING to SDF_KILL Rename the SDF_DEACTIVATING flag to SDF_KILL to make it more obvious that this relates to the kill_sb filesystem operation. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
3c69c437 |
|
23-Aug-2023 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Rename sd_{ glock => kill }_wait Rename sd_glock_wait to sd_kill_wait: we'll use it for other things related to "killing" a filesystem on unmount soon (kill_sb). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
66fa9912 |
|
25-Jul-2023 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: conversion deadlock do_promote bypass Consider the following case: 1. A glock is held in shared mode. 2. A process requests the glock in exclusive mode (rename). 3. Before the lock is granted, more processes (read / ls) request the glock in shared mode again. 4. gfs2 sends a request to dlm for the lock in exclusive mode because that holder is at the head of the queue. 5. Somehow the dlm request gets canceled, so dlm sends us back a response with state == LM_ST_SHARED and LM_OUT_CANCELED. So at that point, the glock is still held in shared mode. 6. finish_xmote gets called to process the response from dlm. It detects that the glock is not in the requested mode and no demote is in progress, so it moves the canceled holder to the tail of the queue and finds the new holder at the head of the queue. That holder is requesting the glock in shared mode. 7. finish_xmote calls do_xmote to transition the glock into shared mode, but the glock is already in shared mode and so do_xmote complains about that with: GLOCK_BUG_ON(gl, gl->gl_state == gl->gl_target); Instead, in finish_xmote, after moving the canceled holder to the tail of the queue, check if any new holders can be granted. Only call do_xmote to repeat the dlm request if the holder at the head of the queue is requesting the glock in a mode that is incompatible with the mode the glock is currently held in. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
0b93bac2 |
|
08-Aug-2023 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Remove LM_FLAG_PRIORITY flag The last user of this flag was removed in commit b77b4a4815a9 ("gfs2: Rework freeze / thaw logic"). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
de3e7f97 |
|
08-Aug-2023 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: do_promote cleanup Change function do_promote to return true on success, and false otherwise. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
af1abe11 |
|
16-Nov-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Rename remaining "transaction" glock references The transaction glock was repurposed to serve as the new freeze glock years ago. Don't refer to it as the transaction glock anymore. Also, to be more precise, call it the "freeze glock" instead of the "freeze lock". Ditto for the journal glock. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
6c0246a9 |
|
05-Dec-2022 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Cease delete work during unmount Add a check to delete_work_func() so that it quits when it finds that the filesystem is deactivating. This speeds up the delete workqueue draining in gfs2_kill_sb(). In addition, make sure that iopen_go_callback() won't queue any new delete work while the filesystem is deactivating. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
f0e56edc |
|
20-Dec-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Split the two kinds of glock "delete" work Function delete_work_func() is used for two purposes: * to immediately try to evict the glock's inode, and * to verify after a little while that the inode has been deleted as expected, and didn't just get skipped. These two operations are not separated very well, so introduce two new glock flags to improved that. Split gfs2_queue_delete_work() into gfs2_queue_try_to_evict and gfs2_queue_verify_evict(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
0247f4e9 |
|
06-Dec-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Move delete workqueue into super block Move the global delete workqueue into struct gfs2_sbd so that we can flush / drain it without interfering with other filesystems. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
3056dc46 |
|
09-Dec-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Get rid of GLF_PENDING_DELETE flag Get rid of the GLF_PENDING_DELETE glock flag introduced by commit a0e3cc65fa29 ("gfs2: Turn gl_delete into a delayed work"). The only use of that flag is to prevent the iopen glock from being demoted (i.e., unlocked) while delete work is pending. It turns out that demoting the iopen glock while delete work is pending is perfectly fine; we only need to make sure that the glock isn't being freed while still in use. This is ensured by the previous patch because delete_work_func() owns a reference while the work is queued or running. With these changes, gfs2_queue_delete_work() no longer takes the glock spin lock, so we can use it in iopen_go_callback() instead of open-coding it there. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
228804a3 |
|
09-Dec-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Make glock lru list scanning safer In __gfs2_glock_put(), remove the glock from the lru list *after* dropping the glock lock. This prevents deadlocks against gfs2_scan_glock_lru(). In gfs2_scan_glock_lru(), make sure that the glock's reference count is zero before moving the glock to the dispose list. This skips glocks that are marked dead as well as glocks that are still in use. Additionally, switch to spin_trylock() as we already do in gfs2_dispose_glock_lru(); this alone would also be enough to prevent deadlocks against __gfs2_glock_put(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
8fb8f70e |
|
09-Dec-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Clean up gfs2_scan_glock_lru Switch to list_for_each_entry_safe() and eliminate the "skipped" list in gfs2_scan_glock_lru(). At the same time, scan the requested number of items to scan, not one more than that number. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
9ffa1888 |
|
23-Jan-2023 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: gl_object races fix Function glock_clear_object() checks if the specified glock is still pointing at the right object and clears the gl_object pointer. To handle the case of incompletely constructed inodes, glock_clear_object() also allows gl_object to be NULL. However, in the teardown case, when iget_failed() is called and the inode is removed from the inode hash, by the time we get to the glock_clear_object() calls in gfs2_put_super() and its helpers, we don't have exclusion against concurrent gfs2_inode_lookup() and gfs2_create_inode() calls, and the inode and iopen glocks may already be pointing at another inode, so the checks in glock_clear_object() are incorrect. To better handle this case, always completely disassociate an inode from its glocks before tearing it down. In addition, get rid of a duplicate glock_clear_object() call in gfs2_evict_inode(). That way, glock_clear_object() will only ever be called when the glock points at the current inode, and the NULL check in glock_clear_object() can be removed. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
6b46a061 |
|
09-Dec-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Remove support for glock holder auto-demotion (2) As a follow-up to the previous commit, move the recovery related code in __gfs2_glock_dq() to gfs2_glock_dq() where it better fits. No functional change. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
ba3e77a4 |
|
09-Dec-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Remove support for glock holder auto-demotion Remove the support for glock holder auto-demotion (commit dc732906c245 and folow-ups) as we are not planning to use this feature, and the additional code therefore only adds unnecessary complexity. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
f0c0ade8 |
|
05-Dec-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Minor gfs2_try_evict cleanup In gfs2_try_evict(), when an inode can't be evicted, we are grabbing a temporary reference on the inode glock to poke that glock. That should be safe, but it's easier to just grab an inode reference as we already do earlier in this function. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
88f4a9f8 |
|
05-Dec-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Partially revert gfs2_inode_lookup change Commit c412a97cf6c5 changed delete_work_func() to always perform an inode lookup when gfs2_try_evict() fails. This doesn't make sense as a gfs2_try_evict() failure indicates that the inode is likely still in use. Revert that change. Fixes: c412a97cf6c5 ("gfs2: Use TRY lock in gfs2_inode_lookup for UNLINKED inodes") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
3781ec9e |
|
05-Dec-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Uninline and improve glock_{set,clear}_object Those functions have reached a size at which having them inline isn't useful anymore, so uninline them. In addition, report the glock name on assertion failures. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
97236ad5 |
|
04-Dec-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Avoid dequeuing GL_ASYNC glock holders twice When a locking request fails, the associated glock holder is automatically dequeued from the list of active and waiting holders. For GL_ASYNC locking requests, this will obviously happen asynchronously and it can race with attempts to cancel that locking request via gfs2_glock_dq(). Therefore, don't forget to check if a locking request has already been dequeued in gfs2_glock_dq(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
4ad02083 |
|
02-Dec-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Make gfs2_glock_hold return its glock argument This allows code like 'gl = gfs2_glock_hold(...)'. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
86934198 |
|
18-Aug-2022 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Clear flags when withdraw prevents xmote There are a couple places in function do_xmote where normal processing is circumvented due to withdraws in progress. However, since we bypass most of do_xmote() we bypass telling dlm to lock the dlm lock, which means dlm will never respond with a completion callback. Since the completion callback ordinarily clears GLF_LOCK, this patch changes function do_xmote to handle those situations more gracefully so the file system may be unmounted after withdraw. A very similar situation happens with the GLF_DEMOTE_IN_PROGRESS flag, which is cleared by function finish_xmote(). Since the withdraw causes us to skip the majority of do_xmote, it therefore also skips the call to finish_xmote() so the DEMOTE_IN_PROGRESS flag needs to be cleared manually. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
053640a7 |
|
18-Aug-2022 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Dequeue waiters when withdrawn When a withdraw occurs, ordinary (not system) glocks may not be granted anymore. Later, when the file system is unmounted, gfs2_gl_hash_clear() tries to clear out all the glocks, but these un-grantable pending waiters prevent some glocks from being freed. So the unmount hangs, at least for its ten-minute timeout period. This patch takes measures to remove any pending waiters from the glocks that will never be granted. This allows the unmount to proceed in a reasonable period of time. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
c412a97c |
|
22-Aug-2022 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Use TRY lock in gfs2_inode_lookup for UNLINKED inodes Before this patch, delete_work_func() would check for the GLF_DEMOTE flag on the iopen glock and if set, it would perform special processing. However, there was a race whereby the GLF_DEMOTE flag could be set by another process after the check. Then when it called gfs2_lookup_by_inum() which calls gfs2_inode_lookup(), it tried to lock the iopen glock in SH mode, but the GLF_DEMOTE flag prevented the request from being granted. But the iopen glock could never be demoted because that happens when the inode is evicted, and the evict was never completed because of the failed lookup. To fix that, change function gfs2_inode_lookup() so that when GFS2_BLKST_UNLINKED inodes are searched, it uses the LM_FLAG_TRY flag for the iopen glock. If the locking request fails, fail gfs2_inode_lookup() with -EAGAIN so that delete_work_func() can retry the operation later. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
cbe6d257 |
|
05-Apr-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Add GL_NOPID flag for process-independent glock holders Add a GL_NOPID flag to indicate that once a glock holder has been acquired, it won't be associated with the current process anymore. This is useful for iopen and flock glocks which are associated with open files, as well as journal glock holders and similar which are associated with the filesystem. Once GL_NOPID is used for all applicable glocks (see the next patches), processes will no longer be falsely reported as holding glocks which they are not actually holding in the glocks dump file. Unlike before, when a process is reported as having "(ended)", this will indicate an actual bug. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
56535dc6 |
|
23-Jun-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Add flocks to glockfd debugfs file Include flock glocks in the "glockfd" debugfs file. Those are similar to the iopen glocks; while an open file is holding an flock, it is holding the file's flock glock. We cannot take f_fl_mutex in gfs2_glockfd_seq_show_flock() or else dumping the "glockfd" file would block on flock operations. Instead, use the file->f_lock spin lock to protect the f_fl_gh.gh_gl glock pointer. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
4480c27c |
|
08-Jun-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Add glockfd debugfs file When a process has a gfs2 file open, the file is keeping a reference on the underlying gfs2 inode, and the inode is keeping the inode's iopen glock held in shared mode. In other words, the process depends on the iopen glock of each open gfs2 file. Expose those dependencies in a new "glockfd" debugfs file. The new debugfs file contains one line for each gfs2 file descriptor, specifying the tgid, file descriptor number, and glock name, e.g., 1601 6 5/816d This list is compiled by iterating all tasks on the system using find_ge_pid(), and all file descriptors of each task using task_lookup_next_fd_rcu(). To make that work from gfs2, export those two functions. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
6feaec81 |
|
10-Jun-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: List traversal in do_promote is safe In do_promote(), we're never removing the current entry from the list and so the list traversal is actually safe. Switch back to list_for_each_entry(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
0befb851 |
|
10-Jun-2022 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: do_promote glock holder stealing fix In do_promote(), when the glock had no strong holders, we were accidentally calling demote_incompat_holders() with new_gh == NULL, so no weak holders were considered incompatible. Instead, the new holder should have been passed in. For doing that, the HIF_HOLDER flag needs to be set in new_gh to prevent may_grant() from complaining. This means that the new holder will now be recognized as a current holder, so skip over it explicitly in demote_incompat_holders() to prevent it from being dequeued. To further clarify things, we can now rename new_gh to current_gh in demote_incompat_holders(); after all, the HIF_HOLDER flag is already set, which means the new holder is already a current holder. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
8f0028fc |
|
10-Jun-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Use better variable name In do_promote() and add_to_queue(), use current_gh as the variable name for the first strong holder we could find: this matches the variable name is may_grant(), and more clearly indicates that we're interested in one (any) of the current strong holders. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
5f38a4d3 |
|
09-Jun-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Make go_instantiate take a glock Make go_instantiate take a glock instead of a glock holder as its argument: this handler is supposed to instantiate the object associated with the glock. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
86c30a01 |
|
10-Jun-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Add new go_held glock operation Right now, inode_go_instantiate() contains functionality that relates to how a glock is held rather than the glock itself, like waiting for pending direct I/O to complete and completing interrupted truncates. This code is meant to be run each time a holder is acquired, but go_instantiate is actually only called once, when the glock is instantiated. To fix that, introduce a new go_held glock operation that is called each time a glock holder is acquired. Move the holder specific code in inode_go_instantiate() over to inode_go_held(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
de3f906f |
|
02-Jun-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Revert 'Fix "truncate in progress" hang' Now that interrupted truncates are completed in the context of the process taking the glock, there is no need for the glock state engine to delegate that task to gfs2_quotad or for quotad to perform those truncates anymore. Get rid of the obsolete associated infrastructure. Reverts commit 813e0c46c9e2 ("GFS2: Fix "truncate in progress" hang"). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
53d69132 |
|
10-Jun-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Instantiate glocks ouside of glock state engine Instantiate glocks outside of the glock state engine: there is no real reason for instantiating them inside the glock state engine; it only complicates the code. Instead, instantiate them in gfs2_glock_wait() and gfs2_glock_async_wait() using the new gfs2_glock_holder_ready() helper. On top of that, the only other place that acquires a glock without using gfs2_glock_wait() or gfs2_glock_async_wait() is gfs2_upgrade_iopen_glock(), so call gfs2_glock_holder_ready() there as well. If a dinode has a pending truncate, the glock-specific instantiate function for inodes wakes up the truncate function in the quota daemon. Waiting for the completion of the truncate was previously done by the glock state engine, but we now need to wait in inode_go_instantiate(). This also means that gfs2_instantiate() will now no longer return any "special" error codes. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
bdff777c |
|
09-Jun-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Fix up gfs2_glock_async_wait Since commit 1fc05c8d8426 ("gfs2: cancel timed-out glock requests"), a pending locking request can be canceled by calling gfs2_glock_dq() on the pending holder. In gfs2_glock_async_wait(), when we time out, use that to cancel the remaining locking requests and dequeue the locking requests already granted. That's simpler as well as more efficient than waiting for all locking requests to eventually be granted and dequeuing them then. In addition, gfs2_glock_async_wait() promises that by the time the function completes, all glocks are either granted or dequeued, but the implementation doesn't keep that promise if individual locking requests fail. Fix that as well. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
44dab005 |
|
14-Jun-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Minor gfs2_glock_nq_m cleanup Add state and flags arguments to gfs2_rlist_alloc() to make it somewhat more obvious which state and flags an rlist uses. With that, stop knocking off flags in gfs2_glock_nq_m() and its nq_m_sync() helper that are never set in the first place. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
565f82b5 |
|
26-May-2022 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Rewrap overlong comment in do_promote Rewrap the comment to keep the line length below 80 characters. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
e33c267a |
|
31-May-2022 |
Roman Gushchin <roman.gushchin@linux.dev> |
mm: shrinkers: provide shrinkers with names Currently shrinkers are anonymous objects. For debugging purposes they can be identified by count/scan function names, but it's not always useful: e.g. for superblock's shrinkers it's nice to have at least an idea of to which superblock the shrinker belongs. This commit adds names to shrinkers. register_shrinker() and prealloc_shrinker() functions are extended to take a format and arguments to master a name. In some cases it's not possible to determine a good name at the time when a shrinker is allocated. For such cases shrinker_debugfs_rename() is provided. The expected format is: <subsystem>-<shrinker_type>[:<instance>]-<id> For some shrinkers an instance can be encoded as (MAJOR:MINOR) pair. After this change the shrinker debugfs directory looks like: $ cd /sys/kernel/debug/shrinker/ $ ls dquota-cache-16 sb-devpts-28 sb-proc-47 sb-tmpfs-42 mm-shadow-18 sb-devtmpfs-5 sb-proc-48 sb-tmpfs-43 mm-zspool:zram0-34 sb-hugetlbfs-17 sb-pstore-31 sb-tmpfs-44 rcu-kfree-0 sb-hugetlbfs-33 sb-rootfs-2 sb-tmpfs-49 sb-aio-20 sb-iomem-12 sb-securityfs-6 sb-tracefs-13 sb-anon_inodefs-15 sb-mqueue-21 sb-selinuxfs-22 sb-xfs:vda1-36 sb-bdev-3 sb-nsfs-4 sb-sockfs-8 sb-zsmalloc-19 sb-bpf-32 sb-pipefs-14 sb-sysfs-26 thp-deferred_split-10 sb-btrfs:vda2-24 sb-proc-25 sb-tmpfs-1 thp-zero-9 sb-cgroup2-30 sb-proc-39 sb-tmpfs-27 xfs-buf:vda1-37 sb-configfs-23 sb-proc-41 sb-tmpfs-29 xfs-inodegc:vda1-38 sb-dax-11 sb-proc-45 sb-tmpfs-35 sb-debugfs-7 sb-proc-46 sb-tmpfs-40 [roman.gushchin@linux.dev: fix build warnings] Link: https://lkml.kernel.org/r/Yr+ZTnLb9lJk6fJO@castle Reported-by: kernel test robot <lkp@intel.com> Link: https://lkml.kernel.org/r/20220601032227.4076670-4-roman.gushchin@linux.dev Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Dave Chinner <dchinner@redhat.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
11d8b79e |
|
08-May-2022 |
Kees Cook <keescook@chromium.org> |
gfs2: Use container_of() for gfs2_glock(aspace) Clang's structure layout randomization feature gets upset when it sees struct address_space (which is randomized) cast to struct gfs2_glock. This is due to seeing the mapping pointer as being treated as an array of gfs2_glock, rather than "something else, before struct address_space": In file included from fs/gfs2/acl.c:23: fs/gfs2/meta_io.h:44:12: error: casting from randomized structure pointer type 'struct address_space *' to 'struct gfs2_glock *' return (((struct gfs2_glock *)mapping) - 1)->gl_name.ln_sbd; ^ Replace the instances of open-coded pointer math with container_of() usage, and update the allocator to match. Some cleanups and conversion of gfs2_glock_get() and gfs2_glock_dealloc() by Andreas. Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/lkml/202205041550.naKxwCBj-lkp@intel.com Cc: Bob Peterson <rpeterso@redhat.com> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Bill Wendling <morbo@google.com> Cc: cluster-devel@redhat.com Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
a4e8145e |
|
14-Jan-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Initialize gh_error in gfs2_glock_nq The gh_error field if a glock holder is initialized to zero in gfs2_holder_init(). When a locking operation fails, gh_error is set to an error code; when it succeeds, the gh_error value is left unchanged. The field isn't initialized in gfs2_holder_reinit(), which is a problem. Instead of fixing that directly, initialize gh_error in gfs2_glock_nq(). That also obsoletes the assignment in do_flock(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
5a27a43e |
|
12-Jan-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Make use of list_is_first Use list_is_first() instead of open-coding it. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
1fc05c8d |
|
23-Jan-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: cancel timed-out glock requests The gfs2 evict code tries to upgrade the iopen glock from SH to EX. If the attempt to upgrade times out, gfs2 needs to tell dlm to cancel the lock request or it can deadlock. We also need to wake up the process waiting for the lock when dlm sends its AST back to gfs2. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
356b8103 |
|
03-Feb-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
Revert "gfs2: check context in gfs2_glock_put" It turns out that the might_sleep() call that commit 660a6126f8c3 adds is triggering occasional data corruption in testing. We're not sure about the root cause yet, but since this commit was added as a debugging aid only, revert it for now. This reverts commit 660a6126f8c3208f6df8d552039cda078a8426d1. Fixes: 660a6126f8c3 ("gfs2: check context in gfs2_glock_put") Cc: stable@vger.kernel.org # v5.16+ Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
3c5c67ec |
|
29-Nov-2021 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Fix gfs2_instantiate description The description of gfs2_instantiate accidentally lists a glock argument, but the function takes a glock holder. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
ffd0cd3c |
|
18-Nov-2021 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Fix __gfs2_holder_init function name in kernel-doc comment The function name in the kernel-doc comment wasn't updated when the function was renamed. Fixes: b016d9a84abd ("gfs2: Save ip from gfs2_glock_nq_init") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
e11b02df |
|
22-Nov-2021 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Fix remote demote of weak glock holders When we mock up a temporary holder in gfs2_glock_cb to demote weak holders in response to a remote locking conflict, we don't set the HIF_HOLDER flag. This causes function may_grant to BUG. Fix by setting the missing HIF_HOLDER flag in the mock glock holder. In addition, define the mock glock holder where it is used. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
a7ac203d |
|
08-Nov-2021 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Fix "Introduce flag for glock holder auto-demotion" Function demote_incompat_holders iterates over the list of glock holders with list_for_each_entry, and it then sometimes removes the current holder from the list. This will get the loop stuck; we must use list_for_each_entry_safe instead. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
7a92deaa |
|
03-Nov-2021 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Fix atomic bug in gfs2_instantiate Replace test_bit() + set_bit() with test_and_set_bit() where we need an atomic operation. Use clear_and_wake_up_bit() instead of open coding it. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
660a6126 |
|
13-Oct-2021 |
Alexander Aring <aahringo@redhat.com> |
gfs2: check context in gfs2_glock_put Add a might_sleep call into gfs2_glock_put which can sleep in DLM when the last reference is released. This will show problems earlier, and not only when the last reference is put. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
7427f3bb |
|
07-Oct-2021 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Fix glock_hash_walk bugs So far, glock_hash_walk took a reference on each glock it iterated over, and it was the examiner's responsibility to drop those references. Dropping the final reference to a glock can sleep and the examiners are called in a RCU critical section with spin locks held, so examiners that didn't need the extra reference had to drop it asynchronously via gfs2_glock_queue_put or similar. This wasn't done correctly in thaw_glock which did call gfs2_glock_put, and not at all in dump_glock_func. Change glock_hash_walk to not take glock references at all. That way, the examiners that don't need them won't have to bother with slow asynchronous puts, and the examiners that do need references can take them themselves. Reported-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
486408d6 |
|
11-Oct-2021 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Cancel remote delete work asynchronously In gfs2_inode_lookup and gfs2_create_inode, we're calling gfs2_cancel_delete_work which currently cancels any remote delete work (delete_work_func) synchronously. This means that if the work is currently running, it will wait for it to finish. We're doing this to pevent a previous instance of an inode from having any influence on the next instance. However, delete_work_func uses gfs2_inode_lookup internally, and we can end up in a deadlock when delete_work_func gets interrupted at the wrong time. For example, (1) An inode's iopen glock has delete work queued, but the inode itself has been evicted from the inode cache. (2) The delete work is preempted before reaching gfs2_inode_lookup. (3) Another process recreates the inode (gfs2_create_inode). It tries to cancel any outstanding delete work, which blocks waiting for the ongoing delete work to finish. (4) The delete work calls gfs2_inode_lookup, which blocks waiting for gfs2_create_inode to instantiate and unlock the new inode => deadlock. It turns out that when the delete work notices that its inode has been re-instantiated, it will do nothing. This means that it's safe to cancel the delete work asynchronously. This prevents the kind of deadlock described above. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
f2e70d8f |
|
06-Oct-2021 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: fix GL_SKIP node_scope problems Before this patch, when a glock was locked, the very first holder on the queue would unlock the lockref and call the go_instantiate glops function (if one existed), unless GL_SKIP was specified. When we introduced the new node-scope concept, we allowed multiple holders to lock glocks in EX mode and share the lock. But node-scope introduced a new problem: if the first holder has GL_SKIP and the next one does NOT, since it is not the first holder on the queue, the go_instantiate op was not called. Eventually the GL_SKIP holder may call the instantiate sub-function (e.g. gfs2_rgrp_bh_get) but there was still a window of time in which another non-GL_SKIP holder assumes the instantiate function had been called by the first holder. In the case of rgrp glocks, this led to a NULL pointer dereference on the buffer_heads. This patch tries to fix the problem by introducing two new glock flags: GLF_INSTANTIATE_NEEDED, which keeps track of when the instantiate function needs to be called to "fill in" or "read in" the object before it is referenced. GLF_INSTANTIATE_IN_PROG which is used to determine when a process is in the process of reading in the object. Whenever a function needs to reference the object, it checks the GLF_INSTANTIATE_NEEDED flag, and if set, it sets GLF_INSTANTIATE_IN_PROG and calls the glops "go_instantiate" function. As before, the gl_lockref spin_lock is unlocked during the IO operation, which may take a relatively long amount of time to complete. While unlocked, if another process determines go_instantiate is still needed, it sees GLF_INSTANTIATE_IN_PROG is set, and waits for the go_instantiate glop operation to be completed. Once GLF_INSTANTIATE_IN_PROG is cleared, it needs to check GLF_INSTANTIATE_NEEDED again because the other process's go_instantiate operation may not have been successful. Functions that previously called the instantiate sub-functions now call directly into gfs2_instantiate so the new bits are managed properly. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
e6f85600 |
|
06-Oct-2021 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: split glock instantiation off from do_promote Before this patch, function do_promote had a section of code that did the actual instantiation. This patch splits that off into its own function, gfs2_instantiate, which prepares us for the next patch that will use that function. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
60d8bae9 |
|
06-Oct-2021 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: further simplify do_promote This patch further simplifies function do_promote by eliminating some redundant code in favor of using a lock_released flag. This is just prep work for a future patch. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
17a6ecee |
|
06-Oct-2021 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: re-factor function do_promote This patch simply re-factors function do_promote to reduce the indents. The logic should be unchanged. This makes future patches more readable. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
d74d0ce5 |
|
05-Oct-2021 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Remove 'first' trace_gfs2_promote argument Remove the 'first' argument of trace_gfs2_promote: with GL_SKIP, the 'first' holder isn't the one that instantiates the glock (gl_instantiate), which is what the 'first' flag was apparently supposed to indicate. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
3278b977 |
|
29-Sep-2021 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: change go_lock to go_instantiate Before this patch, the go_lock glock operations (glops) did not do any actual locking. They were used to instantiate objects, like reading in dinodes and rgrps from the media. This patch renames the functions to go_instantiate for clarity. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
b016d9a8 |
|
30-Sep-2021 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Save ip from gfs2_glock_nq_init Before this patch, when a glock was locked by function gfs2_glock_nq_init, it initialized the holder gh_ip (return address) as gfs2_glock_nq_init. That made it extremely difficult to track down problems because many functions call gfs2_glock_nq_init. This patch changes the function so that it saves gh_ip from the caller of gfs2_glock_nq_init, which makes it easy to backtrack which holder took the lock. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
c1442f6b |
|
12-Sep-2021 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: move GL_SKIP check from glops to do_promote Before this patch, each individual "go_lock" glock operation (glop) checked the GL_SKIP flag, and if set, would skip further processing. This patch changes the logic so the go_lock caller, function go_promote, checks the GL_SKIP flag before calling the go_lock op in the first place. This avoids having to unnecessarily unlock gl_lockref.lock only to re-lock it again. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
4c69038d |
|
12-Sep-2021 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Add GL_SKIP holder flag to dump_holder Somehow, the GL_SKIP flag was missed when dumping glock holders. This patch adds it to function hflags2str. I added it at the end because I wanted Holder and Skip flags together to read "Hs" rather than "sH" to avoid confusion with "Shared" ("SH") holder state. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
dc732906 |
|
19-Aug-2021 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Introduce flag for glock holder auto-demotion This patch introduces a new HIF_MAY_DEMOTE flag and infrastructure that will allow glocks to be demoted automatically on locking conflicts. When a locking request comes in that isn't compatible with the locking state of an active holder and that holder has the HIF_MAY_DEMOTE flag set, the holder will be demoted before the incoming locking request is granted. Note that this mechanism demotes active holders (with the HIF_HOLDER flag set), while before we were only demoting glocks without any active holders. This allows processes to keep hold of locks that may form a cyclic locking dependency; the core glock logic will then break those dependencies in case a conflicting locking request occurs. We'll use this to avoid giving up the inode glock proactively before faulting in pages. Processes that allow a glock holder to be taken away indicate this by calling gfs2_holder_allow_demote(), which sets the HIF_MAY_DEMOTE flag. Later, they call gfs2_holder_disallow_demote() to clear the flag again, and then they check if their holder is still queued: if it is, they are still holding the glock; if it isn't, they can re-acquire the glock (or abort). Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
61444649 |
|
09-Aug-2021 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Clean up function may_grant Pass the first current glock holder into function may_grant and deobfuscate the logic there. While at it, switch from BUG_ON to GLOCK_BUG_ON in may_grant. To make that build cleanly, de-constify the may_grant arguments. We're now using function find_first_holder in do_promote, so move the function's definition above do_promote. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
08d73666 |
|
02-Aug-2021 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Remove redundant check from gfs2_glock_dq In function gfs2_glock_dq, it checks to see if this is the fast path. Before this patch, it checked both "find_first_holder(gl) == NULL" and list_empty(&gl->gl_holders), which is redundant. If gl_holders is empty then find_first_holder must return NULL. This patch removes the redundancy. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
a8f1d32d |
|
27-Jul-2021 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Eliminate vestigial HIF_FIRST Holder flag HIF_FIRST is no longer used or needed, so remove it. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
38a618db |
|
07-Jun-2021 |
Baokun Li <libaokun1@huawei.com> |
gfs2: Use list_move_tail instead of list_del/list_add_tail Using list_move_tail() instead of list_del() + list_add_tail(). Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
1ab19c5d |
|
18-May-2021 |
Hillf Danton <hdanton@sina.com> |
gfs2: Fix use-after-free in gfs2_glock_shrink_scan The GLF_LRU flag is checked under lru_lock in gfs2_glock_remove_from_lru() to remove the glock from the lru list in __gfs2_glock_put(). On the shrink scan path, the same flag is cleared under lru_lock but because of cond_resched_lock(&lru_lock) in gfs2_dispose_glock_lru(), progress on the put side can be made without deleting the glock from the lru list. Keep GLF_LRU across the race window opened by cond_resched_lock(&lru_lock) to ensure correct behavior on both sides - clear GLF_LRU after list_del under lru_lock. Reported-by: syzbot <syzbot+34ba7ddbf3021981a228@syzkaller.appspotmail.com> Signed-off-by: Hillf Danton <hdanton@sina.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
865cc3e9 |
|
18-May-2021 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: fix a deadlock on withdraw-during-mount Before this patch, gfs2 would deadlock because of the following sequence during mount: mount gfs2_fill_super gfs2_make_fs_rw <--- Detects IO error with glock kthread_stop(sdp->sd_quotad_process); <--- Blocked waiting for quotad to finish logd Detects IO error and the need to withdraw calls gfs2_withdraw gfs2_make_fs_ro kthread_stop(sdp->sd_quotad_process); <--- Blocked waiting for quotad to finish gfs2_quotad gfs2_statfs_sync gfs2_glock_wait <---- Blocked waiting for statfs glock to be granted glock_work_func do_xmote <---Detects IO error, can't release glock: blocked on withdraw glops->go_inval glock_blocked_by_withdraw requeue glock work & exit <--- work requeued, blocked by withdraw This patch makes a special exception for the statfs system inode glock, which allows the statfs glock UNLOCK to proceed normally. That allows the quotad daemon to exit during the withdraw, which allows the logd daemon to exit during the withdraw, which allows the mount to exit. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
20265d9a |
|
18-May-2021 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: fix scheduling while atomic bug in glocks Before this patch, in the unlikely event that gfs2_glock_dq encountered a withdraw, it would do a wait_on_bit to wait for its journal to be recovered, but it never released the glock's spin_lock, which caused a scheduling-while-atomic error. This patch unlocks the lockref spin_lock before waiting for recovery. Fixes: 601ef0d52e96 ("gfs2: Force withdraw to replay journals and wait for it to finish") Cc: stable@vger.kernel.org # v5.7+ Reported-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
7716506a |
|
04-May-2021 |
Matthew Wilcox (Oracle) <willy@infradead.org> |
mm: introduce and use mapping_empty() Patch series "Remove nrexceptional tracking", v2. We actually use nrexceptional for very little these days. It's a minor pain to keep in sync with nrpages, but the pain becomes much bigger with the THP patches because we don't know how many indices a shadow entry occupies. It's easier to just remove it than keep it accurate. Also, we save 8 bytes per inode which is nothing to sneeze at; on my laptop, it would improve shmem_inode_cache from 22 to 23 objects per 16kB, and inode_cache from 26 to 27 objects. Combined, that saves a megabyte of memory from a combined usage of 25MB for both caches. Unfortunately, ext4 doesn't cross a magic boundary, so it doesn't save any memory for ext4. This patch (of 4): Instead of checking the two counters (nrpages and nrexceptional), we can just check whether i_pages is empty. Link: https://lkml.kernel.org/r/20201026151849.24232-1-willy@infradead.org Link: https://lkml.kernel.org/r/20201026151849.24232-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Vishal Verma <vishal.l.verma@intel.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
c551f66c |
|
30-Mar-2021 |
Lee Jones <lee.jones@linaro.org> |
gfs2: Fix a number of kernel-doc warnings Building the kernel with W=1 results in a number of kernel-doc warnings like incorrect function names and parameter descriptions. Fix those, mostly by adding missing parameter descriptions, removing left-over descriptions, and demoting some less important kernel-doc comments into regular comments. Originally proposed by Lee Jones; improved and combined into a single patch by Andreas. Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
f68effb3 |
|
19-Mar-2021 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Eliminate gh parameter from go_xmote_bh func The only glock that uses go_xmote_bh glops function is the freeze glock which uses freeze_go_xmote_bh. It does not use its gh parameter, so this patch eliminates the unneeded parameter. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
4f0f586b |
|
08-Apr-2021 |
Sami Tolvanen <samitolvanen@google.com> |
treewide: Change list_sort to use const pointers list_sort() internally casts the comparison function passed to it to a different type with constant struct list_head pointers, and uses this pointer to call the functions, which trips indirect call Control-Flow Integrity (CFI) checking. Instead of removing the consts, this change defines the list_cmp_func_t type and changes the comparison function types of all list_sort() callers to use const pointers, thus avoiding type mismatches. Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210408182843.1754385-10-samitolvanen@google.com
|
#
06e908cd |
|
18-Apr-2018 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Allow node-wide exclusive glock sharing Introduce a new LM_FLAG_NODE_SCOPE glock holder flag: when taking a glock in LM_ST_EXCLUSIVE (EX) mode and with the LM_FLAG_NODE_SCOPE flag set, the exclusive lock is shared among all local processes who are holding the glock in EX mode and have the LM_FLAG_NODE_SCOPE flag set. From the point of view of other nodes, the lock is still held exclusively. A future patch will start using this flag to improve performance with rgrp sharing. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
a55a47a3 |
|
27-Nov-2020 |
Andreas Gruenbacher <agruenba@redhat.com> |
Revert "GFS2: Prevent delete work from occurring on glocks used for create" Since commit a0e3cc65fa29 ("gfs2: Turn gl_delete into a delayed work"), we're cancelling any pending delete work of an iopen glock before attaching a new inode to that glock in gfs2_create_inode. This means that delete_work_func can no longer be queued or running when attaching the iopen glock to the new inode, and we can revert commit a4923865ea07 ("GFS2: Prevent delete work from occurring on glocks used for create"), which tried to achieve the same but in a racy way. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
515b269d |
|
23-Nov-2020 |
Alexander Aring <aahringo@redhat.com> |
gfs2: set lockdep subclass for iopen glocks This patch introduce a new globs attribute to define the subclass of the glock lockref spinlock. This avoid the following lockdep warning, which occurs when we lock an inode lock while an iopen lock is held: ============================================ WARNING: possible recursive locking detected 5.10.0-rc3+ #4990 Not tainted -------------------------------------------- kworker/0:1/12 is trying to acquire lock: ffff9067d45672d8 (&gl->gl_lockref.lock){+.+.}-{3:3}, at: lockref_get+0x9/0x20 but task is already holding lock: ffff9067da308588 (&gl->gl_lockref.lock){+.+.}-{3:3}, at: delete_work_func+0x164/0x260 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&gl->gl_lockref.lock); lock(&gl->gl_lockref.lock); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by kworker/0:1/12: #0: ffff9067c1bfdd38 ((wq_completion)delete_workqueue){+.+.}-{0:0}, at: process_one_work+0x1b7/0x540 #1: ffffac594006be70 ((work_completion)(&(&gl->gl_delete)->work)){+.+.}-{0:0}, at: process_one_work+0x1b7/0x540 #2: ffff9067da308588 (&gl->gl_lockref.lock){+.+.}-{3:3}, at: delete_work_func+0x164/0x260 stack backtrace: CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.10.0-rc3+ #4990 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 Workqueue: delete_workqueue delete_work_func Call Trace: dump_stack+0x8b/0xb0 __lock_acquire.cold+0x19e/0x2e3 lock_acquire+0x150/0x410 ? lockref_get+0x9/0x20 _raw_spin_lock+0x27/0x40 ? lockref_get+0x9/0x20 lockref_get+0x9/0x20 delete_work_func+0x188/0x260 process_one_work+0x237/0x540 worker_thread+0x4d/0x3b0 ? process_one_work+0x540/0x540 kthread+0x127/0x140 ? __kthread_bind_mask+0x60/0x60 ret_from_fork+0x22/0x30 Suggested-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
da7d554f |
|
26-Oct-2020 |
Alexander Aring <aahringo@redhat.com> |
gfs2: Wake up when sd_glock_disposal becomes zero Commit fc0e38dae645 ("GFS2: Fix glock deallocation race") fixed a sd_glock_disposal accounting bug by adding a missing atomic_dec statement, but it failed to wake up sd_glock_wait when that decrement causes sd_glock_disposal to reach zero. As a consequence, gfs2_gl_hash_clear can now run into a 10-minute timeout instead of being woken up. Add the missing wakeup. Fixes: fc0e38dae645 ("GFS2: Fix glock deallocation race") Cc: stable@vger.kernel.org # v2.6.39+ Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
2ffed529 |
|
15-Oct-2020 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Only access gl_delete for iopen glocks Only initialize gl_delete for iopen glocks, but more importantly, only access it for iopen glocks in flush_delete_work: flush_delete_work is called for different types of glocks including rgrp glocks, and those use gl_vm which is in a union with gl_delete. Without this fix, we'll end up clobbering gl_vm, which results in general memory corruption. Fixes: a0e3cc65fa29 ("gfs2: Turn gl_delete into a delayed work") Cc: stable@vger.kernel.org # v5.8+ Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
dbffb29d |
|
12-Oct-2020 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Fix comments to glock_hash_walk The comments before function glock_hash_walk had the wrong name and an extra parameter. This simply fixes the comments. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
e2c6c8a7 |
|
12-Oct-2020 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: eliminate GLF_QUEUED flag in favor of list_empty(gl_holders) Before this patch, glock.c maintained a flag, GLF_QUEUED, which indicated when a glock had a holder queued. It was only checked for inode glocks, although set and cleared by all glocks, and it was only used to determine whether the glock should be held for the minimum hold time before releasing. The problem is that the flag is not accurate at all. If a process holds the glock, the flag is set. When they dequeue the glock, it only cleared the flag in cases when the state actually changed. So if the state doesn't change, the flag may still be set, even when nothing is queued. This happens to iopen glocks often: the get held in SH, then the file is closed, but the glock remains in SH mode. We don't need a special flag to indicate this: we can simply tell whether the glock has any items queued to the holders queue. It's a waste of cpu time to maintain it. This patch eliminates the flag in favor of simply checking list_empty on the glock holders. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
ee1e2c77 |
|
16-Sep-2020 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: call truncate_inode_pages_final for address space glocks Before this patch, we were not calling truncate_inode_pages_final for the address space for glocks, which left the possibility of a leak. We now take care of the problem instead of complaining, and we do it during glock tear-down.. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
e8a8023e |
|
15-Sep-2020 |
Liu Shixin <liushixin2@huawei.com> |
gfs2: convert to use DEFINE_SEQ_ATTRIBUTE macro Use DEFINE_SEQ_ATTRIBUTE macro to simplify the code. Signed-off-by: Liu Shixin <liushixin2@huawei.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
c07bfb4d |
|
27-Jul-2020 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Fix refcount leak in gfs2_glock_poke In gfs2_glock_poke, make sure gfs2_holder_uninit is called on the local glock holder. Without that, we're leaking a glock and a pid reference. Fixes: 9e8990dea926 ("gfs2: Smarter iopen glock waiting") Cc: stable@vger.kernel.org # v5.8+ Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
5deaf1f6 |
|
09-Jun-2020 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Add some flags missing from glock output Before this patch, three flags were not represented in the glock output. This patch adds them in: c - GLF_INODE_CREATING P - GLF_PENDING_DELETE x - GLF_FREEING (both f and F are already used) Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
34244d71 |
|
10-Jun-2020 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Don't sleep during glock hash walk In flush_delete_work, instead of flushing each individual pending delayed work item, cancel and re-queue them for immediate execution. The waiting isn't needed here because we're already waiting for all queued work items to complete in gfs2_flush_delete_work. This makes the code more efficient, but more importantly, it avoids sleeping during a rhashtable walk, inside rcu_read_lock(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
9e8990de |
|
17-Jan-2020 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Smarter iopen glock waiting When trying to upgrade the iopen glock from a shared to an exclusive lock in gfs2_evict_inode, abort the wait if there is contention on the corresponding inode glock: in that case, the inode must still be in active use on another node, and we're not guaranteed to get the iopen glock anytime soon. To make this work even better, when we notice contention on the iopen glock and we can't evict the corresponsing inode and release the iopen glock immediately, poke the inode glock. The other node(s) trying to acquire the lock can then abort instead of timing out. Thanks to Heinz Mauelshagen for pointing out a locking bug in a previous version of this patch. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
35b6f8fb |
|
17-Jan-2020 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Wake up when setting GLF_DEMOTE Wake up the sdp->sd_async_glock_wait wait queue when setting the GLF_DEMOTE flag. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
b0dcffd8 |
|
15-Jan-2020 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Check inode generation number in delete_work_func In delete_work_func, if the iopen glock still has an inode attached, limit the inode lookup to that specific generation number: in the likely case that the inode was deleted on the node on which the inode's link count dropped to zero, we can skip verifying the on-disk block type and reading in the inode. The same applies if another node that had the inode open managed to delete the inode before us. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
6bdcadea |
|
14-Jan-2020 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Minor gfs2_lookup_by_inum cleanup Use a zero no_formal_ino instead of a NULL pointer to indicate that any inode generation number will qualify: a valid inode never has a zero no_formal_ino. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
8c7b9262 |
|
13-Jan-2020 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Give up the iopen glock on contention When there's contention on the iopen glock, it means that the link count of the corresponding inode has dropped to zero on a remote node which is now trying to delete the inode. In that case, try to evict the inode so that the iopen glock will be released, which will allow the remote node to do its job. When the inode is still open locally, the inode's reference count won't drop to zero and so we'll keep holding the inode and its iopen glock. The remote node will time out its request to grab the iopen glock, and when the inode is finally closed locally, we'll try to delete it ourself. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
a0e3cc65 |
|
16-Jan-2020 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Turn gl_delete into a delayed work This requires flushing delayed work items in gfs2_make_fs_ro (which is called before unmounting a filesystem). When inodes are deleted and then recreated, pending gl_delete work items would have no effect because the inode generations will have changed, so we can cancel any pending gl_delete works before reusing iopen glocks. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
f286d627 |
|
13-Jan-2020 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Keep track of deleted inode generations in LVBs When deleting an inode, keep track of the generation of the deleted inode in the inode glock Lock Value Block (LVB). When trying to delete an inode remotely, check the last-known inode generation against the deleted inode generation to skip duplicate remote deletes. This avoids taking the resource group glock in order to verify the block type. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
15f2547b |
|
14-Jan-2020 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Allow ASPACE glocks to also have an lvb Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
ea4e61c7 |
|
23-May-2020 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: introduce new gfs2_glock_assert_withdraw Before this patch, asserts based on glocks did not print the glock with the error. This patch introduces a new macro, gfs2_glock_assert_withdraw which first prints the glock, then takes the assert. This also changes a few glock asserts to the new macro. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
7e901d6e |
|
22-May-2020 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: print mapping->nrpages in glock dump for address space glocks This patch makes the glock dumps in debugfs print the number of pages (nrpages) for address space glocks. This will aid in debugging. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
b14c9490 |
|
08-May-2020 |
Bob Peterson <rpeterso@redhat.com> |
Revert "gfs2: Don't demote a glock until its revokes are written" This reverts commit df5db5f9ee112e76b5202fbc331f990a0fc316d6. This patch fixes a regression: patch df5db5f9ee112 allowed function run_queue() to bypass its call to do_xmote() if revokes were queued for the glock. That's wrong because its call to do_xmote() is what is responsible for calling the go_sync() glops functions to sync both the ail list and any revokes queued for it. By bypassing the call, gfs2 could get into a stand-off where the glock could not be demoted until its revokes are written back, but the revokes would not be written back because do_xmote() was never called. It "sort of" works, however, because there are other mechanisms like the log flush daemon (logd) that can sync the ail items and revokes, if it deems it necessary. The problem is: without file system pressure, it might never deem it necessary. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
b11e1a84 |
|
06-May-2020 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: If go_sync returns error, withdraw but skip invalidate Before this patch, if the go_sync operation returned an error during the do_xmote process (such as unable to sync metadata to the journal) the code did goto out. That kept the glock locked, so it could not be given away, which correctly avoids file system corruption. However, it never set the withdraw bit or requeueing the glock work. So it would hang forever, unable to ever demote the glock. This patch changes to goto to a new label, skip_inval, so that errors from go_sync are treated the same way as errors from go_inval: The delayed withdraw bit is set and the work is requeued. That way, the logd should eventually figure out there's a problem and withdraw properly there. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
a8b7528b |
|
23-Apr-2020 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Fix error exit in do_xmote Before this patch, if an error was detected from glock function go_sync by function do_xmote, it would return. But the function had temporarily unlocked the gl_lockref spin_lock, and it never re-locked it. When the caller of do_xmote tried to unlock it again, it was already unlocked, which resulted in a corrupted spin_lock value. This patch makes sure the gl_lockref spin_lock is re-locked after it is unlocked. Thanks to Wu Bo <wubo40@huawei.com> for reporting this problem. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
969183bc |
|
03-Feb-2020 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Switch to list_{first,last}_entry Replace open-coded versions of list_first_entry and list_last_entry with those functions. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
1c634f94 |
|
13-Nov-2019 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Do proper error checking for go_sync family of glops functions Before this patch, function do_xmote would try to sync out the glock dirty data by calling the appropriate glops function XXX_go_sync() but it did not check for a good return code. If the sync was not possible due to an io error or whatever, do_xmote would continue on and call go_inval and release the glock to other cluster nodes. When those nodes go to replay the journal, they may already be holding glocks for the journal records that should have been synced, but were not due to the ignored error. This patch introduces proper error code checking to the go_sync family of glops functions. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
df5db5f9 |
|
13-Nov-2019 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Don't demote a glock until its revokes are written Before this patch, run_queue would demote glocks based on whether there are any more holders. But if the glock has pending revokes that haven't been written to the media, giving up the glock might end in file system corruption if the revokes never get written due to io errors, node crashes and fences, etc. In that case, another node will replay the metadata blocks associated with the glock, but because the revoke was never written, it could replay that block even though the glock had since been granted to another node who might have made changes. This patch changes the logic in run_queue so that it never demotes a glock until its count of pending revokes reaches zero. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
d93ae386 |
|
29-Apr-2019 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Check for log write errors before telling dlm to unlock Before this patch, function do_xmote just assumed all the writes submitted to the journal were finished and successful, and it called the go_unlock function to release the dlm lock. But if they're not, and a revoke failed to make its way to the journal, a journal replay on another node will cause corruption if we let the go_inval function continue and tell dlm to release the glock to another node. This patch adds a couple checks for errors in do_xmote after the calls to go_sync and go_inval. If an error is found, we cannot withdraw yet, because the withdraw itself uses glocks to make the file system read-only. Instead, we flag the error. Later, asserts should cause another node to replay the journal before continuing, thus protecting rgrp and dinode glocks and maintaining the integrity of the metadata. Note that we only need to do this for journaled glocks. System glocks should be able to progress even under withdrawn conditions. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
33dbd1e4 |
|
22-May-2019 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: fix infinite loop when checking ail item count before go_inval Before this patch, the rgrp_go_inval and inode_go_inval functions each checked if there were any items left on the ail count (by way of a count), and if so, did a withdraw. But the withdraw code now uses glocks when changing the file system to read-only status. So we can not have glock functions withdrawing or a hang will likely result: The glocks can't be serviced by the work_func if the work_func is busy doing its own withdraw. This patch removes the checks from the go_inval functions and adds a centralized check in do_xmote to warn about the problem and not withdraw, but flag the error so it's eventually caught when the logd daemon eventually runs. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
601ef0d5 |
|
28-Jan-2020 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Force withdraw to replay journals and wait for it to finish When a node withdraws from a file system, it often leaves its journal in an incomplete state. This is especially true when the withdraw is caused by io errors writing to the journal. Before this patch, a withdraw would try to write a "shutdown" record to the journal, tell dlm it's done with the file system, and none of the other nodes know about the problem. Later, when the problem is fixed and the withdrawn node is rebooted, it would then discover that its own journal was incomplete, and replay it. However, replaying it at this point is almost guaranteed to introduce corruption because the other nodes are likely to have used affected resource groups that appeared in the journal since the time of the withdraw. Replaying the journal later will overwrite any changes made, and not through any fault of dlm, which was instructed during the withdraw to release those resources. This patch makes file system withdraws seen by the entire cluster. Withdrawing nodes dequeue their journal glock to allow recovery. The remaining nodes check all the journals to see if they are clean or in need of replay. They try to replay dirty journals, but only the journals of withdrawn nodes will be "not busy" and therefore available for replay. Until the journal replay is complete, no i/o related glocks may be given out, to ensure that the replay does not cause the aforementioned corruption: We cannot allow any journal replay to overwrite blocks associated with a glock once it is held. The "live" glock which is now used to signal when a withdraw occurs. When a withdraw occurs, the node signals its withdraw by dequeueing the "live" glock and trying to enqueue it in EX mode, thus forcing the other nodes to all see a demote request, by way of a "1CB" (one callback) try lock. The "live" glock is not granted in EX; the callback is only just used to indicate a withdraw has occurred. Note that all nodes in the cluster must wait for the recovering node to finish replaying the withdrawing node's journal before continuing. To this end, it checks that the journals are clean multiple times in a retry loop. Also note that the withdraw function may be called from a wide variety of situations, and therefore, we need to take extra precautions to make sure pointers are valid before using them in many circumstances. We also need to take care when glocks decide to withdraw, since the withdraw code now uses glocks. Also, before this patch, if a process encountered an error and decided to withdraw, if another process was already withdrawing, the second withdraw would be silently ignored, which set it free to unlock its glocks. That's correct behavior if the original withdrawer encounters further errors down the road. But if secondary waiters don't wait for the journal replay, unlocking glocks will allow other nodes to use them, despite the fact that the journal containing those blocks is being replayed. The replay needs to finish before our glocks are released to other nodes. IOW, secondary withdraws need to wait for the first withdraw to finish. For example, if an rgrp glock is unlocked by a process that didn't wait for the first withdraw, a journal replay could introduce file system corruption by replaying a rgrp block that has already been granted to a different cluster node. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
a72d2401 |
|
13-Jun-2019 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Allow some glocks to be used during withdraw We need to allow some glocks to be enqueued, dequeued, promoted, and demoted when we're withdrawn. For example, to maintain metadata integrity, we should disallow the use of inode and rgrp glocks when withdrawn. Other glocks, like iopen or the transaction glocks may be safely used because none of their metadata goes through the journal. So in general, we should disallow all glocks with an address space, and allow all the others. One exception is: we need to allow our active journal to be demoted so others may recover it. Allowing glocks after withdraw gives us the ability to take appropriate action (in a following patch) to have our journal properly replayed by another node rather than just abandoning the current transactions and pretending nothing bad happened, leaving the other nodes free to modify the blocks we had in our journal, which may result in file system corruption. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
b3422cac |
|
13-Nov-2019 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Rework how rgrp buffer_heads are managed Before this patch, the rgrp code had a serious problem related to how it managed buffer_heads for resource groups. The problem caused file system corruption, especially in cases of journal replay. When an rgrp glock was demoted to transfer ownership to a different cluster node, do_xmote() first calls rgrp_go_sync and then rgrp_go_inval, as expected. When it calls rgrp_go_sync, that called gfs2_rgrp_brelse() that dropped the buffer_head reference count. In most cases, the reference count went to zero, which is right. However, there were other places where the buffers are handled differently. After rgrp_go_sync, do_xmote called rgrp_go_inval which called gfs2_rgrp_brelse a second time, then rgrp_go_inval's call to truncate_inode_pages_range would get rid of the pages in memory, but only if the reference count drops to 0. Unfortunately, gfs2_rgrp_brelse was setting bi->bi_bh = NULL. So when rgrp_go_sync called gfs2_rgrp_brelse, it lost the pointer to the buffer_heads in cases where the reference count was still 1. Therefore, when rgrp_go_inval called gfs2_rgrp_brelse a second time, it failed the check for "if (bi->bi_bh)" and thus failed to call brelse a second time. Because of that, the reference count on those buffers sometimes failed to drop from 1 to 0. And that caused function truncate_inode_pages_range to keep the pages in page cache rather than freeing them. The next time the rgrp glock was acquired, the metadata read of the rgrp buffers re-used the pages in memory, which were now wrong because they were likely modified by the other node who acquired the glock in EX (which is why we demoted the glock). This re-use of the page cache caused corruption because changes made by the other nodes were never seen, so the bitmaps were inaccurate. For some reason, the problem became most apparent when journal replay forced the replay of rgrps in memory, which caused newer rgrp data to be overwritten by the older in-core pages. A big part of the problem was that the rgrp buffer were released in multiple places: The go_unlock function would release them when the glock was released rather than when the glock is demoted, which is clearly wrong because our intent was to cache them until the glock is demoted from SH or EX. This patch attempts to clean up the mess and make one consistent and centralized mechanism for managing the rgrp buffer_heads by implementing several changes: 1. It eliminates the call to gfs2_rgrp_brelse() from rgrp_go_sync. We don't want to release the buffers or zero the pointers when syncing for the reasons stated above. It only makes sense to release them when the glock is actually invalidated (go_inval). And when we do, then we set the bh pointers to NULL. 2. The go_unlock function (which was only used for rgrps) is eliminated, as we've talked about doing many times before. The go_unlock function was called too early in the glock dq process, and should not happen until the glock is invalidated. 3. It also eliminates the call to rgrp_brelse in gfs2_clear_rgrpd. That will now happen automatically when the rgrp glocks are demoted, and shouldn't happen any sooner or later than that. Instead, function gfs2_clear_rgrpd has been modified to demote the rgrp glocks, and therefore, free those pages, before the remaining glocks are culled by gfs2_gl_hash_clear. This prevents the gl_object from hanging around when the glocks are culled. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
f7be987b |
|
15-Jan-2020 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Remove GFS2_MIN_LVB_SIZE define The dlm lockspace is set up to have lock value blocks of GDLM_LVB_SIZE bytes, and dlm is the only lock manager we support, so there is no point in claiming that the lock value block could have any other size. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
d99724c3 |
|
15-Nov-2019 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Close timing window with GLF_INVALIDATE_IN_PROGRESS This patch closes a timing window in which two processes compete and overlap in the execution of do_xmote for the same glock: Process A Process B ------------------------------------ ----------------------------- 1. Grabs gl_lockref and calls do_xmote 2. Grabs gl_lockref but is blocked 3. Sets GLF_INVALIDATE_IN_PROGRESS 4. Unlocks gl_lockref 5. Calls do_xmote 6. Call glops->go_sync 7. test_and_clear_bit GLF_DIRTY 8. Call gfs2_log_flush Call glops->go_sync 9. (slow IO, so it blocks a long time) test_and_clear_bit GLF_DIRTY It's not dirty (step 7) returns 10. Tests GLF_INVALIDATE_IN_PROGRESS 11. Calls go_inval (rgrp_go_inval) 12. gfs2_rgrp_relse does brelse 13. truncate_inode_pages_range 14. Calls lm_lock UN In step 14 we've just told dlm to give the glock to another node when, in fact, process A has not finished the IO and synced all buffer_heads to disk and make sure their revokes are done. This patch fixes the problem by changing the GLF_INVALIDATE_IN_PROGRESS to use test_and_set_bit, and if the bit is already set, process B just ignores it and trusts that process A will do the do_xmote in the proper order. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
eb43e660 |
|
14-Nov-2019 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Introduce function gfs2_withdrawn Add function gfs2_withdrawn and replace all checks for the SDF_WITHDRAWN bit to call it. This does not change the logic or function of gfs2, and it facilitates later improvements to the withdraw sequence. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
ad26967b |
|
29-Aug-2019 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Use async glocks for rename Because s_vfs_rename_mutex is not cluster-wide, multiple nodes can reverse the roles of which directories are "old" and which are "new" for the purposes of rename. This can cause deadlocks where two nodes end up waiting for each other. There can be several layers of directory dependencies across many nodes. This patch fixes the problem by acquiring all gfs2_rename's inode glocks asychronously and waiting for all glocks to be acquired. That way all inodes are locked regardless of the order. The timeout value for multiple asynchronous glocks is calculated to be the total of the individual wait times for each glock times two. Since gfs2_exchange is very similar to gfs2_rename, both functions are patched in the same way. A new async glock wait queue, sd_async_glock_wait, keeps a list of waiters for these events. If gfs2's holder_wake function detects an async holder, it wakes up any waiters for the event. The waiter only tests whether any of its requests are still pending. Since the glocks are sent to dlm asychronously, the wait function needs to check to see which glocks, if any, were granted. If a glock is granted by dlm (and therefore held), its minimum hold time is checked and adjusted as necessary, as other glock grants do. If the event times out, all glocks held thus far must be dequeued to resolve any existing deadlocks. Then, if there are any outstanding locking requests, we need to loop around and wait for dlm to respond to those requests too. After we release all requests, we return -ESTALE to the caller (vfs rename) which loops around and retries the request. Node1 Node2 --------- --------- 1. Enqueue A Enqueue B 2. Enqueue B Enqueue A 3. A granted 6. B granted 7. Wait for B 8. Wait for A 9. A times out (since Node 1 holds A) 10. Dequeue B (since it was granted) 11. Wait for all requests from DLM 12. B Granted (since Node2 released it in step 10) 13. Rename 14. Dequeue A 15. DLM Grants A 16. Dequeue A (due to the timeout and since we no longer have B held for our task). 17. Dequeue B 18. Return -ESTALE to vfs 19. VFS retries the operation, goto step 1. This release-all-locks / acquire-all-locks may slow rename / exchange down as both nodes struggle in the same way and do the same thing. However, this will only happen when there is contention for the same inodes, which ought to be rare. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
01123cf1 |
|
29-Aug-2019 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: create function gfs2_glock_update_hold_time This patch moves the code that updates glock minimum hold time to a separate function. This will be called by a future patch. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
98fb0574 |
|
13-Aug-2019 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Fix possible fs name overflows This patch fixes three places in which temporary character buffers could overflow due to the addition of the file system id from patch 3792ce973f07. Thanks to Dan Carpenter for pointing it out. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
3792ce97 |
|
09-May-2019 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: dump fsid when dumping glock problems Before this patch, if a glock error was encountered, the glock with the problem was dumped. But sometimes you may have lots of file systems mounted, and that doesn't tell you which file system it was for. This patch adds a new boolean parameter fsid to the dump_glock family of functions. For non-error cases, such as dumping the glocks debugfs file, the fsid is not dumped in order to keep lock dumps and glocktop as clean as possible. For all error cases, such as GLOCK_BUG_ON, the file system id is now printed. This will make it easier to debug. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
04aea0ca |
|
07-May-2019 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Rename SDF_SHUTDOWN to SDF_WITHDRAWN Before this patch, the superblock flag indicating when a file system is withdrawn was called SDF_SHUTDOWN. This patch simply renames it to the more obvious SDF_WITHDRAWN. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
15a798f7 |
|
05-Jun-2019 |
Kefeng Wang <wangkefeng.wang@huawei.com> |
gfs2: Use IS_ERR_OR_NULL Use IS_ERR_OR_NULL where appropriate. (Several more places converted by Andreas.) Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
638803d4 |
|
06-Jun-2019 |
Bob Peterson <rpeterso@redhat.com> |
Revert "gfs2: Replace gl_revokes with a GLF flag" Commit 73118ca8baf7 introduced a glock reference counting bug in gfs2_trans_remove_revoke. Given that, replacing gl_revokes with a GLF flag is no longer useful, so revert that commit. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
7336d0e6 |
|
31-May-2019 |
Thomas Gleixner <tglx@linutronix.de> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 398 Based on 1 normalized pattern(s): this copyrighted material is made available to anyone wishing to use modify copy or redistribute it subject to the terms and conditions of the gnu general public license version 2 extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 44 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190531081038.653000175@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
73118ca8 |
|
04-Apr-2019 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Replace gl_revokes with a GLF flag The gl_revokes value determines how many outstanding revokes a glock has on the superblock revokes list; this is used to avoid unnecessary log flushes. However, gl_revokes is only ever tested for being zero, and it's only decremented in revoke_lo_after_commit, which removes all revokes from the list, so we know that the gl_revoke values of all the glocks on the list will reach zero. Therefore, we can replace gl_revokes with a bit flag. This saves an atomic counter in struct gfs2_glock. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
9287c645 |
|
04-Apr-2019 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Fix occasional glock use-after-free This patch has to do with the life cycle of glocks and buffers. When gfs2 metadata or journaled data is queued to be written, a gfs2_bufdata object is assigned to track the buffer, and that is queued to various lists, including the glock's gl_ail_list to indicate it's on the active items list. Once the page associated with the buffer has been written, it is removed from the ail list, but its life isn't over until a revoke has been successfully written. So after the block is written, its bufdata object is moved from the glock's gl_ail_list to a file-system-wide list of pending revokes, sd_log_le_revoke. At that point the glock still needs to track how many revokes it contributed to that list (in gl_revokes) so that things like glock go_sync can ensure all the metadata has been not only written, but also revoked before the glock is granted to a different node. This is to guarantee journal replay doesn't replay the block once the glock has been granted to another node. Ross Lagerwall recently discovered a race in which an inode could be evicted, and its glock freed after its ail list had been synced, but while it still had unwritten revokes on the sd_log_le_revoke list. The evict decremented the glock reference count to zero, which allowed the glock to be freed. After the revoke was written, function revoke_lo_after_commit tried to adjust the glock's gl_revokes counter and clear its GLF_LFLUSH flag, at which time it referenced the freed glock. This patch fixes the problem by incrementing the glock reference count in gfs2_add_revoke when the glock's first bufdata object is moved from the glock to the global revokes list. Later, when the glock's last such bufdata object is freed, the reference count is decremented. This guarantees that whichever process finishes last (the revoke writing or the evict) will properly free the glock, and neither will reference the glock after it has been freed. Reported-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
7881ef3f |
|
27-Mar-2019 |
Ross Lagerwall <ross.lagerwall@citrix.com> |
gfs2: Fix lru_count going negative Under certain conditions, lru_count may drop below zero resulting in a large amount of log spam like this: vmscan: shrink_slab: gfs2_dump_glock+0x3b0/0x630 [gfs2] \ negative objects to delete nr=-1 This happens as follows: 1) A glock is moved from lru_list to the dispose list and lru_count is decremented. 2) The dispose function calls cond_resched() and drops the lru lock. 3) Another thread takes the lru lock and tries to add the same glock to lru_list, checking if the glock is on an lru list. 4) It is on a list (actually the dispose list) and so it avoids incrementing lru_count. 5) The glock is moved to lru_list. 5) The original thread doesn't dispose it because it has been re-added to the lru list but the lru_count has still decreased by one. Fix by checking if the LRU flag is set on the glock rather than checking if the glock is on some list and rearrange the code so that the LRU flag is added/removed precisely when the glock is added/removed from lru_list. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
605b0487 |
|
06-Mar-2019 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Fix missed wakeups in find_insert_glock Mark Syms has reported seeing tasks that are stuck waiting in find_insert_glock. It turns out that struct lm_lockname contains four padding bytes on 64-bit architectures that function glock_waitqueue doesn't skip when hashing the glock name. As a result, we can end up waking up the wrong waitqueue, and the waiting tasks may be stuck forever. Fix that by using ht_parms.key_len instead of sizeof(struct lm_lockname) for the key length. Reported-by: Mark Syms <mark.syms@citrix.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
2abbf9a4 |
|
22-Jan-2019 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
gfs: no need to check return value of debugfs_create functions When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. There is no need to save the dentries for the debugfs files, so drop those variables to save a bit of space and make the code simpler. Cc: Bob Peterson <rpeterso@redhat.com> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: cluster-devel@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
27a2660f |
|
18-Apr-2018 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Dump nrpages for inodes and their glocks This patch is based on an idea from Steve Whitehouse. The idea is to dump the number of pages for inodes in the glock dumps. The additional locking required me to drop const from quite a few places. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
e54c78a2 |
|
03-Oct-2018 |
Bob Peterson <rpeterso@redhat.com> |
gfs2: Use fs_* functions instead of pr_* function where we can Before this patch, various errors and messages were reported using the pr_* functions: pr_err, pr_warn, pr_info, etc., but that does not tell you which gfs2 mount had the problem, which is often vital to debugging. This patch changes the calls from pr_* to fs_* in most of the messages so that the file system id is printed along with the message. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
6da2ec56 |
|
12-Jun-2018 |
Kees Cook <keescook@chromium.org> |
treewide: kmalloc() -> kmalloc_array() The kmalloc() function has a 2-factor argument form, kmalloc_array(). This patch replaces cases of: kmalloc(a * b, gfp) with: kmalloc_array(a * b, gfp) as well as handling cases of: kmalloc(a * b * c, gfp) with: kmalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kmalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kmalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The tools/ directory was manually excluded, since it has its own implementation of kmalloc(). The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kmalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | kmalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kmalloc( - sizeof(u8) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(char) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(u8) * COUNT + COUNT , ...) | kmalloc( - sizeof(__u8) * COUNT + COUNT , ...) | kmalloc( - sizeof(char) * COUNT + COUNT , ...) | kmalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kmalloc + kmalloc_array ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kmalloc + kmalloc_array ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kmalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kmalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kmalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kmalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kmalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kmalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kmalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kmalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | kmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kmalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kmalloc(C1 * C2 * C3, ...) | kmalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) | kmalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) | kmalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) | kmalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kmalloc(sizeof(THING) * C2, ...) | kmalloc(sizeof(TYPE) * C2, ...) | kmalloc(C1 * C2 * C3, ...) | kmalloc(C1 * C2, ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - (E1) * E2 + E1, E2 , ...) | - kmalloc + kmalloc_array ( - (E1) * (E2) + E1, E2 , ...) | - kmalloc + kmalloc_array ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>
|
#
3fd5d3ad |
|
27-Mar-2018 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Stop using rhashtable_walk_peek Function rhashtable_walk_peek is problematic because there is no guarantee that the glock previously returned still exists; when that key is deleted, rhashtable_walk_peek can end up returning a different key, which will cause an inconsistent glock dump. Fix this by keeping track of the current glock in the seq file iterator functions instead. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
7ac07fda |
|
08-Jan-2018 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Glock dump performance regression fix Restore an optimization removed in commit 7f19449553 "Fix debugfs glocks dump": keep the glock hash table iterator active while the glock dump file is held open. This avoids having to rescan the hash table from the start for each read, with quadratically rising runtime. In addition, use rhastable_walk_peek for resuming a glock dump at the current position: when a glock doesn't fit in the provided buffer anymore, the next read must revisit the same glock. Finally, also restart the dump from the first entry when we notice that the hash table has been resized in gfs2_glock_seq_start. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
97a6ec4a |
|
04-Dec-2017 |
Tom Herbert <tom@quantonium.net> |
rhashtable: Change rhashtable_walk_start to return void Most callers of rhashtable_walk_start don't care about a resize event which is indicated by a return value of -EAGAIN. So calls to rhashtable_walk_start are wrapped wih code to ignore -EAGAIN. Something like this is common: ret = rhashtable_walk_start(rhiter); if (ret && ret != -EAGAIN) goto out; Since zero and -EAGAIN are the only possible return values from the function this check is pointless. The condition never evaluates to true. This patch changes rhashtable_walk_start to return void. This simplifies code for the callers that ignore -EAGAIN. For the few cases where the caller cares about the resize event, particularly where the table can be walked in mulitple parts for netlink or seq file dump, the function rhashtable_walk_start_check has been added that returns -EAGAIN on a resize event. Signed-off-by: Tom Herbert <tom@quantonium.net> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
10201655 |
|
19-Sep-2017 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Fix debugfs glocks dump The switch to rhashtables (commit 88ffbf3e03) broke the debugfs glock dump (/sys/kernel/debug/gfs2/<device>/glocks) for dumps bigger than a single buffer: the right function for restarting an rhashtable iteration from the beginning of the hash table is rhashtable_walk_enter; rhashtable_walk_stop + rhashtable_walk_start will just resume from the current position. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Cc: stable@vger.kernel.org # v4.3+
|
#
d296b15e |
|
30-Aug-2017 |
Arvind Yadav <arvind.yadav.cs@gmail.com> |
gfs2: constify rhashtable_params rhashtable_params are not supposed to change at runtime. All Functions rhashtable_* working with const rhashtable_params provided by <linux/rhashtable.h>. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
27c3b415 |
|
18-Aug-2017 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Fix up some sparse warnings This patch cleans up various pieces of GFS2 to avoid sparse errors. This doesn't fix them all, but it fixes several. The first error, in function glock_hash_walk was a genuine bug where the rhashtable could be started and not stopped. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
a91323e2 |
|
04-Aug-2017 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Clean up waiting on glocks The prepare_to_wait_on_glock and finish_wait_on_glock functions introduced in commit 56a365be "gfs2: gfs2_glock_get: Wait on freeing glocks" are better removed, resulting in cleaner code. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
71c1b213 |
|
01-Aug-2017 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: gfs2_evict_inode: Put glocks asynchronously gfs2_evict_inode is called to free inodes under memory pressure. The function calls into DLM when an inode's last cluster-wide reference goes away (remote unlink) and to release the glock and associated DLM lock before finally destroying the inode. However, if DLM is blocked on memory to become available, calling into DLM again will deadlock. Avoid that by decoupling releasing glocks from destroying inodes in that case: with gfs2_glock_queue_put, glocks will be dequeued asynchronously in work queue context, when the associated inodes have likely already been destroyed. With this change, inodes can end up being unlinked, remote-unlink can be triggered, and then the inode can be reallocated before all remote-unlink callbacks are processed. To detect that, revalidate the link count in gfs2_evict_inode to make sure we're not deleting an allocated, referenced inode. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
0515480a |
|
01-Aug-2017 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: gfs2_glock_get: Wait on freeing glocks Keep glocks in their hash table until they are freed instead of removing them when their last reference is dropped. This allows to wait for any previous instances of a glock to go away in gfs2_glock_get before creating a new glocks. Special thanks to Andy Price for finding and fixing a problem which also required us to delete the rcu_read_unlock from the error case in function gfs2_glock_get. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
645ebd49 |
|
26-Jul-2017 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Don't waste time locking lru_lock for non-lru glocks Before this patch, glock_dq would call gfs2_glock_remove_from_lru. For glocks that are never put on the LRU, such as the transaction glock, this just takes the spin_lock, determines there's nothing to be done because the list is empty, then unlocks again. This was causing unnecessary lock contention on the lru_lock spin_lock. This patch adds a check for GLOF_LRU in the glops before taking the spin_lock. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
961ae1d8 |
|
07-Jul-2017 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Fix glock rhashtable rcu bug Before commit 88ffbf3e03 "GFS2: Use resizable hash table for glocks", glocks were freed via call_rcu to allow reading the glock hashtable locklessly using rcu. This was then changed to free glocks immediately, which made reading the glock hashtable unsafe. Bring back the original code for freeing glocks via call_rcu. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Cc: stable@vger.kernel.org # 4.3+
|
#
6b0c7440 |
|
30-Jun-2017 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Clean up glock work enqueuing This patch adds a standardized queueing mechanism for glock work with spin_lock protection to prevent races. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
ed17545d |
|
05-May-2017 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Allow glocks to be unlocked after withdraw This bug fixes a regression introduced by patch 0d1c7ae9d8. The intent of the patch was to stop promoting glocks after a file system is withdrawn due to a variety of errors, because doing so results in a BUG(). (You should be able to unmount after a withdraw rather than having the kernel panic.) Unfortunately, it also stopped demotions, so glocks could not be unlocked after withdraw, which means the unmount would hang. This patch allows function do_xmote to demote locks to an unlocked state after a withdraw, but not promote them. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
0a52aba7 |
|
21-Feb-2017 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Switch to rhashtable_lookup_get_insert_fast Switch from rhashtable_lookup_insert_fast + rhashtable_lookup_fast to rhashtable_lookup_get_insert_fast, which is cleaner and avoids an extra rhashtable lookup. At the same time, turn the retry loop in gfs2_glock_get into an infinite loop. The lookup or insert will eventually succeed, usually very fast, but there is no reason to give up trying at a fixed number of iterations. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
972b044e |
|
16-Mar-2017 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Don't pack struct lm_lockname As per a suggestion by Linus, don't pack struct lm_lockname: we did that because the struct is used as a rhashtable key, but packing tells the compiler that the 64-bit fields in the struct may be unaligned, causing it to generate worse code on some architectures. Instead, rearrange the fields in the struct so that there is no padding between fields, and exclude any tail padding from the hash key size. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
92ecd73a |
|
09-Mar-2017 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Deduplicate gfs2_{glocks,glstats}_open Both functions are identical except for the seq_operations used. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
cc37a627 |
|
09-Mar-2017 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Replace rhashtable_walk_init with rhashtable_walk_enter Function rhashtable_walk_init is deprecated. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
0d1c7ae9 |
|
02-Mar-2017 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Prevent BUG from occurring when normal Withdraws occur When the GFS2 file system withdraws due to metadata corruption, it often has outstanding transactions in the journal and delayed work queued for its glocks. This patch adds some new checks for a withdrawn file system before proceeding with operations that would obviously cause a BUG() to be triggered. That allows GFS2 to be safely unmounted rather than cause the system to go down. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
f38e5fb9 |
|
21-Feb-2017 |
Andrew Price <anprice@redhat.com> |
gfs2: Add missing rcu locking for glock lookup We must hold the rcu read lock across looking up glocks and trying to bump their refcount to prevent the glocks from being freed in between. Cc: <stable@vger.kernel.org> # 4.3+ Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
98687f42 |
|
11-Feb-2017 |
Herbert Xu <herbert@gondor.apana.org.au> |
gfs2: Use rhashtable walk interface in glock_hash_walk The function glock_hash_walk walks the rhashtable by hand. This is broken because if it catches the hash table in the middle of a rehash, then it will miss entries. This patch replaces the manual walk by using the rhashtable walk interface. Fixes: 88ffbf3e037e ("GFS2: Use resizable hash table for glocks") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
bf3f14d6 |
|
15-Feb-2017 |
David S. Miller <davem@davemloft.net> |
rhashtable: Revert nested table changes. This reverts commits: 6a25478077d987edc5e2f880590a2bc5fcab4441 9dbbfb0ab6680c6a85609041011484e6658e7d3c 40137906c5f55c252194ef5834130383e639536f It's too risky to put in this late in the release cycle. We'll put these changes into the next merge window instead. Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
6a254780 |
|
11-Feb-2017 |
Herbert Xu <herbert@gondor.apana.org.au> |
gfs2: Use rhashtable walk interface in glock_hash_walk The function glock_hash_walk walks the rhashtable by hand. This is broken because if it catches the hash table in the middle of a rehash, then it will miss entries. This patch replaces the manual walk by using the rhashtable walk interface. Fixes: 88ffbf3e037e ("GFS2: Use resizable hash table for glocks") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
8b0e1953 |
|
24-Dec-2016 |
Thomas Gleixner <tglx@linutronix.de> |
ktime: Cleanup ktime_set() usage ktime_set(S,N) was required for the timespec storage type and is still useful for situations where a Seconds and Nanoseconds part of a time value needs to be converted. For anything where the Seconds argument is 0, this is pointless and can be replaced with a simple assignment. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>
|
#
7c0f6ba6 |
|
24-Dec-2016 |
Linus Torvalds <torvalds@linux-foundation.org> |
Replace <asm/uaccess.h> with <linux/uaccess.h> globally This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
14d37564 |
|
14-Dec-2016 |
Dan Carpenter <dan.carpenter@oracle.com> |
GFS2: Fix reference to ERR_PTR in gfs2_glock_iter_next This patch fixes a place where function gfs2_glock_iter_next can reference an invalid error pointer. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
e0d735c1 |
|
20-Sep-2016 |
Chao Yu <chao@kernel.org> |
gfs2: fix to detect failure of register_shrinker register_shrinker can fail after commit 1d3d4437eae1 ("vmscan: per-node deferred work"), we should detect the failure of it, otherwise we may fail to register shrinker after gfs2 module was been inited successfully. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
47a9a527 |
|
01-Aug-2016 |
Fabian Frederick <fabf@skynet.be> |
GFS2: use BIT() macro Replace 1 << value shift by more explicit BIT() macro Also fixes two bare unsigned definitions: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' + unsigned hsize = BIT(ip->i_depth); Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
6df9f9a2 |
|
17-Jun-2016 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Lock holder cleanup Make the code more readable by cleaning up the different ways of initializing lock holders and checking for initialized lock holders: mark lock holders as uninitialized by setting the holder's glock to NULL (gfs2_holder_mark_uninitialized) instead of zeroing out the entire object or using a separate flag. Recognize initialized holders by their non-NULL glock (gfs2_holder_initialized). Don't zero out holder objects which are immeditiately initialized via gfs2_holder_init or gfs2_glock_nq_init. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
ec5ec66b |
|
13-Jun-2016 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Get rid of gfs2_ilookup Now that gfs2_lookup_by_inum only takes the inode glock for new inodes (and not for cached inodes anymore), there no longer is a need to optimize the cached-inode case in gfs2_get_dentry or delete_work_func, and gfs2_ilookup can be removed. In addition, gfs2_get_dentry wasn't checking the GFS2_DIF_SYSTEM flag in i_diskflags in the gfs2_ilookup case (see gfs2_lookup_by_inum); this inconsistency goes away as well. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
3ce37b2c |
|
13-Jun-2016 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Fix gfs2_lookup_by_inum lock inversion The current gfs2_lookup_by_inum takes the glock of a presumed inode identified by block number, verifies that the block is indeed an inode, and then instantiates and reads the new inode via gfs2_inode_lookup. However, instantiating a new inode may block on freeing a previous instance of that inode (__wait_on_freeing_inode), and freeing an inode requires to take the glock already held, leading to lock inversion and deadlock. Fix this by first instantiating the new inode, then verifying that the block is an inode (if required), and then reading in the new inode, all in gfs2_inode_lookup. If the block we are looking for is not an inode, we discard the new inode via iget_failed, which marks inodes as bad and unhashes them. Other tasks waiting on that inode will get back a bad inode back from ilookup or iget_locked; in that case, retry the lookup. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
a527b38e |
|
11-Apr-2016 |
Denys Vlasenko <dvlasenk@redhat.com> |
GFS2: fs/gfs2/glock.c: Deinline do_error, save 1856 bytes This function compiles to 522 bytes of machine code. Error paths are not very time critical. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
8f6fd83c |
|
02-Mar-2016 |
Bob Copeland <me@bobcopeland.com> |
rhashtable: accept GFP flags in rhashtable_walk_init In certain cases, the 802.11 mesh pathtable code wants to iterate over all of the entries in the forwarding table from the receive path, which is inside an RCU read-side critical section. Enable walks inside atomic sections by allowing GFP_ATOMIC allocations for the walker state. Change all existing callsites to pass in GFP_KERNEL. Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Bob Copeland <me@bobcopeland.com> [also adjust gfs2/glock.c and rhashtable tests] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
#
3e11e530 |
|
23-Mar-2016 |
Benjamin Marzinski <bmarzins@redhat.com> |
GFS2: ignore unlock failures after withdraw After gfs2 has withdrawn the filesystem, it may still have many locks not in the unlocked state. If it is using lock_dlm, it will failed trying the unlocks since it has already unmounted the lock manager. Instead, it should set the SDF_SKIP_DLM_UNLOCK flag on withdraw, to signal that it can skip the lock_manager on unlocks, and failback to lock_nolock style unlocking. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
ff34245d |
|
27-Mar-2015 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Don't filter out I_FREEING inodes anymore This patch basically reverts a very old patch from 2008, 7a9f53b3c1875bef22ad4588e818bc046ef183da, with the title "Alternate gfs2_iget to avoid looking up inodes being freed". The original patch was designed to avoid a deadlock caused by lock ordering with try_rgrp_unlink. The patch forced the function to not find inodes that were being removed by VFS. The problem is, that made it impossible for nodes to delete their own unlinked dinodes after a certain point in time, because the inode needed was not found by this filtering process. There is no longer a need for the patch, since function try_rgrp_unlink no longer locks the inode: All it does is queue the glock onto the delete work_queue, so there should be no more deadlock. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
a4923865 |
|
07-Dec-2015 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Prevent delete work from occurring on glocks used for create This patch tries to prevent delete work (queued via iopen callback) from executing if the glock is currently being used to create a new inode. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
7508abc4 |
|
18-Dec-2015 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Check if iopen is held when deleting inode This patch fixes an error condition in which an inode is partially created in gfs2_create_inode() but then some error is discovered, which causes it to fail and call iput() before the iopen glock is created or held. In that case, gfs2_delete_inode would try to unlock an iopen glock that doesn't yet exist. Therefore, we test its holder (which must exist) for the HIF_HOLDER bit before trying to dq it. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
2aba1b5b |
|
19-May-2015 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Reintroduce a timeout in function gfs2_gl_hash_clear At some point in the past, we used to have a timeout when GFS2 was unmounting, trying to clear out its glocks. If the timeout expires, it would dump the remaining glocks to the kernel messages so that developers can debug the problem. That timeout was eliminated, probably by accident. This patch reintroduces it. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
b58bf407 |
|
24-Jul-2015 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Reduce size of incore inode This patch makes no functional changes. Its goal is to reduce the size of the gfs2 inode in memory by rearranging structures and changing the size of some variables within the structure. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
3dd1dd8c |
|
12-Nov-2015 |
Andrew Price <anprice@redhat.com> |
GFS2: Use rht_for_each_entry_rcu in glock_hash_walk This lockdep splat was being triggered on umount: [55715.973122] =============================== [55715.980169] [ INFO: suspicious RCU usage. ] [55715.981021] 4.3.0-11553-g8d3de01-dirty #15 Tainted: G W [55715.982353] ------------------------------- [55715.983301] fs/gfs2/glock.c:1427 suspicious rcu_dereference_protected() usage! The code it refers to is the rht_for_each_entry_safe usage in glock_hash_walk. The condition that triggers the warning is lockdep_rht_bucket_is_held(tbl, hash) which is checked in the __rcu_dereference_protected macro. The rhashtable buckets are not changed in glock_hash_walk so it's safe to rely on the rcu protection. Replace the rht_for_each_entry_safe() usage with rht_for_each_entry_rcu(), which doesn't care whether the bucket lock is held if the rcu read lock is held. Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
f3dd1649 |
|
29-Oct-2015 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Remove gl_spin define Commit e66cf161 replaced the gl_spin spinlock in struct gfs2_glock with a gl_lockref lockref and defined gl_spin as gl_lockref.lock (the spinlock in gl_lockref). Remove that define to make the references to gl_lockref.lock more obvious. Signed-off-by: Andreas Gruenbacher <andreas.gruenbacher@gmail.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
8f7e0a80 |
|
27-Aug-2015 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: A minor "sbstats" cleanup It seems cleaner to avoid the temporary value here. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
4d207133 |
|
26-Aug-2015 |
Ben Hutchings <ben@decadent.org.uk> |
gfs2: Make statistics unsigned, suitable for use with do_div() None of these statistics can meaningfully be negative, and the numerator for do_div() must have the type u64. The generic implementation of do_div() used on some 32-bit architectures asserts that, resulting in a compiler error in gfs2_rgrp_congested(). Fixes: 0166b197c2ed ("GFS2: Average in only non-zero round-trip times ...") Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Andreas Gruenbacher <agruenba@redhat.com>
|
#
88ffbf3e |
|
16-Mar-2015 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Use resizable hash table for glocks This patch changes the glock hash table from a normal hash table to a resizable hash table, which scales better. This also simplifies a lot of code. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
15562c43 |
|
16-Mar-2015 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Move glock superblock pointer to field gl_name What uniquely identifies a glock in the glock hash table is not gl_name, but gl_name and its superblock pointer. This patch makes the gl_name field correspond to a unique glock identifier. That will allow us to simplify hashing with a future patch, since the hash algorithm can then take the gl_name and hash its components in one operation. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
81648d04 |
|
27-Aug-2015 |
Andreas Gruenbacher <agruenba@redhat.com> |
gfs2: Simplify the seq file code for "sbstats" Don't use struct gfs2_glock_iter as the helper data structure for iterating through "sbstats"; we are not iterating through glocks here. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
e7ccaf5f |
|
12-Jun-2015 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Don't add all glocks to the lru The glocks used for resource groups often come and go hundreds of thousands of times per second. Adding them to the lru list just adds unnecessary contention for the lru_lock spin_lock, especially considering we're almost certainly going to re-use the glock and take it back off the lru microseconds later. We never want the glock shrinker to cull them anyway. This patch adds a new bit in the glops that determines which glock types get put onto the lru list and which ones don't. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
7b4ddfa7 |
|
24-Mar-2015 |
Chengyu Song <csong84@gatech.edu> |
gfs2: incorrect check for debugfs returns debugfs_create_dir and debugfs_create_file may return -ENODEV when debugfs is not configured, so the return value should be checked against ERROR_VALUE as well, otherwise the later dereference of the dentry pointer would crash the kernel. Signed-off-by: Chengyu Song <csong84@gatech.edu> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
b83ae6d4 |
|
14-Jan-2015 |
Christoph Hellwig <hch@lst.de> |
fs: remove mapping->backing_dev_info Now that we never use the backing_dev_info pointer in struct address_space we can simply remove it and save 4 to 8 bytes in every inode. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Reviewed-by: Tejun Heo <tj@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
8f6cb409 |
|
05-Jan-2015 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Eliminate __gfs2_glock_remove_from_lru Since the only caller of function __gfs2_glock_remove_from_lru locks the same spin_lock as gfs2_glock_remove_from_lru, the functions can be combined. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
30badc95 |
|
18-Nov-2014 |
Markus Elfring <elfring@users.sourceforge.net> |
GFS2: Deletion of unnecessary checks before two function calls The functions iput() and put_pid() test whether their argument is NULL and then return immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
d29c0afe |
|
03-Oct-2014 |
Fabian Frederick <fabf@skynet.be> |
GFS2: use _RET_IP_ instead of (unsigned long)__builtin_return_address(0) use macro definition Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
5bef3e7c |
|
26-Jun-2014 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Allow flocks to use normal glock dq rather than dq_wait This patch allows flock glocks to use a non-blocking dequeue rather than dq_wait. It also reverts the previous patch I had posted regarding dq_wait. The reverted patch isn't necessarily a bad idea, but I decided this might avoid unforeseen side effects, and was therefore safer. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
fe0bbd29 |
|
23-Jun-2014 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Use GFP_NOFS when allocating glocks Normally GFP_KERNEL is ok here, but there is now a rarely used code path relating to deallocation of unlinked inodes (in certain corner cases) which if hit at times of memory shortage can cause recursion while trying to free memory. One solution would be to try and move the gfs2_glock_get() call so that it is no longer called while another glock is held, but that doesn't look at all easy, so GFP_NOFS is the best solution for the time being. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
94a09a39 |
|
23-Jun-2014 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Fix race in glock lru glock disposal We must not leave items on the LRU list with GLF_LOCK set, since they can be removed if the glock is brought back into use, which may then potentially result in a hang, waiting for GLF_LOCK to clear. It doesn't happen very often, since it requires a glock that has not been used for a long time to be brought back into use at the same moment that the shrinker is part way through disposing of glocks. The fix is to set GLF_LOCK at a later time, when we already know that the other locks can be obtained. Also, we now only release the lru_lock in case a resched is needed, rather than on every iteration. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
79272b35 |
|
20-Jun-2014 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Only wait for demote when last holder is dequeued Function gfs2_glock_dq_wait is supposed to dequeue a glock and then wait for the lock to be demoted. The problem is, if this is a shared lock, its demote will depend on the other holders, which means you might end up waiting forever because the other process is blocked. This problem is especially apparent when dealing with nested flocks. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
74316201 |
|
06-Jul-2014 |
NeilBrown <neilb@suse.de> |
sched: Remove proliferation of wait_on_bit() action functions The current "wait_on_bit" interface requires an 'action' function to be provided which does the actual waiting. There are over 20 such functions, many of them identical. Most cases can be satisfied by one of just two functions, one which uses io_schedule() and one which just uses schedule(). So: Rename wait_on_bit and wait_on_bit_lock to wait_on_bit_action and wait_on_bit_lock_action to make it explicit that they need an action function. Introduce new wait_on_bit{,_lock} and wait_on_bit{,_lock}_io which are *not* given an action function but implicitly use a standard one. The decision to error-out if a signal is pending is now made based on the 'mode' argument rather than being encoded in the action function. All instances of the old wait_on_bit and wait_on_bit_lock which can use the new version have been changed accordingly and their action functions have been discarded. wait_on_bit{_lock} does not return any specific error code in the event of a signal so the caller must check for non-zero and interpolate their own error code as appropriate. The wait_on_bit() call in __fscache_wait_on_invalidate() was ambiguous as it specified TASK_UNINTERRUPTIBLE but used fscache_wait_bit_interruptible as an action function. David Howells confirms this should be uniformly "uninterruptible" The main remaining user of wait_on_bit{,_lock}_action is NFS which needs to use a freezer-aware schedule() call. A comment in fs/gfs2/glock.c notes that having multiple 'action' functions is useful as they display differently in the 'wchan' field of 'ps'. (and /proc/$PID/wchan). As the new bit_wait{,_io} functions are tagged "__sched", they will not show up at all, but something higher in the stack. So the distinction will still be visible, only with different function names (gds2_glock_wait versus gfs2_glock_dq_wait in the gfs2/glock.c case). Since first version of this patch (against 3.15) two new action functions appeared, on in NFS and one in CIFS. CIFS also now uses an action function that makes the same freezer aware schedule call as NFS. Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: David Howells <dhowells@redhat.com> (fscache, keys) Acked-by: Steven Whitehouse <swhiteho@redhat.com> (gfs2) Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steve French <sfrench@samba.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140707051603.28027.72349.stgit@notabene.brown Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
4e857c58 |
|
17-Mar-2014 |
Peter Zijlstra <peterz@infradead.org> |
arch: Mass conversion of smp_mb__*() Mostly scripted conversion of the smp_mb__* barriers. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-arch@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
01b172b7 |
|
12-Mar-2014 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Ensure workqueue is scheduled after noexp request This patch closes a small timing window whereby a request to hold the transaction glock can get stuck. The problem is that after the DLM has granted the lock, it can get into a state whereby it doesn't transition the glock to a held state, due to not having requeued the glock state machine to finish the transition. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
d77d1b58 |
|
06-Mar-2014 |
Joe Perches <joe@perches.com> |
GFS2: Use pr_<level> more consistently Add pr_fmt, remove embedded "GFS2: " prefixes. This now consistently emits lower case "gfs2: " for each message. Other miscellanea around these changes: o Add missing newlines o Coalesce formats o Realign arguments Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
fc554ed3 |
|
05-Mar-2014 |
Fabian Frederick <fabf@skynet.be> |
GFS2: global conversion to pr_foo() -All printk(KERN_foo converted to pr_foo(). -Messages updated to fit in 80 columns. -fs_macros converted as well. -fs_printk removed. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
ac3beb6a |
|
16-Jan-2014 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Don't use ENOBUFS when ENOMEM is the correct error code Al Viro has tactfully pointed out that we are using the incorrect error code in some cases. This patch fixes that, and also removes the (unused) return value for glock dumping. > * gfs2_iget() - ENOBUFS instead of ENOMEM. ENOBUFS is > "No buffer space available (POSIX.1 (XSI STREAMS option))" and since > we don't support STREAMS it's probably fair game, but... what the hell? Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Al Viro <viro@ZenIV.linux.org.uk>
|
#
0b3a2c99 |
|
02-Jan-2014 |
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> |
GFS2: Fix unsafe dereference in dump_holder() GLOCK_BUG_ON() might call this function without RCU read lock. Make sure that RCU read lock is held when using task_struct returned from pid_task(). Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
e3c4269d |
|
12-Nov-2013 |
Michal Nazarewicz <mina86@mina86.com> |
GFS2: fix potential NULL pointer dereference Commit [e66cf1610: GFS2: Use lockref for glocks] replaced call: atomic_read(&gi->gl->gl_ref) == 0 with: __lockref_is_dead(&gl->gl_lockref) therefore changing how gl is accessed, from gi->gl to plan gl. However, gl can be a NULL pointer, and so gi->gl needs to be used instead (which is guaranteed not to be NULL because fo the while loop checking that condition). Signed-off-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
e66cf161 |
|
15-Oct-2013 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Use lockref for glocks Currently glocks have an atomic reference count and also a spinlock which covers various internal fields, such as the state. This intent of this patch is to replace the spinlock and the atomic reference count with a lockref structure. This contains a spinlock which we can continue to use as before, and a reference counter which is used in conjuction with the spinlock to replace the previous atomic counter. As a result of this there are some new rules for reference counting on glocks. We need to distinguish between reference count changes under gl_spin (which are now just increment or decrement of the new counter, provided the count cannot hit zero) and those which are outside of gl_spin, but which now take gl_spin internally. The conversion is relatively straight forward. There is probably some further clean up which can be done, but the priority at this stage is to make the change in as simple a manner as possible. A consequence of this change is that the reference count is being decoupled from the lru list processing. This should allow future adoption of the lru_list code with glocks in due course. The reason for using the "dead" state and not just relying on 0 being the "invalid state" is so that in due course 0 ref counts can be allowable. The intent is to eventually be able to remove the ref count changes which are currently hidden away in state_change(). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
1ab6c499 |
|
27-Aug-2013 |
Dave Chinner <dchinner@redhat.com> |
fs: convert fs shrinkers to new scan/count API Convert the filesystem shrinkers to use the new API, and standardise some of the behaviours of the shrinkers at the same time. For example, nr_to_scan means the number of objects to scan, not the number of objects to free. I refactored the CIFS idmap shrinker a little - it really needs to be broken up into a shrinker per tree and keep an item count with the tree root so that we don't need to walk the tree every time the shrinker needs to count the number of objects in the tree (i.e. all the time under memory pressure). [glommer@openvz.org: fixes for ext4, ubifs, nfs, cifs and glock. Fixes are needed mainly due to new code merged in the tree] [assorted fixes folded in] Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Glauber Costa <glommer@openvz.org> Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Acked-by: Jan Kara <jack@suse.cz> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: Arve Hjønnevåg <arve@android.com> Cc: Carlos Maiolino <cmaiolino@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Rientjes <rientjes@google.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: J. Bruce Fields <bfields@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Kent Overstreet <koverstreet@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
55f841ce |
|
27-Aug-2013 |
Glauber Costa <glommer@openvz.org> |
super: fix calculation of shrinkable objects for small numbers The sysctl knob sysctl_vfs_cache_pressure is used to determine which percentage of the shrinkable objects in our cache we should actively try to shrink. It works great in situations in which we have many objects (at least more than 100), because the aproximation errors will be negligible. But if this is not the case, specially when total_objects < 100, we may end up concluding that we have no objects at all (total / 100 = 0, if total < 100). This is certainly not the biggest killer in the world, but may matter in very low kernel memory situations. Signed-off-by: Glauber Costa <glommer@openvz.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: Dave Chinner <david@fromorbit.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: Arve Hjønnevåg <arve@android.com> Cc: Carlos Maiolino <cmaiolino@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Rientjes <rientjes@google.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: J. Bruce Fields <bfields@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Kent Overstreet <koverstreet@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
068213f7 |
|
25-Jul-2013 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Remove unnecessary memory barrier Function test_and_clear_bit implies a memory barrier, so subsequent memory barriers are unnecessary. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
7286b31e |
|
20-Aug-2013 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Take glock reference in examine_bucket() We need to check the glock ref counter in a race free way in order to ensure that the gfs2_glock_hold() call will succeed. The easiest way to do that is to simply take the reference count early in the common code of examine_bucket, skipping any glocks with zero ref count. That means that the examiner functions all need to put their reference on the glock once they've performed their function. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Reported-by: David Teigland <teigland@redhat.com> Tested-by: David Teigland <teigland@redhat.com>
|
#
dfc4616d |
|
15-Aug-2013 |
Dan Carpenter <dan.carpenter@oracle.com> |
GFS2: alloc_workqueue() doesn't return an ERR_PTR alloc_workqueue() returns a NULL on error, it doesn't return an ERR_PTR. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
7af584d3 |
|
12-Dec-2012 |
Joe Perches <joe@perches.com> |
gfs2: Convert print_symbol to %pSR Use the new vsprintf extension to avoid any possible message interleaving. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
#
222cb538 |
|
24-Apr-2013 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Flush work queue before clearing glock hash tables There was a timing window when a GFS2 file system was unmounted that caused GFS2 to call BUG() and panic the kernel. The call to BUG() is meant to ensure that the glock reference count, gl_ref, never gets down to zero and bounce back up again. What was happening during umount is that function gfs2_put_super was dequeing its glocks for well-known files. In particular, we saw it on the journal glock, sd_jinode_gh. The dequeue caused delayed work to be queued for the glock state machine, to transition the lock to an "unlocked" state. While the work was still queued, gfs2_put_super called gfs2_gl_hash_clear to clear out the glock hash tables. If the timing was just so, the glock work function would drop the reference count at the time when it was being checked for zero, and that caused BUG() to be called. This patch calls flush_workqueue before clearing the glock hash tables, thereby ensuring that the delayed work is executed before the hash tables are cleared, and therefore the reference count never goes to zero until the glock is cleared. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
7bd8b2eb |
|
10-Apr-2013 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Add origin indicator to glock demote tracing This adds the origin indicator to the trace point for glock demotion, so that it is possible to see where demote requests have come from. Note that requests generated from the demote_rq sysfs interface will show as remote, since they are intended to replicate exactly the effect of a demote reuqest from a remote node. It is still possible to tell these apart by looking at the process which initiated the demote request. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
81ffbf65 |
|
10-Apr-2013 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Add origin indicator to glock callbacks This patch adds a bool indicating whether the demote request was originated locally or remotely. This is then used by the iopen ->go_callback() to make 100% sure that it will only respond to remote callbacks. Since ->evict_inode() uses GL_NOCACHE when it attempts to get an exclusive lock on the iopen lock, this may result in extra scheduling of the workqueue in case that the exclusive promotion request failed. This patch prevents that from happening. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
28fb3027 |
|
26-Feb-2013 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Remove gfs2_refresh_inode from inode creation path The original method for creating inodes used in GFS2 was to fill out a buffer, with all the information, and then to read that buffer into the in-core inode, using gfs2_refresh_inode() The problem with this approach is that all the inode's fields need to be calculated ahead of time, and were stored in various variables making the code rather complicated. The new approach is simply to allocate the in-core inode earlier and fill in as many fields as possible ahead of time. These can then be used to initilise the on disk representation. The code has been working towards the point where it is possible to remove gfs2_refresh_inode() because all the fields are correctly initialised ahead of time. We've now reached that milestone, and have reversed the order of setting up the in core and on disk inodes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
4506a519 |
|
01-Feb-2013 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Split glock lru processing into two parts The intent here is to split the processing of the glock lru list into two parts, so that the selection of glocks and the disposal are separate functions. The plan is then, that further updates can then be made to these functions in the future to improve the selection of glocks and also the efficiency of glock disposal. The new feature which this patch brings is sorting the glocks to be disposed of into glock number (and thus also disk block number) order. Not all glocks will need i/o in order to dispose of them, but some will, and at least we'll generate mostly disk block order i/o now. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
2a005855 |
|
13-Dec-2012 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Separate LRU scanning from shrinker This breaks out the LRU scanning function from the shrinker in preparation for adding other callers to the LRU scanner. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
252aa6f5 |
|
11-Dec-2012 |
Rafael Aquini <aquini@redhat.com> |
mm: redefine address_space.assoc_mapping Overhaul struct address_space.assoc_mapping renaming it to address_space.private_data and its type is redefined to void*. By this approach we consistently name the .private_* elements from struct address_space as well as allow extended usage for address_space association with other data structures through ->private_data. Also, all users of old ->assoc_mapping element are converted to reflect its new name and type change (->private_data). Signed-off-by: Rafael Aquini <aquini@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Andi Kleen <andi@firstfloor.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
4e2f8849 |
|
14-Nov-2012 |
David Teigland <teigland@redhat.com> |
GFS2: remove redundant lvb pointer The lksb struct already contains a pointer to the lvb, so another directly from the glock struct is not needed. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
dba2d70c |
|
14-Nov-2012 |
David Teigland <teigland@redhat.com> |
GFS2: only use lvb on glocks that need it Save the effort of allocating, reading and writing the lvb for most glocks that do not use it. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
fb6791d1 |
|
13-Nov-2012 |
David Teigland <teigland@redhat.com> |
GFS2: skip dlm_unlock calls in unmount When unmounting, gfs2 does a full dlm_unlock operation on every cached lock. This can create a very large amount of work and can take a long time to complete. However, the vast majority of these dlm unlock operations are unnecessary because after all the unlocks are done, gfs2 leaves the dlm lockspace, which automatically clears the locks of the leaving node, without unlocking each one individually. So, gfs2 can skip explicit dlm unlocks, and use dlm_release_lockspace to remove the locks implicitly. The one exception is when the lock's lvb is being used. In this case, dlm_unlock is called because it may update the lvb of the resource. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
06dfc306 |
|
24-Oct-2012 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Rename glops go_xmote_th to go_sync [Editorial: This is a nit, but has been a minor irritation for a long time:] This patch renames glops structure item for go_xmote_th to go_sync. The functionality is unchanged; it's just for readability. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
8eae1ca0 |
|
15-Oct-2012 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Review bug traps in glops.c Two of the bug traps here could really be warnings. The others are converted from BUG() to GLOCK_BUG_ON() since we'll most likely need to know the glock state in order to debug any issues which arise. As a result of this, __dump_glock has to be renamed and is no longer static. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
e5dc76b9 |
|
08-Aug-2012 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Eliminate redundant calls to may_grant Function add_to_queue was checking may_grant for the passed-in holder for every iteration of its gh2 loop. Now it only checks it once at the beginning to see if a try lock is futile. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
81e1d450 |
|
08-Aug-2012 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Combine functions gfs2_glock_dq_wait and wait_on_demote Function gfs2_glock_dq_wait called two-line function wait_on_demote, so they were combined. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
07a79049 |
|
08-Aug-2012 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Combine functions gfs2_glock_wait and wait_on_holder Function gfs2_glock_wait only called function wait_on_holder and returned its return code, so they were combined for readability. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
4abb6ad9 |
|
08-Aug-2012 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: inline __gfs2_glock_schedule_for_reclaim Since function gfs2_glock_schedule_for_reclaim is only two significant lines, we can eliminate it, simplifying the code and making it more readable. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
0fe2f1e9 |
|
11-Jun-2012 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Size seq_file buffer more carefully This places a limit on the buffer size for archs with larger PAGE_SIZE. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
|
#
1bb49303 |
|
11-Jun-2012 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Use seq_vprintf for glocks debugfs file Make use of the newly added seq_vprintf() function. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
|
#
90306c41 |
|
29-May-2012 |
Benjamin Marzinski <bmarzins@redhat.com> |
GFS2: Use lvbs for storing rgrp information with mount option Instead of reading in the resource groups when gfs2 is checking for free space to allocate from, gfs2 can store the necessary infromation in the resource group's lvb. Also, instead of searching for unlinked inodes in every resource group that's checked for free space, gfs2 can store the number of unlinked but inodes in the lvb, and only check for unlinked inodes if it will find some. The first time a resource group is locked, the lvb must initialized. Since this involves counting the unlinked inodes in the resource group, this takes a little extra time. But after that, if the resource group is locked with GL_SKIP, the buffer head won't be read in unless it's actually needed. Enabling the resource groups lvbs is done via the rgrplvb mount option. If this option isn't set, the lvbs will still be set and updated, but they won't be verfied or used by the filesystem. To safely turn on this option, all of the nodes mounting the filesystem must be running code with this patch, and the filesystem must have been completely unmounted since they were updated. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
ba1ddcb6 |
|
08-Jun-2012 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Cache last hash bucket for glock seq_files For the glocks and glstats seq_files, which are exposed via debugfs we should cache the most recent hash bucket, along with the offset into that bucket. This allows us to restart from that point, rather than having to begin at the beginning each time. This is an idea from Eric Dumazet, however I've slightly extended it so that if the position from which we are due to start is at any point beyond the last cached point, we start from the last cached point, plus whatever is the appropriate offset. I don't really expect people to be lseeking around these files, but if they did so with only positive offsets, then we'd still get some of the benefit of using a cached offset. With my simple test of around 200k entries in the file, I'm seeing an approx 10x speed up. Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
df5d2f55 |
|
07-Jun-2012 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Increase buffer size for glocks and glstats debugfs files As per Al Viro's suggestion, this increases the buffer size used for these two files. This provides a speed up of slightly less than 8x (i.e. proportional to the buffer size) for cases when we have large numbers of glocks. Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
a245769f |
|
20-Jan-2012 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: glock statistics gathering The stats are divided into two sets: those relating to the super block and those relating to an individual glock. The super block stats are done on a per cpu basis in order to try and reduce the overhead of gathering them. They are also further divided by glock type. In the case of both the super block and glock statistics, the same information is gathered in each case. The super block statistics are used to provide default values for most of the glock statistics, so that newly created glocks should have, as far as possible, a sensible starting point. The statistics are divided into three pairs of mean and variance, plus two counters. The mean/variance pairs are smoothed exponential estimates and the algorithm used is one which will be very familiar to those used to calculation of round trip times in network code. The three pairs of mean/variance measure the following things: 1. DLM lock time (non-blocking requests) 2. DLM lock time (blocking requests) 3. Inter-request time (again to the DLM) A non-blocking request is one which will complete right away, whatever the state of the DLM lock in question. That currently means any requests when (a) the current state of the lock is exclusive (b) the requested state is either null or unlocked or (c) the "try lock" flag is set. A blocking request covers all the other lock requests. There are two counters. The first is there primarily to show how many lock requests have been made, and thus how much data has gone into the mean/variance calculations. The other counter is counting queueing of holders at the top layer of the glock code. Hopefully that number will be a lot larger than the number of dlm lock requests issued. So why gather these statistics? There are several reasons we'd like to get a better idea of these timings: 1. To be able to better set the glock "min hold time" 2. To spot performance issues more easily 3. To improve the algorithm for selecting resource groups for allocation (to base it on lock wait time, rather than blindly using a "try lock") Due to the smoothing action of the updates, a step change in some input quantity being sampled will only fully be taken into account after 8 samples (or 4 for the variance) and this needs to be carefully considered when interpreting the results. Knowing both the time it takes a lock request to complete and the average time between lock requests for a glock means we can compute the total percentage of the time for which the node is able to use a glock vs. time that the rest of the cluster has its share. That will be very useful when setting the lock min hold time. The other point to remember is that all times are in nanoseconds. Great care has been taken to ensure that we measure exactly the quantities that we want, as accurately as possible. There are always inaccuracies in any measuring system, but I hope this is as accurate as we can reasonably make it. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
4043b886 |
|
16-Jan-2012 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Fix race between lru_list and glock ref count This patch fixes a narrow race window between the glock ref count hitting zero and glocks being removed from the lru_list. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
e0c2a9aa |
|
09-Jan-2012 |
David Teigland <teigland@redhat.com> |
GFS2: dlm based recovery coordination This new method of managing recovery is an alternative to the previous approach of using the userland gfs_controld. - use dlm slot numbers to assign journal id's - use dlm recovery callbacks to initiate journal recovery - use a dlm lock to determine the first node to mount fs - use a dlm lock to track journals that need recovery Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
7cf8dcd3 |
|
15-Jun-2011 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Automatically adjust glock min hold time This patch is a performance improvement for GFS2 in a clustered environment. It makes the glock hold time self-adjusting. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
1495f230 |
|
24-May-2011 |
Ying Han <yinghan@google.com> |
vmscan: change shrinker API by passing shrink_control struct Change each shrinker's API by consolidating the existing parameters into shrink_control struct. This will simplify any further features added w/o touching each file of shrinker. [akpm@linux-foundation.org: fix build] [akpm@linux-foundation.org: fix warning] [kosaki.motohiro@jp.fujitsu.com: fix up new shrinker API] [akpm@linux-foundation.org: fix xfs warning] [akpm@linux-foundation.org: update gfs2] Signed-off-by: Ying Han <yinghan@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Minchan Kim <minchan.kim@gmail.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Acked-by: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
f90e5b5b |
|
24-May-2011 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Processes waiting on inode glock that no processes are holding This patch fixes a race in the GFS2 glock state machine that may result in lockups. The symptom is that all nodes but one will hang, waiting for a particular glock. All the holder records will have the "W" (Waiting) bit set. The other node will typically have the glock stuck in Exclusive mode (EX) with no holder records, but the dinode will be cached. In other words, an entry with "I:" will appear in the glock dump for that glock, but nothing else. The race has to do with the glock "Pending Demote" bit, which can be set, then immediately reset, thus losing the fact that another node needs the glock. The sequence of events is: 1. Something schedules the glock workqueue (e.g. glock request from fs) 2. The glock workqueue gets to the point between the test of the reply pending bit and the spin lock: if (test_and_clear_bit(GLF_REPLY_PENDING, &gl->gl_flags)) { finish_xmote(gl, gl->gl_reply); drop_ref = 1; } down_read(&gfs2_umount_flush_sem); <---- i.e. here spin_lock(&gl->gl_spin); 3. In comes (a) the reply to our EX lock request setting GLF_REPLY_PENDING and (b) the demote request which sets GLF_PENDING_DEMOTE 4. The following test is executed: if (test_and_clear_bit(GLF_PENDING_DEMOTE, &gl->gl_flags) && gl->gl_state != LM_ST_UNLOCKED && gl->gl_demote_state != LM_ST_EXCLUSIVE) { This resets the pending demote flag, and gl->gl_demote_state is not equal to exclusive, however because the reply from the dlm arrived after we checked for the GLF_REPLY_PENDING flag, gl->gl_state is still equal to unlocked, so although we reset the GLF_PENDING_DEMOTE flag, we didn't then set the GLF_DEMOTE flag or reinstate the GLF_PENDING_DEMOTE_FLAG. The patch closes the timing window by only transitioning the "Pending demote" bit to the "demote" flag once we know the other conditions (not unlocked and not exclusive) are met. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
588da3b3 |
|
04-May-2011 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Don't use a try lock when promoting to a higher mode Previously we marked all locks being promoted to a higher mode with the try flag to avoid any potential deadlocks issues. The DLM is able to detect these and report them in way that GFS2 can deal with them correctly. So we can just request the required mode and wait for a response without needing to perform this check. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
1879fd6a |
|
25-Apr-2011 |
Christoph Hellwig <hch@infradead.org> |
add hlist_bl_lock/unlock helpers Now that the whole dcache_hash_bucket crap is gone, go all the way and also remove the weird locking layering violations for locking the hash buckets. Add hlist_bl_lock/unlock helpers to move the locking into the list abstraction instead of requiring each caller to open code it. After all allowing for the bit locks is the whole point of these helpers over the plain hlist variant. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
4667a0ec |
|
18-Apr-2011 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Make writeback more responsive to system conditions This patch adds writeback_control to writing back the AIL list. This means that we can then take advantage of the information we get in ->write_inode() in order to set off some pre-emptive writeback. In addition, the AIL code is cleaned up a bit to make it a bit simpler to understand. There is still more which can usefully be done in this area, but this is a good start at least. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
f42ab085 |
|
14-Apr-2011 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Optimise glock lru and end of life inodes The GLF_LRU flag introduced in the previous patch can be used to check if a glock is on the lru list when a new holder is queued and if so remove it, without having first to get the lru_lock. The main purpose of this patch however is to optimise the glocks left over when an inode at end of life is being evicted. Previously such glocks were left with the GLF_LFLUSH flag set, so that when reclaimed, each one required a log flush. This patch resets the GLF_LFLUSH flag when there is nothing left to flush thus preventing later log flushes as glocks are reused or demoted. In order to do this, we need to keep track of the number of revokes which are outstanding, and also to clear the GLF_LFLUSH bit after a log commit when only revokes have been processed. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
627c10b7 |
|
14-Apr-2011 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Improve tracing support (adds two flags) This adds support for two new flags. One keeps track of whether the glock is on the LRU list or not. The other isn't really a flag as such, but an indication of whether the glock has an attached object or not. This indication is reported without any locking, which is ok since we do not dereference the object pointer but merely report whether it is NULL or not. Also, this fixes one place where a tracepoint was missing, which was at the point we remove deallocated blocks from the journal. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
29687a2a |
|
30-Mar-2011 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Alter point of entry to glock lru list for glocks with an address_space Rather than allowing the glocks to be scheduled for possible reclaim as soon as they have exited the journal, this patch delays their entry to the list until the glocks in question are no longer in use. This means that we will rely on the vm for writeback of all dirty data and metadata from now on. When glocks are added to the lru list they should be freeable much faster since all the I/O required to free them should have already been completed. This should lead to much better I/O patterns under low memory conditions. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
25985edc |
|
30-Mar-2011 |
Lucas De Marchi <lucas.demarchi@profusion.mobi> |
Fix common misspellings Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
|
#
7e32d026 |
|
15-Mar-2011 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Don't use _raw version of RCU dereference As per RCU glock patch review comments, don't use the _raw version of this function here. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
#
fa1bbdea |
|
10-Mar-2011 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: Optimize glock multiple-dequeue code This is a small patch that optimizes multiple glock dequeue operations. It changes the unlock order to be more efficient and makes it easier for lock debugging tools to unravel. It also eliminates the need for the temp variable x, although that would likely be optimized out. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
fc0e38da |
|
09-Mar-2011 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Fix glock deallocation race This patch fixes a race in deallocating glocks which was introduced in the RCU glock patch. We need to ensure that the glock count is kept correct even in the case that there is a race to add a new glock into the hash table. Also, to avoid having to wait for an RCU grace period, the glock counter can be decremented before call_rcu() is called. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
58a69cb4 |
|
16-Feb-2011 |
Tejun Heo <tj@kernel.org> |
workqueue, freezer: unify spelling of 'freeze' + 'able' to 'freezable' There are two spellings in use for 'freeze' + 'able' - 'freezable' and 'freezeable'. The former is the more prominent one. The latter is mostly used by workqueue and in a few other odd places. Unify the spelling to 'freezable'. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Dmitry Torokhov <dtor@mail.ru> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Alex Dubov <oakad@yahoo.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Steven Whitehouse <swhiteho@redhat.com>
|
#
edae38a6 |
|
31-Jan-2011 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Fix glock queue trace point Somehow this tracepoint landed up in the wrong place. This moves it to where it should be. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
bc015cb8 |
|
19-Jan-2011 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Use RCU for glock hash table This has a number of advantages: - Reduces contention on the hash table lock - Makes the code smaller and simpler - Should speed up glock dumps when under load - Removes ref count changing in examine_bucket - No longer need hash chain lock in glock_put() in common case There are some further changes which this enables and which we may do in the future. One is to look at using SLAB_RCU, and another is to look at using a per-cpu counter for the per-sb glock counter, since that is touched twice in the lifetime of each glock (but only used at umount time). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
#
47a25380 |
|
30-Nov-2010 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Merge glock state fields into a bitfield We can only merge the fields into a bitfield if the locking rules for them are the same. In this case gl_spin covers all of the fields (write side) but a couple of them are used with GLF_LOCK as the read side lock, which should be ok since we know that the field in question won't be changing at the time. The gl_req setting has to be done earlier (in glock.c) in order to place it under gl_spin. The gl_reply setting also has to be brought under gl_spin in order to comply with the new rules. This saves 4*sizeof(unsigned int) per glock. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Bob Peterson <rpeterso@redhat.com>
|
#
921169ca |
|
28-Nov-2010 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Clean up of gdlm_lock function The DLM never returns -EAGAIN in response to dlm_lock(), and even if it did, the test in gdlm_lock() was wrong anyway. Once that test is removed, it is possible to greatly simplify this code by simply using a "normal" error return code (0 for success). We then no longer need the LM_OUT_ASYNC return code which can be removed. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
5e69069c |
|
09-Nov-2010 |
Joe Perches <joe@perches.com> |
GFS2: fs/gfs2/glock.c: Use printf extension %pV Using %pV reduces the number of printk calls and eliminates any possible message interleaving from other printk calls. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
cc18152e |
|
05-Nov-2010 |
Joe Perches <joe@perches.com> |
GFS2: fs/gfs2/glock.c: Convert sprintf_symbol to %pS Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
d2115778 |
|
03-Nov-2010 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Change two WQ_RESCUERs into WQ_MEM_RECLAIM The WQ_RESCUER flag should only be used internally to the workqueue implementation. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Acked-by: Tejun Heo <tj@kernel.org>
|
#
044b9414 |
|
03-Nov-2010 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Fix inode deallocation race This area of the code has always been a bit delicate due to the subtleties of lock ordering. The problem is that for "normal" alloc/dealloc, we always grab the inode locks first and the rgrp lock later. In order to ensure no races in looking up the unlinked, but still allocated inodes, we need to hold the rgrp lock when we do the lookup, which means that we can't take the inode glock. The solution is to borrow the technique already used by NFS to solve what is essentially the same problem (given an inode number, look up the inode carefully, checking that it really is in the expected state). We cannot do that directly from the allocation code (lock ordering again) so we give the job to the pre-existing delete workqueue and carry on with the allocation as normal. If we find there is no space, we do a journal flush (required anyway if space from a deallocation is to be released) which should block against the pending deallocations, so we should always get the space back. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
c741c455 |
|
29-Sep-2010 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Fix spectator umount issue The tests further down the recovery function relating to unlocking the journal need to be updated to match the intial test. Also, a test in the umount code which was surplus to requirements has been removed. Umounting spectator mounts now works correctly, as expected. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
9fa0ea9f |
|
13-Sep-2010 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Use new workqueue scheme The recovery workqueue can be freezable since we want it to finish what it is doing if the system is to be frozen (although why you'd want to freeze a cluster node is beyond me since it will result in it being ejected from the cluster). It does still make sense for single node GFS2 filesystems though. The glock workqueue will benefit from being able to run more work items concurrently. A test running postmark shows improved performance and multi-threaded workloads are likely to benefit even more. It needs to be high priority because the latency directly affects the latency of filesystem glock operations. The delete workqueue is similar to the recovery workqueue in that it must not get blocked by memory allocations, and may run for a long time. Potentially other GFS2 threads might also be converted to workqueues, but I'll leave that for a later patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Acked-by: Tejun Heo <tj@kernel.org>
|
#
7b5e3d5f |
|
03-Sep-2010 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Don't enforce min hold time when two demotes occur in rapid succession Due to the design of the VFS, it is quite usual for operations on GFS2 to consist of a lookup (requiring a shared lock) followed by an operation requiring an exclusive lock. If a remote node has cached an exclusive lock, then it will receive two demote events in rapid succession firstly for a shared lock and then to unlocked. The existing min hold time code was triggering in this case, even if the node was otherwise idle since the state change time was being updated by the initial demote. This patch introduces logic to skip the min hold timer in the case that a "double demote" of this kind has occurred. The min hold timer will still be used in all other cases. A new glock flag is introduced which is used to keep track of whether there have been any newly queued holders since the last glock state change. The min hold time is only applied if the flag is set. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Tested-by: Abhijith Das <adas@redhat.com>
|
#
0809f6ec |
|
02-Aug-2010 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Fix recovery stuck bug (try #2) This is a clean up of the code which deals with LM_FLAG_NOEXP which aims to remove any possible race conditions by using gl_spin to cover the gap between testing for the LM_FLAG_NOEXP and the GL_FROZEN flag. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
7cdee5db |
|
29-Jul-2010 |
Steven Whitehouse <swhiteho@redhat.com> |
Revert "GFS2: recovery stuck on transaction lock" This reverts commit b7dc2df5725fe7355fd76000ead7e39728e1b8a9. The initial patch didn't quite work since it doesn't cover all the possible routes by which the GLF_FROZEN flag might be set. A revised fix is coming up in the next patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
d5341a92 |
|
23-Jul-2010 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Make "try" lock not try quite so hard This looks like a big change, but in reality its only a single line of actual code change, the rest is just moving a function to before its new caller. The "try" flag for glocks is a rather subtle and delicate setting since it requires that the state machine tries just hard enough to ensure that it has a good chance of getting the requested lock, but no so hard that the request can land up blocked behind another. The patch adds in an additional check which will fail any queued try locks if there is another request blocking the try lock request which is not granted and compatible, nor in progress already. The check is made only after all pending locks which may be granted have been granted. I've checked this with the reproducer for the reported flock bug which this is intended to fix, and it now passes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
7f8275d0 |
|
18-Jul-2010 |
Dave Chinner <dchinner@redhat.com> |
mm: add context argument to shrinker callback The current shrinker implementation requires the registered callback to have global state to work from. This makes it difficult to shrink caches that are not global (e.g. per-filesystem caches). Pass the shrinker structure to the callback so that users can embed the shrinker structure in the context the shrinker needs to operate on and get back to it in the callback via container_of(). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
b7dc2df5 |
|
23-Jun-2010 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: recovery stuck on transaction lock This patch fixes bugzilla bug #590878: GFS2: recovery stuck on transaction lock. We set the frozen flag on the glock when we receive a completion that cannot be delivered due to blocked locks. At that point we check to see whether the first waiting holder has the noexp flag set. If the noexp lock is queued later, then we need to unfreeze the glock at that point in time, namely, in the glock work function. This patch was originally written by Steve Whitehouse, but since he's on holiday, I'm submitting it. It's been well tested with a complex recovery test called revolver. Signed-off-by: Steve Whitehouse <swhiteho@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
1a0eae88 |
|
14-Apr-2010 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: glock livelock This patch fixes a couple gfs2 problems with the reclaiming of unlinked dinodes. First, there were a couple of livelocks where everything would come to a halt waiting for a glock that was seemingly held by a process that no longer existed. In fact, the process did exist, it just had the wrong pid number in the holder information. Second, there was a lock ordering problem between inode locking and glock locking. Third, glock/inode contention could sometimes cause inodes to be improperly marked invalid by iget_failed. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
#
4818972e |
|
22-Feb-2010 |
Bob Peterson <rpeterso@redhat.com> |
GFS2: print glock numbers in hex This patch changes glock numbers from printing in decimal to hex. Since DLM prints corresponding resource IDs in hex, it makes debugging easier. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
c1184f8a |
|
08-Jan-2010 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Remove loopy umount code As a consequence of the previous patch, we can now remove the loop which used to be required due to the circular dependency between the inodes and glocks. Instead we can just invalidate the inodes, and then clear up any glocks which are left. Also we no longer need the rwsem since there is no longer any danger of the inode invalidation calling back into the glock code (and from there back into the inode code). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
009d8518 |
|
07-Dec-2009 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Metadata address space clean up Since the start of GFS2, an "extra" inode has been used to store the metadata belonging to each inode. The only reason for using this inode was to have an extra address space, the other fields were unused. This means that the memory usage was rather inefficient. The reason for keeping each inode's metadata in a separate address space is that when glocks are requested on remote nodes, we need to be able to efficiently locate the data and metadata which relating to that glock (inode) in order to sync or sync and invalidate it (depending on the remotely requested lock mode). This patch adds a new type of glock, which has in addition to its normal fields, has an address space. This applies to all inode and rgrp glocks (but to no other glock types which remain as before). As a result, we no longer need to have the second inode. This results in three major improvements: 1. A saving of approx 25% of memory used in caching inodes 2. A removal of the circular dependency between inodes and glocks 3. No confusion between "normal" and "metadata" inodes in super.c Although the first of these is the more immediately apparent, the second is just as important as it now enables a number of clean ups at umount time. Those will be the subject of future patches. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
8f05228e |
|
29-Jan-2010 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Extend umount wait coverage to full glock lifetime Although all glocks are, by the time of the umount glock wait, scheduled for demotion, some of them haven't made it far enough through the process for the original set of waiting code to wait for them. This extends the ref count to the whole glock lifetime in order to ensure that the waiting does catch all glocks. It does make it a bit more invasive, but it seems the only sensible solution at the moment. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
26bb7505 |
|
27-Nov-2009 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Fix glock refcount issues This patch fixes some ref counting issues. Firstly by moving the point at which we drop the ref count after a dlm lock operation has completed we ensure that we never call gfs2_glock_hold() on a lock with a zero ref count. Secondly, by using atomic_dec_and_lock() in gfs2_glock_put() we ensure that at no time will a glock with zero ref count appear on the lru_list. That means that we can remove the check for this in our shrinker (which was racy). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
7e71c55e |
|
22-Sep-2009 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Fix potential race in glock code We need to be careful of the ordering between clearing the GLF_LOCK bit and scheduling the workqueue. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
b94a170e |
|
23-Jul-2009 |
Benjamin Marzinski <bmarzins@redhat.com> |
GFS2: remove dcache entries for remote deleted inodes When a file is deleted from a gfs2 filesystem on one node, a dcache entry for it may still exist on other nodes in the cluster. If this happens, gfs2 will be unable to free this file on disk. Because of this, it's possible to have a gfs2 filesystem with no files on it and no free space. With this patch, when a node receives a callback notifying it that the file is being deleted on another node, it schedules a new workqueue thread to remove the file's dcache entry. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
8ff22a6f |
|
10-Jul-2009 |
Benjamin Marzinski <bmarzins@redhat.com> |
GFS2: Don't put unlikely reclaim candidates on the reclaim list. GFS2 was placing far too many glocks on the reclaim list that were not good candidates for freeing up from cache. These locks would sit there and repeatedly get scanned to see if they could be reclaimed, wasting a lot of time when there was memory pressure. This fix does more checks on the locks to see if they are actually likely to be removable from cache. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
a51b56ff |
|
30-Jun-2009 |
Benjamin Marzinski <bmarzins@redhat.com> |
GFS2: Fix panic in glock memory shrinker It is possible for gfs2_shrink_glock_memory() to check a glock for demotion that's in the process of being freed by gfs2_glock_put(). In this case, gfs2_shrink_glock_memory() will acquire a new reference to this glock, and then try to free the glock itself when it drops the refernce. To solve this, gfs2_shrink_glock_memory() just needs to check if the glock is in the process of being freed, and if so skip it without ever unlocking the lru_lock. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Acked-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
2163b1e6 |
|
25-Jun-2009 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Shrink the shrinker This patch removes some of the special cases that the shrinker was trying to deal with. As a result we leave fewer items on the list and none at all which cannot be demoted. This makes the list scanning more efficient and solves some issues seen with large numbers of inodes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
63997775 |
|
12-Jun-2009 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Add tracepoints This patch adds the ability to trace various aspects of the GFS2 filesystem. The trace points are divided into three groups, glocks, logging and bmap. These points have been chosen because they allow inspection of the major internal functions of GFS2 and they are also generic enough that they are unlikely to need any major changes as the filesystem evolves. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
fe64d517 |
|
19-May-2009 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Umount recovery race fix This patch fixes a race condition where we can receive recovery requests part way through processing a umount. This was causing problems since the recovery thread had already gone away. Looking in more detail at the recovery code, it was really trying to implement a slight variation on a work queue, and that happens to align nicely with the recently introduced slow-work subsystem. As a result I've updated the code to use slow-work, rather than its own home grown variety of work queue. When using the wait_on_bit() function, I noticed that the wait function that was supplied as an argument was appearing in the WCHAN field, so I've updated the function names in order to produce more meaningful output. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
0c7a531a |
|
30-Apr-2009 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Fix glock ref counting bug Depending on the ordering of events as we go around the glock shrinker loop, it is possible to drop the ref count of a glock incorrectly. It doesn't happen very often. This patch corrects the got_ref variable, fixing the problem. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
a228df63 |
|
07-Apr-2009 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Move umount flush rwsem The rwsem, used only on umount, is in the wrong place in glock.c. This patch moves it up a bit so that it does not get called under a spinlock. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
64d576ba |
|
12-Feb-2009 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Add a "demote a glock" interface to sysfs This adds a sysfs file called demote_rq to GFS2's per filesystem directory. Its possible to use this file to demote arbitrary glocks in exactly the same way as if a request had come in from a remote node. This is intended for testing issues relating to caching of data under glocks. Despite that, the interface is generic enough to send requests to any type of glock, but be careful as its not always safe to send an arbitrary message to an arbitrary glock. For that reason and to prevent DoS, this interface is restricted to root only. The messages look like this: <type>:<glocknumber> <mode> Example: echo -n "2:13324 EX" >/sys/fs/gfs2/unity:myfs/demote_rq Which means "please demote inode glock (type 2) number 13324 so that I can get an EX (exclusive) lock". The lock modes are those which would normally be sent by a remote node in its callback so if you want to unlock a glock, you use EX, to demote to shared, use SH or PR (depending on whether you like GFS2 or DLM lock modes better!). If the glock doesn't exist, you'll get -ENOENT returned. If the arguments don't make sense, you'll get -EINVAL returned. The plan is that this interface will be used in combination with the blktrace patch which I recently posted for comments although it is, of course, still useful in its own right. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
d8348de0 |
|
05-Feb-2009 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Fix deadlock on journal flush This patch fixes a deadlock when the journal is flushed and there are dirty inodes other than the one which caused the journal flush. Originally the journal flushing code was trying to obtain the transaction glock while running the flush code for an inode glock. We no longer require the transaction glock at this point in time since we know that any attempt to get the transaction glock from another node will result in a journal flush. So if we are flushing the journal, we can be sure that the transaction lock is still cached from when the transaction was started. By inlining a version of gfs2_trans_begin() (minus the bit which gets the transaction glock) we can avoid the deadlock problems caused if there is a demote request queued up on the transaction glock. In addition I've also moved the umount rwsem so that it covers the glock workqueue, since it all demotions are done by this workqueue now. That fixes a bug on umount which I came across while fixing the original problem. Reported-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
ac2425e7 |
|
13-Jan-2009 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Remove unused field from glock The time stamp field is unused in the glock now that we are using a shrinker, so that we can remove it and save sizeof(unsigned long) bytes in each glock. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
f057f6cd |
|
12-Jan-2009 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Merge lock_dlm module into GFS2 This is the big patch that I've been working on for some time now. There are many reasons for wanting to make this change such as: o Reducing overhead by eliminating duplicated fields between structures o Simplifcation of the code (reduces the code size by a fair bit) o The locking interface is now the DLM interface itself as proposed some time ago. o Fewer lookups of glocks when processing replies from the DLM o Fewer memory allocations/deallocations for each glock o Scope to do further optimisations in the future (but this patch is more than big enough for now!) Please note that (a) this patch relates to the lock_dlm module and not the DLM itself, that is still a separate module; and (b) that we retain the ability to build GFS2 as a standalone single node filesystem with out requiring the DLM. This patch needs a lot of testing, hence my keeping it I restarted my -git tree after the last merge window. That way, this has the maximum exposure before its merged. This is (modulo a few minor bug fixes) the same patch that I've been posting on and off the the last three months and its passed a number of different tests so far. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
eb8374e7 |
|
25-Dec-2008 |
Julia Lawall <julia@diku.dk> |
GFS2: Use DEFINE_SPINLOCK SPIN_LOCK_UNLOCKED is deprecated. The following makes the change suggested in Documentation/spinlocks.txt The semantic patch that makes this change is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @@ declarer name DEFINE_SPINLOCK; identifier xxx_lock; @@ - spinlock_t xxx_lock = SPIN_LOCK_UNLOCKED; + DEFINE_SPINLOCK(xxx_lock); // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
fefc03bf |
|
19-Dec-2008 |
Steven Whitehouse <swhiteho@redhat.com> |
Revert "GFS2: Fix use-after-free bug on umount" This reverts commit 78802499912f1ba31ce83a94c55b5a980f250a43. The original patch is causing problems in relation to order of operations at umount in relation to jdata files. I need to fix this a different way. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
3af165ac |
|
27-Nov-2008 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Fix use-after-free bug on umount There was a use-after-free with the GFS2 super block during umount. This patch moves almost all of the umount code from ->put_super into ->kill_sb, the only bit that cannot be moved being the glock hash clearing which has to remain as ->put_super due to umount ordering requirements. As a result its now obvious that the kfree is the final operation, whereas before it was hidden in ->put_super. Also gfs2_jindex_free is then only referenced from a single file so thats moved and marked static too. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
2bfb6449 |
|
26-Nov-2008 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Move four functions from super.c The functions which are being moved can all be marked static in their new locations, since they only have a single caller each. Their new locations are more logical than before and some of the functions are small enough that the compiler might well inline them. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
97cc1025 |
|
20-Nov-2008 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Kill two daemons with one patch This patch removes the two daemons, gfs2_scand and gfs2_glockd and replaces them with a shrinker which is called from the VM. The net result is that GFS2 responds better when there is memory pressure, since it shrinks the glock cache at the same rate as the VFS shrinks the dcache and icache. There are no longer any time based criteria for shrinking glocks, they are kept until such time as the VM asks for more memory and then we demote just as many glocks as required. There are potential future changes to this code, including the possibility of sorting the glocks which are to be written back into inode number order, to get a better I/O ordering. It would be very useful to have an elevator based workqueue implementation for this, as that would automatically deal with the read I/O cases at the same time. This patch is my answer to Andrew Morton's remark, made during the initial review of GFS2, asking why GFS2 needs so many kernel threads, the answer being that it doesn't :-) This patch is a net loss of about 200 lines of code. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
813e0c46 |
|
18-Nov-2008 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Fix "truncate in progress" hang Following on from the recent clean up of gfs2_quotad, this patch moves the processing of "truncate in progress" inodes from the glock workqueue into gfs2_quotad. This fixes a hang due to the "truncate in progress" processing requiring glocks in order to complete. It might seem odd to use gfs2_quotad for this particular item, but we have to use a pre-existing thread since creating a thread implies a GFP_KERNEL memory allocation which is not allowed from the glock workqueue context. Of the existing threads, gfs2_logd and gfs2_recoverd may deadlock if used for this operation. gfs2_scand and gfs2_glockd are both scheduled for removal at some (hopefully not too distant) future point. That leaves only gfs2_quotad whose workload is generally fairly light and is easily adapted for this extra task. Also, as a result of this change, it opens the way for a future patch to make the reading of the inode's information asynchronous with respect to the glock workqueue, which is another improvement that has been on the list for some time now. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
55ba474d |
|
24-Oct-2008 |
Harvey Harrison <harvey.harrison@gmail.com> |
GFS2: sparse annotation of gl->gl_spin fs/gfs2/glock.c:308:5: warning: context problem in 'do_promote': '_spin_unlock' expected different context fs/gfs2/glock.c:308:5: context '*gl+28': wanted >= 1, got 0 fs/gfs2/glock.c:529:2: warning: context problem in 'do_xmote': '_spin_unlock' expected different context fs/gfs2/glock.c:529:2: context '*gl+28': wanted >= 1, got 0 fs/gfs2/glock.c:925:3: warning: context problem in 'add_to_queue': '_spin_unlock' expected different context fs/gfs2/glock.c:925:3: context '*gl+28': wanted >= 1, got 0 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
719ee344 |
|
18-Sep-2008 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: high time to take some time over atime Until now, we've used the same scheme as GFS1 for atime. This has failed since atime is a per vfsmnt flag, not a per fs flag and as such the "noatime" flag was not getting passed down to the filesystems. This patch removes all the "special casing" around atime updates and we simply use the VFS's atime code. The net result is that GFS2 will now support all the same atime related mount options of any other filesystem on a per-vfsmnt basis. We do lose the "lazy atime" updates, but we gain "relatime". We could add lazy atime to the VFS at a later date, if there is a requirement for that variant still - I suspect relatime will be enough. Also we lose about 100 lines of code after this patch has been applied, and I have a suspicion that it will speed things up a bit, even when atime is "on". So it seems like a nice clean up as well. From a user perspective, everything stays the same except the loss of the per-fs atime quantum tweekable (ought to be per-vfsmnt at the very least, and to be honest I don't think anybody ever used it) and that a number of options which were ignored before now work correctly. Please let me know if you've got any comments. I'm pushing this out early so that you can all see what my plans are. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
dff52574 |
|
02-Sep-2008 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Fix race relating to glock min-hold time In the case that a request for a glock arrives right after the grant reply has arrived, it sometimes means that the gl_tstamp field hasn't been updated recently enough. The net result is that the min-hold time for the glock is ignored. If this happens often enough, it leads to poor performance. This patch adds an additional test, so that if the reply pending bit is set on a glock, then it will select the maximum length of time for the min-hold time, rather than looking at gl_tstamp. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
c1e817d0 |
|
22-Jul-2008 |
Steven Whitehouse <swhiteho@redhat.com> |
GFS2: Fix debugfs glock file iterator Due to an incorrect iterator, some glocks were being missed from the glock dumps obtained via debugfs. This patch fixes the problem and ensures that we don't miss any glocks in future. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
209806ab |
|
07-Jul-2008 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Allow local DF locks when holding a cached EX glock We already allow local SH locks while we hold a cached EX glock, so here we allow DF locks as well. This works only because we rely on the VFS's invalidation for locally cached data, and because if we hold an EX lock, then we know that no other node can be caching data relating to this file. It dramatically speeds up initial writes to O_DIRECT files since we fall back to buffered I/O for this and would otherwise bounce between DF and EX modes on each and every write call. The lessons to be learned from that are to ensure that (for the time being anyway) O_DIRECT files are preallocated and that they are written to using reasonably large I/O sizes. Even so this change fixes that corner case nicely Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
265d529c |
|
07-Jul-2008 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Fix delayed demote race There is a race in the delayed demote code where it does the wrong thing if a demotion to UN has occurred for other reasons before the delay has expired. This patch adds an assert to catch that condition as well as fixing the root cause by adding an additional check for the UN state. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Bob Peterson <rpeterso@redhat.com>
|
#
1bdad606 |
|
03-Jun-2008 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Remove remote lock dropping code There are several reasons why this is undesirable: 1. It never happens during normal operation anyway 2. If it does happen it causes performance to be very, very poor 3. It isn't likely to solve the original problem (memory shortage on remote DLM node) it was supposed to solve 4. It uses a bunch of arbitrary constants which are unlikely to be correct for any particular situation and for which the tuning seems to be a black art. 5. In an N node cluster, only 1/N of the dropped locked will actually contribute to solving the problem on average. So all in all we are better off without it. This also makes merging the lock_dlm module into GFS2 a bit easier. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
048bca22 |
|
23-May-2008 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] No lock_nolock This patch merges the lock_nolock module into GFS2 itself. As well as removing some of the overhead of the module, it also means that its now impossible to build GFS2 without a lock module (which would be a pointless thing to do anyway). We also plan to merge lock_dlm into GFS2 in the future, but that is a more tricky task, and will therefore be a separate patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: David Teigland <teigland@redhat.com>
|
#
6802e340 |
|
21-May-2008 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Clean up the glock core This patch implements a number of cleanups to the core of the GFS2 glock code. As a result a lot of code is removed. It looks like a really big change, but actually a large part of this patch is either removing or moving existing code. There are some new bits too though, such as the new run_queue() function which is considerably streamlined. Highlights of this patch include: o Fixes a cluster coherency bug during SH -> EX lock conversions o Removes the "glmutex" code in favour of a single bit lock o Removes the ->go_xmote_bh() for inodes since it was duplicating ->go_lock() o We now only use the ->lm_lock() function for both locks and unlocks (i.e. unlock is a lock with target mode LM_ST_UNLOCKED) o The fast path is considerably shortly, giving performance gains especially with lock_nolock o The glock_workqueue is now used for all the callbacks from the DLM which allows us to simplify the lock_dlm module (see following patch) o The way is now open to make further changes such as eliminating the two threads (gfs2_glockd and gfs2_scand) in favour of a more efficient scheme. This patch has undergone extensive testing with various test suites so it should be pretty stable by now. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Bob Peterson <rpeterso@redhat.com>
|
#
58e9fee1 |
|
14-Mar-2008 |
Benjamin Marzinski <bmarzins@redhat.com> |
[GFS2] Invalidate cache at correct point GFS2 wasn't invalidating its cache before it called into the lock manager with a request that could potentially drop a lock. This was leaving a window where the lock could be actually be held by another node, but the file's page cache would still appear valid, causing coherency problems. This patch moves the cache invalidation to before the lock manager call when dropping a lock. It also adds the option to the lock_dlm lock manager to not use conversion mode deadlock avoidance, which, on a conversion from shared to exclusive, could internally drop the lock, and then reacquire in. GFS2 now asks lock_dlm to not do this. Instead, GFS2 manually drops the lock and reacquires it. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
840ca0ec |
|
12-Feb-2008 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Fix bug where we called drop_bh incorrectly As a result of an earlier patch, drop_bh was being called in cases when it shouldn't have been. Since we never have a gh in the drop case and we always have a gh in the promote case, we can use that extra information to tell which case has been seen. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Bob Peterson <rpeterso@redhat.com>
|
#
cf45b752 |
|
31-Jan-2008 |
Bob Peterson <rpeterso@redhat.com> |
[GFS2] Remove rgrp and glock version numbers This patch further reduces GFS2's memory requirements by eliminating the 64-bit version number fields in lieu of a couple bits. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
da755fdb |
|
30-Jan-2008 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Remove lm.[ch] and distribute content The functions in lm.c were just wrappers which were mostly only used in one other file. By moving the functions to the files where they are being used, they can be marked static and also this will usually result in them being inlined since they are often only used from one point in the code. A couple of really trivial functions have been inlined by hand into the function which called them as it makes the code clearer to do that. We also gain from one fewer function call in the glock lock and unlock paths. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
ab0d7566 |
|
29-Jan-2008 |
Bob Peterson <rpeterso@redhat.com> |
[GFS2] Eliminate gl_req_bh This patch further reduces the memory needs of GFS2 by eliminating the gl_req_bh variable from struct gfs2_glock. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
29d38cd1 |
|
28-Jan-2008 |
Bob Peterson <rpeterso@redhat.com> |
[GFS2] Get rid of gl_waiters2 This patch reduces memory by replacing the int variable gl_waiters2 by a single bit in the gl_flags. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
048786f1 |
|
28-Jan-2008 |
Adrian Bunk <bunk@kernel.org> |
[GFS2] make gfs2_glock_hold() static gfs2_glock_hold() can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
ef8c441c |
|
28-Jan-2008 |
Bob Peterson <rpeterso@redhat.com> |
[GFS2] Only wake the reclaim daemon if we need to This patch only wakes up the glock reclaim daemon if there is actually something to be reclaimed. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
eccba068 |
|
07-Feb-2008 |
Pavel Emelyanov <xemul@openvz.org> |
gfs2: make gfs2_glock.gl_owner_pid be a struct pid * The gl_owner_pid field is used to get the lock owning task by its pid, so make it in a proper manner, i.e. by using the struct pid pointer and pid_task() function. The pid_task() becomes exported for the gfs2 module. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
b1e058da |
|
07-Feb-2008 |
Pavel Emelyanov <xemul@openvz.org> |
gfs2: make gfs2_holder.gh_owner_pid be a struct pid * The gl_owner_pid field is used to get the holder task by its pid and check whether the current is a holder, so make it in a proper manner, i.e. via the struct pid * manipulations. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
398bbe68 |
|
11-Dec-2007 |
Bob Peterson <rpeterso@redhat.com> |
[GFS2] Reorganize function gfs2_glmutex_lock This patch optimizes the function gfs2_glmutex_lock. The basic theory is: Why bother initializing a holder, setting up wait bits and then waiting on them, if you know the glock can be yours. So the holder stuff is placed inside the if checking if the glock is locked. This one needs careful scrutiny because changing anything to do with locking should strike terror into one's heart. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
1a2781cf |
|
16-Nov-2007 |
Fabio Massimo Di Nitto <fabbione@ubuntu.com> |
[GFS2] Fix runtime issue with UP kernels The issue is indeed UP vs SMP and it is totally random. spin_is_locked() is a bad assertion because there is no correct answer on UP. on UP spin_is_locked() has to return either one value or another, always. This means that in my setup I am lucky enough to trigger the issue and your you are lucky enough not to. the patch in attachment removes the bogus calls to BUG_ON and according to David (in CC and thanks for the long explanation on the problem) we can rely upon things like lockdep to find problem that might be trying to catch. Signed-off-by: Fabio M. Di Nitto <fabbione@ubuntu.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
2bcd610d |
|
08-Nov-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Don't add glocks to the journal The only reason for adding glocks to the journal was to keep track of which locks required a log flush prior to release. We add a flag to the glock to allow this check to be made in a simpler way. This reduces the size of a glock (by 12 bytes on i386, 24 on x86_64) and means that we can avoid extra work during the journal flush. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
e589665e |
|
02-Nov-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Remove flags no longer required The HIF_MUTEX and HIF_PROMOTE flags were set on the glock holders depending upon which of the two waiters lists they were going to be queued upon. They were then tested when the holders were taken off the lists to ensure that the right type of holder was being dequeued. Since we are already using separate lists, there doesn't seem a lot of point having these flags as well, and since setting them and testing them is in the fast path for locking and unlocking glock, this patch removes them. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
3042a2cc |
|
02-Nov-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Reorder writeback for glock sync Previously we were doing (write data, wait for data, write metadata, wait for metadata). After this patch we so (write metadata, write data, wait for data, wait for metadata) which should be more efficient. Also I noticed that the drop_bh and xmote_bh functions were almost identical. In fact the only difference was a single test, and that test is such that in the drop_bh case, it would always evaluate to the correct result. As such we can use the xmote_bh functions in all the places where we were using the drop_bh function and remove the drop_bh functions. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
c2932e03 |
|
01-Nov-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Remove "reclaim limit" This call to reclaim glocks is not needed, and in particular we don't want it in the fast path for locking glocks. The limit was entirely arbitrary anyway and we can't expect users to adjust things like this, the remaining code will do the right thing on its own. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
cc7e79b1 |
|
04-Oct-2007 |
Wendy Cheng <wcheng@redhat.com> |
[GFS2] Handle multiple glock demote requests Fix a race condition where multiple glock demote requests are sent to a node back-to-back. This patch does a check inside handle_callback() to see whether a demote request is in progress. If true, it sets a flag to make sure run_queue() will loop again to handle the new request, instead of erronously setting gl_demote_state to a different state. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
49e61f2e |
|
13-Sep-2007 |
Wendy Cheng <wcheng@redhat.com> |
[GFS2] Move inode deletion out of blocking_cb Move inode deletion code out of blocking_cb handle_callback route to avoid racy conditions that end up blocking lock_dlm1 thread. Fix bugzilla 286821. Signed-off-by: Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
b4c20166 |
|
13-Sep-2007 |
Abhijith Das <adas@redhat.com> |
[GFS2] flocks from same process trip kernel BUG at fs/gfs2/glock.c:1118! This patch adds a new flag to the gfs2_holder structure GL_FLOCK. It is set on holders of glocks representing flocks. This flag is checked in add_to_queue() and a process is permitted to queue more than one holder onto a glock if it is set. This solves the issue of a process not being able to do multiple flocks on the same file. Through a single descriptor, a process can now promote and demote flocks. Through multiple descriptors a process can now queue multiple flocks on the same file. There's still the problem of a process deadlocking itself (because gfs2 blocking locks are not interruptible) by queueing incompatible deadlock. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
c4f68a13 |
|
23-Aug-2007 |
Benjamin Marzinski <bmarzins@redhat.com> |
[GFS2] delay glock demote for a minimum hold time When a lot of IO, with some distributed mmap IO, is run on a GFS2 filesystem in a cluster, it will deadlock. The reason is that do_no_page() will repeatedly call gfs2_sharewrite_nopage(), because each node keeps giving up the glock too early, and is forced to call unmap_mapping_range(). This bumps the mapping->truncate_count sequence count, forcing do_no_page() to retry. This patch institutes a minimum glock hold time a tenth a second. This insures that even in heavy contention cases, the node has enough time to get some useful work done before it gives up the glock. A second issue is that when gfs2_glock_dq() is called from within a page fault to demote a lock, and the associated page needs to be written out, it will try to acqire a lock on it, but it has already been locked at a higher level. This patch puts makes gfs2_glock_dq() use the work queue as well, to avoid this issue. This is the same patch as Steve Whitehouse originally proposed to fix this issue, execpt that gfs2_glock_dq() now grabs a reference to the glock before it queues up the work on it. Signed-off-by: Benjamin E. Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
a947e033 |
|
21-Aug-2007 |
Abhijith Das <adas@redhat.com> |
[GFS2] Wendy's dump lockname in hex & fix glock dump With this patch, gfs2 glockdump through the debugfs filesystem will only dump glocks for the specified filesystem instead of all glocks. Also, to aid debugging, the glock number is dumped in hex instead of decimal. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Abhijith Das <adas@redhat.com>
|
#
8fbbfd21 |
|
01-Aug-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Reduce number of gfs2_scand processes to one We only need a single gfs2_scand process rather than the one per filesystem which we had previously. As a result the parameter determining the frequency of gfs2_scand runs becomes a module parameter rather than a mount parameter as it was before. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
4ef29002 |
|
31-Jul-2007 |
Denis Cheng <crquan@gmail.com> |
[GFS2] mark struct *_operations const these struct *_operations are all method tables, thus should be const. Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
7b08fc62 |
|
24-Jul-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Fix an oops in glock dumping This fixes an oops which was occurring during glock dumping due to the seq file code not taking a reference to the glock. Also this fixes a memory leak which occurred in certain cases, in turn preventing the filesystem from unmounting. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
aa0481e5 |
|
21-Jul-2007 |
Jesper Juhl <jesper.juhl@gmail.com> |
[GFS2] Clean up duplicate includes in fs/gfs2/ This patch cleans up duplicate includes in fs/gfs2/ Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
26caee5b |
|
23-Jul-2007 |
Josef Whiter <jwhiter@redhat.com> |
[GFS2] Fix calculation of demote state If a glock is in the exclusive state and a request for demote to deferred has been received, then further requests for demote to shared are being ignored. This patch fixes that by ensuring that we demote to unlocked in that case. Signed-off-by: Josef Whiter <jwhiter@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
87124e58 |
|
23-Jul-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Fix two races relating to glock callbacks One of the races relates to referencing a variable while not holding its protecting spinlock. The patch simply moves the test inside the spin lock. The other races occurs when a demote to unlocked request occurs during the time a demote to shared request is already running. This of course only happens in the case that the lock was in the exclusive mode to start with. The patch adds a check to see if another demote request has occurred in the mean time and if it has, then it performs a second demote. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
eaf5bd3c |
|
19-Jun-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Simplify multiple glock aquisition There is a bug in the code which acquires multiple glocks where if the initial out-of-order attempt fails part way though we can land up trying to acquire the wrong number of glocks. This is part of the fix for red hat bz #239737. The other part of the bz doesn't apply to upstream kernels since it was fixed by: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=d3717bdf8f08a0e1039158c8bab2c24d20f492b6 Since the out-of-order code doesn't appear to add anything to the performance of GFS2, this patch just removed it rather than trying to fix it. It should be much easier to see whats going on here now. In addition, we don't allocate any memory unless we are using a lot of glocks (which is a relatively uncommon case). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
d93cfa98 |
|
11-Jun-2007 |
Abhijith Das <adas@redhat.com> |
[GFS2] Fix deallocation issues There were two issues during deallocation of unlinked inodes. The first was relating to the use of a "try" lock which in the case of the inode lock wasn't trying hard enough to deallocate in all circumstances (now changed to a normal glock) and in the case of the iopen lock didn't wait for the demotion of the shared lock before attempting to get the exclusive lock, and thereby sometimes (timing dependent) not completing the deallocation when it should have done. The second issue related to the lack of a way to invalidate dcache entries on remote nodes (now fixed by this patch) which meant that unlinks were taking a long time to return disk space to the fs. By adding some code to invalidate the dcache entries across the cluster for unlinked inodes, that is now fixed. This patch was written jointly by Abhijith Das and Steven Whitehouse. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
dbb7cae2 |
|
15-May-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Clean up inode number handling This patch cleans up the inode number handling code. The main difference is that instead of looking up the inodes using a struct gfs2_inum_host we now use just the no_addr member of this structure. The tests relating to no_formal_ino can then be done by the calling code. This has advantages in that we want to do different things in different code paths if the no_formal_ino doesn't match. In the NFS patch we want to return -ESTALE, but in the ->lookup() path, its a bug in the fs if the no_formal_ino doesn't match and thus we can withdraw in this case. In order to later fix bz #201012, we need to be able to look up an inode without knowing no_formal_ino, as the only information that is known to us is the on-disk location of the inode in question. This patch will also help us to fix bz #236099 at a later date by cleaning up a lot of the code in that area. There are no user visible changes as a result of this patch and there are no changes to the on-disk format either. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
cd81a4ba |
|
13-May-2007 |
Robert Peterson <rpeterso@redhat.com> |
[GFS2] Addendum patch 2 for gfs2_grow This addendum patch 2 corrects three things: 1. It fixes a stupid mistake in the previous addendum that broke gfs2. Ref: https://www.redhat.com/archives/cluster-devel/2007-May/msg00162.html 2. It fixes a problem that Dave Teigland pointed out regarding the external declarations in ops_address.h being in the wrong place. 3. It recasts a couple more %llu printks to (unsigned long long) as requested by Steve Whitehouse. I would have loved to put this all in one revised patch, but there was a rush to get some patches for RHEL5. Therefore, the previous patches were applied to the git tree "as is" and therefore, I'm posting another addendum. Sorry. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
37fde8ca |
|
01-May-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Uncomment sprintf_symbol calling code Now that the patch from -mm has gone upstream, we can uncomment the code in GFS2 which uses sprintf_symbol. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Robert Peterson <rpeterso@redhat.com>
|
#
5f882096 |
|
18-Apr-2007 |
Robert Peterson <rpeterso@redhat.com> |
[GFS2] lockdump improvements The patch below consists of the following changes (in code order): 1. I fixed a minor compiler warning regarding the printing of a kernel symbol address. 2. I implemented a suggestion from Dave Teigland that moves the debugfs information for gfs2 into a subdirectory so we can easily expand our use of debugfs in the future. The current code keeps the glock information in: /debug/gfs2/<fs> With the patch, the new code keeps the glock information in: /debug/gfs2/<fs>/glock That will allow us to create more debugfs files in the future. 3. This fixes a bug whereby a failed mount attempt causes the debugfs file to not be deleted. Failed mount attempts should always clean up after themselves, including deleting the debugfs file and/or directory. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
7a0079d9 |
|
17-Apr-2007 |
Robert Peterson <rpeterso@redhat.com> |
[GFS2] bz 236008: Kernel gpf doing cat /debugfs/gfs2/xxx (lock dump) This is for Bugzilla Bug 236008: Kernel gpf doing cat /debugfs/gfs2/xxx (lock dump) seen at the "gfs2 summit". This also fixes the bug that caused garbage to be printed by the "initialized at" field. I apologize for the kludge, but that code will all be ripped out anyway when the official sprint_symbol function becomes available in the Linux kernel. I also changed some formatting so that spaces are replaced by proper tabs. Signed-off-by: Robert Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
04b933f2 |
|
23-Mar-2007 |
Robert Peterson <rpeterso@redhat.com> |
[GFS2] Red Hat bz 228540: owner references In Testing the previously posted and accepted patch for https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=228540 I uncovered some gfs2 badness. It turns out that the current gfs2 code saves off a process pointer when glocks is taken in both the glock and glock holder structures. Those structures will persist in memory long after the process has ended; pointers to poisoned memory. This problem isn't caused by the 228540 fix; the new capability introduced by the fix just uncovered the problem. I wrote this patch that avoids saving process pointers and instead saves off the process pid. Rather than referencing the bad pointers, it now does process lookups. There is special code that makes the output nicer for printing holder information for processes that have ended. This patch also adds a stub for the new "sprint_symbol" function that exists in Andrew Morton's -mm patch set, but won't go into the base kernel until 2.6.22, since it adds functionality but doesn't fix a bug. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
420d2a10 |
|
18-Mar-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Fix a bug on i386 due to evaluation order Since gcc didn't evaluate the last two terms of the expression in glock.c:1881 as a constant expression, it resulted in an error on i386 due to the lack of a 64bit divide instruction. This adds some brackets to fix the problem. This was reported by Andrew Morton. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org>
|
#
3b8249f6 |
|
16-Mar-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Fix bz 224480 and cleanup glock demotion code This patch prevents the printing of a warning message in cases where the fs is functioning normally by handing off responsibility for unlinked, but still open inodes, to another node for eventual deallocation. Also, there is now an improved system for ensuring that such requests to other nodes do not get lost. The callback on the iopen lock is only ever called when i_nlink == 0 and when a node is unable to deallocate it due to it still being in use on another node. When a node receives the callback therefore, it knows that i_nlink must be zero, so we mark it as such (in gfs2_drop_inode) in order that it will then attempt deallocation of the inode itself. As an additional benefit, queuing a demote request no longer requires a memory allocation. This simplifies the code for dealing with gfs2_holders as it removes one special case. There are two new fields in struct gfs2_glock. gl_demote_state is the state which the remote node has requested and gl_demote_time is the time when the request came in. Both fields are only valid when the GLF_DEMOTE flag is set in gl_flags. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
5c7342d8 |
|
07-Mar-2007 |
Josef Whiter <jwhiter@redhat.com> |
[GFS2] fix bz 231369, gfs2 will oops if you specify an invalid mount option If you specify an invalid mount option when trying to mount a gfs2 filesystem, gfs2 will oops. The attached patch resolves this problem. Signed-off-by: Josef Whiter <jwhiter@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
7c52b166 |
|
16-Mar-2007 |
Robert Peterson <rpeterso@redhat.com> |
[GFS2] Add gfs2_tool lockdump support to gfs2 (bz 228540) The attached patch resolves bz 228540. This adds the capability for gfs2 to dump gfs2 locks through the debugfs file system. This used to exist in gfs1 as "gfs_tool lockdump" but it's missing from gfs2 because all the ioctls were stripped out. Please see the bugzilla for more history about the fix. This patch is also attached to the bugzilla record. The patch is against Steve Whitehouse's latest nmw git tree kernel (2.6.21-rc1) and has been tested on system trin-10. Signed-off-by: Robert Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
95d97b7d |
|
06-Mar-2007 |
Andrew Morton <akpm@linux-foundation.org> |
[GFS2] build fix fs/gfs2/glock.c:2198: error: 'THIS_MODULE' undeclared here (not in a function) Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
631c42e1 |
|
01-Mar-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] go_drop_bh is never used, so remove it The ->go_drop_bh function is never used, so this removes it and the single caller, Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
61be084e |
|
29-Jan-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Put back semaphore to avoid umount problem Dave Teigland fixed this bug a while back, but I managed to mistakenly remove the semaphore during later development. It is required to avoid the list of inodes changing during an invalidate_inodes call. I have made it an rwsem since the read side will be taken frequently during normal filesystem operation. The write site will only happen during umount of the file system. Also the bug only triggers when using the DLM lock manager and only then under certain conditions as its timing related. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: David Teigland <teigland@redhat.com>
|
#
d043e190 |
|
23-Jan-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Fix typo in glock.c This is a one letter typo fix in glock.c, spotted by Rob Kenna. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
90101c31 |
|
23-Jan-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Compile fix for glock.c This one liner got missed from the previous patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
12132933 |
|
22-Jan-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Remove queue_empty() function This function is not longer required since we do not do recursive locking in the glock layer. As a result all its callers can be replaceed with list_empty() calls. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
b5d32bea |
|
21-Jan-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Tidy up glops calls This patch doesn't make any changes to the ordering of the various operations related to glocking, but it does tidy up the calls to the glops.c functions to make the structure more obvious. The two functions: gfs2_glock_xmote_th() and gfs2_glock_drop_th() can be made static within glock.c since they are called by every set of glock operations. The xmote_th and drop_th glock operations are then made conditional upon those two routines existing and called from the previously mentioned functions in glock.c respectively. Also it can be seen that the go_sync operation isn't needed since it can easily be replaced by calls to xmote_bh and drop_bh respectively. This results in no longer (confusingly) calling back into routines in glock.c from glops.c and also reducing the glock operations by one member. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
1c0f4872 |
|
21-Jan-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Remove local exclusive glock mode Here is a patch for GFS2 to remove the local exclusive flag. In the places it was used, mutex's are always held earlier in the call path, so it appears redundant in the LM_ST_SHARED case. Also, the GFS2 holders were setting local exclusive in any case where the requested lock was LM_ST_EXCLUSIVE. So the other places in the glock code where the flag was tested have been replaced with tests for the lock state being LM_ST_EXCLUSIVE in order to ensure the logic is the same as before (i.e. LM_ST_EXCLUSIVE is always locally exclusive as well as globally exclusive). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
6bd9c8c2 |
|
19-Jan-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Remove unused go_callback operation This is never used, so we might as well remove it. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
e5dab552 |
|
18-Jan-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Remove the "greedy" function from glock.[ch] The "greedy" code was an attempt to retain glocks for a minimum length of time when they relate to mmap()ed files. The current implementation of this feature is not, however, ideal in that it required allocating memory in order to do this and its overly complicated. It also misses the mark by ignoring the other I/O operations which are just as likely to suffer from the same problem. So the plan is to remove this now and then add the functionality back as part of the glock state machine at a later date (and thus take into account all the possible users of this feature) Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
fee852e3 |
|
17-Jan-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Shrink gfs2_inode memory by half Here is something I spotted (while looking for something entirely different) the other day. Rather than using a completion in each and every struct gfs2_holder, this removes it in favour of hashed wait queues, thus saving a considerable amount of memory both on the stack (where a number of gfs2_holder structures are allocated) and in particular in the gfs2_inode which has 8 gfs2_holder structures embedded within it. As a result on x86_64 the gfs2_inode shrinks from 2488 bytes to 1912 bytes, a saving of 576 bytes per inode (no thats not a typo!). In actual practice we get a much better result than that since now that a gfs2_inode is under the 2048 byte barrier, we get two per 4k slab page effectively halving the amount of memory required to store gfs2_inodes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
3699e3a4 |
|
17-Jan-2007 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Clean up/speed up readdir This removes the extra filldir callback which gfs2 was using to enclose an attempt at readahead for inodes during readdir. The code was too complicated and also hurts performance badly in the case that the getdents64/readdir call isn't being followed by stat() and it wasn't even getting it right all the time when it was. As a result, on my test box an "ls" of a directory containing 250000 files fell from about 7mins (freshly mounted, so nothing cached) to between about 15 to 25 seconds. When the directory content was cached, the time taken fell from about 3mins to about 4 or 5 seconds. Interestingly in the cached case, running "ls -l" once reduced the time taken for subsequent runs of "ls" to about 6 secs even without this patch. Now it turns out that there was a special case of glocks being used for prefetching the metadata, but because of the timeouts for these locks (set to 10 secs) the metadata was being timed out before it was being used and this the prefetch code was constantly trying to prefetch the same data over and over. Calling "ls -l" meant that the inodes were brought into memory and once the inodes are cached, the glocks are not disposed of until the inodes are pushed out of the cache, thus extending the lifetime of the glocks, and thus bringing down the time for subsequent runs of "ls" considerably. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
0ac23069 |
|
28-Nov-2006 |
Randy Dunlap <randy.dunlap@oracle.com> |
[GFS2] lock function parameter Fix function parameter typing: fs/gfs2/glock.c:100: warning: function declaration isn't a prototype Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
b004157a |
|
23-Nov-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Fix journal flush problem This fixes a bug which resulted in poor performance due to flushing the journal too often. The code path in question was via the inode_go_sync() function in glops.c. The solution is not to flush the journal immediately when inodes are ejected from memory, but batch up the work for glockd to deal with later on. This means that glocks may now live on beyond the end of the lifetime of their inodes (but not very much longer in the normal case). Also fixed in this patch is a bug (which was hidden by the bug mentioned above) in calculation of the number of free journal blocks. The gfs2_logd process has been altered to be more responsive to the journal filling up. We now wake it up when the number of uncommitted journal blocks has reached the threshold level rather than trying to flush directly at the end of each transaction. This again means doing fewer, but larger, log flushes in general. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
1a14d3a6 |
|
20-Nov-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Simplify glops functions The go_sync callback took two flags, but one of them was set on every call, so this patch removes once of the flags and makes the previously conditional operations (on this flag), unconditional. The go_inval callback took three flags, each of which was set on every call to it. This patch removes the flags and makes the operations unconditional, which makes the logic rather more obvious. Two now unused flags are also removed from incore.h. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
ab923031 |
|
15-Nov-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Fix memory allocation in glock.c Change from GFP_KERNEL to GFP_NOFS as this was causing a slow down when trying to push inodes from cache. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
c594d886 |
|
08-Nov-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Remove unused GL_DUMP flag There is no way to set the GL_DUMP flag, and in any case the same thing can be done with systemtap if required for debugging, so this removes it. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
b60623c2 |
|
31-Oct-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Shrink gfs2_inode (3) - di_mode This removes the duplicate di_mode field in favour of using the inode->i_mode field. This saves 4 bytes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
c4028958 |
|
22-Nov-2006 |
David Howells <dhowells@redhat.com> |
WorkStruct: make allyesconfig Fix up for make allyesconfig. Signed-Off-By: David Howells <dhowells@redhat.com>
|
#
907b9bce |
|
25-Sep-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2/DLM] Fix trailing whitespace As per Andrew Morton's request, removed trailing whitespace. Cc: Andrew Morton <akpm@osdl.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
7d308590 |
|
18-Sep-2006 |
Fabio Massimo Di Nitto <fabbione@ubuntu.com> |
[GFS2] Export lm_interface to kernel headers lm_interface.h has a few out of the tree clients such as GFS1 and userland tools. Right now, these clients keeps a copy of the file in their build tree that can go out of sync. Move lm_interface.h to include/linux, export it to userland and clean up fs/gfs2 to use the new location. Signed-off-by: Fabio M. Di Nitto <fabbione@ubuntu.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
a8336344 |
|
14-Sep-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Fix glock hash clearing A one liner bug fix to prevent the return value being wrong when more than one superblock is mounted. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
16feb9fe |
|
13-Sep-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Use atomic_t rather than kref in glock.c Use atomic_t as the ref count in glocks rather than a kref. This is another step towards using RCU for the glock hash. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
b6397893 |
|
12-Sep-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Use hlist for glock hash chains This results in smaller list heads, so that we can have more chains in the same amount of memory (twice as many). I've multiplied the size of the table by four though - this is because we are saving memory by not having one lock per chain any more. So we land up using about the same amount of memory for the hash table as we did before I started these changes, the difference being that we now have four times as many hash chains. The reason that I say "about the same amount of memory" is that the actual amount now depends upon the NR_CPUS and some of the config variables, so that its not exact and in some cases we do use more memory. Eventually we might want to scale the hash table size according to the size of physical ram as measured on module load. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
24264434 |
|
11-Sep-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Rewrite of examine_bucket() The existing implementation of this function in glock.c was not very efficient as it relied upon keeping a cursor element upon the hash chain in question and moving it along. This new version improves upon this by using the current element as a cursor. This is possible since we only look at the "next" element in the list after we've taken the read_lock() subsequent to calling the examiner function. Obviously we have to eventually drop the ref count that we are then left with and we cannot do that while holding the read_lock, so we do that next time we drop the lock. That means either just before we examine another glock, or when the loop has terminated. The new implementation has several advantages: it uses only a read_lock() rather than a write_lock(), so it can run simnultaneously with other code, it doesn't need a "plug" element, so that it removes a test not only from this list iterator, but from all the other glock list iterators too. So it makes things faster and smaller. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
94610610 |
|
09-Sep-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Remove unused function from glock.c The callback for iopen locks is unused, so this removes it. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
a5e08a9e |
|
09-Sep-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Add consts to glock sorting function Add back the consts which were casted away in the glock sorting function. Also add early exit code. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
087efdd3 |
|
09-Sep-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Make glock hash locks proportional to NR_CPUS Make the number of locks used for hash chains in glock.c proportional to NR_CPUS. Also move constants for the number of hash chains into glock.c from incore.h since they are not used outside of glock.c. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
37b2fa6a |
|
08-Sep-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Move rwlocks in glock.c into their own array This splits the rwlocks guarding the hash chains of the glock hash table into their own array. This will reduce memory usage in some cases due to better alignment, although the real reason for doing it is to allow the two tables to be different sizes in future (i.e. the locks will be sized proportionally with the max number of CPUs and the hash chains sized proportinally with the size of physical memory) In order to allow this, the gl_bucket member of struct gfs2_glock has now become gl_hash, so we record the hash rather than a pointer to the bucket itself. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
9b47c11d |
|
08-Sep-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Use void * instead of typedef for locking module interface As requested by Jan Engelhardt, this removes the typedefs in the locking module interface and replaces them with void *. Also since we are changing the interface, I've added a few consts as well. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Cc: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
1c089c32 |
|
07-Sep-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Remove one typedef This removes one of the typedefs from the locking interface. It is replaced by a forward declaration of the gfs2 superblock. The other two are not so easy to solve since in their case, they can refer to one of two possible structures. Cc: David Teigland <teigland@redhat.com> Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
85d1da67 |
|
07-Sep-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Move glock hash table out of superblock There are several reasons why we want to do this: - Firstly its large and thus we'll scale better with multiple GFS2 fs mounted at the same time - Secondly its easier to scale its size as required (thats a plan for later patches) - Thirdly, we can use kzalloc rather than vmalloc when allocating the superblock (its now only 4888 bytes) - Fourth its all part of my plan to eventually be able to use RCU with the glock hash. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
b8547856 |
|
07-Sep-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Add gfs2 superblock to glock hash function This is another patch preparing for sharing of the glock hash table between different gfs2 mounts. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
cd915493 |
|
03-Sep-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Change all types to uX style This makes all fixed size types have consistent names. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
a91ea69f |
|
03-Sep-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Align all labels against LH side This makes everything consistent. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
50299965 |
|
04-Sep-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Tidy up locking code As per Jan Engelhardt's second email, this removes some unused code, and fixes up indenting in various places. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
e9fc2aa0 |
|
01-Sep-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Update copyright, tidy up incore.h As per comments from Jan Engelhardt <jengelh@linux01.gwdg.de> this updates the copyright message to say "version" in full rather than "v.2". Also incore.h has been updated to remove forward structure declarations which are not required. The gfs2_quota_lvb structure has now had endianess annotations added to it. Also quota.c has been updated so that we now store the lvb data locally in endian independant format to avoid needing a structure in host endianess too. As a result the endianess conversions are done as required at various points and thus the conversion routines in lvb.[ch] are no longer required. I've moved the one remaining constant in lvb.h thats used into lm.h and removed the unused lvb.[ch]. I have not changed the HIF_ constants. That is left to a later patch which I hope will unify the gh_flags and gh_iflags fields of the struct gfs2_holder. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
899be4d3 |
|
29-Aug-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Add superblock into key for glock lookups This adds the superblock as a key for glock lookups. Since the glocks are already stored in a per-superblock table, this has no effect at the moment. Later on this will change though. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
d6a53727 |
|
30-Aug-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Use const on glock lookup key Use const for the glock name which is being used as a lookup key in the glock hash table. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
ec45d9f5 |
|
30-Aug-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Use slab properly with glocks We can take advantage of the slab allocator to ensure that all the list heads and the spinlock (plus one or two other fields) are initialised by slab to speed up allocation of glocks. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
5e2b0613 |
|
30-Aug-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Remove unused code from glock layer Remove the unused sync feature from glocks. This is currently done by calling the required functions to sync pages/blocks directly so this code isn't needed. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
8fb4b536 |
|
30-Aug-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Make glock operations const For all the usual reasons of enforcing correctness and potentially reducing code size, this patch makes the glock operations const. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
86384605 |
|
25-Aug-2006 |
Abhijith Das <adas@redhat.com> |
[GFS2] Allow mounting of gfs2 and gfs2meta at the same time This patch allows the simultaneous mounting of gfs2meta and gfs2 filesystems. A restriction however is that a gfs2meta fs may only be mounted if its corresponding gfs2 filesystem is also mounted. Also, a gfs2 filesystem cannot be unmounted before its gfs2meta filesystem. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
a2242db0 |
|
24-Aug-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Speed up scanning of glocks I noticed the gfs2_scand seemed to be taking a lot of CPU, so in order to cut that down a bit, here is a patch. Firstly the type of a glock is a constant during its lifetime, so that its possible to check this without needing locking. I've moved the (common) case of testing for an inode glock outside of the glmutex lock. Also there was a mutex left over from when the glock cache was master of the inode cache. That isn't required any more so I've removed that too. There is probably scope for further speed ups in the future in this area. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
88721877 |
|
10-Aug-2006 |
Russell Cattelan <cattelan@redhat.com> |
[GFS2] Fix a couple of refcount leaks. recovery.c add a brelse to deal with gfs2_replay_read_block being called twice on the same block. add a dput to drop the ref count on the root inode. This was causing lingering glocks and thus causing a mount failure to hang. Fix a endian conversion macro that was was swizzling 16bits when it should have been swizzling 32. Signed-off-by: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
5dd9feaf |
|
28-Jul-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Fix bug in clear_inode We should have been waiting for lock demotion to finish in clear_inode. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
f45b7ddd |
|
27-Jul-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Use a bio to read the superblock This means that we don't need to create a special inode just to contain a struct address_space in order to read a single disk block. Instead we read the disk block directly. Its slightly faster, and uses slightly less memory, but the real reason for doing this is that it removes a special case from the glock code. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
29937ac6 |
|
06-Jul-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Fixes to scanning of glocks (again) This really is the correct fix this time. We just ignore all glocks associated with inodes until the inodes are pushed from the inode cache. At that point the glocks are queued for reclaim, so we don't need to do it here. Also fix one or two other minor bugs. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
627add2d |
|
05-Jul-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Correct logic in glock scanner Under certain circumstances the glock scanning logic would demote locks which ought not to have been selected for demotion. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
af18ddb8 |
|
24-Jun-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Eliminate one instance of __GFP_NOFAIL This removes one instance of GFP_NOFAIL from the glock callback function. It also fixes a bug where a , was used at a line end rather than ; causing unintended results. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
feaa7bba |
|
14-Jun-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Fix unlinked file handling This patch fixes the way we have been dealing with unlinked, but still open files. It removes all limits (other than memory for inodes, as per every other filesystem) on numbers of these which we can support on GFS2. It also means that (like other fs) its the responsibility of the last process to close the file to deallocate the storage, rather than the person who did the unlinking. Note that with GFS2, those two events might take place on different nodes. Also there are a number of other changes: o We use the Linux inode subsystem as it was intended to be used, wrt allocating GFS2 inodes o The Linux inode cache is now the point which we use for local enforcement of only holding one copy of the inode in core at once (previous to this we used the glock layer). o We no longer use the unlinked "special" file. We just ignore it completely. This makes unlinking more efficient. o We now use the 4th block allocation state. The previously unused state is used to track unlinked but still open inodes. o gfs2_inoded is no longer needed o Several fields are now no longer needed (and removed) from the in core struct gfs2_inode o Several fields are no longer needed (and removed) from the in core superblock There are a number of future possible optimisations and clean ups which have been made possible by this patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
382066da |
|
24-May-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Casts for printing 64bit numbers Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
320dd101 |
|
18-May-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] glock debugging and inode cache changes This adds some extra debugging to glock.c and changes inode.c's deallocation code to call the debugging code at a suitable moment. I'm chasing down a particular bug to do with deallocation at the moment and the code can go again once the bug is fixed. Also this includes the first part of some changes to unify the Linux struct inode and GFS2's struct gfs2_inode. This transformation will happen in small parts over the next short period. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
3a8a9a10 |
|
18-May-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Update copyright date to 2006 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
bd896801 |
|
18-May-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Remove semaphore.h from C files We no longer use semaphores, everything has been converted to mutex or rwsem, so we don't need to include this header any more. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
fd88de56 |
|
05-May-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Readpages support This adds readpages support (and also corrects a small bug in the readpage error path at the same time). Hopefully this will improve performance by allowing GFS to submit larger lumps of I/O at a time. In order to simplify the setting of BH_Boundary, it currently gets set when we hit the end of a indirect pointer block. There is always a boundary at this point with the current allocation code. It doesn't get all the boundaries right though, so there is still room for improvement in this. See comments in fs/gfs2/ops_address.c for further information about readpages with GFS2. Signed-off-by: Steven Whitehouse
|
#
56409abb |
|
28-Apr-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Remove some unused code Remove some of the unused code flagged up by Adrian Bunk. Cc: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse
|
#
08bc2dbc |
|
28-Apr-2006 |
Adrian Bunk <bunk@stusta.de> |
[GFS2] [-mm patch] fs/gfs2/: possible cleanups This patch contains the following possible cleanups: - make needlessly global code static - #if 0 unused functions - remove the following global function that was both unused and unimplemented: - super.c: gfs2_do_upgrade() Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
36327521 |
|
28-Apr-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Reordering in deallocation to avoid recursive locking Despite my earlier careful search, there was a recursive lock left in the deallocation code. This removes it. It also should speed up deallocation be reducing the number of locking operations which take place by using two "try lock" operations on the two locks involved in inode deallocation which allows us to grab the locks out of order (compared with NFS which grabs the inode lock first and the iopen lock later). It is ok for us to fail while doing this since if it does fail it means that someone else is still using the inode and thus it wouldn't be possible to deallocate anyway. This fixes the bug reported to me by Rob Kenna. Cc: Rob Kenna <rkenna@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
e7f5c01c |
|
27-Apr-2006 |
David Teigland <teigland@redhat.com> |
[GFS2] Remove redundant casts to/from void Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
579b78a4 |
|
26-Apr-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Remove GL_NEVER_RECURSE flag There is no point in keeping this flag since recursion is not now allowed for any glock. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
5965b1f4 |
|
26-Apr-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Don't do recursive locking in glock layer This patch changes the last user of recursive locking so that it no longer needs this feature and removes it from the glock layer. This makes the glock code a lot simpler and easier to understand. Its also a prerequsite to adding support for the AOP_TRUNCATED_PAGE return code (or at least it is if you don't want your brain to melt in the process) I've left in a couple of checks just in case there is some place else in the code which is still using this feature that I didn't spot yet, but they can probably be removed long term. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
190562bd |
|
20-Apr-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Fix a bug: scheduling under a spinlock At some stage, a mutex was added to gfs2_glock_put() without checking all its call sites. Two of them were called from under a spinlock causing random delays at various points and crashes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
fe1bdedc |
|
18-Apr-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Use vmalloc() in dir code When allocating memory to sort directory entries, use vmalloc() rather than kmalloc() since for larger directories, the required size can easily be graeter than the 128k maximum of kmalloc(). Also adding the first steps towards getting the AOP_TRUNCATED_PAGE return code get in the glock code by flagging all places where we request a glock and we are holding a page lock. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
d0dc80db |
|
29-Mar-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Update debugging code Update the debugging code in trans.c and at the same time improve the debugging code for gfs2_holders. The new code should be pretty fast during the normal case and provide just as much information in case of errors (or more). One small function from glock.c has moved to glock.h as a static inline so that its return address won't get in the way of the debugging. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
5c676f6d |
|
27-Feb-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Macros removal in gfs2.h As suggested by Pekka Enberg <penberg@cs.helsinki.fi>. The DIV_RU macro is renamed DIV_ROUND_UP and and moved to kernel.h The other macros are gone from gfs2.h as (although not requested by Pekka Enberg) are a number of included header file which are now included individually. The inode number comparison function is now an inline function. The DT2IF and IF2DT may be addressed in a future patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
d92a8d48 |
|
27-Feb-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Audit printk and kmalloc All printk calls now have KERN_ set where required and a couple of kmalloc(), memset(.., 0, ...) calls changed to kzalloc(). This is in response to comments from: Pekka Enberg <penberg@cs.helsinki.fi> and Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
d95cb943 |
|
23-Feb-2006 |
David Teigland <teigland@redhat.com> |
[GFS2] Patch to remove stats counters from GFS2 (II) Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
6a6b3d01 |
|
23-Feb-2006 |
David Teigland <teigland@redhat.com> |
[GFS2] Patch to remove stats gathering from GFS2 Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
f55ab26a |
|
20-Feb-2006 |
Steven Whitehouse <swhiteho@redhat.com> |
[GFS2] Use mutices rather than semaphores As well as a number of minor bug fixes, this patch changes GFS to use mutices rather than semaphores. This results in better information in case there are any locking problems. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
#
b3b94faa |
|
16-Jan-2006 |
David Teigland <teigland@redhat.com> |
[GFS2] The core of GFS2 This patch contains all the core files for GFS2. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|