#
4e98cc90 |
|
22-Feb-2024 |
Darrick J. Wong <djwong@kernel.org> |
xfs: allow scrub to hook metadata updates in other writers Certain types of filesystem metadata can only be checked by scanning every file in the entire filesystem. Specific examples of this include quota counts, file link counts, and reverse mappings of file extents. Directory and parent pointer reconstruction may also fall into this category. File scanning is much trickier than scanning AG metadata because we have to take inode locks in the same order as the rest of [VX]FS, we can't be holding buffer locks when we do that, and scanning the whole filesystem takes time. Earlier versions of the online repair patchset relied heavily on fsfreeze as a means to quiesce the filesystem so that we could take locks in the proper order without worrying about concurrent updates from other writers. Reviewers of those patches opined that freezing the entire fs to check and repair something was not sufficiently better than unmounting to run fsck offline. I don't agree with that 100%, but the message was clear: find a way to repair things that minimizes the quiet period where nobody can write to the filesystem. Generally, building btree indexes online can be split into two phases: a collection phase where we compute the records that will be put into the new btree; and a construction phase, where we construct the physical btree blocks and persist them. While it's simple to hold resource locks for the entirety of the two phases to ensure that the new index is consistent with the rest of the system, we don't need to hold resource locks during the collection phase if we have a means to receive live updates of other work going on elsewhere in the system. The goal of this patch, then, is to enable online fsck to learn about metadata updates going on in other threads while it constructs a shadow copy of the metadata records to verify or correct the real metadata. To minimize the overhead when online fsck isn't running, we use srcu notifiers because they prioritize fast access to the notifier call chain (particularly when the chain is empty) at a cost to configuring notifiers. Online fsck should be relatively infrequent, so this is acceptable. The intended usage model is fairly simple. Code that modifies a metadata structure of interest should declare a xfs_hook_chain structure in some well defined place, and call xfs_hook_call whenever an update happens. Online fsck code should define a struct notifier_block and use xfs_hook_add to attach the block to the chain, along with a function to be called. This function should synchronize with the fsck scanner to update whatever in-memory data the scanner is collecting. When finished, xfs_hook_del removes the notifier from the list and waits for them all to complete. Originally, I selected srcu notifiers over blocking notifiers to implement live hooks because they seemed to have fewer impacts to scalability. The per-call cost of srcu_notifier_call_chain is higher (19ns) than blocking_notifier_ (4ns) in the single threaded case, but blocking notifiers use an rwsem to stabilize the list. Cacheline bouncing for that rwsem is costly to runtime code when there are a lot of CPUs running regular filesystem operations. If there are no hooks installed, this is a total waste of CPU time. Therefore, I stuck with srcu notifiers, despite trading off single threaded performance for multithreaded performance. I also wasn't thrilled with the very high teardown time for srcu notifiers, since the caller has to wait for the next rcu grace period. This can take a long time if there are a lot of CPUs. Then I discovered the jump label implementation of static keys. Jump labels use kernel code patching to replace a branch with a nop sled when the key is disabled. IOWs, they can eliminate the overhead of _call_chain when there are no hooks enabled. This makes blocking notifiers competitive again -- scrub runs faster because teardown of the chain is a lot cheaper, and runtime code only pays the rwsem locking overhead when scrub is actually running. With jump labels enabled, calls to empty notifier chains are elided from the call sites when there are no hooks registered, which means that the overhead is 0.36ns when fsck is not running. This is perfect for most of the architectures that XFS is expected to run on (e.g. x86, powerpc, arm64, s390x, riscv). For architectures that don't support jump labels (e.g. m68k) the runtime overhead of checking the static key is an atomic counter read. This isn't great, but it's still cheaper than taking a shared rwsem. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
785dd131 |
|
19-Feb-2024 |
Matthew Wilcox (Oracle) <willy@infradead.org> |
xfs: Remove mrlock wrapper mrlock was an rwsem wrapper that also recorded whether the lock was held for read or write. Now that we can ask the generic code whether the lock is held for read or write, we can remove this wrapper and use an rwsem directly. As the comment says, we can't use lockdep to assert that the ILOCK is held for write, because we might be in a workqueue, and we aren't able to tell lockdep that we do in fact own the lock. Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: "Matthew Wilcox (Oracle)" <willy@infradead.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
|
#
d4c75a1b |
|
15-Jan-2024 |
Dave Chinner <dchinner@redhat.com> |
xfs: convert remaining kmem_free() to kfree() The remaining callers of kmem_free() are freeing heap memory, so we can convert them directly to kfree() and get rid of kmem_free() altogether. This conversion was done with: $ for f in `git grep -l kmem_free fs/xfs`; do > sed -i s/kmem_free/kfree/ $f > done $ Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
|
#
afdc1155 |
|
15-Jan-2024 |
Dave Chinner <dchinner@redhat.com> |
xfs: move kmem_to_page() Move it to the general xfs linux wrapper header file so we can prepare to remove kmem.h Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
|
#
ef5a83b7 |
|
16-Oct-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: use shifting and masking when converting rt extents, if possible Avoid the costs of integer division (32-bit and 64-bit) if the realtime extent size is a power of two. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
a76dba3b |
|
10-Aug-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: create scaffolding for creating debugfs entries Set up debugfs directories for xfs as a whole, and a subdirectory for each mounted filesystem. This will enable the creation of debugfs files in the next patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
d5c88131 |
|
11-Apr-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: allow queued AG intents to drain before scrubbing When a writer thread executes a chain of log intent items, the AG header buffer locks will cycle during a transaction roll to get from one intent item to the next in a chain. Although scrub takes all AG header buffer locks, this isn't sufficient to guard against scrub checking an AG while that writer thread is in the middle of finishing a chain because there's no higher level locking primitive guarding allocation groups. When there's a collision, cross-referencing between data structures (e.g. rmapbt and refcountbt) yields false corruption events; if repair is running, this results in incorrect repairs, which is catastrophic. Fix this by adding to the perag structure the count of active intents and make scrub wait until it has both AG header buffer locks and the intent counter reaches zero. One quirk of the drain code is that deferred bmap updates also bump and drop the intent counter. A fundamental decision made during the design phase of the reverse mapping feature is that updates to the rmapbt records are always made by the same code that updates the primary metadata. In other words, callers of bmapi functions expect that the bmapi functions will queue deferred rmap updates. Some parts of the reflink code queue deferred refcount (CUI) and bmap (BUI) updates in the same head transaction, but the deferred work manager completely finishes the CUI before the BUI work is started. As a result, the CUI drops the intent count long before the deferred rmap (RUI) update even has a chance to bump the intent count. The only way to keep the intent count elevated between the CUI and RUI is for the BUI to bump the counter until the RUI has been created. A second quirk of the intent drain code is that deferred work items must increment the intent counter as soon as the work item is added to the transaction. When a BUI completes and queues an RUI, the RUI must increment the counter before the BUI decrements it. The only way to accomplish this is to require that the counter be bumped as soon as the deferred work item is created in memory. In the next patches we'll improve on this facility, but this patch provides the basic functionality. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
5970e15d |
|
20-Nov-2022 |
Jeff Layton <jlayton@kernel.org> |
filelock: move file locking definitions to separate header file The file locking definitions have lived in fs.h since the dawn of time, but they are only used by a small subset of the source files that include it. Move the file locking definitions to a new header file, and add the appropriate #include directives to the source files that need them. By doing this we trim down fs.h a bit and limit the amount of rebuilding that has to be done when we make changes to the file locking APIs. Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Acked-by: Steve French <stfrench@microsoft.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Jeff Layton <jlayton@kernel.org>
|
#
d03025ae |
|
14-Jul-2022 |
Bart Van Assche <bvanassche@acm.org> |
fs/xfs: Use the enum req_op and blk_opf_t types Improve static type checking by using the enum req_op type for variables that represent a request operation and the new blk_opf_t type for the combination of a request operation with request flags. Reviewed-by: Darrick J. Wong <djwong@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20220714180729.1065367-63-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
919edbad |
|
29-Mar-2022 |
Dave Chinner <dchinner@redhat.com> |
xfs: drop async cache flushes from CIL commits. Jan Kara reported a performance regression in dbench that he bisected down to commit bad77c375e8d ("xfs: CIL checkpoint flushes caches unconditionally"). Whilst developing the journal flush/fua optimisations this cache was part of, it appeared to made a significant difference to performance. However, now that this patchset has settled and all the correctness issues fixed, there does not appear to be any significant performance benefit to asynchronous cache flushes. In fact, the opposite is true on some storage types and workloads, where additional cache flushes that can occur from fsync heavy workloads have measurable and significant impact on overall throughput. Local dbench testing shows little difference on dbench runs with sync vs async cache flushes on either fast or slow SSD storage, and no difference in streaming concurrent async transaction workloads like fs-mark. Fast NVME storage. From `dbench -t 30`, CIL scale: clients async sync BW Latency BW Latency 1 935.18 0.855 915.64 0.903 8 2404.51 6.873 2341.77 6.511 16 3003.42 6.460 2931.57 6.529 32 3697.23 7.939 3596.28 7.894 128 7237.43 15.495 7217.74 11.588 512 5079.24 90.587 5167.08 95.822 fsmark, 32 threads, create w/ 64 byte xattr w/32k logbsize create chown unlink async 1m41s 1m16s 2m03s sync 1m40s 1m19s 1m54s Slower SATA SSD storage: From `dbench -t 30`, CIL scale: clients async sync BW Latency BW Latency 1 78.59 15.792 83.78 10.729 8 367.88 92.067 404.63 59.943 16 564.51 72.524 602.71 76.089 32 831.66 105.984 870.26 110.482 128 1659.76 102.969 1624.73 91.356 512 2135.91 223.054 2603.07 161.160 fsmark, 16 threads, create w/32k logbsize create unlink async 5m06s 4m15s sync 5m00s 4m22s And on Jan's test machine: 5.18-rc8-vanilla 5.18-rc8-patched Amean 1 71.22 ( 0.00%) 64.94 * 8.81%* Amean 2 93.03 ( 0.00%) 84.80 * 8.85%* Amean 4 150.54 ( 0.00%) 137.51 * 8.66%* Amean 8 252.53 ( 0.00%) 242.24 * 4.08%* Amean 16 454.13 ( 0.00%) 439.08 * 3.31%* Amean 32 835.24 ( 0.00%) 829.74 * 0.66%* Amean 64 1740.59 ( 0.00%) 1686.73 * 3.09%* Performance and cache flush behaviour is restored to pre-regression levels. As such, we can now consider the async cache flush mechanism an unnecessary exercise in premature optimisation and hence we can now remove it and the infrastructure it requires completely. Fixes: bad77c375e8d ("xfs: CIL checkpoint flushes caches unconditionally") Reported-and-tested-by: Jan Kara <jack@suse.cz> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
#
a793d79e |
|
02-Dec-2021 |
Christian Brauner <christian.brauner@ubuntu.com> |
fs: move mapping helpers The low-level mapping helpers were so far crammed into fs.h. They are out of place there. The fs.h header should just contain the higher-level mapping helpers that interact directly with vfs objects such as struct super_block or struct inode and not the bare mapping helpers. Similarly, only vfs and specific fs code shall interact with low-level mapping helpers. And so they won't be made accessible automatically through regular {g,u}id helpers. Link: https://lore.kernel.org/r/20211123114227.3124056-3-brauner@kernel.org (v1) Link: https://lore.kernel.org/r/20211130121032.3753852-3-brauner@kernel.org (v2) Link: https://lore.kernel.org/r/20211203111707.3901969-3-brauner@kernel.org Cc: Seth Forshee <sforshee@digitalocean.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> CC: linux-fsdevel@vger.kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Seth Forshee <sforshee@digitalocean.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
|
#
0431d926 |
|
18-Jun-2021 |
Dave Chinner <dchinner@redhat.com> |
xfs: async blkdev cache flush The new checkpoint cache flush mechanism requires us to issue an unconditional cache flush before we start a new checkpoint. We don't want to block for this if we can help it, and we have a fair chunk of CPU work to do between starting the checkpoint and issuing the first journal IO. Hence it makes sense to amortise the latency cost of the cache flush by issuing it asynchronously and then waiting for it only when we need to issue the first IO in the transaction. To do this, we need async cache flush primitives to submit the cache flush bio and to wait on it. The block layer has no such primitives for filesystems, so roll our own for the moment. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
#
db07349d |
|
29-Mar-2021 |
Christoph Hellwig <hch@lst.de> |
xfs: move the di_flags field to struct xfs_inode In preparation of removing the historic icinode struct, move the flags field into the containing xfs_inode structure. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
#
9669f51d |
|
22-Jan-2021 |
Darrick J. Wong <djwong@kernel.org> |
xfs: consolidate the eofblocks and cowblocks workers Remove the separate cowblocks work items and knob so that we can control and run everything from a single blockgc work queue. Note that the speculative_prealloc_lifetime sysfs knob retains its historical name even though the functions move to prefix xfs_blockgc_*. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
25219dbf |
|
09-Oct-2020 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: fix fallocate functions when rtextsize is larger than 1 In commit fe341eb151ec, I forgot that xfs_free_file_space isn't strictly a "remove mapped blocks" function. It is actually a function to zero file space by punching out the middle and writing zeroes to the unaligned ends of the specified range. Therefore, putting a rtextsize alignment check in that function is wrong because that breaks unaligned ZERO_RANGE on the realtime volume. Furthermore, xfs_file_fallocate already has alignment checks for the functions require the file range to be aligned to the size of a fundamental allocation unit (which is 1 FSB on the data volume and 1 rt extent on the realtime volume). Create a new helper to check fallocate arguments against the realtiem allocation unit size, fix the fallocate frontend to use it, fix free_file_space to delete the correct range, and remove a now redundant check from insert_file_space. NOTE: The realtime extent size is not required to be a power of two! Fixes: fe341eb151ec ("xfs: ensure that fpunch, fcollapse, and finsert operations are aligned to rt extent size") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
|
#
c63290e3 |
|
16-Sep-2020 |
Kaixu Xia <kaixuxia@tencent.com> |
xfs: remove the unused SYNCHRONIZE macro There are no callers of the SYNCHRONIZE() macro, so remove it. Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
0d5a5714 |
|
02-Jul-2020 |
Yafang Shao <laoar.shao@gmail.com> |
xfs: remove useless definitions in xfs_linux.h Remove current_pid(), current_test_flags() and current_clear_flags_nested(), because they are useless. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
d5f0f49a |
|
26-Feb-2020 |
Christoph Hellwig <hch@lst.de> |
xfs: clean up the attr flag confusion The ATTR_* flags have a long IRIX history, where they a userspace interface, the on-disk format and an internal interface. We've split out the on-disk interface to the XFS_ATTR_* values, but despite (or because?) of that the flag have still been a mess. Switch the internal interface to pass the on-disk XFS_ATTR_* flags for the namespace and the Linux XATTR_* flags for the actual flags instead. The ATTR_* values that are actually used are move to xfs_fs.h with a new XFS_IOC_* prefix to not conflict with the userspace version that has the same name and must have the same value. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Chandan Rajendra <chandanrlinux@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
ba8adad5 |
|
21-Feb-2020 |
Christoph Hellwig <hch@lst.de> |
xfs: remove the kuid/kgid conversion wrappers Remove the XFS wrappers for converting from and to the kuid/kgid types. Mostly this means switching to VFS i_{u,g}id_{read,write} helpers, but in a few spots the calls to the conversion functions is open coded. To match the use of sb->s_user_ns in the helpers and other file systems, sb->s_user_ns is also used in the quota code. The ACL code already does the conversion in a grotty layering violation in the VFS xattr code, so it keeps using init_user_ns for the identity mapping. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
6519f708 |
|
17-Nov-2019 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: report corruption only as a regular error Redefine XFS_IS_CORRUPT so that it reports corruptions only via xfs_corruption_report. Since these are on-disk contents (and not checks of internal state), we don't ever want to panic the kernel. This also amends the corruption report to recommend unmounting and running xfs_repair. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
1ec28615 |
|
11-Nov-2019 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: add a XFS_IS_CORRUPT macro Add a new macro, XFS_IS_CORRUPT, which we will use to integrate some corruption reporting when the corruption test expression is true. This will be used in the next patch to remove the ugly XFS_WANT_CORRUPT* macros. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
9842b56c |
|
02-Nov-2019 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: make the assertion message functions take a mount parameter Make the assfail and asswarn functions take a struct xfs_mount so that we can start tying debugging and corruption messages to a particular mount. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
6ad5b325 |
|
28-Jun-2019 |
Christoph Hellwig <hch@lst.de> |
xfs: use bios directly to read and write the log recovery buffers The xfs_buf structure is basically used as a glorified container for a memory allocation in the log recovery code. Replace it with a call to kmem_alloc_large and a simple abstraction to read into or write from it synchronously using chained bios. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
1e85a367 |
|
28-Jun-2019 |
Christoph Hellwig <hch@lst.de> |
xfs: remove the no-op spinlock_destroy stub Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
0703a8e1 |
|
08-Jun-2018 |
Dave Chinner <dchinner@redhat.com> |
xfs: replace do_mod with native operations do_mod() is a hold-over from when we have different sizes for file offsets and and other internal values for 40 bit XFS filesystems. Hence depending on build flags variables passed to do_mod() could change size. We no longer support those small format filesystems and hence everything is of fixed size theses days, even on 32 bit platforms. As such, we can convert all the do_mod() callers to platform optimised modulus operations as defined by linux/math64.h. Individual conversions depend on the types of variables being used. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
9bb54cb5 |
|
07-Jun-2018 |
Dave Chinner <dchinner@redhat.com> |
xfs: clean up MIN/MAX Get rid of the MIN/MAX macros and just use the native min/max macros directly in the XFS code. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
4a2d01b0 |
|
07-Jun-2018 |
Dave Chinner <dchinner@redhat.com> |
xfs: xfs_reflink_convert_cow() memory allocation deadlock xfs_reflink_convert_cow() manipulates the incore extent list in GFP_KERNEL context in the IO submission path whilst holding locked pages under writeback. This is a memory reclaim deadlock vector. This code is not in a transaction, so any memory allocations it makes aren't protected via the memalloc_nofs_save() context that transactions carry. Hence we need to run this call under memalloc_nofs_save() context to prevent potential memory allocations from being run as GFP_KERNEL and deadlocking. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
0b61f8a4 |
|
05-Jun-2018 |
Dave Chinner <dchinner@redhat.com> |
xfs: convert to SPDX license tags Remove the verbose license text from XFS files and replace them with SPDX tags. This does not change the license of any of the code, merely refers to the common, up-to-date license files in LICENSES/ This change was mostly scripted. fs/xfs/Makefile and fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected and modified by the following command: for f in `git grep -l "GNU General" fs/xfs/` ; do echo $f cat $f | awk -f hdr.awk > $f.new mv -f $f.new $f done And the hdr.awk script that did the modification (including detecting the difference between GPL-2.0 and GPL-2.0+ licenses) is as follows: $ cat hdr.awk BEGIN { hdr = 1.0 tag = "GPL-2.0" str = "" } /^ \* This program is free software/ { hdr = 2.0; next } /any later version./ { tag = "GPL-2.0+" next } /^ \*\// { if (hdr > 0.0) { print "// SPDX-License-Identifier: " tag print str print $0 str="" hdr = 0.0 next } print $0 next } /^ \* / { if (hdr > 1.0) next if (hdr > 0.0) { if (str != "") str = str "\n" str = str $0 next } print $0 next } /^ \*/ { if (hdr > 0.0) next print $0 next } // { if (hdr > 0.0) { if (str != "") str = str "\n" str = str $0 next } print $0 } END { } $ Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
c9690043 |
|
09-Jan-2018 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: use %px for data pointers when debugging Starting with commit 57e734423ad ("vsprintf: refactor %pK code out of pointer"), the behavior of the raw '%p' printk format specifier was changed to print a 32-bit hash of the pointer value to avoid leaking kernel pointers into dmesg. For most situations that's good. This is /undesirable/ behavior when we're trying to debug XFS, however, so define a PTR_FMT that prints the actual pointer when we're in debug mode. Note that %p for tracepoints still prints the raw pointer, so in the long run we could consider rewriting some of these messages as tracepoints. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
a0158315 |
|
08-Jan-2018 |
Richard Wareing <rwareing@fb.com> |
xfs: Show realtime device stats on statfs calls if realtime flags set - Reports realtime device free blocks in statfs calls if (realtime) inheritance bit is set on the inode of directory, or realtime flag in the case of files. This is a bit more intuitive, especially for use-cases which are using a much larger device for the realtime device. - Add XFS_IS_REALTIME_MOUNT option to gate based on the existence of a realtime device on the mount, similar to the XFS_IS_REALTIME_INODE option. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Richard Wareing <rwareing@fb.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
274e0a1f |
|
20-Nov-2017 |
Christoph Hellwig <hch@lst.de> |
xfs: abstract out dev_t conversions And move them to xfs_linux.h so that xfsprogs can stub them out more easily. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
fc41e2a1 |
|
06-Nov-2017 |
Christoph Hellwig <hch@lst.de> |
xfs: always define STATIC to static noinline Ever since we added the noinline tag there is no good reason to define away the static for debug builds - we'll get just as good debug information with our without it, so don't mess up sparse and other checkers due to it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
52c732ee |
|
17-Oct-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: refactor btree block header checking functions Refactor the btree block header checks to have an internal function that returns the address of the failing check without logging errors. The scrubber will call the internal function, while the external version will maintain the current logging behavior. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
b31ff3cd |
|
12-Sep-2017 |
Richard Wareing <rwareing@fb.com> |
xfs: XFS_IS_REALTIME_INODE() should be false if no rt device present If using a kernel with CONFIG_XFS_RT=y and we set the RHINHERIT flag on a directory in a filesystem that does not have a realtime device and create a new file in that directory, it gets marked as a real time file. When data is written and a fsync is issued, the filesystem attempts to flush a non-existent rt device during the fsync process. This results in a crash dereferencing a null buftarg pointer in xfs_blkdev_issue_flush(): BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: xfs_blkdev_issue_flush+0xd/0x20 ..... Call Trace: xfs_file_fsync+0x188/0x1c0 vfs_fsync_range+0x3b/0xa0 do_fsync+0x3d/0x70 SyS_fsync+0x10/0x20 do_syscall_64+0x4d/0xb0 entry_SYSCALL64_slow_path+0x25/0x25 Setting RT inode flags does not require special privileges so any unprivileged user can cause this oops to occur. To reproduce, confirm kernel is compiled with CONFIG_XFS_RT=y and run: # mkfs.xfs -f /dev/pmem0 # mount /dev/pmem0 /mnt/test # mkdir /mnt/test/foo # xfs_io -c 'chattr +t' /mnt/test/foo # xfs_io -f -c 'pwrite 0 5m' -c fsync /mnt/test/foo/bar Or just run xfstests with MKFS_OPTIONS="-d rtinherit=1" and wait. Kernels built with CONFIG_XFS_RT=n are not exposed to this bug. Fixes: f538d4da8d52 ("[XFS] write barrier support") Cc: <stable@vger.kernel.org> Signed-off-by: Richard Wareing <rwareing@fb.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
6eb0b8df |
|
07-Jul-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: rename MAXPATHLEN to XFS_SYMLINK_MAXLEN XFS has a maximum symlink target length of 1024 bytes; this is a holdover from the Irix days. Unfortunately, the constant establishing this is 'MAXPATHLEN' and is /not/ the same as the Linux MAXPATHLEN, which is 4096. The kernel enforces its 1024 byte MAXPATHLEN on symlink targets, but xfsprogs picks up the (Linux) system 4096 byte MAXPATHLEN, which means that xfs_repair doesn't complain about oversized symlinks. Since this is an on-disk format constraint, put the define in the XFS namespace and move everything over to use the new name. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
c8ce540d |
|
16-Jun-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: remove double-underscore integer types This is a purely mechanical patch that removes the private __{u,}int{8,16,32,64}_t typedefs in favor of using the system {u,}int{8,16,32,64}_t typedefs. This is the sed script used to perform the transformation and fix the resulting whitespace and indentation errors: s/typedef\t__uint8_t/typedef __uint8_t\t/g s/typedef\t__uint/typedef __uint/g s/typedef\t__int\([0-9]*\)_t/typedef int\1_t\t/g s/__uint8_t\t/__uint8_t\t\t/g s/__uint/uint/g s/__int\([0-9]*\)_t\t/__int\1_t\t\t/g s/__int/int/g /^typedef.*int[0-9]*_t;$/d Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
d905fdaa |
|
04-May-2017 |
Amir Goldstein <amir73il@gmail.com> |
xfs: use the common helper uuid_is_null() Use the common helper uuid_is_null() and remove the xfs specific helper uuid_is_nil(). The common helper does not check for the NULL pointer value as xfs helper did, but xfs code never calls the helper with a pointer that can be NULL. Conform comments and warning strings to use the term 'null uuid' instead of 'nil uuid', because this is the terminology used by lib/uuid.c and its users. It is also the terminology used in userspace by libuuid and xfsprogs. Signed-off-by: Amir Goldstein <amir73il@gmail.com> [hch: remove now unused uuid.[ch]] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|
#
f9727a17 |
|
17-May-2017 |
Christoph Hellwig <hch@lst.de> |
uuid: rename uuid types Our "little endian" UUID really is a Wintel GUID, so rename it and its helpers such (guid_t). The big endian UUID is the only true one, so give it the name uuid_t. The uuid_le and uuid_be names are retained for now, but will hopefully go away soon. The exception to that are the _cmp helpers that will be replaced by better primitives ASAP and thus don't get the new names. Also the _to_bin helpers are named to match the better named uuid_parse routine in userspace. Also remove the existing typedef in XFS that's now been superceeded by the generic type name. Signed-off-by: Christoph Hellwig <hch@lst.de> [andy: also update the UUID_LE/UUID_BE macros including fallout] Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
b1f359f9 |
|
05-May-2017 |
Christoph Hellwig <hch@lst.de> |
xfs: use uuid_be to implement the uuid_t type Use the generic Linux definition to implement our UUID type, this will allow using more generic infrastructure in the future. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
5146d0b7 |
|
11-Apr-2017 |
Eric Sandeen <sandeen@redhat.com> |
xfs: remove custom do_div implementations Long ago, all this gunk was added with a lament about problems with gcc's do_div, and a fun recommendation in the changelog: egcs-2.91.66 is the recommended compiler version for building XFS. All this special stuff was needed to work around an old gcc bug, apparently, and it's been there ever since. There should be no need for this anymore, so remove it. Remove the special 32-bit xfs_do_mod as well; just let the kernel's do_div() handle all this. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
174cd4b1 |
|
02-Feb-2017 |
Ingo Molnar <mingo@kernel.org> |
sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h> Fix up affected files that include this signal functionality via sched.h. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
bf46ecc3 |
|
17-Jan-2017 |
Amir Goldstein <amir73il@gmail.com> |
xfs: make the ASSERT() condition likely The ASSERT() condition is the normal case, not the exception, so testing the condition should be likely(), not unlikely(). Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
7c0f6ba6 |
|
24-Dec-2016 |
Linus Torvalds <torvalds@linux-foundation.org> |
Replace <asm/uaccess.h> with <linux/uaccess.h> globally This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
6031e73a |
|
06-Dec-2016 |
Lucas Stach <dev@lynxeye.de> |
xfs: use rhashtable to track buffer cache On filesystems with a lot of metadata and in metadata intensive workloads xfs_buf_find() is showing up at the top of the CPU cycles trace. Most of the CPU time is spent on CPU cache misses while traversing the rbtree. As the buffer cache does not need any kind of ordering, but fast lookups a hashtable is the natural data structure to use. The rhashtable infrastructure provides a self-scaling hashtable implementation and allows lookups to proceed while the table is going through a resize operation. This reduces the CPU-time spent for the lookups to 1/3 even for small filesystems with a relatively small number of cached buffers, with possibly much larger gains on higher loaded filesystems. [dchinner: reduce minimum hash size to an acceptable size for large filesystems with many AGs with no active use.] [dchinner: remove stale rbtree asserts.] [dchinner: use xfs_buf_map for compare function argument.] [dchinner: make functions static.] [dchinner: remove redundant comments.] Signed-off-by: Lucas Stach <dev@lynxeye.de> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
83104d44 |
|
03-Oct-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: garbage collect old cowextsz reservations Trim CoW reservations made on behalf of a cowextsz hint if they get too old or we run low on quota, so long as we don't have dirty data awaiting writeback or directio operations in progress. Garbage collection of the cowextsize extents are kept separate from prealloc extent reaping because setting the CoW prealloc lifetime to a (much) higher value than the regular prealloc extent lifetime has been useful for combatting CoW fragmentation on VM hosts where the VMs experience bursty write behaviors and we can keep the utilization ratios low enough that we don't start to run out of space. IOWs, it benefits us to keep the CoW fork reservations around for as long as we can unless we run out of blocks or hit inode reclaim. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
aa2dd0ad |
|
19-Jul-2016 |
Christoph Hellwig <hch@lst.de> |
xfs: remove __arch_pack Instead we always declare struct xfs_dir2_sf_hdr as packed. That's the expected layout, and while most major architectures do the packing by default the new structure size and offset checker showed that not only the ARM old ABI got this wrong, but various minor embedded architectures did as well. [Verified that no code change on x86-64 results from this change] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
09cbfeaf |
|
01-Apr-2016 |
Kirill A. Shutemov <kirill.shutemov@linux.intel.com> |
mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
80529c45 |
|
11-Oct-2015 |
Bill O'Donnell <billodo@redhat.com> |
xfs: pass xfsstats structures to handlers and macros This patch is the next step toward per-fs xfs stats. The patch makes the show and clear routines able to handle any stats structure associated with a kobject. Instead of a single global xfsstats structure, add kobject and a pointer to a per-cpu struct xfsstats. Modify the macros that manipulate the stats accordingly: XFS_STATS_INC, XFS_STATS_DEC, and XFS_STATS_ADD now access xfsstats->xs_stats. The sysfs functions need to get from the kobject back to the xfsstats structure which contains it, and pass the pointer to the ->xs_stats percpu structure into the show & clear routines. Signed-off-by: Bill O'Donnell <billodo@redhat.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
b2a922cd |
|
21-Jun-2015 |
Christoph Hellwig <hch@lst.de> |
xfs: remove xfs_caddr_t Just use char pointers directly instead of the confusing typedef to a pointer type. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
fc51c2b5 |
|
21-Jun-2015 |
Christoph Hellwig <hch@lst.de> |
xfs: remove inst_t We can simply use a void pointer to pass a long return addresses in the debugging helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
db9d67d6 |
|
21-Jun-2015 |
Christoph Hellwig <hch@lst.de> |
xfs: remove __psint_t and __psunsigned_t Replace uses of __psint_t with the proper uintptr_t and ptrdiff_t types, and remove the defintions of __psint_t and __psunsigned_t. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
5681ca40 |
|
23-Feb-2015 |
Dave Chinner <dchinner@redhat.com> |
xfs: Remove icsb infrastructure Now that the in-core superblock infrastructure has been replaced with generic per-cpu counters, we don't need it anymore. Nuke it from orbit so we are sure that it won't haunt us again... Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
6d3ebaae |
|
27-Nov-2014 |
Christoph Hellwig <hch@lst.de> |
xfs: merge xfs_dinode.h into xfs_format.h More consolidatation for the on-disk format defintions. Note that the XFS_IS_REALTIME_INODE moves to xfs_linux.h instead as it is not related to the on disk format, but depends on a CONFIG_ option. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
e076b0f3 |
|
01-Oct-2014 |
Dave Chinner <dchinner@redhat.com> |
xfs: kill time.h The typedef for timespecs and nanotime() are completely unnecessary, and delay() can be moved to fs/xfs/linux.h, which means this file can go away. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
b92cc59f6 |
|
03-Aug-2014 |
Dave Chinner <dchinner@redhat.com> |
xfs: kill xfs_vnode.h Move the IO flag definitions to xfs_inode.h and kill the header file as it is now empty. Removing the xfs_vnode.h file showed up an implicit header include path: xfs_linux.h -> xfs_vnode.h -> xfs_fs.h And so every xfs header file has been inplicitly been including xfs_fs.h where it is needed or not. Hence the removal of xfs_vnode.h causes all sorts of build issues because BBTOB() and friends are no longer automatically included in the build. This also gets fixed. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
d5cf09ba |
|
29-Jul-2014 |
Christoph Hellwig <hch@lst.de> |
xfs: require 64-bit sector_t Trying to support tiny disks only and saving a bit memory might have made sense on an SGI O2 15 years ago, but is pretty pointless today. Remove the rarely tested codepath that uses various smaller in-memory types to reduce our test matrix and make the codebase a little bit smaller and less complicated. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
a31b1d3d |
|
14-Jul-2014 |
Brian Foster <bfoster@redhat.com> |
xfs: add xfs_mount sysfs kobject Embed a base kobject into xfs_mount. This creates a kobject associated with each XFS mount and a subdirectory in sysfs with the name of the filesystem. The subdirectory lifecycle matches that of the mount. Also add the new xfs_sysfs.[c,h] source files with some XFS sysfs infrastructure to facilitate attribute creation. Note that there are currently no attributes exported as part of the xfs_mount kobject. It exists solely to serve as a per-mount container for child objects. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
d99831ff |
|
21-Jun-2014 |
Eric Sandeen <sandeen@sandeen.net> |
xfs: return is not a function return is not a function. "return(EIO);" is silly; "return (EIO);" moreso. return is not a function. Nuke the pointless parens. [dchinner: catch a couple of extra cases in xfs_attr_list.c, xfs_acl.c and xfs_linux.h.] Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
ca23f8fd |
|
26-Feb-2014 |
Eric Sandeen <sandeen@sandeen.net> |
xfs: add xfs_verifier_error() We want to distinguish between corruption, CRC errors, etc. In addition, the full stack trace on verifier errors seems less than helpful; it looks more like an oops than corruption. Create a new function to specifically alert the user to verifier errors, which can differentiate between EFSCORRUPTED and CRC mismatches. It doesn't dump stack unless the xfs error level is turned up high. Define a new error message (EFSBADCRC) to clearly identify CRC errors. (Defined to EBADMSG, bad message) Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
51582170 |
|
26-Feb-2014 |
Eric Sandeen <sandeen@redhat.com> |
xfs: add helper for verifying checksums on xfs_bufs Many/most callers of xfs_verify_cksum() pass bp->b_addr and BBTOB(bp->b_length) as the first 2 args. Add a helper which can just accept the bp and the crc offset, and work it out on its own, for brevity. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
5c753909 |
|
07-Aug-2013 |
Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> |
xfs: remove two unused macro definitions in xfs_linux.h Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
c5eeb7ec |
|
15-Aug-2013 |
Dwight Engen <dwight.engen@oracle.com> |
xfs: create wrappers for converting kuid_t to/from uid_t Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Dwight Engen <dwight.engen@oracle.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
4f3d71f6 |
|
12-Aug-2013 |
Dave Chinner <dchinner@redhat.com> |
xfs: move kernel specific type definitions to xfs.h xfs_types.h is shared with userspace, so having kernel specific types defined in it is problematic. Move all the kernel specific defines to xfs_linux.h so we can remove the __KERNEL__ guards from xfs_types.h. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
742ae1e3 |
|
30-Apr-2013 |
Dave Chinner <dchinner@redhat.com> |
xfs: introduce CONFIG_XFS_WARN Running a CONFIG_XFS_DEBUG kernel in production environments is not the best idea as it introduces significant overhead, can change the behaviour of algorithms (such as allocation) to improve test coverage, and (most importantly) panic the machine on non-fatal errors. There are many cases where all we want to do is run a kernel with more bounds checking enabled, such as is provided by the ASSERT() statements throughout the code, but without all the potential overhead and drawbacks. This patch converts all the ASSERT statements to evaluate as WARN_ON(1) statements and hence if they fail dump a warning and a stack trace to the log. This has minimal overhead and does not change any algorithms, and will allow us to find strange "out of bounds" problems more easily on production machines. There are a few places where assert statements contain debug only code. These are converted to be debug-or-warn only code so that we still get all the assert checks in the code. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
3d6e0361 |
|
27-Mar-2013 |
Rich Johnston <rjohnston@sgi.com> |
xfs: Add ratelimited printk for different alert levels Ratelimited printk will be useful in printing xfs messages which are otherwise not required to be printed always due to their high rate (to prevent kernel ring buffer from overflowing), while at the same time required to be printed. Signed-off-by: Raghavendra D Prabhu <rprabhu@wnohang.net> Reviewed-by: Rich Johnston <rjohnston@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
bc02e869 |
|
15-Nov-2012 |
Christoph Hellwig <hch@lst.de> |
xfs: add CRC infrastructure - add a mount feature bit for CRC enabled filesystems - add some helpers for generating and verifying the CRCs - add a copy_uuid helper The checksumming helpers are loosely based on similar ones in sctp, all other bits come from Dave Chinner. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
579b62fa |
|
06-Nov-2012 |
Brian Foster <bfoster@redhat.com> |
xfs: add background scanning to clear eofblocks inodes Create a new mount workqueue and delayed_work to enable background scanning and freeing of eofblocks inodes. The scanner kicks in once speculative preallocation occurs and stops requeueing itself when no eofblocks inodes exist. The scan interval is based on the new 'speculative_prealloc_lifetime' tunable (default to 5m). The background scanner performs unfiltered, best effort scans (which skips inodes under lock contention or with a dirty cache mapping). Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
0030807c |
|
11-Oct-2011 |
Christoph Hellwig <hch@infradead.org> |
xfs: revert to using a kthread for AIL pushing Currently we have a few issues with the way the workqueue code is used to implement AIL pushing: - it accidentally uses the same workqueue as the syncer action, and thus can be prevented from running if there are enough sync actions active in the system. - it doesn't use the HIGHPRI flag to queue at the head of the queue of work items At this point I'm not confident enough in getting all the workqueue flags and tweaks right to provide a perfectly reliable execution context for AIL pushing, which is the most important piece in XFS to make forward progress when the log fills. Revert back to use a kthread per filesystem which fixes all the above issues at the cost of having a task struct and stack around for each mounted filesystem. In addition this also gives us much better ways to diagnose any issues involving hung AIL pushing and removes a small amount of code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Stefan Priebe <s.priebe@profihost.ag> Tested-by: Stefan Priebe <s.priebe@profihost.ag> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
|
#
c59d87c4 |
|
12-Aug-2011 |
Christoph Hellwig <hch@infradead.org> |
xfs: remove subdirectories Use the move from Linux 2.6 to Linux 3.x as an excuse to kill the annoying subdirectories in the XFS source code. Besides the large amount of file rename the only changes are to the Makefile, a few files including headers with the subdirectory prefix, and the binary sysctl compat code that includes a header under fs/xfs/ from kernel/. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
|