#
7a2192ac |
|
22-Feb-2024 |
Darrick J. Wong <djwong@kernel.org> |
xfs: create refcount bag structure for btree repairs Create a bag structure for refcount information that uses the refcount bag btree defined in the previous patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
18a1e644 |
|
22-Feb-2024 |
Darrick J. Wong <djwong@kernel.org> |
xfs: define an in-memory btree for storing refcount bag info during repairs Create a new in-memory btree type so that we can store refcount bag info in a much more memory-efficient and performant format. Recall that the refcount recordset regenerator computes the new recordset from browsing the rmap records. Let's say that the rmap records are: {agbno: 10, length: 40, ...} {agbno: 11, length: 3, ...} {agbno: 12, length: 20, ...} {agbno: 15, length: 1, ...} It is convenient to have a data structure that could quickly tell us the refcount for an arbitrary agbno without wasting memory. An array or a list could do that pretty easily. List suck because of the pointer overhead. xfarrays are a lot more compact, but we want to minimize sparse holes in the xfarray to constrain memory usage. Maintaining any kind of record order isn't needed for correctness, so I created the "rcbag", which is shorthand for an unordered list of (excerpted) reverse mappings. So we add the first rmap to the rcbag, and it looks like: 0: {agbno: 10, length: 40} The refcount for agbno 10 is 1. Then we move on to block 11, so we add the second rmap: 0: {agbno: 10, length: 40} 1: {agbno: 11, length: 3} The refcount for agbno 11 is 2. We move on to block 12, so we add the third: 0: {agbno: 10, length: 40} 1: {agbno: 11, length: 3} 2: {agbno: 12, length: 20} The refcount for agbno 12 and 13 is 3. We move on to block 14, and remove the second rmap: 0: {agbno: 10, length: 40} 1: NULL 2: {agbno: 12, length: 20} The refcount for agbno 14 is 2. We move on to block 15, and add the last rmap. But we don't care where it is and we don't want to expand the array so we put it in slot 1: 0: {agbno: 10, length: 40} 1: {agbno: 15, length: 1} 2: {agbno: 12, length: 20} The refcount for block 15 is 3. Notice how order doesn't matter in this list? That's why repair uses an unordered list, or "bag". The data structure is not a set because it does not guarantee uniqueness. That said, adding and removing specific items is now an O(n) operation because we have no idea where that item might be in the list. Overall, the runtime is O(n^2) which is bad. I realized that I could easily refactor the btree code and reimplement the refcount bag with an xfbtree. Adding and removing is now O(log2 n), so the runtime is at least O(n log2 n), which is much faster. In the end, the rcbag becomes a sorted list, but that's merely a detail of the implementation. The repair code doesn't care. (Note: That horrible xfs_db bmap_inflate command can be used to exercise this sort of rcbag insanity by cranking up refcounts quickly.) Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
32080a9b |
|
22-Feb-2024 |
Darrick J. Wong <djwong@kernel.org> |
xfs: repair the rmapbt Rebuild the reverse mapping btree from all primary metadata. This first patch establishes the bare mechanics of finding records and putting together a new ondisk tree; more complex pieces are needed to make it work properly. Link: Documentation/filesystems/xfs-online-fsck-design.rst Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
a095686a |
|
22-Feb-2024 |
Darrick J. Wong <djwong@kernel.org> |
xfs: support in-memory btrees Adapt the generic btree cursor code to be able to create a btree whose buffers come from a (presumably in-memory) buftarg with a header block that's specific to in-memory btrees. We'll connect this to other parts of online scrub in the next patches. Note that in-memory btrees always have a block size matching the system memory page size for efficiency reasons. There are also a few things we need to do to finalize a btree update; that's covered in the next patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
5076a604 |
|
22-Feb-2024 |
Darrick J. Wong <djwong@kernel.org> |
xfs: support in-memory buffer cache targets Allow the buffer cache to target in-memory files by making it possible to have a buftarg that maps pages from private shmem files. As the prevous patch alludes, the in-memory buftarg contains its own cache, points to a shmem file, and does not point to a block_device. The next few patches will make it possible to construct an xfs_btree in pageable memory by using this buftarg. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
4ed080cd |
|
22-Feb-2024 |
Darrick J. Wong <djwong@kernel.org> |
xfs: repair summary counters Use the same summary counter calculation infrastructure to generate new values for the in-core summary counters. The difference between the scrubber and the repairer is that the repairer will freeze the fs during setup, which means that the values should match exactly. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
6b631c60 |
|
22-Feb-2024 |
Darrick J. Wong <djwong@kernel.org> |
xfs: teach repair to fix file nlinks Fix the file link counts since we just computed the correct ones. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
f1184081 |
|
22-Feb-2024 |
Darrick J. Wong <djwong@kernel.org> |
xfs: teach scrub to check file nlinks Create the necessary scrub code to walk the filesystem's directory tree so that we can compute file link counts. Similar to quotacheck, we create an incore shadow array of link count information and then we walk the filesystem a second time to compare the link counts. We need live updates to keep the information up to date during the lengthy scan, so this scrubber remains disabled until the next patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
96ed2ae4 |
|
22-Feb-2024 |
Darrick J. Wong <djwong@kernel.org> |
xfs: repair dquots based on live quotacheck results Use the shadow quota counters that live quotacheck creates to reset the incore dquot counters. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
48dd9117 |
|
22-Feb-2024 |
Darrick J. Wong <djwong@kernel.org> |
xfs: implement live quotacheck inode scan Create a new trio of scrub functions to check quota counters. While the dquots themselves are filesystem metadata and should be checked early, the dquot counter values are computed from other metadata and are therefore summary counters. We don't plug these into the scrub dispatch just yet, because we still need to be able to watch quota updates while doing our scan. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
4e98cc90 |
|
22-Feb-2024 |
Darrick J. Wong <djwong@kernel.org> |
xfs: allow scrub to hook metadata updates in other writers Certain types of filesystem metadata can only be checked by scanning every file in the entire filesystem. Specific examples of this include quota counts, file link counts, and reverse mappings of file extents. Directory and parent pointer reconstruction may also fall into this category. File scanning is much trickier than scanning AG metadata because we have to take inode locks in the same order as the rest of [VX]FS, we can't be holding buffer locks when we do that, and scanning the whole filesystem takes time. Earlier versions of the online repair patchset relied heavily on fsfreeze as a means to quiesce the filesystem so that we could take locks in the proper order without worrying about concurrent updates from other writers. Reviewers of those patches opined that freezing the entire fs to check and repair something was not sufficiently better than unmounting to run fsck offline. I don't agree with that 100%, but the message was clear: find a way to repair things that minimizes the quiet period where nobody can write to the filesystem. Generally, building btree indexes online can be split into two phases: a collection phase where we compute the records that will be put into the new btree; and a construction phase, where we construct the physical btree blocks and persist them. While it's simple to hold resource locks for the entirety of the two phases to ensure that the new index is consistent with the rest of the system, we don't need to hold resource locks during the collection phase if we have a means to receive live updates of other work going on elsewhere in the system. The goal of this patch, then, is to enable online fsck to learn about metadata updates going on in other threads while it constructs a shadow copy of the metadata records to verify or correct the real metadata. To minimize the overhead when online fsck isn't running, we use srcu notifiers because they prioritize fast access to the notifier call chain (particularly when the chain is empty) at a cost to configuring notifiers. Online fsck should be relatively infrequent, so this is acceptable. The intended usage model is fairly simple. Code that modifies a metadata structure of interest should declare a xfs_hook_chain structure in some well defined place, and call xfs_hook_call whenever an update happens. Online fsck code should define a struct notifier_block and use xfs_hook_add to attach the block to the chain, along with a function to be called. This function should synchronize with the fsck scanner to update whatever in-memory data the scanner is collecting. When finished, xfs_hook_del removes the notifier from the list and waits for them all to complete. Originally, I selected srcu notifiers over blocking notifiers to implement live hooks because they seemed to have fewer impacts to scalability. The per-call cost of srcu_notifier_call_chain is higher (19ns) than blocking_notifier_ (4ns) in the single threaded case, but blocking notifiers use an rwsem to stabilize the list. Cacheline bouncing for that rwsem is costly to runtime code when there are a lot of CPUs running regular filesystem operations. If there are no hooks installed, this is a total waste of CPU time. Therefore, I stuck with srcu notifiers, despite trading off single threaded performance for multithreaded performance. I also wasn't thrilled with the very high teardown time for srcu notifiers, since the caller has to wait for the next rcu grace period. This can take a long time if there are a lot of CPUs. Then I discovered the jump label implementation of static keys. Jump labels use kernel code patching to replace a branch with a nop sled when the key is disabled. IOWs, they can eliminate the overhead of _call_chain when there are no hooks enabled. This makes blocking notifiers competitive again -- scrub runs faster because teardown of the chain is a lot cheaper, and runtime code only pays the rwsem locking overhead when scrub is actually running. With jump labels enabled, calls to empty notifier chains are elided from the call sites when there are no hooks registered, which means that the overhead is 0.36ns when fsck is not running. This is perfect for most of the architectures that XFS is expected to run on (e.g. x86, powerpc, arm64, s390x, riscv). For architectures that don't support jump labels (e.g. m68k) the runtime overhead of checking the static key is an atomic counter read. This isn't great, but it's still cheaper than taking a shared rwsem. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
8660c7b7 |
|
22-Feb-2024 |
Darrick J. Wong <djwong@kernel.org> |
xfs: implement live inode scan for scrub This patch implements a live file scanner for online fsck functions that require the ability to walk a filesystem to gather metadata records and stay informed about metadata changes to files that have already been visited. The iscan structure consists of two inode number cursors: one to track which inode we want to visit next, and a second one to track which inodes have already been visited. This second cursor is key to capturing live updates to files previously scanned while the main thread continues scanning -- any inode greater than this value hasn't been scanned and can go on its way; any other update must be incorporated into the collected data. It is critical for the scanning thraad to hold exclusive access on the inode until after marking the inode visited. This new code is a separate patch from the patchsets adding callers for the sake of enabling the author to move patches around his tree with ease. The intended usage model for this code is roughly: xchk_iscan_start(iscan, 0, 0); while ((error = xchk_iscan_iter(sc, iscan, &ip)) == 1) { xfs_ilock(ip, ...); /* capture inode metadata */ xchk_iscan_mark_visited(iscan, ip); xfs_iunlock(ip, ...); xfs_irele(ip); } xchk_iscan_stop(iscan); if (error) return error; Hook functions for live updates can then do: if (xchk_iscan_want_live_update(...)) /* update the captured inode metadata */ Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
f078d4ea |
|
15-Jan-2024 |
Dave Chinner <dchinner@redhat.com> |
xfs: convert kmem_alloc() to kmalloc() kmem_alloc() is just a thin wrapper around kmalloc() these days. Convert everything to use kmalloc() so we can get rid of the wrapper. Note: the transaction region allocation in xlog_add_to_transaction() can be a high order allocation. Converting it to use kmalloc(__GFP_NOFAIL) results in warnings in the page allocation code being triggered because the mm subsystem does not want us to use __GFP_NOFAIL with high order allocations like we've been doing with the kmem_alloc() wrapper for a couple of decades. Hence this specific case gets converted to xlog_kvmalloc() rather than kmalloc() to avoid this issue. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
|
#
a5b91555 |
|
15-Dec-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: repair quotas Fix anything that causes the quota verifiers to fail. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
21d75009 |
|
15-Dec-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: improve dquot iteration for scrub Upon a closer inspection of the quota record scrubber, I noticed that dqiterate wasn't actually walking all possible dquots for the mapped blocks in the quota file. This is due to xfs_qm_dqget_next skipping all XFS_IS_DQUOT_UNINITIALIZED dquots. For a fsck program, we really want to look at all the dquots, even if all counters and limits in the dquot record are zero. Rewrite the implementation to do this, as well as switching to an iterator paradigm to reduce the number of indirect calls. This enables removal of the old broken dqiterate code from xfs_dquot.c. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
ffd37b22 |
|
15-Dec-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: online repair of realtime bitmaps Fix all the file metadata surrounding the realtime bitmap file, which includes the rt geometry, file size, forks, and space mappings. The bitmap contents themselves cannot be fixed without rt rmap, so that will come later. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
dbbdbd00 |
|
15-Dec-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: repair problems in CoW forks Try to repair errors that we see in file CoW forks so that we don't do stupid things like remap garbage into a file. There's not a lot we can do with the COW fork -- the ondisk metadata record only that the COW staging extents are owned by the refcount btree, which effectively means that we can't reconstruct this incore structure from scratch. Actually, this is even worse -- we can't touch written extents, because those map space that are actively under writeback, and there's not much to do with delalloc reservations. Hence we can only detect crosslinked unwritten extents and fix them by punching out the problematic parts and replacing them with delalloc extents. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
8f71bede |
|
15-Dec-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: repair inode fork block mapping data structures Use the reverse-mapping btree information to rebuild an inode block map. Update the btree bulk loading code as necessary to support inode rooted btrees and fix some bitrot problems. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
2d295fe6 |
|
15-Dec-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: repair inode records If an inode is so badly damaged that it cannot be loaded into the cache, fix the ondisk metadata and try again. If there /is/ a cached inode, fix any problems and apply any optimizations that can be solved incore. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
9099cd38 |
|
15-Dec-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: repair refcount btrees Reconstruct the refcount data from the rmap btree. Link: https://docs.kernel.org/filesystems/xfs-online-fsck-design.html#case-study-rebuilding-the-space-reference-counts Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
dbfbf3bd |
|
15-Dec-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: repair inode btrees Use the rmapbt to find inode chunks, query the chunks to compute hole and free masks, and with that information rebuild the inobt and finobt. Refer to the case study in Documentation/filesystems/xfs-online-fsck-design.rst for more details. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
4bdfd7d1 |
|
15-Dec-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: repair free space btrees Rebuild the free space btrees from the gaps in the rmap btree. Refer to the case study in Documentation/filesystems/xfs-online-fsck-design.rst for more details. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
0f08af0f |
|
15-Dec-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: move the per-AG datatype bitmaps to separate files Move struct xagb_bitmap to its own pair of C and header files per request of Christoph. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
be408417 |
|
06-Dec-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: implement block reservation accounting for btrees we're staging Create a new xrep_newbt structure to encapsulate a fake root for creating a staged btree cursor as well as to track all the blocks that we need to reserve in order to build that btree. As for the particular choice of lowspace thresholds and btree block slack factors -- at this point one could say that the thresholds in online repair come from bulkload_estimate_ag_slack in xfs_repair[1]. But that's not the entire story, since the offline btree rebuilding code in xfs_repair was merged as a retroport of the online btree code in this patchset! Before xfs_btree_staging.[ch] came along, xfs_repair determined the slack factor (aka the number of slots to leave unfilled in each new btree block) via open-coded logic in repair/phase5.c[2]. At that point the slack factors were arbitrary quantities per btree. The rmapbt automatically left 10 slots free; everything else left zero. That had a noticeable effect on performance straight after mounting because adding records to /any/ btree would result in splits. A few years ago when this patch was first written, Dave and I decided that repair should generate btree blocks that were 75% full unless space was tight, in which case it should try to fill the blocks to nearly full. We defined tight as ~10% free to avoid repair failures but settled on 3/32 (~9%) to avoid div64. IOWs, we mostly pulled the thresholds out of thin air. We've been QAing with those geometry numbers ever since. ;) Link: https://git.kernel.org/pub/scm/fs/xfs/xfsprogs-dev.git/tree/repair/bulkload.c?h=v6.5.0#n114 Link: https://git.kernel.org/pub/scm/fs/xfs/xfsprogs-dev.git/tree/repair/phase5.c?h=v4.19.0#n1349 Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
b7d47a77 |
|
10-Aug-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: move the realtime summary file scrubber to a separate source file Move the realtime summary file checking code to a separate file in preparation to actually implement it. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
d7a74cad |
|
10-Aug-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: track usage statistics of online fsck Track the usage, outcomes, and run times of the online fsck code, and report these values via debugfs. The columns in the file are: * scrubber name * number of scrub invocations * clean objects found * corruptions found * optimizations found * cross referencing failures * inconsistencies found during cross referencing * incomplete scrubs * warnings * number of time scrub had to retry * cumulative amount of time spent scrubbing (microseconds) * number of repair inovcations * successfully repaired objects * cumuluative amount of time spent repairing (microseconds) Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
3934e8eb |
|
10-Aug-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: create a big array data structure Create a simple 'big array' data structure for storage of fixed-size metadata records that will be used to reconstruct a btree index. For repair operations, the most important operations are append, iterate, and sort. Earlier implementations of the big array used linked lists and suffered from severe problems -- pinning all records in kernel memory was not a good idea and frequently lead to OOM situations; random access was very inefficient; and record overhead for the lists was unacceptably high at 40-60%. Therefore, the big memory array relies on the 'xfile' abstraction, which creates a memfd file and stores the records in page cache pages. Since the memfd is created in tmpfs, the memory pages can be pushed out to disk if necessary and we have a built-in usage limit of 50% of physical memory. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
e06ef14b |
|
10-Aug-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: move the post-repair block reaping code to a separate file Reaping blocks after a repair is a complicated affair involving a lot of rmap btree lookups and figuring out if we're going to unmap or free old metadata blocks that might be crosslinked. Eventually, we will need to be able to reap per-AG metadata blocks, bmbt blocks from inode forks, garbage CoW staging extents, and (even later) blocks from btrees rooted in inodes. This results in a lot of reaping code, so we might as well split that off while it's easy. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
fed050f3 |
|
11-Apr-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: cross-reference rmap records with ag btrees Strengthen the rmap btree record checker a little more by comparing OWN_FS and OWN_LOG reverse mappings against the AG headers and internal logs, respectively. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
4c233b5c |
|
11-Apr-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: streamline the directory iteration code for scrub Currently, online scrub reuses the xfs_readdir code to walk every entry in a directory. This isn't awesome for performance, since we end up cycling the directory ILOCK needlessly and coding around the particular quirks of the VFS dir_context interface. Create a streamlined version of readdir that keeps the ILOCK (since the walk function isn't going to copy stuff to userspace), skips a whole lot of directory walk cursor checks (since we start at 0 and walk to the end) and has a sane way to return error codes. Note: Porting the dotdot checking code is left for a subsequent patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
d5c88131 |
|
11-Apr-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: allow queued AG intents to drain before scrubbing When a writer thread executes a chain of log intent items, the AG header buffer locks will cycle during a transaction roll to get from one intent item to the next in a chain. Although scrub takes all AG header buffer locks, this isn't sufficient to guard against scrub checking an AG while that writer thread is in the middle of finishing a chain because there's no higher level locking primitive guarding allocation groups. When there's a collision, cross-referencing between data structures (e.g. rmapbt and refcountbt) yields false corruption events; if repair is running, this results in incorrect repairs, which is catastrophic. Fix this by adding to the perag structure the count of active intents and make scrub wait until it has both AG header buffer locks and the intent counter reaches zero. One quirk of the drain code is that deferred bmap updates also bump and drop the intent counter. A fundamental decision made during the design phase of the reverse mapping feature is that updates to the rmapbt records are always made by the same code that updates the primary metadata. In other words, callers of bmapi functions expect that the bmapi functions will queue deferred rmap updates. Some parts of the reflink code queue deferred refcount (CUI) and bmap (BUI) updates in the same head transaction, but the deferred work manager completely finishes the CUI before the BUI work is started. As a result, the CUI drops the intent count long before the deferred rmap (RUI) update even has a chance to bump the intent count. The only way to keep the intent count elevated between the CUI and RUI is for the BUI to bump the counter until the RUI has been created. A second quirk of the intent drain code is that deferred work items must increment the intent counter as soon as the work item is added to the transaction. When a BUI completes and queues an RUI, the RUI must increment the counter before the BUI decrements it. The only way to accomplish this is to require that the counter be bumped as soon as the deferred work item is created in memory. In the next patches we'll improve on this facility, but this patch provides the basic functionality. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
3cfb9290 |
|
16-Mar-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: test dir/attr hash when loading module Back in the 6.2-rc1 days, Eric Whitney reported a fstests regression in ext4 against generic/454. The cause of this test failure was the unfortunate combination of setting an xattr name containing UTF8 encoded emoji, an xattr hash function that accepted a char pointer with no explicit signedness, signed type extension of those chars to an int, and the 6.2 build tools maintainers deciding to mandate -funsigned-char across the board. As a result, the ondisk extended attribute structure written out by 6.1 and 6.2 were not the same. This discrepancy, in fact, had been noticeable if a filesystem with such an xattr were moved between any two architectures that don't employ the same signedness of a raw "char" declaration. The only reason anyone noticed is that x86 gcc defaults to signed, and no such -funsigned-char update was made to e2fsprogs, so e2fsck immediately started reporting data corruption. After a day and a half of discussing how to handle this use case (xattrs with bit 7 set anywhere in the name) without breaking existing users, Linus merged his own patch and didn't tell the maintainer. None of the ext4 developers realized this until AUTOSEL announced that the commit had been backported to stable. In the end, this problem could have been detected much earlier if there had been any useful tests of hash function(s) in use inside ext4 to make sure that they always produce the same outputs given the same inputs. The XFS dirent/xattr name hash takes a uint8_t*, so I don't think it's vulnerable to this problem. However, let's avoid all this drama by adding our own self test to check that the da hash produces the same outputs for a static pile of inputs on various platforms. This enables us to fix any breakage that may result in a controlled fashion. The buffer and test data are identical to the patches submitted to xfsprogs. Link: https://lore.kernel.org/linux-ext4/Y8bpkm3jA3bDm3eL@debian-BULLSEYE-live-builder-AMD64/ Link: https://lore.kernel.org/linux-xfs/ZBUKCRR7xvIqPrpX@destitution/T/#md38272cc684e2c0d61494435ccbb91f022e8dee4 Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
6f643c57 |
|
02-Jun-2022 |
Shiyang Ruan <ruansy.fnst@fujitsu.com> |
xfs: implement ->notify_failure() for XFS Introduce xfs_notify_failure.c to handle failure related works, such as implement ->notify_failure(), register/unregister dax holder in xfs, and so on. If the rmap feature of XFS enabled, we can query it to find files and metadata which are associated with the corrupt data. For now all we do is kill processes with that file mapped into their address spaces, but future patches could actually do something about corrupt metadata. After that, the memory failure needs to notify the processes who are using those files. Link: https://lkml.kernel.org/r/20220603053738.1218681-7-ruansy.fnst@fujitsu.com Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dan Williams <dan.j.wiliams@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Goldwyn Rodrigues <rgoldwyn@suse.com> Cc: Goldwyn Rodrigues <rgoldwyn@suse.de> Cc: Jane Chu <jane.chu@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Ritesh Harjani <riteshh@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
784eb7d8 |
|
13-Jul-2022 |
Dave Chinner <dchinner@redhat.com> |
xfs: add in-memory iunlink log item Now that we have a clean operation to update the di_next_unlinked field of inode cluster buffers, we can easily defer this operation to transaction commit time so we can order the inode cluster buffer locking consistently. To do this, we introduce a new in-memory log item to track the unlinked list item modification that we are going to make. This follows the same observations as the in-memory double linked list used to track unlinked inodes in that the inodes on the list are pinned in memory and cannot go away, and hence we can simply reference them for the duration of the transaction without needing to take active references or pin them or look them up. This allows us to pass the xfs_inode to the transaction commit code along with the modification to be made, and then order the logged modifications via the ->iop_sort and ->iop_precommit operations for the new log item type. As this is an in-memory log item, it doesn't have formatting, CIL or AIL operational hooks - it exists purely to run the inode unlink modifications and is then removed from the transaction item list and freed once the precommit operation has run. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
fd920008 |
|
03-May-2022 |
Allison Henderson <allison.henderson@oracle.com> |
xfs: Set up infrastructure for log attribute replay Currently attributes are modified directly across one or more transactions. But they are not logged or replayed in the event of an error. The goal of log attr replay is to enable logging and replaying of attribute operations using the existing delayed operations infrastructure. This will later enable the attributes to become part of larger multi part operations that also must first be recorded to the log. This is mostly of interest in the scheme of parent pointers which would need to maintain an attribute containing parent inode information any time an inode is moved, created, or removed. Parent pointers would then be of interest to any feature that would need to quickly derive an inode path from the mount point. Online scrub, nfs lookups and fs grow or shrink operations are all features that could take advantage of this. This patch adds two new log item types for setting or removing attributes as deferred operations. The xfs_attri_log_item will log an intent to set or remove an attribute. The corresponding xfs_attrd_log_item holds a reference to the xfs_attri_log_item and is freed once the transaction is done. Both log items use a generic xfs_attr_log_format structure that contains the attribute name, value, flags, inode, and an op_flag that indicates if the operations is a set or remove. [dchinner: added extra little bits needed for intent whiteouts] Signed-off-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
86ffa471 |
|
01-May-2020 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: refactor log recovery item sorting into a generic dispatch structure Create a generic dispatch structure to delegate recovery of different log item types into various code modules. This will enable us to move code specific to a particular log item type out of xfs_log_recover.c and into the log item source. The first operation we virtualize is the log item sorting. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
166405f6 |
|
22-Apr-2020 |
Arnd Bergmann <arnd@arndb.de> |
xfs: stop CONFIG_XFS_DEBUG from changing compiler flags I ran into a linker warning in XFS that originates from a mismatch between libelf, binutils and objtool when certain files in the kernel are built with "gcc -g": x86_64-linux-ld: fs/xfs/xfs_trace.o: unable to initialize decompress status for section .debug_info After some discussion, nobody could identify why xfs sets this flag here. CONFIG_XFS_DEBUG used to enable lots of unrelated settings, but now its main purpose is to enable extra consistency checks and assertions that are unrelated to the debug info. Remove the Makefile logic to set the flag here. If anyone relies on the debug info, this can simply be enabled again with the global CONFIG_DEBUG_INFO option. Dave Chinner writes: I'm pretty sure it was needed for the original kgdb integration back in the early 2000s. That was when SGI used to patch their XFS dev tree with kgdb and debug symbols were needed by the custom kgdb modules that were ported across from the Irix kernel debugger. ISTR that the early kcrash kernel dump analysis tools (again, originated from the Irix "icrash" kernel dump tools) had custom XFS debug scripts that needed also the debug info to work correctly... Which is a long way of saying "we don't need it anymore" instead of "nobody knows why it was set"... :) Suggested-by: Christoph Hellwig <hch@infradead.org> Link: https://lore.kernel.org/lkml/20200409074130.GD21033@infradead.org/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
e06536a6 |
|
11-Mar-2020 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: introduce fake roots for ag-rooted btrees Create an in-core fake root for AG-rooted btree types so that callers can generate a whole new btree using the upcoming btree bulk load function without making the new tree accessible from the rest of the filesystem. It is up to the individual btree type to provide a function to create a staged cursor (presumably with the appropriate callouts to update the fakeroot) and then commit the staged root back into the filesystem. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
957ee13e |
|
08-Nov-2019 |
Christoph Hellwig <hch@lst.de> |
xfs: remove the now unused dir ops infrastructure Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
3f6d70e8 |
|
12-Jul-2019 |
Eric Sandeen <sandeen@redhat.com> |
xfs: move xfs_trans_inode.c to libxfs/ Userspace now has an identical xfs_trans_inode.c which it has already moved to libxfs/ so do the same move for kernelspace. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
40786717 |
|
03-Jul-2019 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: multithreaded iwalk implementation Create a parallel iwalk implementation and switch quotacheck to use it. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
a211432c |
|
02-Jul-2019 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: create simplified inode walk function Create a new iterator function to simplify walking inodes in an XFS filesystem. This new iterator will replace the existing open-coded walking that goes on in various places. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
caeaea98 |
|
28-Jun-2019 |
Christoph Hellwig <hch@lst.de> |
xfs: merge xfs_trans_bmap.c into xfs_bmap_item.c Keep all bmap item related code together. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
3cfce1e3 |
|
28-Jun-2019 |
Christoph Hellwig <hch@lst.de> |
xfs: merge xfs_trans_rmap.c into xfs_rmap_item.c Keep all rmap item related code together in one file. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
effd5e96 |
|
28-Jun-2019 |
Christoph Hellwig <hch@lst.de> |
xfs: merge xfs_trans_refcount.c into xfs_refcount_item.c Keep all the refcount item related code together in one file. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
81f40041 |
|
28-Jun-2019 |
Christoph Hellwig <hch@lst.de> |
xfs: merge xfs_trans_extfree.c into xfs_extfree_item.c Keep all the extree item related code together in one file. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
6ad5b325 |
|
28-Jun-2019 |
Christoph Hellwig <hch@lst.de> |
xfs: use bios directly to read and write the log recovery buffers The xfs_buf structure is basically used as a glorified container for a memory allocation in the log recovery code. Replace it with a call to kmem_alloc_large and a simple abstraction to read into or write from it synchronously using chained bios. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
9cc342f6 |
|
13-May-2019 |
Masahiro Yamada <yamada.masahiro@socionext.com> |
treewide: prefix header search paths with $(srctree)/ Currently, the Kbuild core manipulates header search paths in a crazy way [1]. To fix this mess, I want all Makefiles to add explicit $(srctree)/ to the search paths in the srctree. Some Makefiles are already written in that way, but not all. The goal of this work is to make the notation consistent, and finally get rid of the gross hacks. Having whitespaces after -I does not matter since commit 48f6e3cf5bc6 ("kbuild: do not drop -I without parameter"). [1]: https://patchwork.kernel.org/patch/9632347/ Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
#
75efa57d |
|
25-Apr-2019 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: add online scrub for superblock counters Teach online scrub how to check the filesystem summary counters. We use the incore delalloc block counter along with the incore AG headers to compute expected values for fdblocks, icount, and ifree, and then check that the percpu counter is within a certain threshold of the expected value. This is done to avoid having to freeze or otherwise lock the filesystem, which means that we're only checking that the counters are fairly close, not that they're exactly correct. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
4860a05d |
|
16-Apr-2019 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: scrub/repair should update filesystem metadata health Now that we have the ability to track sick metadata in-core, make scrub and repair update those health assessments after doing work. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
6772c1f1 |
|
12-Apr-2019 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: track metadata health status Add the necessary in-core metadata fields to keep track of which parts of the filesystem have been observed and which parts were observed to be unhealthy, and print a warning at unmount time if we have unfixed problems. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
bc270b53 |
|
29-Jul-2018 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: move the repair extent list into its own file Move the xrep_extent_list code into a separate file. Logically, this data structure is really just a clumsy bitmap, and in the next patch we'll make this more obvious. No functional changes. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
86210fbe |
|
07-Jun-2018 |
Dave Chinner <dchinner@redhat.com> |
xfs: move various type verifiers to common file New verification functions like xfs_verify_fsbno() and xfs_verify_agino() are spread across multiple files and different header files. They really don't fit cleanly into the places they've been put, and have wider scope than the current header includes. Move the type verifiers to a new file in libxfs (xfs-types.c) and the prototypes to xfs_types.h where they will be visible to all the code that uses the types. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
0b61f8a4 |
|
05-Jun-2018 |
Dave Chinner <dchinner@redhat.com> |
xfs: convert to SPDX license tags Remove the verbose license text from XFS files and replace them with SPDX tags. This does not change the license of any of the code, merely refers to the common, up-to-date license files in LICENSES/ This change was mostly scripted. fs/xfs/Makefile and fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected and modified by the following command: for f in `git grep -l "GNU General" fs/xfs/` ; do echo $f cat $f | awk -f hdr.awk > $f.new mv -f $f.new $f done And the hdr.awk script that did the modification (including detecting the difference between GPL-2.0 and GPL-2.0+ licenses) is as follows: $ cat hdr.awk BEGIN { hdr = 1.0 tag = "GPL-2.0" str = "" } /^ \* This program is free software/ { hdr = 2.0; next } /any later version./ { tag = "GPL-2.0+" next } /^ \*\// { if (hdr > 0.0) { print "// SPDX-License-Identifier: " tag print str print $0 str="" hdr = 0.0 next } print $0 next } /^ \* / { if (hdr > 1.0) next if (hdr > 0.0) { if (str != "") str = str "\n" str = str $0 next } print $0 next } /^ \*/ { if (hdr > 0.0) next print $0 next } // { if (hdr > 0.0) { if (str != "") str = str "\n" str = str $0 next } print $0 } END { } $ Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
d25522f1 |
|
29-May-2018 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: repair superblocks If one of the backup superblocks is found to differ seriously from superblock 0, write out a fresh copy from the in-core sb. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
b16817b6 |
|
14-May-2018 |
Dave Chinner <dchinner@redhat.com> |
xfs: move growfs core to libxfs So it can be shared with userspace (e.g. mkfs) easily. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
84d42ea6 |
|
14-May-2018 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: implement the metadata repair ioctl flag Plumb in the pieces necessary to make the "scrub" subfunction of the scrub ioctl actually work. This means that we make the IFLAG_REPAIR flag to the scrub ioctl actually do something, and we add an errortag knob so that xfstests can force the kernel to rebuild a metadata structure even if there's nothing wrong with it. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
6bdcf26a |
|
03-Nov-2017 |
Christoph Hellwig <hch@lst.de> |
xfs: use a b+tree for the in-core extent list Replace the current linear list and the indirection array for the in-core extent list with a b+tree to avoid the need for larger memory allocations for the indirection array when lots of extents are present. The current extent list implementations leads to heavy pressure on the memory allocator when modifying files with a high extent count, and can lead to high latencies because of that. The replacement is a b+tree with a few quirks. The leaf nodes directly store the extent record in two u64 values. The encoding is a little bit different from the existing in-core extent records so that the start offset and length which are required for lookups can be retreived with simple mask operations. The inner nodes store a 64-bit key containing the start offset in the first half of the node, and the pointers to the next lower level in the second half. In either case we walk the node from the beginninig to the end and do a linear search, as that is more efficient for the low number of cache lines touched during a search (2 for the inner nodes, 4 for the leaf nodes) than a binary search. We store termination markers (zero length for the leaf nodes, an otherwise impossible high bit for the inner nodes) to terminate the key list / records instead of storing a count to use the available cache lines as efficiently as possible. One quirk of the algorithm is that while we normally split a node half and half like usual btree implementations we just spill over entries added at the very end of the list to a new node on its own. This means we get a 100% fill grade for the common cases of bulk insertion when reading an inode into memory, and when only sequentially appending to a file. The downside is a slightly higher chance of splits on the first random insertions. Both insert and removal manually recurse into the lower levels, but the bulk deletion of the whole tree is still implemented as a recursive function call, although one limited by the overall depth and with very little stack usage in every iteration. For the first few extents we dynamically grow the list from a single extent to the next powers of two until we have a first full leaf block and that building the actual tree. The code started out based on the generic lib/btree.c code from Joern Engel based on earlier work from Peter Zijlstra, but has since been rewritten beyond recognition. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
c2fc338c |
|
17-Oct-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: scrub quota information Perform some quick sanity testing of the disk quota information. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
29b0767b |
|
17-Oct-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: scrub realtime bitmap/summary Perform simple tests of the realtime bitmap and summary. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
0f28b257 |
|
17-Oct-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: scrub directory parent pointers Scrub parent pointers, sort of. For directories, we can ride the '..' entry up to the parent to confirm that there's at most one dentry that points back to this directory. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
2a721dbb |
|
17-Oct-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: scrub symbolic links Create the infrastructure to scrub symbolic link data. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
eec0482e |
|
17-Oct-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: scrub extended attributes Scrub the hash tree, keys, and values in an extended attribute structure. Refactor the attribute code to use the transaction if the caller supplied one to avoid buffer deadocks. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
a5c46e5e |
|
17-Oct-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: scrub directory metadata Scrub the hash tree and all the entries in a directory. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
7c4a07a4 |
|
17-Oct-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: scrub directory/attribute btrees Provide a way to check the shape and scrub the hashes and records in a directory or extended attribute btree. These are helper functions for the directory & attribute scrubbers in subsequent patches. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> [fengguang: remove unneeded variable to store return value] Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
99d9d8d0 |
|
17-Oct-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: scrub inode block mappings Scrub an individual inode's block mappings to make sure they make sense. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
80e4e126 |
|
17-Oct-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: scrub inodes Scrub the fields within an inode. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
edc09b52 |
|
17-Oct-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: scrub refcount btrees Plumb in the pieces necessary to check the refcount btree. If rmap is available, check the reference count by performing an interval query against the rmapbt. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
c7e693d9 |
|
17-Oct-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: scrub rmap btrees Check the reverse mapping records to make sure that the contents make sense. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
3daa6641 |
|
17-Oct-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: scrub inode btrees Check the records of the inode btrees to make sure that the values make sense given the inode records themselves. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
efa7a99c |
|
17-Oct-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: scrub free space btrees Check the extent records free space btrees to ensure that the values look sane. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
21fb4cb1 |
|
17-Oct-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: scrub the secondary superblocks Ensure that the geometry presented in the backup superblocks matches the primary superblock so that repair can recover the filesystem if that primary gets corrupted. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
537964bc |
|
17-Oct-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: create helpers to scrub a metadata btree Create helper functions and tracepoints to deal with errors while scrubbing a metadata btree. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
dcb660f9 |
|
17-Oct-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: probe the scrub ioctl Create a probe scrubber with id 0. This will be used by xfs_scrub to probe the kernel's abilities to scrub (and repair) the metadata. We do this by validating the ioctl inputs from userspace, preparing the filesystem for a scrub (or a repair) operation, and immediately returning to userspace. Userspace can use the returned errno and structure state to decide (in broad terms) if scrub/repair are supported by the running kernel. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
36fd6e86 |
|
17-Oct-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: create an ioctl to scrub AG metadata Create an ioctl that can be used to scrub internal filesystem metadata. The new ioctl takes the metadata type, an (optional) AG number, an (optional) inode number and generation, and a flags argument. This will be used by the upcoming XFS online scrub tool. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
d905fdaa |
|
04-May-2017 |
Amir Goldstein <amir73il@gmail.com> |
xfs: use the common helper uuid_is_null() Use the common helper uuid_is_null() and remove the xfs specific helper uuid_is_nil(). The common helper does not check for the NULL pointer value as xfs helper did, but xfs code never calls the helper with a pointer that can be NULL. Conform comments and warning strings to use the term 'null uuid' instead of 'nil uuid', because this is the terminology used by lib/uuid.c and its users. It is also the terminology used in userspace by libuuid and xfsprogs. Signed-off-by: Amir Goldstein <amir73il@gmail.com> [hch: remove now unused uuid.[ch]] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|
#
e89c0413 |
|
28-Mar-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: implement the GETFSMAP ioctl Introduce a new ioctl that uses the reverse mapping btree to return information about the physical layout of the filesystem. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
3993baeb |
|
03-Oct-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: introduce the CoW fork Introduce a new in-core fork for storing copy-on-write delalloc reservations and allocated extents that are in the process of being written out. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
77d61fe4 |
|
03-Oct-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: log bmap intent items Provide a mechanism for higher levels to create BUI/BUD items, submit them to the log, and a stub function to deal with recovered BUI items. These parts will be connected to the rmapbt in a later patch. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
6413a014 |
|
03-Oct-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: create bmbt update intent log items Create bmbt update intent/done log items to record redo information in the log. Because we roll transactions multiple times for reflink operations, we also have to track the status of the metadata updates that will be recorded in the post-roll transactions in case we crash before committing the final transaction. This mechanism enables log recovery to finish what was already started. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
f997ee21 |
|
03-Oct-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: log refcount intent items Provide a mechanism for higher levels to create CUI/CUD items, submit them to the log, and a stub function to deal with recovered CUI items. These parts will be connected to the refcountbt in a later patch. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
baf4bcac |
|
03-Oct-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: create refcount update intent log items Create refcount update intent/done log items to record redo information in the log. Because we need to roll transactions between updating the bmbt mapping and updating the reverse mapping, we also have to track the status of the metadata updates that will be recorded in the post-roll transactions, just in case we crash before committing the final transaction. This mechanism enables log recovery to finish what was already started. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
bdf28630 |
|
03-Oct-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: add refcount btree operations Implement the generic btree operations required to manipulate refcount btree blocks. The implementation is similar to the bmapbt, though it will only allocate and free blocks from the AG. Since the refcount root and level fields are separate from the existing roots and levels array, they need a separate logging flag. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> [hch: fix logging of AGF refcount btree fields] Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
1946b91c |
|
03-Oct-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: define the on-disk refcount btree format Start constructing the refcount btree implementation by establishing the on-disk format and everything needed to read, write, and manipulate the refcount btree blocks. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
3fd129b6 |
|
18-Sep-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: set up per-AG free space reservations One unfortunate quirk of the reference count and reverse mapping btrees -- they can expand in size when blocks are written to *other* allocation groups if, say, one large extent becomes a lot of tiny extents. Since we don't want to start throwing errors in the middle of CoWing, we need to reserve some blocks to handle future expansion. The transaction block reservation counters aren't sufficient here because we have to have a reserve of blocks in every AG, not just somewhere in the filesystem. Therefore, create two per-AG block reservation pools. One feeds the AGFL so that rmapbt expansion always succeeds, and the other feeds all other metadata so that refcountbt expansion never fails. Use the count of how many reserved blocks we need to have on hand to create a virtual reservation in the AG. Through selective clamping of the maximum length of allocation requests and of the length of the longest free extent, we can make it look like there's less free space in the AG unless the reservation owner is asking for blocks. In other words, play some accounting tricks in-core to make sure that we always have blocks available. On the plus side, there's nothing to clean up if we crash, which is contrast to the strategy that the rough draft used (actually removing extents from the freespace btrees). Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
9e88b5d8 |
|
02-Aug-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: log rmap intent items Provide a mechanism for higher levels to create RUI/RUD items, submit them to the log, and a stub function to deal with recovered RUI items. These parts will be connected to the rmapbt in a later patch. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
5880f2d7 |
|
02-Aug-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: create rmap update intent log items Create rmap update intent/done log items to record redo information in the log. Because we need to roll transactions between updating the bmbt mapping and updating the reverse mapping, we also have to track the status of the metadata updates that will be recorded in the post-roll transactions, just in case we crash before committing the final transaction. This mechanism enables log recovery to finish what was already started. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
035e00ac |
|
02-Aug-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: define the on-disk rmap btree format Originally-From: Dave Chinner <dchinner@redhat.com> Now we have all the surrounding call infrastructure in place, we can start filling out the rmap btree implementation. Start with the on-disk btree format; add everything needed to read, write and manipulate rmap btree blocks. This prepares the way for adding the btree operations implementation. [darrick: record owner and offset info in rmap btree] [darrick: fork, bmbt and unwritten state in rmap btree] [darrick: flags are a separate field in xfs_rmap_irec] [darrick: calculate maxlevels separately] [darrick: move the 'unwritten' bit into unused parts of rm_offset] Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
673930c3 |
|
02-Aug-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: introduce rmap extent operation stubs Originally-From: Dave Chinner <dchinner@redhat.com> Add the stubs into the extent allocation and freeing paths that the rmap btree implementation will hook into. While doing this, add the trace points that will be used to track rmap btree extent manipulations. [darrick.wong@oracle.com: Extend the stubs to take full owner info.] Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
4e0cc29b |
|
02-Aug-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: move deferred operations into a separate file All the code around struct xfs_bmap_free basically implements a deferred operation framework through which we can roll transactions (to unlock buffers and avoid violating lock order rules) while managing all the necessary log redo items. Previously we only used this code to free extents after some sort of mapping operation, but with the advent of rmap and reflink, we suddenly need to do more than that. With that in mind, xfs_bmap_free really becomes a deferred ops control structure. Rename the structure and move the deferred ops into their own file to avoid further bloating of the bmap code. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
15d66ac2 |
|
08-Jul-2016 |
Benjamin Coddington <bcodding@redhat.com> |
xfs: abstract block export operations from nfsd layouts Instead of creeping pnfs layout configuration into filesystems, move the definition of block-based export operations under a more abstract configuration. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Dave Chinner <david@fromorbit.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
f99d4fbd |
|
04-Mar-2016 |
Christoph Hellwig <hch@lst.de> |
nfsd: add SCSI layout support This is a simple extension to the block layout driver to use SCSI persistent reservations for access control and fencing, as well as SCSI VPD pages for device identification. For this we need to pass the nfs4_client to the proc_getdeviceinfo method to generate the reservation key, and add a new fence_client method to allow for fence actions in the layout driver. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
81c39329 |
|
04-Mar-2016 |
Christoph Hellwig <hch@lst.de> |
nfsd: add a new config option for the block layout driver Split the config symbols into a generic pNFS one, which is invisible and gets selected by the layout drivers, and one for the block layout driver. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
985ef4dc |
|
18-Oct-2015 |
Dave Chinner <dchinner@redhat.com> |
xfs: stats are no longer dependent on CONFIG_PROC_FS So we need to fix the makefile to understand this, otherwise build errors with CONFIG_PROC_FS=n occur. Reported-and-tested-by: Jim Davis <jim.epost@gmail.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
1cfc4a9c |
|
28-Jul-2015 |
Dave Chinner <dchinner@redhat.com> |
libxfs: add xfs_bit.c The header side of xfs_bit.c is already in libxfs, and the sparse inode code requires the xfs_next_bit() function so pull in the xfs_bit.c file so that a sparse inode enabled libxfs compiles cleanly in userspace. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
52785112 |
|
15-Feb-2015 |
Christoph Hellwig <hch@lst.de> |
xfs: implement pNFS export operations Add operations to export pNFS block layouts from an XFS filesystem. See the previous commit adding the operations for an explanation of them. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
a31b1d3d |
|
14-Jul-2014 |
Brian Foster <bfoster@redhat.com> |
xfs: add xfs_mount sysfs kobject Embed a base kobject into xfs_mount. This creates a kobject associated with each XFS mount and a subdirectory in sysfs with the name of the filesystem. The subdirectory lifecycle matches that of the mount. Also add the new xfs_sysfs.[c,h] source files with some XFS sysfs infrastructure to facilitate attribute creation. Note that there are currently no attributes exported as part of the xfs_mount kobject. It exists solely to serve as a per-mount container for child objects. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
30f712c9 |
|
24-Jun-2014 |
Dave Chinner <dchinner@redhat.com> |
libxfs: move source files Move all the source files that are shared with userspace into libxfs/. This is done as one big chunk simpy to get it done quickly Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
69116a13 |
|
24-Jun-2014 |
Dave Chinner <dchinner@redhat.com> |
xfs: create libxfs infrastructure To minimise the differences between kernel and userspace code, split the kernel code into the same structure as the userspace code. That is, the gneric core functionality of XFS is moved to a libxfs/ directory and treat it as a layering barrier in the XFS code. This patch introduces the libxfs directory, the build infrastructure and an initial source and header file to build. The libxfs directory will contain the header files that are needed to build libxfs - most of userspace does not care about the location of these header files as they are accessed indirectly. Hence keeping them inside libxfs makes it easy to track the changes and script the sync process as the directory structure will be identical. To allow this changeover to occur in the kernel code, there are some temporary infrastructure in the makefiles to grab the header filesystem from both locations. Once all the files are moved, modifications will be made in the source code that will make the need for these include directives go away. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
32c5483a |
|
29-Oct-2013 |
Dave Chinner <dchinner@redhat.com> |
xfs: abstract the differences in dir2/dir3 via an ops vector Lots of the dir code now goes through switches to determine what is the correct on-disk format to parse. It generally involves a "xfs_sbversion_hasfoo" check, deferencing the superblock version and feature fields and hence touching several cache lines per operation in the process. Some operations do multiple checks because they nest conditional operations and they don't pass the information in a direct fashion between each other. Hence, add an ops vector to the xfs_inode structure that is configured when the inode is initialised to point to all the correct decode and encoding operations. This will significantly reduce the branchiness and cacheline footprint of the directory object decoding and encoding. This is the first patch in a series of conversion patches. It will introduce the ops structure, the setup of it and add the first operation to the vector. Subsequent patches will convert directory ops one at a time to keep the changes simple and obvious. Just this patch shows the benefit of such an approach on code size. Just converting the two shortform dir operations as this patch does decreases the built binary size by ~1500 bytes: $ size fs/xfs/xfs.o.orig fs/xfs/xfs.o.p1 text data bss dec hex filename 794490 96802 1096 892388 d9de4 fs/xfs/xfs.o.orig 792986 96802 1096 890884 d9804 fs/xfs/xfs.o.p1 $ That's a significant decrease in the instruction cache footprint of the directory code for such a simple change, and indicates that this approach is definitely worth pursuing further. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
c963c619 |
|
14-Oct-2013 |
Dave Chinner <dchinner@redhat.com> |
xfs: split xfs_rtalloc.c for userspace sanity xfs_rtalloc.c is partially shared with userspace. Split the file up into two parts - one that is kernel private and the other which is wholly shared with userspace. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
9aede1d8 |
|
14-Oct-2013 |
Dave Chinner <dchinner@redhat.com> |
xfs: split dquot buffer operations out Parts of userspace want to be able to read and modify dquot buffers (e.g. xfs_db) so we need to split out the reading and writing of these buffers so it is easy to shared code with libxfs in userspace. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
5a96a945 |
|
12-Aug-2013 |
Jie Liu <jeff.liu@oracle.com> |
xfs: Add xfs_log_rlimit.c Add source files for xfs_log_rlimit.c The new file is used for log size calculations and validation shared with userspace. [dchinner: xfs_log_calc_max_attrsetm_res() does not modify the tr_attrsetm reservation, just calculates the maximum. ] [dchinner: rework loop in xfs_log_get_max_trans_res() ] Signed-off-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
a133d952 |
|
12-Aug-2013 |
Dave Chinner <dchinner@redhat.com> |
xfs: consolidate extent swap code So we don't need xfs_dfrag.h in userspace anymore, move the extent swap ioctl structure definition to xfs_fs.h where most of the other ioctl structure definitions are. Now that we don't need separate files for extent swapping, separate the basic file descriptor checking code to xfs_ioctl.c, and the code that does the extent swap operation to xfs_bmap_util.c. This cleanly separates the user interface code from the physical mechanism used to do the extent swap. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
e546cb79 |
|
12-Aug-2013 |
Dave Chinner <dchinner@redhat.com> |
xfs: consolidate xfs_utils.c There are a few small helper functions in xfs_util, all related to xfs_inode modifications. Move them all to xfs_inode.c so all xfs_inode operations are consiolidated in the one place. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
f6bba201 |
|
12-Aug-2013 |
Dave Chinner <dchinner@redhat.com> |
xfs: consolidate xfs_rename.c Move the rename code to xfs_inode.c to continue consolidating all the kernel xfs_inode operations in the one place. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
c24b5dfa |
|
12-Aug-2013 |
Dave Chinner <dchinner@redhat.com> |
xfs: kill xfs_vnodeops.[ch] Now we have xfs_inode.c for holding kernel-only XFS inode operations, move all the inode operations from xfs_vnodeops.c to this new file as it holds another set of kernel-only inode operations. The name of this file traces back to the days of Irix and it's vnodes which we don't have anymore. Essentially this move consolidates the inode locking functions and a bunch of XFS inode operations into the one file. Eventually the high level functions will be merged into the VFS interface functions in xfs_iops.c. This leaves only internal preallocation, EOF block manipulation and hole punching functions in vnodeops.c. Move these to xfs_bmap_util.c where we are already consolidating various in-kernel physical extent manipulation and querying functions. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
68988114 |
|
12-Aug-2013 |
Dave Chinner <dchinner@redhat.com> |
xfs: create xfs_bmap_util.[ch] There is a bunch of code in xfs_bmap.c that is kernel specific and not shared with userspace. To minimise the difference between the kernel and userspace code, shift this unshared code to xfs_bmap_util.c, and the declarations to xfs_bmap_util.h. The biggest issue here is xfs_bmap_finish() - userspace has it's own definition of this function, and so we need to move it out of xfs_bmap.[ch]. This means several other files need to include xfs_bmap_util.h as well. It also introduces and interesting dance for the stack switching code in xfs_bmapi_allocate(). The stack switching/workqueue code is actually moved to xfs_bmap_util.c, so that userspace can simply use a #define in a header file to connect the dots without needing to know about the stack switch code at all. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
ff55068c |
|
12-Aug-2013 |
Dave Chinner <dchinner@redhat.com> |
xfs: introduce xfs_sb.c for sharing with libxfs xfs_mount.c is shared with userspace, but the only functions that are shared are to do with physical superblock manipulations. This means that less than 25% of the xfs_mount.c code is actually shared with userspace. Move all the superblock functions to xfs_sb.c and share that instead with libxfs. Note that this will leave all the in-core transaction related superblock counter modifications in xfs_mount.c as none of that is shared with userspace. With a few more small changes, xfs_mount.h won't need to be shared with userspace anymore, either. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
1fb7e48d |
|
12-Aug-2013 |
Dave Chinner <dchinner@redhat.com> |
xfs: split out the remote symlink handling The remote symlink format definition and manipulation needs to be shared with userspace, but the in-kernel interfaces do not. Split the remote symlink format handling out into xfs_symlink_remote.[ch] fo it can easily be shared with userspace. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
fde2227c |
|
12-Aug-2013 |
Dave Chinner <dchinner@redhat.com> |
xfs: split out attribute fork truncation code into separate file The attribute inactivation code is not used by userspace, so like the attribute listing, split it out into a separate file to minimise the differences between the filesystem shared with libxfs in userspace. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
abec5f2b |
|
12-Aug-2013 |
Dave Chinner <dchinner@redhat.com> |
xfs: split out attribute listing code into separate file The attribute listing code is not used by userspace, so like the directory readdir code, split it out into a separate file to minimise the differences between the filesystem shared with libxfs in userspace. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
4a8af273 |
|
12-Aug-2013 |
Dave Chinner <dchinner@redhat.com> |
xfs: move getdents code into it's own file The directory readdir code is not used by userspace, but it is intermingled with files that are shared with userspace. This makes it difficult to compare the differences between the userspac eand kernel files are the userspace files don't have the getdents code in them. Move all the kernel getdents code to a separate file to bring the shared content between userspace and kernel files closer together. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
1fd7115e |
|
12-Aug-2013 |
Dave Chinner <dchinner@redhat.com> |
xfs: introduce xfs_inode_buf.c for inode buffer operations The only thing remaining in xfs_inode.[ch] are the operations that read, write or verify physical inodes in their underlying buffers. Move all this code to xfs_inode_buf.[ch] and so we can stop sharing xfs_inode.[ch] with userspace. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
5c4d97d0 |
|
12-Aug-2013 |
Dave Chinner <dchinner@redhat.com> |
xfs: move inode fork definitions to a new header file The inode fork definitions are a combination of on-disk format definition and in-memory tracking and manipulation. They are both shared with userspace, so move them all into their own file so sharing is easy to do and track. This removes all inode fork related information from xfs_inode.h. Do the same for the all the C code that currently resides in xfs_inode.c for the same reason. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
7fd36c44 |
|
12-Aug-2013 |
Dave Chinner <dchinner@redhat.com> |
xfs: split out transaction reservation code The transaction reservation size calculations is used by both kernel and userspace, but most of the transaction code in xfs_trans.c is kernel specific. Split all the transaction reservation code out into it's own files to make sharing with userspace simpler. This just leaves kernel-only definitions in xfs_trans.h, so it doesn't need to be shared with userspace anymore, either. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
3ebe7d2d |
|
27-Jun-2013 |
Dave Chinner <david@fromorbit.com> |
xfs: Inode create log items Introduce the inode create log item type for logical inode create logging. Instead of logging the changes in buffers, pass the range to be initialised through the log by a new transaction type. This reduces the amount of log space required to record initialisation during allocation from about 128 bytes per inode to a small fixed amount per inode extent to be initialised. This requires a new log item type to track it through the log and the AIL. This is a relatively simple item - most callbacks are noops as this item has the same life cycle as the transaction. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
95920cd6 |
|
02-Apr-2013 |
Dave Chinner <dchinner@redhat.com> |
xfs: split remote attribute code out Adding CRC support to remote attributes adds a significant amount of remote attribute specific code. Split the existing remote attribute code out into it's own file so that all the relevant remote attribute code is in a single, easy to find place. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
19de7351 |
|
02-Apr-2013 |
Dave Chinner <dchinner@redhat.com> |
xfs: split out symlink code into it's own file. The symlink code is about to get more complicated when CRCs are added for remote symlink blocks. The symlink management code is mostly self contained, so move it to it's own files so that all the new code and the existing symlink code will not be intermingled with other unrelated code. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
fb595814 |
|
12-Nov-2012 |
Dave Chinner <dchinner@redhat.com> |
xfs: remove xfs_flushinval_pages It's just a simple wrapper around VFS functionality, and is actually bugging in that it doesn't remove mappings before invalidating the page cache. Remove it and replace it with the correct VFS functionality. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Andrew Dahl <adahl@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
33479e05 |
|
08-Oct-2012 |
Dave Chinner <dchinner@redhat.com> |
xfs: remove xfs_iget.c The inode cache functions remaining in xfs_iget.c can be moved to xfs_icache.c along with the other inode cache functions. This removes all functionality from xfs_iget.c, so the file can simply be removed. This move results in various functions now only having the scope of a single file (e.g. xfs_inode_free()), so clean up all the definitions and exported prototypes in xfs_icache.[ch] and xfs_inode.h appropriately. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
6d8b79cf |
|
08-Oct-2012 |
Dave Chinner <dchinner@redhat.com> |
xfs: rename xfs_sync.[ch] to xfs_icache.[ch] xfs_sync.c now only contains inode reclaim functions and inode cache iteration functions. It is not related to sync operations anymore. Rename to xfs_icache.c to reflect it's contents and prepare for consolidation with the other inode cache file that exists (xfs_iget.c). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
2af51f3a |
|
22-Apr-2012 |
Dave Chinner <dchinner@redhat.com> |
xfs: move xfs_do_force_shutdown() and kill xfs_rw.c xfs_do_force_shutdown now is the only thing in xfs_rw.c. There is no need to keep it in it's own file anymore, so move it to xfs_fsops.c next to xfs_fs_goingdown() and kill xfs_rw.c. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
efc27b52 |
|
29-Apr-2012 |
Dave Chinner <dchinner@redhat.com> |
xfs: move busy extent handling to it's own file To make it easier to handle userspace code merges, move all the busy extent handling out of the allocation code and into it's own file. The userspace code does not need the busy extent code, so this simplifies the merging of the kernel code into the userspace xfsprogs library. Because the busy extent code has been almost completely rewritten over the past couple of years, also update the copyright on this new file to include the authors that made all those changes. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
48776fd2 |
|
13-Mar-2012 |
Christoph Hellwig <hch@infradead.org> |
xfs: use common code for quota statistics Switch the quota code over to use the generic XFS statistics infrastructure. While the legacy /proc/fs/xfs/xqm and /proc/fs/xfs/xqmstats interfaces are preserved for now the statistics that still have a meaning with the current code are now also available from /proc/fs/xfs/stats. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
|
#
b6bede3b |
|
14-Aug-2011 |
Christoph Hellwig <hch@infradead.org> |
xfs: fix tracing builds inside the source tree The code really requires the current source directory to be in the header search path. We already do this if building with an object tree separate from the source, but it needs to be added manually if building inside the source. The cflags addition for it accidentally got removed when collapsing the xfs directory structure. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Alex Elder <aelder@sgi.com>
|
#
c59d87c4 |
|
12-Aug-2011 |
Christoph Hellwig <hch@infradead.org> |
xfs: remove subdirectories Use the move from Linux 2.6 to Linux 3.x as an excuse to kill the annoying subdirectories in the XFS source code. Besides the large amount of file rename the only changes are to the Makefile, a few files including headers with the subdirectory prefix, and the binary sysctl compat code that includes a header under fs/xfs/ from kernel/. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
|
#
06f8e2d6 |
|
12-Aug-2011 |
Alex Elder <aelder@sgi.com> |
xfs: don't expect xfs headers to be in subdirectories Fix up some #include directives in preparation for moving a few header files out of xfs source subdirectories. Note that "xfs_linux.h" also got its quoting convention for included files switched. Signed-off-by: Alex Elder <aelder@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
c84470dd |
|
13-Jul-2011 |
Christoph Hellwig <hch@lst.de> |
xfs: remove leftovers of the old btree tracing code Remove various bits left over from the old kdb-only btree tracing code, but leave the actual trace point stubs in place to ease adding new event based btree tracing. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
0ccd234c |
|
14-Jan-2011 |
matt mooney <mfm@muteddisk.com> |
fs: change to new flag variable Replace EXTRA_CFLAGS with ccflags-y. And change ntfs-objs to ntfs-y for cleaner conditional inclusion. Signed-off-by: matt mooney <mfm@muteddisk.com> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Michal Marek <mmarek@suse.cz>
|
#
9130090b |
|
06-Mar-2011 |
Dave Chinner <dchinner@redhat.com> |
xfs: kill support/debug.[ch] The remaining functionality in debug.[ch] is effectively just assert handling, conditional debug definitions and hex dumping. The hex dumping and assert function can be moved into the new printk module, while the rest can be moved into top-level header files. This allows fs/xfs/support/debug.[ch] to be completely removed from the codebase. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Alex Elder <aelder@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
10e38391 |
|
01-Mar-2011 |
Dave Chinner <dchinner@redhat.com> |
xfs: introduce new logging API. Most of the logging infrastructure in XFS is unneccessary and designed around the infrastructure supplied by Irix rather than Linux. To rationalise the logging interfaces, start by introducing simple printk wrappers similar to the dev_printk() infrastructure. Later patches will convert code to use this new interface. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Alex Elder <aelder@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
a46db608 |
|
07-Jan-2011 |
Christoph Hellwig <hch@infradead.org> |
xfs: add FITRIM support Allow manual discards from userspace using the FITRIM ioctl. This is not intended to be run during normal workloads, as the freepsace btree walks can cause large performance degradation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
|
#
e98c414f |
|
23-Jun-2010 |
Christoph Hellwig <hch@infradead.org> |
xfs: simplify log item descriptor tracking Currently we track log item descriptor belonging to a transaction using a complex opencoded chunk allocator. This code has been there since day one and seems to work around the lack of an efficient slab allocator. This patch replaces it with dynamically allocated log item descriptors from a dedicated slab pool, linked to the transaction by a linked list. This allows to greatly simplify the log item descriptor tracking to the point where it's just a couple hundred lines in xfs_trans.c instead of a separate file. The external API has also been simplified while we're at it - the xfs_trans_add_item and xfs_trans_del_item functions to add/ delete items from a transaction have been simplified to the bare minium, and the xfs_trans_find_item function is replaced with a direct dereference of the li_desc field. All debug code walking the list of log items in a transaction is down to a simple list_for_each_entry. Note that we could easily use a singly linked list here instead of the double linked list from list.h as the fastpath only does deletion from sequential traversal. But given that we don't have one available as a library function yet I use the list.h functions for simplicity. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
288699fe |
|
23-Jun-2010 |
Christoph Hellwig <hch@infradead.org> |
xfs: drop dmapi hooks Dmapi support was never merged upstream, but we still have a lot of hooks bloating XFS for it, all over the fast pathes of the filesystem. This patch drops over 700 lines of dmapi overhead. If we'll ever get HSM support in mainline at least the namespace events can be done much saner in the VFS instead of the individual filesystem, so it's not like this is much help for future work. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
71e330b5 |
|
20-May-2010 |
Dave Chinner <dchinner@redhat.com> |
xfs: Introduce delayed logging core code The delayed logging code only changes in-memory structures and as such can be enabled and disabled with a mount option. Add the mount option and emit a warning that this is an experimental feature that should not be used in production yet. We also need infrastructure to track committed items that have not yet been written to the log. This is what the Committed Item List (CIL) is for. The log item also needs to be extended to track the current log vector, the associated memory buffer and it's location in the Commit Item List. Extend the log item and log vector structures to enable this tracking. To maintain the current log format for transactions with delayed logging, we need to introduce a checkpoint transaction and a context for tracking each checkpoint from initiation to transaction completion. This includes adding a log ticket for tracking space log required/used by the context checkpoint. To track all the changes we need an io vector array per log item, rather than a single array for the entire transaction. Using the new log vector structure for this requires two passes - the first to allocate the log vector structures and chain them together, and the second to fill them out. This log vector chain can then be passed to the CIL for formatting, pinning and insertion into the CIL. Formatting of the log vector chain is relatively simple - it's just a loop over the iovecs on each log vector, but it is made slightly more complex because we re-write the iovec after the copy to point back at the memory buffer we just copied into. This code also needs to pin log items. If the log item is not already tracked in this checkpoint context, then it needs to be pinned. Otherwise it is already pinned and we don't need to pin it again. The only other complexity is calculating the amount of new log space the formatting has consumed. This needs to be accounted to the transaction in progress, and the accounting is made more complex becase we need also to steal space from it for log metadata in the checkpoint transaction. Calculate all this at insert time and update all the tickets, counters, etc correctly. Once we've formatted all the log items in the transaction, attach the busy extents to the checkpoint context so the busy extents live until checkpoint completion and can be processed at that point in time. Transactions can then be freed at this point in time. Now we need to issue checkpoints - we are tracking the amount of log space used by the items in the CIL, so we can trigger background checkpoints when the space usage gets to a certain threshold. Otherwise, checkpoints need ot be triggered when a log synchronisation point is reached - a log force event. Because the log write code already handles chained log vectors, writing the transaction is trivial, too. Construct a transaction header, add it to the head of the chain and write it into the log, then issue a commit record write. Then we can release the checkpoint log ticket and attach the context to the log buffer so it can be called during Io completion to complete the checkpoint. We also need to allow for synchronising multiple in-flight checkpoints. This is needed for two things - the first is to ensure that checkpoint commit records appear in the log in the correct sequence order (so they are replayed in the correct order). The second is so that xfs_log_force_lsn() operates correctly and only flushes and/or waits for the specific sequence it was provided with. To do this we need a wait variable and a list tracking the checkpoint commits in progress. We can walk this list and wait for the checkpoints to change state or complete easily, an this provides the necessary synchronisation for correct operation in both cases. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
|
#
dda35b8f |
|
15-Feb-2010 |
Christoph Hellwig <hch@infradead.org> |
xfs: merge xfs_lrw.c into xfs_file.c Currently the code to implement the file operations is split over two small files. Merge the content of xfs_lrw.c into xfs_file.c to have it in one place. Note that I haven't done various cleanups that are possible after this yet, they will follow in the next patch. Also the function xfs_dev_is_read_only which was in xfs_lrw.c before really doesn't fit in here at all and was moved to xfs_mount.c. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Alex Elder <aelder@sgi.com>
|
#
388f1f0c |
|
25-Jan-2010 |
Dave Chinner <david@fromorbit.com> |
xfs: turn off sign warnings Because they cause warnings in static inline functions conditionally compiled into XFS from the VFS (e.g. fsnotify). Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
58c75cfb |
|
19-Jan-2010 |
Dave Chinner <david@fromorbit.com> |
xfs: make compile warn about char sign mismatches again The -fno-unsigned-char directive has no effect anymore as the XFs build is clean. However, the kernel build hides pointer sign differences so turn that back on so that we can clean up all the mismatches prior to a userspace code resync. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
0b1b213f |
|
14-Dec-2009 |
Christoph Hellwig <hch@infradead.org> |
xfs: event tracing support Convert the old xfs tracing support that could only be used with the out of tree kdb and xfsidbg patches to use the generic event tracer. To use it make sure CONFIG_EVENT_TRACING is enabled and then enable all xfs trace channels by: echo 1 > /sys/kernel/debug/tracing/events/xfs/enable or alternatively enable single events by just doing the same in one event subdirectory, e.g. echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_ihold/enable or set more complex filters, etc. In Documentation/trace/events.txt all this is desctribed in more detail. To reads the events do a cat /sys/kernel/debug/tracing/trace Compared to the last posting this patch converts the tracing mostly to the one tracepoint per callsite model that other users of the new tracing facility also employ. This allows a very fine-grained control of the tracing, a cleaner output of the traces and also enables the perf tool to use each tracepoint as a virtual performance counter, allowing us to e.g. count how often certain workloads git various spots in XFS. Take a look at http://lwn.net/Articles/346470/ for some examples. Also the btree tracing isn't included at all yet, as it will require additional core tracing features not in mainline yet, I plan to deliver it later. And the really nice thing about this patch is that it actually removes many lines of code while adding this nice functionality: fs/xfs/Makefile | 8 fs/xfs/linux-2.6/xfs_acl.c | 1 fs/xfs/linux-2.6/xfs_aops.c | 52 - fs/xfs/linux-2.6/xfs_aops.h | 2 fs/xfs/linux-2.6/xfs_buf.c | 117 +-- fs/xfs/linux-2.6/xfs_buf.h | 33 fs/xfs/linux-2.6/xfs_fs_subr.c | 3 fs/xfs/linux-2.6/xfs_ioctl.c | 1 fs/xfs/linux-2.6/xfs_ioctl32.c | 1 fs/xfs/linux-2.6/xfs_iops.c | 1 fs/xfs/linux-2.6/xfs_linux.h | 1 fs/xfs/linux-2.6/xfs_lrw.c | 87 -- fs/xfs/linux-2.6/xfs_lrw.h | 45 - fs/xfs/linux-2.6/xfs_super.c | 104 --- fs/xfs/linux-2.6/xfs_super.h | 7 fs/xfs/linux-2.6/xfs_sync.c | 1 fs/xfs/linux-2.6/xfs_trace.c | 75 ++ fs/xfs/linux-2.6/xfs_trace.h | 1369 +++++++++++++++++++++++++++++++++++++++++ fs/xfs/linux-2.6/xfs_vnode.h | 4 fs/xfs/quota/xfs_dquot.c | 110 --- fs/xfs/quota/xfs_dquot.h | 21 fs/xfs/quota/xfs_qm.c | 40 - fs/xfs/quota/xfs_qm_syscalls.c | 4 fs/xfs/support/ktrace.c | 323 --------- fs/xfs/support/ktrace.h | 85 -- fs/xfs/xfs.h | 16 fs/xfs/xfs_ag.h | 14 fs/xfs/xfs_alloc.c | 230 +----- fs/xfs/xfs_alloc.h | 27 fs/xfs/xfs_alloc_btree.c | 1 fs/xfs/xfs_attr.c | 107 --- fs/xfs/xfs_attr.h | 10 fs/xfs/xfs_attr_leaf.c | 14 fs/xfs/xfs_attr_sf.h | 40 - fs/xfs/xfs_bmap.c | 507 +++------------ fs/xfs/xfs_bmap.h | 49 - fs/xfs/xfs_bmap_btree.c | 6 fs/xfs/xfs_btree.c | 5 fs/xfs/xfs_btree_trace.h | 17 fs/xfs/xfs_buf_item.c | 87 -- fs/xfs/xfs_buf_item.h | 20 fs/xfs/xfs_da_btree.c | 3 fs/xfs/xfs_da_btree.h | 7 fs/xfs/xfs_dfrag.c | 2 fs/xfs/xfs_dir2.c | 8 fs/xfs/xfs_dir2_block.c | 20 fs/xfs/xfs_dir2_leaf.c | 21 fs/xfs/xfs_dir2_node.c | 27 fs/xfs/xfs_dir2_sf.c | 26 fs/xfs/xfs_dir2_trace.c | 216 ------ fs/xfs/xfs_dir2_trace.h | 72 -- fs/xfs/xfs_filestream.c | 8 fs/xfs/xfs_fsops.c | 2 fs/xfs/xfs_iget.c | 111 --- fs/xfs/xfs_inode.c | 67 -- fs/xfs/xfs_inode.h | 76 -- fs/xfs/xfs_inode_item.c | 5 fs/xfs/xfs_iomap.c | 85 -- fs/xfs/xfs_iomap.h | 8 fs/xfs/xfs_log.c | 181 +---- fs/xfs/xfs_log_priv.h | 20 fs/xfs/xfs_log_recover.c | 1 fs/xfs/xfs_mount.c | 2 fs/xfs/xfs_quota.h | 8 fs/xfs/xfs_rename.c | 1 fs/xfs/xfs_rtalloc.c | 1 fs/xfs/xfs_rw.c | 3 fs/xfs/xfs_trans.h | 47 + fs/xfs/xfs_trans_buf.c | 62 - fs/xfs/xfs_vnodeops.c | 8 70 files changed, 2151 insertions(+), 2592 deletions(-) Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
|
#
ef14f0c1 |
|
10-Jun-2009 |
Christoph Hellwig <hch@lst.de> |
xfs: use generic Posix ACL code This patch rips out the XFS ACL handling code and uses the generic fs/posix_acl.c code instead. The ondisk format is of course left unchanged. This also introduces the same ACL caching all other Linux filesystems do by adding pointers to the acl and default acl in struct xfs_inode. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
|
#
7d095257 |
|
08-Jun-2009 |
Christoph Hellwig <hch@lst.de> |
xfs: kill xfs_qmops Kill the quota ops function vector and replace it with direct calls or stubs in the CONFIG_XFS_QUOTA=n case. Make sure we check XFS_IS_QUOTA_RUNNING in the right spots. We can remove the number of those checks because the XFS_TRANS_DQ_DIRTY flag can't be set otherwise. This brings us back closer to the way this code worked in IRIX and earlier Linux versions, but we keep a lot of the more useful factoring of common code. Eventually we should also kill xfs_qm_bhv.c, but that's left for a later patch. Reduces the size of the source code by about 250 lines and the size of XFS module by about 1.5 kilobytes with quotas enabled: text data bss dec hex filename 615957 2960 3848 622765 980ad fs/xfs/xfs.o 617231 3152 3848 624231 98667 fs/xfs/xfs.o.old Fallout: - xfs_qm_dqattach is split into xfs_qm_dqattach_locked which expects the inode locked and xfs_qm_dqattach which does the locking around it, thus removing XFS_QMOPT_ILOCKED. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
|
#
fcafb71b |
|
09-Feb-2009 |
Christoph Hellwig <hch@lst.de> |
xfs: get rid of indirections in the quotaops implementation Currently we call from the nicely abstracted linux quotaops into a ugly multiplexer just to split the calls out at the same boundary again. Rewrite the quota ops handling to remove that obfucation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com>
|
#
5a8d0f3c |
|
02-Dec-2008 |
Christoph Hellwig <hch@lst.de> |
move inode tracing out of xfs_vnode. Move the inode tracing into xfs_iget.c / xfs_inode.h and kill xfs_vnode.c now that it's empty. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com>
|
#
f95099ba |
|
02-Dec-2008 |
Christoph Hellwig <hch@lst.de> |
kill xfs_unmount_flush There's almost nothing left in this function, instead remove the IRELE on the real times inodes and the call to XFS_QM_UNMOUNT into xfs_unmountfs. For the regular unmount case that means it now also happenes after dmapi notification, but otherwise there is no difference in behaviour. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com>
|
#
fe4fa4b8 |
|
30-Oct-2008 |
David Chinner <david@fromorbit.com> |
[XFS] move sync code to its own file The sync code in XFS is spread around several files. While it used to make sense to have such a distribution, the code is about to be cleaned up and so centralising it in one spot as the first step makes sense. SGI-PV: 988139 SGI-Modid: xfs-linux-melb:xfs-kern:32282a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
|
#
8c4ed633 |
|
29-Oct-2008 |
Christoph Hellwig <hch@infradead.org> |
[XFS] make btree tracing generic Make the existing bmap btree tracing generic so that it applies to all btree types. Some fragments lifted from a patch by Dave Chinner. SGI-PV: 985583 SGI-Modid: xfs-linux-melb:xfs-kern:32187a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Bill O'Donnell <billodo@sgi.com> Signed-off-by: David Chinner <david@fromorbit.com>
|
#
0ec58516 |
|
22-Jun-2008 |
Lachlan McIlroy <lachlan@redback.melbourne.sgi.com> |
[XFS] Use the generic xattr methods. Use the generic set, get and removexattr methods and supply the s_xattr array with fine-grained handlers. All XFS/Linux highlevel attr handling is rewritten from scratch and placed into fs/xfs/linux-2.6/xfs_xattr.c so that it's separated from the generic low-level code. SGI-PV: 982343 SGI-Modid: xfs-linux-melb:xfs-kern:31234a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
|
#
269cdfaf |
|
28-Nov-2007 |
Lachlan McIlroy <lachlan@redback.melbourne.sgi.com> |
[XFS] Added quota targets and removed dmapi directory Fixes build failures introduced by bad merge to mainline.
|
#
794f744b |
|
26-Nov-2007 |
Eric Sandeen <sandeen@sandeen.net> |
[XFS] Fix up xfs out-of-tree builds. (a.k.a. external modules) Change -I include directives to find headers in the out-of-tree spot. This allows a directory containing only xfs files to be built as: SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29878a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
|
#
58b7983d |
|
26-Nov-2007 |
Andi Kleen <ak@linux.intel.com> |
[XFS] Remove Makefile wrappers in XFS Makefile (and Kbuild) would include Makefile-linux-26 I doubt XFS will really still compile on 2.4; so drop that. This moves Makefile-linux-26 into Makefile and drops Kbuild. Also having wrappers as both Kbuild and Makefile seemed redundant anyways. The patch is relatively large because it renames a file, but no functional changes. SGI-PV: 971050 SGI-Modid: xfs-linux-melb:xfs-kern:29781a Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
|
#
cde410a9 |
|
04-Sep-2005 |
Nathan Scott <nathans@sgi.com> |
[XFS] Sort out some cosmetic differences between XFS trees. SGI-PV: 904196 SGI-Modid: xfs-linux-melb:xfs-kern:23719a Signed-off-by: Nathan Scott <nathans@sgi.com>
|
#
9effd8e6 |
|
05-May-2005 |
Eric Sandeen <sandeen@sgi.com> |
[XFS] Enable XFS_VNODE_TRACE SGI Modid: xfs-linux:xfs-kern:190725a Signed-off-by: Eric Sandeen <sandeen@sgi.com> Signed-off-by: Christoph Hellwig <hch@sgi.com> .
|
#
1da177e4 |
|
16-Apr-2005 |
Linus Torvalds <torvalds@ppc970.osdl.org> |
Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!
|