#
4bdfd7d1 |
|
15-Dec-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: repair free space btrees Rebuild the free space btrees from the gaps in the rmap btree. Refer to the case study in Documentation/filesystems/xfs-online-fsck-design.rst for more details. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
e3042be3 |
|
06-Dec-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: automatic freeing of freshly allocated unwritten space As mentioned in the previous commit, online repair wants to allocate space to write out a new metadata structure, and it also wants to hedge against system crashes during repairs by logging (and later cancelling) EFIs to free the space if we crash before committing the new data structure. Therefore, create a trio of functions to schedule automatic reaping of freshly allocated unwritten space. xfs_alloc_schedule_autoreap creates a paused EFI representing the space we just allocated. Once the allocations are made and the autoreaps scheduled, we can start writing to disk. If the writes succeed, xfs_alloc_cancel_autoreap marks the EFI work items as stale and unpauses the pending deferred work item. Assuming that's done in the same transaction that commits the new structure into the filesystem, we guarantee that either the new object is fully visible, or that all the space gets reclaimed. If the writes succeed but only part of an extent was used, repair must call the same _cancel_autoreap function to kill the first EFI and then log a new EFI to free the unused space. The first EFI is already committed, so it cannot be changed. For full extents that aren't used, xfs_alloc_commit_autoreap will unpause the EFI, which results in the space being freed during the next _defer_finish cycle. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
4c88fef3 |
|
06-Dec-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: remove __xfs_free_extent_later xfs_free_extent_later is a trivial helper, so remove it to reduce the amount of thinking required to understand the deferred freeing interface. This will make it easier to introduce automatic reaping of speculative allocations in the next patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
2d7d1e7e |
|
29-Jun-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: AGI length should be bounds checked Similar to the recent patch strengthening the AGF agf_length verification, the AGI verifier does not check that the AGI length field is within known good bounds. This isn't currently checked by runtime kernel code, yet we assume in many places that it is correct and verify other metadata against it. Add length verification to the AGI verifier. Just like the AGF length checking, the length of the AGI must be equal to the size of the AG specified in the superblock, unless it is the last AG in the filesystem. In that case, it must be less than or equal to sb->sb_agblocks and greater than XFS_MIN_AG_BLOCKS, which is the smallest AG a growfs operation will allow to exist. There's only one place in the filesystem that actually uses agi_length, but let's not leave it vulnerable to the same weird nonsense that generates syzbot bugs, eh? Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
8ebbf262 |
|
28-Jun-2023 |
Dave Chinner <dchinner@redhat.com> |
xfs: don't block in busy flushing when freeing extents If the current transaction holds a busy extent and we are trying to allocate a new extent to fix up the free list, we can deadlock if the AG is entirely empty except for the busy extent held by the transaction. This can occur at runtime processing an XEFI with multiple extents in this path: __schedule+0x22f at ffffffff81f75e8f schedule+0x46 at ffffffff81f76366 xfs_extent_busy_flush+0x69 at ffffffff81477d99 xfs_alloc_ag_vextent_size+0x16a at ffffffff8141711a xfs_alloc_ag_vextent+0x19b at ffffffff81417edb xfs_alloc_fix_freelist+0x22f at ffffffff8141896f xfs_free_extent_fix_freelist+0x6a at ffffffff8141939a __xfs_free_extent+0x99 at ffffffff81419499 xfs_trans_free_extent+0x3e at ffffffff814a6fee xfs_extent_free_finish_item+0x24 at ffffffff814a70d4 xfs_defer_finish_noroll+0x1f7 at ffffffff81441407 xfs_defer_finish+0x11 at ffffffff814417e1 xfs_itruncate_extents_flags+0x13d at ffffffff8148b7dd xfs_inactive_truncate+0xb9 at ffffffff8148bb89 xfs_inactive+0x227 at ffffffff8148c4f7 xfs_fs_destroy_inode+0xb8 at ffffffff81496898 destroy_inode+0x3b at ffffffff8127d2ab do_unlinkat+0x1d1 at ffffffff81270df1 do_syscall_64+0x40 at ffffffff81f6b5f0 entry_SYSCALL_64_after_hwframe+0x44 at ffffffff8200007c This can also happen in log recovery when processing an EFI with multiple extents through this path: context_switch() kernel/sched/core.c:3881 __schedule() kernel/sched/core.c:5111 schedule() kernel/sched/core.c:5186 xfs_extent_busy_flush() fs/xfs/xfs_extent_busy.c:598 xfs_alloc_ag_vextent_size() fs/xfs/libxfs/xfs_alloc.c:1641 xfs_alloc_ag_vextent() fs/xfs/libxfs/xfs_alloc.c:828 xfs_alloc_fix_freelist() fs/xfs/libxfs/xfs_alloc.c:2362 xfs_free_extent_fix_freelist() fs/xfs/libxfs/xfs_alloc.c:3029 __xfs_free_extent() fs/xfs/libxfs/xfs_alloc.c:3067 xfs_trans_free_extent() fs/xfs/xfs_extfree_item.c:370 xfs_efi_recover() fs/xfs/xfs_extfree_item.c:626 xlog_recover_process_efi() fs/xfs/xfs_log_recover.c:4605 xlog_recover_process_intents() fs/xfs/xfs_log_recover.c:4893 xlog_recover_finish() fs/xfs/xfs_log_recover.c:5824 xfs_log_mount_finish() fs/xfs/xfs_log.c:764 xfs_mountfs() fs/xfs/xfs_mount.c:978 xfs_fs_fill_super() fs/xfs/xfs_super.c:1908 mount_bdev() fs/super.c:1417 xfs_fs_mount() fs/xfs/xfs_super.c:1985 legacy_get_tree() fs/fs_context.c:647 vfs_get_tree() fs/super.c:1547 do_new_mount() fs/namespace.c:2843 do_mount() fs/namespace.c:3163 ksys_mount() fs/namespace.c:3372 __do_sys_mount() fs/namespace.c:3386 __se_sys_mount() fs/namespace.c:3383 __x64_sys_mount() fs/namespace.c:3383 do_syscall_64() arch/x86/entry/common.c:296 entry_SYSCALL_64() arch/x86/entry/entry_64.S:180 To avoid this deadlock, we should not block in xfs_extent_busy_flush() if we hold a busy extent in the current transaction. Now that the EFI processing code can handle requeuing a partially completed EFI, we can detect this situation in xfs_extent_busy_flush() and return -EAGAIN rather than going to sleep forever. The -EAGAIN get propagated back out to the xfs_trans_free_extent() context, where the EFD is populated and the transaction is rolled, thereby moving the busy extents into the CIL. At this point, we can retry the extent free operation again with a clean transaction. If we hit the same "all free extents are busy" situation when trying to fix up the free list, we can safely call xfs_extent_busy_flush() and wait for the busy extents to resolve and wake us. At this point, the allocation search can make progress again and we can fix up the free list. This deadlock was first reported by Chandan in mid-2021, but I couldn't make myself understood during review, and didn't have time to fix it myself. It was reported again in March 2023, and again I have found myself unable to explain the complexities of the solution needed during review. As such, I don't have hours more time to waste trying to get the fix written the way it needs to be written, so I'm just doing it myself. This patchset is largely based on Wengang Wang's last patch, but with all the unnecessary stuff removed, split up into multiple patches and cleaned up somewhat. Reported-by: Chandan Babu R <chandanrlinux@gmail.com> Reported-by: Wengang Wang <wen.gang.wang@oracle.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
#
6a2a9d77 |
|
28-Jun-2023 |
Dave Chinner <dchinner@redhat.com> |
xfs: pass alloc flags through to xfs_extent_busy_flush() To avoid blocking in xfs_extent_busy_flush() when freeing extents and the only busy extents are held by the current transaction, we need to pass the XFS_ALLOC_FLAG_FREEING flag context all the way into xfs_extent_busy_flush(). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
|
#
b742d7b4 |
|
28-Jun-2023 |
Dave Chinner <dchinner@redhat.com> |
xfs: use deferred frees for btree block freeing Btrees that aren't freespace management trees use the normal extent allocation and freeing routines for their blocks. Hence when a btree block is freed, a direct call to xfs_free_extent() is made and the extent is immediately freed. This puts the entire free space management btrees under this path, so we are stacking btrees on btrees in the call stack. The inobt, finobt and refcount btrees all do this. However, the bmap btree does not do this - it calls xfs_free_extent_later() to defer the extent free operation via an XEFI and hence it gets processed in deferred operation processing during the commit of the primary transaction (i.e. via intent chaining). We need to change xfs_free_extent() to behave in a non-blocking manner so that we can avoid deadlocks with busy extents near ENOSPC in transactions that free multiple extents. Inserting or removing a record from a btree can cause a multi-level tree merge operation and that will free multiple blocks from the btree in a single transaction. i.e. we can call xfs_free_extent() multiple times, and hence the btree manipulation transaction is vulnerable to this busy extent deadlock vector. To fix this, convert all the remaining callers of xfs_free_extent() to use xfs_free_extent_later() to queue XEFIs and hence defer processing of the extent frees to a context that can be safely restarted if a deadlock condition is detected. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
|
#
7dfee17b |
|
04-Jun-2023 |
Dave Chinner <dchinner@redhat.com> |
xfs: validate block number being freed before adding to xefi Bad things happen in defered extent freeing operations if it is passed a bad block number in the xefi. This can come from a bogus agno/agbno pair from deferred agfl freeing, or just a bad fsbno being passed to __xfs_free_extent_later(). Either way, it's very difficult to diagnose where a null perag oops in EFI creation is coming from when the operation that queued the xefi has already been completed and there's no longer any trace of it around.... Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
6abc7aef |
|
11-Apr-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: replace xfs_btree_has_record with a general keyspace scanner The current implementation of xfs_btree_has_record returns true if it finds /any/ record within the given range. Unfortunately, that's not sufficient for scrub. We want to be able to tell if a range of keyspace for a btree is devoid of records, is totally mapped to records, or is somewhere in between. By forcing this to be a boolean, we conflated sparseness and fullness, which caused scrub to return incorrect results. Fix the API so that we can tell the caller which of those three is the current state. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
35e3b9a1 |
|
11-Apr-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: standardize ondisk to incore conversion for free space btrees Create a xfs_alloc_btrec_to_irec function to convert an ondisk record to an incore record, and a xfs_alloc_check_irec function to detect corruption. Replace all the open-coded logic with calls to the new helpers and bubble up corruption reports. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
f6b38463 |
|
11-Apr-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: give xfs_extfree_intent its own perag reference Give the xfs_extfree_intent an passive reference to the perag structure data. This reference will be used to enable scrub intent draining functionality in subsequent patches. The space being freed must already be allocated, so we need to able to run even if the AG is being offlined or shrunk. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
b2ccab31 |
|
11-Apr-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: pass per-ag references to xfs_free_extent Pass a reference to the per-AG structure to xfs_free_extent. Most callers already have one, so we can eliminate unnecessary lookups. The one exception to this is the EFI code, which the next patch will fix. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
230e8fe8 |
|
12-Feb-2023 |
Dave Chinner <dchinner@redhat.com> |
xfs: fold xfs_alloc_ag_vextent() into callers We don't need the multiplexing xfs_alloc_ag_vextent() provided anymore - we can just call the exact/near/size variants directly. This allows us to remove args->type completely and stop using args->fsbno as an input to the allocator algorithms. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
#
5f36b2ce |
|
12-Feb-2023 |
Dave Chinner <dchinner@redhat.com> |
xfs: introduce xfs_alloc_vextent_exact_bno() Two of the callers to xfs_alloc_vextent_this_ag() actually want exact block number allocation, not anywhere-in-ag allocation. Split this out from _this_ag() as a first class citizen so no external extent allocation code needs to care about args->type anymore. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
#
db4710fd |
|
12-Feb-2023 |
Dave Chinner <dchinner@redhat.com> |
xfs: introduce xfs_alloc_vextent_near_bno() The remaining callers of xfs_alloc_vextent() are all doing NEAR_BNO allocations. We can replace that function with a new xfs_alloc_vextent_near_bno() function that does this explicitly. We also multiplex NEAR_BNO allocations through xfs_alloc_vextent_this_ag via args->type. Replace all of these with direct calls to xfs_alloc_vextent_near_bno(), too. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
#
2a7f6d41 |
|
12-Feb-2023 |
Dave Chinner <dchinner@redhat.com> |
xfs: use xfs_alloc_vextent_start_bno() where appropriate Change obvious callers of single AG allocation to use xfs_alloc_vextent_start_bno(). Callers no long need to specify XFS_ALLOCTYPE_START_BNO, and so the type can be driven inward and removed. While doing this, also pass the allocation target fsb as a parameter rather than encoding it in args->fsbno. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
#
319c9e87 |
|
12-Feb-2023 |
Dave Chinner <dchinner@redhat.com> |
xfs: use xfs_alloc_vextent_first_ag() where appropriate Change obvious callers of single AG allocation to use xfs_alloc_vextent_first_ag(). This gets rid of XFS_ALLOCTYPE_FIRST_AG as the type used within xfs_alloc_vextent_first_ag() during iteration is _THIS_AG. Hence we can remove the setting of args->type from all the callers of _first_ag() and remove the alloctype. While doing this, pass the allocation target fsb as a parameter rather than encoding it in args->fsbno. This starts the process of making args->fsbno an output only variable rather than input/output. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
#
74c36a86 |
|
12-Feb-2023 |
Dave Chinner <dchinner@redhat.com> |
xfs: use xfs_alloc_vextent_this_ag() where appropriate Change obvious callers of single AG allocation to use xfs_alloc_vextent_this_ag(). Drive the per-ag grabbing out to the callers, too, so that callers with active references don't need to do new lookups just for an allocation in a context that already has a perag reference. The only remaining caller that does single AG allocation through xfs_alloc_vextent() is xfs_bmap_btalloc() with XFS_ALLOCTYPE_NEAR_BNO. That is going to need more untangling before it can be converted cleanly. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
#
cec7bb7d |
|
07-Jul-2022 |
Dave Chinner <dchinner@redhat.com> |
xfs: pass perag to xfs_alloc_read_agfl We have the perag in most places we call xfs_alloc_read_agfl, so pass the perag instead of a mount/agno pair. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
#
8c392eb2 |
|
07-Jul-2022 |
Dave Chinner <dchinner@redhat.com> |
xfs: pass perag to xfs_alloc_put_freelist It's available in all callers, so pass it in so that the perag can be passed further down the stack. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
#
49f0d84e |
|
07-Jul-2022 |
Dave Chinner <dchinner@redhat.com> |
xfs: pass perag to xfs_alloc_get_freelist It's available in all callers, so pass it in so that the perag can be passed further down the stack. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
#
fa044ae7 |
|
07-Jul-2022 |
Dave Chinner <dchinner@redhat.com> |
xfs: pass perag to xfs_read_agf We have the perag in most places we call xfs_read_agf, so pass the perag instead of a mount/agno pair. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
#
08d3e84f |
|
07-Jul-2022 |
Dave Chinner <dchinner@redhat.com> |
xfs: pass perag to xfs_alloc_read_agf() xfs_alloc_read_agf() initialises the perag if it hasn't been done yet, so it makes sense to pass it the perag rather than pull a reference from the buffer. This allows callers to be per-ag centric rather than passing mount/agno pairs everywhere. Whilst modifying the xfs_reflink_find_shared() function definition, declare it static and remove the extern declaration as it is an internal function only these days. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
#
76b47e52 |
|
07-Jul-2022 |
Dave Chinner <dchinner@redhat.com> |
xfs: kill xfs_alloc_pagf_init() Trivial wrapper around xfs_alloc_read_agf(), can be easily replaced by passing a NULL agfbp to xfs_alloc_read_agf(). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
#
f53dde11 |
|
20-Apr-2022 |
Dave Chinner <dchinner@redhat.com> |
xfs: convert AGF log flags to unsigned. 5.18 w/ std=gnu11 compiled with gcc-5 wants flags stored in unsigned fields to be unsigned. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
93defd5a |
|
16-Mar-2022 |
Darrick J. Wong <djwong@kernel.org> |
xfs: document the XFS_ALLOC_AGFL_RESERVE constant Currently, we use this undocumented macro to encode the minimum number of blocks needed to replenish a completely empty AGFL when an AG is nearly full. This has lead to confusion on the part of the maintainers, so let's document what the value actually means, and move it to xfs_alloc.c since it's not used outside of that module. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
b3b5ff41 |
|
12-Oct-2021 |
Darrick J. Wong <djwong@kernel.org> |
xfs: reduce the size of struct xfs_extent_free_item We only use EFIs to free metadata blocks -- not regular data/attr fork extents. Remove all the fields that we never use, for a net reduction of 16 bytes. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
|
#
c201d9ca |
|
12-Oct-2021 |
Darrick J. Wong <djwong@kernel.org> |
xfs: rename xfs_bmap_add_free to xfs_free_extent_later xfs_bmap_add_free isn't a block mapping function; it schedules deferred freeing operations for a later point in a compound transaction chain. While it's primarily used by bunmapi, its use has expanded beyond that. Move it to xfs_alloc.c and rename the function since it's now general freeing functionality. Bring the slab cache bits in line with the way we handle the other intent items. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
|
#
7cb3efb4 |
|
13-Oct-2021 |
Darrick J. Wong <djwong@kernel.org> |
xfs: rename m_ag_maxlevels to m_allocbt_maxlevels Years ago when XFS was thought to be much more simple, we introduced m_ag_maxlevels to specify the maximum btree height of per-AG btrees for a given filesystem mount. Then we observed that inode btrees don't actually have the same height and split that off; and now we have rmap and refcount btrees with much different geometries and separate maxlevels variables. The 'ag' part of the name doesn't make much sense anymore, so rename this to m_alloc_maxlevels to reinforce that this is the maximum height of the *free space* btrees. This sets us up for the next patch, which will add a variable to track the maximum height of all AG btrees. (Also take the opportunity to improve adjacent comments and fix minor style problems.) Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
ebd9027d |
|
18-Aug-2021 |
Dave Chinner <dchinner@redhat.com> |
xfs: convert xfs_sb_version_has checks to use mount features This is a conversion of the remaining xfs_sb_version_has..(sbp) checks to use xfs_has_..(mp) feature checks. This was largely done with a vim replacement macro that did: :0,$s/xfs_sb_version_has\(.*\)&\(.*\)->m_sb/xfs_has_\1\2/g<CR> A couple of other variants were also used, and the rest touched up by hand. $ size -t fs/xfs/built-in.a text data bss dec hex filename before 1127533 311352 484 1439369 15f689 (TOTALS) after 1125360 311352 484 1437196 15ee0c (TOTALS) Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
#
159eb69d |
|
10-Aug-2021 |
Darrick J. Wong <djwong@kernel.org> |
xfs: make the record pointer passed to query_range functions const The query_range functions are supposed to call a caller-supplied function on each record found in the dataset. These functions don't own the memory storing the record, so don't let them change the record. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
04dcb474 |
|
10-Aug-2021 |
Darrick J. Wong <djwong@kernel.org> |
xfs: make the key parameters to all btree query range functions const Range query functions are not supposed to modify the query keys that are being passed in, so mark them all const. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
45d06621 |
|
01-Jun-2021 |
Dave Chinner <dchinner@redhat.com> |
xfs: pass perags through to the busy extent code All of the callers of the busy extent API either have perag references available to use so we can pass a perag to the busy extent functions rather than having them have to do unnecessary lookups. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
#
30151967 |
|
22-Jan-2021 |
Chandan Babu R <chandanrlinux@gmail.com> |
xfs: Introduce error injection to allocate only minlen size extents for files This commit adds XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag which helps userspace test programs to get xfs_bmap_btalloc() to always allocate minlen sized extents. This is required for test programs which need a guarantee that minlen extents allocated for a file do not get merged with their existing neighbours in the inode's BMBT. "Inode fork extent overflow check" for Directories, Xattrs and extension of realtime inodes need this since the file offset at which the extents are being allocated cannot be explicitly controlled from userspace. One way to use this error tag is to, 1. Consume all of the free space by sequentially writing to a file. 2. Punch alternate blocks of the file. This causes CNTBT to contain sufficient number of one block sized extent records. 3. Inject XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag. After step 3, xfs_bmap_btalloc() will issue space allocation requests for minlen sized extents only. ENOSPC error code is returned to userspace when there aren't any "one block sized" extents left in any of the AGs. Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
508578f2 |
|
12-May-2020 |
Nishad Kamdar <nishadkamdar@gmail.com> |
xfs: Use the correct style for SPDX License Identifier This patch corrects the SPDX License Identifier style in header files related to XFS File System support. For C header files Documentation/process/license-rules.rst mandates C-like comments. (opposed to C source files where C++ style should be used). Changes made by using a script provided by Joe Perches here: https://lkml.org/lkml/2019/2/7/46. Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
183606d8 |
|
10-Mar-2020 |
Christoph Hellwig <hch@lst.de> |
xfs: remove the agfl_bno member from struct xfs_agfl struct xfs_agfl is a header in front of the AGFL entries that exists for CRC enabled file systems. For not CRC enabled file systems the AGFL is simply a list of agbno. Make the CRC case similar to that by just using the list behind the new header. This indirectly solves a problem with modern gcc versions that warn about taking addresses of packed structures (and we have to pack the AGFL given that gcc rounds up structure sizes). Also replace the helper macro to get from a buffer with an inline function in xfs_alloc.h to make the code easier to read. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
c34d570d |
|
30-Oct-2019 |
Christoph Hellwig <hch@lst.de> |
xfs: cleanup use of the XFS_ALLOC_ flags Always set XFS_ALLOC_USERDATA for data fork allocations, and check it in xfs_alloc_is_userdata instead of the current obsfucated check. Also remove the xfs_alloc_is_userdata and xfs_alloc_allow_busy_reuse helpers to make the code a little easier to understand. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
fd638f1d |
|
30-Oct-2019 |
Christoph Hellwig <hch@lst.de> |
xfs: move extent zeroing to xfs_bmapi_allocate Move the extent zeroing case there for the XFS_BMAPI_ZERO flag outside the low-level allocator and into xfs_bmapi_allocate, where is still is in transaction context, but outside the very lowlevel code where it doesn't belong. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
ce840429 |
|
23-Sep-2019 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: revert 1baa2800e62d ("xfs: remove the unused XFS_ALLOC_USERDATA flag") Revert this commit, as it caused periodic regressions in xfs/173 w/ 1k blocks. [1] https://lore.kernel.org/lkml/20190919014602.GN15734@shao2-debian/ Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
1baa2800 |
|
30-Aug-2019 |
Christoph Hellwig <hch@lst.de> |
xfs: remove the unused XFS_ALLOC_USERDATA flag Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
66e3237e |
|
12-Dec-2018 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: const-ify xfs_owner_info arguments Only certain functions actually change the contents of an xfs_owner_info; the rest can accept a const struct pointer. This will enable us to save stack space by hoisting static owner info types to be const global variables. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
64396ff2 |
|
11-Jul-2018 |
Brian Foster <bfoster@redhat.com> |
xfs: remove xfs_alloc_arg firstblock field The xfs_alloc_arg.firstblock field is used to control the starting agno for an allocation. The structure already carries a pointer to the transaction, which carries the current firstblock value. Remove the field and access ->t_firstblock directly in the allocation code. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
86210fbe |
|
07-Jun-2018 |
Dave Chinner <dchinner@redhat.com> |
xfs: move various type verifiers to common file New verification functions like xfs_verify_fsbno() and xfs_verify_agino() are spread across multiple files and different header files. They really don't fit cleanly into the places they've been put, and have wider scope than the current header includes. Move the type verifiers to a new file in libxfs (xfs-types.c) and the prototypes to xfs_types.h where they will be visible to all the code that uses the types. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
0b61f8a4 |
|
05-Jun-2018 |
Dave Chinner <dchinner@redhat.com> |
xfs: convert to SPDX license tags Remove the verbose license text from XFS files and replace them with SPDX tags. This does not change the license of any of the code, merely refers to the common, up-to-date license files in LICENSES/ This change was mostly scripted. fs/xfs/Makefile and fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected and modified by the following command: for f in `git grep -l "GNU General" fs/xfs/` ; do echo $f cat $f | awk -f hdr.awk > $f.new mv -f $f.new $f done And the hdr.awk script that did the modification (including detecting the difference between GPL-2.0 and GPL-2.0+ licenses) is as follows: $ cat hdr.awk BEGIN { hdr = 1.0 tag = "GPL-2.0" str = "" } /^ \* This program is free software/ { hdr = 2.0; next } /any later version./ { tag = "GPL-2.0+" next } /^ \*\// { if (hdr > 0.0) { print "// SPDX-License-Identifier: " tag print str print $0 str="" hdr = 0.0 next } print $0 next } /^ \* / { if (hdr > 1.0) next if (hdr > 0.0) { if (str != "") str = str "\n" str = str $0 next } print $0 next } /^ \*/ { if (hdr > 0.0) next print $0 next } // { if (hdr > 0.0) { if (str != "") str = str "\n" str = str $0 next } print $0 } END { } $ Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
9f3a080e |
|
14-May-2018 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: hoist xfs_scrub_agfl_walk to libxfs as xfs_agfl_walk This function is basically a generic AGFL block iterator, so promote it to libxfs ahead of online repair wanting to use it. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
4e529339 |
|
10-May-2018 |
Brian Foster <bfoster@redhat.com> |
xfs: factor out nodiscard helpers The changes to skip discards of speculative preallocation and unwritten extents introduced several new wrapper functions through the bunmapi -> extent free codepath to reduce churn in all of the associated callers. In several cases, these wrappers simply toggle a single flag to skip or not skip discards for the resulting blocks. The explicit _nodiscard() wrappers for such an isolated set of callers is a bit overkill. Kill off these wrappers and replace with the calls to the underlying functions in the contexts that need to control discard behavior. Retain the wrappers that preserve the original calling conventions to serve the original purpose of reducing code churn. This is a refactoring patch and does not change behavior. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
fcb762f5 |
|
09-May-2018 |
Brian Foster <bfoster@redhat.com> |
xfs: add bmapi nodiscard flag Freed extents are unconditionally discarded when online discard is enabled. Define XFS_BMAPI_NODISCARD to allow callers to bypass discards when unnecessary. For example, this will be useful for eofblocks trimming. This patch does not change behavior. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
4223f659 |
|
07-May-2018 |
Brian Foster <bfoster@redhat.com> |
xfs: create agfl block free helper function Refactor the AGFL block free code into a new helper such that it can be invoked from deferred context. No functional changes. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
a1f69417 |
|
06-Apr-2018 |
Eric Sandeen <sandeen@sandeen.net> |
xfs: non-scrub - remove unused function parameters Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
a78ee256 |
|
06-Mar-2018 |
Dave Chinner <dchinner@redhat.com> |
xfs: convert XFS_AGFL_SIZE to a helper function The AGFL size calculation is about to get more complex, so lets turn the macro into a function first and remove the macro. Signed-off-by: Dave Chinner <dchinner@redhat.com> [darrick: forward port to newer kernel, simplify the helper] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
ce1d802e |
|
16-Jan-2018 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: add scrub cross-referencing helpers for the free space btrees Add a couple of functions to the free space btrees that will be used to cross-reference metadata against the bnobt/cntbt, and a generic btree function that provides the real implementation. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
21ec5416 |
|
17-Oct-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: create block pointer check functions Create some helper functions to check that a block pointer points within the filesystem (or AG) and doesn't point at static metadata. We will use this for scrub. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
26788097 |
|
16-Jun-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: export various function for the online scrubber Export various internal functions so that the online scrubber can use them to check the state of metadata. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
e9a2599a |
|
28-Mar-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: create a function to query all records in a btree Create a helper function that will query all records in a btree. This will be used by the online repair functions to examine every record in a btree to rebuild a second btree. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
2d520bfa |
|
28-Mar-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: provide a query_range function for freespace btrees Implement a query_range function for the bnobt and cntbt. This will be used for getfsmap fallback if there is no rmapbt and by the online scrub and repair code. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
8d242e93 |
|
17-Feb-2017 |
Christoph Hellwig <hch@lst.de> |
xfs: remove XFS_ALLOCTYPE_ANY_AG and XFS_ALLOCTYPE_START_AG XFS_ALLOCTYPE_ANY_AG was only used for the RT allocator and is unused now, and XFS_ALLOCTYPE_START_AG has been unused for a while. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
54fee133 |
|
09-Jan-2017 |
Christoph Hellwig <hch@lst.de> |
xfs: adjust allocation length in xfs_alloc_space_available We must decide in xfs_alloc_fix_freelist if we can perform an allocation from a given AG is possible or not based on the available space, and should not fail the allocation past that point on a healthy file system. But currently we have two additional places that second-guess xfs_alloc_fix_freelist: xfs_alloc_ag_vextent tries to adjust the maxlen parameter to remove the reservation before doing the allocation (but ignores the various minium freespace requirements), and xfs_alloc_fix_minleft tries to fix up the allocated length after we've found an extent, but ignores the reservations and also doesn't take the AGFL into account (and thus fails allocations for not matching minlen in some cases). Remove all these later fixups and just correct the maxlen argument inside xfs_alloc_fix_freelist once we have the AGF buffer locked. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
292378ed |
|
25-Sep-2016 |
Dave Chinner <dchinner@redhat.com> |
xfs: remote attribute blocks aren't really userdata When adding a new remote attribute, we write the attribute to the new extent before the allocation transaction is committed. This means we cannot reuse busy extents as that violates crash consistency semantics. Hence we currently treat remote attribute extent allocation like userdata because it has the same overwrite ordering constraints as userdata. Unfortunately, this also allows the allocator to incorrectly apply extent size hints to the remote attribute extent allocation. This results in interesting failures, such as transaction block reservation overruns and in-memory inode attribute fork corruption. To fix this, we need to separate the busy extent reuse configuration from the userdata configuration. This changes the definition of XFS_BMAPI_METADATA slightly - it now means that allocation is metadata and reuse of busy extents is acceptible due to the metadata ordering semantics of the journal. If this flag is not set, it means the allocation is that has unordered data writeback, and hence busy extent reuse is not allowed. It no longer implies the allocation is for user data, just that the data write will not be strictly ordered. This matches the semantics for both user data and remote attribute block allocation. As such, This patch changes the "userdata" field to a "datatype" field, and adds a "no busy reuse" flag to the field. When we detect an unordered data extent allocation, we immediately set the no reuse flag. We then set the "user data" flags based on the inode fork we are allocating the extent to. Hence we only set userdata flags on data fork allocations now and consider attribute fork remote extents to be an unordered metadata extent. The result is that remote attribute extents now have the expected allocation semantics, and the data fork allocation behaviour is completely unchanged. It should be noted that there may be other ways to fix this (e.g. use ordered metadata buffers for the remote attribute extent data write) but they are more invasive and difficult to validate both from a design and implementation POV. Hence this patch takes the simple, obvious route to fixing the problem... Reported-and-tested-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
3fd129b6 |
|
18-Sep-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: set up per-AG free space reservations One unfortunate quirk of the reference count and reverse mapping btrees -- they can expand in size when blocks are written to *other* allocation groups if, say, one large extent becomes a lot of tiny extents. Since we don't want to start throwing errors in the middle of CoWing, we need to reserve some blocks to handle future expansion. The transaction block reservation counters aren't sufficient here because we have to have a reserve of blocks in every AG, not just somewhere in the filesystem. Therefore, create two per-AG block reservation pools. One feeds the AGFL so that rmapbt expansion always succeeds, and the other feeds all other metadata so that refcountbt expansion never fails. Use the count of how many reserved blocks we need to have on hand to create a virtual reservation in the AG. Through selective clamping of the maximum length of allocation requests and of the length of the longest free extent, we can make it look like there's less free space in the AG unless the reservation owner is asking for blocks. In other words, play some accounting tricks in-core to make sure that we always have blocks available. On the plus side, there's nothing to clean up if we crash, which is contrast to the strategy that the rough draft used (actually removing extents from the freespace btrees). Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
04f13060 |
|
02-Aug-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: don't update rmapbt when fixing agfl Allow a caller of xfs_alloc_fix_freelist to disable rmapbt updates when fixing the AG freelist. xfs_repair needs this during phase 5 to be able to adjust the freelist while it's reconstructing the rmap btree; the missing entries will be added back at the very end of phase 5 once the AGFL contents settle down. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
52548852 |
|
02-Aug-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: rmap btree requires more reserved free space Originally-From: Dave Chinner <dchinner@redhat.com> The rmap btree is allocated from the AGFL, which means we have to ensure ENOSPC is reported to userspace before we run out of free space in each AG. The last allocation in an AG can cause a full height rmap btree split, and that means we have to reserve at least this many blocks *in each AG* to be placed on the AGFL at ENOSPC. Update the various space calculation functions to handle this. Also, because the macros are now executing conditional code and are called quite frequently, convert them to functions that initialise variables in the struct xfs_mount, use the new variables everywhere and document the calculations better. [darrick.wong@oracle.com: don't reserve blocks if !rmap] [dchinner@redhat.com: update m_ag_max_usable after growfs] Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
340785cc |
|
02-Aug-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: add owner field to extent allocation and freeing For the rmap btree to work, we have to feed the extent owner information to the the allocation and freeing functions. This information is what will end up in the rmap btree that tracks allocated extents. While we technically don't need the owner information when freeing extents, passing it allows us to validate that the extent we are removing from the rmap btree actually belonged to the owner we expected it to belong to. We also define a special set of owner values for internal metadata that would otherwise have no owner. This allows us to tell the difference between metadata owned by different per-ag btrees, as well as static fs metadata (e.g. AG headers) and internal journal blocks. There are also a couple of special cases we need to take care of - during EFI recovery, we don't actually know who the original owner was, so we need to pass a wildcard to indicate that we aren't checking the owner for validity. We also need special handling in growfs, as we "free" the space in the last AG when extending it, but because it's new space it has no actual owner... While touching the xfs_bmap_add_free() function, re-order the parameters to put the struct xfs_mount first. Extend the owner field to include both the owner type and some sort of index within the owner. The index field will be used to support reverse mappings when reflink is enabled. When we're freeing extents from an EFI, we don't have the owner information available (rmap updates have their own redo items). xfs_free_extent therefore doesn't need to do an rmap update. Make sure that the log replay code signals this correctly. This is based upon a patch originally from Dave Chinner. It has been extended to add more owner information with the intent of helping recovery operations when things go wrong (e.g. offset of user data block in a file). [dchinner: de-shout the xfs_rmap_*_owner helpers] [darrick: minor style fixes suggested by Christoph Hellwig] Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
8018026e |
|
02-Aug-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: rmap btree add more reserved blocks Originally-From: Dave Chinner <dchinner@redhat.com> XFS reserves a small amount of space in each AG for the minimum number of free blocks needed for operation. Adding the rmap btree increases the number of reserved blocks, but it also increases the complexity of the calculation as the free inode btree is optional (like the rmbt). Rather than calculate the prealloc blocks every time we need to check it, add a function to calculate it at mount time and store it in the struct xfs_mount, and convert the XFS_PREALLOC_BLOCKS macro just to use the xfs-mount variable directly. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
4d89e20b |
|
20-Jun-2016 |
Dave Chinner <dchinner@redhat.com> |
xfs: separate freelist fixing into a separate helper Break up xfs_free_extent() into a helper that fixes the freelist. This helper will be used subsequently to ensure the freelist during deferred rmap processing. [darrick: refactor to put this at the head of the patchset] Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
0d5a75e9 |
|
01-Jun-2016 |
Eric Sandeen <sandeen@sandeen.net> |
xfs: make several functions static Al Viro noticed that xfs_lock_inodes should be static, and that led to ... a few more. These are just the easy ones, others require moving functions higher in source files, so that's not done here to keep this review simple. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
2e9101da |
|
03-Jan-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
libxfs: make xfs_alloc_fix_freelist non-static Since xfs_repair wants to use xfs_alloc_fix_freelist, remove the static designation. xfsprogs already has this; this simply brings the kernel up to date. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
3fbbbea3 |
|
02-Nov-2015 |
Dave Chinner <dchinner@redhat.com> |
xfs: introduce BMAPI_ZERO for allocating zeroed extents To enable DAX to do atomic allocation of zeroed extents, we need to drive the block zeroing deep into the allocator. Because xfs_bmapi_write() can return merged extents on allocation that were only partially allocated (i.e. requested range spans allocated and hole regions, allocation into the hole was contiguous), we cannot zero the extent returned from xfs_bmapi_write() as that can overwrite existing data with zeros. Hence we have to drive the extent zeroing into the allocation code, prior to where we merge the extents into the BMBT and return the resultant map. This means we need to propagate this need down to the xfs_alloc_vextent() and issue the block zeroing at this point. While this functionality is being introduced for DAX, there is no reason why it is specific to DAX - we can per-zero blocks during the allocation transaction on any type of device. It's just slow (and usually slower than unwritten allocation and conversion) on traditional block devices so doesn't tend to get used. We can, however, hook hardware zeroing optimisations via sb_issue_zeroout() to this operation, so it may be useful in future and hence the "allocate zeroed blocks" API needs to be implementation neutral. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
496817b4 |
|
21-Jun-2015 |
Dave Chinner <dchinner@redhat.com> |
xfs: clean up XFS_MIN_FREELIST macros We no longer calculate the minimum freelist size from the on-disk AGF, so we don't need the macros used for this. That means the nested macros can be cleaned up, and turn this into an actual function so the logic is clear and concise. This will make it much easier to add support for the rmap btree when the time comes. This also gets rid of the XFS_AG_MAXLEVELS macro used by these freelist macros as it is simply a wrapper around a single variable. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
50adbcb4 |
|
21-Jun-2015 |
Dave Chinner <dchinner@redhat.com> |
xfs: xfs_alloc_fix_freelist() can use incore perag structures At the moment, xfs_alloc_fix_freelist() uses a mix of per-ag based access and agf buffer based access to freelist and space usage information. However, once the AGF buffer is locked inside this function, it is guaranteed that both the in-memory and on-disk values are identical. xfs_alloc_fix_freelist() doesn't modify the values in the structures directly, so it is a read-only user of the infomration, and hence can use the per-ag structure exclusively for determining what it should do. This opens up an avenue for cleaning up a lot of duplicated logic whose only difference is the structure it gets the data from, and in doing so removes a lot of needless byte swapping overhead when fixing up the free list. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
bfe46d4e |
|
28-May-2015 |
Brian Foster <bfoster@redhat.com> |
xfs: support min/max agbno args in block allocator The block allocator supports various arguments to tweak block allocation behavior and set allocation requirements. The sparse inode chunk feature introduces a new requirement not supported by the current arguments. Sparse inode allocations must convert or merge into an inode record that describes a fixed length chunk (64 inodes x inodesize). Full inode chunk allocations by definition always result in valid inode records. Sparse chunk allocations are smaller and the associated records can refer to blocks not owned by the inode chunk. This model can result in invalid inode records in certain cases. For example, if a sparse allocation occurs near the start of an AG, the aligned inode record for that chunk might refer to agbno 0. If an allocation occurs towards the end of the AG and the AG size is not aligned, the inode record could refer to blocks beyond the end of the AG. While neither of these scenarios directly result in corruption, they both insert invalid inode records and at minimum cause repair to complain, are unlikely to merge into full chunks over time and set land mines for other areas of code. To guarantee sparse inode chunk allocation creates valid inode records, support the ability to specify an agbno range limit for XFS_ALLOCTYPE_NEAR_BNO block allocations. The min/max agbno's are specified in the allocation arguments and limit the block allocation algorithms to that range. The starting 'agbno' hint is clamped to the range if the specified agbno is out of range. If no sufficient extent is available within the range, the allocation fails. For backwards compatibility, the min/max fields can be initialized to 0 to disable range limiting (e.g., equivalent to min=0,max=agsize). Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
4fb6e8ad |
|
27-Nov-2014 |
Christoph Hellwig <hch@lst.de> |
xfs: merge xfs_ag.h into xfs_format.h More on-disk format consolidation. A few declarations that weren't on-disk format related move into better suitable spots. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
84be0ffc |
|
24-Jun-2014 |
Dave Chinner <dchinner@redhat.com> |
libxfs: move header files Move all the header files that are shared with userspace into libxfs. This is done as one big chunk simpy to get it done quickly. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|