#
5ef819c3 |
|
22-Feb-2024 |
Christoph Hellwig <hch@lst.de> |
xfs: rename btree helpers that depends on the block number representation All these helpers hardcode fsblocks or agblocks and not just the pointer size. Rename them so that the names are still fitting when we add the long format in-memory blocks and adjust the checks when calling them to check the btree types and not just pointer length. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
#
ec793e69 |
|
22-Feb-2024 |
Christoph Hellwig <hch@lst.de> |
xfs: remove xfs_btnum_t The last checks for bc_btnum can be replaced with helpers that check the btree ops. This allows adding new btrees to XFS without having to update a global enum. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> [djwong: complete the ops predicates] Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
#
14dd46cf |
|
22-Feb-2024 |
Christoph Hellwig <hch@lst.de> |
xfs: split xfs_inobt_init_cursor Split xfs_inobt_init_cursor into separate routines for the inobt and finobt to prepare for the removal of the xfs_btnum global enumeration of btree types. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
#
4bfb028a |
|
22-Feb-2024 |
Christoph Hellwig <hch@lst.de> |
xfs: remove the btnum argument to xfs_inobt_count_blocks xfs_inobt_count_blocks is only used for the finobt. Hardcode the btnum argument and rename the function to match that. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
#
3038fd81 |
|
22-Feb-2024 |
Christoph Hellwig <hch@lst.de> |
xfs: remove xfs_inobt_cur This helper provides no real advantage over just open code the two calls in it in the callers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
#
7f47734a |
|
22-Feb-2024 |
Christoph Hellwig <hch@lst.de> |
xfs: add a sick_mask to struct xfs_btree_ops Clean up xfs_btree_mark_sick by adding a sick_mask to the btree-ops for all AG-root btrees. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
#
77953b97 |
|
22-Feb-2024 |
Christoph Hellwig <hch@lst.de> |
xfs: add a name field to struct xfs_btree_ops The btnum in struct xfs_btree_ops is often used for printing a symbolic name for the btree. Add a name field to the ops structure and use that directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
#
6234dee7 |
|
22-Feb-2024 |
Christoph Hellwig <hch@lst.de> |
xfs: remove xfs_inobt_stage_cursor Just open code the two calls in the callers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
#
f6c98d92 |
|
22-Feb-2024 |
Christoph Hellwig <hch@lst.de> |
xfs: fold xfs_inobt_init_common into xfs_inobt_init_cursor Make the levels initialization in xfs_inobt_init_cursor conditional and merge the two helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
#
2b9e7f26 |
|
22-Feb-2024 |
Christoph Hellwig <hch@lst.de> |
xfs: don't override bc_ops for staging btrees Add a few conditionals for staging btrees to the core btree code instead of overloading the bc_ops vector. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
#
4f0cd5a5 |
|
22-Feb-2024 |
Christoph Hellwig <hch@lst.de> |
xfs: split out a btree type from the btree ops geometry flags Two of the btree cursor flags are always used together and encode the fundamental btree type. There currently are two such types: 1) an on-disk AG-rooted btree with 32-bit pointers 2) an on-disk inode-rooted btree with 64-bit pointers and we're about to add: 3) an in-memory btree with 64-bit pointers Introduce a new enum and a new type field in struct xfs_btree_geom to encode this type directly instead of using flags and change most code to switch on this enum. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> [djwong: make the pointer lengths explicit] Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
#
1a9d2629 |
|
22-Feb-2024 |
Darrick J. Wong <djwong@kernel.org> |
xfs: store the btree pointer length in struct xfs_btree_ops Make the pointer length an explicit field in the btree operations structure so that the next patch (which introduces an explicit btree type enum) doesn't have to play a bunch of awkward games with inferring the pointer length from the enumeration. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
07b7f2e3 |
|
22-Feb-2024 |
Christoph Hellwig <hch@lst.de> |
xfs: move the btree stats offset into struct btree_ops The statistics offset is completely static, move it into the btree_ops structure instead of the cursor. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
#
90cfae81 |
|
22-Feb-2024 |
Darrick J. Wong <djwong@kernel.org> |
xfs: move lru refs to the btree ops structure Move the btree buffer LRU refcount to the btree ops structure so that we can eliminate the last bc_btnum switch in the generic btree code. We're about to create repair-specific btree types, and we don't want that stuff cluttering up libxfs. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
d8d6df42 |
|
22-Feb-2024 |
Darrick J. Wong <djwong@kernel.org> |
xfs: extern some btree ops structures Expose these static btree ops structures so that we can reference them in the AG initialization code in the next patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
f9e325bf |
|
22-Feb-2024 |
Darrick J. Wong <djwong@kernel.org> |
xfs: drop XFS_BTREE_CRC_BLOCKS All existing btree types set XFS_BTREE_CRC_BLOCKS when running against a V5 filesystem. All currently proposed btree types are V5 only and use the richer XFS_BTREE_CRC_BLOCKS format. Therefore, we can drop this flag and change the conditional to xfs_has_crc. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
056d22c8 |
|
22-Feb-2024 |
Darrick J. Wong <djwong@kernel.org> |
xfs: set the btree cursor bc_ops in xfs_btree_alloc_cursor This is a precursor to putting more static data in the btree ops structure. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
4c88fef3 |
|
06-Dec-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: remove __xfs_free_extent_later xfs_free_extent_later is a trivial helper, so remove it to reduce the amount of thinking required to understand the deferred freeing interface. This will make it easier to introduce automatic reaping of speculative allocations in the next patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
b742d7b4 |
|
28-Jun-2023 |
Dave Chinner <dchinner@redhat.com> |
xfs: use deferred frees for btree block freeing Btrees that aren't freespace management trees use the normal extent allocation and freeing routines for their blocks. Hence when a btree block is freed, a direct call to xfs_free_extent() is made and the extent is immediately freed. This puts the entire free space management btrees under this path, so we are stacking btrees on btrees in the call stack. The inobt, finobt and refcount btrees all do this. However, the bmap btree does not do this - it calls xfs_free_extent_later() to defer the extent free operation via an XEFI and hence it gets processed in deferred operation processing during the commit of the primary transaction (i.e. via intent chaining). We need to change xfs_free_extent() to behave in a non-blocking manner so that we can avoid deadlocks with busy extents near ENOSPC in transactions that free multiple extents. Inserting or removing a record from a btree can cause a multi-level tree merge operation and that will free multiple blocks from the btree in a single transaction. i.e. we can call xfs_free_extent() multiple times, and hence the btree manipulation transaction is vulnerable to this busy extent deadlock vector. To fix this, convert all the remaining callers of xfs_free_extent() to use xfs_free_extent_later() to queue XEFIs and hence defer processing of the extent frees to a context that can be safely restarted if a deadlock condition is detected. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
|
#
4a200a09 |
|
11-Apr-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: implement masked btree key comparisons for _has_records scans For keyspace fullness scans, we want to be able to mask off the parts of the key that we don't care about. For most btree types we /do/ want the full keyspace, but for checking that a given space usage also has a full complement of rmapbt records (even if different/multiple owners) we need this masking so that we only track sparseness of rm_startblock, not the whole keyspace (which is extremely sparse). Augment the ->diff_two_keys and ->keys_contiguous helpers to take a third union xfs_btree_key argument, and wire up xfs_rmap_has_records to pass this through. This third "mask" argument should contain a nonzero value in each structure field that should be used in the key comparisons done during the scan. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
6abc7aef |
|
11-Apr-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: replace xfs_btree_has_record with a general keyspace scanner The current implementation of xfs_btree_has_record returns true if it finds /any/ record within the given range. Unfortunately, that's not sufficient for scrub. We want to be able to tell if a range of keyspace for a btree is devoid of records, is totally mapped to records, or is somewhere in between. By forcing this to be a boolean, we conflated sparseness and fullness, which caused scrub to return incorrect results. Fix the API so that we can tell the caller which of those three is the current state. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
366a0b8d |
|
11-Apr-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: standardize ondisk to incore conversion for inode btrees Create a xfs_inobt_check_irec function to detect corruption in btree records. Fix all xfs_inobt_btrec_to_irec callsites to call the new helper and bubble up corruption reports. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
9b2e5a23 |
|
11-Apr-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: create traced helper to get extra perag references There are a few places in the XFS codebase where a caller has either an active or a passive reference to a perag structure and wants to give a passive reference to some other piece of code. Btree cursor creation and inode walks are good examples of this. Replace the open-coded logic with a helper to do this. The new function adds a few safeguards -- it checks that there's at least one reference to the perag structure passed in, and it records the refcount bump in the ftrace information. This makes it much easier to debug perag refcounting problems. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
b2ccab31 |
|
11-Apr-2023 |
Darrick J. Wong <djwong@kernel.org> |
xfs: pass per-ag references to xfs_free_extent Pass a reference to the per-AG structure to xfs_free_extent. Most callers already have one, so we can eliminate unnecessary lookups. The one exception to this is the EFI code, which the next patch will fix. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
db4710fd |
|
12-Feb-2023 |
Dave Chinner <dchinner@redhat.com> |
xfs: introduce xfs_alloc_vextent_near_bno() The remaining callers of xfs_alloc_vextent() are all doing NEAR_BNO allocations. We can replace that function with a new xfs_alloc_vextent_near_bno() function that does this explicitly. We also multiplex NEAR_BNO allocations through xfs_alloc_vextent_this_ag via args->type. Replace all of these with direct calls to xfs_alloc_vextent_near_bno(), too. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
#
74c36a86 |
|
12-Feb-2023 |
Dave Chinner <dchinner@redhat.com> |
xfs: use xfs_alloc_vextent_this_ag() where appropriate Change obvious callers of single AG allocation to use xfs_alloc_vextent_this_ag(). Drive the per-ag grabbing out to the callers, too, so that callers with active references don't need to do new lookups just for an allocation in a context that already has a perag reference. The only remaining caller that does single AG allocation through xfs_alloc_vextent() is xfs_bmap_btalloc() with XFS_ALLOCTYPE_NEAR_BNO. That is going to need more untangling before it can be converted cleanly. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
#
7ac2ff8b |
|
12-Feb-2023 |
Dave Chinner <dchinner@redhat.com> |
xfs: perags need atomic operational state We currently don't have any flags or operational state in the xfs_perag except for the pagf_init and pagi_init flags. And the agflreset flag. Oh, there's also the pagf_metadata and pagi_inodeok flags, too. For controlling per-ag operations, we are going to need some atomic state flags. Hence add an opstate field similar to what we already have in the mount and log, and convert all these state flags across to atomic bit operations. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
#
bab8b795 |
|
12-Feb-2023 |
Dave Chinner <dchinner@redhat.com> |
xfs: inobt can use perags in many more places than it does Lots of code in the inobt infrastructure is passed both xfs_mount and perags. We only need perags for the per-ag inode allocation code, so reduce the duplication by passing only the perags as the primary object. This ends up reducing the code size by a bit: text data bss dec hex filename orig 1138878 323979 548 1463405 16546d (TOTALS) patched 1138709 323979 548 1463236 1653c4 (TOTALS) Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
#
36029dee |
|
07-Jul-2022 |
Dave Chinner <dchinner@redhat.com> |
xfs: make is_log_ag() a first class helper We check if an ag contains the log in many places, so make this a first class XFS helper by lifting it to fs/xfs/libxfs/xfs_ag.h and renaming it xfs_ag_contains_log(). The convert all the places that check if the AG contains the log to use this helper. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
#
3829c9a1 |
|
07-Jul-2022 |
Dave Chinner <dchinner@redhat.com> |
xfs: replace xfs_ag_block_count() with perag accesses Many of the places that call xfs_ag_block_count() have a perag available. These places can just read pag->block_count directly instead of calculating the AG block count from first principles. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
#
99b13c7f |
|
07-Jul-2022 |
Dave Chinner <dchinner@redhat.com> |
xfs: pass perag to xfs_ialloc_read_agi() xfs_ialloc_read_agi() initialises the perag if it hasn't been done yet, so it makes sense to pass it the perag rather than pull a reference from the buffer. This allows callers to be per-ag centric rather than passing mount/agno pairs everywhere. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
#
e7720afa |
|
27-Sep-2021 |
Darrick J. Wong <djwong@kernel.org> |
xfs: remove kmem_zone typedef Remove these typedefs by referencing kmem_cache directly. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
|
#
9fa47bdc |
|
23-Sep-2021 |
Darrick J. Wong <djwong@kernel.org> |
xfs: use separate btree cursor cache for each btree type Now that we have the infrastructure to track the max possible height of each btree type, we can create a separate slab cache for cursors of each type of btree. For smaller indices like the free space btrees, this means that we can pack more cursors into a slab page, improving slab utilization. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
0ed5f735 |
|
23-Sep-2021 |
Darrick J. Wong <djwong@kernel.org> |
xfs: compute absolute maximum nlevels for each btree type Add code for all five btree types so that we can compute the absolute maximum possible btree height for each btree type. This is a setup for the next patch, which makes every btree type have its own cursor cache. The functions are exported so that we can have xfs_db report the absolute maximum btree heights for each btree type, rather than making everyone run their own ad-hoc computations. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
c940a0c5 |
|
16-Sep-2021 |
Darrick J. Wong <djwong@kernel.org> |
xfs: dynamically allocate cursors based on maxlevels To support future btree code, we need to be able to size btree cursors dynamically for very large btrees. Switch the maxlevels computation to use the precomputed values in the superblock, and create cursors that can handle a certain height. For now, we retain the btree cursor cache that can handle up to 9-level btrees, though a subsequent patch introduces separate caches for each btree type, where each cache's objects will be exactly tall enough to handle the specific btree type. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
56370ea6 |
|
16-Sep-2021 |
Darrick J. Wong <djwong@kernel.org> |
xfs: refactor btree cursor allocation function Refactor btree allocation to a common helper. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Chandan Babu R <chandan.babu@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
cc411740 |
|
23-Sep-2021 |
Darrick J. Wong <djwong@kernel.org> |
xfs: remove xfs_btree_cur.bc_blocklog This field isn't used by anyone, so get rid of it. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
04fcad80 |
|
18-Aug-2021 |
Dave Chinner <dchinner@redhat.com> |
xfs: introduce xfs_buf_daddr() Introduce a helper function xfs_buf_daddr() to extract the disk address of the buffer from the struct xfs_buf. This will replace direct accesses to bp->b_bn and bp->b_maps[0].bm_bn, as well as the XFS_BUF_ADDR() macro. This patch introduces the helper function and replaces all uses of XFS_BUF_ADDR() as this is just a simple sed replacement. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
#
ebd9027d |
|
18-Aug-2021 |
Dave Chinner <dchinner@redhat.com> |
xfs: convert xfs_sb_version_has checks to use mount features This is a conversion of the remaining xfs_sb_version_has..(sbp) checks to use xfs_has_..(mp) feature checks. This was largely done with a vim replacement macro that did: :0,$s/xfs_sb_version_has\(.*\)&\(.*\)->m_sb/xfs_has_\1\2/g<CR> A couple of other variants were also used, and the rest touched up by hand. $ size -t fs/xfs/built-in.a text data bss dec hex filename before 1127533 311352 484 1439369 15f689 (TOTALS) after 1125360 311352 484 1437196 15ee0c (TOTALS) Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
#
38c26bfd |
|
18-Aug-2021 |
Dave Chinner <dchinner@redhat.com> |
xfs: replace xfs_sb_version checks with feature flag checks Convert the xfs_sb_version_hasfoo() to checks against mp->m_features. Checks of the superblock itself during disk operations (e.g. in the read/write verifiers and the to/from disk formatters) are not converted - they operate purely on the superblock state. Everything else should use the mount features. Large parts of this conversion were done with sed with commands like this: for f in `git grep -l xfs_sb_version_has fs/xfs/*.c`; do sed -i -e 's/xfs_sb_version_has\(.*\)(&\(.*\)->m_sb)/xfs_has_\1(\2)/' $f done With manual cleanups for things like "xfs_has_extflgbit" and other little inconsistencies in naming. The result is ia lot less typing to check features and an XFS binary size reduced by a bit over 3kB: $ size -t fs/xfs/built-in.a text data bss dec hex filenam before 1130866 311352 484 1442702 16038e (TOTALS) after 1127727 311352 484 1439563 15f74b (TOTALS) Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
#
deb06b9a |
|
12-Aug-2021 |
Darrick J. Wong <djwong@kernel.org> |
xfs: make the start pointer passed to btree alloc_block functions const The @start pointer passed to each per-AG btree type's ->alloc_block function isn't supposed to be modified, since it's a hint about the location of the btree block being split that is to be fed to the allocator, so mark the parameter const. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
b5a6e5fe |
|
12-Aug-2021 |
Darrick J. Wong <djwong@kernel.org> |
xfs: make the pointer passed to btree set_root functions const The pointer passed to each per-AG btree type's ->set_root function isn't supposed to be modified (that function sets an external pointer to the root block) so mark them const. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
8e38dc88 |
|
10-Aug-2021 |
Darrick J. Wong <djwong@kernel.org> |
xfs: make the keys and records passed to btree inorder functions const The inorder functions are simple predicates, which means that they don't modify the parameters. Mark them all const. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
23825cd1 |
|
10-Aug-2021 |
Darrick J. Wong <djwong@kernel.org> |
xfs: mark the record passed into btree init_key functions as const These functions initialize a key from a record, but they aren't supposed to modify the record. Mark it const. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
d29d5577 |
|
10-Aug-2021 |
Darrick J. Wong <djwong@kernel.org> |
xfs: make the key parameters to all btree key comparison functions const The btree key comparison functions are not allowed to change the keys that are passed in, so mark them const. We'll need this for the next patch, which adds const to the btree range query functions. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
50f02fe3 |
|
01-Jun-2021 |
Dave Chinner <dchinner@redhat.com> |
xfs: remove agno from btree cursor Now that everything passes a perag, the agno is not needed anymore. Convert all the users to use pag->pag_agno instead and remove the agno from the cursor. This was largely done as an automated search and replace. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
#
7b13c515 |
|
01-Jun-2021 |
Dave Chinner <dchinner@redhat.com> |
xfs: use perag for ialloc btree cursors Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
#
be9fb17d |
|
01-Jun-2021 |
Dave Chinner <dchinner@redhat.com> |
xfs: add a perag to the btree cursor Which will eventually completely replace the agno in it. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
30933120 |
|
01-Jun-2021 |
Dave Chinner <dchinner@redhat.com> |
xfs: push perags through the ag reservation callouts We currently pass an agno from the AG reservation functions to the individual feature accounting functions, which in future may have to do perag lookups to access per-AG state. Instead, pre-emptively plumb the perag through from the highest AG reservation layer to the feature callouts so they won't have to look it up again. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
2e984bad |
|
04-Dec-2020 |
Joseph Qi <joseph.qi@linux.alibaba.com> |
xfs: remove unneeded return value check for *init_cursor() Since *init_cursor() can always return a valid cursor, the NULL check in caller is unneeded. So clean them up. This also keeps the behavior consistent with other callers. Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
11f74423 |
|
26-Aug-2020 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: support inode btree blockcounts in online repair Add the necessary bits to the online repair code to support logging the inode btree counters when rebuilding the btrees, and to support fixing the counters when rebuilding the AGI. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
1ac35f06 |
|
26-Aug-2020 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: use the finobt block counts to speed up mount times Now that we have reliable finobt block counts, use them to speed up the per-AG block reservation calculations at mount time. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
2a39946c |
|
17-Aug-2020 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: store inode btree block counts in AGI header Add a btree block usage counters for both inode btrees to the AGI header so that we don't have to walk the entire finobt at mount time to create the per-AG reservations. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
32a2b11f |
|
22-Jul-2020 |
Carlos Maiolino <cmaiolino@redhat.com> |
xfs: Remove kmem_zone_zalloc() usage Use kmem_cache_zalloc() directly. With the exception of xlog_ticket_alloc() which will be dealt on the next patch for readability. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
c29ce8f4 |
|
11-Mar-2020 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: add support for inode btree staging cursors Add support for btree staging cursors for the inode btrees. This is needed both for online repair and also to convert xfs_repair to use btree bulk loading. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
576af732 |
|
10-Mar-2020 |
Dave Chinner <dchinner@redhat.com> |
xfs: convert btree cursor ag-private member name bc_private.a -> bc_ag conversion via script: `sed -i 's/bc_private\.a/bc_ag/g' fs/xfs/*[ch] fs/xfs/*/*[ch]` And then revert the change to the bc_ag #define in fs/xfs/libxfs/xfs_btree.h manually. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
370c782b |
|
10-Mar-2020 |
Christoph Hellwig <hch@lst.de> |
xfs: remove XFS_BUF_TO_AGI Just dereference bp->b_addr directly and make the code a little simpler and more clear. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
a211432c |
|
02-Jul-2019 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: create simplified inode walk function Create a new iterator function to simplify walking inodes in an XFS filesystem. This new iterator will replace the existing open-coded walking that goes on in various places. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
250d4b4c |
|
28-Jun-2019 |
Eric Sandeen <sandeen@sandeen.net> |
xfs: remove unused header files There are many, many xfs header files which are included but unneeded (or included twice) in the xfs code, so remove them. nb: xfs_linux.h includes about 9 headers for everyone, so those explicit includes get removed by this. I'm not sure what the preference is, but if we wanted explicit includes everywhere, a followup patch could remove those xfs_*.h includes from xfs_linux.h and move them into the files that need them. Or it could be left as-is. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
dbd329f1 |
|
28-Jun-2019 |
Christoph Hellwig <hch@lst.de> |
xfs: add struct xfs_mount pointer to struct xfs_buf We need to derive the mount pointer from a buffer in a lot of place. Add a direct pointer to short cut the pointer chasing. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
ef325959 |
|
05-Jun-2019 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: separate inode geometry Separate the inode geometry information into a distinct structure. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
5cd213b0 |
|
20-May-2019 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: don't reserve per-AG space for an internal log It turns out that the log can consume nearly all the space in an AG, and when this happens this it's possible that there will be less free space in the AG than the reservation would try to hide. On a debug kernel this can trigger an ASSERT in xfs/250: XFS: Assertion failed: xfs_perag_resv(pag, XFS_AG_RESV_METADATA)->ar_reserved + xfs_perag_resv(pag, XFS_AG_RESV_RMAPBT)->ar_reserved <= pag->pagf_freeblks + pag->pagf_flcount, file: fs/xfs/libxfs/xfs_ag_resv.c, line: 319 The log is permanently allocated, so we know we're never going to have to expand the btrees to hold any records associated with the log space. We therefore can treat the space as if it doesn't exist. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
|
#
e1f6ca11 |
|
14-Feb-2019 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: rename m_inotbt_nores to m_finobt_nores Rename this flag variable to imply more strongly that it's related to the free inode btree (finobt) operation. No functional changes. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
8473fee3 |
|
07-Feb-2019 |
Brian Foster <bfoster@redhat.com> |
xfs: distinguish between inobt and finobt magic values The inode btree verifier code is shared between the inode btree and free inode btree because the underlying metadata formats are essentially equivalent. A side effect of this is that the verifier cannot determine whether a particular btree block should have an inobt or finobt magic value. This logic allows an unfortunate xfs_repair bug to escape detection where certain level > 0 nodes of the finobt are stamped with inobt magic by xfs_repair finobt reconstruction. This is fortunately not a severe problem since the inode btree magic values do not contribute to any changes in kernel behavior, but we do need a means to detect and prevent this problem in the future. Add a field to xfs_buf_ops to store the v4 and v5 superblock magic values expected by a particular verifier. Add a helper to check an on-disk magic value against the value expected by the verifier. Call the helper from the shared [f]inobt verifier code for magic value verification. This ensures that the inode btree blocks each have the appropriate magic value based on specific tree type and superblock version. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
01e68f40 |
|
07-Feb-2019 |
Brian Foster <bfoster@redhat.com> |
xfs: create a separate finobt verifier The inobt verifier is reused for the inobt and finobt, which prevents the ability to distinguish between magic values on a per-tree basis. Create a separate finobt structure in preparation for changes to enforce the appropriate magic value for the associated tree. This patch has no functional change. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
7280feda |
|
12-Dec-2018 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: remove xfs_rmap_ag_owner and friends Owner information for static fs metadata can be defined readonly at build time because it never changes across filesystems. This enables us to reduce stack usage (particularly in scrub) because we can use the statically defined oinfo structures. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
c0876897 |
|
19-Nov-2018 |
Dave Chinner <dchinner@redhat.com> |
xfs: finobt AG reserves don't consider last AG can be a runt The last AG may be very small comapred to all other AGs, and hence AG reservations based on the superblock AG size may actually consume more space than the AG actually has. This results on assert failures like: XFS: Assertion failed: xfs_perag_resv(pag, XFS_AG_RESV_METADATA)->ar_reserved + xfs_perag_resv(pag, XFS_AG_RESV_RMAPBT)->ar_reserved <= pag->pagf_freeblks + pag->pagf_flcount, file: fs/xfs/libxfs/xfs_ag_resv.c, line: 319 [ 48.932891] xfs_ag_resv_init+0x1bd/0x1d0 [ 48.933853] xfs_fs_reserve_ag_blocks+0x37/0xb0 [ 48.934939] xfs_mountfs+0x5b3/0x920 [ 48.935804] xfs_fs_fill_super+0x462/0x640 [ 48.936784] ? xfs_test_remount_options+0x60/0x60 [ 48.937908] mount_bdev+0x178/0x1b0 [ 48.938751] mount_fs+0x36/0x170 [ 48.939533] vfs_kern_mount.part.43+0x54/0x130 [ 48.940596] do_mount+0x20e/0xcb0 [ 48.941396] ? memdup_user+0x3e/0x70 [ 48.942249] ksys_mount+0xba/0xd0 [ 48.943046] __x64_sys_mount+0x21/0x30 [ 48.943953] do_syscall_64+0x54/0x170 [ 48.944835] entry_SYSCALL_64_after_hwframe+0x49/0xbe Hence we need to ensure the finobt per-ag space reservations take into account the size of the last AG rather than treat it like all the other full size AGs. Note that both refcountbt and rmapbt already take the size of the AG into account via reading the AGF length directly. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
ebcbef3a |
|
29-Jul-2018 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: pass transaction lock while setting up agresv on cyclic metadata Pass a tranaction pointer through to all helpers that calculate the per-AG block reservation. Online repair will use this to reinitialize per-ag reservations while it still holds all the AG headers locked to the repair transaction. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
0b04b6b8 |
|
19-Jul-2018 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: trivial xfs_btree_del_cursor cleanups The error argument to xfs_btree_del_cursor already understands the "nonzero for error" semantics, so remove pointless error testing in the callers and pass it directly. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
0b61f8a4 |
|
05-Jun-2018 |
Dave Chinner <dchinner@redhat.com> |
xfs: convert to SPDX license tags Remove the verbose license text from XFS files and replace them with SPDX tags. This does not change the license of any of the code, merely refers to the common, up-to-date license files in LICENSES/ This change was mostly scripted. fs/xfs/Makefile and fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected and modified by the following command: for f in `git grep -l "GNU General" fs/xfs/` ; do echo $f cat $f | awk -f hdr.awk > $f.new mv -f $f.new $f done And the hdr.awk script that did the modification (including detecting the difference between GPL-2.0 and GPL-2.0+ licenses) is as follows: $ cat hdr.awk BEGIN { hdr = 1.0 tag = "GPL-2.0" str = "" } /^ \* This program is free software/ { hdr = 2.0; next } /any later version./ { tag = "GPL-2.0+" next } /^ \*\// { if (hdr > 0.0) { print "// SPDX-License-Identifier: " tag print str print $0 str="" hdr = 0.0 next } print $0 next } /^ \* / { if (hdr > 1.0) next if (hdr > 0.0) { if (str != "") str = str "\n" str = str $0 next } print $0 next } /^ \*/ { if (hdr > 0.0) next print $0 next } // { if (hdr > 0.0) { if (str != "") str = str "\n" str = str $0 next } print $0 } END { } $ Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
2e050e64 |
|
24-May-2018 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: fix inobt magic number check In commit a6a781a58befcbd467c ("xfs: have buffer verifier functions report failing address") the bad magic number return was ported incorrectly. Fixes: a6a781a58befcbd467ce843af4eaca3906aa1f08 Reported-by: syzbot+08ab33be0178b76851c8@syzkaller.appspotmail.com Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
|
#
14861c47 |
|
09-May-2018 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: add helpers to calculate btree size Add a bunch of helper functions that calculate the sizes of various btrees. These will be used to repair btrees and btree headers. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
a1f69417 |
|
06-Apr-2018 |
Eric Sandeen <sandeen@sandeen.net> |
xfs: non-scrub - remove unused function parameters Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
e157ebdc |
|
06-Mar-2018 |
Carlos Maiolino <cmaiolino@redhat.com> |
Cleanup old XFS_BTREE_* traces Remove unused legacy btree traces from IRIX era. Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
ad90bb58 |
|
12-Jan-2018 |
Brian Foster <bfoster@redhat.com> |
xfs: account finobt blocks properly in perag reservation XFS started using the perag metadata reservation pool for free inode btree blocks in commit 76d771b4cbe33 ("xfs: use per-AG reservations for the finobt"). To handle backwards compatibility, finobt blocks are accounted against the pool so long as the full reservation is available at mount time. Otherwise the ->m_inotbt_nores flag is set and the filesystem falls back to the traditional per-transaction finobt reservation. This commit has two problems: - finobt blocks are always accounted against the metadata reservation on allocation, regardless of ->m_inotbt_nores state - finobt blocks are never returned to the reservation pool on free The first problem affects reflink+finobt filesystems where the full finobt reservation is not available at mount time. finobt blocks are essentially stolen from the reflink reservation, putting refcountbt management at risk of allocation failure. The second problem is an unconditional leak of metadata reservation whenever finobt is enabled. Update the finobt block allocation callouts to consider ->m_inotbt_nores and account blocks appropriately. Blocks should be consistently accounted against the metadata pool when ->m_inotbt_nores is false and otherwise tagged as RESV_NONE. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
b5572597 |
|
08-Jan-2018 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: create a new buf_ops pointer to verify structure metadata Expose all metadata structure buffer verifier functions via buf_ops. These will be used by the online scrub mechanism to look for problems with buffers that are already sitting around in memory. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
bc1a09b8 |
|
08-Jan-2018 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: refactor verifier callers to print address of failing check Refactor the callers of verifiers to print the instruction address of a failing check. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
a6a781a5 |
|
08-Jan-2018 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: have buffer verifier functions report failing address Modify each function that checks the contents of a metadata buffer to return the instruction address of the failing test so that we can report more precise failure errors to the log. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
31ca03c9 |
|
08-Jan-2018 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: refactor xfs_verifier_error and xfs_buf_ioerror Since all verification errors also mark the buffer as having an error, we can combine these two calls. Later we'll add a xfs_failaddr_t parameter to promote the idea of reporting corruption errors and the address of the failing check to enable better debugging reports. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
#
118bb47e |
|
16-Jun-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: plumb in needed functions for range querying of various btrees Plumb in the pieces (init_high_key, diff_two_keys) necessary to call query_range on the inode space and block mapping btrees and to extract raw btree records. This will eventually be used by the inobt and bmbt scrubbers. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
38dee376 |
|
16-Jun-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: always compile the btree inorder check functions The btree record and key inorder check functions will be used by the btree scrubber code, so make sure they're always built. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
|
#
c8ce540d |
|
16-Jun-2017 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: remove double-underscore integer types This is a purely mechanical patch that removes the private __{u,}int{8,16,32,64}_t typedefs in favor of using the system {u,}int{8,16,32,64}_t typedefs. This is the sed script used to perform the transformation and fix the resulting whitespace and indentation errors: s/typedef\t__uint8_t/typedef __uint8_t\t/g s/typedef\t__uint/typedef __uint/g s/typedef\t__int\([0-9]*\)_t/typedef int\1_t\t/g s/__uint8_t\t/__uint8_t\t\t/g s/__uint/uint/g s/__int\([0-9]*\)_t\t/__int\1_t\t\t/g s/__int/int/g /^typedef.*int[0-9]*_t;$/d Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
76d771b4 |
|
25-Jan-2017 |
Christoph Hellwig <hch@lst.de> |
xfs: use per-AG reservations for the finobt Currently we try to rely on the global reserved block pool for block allocations for the free inode btree, but I have customer reports (fairly complex workload, need to find an easier reproducer) where that is not enough as the AG where we free an inode that requires a new finobt block is entirely full. This causes us to cancel a dirty transaction and thus a file system shutdown. I think the right way to guard against this is to treat the finot the same way as the refcount btree and have a per-AG reservations for the possible worst case size of it, and the patch below implements that. Note that this could increase mount times with large finobt trees. In an ideal world we would have added a field for the number of finobt fields to the AGI, similar to what we did for the refcount blocks. We should do add it next time we rev the AGI or AGF format by adding new fields. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
#
b24a978c |
|
08-Dec-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: use GPF_NOFS when allocating btree cursors Use NOFS for allocating btree cursors, since they can be called under the ilock. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
11ef38af |
|
04-Dec-2016 |
Dave Chinner <dchinner@redhat.com> |
xfs: make xfs btree stats less huge Embedding a switch statement in every btree stats inc/add adds a lot of code overhead to the core btree infrastructure paths. Stats are supposed to be small and lightweight, but the btree stats have become big and bloated as we've added more btrees. It needs fixing because the reflink code will just add more overhead again. Convert the v2 btree stats to arrays instead of independent variables, and instead use the type to index the specific btree array via an enum. This allows us to use array based indexing to update the stats, rather than having to derefence variables specific to the btree type. If we then wrap the xfsstats structure in a union and place uint32_t array beside it, and calculate the correct btree stats array base array index when creating a btree cursor, we can easily access entries in the stats structure without having to switch names based on the btree type. We then replace with the switch statement with a simple set of stats wrapper macros, resulting in a significant simplification of the btree stats code, and: text data bss dec hex filename 48905 144 8 49057 bfa1 fs/xfs/libxfs/xfs_btree.o.old 36793 144 8 36945 9051 fs/xfs/libxfs/xfs_btree.o it reduces the core btree infrastructure code size by close to 25%! Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
3fd129b6 |
|
18-Sep-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: set up per-AG free space reservations One unfortunate quirk of the reference count and reverse mapping btrees -- they can expand in size when blocks are written to *other* allocation groups if, say, one large extent becomes a lot of tiny extents. Since we don't want to start throwing errors in the middle of CoWing, we need to reserve some blocks to handle future expansion. The transaction block reservation counters aren't sufficient here because we have to have a reserve of blocks in every AG, not just somewhere in the filesystem. Therefore, create two per-AG block reservation pools. One feeds the AGFL so that rmapbt expansion always succeeds, and the other feeds all other metadata so that refcountbt expansion never fails. Use the count of how many reserved blocks we need to have on hand to create a virtual reservation in the AG. Through selective clamping of the maximum length of allocation requests and of the length of the longest free extent, we can make it look like there's less free space in the AG unless the reservation owner is asking for blocks. In other words, play some accounting tricks in-core to make sure that we always have blocks available. On the plus side, there's nothing to clean up if we crash, which is contrast to the strategy that the rough draft used (actually removing extents from the freespace btrees). Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
973b8319 |
|
02-Aug-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: remove the get*keys and update_keys btree ops pointers These are internal btree functions; we don't need them to be dispatched via function pointers. Make them static again and just check the overlapped flag to figure out what we need to do. The strategy behind this patch was suggested by Christoph. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
340785cc |
|
02-Aug-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: add owner field to extent allocation and freeing For the rmap btree to work, we have to feed the extent owner information to the the allocation and freeing functions. This information is what will end up in the rmap btree that tracks allocated extents. While we technically don't need the owner information when freeing extents, passing it allows us to validate that the extent we are removing from the rmap btree actually belonged to the owner we expected it to belong to. We also define a special set of owner values for internal metadata that would otherwise have no owner. This allows us to tell the difference between metadata owned by different per-ag btrees, as well as static fs metadata (e.g. AG headers) and internal journal blocks. There are also a couple of special cases we need to take care of - during EFI recovery, we don't actually know who the original owner was, so we need to pass a wildcard to indicate that we aren't checking the owner for validity. We also need special handling in growfs, as we "free" the space in the last AG when extending it, but because it's new space it has no actual owner... While touching the xfs_bmap_add_free() function, re-order the parameters to put the struct xfs_mount first. Extend the owner field to include both the owner type and some sort of index within the owner. The index field will be used to support reverse mappings when reflink is enabled. When we're freeing extents from an EFI, we don't have the owner information available (rmap updates have their own redo items). xfs_free_extent therefore doesn't need to do an rmap update. Make sure that the log replay code signals this correctly. This is based upon a patch originally from Dave Chinner. It has been extended to add more owner information with the intent of helping recovery operations when things go wrong (e.g. offset of user data block in a file). [dchinner: de-shout the xfs_rmap_*_owner helpers] [darrick: minor style fixes suggested by Christoph Hellwig] Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
70b22659 |
|
02-Aug-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: add function pointers for get/update keys to the btree Add some function pointers to bc_ops to get the btree keys for leaf and node blocks, and to update parent keys of a block. Convert the _btree_updkey calls to use our new pointer, and modify the tree shape changing code to call the appropriate get_*_keys pointer instead of _btree_copy_keys because the overlapping btree has to calculate high key values. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
e5821e57 |
|
02-Aug-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
xfs: during btree split, save new block key & ptr for future insertion When a btree block has to be split, we pass the new block's ptr from xfs_btree_split() back to xfs_btree_insert() via a pointer parameter; however, we pass the block's key through the cursor's record. It is a little weird to "initialize" a record from a key since the non-key attributes will have garbage values. When we go to add support for interval queries, we have to be able to pass the lowest and highest keys accessible via a pointer. There's no clean way to pass this back through the cursor's record field. Therefore, pass the key directly back to xfs_btree_insert() the same way that we pass the btree_ptr. As a bonus, we no longer need init_rec_from_key and can drop it from the codebase. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
edfd9dd5 |
|
07-Feb-2016 |
Christoph Hellwig <hch@lst.de> |
xfs: move buffer invalidation to xfs_btree_free_block ... instead of leaving it in the methods. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
c5ab131b |
|
03-Jan-2016 |
Darrick J. Wong <darrick.wong@oracle.com> |
libxfs: refactor short btree block verification Create xfs_btree_sblock_verify() to verify short-format btree blocks (i.e. the per-AG btrees with 32-bit block pointers) instead of open-coding them. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
233135b7 |
|
03-Jan-2016 |
Eric Sandeen <sandeen@redhat.com> |
xfs: print name of verifier if it fails This adds a name to each buf_ops structure, so that if a verifier fails we can print the type of verifier that failed it. Should be a slight debugging aid, I hope. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
ce748eaa |
|
28-Jul-2015 |
Eric Sandeen <sandeen@sandeen.net> |
xfs: create new metadata UUID field and incompat flag This adds a new superblock field, sb_meta_uuid. If set, along with a new incompat flag, the code will use that field on a V5 filesystem to compare to metadata UUIDs, which allows us to change the user- visible UUID at will. Userspace handles the setting and clearing of the incompat flag as appropriate, as the UUID gets changed; i.e. setting the user-visible UUID back to the original UUID (as stored in the new field) will remove the incompatible feature flag. If the incompat flag is not set, this copies the user-visible UUID into into the meta_uuid slot in memory when the superblock is read from disk; the meta_uuid field is not written back to disk in this case. The remainder of this patch simply switches verifiers, initializers, etc to use the new sb_meta_uuid field. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
56d1115c |
|
28-May-2015 |
Brian Foster <bfoster@redhat.com> |
xfs: allocate sparse inode chunks on full chunk allocation failure xfs_ialloc_ag_alloc() makes several attempts to allocate a full inode chunk. If all else fails, reduce the allocation to the sparse length and alignment and attempt to allocate a sparse inode chunk. If sparse chunk allocation succeeds, check whether an inobt record already exists that can track the chunk. If so, inherit and update the existing record. Otherwise, insert a new record for the sparse chunk. Create helpers to align sparse chunk inode records and insert or update existing records in the inode btrees. The xfs_inobt_insert_sprec() helper implements the merge or update semantics required for sparse inode records with respect to both the inobt and finobt. To update the inobt, either insert a new record or merge with an existing record. To update the finobt, use the updated inobt record to either insert or replace an existing record. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
4148c347 |
|
28-May-2015 |
Brian Foster <bfoster@redhat.com> |
xfs: helper to convert holemask to inode alloc. bitmap The inobt record holemask field is a condensed data type designed to fit into the existing on-disk record and is zero based (allocated regions are set to 0, sparse regions are set to 1) to provide backwards compatibility. This makes the type somewhat complex for use in higher level inode manipulations such as individual inode allocation, etc. Rather than foist the complexity of dealing with this field to every bit of logic that requires inode granular information, create a helper to convert the holemask to an inode allocation bitmap. The inode allocation bitmap is inode granularity similar to the inobt record free mask and indicates which inodes of the chunk are physically allocated on disk, irrespective of whether the inode is considered allocated or free by the filesystem. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
5419040f |
|
28-May-2015 |
Brian Foster <bfoster@redhat.com> |
xfs: introduce inode record hole mask for sparse inode chunks The inode btrees track 64 inodes per record regardless of inode size. Thus, inode chunks on disk vary in size depending on the size of the inodes. This creates a contiguous allocation requirement for new inode chunks that can be difficult to satisfy on an aged and fragmented (free space) filesystems. The inode record freecount currently uses 4 bytes on disk to track the free inode count. With a maximum freecount value of 64, only one byte is required. Convert the freecount field to a single byte and use two of the remaining 3 higher order bytes left for the hole mask field. Use the final leftover byte for the total count field. The hole mask field tracks holes in the chunks of physical space that the inode record refers to. This facilitates the sparse allocation of inode chunks when contiguous chunks are not available and allows the inode btrees to identify what portions of the chunk contain valid inodes. The total count field contains the total number of valid inodes referred to by the record. This can also be deduced from the hole mask. The count field provides clarity and redundancy for internal record verification. Note that neither of the new fields can be written to disk on fs' without sparse inode support. Doing so writes to the high-order bytes of freecount and causes corruption from the perspective of older kernels. The on-disk inobt record data structure is updated with a union to distinguish between the original, "full" format and the new, "sparse" format. The conversion routines to get, insert and update records are updated to translate to and from the on-disk record accordingly such that freecount remains a 4-byte value on non-supported fs, yet the new fields of the in-core record are always valid with respect to the record. This means that higher level code can refer to the current in-core record format unconditionally and lower level code ensures that records are translated to/from disk according to the capabilities of the fs. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
bb58e618 |
|
27-Nov-2014 |
Christoph Hellwig <hch@lst.de> |
xfs: move most of xfs_sb.h to xfs_format.h More on-disk format consolidation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
4fb6e8ad |
|
27-Nov-2014 |
Christoph Hellwig <hch@lst.de> |
xfs: merge xfs_ag.h into xfs_format.h More on-disk format consolidation. A few declarations that weren't on-disk format related move into better suitable spots. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
2451337d |
|
24-Jun-2014 |
Dave Chinner <dchinner@redhat.com> |
xfs: global error sign conversion Convert all the errors the core XFs code to negative error signs like the rest of the kernel and remove all the sign conversion we do in the interface layers. Errors for conversion (and comparison) found via searches like: $ git grep " E" fs/xfs $ git grep "return E" fs/xfs $ git grep " E[A-Z].*;$" fs/xfs Negation points found via searches like: $ git grep "= -[a-z,A-Z]" fs/xfs $ git grep "return -[a-z,A-D,F-Z]" fs/xfs $ git grep " -[a-z].*;" fs/xfs [ with some bits I missed from Brian Foster ] Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|
#
30f712c9 |
|
24-Jun-2014 |
Dave Chinner <dchinner@redhat.com> |
libxfs: move source files Move all the source files that are shared with userspace into libxfs/. This is done as one big chunk simpy to get it done quickly Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
|