History log of /linux-master/fs/xfs/scrub/iscan.c
Revision Date Author Comments
# 14dd46cf 22-Feb-2024 Christoph Hellwig <hch@lst.de>

xfs: split xfs_inobt_init_cursor

Split xfs_inobt_init_cursor into separate routines for the inobt and
finobt to prepare for the removal of the xfs_btnum global enumeration
of btree types.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>


# 5385f1a6 22-Feb-2024 Darrick J. Wong <djwong@kernel.org>

xfs: repair file modes by scanning for a dirent pointing to us

Repair might encounter an inode with a totally garbage i_mode. To fix
this problem, we have to figure out if the file was a regular file, a
directory, or a special file. One way to figure this out is to check if
there are any directories with entries pointing down to the busted file.

This patch recovers the file mode by scanning every directory entry on
the filesystem to see if there are any that point to the busted file.
If the ftype of all such dirents are consistent, the mode is recovered
from the ftype. If no dirents are found, the file becomes a regular
file. In all cases, ACLs are canceled and the file is made accessible
only by root.

A previous patch attempted to guess the mode by reading the beginning of
the file data. This was rejected by Christoph on the grounds that we
cannot trust user-controlled data blocks. Users do not have direct
control over the ondisk contents of directory entries, so this method
should be much safer.

If all the dirents have the same ftype, then we can translate that back
into an S_IFMT flag and fix the file. If not, reset the mode to
S_IFREG.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>


# 82334a79 22-Feb-2024 Darrick J. Wong <djwong@kernel.org>

xfs: iscan batching should handle unallocated inodes too

The inode scanner tries to reduce contention on the AGI header buffer
lock by grabbing references to consecutive allocated inodes. Batching
stops as soon as we encounter an unallocated inode. This is unfortunate
because in the worst case performance collapses to the old "one at a
time" behavior if every other inode is free.

This is correct behavior, but we could do better. Unallocated inodes by
definition have nothing to scan, which means the iscan can ignore them
as long as someone ensures that the scan data will reflect another
thread allocating the inode and adding interesting metadata to that
inode. That mechanism is, of course, the live update hooks.

Therefore, extend the batching mechanism to track unallocated inodes
adjacent to the scan cursor. The _want_live_update predicate can tell
the caller's live update hook to incorporate all live updates to what
the scanner thinks is an unallocated inode if (after dropping the AGI)
some other thread allocates one of those inodes and begins using it.

Note that we cannot just copy the ir_free bitmap into the scan cursor
because the batching stops if iget says the inode is in an intermediate
state (e.g. on the inactivation list) and cannot be igrabbed.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>


# a7a686cb 22-Feb-2024 Darrick J. Wong <djwong@kernel.org>

xfs: cache a bunch of inodes for repair scans

After observing xfs_scrub taking forever to rebuild parent pointers on a
pptrs enabled filesystem, I decided to profile what the system was
doing. It turns out that when there are a lot of threads trying to scan
the filesystem, most of our time is spent contending on AGI buffer
locks. Given that we're walking the inobt records anyway, we can often
tell ahead of time when there's a bunch of (up to 64) consecutive inodes
that we could grab all at once.

Do this to amortize the cost of taking the AGI lock across as many
inodes as we possibly can. On the author's system this seems to improve
parallel throughput from barely one and a half cores to slightly
sublinear scaling. The obvious antipattern here of course is where the
freemask has every other bit set (e.g. all 0xA's)

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>


# c473a332 22-Feb-2024 Darrick J. Wong <djwong@kernel.org>

xfs: stagger the starting AG of scrub iscans to reduce contention

Online directory and parent repairs on parent-pointer equipped
filesystems have shown that starting a large number of parallel iscans
causes a lot of AGI buffer contention. Try to reduce this by making it
so that iscans scan wrap around the end of the filesystem, and using a
rotor to stagger where each scanner begins. Surprisingly, this boosts
CPU utilization (on the author's test machines) from effectively
single-threaded to 160%. Not great, but see the next patch.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>


# 8660c7b7 22-Feb-2024 Darrick J. Wong <djwong@kernel.org>

xfs: implement live inode scan for scrub

This patch implements a live file scanner for online fsck functions that
require the ability to walk a filesystem to gather metadata records and
stay informed about metadata changes to files that have already been
visited.

The iscan structure consists of two inode number cursors: one to track
which inode we want to visit next, and a second one to track which
inodes have already been visited. This second cursor is key to
capturing live updates to files previously scanned while the main thread
continues scanning -- any inode greater than this value hasn't been
scanned and can go on its way; any other update must be incorporated
into the collected data. It is critical for the scanning thraad to hold
exclusive access on the inode until after marking the inode visited.

This new code is a separate patch from the patchsets adding callers for
the sake of enabling the author to move patches around his tree with
ease. The intended usage model for this code is roughly:

xchk_iscan_start(iscan, 0, 0);
while ((error = xchk_iscan_iter(sc, iscan, &ip)) == 1) {
xfs_ilock(ip, ...);
/* capture inode metadata */
xchk_iscan_mark_visited(iscan, ip);
xfs_iunlock(ip, ...);

xfs_irele(ip);
}
xchk_iscan_stop(iscan);
if (error)
return error;

Hook functions for live updates can then do:

if (xchk_iscan_want_live_update(...))
/* update the captured inode metadata */

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>