Searched +hist:43 +hist:b6573b (Results 1 - 4 of 4) sorted by relevance

/linux-master/fs/f2fs/
H A DKconfigdiff 3fde13f8 Fri Jan 22 02:46:43 MST 2021 Chao Yu <chao@kernel.org> f2fs: compress: support compress level

Expand 'compress_algorithm' mount option to accept parameter as format of
<algorithm>:<level>, by this way, it gives a way to allow user to do more
specified config on lz4 and zstd compression level, then f2fs compression
can provide higher compress ratio.

In order to set compress level for lz4 algorithm, it needs to set
CONFIG_LZ4HC_COMPRESS and CONFIG_F2FS_FS_LZ4HC config to enable lz4hc
compress algorithm.

CR and performance number on lz4/lz4hc algorithm:

dd if=enwik9 of=compressed_file conv=fsync

Original blocks: 244382

lz4 lz4hc-9
compressed blocks 170647 163270
compress ratio 69.8% 66.8%
speed 16.4207 s, 60.9 MB/s 26.7299 s, 37.4 MB/s

compress ratio = after / before

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff 43b6573b Wed Mar 02 01:04:24 MST 2016 Keith Mok <ek9852@gmail.com> f2fs: use cryptoapi crc32 functions

The crc function is done bit by bit.
Optimize this by use cryptoapi
crc32 function which is backed by h/w acceleration.

Signed-off-by: Keith Mok <ek9852@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff 43b6573b Wed Mar 02 01:04:24 MST 2016 Keith Mok <ek9852@gmail.com> f2fs: use cryptoapi crc32 functions

The crc function is done bit by bit.
Optimize this by use cryptoapi
crc32 function which is backed by h/w acceleration.

Signed-off-by: Keith Mok <ek9852@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff 3b218e3a Tue Oct 29 00:43:01 MDT 2013 Jaegeuk Kim <jaegeuk@kernel.org> f2fs: introduce CONFIG_F2FS_CHECK_FS for BUG_ON control

This config will support an option to remove so many BUG_ONs that degrade
the performance potentially.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
H A Dcheckpoint.cdiff 47c8ebcc Thu Jan 27 14:31:43 MST 2022 Jaegeuk Kim <jaegeuk@kernel.org> f2fs: add a way to limit roll forward recovery time

This adds a sysfs entry to call checkpoint during fsync() in order to avoid
long elapsed time to run roll-forward recovery when booting the device.
Default value doesn't enforce the limitation which is same as before.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff 3b42c741 Sat Feb 20 02:38:43 MST 2021 Chao Yu <chao@kernel.org> f2fs: update comments for explicit memory barrier

Add more detailed comments for explicit memory barrier used by
f2fs, in order to enhance code readability.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff b4b10061 Tue Mar 31 12:43:07 MDT 2020 Jaegeuk Kim <jaegeuk@kernel.org> f2fs: refactor resize_fs to avoid meta updates in progress

Sahitya raised an issue:
- prevent meta updates while checkpoint is in progress

allocate_segment_for_resize() can cause metapage updates if
it requires to change the current node/data segments for resizing.
Stop these meta updates when there is a checkpoint already
in progress to prevent inconsistent CP data.

Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff bae0ee7a Tue Dec 25 02:43:42 MST 2018 Chao Yu <chao@kernel.org> f2fs: check PageWriteback flag for ordered case

For all ordered cases in f2fs_wait_on_page_writeback(), we need to
check PageWriteback status, so let's clean up to relocate the check
into f2fs_wait_on_page_writeback().

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff 4354994f Mon Aug 20 20:21:43 MDT 2018 Daniel Rosenberg <drosen@google.com> f2fs: checkpoint disabling

Note that, it requires "f2fs: return correct errno in f2fs_gc".

This adds a lightweight non-persistent snapshotting scheme to f2fs.

To use, mount with the option checkpoint=disable, and to return to
normal operation, remount with checkpoint=enable. If the filesystem
is shut down before remounting with checkpoint=enable, it will revert
back to its apparent state when it was first mounted with
checkpoint=disable. This is useful for situations where you wish to be
able to roll back the state of the disk in case of some critical
failure.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
[Jaegeuk Kim: use SB_RDONLY instead of MS_RDONLY]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff e4c5d848 Fri Sep 30 18:37:43 MDT 2016 Jaegeuk Kim <jaegeuk@kernel.org> f2fs: introduce update_ckpt_flags to clean up

This patch add update_ckpt_flags() to clean up the flow.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff c2a080ae Tue Aug 30 20:43:19 MDT 2016 Chao Yu <chao@kernel.org> f2fs: fix to set superblock dirty correctly

tests/generic/251 of fstest suit complains us with below message:

------------[ cut here ]------------
invalid opcode: 0000 [#1] PREEMPT SMP
CPU: 2 PID: 7698 Comm: fstrim Tainted: G O 4.7.0+ #21
task: e9f4e000 task.stack: e7262000
EIP: 0060:[<f89fcefe>] EFLAGS: 00010202 CPU: 2
EIP is at write_checkpoint+0xfde/0x1020 [f2fs]
EAX: f33eb300 EBX: eecac310 ECX: 00000001 EDX: ffff0001
ESI: eecac000 EDI: eecac5f0 EBP: e7263dec ESP: e7263d18
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 80050033 CR2: b76ab01c CR3: 2eb89de0 CR4: 000406f0
Stack:
00000001 a220fb7b e9f4e000 00000002 419ff2d3 b3a05151 00000002 e9f4e5d8
e9f4e000 419ff2d3 b3a05151 eecac310 c10b8154 b3a05151 419ff2d3 c10b78bd
e9f4e000 e9f4e000 e9f4e5d8 00000001 e9f4e000 ec409000 eecac2cc eecac288
Call Trace:
[<c10b8154>] ? __lock_acquire+0x3c4/0x760
[<c10b78bd>] ? mark_held_locks+0x5d/0x80
[<f8a10632>] f2fs_trim_fs+0x1c2/0x2e0 [f2fs]
[<f89e9f56>] f2fs_ioctl+0x6b6/0x10b0 [f2fs]
[<c13d51df>] ? __this_cpu_preempt_check+0xf/0x20
[<c10b4281>] ? trace_hardirqs_off_caller+0x91/0x120
[<f89e98a0>] ? __exchange_data_block+0xd30/0xd30 [f2fs]
[<c120b2e1>] do_vfs_ioctl+0x81/0x7f0
[<c11d57c5>] ? kmem_cache_free+0x245/0x2e0
[<c1217840>] ? get_unused_fd_flags+0x40/0x40
[<c1206eec>] ? putname+0x4c/0x50
[<c11f631e>] ? do_sys_open+0x16e/0x1d0
[<c1001990>] ? do_fast_syscall_32+0x30/0x1c0
[<c13d51df>] ? __this_cpu_preempt_check+0xf/0x20
[<c120baa8>] SyS_ioctl+0x58/0x80
[<c1001a01>] do_fast_syscall_32+0xa1/0x1c0
[<c178cc54>] sysenter_past_esp+0x45/0x74
EIP: [<f89fcefe>] write_checkpoint+0xfde/0x1020 [f2fs] SS:ESP 0068:e7263d18
---[ end trace 4de95d7e6b3aa7c6 ]---

The reason is: with below call stack, we will encounter BUG_ON during
doing fstrim.

Thread A Thread B
- write_checkpoint
- do_checkpoint
- f2fs_write_inode
- update_inode_page
- update_inode
- set_page_dirty
- f2fs_set_node_page_dirty
- inc_page_count
- percpu_counter_inc
- set_sbi_flag(SBI_IS_DIRTY)
- clear_sbi_flag(SBI_IS_DIRTY)

Thread C Thread D
- f2fs_write_node_page
- set_node_addr
- __set_nat_cache_dirty
- nm_i->dirty_nat_cnt++
- do_vfs_ioctl
- f2fs_ioctl
- f2fs_trim_fs
- write_checkpoint
- f2fs_bug_on(nm_i->dirty_nat_cnt)

Fix it by setting superblock dirty correctly in do_checkpoint and
f2fs_write_node_page.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff a1961246 Fri May 20 10:43:20 MDT 2016 Jaegeuk Kim <jaegeuk@kernel.org> f2fs: introduce f2fs_i_links_write with mark_inode_dirty_sync

This patch introduces f2fs_i_links_write() to call mark_inode_dirty_sync() when
changing inode->i_links.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff b951a4ec Fri May 13 00:57:43 MDT 2016 Yunlei He <heyunlei@huawei.com> f2fs: no need inc dirty pages under inode lock

No need inc dirty pages under inode lock

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff 43b6573b Wed Mar 02 01:04:24 MST 2016 Keith Mok <ek9852@gmail.com> f2fs: use cryptoapi crc32 functions

The crc function is done bit by bit.
Optimize this by use cryptoapi
crc32 function which is backed by h/w acceleration.

Signed-off-by: Keith Mok <ek9852@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff 43b6573b Wed Mar 02 01:04:24 MST 2016 Keith Mok <ek9852@gmail.com> f2fs: use cryptoapi crc32 functions

The crc function is done bit by bit.
Optimize this by use cryptoapi
crc32 function which is backed by h/w acceleration.

Signed-off-by: Keith Mok <ek9852@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
H A Dsuper.cdiff 2736e8ee Thu Jun 08 05:02:43 MDT 2023 Christoph Hellwig <hch@lst.de> block: use the holder as indication for exclusive opens

The current interface for exclusive opens is rather confusing as it
requires both the FMODE_EXCL flag and a holder. Remove the need to pass
FMODE_EXCL and just key off the exclusive open off a non-NULL holder.

For blkdev_put this requires adding the holder argument, which provides
better debug checking that only the holder actually releases the hold,
but at the same time allows removing the now superfluous mode argument.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Christian Brauner <brauner@kernel.org>
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Acked-by: Jack Wang <jinpu.wang@ionos.com> [rnbd]
Link: https://lore.kernel.org/r/20230608110258.189493-16-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
diff c5bf8348 Mon Feb 06 07:43:08 MST 2023 Yangtao Li <frank.li@vivo.com> f2fs: fix to set ipu policy

For LFS mode, it should update outplace and no need inplace update.
When using LFS mode for small-volume devices, IPU will not be used,
and the OPU writing method is actually used, but F2FS_IPU_FORCE can
be read from the ipu_policy node, which is different from the actual
situation. And remount to lfs mode should be disallowed when
f2fs ipu is enabled, let's fix it.

Fixes: 84b89e5d943d ("f2fs: add auto tuning for small devices")
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff 12607c1b Wed Nov 30 10:36:43 MST 2022 Jaegeuk Kim <jaegeuk@kernel.org> f2fs: specify extent cache for read explicitly

Let's descrbie it's read extent cache.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff e33c267a Tue May 31 21:22:24 MDT 2022 Roman Gushchin <roman.gushchin@linux.dev> mm: shrinkers: provide shrinkers with names

Currently shrinkers are anonymous objects. For debugging purposes they
can be identified by count/scan function names, but it's not always
useful: e.g. for superblock's shrinkers it's nice to have at least an
idea of to which superblock the shrinker belongs.

This commit adds names to shrinkers. register_shrinker() and
prealloc_shrinker() functions are extended to take a format and arguments
to master a name.

In some cases it's not possible to determine a good name at the time when
a shrinker is allocated. For such cases shrinker_debugfs_rename() is
provided.

The expected format is:
<subsystem>-<shrinker_type>[:<instance>]-<id>
For some shrinkers an instance can be encoded as (MAJOR:MINOR) pair.

After this change the shrinker debugfs directory looks like:
$ cd /sys/kernel/debug/shrinker/
$ ls
dquota-cache-16 sb-devpts-28 sb-proc-47 sb-tmpfs-42
mm-shadow-18 sb-devtmpfs-5 sb-proc-48 sb-tmpfs-43
mm-zspool:zram0-34 sb-hugetlbfs-17 sb-pstore-31 sb-tmpfs-44
rcu-kfree-0 sb-hugetlbfs-33 sb-rootfs-2 sb-tmpfs-49
sb-aio-20 sb-iomem-12 sb-securityfs-6 sb-tracefs-13
sb-anon_inodefs-15 sb-mqueue-21 sb-selinuxfs-22 sb-xfs:vda1-36
sb-bdev-3 sb-nsfs-4 sb-sockfs-8 sb-zsmalloc-19
sb-bpf-32 sb-pipefs-14 sb-sysfs-26 thp-deferred_split-10
sb-btrfs:vda2-24 sb-proc-25 sb-tmpfs-1 thp-zero-9
sb-cgroup2-30 sb-proc-39 sb-tmpfs-27 xfs-buf:vda1-37
sb-configfs-23 sb-proc-41 sb-tmpfs-29 xfs-inodegc:vda1-38
sb-dax-11 sb-proc-45 sb-tmpfs-35
sb-debugfs-7 sb-proc-46 sb-tmpfs-40

[roman.gushchin@linux.dev: fix build warnings]
Link: https://lkml.kernel.org/r/Yr+ZTnLb9lJk6fJO@castle
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lkml.kernel.org/r/20220601032227.4076670-4-roman.gushchin@linux.dev
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
diff 908ea654 Wed May 25 03:43:36 MDT 2022 Yufen Yu <yuyufen@huawei.com> f2fs: add f2fs_init_write_merge_io function

Almost all other initialization of variables in f2fs_fill_super are
extraced to a single function. Also do it for write_io[], which can
make code more clean.

This patch just refactors the code, theres no functional change.

Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Reviewed-by: Chao Yu <chao@kernel.org>
[Jaegeuk Kim: clean up]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff 9d6b0cd7 Tue Feb 22 12:31:43 MST 2022 Matthew Wilcox (Oracle) <willy@infradead.org> fs: Remove flags parameter from aops->write_begin

There are no more aop flags left, so remove the parameter.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
diff 47c8ebcc Thu Jan 27 14:31:43 MST 2022 Jaegeuk Kim <jaegeuk@kernel.org> f2fs: add a way to limit roll forward recovery time

This adds a sysfs entry to call checkpoint during fsync() in order to avoid
long elapsed time to run roll-forward recovery when booting the device.
Default value doesn't enforce the limitation which is same as before.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff 4f993264 Mon Aug 02 18:15:43 MDT 2021 Chao Yu <chao@kernel.org> f2fs: introduce discard_unit mount option

As James Z reported in bugzilla:

https://bugzilla.kernel.org/show_bug.cgi?id=213877

[1.] One-line summary of the problem:
Mount multiple SMR block devices exceed certain number cause system non-response

[2.] Full description of the problem/report:
Created some F2FS on SMR devices (mkfs.f2fs -m), then mounted in sequence. Each device is the same Model: HGST HSH721414AL (Size 14TB).
Empirically, found that when the amount of SMR device * 1.5Gb > System RAM, the system ran out of memory and hung. No dmesg output. For example, 24 SMR Disk need 24*1.5GB = 36GB. A system with 32G RAM can only mount 21 devices, the 22nd device will be a reproducible cause of system hang.
The number of SMR devices with other FS mounted on this system does not interfere with the result above.

[3.] Keywords (i.e., modules, networking, kernel):
F2FS, SMR, Memory

[4.] Kernel information
[4.1.] Kernel version (uname -a):
Linux 5.13.4-200.fc34.x86_64 #1 SMP Tue Jul 20 20:27:29 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

[4.2.] Kernel .config file:
Default Fedora 34 with f2fs-tools-1.14.0-2.fc34.x86_64

[5.] Most recent kernel version which did not have the bug:
None

[6.] Output of Oops.. message (if applicable) with symbolic information
resolved (see Documentation/admin-guide/oops-tracing.rst)
None

[7.] A small shell script or example program which triggers the
problem (if possible)
mount /dev/sdX /mnt/0X

[8.] Memory consumption

With 24 * 14T SMR Block device with F2FS
free -g
total used free shared buff/cache available
Mem: 46 36 0 0 10 10
Swap: 0 0 0

With 3 * 14T SMR Block device with F2FS
free -g
total used free shared buff/cache available
Mem: 7 5 0 0 1 1
Swap: 7 0 7

The root cause is, there are three bitmaps:
- cur_valid_map
- ckpt_valid_map
- discard_map
and each of them will cost ~500MB memory, {cur, ckpt}_valid_map are
necessary, but discard_map is optional, since this bitmap will only be
useful in mountpoint that small discard is enabled.

For a blkzoned device such as SMR or ZNS devices, f2fs will only issue
discard for a section(zone) when all blocks of that section are invalid,
so, for such device, we don't need small discard functionality at all.

This patch introduces a new mountoption "discard_unit=block|segment|
section" to support issuing discard with different basic unit which is
aligned to block, segment or section, so that user can specify
"discard_unit=segment" or "discard_unit=section" to disable small
discard functionality.

Note that this mount option can not be changed by remount() due to
related metadata need to be initialized during mount().

In order to save memory, let's use "discard_unit=section" for blkzoned
device by default.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff 0dd57178 Mon May 17 19:57:54 MDT 2021 Chao Yu <chao@kernel.org> f2fs: add MODULE_SOFTDEP to ensure crc32 is included in the initramfs

As marcosfrm reported in bugzilla:

https://bugzilla.kernel.org/show_bug.cgi?id=213089

Initramfs generators rely on "pre" softdeps (and "depends") to include
additional required modules.

F2FS does not declare "pre: crc32" softdep. Then every generator (dracut,
mkinitcpio...) has to maintain a hardcoded list for this purpose.

Hence let's use MODULE_SOFTDEP("pre: crc32") in f2fs code.

Fixes: 43b6573bac95 ("f2fs: use cryptoapi crc32 functions")
Reported-by: marcosfrm <marcosfrm@gmail.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff 3fde13f8 Fri Jan 22 02:46:43 MST 2021 Chao Yu <chao@kernel.org> f2fs: compress: support compress level

Expand 'compress_algorithm' mount option to accept parameter as format of
<algorithm>:<level>, by this way, it gives a way to allow user to do more
specified config on lz4 and zstd compression level, then f2fs compression
can provide higher compress ratio.

In order to set compress level for lz4 algorithm, it needs to set
CONFIG_LZ4HC_COMPRESS and CONFIG_F2FS_FS_LZ4HC config to enable lz4hc
compress algorithm.

CR and performance number on lz4/lz4hc algorithm:

dd if=enwik9 of=compressed_file conv=fsync

Original blocks: 244382

lz4 lz4hc-9
compressed blocks 170647 163270
compress ratio 69.8% 66.8%
speed 16.4207 s, 60.9 MB/s 26.7299 s, 37.4 MB/s

compress ratio = after / before

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
H A Df2fs.hdiff ffc143db Fri May 26 14:43:23 MDT 2023 Matthew Wilcox (Oracle) <willy@infradead.org> filemap: Add fgf_t typedef

Similarly to gfp_t, define fgf_t as its own type to prevent various
misuses and confusion. Leave the flags as FGP_* for now to reduce the
size of this patch; they will be converted to FGF_* later. Move the
documentation to the definition of the type insted of burying it in the
__filemap_get_folio() documentation.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev>
diff 55847850 Fri Apr 14 04:43:08 MDT 2023 Wu Bo <bo.wu@vivo.com> f2fs: allocate trace path buffer from names_cache

It would be better to use the dedicated slab to store path.

Signed-off-by: Wu Bo <bo.wu@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff d23be468 Sat Feb 04 02:43:45 MST 2023 qixiaoyu1 <qxy65535@gmail.com> f2fs: add sysfs nodes to set last_age_weight

Signed-off-by: qixiaoyu1 <qixiaoyu1@xiaomi.com>
Signed-off-by: xiongping1 <xiongping1@xiaomi.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff 12607c1b Wed Nov 30 10:36:43 MST 2022 Jaegeuk Kim <jaegeuk@kernel.org> f2fs: specify extent cache for read explicitly

Let's descrbie it's read extent cache.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff 0d5b9d81 Tue Jul 12 09:26:43 MDT 2022 Chao Yu <chao@kernel.org> f2fs: invalidate meta pages only for post_read required inode

After commit e3b49ea36802 ("f2fs: invalidate META_MAPPING before
IPU/DIO write"), invalidate_mapping_pages() will be called to
avoid race condition in between IPU/DIO and readahead for GC.

However, readahead flow is only used for post_read required inode,
so this patch adds check condition to avoids unnecessary page cache
invalidating for non-post_read inode.

Signed-off-by: Chao Yu <chao.yu@oppo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff 908ea654 Wed May 25 03:43:36 MDT 2022 Yufen Yu <yuyufen@huawei.com> f2fs: add f2fs_init_write_merge_io function

Almost all other initialization of variables in f2fs_fill_super are
extraced to a single function. Also do it for write_io[], which can
make code more clean.

This patch just refactors the code, theres no functional change.

Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Reviewed-by: Chao Yu <chao@kernel.org>
[Jaegeuk Kim: clean up]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff 759820c9 Sat May 21 05:11:43 MDT 2022 Julia Lawall <Julia.Lawall@inria.fr> f2fs: fix typo in comment

Spelling mistake (triple letters) in comment.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff c7f91bd4 Tue Feb 22 11:43:13 MST 2022 Bart Van Assche <bvanassche@acm.org> f2fs: Restore rwsem lockdep support

Lockdep uses lock class keys in its analysis. init_rwsem() instantiates
one lock class key with each init_rwsem() user as follows:

#define init_rwsem(sem) \
do { \
static struct lock_class_key __key; \
\
__init_rwsem((sem), #sem, &__key); \
} while (0)

Commit e4544b63a7ee ("f2fs: move f2fs to use reader-unfair rwsems") reduced
the number of lock class keys from one per init_rwsem() user to one per
file in which init_f2fs_rwsem() is used. This causes the same lock class key
to be associated with multiple f2fs rwsems and also triggers a number of
false positive lockdep deadlock reports. Fix this by again instantiating one
lock class key with each init_f2fs_rwsem() caller.

Cc: Tim Murray <timmurray@google.com>
Reported-by: syzbot+0b9cadf5fc45a98a5083@syzkaller.appspotmail.com
Fixes: e4544b63a7ee ("f2fs: move f2fs to use reader-unfair rwsems")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff 47c8ebcc Thu Jan 27 14:31:43 MST 2022 Jaegeuk Kim <jaegeuk@kernel.org> f2fs: add a way to limit roll forward recovery time

This adds a sysfs entry to call checkpoint during fsync() in order to avoid
long elapsed time to run roll-forward recovery when booting the device.
Default value doesn't enforce the limitation which is same as before.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff 4f993264 Mon Aug 02 18:15:43 MDT 2021 Chao Yu <chao@kernel.org> f2fs: introduce discard_unit mount option

As James Z reported in bugzilla:

https://bugzilla.kernel.org/show_bug.cgi?id=213877

[1.] One-line summary of the problem:
Mount multiple SMR block devices exceed certain number cause system non-response

[2.] Full description of the problem/report:
Created some F2FS on SMR devices (mkfs.f2fs -m), then mounted in sequence. Each device is the same Model: HGST HSH721414AL (Size 14TB).
Empirically, found that when the amount of SMR device * 1.5Gb > System RAM, the system ran out of memory and hung. No dmesg output. For example, 24 SMR Disk need 24*1.5GB = 36GB. A system with 32G RAM can only mount 21 devices, the 22nd device will be a reproducible cause of system hang.
The number of SMR devices with other FS mounted on this system does not interfere with the result above.

[3.] Keywords (i.e., modules, networking, kernel):
F2FS, SMR, Memory

[4.] Kernel information
[4.1.] Kernel version (uname -a):
Linux 5.13.4-200.fc34.x86_64 #1 SMP Tue Jul 20 20:27:29 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

[4.2.] Kernel .config file:
Default Fedora 34 with f2fs-tools-1.14.0-2.fc34.x86_64

[5.] Most recent kernel version which did not have the bug:
None

[6.] Output of Oops.. message (if applicable) with symbolic information
resolved (see Documentation/admin-guide/oops-tracing.rst)
None

[7.] A small shell script or example program which triggers the
problem (if possible)
mount /dev/sdX /mnt/0X

[8.] Memory consumption

With 24 * 14T SMR Block device with F2FS
free -g
total used free shared buff/cache available
Mem: 46 36 0 0 10 10
Swap: 0 0 0

With 3 * 14T SMR Block device with F2FS
free -g
total used free shared buff/cache available
Mem: 7 5 0 0 1 1
Swap: 7 0 7

The root cause is, there are three bitmaps:
- cur_valid_map
- ckpt_valid_map
- discard_map
and each of them will cost ~500MB memory, {cur, ckpt}_valid_map are
necessary, but discard_map is optional, since this bitmap will only be
useful in mountpoint that small discard is enabled.

For a blkzoned device such as SMR or ZNS devices, f2fs will only issue
discard for a section(zone) when all blocks of that section are invalid,
so, for such device, we don't need small discard functionality at all.

This patch introduces a new mountoption "discard_unit=block|segment|
section" to support issuing discard with different basic unit which is
aligned to block, segment or section, so that user can specify
"discard_unit=segment" or "discard_unit=section" to disable small
discard functionality.

Note that this mount option can not be changed by remount() due to
related metadata need to be initialized during mount().

In order to save memory, let's use "discard_unit=section" for blkzoned
device by default.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

Completed in 805 milliseconds