History log of /linux-master/fs/f2fs/xattr.c
Revision Date Author Comments
# 86d7d57a 11-Dec-2023 Zhiguo Niu <zhiguo.niu@unisoc.com>

f2fs: fix to check return value of f2fs_recover_xattr_data

Should check return value of f2fs_recover_xattr_data in
__f2fs_setxattr rather than doing invalid retry if error happen.

Also just do set_page_dirty in f2fs_recover_xattr_data when
page is changed really.

Fixes: 50a472bbc79f ("f2fs: do not return EFSCORRUPTED, but try to run online repair")
Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# e26b6d39 06-Nov-2023 Eric Biggers <ebiggers@google.com>

f2fs: explicitly null-terminate the xattr list

When setting an xattr, explicitly null-terminate the xattr list. This
eliminates the fragile assumption that the unused xattr space is always
zeroed.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 50a472bb 19-Oct-2023 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: do not return EFSCORRUPTED, but try to run online repair

If we return the error, there's no way to recover the status as of now, since
fsck does not fix the xattr boundary issue.

Cc: stable@vger.kernel.org
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# a1c0752c 29-Sep-2023 Wedson Almeida Filho <walmeida@microsoft.com>

f2fs: move f2fs_xattr_handlers and f2fs_xattr_handler_map to .rodata

This makes it harder for accidental or malicious changes to
f2fs_xattr_handlers or f2fs_xattr_handler_map at runtime.

Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Chao Yu <chao@kernel.org>
Cc: linux-f2fs-devel@lists.sourceforge.net
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
Link: https://lore.kernel.org/r/20230930050033.41174-11-wedsonaf@gmail.com
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>


# bc3994ff 19-Jul-2023 Chao Yu <chao@kernel.org>

f2fs: remove unneeded check condition in __f2fs_setxattr()

It has checked return value of write_all_xattrs(), remove unneeded
following check condition.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 8874ad7d 19-Jul-2023 Chao Yu <chao@kernel.org>

f2fs: fix to update i_ctime in __f2fs_setxattr()

generic/728 - output mismatch (see /media/fstests/results//generic/728.out.bad)
--- tests/generic/728.out 2023-07-19 07:10:48.362711407 +0000
+++ /media/fstests/results//generic/728.out.bad 2023-07-19 08:39:57.000000000 +0000
QA output created by 728
+Expected ctime to change after setxattr.
+Expected ctime to change after removexattr.
Silence is golden
...
(Run 'diff -u /media/fstests/tests/generic/728.out /media/fstests/results//generic/728.out.bad' to see the entire diff)
generic/729 1s

It needs to update i_ctime after {set,remove}xattr, fix it.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# c62ebd35 05-Jul-2023 Jeff Layton <jlayton@kernel.org>

f2fs: convert to ctime accessor functions

In later patches, we're going to change how the inode's ctime field is
used. Switch to using accessor functions instead of raw accesses of
inode->i_ctime.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Message-Id: <20230705190309.579783-41-jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>


# 5eda1ad1 28-Jun-2023 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: fix deadlock in i_xattr_sem and inode page lock

Thread #1:

[122554.641906][ T92] f2fs_getxattr+0xd4/0x5fc
-> waiting for f2fs_down_read(&F2FS_I(inode)->i_xattr_sem);

[122554.641927][ T92] __f2fs_get_acl+0x50/0x284
[122554.641948][ T92] f2fs_init_acl+0x84/0x54c
[122554.641969][ T92] f2fs_init_inode_metadata+0x460/0x5f0
[122554.641990][ T92] f2fs_add_inline_entry+0x11c/0x350
-> Locked dir->inode_page by f2fs_get_node_page()

[122554.642009][ T92] f2fs_do_add_link+0x100/0x1e4
[122554.642025][ T92] f2fs_create+0xf4/0x22c
[122554.642047][ T92] vfs_create+0x130/0x1f4

Thread #2:

[123996.386358][ T92] __get_node_page+0x8c/0x504
-> waiting for dir->inode_page lock

[123996.386383][ T92] read_all_xattrs+0x11c/0x1f4
[123996.386405][ T92] __f2fs_setxattr+0xcc/0x528
[123996.386424][ T92] f2fs_setxattr+0x158/0x1f4
-> f2fs_down_write(&F2FS_I(inode)->i_xattr_sem);

[123996.386443][ T92] __f2fs_set_acl+0x328/0x430
[123996.386618][ T92] f2fs_set_acl+0x38/0x50
[123996.386642][ T92] posix_acl_chmod+0xc8/0x1c8
[123996.386669][ T92] f2fs_setattr+0x5e0/0x6bc
[123996.386689][ T92] notify_change+0x4d8/0x580
[123996.386717][ T92] chmod_common+0xd8/0x184
[123996.386748][ T92] do_fchmodat+0x60/0x124
[123996.386766][ T92] __arm64_sys_fchmodat+0x28/0x3c

Cc: <stable@vger.kernel.org>
Fixes: 27161f13e3c3 "f2fs: avoid race in between read xattr & write xattr"
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# d549b741 01-Feb-2023 Christian Brauner <brauner@kernel.org>

fs: rename generic posix acl handlers

Reflect in their naming and document that they are kept around for
legacy reasons and shouldn't be used anymore by new code.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>


# a5488f29 01-Feb-2023 Christian Brauner <brauner@kernel.org>

fs: simplify ->listxattr() implementation

The ext{2,4}, erofs, f2fs, and jffs2 filesystems use the same logic to
check whether a given xattr can be listed. Simplify them and avoid
open-coding the same check by calling the helper we introduced earlier.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: linux-f2fs-devel@lists.sourceforge.net
Cc: linux-erofs@lists.ozlabs.org
Cc: linux-ext4@vger.kernel.org
Cc: linux-mtd@lists.infradead.org
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>


# 0c95c025 01-Feb-2023 Christian Brauner <brauner@kernel.org>

fs: drop unused posix acl handlers

Remove struct posix_acl_{access,default}_handler for all filesystems
that don't depend on the xattr handler in their inode->i_op->listxattr()
method in any way. There's nothing more to do than to simply remove the
handler. It's been effectively unused ever since we introduced the new
posix acl api.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>


# 01beba79 12-Jan-2023 Christian Brauner <brauner@kernel.org>

fs: port inode_owner_or_capable() to mnt_idmap

Convert to struct mnt_idmap.

Last cycle we merged the necessary infrastructure in
256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.

Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.

Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.

Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>


# 39f60c1c 12-Jan-2023 Christian Brauner <brauner@kernel.org>

fs: port xattr to mnt_idmap

Convert to struct mnt_idmap.

Last cycle we merged the necessary infrastructure in
256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.

Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.

Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.

Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>


# 95fa90c9 28-Sep-2022 Chao Yu <chao@kernel.org>

f2fs: support recording errors into superblock

This patch supports to record detail reason of FSCORRUPTED error into
f2fs_super_block.s_errors[].

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# e4544b63 07-Jan-2022 Tim Murray <timmurray@google.com>

f2fs: move f2fs to use reader-unfair rwsems

f2fs rw_semaphores work better if writers can starve readers,
especially for the checkpoint thread, because writers are strictly
more important than reader threads. This prevents significant priority
inversion between low-priority readers that blocked while trying to
acquire the read lock and a second acquisition of the write lock that
might be blocking high priority work.

Signed-off-by: Tim Murray <timmurray@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# dd9d4a3a 12-Dec-2021 Chao Yu <chao@kernel.org>

f2fs: clean up __find_inline_xattr() with __find_xattr()

Just cleanup, no logic change.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 645a3c40 12-Dec-2021 Chao Yu <chao@kernel.org>

f2fs: fix to do sanity check on last xattr entry in __f2fs_setxattr()

As Wenqing Liu reported in bugzilla:

https://bugzilla.kernel.org/show_bug.cgi?id=215235

- Overview
page fault in f2fs_setxattr() when mount and operate on corrupted image

- Reproduce
tested on kernel 5.16-rc3, 5.15.X under root

1. unzip tmp7.zip
2. ./single.sh f2fs 7

Sometimes need to run the script several times

- Kernel dump
loop0: detected capacity change from 0 to 131072
F2FS-fs (loop0): Found nat_bits in checkpoint
F2FS-fs (loop0): Mounted with checkpoint version = 7548c2ee
BUG: unable to handle page fault for address: ffffe47bc7123f48
RIP: 0010:kfree+0x66/0x320
Call Trace:
__f2fs_setxattr+0x2aa/0xc00 [f2fs]
f2fs_setxattr+0xfa/0x480 [f2fs]
__f2fs_set_acl+0x19b/0x330 [f2fs]
__vfs_removexattr+0x52/0x70
__vfs_removexattr_locked+0xb1/0x140
vfs_removexattr+0x56/0x100
removexattr+0x57/0x80
path_removexattr+0xa3/0xc0
__x64_sys_removexattr+0x17/0x20
do_syscall_64+0x37/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae

The root cause is in __f2fs_setxattr(), we missed to do sanity check on
last xattr entry, result in out-of-bound memory access during updating
inconsistent xattr data of target inode.

After the fix, it can detect such xattr inconsistency as below:

F2FS-fs (loop11): inode (7) has invalid last xattr entry, entry_size: 60676
F2FS-fs (loop11): inode (8) has corrupted xattr
F2FS-fs (loop11): inode (8) has corrupted xattr
F2FS-fs (loop11): inode (8) has invalid last xattr entry, entry_size: 47736

Cc: stable@vger.kernel.org
Reported-by: Wenqing Liu <wenqingliu0120@gmail.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 10a26878 28-Oct-2021 Chao Yu <chao@kernel.org>

f2fs: support fault injection for dquot_initialize()

This patch adds a new function f2fs_dquot_initialize() to wrap
dquot_initialize(), and it supports to inject fault into
f2fs_dquot_initialize() to simulate inner failure occurs in
dquot_initialize().

Usage:
a) echo 65536 > /sys/fs/f2fs/<dev>/inject_type or
b) mount -o fault_type=65536 <dev> <mountpoint>

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 32410577 08-Aug-2021 Chao Yu <chao@kernel.org>

f2fs: support fault injection for f2fs_kmem_cache_alloc()

This patch supports to inject fault into f2fs_kmem_cache_alloc().

Usage:
a) echo 32768 > /sys/fs/f2fs/<dev>/inject_type or
b) mount -o fault_type=32768 <dev> <mountpoint>

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 5f029c04 05-Apr-2021 Yi Zhuang <zhuangyi1@huawei.com>

f2fs: clean up build warnings

This patch combined the below three clean-up patches.

- modify open brace '{' following function definitions
- ERROR: spaces required around that ':'
- ERROR: spaces required before the open parenthesis '('
- ERROR: spaces prohibited before that ','
- Made suggested modifications from checkpatch in reference to WARNING:
Missing a blank line after declarations

Signed-off-by: Yi Zhuang <zhuangyi1@huawei.com>
Signed-off-by: Jia Yang <jiayang5@huawei.com>
Signed-off-by: Jack Qiu <jack.qiu@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# e65ce2a5 21-Jan-2021 Christian Brauner <christian.brauner@ubuntu.com>

acl: handle idmapped mounts

The posix acl permission checking helpers determine whether a caller is
privileged over an inode according to the acls associated with the
inode. Add helpers that make it possible to handle acls on idmapped
mounts.

The vfs and the filesystems targeted by this first iteration make use of
posix_acl_fix_xattr_from_user() and posix_acl_fix_xattr_to_user() to
translate basic posix access and default permissions such as the
ACL_USER and ACL_GROUP type according to the initial user namespace (or
the superblock's user namespace) to and from the caller's current user
namespace. Adapt these two helpers to handle idmapped mounts whereby we
either map from or into the mount's user namespace depending on in which
direction we're translating.
Similarly, cap_convert_nscap() is used by the vfs to translate user
namespace and non-user namespace aware filesystem capabilities from the
superblock's user namespace to the caller's user namespace. Enable it to
handle idmapped mounts by accounting for the mount's user namespace.

In addition the fileystems targeted in the first iteration of this patch
series make use of the posix_acl_chmod() and, posix_acl_update_mode()
helpers. Both helpers perform permission checks on the target inode. Let
them handle idmapped mounts. These two helpers are called when posix
acls are set by the respective filesystems to handle this case we extend
the ->set() method to take an additional user namespace argument to pass
the mount's user namespace down.

Link: https://lore.kernel.org/r/20210121131959.646623-9-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>


# 21cb47be 21-Jan-2021 Christian Brauner <christian.brauner@ubuntu.com>

inode: make init and permission helpers idmapped mount aware

The inode_owner_or_capable() helper determines whether the caller is the
owner of the inode or is capable with respect to that inode. Allow it to
handle idmapped mounts. If the inode is accessed through an idmapped
mount it according to the mount's user namespace. Afterwards the checks
are identical to non-idmapped mounts. If the initial user namespace is
passed nothing changes so non-idmapped mounts will see identical
behavior as before.

Similarly, allow the inode_init_owner() helper to handle idmapped
mounts. It initializes a new inode on idmapped mounts by mapping the
fsuid and fsgid of the caller from the mount's user namespace. If the
initial user namespace is passed nothing changes so non-idmapped mounts
will see identical behavior as before.

Link: https://lore.kernel.org/r/20210121131959.646623-7-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>


# 2e0cd472 31-Jan-2021 Liu Song <liu.song11@zte.com.cn>

f2fs: remove unnecessary initialization in xattr.c

These variables will be explicitly assigned before use,
so there is no need to initialize.

Signed-off-by: Liu Song <liu.song11@zte.com.cn>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 17232e83 25-Dec-2020 Chao Yu <chao@kernel.org>

f2fs: enhance to update i_mode and acl atomically in f2fs_setattr()

Previously, in f2fs_setattr(), we don't update S_ISUID|S_ISGID|S_ISVTX
bits with S_IRWXUGO bits and acl entries atomically, so in error path,
chmod() may partially success, this patch enhances to make chmod() flow
being atomical.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# c8eb7024 14-Sep-2020 Chao Yu <chao@kernel.org>

f2fs: clean up kvfree

After commit 0b6d4ca04a86 ("f2fs: don't return vmalloc() memory from
f2fs_kmalloc()"), f2fs_k{m,z}alloc() will not return vmalloc()'ed
memory, so clean up to use kfree() instead of kvfree() to free
vmalloc()'ed memory.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# a87aff1d 24-Jul-2020 Jack Qiu <jack.qiu@huawei.com>

f2fs: space related cleanup

Just for code style, no logic change
1. delete useless space
2. change spaces into tab

Signed-off-by: Jack Qiu <jack.qiu@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# a999150f 25-Feb-2020 Chao Yu <chao@kernel.org>

f2fs: use kmem_cache pool during inline xattr lookups

It's been observed that kzalloc() on lookup_all_xattrs() are called millions
of times on Android, quickly becoming the top abuser of slub memory allocator.

Use a dedicated kmem cache pool for xattr lookups to mitigate this.

Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# ba3b583c 14-Feb-2020 Chao Yu <chao@kernel.org>

f2fs: clean up parameter of macro XATTR_SIZE()

Just cleanup, no logic change.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 3addc1ae 27-Feb-2020 Chao Yu <chao@kernel.org>

f2fs: remove i_sem lock coverage in f2fs_setxattr()

f2fs_inode.xattr_ver field was gone after commit d260081ccf37
("f2fs: change recovery policy of xattr node block"), remove i_sem
lock coverage in f2fs_setxattr()

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 688078e7 18-Oct-2019 Randall Huang <huangrandall@google.com>

f2fs: fix to avoid memory leakage in f2fs_listxattr

In f2fs_listxattr, there is no boundary check before
memcpy e_name to buffer.
If the e_name_len is corrupted,
unexpected memory contents may be returned to the buffer.

Signed-off-by: Randall Huang <huangrandall@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 00e09c0b 23-Aug-2019 Chao Yu <chao@kernel.org>

f2fs: enhance f2fs_is_checkpoint_ready()'s readability

This patch changes sematics of f2fs_is_checkpoint_ready()'s return
value as: return true when checkpoint is ready, other return false,
it can improve readability of below conditions.

f2fs_submit_page_write()
...
if (is_sbi_flag_set(sbi, SBI_IS_SHUTDOWN) ||
!f2fs_is_checkpoint_ready(sbi))
__submit_merged_bio(io);

f2fs_balance_fs()
...
if (!f2fs_is_checkpoint_ready(sbi))
return;

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# a25c2cdc 22-Jul-2019 Chao Yu <chao@kernel.org>

f2fs: fix to detect cp error in f2fs_setxattr()

It needs to return -EIO if filesystem has been shutdown, fix the
miss case in f2fs_setxattr().

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 955ebcd3 22-Jul-2019 Chao Yu <chao@kernel.org>

f2fs: fix to spread f2fs_is_checkpoint_ready()

We missed to call f2fs_is_checkpoint_ready() in several places, it may
allow space allocation even when free space was exhausted during
checkpoint is disabled, fix to add them.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# c83414ae 19-Jun-2019 Chao Yu <chao@kernel.org>

f2fs: set SBI_NEED_FSCK for xattr corruption case

If xattr is corrupted, let's print kernel message and set SBI_NEED_FSCK
for further repair.

Reported-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 10f966bb 19-Jun-2019 Chao Yu <chao@kernel.org>

f2fs: use generic EFSBADCRC/EFSCORRUPTED

f2fs uses EFAULT as error number to indicate filesystem is corrupted
all the time, but generic filesystems use EUCLEAN for such condition,
we need to change to follow others.

This patch adds two new macros as below to wrap more generic error
code macros, and spread them in code.

EFSBADCRC EBADMSG /* Bad CRC detected */
EFSCORRUPTED EUCLEAN /* Filesystem is corrupted */

Reported-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 2777e654 11-Apr-2019 Randall Huang <huangrandall@google.com>

f2fs: fix to avoid accessing xattr across the boundary

When we traverse xattr entries via __find_xattr(),
if the raw filesystem content is faked or any hardware failure occurs,
out-of-bound error can be detected by KASAN.
Fix the issue by introducing boundary check.

[ 38.402878] c7 1827 BUG: KASAN: slab-out-of-bounds in f2fs_getxattr+0x518/0x68c
[ 38.402891] c7 1827 Read of size 4 at addr ffffffc0b6fb35dc by task
[ 38.402935] c7 1827 Call trace:
[ 38.402952] c7 1827 [<ffffff900809003c>] dump_backtrace+0x0/0x6bc
[ 38.402966] c7 1827 [<ffffff9008090030>] show_stack+0x20/0x2c
[ 38.402981] c7 1827 [<ffffff900871ab10>] dump_stack+0xfc/0x140
[ 38.402995] c7 1827 [<ffffff9008325c40>] print_address_description+0x80/0x2d8
[ 38.403009] c7 1827 [<ffffff900832629c>] kasan_report_error+0x198/0x1fc
[ 38.403022] c7 1827 [<ffffff9008326104>] kasan_report_error+0x0/0x1fc
[ 38.403037] c7 1827 [<ffffff9008325000>] __asan_load4+0x1b0/0x1b8
[ 38.403051] c7 1827 [<ffffff90085fcc44>] f2fs_getxattr+0x518/0x68c
[ 38.403066] c7 1827 [<ffffff90085fc508>] f2fs_xattr_generic_get+0xb0/0xd0
[ 38.403080] c7 1827 [<ffffff9008395708>] __vfs_getxattr+0x1f4/0x1fc
[ 38.403096] c7 1827 [<ffffff9008621bd0>] inode_doinit_with_dentry+0x360/0x938
[ 38.403109] c7 1827 [<ffffff900862d6cc>] selinux_d_instantiate+0x2c/0x38
[ 38.403123] c7 1827 [<ffffff900861b018>] security_d_instantiate+0x68/0x98
[ 38.403136] c7 1827 [<ffffff9008377db8>] d_splice_alias+0x58/0x348
[ 38.403149] c7 1827 [<ffffff900858d16c>] f2fs_lookup+0x608/0x774
[ 38.403163] c7 1827 [<ffffff900835eacc>] lookup_slow+0x1e0/0x2cc
[ 38.403177] c7 1827 [<ffffff9008367fe0>] walk_component+0x160/0x520
[ 38.403190] c7 1827 [<ffffff9008369ef4>] path_lookupat+0x110/0x2b4
[ 38.403203] c7 1827 [<ffffff900835dd38>] filename_lookup+0x1d8/0x3a8
[ 38.403216] c7 1827 [<ffffff900835eeb0>] user_path_at_empty+0x54/0x68
[ 38.403229] c7 1827 [<ffffff9008395f44>] SyS_getxattr+0xb4/0x18c
[ 38.403241] c7 1827 [<ffffff9008084200>] el0_svc_naked+0x34/0x38

Signed-off-by: Randall Huang <huangrandall@google.com>
[Jaegeuk Kim: Fix wrong ending boundary]
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 2c28aba8 05-Mar-2019 Chao Yu <chao@kernel.org>

f2fs: fix to adapt small inline xattr space in __find_inline_xattr()

With below testcase, we will fail to find existed xattr entry:

1. mkfs.f2fs -O extra_attr -O flexible_inline_xattr /dev/zram0
2. mount -t f2fs -o inline_xattr_size=1 /dev/zram0 /mnt/f2fs/
3. touch /mnt/f2fs/file
4. setfattr -n "user.name" -v 0 /mnt/f2fs/file
5. getfattr -n "user.name" /mnt/f2fs/file

/mnt/f2fs/file: user.name: No such attribute

The reason is for inode which has very small inline xattr size,
__find_inline_xattr() will fail to traverse any entry due to first
entry may not be loaded from xattr node yet, later, we may skip to
check entire xattr datas in __find_xattr(), result in such wrong
condition.

This patch adds condition to check such case to avoid this issue.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 2a6a7e72 05-Mar-2019 Chao Yu <chao@kernel.org>

f2fs: fix to use kvfree instead of kzfree

As Jiqun Li reported in bugzilla:

https://bugzilla.kernel.org/show_bug.cgi?id=202747

System can panic due to using wrong allocate/free function pair
in xattr interface:
- use kvmalloc to allocate memory
- use kzfree to free memory

Let's fix to use kvfree instead of kzfree, BTW, we are safe to
get rid of kzfree, since there is no such confidential data stored
as xattr, we don't need to zero it before free memory.

Fixes: 5222595d093e ("f2fs: use kvmalloc, if kmalloc is failed")
Reported-by: Jiqun Li <jiqun.li@unisoc.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# eecfa427 25-Jan-2019 Gao Xiang <xiang@kernel.org>

f2fs: use xattr_prefix to wrap up

Let's use xattr_prefix instead of open code.
No logic changes.

Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 64beba05 26-Dec-2018 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: sanity check of xattr entry size

There is a security report where f2fs_getxattr() has a hole to expose wrong
memory region when the image is malformed like this.

f2fs_getxattr: entry->e_name_len: 4, size: 12288, buffer_size: 16384, len: 4

Cc: <stable@vger.kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# bae0ee7a 25-Dec-2018 Chao Yu <chao@kernel.org>

f2fs: check PageWriteback flag for ordered case

For all ordered cases in f2fs_wait_on_page_writeback(), we need to
check PageWriteback status, so let's clean up to relocate the check
into f2fs_wait_on_page_writeback().

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 7c1a000d 11-Sep-2018 Chao Yu <chao@kernel.org>

f2fs: add SPDX license identifiers

Remove the verbose license text from f2fs files and replace them with
SPDX tags. This does not change the license of any of the code.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 797c1cb5 19-Jul-2018 Chao Yu <chao@kernel.org>

f2fs: restrict setting up inode.i_advise

In order to give advise to f2fs to recognize hot/cold file, it is possible
that we can set specific bit in inode.i_advise through setxattr(), but
there are several bits which are used internally, such as encrypt_bit,
keep_size_bit, they should never be changed through setxattr().

So that this patch 1) adds FADVISE_MODIFIABLE_BITS to filter modifiable
bits user given, 2) supports to clear {hot,cold}_file bits.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 5d3ce4f7 18-Jul-2018 Hyunchul Lee <cheol.lee@lge.com>

f2fs: avoid duplicated permission check for "trusted." xattrs

Because xattr_permission already checks CAP_SYS_ADMIN
capability, we don't need to check it.

Signed-off-by: Hyunchul Lee <cheol.lee@lge.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 4d57b86d 29-May-2018 Chao Yu <chao@kernel.org>

f2fs: clean up symbol namespace

As Ted reported:

"Hi, I was looking at f2fs's sources recently, and I noticed that there
is a very large number of non-static symbols which don't have a f2fs
prefix. There's well over a hundred (see attached below).

As one example, in fs/f2fs/dir.c there is:

unsigned char get_de_type(struct f2fs_dir_entry *de)

This function is clearly only useful for f2fs, but it has a generic
name. This means that if any other file system tries to have the same
symbol name, there will be a symbol conflict and the kernel would not
successfully build. It also means that when someone is looking f2fs
sources, it's not at all obvious whether a function such as
read_data_page(), invalidate_blocks(), is a generic kernel function
found in the fs, mm, or block layers, or a f2fs specific function.

You might want to fix this at some point. Hopefully Kent's bcachefs
isn't similarly using genericly named functions, since that might
cause conflicts with f2fs's functions --- but just as this would be a
problem that we would rightly insist that Kent fix, this is something
that we should have rightly insisted that f2fs should have fixed
before it was integrated into the mainline kernel.

acquire_orphan_inode
add_ino_entry
add_orphan_inode
allocate_data_block
allocate_new_segments
alloc_nid
alloc_nid_done
alloc_nid_failed
available_free_memory
...."

This patch adds "f2fs_" prefix for all non-static symbols in order to:
a) avoid conflict with other kernel generic symbols;
b) to indicate the function is f2fs specific one instead of generic
one;

Reported-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# b2c4692b 20-Jan-2018 Daeho Jeong <daeho.jeong@samsung.com>

f2fs: correct removexattr behavior for null valued extended attribute

__vfs_removexattr() transfers "NULL" value to the setxattr handler of
the f2fs filesystem in order to remove the extended attribute. But,
__f2fs_setxattr() just ignores the removal request when the value of
the extended attribute is already NULL. We have to remove the extended
attribute itself even if the value of that is already NULL.

We can reporduce this bug with the below:

1. touch file
2. setfattr -n "user.foo" file
3. setfattr -x "user.foo" file
4. getfattr -d file
> user.foo

Signed-off-by: Daeho Jeong <daeho.jeong@samsung.com>
Signed-off-by: Youngjin Gil <youngjin.gil@samsung.com>
Tested-by: Hobin Woo <hobin.woo@samsung.com>
Tested-by: Chao Yu <yuchao0@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# d620439f 28-Dec-2017 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: fix missing error number for xattr operation

This fixes generic/449 hang problem caused by no ENOSPC forever which should be
returned by setxattr under disk full scenario.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# acbf054d 30-Nov-2017 Chao Yu <chao@kernel.org>

f2fs: inject fault to kzalloc

This patch introduces f2fs_kzalloc based on f2fs_kmalloc in order to
support error injection for kzalloc().

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# bf9c1427 16-Oct-2017 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: handle error case when adding xattr entry

This patch fixes recovering incomplete xattr entries remaining in inline xattr
and xattr block, caused by any kind of errors.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 6afc662e 06-Sep-2017 Chao Yu <chao@kernel.org>

f2fs: support flexible inline xattr size

Now, in product, more and more features based on file encryption were
introduced, their demand of xattr space is increasing, however, inline
xattr has fixed-size of 200 bytes, once inline xattr space is full, new
increased xattr data would occupy additional xattr block which may bring
us more space usage and performance regression during persisting.

In order to resolve above issue, it's better to expand inline xattr size
flexibly according to user's requirement.

So this patch introduces new filesystem feature 'flexible inline xattr',
and new mount option 'inline_xattr_size=%u', once mkfs enables the
feature, we can use the option to make f2fs supporting flexible inline
xattr size.

To support this feature, we add extra attribute i_inline_xattr_size in
inode layout, indicating that how many space inline xattr borrows from
block address mapping space in inode layout, by this, we can easily
locate and store flexible-sized inline xattr data in inode.

Inode disk layout:
+----------------------+
| .i_mode |
| ... |
| .i_ext |
+----------------------+
| .i_extra_isize |
| .i_inline_xattr_size |-----------+
| ... | |
+----------------------+ |
| .i_addr | |
| - block address or | |
| - inline data | |
+----------------------+<---+ v
| inline xattr | +---inline xattr range
+----------------------+<---+
| .i_nid |
+----------------------+
| node_footer |
| (nid, ino, offset) |
+----------------------+

Note that, we have to cnosider backward compatibility which reserved
inline_data space, 200 bytes, all the time, reported by Sheng Yong.

Previous inline data or directory always reserved 200 bytes in inode layout,
even if inline_xattr is disabled. In order to keep inline_dentry's structure
for backward compatibility, we get the space back only from inline_data.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Reported-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# d8d1389e 23-Oct-2017 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: add missing quota_initialize

This patch adds to call quota_intialize in f2fs_set_acl, f2fs_unlink,
and f2fs_rename.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 9c77f754 19-Oct-2017 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: remove obsolete pointer for truncate_xattr_node

This patch removes obosolete parameter for truncate_xattr_node.

Suggested-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 63840695 04-Sep-2017 Chao Yu <chao@kernel.org>

f2fs: introduce read_xattr_block

Commit ba38c27eb93e ("f2fs: enhance lookup xattr") introduces
lookup_all_xattrs duplicating from read_all_xattrs, which leaves
lots of similar codes in between them, so introduce new help
read_xattr_block to clean up redundant codes.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# a5f433f7 04-Sep-2017 Chao Yu <chao@kernel.org>

f2fs: introduce read_inline_xattr

Commit ba38c27eb93e ("f2fs: enhance lookup xattr") introduces
lookup_all_xattrs duplicating from read_all_xattrs, which leaves
lots of similar codes in between them, so introduce new help
read_inline_xattr to clean up redundant codes.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 27161f13 06-Sep-2017 Yunlei He <heyunlei@huawei.com>

f2fs: avoid race in between read xattr & write xattr

Thread A: Thread B:
-f2fs_getxattr
-lookup_all_xattrs
-xnid = F2FS_I(inode)->i_xattr_nid;
-f2fs_setxattr
-__f2fs_setxattr
-write_all_xattrs
-truncate_xattr_node
... ...
-write_checkpoint
... ...
-alloc_nid <- nid reuse
-get_node_page
-f2fs_bug_on <- nid != node_footer->nid

It's need a rw_sem to avoid the race

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 5f4ce6ab 17-Jul-2017 Yunlei He <heyunlei@huawei.com>

f2fs: remove unused input parameter

This patch remove unused input parameter in function
new_node_page.

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Yong Sheng <shengyong1@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 22588f87 22-Mar-2017 Chao Yu <chao@kernel.org>

f2fs: don't reserve additional space in xattr block

In this patch, we change xattr block disk layout as below:

Before:
xattr node block layout
+---------------------------------------------+---------------+-------------+
| node block xattr entries | reserved | node footer |
| 4068 Bytes | 4 Bytes | 24 Bytes |

In memory layout
+--------------------+---------------------------------+--------------------+
| inline xattr | node block xattr entries | reserved |
| 200 Bytes | 4068 Bytes | 4 Bytes |

After:
xattr node block layout
+-------------------------------------------------------------+-------------+
| node block xattr entries | node footer |
| 4072 Bytes | 24 Bytes |

In memory layout
+--------------------+---------------------------------+--------------------+
| inline xattr | node block xattr entries | reserved |
| 200 Bytes | 4072 Bytes | 4 Bytes |

With this change, we don't need to reserve additional space in node block,
just keep reserved space in logical in-memory layout. So that it would help
to enlarge valid free space of xattr node block.

As tested, generic/026 shows max stored xattr entires number increases from
531 to 532 when inline_xattr option is enabled.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 89e9eabd 22-Mar-2017 Chao Yu <chao@kernel.org>

f2fs: clean up xattr operation

1. don't allocate redundant memory in read_all_xattrs.
2. introduce RESERVED_XATTR_SIZE for cleanup.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Reviewed-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# b71deadb 10-Mar-2017 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: le16_to_cpu for xattr->e_value_size

This patch fixes missing le16 conversion, reported by kbuild test robot.

Fixes: 5f35a2cd5 ("f2fs: Don't update the xattr data that same as the exist")
Reviewed-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 0ec4a5b6 25-Feb-2017 Kinglong Mee <kinglongmee@gmail.com>

f2fs: drop the duplicate pval in f2fs_getxattr

Fixes: ba38c27eb9 ("f2fs: enhance lookup xattr")
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 5f35a2cd 25-Feb-2017 Kinglong Mee <kinglongmee@gmail.com>

f2fs: Don't update the xattr data that same as the exist

f2fs removes the old xattr data and appends the new data although
the new data is same as the exist.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# d260081c 08-Feb-2017 Chao Yu <chao@kernel.org>

f2fs: change recovery policy of xattr node block

Currently, if we call fsync after updating the xattr date belongs to the
file, f2fs needs to trigger checkpoint to keep xattr data consistent. But,
this policy cause low performance as checkpoint will block most foreground
operations and cause unneeded and unrelated IOs around checkpoint.

This patch will reuse regular file recovery policy for xattr node block,
so, we change to write xattr node block tagged with fsync flag to warm
area instead of cold area, and during recovery, we search warm node chain
for fsynced xattr block, and do the recovery.

So, for below application IO pattern, performance can be improved
obviously:
- touch file
- create/update/delete xattr entry in file
- fsync file

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# ba38c27e 24-Jan-2017 Chao Yu <chao@kernel.org>

f2fs: enhance lookup xattr

Previously, in getxattr we will load all entries both in inline xattr and
xattr node block, and then do the lookup in all entries, but our lookup
flow shows low efficiency, since if we can lookup and hit in inline xattr
of inode page cache first, we don't need to load and lookup xattr node
block, which can obviously save cpu time and IO latency.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: initialize NULL to avoid warning]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 7c45729a 14-Oct-2016 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: keep dirty inodes selectively for checkpoint

This is to avoid no free segment bug during checkpoint caused by a number of
dirty inodes.

The case was reported by Chao like this.
1. mount with lazytime option
2. fill 4k file until disk is full
3. sync filesystem
4. read all files in the image
5. umount

In this case, we actually don't need to flush dirty inode to inode page during
checkpoint.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# c2050a45 14-Sep-2016 Deepa Dinamani <deepa.kernel@gmail.com>

fs: Replace current_fs_time() with current_time()

current_fs_time() uses struct super_block* as an argument.
As per Linus's suggestion, this is changed to take struct
inode* as a parameter instead. This is because the function
is primarily meant for vfs inode timestamps.
Also the function was renamed as per Arnd's suggestion.

Change all calls to current_fs_time() to use the new
current_time() function instead. current_fs_time() will be
deleted.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 86696966 18-Sep-2016 Chao Yu <chao@kernel.org>

f2fs: fix to return error number of read_all_xattrs correctly

We treat all error in read_all_xattrs as a no memory error, which covers
the real reason of failure in it. Fix it by return correct errno in order
to reflect the real cause.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# bbf156f7 29-Aug-2016 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: fix lost xattrs of directories

This patch enhances the xattr consistency of dirs from suddern power-cuts.

Possible scenario would be:
1. dir->setxattr used by per-file encryption
2. file->setxattr goes into inline_xattr
3. file->fsync

In that case, we should do checkpoint for #1.
Otherwise we'd lose dir's key information for the file given #2.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# b56ab837 30-Jun-2016 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: avoid mark_inode_dirty

Let's check inode's dirtiness before calling mark_inode_dirty.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# a0995af6 27-Jun-2016 Tiezhu Yang <yangtiezhu@loongson.cn>

f2fs: remove unnecessary goto statement

When base_addr is NULL, there is no need to call kzfree,
it should return -ENOMEM directly. Additionally, it is
better to initialize variable 'error' with 0.

Signed-off-by: Tiezhu Yang <kernelpatch@126.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# ee6d182f 20-May-2016 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: remove syncing inode page in all the cases

This patch reduces to call them across the whole tree.
- sync_inode_page()
- update_inode_page()
- update_inode()
- f2fs_write_inode()

Instead, checkpoint will flush all the dirty inode metadata before syncing
node pages.
Note that, this is doable, since we call mark_inode_dirty_sync() for all
inode's field change which needs to update on-disk inode as well.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 205b9822 20-May-2016 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: call mark_inode_dirty_sync for i_field changes

This patch calls mark_inode_dirty_sync() for the following on-disk inode
changes.

-> largest
-> ctime/mtime/atime
-> i_current_depth
-> i_xattr_nid
-> i_pino
-> i_advise
-> i_flags
-> i_mode

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 91942321 20-May-2016 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: use inode pointer for {set, clear}_inode_flag

This patch refactors to use inode pointer for set_inode_flag and
clear_inode_flag.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 59301226 27-May-2016 Al Viro <viro@zeniv.linux.org.uk>

switch xattr_handler->set() to passing dentry and inode separately

preparation for similar switch in ->setxattr() (see the next commit for
rationale).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# e3bc808c 04-May-2016 Chao Yu <chao@kernel.org>

f2fs: remove unneeded memset when updating xattr

Each of fields in struct f2fs_xattr_entry will be assigned later,
so previously we don't need to memset the struct.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 58457f1c 12-Apr-2016 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: give -E2BIG for no space in xattr

This patch returns -E2BIG if there is no space to add an xattr entry.
This should fix generic/026 in xfstests as well.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# b296821a 10-Apr-2016 Al Viro <viro@zeniv.linux.org.uk>

xattr_handler: pass dentry and inode as separate arguments of ->get()

... and do not assume they are already attached to each other

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# fec1d657 20-Jan-2016 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: use wait_for_stable_page to avoid contention

In write_begin, if storage supports stable_page, we don't need to wait for
writeback to update its contents.
This patch introduces to use wait_for_stable_page instead of
wait_on_page_writeback.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# d0239e1b 08-Jan-2016 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: detect idle time depending on user behavior

This patch adds last time that user requested filesystem operations.
This information is used to detect whether system is idle or not later.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 2c4db1a6 07-Jan-2016 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: clean up f2fs_balance_fs

This patch adds one parameter to clean up all the callers of f2fs_balance_fs.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 764a5c6b 02-Dec-2015 Andreas Gruenbacher <agruenba@redhat.com>

xattr handlers: Simplify list operation

Change the list operation to only return whether or not an attribute
should be listed. Copying the attribute names into the buffer is moved
to the callers.

Since the result only depends on the dentry and not on the attribute
name, we do not pass the attribute name to list operations.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 98e9cb57 02-Dec-2015 Andreas Gruenbacher <agruenba@redhat.com>

vfs: Distinguish between full xattr names and proper prefixes

Add an additional "name" field to struct xattr_handler. When the name
is set, the handler matches attributes with exactly that name. When the
prefix is set instead, the handler matches attributes with the given
prefix and with a non-empty suffix.

This patch should avoid bugs like the one fixed in commit c361016a in
the future.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 29608d20 04-Oct-2015 Andreas Gruenbacher <agruenba@redhat.com>

f2fs: xattr simplifications

Now that the xattr handler is passed to the xattr handler operations, we
have access to the attribute name prefix, so simplify
f2fs_xattr_generic_list.

Also, f2fs_xattr_advise_list is only ever called for
f2fs_xattr_advise_handler; there is no need to double check for that.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Changman Lee <cm224.lee@samsung.com>
Cc: Chao Yu <chao2.yu@samsung.com>
Cc: linux-f2fs-devel@lists.sourceforge.net
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# d9a82a04 04-Oct-2015 Andreas Gruenbacher <agruenba@redhat.com>

xattr handlers: Pass handler to operations instead of flags

The xattr_handler operations are currently all passed a file system
specific flags value which the operations can use to disambiguate between
different handlers; some file systems use that to distinguish the xattr
namespace, for example. In some oprations, it would be useful to also have
access to the handler prefix. To allow that, pass a pointer to the handler
to operations instead of the flags value alone.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 037fe70c 13-Jul-2015 Chao Yu <chao@kernel.org>

f2fs: correct return value of ->setxattr

This patch fixes to return correct error number of ->setxattr, which
is reported by xfstest tests/generic/026 as below:

generic/026 - output mismatch
--- tests/generic/026.out
+++ results/generic/026.out.bad
@@ -4,6 +4,6 @@
1 below acl max
acl max
1 above acl max
-chacl: cannot set access acl on "largeaclfile": Argument list too long
+chacl: cannot set access acl on "largeaclfile": Numerical result out of range
use 16 aces
use 17 aces
...
Ran: generic/026
Failures: generic/026
Failed 1 of 1 tests

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# f424f664 20-Apr-2015 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs crypto: add encryption policy and password salt support

This patch adds encryption policy and password salt support through ioctl
implementation.

It adds three ioctls:
F2FS_IOC_SET_ENCRYPTION_POLICY,
F2FS_IOC_GET_ENCRYPTION_POLICY,
F2FS_IOC_GET_ENCRYPTION_PWSALT, which use xattr operations.

Note that, these definition and codes are taken from ext4 crypto support.
For f2fs, xattr operations and on-disk flags for superblock and inode were
changed.

Signed-off-by: Michael Halcrow <mhalcrow@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Ildar Muslukhov <muslukhovi@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 2b0143b5 17-Mar-2015 David Howells <dhowells@redhat.com>

VFS: normal filesystems (and lustre): d_inode() annotations

that's the bulk of filesystem drivers dealing with inodes of their own

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 30c62fdb 22-Mar-2015 Chao Yu <chao@kernel.org>

f2fs: persist system.advise into on-disk inode

This patch fixes to dirty inode for persisting i_advise of f2fs inode info into
on-disk inode if user sets system.advise through setxattr. Otherwise the new
value will be lost.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 84e97c27 22-Mar-2015 Chao Yu <chao@kernel.org>

f2fs: avoid NULL pointer dereference in f2fs_xattr_advise_get

We will encounter oops by executing below command.
getfattr -n system.advise /mnt/f2fs/file
Killed

message log:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<f8b54d69>] f2fs_xattr_advise_get+0x29/0x40 [f2fs]
*pdpt = 00000000319b7001 *pde = 0000000000000000
Oops: 0002 [#1] SMP
Modules linked in: f2fs(O) snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq joydev
snd_seq_device snd_timer bnep snd rfcomm microcode bluetooth soundcore i2c_piix4 mac_hid serio_raw parport_pc ppdev lp parport
binfmt_misc hid_generic psmouse usbhid hid e1000 [last unloaded: f2fs]
CPU: 3 PID: 3134 Comm: getfattr Tainted: G O 4.0.0-rc1 #6
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
task: f3a71b60 ti: f19a6000 task.ti: f19a6000
EIP: 0060:[<f8b54d69>] EFLAGS: 00010246 CPU: 3
EIP is at f2fs_xattr_advise_get+0x29/0x40 [f2fs]
EAX: 00000000 EBX: f19a7e71 ECX: 00000000 EDX: f8b5b467
ESI: 00000000 EDI: f2008570 EBP: f19a7e14 ESP: f19a7e08
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 80050033 CR2: 00000000 CR3: 319b8000 CR4: 000007f0
Stack:
f8b5a634 c0cbb580 00000000 f19a7e34 c1193850 00000000 00000007 f19a7e71
f19a7e64 c0cbb580 c1193810 f19a7e50 c1193c00 00000000 00000000 00000000
c0cbb580 00000000 f19a7f70 c1194097 00000000 00000000 00000000 74737973
Call Trace:
[<c1193850>] generic_getxattr+0x40/0x50
[<c1193810>] ? xattr_resolve_name+0x80/0x80
[<c1193c00>] vfs_getxattr+0x70/0xa0
[<c1194097>] getxattr+0x87/0x190
[<c11801d7>] ? path_lookupat+0x57/0x5f0
[<c11819d2>] ? putname+0x32/0x50
[<c116653a>] ? kmem_cache_alloc+0x2a/0x130
[<c11819d2>] ? putname+0x32/0x50
[<c11819d2>] ? putname+0x32/0x50
[<c11819d2>] ? putname+0x32/0x50
[<c11827f9>] ? user_path_at_empty+0x49/0x70
[<c118283f>] ? user_path_at+0x1f/0x30
[<c11941e7>] path_getxattr+0x47/0x80
[<c11948e7>] SyS_getxattr+0x27/0x30
[<c163f748>] sysenter_do_call+0x12/0x12
Code: 66 90 55 89 e5 57 56 53 66 66 66 66 90 8b 78 20 89 d3 ba 67 b4 b5 f8 89 d8 89 ce e8 42 7c 7b c8 85 c0 75 16 0f b6 87 44 01 00
00 <88> 06 b8 01 00 00 00 5b 5e 5f 5d c3 8d 76 00 b8 ea ff ff ff eb
EIP: [<f8b54d69>] f2fs_xattr_advise_get+0x29/0x40 [f2fs] SS:ESP 0068:f19a7e08
CR2: 0000000000000000
---[ end trace 860260654f1f416a ]---

The reason is that in getfattr there are two steps which is indicated by strace info:
1) try to lookup and get size of specified xattr.
2) get value of the extented attribute.

strace info:
getxattr("/mnt/f2fs/file", "system.advise", 0x0, 0) = 1
getxattr("/mnt/f2fs/file", "system.advise", "\x00", 256) = 1

For the first step, getfattr may pass a NULL pointer in @value and zero in @size
as parameters for ->getxattr, but we access this @value pointer directly without
checking whether the pointer is valid or not in f2fs_xattr_advise_get, so the
oops occurs.

This patch fixes this issue by verifying @value pointer before using.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# bce8d112 13-Oct-2014 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: avoid deadlock on init_inode_metadata

Previously, init_inode_metadata does not hold any parent directory's inode
page. So, f2fs_init_acl can grab its parent inode page without any problem.
But, when we use inline_dentry, that page is grabbed during f2fs_add_link,
so that we can fall into deadlock condition like below.

INFO: task mknod:11006 blocked for more than 120 seconds.
Tainted: G OE 3.17.0-rc1+ #13
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mknod D ffff88003fc94580 0 11006 11004 0x00000000
ffff880007717b10 0000000000000002 ffff88003c323220 ffff880007717fd8
0000000000014580 0000000000014580 ffff88003daecb30 ffff88003c323220
ffff88003fc94e80 ffff88003ffbb4e8 ffff880007717ba0 0000000000000002
Call Trace:
[<ffffffff8173dc40>] ? bit_wait+0x50/0x50
[<ffffffff8173d4cd>] io_schedule+0x9d/0x130
[<ffffffff8173dc6c>] bit_wait_io+0x2c/0x50
[<ffffffff8173da3b>] __wait_on_bit_lock+0x4b/0xb0
[<ffffffff811640a7>] __lock_page+0x67/0x70
[<ffffffff810acf50>] ? autoremove_wake_function+0x40/0x40
[<ffffffff811652cc>] pagecache_get_page+0x14c/0x1e0
[<ffffffffa029afa9>] get_node_page+0x59/0x130 [f2fs]
[<ffffffffa02a63ad>] read_all_xattrs+0x24d/0x430 [f2fs]
[<ffffffffa02a6ca2>] f2fs_getxattr+0x52/0xe0 [f2fs]
[<ffffffffa02a7481>] f2fs_get_acl+0x41/0x2d0 [f2fs]
[<ffffffff8122d847>] get_acl+0x47/0x70
[<ffffffff8122db5a>] posix_acl_create+0x5a/0x150
[<ffffffffa02a7759>] f2fs_init_acl+0x29/0xcb [f2fs]
[<ffffffffa0286a8d>] init_inode_metadata+0x5d/0x340 [f2fs]
[<ffffffffa029253a>] f2fs_add_inline_entry+0x12a/0x2e0 [f2fs]
[<ffffffffa0286ea5>] __f2fs_add_link+0x45/0x4a0 [f2fs]
[<ffffffffa028b5b6>] ? f2fs_new_inode+0x146/0x220 [f2fs]
[<ffffffffa028b816>] f2fs_mknod+0x86/0xf0 [f2fs]
[<ffffffff811e3ec1>] vfs_mknod+0xe1/0x160
[<ffffffff811e4b26>] SyS_mknod+0x1f6/0x200
[<ffffffff81741d7f>] tracesys+0xe1/0xe6

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 9850cf4a 02-Sep-2014 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: need fsck.f2fs when f2fs_bug_on is triggered

If any f2fs_bug_on is triggered, fsck.f2fs is needed.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 4081363f 02-Sep-2014 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: introduce F2FS_I_SB, F2FS_M_SB, and F2FS_P_SB

This patch adds three inline functions to clean up dirty casting codes.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# e1c42045 06-Aug-2014 arter97 <qkrwngud825@gmail.com>

f2fs: fix typo

Fix typo and some grammatical errors.

The words "filesystem" and "readahead" are being used without the space treewide.

Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# d631abda 01-Jun-2014 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: fix recursive lock by f2fs_setxattr

This patch should resolve the following recursive lock.

[<ffffffff8135a9c3>] call_rwsem_down_write_failed+0x13/0x20
[<ffffffffa01749dc>] f2fs_setxattr+0x5c/0xa0 [f2fs]
[<ffffffffa0174c99>] __f2fs_set_acl+0x1b9/0x340 [f2fs]
[<ffffffffa017515a>] f2fs_init_acl+0x4a/0xcb [f2fs]
[<ffffffffa0159abe>] __f2fs_add_link+0x26e/0x780 [f2fs]
[<ffffffffa015d4d8>] f2fs_mkdir+0xb8/0x150 [f2fs]
[<ffffffff811cebd7>] vfs_mkdir+0xb7/0x160
[<ffffffff811cf89b>] SyS_mkdir+0xab/0xe0
[<ffffffff817244bf>] tracesys+0xe1/0xe6
[<ffffffffffffffff>] 0xffffffffffffffff

The call path indicates:
- f2fs_add_link
: down_write(&fi->i_sem);

- init_inode_metadata
- f2fs_init_acl
- __f2fs_set_acl
- f2fs_setxattr
: down_write(&fi->i_sem);

Here we should not call f2fs_setxattr, but __f2fs_setxattr.
But __f2fs_setxattr is a static function in xattr.c, so that I found the other
generic approach to use f2fs_setxattr.

In f2fs_setxattr, the page pointer is only given from init_inode_metadata.
So, this patch adds this condition to avoid this in f2fs_setxattr.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>


# 54b591df 29-Apr-2014 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: split grab_cache_page and wait_on_page_writeback for node pages

This patch splits grab_cache_page_write_begin into grab_cache_page and
wait_on_page_writeback for node pages.

This patch intends to enhance the latency to get node pages by alleviating
unnecessary wait_on_page_writeback.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# 916decbf 22-Apr-2014 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: return errors right after checking them

This patch adds two error conditions early in the setxattr operations.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# c02745ef 22-Apr-2014 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: pass flags field to setxattr functions

This patch passes the "flags" field to the low level setxattr functions
to use XATTR_REPLACE in the following patches.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# e1123268 22-Apr-2014 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: clean up long variable names

This patch includes simple clean-ups to reduce unnecessary long variable names.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# 6e452d69 22-Mar-2014 Chao Yu <chao@kernel.org>

f2fs: avoid unneeded lookup when xattr name length is too long

In f2fs_setxattr we have limit this attribute name length, so we should also
check it in f2fs_getxattr to avoid useless lookup caused by invalid name length.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# 808a1d74 20-Mar-2014 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: avoid RECLAIM_FS-ON-W warning

This patch should resolve the following possible bug.

RECLAIM_FS-ON-W at:
mark_held_locks+0xb9/0x140
lockdep_trace_alloc+0x85/0xf0
__kmalloc+0x53/0x1d0
read_all_xattrs+0x3d1/0x3f0 [f2fs]
f2fs_getxattr+0x4f/0x100 [f2fs]
f2fs_get_acl+0x4c/0x290 [f2fs]
get_acl+0x4f/0x80
posix_acl_create+0x72/0x180
f2fs_init_acl+0x29/0xcc [f2fs]
__f2fs_add_link+0x259/0x710 [f2fs]
f2fs_create+0xad/0x1c0 [f2fs]
vfs_create+0xed/0x150
do_last+0xd36/0xed0
path_openat+0xc5/0x680
do_filp_open+0x43/0xa0
do_sys_open+0x13c/0x230
SyS_creat+0x1e/0x20
system_call_fastpath+0x16/0x1b

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# d928bfbf 20-Mar-2014 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: introduce fi->i_sem to protect fi's info

This patch introduces fi->i_sem to protect fi's info that includes xattr_ver,
pino, i_nlink.
This enables to remove i_mutex during f2fs_sync_file, resulting in performance
improvement when a number of fsync calls are triggered from many concurrent
threads.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# a6dda0e6 20-Dec-2013 Christoph Hellwig <hch@infradead.org>

f2fs: use generic posix ACL infrastructure

f2fs has some weird mode bit handling, so still using the old
chmod code for now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 6bacf52f 05-Dec-2013 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: add unlikely() macro for compiler more aggressively

This patch adds unlikely() macro into the most of codes.
The basic rule is to add that when:
- checking unusual errors,
- checking page mappings,
- and the other unlikely conditions.

Change log from v1:
- Don't add unlikely for the NULL test and error test: advised by Andi Kleen.

Cc: Chao Yu <chao2.yu@samsung.com>
Cc: Andi Kleen <andi@firstfloor.org>
Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# cc3de6a3 29-Oct-2013 Chao Yu <chao@kernel.org>

f2fs: fix calculating incorrect free size when update xattr in __f2fs_setxattr

During xattr updating, free size should be corrected to remainder free size
+ old entry size.
It can avoid ENOSPC error when we update old entry with the same size new
entry at fully filled xattr.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# 5d56b671 29-Oct-2013 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: add an option to avoid unnecessary BUG_ONs

If you want to remove unnecessary BUG_ONs, you can just turn off F2FS_CHECK_FS
in your kernel config.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# e479556b 27-Sep-2013 Gu Zheng <guz.fnst@cn.fujitsu.com>

f2fs: use rw_sem instead of fs_lock(locks mutex)

The fs_locks is used to block other ops(ex, recovery) when doing checkpoint.
And each other operate routine(besides checkpoint) needs to acquire a fs_lock,
there is a terrible problem here, if these are too many concurrency threads acquiring
fs_lock, so that they will block each other and may lead to some performance problem,
but this is not the phenomenon we want to see.
Though there are some optimization patches introduced to enhance the usage of fs_lock,
but the thorough solution is using a *rw_sem* to replace the fs_lock.
Checkpoint routine takes write_sem, and other ops take read_sem, so that we can block
other ops(ex, recovery) when doing checkpoint, and other ops will not disturb each other,
this can avoid the problem described above completely.
Because of the weakness of rw_sem, the above change may introduce a potential problem
that the checkpoint thread might get starved if other threads are intensively locking
the read semaphore for I/O.(Pointed out by Xu Jin)
In order to avoid this, a wait_list is introduced, the appending read semaphore ops
will be dropped into the wait_list if checkpoint thread is waiting for write semaphore,
and will be waked up when checkpoint thread gives up write semaphore.
Thanks to Kim's previous review and test, and will be very glad to see other guys'
performance tests about this patch.

V2:
-fix the potential starvation problem.
-use more suitable func name suggested by Xu Jin.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
[Jaegeuk Kim: adjust minor coding standard]
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# 52ab9560 24-Sep-2013 Russ Knize <Russ.Knize@motorola.com>

f2fs: don't GC or take an fs_lock from f2fs_initxattrs()

f2fs_initxattrs() is called internally from within F2FS and should
not call functions that are used by VFS handlers. This avoids
certain deadlocks:

- vfs_create()
- f2fs_create() <-- takes an fs_lock
- f2fs_add_link()
- __f2fs_add_link()
- init_inode_metadata()
- f2fs_init_security()
- security_inode_init_security()
- f2fs_initxattrs()
- f2fs_setxattr() <-- also takes an fs_lock

If the caller happens to grab the same fs_lock from the pool in both
places, they will deadlock. There are also deadlocks involving
multiple threads and mutexes:

- f2fs_write_begin()
- f2fs_balance_fs() <-- takes gc_mutex
- f2fs_gc()
- write_checkpoint()
- block_operations()
- mutex_lock_all() <-- blocks trying to grab all fs_locks

- f2fs_mkdir() <-- takes an fs_lock
- __f2fs_add_link()
- f2fs_init_security()
- security_inode_init_security()
- f2fs_initxattrs()
- f2fs_setxattr()
- f2fs_balance_fs() <-- blocks trying to take gc_mutex

Signed-off-by: Russ Knize <Russ.Knize@motorola.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# 65985d93 14-Aug-2013 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: support the inline xattrs

0. modified inode structure
--------------------------------------
metadata (e.g., i_mtime, i_ctime, etc)
--------------------------------------
direct pointers [0 ~ 873]

inline xattrs (200 bytes by default)

indirect pointers [0 ~ 4]
--------------------------------------
node footer
--------------------------------------

1. setxattr flow
- read_all_xattrs copies all the xattrs from inline and xattr node block.
- handle xattr entries
- write_all_xattrs copies modified xattrs into inline and xattr node block.

2. getxattr flow
- read_all_xattrs copies all the xattrs from inline and xattr node block.
- check target entries

3. Usage
# mount -t f2fs -o inline_xattr $DEV $MNT

Once mounted with the inline_xattr option, f2fs marks all the newly created
files to reserve an amount of inline xattr space explicitly inside the inode
block. Without the mount option, f2fs will not touch any existing files and
newly created files as well.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# dd9cfe23 12-Aug-2013 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: introduce __find_xattr for readability

The __find_xattr is to search the wanted xattr entry starting from the
base_addr.

If not found, the returned entry is the last empty xattr entry that can be
allocated newly.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# 479bd73a 12-Aug-2013 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: should cover i_xattr_nid with its xattr node page lock

Previously, f2fs_setxattr assigns i_xattr_nid in the inode page inconsistently.

The scenario is:

= Thread 1 = = Thread 2 = = fi->i_xattr_nid = = on-disk nid =

f2fs_setxattr 0 0
new_node_page X 0
sync_inode_page X X
checkpoint X X -.
grab_cache_page X X |
--> allocate a new xattr node block or -ENOSPC <----------------'

At this moment, the checkpoint stores inconsistent data where the inode has
i_xattr_nid but actual xattr node block is not allocated yet.

So, we should assign the real i_xattr_nid only after its xattr node block is
allocated.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# d71b5564 09-Aug-2013 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: introduce cur_cp_version function to reduce code size

This patch introduces a new inline function, cur_cp_version, to reduce redundant
codes.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# e518ff81 08-Aug-2013 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: fix inconsistency between xattr node blocks and its inode

Previously xattr node blocks are stored to the COLD_NODE log, which means that
our roll-forward mechanism doesn't recover the xattr node blocks at all.
Only the direct node blocks in the WARM_NODE log can be recovered.

So, let's resolve the issue simply by conducting checkpoint during fsync when a
file has a modified xattr node block.

This approach is able to degrade the performance, but normally the checkpoint
overhead is shown at the initial fsync call after the xattr entry changes.
Once the checkpoint is done, no additional overhead would be occurred.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# 8ae8f162 03-Jun-2013 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: support xattr security labels

This patch adds the support of security labels for f2fs, which will be used
by Linus Security Models (LSMs).

Quote from http://en.wikipedia.org/wiki/Linux_Security_Modules:
"Linux Security Modules (LSM) is a framework that allows the Linux kernel to
support a variety of computer security models while avoiding favoritism toward
any single security implementation. The framework is licensed under the terms of
the GNU General Public License and is standard part of the Linux kernel since
Linux 2.6. AppArmor, SELinux, Smack and TOMOYO Linux are the currently accepted
modules in the official kernel.".

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# 1e03e38b 30-May-2013 Jason Hrycay <jhrycay@gmail.com>

f2fs: handle errors from get_node_page calls

Add check for error pointers returned from get_node_page in order to
avoid dereferencing a bad address on the next use.

Signed-off-by: Jason Hrycay <jason.hrycay@motorola.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# 39936837 22-Nov-2012 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: introduce a new global lock scheme

In the previous version, f2fs uses global locks according to the usage types,
such as directory operations, block allocation, block write, and so on.

Reference the following lock types in f2fs.h.
enum lock_type {
RENAME, /* for renaming operations */
DENTRY_OPS, /* for directory operations */
DATA_WRITE, /* for data write */
DATA_NEW, /* for data allocation */
DATA_TRUNC, /* for data truncate */
NODE_NEW, /* for node allocation */
NODE_TRUNC, /* for node truncate */
NODE_WRITE, /* for node write */
NR_LOCK_TYPE,
};

In that case, we lose the performance under the multi-threading environment,
since every types of operations must be conducted one at a time.

In order to address the problem, let's share the locks globally with a mutex
array regardless of any types.
So, let users grab a mutex and perform their jobs in parallel as much as
possbile.

For this, I propose a new global lock scheme as follows.

0. Data structure
- f2fs_sb_info -> mutex_lock[NR_GLOBAL_LOCKS]
- f2fs_sb_info -> node_write

1. mutex_lock_op(sbi)
- try to get an avaiable lock from the array.
- returns the index of the gottern lock variable.

2. mutex_unlock_op(sbi, index of the lock)
- unlock the given index of the lock.

3. mutex_lock_all(sbi)
- grab all the locks in the array before the checkpoint.

4. mutex_unlock_all(sbi)
- release all the locks in the array after checkpoint.

5. block_operations()
- call mutex_lock_all()
- sync_dirty_dir_inodes()
- grab node_write
- sync_node_pages()

Note that,
the pairs of mutex_lock_op()/mutex_unlock_op() and
mutex_lock_all()/mutex_unlock_all() should be used together.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# 7c909772 17-Mar-2013 Namjae Jeon <namjae.jeon@samsung.com>

f2fs: reorganize f2fs_setxattr

make use of F2FS_NAME_LEN for name length checking,
change return conditions at few places, by assigning
storing the errorvalue in 'error' and making a common
exit path.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# 7d82db83 10-Jan-2013 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: add f2fs_balance_fs in several interfaces

The f2fs_balance_fs() is to check the number of free sections and decide whether
it needs to conduct cleaning or not. If there are not enough free sections, the
cleaning job should be started.

In order to control an amount of free sections even under high utilization, f2fs
should call f2fs_balance_fs at all the VFS interfaces that are able to produce
dirty pages.
This patch adds the function calls in the missing interfaces as follows.

1. f2fs_setxattr()
The f2fs_setxattr() produces dirty node pages so that we should call
f2fs_balance_fs() either likewise doing in other VFS interfaces such as
f2fs_lookup(), f2fs_mkdir(), and so on.

2. f2fs_sync_file()
We should guarantee serving free sections for syncing metadata during fsync.
Previously, there is no space check before triggering checkpoint and
sync_node_pages.
Therefore, if a bunch of fsync calls are triggered under 100% of FS utilization,
f2fs is able to be faced with no free sections, resulting in BUG_ON().

3. f2fs_sync_fs()
Before calling write_checkpoint(), we should guarantee that there are minimum
free sections.

4. f2fs_write_inode()
f2fs_write_inode() is also able to produce dirty node pages.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# 9836b8b9 27-Dec-2012 Leon Romanovsky <leon@kernel.org>

f2fs: unify string length declarations and usage

This patch is intended to unify string length declarations and usage.
There are number of calls to strlen which return size_t object.
The size of this object depends on compiler if it will be bigger,
equal or even smaller than an unsigned int

Signed-off-by: Leon Romanovsky <leon@leon.nu>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# 573ea5fc 30-Nov-2012 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: resolve build failures

There exist two build failures reported by Randy Dunlap as follows.

(on i386)
a. (config-r8857)
ERROR: "f2fs_xattr_advise_handler" [fs/f2fs/f2fs.ko] undefined!

Key configs in (config-r8857) are as follows.
CONFIG_F2FS_FS=m
# CONFIG_F2FS_STAT_FS is not set
CONFIG_F2FS_FS_XATTR=y
# CONFIG_F2FS_FS_POSIX_ACL is not set

The error was occurred due to the function location that we made a mistake.
Recently we added a new functionality for users to indicate cold files
explicitly through xattr operations (i.e., f2fs_xattr_advise_handler).

This handler should have been added in xattr.c instead of acl.c in order
to avoid an undefined operation like in this case where XATTR is set and
ACL is not set.

b. (config-r8855)
fs/f2fs/file.c: In function 'f2fs_vm_page_mkwrite':
fs/f2fs/file.c:97:2: error: implicit declaration of function
'block_page_mkwrite_return'

Key config in (config-r8855) is CONFIG_BLOCK.

Obviously, f2fs works on top of the block device so that we should consider
carefully a sort of config dependencies.

The reason why this error was occurred was that f2fs_vm_page_mkwrite() calls
block_page_mkwrite_return() which is enalbed only if CONFIG_BLOCK is set.

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>


# 0a8165d7 28-Nov-2012 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: adjust kernel coding style

As pointed out by Randy Dunlap, this patch removes all usage of "/**" for comment
blocks. Instead, just use "/*".

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>


# af48b85b 02-Nov-2012 Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: add xattr and acl functionalities

This implements xattr and acl functionalities.

- F2FS uses a node page to contain use extended attributes.

Signed-off-by: Changman Lee <cm224.lee@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>