Searched hist:592 (Results 276 - 300 of 377) sorted by relevance

<<111213141516

/linux-master/tools/perf/Documentation/
H A Dperf-stat.txtdiff 430daf2d Mon Mar 20 14:17:00 MDT 2017 Andi Kleen <ak@linux.intel.com> perf stat: Collapse identically named events

The uncore PMU has a lot of duplicated PMUs for different subsystems.
When expanding an uncore alias we usually end up with a large
number of identically named aliases, which makes perf stat
output difficult to read.

Automatically sum them up in perf stat, unless --no-merge is specified.

This can be default because only the uncores generally have duplicated
aliases. Other PMUs have unique names.

Before:

% perf stat --no-merge -a -e unc_c_llc_lookup.any sleep 1

Performance counter stats for 'system wide':

694,976 Bytes unc_c_llc_lookup.any
706,304 Bytes unc_c_llc_lookup.any
956,608 Bytes unc_c_llc_lookup.any
782,720 Bytes unc_c_llc_lookup.any
605,696 Bytes unc_c_llc_lookup.any
442,816 Bytes unc_c_llc_lookup.any
659,328 Bytes unc_c_llc_lookup.any
509,312 Bytes unc_c_llc_lookup.any
263,936 Bytes unc_c_llc_lookup.any
592,448 Bytes unc_c_llc_lookup.any
672,448 Bytes unc_c_llc_lookup.any
608,640 Bytes unc_c_llc_lookup.any
641,024 Bytes unc_c_llc_lookup.any
856,896 Bytes unc_c_llc_lookup.any
808,832 Bytes unc_c_llc_lookup.any
684,864 Bytes unc_c_llc_lookup.any
710,464 Bytes unc_c_llc_lookup.any
538,304 Bytes unc_c_llc_lookup.any

1.002577660 seconds time elapsed

After:

% perf stat -a -e unc_c_llc_lookup.any sleep 1

Performance counter stats for 'system wide':

2,685,120 Bytes unc_c_llc_lookup.any

1.002648032 seconds time elapsed

v2: Split collect_aliases. Rename alias flag.
v3: Make sure unsupported/not counted is always printed.
v4: Factor out callback change into separate patch.
v5: Move check for bad results here
Move merged check into collect_data

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170320201711.14142-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
/linux-master/drivers/usb/core/
H A Ddriver.cdiff 592fbbe4 Tue Sep 19 08:08:43 MDT 2006 Alan Stern <stern@rowland.harvard.edu> USB: fix root-hub resume when CONFIG_USB_SUSPEND is not set

This patch (as786) removes a redundant test and fixes a problem
involving repeated system sleeps when CONFIG_USB_SUSPEND is not set.
During the first wakeup, the root hub's dev.power.power_state.event
field doesn't get updated, causing it not to be suspended during the
second sleep transition.

This takes care of the issue raised by Rafael J. Wysocki and Mattia
Dongili.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
/linux-master/drivers/scsi/qla2xxx/
H A Dqla_iocb.cdiff 134f6695 Mon Jun 26 04:58:03 MDT 2023 Dan Carpenter <dan.carpenter@linaro.org> scsi: qla2xxx: Silence a static checker warning

Smatch and Clang both complain that LOGIN_TEMPLATE_SIZE is more than
sizeof(ha->plogi_els_payld.fl_csp).

Smatch warning:
drivers/scsi/qla2xxx/qla_iocb.c:3075 qla24xx_els_dcmd2_iocb()
warn: '&ha->plogi_els_payld.fl_csp' sometimes too small '16' size = 112

Clang warning:
include/linux/fortify-string.h:592:4: error: call to
'__read_overflow2_field' declared with 'warning' attribute: detected
read beyond size of field (2nd parameter); maybe use struct_group()?
[-Werror,-Wattribute-warning]
__read_overflow2_field(q_size_field, size);

When I was reading this code I assumed the "- 4" meant that we were
skipping the last 4 bytes but actually it turned out that we are
skipping the first four bytes.

I have re-written it remove the magic numbers, be more clear and
silence the static checker warnings.

Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/4aa0485e-766f-4b02-8d5d-c6781ea8f511@moroto.mountain
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
/linux-master/fs/xfs/
H A Dxfs_trans.hdiff 1998764e Tue Jun 27 00:12:40 MDT 2006 Alexey Dobriyan <adobriyan@gmail.com> [XFS] Reduce size of xfs_trans_t structure. * remove ->t_forw, ->t_back --
unused * ->t_ag_freeblks_delta, ->t_ag_flist_delta, ->t_ag_btree_delta
are debugging aid -- wrap them in everyone's favourite way. As a
result, cut "xfs_trans" slab object size from 592 to 572 bytes here.

SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:26319a

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
H A Dxfs_super.cdiff 4887e531 Mon Apr 22 00:13:16 MDT 2024 Christoph Hellwig <hch@lst.de> xfs: compile out v4 support if disabled

Add a few strategic IS_ENABLED statements to let the compiler eliminate
unused code when CONFIG_XFS_SUPPORT_V4 is disabled.

This saves multiple kilobytes of .text in my .config:

$ size xfs.o.*
text data bss dec hex filename
1363633 294836 592 1659061 1950b5 xfs.o.new
1371453 294868 592 1666913 196f61 xfs.o.old

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
diff 4887e531 Mon Apr 22 00:13:16 MDT 2024 Christoph Hellwig <hch@lst.de> xfs: compile out v4 support if disabled

Add a few strategic IS_ENABLED statements to let the compiler eliminate
unused code when CONFIG_XFS_SUPPORT_V4 is disabled.

This saves multiple kilobytes of .text in my .config:

$ size xfs.o.*
text data bss dec hex filename
1363633 294836 592 1659061 1950b5 xfs.o.new
1371453 294868 592 1666913 196f61 xfs.o.old

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
/linux-master/arch/x86/boot/compressed/
H A Dhead_64.Sdiff 592a9b8c Tue May 14 02:50:42 MDT 2013 Zhang Yanfei <zhangyanfei@cn.fujitsu.com> x86/mm: Drop unneeded include <asm/*pgtable, page*_types.h>

arch/x86/boot/compressed/head_64.S includes <asm/pgtable_types.h> and
<asm/page_types.h> but it doesn't look like it needs them. So remove them.

Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/5191FAE2.4020403@cn.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
/linux-master/drivers/net/ethernet/broadcom/bnx2x/
H A Dbnx2x_cmn.cdiff 592b9b8d Thu Jun 25 06:19:29 MDT 2015 Yuval Mintz <Yuval.Mintz@qlogic.com> bnx2x: Fix linearization for encapsulated packets

Due to FW constraints, driver must make sure that transmitted SKBs will
not be too fragmented, or in the case that they are - that each 'window'
of fragments passed to the FW would contain at least an mss worth of data.

For encapsultaed packets the calculation is wrong, since it ignores the
inner headers in the calculation of the headers' length.
This could lead to a FW assertion in case of a too-fragmented encapsulated
packet.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
/linux-master/net/openvswitch/
H A Ddatapath.cdiff 494bea39 Tue Aug 15 23:30:07 MDT 2017 Liping Zhang <zlpnobody@gmail.com> openvswitch: fix skb_panic due to the incorrect actions attrlen

For sw_flow_actions, the actions_len only represents the kernel part's
size, and when we dump the actions to the userspace, we will do the
convertions, so it's true size may become bigger than the actions_len.

But unfortunately, for OVS_PACKET_ATTR_ACTIONS, we use the actions_len
to alloc the skbuff, so the user_skb's size may become insufficient and
oops will happen like this:
skbuff: skb_over_panic: text:ffffffff8148fabf len:1749 put:157 head:
ffff881300f39000 data:ffff881300f39000 tail:0x6d5 end:0x6c0 dev:<NULL>
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:129!
[...]
Call Trace:
<IRQ>
[<ffffffff8148be82>] skb_put+0x43/0x44
[<ffffffff8148fabf>] skb_zerocopy+0x6c/0x1f4
[<ffffffffa0290d36>] queue_userspace_packet+0x3a3/0x448 [openvswitch]
[<ffffffffa0292023>] ovs_dp_upcall+0x30/0x5c [openvswitch]
[<ffffffffa028d435>] output_userspace+0x132/0x158 [openvswitch]
[<ffffffffa01e6890>] ? ip6_rcv_finish+0x74/0x77 [ipv6]
[<ffffffffa028e277>] do_execute_actions+0xcc1/0xdc8 [openvswitch]
[<ffffffffa028e3f2>] ovs_execute_actions+0x74/0x106 [openvswitch]
[<ffffffffa0292130>] ovs_dp_process_packet+0xe1/0xfd [openvswitch]
[<ffffffffa0292b77>] ? key_extract+0x63c/0x8d5 [openvswitch]
[<ffffffffa029848b>] ovs_vport_receive+0xa1/0xc3 [openvswitch]
[...]

Also we can find that the actions_len is much little than the orig_len:
crash> struct sw_flow_actions 0xffff8812f539d000
struct sw_flow_actions {
rcu = {
next = 0xffff8812f5398800,
func = 0xffffe3b00035db32
},
orig_len = 1384,
actions_len = 592,
actions = 0xffff8812f539d01c
}

So as a quick fix, use the orig_len instead of the actions_len to alloc
the user_skb.

Last, this oops happened on our system running a relative old kernel, but
the same risk still exists on the mainline, since we use the wrong
actions_len from the beginning.

Fixes: ccea74457bbd ("openvswitch: include datapath actions with sampled-packet upcall to userspace")
Cc: Neil McKee <neil.mckee@inmon.com>
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
/linux-master/arch/powerpc/kvm/
H A Dbook3s_pr.cdiff 592f5d87 Tue Mar 13 16:31:02 MDT 2012 Alexander Graf <agraf@suse.de> KVM: PPC: Book3S: PR: Fix preemption

We were leaking preemption counters. Fix the code to always toggle
between preempt and non-preempt properly.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
/linux-master/fs/f2fs/
H A Drecovery.cdiff cad83c96 Fri May 07 04:10:38 MDT 2021 Chao Yu <chao@kernel.org> f2fs: fix to avoid racing on fsync_entry_slab by multi filesystem instances

As syzbot reported, there is an use-after-free issue during f2fs recovery:

Use-after-free write at 0xffff88823bc16040 (in kfence-#10):
kmem_cache_destroy+0x1f/0x120 mm/slab_common.c:486
f2fs_recover_fsync_data+0x75b0/0x8380 fs/f2fs/recovery.c:869
f2fs_fill_super+0x9393/0xa420 fs/f2fs/super.c:3945
mount_bdev+0x26c/0x3a0 fs/super.c:1367
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x86/0x270 fs/super.c:1497
do_new_mount fs/namespace.c:2905 [inline]
path_mount+0x196f/0x2be0 fs/namespace.c:3235
do_mount fs/namespace.c:3248 [inline]
__do_sys_mount fs/namespace.c:3456 [inline]
__se_sys_mount+0x2f9/0x3b0 fs/namespace.c:3433
do_syscall_64+0x3f/0xb0 arch/x86/entry/common.c:47
entry_SYSCALL_64_after_hwframe+0x44/0xae

The root cause is multi f2fs filesystem instances can race on accessing
global fsync_entry_slab pointer, result in use-after-free issue of slab
cache, fixes to init/destroy this slab cache only once during module
init/destroy procedure to avoid this issue.

Reported-by: syzbot+9d90dad32dd9727ed084@syzkaller.appspotmail.com
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
H A Dcheckpoint.cdiff 6a257471 Mon Sep 28 19:23:12 MDT 2020 Chao Yu <chao@kernel.org> f2fs: fix to check segment boundary during SIT page readahead

As syzbot reported:

kernel BUG at fs/f2fs/segment.h:657!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 16220 Comm: syz-executor.0 Not tainted 5.9.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:f2fs_ra_meta_pages+0xa51/0xdc0 fs/f2fs/segment.h:657
Call Trace:
build_sit_entries fs/f2fs/segment.c:4195 [inline]
f2fs_build_segment_manager+0x4b8a/0xa3c0 fs/f2fs/segment.c:4779
f2fs_fill_super+0x377d/0x6b80 fs/f2fs/super.c:3633
mount_bdev+0x32e/0x3f0 fs/super.c:1417
legacy_get_tree+0x105/0x220 fs/fs_context.c:592
vfs_get_tree+0x89/0x2f0 fs/super.c:1547
do_new_mount fs/namespace.c:2875 [inline]
path_mount+0x1387/0x2070 fs/namespace.c:3192
do_mount fs/namespace.c:3205 [inline]
__do_sys_mount fs/namespace.c:3413 [inline]
__se_sys_mount fs/namespace.c:3390 [inline]
__x64_sys_mount+0x27f/0x300 fs/namespace.c:3390
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

@blkno in f2fs_ra_meta_pages could exceed max segment count, causing panic
in following sanity check in current_sit_addr(), add check condition to
avoid this issue.

Reported-by: syzbot+3698081bcf0bb2d12174@syzkaller.appspotmail.com
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
diff 29e7043f Tue Feb 10 17:23:12 MST 2015 Jaegeuk Kim <jaegeuk@kernel.org> f2fs: fix sparse warnings

This patch resolves the following warnings.

include/trace/events/f2fs.h:150:1: warning: expression using sizeof bool
include/trace/events/f2fs.h:180:1: warning: expression using sizeof bool
include/trace/events/f2fs.h:990:1: warning: expression using sizeof bool
include/trace/events/f2fs.h:990:1: warning: expression using sizeof bool
include/trace/events/f2fs.h:150:1: warning: odd constant _Bool cast (ffffffffffffffff becomes 1)
include/trace/events/f2fs.h:180:1: warning: odd constant _Bool cast (ffffffffffffffff becomes 1)
include/trace/events/f2fs.h:990:1: warning: odd constant _Bool cast (ffffffffffffffff becomes 1)
include/trace/events/f2fs.h:990:1: warning: odd constant _Bool cast (ffffffffffffffff becomes 1)

fs/f2fs/checkpoint.c:27:19: warning: symbol 'inode_entry_slab' was not declared. Should it be static?
fs/f2fs/checkpoint.c:577:15: warning: cast to restricted __le32
fs/f2fs/checkpoint.c:592:15: warning: cast to restricted __le32

fs/f2fs/trace.c:19:1: warning: symbol 'pids' was not declared. Should it be static?
fs/f2fs/trace.c:21:21: warning: symbol 'last_io' was not declared. Should it be static?

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
/linux-master/fs/notify/fanotify/
H A Dfanotify_user.cdiff 592f6b84 Mon Jan 27 18:07:19 MST 2014 Heiko Carstens <hca@linux.ibm.com> compat: fix sys_fanotify_mark

Commit 91c2e0bcae72 ("unify compat fanotify_mark(2), switch to
COMPAT_SYSCALL_DEFINE") added a new unified compat fanotify_mark syscall
to be used by all architectures.

Unfortunately the unified version merges the split mask parameter in a
wrong way: the lower and higher word got swapped.

This was discovered with glibc's tst-fanotify test case.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reported-by: Andreas Krebbel <krebbel@linux.vnet.ibm.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Acked-by: "David S. Miller" <davem@davemloft.net>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: <stable@vger.kernel.org> [3.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
/linux-master/fs/btrfs/
H A Dvolumes.cdiff b2a61667 Tue Jul 27 01:13:03 MDT 2021 Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com> btrfs: fix rw device counting in __btrfs_free_extra_devids

When removing a writeable device in __btrfs_free_extra_devids, the rw
device count should be decremented.

This error was caught by Syzbot which reported a warning in
close_fs_devices:

WARNING: CPU: 1 PID: 9355 at fs/btrfs/volumes.c:1168 close_fs_devices+0x763/0x880 fs/btrfs/volumes.c:1168
Modules linked in:
CPU: 0 PID: 9355 Comm: syz-executor552 Not tainted 5.13.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:close_fs_devices+0x763/0x880 fs/btrfs/volumes.c:1168
RSP: 0018:ffffc9000333f2f0 EFLAGS: 00010293
RAX: ffffffff8365f5c3 RBX: 0000000000000001 RCX: ffff888029afd4c0
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffff88802846f508 R08: ffffffff8365f525 R09: ffffed100337d128
R10: ffffed100337d128 R11: 0000000000000000 R12: dffffc0000000000
R13: ffff888019be8868 R14: 1ffff1100337d10d R15: 1ffff1100337d10a
FS: 00007f6f53828700(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000047c410 CR3: 00000000302a6000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
btrfs_close_devices+0xc9/0x450 fs/btrfs/volumes.c:1180
open_ctree+0x8e1/0x3968 fs/btrfs/disk-io.c:3693
btrfs_fill_super fs/btrfs/super.c:1382 [inline]
btrfs_mount_root+0xac5/0xc60 fs/btrfs/super.c:1749
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x86/0x270 fs/super.c:1498
fc_mount fs/namespace.c:993 [inline]
vfs_kern_mount+0xc9/0x160 fs/namespace.c:1023
btrfs_mount+0x3d3/0xb50 fs/btrfs/super.c:1809
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x86/0x270 fs/super.c:1498
do_new_mount fs/namespace.c:2905 [inline]
path_mount+0x196f/0x2be0 fs/namespace.c:3235
do_mount fs/namespace.c:3248 [inline]
__do_sys_mount fs/namespace.c:3456 [inline]
__se_sys_mount+0x2f9/0x3b0 fs/namespace.c:3433
do_syscall_64+0x3f/0xb0 arch/x86/entry/common.c:47
entry_SYSCALL_64_after_hwframe+0x44/0xae

Because fs_devices->rw_devices was not 0 after
closing all devices. Here is the call trace that was observed:

btrfs_mount_root():
btrfs_scan_one_device():
device_list_add(); <---------------- device added
btrfs_open_devices():
open_fs_devices():
btrfs_open_one_device(); <-------- writable device opened,
rw device count ++
btrfs_fill_super():
open_ctree():
btrfs_free_extra_devids():
__btrfs_free_extra_devids(); <--- writable device removed,
rw device count not decremented
fail_tree_roots:
btrfs_close_devices():
close_fs_devices(); <------- rw device count off by 1

As a note, prior to commit cf89af146b7e ("btrfs: dev-replace: fail
mount if we don't have replace item with target device"), rw_devices
was decremented on removing a writable device in
__btrfs_free_extra_devids only if the BTRFS_DEV_STATE_REPLACE_TGT bit
was not set for the device. However, this check does not need to be
reinstated as it is now redundant and incorrect.

In __btrfs_free_extra_devids, we skip removing the device if it is the
target for replacement. This is done by checking whether device->devid
== BTRFS_DEV_REPLACE_DEVID. Since BTRFS_DEV_STATE_REPLACE_TGT is set
only on the device with devid BTRFS_DEV_REPLACE_DEVID, no devices
should have the BTRFS_DEV_STATE_REPLACE_TGT bit set after the check,
and so it's redundant to test for that bit.

Additionally, following commit 82372bc816d7 ("Btrfs: make
the logic of source device removing more clear"), rw_devices is
incremented whenever a writeable device is added to the alloc
list (including the target device in btrfs_dev_replace_finishing), so
all removals of writable devices from the alloc list should also be
accompanied by a decrement to rw_devices.

Reported-by: syzbot+a70e2ad0879f160b9217@syzkaller.appspotmail.com
Fixes: cf89af146b7e ("btrfs: dev-replace: fail mount if we don't have replace item with target device")
CC: stable@vger.kernel.org # 5.10+
Tested-by: syzbot+a70e2ad0879f160b9217@syzkaller.appspotmail.com
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
diff b2a61667 Tue Jul 27 01:13:03 MDT 2021 Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com> btrfs: fix rw device counting in __btrfs_free_extra_devids

When removing a writeable device in __btrfs_free_extra_devids, the rw
device count should be decremented.

This error was caught by Syzbot which reported a warning in
close_fs_devices:

WARNING: CPU: 1 PID: 9355 at fs/btrfs/volumes.c:1168 close_fs_devices+0x763/0x880 fs/btrfs/volumes.c:1168
Modules linked in:
CPU: 0 PID: 9355 Comm: syz-executor552 Not tainted 5.13.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:close_fs_devices+0x763/0x880 fs/btrfs/volumes.c:1168
RSP: 0018:ffffc9000333f2f0 EFLAGS: 00010293
RAX: ffffffff8365f5c3 RBX: 0000000000000001 RCX: ffff888029afd4c0
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffff88802846f508 R08: ffffffff8365f525 R09: ffffed100337d128
R10: ffffed100337d128 R11: 0000000000000000 R12: dffffc0000000000
R13: ffff888019be8868 R14: 1ffff1100337d10d R15: 1ffff1100337d10a
FS: 00007f6f53828700(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000047c410 CR3: 00000000302a6000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
btrfs_close_devices+0xc9/0x450 fs/btrfs/volumes.c:1180
open_ctree+0x8e1/0x3968 fs/btrfs/disk-io.c:3693
btrfs_fill_super fs/btrfs/super.c:1382 [inline]
btrfs_mount_root+0xac5/0xc60 fs/btrfs/super.c:1749
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x86/0x270 fs/super.c:1498
fc_mount fs/namespace.c:993 [inline]
vfs_kern_mount+0xc9/0x160 fs/namespace.c:1023
btrfs_mount+0x3d3/0xb50 fs/btrfs/super.c:1809
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x86/0x270 fs/super.c:1498
do_new_mount fs/namespace.c:2905 [inline]
path_mount+0x196f/0x2be0 fs/namespace.c:3235
do_mount fs/namespace.c:3248 [inline]
__do_sys_mount fs/namespace.c:3456 [inline]
__se_sys_mount+0x2f9/0x3b0 fs/namespace.c:3433
do_syscall_64+0x3f/0xb0 arch/x86/entry/common.c:47
entry_SYSCALL_64_after_hwframe+0x44/0xae

Because fs_devices->rw_devices was not 0 after
closing all devices. Here is the call trace that was observed:

btrfs_mount_root():
btrfs_scan_one_device():
device_list_add(); <---------------- device added
btrfs_open_devices():
open_fs_devices():
btrfs_open_one_device(); <-------- writable device opened,
rw device count ++
btrfs_fill_super():
open_ctree():
btrfs_free_extra_devids():
__btrfs_free_extra_devids(); <--- writable device removed,
rw device count not decremented
fail_tree_roots:
btrfs_close_devices():
close_fs_devices(); <------- rw device count off by 1

As a note, prior to commit cf89af146b7e ("btrfs: dev-replace: fail
mount if we don't have replace item with target device"), rw_devices
was decremented on removing a writable device in
__btrfs_free_extra_devids only if the BTRFS_DEV_STATE_REPLACE_TGT bit
was not set for the device. However, this check does not need to be
reinstated as it is now redundant and incorrect.

In __btrfs_free_extra_devids, we skip removing the device if it is the
target for replacement. This is done by checking whether device->devid
== BTRFS_DEV_REPLACE_DEVID. Since BTRFS_DEV_STATE_REPLACE_TGT is set
only on the device with devid BTRFS_DEV_REPLACE_DEVID, no devices
should have the BTRFS_DEV_STATE_REPLACE_TGT bit set after the check,
and so it's redundant to test for that bit.

Additionally, following commit 82372bc816d7 ("Btrfs: make
the logic of source device removing more clear"), rw_devices is
incremented whenever a writeable device is added to the alloc
list (including the target device in btrfs_dev_replace_finishing), so
all removals of writable devices from the alloc list should also be
accompanied by a decrement to rw_devices.

Reported-by: syzbot+a70e2ad0879f160b9217@syzkaller.appspotmail.com
Fixes: cf89af146b7e ("btrfs: dev-replace: fail mount if we don't have replace item with target device")
CC: stable@vger.kernel.org # 5.10+
Tested-by: syzbot+a70e2ad0879f160b9217@syzkaller.appspotmail.com
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
diff 0697d9a6 Wed Nov 18 02:03:26 MST 2020 Johannes Thumshirn <johannes.thumshirn@wdc.com> btrfs: don't access possibly stale fs_info data for printing duplicate device

Syzbot reported a possible use-after-free when printing a duplicate device
warning device_list_add().

At this point it can happen that a btrfs_device::fs_info is not correctly
setup yet, so we're accessing stale data, when printing the warning
message using the btrfs_printk() wrappers.

==================================================================
BUG: KASAN: use-after-free in btrfs_printk+0x3eb/0x435 fs/btrfs/super.c:245
Read of size 8 at addr ffff8880878e06a8 by task syz-executor225/7068

CPU: 1 PID: 7068 Comm: syz-executor225 Not tainted 5.9.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1d6/0x29e lib/dump_stack.c:118
print_address_description+0x66/0x620 mm/kasan/report.c:383
__kasan_report mm/kasan/report.c:513 [inline]
kasan_report+0x132/0x1d0 mm/kasan/report.c:530
btrfs_printk+0x3eb/0x435 fs/btrfs/super.c:245
device_list_add+0x1a88/0x1d60 fs/btrfs/volumes.c:943
btrfs_scan_one_device+0x196/0x490 fs/btrfs/volumes.c:1359
btrfs_mount_root+0x48f/0xb60 fs/btrfs/super.c:1634
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
fc_mount fs/namespace.c:978 [inline]
vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008
btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
do_new_mount fs/namespace.c:2875 [inline]
path_mount+0x179d/0x29e0 fs/namespace.c:3192
do_mount fs/namespace.c:3205 [inline]
__do_sys_mount fs/namespace.c:3413 [inline]
__se_sys_mount+0x126/0x180 fs/namespace.c:3390
do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x44840a
RSP: 002b:00007ffedfffd608 EFLAGS: 00000293 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007ffedfffd670 RCX: 000000000044840a
RDX: 0000000020000000 RSI: 0000000020000100 RDI: 00007ffedfffd630
RBP: 00007ffedfffd630 R08: 00007ffedfffd670 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000001a
R13: 0000000000000004 R14: 0000000000000003 R15: 0000000000000003

Allocated by task 6945:
kasan_save_stack mm/kasan/common.c:48 [inline]
kasan_set_track mm/kasan/common.c:56 [inline]
__kasan_kmalloc+0x100/0x130 mm/kasan/common.c:461
kmalloc_node include/linux/slab.h:577 [inline]
kvmalloc_node+0x81/0x110 mm/util.c:574
kvmalloc include/linux/mm.h:757 [inline]
kvzalloc include/linux/mm.h:765 [inline]
btrfs_mount_root+0xd0/0xb60 fs/btrfs/super.c:1613
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
fc_mount fs/namespace.c:978 [inline]
vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008
btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
do_new_mount fs/namespace.c:2875 [inline]
path_mount+0x179d/0x29e0 fs/namespace.c:3192
do_mount fs/namespace.c:3205 [inline]
__do_sys_mount fs/namespace.c:3413 [inline]
__se_sys_mount+0x126/0x180 fs/namespace.c:3390
do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 6945:
kasan_save_stack mm/kasan/common.c:48 [inline]
kasan_set_track+0x3d/0x70 mm/kasan/common.c:56
kasan_set_free_info+0x17/0x30 mm/kasan/generic.c:355
__kasan_slab_free+0xdd/0x110 mm/kasan/common.c:422
__cache_free mm/slab.c:3418 [inline]
kfree+0x113/0x200 mm/slab.c:3756
deactivate_locked_super+0xa7/0xf0 fs/super.c:335
btrfs_mount_root+0x72b/0xb60 fs/btrfs/super.c:1678
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
fc_mount fs/namespace.c:978 [inline]
vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008
btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
do_new_mount fs/namespace.c:2875 [inline]
path_mount+0x179d/0x29e0 fs/namespace.c:3192
do_mount fs/namespace.c:3205 [inline]
__do_sys_mount fs/namespace.c:3413 [inline]
__se_sys_mount+0x126/0x180 fs/namespace.c:3390
do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

The buggy address belongs to the object at ffff8880878e0000
which belongs to the cache kmalloc-16k of size 16384
The buggy address is located 1704 bytes inside of
16384-byte region [ffff8880878e0000, ffff8880878e4000)
The buggy address belongs to the page:
page:0000000060704f30 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x878e0
head:0000000060704f30 order:3 compound_mapcount:0 compound_pincount:0
flags: 0xfffe0000010200(slab|head)
raw: 00fffe0000010200 ffffea00028e9a08 ffffea00021e3608 ffff8880aa440b00
raw: 0000000000000000 ffff8880878e0000 0000000100000001 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff8880878e0580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880878e0600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8880878e0680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8880878e0700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880878e0780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

The syzkaller reproducer for this use-after-free crafts a filesystem image
and loop mounts it twice in a loop. The mount will fail as the crafted
image has an invalid chunk tree. When this happens btrfs_mount_root() will
call deactivate_locked_super(), which then cleans up fs_info and
fs_info::sb. If a second thread now adds the same block-device to the
filesystem, it will get detected as a duplicate device and
device_list_add() will reject the duplicate and print a warning. But as
the fs_info pointer passed in is non-NULL this will result in a
use-after-free.

Instead of printing possibly uninitialized or already freed memory in
btrfs_printk(), explicitly pass in a NULL fs_info so the printing of the
device name will be skipped altogether.

There was a slightly different approach discussed in
https://lore.kernel.org/linux-btrfs/20200114060920.4527-1-anand.jain@oracle.com/t/#u

Link: https://lore.kernel.org/linux-btrfs/000000000000c9e14b05afcc41ba@google.com
Reported-by: syzbot+582e66e5edf36a22c7b0@syzkaller.appspotmail.com
CC: stable@vger.kernel.org # 4.19+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
diff 0697d9a6 Wed Nov 18 02:03:26 MST 2020 Johannes Thumshirn <johannes.thumshirn@wdc.com> btrfs: don't access possibly stale fs_info data for printing duplicate device

Syzbot reported a possible use-after-free when printing a duplicate device
warning device_list_add().

At this point it can happen that a btrfs_device::fs_info is not correctly
setup yet, so we're accessing stale data, when printing the warning
message using the btrfs_printk() wrappers.

==================================================================
BUG: KASAN: use-after-free in btrfs_printk+0x3eb/0x435 fs/btrfs/super.c:245
Read of size 8 at addr ffff8880878e06a8 by task syz-executor225/7068

CPU: 1 PID: 7068 Comm: syz-executor225 Not tainted 5.9.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1d6/0x29e lib/dump_stack.c:118
print_address_description+0x66/0x620 mm/kasan/report.c:383
__kasan_report mm/kasan/report.c:513 [inline]
kasan_report+0x132/0x1d0 mm/kasan/report.c:530
btrfs_printk+0x3eb/0x435 fs/btrfs/super.c:245
device_list_add+0x1a88/0x1d60 fs/btrfs/volumes.c:943
btrfs_scan_one_device+0x196/0x490 fs/btrfs/volumes.c:1359
btrfs_mount_root+0x48f/0xb60 fs/btrfs/super.c:1634
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
fc_mount fs/namespace.c:978 [inline]
vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008
btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
do_new_mount fs/namespace.c:2875 [inline]
path_mount+0x179d/0x29e0 fs/namespace.c:3192
do_mount fs/namespace.c:3205 [inline]
__do_sys_mount fs/namespace.c:3413 [inline]
__se_sys_mount+0x126/0x180 fs/namespace.c:3390
do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x44840a
RSP: 002b:00007ffedfffd608 EFLAGS: 00000293 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007ffedfffd670 RCX: 000000000044840a
RDX: 0000000020000000 RSI: 0000000020000100 RDI: 00007ffedfffd630
RBP: 00007ffedfffd630 R08: 00007ffedfffd670 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000001a
R13: 0000000000000004 R14: 0000000000000003 R15: 0000000000000003

Allocated by task 6945:
kasan_save_stack mm/kasan/common.c:48 [inline]
kasan_set_track mm/kasan/common.c:56 [inline]
__kasan_kmalloc+0x100/0x130 mm/kasan/common.c:461
kmalloc_node include/linux/slab.h:577 [inline]
kvmalloc_node+0x81/0x110 mm/util.c:574
kvmalloc include/linux/mm.h:757 [inline]
kvzalloc include/linux/mm.h:765 [inline]
btrfs_mount_root+0xd0/0xb60 fs/btrfs/super.c:1613
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
fc_mount fs/namespace.c:978 [inline]
vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008
btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
do_new_mount fs/namespace.c:2875 [inline]
path_mount+0x179d/0x29e0 fs/namespace.c:3192
do_mount fs/namespace.c:3205 [inline]
__do_sys_mount fs/namespace.c:3413 [inline]
__se_sys_mount+0x126/0x180 fs/namespace.c:3390
do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 6945:
kasan_save_stack mm/kasan/common.c:48 [inline]
kasan_set_track+0x3d/0x70 mm/kasan/common.c:56
kasan_set_free_info+0x17/0x30 mm/kasan/generic.c:355
__kasan_slab_free+0xdd/0x110 mm/kasan/common.c:422
__cache_free mm/slab.c:3418 [inline]
kfree+0x113/0x200 mm/slab.c:3756
deactivate_locked_super+0xa7/0xf0 fs/super.c:335
btrfs_mount_root+0x72b/0xb60 fs/btrfs/super.c:1678
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
fc_mount fs/namespace.c:978 [inline]
vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008
btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
do_new_mount fs/namespace.c:2875 [inline]
path_mount+0x179d/0x29e0 fs/namespace.c:3192
do_mount fs/namespace.c:3205 [inline]
__do_sys_mount fs/namespace.c:3413 [inline]
__se_sys_mount+0x126/0x180 fs/namespace.c:3390
do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

The buggy address belongs to the object at ffff8880878e0000
which belongs to the cache kmalloc-16k of size 16384
The buggy address is located 1704 bytes inside of
16384-byte region [ffff8880878e0000, ffff8880878e4000)
The buggy address belongs to the page:
page:0000000060704f30 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x878e0
head:0000000060704f30 order:3 compound_mapcount:0 compound_pincount:0
flags: 0xfffe0000010200(slab|head)
raw: 00fffe0000010200 ffffea00028e9a08 ffffea00021e3608 ffff8880aa440b00
raw: 0000000000000000 ffff8880878e0000 0000000100000001 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff8880878e0580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880878e0600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8880878e0680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8880878e0700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880878e0780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

The syzkaller reproducer for this use-after-free crafts a filesystem image
and loop mounts it twice in a loop. The mount will fail as the crafted
image has an invalid chunk tree. When this happens btrfs_mount_root() will
call deactivate_locked_super(), which then cleans up fs_info and
fs_info::sb. If a second thread now adds the same block-device to the
filesystem, it will get detected as a duplicate device and
device_list_add() will reject the duplicate and print a warning. But as
the fs_info pointer passed in is non-NULL this will result in a
use-after-free.

Instead of printing possibly uninitialized or already freed memory in
btrfs_printk(), explicitly pass in a NULL fs_info so the printing of the
device name will be skipped altogether.

There was a slightly different approach discussed in
https://lore.kernel.org/linux-btrfs/20200114060920.4527-1-anand.jain@oracle.com/t/#u

Link: https://lore.kernel.org/linux-btrfs/000000000000c9e14b05afcc41ba@google.com
Reported-by: syzbot+582e66e5edf36a22c7b0@syzkaller.appspotmail.com
CC: stable@vger.kernel.org # 4.19+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
diff 0697d9a6 Wed Nov 18 02:03:26 MST 2020 Johannes Thumshirn <johannes.thumshirn@wdc.com> btrfs: don't access possibly stale fs_info data for printing duplicate device

Syzbot reported a possible use-after-free when printing a duplicate device
warning device_list_add().

At this point it can happen that a btrfs_device::fs_info is not correctly
setup yet, so we're accessing stale data, when printing the warning
message using the btrfs_printk() wrappers.

==================================================================
BUG: KASAN: use-after-free in btrfs_printk+0x3eb/0x435 fs/btrfs/super.c:245
Read of size 8 at addr ffff8880878e06a8 by task syz-executor225/7068

CPU: 1 PID: 7068 Comm: syz-executor225 Not tainted 5.9.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1d6/0x29e lib/dump_stack.c:118
print_address_description+0x66/0x620 mm/kasan/report.c:383
__kasan_report mm/kasan/report.c:513 [inline]
kasan_report+0x132/0x1d0 mm/kasan/report.c:530
btrfs_printk+0x3eb/0x435 fs/btrfs/super.c:245
device_list_add+0x1a88/0x1d60 fs/btrfs/volumes.c:943
btrfs_scan_one_device+0x196/0x490 fs/btrfs/volumes.c:1359
btrfs_mount_root+0x48f/0xb60 fs/btrfs/super.c:1634
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
fc_mount fs/namespace.c:978 [inline]
vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008
btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
do_new_mount fs/namespace.c:2875 [inline]
path_mount+0x179d/0x29e0 fs/namespace.c:3192
do_mount fs/namespace.c:3205 [inline]
__do_sys_mount fs/namespace.c:3413 [inline]
__se_sys_mount+0x126/0x180 fs/namespace.c:3390
do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x44840a
RSP: 002b:00007ffedfffd608 EFLAGS: 00000293 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007ffedfffd670 RCX: 000000000044840a
RDX: 0000000020000000 RSI: 0000000020000100 RDI: 00007ffedfffd630
RBP: 00007ffedfffd630 R08: 00007ffedfffd670 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000001a
R13: 0000000000000004 R14: 0000000000000003 R15: 0000000000000003

Allocated by task 6945:
kasan_save_stack mm/kasan/common.c:48 [inline]
kasan_set_track mm/kasan/common.c:56 [inline]
__kasan_kmalloc+0x100/0x130 mm/kasan/common.c:461
kmalloc_node include/linux/slab.h:577 [inline]
kvmalloc_node+0x81/0x110 mm/util.c:574
kvmalloc include/linux/mm.h:757 [inline]
kvzalloc include/linux/mm.h:765 [inline]
btrfs_mount_root+0xd0/0xb60 fs/btrfs/super.c:1613
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
fc_mount fs/namespace.c:978 [inline]
vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008
btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
do_new_mount fs/namespace.c:2875 [inline]
path_mount+0x179d/0x29e0 fs/namespace.c:3192
do_mount fs/namespace.c:3205 [inline]
__do_sys_mount fs/namespace.c:3413 [inline]
__se_sys_mount+0x126/0x180 fs/namespace.c:3390
do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 6945:
kasan_save_stack mm/kasan/common.c:48 [inline]
kasan_set_track+0x3d/0x70 mm/kasan/common.c:56
kasan_set_free_info+0x17/0x30 mm/kasan/generic.c:355
__kasan_slab_free+0xdd/0x110 mm/kasan/common.c:422
__cache_free mm/slab.c:3418 [inline]
kfree+0x113/0x200 mm/slab.c:3756
deactivate_locked_super+0xa7/0xf0 fs/super.c:335
btrfs_mount_root+0x72b/0xb60 fs/btrfs/super.c:1678
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
fc_mount fs/namespace.c:978 [inline]
vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008
btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
do_new_mount fs/namespace.c:2875 [inline]
path_mount+0x179d/0x29e0 fs/namespace.c:3192
do_mount fs/namespace.c:3205 [inline]
__do_sys_mount fs/namespace.c:3413 [inline]
__se_sys_mount+0x126/0x180 fs/namespace.c:3390
do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

The buggy address belongs to the object at ffff8880878e0000
which belongs to the cache kmalloc-16k of size 16384
The buggy address is located 1704 bytes inside of
16384-byte region [ffff8880878e0000, ffff8880878e4000)
The buggy address belongs to the page:
page:0000000060704f30 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x878e0
head:0000000060704f30 order:3 compound_mapcount:0 compound_pincount:0
flags: 0xfffe0000010200(slab|head)
raw: 00fffe0000010200 ffffea00028e9a08 ffffea00021e3608 ffff8880aa440b00
raw: 0000000000000000 ffff8880878e0000 0000000100000001 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff8880878e0580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880878e0600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8880878e0680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8880878e0700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880878e0780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

The syzkaller reproducer for this use-after-free crafts a filesystem image
and loop mounts it twice in a loop. The mount will fail as the crafted
image has an invalid chunk tree. When this happens btrfs_mount_root() will
call deactivate_locked_super(), which then cleans up fs_info and
fs_info::sb. If a second thread now adds the same block-device to the
filesystem, it will get detected as a duplicate device and
device_list_add() will reject the duplicate and print a warning. But as
the fs_info pointer passed in is non-NULL this will result in a
use-after-free.

Instead of printing possibly uninitialized or already freed memory in
btrfs_printk(), explicitly pass in a NULL fs_info so the printing of the
device name will be skipped altogether.

There was a slightly different approach discussed in
https://lore.kernel.org/linux-btrfs/20200114060920.4527-1-anand.jain@oracle.com/t/#u

Link: https://lore.kernel.org/linux-btrfs/000000000000c9e14b05afcc41ba@google.com
Reported-by: syzbot+582e66e5edf36a22c7b0@syzkaller.appspotmail.com
CC: stable@vger.kernel.org # 4.19+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
diff 0697d9a6 Wed Nov 18 02:03:26 MST 2020 Johannes Thumshirn <johannes.thumshirn@wdc.com> btrfs: don't access possibly stale fs_info data for printing duplicate device

Syzbot reported a possible use-after-free when printing a duplicate device
warning device_list_add().

At this point it can happen that a btrfs_device::fs_info is not correctly
setup yet, so we're accessing stale data, when printing the warning
message using the btrfs_printk() wrappers.

==================================================================
BUG: KASAN: use-after-free in btrfs_printk+0x3eb/0x435 fs/btrfs/super.c:245
Read of size 8 at addr ffff8880878e06a8 by task syz-executor225/7068

CPU: 1 PID: 7068 Comm: syz-executor225 Not tainted 5.9.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1d6/0x29e lib/dump_stack.c:118
print_address_description+0x66/0x620 mm/kasan/report.c:383
__kasan_report mm/kasan/report.c:513 [inline]
kasan_report+0x132/0x1d0 mm/kasan/report.c:530
btrfs_printk+0x3eb/0x435 fs/btrfs/super.c:245
device_list_add+0x1a88/0x1d60 fs/btrfs/volumes.c:943
btrfs_scan_one_device+0x196/0x490 fs/btrfs/volumes.c:1359
btrfs_mount_root+0x48f/0xb60 fs/btrfs/super.c:1634
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
fc_mount fs/namespace.c:978 [inline]
vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008
btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
do_new_mount fs/namespace.c:2875 [inline]
path_mount+0x179d/0x29e0 fs/namespace.c:3192
do_mount fs/namespace.c:3205 [inline]
__do_sys_mount fs/namespace.c:3413 [inline]
__se_sys_mount+0x126/0x180 fs/namespace.c:3390
do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x44840a
RSP: 002b:00007ffedfffd608 EFLAGS: 00000293 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007ffedfffd670 RCX: 000000000044840a
RDX: 0000000020000000 RSI: 0000000020000100 RDI: 00007ffedfffd630
RBP: 00007ffedfffd630 R08: 00007ffedfffd670 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000001a
R13: 0000000000000004 R14: 0000000000000003 R15: 0000000000000003

Allocated by task 6945:
kasan_save_stack mm/kasan/common.c:48 [inline]
kasan_set_track mm/kasan/common.c:56 [inline]
__kasan_kmalloc+0x100/0x130 mm/kasan/common.c:461
kmalloc_node include/linux/slab.h:577 [inline]
kvmalloc_node+0x81/0x110 mm/util.c:574
kvmalloc include/linux/mm.h:757 [inline]
kvzalloc include/linux/mm.h:765 [inline]
btrfs_mount_root+0xd0/0xb60 fs/btrfs/super.c:1613
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
fc_mount fs/namespace.c:978 [inline]
vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008
btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
do_new_mount fs/namespace.c:2875 [inline]
path_mount+0x179d/0x29e0 fs/namespace.c:3192
do_mount fs/namespace.c:3205 [inline]
__do_sys_mount fs/namespace.c:3413 [inline]
__se_sys_mount+0x126/0x180 fs/namespace.c:3390
do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 6945:
kasan_save_stack mm/kasan/common.c:48 [inline]
kasan_set_track+0x3d/0x70 mm/kasan/common.c:56
kasan_set_free_info+0x17/0x30 mm/kasan/generic.c:355
__kasan_slab_free+0xdd/0x110 mm/kasan/common.c:422
__cache_free mm/slab.c:3418 [inline]
kfree+0x113/0x200 mm/slab.c:3756
deactivate_locked_super+0xa7/0xf0 fs/super.c:335
btrfs_mount_root+0x72b/0xb60 fs/btrfs/super.c:1678
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
fc_mount fs/namespace.c:978 [inline]
vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008
btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
do_new_mount fs/namespace.c:2875 [inline]
path_mount+0x179d/0x29e0 fs/namespace.c:3192
do_mount fs/namespace.c:3205 [inline]
__do_sys_mount fs/namespace.c:3413 [inline]
__se_sys_mount+0x126/0x180 fs/namespace.c:3390
do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

The buggy address belongs to the object at ffff8880878e0000
which belongs to the cache kmalloc-16k of size 16384
The buggy address is located 1704 bytes inside of
16384-byte region [ffff8880878e0000, ffff8880878e4000)
The buggy address belongs to the page:
page:0000000060704f30 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x878e0
head:0000000060704f30 order:3 compound_mapcount:0 compound_pincount:0
flags: 0xfffe0000010200(slab|head)
raw: 00fffe0000010200 ffffea00028e9a08 ffffea00021e3608 ffff8880aa440b00
raw: 0000000000000000 ffff8880878e0000 0000000100000001 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff8880878e0580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880878e0600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8880878e0680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8880878e0700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880878e0780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

The syzkaller reproducer for this use-after-free crafts a filesystem image
and loop mounts it twice in a loop. The mount will fail as the crafted
image has an invalid chunk tree. When this happens btrfs_mount_root() will
call deactivate_locked_super(), which then cleans up fs_info and
fs_info::sb. If a second thread now adds the same block-device to the
filesystem, it will get detected as a duplicate device and
device_list_add() will reject the duplicate and print a warning. But as
the fs_info pointer passed in is non-NULL this will result in a
use-after-free.

Instead of printing possibly uninitialized or already freed memory in
btrfs_printk(), explicitly pass in a NULL fs_info so the printing of the
device name will be skipped altogether.

There was a slightly different approach discussed in
https://lore.kernel.org/linux-btrfs/20200114060920.4527-1-anand.jain@oracle.com/t/#u

Link: https://lore.kernel.org/linux-btrfs/000000000000c9e14b05afcc41ba@google.com
Reported-by: syzbot+582e66e5edf36a22c7b0@syzkaller.appspotmail.com
CC: stable@vger.kernel.org # 4.19+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
diff 0697d9a6 Wed Nov 18 02:03:26 MST 2020 Johannes Thumshirn <johannes.thumshirn@wdc.com> btrfs: don't access possibly stale fs_info data for printing duplicate device

Syzbot reported a possible use-after-free when printing a duplicate device
warning device_list_add().

At this point it can happen that a btrfs_device::fs_info is not correctly
setup yet, so we're accessing stale data, when printing the warning
message using the btrfs_printk() wrappers.

==================================================================
BUG: KASAN: use-after-free in btrfs_printk+0x3eb/0x435 fs/btrfs/super.c:245
Read of size 8 at addr ffff8880878e06a8 by task syz-executor225/7068

CPU: 1 PID: 7068 Comm: syz-executor225 Not tainted 5.9.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1d6/0x29e lib/dump_stack.c:118
print_address_description+0x66/0x620 mm/kasan/report.c:383
__kasan_report mm/kasan/report.c:513 [inline]
kasan_report+0x132/0x1d0 mm/kasan/report.c:530
btrfs_printk+0x3eb/0x435 fs/btrfs/super.c:245
device_list_add+0x1a88/0x1d60 fs/btrfs/volumes.c:943
btrfs_scan_one_device+0x196/0x490 fs/btrfs/volumes.c:1359
btrfs_mount_root+0x48f/0xb60 fs/btrfs/super.c:1634
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
fc_mount fs/namespace.c:978 [inline]
vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008
btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
do_new_mount fs/namespace.c:2875 [inline]
path_mount+0x179d/0x29e0 fs/namespace.c:3192
do_mount fs/namespace.c:3205 [inline]
__do_sys_mount fs/namespace.c:3413 [inline]
__se_sys_mount+0x126/0x180 fs/namespace.c:3390
do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x44840a
RSP: 002b:00007ffedfffd608 EFLAGS: 00000293 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007ffedfffd670 RCX: 000000000044840a
RDX: 0000000020000000 RSI: 0000000020000100 RDI: 00007ffedfffd630
RBP: 00007ffedfffd630 R08: 00007ffedfffd670 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000001a
R13: 0000000000000004 R14: 0000000000000003 R15: 0000000000000003

Allocated by task 6945:
kasan_save_stack mm/kasan/common.c:48 [inline]
kasan_set_track mm/kasan/common.c:56 [inline]
__kasan_kmalloc+0x100/0x130 mm/kasan/common.c:461
kmalloc_node include/linux/slab.h:577 [inline]
kvmalloc_node+0x81/0x110 mm/util.c:574
kvmalloc include/linux/mm.h:757 [inline]
kvzalloc include/linux/mm.h:765 [inline]
btrfs_mount_root+0xd0/0xb60 fs/btrfs/super.c:1613
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
fc_mount fs/namespace.c:978 [inline]
vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008
btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
do_new_mount fs/namespace.c:2875 [inline]
path_mount+0x179d/0x29e0 fs/namespace.c:3192
do_mount fs/namespace.c:3205 [inline]
__do_sys_mount fs/namespace.c:3413 [inline]
__se_sys_mount+0x126/0x180 fs/namespace.c:3390
do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 6945:
kasan_save_stack mm/kasan/common.c:48 [inline]
kasan_set_track+0x3d/0x70 mm/kasan/common.c:56
kasan_set_free_info+0x17/0x30 mm/kasan/generic.c:355
__kasan_slab_free+0xdd/0x110 mm/kasan/common.c:422
__cache_free mm/slab.c:3418 [inline]
kfree+0x113/0x200 mm/slab.c:3756
deactivate_locked_super+0xa7/0xf0 fs/super.c:335
btrfs_mount_root+0x72b/0xb60 fs/btrfs/super.c:1678
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
fc_mount fs/namespace.c:978 [inline]
vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008
btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
do_new_mount fs/namespace.c:2875 [inline]
path_mount+0x179d/0x29e0 fs/namespace.c:3192
do_mount fs/namespace.c:3205 [inline]
__do_sys_mount fs/namespace.c:3413 [inline]
__se_sys_mount+0x126/0x180 fs/namespace.c:3390
do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

The buggy address belongs to the object at ffff8880878e0000
which belongs to the cache kmalloc-16k of size 16384
The buggy address is located 1704 bytes inside of
16384-byte region [ffff8880878e0000, ffff8880878e4000)
The buggy address belongs to the page:
page:0000000060704f30 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x878e0
head:0000000060704f30 order:3 compound_mapcount:0 compound_pincount:0
flags: 0xfffe0000010200(slab|head)
raw: 00fffe0000010200 ffffea00028e9a08 ffffea00021e3608 ffff8880aa440b00
raw: 0000000000000000 ffff8880878e0000 0000000100000001 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff8880878e0580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880878e0600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8880878e0680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8880878e0700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880878e0780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

The syzkaller reproducer for this use-after-free crafts a filesystem image
and loop mounts it twice in a loop. The mount will fail as the crafted
image has an invalid chunk tree. When this happens btrfs_mount_root() will
call deactivate_locked_super(), which then cleans up fs_info and
fs_info::sb. If a second thread now adds the same block-device to the
filesystem, it will get detected as a duplicate device and
device_list_add() will reject the duplicate and print a warning. But as
the fs_info pointer passed in is non-NULL this will result in a
use-after-free.

Instead of printing possibly uninitialized or already freed memory in
btrfs_printk(), explicitly pass in a NULL fs_info so the printing of the
device name will be skipped altogether.

There was a slightly different approach discussed in
https://lore.kernel.org/linux-btrfs/20200114060920.4527-1-anand.jain@oracle.com/t/#u

Link: https://lore.kernel.org/linux-btrfs/000000000000c9e14b05afcc41ba@google.com
Reported-by: syzbot+582e66e5edf36a22c7b0@syzkaller.appspotmail.com
CC: stable@vger.kernel.org # 4.19+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
diff 0697d9a6 Wed Nov 18 02:03:26 MST 2020 Johannes Thumshirn <johannes.thumshirn@wdc.com> btrfs: don't access possibly stale fs_info data for printing duplicate device

Syzbot reported a possible use-after-free when printing a duplicate device
warning device_list_add().

At this point it can happen that a btrfs_device::fs_info is not correctly
setup yet, so we're accessing stale data, when printing the warning
message using the btrfs_printk() wrappers.

==================================================================
BUG: KASAN: use-after-free in btrfs_printk+0x3eb/0x435 fs/btrfs/super.c:245
Read of size 8 at addr ffff8880878e06a8 by task syz-executor225/7068

CPU: 1 PID: 7068 Comm: syz-executor225 Not tainted 5.9.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1d6/0x29e lib/dump_stack.c:118
print_address_description+0x66/0x620 mm/kasan/report.c:383
__kasan_report mm/kasan/report.c:513 [inline]
kasan_report+0x132/0x1d0 mm/kasan/report.c:530
btrfs_printk+0x3eb/0x435 fs/btrfs/super.c:245
device_list_add+0x1a88/0x1d60 fs/btrfs/volumes.c:943
btrfs_scan_one_device+0x196/0x490 fs/btrfs/volumes.c:1359
btrfs_mount_root+0x48f/0xb60 fs/btrfs/super.c:1634
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
fc_mount fs/namespace.c:978 [inline]
vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008
btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
do_new_mount fs/namespace.c:2875 [inline]
path_mount+0x179d/0x29e0 fs/namespace.c:3192
do_mount fs/namespace.c:3205 [inline]
__do_sys_mount fs/namespace.c:3413 [inline]
__se_sys_mount+0x126/0x180 fs/namespace.c:3390
do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x44840a
RSP: 002b:00007ffedfffd608 EFLAGS: 00000293 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007ffedfffd670 RCX: 000000000044840a
RDX: 0000000020000000 RSI: 0000000020000100 RDI: 00007ffedfffd630
RBP: 00007ffedfffd630 R08: 00007ffedfffd670 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000001a
R13: 0000000000000004 R14: 0000000000000003 R15: 0000000000000003

Allocated by task 6945:
kasan_save_stack mm/kasan/common.c:48 [inline]
kasan_set_track mm/kasan/common.c:56 [inline]
__kasan_kmalloc+0x100/0x130 mm/kasan/common.c:461
kmalloc_node include/linux/slab.h:577 [inline]
kvmalloc_node+0x81/0x110 mm/util.c:574
kvmalloc include/linux/mm.h:757 [inline]
kvzalloc include/linux/mm.h:765 [inline]
btrfs_mount_root+0xd0/0xb60 fs/btrfs/super.c:1613
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
fc_mount fs/namespace.c:978 [inline]
vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008
btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
do_new_mount fs/namespace.c:2875 [inline]
path_mount+0x179d/0x29e0 fs/namespace.c:3192
do_mount fs/namespace.c:3205 [inline]
__do_sys_mount fs/namespace.c:3413 [inline]
__se_sys_mount+0x126/0x180 fs/namespace.c:3390
do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 6945:
kasan_save_stack mm/kasan/common.c:48 [inline]
kasan_set_track+0x3d/0x70 mm/kasan/common.c:56
kasan_set_free_info+0x17/0x30 mm/kasan/generic.c:355
__kasan_slab_free+0xdd/0x110 mm/kasan/common.c:422
__cache_free mm/slab.c:3418 [inline]
kfree+0x113/0x200 mm/slab.c:3756
deactivate_locked_super+0xa7/0xf0 fs/super.c:335
btrfs_mount_root+0x72b/0xb60 fs/btrfs/super.c:1678
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
fc_mount fs/namespace.c:978 [inline]
vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008
btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732
legacy_get_tree+0xea/0x180 fs/fs_context.c:592
vfs_get_tree+0x88/0x270 fs/super.c:1547
do_new_mount fs/namespace.c:2875 [inline]
path_mount+0x179d/0x29e0 fs/namespace.c:3192
do_mount fs/namespace.c:3205 [inline]
__do_sys_mount fs/namespace.c:3413 [inline]
__se_sys_mount+0x126/0x180 fs/namespace.c:3390
do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

The buggy address belongs to the object at ffff8880878e0000
which belongs to the cache kmalloc-16k of size 16384
The buggy address is located 1704 bytes inside of
16384-byte region [ffff8880878e0000, ffff8880878e4000)
The buggy address belongs to the page:
page:0000000060704f30 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x878e0
head:0000000060704f30 order:3 compound_mapcount:0 compound_pincount:0
flags: 0xfffe0000010200(slab|head)
raw: 00fffe0000010200 ffffea00028e9a08 ffffea00021e3608 ffff8880aa440b00
raw: 0000000000000000 ffff8880878e0000 0000000100000001 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff8880878e0580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880878e0600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8880878e0680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8880878e0700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880878e0780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

The syzkaller reproducer for this use-after-free crafts a filesystem image
and loop mounts it twice in a loop. The mount will fail as the crafted
image has an invalid chunk tree. When this happens btrfs_mount_root() will
call deactivate_locked_super(), which then cleans up fs_info and
fs_info::sb. If a second thread now adds the same block-device to the
filesystem, it will get detected as a duplicate device and
device_list_add() will reject the duplicate and print a warning. But as
the fs_info pointer passed in is non-NULL this will result in a
use-after-free.

Instead of printing possibly uninitialized or already freed memory in
btrfs_printk(), explicitly pass in a NULL fs_info so the printing of the
device name will be skipped altogether.

There was a slightly different approach discussed in
https://lore.kernel.org/linux-btrfs/20200114060920.4527-1-anand.jain@oracle.com/t/#u

Link: https://lore.kernel.org/linux-btrfs/000000000000c9e14b05afcc41ba@google.com
Reported-by: syzbot+582e66e5edf36a22c7b0@syzkaller.appspotmail.com
CC: stable@vger.kernel.org # 4.19+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
diff 592d92ee Tue Mar 14 14:33:55 MDT 2017 Liu Bo <bo.li.liu@oracle.com> Btrfs: create a helper for getting chunk map

We have similar code here and there, this merges them into a helper.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
H A Dvolumes.hdiff 592d92ee Tue Mar 14 14:33:55 MDT 2017 Liu Bo <bo.li.liu@oracle.com> Btrfs: create a helper for getting chunk map

We have similar code here and there, this merges them into a helper.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
H A Dextent_io.hdiff 592a0ce9 Tue Jan 09 08:46:25 MST 2024 Filipe Manana <fdmanana@suse.com> btrfs: remove extent_map_tree forward declaration at extent_io.h

There's no need to do a forward declaration of struct extent_map_tree at
extent_io.h, as there are no function prototypes, inline functions or data
structures that refer to struct extent_map_tree.

So remove that forward declaration, which is not needed since commit
477a30ba5f8d ("btrfs: Sink extent_tree arguments in
try_release_extent_mapping").

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
/linux-master/drivers/platform/x86/
H A Dthinkpad_acpi.cdiff e7d2860b Mon Dec 14 19:01:06 MST 2009 André Goddard Rosa <andre.goddard@gmail.com> tree-wide: convert open calls to remove spaces to skip_spaces() lib function

Makes use of skip_spaces() defined in lib/string.c for removing leading
spaces from strings all over the tree.

It decreases lib.a code size by 47 bytes and reuses the function tree-wide:
text data bss dec hex filename
64688 584 592 65864 10148 (TOTALS-BEFORE)
64641 584 592 65817 10119 (TOTALS-AFTER)

Also, while at it, if we see (*str && isspace(*str)), we can be sure to
remove the first condition (*str) as the second one (isspace(*str)) also
evaluates to 0 whenever *str == 0, making it redundant. In other words,
"a char equals zero is never a space".

Julia Lawall tried the semantic patch (http://coccinelle.lip6.fr) below,
and found occurrences of this pattern on 3 more files:
drivers/leds/led-class.c
drivers/leds/ledtrig-timer.c
drivers/video/output.c

@@
expression str;
@@

( // ignore skip_spaces cases
while (*str && isspace(*str)) { \(str++;\|++str;\) }
|
- *str &&
isspace(*str)
)

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Cc: Julia Lawall <julia@diku.dk>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Neil Brown <neilb@suse.de>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: David Howells <dhowells@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
Cc: Samuel Ortiz <samuel@sortiz.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
diff e7d2860b Mon Dec 14 19:01:06 MST 2009 André Goddard Rosa <andre.goddard@gmail.com> tree-wide: convert open calls to remove spaces to skip_spaces() lib function

Makes use of skip_spaces() defined in lib/string.c for removing leading
spaces from strings all over the tree.

It decreases lib.a code size by 47 bytes and reuses the function tree-wide:
text data bss dec hex filename
64688 584 592 65864 10148 (TOTALS-BEFORE)
64641 584 592 65817 10119 (TOTALS-AFTER)

Also, while at it, if we see (*str && isspace(*str)), we can be sure to
remove the first condition (*str) as the second one (isspace(*str)) also
evaluates to 0 whenever *str == 0, making it redundant. In other words,
"a char equals zero is never a space".

Julia Lawall tried the semantic patch (http://coccinelle.lip6.fr) below,
and found occurrences of this pattern on 3 more files:
drivers/leds/led-class.c
drivers/leds/ledtrig-timer.c
drivers/video/output.c

@@
expression str;
@@

( // ignore skip_spaces cases
while (*str && isspace(*str)) { \(str++;\|++str;\) }
|
- *str &&
isspace(*str)
)

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Cc: Julia Lawall <julia@diku.dk>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Neil Brown <neilb@suse.de>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: David Howells <dhowells@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
Cc: Samuel Ortiz <samuel@sortiz.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
/linux-master/fs/ext4/
H A Dextents.cdiff 592acbf1 Fri May 10 17:28:06 MDT 2019 Sriram Rajagopalan <sriramr@arista.com> ext4: zero out the unused memory region in the extent tree block

This commit zeroes out the unused memory region in the buffer_head
corresponding to the extent metablock after writing the extent header
and the corresponding extent node entries.

This is done to prevent random uninitialized data from getting into
the filesystem when the extent block is synced.

This fixes CVE-2019-11833.

Signed-off-by: Sriram Rajagopalan <sriramr@arista.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
diff 592ddec7 Wed Dec 12 02:50:10 MST 2018 Chandan Rajendra <chandan@linux.vnet.ibm.com> ext4: use IS_ENCRYPTED() to check encryption status

This commit removes the ext4 specific ext4_encrypted_inode() and makes
use of the generic IS_ENCRYPTED() macro to check for the encryption
status of an inode.

Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
/linux-master/drivers/block/zram/
H A Dzram_drv.cdiff ebaf9ab5 Tue Jul 26 16:22:45 MDT 2016 Sergey Senozhatsky <sergey.senozhatsky@gmail.com> zram: switch to crypto compress API

We don't have an idle zstreams list anymore and our write path now works
absolutely differently, preventing preemption during compression. This
removes possibilities of read paths preempting writes at wrong places
(which could badly affect the performance of both paths) and at the same
time opens the door for a move from custom LZO/LZ4 compression backends
implementation to a more generic one, using crypto compress API.

Joonsoo Kim [1] attempted to do this a while ago, but faced with the
need of introducing a new crypto API interface. The root cause was the
fact that crypto API compression algorithms require a compression stream
structure (in zram terminology) for both compression and decompression
ops, while in reality only several of compression algorithms really need
it. This resulted in a concept of context-less crypto API compression
backends [2]. Both write and read paths, though, would have been
executed with the preemption enabled, which in the worst case could have
resulted in a decreased worst-case performance, e.g. consider the
following case:

CPU0

zram_write()
spin_lock()
take the last idle stream
spin_unlock()

<< preempted >>

zram_read()
spin_lock()
no idle streams
spin_unlock()
schedule()

resuming zram_write compression()

but it took me some time to realize that, and it took even longer to
evolve zram and to make it ready for crypto API. The key turned out to be
-- drop the idle streams list entirely. Without the idle streams list we
are free to use compression algorithms that require compression stream for
decompression (read), because streams are now placed in per-cpu data and
each write path has to disable preemption for compression op, almost
completely eliminating the aforementioned case (technically, we still have
a small chance, because write path has a fast and a slow paths and the
slow path is executed with the preemption enabled; but the frequency of
failed fast path is too low).

TEST
====

- 4 CPUs, x86_64 system
- 3G zram, lzo
- fio tests: read, randread, write, randwrite, rw, randrw

test script [3] command:
ZRAM_SIZE=3G LOG_SUFFIX=XXXX FIO_LOOPS=5 ./zram-fio-test.sh

BASE PATCHED
jobs1
READ: 2527.2MB/s 2482.7MB/s
READ: 2102.7MB/s 2045.0MB/s
WRITE: 1284.3MB/s 1324.3MB/s
WRITE: 1080.7MB/s 1101.9MB/s
READ: 430125KB/s 437498KB/s
WRITE: 430538KB/s 437919KB/s
READ: 399593KB/s 403987KB/s
WRITE: 399910KB/s 404308KB/s
jobs2
READ: 8133.5MB/s 7854.8MB/s
READ: 7086.6MB/s 6912.8MB/s
WRITE: 3177.2MB/s 3298.3MB/s
WRITE: 2810.2MB/s 2871.4MB/s
READ: 1017.6MB/s 1023.4MB/s
WRITE: 1018.2MB/s 1023.1MB/s
READ: 977836KB/s 984205KB/s
WRITE: 979435KB/s 985814KB/s
jobs3
READ: 13557MB/s 13391MB/s
READ: 11876MB/s 11752MB/s
WRITE: 4641.5MB/s 4682.1MB/s
WRITE: 4164.9MB/s 4179.3MB/s
READ: 1453.8MB/s 1455.1MB/s
WRITE: 1455.1MB/s 1458.2MB/s
READ: 1387.7MB/s 1395.7MB/s
WRITE: 1386.1MB/s 1394.9MB/s
jobs4
READ: 20271MB/s 20078MB/s
READ: 18033MB/s 17928MB/s
WRITE: 6176.8MB/s 6180.5MB/s
WRITE: 5686.3MB/s 5705.3MB/s
READ: 2009.4MB/s 2006.7MB/s
WRITE: 2007.5MB/s 2004.9MB/s
READ: 1929.7MB/s 1935.6MB/s
WRITE: 1926.8MB/s 1932.6MB/s
jobs5
READ: 18823MB/s 19024MB/s
READ: 18968MB/s 19071MB/s
WRITE: 6191.6MB/s 6372.1MB/s
WRITE: 5818.7MB/s 5787.1MB/s
READ: 2011.7MB/s 1981.3MB/s
WRITE: 2011.4MB/s 1980.1MB/s
READ: 1949.3MB/s 1935.7MB/s
WRITE: 1940.4MB/s 1926.1MB/s
jobs6
READ: 21870MB/s 21715MB/s
READ: 19957MB/s 19879MB/s
WRITE: 6528.4MB/s 6537.6MB/s
WRITE: 6098.9MB/s 6073.6MB/s
READ: 2048.6MB/s 2049.9MB/s
WRITE: 2041.7MB/s 2042.9MB/s
READ: 2013.4MB/s 1990.4MB/s
WRITE: 2009.4MB/s 1986.5MB/s
jobs7
READ: 21359MB/s 21124MB/s
READ: 19746MB/s 19293MB/s
WRITE: 6660.4MB/s 6518.8MB/s
WRITE: 6211.6MB/s 6193.1MB/s
READ: 2089.7MB/s 2080.6MB/s
WRITE: 2085.8MB/s 2076.5MB/s
READ: 2041.2MB/s 2052.5MB/s
WRITE: 2037.5MB/s 2048.8MB/s
jobs8
READ: 20477MB/s 19974MB/s
READ: 18922MB/s 18576MB/s
WRITE: 6851.9MB/s 6788.3MB/s
WRITE: 6407.7MB/s 6347.5MB/s
READ: 2134.8MB/s 2136.1MB/s
WRITE: 2132.8MB/s 2134.4MB/s
READ: 2074.2MB/s 2069.6MB/s
WRITE: 2087.3MB/s 2082.4MB/s
jobs9
READ: 19797MB/s 19994MB/s
READ: 18806MB/s 18581MB/s
WRITE: 6878.7MB/s 6822.7MB/s
WRITE: 6456.8MB/s 6447.2MB/s
READ: 2141.1MB/s 2154.7MB/s
WRITE: 2144.4MB/s 2157.3MB/s
READ: 2084.1MB/s 2085.1MB/s
WRITE: 2091.5MB/s 2092.5MB/s
jobs10
READ: 19794MB/s 19784MB/s
READ: 18794MB/s 18745MB/s
WRITE: 6984.4MB/s 6676.3MB/s
WRITE: 6532.3MB/s 6342.7MB/s
READ: 2150.6MB/s 2155.4MB/s
WRITE: 2156.8MB/s 2161.5MB/s
READ: 2106.4MB/s 2095.6MB/s
WRITE: 2109.7MB/s 2098.4MB/s

BASE PATCHED
jobs1 perfstat
stalled-cycles-frontend 102,480,595,419 ( 41.53%) 114,508,864,804 ( 46.92%)
stalled-cycles-backend 51,941,417,832 ( 21.05%) 46,836,112,388 ( 19.19%)
instructions 283,612,054,215 ( 1.15) 283,918,134,959 ( 1.16)
branches 56,372,560,385 ( 724.923) 56,449,814,753 ( 733.766)
branch-misses 374,826,000 ( 0.66%) 326,935,859 ( 0.58%)
jobs2 perfstat
stalled-cycles-frontend 155,142,745,777 ( 40.99%) 164,170,979,198 ( 43.82%)
stalled-cycles-backend 70,813,866,387 ( 18.71%) 66,456,858,165 ( 17.74%)
instructions 463,436,648,173 ( 1.22) 464,221,890,191 ( 1.24)
branches 91,088,733,902 ( 760.088) 91,278,144,546 ( 769.133)
branch-misses 504,460,363 ( 0.55%) 394,033,842 ( 0.43%)
jobs3 perfstat
stalled-cycles-frontend 201,300,397,212 ( 39.84%) 223,969,902,257 ( 44.44%)
stalled-cycles-backend 87,712,593,974 ( 17.36%) 81,618,888,712 ( 16.19%)
instructions 642,869,545,023 ( 1.27) 644,677,354,132 ( 1.28)
branches 125,724,560,594 ( 690.682) 126,133,159,521 ( 694.542)
branch-misses 527,941,798 ( 0.42%) 444,782,220 ( 0.35%)
jobs4 perfstat
stalled-cycles-frontend 246,701,197,429 ( 38.12%) 280,076,030,886 ( 43.29%)
stalled-cycles-backend 119,050,341,112 ( 18.40%) 110,955,641,671 ( 17.15%)
instructions 822,716,962,127 ( 1.27) 825,536,969,320 ( 1.28)
branches 160,590,028,545 ( 688.614) 161,152,996,915 ( 691.068)
branch-misses 650,295,287 ( 0.40%) 550,229,113 ( 0.34%)
jobs5 perfstat
stalled-cycles-frontend 298,958,462,516 ( 38.30%) 344,852,200,358 ( 44.16%)
stalled-cycles-backend 137,558,742,122 ( 17.62%) 129,465,067,102 ( 16.58%)
instructions 1,005,714,688,752 ( 1.29) 1,007,657,999,432 ( 1.29)
branches 195,988,773,962 ( 697.730) 196,446,873,984 ( 700.319)
branch-misses 695,818,940 ( 0.36%) 624,823,263 ( 0.32%)
jobs6 perfstat
stalled-cycles-frontend 334,497,602,856 ( 36.71%) 387,590,419,779 ( 42.38%)
stalled-cycles-backend 163,539,365,335 ( 17.95%) 152,640,193,639 ( 16.69%)
instructions 1,184,738,177,851 ( 1.30) 1,187,396,281,677 ( 1.30)
branches 230,592,915,640 ( 702.902) 231,253,802,882 ( 702.356)
branch-misses 747,934,786 ( 0.32%) 643,902,424 ( 0.28%)
jobs7 perfstat
stalled-cycles-frontend 396,724,684,187 ( 37.71%) 460,705,858,952 ( 43.84%)
stalled-cycles-backend 188,096,616,496 ( 17.88%) 175,785,787,036 ( 16.73%)
instructions 1,364,041,136,608 ( 1.30) 1,366,689,075,112 ( 1.30)
branches 265,253,096,936 ( 700.078) 265,890,524,883 ( 702.839)
branch-misses 784,991,589 ( 0.30%) 729,196,689 ( 0.27%)
jobs8 perfstat
stalled-cycles-frontend 440,248,299,870 ( 36.92%) 509,554,793,816 ( 42.46%)
stalled-cycles-backend 222,575,930,616 ( 18.67%) 213,401,248,432 ( 17.78%)
instructions 1,542,262,045,114 ( 1.29) 1,545,233,932,257 ( 1.29)
branches 299,775,178,439 ( 697.666) 300,528,458,505 ( 694.769)
branch-misses 847,496,084 ( 0.28%) 748,794,308 ( 0.25%)
jobs9 perfstat
stalled-cycles-frontend 506,269,882,480 ( 37.86%) 592,798,032,820 ( 44.43%)
stalled-cycles-backend 253,192,498,861 ( 18.93%) 233,727,666,185 ( 17.52%)
instructions 1,721,985,080,913 ( 1.29) 1,724,666,236,005 ( 1.29)
branches 334,517,360,255 ( 694.134) 335,199,758,164 ( 697.131)
branch-misses 873,496,730 ( 0.26%) 815,379,236 ( 0.24%)
jobs10 perfstat
stalled-cycles-frontend 549,063,363,749 ( 37.18%) 651,302,376,662 ( 43.61%)
stalled-cycles-backend 281,680,986,810 ( 19.07%) 277,005,235,582 ( 18.55%)
instructions 1,901,859,271,180 ( 1.29) 1,906,311,064,230 ( 1.28)
branches 369,398,536,153 ( 694.004) 370,527,696,358 ( 688.409)
branch-misses 967,929,335 ( 0.26%) 890,125,056 ( 0.24%)

BASE PATCHED
seconds elapsed 79.421641008 78.735285546
seconds elapsed 61.471246133 60.869085949
seconds elapsed 62.317058173 62.224188495
seconds elapsed 60.030739363 60.081102518
seconds elapsed 74.070398362 74.317582865
seconds elapsed 84.985953007 85.414364176
seconds elapsed 97.724553255 98.173311344
seconds elapsed 109.488066758 110.268399318
seconds elapsed 122.768189405 122.967164498
seconds elapsed 135.130035105 136.934770801

On my other system (8 x86_64 CPUs, short version of test results):

BASE PATCHED
seconds elapsed 19.518065994 19.806320662
seconds elapsed 15.172772749 15.594718291
seconds elapsed 13.820925970 13.821708564
seconds elapsed 13.293097816 14.585206405
seconds elapsed 16.207284118 16.064431606
seconds elapsed 17.958376158 17.771825767
seconds elapsed 19.478009164 19.602961508
seconds elapsed 21.347152811 21.352318709
seconds elapsed 24.478121126 24.171088735
seconds elapsed 26.865057442 26.767327618

So performance-wise the numbers are quite similar.

Also update zcomp interface to be more aligned with the crypto API.

[1] http://marc.info/?l=linux-kernel&m=144480832108927&w=2
[2] http://marc.info/?l=linux-kernel&m=145379613507518&w=2
[3] https://github.com/sergey-senozhatsky/zram-perf-test

Link: http://lkml.kernel.org/r/20160531122017.2878-3-sergey.senozhatsky@gmail.com
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Suggested-by: Minchan Kim <minchan@kernel.org>
Suggested-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
diff ebaf9ab5 Tue Jul 26 16:22:45 MDT 2016 Sergey Senozhatsky <sergey.senozhatsky@gmail.com> zram: switch to crypto compress API

We don't have an idle zstreams list anymore and our write path now works
absolutely differently, preventing preemption during compression. This
removes possibilities of read paths preempting writes at wrong places
(which could badly affect the performance of both paths) and at the same
time opens the door for a move from custom LZO/LZ4 compression backends
implementation to a more generic one, using crypto compress API.

Joonsoo Kim [1] attempted to do this a while ago, but faced with the
need of introducing a new crypto API interface. The root cause was the
fact that crypto API compression algorithms require a compression stream
structure (in zram terminology) for both compression and decompression
ops, while in reality only several of compression algorithms really need
it. This resulted in a concept of context-less crypto API compression
backends [2]. Both write and read paths, though, would have been
executed with the preemption enabled, which in the worst case could have
resulted in a decreased worst-case performance, e.g. consider the
following case:

CPU0

zram_write()
spin_lock()
take the last idle stream
spin_unlock()

<< preempted >>

zram_read()
spin_lock()
no idle streams
spin_unlock()
schedule()

resuming zram_write compression()

but it took me some time to realize that, and it took even longer to
evolve zram and to make it ready for crypto API. The key turned out to be
-- drop the idle streams list entirely. Without the idle streams list we
are free to use compression algorithms that require compression stream for
decompression (read), because streams are now placed in per-cpu data and
each write path has to disable preemption for compression op, almost
completely eliminating the aforementioned case (technically, we still have
a small chance, because write path has a fast and a slow paths and the
slow path is executed with the preemption enabled; but the frequency of
failed fast path is too low).

TEST
====

- 4 CPUs, x86_64 system
- 3G zram, lzo
- fio tests: read, randread, write, randwrite, rw, randrw

test script [3] command:
ZRAM_SIZE=3G LOG_SUFFIX=XXXX FIO_LOOPS=5 ./zram-fio-test.sh

BASE PATCHED
jobs1
READ: 2527.2MB/s 2482.7MB/s
READ: 2102.7MB/s 2045.0MB/s
WRITE: 1284.3MB/s 1324.3MB/s
WRITE: 1080.7MB/s 1101.9MB/s
READ: 430125KB/s 437498KB/s
WRITE: 430538KB/s 437919KB/s
READ: 399593KB/s 403987KB/s
WRITE: 399910KB/s 404308KB/s
jobs2
READ: 8133.5MB/s 7854.8MB/s
READ: 7086.6MB/s 6912.8MB/s
WRITE: 3177.2MB/s 3298.3MB/s
WRITE: 2810.2MB/s 2871.4MB/s
READ: 1017.6MB/s 1023.4MB/s
WRITE: 1018.2MB/s 1023.1MB/s
READ: 977836KB/s 984205KB/s
WRITE: 979435KB/s 985814KB/s
jobs3
READ: 13557MB/s 13391MB/s
READ: 11876MB/s 11752MB/s
WRITE: 4641.5MB/s 4682.1MB/s
WRITE: 4164.9MB/s 4179.3MB/s
READ: 1453.8MB/s 1455.1MB/s
WRITE: 1455.1MB/s 1458.2MB/s
READ: 1387.7MB/s 1395.7MB/s
WRITE: 1386.1MB/s 1394.9MB/s
jobs4
READ: 20271MB/s 20078MB/s
READ: 18033MB/s 17928MB/s
WRITE: 6176.8MB/s 6180.5MB/s
WRITE: 5686.3MB/s 5705.3MB/s
READ: 2009.4MB/s 2006.7MB/s
WRITE: 2007.5MB/s 2004.9MB/s
READ: 1929.7MB/s 1935.6MB/s
WRITE: 1926.8MB/s 1932.6MB/s
jobs5
READ: 18823MB/s 19024MB/s
READ: 18968MB/s 19071MB/s
WRITE: 6191.6MB/s 6372.1MB/s
WRITE: 5818.7MB/s 5787.1MB/s
READ: 2011.7MB/s 1981.3MB/s
WRITE: 2011.4MB/s 1980.1MB/s
READ: 1949.3MB/s 1935.7MB/s
WRITE: 1940.4MB/s 1926.1MB/s
jobs6
READ: 21870MB/s 21715MB/s
READ: 19957MB/s 19879MB/s
WRITE: 6528.4MB/s 6537.6MB/s
WRITE: 6098.9MB/s 6073.6MB/s
READ: 2048.6MB/s 2049.9MB/s
WRITE: 2041.7MB/s 2042.9MB/s
READ: 2013.4MB/s 1990.4MB/s
WRITE: 2009.4MB/s 1986.5MB/s
jobs7
READ: 21359MB/s 21124MB/s
READ: 19746MB/s 19293MB/s
WRITE: 6660.4MB/s 6518.8MB/s
WRITE: 6211.6MB/s 6193.1MB/s
READ: 2089.7MB/s 2080.6MB/s
WRITE: 2085.8MB/s 2076.5MB/s
READ: 2041.2MB/s 2052.5MB/s
WRITE: 2037.5MB/s 2048.8MB/s
jobs8
READ: 20477MB/s 19974MB/s
READ: 18922MB/s 18576MB/s
WRITE: 6851.9MB/s 6788.3MB/s
WRITE: 6407.7MB/s 6347.5MB/s
READ: 2134.8MB/s 2136.1MB/s
WRITE: 2132.8MB/s 2134.4MB/s
READ: 2074.2MB/s 2069.6MB/s
WRITE: 2087.3MB/s 2082.4MB/s
jobs9
READ: 19797MB/s 19994MB/s
READ: 18806MB/s 18581MB/s
WRITE: 6878.7MB/s 6822.7MB/s
WRITE: 6456.8MB/s 6447.2MB/s
READ: 2141.1MB/s 2154.7MB/s
WRITE: 2144.4MB/s 2157.3MB/s
READ: 2084.1MB/s 2085.1MB/s
WRITE: 2091.5MB/s 2092.5MB/s
jobs10
READ: 19794MB/s 19784MB/s
READ: 18794MB/s 18745MB/s
WRITE: 6984.4MB/s 6676.3MB/s
WRITE: 6532.3MB/s 6342.7MB/s
READ: 2150.6MB/s 2155.4MB/s
WRITE: 2156.8MB/s 2161.5MB/s
READ: 2106.4MB/s 2095.6MB/s
WRITE: 2109.7MB/s 2098.4MB/s

BASE PATCHED
jobs1 perfstat
stalled-cycles-frontend 102,480,595,419 ( 41.53%) 114,508,864,804 ( 46.92%)
stalled-cycles-backend 51,941,417,832 ( 21.05%) 46,836,112,388 ( 19.19%)
instructions 283,612,054,215 ( 1.15) 283,918,134,959 ( 1.16)
branches 56,372,560,385 ( 724.923) 56,449,814,753 ( 733.766)
branch-misses 374,826,000 ( 0.66%) 326,935,859 ( 0.58%)
jobs2 perfstat
stalled-cycles-frontend 155,142,745,777 ( 40.99%) 164,170,979,198 ( 43.82%)
stalled-cycles-backend 70,813,866,387 ( 18.71%) 66,456,858,165 ( 17.74%)
instructions 463,436,648,173 ( 1.22) 464,221,890,191 ( 1.24)
branches 91,088,733,902 ( 760.088) 91,278,144,546 ( 769.133)
branch-misses 504,460,363 ( 0.55%) 394,033,842 ( 0.43%)
jobs3 perfstat
stalled-cycles-frontend 201,300,397,212 ( 39.84%) 223,969,902,257 ( 44.44%)
stalled-cycles-backend 87,712,593,974 ( 17.36%) 81,618,888,712 ( 16.19%)
instructions 642,869,545,023 ( 1.27) 644,677,354,132 ( 1.28)
branches 125,724,560,594 ( 690.682) 126,133,159,521 ( 694.542)
branch-misses 527,941,798 ( 0.42%) 444,782,220 ( 0.35%)
jobs4 perfstat
stalled-cycles-frontend 246,701,197,429 ( 38.12%) 280,076,030,886 ( 43.29%)
stalled-cycles-backend 119,050,341,112 ( 18.40%) 110,955,641,671 ( 17.15%)
instructions 822,716,962,127 ( 1.27) 825,536,969,320 ( 1.28)
branches 160,590,028,545 ( 688.614) 161,152,996,915 ( 691.068)
branch-misses 650,295,287 ( 0.40%) 550,229,113 ( 0.34%)
jobs5 perfstat
stalled-cycles-frontend 298,958,462,516 ( 38.30%) 344,852,200,358 ( 44.16%)
stalled-cycles-backend 137,558,742,122 ( 17.62%) 129,465,067,102 ( 16.58%)
instructions 1,005,714,688,752 ( 1.29) 1,007,657,999,432 ( 1.29)
branches 195,988,773,962 ( 697.730) 196,446,873,984 ( 700.319)
branch-misses 695,818,940 ( 0.36%) 624,823,263 ( 0.32%)
jobs6 perfstat
stalled-cycles-frontend 334,497,602,856 ( 36.71%) 387,590,419,779 ( 42.38%)
stalled-cycles-backend 163,539,365,335 ( 17.95%) 152,640,193,639 ( 16.69%)
instructions 1,184,738,177,851 ( 1.30) 1,187,396,281,677 ( 1.30)
branches 230,592,915,640 ( 702.902) 231,253,802,882 ( 702.356)
branch-misses 747,934,786 ( 0.32%) 643,902,424 ( 0.28%)
jobs7 perfstat
stalled-cycles-frontend 396,724,684,187 ( 37.71%) 460,705,858,952 ( 43.84%)
stalled-cycles-backend 188,096,616,496 ( 17.88%) 175,785,787,036 ( 16.73%)
instructions 1,364,041,136,608 ( 1.30) 1,366,689,075,112 ( 1.30)
branches 265,253,096,936 ( 700.078) 265,890,524,883 ( 702.839)
branch-misses 784,991,589 ( 0.30%) 729,196,689 ( 0.27%)
jobs8 perfstat
stalled-cycles-frontend 440,248,299,870 ( 36.92%) 509,554,793,816 ( 42.46%)
stalled-cycles-backend 222,575,930,616 ( 18.67%) 213,401,248,432 ( 17.78%)
instructions 1,542,262,045,114 ( 1.29) 1,545,233,932,257 ( 1.29)
branches 299,775,178,439 ( 697.666) 300,528,458,505 ( 694.769)
branch-misses 847,496,084 ( 0.28%) 748,794,308 ( 0.25%)
jobs9 perfstat
stalled-cycles-frontend 506,269,882,480 ( 37.86%) 592,798,032,820 ( 44.43%)
stalled-cycles-backend 253,192,498,861 ( 18.93%) 233,727,666,185 ( 17.52%)
instructions 1,721,985,080,913 ( 1.29) 1,724,666,236,005 ( 1.29)
branches 334,517,360,255 ( 694.134) 335,199,758,164 ( 697.131)
branch-misses 873,496,730 ( 0.26%) 815,379,236 ( 0.24%)
jobs10 perfstat
stalled-cycles-frontend 549,063,363,749 ( 37.18%) 651,302,376,662 ( 43.61%)
stalled-cycles-backend 281,680,986,810 ( 19.07%) 277,005,235,582 ( 18.55%)
instructions 1,901,859,271,180 ( 1.29) 1,906,311,064,230 ( 1.28)
branches 369,398,536,153 ( 694.004) 370,527,696,358 ( 688.409)
branch-misses 967,929,335 ( 0.26%) 890,125,056 ( 0.24%)

BASE PATCHED
seconds elapsed 79.421641008 78.735285546
seconds elapsed 61.471246133 60.869085949
seconds elapsed 62.317058173 62.224188495
seconds elapsed 60.030739363 60.081102518
seconds elapsed 74.070398362 74.317582865
seconds elapsed 84.985953007 85.414364176
seconds elapsed 97.724553255 98.173311344
seconds elapsed 109.488066758 110.268399318
seconds elapsed 122.768189405 122.967164498
seconds elapsed 135.130035105 136.934770801

On my other system (8 x86_64 CPUs, short version of test results):

BASE PATCHED
seconds elapsed 19.518065994 19.806320662
seconds elapsed 15.172772749 15.594718291
seconds elapsed 13.820925970 13.821708564
seconds elapsed 13.293097816 14.585206405
seconds elapsed 16.207284118 16.064431606
seconds elapsed 17.958376158 17.771825767
seconds elapsed 19.478009164 19.602961508
seconds elapsed 21.347152811 21.352318709
seconds elapsed 24.478121126 24.171088735
seconds elapsed 26.865057442 26.767327618

So performance-wise the numbers are quite similar.

Also update zcomp interface to be more aligned with the crypto API.

[1] http://marc.info/?l=linux-kernel&m=144480832108927&w=2
[2] http://marc.info/?l=linux-kernel&m=145379613507518&w=2
[3] https://github.com/sergey-senozhatsky/zram-perf-test

Link: http://lkml.kernel.org/r/20160531122017.2878-3-sergey.senozhatsky@gmail.com
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Suggested-by: Minchan Kim <minchan@kernel.org>
Suggested-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
/linux-master/drivers/tty/
H A Dn_tty.cdiff a5660b41 Mon Apr 04 15:26:54 MDT 2011 Linus Torvalds <torvalds@linux-foundation.org> tty: fix endless work loop when the buffer fills up

Commit f23eb2b2b285 ('tty: stop using "delayed_work" in the tty layer')
ended up causing hung machines on UP with no preemption, because the
work routine to flip the buffer data to the ldisc would endlessly re-arm
itself if the destination buffer had filled up.

With the delayed work, that only caused a timer-driving polling of the
tty state every timer tick, but without the delay we just ended up with
basically a busy loop instead.

Stop the insane polling, and instead make the code that opens up the
receive room re-schedule the buffer flip work. That's what we should
have been doing anyway.

This same "poll for tty room" issue is almost certainly also the cause
of excessive kworker activity when idle reported by Dave Jones, who also
reported "flush_to_ldisc executing 2500 times a second" back in Nov 2010:

http://lkml.org/lkml/2010/11/30/592

which is that silly flushing done every timer tick. Wasting both power
and CPU for no good reason.

Reported-and-tested-by: Alexander Beregalov <a.beregalov@gmail.com>
Reported-and-tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Cc: Greg KH <gregkh@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
/linux-master/tools/perf/util/
H A Dannotate.hdiff 592c10e2 Wed Apr 11 07:30:03 MDT 2018 Arnaldo Carvalho de Melo <acme@redhat.com> perf annotate: Allow showing offsets in more than just jump targets

Jesper wanted to see offsets at callq sites when doing some performance
investigation related to retpolines, so save him some time by providing
an 'struct annotation_options' to control where offsets should appear:
just on jump targets? That + call instructions? All?

This puts in place the logic to show the offsets, now we need to wire
this up in the TUI browser (next patch) and on the 'perf annotate --stdio2"
interface, where we need a more general mechanism to setup the
'annotation_options' struct from the command line.

Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-m3jc9c3swobye9tj08gnh5i7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
H A Dutil.hdiff 592d5a6b Wed Sep 02 01:56:34 MDT 2015 Jiri Olsa <jolsa@kernel.org> tools lib api fs: Move tracing_path interface into api/fs/tracing_path.c

Moving tracing_path interface into api/fs/tracing_path.c out of util.c.
It seems generic enough to be used by others, and I couldn't think of
better place.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Matt Fleming <matt.fleming@intel.com>
Reviewed-by: Raphael Beamonte <raphael.beamonte@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1441180605-24737-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
/linux-master/drivers/staging/rtl8192e/rtl8192e/
H A Drtl_core.cdiff 1f3aefb5 Sun Feb 22 07:58:08 MST 2015 Melike Yurtoglu <aysemelikeyurtoglu@gmail.com> Staging: rtl8192e: replace memcpy() by ether_addr_copy() using coccinelle and pack variable

This patch focuses on fixing the following warning generated
by checkpatch.pl for the file rxtx.c

Prefer ether_addr_copy() over memcpy() if the Ethernet addresses
are __aligned(2)

@@ expression e1, e2; @@

- memcpy(e1, e2, ETH_ALEN);
+ ether_addr_copy(e1, e2);

struct net_device {
char name[16]; /* 0 16*/
struct hlist_node name_hlist; /* 16 16*/
char * ifalias; /* 32 8*/
long unsigned int mem_end; /* 40 8*/
long unsigned int mem_start; /* 48 8*/
long unsigned int base_addr; /* 56 8*/
/* --- cacheline 1 boundary (64 bytes) --- */
int irq; /* 64 4*/

/* XXX 4 bytes hole, try to pack */

long unsigned int state; /* 72 8*/
struct list_head dev_list; /* 80 16*/
struct list_head napi_list; /* 96 16*/
struct list_head unreg_list; /* 112 16*/
/* --- cacheline 2 boundary (128 bytes) --- */
struct list_head close_list; /* 128 16*/
struct {
struct list_head upper; /* 144 16*/
struct list_head lower; /* 160 16*/
} adj_list; /* 144 32*/
struct {
struct list_head upper; /* 176 16*/
struct list_head lower; /* 192 16*/
} all_adj_list; /* 176 32*/
/* --- cacheline 3 boundary (192 bytes) was 16 bytes ago --- */
netdev_features_t features; /* 208 8*/
netdev_features_t hw_features; /* 216 8*/
netdev_features_t wanted_features; /* 224 8*/
netdev_features_t vlan_features; /* 232 8*/
netdev_features_t hw_enc_features; /* 240 8*/
netdev_features_t mpls_features; /* 248 8*/
/* --- cacheline 4 boundary (256 bytes) --- */
int ifindex; /* 256 4*/
int iflink; /* 260 4*/
struct net_device_stats stats; /* 264 184*/
/* --- cacheline 7 boundary (448 bytes) --- */
atomic_long_t rx_dropped; /* 448 8*/
atomic_long_t tx_dropped; /* 456 8*/
atomic_t carrier_changes; /* 464 4*/

/* XXX 4 bytes hole, try to pack */

const struct iw_handler_def * wireless_handlers; /* 472 8*/
struct iw_public_data * wireless_data; /* 480 8*/
const struct net_device_ops * netdev_ops; /* 488 8*/
const struct ethtool_ops * ethtool_ops; /* 496 8*/
const struct forwarding_accel_ops * fwd_ops; /* 504 8*/
/* --- cacheline 8 boundary (512 bytes) --- */
const struct header_ops * header_ops; /* 512 8*/
unsigned int flags; /* 520 4*/
unsigned int priv_flags; /* 524 4*/
short unsigned int gflags; /* 528 2*/
short unsigned int padded; /* 530 2*/
unsigned char operstate; /* 532 1*/
unsigned char link_mode; /* 533 1*/
unsigned char if_port; /* 534 1*/
unsigned char dma; /* 535 1*/
unsigned int mtu; /* 536 4*/
short unsigned int type; /* 540 2*/
short unsigned int hard_header_len; /* 542 2*/
short unsigned int needed_headroom; /* 544 2*/
short unsigned int needed_tailroom; /* 546 2*/
unsigned char perm_addr[32]; /* 548 32*/
/* --- cacheline 9 boundary (576 bytes) was 4 bytes ago --- */
unsigned char addr_assign_type; /* 580 1*/
unsigned char addr_len; /* 581 1*/
short unsigned int neigh_priv_len; /* 582 2*/
short unsigned int dev_id; /* 584 2*/
short unsigned int dev_port; /* 586 2*/
spinlock_t addr_list_lock; /* 588 4*/
struct netdev_hw_addr_list uc; /* 592 24*/
struct netdev_hw_addr_list mc; /* 616 24*/
/* --- cacheline 10 boundary (640 bytes) --- */
struct netdev_hw_addr_list dev_addrs; /* 640 24*/
struct kset * queues_kset; /* 664 8*/
unsigned char name_assign_type; /* 672 1*/
bool uc_promisc; /* 673 1*/

/* XXX 2 bytes hole, try to pack */

unsigned int promiscuity; /* 676 4*/
unsigned int allmulti; /* 680 4*/

/* XXX 4 bytes hole, try to pack */

struct vlan_info * vlan_info; /* 688 8*/
struct dsa_switch_tree * dsa_ptr; /* 696 8*/
/* --- cacheline 11 boundary (704 bytes) --- */
struct tipc_bearer * tipc_ptr; /* 704 8*/
void * atalk_ptr; /* 712 8*/
struct in_device * ip_ptr; /* 720 8*/
struct dn_dev * dn_ptr; /* 728 8*/
struct inet6_dev * ip6_ptr; /* 736 8*/
void * ax25_ptr; /* 744 8*/
struct wireless_dev * ieee80211_ptr; /* 752 8*/
struct wpan_dev * ieee802154_ptr; /* 760 8*/
/* --- cacheline 12 boundary (768 bytes) --- */
long unsigned int last_rx; /* 768 8*/
unsigned char * dev_addr; /* 776 8*/
struct netdev_rx_queue * _rx; /* 784 8*/
unsigned int num_rx_queues; /* 792 4*/
unsigned int real_num_rx_queues; /* 796 4*/
long unsigned int gro_flush_timeout; /* 800 8*/
rx_handler_func_t * rx_handler; /* 808 8*/
void * rx_handler_data; /* 816 8*/
struct netdev_queue * ingress_queue; /* 824 8*/
/* --- cacheline 13 boundary (832 bytes) --- */
unsigned char broadcast[32]; /* 832 32*/

/* XXX 32 bytes hole, try to pack */

/* --- cacheline 14 boundary (896 bytes) --- */
struct netdev_queue * _tx; /* 896 8*/
unsigned int num_tx_queues; /* 904 4*/
unsigned int real_num_tx_queues; /* 908 4*/
struct Qdisc * qdisc; /* 912 8*/
long unsigned int tx_queue_len; /* 920 8*/
spinlock_t tx_global_lock; /* 928 4*/

/* XXX 4 bytes hole, try to pack */

struct xps_dev_maps * xps_maps; /* 936 8*/
struct cpu_rmap * rx_cpu_rmap; /* 944 8*/
long unsigned int trans_start; /* 952 8*/
/* --- cacheline 15 boundary (960 bytes) --- */
int watchdog_timeo; /* 960 4*/

/* XXX 4 bytes hole, try to pack */

struct timer_list watchdog_timer; /* 968 80*/
/* --- cacheline 16 boundary (1024 bytes) was 24 bytes ago ---* */
int * pcpu_refcnt; /* 1048 8*/
struct list_head todo_list; /* 1056 16*/
struct hlist_node index_hlist; /* 1072 16*/
/* --- cacheline 17 boundary (1088 bytes) --- */
struct list_head link_watch_list; /* 1088 16*/
enum {
NETREG_UNINITIALIZED = 0,
NETREG_REGISTERED = 1,
NETREG_UNREGISTERING = 2,
NETREG_UNREGISTERED = 3,
NETREG_RELEASED = 4,
NETREG_DUMMY = 5,
} reg_state:8; /* 1104 4 */
/* Bitfield combined with next fields */

bool dismantle; /* 1105 1*/

/* Bitfield combined with previous fields */

enum {
RTNL_LINK_INITIALIZED = 0,
RTNL_LINK_INITIALIZING = 1,
} rtnl_link_state:16; /* 1104 4 */

/* XXX 4 bytes hole, try to pack */

void (*destructor)(struct net_device *);/* 1112 8 */
struct netpoll_info * npinfo; /* 1120 8*/
struct net * nd_net; /* 1128 8*/
union {
void * ml_priv; /* 8*/
struct pcpu_lstats * lstats; /* 8*/
struct pcpu_sw_netstats * tstats; /* 8*/
struct pcpu_dstats * dstats; /* 8*/
struct pcpu_vstats * vstats; /* 8*/
}; /* 1136 8*/
struct garp_port * garp_port; /* 1144 8*/
/* --- cacheline 18 boundary (1152 bytes) was 4 bytes ago --- */
struct mrp_port * mrp_port; /* 1152 8*/
struct device dev; /* 1160 696*/

/* XXX last struct has 7 bytes of padding */

/* --- cacheline 29 boundary (1856 bytes) was 4 bytes ago --- */
const struct attribute_group * sysfs_groups[4]; /* 1856 32*/
const struct attribute_group * sysfs_rx_queue_group; /* 18888 */
const struct rtnl_link_ops * rtnl_link_ops; /* 1896 8*/
unsigned int gso_max_size; /* 1904 4*/
u16 gso_max_segs; /* 1908 2*/
u16 gso_min_segs; /* 1910 2*/
const struct dcbnl_rtnl_ops * dcbnl_ops; /* 1912 8*/
/* --- cacheline 30 boundary (1920 bytes) was 4 bytes ago --- */
u8 num_tc; /* 1920 1*/

/* XXX 1 byte hole, try to pack */

struct netdev_tc_txq tc_to_txq[16]; /* 1922 64*/
/* --- cacheline 31 boundary (1984 bytes) was 6 bytes ago --- */
u8 prio_tc_map[16]; /* 1986 16*/

/* XXX 2 bytes hole, try to pack */

unsigned int fcoe_ddp_xid; /* 2004 4*/
struct phy_device * phydev; /* 2008 8*/
struct lock_class_key * qdisc_tx_busylock; /* 2016 8*/
int group; /* 2024 4*/

/* XXX 4 bytes hole, try to pack */

struct pm_qos_request pm_qos_req; /* 2032 176*/
/* --- cacheline 34 boundary (2176 bytes) was 36 bytes ago --- * */

/* size: 2240, cachelines: 35, members: 120 */
/* sum members: 2147, holes: 11, sum holes: 65 */
/* padding: 32 */
/* paddings: 1, sum paddings: 7 */

/* BRAIN FART ALERT! 2240 != 2147 + 65(holes), diff = 28 */

};

Signed-off-by: Melike Yurtoglu <aysemelikeyurtoglu@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
/linux-master/fs/
H A Deventpoll.cdiff 3419b23a Sun Jun 25 06:48:14 MDT 2006 Davide Libenzi <davidel@xmailserver.org> [PATCH] epoll: use unlocked wqueue operations

A few days ago Arjan signaled a lockdep red flag on epoll locks, and
precisely between the epoll's device structure lock (->lock) and the wait
queue head lock (->lock).

Like I explained in another email, and directly to Arjan, this can't happen
in reality because of the explicit check at eventpoll.c:592, that does not
allow to drop an epoll fd inside the same epoll fd. Since lockdep is
working on per-structure locks, it will never be able to know of policies
enforced in other parts of the code.

It was decided time ago of having the ability to drop epoll fds inside
other epoll fds, that triggers a very trick wakeup operations (due to
possibly reentrant callback-driven wakeups) handled by the
ep_poll_safewake() function. While looking again at the code though, I
noticed that all the operations done on the epoll's main structure wait
queue head (->wq) are already protected by the epoll lock (->lock), so that
locked-style functions can be used to manipulate the ->wq member. This
makes both a lock-acquire save, and lockdep happy.

Running totalmess on my dual opteron for a while did not reveal any problem
so far:

http://www.xmailserver.org/totalmess.c

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
/linux-master/fs/gfs2/
H A Dops_fstype.cdiff eb602521 Thu Mar 04 07:28:57 MST 2021 Yang Li <yang.lee@linux.alibaba.com> gfs2: make function gfs2_make_fs_ro() to void type

It fixes the following warning detected by coccinelle:
./fs/gfs2/super.c:592:5-10: Unneeded variable: "error". Return "0" on
line 628

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
H A Dsuper.cdiff eb602521 Thu Mar 04 07:28:57 MST 2021 Yang Li <yang.lee@linux.alibaba.com> gfs2: make function gfs2_make_fs_ro() to void type

It fixes the following warning detected by coccinelle:
./fs/gfs2/super.c:592:5-10: Unneeded variable: "error". Return "0" on
line 628

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>

Completed in 2743 milliseconds

<<111213141516