Searched +hist:4 +hist:b6f5d20 (Results 51 - 75 of 106) sorted by last modified time
/linux-master/fs/debugfs/ | ||
H A D | file.c | diff d472cf79 Mon Sep 19 11:24:18 MDT 2022 Akinobu Mita <akinobu.mita@gmail.com> debugfs: fix error when writing negative value to atomic_t debugfs file The simple attribute files do not accept a negative value since the commit 488dac0c9237 ("libfs: fix error cast of negative value in simple_attr_write()"), so we have to use a 64-bit value to write a negative value for a debugfs file created by debugfs_create_atomic_t(). This restores the previous behaviour by introducing DEFINE_DEBUGFS_ATTRIBUTE_SIGNED for a signed value. Link: https://lkml.kernel.org/r/20220919172418.45257-4-akinobu.mita@gmail.com Fixes: 488dac0c9237 ("libfs: fix error cast of negative value in simple_attr_write()") Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reported-by: Zhao Gongyi <zhaogongyi@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Oscar Salvador <osalvador@suse.de> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Wei Yongjun <weiyongjun1@huawei.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> diff 275678e7 Mon Feb 17 21:31:50 MST 2020 Taehee Yoo <ap420073@gmail.com> debugfs: Check module state before warning in {full/open}_proxy_open() When the module is being removed, the module state is set to MODULE_STATE_GOING. At this point, try_module_get() fails. And when {full/open}_proxy_open() is being called, it calls try_module_get() to try to hold module reference count. If it fails, it warns about the possibility of debugfs file leak. If {full/open}_proxy_open() is called while the module is being removed, it fails to hold the module. So, It warns about debugfs file leak. But it is not the debugfs file leak case. So, this patch just adds module state checking routine in the {full/open}_proxy_open(). Test commands: #SHELL1 while : do modprobe netdevsim echo 1 > /sys/bus/netdevsim/new_device modprobe -rv netdevsim done #SHELL2 while : do cat /sys/kernel/debug/netdevsim/netdevsim1/ports/0/ipsec done Splat looks like: [ 298.766738][T14664] debugfs file owner did not clean up at exit: ipsec [ 298.766766][T14664] WARNING: CPU: 2 PID: 14664 at fs/debugfs/file.c:312 full_proxy_open+0x10f/0x650 [ 298.768595][T14664] Modules linked in: netdevsim(-) openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 n][ 298.771343][T14664] CPU: 2 PID: 14664 Comm: cat Tainted: G W 5.5.0+ #1 [ 298.772373][T14664] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 298.773545][T14664] RIP: 0010:full_proxy_open+0x10f/0x650 [ 298.774247][T14664] Code: 48 c1 ea 03 80 3c 02 00 0f 85 c1 04 00 00 49 8b 3c 24 e8 e4 b5 78 ff 84 c0 75 2d 4c 89 ee 48 [ 298.776782][T14664] RSP: 0018:ffff88805b7df9b8 EFLAGS: 00010282[ 298.777583][T14664] RAX: dffffc0000000008 RBX: ffff8880511725c0 RCX: 0000000000000000 [ 298.778610][T14664] RDX: 0000000000000000 RSI: 0000000000000006 RDI: ffff8880540c5c14 [ 298.779637][T14664] RBP: 0000000000000000 R08: fffffbfff15235ad R09: 0000000000000000 [ 298.780664][T14664] R10: 0000000000000001 R11: 0000000000000000 R12: ffffffffc06b5000 [ 298.781702][T14664] R13: ffff88804c234a88 R14: ffff88804c22dd00 R15: ffffffff8a1b5660 [ 298.782722][T14664] FS: 00007fafa13a8540(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000 [ 298.783845][T14664] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 298.784672][T14664] CR2: 00007fafa0e9cd10 CR3: 000000004b286005 CR4: 00000000000606e0 [ 298.785739][T14664] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 298.786769][T14664] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 298.787785][T14664] Call Trace: [ 298.788237][T14664] do_dentry_open+0x63c/0xf50 [ 298.788872][T14664] ? open_proxy_open+0x270/0x270 [ 298.789524][T14664] ? __x64_sys_fchdir+0x180/0x180 [ 298.790169][T14664] ? inode_permission+0x65/0x390 [ 298.790832][T14664] path_openat+0xc45/0x2680 [ 298.791425][T14664] ? save_stack+0x69/0x80 [ 298.791988][T14664] ? save_stack+0x19/0x80 [ 298.792544][T14664] ? path_mountpoint+0x2e0/0x2e0 [ 298.793233][T14664] ? check_chain_key+0x236/0x5d0 [ 298.793910][T14664] ? sched_clock_cpu+0x18/0x170 [ 298.794527][T14664] ? find_held_lock+0x39/0x1d0 [ 298.795153][T14664] do_filp_open+0x16a/0x260 [ ... ] Fixes: 9fd4dcece43a ("debugfs: prevent access to possibly dead file_operations at file open") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://lore.kernel.org/r/20200218043150.29447-1-ap420073@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff a37f4958 Sat Dec 07 09:16:03 MST 2019 Eric Snowberg <eric.snowberg@oracle.com> debugfs: Return -EPERM when locked down When lockdown is enabled, debugfs_is_locked_down returns 1. It will then trigger the following: WARNING: CPU: 48 PID: 3747 CPU: 48 PID: 3743 Comm: bash Not tainted 5.4.0-1946.x86_64 #1 Hardware name: Oracle Corporation ORACLE SERVER X7-2/ASM, MB, X7-2, BIOS 41060400 05/20/2019 RIP: 0010:do_dentry_open+0x343/0x3a0 Code: 00 40 08 00 45 31 ff 48 c7 43 28 40 5b e7 89 e9 02 ff ff ff 48 8b 53 28 4c 8b 72 70 4d 85 f6 0f 84 10 fe ff ff e9 f5 fd ff ff <0f> 0b 41 bf ea ff ff ff e9 3b ff ff ff 41 bf e6 ff ff ff e9 b4 fe RSP: 0018:ffffb8740dde7ca0 EFLAGS: 00010202 RAX: ffffffff89e88a40 RBX: ffff928c8e6b6f00 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff928dbfd97778 RDI: ffff9285cff685c0 RBP: ffffb8740dde7cc8 R08: 0000000000000821 R09: 0000000000000030 R10: 0000000000000057 R11: ffffb8740dde7a98 R12: ffff926ec781c900 R13: ffff928c8e6b6f10 R14: ffffffff8936e190 R15: 0000000000000001 FS: 00007f45f6777740(0000) GS:ffff928dbfd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fff95e0d5d8 CR3: 0000001ece562006 CR4: 00000000007606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: vfs_open+0x2d/0x30 path_openat+0x2d4/0x1680 ? tty_mode_ioctl+0x298/0x4c0 do_filp_open+0x93/0x100 ? strncpy_from_user+0x57/0x1b0 ? __alloc_fd+0x46/0x150 do_sys_open+0x182/0x230 __x64_sys_openat+0x20/0x30 do_syscall_64+0x60/0x1b0 entry_SYSCALL_64_after_hwframe+0x170/0x1d5 RIP: 0033:0x7f45f5e5ce02 Code: 25 00 00 41 00 3d 00 00 41 00 74 4c 48 8d 05 25 59 2d 00 8b 00 85 c0 75 6d 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 a2 00 00 00 48 8b 4c 24 28 64 48 33 0c 25 RSP: 002b:00007fff95e0d2e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 0000561178c069b0 RCX: 00007f45f5e5ce02 RDX: 0000000000000241 RSI: 0000561178c08800 RDI: 00000000ffffff9c RBP: 00007fff95e0d3e0 R08: 0000000000000020 R09: 0000000000000005 R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000003 R14: 0000000000000001 R15: 0000561178c08800 Change the return type to int and return -EPERM when lockdown is enabled to remove the warning above. Also rename debugfs_is_locked_down to debugfs_locked_down to make it sound less like it returns a boolean. Fixes: 5496197f9b08 ("debugfs: Restrict debugfs when the kernel is locked down") Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: stable <stable@vger.kernel.org> Acked-by: James Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/r/20191207161603.35907-1-eric.snowberg@oracle.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff a37f4958 Sat Dec 07 09:16:03 MST 2019 Eric Snowberg <eric.snowberg@oracle.com> debugfs: Return -EPERM when locked down When lockdown is enabled, debugfs_is_locked_down returns 1. It will then trigger the following: WARNING: CPU: 48 PID: 3747 CPU: 48 PID: 3743 Comm: bash Not tainted 5.4.0-1946.x86_64 #1 Hardware name: Oracle Corporation ORACLE SERVER X7-2/ASM, MB, X7-2, BIOS 41060400 05/20/2019 RIP: 0010:do_dentry_open+0x343/0x3a0 Code: 00 40 08 00 45 31 ff 48 c7 43 28 40 5b e7 89 e9 02 ff ff ff 48 8b 53 28 4c 8b 72 70 4d 85 f6 0f 84 10 fe ff ff e9 f5 fd ff ff <0f> 0b 41 bf ea ff ff ff e9 3b ff ff ff 41 bf e6 ff ff ff e9 b4 fe RSP: 0018:ffffb8740dde7ca0 EFLAGS: 00010202 RAX: ffffffff89e88a40 RBX: ffff928c8e6b6f00 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff928dbfd97778 RDI: ffff9285cff685c0 RBP: ffffb8740dde7cc8 R08: 0000000000000821 R09: 0000000000000030 R10: 0000000000000057 R11: ffffb8740dde7a98 R12: ffff926ec781c900 R13: ffff928c8e6b6f10 R14: ffffffff8936e190 R15: 0000000000000001 FS: 00007f45f6777740(0000) GS:ffff928dbfd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fff95e0d5d8 CR3: 0000001ece562006 CR4: 00000000007606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: vfs_open+0x2d/0x30 path_openat+0x2d4/0x1680 ? tty_mode_ioctl+0x298/0x4c0 do_filp_open+0x93/0x100 ? strncpy_from_user+0x57/0x1b0 ? __alloc_fd+0x46/0x150 do_sys_open+0x182/0x230 __x64_sys_openat+0x20/0x30 do_syscall_64+0x60/0x1b0 entry_SYSCALL_64_after_hwframe+0x170/0x1d5 RIP: 0033:0x7f45f5e5ce02 Code: 25 00 00 41 00 3d 00 00 41 00 74 4c 48 8d 05 25 59 2d 00 8b 00 85 c0 75 6d 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 a2 00 00 00 48 8b 4c 24 28 64 48 33 0c 25 RSP: 002b:00007fff95e0d2e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 0000561178c069b0 RCX: 00007f45f5e5ce02 RDX: 0000000000000241 RSI: 0000561178c08800 RDI: 00000000ffffff9c RBP: 00007fff95e0d3e0 R08: 0000000000000020 R09: 0000000000000005 R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000003 R14: 0000000000000001 R15: 0000561178c08800 Change the return type to int and return -EPERM when lockdown is enabled to remove the warning above. Also rename debugfs_is_locked_down to debugfs_locked_down to make it sound less like it returns a boolean. Fixes: 5496197f9b08 ("debugfs: Restrict debugfs when the kernel is locked down") Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: stable <stable@vger.kernel.org> Acked-by: James Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/r/20191207161603.35907-1-eric.snowberg@oracle.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff a37f4958 Sat Dec 07 09:16:03 MST 2019 Eric Snowberg <eric.snowberg@oracle.com> debugfs: Return -EPERM when locked down When lockdown is enabled, debugfs_is_locked_down returns 1. It will then trigger the following: WARNING: CPU: 48 PID: 3747 CPU: 48 PID: 3743 Comm: bash Not tainted 5.4.0-1946.x86_64 #1 Hardware name: Oracle Corporation ORACLE SERVER X7-2/ASM, MB, X7-2, BIOS 41060400 05/20/2019 RIP: 0010:do_dentry_open+0x343/0x3a0 Code: 00 40 08 00 45 31 ff 48 c7 43 28 40 5b e7 89 e9 02 ff ff ff 48 8b 53 28 4c 8b 72 70 4d 85 f6 0f 84 10 fe ff ff e9 f5 fd ff ff <0f> 0b 41 bf ea ff ff ff e9 3b ff ff ff 41 bf e6 ff ff ff e9 b4 fe RSP: 0018:ffffb8740dde7ca0 EFLAGS: 00010202 RAX: ffffffff89e88a40 RBX: ffff928c8e6b6f00 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff928dbfd97778 RDI: ffff9285cff685c0 RBP: ffffb8740dde7cc8 R08: 0000000000000821 R09: 0000000000000030 R10: 0000000000000057 R11: ffffb8740dde7a98 R12: ffff926ec781c900 R13: ffff928c8e6b6f10 R14: ffffffff8936e190 R15: 0000000000000001 FS: 00007f45f6777740(0000) GS:ffff928dbfd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fff95e0d5d8 CR3: 0000001ece562006 CR4: 00000000007606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: vfs_open+0x2d/0x30 path_openat+0x2d4/0x1680 ? tty_mode_ioctl+0x298/0x4c0 do_filp_open+0x93/0x100 ? strncpy_from_user+0x57/0x1b0 ? __alloc_fd+0x46/0x150 do_sys_open+0x182/0x230 __x64_sys_openat+0x20/0x30 do_syscall_64+0x60/0x1b0 entry_SYSCALL_64_after_hwframe+0x170/0x1d5 RIP: 0033:0x7f45f5e5ce02 Code: 25 00 00 41 00 3d 00 00 41 00 74 4c 48 8d 05 25 59 2d 00 8b 00 85 c0 75 6d 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 a2 00 00 00 48 8b 4c 24 28 64 48 33 0c 25 RSP: 002b:00007fff95e0d2e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 0000561178c069b0 RCX: 00007f45f5e5ce02 RDX: 0000000000000241 RSI: 0000561178c08800 RDI: 00000000ffffff9c RBP: 00007fff95e0d3e0 R08: 0000000000000020 R09: 0000000000000005 R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000003 R14: 0000000000000001 R15: 0000561178c08800 Change the return type to int and return -EPERM when lockdown is enabled to remove the warning above. Also rename debugfs_is_locked_down to debugfs_locked_down to make it sound less like it returns a boolean. Fixes: 5496197f9b08 ("debugfs: Restrict debugfs when the kernel is locked down") Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: stable <stable@vger.kernel.org> Acked-by: James Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/r/20191207161603.35907-1-eric.snowberg@oracle.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff a37f4958 Sat Dec 07 09:16:03 MST 2019 Eric Snowberg <eric.snowberg@oracle.com> debugfs: Return -EPERM when locked down When lockdown is enabled, debugfs_is_locked_down returns 1. It will then trigger the following: WARNING: CPU: 48 PID: 3747 CPU: 48 PID: 3743 Comm: bash Not tainted 5.4.0-1946.x86_64 #1 Hardware name: Oracle Corporation ORACLE SERVER X7-2/ASM, MB, X7-2, BIOS 41060400 05/20/2019 RIP: 0010:do_dentry_open+0x343/0x3a0 Code: 00 40 08 00 45 31 ff 48 c7 43 28 40 5b e7 89 e9 02 ff ff ff 48 8b 53 28 4c 8b 72 70 4d 85 f6 0f 84 10 fe ff ff e9 f5 fd ff ff <0f> 0b 41 bf ea ff ff ff e9 3b ff ff ff 41 bf e6 ff ff ff e9 b4 fe RSP: 0018:ffffb8740dde7ca0 EFLAGS: 00010202 RAX: ffffffff89e88a40 RBX: ffff928c8e6b6f00 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff928dbfd97778 RDI: ffff9285cff685c0 RBP: ffffb8740dde7cc8 R08: 0000000000000821 R09: 0000000000000030 R10: 0000000000000057 R11: ffffb8740dde7a98 R12: ffff926ec781c900 R13: ffff928c8e6b6f10 R14: ffffffff8936e190 R15: 0000000000000001 FS: 00007f45f6777740(0000) GS:ffff928dbfd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fff95e0d5d8 CR3: 0000001ece562006 CR4: 00000000007606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: vfs_open+0x2d/0x30 path_openat+0x2d4/0x1680 ? tty_mode_ioctl+0x298/0x4c0 do_filp_open+0x93/0x100 ? strncpy_from_user+0x57/0x1b0 ? __alloc_fd+0x46/0x150 do_sys_open+0x182/0x230 __x64_sys_openat+0x20/0x30 do_syscall_64+0x60/0x1b0 entry_SYSCALL_64_after_hwframe+0x170/0x1d5 RIP: 0033:0x7f45f5e5ce02 Code: 25 00 00 41 00 3d 00 00 41 00 74 4c 48 8d 05 25 59 2d 00 8b 00 85 c0 75 6d 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 a2 00 00 00 48 8b 4c 24 28 64 48 33 0c 25 RSP: 002b:00007fff95e0d2e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 0000561178c069b0 RCX: 00007f45f5e5ce02 RDX: 0000000000000241 RSI: 0000561178c08800 RDI: 00000000ffffff9c RBP: 00007fff95e0d3e0 R08: 0000000000000020 R09: 0000000000000005 R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000003 R14: 0000000000000001 R15: 0000561178c08800 Change the return type to int and return -EPERM when lockdown is enabled to remove the warning above. Also rename debugfs_is_locked_down to debugfs_locked_down to make it sound less like it returns a boolean. Fixes: 5496197f9b08 ("debugfs: Restrict debugfs when the kernel is locked down") Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: stable <stable@vger.kernel.org> Acked-by: James Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/r/20191207161603.35907-1-eric.snowberg@oracle.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff 8e580263 Fri Oct 11 07:29:27 MDT 2019 Greg Kroah-Hartman <gregkh@linuxfoundation.org> debugfs: remove return value of debugfs_create_size_t() No one checks the return value of debugfs_create_size_t(), as it's not needed, so make the return value void, so that no one tries to do so in the future. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20191011132931.1186197-4-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff 4d45f797 Tue Mar 22 07:11:18 MDT 2016 Nicolai Stange <nicstange@gmail.com> debugfs: unproxify files created through debugfs_create_bool() Currently, the struct file_operations fops_bool associated with files created through the debugfs_create_bool() helpers are not file lifetime aware. Thus, a lifetime managing proxy is created around fops_bool each time such a file is opened which is an unnecessary waste of resources. Implement file lifetime management for the fops_bool file_operations. Namely, make debugfs_read_file_bool() and debugfs_write_file_bool() safe against file removals by means of debugfs_use_file_start() and debugfs_use_file_finish(). Make debugfs_create_bool() create its files in non-proxying operation mode through debugfs_create_mode_unsafe(). Finally, purge debugfs_create_mode() as debugfs_create_bool() had been its last user. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
/linux-master/fs/bfs/ | ||
H A D | file.c | diff 0af57378 Mon Jun 28 20:36:12 MDT 2021 Christoph Hellwig <hch@lst.de> mm: require ->set_page_dirty to be explicitly wired up Remove the CONFIG_BLOCK default to __set_page_dirty_buffers and just wire that method up for the missing instances. [hch@lst.de: ecryptfs: add a ->set_page_dirty cludge] Link: https://lkml.kernel.org/r/20210624125250.536369-1-hch@lst.de Link: https://lkml.kernel.org/r/20210614061512.3966143-4-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Tyler Hicks <code@tyhicks.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff b2441318 Wed Nov 01 08:07:57 MDT 2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org> License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff 4e29d50a Mon Jul 05 06:15:01 MDT 2010 Artem Bityutskiy <Artem.Bityutskiy@nokia.com> BFS: clean up the superblock usage BFS is a very simple FS and its superblocks contains only static information and is never changed. However, the BFS code for some misterious reasons marked its buffer head as dirty from time to time, but nothing in that buffer was ever changed. This patch removes all the BFS superblock manipulation, simply because it is not needed. It removes: 1. The si_sbh filed from 'struct bfs_sb_info' because it is not needed. We only need to read the SB once on mount to get the start of data blocks and the FS size. After this, we can forget about the SB. 2. All instances of 'mark_buffer_dirty(sbh)' for BFS SB because it is never changed. 3. The '->sync_fs()' method because there is nothing to sync (inodes are synched by VFS). 4. The '->write_super()' method, again, because the SB is never changed. Tested-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff 4e29d50a Mon Jul 05 06:15:01 MDT 2010 Artem Bityutskiy <Artem.Bityutskiy@nokia.com> BFS: clean up the superblock usage BFS is a very simple FS and its superblocks contains only static information and is never changed. However, the BFS code for some misterious reasons marked its buffer head as dirty from time to time, but nothing in that buffer was ever changed. This patch removes all the BFS superblock manipulation, simply because it is not needed. It removes: 1. The si_sbh filed from 'struct bfs_sb_info' because it is not needed. We only need to read the SB once on mount to get the start of data blocks and the FS size. After this, we can forget about the SB. 2. All instances of 'mark_buffer_dirty(sbh)' for BFS SB because it is never changed. 3. The '->sync_fs()' method because there is nothing to sync (inodes are synched by VFS). 4. The '->write_super()' method, again, because the SB is never changed. Tested-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
/linux-master/fs/coda/ | ||
H A D | file.c | diff b1deb685 Mon Nov 08 19:34:36 MST 2021 Alex Shi <alexs@kernel.org> coda: remove err which no one care No one care 'err' in func coda_release, so better remove it. Link: https://lkml.kernel.org/r/20210908140308.18491-4-jaharkes@cs.cmu.edu Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu> Cc: Jing Yangyang <jing.yangyang@zte.com.cn> Cc: Xin Tan <tanxin.ctf@gmail.com> Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn> Cc: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff b2441318 Wed Nov 01 08:07:57 MDT 2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org> License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff 5a0e3ad6 Wed Mar 24 02:04:11 MDT 2010 Tejun Heo <tj@kernel.org> include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> diff 4c728ef5 Mon Dec 22 13:11:15 MST 2008 Christoph Hellwig <hch@lst.de> add a vfs_fsync helper Fsync currently has a fdatawrite/fdatawait pair around the method call, and a mutex_lock/unlock of the inode mutex. All callers of fsync have to duplicate this, but we have a few and most of them don't quite get it right. This patch adds a new vfs_fsync that takes care of this. It's a little more complicated as usual as ->fsync might get a NULL file pointer and just a dentry from nfsd, but otherwise gets afile and we want to take the mapping and file operations from it when it is there. Notes on the fsync callers: - ecryptfs wasn't calling filemap_fdatawrite / filemap_fdatawait on the lower file - coda wasn't calling filemap_fdatawrite / filemap_fdatawait on the host file, and returning 0 when ->fsync was missing - shm wasn't calling either filemap_fdatawrite / filemap_fdatawait nor taking i_mutex. Now given that shared memory doesn't have disk backing not doing anything in fsync seems fine and I left it out of the vfs_fsync conversion for now, but in that case we might just not pass it through to the lower file at all but just call the no-op simple_sync_file directly. [and now actually export vfs_fsync] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
H A D | dir.c | diff b2441318 Wed Nov 01 08:07:57 MDT 2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org> License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff 4acdaf27 Mon Jul 25 23:42:34 MDT 2011 Al Viro <viro@zeniv.linux.org.uk> switch ->create() to umode_t vfs_create() ignores everything outside of 16bit subset of its mode argument; switching it to umode_t is obviously equivalent and it's the only caller of the method Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff 5a0e3ad6 Wed Mar 24 02:04:11 MDT 2010 Tejun Heo <tj@kernel.org> include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
/linux-master/include/linux/ | ||
H A D | crash_dump.h | diff e0690479 Fri Apr 29 15:37:59 MDT 2022 Matthew Wilcox (Oracle) <willy@infradead.org> vmcore: convert read_from_oldmem() to take an iov_iter Remove the read_from_oldmem() wrapper introduced earlier and convert all the remaining callers to pass an iov_iter. Link: https://lkml.kernel.org/r/20220408090636.560886-4-bhe@redhat.com Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> diff 65fddcfc Mon Jun 08 22:32:42 MDT 2020 Mike Rapoport <rppt@kernel.org> mm: reorder includes after introduction of linux/pgtable.h The replacement of <asm/pgrable.h> with <linux/pgtable.h> made the include of the latter in the middle of asm includes. Fix this up with the aid of the below script and manual adjustments here and there. import sys import re if len(sys.argv) is not 3: print "USAGE: %s <file> <header>" % (sys.argv[0]) sys.exit(1) hdr_to_move="#include <linux/%s>" % sys.argv[2] moved = False in_hdrs = False with open(sys.argv[1], "r") as f: lines = f.readlines() for _line in lines: line = _line.rstrip(' ') if line == hdr_to_move: continue if line.startswith("#include <linux/"): in_hdrs = True elif not moved and in_hdrs: moved = True print hdr_to_move print line Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-4-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff 4c5b566c Mon Mar 30 12:15:44 MDT 2020 Kairui Song <kasong@redhat.com> crash_dump: Remove no longer used saved_max_pfn saved_max_pfn was originally introduced in commit 92aa63a5a1bf ("[PATCH] kdump: Retrieve saved max pfn") It used to make sure that the user does not try to read the physical memory beyond saved_max_pfn. But since commit 921d58c0e699 ("vmcore: remove saved_max_pfn check") it's no longer used for the check. This variable doesn't have any users anymore so just remove it. [ bp: Drop the Calgary IOMMU reference from the commit message. ] Signed-off-by: Kairui Song <kasong@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Link: https://lkml.kernel.org/r/20200330181544.1595733-1-kasong@redhat.com diff b2441318 Wed Nov 01 08:07:57 MDT 2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org> License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff be8a8d06 Wed Sep 11 15:24:49 MDT 2013 Michael Holzheu <holzheu@linux.vnet.ibm.com> vmcore: introduce ELF header in new memory feature For s390 we want to use /proc/vmcore for our SCSI stand-alone dump (zfcpdump). We have support where the first HSA_SIZE bytes are saved into a hypervisor owned memory area (HSA) before the kdump kernel is booted. When the kdump kernel starts, it is restricted to use only HSA_SIZE bytes. The advantages of this mechanism are: * No crashkernel memory has to be defined in the old kernel. * Early boot problems (before kexec_load has been done) can be dumped * Non-Linux systems can be dumped. We modify the s390 copy_oldmem_page() function to read from the HSA memory if memory below HSA_SIZE bytes is requested. Since we cannot use the kexec tool to load the kernel in this scenario, we have to build the ELF header in the 2nd (kdump/new) kernel. So with the following patch set we would like to introduce the new function that the ELF header for /proc/vmcore can be created in the 2nd kernel memory. The following steps are done during zfcpdump execution: 1. Production system crashes 2. User boots a SCSI disk that has been prepared with the zfcpdump tool 3. Hypervisor saves CPU state of boot CPU and HSA_SIZE bytes of memory into HSA 4. Boot loader loads kernel into low memory area 5. Kernel boots and uses only HSA_SIZE bytes of memory 6. Kernel saves registers of non-boot CPUs 7. Kernel does memory detection for dump memory map 8. Kernel creates ELF header for /proc/vmcore 9. /proc/vmcore uses this header for initialization 10. The zfcpdump user space reads /proc/vmcore to write dump to SCSI disk - copy_oldmem_page() copies from HSA for memory below HSA_SIZE - copy_oldmem_page() copies from real memory for memory above HSA_SIZE Currently for s390 we create the ELF core header in the 2nd kernel with a small trick. We relocate the addresses in the ELF header in a way that for the /proc/vmcore code it seems to be in the 1st kernel (old) memory and the read_from_oldmem() returns the correct data. This allows the /proc/vmcore code to use the ELF header in the 2nd kernel. This patch: Exchange the old mechanism with the new and much cleaner function call override feature that now offcially allows to create the ELF core header in the 2nd kernel. To use the new feature the following function have to be defined by the architecture backend code to read from new memory: * elfcorehdr_alloc: Allocate ELF header * elfcorehdr_free: Free the memory of the ELF header * elfcorehdr_read: Read from ELF header * elfcorehdr_read_notes: Read from ELF notes Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Jan Willeke <willeke@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff 313162d0 Mon Jan 30 09:46:54 MST 2012 Paul Gortmaker <paul.gortmaker@windriver.com> device.h: audit and cleanup users in main include dir The <linux/device.h> header includes a lot of stuff, and it in turn gets a lot of use just for the basic "struct device" which appears so often. Clean up the users as follows: 1) For those headers only needing "struct device" as a pointer in fcn args, replace the include with exactly that. 2) For headers not really using anything from device.h, simply delete the include altogether. 3) For headers relying on getting device.h implicitly before being included themselves, now explicitly include device.h 4) For files in which doing #1 or #2 uncovers an implicit dependency on some other header, fix by explicitly adding the required header(s). Any C files that were implicitly relying on device.h to be present have already been dealt with in advance. Total removals from #1 and #2: 51. Total additions coming from #3: 9. Total other implicit dependencies from #4: 7. As of 3.3-rc1, there were 110, so a net removal of 42 gives about a 38% reduction in device.h presence in include/* Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> diff 313162d0 Mon Jan 30 09:46:54 MST 2012 Paul Gortmaker <paul.gortmaker@windriver.com> device.h: audit and cleanup users in main include dir The <linux/device.h> header includes a lot of stuff, and it in turn gets a lot of use just for the basic "struct device" which appears so often. Clean up the users as follows: 1) For those headers only needing "struct device" as a pointer in fcn args, replace the include with exactly that. 2) For headers not really using anything from device.h, simply delete the include altogether. 3) For headers relying on getting device.h implicitly before being included themselves, now explicitly include device.h 4) For files in which doing #1 or #2 uncovers an implicit dependency on some other header, fix by explicitly adding the required header(s). Any C files that were implicitly relying on device.h to be present have already been dealt with in advance. Total removals from #1 and #2: 51. Total additions coming from #3: 9. Total other implicit dependencies from #4: 7. As of 3.3-rc1, there were 110, so a net removal of 42 gives about a 38% reduction in device.h presence in include/* Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
H A D | ramfs.h | diff b2441318 Wed Nov 01 08:07:57 MDT 2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org> License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff 4b19de6d Thu Oct 02 15:50:16 MDT 2008 Nick Piggin <npiggin@suse.de> mm: tiny-shmem nommu fix The previous patch db203d53d474aa068984e409d807628f5841da1b ("mm: tiny-shmem fix lock ordering: mmap_sem vs i_mutex") to fix the lock ordering in tiny-shmem breaks shared anonymous and IPC memory on NOMMU architectures because it was using the expanding truncate to signal ramfs to allocate a physically contiguous RAM backing the inode (otherwise it is unusable for "memory mapping" it to userspace). However do_truncate is what caused the lock ordering error, due to it taking i_mutex. In this case, we can actually just call ramfs directly to allocate memory for the mapping, rather than go via truncate. Acked-by: David Howells <dhowells@redhat.com> Acked-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
/linux-master/fs/ | ||
H A D | char_dev.c | diff 68279f9c Wed Oct 11 10:55:00 MDT 2023 Alexey Dobriyan <adobriyan@gmail.com> treewide: mark stuff as __ro_after_init __read_mostly predates __ro_after_init. Many variables which are marked __read_mostly should have been __ro_after_init from day 1. Also, mark some stuff as "const" and "__init" while I'm at it. [akpm@linux-foundation.org: revert sysctl_nr_open_min, sysctl_nr_open_max changes due to arm warning] [akpm@linux-foundation.org: coding-style cleanups] Link: https://lkml.kernel.org/r/4f6bb9c0-abba-4ee4-a7aa-89265e886817@p183 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> diff 68279f9c Wed Oct 11 10:55:00 MDT 2023 Alexey Dobriyan <adobriyan@gmail.com> treewide: mark stuff as __ro_after_init __read_mostly predates __ro_after_init. Many variables which are marked __read_mostly should have been __ro_after_init from day 1. Also, mark some stuff as "const" and "__init" while I'm at it. [akpm@linux-foundation.org: revert sysctl_nr_open_min, sysctl_nr_open_max changes due to arm warning] [akpm@linux-foundation.org: coding-style cleanups] Link: https://lkml.kernel.org/r/4f6bb9c0-abba-4ee4-a7aa-89265e886817@p183 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> diff 4b0be572 Fri Feb 15 05:27:13 MST 2019 Chengguang Xu <cgxu519@gmx.com> chardev: code cleanup for __register_chrdev_region() It's just code cleanup, not functional change. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff b2441318 Wed Nov 01 08:07:57 MDT 2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org> License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff a5d31a3f Thu Jun 15 14:05:20 MDT 2017 Logan Gunthorpe <logang@deltatee.com> char_dev: extend dynamic allocation of majors into a higher range We've run into problems with running out of dynamicly assign char device majors particullarly on automated test systems with all-yes-configs. Roughly 40 dynamic assignments can be made with such kernels at this time while space is reserved for only 20. Currently, the kernel only prints a warning when dynamic allocation overflows the reserved region. And when this happens drivers that have fixed assignments can randomly fail depending on the order of initialization of other drivers. Thus, adding a new char device can cause unexpected failures in completely unrelated parts of the kernel. This patch solves the problem by extending dynamic major number allocations down from 511 once the 234-254 region fills up. Fixed majors already exist above 255 so the infrastructure to support high number majors is already in place. The patch reserves an additional 128 major numbers which should hopefully last us a while. Kernels that don't require more than 20 dynamic majors assigned (which is pretty typical) should not be affected by this change. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alan Cox <alan@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Linus Walleij <linus.walleij@linaro.org> Link: https://lkml.org/lkml/2017/6/4/107 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff 233ed09d Fri Mar 17 00:48:08 MDT 2017 Logan Gunthorpe <logang@deltatee.com> chardev: add helper function to register char devs with a struct device Credit for this patch goes is shared with Dan Williams [1]. I've taken things one step further to make the helper function more useful and clean up calling code. There's a common pattern in the kernel whereby a struct cdev is placed in a structure along side a struct device which manages the life-cycle of both. In the naive approach, the reference counting is broken and the struct device can free everything before the chardev code is entirely released. Many developers have solved this problem by linking the internal kobjs in this fashion: cdev.kobj.parent = &parent_dev.kobj; The cdev code explicitly gets and puts a reference to it's kobj parent. So this seems like it was intended to be used this way. Dmitrty Torokhov first put this in place in 2012 with this commit: 2f0157f char_dev: pin parent kobject and the first instance of the fix was then done in the input subsystem in the following commit: 4a215aa Input: fix use-after-free introduced with dynamic minor changes Subsequently over the years, however, this issue seems to have tripped up multiple developers independently. For example, see these commits: 0d5b7da iio: Prevent race between IIO chardev opening and IIO device (by Lars-Peter Clausen in 2013) ba0ef85 tpm: Fix initialization of the cdev (by Jason Gunthorpe in 2015) 5b28dde [media] media: fix use-after-free in cdev_put() when app exits after driver unbind (by Shauh Khan in 2016) This technique is similarly done in at least 15 places within the kernel and probably should have been done so in another, at least, 5 places. The kobj line also looks very suspect in that one would not expect drivers to have to mess with kobject internals in this way. Even highly experienced kernel developers can be surprised by this code, as seen in [2]. To help alleviate this situation, and hopefully prevent future wasted effort on this problem, this patch introduces a helper function to register a char device along with its parent struct device. This creates a more regular API for tying a char device to its parent without the developer having to set members in the underlying kobject. This patch introduce cdev_device_add and cdev_device_del which replaces a common pattern including setting the kobj parent, calling cdev_add and then calling device_add. It also introduces cdev_set_parent for the few cases that set the kobject parent without using device_add. [1] https://lkml.org/lkml/2017/2/13/700 [2] https://lkml.org/lkml/2017/2/10/370 Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Hans Verkuil <hans.verkuil@cisco.com> Reviewed-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff 01d553d0 Fri Sep 29 03:00:08 MDT 2006 Amos Waterland <apw@us.ibm.com> [PATCH] Chardev checking of overlapping ranges The code in __register_chrdev_region checks that if the driver wishing to register has the same major as an existing driver the new minor range is strictly less than the existing minor range. However, it does not also check that the new minor range is strictly greater than the existing minor range. That is, if driver X has registered with major=x and minor=0-3, __register_chrdev_region will allow driver Y to register with major=x and minor=1-4. Signed-off-by: Amos Waterland <apw@us.ibm.com> Cc: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 7170be5f Sat Jan 14 14:20:38 MST 2006 Neil Horman <nhorman@redhat.com> [PATCH] convert /proc/devices to use seq_file interface A Christoph suggested that the /proc/devices file be converted to use the seq_file interface. This patch does that. I've obxerved one or two installation that had sufficiently large sans that they overran the 4k limit on /proc/devices. Signed-off-by: Neil Horman <nhorman@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
H A D | binfmt_misc.c | diff 21ca59b3 Wed Oct 27 16:31:14 MDT 2021 Christian Brauner <brauner@kernel.org> binfmt_misc: enable sandboxed mounts Enable unprivileged sandboxes to create their own binfmt_misc mounts. This is based on Laurent's work in [1] but has been significantly reworked to fix various issues we identified in earlier versions. While binfmt_misc can currently only be mounted in the initial user namespace, binary types registered in this binfmt_misc instance are available to all sandboxes (Either by having them installed in the sandbox or by registering the binary type with the F flag causing the interpreter to be opened right away). So binfmt_misc binary types are already delegated to sandboxes implicitly. However, while a sandbox has access to all registered binary types in binfmt_misc a sandbox cannot currently register its own binary types in binfmt_misc. This has prevented various use-cases some of which were already outlined in [1] but we have a range of issues associated with this (cf. [3]-[5] below which are just a small sample). Extend binfmt_misc to be mountable in non-initial user namespaces. Similar to other filesystem such as nfsd, mqueue, and sunrpc we use keyed superblock management. The key determines whether we need to create a new superblock or can reuse an already existing one. We use the user namespace of the mount as key. This means a new binfmt_misc superblock is created once per user namespace creation. Subsequent mounts of binfmt_misc in the same user namespace will mount the same binfmt_misc instance. We explicitly do not create a new binfmt_misc superblock on every binfmt_misc mount as the semantics for load_misc_binary() line up with the keying model. This also allows us to retrieve the relevant binfmt_misc instance based on the caller's user namespace which can be done in a simple (bounded to 32 levels) loop. Similar to the current binfmt_misc semantics allowing access to the binary types in the initial binfmt_misc instance we do allow sandboxes access to their parent's binfmt_misc mounts if they do not have created a separate binfmt_misc instance. Overall, this will unblock the use-cases mentioned below and in general will also allow to support and harden execution of another architecture's binaries in tight sandboxes. For instance, using the unshare binary it possible to start a chroot of another architecture and configure the binfmt_misc interpreter without being root to run the binaries in this chroot and without requiring the host to modify its binary type handlers. Henning had already posted a few experiments in the cover letter at [1]. But here's an additional example where an unprivileged container registers qemu-user-static binary handlers for various binary types in its separate binfmt_misc mount and is then seamlessly able to start containers with a different architecture without affecting the host: root [lxc monitor] /var/snap/lxd/common/lxd/containers f1 1000000 \_ /sbin/init 1000000 \_ /lib/systemd/systemd-journald 1000000 \_ /lib/systemd/systemd-udevd 1000100 \_ /lib/systemd/systemd-networkd 1000101 \_ /lib/systemd/systemd-resolved 1000000 \_ /usr/sbin/cron -f 1000103 \_ /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only 1000000 \_ /usr/bin/python3 /usr/bin/networkd-dispatcher --run-startup-triggers 1000104 \_ /usr/sbin/rsyslogd -n -iNONE 1000000 \_ /lib/systemd/systemd-logind 1000000 \_ /sbin/agetty -o -p -- \u --noclear --keep-baud console 115200,38400,9600 vt220 1000107 \_ dnsmasq --conf-file=/dev/null -u lxc-dnsmasq --strict-order --bind-interfaces --pid-file=/run/lxc/dnsmasq.pid --liste 1000000 \_ [lxc monitor] /var/lib/lxc f1-s390x 1100000 \_ /usr/bin/qemu-s390x-static /sbin/init 1100000 \_ /usr/bin/qemu-s390x-static /lib/systemd/systemd-journald 1100000 \_ /usr/bin/qemu-s390x-static /usr/sbin/cron -f 1100103 \_ /usr/bin/qemu-s390x-static /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-ac 1100000 \_ /usr/bin/qemu-s390x-static /usr/bin/python3 /usr/bin/networkd-dispatcher --run-startup-triggers 1100104 \_ /usr/bin/qemu-s390x-static /usr/sbin/rsyslogd -n -iNONE 1100000 \_ /usr/bin/qemu-s390x-static /lib/systemd/systemd-logind 1100000 \_ /usr/bin/qemu-s390x-static /sbin/agetty -o -p -- \u --noclear --keep-baud console 115200,38400,9600 vt220 1100000 \_ /usr/bin/qemu-s390x-static /sbin/agetty -o -p -- \u --noclear --keep-baud pts/0 115200,38400,9600 vt220 1100000 \_ /usr/bin/qemu-s390x-static /sbin/agetty -o -p -- \u --noclear --keep-baud pts/1 115200,38400,9600 vt220 1100000 \_ /usr/bin/qemu-s390x-static /sbin/agetty -o -p -- \u --noclear --keep-baud pts/2 115200,38400,9600 vt220 1100000 \_ /usr/bin/qemu-s390x-static /sbin/agetty -o -p -- \u --noclear --keep-baud pts/3 115200,38400,9600 vt220 1100000 \_ /usr/bin/qemu-s390x-static /lib/systemd/systemd-udevd [1]: https://lore.kernel.org/all/20191216091220.465626-1-laurent@vivier.eu [2]: https://discuss.linuxcontainers.org/t/binfmt-misc-permission-denied [3]: https://discuss.linuxcontainers.org/t/lxd-binfmt-support-for-qemu-static-interpreters [4]: https://discuss.linuxcontainers.org/t/3-1-0-binfmt-support-service-in-unprivileged-guest-requires-write-access-on-hosts-proc-sys-fs-binfmt-misc [5]: https://discuss.linuxcontainers.org/t/qemu-user-static-not-working-4-11 Link: https://lore.kernel.org/r/20191216091220.465626-2-laurent@vivier.eu (origin) Link: https://lore.kernel.org/r/20211028103114.2849140-2-brauner@kernel.org (v1) Cc: Sargun Dhillon <sargun@sargun.me> Cc: Serge Hallyn <serge@hallyn.com> Cc: Jann Horn <jannh@google.com> Cc: Henning Schild <henning.schild@siemens.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Laurent Vivier <laurent@vivier.eu> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> --- /* v2 */ - Serge Hallyn <serge@hallyn.com>: - Use GFP_KERNEL_ACCOUNT for userspace triggered allocations when a new binary type handler is registered. - Christian Brauner <christian.brauner@ubuntu.com>: - Switch authorship to me. I refused to do that earlier even though Laurent said I should do so because I think it's genuinely bad form. But by now I have changed so many things that it'd be unfair to blame Laurent for any potential bugs in here. - Add more comments that explain what's going on. - Rename functions while changing them to better reflect what they are doing to make the code easier to understand. - In the first version when a specific binary type handler was removed either through a write to the entry's file or all binary type handlers were removed by a write to the binfmt_misc mount's status file all cleanup work happened during inode eviction. That includes removal of the relevant entries from entry list. While that works fine I disliked that model after thinking about it for a bit. Because it means that there was a window were someone has already removed a or all binary handlers but they could still be safely reached from load_misc_binary() when it has managed to take the read_lock() on the entries list while inode eviction was already happening. Again, that perfectly benign but it's cleaner to remove the binary handler from the list immediately meaning that ones the write to then entry's file or the binfmt_misc status file returns the binary type cannot be executed anymore. That gives stronger guarantees to the user. diff 21ca59b3 Wed Oct 27 16:31:14 MDT 2021 Christian Brauner <brauner@kernel.org> binfmt_misc: enable sandboxed mounts Enable unprivileged sandboxes to create their own binfmt_misc mounts. This is based on Laurent's work in [1] but has been significantly reworked to fix various issues we identified in earlier versions. While binfmt_misc can currently only be mounted in the initial user namespace, binary types registered in this binfmt_misc instance are available to all sandboxes (Either by having them installed in the sandbox or by registering the binary type with the F flag causing the interpreter to be opened right away). So binfmt_misc binary types are already delegated to sandboxes implicitly. However, while a sandbox has access to all registered binary types in binfmt_misc a sandbox cannot currently register its own binary types in binfmt_misc. This has prevented various use-cases some of which were already outlined in [1] but we have a range of issues associated with this (cf. [3]-[5] below which are just a small sample). Extend binfmt_misc to be mountable in non-initial user namespaces. Similar to other filesystem such as nfsd, mqueue, and sunrpc we use keyed superblock management. The key determines whether we need to create a new superblock or can reuse an already existing one. We use the user namespace of the mount as key. This means a new binfmt_misc superblock is created once per user namespace creation. Subsequent mounts of binfmt_misc in the same user namespace will mount the same binfmt_misc instance. We explicitly do not create a new binfmt_misc superblock on every binfmt_misc mount as the semantics for load_misc_binary() line up with the keying model. This also allows us to retrieve the relevant binfmt_misc instance based on the caller's user namespace which can be done in a simple (bounded to 32 levels) loop. Similar to the current binfmt_misc semantics allowing access to the binary types in the initial binfmt_misc instance we do allow sandboxes access to their parent's binfmt_misc mounts if they do not have created a separate binfmt_misc instance. Overall, this will unblock the use-cases mentioned below and in general will also allow to support and harden execution of another architecture's binaries in tight sandboxes. For instance, using the unshare binary it possible to start a chroot of another architecture and configure the binfmt_misc interpreter without being root to run the binaries in this chroot and without requiring the host to modify its binary type handlers. Henning had already posted a few experiments in the cover letter at [1]. But here's an additional example where an unprivileged container registers qemu-user-static binary handlers for various binary types in its separate binfmt_misc mount and is then seamlessly able to start containers with a different architecture without affecting the host: root [lxc monitor] /var/snap/lxd/common/lxd/containers f1 1000000 \_ /sbin/init 1000000 \_ /lib/systemd/systemd-journald 1000000 \_ /lib/systemd/systemd-udevd 1000100 \_ /lib/systemd/systemd-networkd 1000101 \_ /lib/systemd/systemd-resolved 1000000 \_ /usr/sbin/cron -f 1000103 \_ /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only 1000000 \_ /usr/bin/python3 /usr/bin/networkd-dispatcher --run-startup-triggers 1000104 \_ /usr/sbin/rsyslogd -n -iNONE 1000000 \_ /lib/systemd/systemd-logind 1000000 \_ /sbin/agetty -o -p -- \u --noclear --keep-baud console 115200,38400,9600 vt220 1000107 \_ dnsmasq --conf-file=/dev/null -u lxc-dnsmasq --strict-order --bind-interfaces --pid-file=/run/lxc/dnsmasq.pid --liste 1000000 \_ [lxc monitor] /var/lib/lxc f1-s390x 1100000 \_ /usr/bin/qemu-s390x-static /sbin/init 1100000 \_ /usr/bin/qemu-s390x-static /lib/systemd/systemd-journald 1100000 \_ /usr/bin/qemu-s390x-static /usr/sbin/cron -f 1100103 \_ /usr/bin/qemu-s390x-static /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-ac 1100000 \_ /usr/bin/qemu-s390x-static /usr/bin/python3 /usr/bin/networkd-dispatcher --run-startup-triggers 1100104 \_ /usr/bin/qemu-s390x-static /usr/sbin/rsyslogd -n -iNONE 1100000 \_ /usr/bin/qemu-s390x-static /lib/systemd/systemd-logind 1100000 \_ /usr/bin/qemu-s390x-static /sbin/agetty -o -p -- \u --noclear --keep-baud console 115200,38400,9600 vt220 1100000 \_ /usr/bin/qemu-s390x-static /sbin/agetty -o -p -- \u --noclear --keep-baud pts/0 115200,38400,9600 vt220 1100000 \_ /usr/bin/qemu-s390x-static /sbin/agetty -o -p -- \u --noclear --keep-baud pts/1 115200,38400,9600 vt220 1100000 \_ /usr/bin/qemu-s390x-static /sbin/agetty -o -p -- \u --noclear --keep-baud pts/2 115200,38400,9600 vt220 1100000 \_ /usr/bin/qemu-s390x-static /sbin/agetty -o -p -- \u --noclear --keep-baud pts/3 115200,38400,9600 vt220 1100000 \_ /usr/bin/qemu-s390x-static /lib/systemd/systemd-udevd [1]: https://lore.kernel.org/all/20191216091220.465626-1-laurent@vivier.eu [2]: https://discuss.linuxcontainers.org/t/binfmt-misc-permission-denied [3]: https://discuss.linuxcontainers.org/t/lxd-binfmt-support-for-qemu-static-interpreters [4]: https://discuss.linuxcontainers.org/t/3-1-0-binfmt-support-service-in-unprivileged-guest-requires-write-access-on-hosts-proc-sys-fs-binfmt-misc [5]: https://discuss.linuxcontainers.org/t/qemu-user-static-not-working-4-11 Link: https://lore.kernel.org/r/20191216091220.465626-2-laurent@vivier.eu (origin) Link: https://lore.kernel.org/r/20211028103114.2849140-2-brauner@kernel.org (v1) Cc: Sargun Dhillon <sargun@sargun.me> Cc: Serge Hallyn <serge@hallyn.com> Cc: Jann Horn <jannh@google.com> Cc: Henning Schild <henning.schild@siemens.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Laurent Vivier <laurent@vivier.eu> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> --- /* v2 */ - Serge Hallyn <serge@hallyn.com>: - Use GFP_KERNEL_ACCOUNT for userspace triggered allocations when a new binary type handler is registered. - Christian Brauner <christian.brauner@ubuntu.com>: - Switch authorship to me. I refused to do that earlier even though Laurent said I should do so because I think it's genuinely bad form. But by now I have changed so many things that it'd be unfair to blame Laurent for any potential bugs in here. - Add more comments that explain what's going on. - Rename functions while changing them to better reflect what they are doing to make the code easier to understand. - In the first version when a specific binary type handler was removed either through a write to the entry's file or all binary type handlers were removed by a write to the binfmt_misc mount's status file all cleanup work happened during inode eviction. That includes removal of the relevant entries from entry list. While that works fine I disliked that model after thinking about it for a bit. Because it means that there was a window were someone has already removed a or all binary handlers but they could still be safely reached from load_misc_binary() when it has managed to take the read_lock() on the entries list while inode eviction was already happening. Again, that perfectly benign but it's cleaner to remove the binary handler from the list immediately meaning that ones the write to then entry's file or the binfmt_misc status file returns the binary type cannot be executed anymore. That gives stronger guarantees to the user. diff 1c5976ef Wed Oct 27 16:31:13 MDT 2021 Christian Brauner <brauner@kernel.org> binfmt_misc: cleanup on filesystem umount Currently, registering a new binary type pins the binfmt_misc filesystem. Specifically, this means that as long as there is at least one binary type registered the binfmt_misc filesystem survives all umounts, i.e. the superblock is not destroyed. Meaning that a umount followed by another mount will end up with the same superblock and the same binary type handlers. This is a behavior we tend to discourage for any new filesystems (apart from a few special filesystems such as e.g. configfs or debugfs). A umount operation without the filesystem being pinned - by e.g. someone holding a file descriptor to an open file - should usually result in the destruction of the superblock and all associated resources. This makes introspection easier and leads to clearly defined, simple and clean semantics. An administrator can rely on the fact that a umount will guarantee a clean slate making it possible to reinitialize a filesystem. Right now all binary types would need to be explicitly deleted before that can happen. This allows us to remove the heavy-handed calls to simple_pin_fs() and simple_release_fs() when creating and deleting binary types. This in turn allows us to replace the current brittle pinning mechanism abusing dget() which has caused a range of bugs judging from prior fixes in [2] and [3]. The additional dget() in load_misc_binary() pins the dentry but only does so for the sake to prevent ->evict_inode() from freeing the node when a user removes the binary type and kill_node() is run. Which would mean ->interpreter and ->interp_file would be freed causing a UAF. This isn't really nicely documented nor is it very clean because it relies on simple_pin_fs() pinning the filesystem as long as at least one binary type exists. Otherwise it would cause load_misc_binary() to hold on to a dentry belonging to a superblock that has been shutdown. Replace that implicit pinning with a clean and simple per-node refcount and get rid of the ugly dget() pinning. A similar mechanism exists for e.g. binderfs (cf. [4]). All the cleanup work can now be done in ->evict_inode(). In a follow-up patch we will make it possible to use binfmt_misc in sandboxes. We will use the cleaner semantics where a umount for the filesystem will cause the superblock and all resources to be deallocated. In preparation for this apply the same semantics to the initial binfmt_misc mount. Note, that this is a user-visible change and as such a uapi change but one that we can reasonably risk. We've discussed this in earlier versions of this patchset (cf. [1]). The main user and provider of binfmt_misc is systemd. Systemd provides binfmt_misc via autofs since it is configurable as a kernel module and is used by a few exotic packages and users. As such a binfmt_misc mount is triggered when /proc/sys/fs/binfmt_misc is accessed and is only provided on demand. Other autofs on demand filesystems include EFI ESP which systemd umounts if the mountpoint stays idle for a certain amount of time. This doesn't apply to the binfmt_misc autofs mount which isn't touched once it is mounted meaning this change can't accidently wipe binary type handlers without someone having explicitly unmounted binfmt_misc. After speaking to systemd folks they don't expect this change to affect them. In line with our general policy, if we see a regression for systemd or other users with this change we will switch back to the old behavior for the initial binfmt_misc mount and have binary types pin the filesystem again. But while we touch this code let's take the chance and let's improve on the status quo. [1]: https://lore.kernel.org/r/20191216091220.465626-2-laurent@vivier.eu [2]: commit 43a4f2619038 ("exec: binfmt_misc: fix race between load_misc_binary() and kill_node()" [3]: commit 83f918274e4b ("exec: binfmt_misc: shift filp_close(interp_file) from kill_node() to bm_evict_inode()") [4]: commit f0fe2c0f050d ("binder: prevent UAF for binderfs devices II") Link: https://lore.kernel.org/r/20211028103114.2849140-1-brauner@kernel.org (v1) Cc: Sargun Dhillon <sargun@sargun.me> Cc: Serge Hallyn <serge@hallyn.com> Cc: Jann Horn <jannh@google.com> Cc: Henning Schild <henning.schild@siemens.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Laurent Vivier <laurent@vivier.eu> Cc: linux-fsdevel@vger.kernel.org Acked-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> --- /* v2 */ - Christian Brauner <christian.brauner@ubuntu.com>: - Add more comments that explain what's going on. - Rename functions while changing them to better reflect what they are doing to make the code easier to understand. - In the first version when a specific binary type handler was removed either through a write to the entry's file or all binary type handlers were removed by a write to the binfmt_misc mount's status file all cleanup work happened during inode eviction. That includes removal of the relevant entries from entry list. While that works fine I disliked that model after thinking about it for a bit. Because it means that there was a window were someone has already removed a or all binary handlers but they could still be safely reached from load_misc_binary() when it has managed to take the read_lock() on the entries list while inode eviction was already happening. Again, that perfectly benign but it's cleaner to remove the binary handler from the list immediately meaning that ones the write to then entry's file or the binfmt_misc status file returns the binary type cannot be executed anymore. That gives stronger guarantees to the user. diff 1c5976ef Wed Oct 27 16:31:13 MDT 2021 Christian Brauner <brauner@kernel.org> binfmt_misc: cleanup on filesystem umount Currently, registering a new binary type pins the binfmt_misc filesystem. Specifically, this means that as long as there is at least one binary type registered the binfmt_misc filesystem survives all umounts, i.e. the superblock is not destroyed. Meaning that a umount followed by another mount will end up with the same superblock and the same binary type handlers. This is a behavior we tend to discourage for any new filesystems (apart from a few special filesystems such as e.g. configfs or debugfs). A umount operation without the filesystem being pinned - by e.g. someone holding a file descriptor to an open file - should usually result in the destruction of the superblock and all associated resources. This makes introspection easier and leads to clearly defined, simple and clean semantics. An administrator can rely on the fact that a umount will guarantee a clean slate making it possible to reinitialize a filesystem. Right now all binary types would need to be explicitly deleted before that can happen. This allows us to remove the heavy-handed calls to simple_pin_fs() and simple_release_fs() when creating and deleting binary types. This in turn allows us to replace the current brittle pinning mechanism abusing dget() which has caused a range of bugs judging from prior fixes in [2] and [3]. The additional dget() in load_misc_binary() pins the dentry but only does so for the sake to prevent ->evict_inode() from freeing the node when a user removes the binary type and kill_node() is run. Which would mean ->interpreter and ->interp_file would be freed causing a UAF. This isn't really nicely documented nor is it very clean because it relies on simple_pin_fs() pinning the filesystem as long as at least one binary type exists. Otherwise it would cause load_misc_binary() to hold on to a dentry belonging to a superblock that has been shutdown. Replace that implicit pinning with a clean and simple per-node refcount and get rid of the ugly dget() pinning. A similar mechanism exists for e.g. binderfs (cf. [4]). All the cleanup work can now be done in ->evict_inode(). In a follow-up patch we will make it possible to use binfmt_misc in sandboxes. We will use the cleaner semantics where a umount for the filesystem will cause the superblock and all resources to be deallocated. In preparation for this apply the same semantics to the initial binfmt_misc mount. Note, that this is a user-visible change and as such a uapi change but one that we can reasonably risk. We've discussed this in earlier versions of this patchset (cf. [1]). The main user and provider of binfmt_misc is systemd. Systemd provides binfmt_misc via autofs since it is configurable as a kernel module and is used by a few exotic packages and users. As such a binfmt_misc mount is triggered when /proc/sys/fs/binfmt_misc is accessed and is only provided on demand. Other autofs on demand filesystems include EFI ESP which systemd umounts if the mountpoint stays idle for a certain amount of time. This doesn't apply to the binfmt_misc autofs mount which isn't touched once it is mounted meaning this change can't accidently wipe binary type handlers without someone having explicitly unmounted binfmt_misc. After speaking to systemd folks they don't expect this change to affect them. In line with our general policy, if we see a regression for systemd or other users with this change we will switch back to the old behavior for the initial binfmt_misc mount and have binary types pin the filesystem again. But while we touch this code let's take the chance and let's improve on the status quo. [1]: https://lore.kernel.org/r/20191216091220.465626-2-laurent@vivier.eu [2]: commit 43a4f2619038 ("exec: binfmt_misc: fix race between load_misc_binary() and kill_node()" [3]: commit 83f918274e4b ("exec: binfmt_misc: shift filp_close(interp_file) from kill_node() to bm_evict_inode()") [4]: commit f0fe2c0f050d ("binder: prevent UAF for binderfs devices II") Link: https://lore.kernel.org/r/20211028103114.2849140-1-brauner@kernel.org (v1) Cc: Sargun Dhillon <sargun@sargun.me> Cc: Serge Hallyn <serge@hallyn.com> Cc: Jann Horn <jannh@google.com> Cc: Henning Schild <henning.schild@siemens.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Laurent Vivier <laurent@vivier.eu> Cc: linux-fsdevel@vger.kernel.org Acked-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> --- /* v2 */ - Christian Brauner <christian.brauner@ubuntu.com>: - Add more comments that explain what's going on. - Rename functions while changing them to better reflect what they are doing to make the code easier to understand. - In the first version when a specific binary type handler was removed either through a write to the entry's file or all binary type handlers were removed by a write to the binfmt_misc mount's status file all cleanup work happened during inode eviction. That includes removal of the relevant entries from entry list. While that works fine I disliked that model after thinking about it for a bit. Because it means that there was a window were someone has already removed a or all binary handlers but they could still be safely reached from load_misc_binary() when it has managed to take the read_lock() on the entries list while inode eviction was already happening. Again, that perfectly benign but it's cleaner to remove the binary handler from the list immediately meaning that ones the write to then entry's file or the binfmt_misc status file returns the binary type cannot be executed anymore. That gives stronger guarantees to the user. diff e7850f4d Fri Mar 12 22:07:41 MST 2021 Lior Ribak <liorribak@gmail.com> binfmt_misc: fix possible deadlock in bm_register_write There is a deadlock in bm_register_write: First, in the begining of the function, a lock is taken on the binfmt_misc root inode with inode_lock(d_inode(root)). Then, if the user used the MISC_FMT_OPEN_FILE flag, the function will call open_exec on the user-provided interpreter. open_exec will call a path lookup, and if the path lookup process includes the root of binfmt_misc, it will try to take a shared lock on its inode again, but it is already locked, and the code will get stuck in a deadlock To reproduce the bug: $ echo ":iiiii:E::ii::/proc/sys/fs/binfmt_misc/bla:F" > /proc/sys/fs/binfmt_misc/register backtrace of where the lock occurs (#5): 0 schedule () at ./arch/x86/include/asm/current.h:15 1 0xffffffff81b51237 in rwsem_down_read_slowpath (sem=0xffff888003b202e0, count=<optimized out>, state=state@entry=2) at kernel/locking/rwsem.c:992 2 0xffffffff81b5150a in __down_read_common (state=2, sem=<optimized out>) at kernel/locking/rwsem.c:1213 3 __down_read (sem=<optimized out>) at kernel/locking/rwsem.c:1222 4 down_read (sem=<optimized out>) at kernel/locking/rwsem.c:1355 5 0xffffffff811ee22a in inode_lock_shared (inode=<optimized out>) at ./include/linux/fs.h:783 6 open_last_lookups (op=0xffffc9000022fe34, file=0xffff888004098600, nd=0xffffc9000022fd10) at fs/namei.c:3177 7 path_openat (nd=nd@entry=0xffffc9000022fd10, op=op@entry=0xffffc9000022fe34, flags=flags@entry=65) at fs/namei.c:3366 8 0xffffffff811efe1c in do_filp_open (dfd=<optimized out>, pathname=pathname@entry=0xffff8880031b9000, op=op@entry=0xffffc9000022fe34) at fs/namei.c:3396 9 0xffffffff811e493f in do_open_execat (fd=fd@entry=-100, name=name@entry=0xffff8880031b9000, flags=<optimized out>, flags@entry=0) at fs/exec.c:913 10 0xffffffff811e4a92 in open_exec (name=<optimized out>) at fs/exec.c:948 11 0xffffffff8124aa84 in bm_register_write (file=<optimized out>, buffer=<optimized out>, count=19, ppos=<optimized out>) at fs/binfmt_misc.c:682 12 0xffffffff811decd2 in vfs_write (file=file@entry=0xffff888004098500, buf=buf@entry=0xa758d0 ":iiiii:E::ii::i:CF ", count=count@entry=19, pos=pos@entry=0xffffc9000022ff10) at fs/read_write.c:603 13 0xffffffff811defda in ksys_write (fd=<optimized out>, buf=0xa758d0 ":iiiii:E::ii::i:CF ", count=19) at fs/read_write.c:658 14 0xffffffff81b49813 in do_syscall_64 (nr=<optimized out>, regs=0xffffc9000022ff58) at arch/x86/entry/common.c:46 15 0xffffffff81c0007c in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:120 To solve the issue, the open_exec call is moved to before the write lock is taken by bm_register_write Link: https://lkml.kernel.org/r/20210228224414.95962-1-liorribak@gmail.com Fixes: 948b701a607f1 ("binfmt_misc: add persistent opened binary handler for containers") Signed-off-by: Lior Ribak <liorribak@gmail.com> Acked-by: Helge Deller <deller@gmx.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff 51f39a1f Fri Dec 12 17:57:29 MST 2014 David Drysdale <drysdale@google.com> syscalls: implement execveat() system call This patchset adds execveat(2) for x86, and is derived from Meredydd Luff's patch from Sept 2012 (https://lkml.org/lkml/2012/9/11/528). The primary aim of adding an execveat syscall is to allow an implementation of fexecve(3) that does not rely on the /proc filesystem, at least for executables (rather than scripts). The current glibc version of fexecve(3) is implemented via /proc, which causes problems in sandboxed or otherwise restricted environments. Given the desire for a /proc-free fexecve() implementation, HPA suggested (https://lkml.org/lkml/2006/7/11/556) that an execveat(2) syscall would be an appropriate generalization. Also, having a new syscall means that it can take a flags argument without back-compatibility concerns. The current implementation just defines the AT_EMPTY_PATH and AT_SYMLINK_NOFOLLOW flags, but other flags could be added in future -- for example, flags for new namespaces (as suggested at https://lkml.org/lkml/2006/7/11/474). Related history: - https://lkml.org/lkml/2006/12/27/123 is an example of someone realizing that fexecve() is likely to fail in a chroot environment. - http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=514043 covered documenting the /proc requirement of fexecve(3) in its manpage, to "prevent other people from wasting their time". - https://bugzilla.redhat.com/show_bug.cgi?id=241609 described a problem where a process that did setuid() could not fexecve() because it no longer had access to /proc/self/fd; this has since been fixed. This patch (of 4): Add a new execveat(2) system call. execveat() is to execve() as openat() is to open(): it takes a file descriptor that refers to a directory, and resolves the filename relative to that. In addition, if the filename is empty and AT_EMPTY_PATH is specified, execveat() executes the file to which the file descriptor refers. This replicates the functionality of fexecve(), which is a system call in other UNIXen, but in Linux glibc it depends on opening "/proc/self/fd/<fd>" (and so relies on /proc being mounted). The filename fed to the executed program as argv[0] (or the name of the script fed to a script interpreter) will be of the form "/dev/fd/<fd>" (for an empty filename) or "/dev/fd/<fd>/<filename>", effectively reflecting how the executable was found. This does however mean that execution of a script in a /proc-less environment won't work; also, script execution via an O_CLOEXEC file descriptor fails (as the file will not be accessible after exec). Based on patches by Meredydd Luff. Signed-off-by: David Drysdale <drysdale@google.com> Cc: Meredydd Luff <meredydd@senatehouse.org> Cc: Shuah Khan <shuah.kh@samsung.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rich Felker <dalias@aerifal.cx> Cc: Christoph Hellwig <hch@infradead.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff 6b899c4e Wed Dec 10 16:52:08 MST 2014 Mike Frysinger <vapier@gentoo.org> binfmt_misc: add comments & debug logs When trying to develop a custom format handler, the errors returned all effectively get bucketed as EINVAL with no kernel messages. The other errors (ENOMEM/EFAULT) are internal/obvious and basic. Thus any time a bad handler is rejected, the developer has to walk the dense code and try to guess where it went wrong. Needing to dive into kernel code is itself a fairly high barrier for a lot of people. To improve this situation, let's deploy extensive pr_debug markers at logical parse points, and add comments to the dense parsing logic. It let's you see exactly where the parsing aborts, the string the kernel received (useful when dealing with shell code), how it translated the buffers to binary data, and how it will apply the mask at runtime. Some example output: $ echo ':qemu-foo:M::\x7fELF\xAD\xAD\x01\x00:\xff\xff\xff\xff\xff\x00\xff\x00:/usr/bin/qemu-foo:POC' > register $ dmesg binfmt_misc: register: received 92 bytes binfmt_misc: register: delim: 0x3a {:} binfmt_misc: register: name: {qemu-foo} binfmt_misc: register: type: M (magic) binfmt_misc: register: offset: 0x0 binfmt_misc: register: magic[raw]: 5c 78 37 66 45 4c 46 5c 78 41 44 5c 78 41 44 5c \x7fELF\xAD\xAD\ binfmt_misc: register: magic[raw]: 78 30 31 5c 78 30 30 00 x01\x00. binfmt_misc: register: mask[raw]: 5c 78 66 66 5c 78 66 66 5c 78 66 66 5c 78 66 66 \xff\xff\xff\xff binfmt_misc: register: mask[raw]: 5c 78 66 66 5c 78 30 30 5c 78 66 66 5c 78 30 30 \xff\x00\xff\x00 binfmt_misc: register: mask[raw]: 00 . binfmt_misc: register: magic/mask length: 8 binfmt_misc: register: magic[decoded]: 7f 45 4c 46 ad ad 01 00 .ELF.... binfmt_misc: register: mask[decoded]: ff ff ff ff ff 00 ff 00 ........ binfmt_misc: register: magic[masked]: 7f 45 4c 46 ad 00 01 00 .ELF.... binfmt_misc: register: interpreter: {/usr/bin/qemu-foo} binfmt_misc: register: flag: P (preserve argv0) binfmt_misc: register: flag: O (open binary) binfmt_misc: register: flag: C (preserve creds) The [raw] lines show us exactly what was received from userspace. The lines after that show us how the kernel has decoded things. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff 6b899c4e Wed Dec 10 16:52:08 MST 2014 Mike Frysinger <vapier@gentoo.org> binfmt_misc: add comments & debug logs When trying to develop a custom format handler, the errors returned all effectively get bucketed as EINVAL with no kernel messages. The other errors (ENOMEM/EFAULT) are internal/obvious and basic. Thus any time a bad handler is rejected, the developer has to walk the dense code and try to guess where it went wrong. Needing to dive into kernel code is itself a fairly high barrier for a lot of people. To improve this situation, let's deploy extensive pr_debug markers at logical parse points, and add comments to the dense parsing logic. It let's you see exactly where the parsing aborts, the string the kernel received (useful when dealing with shell code), how it translated the buffers to binary data, and how it will apply the mask at runtime. Some example output: $ echo ':qemu-foo:M::\x7fELF\xAD\xAD\x01\x00:\xff\xff\xff\xff\xff\x00\xff\x00:/usr/bin/qemu-foo:POC' > register $ dmesg binfmt_misc: register: received 92 bytes binfmt_misc: register: delim: 0x3a {:} binfmt_misc: register: name: {qemu-foo} binfmt_misc: register: type: M (magic) binfmt_misc: register: offset: 0x0 binfmt_misc: register: magic[raw]: 5c 78 37 66 45 4c 46 5c 78 41 44 5c 78 41 44 5c \x7fELF\xAD\xAD\ binfmt_misc: register: magic[raw]: 78 30 31 5c 78 30 30 00 x01\x00. binfmt_misc: register: mask[raw]: 5c 78 66 66 5c 78 66 66 5c 78 66 66 5c 78 66 66 \xff\xff\xff\xff binfmt_misc: register: mask[raw]: 5c 78 66 66 5c 78 30 30 5c 78 66 66 5c 78 30 30 \xff\x00\xff\x00 binfmt_misc: register: mask[raw]: 00 . binfmt_misc: register: magic/mask length: 8 binfmt_misc: register: magic[decoded]: 7f 45 4c 46 ad ad 01 00 .ELF.... binfmt_misc: register: mask[decoded]: ff ff ff ff ff 00 ff 00 ........ binfmt_misc: register: magic[masked]: 7f 45 4c 46 ad 00 01 00 .ELF.... binfmt_misc: register: interpreter: {/usr/bin/qemu-foo} binfmt_misc: register: flag: P (preserve argv0) binfmt_misc: register: flag: O (open binary) binfmt_misc: register: flag: C (preserve creds) The [raw] lines show us exactly what was received from userspace. The lines after that show us how the kernel has decoded things. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff 6b899c4e Wed Dec 10 16:52:08 MST 2014 Mike Frysinger <vapier@gentoo.org> binfmt_misc: add comments & debug logs When trying to develop a custom format handler, the errors returned all effectively get bucketed as EINVAL with no kernel messages. The other errors (ENOMEM/EFAULT) are internal/obvious and basic. Thus any time a bad handler is rejected, the developer has to walk the dense code and try to guess where it went wrong. Needing to dive into kernel code is itself a fairly high barrier for a lot of people. To improve this situation, let's deploy extensive pr_debug markers at logical parse points, and add comments to the dense parsing logic. It let's you see exactly where the parsing aborts, the string the kernel received (useful when dealing with shell code), how it translated the buffers to binary data, and how it will apply the mask at runtime. Some example output: $ echo ':qemu-foo:M::\x7fELF\xAD\xAD\x01\x00:\xff\xff\xff\xff\xff\x00\xff\x00:/usr/bin/qemu-foo:POC' > register $ dmesg binfmt_misc: register: received 92 bytes binfmt_misc: register: delim: 0x3a {:} binfmt_misc: register: name: {qemu-foo} binfmt_misc: register: type: M (magic) binfmt_misc: register: offset: 0x0 binfmt_misc: register: magic[raw]: 5c 78 37 66 45 4c 46 5c 78 41 44 5c 78 41 44 5c \x7fELF\xAD\xAD\ binfmt_misc: register: magic[raw]: 78 30 31 5c 78 30 30 00 x01\x00. binfmt_misc: register: mask[raw]: 5c 78 66 66 5c 78 66 66 5c 78 66 66 5c 78 66 66 \xff\xff\xff\xff binfmt_misc: register: mask[raw]: 5c 78 66 66 5c 78 30 30 5c 78 66 66 5c 78 30 30 \xff\x00\xff\x00 binfmt_misc: register: mask[raw]: 00 . binfmt_misc: register: magic/mask length: 8 binfmt_misc: register: magic[decoded]: 7f 45 4c 46 ad ad 01 00 .ELF.... binfmt_misc: register: mask[decoded]: ff ff ff ff ff 00 ff 00 ........ binfmt_misc: register: magic[masked]: 7f 45 4c 46 ad 00 01 00 .ELF.... binfmt_misc: register: interpreter: {/usr/bin/qemu-foo} binfmt_misc: register: flag: P (preserve argv0) binfmt_misc: register: flag: O (open binary) binfmt_misc: register: flag: C (preserve creds) The [raw] lines show us exactly what was received from userspace. The lines after that show us how the kernel has decoded things. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff bbaecc08 Mon Oct 13 16:52:03 MDT 2014 Mike Frysinger <vapier@gentoo.org> binfmt_misc: expand the register format limit to 1920 bytes The current code places a 256 byte limit on the registration format. This ends up being fairly limited when you try to do matching against a binary format like ELF: - the magic & mask formats cannot have any embedded NUL chars (string_unescape_inplace halts at the first NUL) - each escape sequence quadruples the size: \x00 is needed for NUL - trying to match bytes at the start of the file as well as further on leads to a lot of \x00 sequences in the mask - magic & mask have to be the same length (when decoded) - still need bytes for the other fields - impossible! Let's look at a concrete (and common) example: using QEMU to run MIPS ELFs. The name field uses 11 bytes "qemu-mipsel". The interp uses 20 bytes "/usr/bin/qemu-mipsel". The type & flags takes up 4 bytes. We need 7 bytes for the delimiter (usually ":"). We can skip offset. So already we're down to 107 bytes to use with the magic/mask instead of the real limit of 128 (BINPRM_BUF_SIZE). If people use shell code to register (which they do the majority of the time), they're down to ~26 possible bytes since the escape sequence must be \x##. The ELF format looks like (both 32 & 64 bit): e_ident: 16 bytes e_type: 2 bytes e_machine: 2 bytes Those 20 bytes are enough for most architectures because they have so few formats in the first place, thus they can be uniquely identified. That also means for shell users, since 20 is smaller than 26, they can sanely register a handler. But for some targets (like MIPS), we need to poke further. The ELF fields continue on: e_entry: 4 or 8 bytes e_phoff: 4 or 8 bytes e_shoff: 4 or 8 bytes e_flags: 4 bytes We only care about e_flags here as that includes the bits to identify whether the ELF is O32/N32/N64. But now we have to consume another 16 bytes (for 32 bit ELFs) or 28 bytes (for 64 bit ELFs) just to match the flags. If every byte is escaped, we send 288 more bytes to the kernel ((20 {e_ident,e_type,e_machine} + 12 {e_entry,e_phoff,e_shoff} + 4 {e_flags}) * 2 {mask,magic} * 4 {escape}) and we've clearly blown our budget. Even if we try to be clever and do the decoding ourselves (rather than relying on the kernel to process \x##), we still can't hit the mark -- string_unescape_inplace treats mask & magic as C strings so NUL cannot be embedded. That leaves us with having to pass \x00 for the 12/24 entry/phoff/shoff bytes (as those will be completely random addresses), and that is a minimum requirement of 48/96 bytes for the mask alone. Add up the rest and we blow through it (this is for 64 bit ELFs): magic: 20 {e_ident,e_type,e_machine} + 24 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 48 # ^^ See note below. mask: 20 {e_ident,e_type,e_machine} + 96 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 120 Remember above we had 107 left over, and now we're at 168. This is of course the *best* case scenario -- you'll also want to have NUL bytes in the magic & mask too to match literal zeros. Note: the reason we can use 24 in the magic is that we can work off of the fact that for bytes the mask would clobber, we can stuff any value into magic that we want. So when mask is \x00, we don't need the magic to also be \x00, it can be an unescaped raw byte like '!'. This lets us handle more formats (barely) under the current 256 limit, but that's a pretty tall hoop to force people to jump through. With all that said, let's bump the limit from 256 bytes to 1920. This way we support escaping every byte of the mask & magic field (which is 1024 bytes by themselves -- 128 * 4 * 2), and we leave plenty of room for other fields. Like long paths to the interpreter (when you have source in your /really/long/homedir/qemu/foo). Since the current code stuffs more than one structure into the same buffer, we leave a bit of space to easily round up to 2k. 1920 is just as arbitrary as 256 ;). Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff bbaecc08 Mon Oct 13 16:52:03 MDT 2014 Mike Frysinger <vapier@gentoo.org> binfmt_misc: expand the register format limit to 1920 bytes The current code places a 256 byte limit on the registration format. This ends up being fairly limited when you try to do matching against a binary format like ELF: - the magic & mask formats cannot have any embedded NUL chars (string_unescape_inplace halts at the first NUL) - each escape sequence quadruples the size: \x00 is needed for NUL - trying to match bytes at the start of the file as well as further on leads to a lot of \x00 sequences in the mask - magic & mask have to be the same length (when decoded) - still need bytes for the other fields - impossible! Let's look at a concrete (and common) example: using QEMU to run MIPS ELFs. The name field uses 11 bytes "qemu-mipsel". The interp uses 20 bytes "/usr/bin/qemu-mipsel". The type & flags takes up 4 bytes. We need 7 bytes for the delimiter (usually ":"). We can skip offset. So already we're down to 107 bytes to use with the magic/mask instead of the real limit of 128 (BINPRM_BUF_SIZE). If people use shell code to register (which they do the majority of the time), they're down to ~26 possible bytes since the escape sequence must be \x##. The ELF format looks like (both 32 & 64 bit): e_ident: 16 bytes e_type: 2 bytes e_machine: 2 bytes Those 20 bytes are enough for most architectures because they have so few formats in the first place, thus they can be uniquely identified. That also means for shell users, since 20 is smaller than 26, they can sanely register a handler. But for some targets (like MIPS), we need to poke further. The ELF fields continue on: e_entry: 4 or 8 bytes e_phoff: 4 or 8 bytes e_shoff: 4 or 8 bytes e_flags: 4 bytes We only care about e_flags here as that includes the bits to identify whether the ELF is O32/N32/N64. But now we have to consume another 16 bytes (for 32 bit ELFs) or 28 bytes (for 64 bit ELFs) just to match the flags. If every byte is escaped, we send 288 more bytes to the kernel ((20 {e_ident,e_type,e_machine} + 12 {e_entry,e_phoff,e_shoff} + 4 {e_flags}) * 2 {mask,magic} * 4 {escape}) and we've clearly blown our budget. Even if we try to be clever and do the decoding ourselves (rather than relying on the kernel to process \x##), we still can't hit the mark -- string_unescape_inplace treats mask & magic as C strings so NUL cannot be embedded. That leaves us with having to pass \x00 for the 12/24 entry/phoff/shoff bytes (as those will be completely random addresses), and that is a minimum requirement of 48/96 bytes for the mask alone. Add up the rest and we blow through it (this is for 64 bit ELFs): magic: 20 {e_ident,e_type,e_machine} + 24 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 48 # ^^ See note below. mask: 20 {e_ident,e_type,e_machine} + 96 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 120 Remember above we had 107 left over, and now we're at 168. This is of course the *best* case scenario -- you'll also want to have NUL bytes in the magic & mask too to match literal zeros. Note: the reason we can use 24 in the magic is that we can work off of the fact that for bytes the mask would clobber, we can stuff any value into magic that we want. So when mask is \x00, we don't need the magic to also be \x00, it can be an unescaped raw byte like '!'. This lets us handle more formats (barely) under the current 256 limit, but that's a pretty tall hoop to force people to jump through. With all that said, let's bump the limit from 256 bytes to 1920. This way we support escaping every byte of the mask & magic field (which is 1024 bytes by themselves -- 128 * 4 * 2), and we leave plenty of room for other fields. Like long paths to the interpreter (when you have source in your /really/long/homedir/qemu/foo). Since the current code stuffs more than one structure into the same buffer, we leave a bit of space to easily round up to 2k. 1920 is just as arbitrary as 256 ;). Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff bbaecc08 Mon Oct 13 16:52:03 MDT 2014 Mike Frysinger <vapier@gentoo.org> binfmt_misc: expand the register format limit to 1920 bytes The current code places a 256 byte limit on the registration format. This ends up being fairly limited when you try to do matching against a binary format like ELF: - the magic & mask formats cannot have any embedded NUL chars (string_unescape_inplace halts at the first NUL) - each escape sequence quadruples the size: \x00 is needed for NUL - trying to match bytes at the start of the file as well as further on leads to a lot of \x00 sequences in the mask - magic & mask have to be the same length (when decoded) - still need bytes for the other fields - impossible! Let's look at a concrete (and common) example: using QEMU to run MIPS ELFs. The name field uses 11 bytes "qemu-mipsel". The interp uses 20 bytes "/usr/bin/qemu-mipsel". The type & flags takes up 4 bytes. We need 7 bytes for the delimiter (usually ":"). We can skip offset. So already we're down to 107 bytes to use with the magic/mask instead of the real limit of 128 (BINPRM_BUF_SIZE). If people use shell code to register (which they do the majority of the time), they're down to ~26 possible bytes since the escape sequence must be \x##. The ELF format looks like (both 32 & 64 bit): e_ident: 16 bytes e_type: 2 bytes e_machine: 2 bytes Those 20 bytes are enough for most architectures because they have so few formats in the first place, thus they can be uniquely identified. That also means for shell users, since 20 is smaller than 26, they can sanely register a handler. But for some targets (like MIPS), we need to poke further. The ELF fields continue on: e_entry: 4 or 8 bytes e_phoff: 4 or 8 bytes e_shoff: 4 or 8 bytes e_flags: 4 bytes We only care about e_flags here as that includes the bits to identify whether the ELF is O32/N32/N64. But now we have to consume another 16 bytes (for 32 bit ELFs) or 28 bytes (for 64 bit ELFs) just to match the flags. If every byte is escaped, we send 288 more bytes to the kernel ((20 {e_ident,e_type,e_machine} + 12 {e_entry,e_phoff,e_shoff} + 4 {e_flags}) * 2 {mask,magic} * 4 {escape}) and we've clearly blown our budget. Even if we try to be clever and do the decoding ourselves (rather than relying on the kernel to process \x##), we still can't hit the mark -- string_unescape_inplace treats mask & magic as C strings so NUL cannot be embedded. That leaves us with having to pass \x00 for the 12/24 entry/phoff/shoff bytes (as those will be completely random addresses), and that is a minimum requirement of 48/96 bytes for the mask alone. Add up the rest and we blow through it (this is for 64 bit ELFs): magic: 20 {e_ident,e_type,e_machine} + 24 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 48 # ^^ See note below. mask: 20 {e_ident,e_type,e_machine} + 96 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 120 Remember above we had 107 left over, and now we're at 168. This is of course the *best* case scenario -- you'll also want to have NUL bytes in the magic & mask too to match literal zeros. Note: the reason we can use 24 in the magic is that we can work off of the fact that for bytes the mask would clobber, we can stuff any value into magic that we want. So when mask is \x00, we don't need the magic to also be \x00, it can be an unescaped raw byte like '!'. This lets us handle more formats (barely) under the current 256 limit, but that's a pretty tall hoop to force people to jump through. With all that said, let's bump the limit from 256 bytes to 1920. This way we support escaping every byte of the mask & magic field (which is 1024 bytes by themselves -- 128 * 4 * 2), and we leave plenty of room for other fields. Like long paths to the interpreter (when you have source in your /really/long/homedir/qemu/foo). Since the current code stuffs more than one structure into the same buffer, we leave a bit of space to easily round up to 2k. 1920 is just as arbitrary as 256 ;). Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff bbaecc08 Mon Oct 13 16:52:03 MDT 2014 Mike Frysinger <vapier@gentoo.org> binfmt_misc: expand the register format limit to 1920 bytes The current code places a 256 byte limit on the registration format. This ends up being fairly limited when you try to do matching against a binary format like ELF: - the magic & mask formats cannot have any embedded NUL chars (string_unescape_inplace halts at the first NUL) - each escape sequence quadruples the size: \x00 is needed for NUL - trying to match bytes at the start of the file as well as further on leads to a lot of \x00 sequences in the mask - magic & mask have to be the same length (when decoded) - still need bytes for the other fields - impossible! Let's look at a concrete (and common) example: using QEMU to run MIPS ELFs. The name field uses 11 bytes "qemu-mipsel". The interp uses 20 bytes "/usr/bin/qemu-mipsel". The type & flags takes up 4 bytes. We need 7 bytes for the delimiter (usually ":"). We can skip offset. So already we're down to 107 bytes to use with the magic/mask instead of the real limit of 128 (BINPRM_BUF_SIZE). If people use shell code to register (which they do the majority of the time), they're down to ~26 possible bytes since the escape sequence must be \x##. The ELF format looks like (both 32 & 64 bit): e_ident: 16 bytes e_type: 2 bytes e_machine: 2 bytes Those 20 bytes are enough for most architectures because they have so few formats in the first place, thus they can be uniquely identified. That also means for shell users, since 20 is smaller than 26, they can sanely register a handler. But for some targets (like MIPS), we need to poke further. The ELF fields continue on: e_entry: 4 or 8 bytes e_phoff: 4 or 8 bytes e_shoff: 4 or 8 bytes e_flags: 4 bytes We only care about e_flags here as that includes the bits to identify whether the ELF is O32/N32/N64. But now we have to consume another 16 bytes (for 32 bit ELFs) or 28 bytes (for 64 bit ELFs) just to match the flags. If every byte is escaped, we send 288 more bytes to the kernel ((20 {e_ident,e_type,e_machine} + 12 {e_entry,e_phoff,e_shoff} + 4 {e_flags}) * 2 {mask,magic} * 4 {escape}) and we've clearly blown our budget. Even if we try to be clever and do the decoding ourselves (rather than relying on the kernel to process \x##), we still can't hit the mark -- string_unescape_inplace treats mask & magic as C strings so NUL cannot be embedded. That leaves us with having to pass \x00 for the 12/24 entry/phoff/shoff bytes (as those will be completely random addresses), and that is a minimum requirement of 48/96 bytes for the mask alone. Add up the rest and we blow through it (this is for 64 bit ELFs): magic: 20 {e_ident,e_type,e_machine} + 24 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 48 # ^^ See note below. mask: 20 {e_ident,e_type,e_machine} + 96 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 120 Remember above we had 107 left over, and now we're at 168. This is of course the *best* case scenario -- you'll also want to have NUL bytes in the magic & mask too to match literal zeros. Note: the reason we can use 24 in the magic is that we can work off of the fact that for bytes the mask would clobber, we can stuff any value into magic that we want. So when mask is \x00, we don't need the magic to also be \x00, it can be an unescaped raw byte like '!'. This lets us handle more formats (barely) under the current 256 limit, but that's a pretty tall hoop to force people to jump through. With all that said, let's bump the limit from 256 bytes to 1920. This way we support escaping every byte of the mask & magic field (which is 1024 bytes by themselves -- 128 * 4 * 2), and we leave plenty of room for other fields. Like long paths to the interpreter (when you have source in your /really/long/homedir/qemu/foo). Since the current code stuffs more than one structure into the same buffer, we leave a bit of space to easily round up to 2k. 1920 is just as arbitrary as 256 ;). Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff bbaecc08 Mon Oct 13 16:52:03 MDT 2014 Mike Frysinger <vapier@gentoo.org> binfmt_misc: expand the register format limit to 1920 bytes The current code places a 256 byte limit on the registration format. This ends up being fairly limited when you try to do matching against a binary format like ELF: - the magic & mask formats cannot have any embedded NUL chars (string_unescape_inplace halts at the first NUL) - each escape sequence quadruples the size: \x00 is needed for NUL - trying to match bytes at the start of the file as well as further on leads to a lot of \x00 sequences in the mask - magic & mask have to be the same length (when decoded) - still need bytes for the other fields - impossible! Let's look at a concrete (and common) example: using QEMU to run MIPS ELFs. The name field uses 11 bytes "qemu-mipsel". The interp uses 20 bytes "/usr/bin/qemu-mipsel". The type & flags takes up 4 bytes. We need 7 bytes for the delimiter (usually ":"). We can skip offset. So already we're down to 107 bytes to use with the magic/mask instead of the real limit of 128 (BINPRM_BUF_SIZE). If people use shell code to register (which they do the majority of the time), they're down to ~26 possible bytes since the escape sequence must be \x##. The ELF format looks like (both 32 & 64 bit): e_ident: 16 bytes e_type: 2 bytes e_machine: 2 bytes Those 20 bytes are enough for most architectures because they have so few formats in the first place, thus they can be uniquely identified. That also means for shell users, since 20 is smaller than 26, they can sanely register a handler. But for some targets (like MIPS), we need to poke further. The ELF fields continue on: e_entry: 4 or 8 bytes e_phoff: 4 or 8 bytes e_shoff: 4 or 8 bytes e_flags: 4 bytes We only care about e_flags here as that includes the bits to identify whether the ELF is O32/N32/N64. But now we have to consume another 16 bytes (for 32 bit ELFs) or 28 bytes (for 64 bit ELFs) just to match the flags. If every byte is escaped, we send 288 more bytes to the kernel ((20 {e_ident,e_type,e_machine} + 12 {e_entry,e_phoff,e_shoff} + 4 {e_flags}) * 2 {mask,magic} * 4 {escape}) and we've clearly blown our budget. Even if we try to be clever and do the decoding ourselves (rather than relying on the kernel to process \x##), we still can't hit the mark -- string_unescape_inplace treats mask & magic as C strings so NUL cannot be embedded. That leaves us with having to pass \x00 for the 12/24 entry/phoff/shoff bytes (as those will be completely random addresses), and that is a minimum requirement of 48/96 bytes for the mask alone. Add up the rest and we blow through it (this is for 64 bit ELFs): magic: 20 {e_ident,e_type,e_machine} + 24 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 48 # ^^ See note below. mask: 20 {e_ident,e_type,e_machine} + 96 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 120 Remember above we had 107 left over, and now we're at 168. This is of course the *best* case scenario -- you'll also want to have NUL bytes in the magic & mask too to match literal zeros. Note: the reason we can use 24 in the magic is that we can work off of the fact that for bytes the mask would clobber, we can stuff any value into magic that we want. So when mask is \x00, we don't need the magic to also be \x00, it can be an unescaped raw byte like '!'. This lets us handle more formats (barely) under the current 256 limit, but that's a pretty tall hoop to force people to jump through. With all that said, let's bump the limit from 256 bytes to 1920. This way we support escaping every byte of the mask & magic field (which is 1024 bytes by themselves -- 128 * 4 * 2), and we leave plenty of room for other fields. Like long paths to the interpreter (when you have source in your /really/long/homedir/qemu/foo). Since the current code stuffs more than one structure into the same buffer, we leave a bit of space to easily round up to 2k. 1920 is just as arbitrary as 256 ;). Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff bbaecc08 Mon Oct 13 16:52:03 MDT 2014 Mike Frysinger <vapier@gentoo.org> binfmt_misc: expand the register format limit to 1920 bytes The current code places a 256 byte limit on the registration format. This ends up being fairly limited when you try to do matching against a binary format like ELF: - the magic & mask formats cannot have any embedded NUL chars (string_unescape_inplace halts at the first NUL) - each escape sequence quadruples the size: \x00 is needed for NUL - trying to match bytes at the start of the file as well as further on leads to a lot of \x00 sequences in the mask - magic & mask have to be the same length (when decoded) - still need bytes for the other fields - impossible! Let's look at a concrete (and common) example: using QEMU to run MIPS ELFs. The name field uses 11 bytes "qemu-mipsel". The interp uses 20 bytes "/usr/bin/qemu-mipsel". The type & flags takes up 4 bytes. We need 7 bytes for the delimiter (usually ":"). We can skip offset. So already we're down to 107 bytes to use with the magic/mask instead of the real limit of 128 (BINPRM_BUF_SIZE). If people use shell code to register (which they do the majority of the time), they're down to ~26 possible bytes since the escape sequence must be \x##. The ELF format looks like (both 32 & 64 bit): e_ident: 16 bytes e_type: 2 bytes e_machine: 2 bytes Those 20 bytes are enough for most architectures because they have so few formats in the first place, thus they can be uniquely identified. That also means for shell users, since 20 is smaller than 26, they can sanely register a handler. But for some targets (like MIPS), we need to poke further. The ELF fields continue on: e_entry: 4 or 8 bytes e_phoff: 4 or 8 bytes e_shoff: 4 or 8 bytes e_flags: 4 bytes We only care about e_flags here as that includes the bits to identify whether the ELF is O32/N32/N64. But now we have to consume another 16 bytes (for 32 bit ELFs) or 28 bytes (for 64 bit ELFs) just to match the flags. If every byte is escaped, we send 288 more bytes to the kernel ((20 {e_ident,e_type,e_machine} + 12 {e_entry,e_phoff,e_shoff} + 4 {e_flags}) * 2 {mask,magic} * 4 {escape}) and we've clearly blown our budget. Even if we try to be clever and do the decoding ourselves (rather than relying on the kernel to process \x##), we still can't hit the mark -- string_unescape_inplace treats mask & magic as C strings so NUL cannot be embedded. That leaves us with having to pass \x00 for the 12/24 entry/phoff/shoff bytes (as those will be completely random addresses), and that is a minimum requirement of 48/96 bytes for the mask alone. Add up the rest and we blow through it (this is for 64 bit ELFs): magic: 20 {e_ident,e_type,e_machine} + 24 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 48 # ^^ See note below. mask: 20 {e_ident,e_type,e_machine} + 96 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 120 Remember above we had 107 left over, and now we're at 168. This is of course the *best* case scenario -- you'll also want to have NUL bytes in the magic & mask too to match literal zeros. Note: the reason we can use 24 in the magic is that we can work off of the fact that for bytes the mask would clobber, we can stuff any value into magic that we want. So when mask is \x00, we don't need the magic to also be \x00, it can be an unescaped raw byte like '!'. This lets us handle more formats (barely) under the current 256 limit, but that's a pretty tall hoop to force people to jump through. With all that said, let's bump the limit from 256 bytes to 1920. This way we support escaping every byte of the mask & magic field (which is 1024 bytes by themselves -- 128 * 4 * 2), and we leave plenty of room for other fields. Like long paths to the interpreter (when you have source in your /really/long/homedir/qemu/foo). Since the current code stuffs more than one structure into the same buffer, we leave a bit of space to easily round up to 2k. 1920 is just as arbitrary as 256 ;). Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff bbaecc08 Mon Oct 13 16:52:03 MDT 2014 Mike Frysinger <vapier@gentoo.org> binfmt_misc: expand the register format limit to 1920 bytes The current code places a 256 byte limit on the registration format. This ends up being fairly limited when you try to do matching against a binary format like ELF: - the magic & mask formats cannot have any embedded NUL chars (string_unescape_inplace halts at the first NUL) - each escape sequence quadruples the size: \x00 is needed for NUL - trying to match bytes at the start of the file as well as further on leads to a lot of \x00 sequences in the mask - magic & mask have to be the same length (when decoded) - still need bytes for the other fields - impossible! Let's look at a concrete (and common) example: using QEMU to run MIPS ELFs. The name field uses 11 bytes "qemu-mipsel". The interp uses 20 bytes "/usr/bin/qemu-mipsel". The type & flags takes up 4 bytes. We need 7 bytes for the delimiter (usually ":"). We can skip offset. So already we're down to 107 bytes to use with the magic/mask instead of the real limit of 128 (BINPRM_BUF_SIZE). If people use shell code to register (which they do the majority of the time), they're down to ~26 possible bytes since the escape sequence must be \x##. The ELF format looks like (both 32 & 64 bit): e_ident: 16 bytes e_type: 2 bytes e_machine: 2 bytes Those 20 bytes are enough for most architectures because they have so few formats in the first place, thus they can be uniquely identified. That also means for shell users, since 20 is smaller than 26, they can sanely register a handler. But for some targets (like MIPS), we need to poke further. The ELF fields continue on: e_entry: 4 or 8 bytes e_phoff: 4 or 8 bytes e_shoff: 4 or 8 bytes e_flags: 4 bytes We only care about e_flags here as that includes the bits to identify whether the ELF is O32/N32/N64. But now we have to consume another 16 bytes (for 32 bit ELFs) or 28 bytes (for 64 bit ELFs) just to match the flags. If every byte is escaped, we send 288 more bytes to the kernel ((20 {e_ident,e_type,e_machine} + 12 {e_entry,e_phoff,e_shoff} + 4 {e_flags}) * 2 {mask,magic} * 4 {escape}) and we've clearly blown our budget. Even if we try to be clever and do the decoding ourselves (rather than relying on the kernel to process \x##), we still can't hit the mark -- string_unescape_inplace treats mask & magic as C strings so NUL cannot be embedded. That leaves us with having to pass \x00 for the 12/24 entry/phoff/shoff bytes (as those will be completely random addresses), and that is a minimum requirement of 48/96 bytes for the mask alone. Add up the rest and we blow through it (this is for 64 bit ELFs): magic: 20 {e_ident,e_type,e_machine} + 24 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 48 # ^^ See note below. mask: 20 {e_ident,e_type,e_machine} + 96 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 120 Remember above we had 107 left over, and now we're at 168. This is of course the *best* case scenario -- you'll also want to have NUL bytes in the magic & mask too to match literal zeros. Note: the reason we can use 24 in the magic is that we can work off of the fact that for bytes the mask would clobber, we can stuff any value into magic that we want. So when mask is \x00, we don't need the magic to also be \x00, it can be an unescaped raw byte like '!'. This lets us handle more formats (barely) under the current 256 limit, but that's a pretty tall hoop to force people to jump through. With all that said, let's bump the limit from 256 bytes to 1920. This way we support escaping every byte of the mask & magic field (which is 1024 bytes by themselves -- 128 * 4 * 2), and we leave plenty of room for other fields. Like long paths to the interpreter (when you have source in your /really/long/homedir/qemu/foo). Since the current code stuffs more than one structure into the same buffer, we leave a bit of space to easily round up to 2k. 1920 is just as arbitrary as 256 ;). Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff bbaecc08 Mon Oct 13 16:52:03 MDT 2014 Mike Frysinger <vapier@gentoo.org> binfmt_misc: expand the register format limit to 1920 bytes The current code places a 256 byte limit on the registration format. This ends up being fairly limited when you try to do matching against a binary format like ELF: - the magic & mask formats cannot have any embedded NUL chars (string_unescape_inplace halts at the first NUL) - each escape sequence quadruples the size: \x00 is needed for NUL - trying to match bytes at the start of the file as well as further on leads to a lot of \x00 sequences in the mask - magic & mask have to be the same length (when decoded) - still need bytes for the other fields - impossible! Let's look at a concrete (and common) example: using QEMU to run MIPS ELFs. The name field uses 11 bytes "qemu-mipsel". The interp uses 20 bytes "/usr/bin/qemu-mipsel". The type & flags takes up 4 bytes. We need 7 bytes for the delimiter (usually ":"). We can skip offset. So already we're down to 107 bytes to use with the magic/mask instead of the real limit of 128 (BINPRM_BUF_SIZE). If people use shell code to register (which they do the majority of the time), they're down to ~26 possible bytes since the escape sequence must be \x##. The ELF format looks like (both 32 & 64 bit): e_ident: 16 bytes e_type: 2 bytes e_machine: 2 bytes Those 20 bytes are enough for most architectures because they have so few formats in the first place, thus they can be uniquely identified. That also means for shell users, since 20 is smaller than 26, they can sanely register a handler. But for some targets (like MIPS), we need to poke further. The ELF fields continue on: e_entry: 4 or 8 bytes e_phoff: 4 or 8 bytes e_shoff: 4 or 8 bytes e_flags: 4 bytes We only care about e_flags here as that includes the bits to identify whether the ELF is O32/N32/N64. But now we have to consume another 16 bytes (for 32 bit ELFs) or 28 bytes (for 64 bit ELFs) just to match the flags. If every byte is escaped, we send 288 more bytes to the kernel ((20 {e_ident,e_type,e_machine} + 12 {e_entry,e_phoff,e_shoff} + 4 {e_flags}) * 2 {mask,magic} * 4 {escape}) and we've clearly blown our budget. Even if we try to be clever and do the decoding ourselves (rather than relying on the kernel to process \x##), we still can't hit the mark -- string_unescape_inplace treats mask & magic as C strings so NUL cannot be embedded. That leaves us with having to pass \x00 for the 12/24 entry/phoff/shoff bytes (as those will be completely random addresses), and that is a minimum requirement of 48/96 bytes for the mask alone. Add up the rest and we blow through it (this is for 64 bit ELFs): magic: 20 {e_ident,e_type,e_machine} + 24 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 48 # ^^ See note below. mask: 20 {e_ident,e_type,e_machine} + 96 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 120 Remember above we had 107 left over, and now we're at 168. This is of course the *best* case scenario -- you'll also want to have NUL bytes in the magic & mask too to match literal zeros. Note: the reason we can use 24 in the magic is that we can work off of the fact that for bytes the mask would clobber, we can stuff any value into magic that we want. So when mask is \x00, we don't need the magic to also be \x00, it can be an unescaped raw byte like '!'. This lets us handle more formats (barely) under the current 256 limit, but that's a pretty tall hoop to force people to jump through. With all that said, let's bump the limit from 256 bytes to 1920. This way we support escaping every byte of the mask & magic field (which is 1024 bytes by themselves -- 128 * 4 * 2), and we leave plenty of room for other fields. Like long paths to the interpreter (when you have source in your /really/long/homedir/qemu/foo). Since the current code stuffs more than one structure into the same buffer, we leave a bit of space to easily round up to 2k. 1920 is just as arbitrary as 256 ;). Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff bbaecc08 Mon Oct 13 16:52:03 MDT 2014 Mike Frysinger <vapier@gentoo.org> binfmt_misc: expand the register format limit to 1920 bytes The current code places a 256 byte limit on the registration format. This ends up being fairly limited when you try to do matching against a binary format like ELF: - the magic & mask formats cannot have any embedded NUL chars (string_unescape_inplace halts at the first NUL) - each escape sequence quadruples the size: \x00 is needed for NUL - trying to match bytes at the start of the file as well as further on leads to a lot of \x00 sequences in the mask - magic & mask have to be the same length (when decoded) - still need bytes for the other fields - impossible! Let's look at a concrete (and common) example: using QEMU to run MIPS ELFs. The name field uses 11 bytes "qemu-mipsel". The interp uses 20 bytes "/usr/bin/qemu-mipsel". The type & flags takes up 4 bytes. We need 7 bytes for the delimiter (usually ":"). We can skip offset. So already we're down to 107 bytes to use with the magic/mask instead of the real limit of 128 (BINPRM_BUF_SIZE). If people use shell code to register (which they do the majority of the time), they're down to ~26 possible bytes since the escape sequence must be \x##. The ELF format looks like (both 32 & 64 bit): e_ident: 16 bytes e_type: 2 bytes e_machine: 2 bytes Those 20 bytes are enough for most architectures because they have so few formats in the first place, thus they can be uniquely identified. That also means for shell users, since 20 is smaller than 26, they can sanely register a handler. But for some targets (like MIPS), we need to poke further. The ELF fields continue on: e_entry: 4 or 8 bytes e_phoff: 4 or 8 bytes e_shoff: 4 or 8 bytes e_flags: 4 bytes We only care about e_flags here as that includes the bits to identify whether the ELF is O32/N32/N64. But now we have to consume another 16 bytes (for 32 bit ELFs) or 28 bytes (for 64 bit ELFs) just to match the flags. If every byte is escaped, we send 288 more bytes to the kernel ((20 {e_ident,e_type,e_machine} + 12 {e_entry,e_phoff,e_shoff} + 4 {e_flags}) * 2 {mask,magic} * 4 {escape}) and we've clearly blown our budget. Even if we try to be clever and do the decoding ourselves (rather than relying on the kernel to process \x##), we still can't hit the mark -- string_unescape_inplace treats mask & magic as C strings so NUL cannot be embedded. That leaves us with having to pass \x00 for the 12/24 entry/phoff/shoff bytes (as those will be completely random addresses), and that is a minimum requirement of 48/96 bytes for the mask alone. Add up the rest and we blow through it (this is for 64 bit ELFs): magic: 20 {e_ident,e_type,e_machine} + 24 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 48 # ^^ See note below. mask: 20 {e_ident,e_type,e_machine} + 96 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 120 Remember above we had 107 left over, and now we're at 168. This is of course the *best* case scenario -- you'll also want to have NUL bytes in the magic & mask too to match literal zeros. Note: the reason we can use 24 in the magic is that we can work off of the fact that for bytes the mask would clobber, we can stuff any value into magic that we want. So when mask is \x00, we don't need the magic to also be \x00, it can be an unescaped raw byte like '!'. This lets us handle more formats (barely) under the current 256 limit, but that's a pretty tall hoop to force people to jump through. With all that said, let's bump the limit from 256 bytes to 1920. This way we support escaping every byte of the mask & magic field (which is 1024 bytes by themselves -- 128 * 4 * 2), and we leave plenty of room for other fields. Like long paths to the interpreter (when you have source in your /really/long/homedir/qemu/foo). Since the current code stuffs more than one structure into the same buffer, we leave a bit of space to easily round up to 2k. 1920 is just as arbitrary as 256 ;). Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff bbaecc08 Mon Oct 13 16:52:03 MDT 2014 Mike Frysinger <vapier@gentoo.org> binfmt_misc: expand the register format limit to 1920 bytes The current code places a 256 byte limit on the registration format. This ends up being fairly limited when you try to do matching against a binary format like ELF: - the magic & mask formats cannot have any embedded NUL chars (string_unescape_inplace halts at the first NUL) - each escape sequence quadruples the size: \x00 is needed for NUL - trying to match bytes at the start of the file as well as further on leads to a lot of \x00 sequences in the mask - magic & mask have to be the same length (when decoded) - still need bytes for the other fields - impossible! Let's look at a concrete (and common) example: using QEMU to run MIPS ELFs. The name field uses 11 bytes "qemu-mipsel". The interp uses 20 bytes "/usr/bin/qemu-mipsel". The type & flags takes up 4 bytes. We need 7 bytes for the delimiter (usually ":"). We can skip offset. So already we're down to 107 bytes to use with the magic/mask instead of the real limit of 128 (BINPRM_BUF_SIZE). If people use shell code to register (which they do the majority of the time), they're down to ~26 possible bytes since the escape sequence must be \x##. The ELF format looks like (both 32 & 64 bit): e_ident: 16 bytes e_type: 2 bytes e_machine: 2 bytes Those 20 bytes are enough for most architectures because they have so few formats in the first place, thus they can be uniquely identified. That also means for shell users, since 20 is smaller than 26, they can sanely register a handler. But for some targets (like MIPS), we need to poke further. The ELF fields continue on: e_entry: 4 or 8 bytes e_phoff: 4 or 8 bytes e_shoff: 4 or 8 bytes e_flags: 4 bytes We only care about e_flags here as that includes the bits to identify whether the ELF is O32/N32/N64. But now we have to consume another 16 bytes (for 32 bit ELFs) or 28 bytes (for 64 bit ELFs) just to match the flags. If every byte is escaped, we send 288 more bytes to the kernel ((20 {e_ident,e_type,e_machine} + 12 {e_entry,e_phoff,e_shoff} + 4 {e_flags}) * 2 {mask,magic} * 4 {escape}) and we've clearly blown our budget. Even if we try to be clever and do the decoding ourselves (rather than relying on the kernel to process \x##), we still can't hit the mark -- string_unescape_inplace treats mask & magic as C strings so NUL cannot be embedded. That leaves us with having to pass \x00 for the 12/24 entry/phoff/shoff bytes (as those will be completely random addresses), and that is a minimum requirement of 48/96 bytes for the mask alone. Add up the rest and we blow through it (this is for 64 bit ELFs): magic: 20 {e_ident,e_type,e_machine} + 24 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 48 # ^^ See note below. mask: 20 {e_ident,e_type,e_machine} + 96 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 120 Remember above we had 107 left over, and now we're at 168. This is of course the *best* case scenario -- you'll also want to have NUL bytes in the magic & mask too to match literal zeros. Note: the reason we can use 24 in the magic is that we can work off of the fact that for bytes the mask would clobber, we can stuff any value into magic that we want. So when mask is \x00, we don't need the magic to also be \x00, it can be an unescaped raw byte like '!'. This lets us handle more formats (barely) under the current 256 limit, but that's a pretty tall hoop to force people to jump through. With all that said, let's bump the limit from 256 bytes to 1920. This way we support escaping every byte of the mask & magic field (which is 1024 bytes by themselves -- 128 * 4 * 2), and we leave plenty of room for other fields. Like long paths to the interpreter (when you have source in your /really/long/homedir/qemu/foo). Since the current code stuffs more than one structure into the same buffer, we leave a bit of space to easily round up to 2k. 1920 is just as arbitrary as 256 ;). Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
H A D | bad_inode.c | diff b2441318 Wed Nov 01 08:07:57 MDT 2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org> License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff a528d35e Tue Jan 31 09:46:22 MST 2017 David Howells <dhowells@redhat.com> statx: Add a system call to make enhanced file info available Add a system call to make extended file information available, including file creation and some attribute flags where available through the underlying filesystem. The getattr inode operation is altered to take two additional arguments: a u32 request_mask and an unsigned int flags that indicate the synchronisation mode. This change is propagated to the vfs_getattr*() function. Functions like vfs_stat() are now inline wrappers around new functions vfs_statx() and vfs_statx_fd() to reduce stack usage. ======== OVERVIEW ======== The idea was initially proposed as a set of xattrs that could be retrieved with getxattr(), but the general preference proved to be for a new syscall with an extended stat structure. A number of requests were gathered for features to be included. The following have been included: (1) Make the fields a consistent size on all arches and make them large. (2) Spare space, request flags and information flags are provided for future expansion. (3) Better support for the y2038 problem [Arnd Bergmann] (tv_sec is an __s64). (4) Creation time: The SMB protocol carries the creation time, which could be exported by Samba, which will in turn help CIFS make use of FS-Cache as that can be used for coherency data (stx_btime). This is also specified in NFSv4 as a recommended attribute and could be exported by NFSD [Steve French]. (5) Lightweight stat: Ask for just those details of interest, and allow a netfs (such as NFS) to approximate anything not of interest, possibly without going to the server [Trond Myklebust, Ulrich Drepper, Andreas Dilger] (AT_STATX_DONT_SYNC). (6) Heavyweight stat: Force a netfs to go to the server, even if it thinks its cached attributes are up to date [Trond Myklebust] (AT_STATX_FORCE_SYNC). And the following have been left out for future extension: (7) Data version number: Could be used by userspace NFS servers [Aneesh Kumar]. Can also be used to modify fill_post_wcc() in NFSD which retrieves i_version directly, but has just called vfs_getattr(). It could get it from the kstat struct if it used vfs_xgetattr() instead. (There's disagreement on the exact semantics of a single field, since not all filesystems do this the same way). (8) BSD stat compatibility: Including more fields from the BSD stat such as creation time (st_btime) and inode generation number (st_gen) [Jeremy Allison, Bernd Schubert]. (9) Inode generation number: Useful for FUSE and userspace NFS servers [Bernd Schubert]. (This was asked for but later deemed unnecessary with the open-by-handle capability available and caused disagreement as to whether it's a security hole or not). (10) Extra coherency data may be useful in making backups [Andreas Dilger]. (No particular data were offered, but things like last backup timestamp, the data version number and the DOS archive bit would come into this category). (11) Allow the filesystem to indicate what it can/cannot provide: A filesystem can now say it doesn't support a standard stat feature if that isn't available, so if, for instance, inode numbers or UIDs don't exist or are fabricated locally... (This requires a separate system call - I have an fsinfo() call idea for this). (12) Store a 16-byte volume ID in the superblock that can be returned in struct xstat [Steve French]. (Deferred to fsinfo). (13) Include granularity fields in the time data to indicate the granularity of each of the times (NFSv4 time_delta) [Steve French]. (Deferred to fsinfo). (14) FS_IOC_GETFLAGS value. These could be translated to BSD's st_flags. Note that the Linux IOC flags are a mess and filesystems such as Ext4 define flags that aren't in linux/fs.h, so translation in the kernel may be a necessity (or, possibly, we provide the filesystem type too). (Some attributes are made available in stx_attributes, but the general feeling was that the IOC flags were to ext[234]-specific and shouldn't be exposed through statx this way). (15) Mask of features available on file (eg: ACLs, seclabel) [Brad Boyer, Michael Kerrisk]. (Deferred, probably to fsinfo. Finding out if there's an ACL or seclabal might require extra filesystem operations). (16) Femtosecond-resolution timestamps [Dave Chinner]. (A __reserved field has been left in the statx_timestamp struct for this - if there proves to be a need). (17) A set multiple attributes syscall to go with this. =============== NEW SYSTEM CALL =============== The new system call is: int ret = statx(int dfd, const char *filename, unsigned int flags, unsigned int mask, struct statx *buffer); The dfd, filename and flags parameters indicate the file to query, in a similar way to fstatat(). There is no equivalent of lstat() as that can be emulated with statx() by passing AT_SYMLINK_NOFOLLOW in flags. There is also no equivalent of fstat() as that can be emulated by passing a NULL filename to statx() with the fd of interest in dfd. Whether or not statx() synchronises the attributes with the backing store can be controlled by OR'ing a value into the flags argument (this typically only affects network filesystems): (1) AT_STATX_SYNC_AS_STAT tells statx() to behave as stat() does in this respect. (2) AT_STATX_FORCE_SYNC will require a network filesystem to synchronise its attributes with the server - which might require data writeback to occur to get the timestamps correct. (3) AT_STATX_DONT_SYNC will suppress synchronisation with the server in a network filesystem. The resulting values should be considered approximate. mask is a bitmask indicating the fields in struct statx that are of interest to the caller. The user should set this to STATX_BASIC_STATS to get the basic set returned by stat(). It should be noted that asking for more information may entail extra I/O operations. buffer points to the destination for the data. This must be 256 bytes in size. ====================== MAIN ATTRIBUTES RECORD ====================== The following structures are defined in which to return the main attribute set: struct statx_timestamp { __s64 tv_sec; __s32 tv_nsec; __s32 __reserved; }; struct statx { __u32 stx_mask; __u32 stx_blksize; __u64 stx_attributes; __u32 stx_nlink; __u32 stx_uid; __u32 stx_gid; __u16 stx_mode; __u16 __spare0[1]; __u64 stx_ino; __u64 stx_size; __u64 stx_blocks; __u64 __spare1[1]; struct statx_timestamp stx_atime; struct statx_timestamp stx_btime; struct statx_timestamp stx_ctime; struct statx_timestamp stx_mtime; __u32 stx_rdev_major; __u32 stx_rdev_minor; __u32 stx_dev_major; __u32 stx_dev_minor; __u64 __spare2[14]; }; The defined bits in request_mask and stx_mask are: STATX_TYPE Want/got stx_mode & S_IFMT STATX_MODE Want/got stx_mode & ~S_IFMT STATX_NLINK Want/got stx_nlink STATX_UID Want/got stx_uid STATX_GID Want/got stx_gid STATX_ATIME Want/got stx_atime{,_ns} STATX_MTIME Want/got stx_mtime{,_ns} STATX_CTIME Want/got stx_ctime{,_ns} STATX_INO Want/got stx_ino STATX_SIZE Want/got stx_size STATX_BLOCKS Want/got stx_blocks STATX_BASIC_STATS [The stuff in the normal stat struct] STATX_BTIME Want/got stx_btime{,_ns} STATX_ALL [All currently available stuff] stx_btime is the file creation time, stx_mask is a bitmask indicating the data provided and __spares*[] are where as-yet undefined fields can be placed. Time fields are structures with separate seconds and nanoseconds fields plus a reserved field in case we want to add even finer resolution. Note that times will be negative if before 1970; in such a case, the nanosecond fields will also be negative if not zero. The bits defined in the stx_attributes field convey information about a file, how it is accessed, where it is and what it does. The following attributes map to FS_*_FL flags and are the same numerical value: STATX_ATTR_COMPRESSED File is compressed by the fs STATX_ATTR_IMMUTABLE File is marked immutable STATX_ATTR_APPEND File is append-only STATX_ATTR_NODUMP File is not to be dumped STATX_ATTR_ENCRYPTED File requires key to decrypt in fs Within the kernel, the supported flags are listed by: KSTAT_ATTR_FS_IOC_FLAGS [Are any other IOC flags of sufficient general interest to be exposed through this interface?] New flags include: STATX_ATTR_AUTOMOUNT Object is an automount trigger These are for the use of GUI tools that might want to mark files specially, depending on what they are. Fields in struct statx come in a number of classes: (0) stx_dev_*, stx_blksize. These are local system information and are always available. (1) stx_mode, stx_nlinks, stx_uid, stx_gid, stx_[amc]time, stx_ino, stx_size, stx_blocks. These will be returned whether the caller asks for them or not. The corresponding bits in stx_mask will be set to indicate whether they actually have valid values. If the caller didn't ask for them, then they may be approximated. For example, NFS won't waste any time updating them from the server, unless as a byproduct of updating something requested. If the values don't actually exist for the underlying object (such as UID or GID on a DOS file), then the bit won't be set in the stx_mask, even if the caller asked for the value. In such a case, the returned value will be a fabrication. Note that there are instances where the type might not be valid, for instance Windows reparse points. (2) stx_rdev_*. This will be set only if stx_mode indicates we're looking at a blockdev or a chardev, otherwise will be 0. (3) stx_btime. Similar to (1), except this will be set to 0 if it doesn't exist. ======= TESTING ======= The following test program can be used to test the statx system call: samples/statx/test-statx.c Just compile and run, passing it paths to the files you want to examine. The file is built automatically if CONFIG_SAMPLES is enabled. Here's some example output. Firstly, an NFS directory that crosses to another FSID. Note that the AUTOMOUNT attribute is set because transiting this directory will cause d_automount to be invoked by the VFS. [root@andromeda ~]# /tmp/test-statx -A /warthog/data statx(/warthog/data) = 0 results=7ff Size: 4096 Blocks: 8 IO Block: 1048576 directory Device: 00:26 Inode: 1703937 Links: 125 Access: (3777/drwxrwxrwx) Uid: 0 Gid: 4041 Access: 2016-11-24 09:02:12.219699527+0000 Modify: 2016-11-17 10:44:36.225653653+0000 Change: 2016-11-17 10:44:36.225653653+0000 Attributes: 0000000000001000 (-------- -------- -------- -------- -------- -------- ---m---- --------) Secondly, the result of automounting on that directory. [root@andromeda ~]# /tmp/test-statx /warthog/data statx(/warthog/data) = 0 results=7ff Size: 4096 Blocks: 8 IO Block: 1048576 directory Device: 00:27 Inode: 2 Links: 125 Access: (3777/drwxrwxrwx) Uid: 0 Gid: 4041 Access: 2016-11-24 09:02:12.219699527+0000 Modify: 2016-11-17 10:44:36.225653653+0000 Change: 2016-11-17 10:44:36.225653653+0000 Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff 4acdaf27 Mon Jul 25 23:42:34 MDT 2011 Al Viro <viro@zeniv.linux.org.uk> switch ->create() to umode_t vfs_create() ignores everything outside of 16bit subset of its mode argument; switching it to umode_t is obviously equivalent and it's the only caller of the method Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
/linux-master/fs/ufs/ | ||
H A D | dir.c | diff b2441318 Wed Nov 01 08:07:57 MDT 2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org> License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff abf5d15f Sun Jun 25 06:47:24 MDT 2006 Evgeniy Dushistov <dushistov@mail.ru> [PATCH] ufs: easy debug Currently to turn on debug mode "user" has to edit ~10 files, to turn off he has to do it again. This patch introduce such changes: 1)turn on(off) debug messages via ".config" 2)remove unnecessary duplication of code 3)make "UFSD" macros more similar to function 4)fix some compiler warnings Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
H A D | file.c | diff b2441318 Wed Nov 01 08:07:57 MDT 2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org> License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
/linux-master/fs/sysv/ | ||
H A D | dir.c | diff 4bb1a137 Thu Jan 19 08:32:29 MST 2023 Fabio M. De Francesco <fmdefrancesco@gmail.com> fs/sysv: Use the offset_in_page() helper Use the offset_in_page() helper because it is more suitable than doing explicit subtractions between pointers to directory entries and kernel virtual addresses of mapped pages. Cc: Ira Weiny <ira.weiny@intel.com> Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff b2441318 Wed Nov 01 08:07:57 MDT 2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org> License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 54b21a79 Sun Jan 08 02:03:05 MST 2006 Andrew Morton <akpm@osdl.org> [PATCH] fix possible PAGE_CACHE_SHIFT overflows We've had two instances recently of overflows when doing 64_bit_value = (32_bit_value << PAGE_CACHE_SHIFT) I did a tree-wide grep of `<<.*PAGE_CACHE_SHIFT' and this is the result. - afs_rxfs_fetch_descriptor.offset is of type off_t, which seems broken. - jfs and jffs are limited to 4GB anyway. - reiserfs map_block_for_writepage() takes an unsigned long for the block - it should take sector_t. (It'll fail for huge filesystems with blocksize<PAGE_CACHE_SIZE) - cramfs_read() needs to use sector_t (I think cramsfs is busted on large filesystems anyway) - affs is limited in file size anyway. - I generally didn't fix 32-bit overflows in directory operations. - arm's __flush_dcache_page() is peculiar. What if the page lies beyond 4G? - gss_wrap_req_priv() needs checking (snd_buf->page_base) Cc: Oleg Drokin <green@linuxhacker.ru> Cc: David Howells <dhowells@redhat.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: <reiserfs-dev@namesys.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: <linux-fsdevel@vger.kernel.org> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 54b21a79 Sun Jan 08 02:03:05 MST 2006 Andrew Morton <akpm@osdl.org> [PATCH] fix possible PAGE_CACHE_SHIFT overflows We've had two instances recently of overflows when doing 64_bit_value = (32_bit_value << PAGE_CACHE_SHIFT) I did a tree-wide grep of `<<.*PAGE_CACHE_SHIFT' and this is the result. - afs_rxfs_fetch_descriptor.offset is of type off_t, which seems broken. - jfs and jffs are limited to 4GB anyway. - reiserfs map_block_for_writepage() takes an unsigned long for the block - it should take sector_t. (It'll fail for huge filesystems with blocksize<PAGE_CACHE_SIZE) - cramfs_read() needs to use sector_t (I think cramsfs is busted on large filesystems anyway) - affs is limited in file size anyway. - I generally didn't fix 32-bit overflows in directory operations. - arm's __flush_dcache_page() is peculiar. What if the page lies beyond 4G? - gss_wrap_req_priv() needs checking (snd_buf->page_base) Cc: Oleg Drokin <green@linuxhacker.ru> Cc: David Howells <dhowells@redhat.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: <reiserfs-dev@namesys.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: <linux-fsdevel@vger.kernel.org> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
/linux-master/fs/jfs/ | ||
H A D | namei.c | diff 4acdaf27 Mon Jul 25 23:42:34 MDT 2011 Al Viro <viro@zeniv.linux.org.uk> switch ->create() to umode_t vfs_create() ignores everything outside of 16bit subset of its mode argument; switching it to umode_t is obviously equivalent and it's the only caller of the method Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff 4e34e719 Sat Jul 23 09:37:31 MDT 2011 Christoph Hellwig <hch@lst.de> fs: take the ACL checks to common code Replace the ->check_acl method with a ->get_acl method that simply reads an ACL from disk after having a cache miss. This means we can replace the ACL checking boilerplate code with a single implementation in namei.c. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4f4b401b Thu Sep 01 08:02:43 MDT 2005 Dave Kleikamp <shaggy@austin.ibm.com> JFS: allow extended attributes to be set within a existing transaction Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com> |
/linux-master/fs/jffs2/ | ||
H A D | os-linux.h | diff 265de8ce Thu Jul 11 21:55:23 MDT 2019 Christoph Hellwig <hch@lst.de> jffs2: pass the correct prototype to read_cache_page Fix the callback jffs2 passes to read_cache_page to actually have the proper type expected. Casting around function pointers can easily hide typing bugs, and defeats control flow protection. Link: http://lkml.kernel.org/r/20190520055731.24538-4-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff aef9ab47 Thu May 18 17:28:49 MDT 2006 David Woodhouse <dwmw2@infradead.org> [JFFS2] Support new device nodes Device node major/minor numbers are just stored in the payload of a single data node. Just extend that to 4 bytes and use new_encode_dev() for it. We only use the 4-byte format if we _need_ to, if !old_valid_dev(foo). This preserves backwards compatibility with older code as much as possible. If we do make devices with major or minor numbers above 255, and then mount the file system with the old code, it'll just read the first two bytes and get the numbers wrong. If it comes to garbage-collect it, it'll then write back those wrong numbers. But that's about the best we can expect. Signed-off-by: David Woodhouse <dwmw2@infradead.org> diff aef9ab47 Thu May 18 17:28:49 MDT 2006 David Woodhouse <dwmw2@infradead.org> [JFFS2] Support new device nodes Device node major/minor numbers are just stored in the payload of a single data node. Just extend that to 4 bytes and use new_encode_dev() for it. We only use the 4-byte format if we _need_ to, if !old_valid_dev(foo). This preserves backwards compatibility with older code as much as possible. If we do make devices with major or minor numbers above 255, and then mount the file system with the old code, it'll just read the first two bytes and get the numbers wrong. If it comes to garbage-collect it, it'll then write back those wrong numbers. But that's about the best we can expect. Signed-off-by: David Woodhouse <dwmw2@infradead.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
H A D | dir.c | diff 798b7347 Fri Jun 19 03:06:35 MDT 2020 Zhe Li <lizhe67@huawei.com> jffs2: fix UAF problem The log of UAF problem is listed below. BUG: KASAN: use-after-free in jffs2_rmdir+0xa4/0x1cc [jffs2] at addr c1f165fc Read of size 4 by task rm/8283 ============================================================================= BUG kmalloc-32 (Tainted: P B O ): kasan: bad access detected ----------------------------------------------------------------------------- INFO: Allocated in 0xbbbbbbbb age=3054364 cpu=0 pid=0 0xb0bba6ef jffs2_write_dirent+0x11c/0x9c8 [jffs2] __slab_alloc.isra.21.constprop.25+0x2c/0x44 __kmalloc+0x1dc/0x370 jffs2_write_dirent+0x11c/0x9c8 [jffs2] jffs2_do_unlink+0x328/0x5fc [jffs2] jffs2_rmdir+0x110/0x1cc [jffs2] vfs_rmdir+0x180/0x268 do_rmdir+0x2cc/0x300 ret_from_syscall+0x0/0x3c INFO: Freed in 0x205b age=3054364 cpu=0 pid=0 0x2e9173 jffs2_add_fd_to_list+0x138/0x1dc [jffs2] jffs2_add_fd_to_list+0x138/0x1dc [jffs2] jffs2_garbage_collect_dirent.isra.3+0x21c/0x288 [jffs2] jffs2_garbage_collect_live+0x16bc/0x1800 [jffs2] jffs2_garbage_collect_pass+0x678/0x11d4 [jffs2] jffs2_garbage_collect_thread+0x1e8/0x3b0 [jffs2] kthread+0x1a8/0x1b0 ret_from_kernel_thread+0x5c/0x64 Call Trace: [c17ddd20] [c02452d4] kasan_report.part.0+0x298/0x72c (unreliable) [c17ddda0] [d2509680] jffs2_rmdir+0xa4/0x1cc [jffs2] [c17dddd0] [c026da04] vfs_rmdir+0x180/0x268 [c17dde00] [c026f4e4] do_rmdir+0x2cc/0x300 [c17ddf40] [c001a658] ret_from_syscall+0x0/0x3c The root cause is that we don't get "jffs2_inode_info.sem" before we scan list "jffs2_inode_info.dents" in function jffs2_rmdir. This patch add codes to get "jffs2_inode_info.sem" before we scan "jffs2_inode_info.dents" to slove the UAF problem. Signed-off-by: Zhe Li <lizhe67@huawei.com> Reviewed-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> diff 4acdaf27 Mon Jul 25 23:42:34 MDT 2011 Al Viro <viro@zeniv.linux.org.uk> switch ->create() to umode_t vfs_create() ignores everything outside of 16bit subset of its mode argument; switching it to umode_t is obviously equivalent and it's the only caller of the method Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff 4e34e719 Sat Jul 23 09:37:31 MDT 2011 Christoph Hellwig <hch@lst.de> fs: take the ACL checks to common code Replace the ->check_acl method with a ->get_acl method that simply reads an ACL from disk after having a cache miss. This means we can replace the ACL checking boilerplate code with a single implementation in namei.c. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff aef9ab47 Thu May 18 17:28:49 MDT 2006 David Woodhouse <dwmw2@infradead.org> [JFFS2] Support new device nodes Device node major/minor numbers are just stored in the payload of a single data node. Just extend that to 4 bytes and use new_encode_dev() for it. We only use the 4-byte format if we _need_ to, if !old_valid_dev(foo). This preserves backwards compatibility with older code as much as possible. If we do make devices with major or minor numbers above 255, and then mount the file system with the old code, it'll just read the first two bytes and get the numbers wrong. If it comes to garbage-collect it, it'll then write back those wrong numbers. But that's about the best we can expect. Signed-off-by: David Woodhouse <dwmw2@infradead.org> diff aef9ab47 Thu May 18 17:28:49 MDT 2006 David Woodhouse <dwmw2@infradead.org> [JFFS2] Support new device nodes Device node major/minor numbers are just stored in the payload of a single data node. Just extend that to 4 bytes and use new_encode_dev() for it. We only use the 4-byte format if we _need_ to, if !old_valid_dev(foo). This preserves backwards compatibility with older code as much as possible. If we do make devices with major or minor numbers above 255, and then mount the file system with the old code, it'll just read the first two bytes and get the numbers wrong. If it comes to garbage-collect it, it'll then write back those wrong numbers. But that's about the best we can expect. Signed-off-by: David Woodhouse <dwmw2@infradead.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
H A D | file.c | diff 23892d38 Wed Aug 03 09:53:12 MDT 2022 Yifei Liu <yifeliu@cs.stonybrook.edu> jffs2: correct logic when creating a hole in jffs2_write_begin Bug description and fix: 1. Write data to a file, say all 1s from offset 0 to 16. 2. Truncate the file to a smaller size, say 8 bytes. 3. Write new bytes (say 2s) from an offset past the original size of the file, say at offset 20, for 4 bytes. This is supposed to create a "hole" in the file, meaning that the bytes from offset 8 (where it was truncated above) up to the new write at offset 20, should all be 0s (zeros). 4. Flush all caches using "echo 3 > /proc/sys/vm/drop_caches" (or unmount and remount) the f/s. 5. Check the content of the file. It is wrong. The 1s that used to be between bytes 9 and 16, before the truncation, have REAPPEARED (they should be 0s). We wrote a script and helper C program to reproduce the bug (reproduce_jffs2_write_begin_issue.sh, write_file.c, and Makefile). We can make them available to anyone. The above example is shown when writing a small file within the same first page. But the bug happens for larger files, as long as steps 1, 2, and 3 above all happen within the same page. The problem was traced to the jffs2_write_begin code, where it goes into an 'if' statement intended to handle writes past the current EOF (i.e., writes that may create a hole). The code computes a 'pageofs' that is the floor of the write position (pos), aligned to the page size boundary. In other words, 'pageofs' will never be larger than 'pos'. The code then sets the internal jffs2_raw_inode->isize to the size of max(current inode size, pageofs) but that is wrong: the new file size should be the 'pos', which is larger than both the current inode size and pageofs. Similarly, the code incorrectly sets the internal jffs2_raw_inode->dsize to the difference between the pageofs minus current inode size; instead it should be the current pos minus the current inode size. Finally, inode->i_size was also set incorrectly. The patch below fixes this bug. The bug was discovered using a new tool for finding f/s bugs using model checking, called MCFS (Model Checking File Systems). Signed-off-by: Yifei Liu <yifeliu@cs.stonybrook.edu> Signed-off-by: Erez Zadok <ezk@cs.stonybrook.edu> Signed-off-by: Manish Adkar <madkar@cs.stonybrook.edu> Signed-off-by: Richard Weinberger <richard@nod.at> diff 23892d38 Wed Aug 03 09:53:12 MDT 2022 Yifei Liu <yifeliu@cs.stonybrook.edu> jffs2: correct logic when creating a hole in jffs2_write_begin Bug description and fix: 1. Write data to a file, say all 1s from offset 0 to 16. 2. Truncate the file to a smaller size, say 8 bytes. 3. Write new bytes (say 2s) from an offset past the original size of the file, say at offset 20, for 4 bytes. This is supposed to create a "hole" in the file, meaning that the bytes from offset 8 (where it was truncated above) up to the new write at offset 20, should all be 0s (zeros). 4. Flush all caches using "echo 3 > /proc/sys/vm/drop_caches" (or unmount and remount) the f/s. 5. Check the content of the file. It is wrong. The 1s that used to be between bytes 9 and 16, before the truncation, have REAPPEARED (they should be 0s). We wrote a script and helper C program to reproduce the bug (reproduce_jffs2_write_begin_issue.sh, write_file.c, and Makefile). We can make them available to anyone. The above example is shown when writing a small file within the same first page. But the bug happens for larger files, as long as steps 1, 2, and 3 above all happen within the same page. The problem was traced to the jffs2_write_begin code, where it goes into an 'if' statement intended to handle writes past the current EOF (i.e., writes that may create a hole). The code computes a 'pageofs' that is the floor of the write position (pos), aligned to the page size boundary. In other words, 'pageofs' will never be larger than 'pos'. The code then sets the internal jffs2_raw_inode->isize to the size of max(current inode size, pageofs) but that is wrong: the new file size should be the 'pos', which is larger than both the current inode size and pageofs. Similarly, the code incorrectly sets the internal jffs2_raw_inode->dsize to the difference between the pageofs minus current inode size; instead it should be the current pos minus the current inode size. Finally, inode->i_size was also set incorrectly. The patch below fixes this bug. The bug was discovered using a new tool for finding f/s bugs using model checking, called MCFS (Model Checking File Systems). Signed-off-by: Yifei Liu <yifeliu@cs.stonybrook.edu> Signed-off-by: Erez Zadok <ezk@cs.stonybrook.edu> Signed-off-by: Manish Adkar <madkar@cs.stonybrook.edu> Signed-off-by: Richard Weinberger <richard@nod.at> diff 265de8ce Thu Jul 11 21:55:23 MDT 2019 Christoph Hellwig <hch@lst.de> jffs2: pass the correct prototype to read_cache_page Fix the callback jffs2 passes to read_cache_page to actually have the proper type expected. Casting around function pointers can easily hide typing bugs, and defeats control flow protection. Link: http://lkml.kernel.org/r/20190520055731.24538-4-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff 4e34e719 Sat Jul 23 09:37:31 MDT 2011 Christoph Hellwig <hch@lst.de> fs: take the ACL checks to common code Replace the ->check_acl method with a ->get_acl method that simply reads an ACL from disk after having a cache miss. This means we can replace the ACL checking boilerplate code with a single implementation in namei.c. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff 5a0e3ad6 Wed Mar 24 02:04:11 MDT 2010 Tejun Heo <tj@kernel.org> include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4fc67fbe Tue Oct 18 17:29:38 MDT 2005 Todd Poynor <tpoynor@mvista.com> [JFFS2] Return 0, not number of bytes written, for success at commit_write Some callers to block-layer commit_write function treat non-zero return as error, notably the loopback mount driver sometimes used in conjunction with JFFS2 on NAND flash for bad block avoidance, etc. Return zero for success as do various other commit_write functions. Signed-off-by: Todd Poynor <tpoynor@mvista.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> |
/linux-master/fs/hpfs/ | ||
H A D | dir.c | diff b2441318 Wed Nov 01 08:07:57 MDT 2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org> License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff 5a0e3ad6 Wed Mar 24 02:04:11 MDT 2010 Tejun Heo <tj@kernel.org> include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
/linux-master/fs/hfs/ | ||
H A D | hfs_fs.h | diff 4ddfc3dc Wed Jun 20 01:47:26 MDT 2018 Arnd Bergmann <arnd@arndb.de> hfs/hfsplus: use 64-bit inode timestamps The interpretation of on-disk timestamps in HFS and HFS+ differs between 32-bit and 64-bit kernels at the moment. Use 64-bit timestamps consistently so apply the current 64-bit behavior everyhere. According to the official documentation for HFS+ [1], inode timestamps are supposed to cover the time range from 1904 to 2040 as originally used in classic MacOS. The traditional Linux usage is to convert the timestamps into an unsigned 32-bit number based on the Unix epoch and from there to a time_t. On 32-bit systems, that wraps the time from 2038 to 1902, so the last two years of the valid time range become garbled. On 64-bit systems, all times before 1970 get turned into timestamps between 2038 and 2106, which is more convenient but also different from the documented behavior. Looking at the Darwin sources [2], it seems that MacOS is inconsistent in yet another way: all timestamps are wrapped around to a 32-bit unsigned number when written to the disk, but when read back, all numeric values lower than 2082844800U are assumed to be invalid, so we cannot represent the times before 1970 or the times after 2040. While all implementations seem to agree on the interpretation of values between 1970 and 2038, they often differ on the exact range they support when reading back values outside of the common range: MacOS (traditional): 1904-2040 Apple Documentation: 1904-2040 MacOS X source comments: 1970-2040 MacOS X source code: 1970-2038 32-bit Linux: 1902-2038 64-bit Linux: 1970-2106 hfsfuse: 1970-2040 hfsutils (32 bit, old libc) 1902-2038 hfsutils (32 bit, new libc) 1970-2106 hfsutils (64 bit) 1904-2040 hfsplus-utils 1904-2040 hfsexplorer 1904-2040 7-zip 1904-2040 Out of the above, the range from 1970 to 2106 seems to be the most useful, as it allows using HFS and HFS+ beyond year 2038, and this matches the behavior that most users would see today on Linux, as few people run 32-bit kernels any more. Link: [1] https://developer.apple.com/library/archive/technotes/tn/tn1150.html Link: [2] https://opensource.apple.com/source/hfs/hfs-407.30.1/core/MacOSStubs.c.auto.html Link: https://lore.kernel.org/lkml/20180711224625.airwna6gzyatoowe@eaf/ Suggested-by: "Ernesto A. Fernández" <ernesto.mnd.fernandez@gmail.com> Reviewed-by: Vyacheslav Dubeyko <slava@dubeyko.com> Reviewed-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> --- v3: revert back to 1970-2106 time range fix bugs found in review merge both patches into one drop cc:stable tag v2: treat pre-1970 dates as invalid following MacOS X behavior, reword and expand changelog text diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
H A D | dir.c | diff 4acdaf27 Mon Jul 25 23:42:34 MDT 2011 Al Viro <viro@zeniv.linux.org.uk> switch ->create() to umode_t vfs_create() ignores everything outside of 16bit subset of its mode argument; switching it to umode_t is obviously equivalent and it's the only caller of the method Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff 4e82d61b Fri May 27 14:42:05 MDT 2011 Sage Weil <sage@newdream.net> hfs: remove unnecessary dentry_unhash on rmdir, dir rename hfs does not have problems with references to unlinked directories. Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
/linux-master/fs/autofs/ | ||
H A D | root.c | diff 797a1d89 Mon Jun 12 04:45:19 MDT 2023 Jeff Layton <jlayton@kernel.org> autofs: set ctime as well when mtime changes on a dir When adding entries to a directory, POSIX generally requires that the ctime also be updated alongside the mtime. Signed-off-by: Jeff Layton <jlayton@kernel.org> Acked-by: Ian Kent <raven@themaw.net> Message-Id: <20230612104524.17058-4-jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org> diff 5a0e3ad6 Wed Mar 24 02:04:11 MDT 2010 Tejun Heo <tj@kernel.org> include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
H A D | autofs_i.h | diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
/linux-master/drivers/misc/ibmasm/ | ||
H A D | ibmasmfs.c | diff 4a2ef475 Wed Oct 04 12:51:53 MDT 2023 Jeff Layton <jlayton@kernel.org> ibmasm: convert to new timestamp accessors Convert to using the new inode timestamp accessor functions. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20231004185347.80880-6-jlayton@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org> diff 5a0e3ad6 Wed Mar 24 02:04:11 MDT 2010 Tejun Heo <tj@kernel.org> include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
/linux-master/fs/affs/ | ||
H A D | file.c | diff 34113026 Wed Jul 12 21:55:08 MDT 2023 Matthew Wilcox (Oracle) <willy@infradead.org> affs: convert data read and write to use folios We still need to convert to/from folios in write_begin & write_end to fit the API, but this removes a lot of calls to old page-based functions, removing many hidden calls to compound_head(). Link: https://lkml.kernel.org/r/20230713035512.4139457-4-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Pankaj Raghav <p.raghav@samsung.com> Acked-by: David Sterba <dsterba@suse.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.com> Cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Tom Rix <trix@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> diff 0af57378 Mon Jun 28 20:36:12 MDT 2021 Christoph Hellwig <hch@lst.de> mm: require ->set_page_dirty to be explicitly wired up Remove the CONFIG_BLOCK default to __set_page_dirty_buffers and just wire that method up for the missing instances. [hch@lst.de: ecryptfs: add a ->set_page_dirty cludge] Link: https://lkml.kernel.org/r/20210624125250.536369-1-hch@lst.de Link: https://lkml.kernel.org/r/20210614061512.3966143-4-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Tyler Hicks <code@tyhicks.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff b2441318 Wed Nov 01 08:07:57 MDT 2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org> License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff a80f2d22 Mon Apr 24 14:13:10 MDT 2017 Fabian Frederick <fabf@skynet.be> fs/affs: bugfix: Write files greater than page size on OFS Previous AFFS patch fixed OFS write operations but unveiled another bug: files greater than 4KB are being created with a wrong size resulting in errors like the following: dd if=/dev/zero of=file bs=4097 count=1 cp file /mnt/affs/ cp: error writing '/mnt/affs/file': Bad address Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff 92cab82b Fri Dec 12 17:57:55 MST 2014 Fabian Frederick <fabf@skynet.be> fs/affs/file.c: remove obsolete pagesize check linux kernel doesn't manage page sizes below 4kb. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff 63259326 Wed Sep 11 15:25:48 MDT 2013 Dan Carpenter <dan.carpenter@oracle.com> affs: use loff_t in affs_truncate() It seems pretty unlikely that AFFS supports files over 4GB but we may as well leave use loff_t just for cleanness sake instead of truncating it to 32 bits. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Marco Stornelli <marco.stornelli@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
/linux-master/fs/hfsplus/ | ||
H A D | dir.c | diff b2441318 Wed Nov 01 08:07:57 MDT 2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org> License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff 017f8da4 Fri Jun 06 15:36:21 MDT 2014 Hin-Tak Leung <htl10@users.sourceforge.net> hfsplus: fix worst-case unicode to char conversion of file names and attributes This is a series of 3 patches which corrects issues in HFS+ concerning the use of non-english file names and attributes. Names and attributes are stored internally as UTF-16 units up to a fixed maximum size, and convert to and from user-representation by NLS. The code incorrectly assume that NLS string lengths are equal to unicode lengths, which is only true for English ascii usage. This patch (of 3): The HFS Plus Volume Format specification (TN1150) states that file names are stored internally as a maximum of 255 unicode characters, as defined by The Unicode Standard, Version 2.0 [Unicode, Inc. ISBN 0-201-48345-9]. File names are converted by the NLS system on Linux before presented to the user. 255 CJK characters converts to UTF-8 with 1 unicode character to up to 3 bytes, and to GB18030 with 1 unicode character to up to 4 bytes. Thus, trying in a UTF-8 locale to list files with names of more than 85 CJK characters results in: $ ls /mnt ls: reading directory /mnt: File name too long The receiving buffer to hfsplus_uni2asc() needs to be 255 x NLS_MAX_CHARSET_SIZE bytes, not 255 bytes as the code has always been. Similar consideration applies to attributes, which are stored internally as a maximum of 127 UTF-16BE units. See XNU source for an up-to-date reference on attributes. Strictly speaking, the maximum value of NLS_MAX_CHARSET_SIZE = 6 is not attainable in the case of conversion to UTF-8, as going beyond 3 bytes requires the use of surrogate pairs, i.e. consuming two input units. Thanks Anton Altaparmakov for reviewing an earlier version of this change. This patch fixes all callers of hfsplus_uni2asc(), and also enables the use of long non-English file names in HFS+. The getting and setting, and general usage of long non-English attributes requires further forthcoming work, in the following patches of this series. [akpm@linux-foundation.org: fix build] Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net> Reviewed-by: Anton Altaparmakov <anton@tuxera.com> Cc: Vyacheslav Dubeyko <slava@dubeyko.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Sougata Santra <sougata@tuxera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff 4acdaf27 Mon Jul 25 23:42:34 MDT 2011 Al Viro <viro@zeniv.linux.org.uk> switch ->create() to umode_t vfs_create() ignores everything outside of 16bit subset of its mode argument; switching it to umode_t is obviously equivalent and it's the only caller of the method Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
/linux-master/fs/fat/ | ||
H A D | file.c | diff 4d7ca409 Thu Jan 12 16:49:32 MST 2023 Christian Brauner <brauner@kernel.org> fs: port vfs{g,u}id helpers to mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> diff 4b789936 Thu Jan 21 06:19:56 MST 2021 Christian Brauner <christian.brauner@ubuntu.com> fat: handle idmapped mounts Let fat handle idmapped mounts. This allows to have the same fat mount appear in multiple locations with different id mappings. This allows to expose a vfat formatted USB stick to multiple user with different ids on the host or in user namespaces allowing for dac permissions: mount -o uid=1000,gid=1000 /dev/sdb /mnt u1001@f2-vm:/lower1$ ls -ln /mnt/ total 4 -rwxr-xr-x 1 1000 1000 4 Oct 28 03:44 aaa -rwxr-xr-x 1 1000 1000 0 Oct 28 01:09 bbb -rwxr-xr-x 1 1000 1000 0 Oct 28 01:10 ccc -rwxr-xr-x 1 1000 1000 0 Oct 28 03:46 ddd -rwxr-xr-x 1 1000 1000 0 Oct 28 04:01 eee mount-idmapped --map-mount b:1000:1001:1 u1001@f2-vm:/lower1$ ls -ln /lower1/ total 4 -rwxr-xr-x 1 1001 1001 4 Oct 28 03:44 aaa -rwxr-xr-x 1 1001 1001 0 Oct 28 01:09 bbb -rwxr-xr-x 1 1001 1001 0 Oct 28 01:10 ccc -rwxr-xr-x 1 1001 1001 0 Oct 28 03:46 ddd -rwxr-xr-x 1 1001 1001 0 Oct 28 04:01 eee u1001@f2-vm:/lower1$ touch /lower1/fff u1001@f2-vm:/lower1$ ls -ln /lower1/fff -rwxr-xr-x 1 1001 1001 0 Oct 28 04:03 /lower1/fff u1001@f2-vm:/lower1$ ls -ln /mnt/fff -rwxr-xr-x 1 1000 1000 0 Oct 28 04:03 /mnt/fff Link: https://lore.kernel.org/r/20210121131959.646623-38-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> diff 4b789936 Thu Jan 21 06:19:56 MST 2021 Christian Brauner <christian.brauner@ubuntu.com> fat: handle idmapped mounts Let fat handle idmapped mounts. This allows to have the same fat mount appear in multiple locations with different id mappings. This allows to expose a vfat formatted USB stick to multiple user with different ids on the host or in user namespaces allowing for dac permissions: mount -o uid=1000,gid=1000 /dev/sdb /mnt u1001@f2-vm:/lower1$ ls -ln /mnt/ total 4 -rwxr-xr-x 1 1000 1000 4 Oct 28 03:44 aaa -rwxr-xr-x 1 1000 1000 0 Oct 28 01:09 bbb -rwxr-xr-x 1 1000 1000 0 Oct 28 01:10 ccc -rwxr-xr-x 1 1000 1000 0 Oct 28 03:46 ddd -rwxr-xr-x 1 1000 1000 0 Oct 28 04:01 eee mount-idmapped --map-mount b:1000:1001:1 u1001@f2-vm:/lower1$ ls -ln /lower1/ total 4 -rwxr-xr-x 1 1001 1001 4 Oct 28 03:44 aaa -rwxr-xr-x 1 1001 1001 0 Oct 28 01:09 bbb -rwxr-xr-x 1 1001 1001 0 Oct 28 01:10 ccc -rwxr-xr-x 1 1001 1001 0 Oct 28 03:46 ddd -rwxr-xr-x 1 1001 1001 0 Oct 28 04:01 eee u1001@f2-vm:/lower1$ touch /lower1/fff u1001@f2-vm:/lower1$ ls -ln /lower1/fff -rwxr-xr-x 1 1001 1001 0 Oct 28 04:03 /lower1/fff u1001@f2-vm:/lower1$ ls -ln /mnt/fff -rwxr-xr-x 1 1000 1000 0 Oct 28 04:03 /mnt/fff Link: https://lore.kernel.org/r/20210121131959.646623-38-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> diff 4b789936 Thu Jan 21 06:19:56 MST 2021 Christian Brauner <christian.brauner@ubuntu.com> fat: handle idmapped mounts Let fat handle idmapped mounts. This allows to have the same fat mount appear in multiple locations with different id mappings. This allows to expose a vfat formatted USB stick to multiple user with different ids on the host or in user namespaces allowing for dac permissions: mount -o uid=1000,gid=1000 /dev/sdb /mnt u1001@f2-vm:/lower1$ ls -ln /mnt/ total 4 -rwxr-xr-x 1 1000 1000 4 Oct 28 03:44 aaa -rwxr-xr-x 1 1000 1000 0 Oct 28 01:09 bbb -rwxr-xr-x 1 1000 1000 0 Oct 28 01:10 ccc -rwxr-xr-x 1 1000 1000 0 Oct 28 03:46 ddd -rwxr-xr-x 1 1000 1000 0 Oct 28 04:01 eee mount-idmapped --map-mount b:1000:1001:1 u1001@f2-vm:/lower1$ ls -ln /lower1/ total 4 -rwxr-xr-x 1 1001 1001 4 Oct 28 03:44 aaa -rwxr-xr-x 1 1001 1001 0 Oct 28 01:09 bbb -rwxr-xr-x 1 1001 1001 0 Oct 28 01:10 ccc -rwxr-xr-x 1 1001 1001 0 Oct 28 03:46 ddd -rwxr-xr-x 1 1001 1001 0 Oct 28 04:01 eee u1001@f2-vm:/lower1$ touch /lower1/fff u1001@f2-vm:/lower1$ ls -ln /lower1/fff -rwxr-xr-x 1 1001 1001 0 Oct 28 04:03 /lower1/fff u1001@f2-vm:/lower1$ ls -ln /mnt/fff -rwxr-xr-x 1 1000 1000 0 Oct 28 04:03 /mnt/fff Link: https://lore.kernel.org/r/20210121131959.646623-38-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> diff 4b789936 Thu Jan 21 06:19:56 MST 2021 Christian Brauner <christian.brauner@ubuntu.com> fat: handle idmapped mounts Let fat handle idmapped mounts. This allows to have the same fat mount appear in multiple locations with different id mappings. This allows to expose a vfat formatted USB stick to multiple user with different ids on the host or in user namespaces allowing for dac permissions: mount -o uid=1000,gid=1000 /dev/sdb /mnt u1001@f2-vm:/lower1$ ls -ln /mnt/ total 4 -rwxr-xr-x 1 1000 1000 4 Oct 28 03:44 aaa -rwxr-xr-x 1 1000 1000 0 Oct 28 01:09 bbb -rwxr-xr-x 1 1000 1000 0 Oct 28 01:10 ccc -rwxr-xr-x 1 1000 1000 0 Oct 28 03:46 ddd -rwxr-xr-x 1 1000 1000 0 Oct 28 04:01 eee mount-idmapped --map-mount b:1000:1001:1 u1001@f2-vm:/lower1$ ls -ln /lower1/ total 4 -rwxr-xr-x 1 1001 1001 4 Oct 28 03:44 aaa -rwxr-xr-x 1 1001 1001 0 Oct 28 01:09 bbb -rwxr-xr-x 1 1001 1001 0 Oct 28 01:10 ccc -rwxr-xr-x 1 1001 1001 0 Oct 28 03:46 ddd -rwxr-xr-x 1 1001 1001 0 Oct 28 04:01 eee u1001@f2-vm:/lower1$ touch /lower1/fff u1001@f2-vm:/lower1$ ls -ln /lower1/fff -rwxr-xr-x 1 1001 1001 0 Oct 28 04:03 /lower1/fff u1001@f2-vm:/lower1$ ls -ln /mnt/fff -rwxr-xr-x 1 1000 1000 0 Oct 28 04:03 /mnt/fff Link: https://lore.kernel.org/r/20210121131959.646623-38-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> diff 4b789936 Thu Jan 21 06:19:56 MST 2021 Christian Brauner <christian.brauner@ubuntu.com> fat: handle idmapped mounts Let fat handle idmapped mounts. This allows to have the same fat mount appear in multiple locations with different id mappings. This allows to expose a vfat formatted USB stick to multiple user with different ids on the host or in user namespaces allowing for dac permissions: mount -o uid=1000,gid=1000 /dev/sdb /mnt u1001@f2-vm:/lower1$ ls -ln /mnt/ total 4 -rwxr-xr-x 1 1000 1000 4 Oct 28 03:44 aaa -rwxr-xr-x 1 1000 1000 0 Oct 28 01:09 bbb -rwxr-xr-x 1 1000 1000 0 Oct 28 01:10 ccc -rwxr-xr-x 1 1000 1000 0 Oct 28 03:46 ddd -rwxr-xr-x 1 1000 1000 0 Oct 28 04:01 eee mount-idmapped --map-mount b:1000:1001:1 u1001@f2-vm:/lower1$ ls -ln /lower1/ total 4 -rwxr-xr-x 1 1001 1001 4 Oct 28 03:44 aaa -rwxr-xr-x 1 1001 1001 0 Oct 28 01:09 bbb -rwxr-xr-x 1 1001 1001 0 Oct 28 01:10 ccc -rwxr-xr-x 1 1001 1001 0 Oct 28 03:46 ddd -rwxr-xr-x 1 1001 1001 0 Oct 28 04:01 eee u1001@f2-vm:/lower1$ touch /lower1/fff u1001@f2-vm:/lower1$ ls -ln /lower1/fff -rwxr-xr-x 1 1001 1001 0 Oct 28 04:03 /lower1/fff u1001@f2-vm:/lower1$ ls -ln /mnt/fff -rwxr-xr-x 1 1000 1000 0 Oct 28 04:03 /mnt/fff Link: https://lore.kernel.org/r/20210121131959.646623-38-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> diff a528d35e Tue Jan 31 09:46:22 MST 2017 David Howells <dhowells@redhat.com> statx: Add a system call to make enhanced file info available Add a system call to make extended file information available, including file creation and some attribute flags where available through the underlying filesystem. The getattr inode operation is altered to take two additional arguments: a u32 request_mask and an unsigned int flags that indicate the synchronisation mode. This change is propagated to the vfs_getattr*() function. Functions like vfs_stat() are now inline wrappers around new functions vfs_statx() and vfs_statx_fd() to reduce stack usage. ======== OVERVIEW ======== The idea was initially proposed as a set of xattrs that could be retrieved with getxattr(), but the general preference proved to be for a new syscall with an extended stat structure. A number of requests were gathered for features to be included. The following have been included: (1) Make the fields a consistent size on all arches and make them large. (2) Spare space, request flags and information flags are provided for future expansion. (3) Better support for the y2038 problem [Arnd Bergmann] (tv_sec is an __s64). (4) Creation time: The SMB protocol carries the creation time, which could be exported by Samba, which will in turn help CIFS make use of FS-Cache as that can be used for coherency data (stx_btime). This is also specified in NFSv4 as a recommended attribute and could be exported by NFSD [Steve French]. (5) Lightweight stat: Ask for just those details of interest, and allow a netfs (such as NFS) to approximate anything not of interest, possibly without going to the server [Trond Myklebust, Ulrich Drepper, Andreas Dilger] (AT_STATX_DONT_SYNC). (6) Heavyweight stat: Force a netfs to go to the server, even if it thinks its cached attributes are up to date [Trond Myklebust] (AT_STATX_FORCE_SYNC). And the following have been left out for future extension: (7) Data version number: Could be used by userspace NFS servers [Aneesh Kumar]. Can also be used to modify fill_post_wcc() in NFSD which retrieves i_version directly, but has just called vfs_getattr(). It could get it from the kstat struct if it used vfs_xgetattr() instead. (There's disagreement on the exact semantics of a single field, since not all filesystems do this the same way). (8) BSD stat compatibility: Including more fields from the BSD stat such as creation time (st_btime) and inode generation number (st_gen) [Jeremy Allison, Bernd Schubert]. (9) Inode generation number: Useful for FUSE and userspace NFS servers [Bernd Schubert]. (This was asked for but later deemed unnecessary with the open-by-handle capability available and caused disagreement as to whether it's a security hole or not). (10) Extra coherency data may be useful in making backups [Andreas Dilger]. (No particular data were offered, but things like last backup timestamp, the data version number and the DOS archive bit would come into this category). (11) Allow the filesystem to indicate what it can/cannot provide: A filesystem can now say it doesn't support a standard stat feature if that isn't available, so if, for instance, inode numbers or UIDs don't exist or are fabricated locally... (This requires a separate system call - I have an fsinfo() call idea for this). (12) Store a 16-byte volume ID in the superblock that can be returned in struct xstat [Steve French]. (Deferred to fsinfo). (13) Include granularity fields in the time data to indicate the granularity of each of the times (NFSv4 time_delta) [Steve French]. (Deferred to fsinfo). (14) FS_IOC_GETFLAGS value. These could be translated to BSD's st_flags. Note that the Linux IOC flags are a mess and filesystems such as Ext4 define flags that aren't in linux/fs.h, so translation in the kernel may be a necessity (or, possibly, we provide the filesystem type too). (Some attributes are made available in stx_attributes, but the general feeling was that the IOC flags were to ext[234]-specific and shouldn't be exposed through statx this way). (15) Mask of features available on file (eg: ACLs, seclabel) [Brad Boyer, Michael Kerrisk]. (Deferred, probably to fsinfo. Finding out if there's an ACL or seclabal might require extra filesystem operations). (16) Femtosecond-resolution timestamps [Dave Chinner]. (A __reserved field has been left in the statx_timestamp struct for this - if there proves to be a need). (17) A set multiple attributes syscall to go with this. =============== NEW SYSTEM CALL =============== The new system call is: int ret = statx(int dfd, const char *filename, unsigned int flags, unsigned int mask, struct statx *buffer); The dfd, filename and flags parameters indicate the file to query, in a similar way to fstatat(). There is no equivalent of lstat() as that can be emulated with statx() by passing AT_SYMLINK_NOFOLLOW in flags. There is also no equivalent of fstat() as that can be emulated by passing a NULL filename to statx() with the fd of interest in dfd. Whether or not statx() synchronises the attributes with the backing store can be controlled by OR'ing a value into the flags argument (this typically only affects network filesystems): (1) AT_STATX_SYNC_AS_STAT tells statx() to behave as stat() does in this respect. (2) AT_STATX_FORCE_SYNC will require a network filesystem to synchronise its attributes with the server - which might require data writeback to occur to get the timestamps correct. (3) AT_STATX_DONT_SYNC will suppress synchronisation with the server in a network filesystem. The resulting values should be considered approximate. mask is a bitmask indicating the fields in struct statx that are of interest to the caller. The user should set this to STATX_BASIC_STATS to get the basic set returned by stat(). It should be noted that asking for more information may entail extra I/O operations. buffer points to the destination for the data. This must be 256 bytes in size. ====================== MAIN ATTRIBUTES RECORD ====================== The following structures are defined in which to return the main attribute set: struct statx_timestamp { __s64 tv_sec; __s32 tv_nsec; __s32 __reserved; }; struct statx { __u32 stx_mask; __u32 stx_blksize; __u64 stx_attributes; __u32 stx_nlink; __u32 stx_uid; __u32 stx_gid; __u16 stx_mode; __u16 __spare0[1]; __u64 stx_ino; __u64 stx_size; __u64 stx_blocks; __u64 __spare1[1]; struct statx_timestamp stx_atime; struct statx_timestamp stx_btime; struct statx_timestamp stx_ctime; struct statx_timestamp stx_mtime; __u32 stx_rdev_major; __u32 stx_rdev_minor; __u32 stx_dev_major; __u32 stx_dev_minor; __u64 __spare2[14]; }; The defined bits in request_mask and stx_mask are: STATX_TYPE Want/got stx_mode & S_IFMT STATX_MODE Want/got stx_mode & ~S_IFMT STATX_NLINK Want/got stx_nlink STATX_UID Want/got stx_uid STATX_GID Want/got stx_gid STATX_ATIME Want/got stx_atime{,_ns} STATX_MTIME Want/got stx_mtime{,_ns} STATX_CTIME Want/got stx_ctime{,_ns} STATX_INO Want/got stx_ino STATX_SIZE Want/got stx_size STATX_BLOCKS Want/got stx_blocks STATX_BASIC_STATS [The stuff in the normal stat struct] STATX_BTIME Want/got stx_btime{,_ns} STATX_ALL [All currently available stuff] stx_btime is the file creation time, stx_mask is a bitmask indicating the data provided and __spares*[] are where as-yet undefined fields can be placed. Time fields are structures with separate seconds and nanoseconds fields plus a reserved field in case we want to add even finer resolution. Note that times will be negative if before 1970; in such a case, the nanosecond fields will also be negative if not zero. The bits defined in the stx_attributes field convey information about a file, how it is accessed, where it is and what it does. The following attributes map to FS_*_FL flags and are the same numerical value: STATX_ATTR_COMPRESSED File is compressed by the fs STATX_ATTR_IMMUTABLE File is marked immutable STATX_ATTR_APPEND File is append-only STATX_ATTR_NODUMP File is not to be dumped STATX_ATTR_ENCRYPTED File requires key to decrypt in fs Within the kernel, the supported flags are listed by: KSTAT_ATTR_FS_IOC_FLAGS [Are any other IOC flags of sufficient general interest to be exposed through this interface?] New flags include: STATX_ATTR_AUTOMOUNT Object is an automount trigger These are for the use of GUI tools that might want to mark files specially, depending on what they are. Fields in struct statx come in a number of classes: (0) stx_dev_*, stx_blksize. These are local system information and are always available. (1) stx_mode, stx_nlinks, stx_uid, stx_gid, stx_[amc]time, stx_ino, stx_size, stx_blocks. These will be returned whether the caller asks for them or not. The corresponding bits in stx_mask will be set to indicate whether they actually have valid values. If the caller didn't ask for them, then they may be approximated. For example, NFS won't waste any time updating them from the server, unless as a byproduct of updating something requested. If the values don't actually exist for the underlying object (such as UID or GID on a DOS file), then the bit won't be set in the stx_mask, even if the caller asked for the value. In such a case, the returned value will be a fabrication. Note that there are instances where the type might not be valid, for instance Windows reparse points. (2) stx_rdev_*. This will be set only if stx_mode indicates we're looking at a blockdev or a chardev, otherwise will be 0. (3) stx_btime. Similar to (1), except this will be set to 0 if it doesn't exist. ======= TESTING ======= The following test program can be used to test the statx system call: samples/statx/test-statx.c Just compile and run, passing it paths to the files you want to examine. The file is built automatically if CONFIG_SAMPLES is enabled. Here's some example output. Firstly, an NFS directory that crosses to another FSID. Note that the AUTOMOUNT attribute is set because transiting this directory will cause d_automount to be invoked by the VFS. [root@andromeda ~]# /tmp/test-statx -A /warthog/data statx(/warthog/data) = 0 results=7ff Size: 4096 Blocks: 8 IO Block: 1048576 directory Device: 00:26 Inode: 1703937 Links: 125 Access: (3777/drwxrwxrwx) Uid: 0 Gid: 4041 Access: 2016-11-24 09:02:12.219699527+0000 Modify: 2016-11-17 10:44:36.225653653+0000 Change: 2016-11-17 10:44:36.225653653+0000 Attributes: 0000000000001000 (-------- -------- -------- -------- -------- -------- ---m---- --------) Secondly, the result of automounting on that directory. [root@andromeda ~]# /tmp/test-statx /warthog/data statx(/warthog/data) = 0 results=7ff Size: 4096 Blocks: 8 IO Block: 1048576 directory Device: 00:27 Inode: 2 Links: 125 Access: (3777/drwxrwxrwx) Uid: 0 Gid: 4041 Access: 2016-11-24 09:02:12.219699527+0000 Modify: 2016-11-17 10:44:36.225653653+0000 Change: 2016-11-17 10:44:36.225653653+0000 Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff b1da47e2 Tue Jul 01 07:01:28 MDT 2008 Miklos Szeredi <mszeredi@suse.cz> [patch 3/4] fat: dont call notify_change The FAT_IOCTL_SET_ATTRIBUTES ioctl() calls notify_change() to change the file mode before changing the inode attributes. Replace with explicit calls to security_inode_setattr(), fat_setattr() and fsnotify_change(). This is equivalent to the original. The reason it is needed, is that later in the series we move the immutable check into notify_change(). That would break the FAT_IOCTL_SET_ATTRIBUTES ioctl, as it needs to perform the mode change regardless of the immutability of the file. [Fix error if fat is built as a module. Thanks to OGAWA Hirofumi for noticing.] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> diff 4b6f5d20 Tue Mar 28 02:56:42 MST 2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
Completed in 539 milliseconds