Searched +hist:16 +hist:db3d3f (Results 1 - 3 of 3) sorted by relevance

/linux-master/include/linux/
H A Dsysctl.hdiff 3bc269cf Tue Aug 08 16:49:59 MDT 2023 Joel Granados <joel.granados@gmail.com> sysctl: Add size arg to __register_sysctl_init

This commit adds table_size to __register_sysctl_init in preparation for
the removal of the sentinel elements in the ctl_table arrays (last empty
markers). And though we do *not* remove any sentinels in this commit, we
set things up by calculating the ctl_table array size with ARRAY_SIZE.

We add a table_size argument to __register_sysctl_init and modify the
register_sysctl_init macro to calculate the array size with ARRAY_SIZE.
The original callers do not need to be updated as they will go through
the new macro.

Signed-off-by: Joel Granados <j.granados@samsung.com>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
diff 9edbfe92 Tue Aug 08 16:49:58 MDT 2023 Joel Granados <joel.granados@gmail.com> sysctl: Add size to register_sysctl

This commit adds table_size to register_sysctl in preparation for the
removal of the sentinel elements in the ctl_table arrays (last empty
markers). And though we do *not* remove any sentinels in this commit, we
set things up by either passing the table_size explicitly or using
ARRAY_SIZE on the ctl_table arrays.

We replace the register_syctl function with a macro that will add the
ARRAY_SIZE to the new register_sysctl_sz function. In this way the
callers that are already using an array of ctl_table structs do not
change. For the callers that pass a ctl_table array pointer, we pass the
table_size to register_sysctl_sz instead of the macro.

Signed-off-by: Joel Granados <j.granados@samsung.com>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
diff bff97cf1 Tue Aug 08 16:49:57 MDT 2023 Joel Granados <joel.granados@gmail.com> sysctl: Add a size arg to __register_sysctl_table

We make these changes in order to prepare __register_sysctl_table and
its callers for when we remove the sentinel element (empty element at
the end of ctl_table arrays). We don't actually remove any sentinels in
this commit, but we *do* make sure to use ARRAY_SIZE so the table_size
is available when the removal occurs.

We add a table_size argument to __register_sysctl_table and adjust
callers, all of which pass ctl_table pointers and need an explicit call
to ARRAY_SIZE. We implement a size calculation in register_net_sysctl in
order to forward the size of the array pointer received from the network
register calls.

The new table_size argument does not yet have any effect in the
init_header call which is still dependent on the sentinel's presence.
table_size *does* however drive the `kzalloc` allocation in
__register_sysctl_table with no adverse effects as the allocated memory
is either one element greater than the calculated ctl_table array (for
the calls in ipc_sysctl.c, mq_sysctl.c and ucount.c) or the exact size
of the calculated ctl_table array (for the call from sysctl_net.c and
register_sysctl). This approach will allows us to "just" remove the
sentinel without further changes to __register_sysctl_table as
table_size will represent the exact size for all the callers at that
point.

Signed-off-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
diff 1e887723 Tue Aug 08 16:49:55 MDT 2023 Joel Granados <joel.granados@gmail.com> sysctl: Add ctl_table_size to ctl_table_header

The new ctl_table_size element will hold the size of the ctl_table
arrays contained in the ctl_table_header. This value should eventually
be passed by the callers to the sysctl register infrastructure. And
while this commit introduces the variable, it does not set nor use it
because that requires case by case considerations for each caller.

It provides two important things: (1) A place to put the
result of the ctl_table array calculation when it gets introduced for
each caller. And (2) the size that will be used as the additional
stopping criteria in the list_for_each_table_entry macro (to be added
when all the callers are migrated)

Signed-off-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
diff 2f2665c1 Fri Jun 16 02:59:22 MDT 2023 Joel Granados <j.granados@samsung.com> sysctl: replace child with an enumeration

This is part of the effort to remove the empty element at the end of
ctl_table structs. "child" was a deprecated elem in this struct and was
being used to differentiate between two types of ctl_tables: "normal"
and "permanently emtpy".

What changed?:
* Replace "child" with an enumeration that will have two values: the
default (0) and the permanently empty (1). The latter is left at zero
so when struct ctl_table is created with kzalloc or in a local
context, it will have the zero value by default. We document the
new enum with kdoc.
* Remove the "empty child" check from sysctl_check_table
* Remove count_subheaders function as there is no longer a need to
calculate how many headers there are for every child
* Remove the recursive call to unregister_sysctl_table as there is no
need to traverse down the child tree any longer
* Add a new SYSCTL_PERM_EMPTY_DIR binary flag
* Remove the last remanence of child from partport/procfs.c

Signed-off-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
diff 0199849a Tue May 02 19:30:04 MDT 2023 Luis Chamberlain <mcgrof@kernel.org> sysctl: remove register_sysctl_paths()

The deprecation for register_sysctl_paths() is over. We can rejoice as
we nuke register_sysctl_paths(). The routine register_sysctl_table()
was the only user left of register_sysctl_paths(), so we can now just
open code and move the implementation over to what used to be
to __register_sysctl_paths().

The old dynamic struct ctl_table_set *set is now the point to
sysctl_table_root.default_set.

The old dynamic const struct ctl_path *path was being used in the
routine register_sysctl_paths() with a static:

static const struct ctl_path null_path[] = { {} };

Since this is a null path we can now just simplfy the old routine
and remove its use as its always empty.

This saves us a total of 230 bytes.

$ ./scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 2/7 grow/shrink: 1/1 up/down: 1015/-1245 (-230)
Function old new delta
register_leaf_sysctl_tables.constprop - 524 +524
register_sysctl_table 22 497 +475
__pfx_register_leaf_sysctl_tables.constprop - 16 +16
null_path 8 - -8
__pfx_register_sysctl_paths 16 - -16
__pfx_register_leaf_sysctl_tables 16 - -16
__pfx___register_sysctl_paths 16 - -16
__register_sysctl_base 29 12 -17
register_sysctl_paths 18 - -18
register_leaf_sysctl_tables 534 - -534
__register_sysctl_paths 620 - -620
Total: Before=21259666, After=21259436, chg -0.00%

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
diff 0199849a Tue May 02 19:30:04 MDT 2023 Luis Chamberlain <mcgrof@kernel.org> sysctl: remove register_sysctl_paths()

The deprecation for register_sysctl_paths() is over. We can rejoice as
we nuke register_sysctl_paths(). The routine register_sysctl_table()
was the only user left of register_sysctl_paths(), so we can now just
open code and move the implementation over to what used to be
to __register_sysctl_paths().

The old dynamic struct ctl_table_set *set is now the point to
sysctl_table_root.default_set.

The old dynamic const struct ctl_path *path was being used in the
routine register_sysctl_paths() with a static:

static const struct ctl_path null_path[] = { {} };

Since this is a null path we can now just simplfy the old routine
and remove its use as its always empty.

This saves us a total of 230 bytes.

$ ./scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 2/7 grow/shrink: 1/1 up/down: 1015/-1245 (-230)
Function old new delta
register_leaf_sysctl_tables.constprop - 524 +524
register_sysctl_table 22 497 +475
__pfx_register_leaf_sysctl_tables.constprop - 16 +16
null_path 8 - -8
__pfx_register_sysctl_paths 16 - -16
__pfx_register_leaf_sysctl_tables 16 - -16
__pfx___register_sysctl_paths 16 - -16
__register_sysctl_base 29 12 -17
register_sysctl_paths 18 - -18
register_leaf_sysctl_tables 534 - -534
__register_sysctl_paths 620 - -620
Total: Before=21259666, After=21259436, chg -0.00%

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
diff 0199849a Tue May 02 19:30:04 MDT 2023 Luis Chamberlain <mcgrof@kernel.org> sysctl: remove register_sysctl_paths()

The deprecation for register_sysctl_paths() is over. We can rejoice as
we nuke register_sysctl_paths(). The routine register_sysctl_table()
was the only user left of register_sysctl_paths(), so we can now just
open code and move the implementation over to what used to be
to __register_sysctl_paths().

The old dynamic struct ctl_table_set *set is now the point to
sysctl_table_root.default_set.

The old dynamic const struct ctl_path *path was being used in the
routine register_sysctl_paths() with a static:

static const struct ctl_path null_path[] = { {} };

Since this is a null path we can now just simplfy the old routine
and remove its use as its always empty.

This saves us a total of 230 bytes.

$ ./scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 2/7 grow/shrink: 1/1 up/down: 1015/-1245 (-230)
Function old new delta
register_leaf_sysctl_tables.constprop - 524 +524
register_sysctl_table 22 497 +475
__pfx_register_leaf_sysctl_tables.constprop - 16 +16
null_path 8 - -8
__pfx_register_sysctl_paths 16 - -16
__pfx_register_leaf_sysctl_tables 16 - -16
__pfx___register_sysctl_paths 16 - -16
__register_sysctl_base 29 12 -17
register_sysctl_paths 18 - -18
register_leaf_sysctl_tables 534 - -534
__register_sysctl_paths 620 - -620
Total: Before=21259666, After=21259436, chg -0.00%

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
diff 0199849a Tue May 02 19:30:04 MDT 2023 Luis Chamberlain <mcgrof@kernel.org> sysctl: remove register_sysctl_paths()

The deprecation for register_sysctl_paths() is over. We can rejoice as
we nuke register_sysctl_paths(). The routine register_sysctl_table()
was the only user left of register_sysctl_paths(), so we can now just
open code and move the implementation over to what used to be
to __register_sysctl_paths().

The old dynamic struct ctl_table_set *set is now the point to
sysctl_table_root.default_set.

The old dynamic const struct ctl_path *path was being used in the
routine register_sysctl_paths() with a static:

static const struct ctl_path null_path[] = { {} };

Since this is a null path we can now just simplfy the old routine
and remove its use as its always empty.

This saves us a total of 230 bytes.

$ ./scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 2/7 grow/shrink: 1/1 up/down: 1015/-1245 (-230)
Function old new delta
register_leaf_sysctl_tables.constprop - 524 +524
register_sysctl_table 22 497 +475
__pfx_register_leaf_sysctl_tables.constprop - 16 +16
null_path 8 - -8
__pfx_register_sysctl_paths 16 - -16
__pfx_register_leaf_sysctl_tables 16 - -16
__pfx___register_sysctl_paths 16 - -16
__register_sysctl_base 29 12 -17
register_sysctl_paths 18 - -18
register_leaf_sysctl_tables 534 - -534
__register_sysctl_paths 620 - -620
Total: Before=21259666, After=21259436, chg -0.00%

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
diff 0199849a Tue May 02 19:30:04 MDT 2023 Luis Chamberlain <mcgrof@kernel.org> sysctl: remove register_sysctl_paths()

The deprecation for register_sysctl_paths() is over. We can rejoice as
we nuke register_sysctl_paths(). The routine register_sysctl_table()
was the only user left of register_sysctl_paths(), so we can now just
open code and move the implementation over to what used to be
to __register_sysctl_paths().

The old dynamic struct ctl_table_set *set is now the point to
sysctl_table_root.default_set.

The old dynamic const struct ctl_path *path was being used in the
routine register_sysctl_paths() with a static:

static const struct ctl_path null_path[] = { {} };

Since this is a null path we can now just simplfy the old routine
and remove its use as its always empty.

This saves us a total of 230 bytes.

$ ./scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 2/7 grow/shrink: 1/1 up/down: 1015/-1245 (-230)
Function old new delta
register_leaf_sysctl_tables.constprop - 524 +524
register_sysctl_table 22 497 +475
__pfx_register_leaf_sysctl_tables.constprop - 16 +16
null_path 8 - -8
__pfx_register_sysctl_paths 16 - -16
__pfx_register_leaf_sysctl_tables 16 - -16
__pfx___register_sysctl_paths 16 - -16
__register_sysctl_base 29 12 -17
register_sysctl_paths 18 - -18
register_leaf_sysctl_tables 534 - -534
__register_sysctl_paths 620 - -620
Total: Before=21259666, After=21259436, chg -0.00%

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
diff 0199849a Tue May 02 19:30:04 MDT 2023 Luis Chamberlain <mcgrof@kernel.org> sysctl: remove register_sysctl_paths()

The deprecation for register_sysctl_paths() is over. We can rejoice as
we nuke register_sysctl_paths(). The routine register_sysctl_table()
was the only user left of register_sysctl_paths(), so we can now just
open code and move the implementation over to what used to be
to __register_sysctl_paths().

The old dynamic struct ctl_table_set *set is now the point to
sysctl_table_root.default_set.

The old dynamic const struct ctl_path *path was being used in the
routine register_sysctl_paths() with a static:

static const struct ctl_path null_path[] = { {} };

Since this is a null path we can now just simplfy the old routine
and remove its use as its always empty.

This saves us a total of 230 bytes.

$ ./scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 2/7 grow/shrink: 1/1 up/down: 1015/-1245 (-230)
Function old new delta
register_leaf_sysctl_tables.constprop - 524 +524
register_sysctl_table 22 497 +475
__pfx_register_leaf_sysctl_tables.constprop - 16 +16
null_path 8 - -8
__pfx_register_sysctl_paths 16 - -16
__pfx_register_leaf_sysctl_tables 16 - -16
__pfx___register_sysctl_paths 16 - -16
__register_sysctl_base 29 12 -17
register_sysctl_paths 18 - -18
register_leaf_sysctl_tables 534 - -534
__register_sysctl_paths 620 - -620
Total: Before=21259666, After=21259436, chg -0.00%

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
diff 0199849a Tue May 02 19:30:04 MDT 2023 Luis Chamberlain <mcgrof@kernel.org> sysctl: remove register_sysctl_paths()

The deprecation for register_sysctl_paths() is over. We can rejoice as
we nuke register_sysctl_paths(). The routine register_sysctl_table()
was the only user left of register_sysctl_paths(), so we can now just
open code and move the implementation over to what used to be
to __register_sysctl_paths().

The old dynamic struct ctl_table_set *set is now the point to
sysctl_table_root.default_set.

The old dynamic const struct ctl_path *path was being used in the
routine register_sysctl_paths() with a static:

static const struct ctl_path null_path[] = { {} };

Since this is a null path we can now just simplfy the old routine
and remove its use as its always empty.

This saves us a total of 230 bytes.

$ ./scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 2/7 grow/shrink: 1/1 up/down: 1015/-1245 (-230)
Function old new delta
register_leaf_sysctl_tables.constprop - 524 +524
register_sysctl_table 22 497 +475
__pfx_register_leaf_sysctl_tables.constprop - 16 +16
null_path 8 - -8
__pfx_register_sysctl_paths 16 - -16
__pfx_register_leaf_sysctl_tables 16 - -16
__pfx___register_sysctl_paths 16 - -16
__register_sysctl_base 29 12 -17
register_sysctl_paths 18 - -18
register_leaf_sysctl_tables 534 - -534
__register_sysctl_paths 620 - -620
Total: Before=21259666, After=21259436, chg -0.00%

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
diff 0199849a Tue May 02 19:30:04 MDT 2023 Luis Chamberlain <mcgrof@kernel.org> sysctl: remove register_sysctl_paths()

The deprecation for register_sysctl_paths() is over. We can rejoice as
we nuke register_sysctl_paths(). The routine register_sysctl_table()
was the only user left of register_sysctl_paths(), so we can now just
open code and move the implementation over to what used to be
to __register_sysctl_paths().

The old dynamic struct ctl_table_set *set is now the point to
sysctl_table_root.default_set.

The old dynamic const struct ctl_path *path was being used in the
routine register_sysctl_paths() with a static:

static const struct ctl_path null_path[] = { {} };

Since this is a null path we can now just simplfy the old routine
and remove its use as its always empty.

This saves us a total of 230 bytes.

$ ./scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 2/7 grow/shrink: 1/1 up/down: 1015/-1245 (-230)
Function old new delta
register_leaf_sysctl_tables.constprop - 524 +524
register_sysctl_table 22 497 +475
__pfx_register_leaf_sysctl_tables.constprop - 16 +16
null_path 8 - -8
__pfx_register_sysctl_paths 16 - -16
__pfx_register_leaf_sysctl_tables 16 - -16
__pfx___register_sysctl_paths 16 - -16
__register_sysctl_base 29 12 -17
register_sysctl_paths 18 - -18
register_leaf_sysctl_tables 534 - -534
__register_sysctl_paths 620 - -620
Total: Before=21259666, After=21259436, chg -0.00%

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
/linux-master/kernel/
H A Dsysctl.cdiff e95d372c Tue May 16 00:38:20 MDT 2023 Kefeng Wang <wangkefeng.wang@huawei.com> mm: page_alloc: move sysctls into it own fils

This moves all page alloc related sysctls to its own file, as part of the
kernel/sysctl.c spring cleaning, also move some functions declarations
from mm.h into internal.h.

Link: https://lkml.kernel.org/r/20230516063821.121844-13-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
diff 28898e26 Sun May 28 14:54:20 MDT 2023 Luis Chamberlain <mcgrof@kernel.org> sysctl: move security keys sysctl registration to its own file

The security keys sysctls are already declared on its own file,
just move the sysctl registration to its own file to help avoid
merge conflicts on sysctls.c, and help with clearing up sysctl.c
further.

This creates a small penalty of 23 bytes:

./scripts/bloat-o-meter vmlinux.1 vmlinux.2
add/remove: 2/0 grow/shrink: 0/1 up/down: 49/-26 (23)
Function old new delta
init_security_keys_sysctls - 33 +33
__pfx_init_security_keys_sysctls - 16 +16
sysctl_init_bases 85 59 -26
Total: Before=21256937, After=21256960, chg +0.00%

But soon we'll be saving tons of bytes anyway, as we modify the
sysctl registrations to use ARRAY_SIZE and so we get rid of all the
empty array elements so let's just clean this up now.

Reviewed-by: Paul Moore <paul@paul-moore.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
diff 28898e26 Sun May 28 14:54:20 MDT 2023 Luis Chamberlain <mcgrof@kernel.org> sysctl: move security keys sysctl registration to its own file

The security keys sysctls are already declared on its own file,
just move the sysctl registration to its own file to help avoid
merge conflicts on sysctls.c, and help with clearing up sysctl.c
further.

This creates a small penalty of 23 bytes:

./scripts/bloat-o-meter vmlinux.1 vmlinux.2
add/remove: 2/0 grow/shrink: 0/1 up/down: 49/-26 (23)
Function old new delta
init_security_keys_sysctls - 33 +33
__pfx_init_security_keys_sysctls - 16 +16
sysctl_init_bases 85 59 -26
Total: Before=21256937, After=21256960, chg +0.00%

But soon we'll be saving tons of bytes anyway, as we modify the
sysctl registrations to use ARRAY_SIZE and so we get rid of all the
empty array elements so let's just clean this up now.

Reviewed-by: Paul Moore <paul@paul-moore.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
diff 861dc0b4 Sun May 28 14:43:46 MDT 2023 Luis Chamberlain <mcgrof@kernel.org> sysctl: move umh sysctl registration to its own file

Move the umh sysctl registration to its own file, the array is
already there. We do this to remove the clutter out of kernel/sysctl.c
to avoid merge conflicts.

This also lets the sysctls not be built at all now when CONFIG_SYSCTL
is not enabled.

This has a small penalty of 23 bytes but soon we'll be removing
all the empty entries on sysctl arrays so just do this cleanup
now:

./scripts/bloat-o-meter vmlinux.base vmlinux.1
add/remove: 2/0 grow/shrink: 0/1 up/down: 49/-26 (23)
Function old new delta
init_umh_sysctls - 33 +33
__pfx_init_umh_sysctls - 16 +16
sysctl_init_bases 111 85 -26
Total: Before=21256914, After=21256937, chg +0.00%

Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
diff 861dc0b4 Sun May 28 14:43:46 MDT 2023 Luis Chamberlain <mcgrof@kernel.org> sysctl: move umh sysctl registration to its own file

Move the umh sysctl registration to its own file, the array is
already there. We do this to remove the clutter out of kernel/sysctl.c
to avoid merge conflicts.

This also lets the sysctls not be built at all now when CONFIG_SYSCTL
is not enabled.

This has a small penalty of 23 bytes but soon we'll be removing
all the empty entries on sysctl arrays so just do this cleanup
now:

./scripts/bloat-o-meter vmlinux.base vmlinux.1
add/remove: 2/0 grow/shrink: 0/1 up/down: 49/-26 (23)
Function old new delta
init_umh_sysctls - 33 +33
__pfx_init_umh_sysctls - 16 +16
sysctl_init_bases 111 85 -26
Total: Before=21256914, After=21256937, chg +0.00%

Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
diff 01e6aac7 Thu May 18 14:37:41 MDT 2023 Luis Chamberlain <mcgrof@kernel.org> signal: move show_unhandled_signals sysctl to its own file

The show_unhandled_signals sysctl is the only sysctl for debug
left on kernel/sysctl.c. We've been moving the syctls out from
kernel/sysctl.c so to help avoid merge conflicts as the shared
array gets out of hand.

This change incurs simplifies sysctl registration by localizing
it where it should go for a penalty in size of increasing the
kernel by 23 bytes, we accept this given recent cleanups have
actually already saved us 1465 bytes in the prior commits.

./scripts/bloat-o-meter vmlinux.3-remove-dev-table vmlinux.4-remove-debug-table
add/remove: 3/1 grow/shrink: 0/1 up/down: 177/-154 (23)
Function old new delta
signal_debug_table - 128 +128
init_signal_sysctls - 33 +33
__pfx_init_signal_sysctls - 16 +16
sysctl_init_bases 85 59 -26
debug_table 128 - -128
Total: Before=21256967, After=21256990, chg +0.00%

Reviewed-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
diff 01e6aac7 Thu May 18 14:37:41 MDT 2023 Luis Chamberlain <mcgrof@kernel.org> signal: move show_unhandled_signals sysctl to its own file

The show_unhandled_signals sysctl is the only sysctl for debug
left on kernel/sysctl.c. We've been moving the syctls out from
kernel/sysctl.c so to help avoid merge conflicts as the shared
array gets out of hand.

This change incurs simplifies sysctl registration by localizing
it where it should go for a penalty in size of increasing the
kernel by 23 bytes, we accept this given recent cleanups have
actually already saved us 1465 bytes in the prior commits.

./scripts/bloat-o-meter vmlinux.3-remove-dev-table vmlinux.4-remove-debug-table
add/remove: 3/1 grow/shrink: 0/1 up/down: 177/-154 (23)
Function old new delta
signal_debug_table - 128 +128
init_signal_sysctls - 33 +33
__pfx_init_signal_sysctls - 16 +16
sysctl_init_bases 85 59 -26
debug_table 128 - -128
Total: Before=21256967, After=21256990, chg +0.00%

Reviewed-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
diff 374a723c Mon May 16 03:07:53 MDT 2022 Fanjun Kong <bh1scw@gmail.com> kernel/sysctl.c: Remove trailing white space

This patch removes the trailing white space in kernel/sysysctl.c

Signed-off-by: Fanjun Kong <bh1scw@gmail.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
[mcgrof: fix commit message subject]
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
diff 06d17766 Wed Feb 16 09:23:21 MST 2022 tangmeng <tangmeng@uniontech.com> kernel/reboot: move reboot sysctls to its own file

kernel/sysctl.c is a kitchen sink where everyone leaves their dirty
dishes, this makes it very difficult to maintain.

To help with this maintenance let's start by moving sysctls to places
where they actually belong. The proc sysctl maintainers do not want to
know what sysctl knobs you wish to add for your own piece of code, we
just care about the core logic.

All filesystem syctls now get reviewed by fs folks. This commit
follows the commit of fs, move the poweroff_cmd and ctrl-alt-del
sysctls to its own file, kernel/reboot.c.

Signed-off-by: tangmeng <tangmeng@uniontech.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
diff a2071573 Mon Aug 02 16:59:36 MDT 2021 Jia He <hejianet@gmail.com> sysctl: introduce new proc handler proc_dobool

This is to let bool variable could be correctly displayed in
big/little endian sysctl procfs. sizeof(bool) is arch dependent,
proc_dobool should work in all arches.

Suggested-by: Pan Xinhui <xinhui@linux.vnet.ibm.com>
Signed-off-by: Jia He <hejianet@gmail.com>
[thuth: rebased the patch to the current kernel version]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
H A Dfork.cdiff 3888750e Thu Nov 16 06:36:37 MST 2023 Heiko Carstens <hca@linux.ibm.com> arch: remove ARCH_TASK_STRUCT_ALLOCATOR

IA-64 was the only architecture which selected ARCH_TASK_STRUCT_ALLOCATOR.
IA-64 was removed with commit cf8e8658100d ("arch: Remove Itanium (IA-64)
architecture"). Therefore remove support for ARCH_THREAD_STACK_ALLOCATOR
as well.

Link: https://lkml.kernel.org/r/20231116133638.1636277-3-hca@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
diff f72709ab Thu Nov 16 06:36:36 MST 2023 Heiko Carstens <hca@linux.ibm.com> arch: remove ARCH_THREAD_STACK_ALLOCATOR

Patch series "Remove unused code after IA-64 removal".

While looking into something different I noticed that there are a couple
of Kconfig options which were only selected by IA-64 and which are now
unused.

So remove them and simplify the code a bit.


This patch (of 3):

IA-64 was the only architecture which selected ARCH_THREAD_STACK_ALLOCATOR.
IA-64 was removed with commit cf8e8658100d ("arch: Remove Itanium (IA-64)
architecture"). Therefore remove support for ARCH_THREAD_STACK_ALLOCATOR as
well.

Link: https://lkml.kernel.org/r/20231116133638.1636277-1-hca@linux.ibm.com
Link: https://lkml.kernel.org/r/20231116133638.1636277-2-hca@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
diff a1140cb2 Tue Aug 23 09:45:32 MDT 2022 Kuniyuki Iwashima <kuniyu@amazon.com> seccomp: Move copy_seccomp() to no failure path.

Our syzbot instance reported memory leaks in do_seccomp() [0], similar
to the report [1]. It shows that we miss freeing struct seccomp_filter
and some objects included in it.

We can reproduce the issue with the program below [2] which calls one
seccomp() and two clone() syscalls.

The first clone()d child exits earlier than its parent and sends a
signal to kill it during the second clone(), more precisely before the
fatal_signal_pending() test in copy_process(). When the parent receives
the signal, it has to destroy the embryonic process and return -EINTR to
user space. In the failure path, we have to call seccomp_filter_release()
to decrement the filter's refcount.

Initially, we called it in free_task() called from the failure path, but
the commit 3a15fb6ed92c ("seccomp: release filter after task is fully
dead") moved it to release_task() to notify user space as early as possible
that the filter is no longer used.

To keep the change and current seccomp refcount semantics, let's move
copy_seccomp() just after the signal check and add a WARN_ON_ONCE() in
free_task() for future debugging.

[0]:
unreferenced object 0xffff8880063add00 (size 256):
comm "repro_seccomp", pid 230, jiffies 4294687090 (age 9.914s)
hex dump (first 32 bytes):
01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
backtrace:
do_seccomp (./include/linux/slab.h:600 ./include/linux/slab.h:733 kernel/seccomp.c:666 kernel/seccomp.c:708 kernel/seccomp.c:1871 kernel/seccomp.c:1991)
do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
unreferenced object 0xffffc90000035000 (size 4096):
comm "repro_seccomp", pid 230, jiffies 4294687090 (age 9.915s)
hex dump (first 32 bytes):
01 00 00 00 00 00 00 00 00 00 00 00 05 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
__vmalloc_node_range (mm/vmalloc.c:3226)
__vmalloc_node (mm/vmalloc.c:3261 (discriminator 4))
bpf_prog_alloc_no_stats (kernel/bpf/core.c:91)
bpf_prog_alloc (kernel/bpf/core.c:129)
bpf_prog_create_from_user (net/core/filter.c:1414)
do_seccomp (kernel/seccomp.c:671 kernel/seccomp.c:708 kernel/seccomp.c:1871 kernel/seccomp.c:1991)
do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
unreferenced object 0xffff888003fa1000 (size 1024):
comm "repro_seccomp", pid 230, jiffies 4294687090 (age 9.915s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
bpf_prog_alloc_no_stats (./include/linux/slab.h:600 ./include/linux/slab.h:733 kernel/bpf/core.c:95)
bpf_prog_alloc (kernel/bpf/core.c:129)
bpf_prog_create_from_user (net/core/filter.c:1414)
do_seccomp (kernel/seccomp.c:671 kernel/seccomp.c:708 kernel/seccomp.c:1871 kernel/seccomp.c:1991)
do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
unreferenced object 0xffff888006360240 (size 16):
comm "repro_seccomp", pid 230, jiffies 4294687090 (age 9.915s)
hex dump (first 16 bytes):
01 00 37 00 76 65 72 6c e0 83 01 06 80 88 ff ff ..7.verl........
backtrace:
bpf_prog_store_orig_filter (net/core/filter.c:1137)
bpf_prog_create_from_user (net/core/filter.c:1428)
do_seccomp (kernel/seccomp.c:671 kernel/seccomp.c:708 kernel/seccomp.c:1871 kernel/seccomp.c:1991)
do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
unreferenced object 0xffff8880060183e0 (size 8):
comm "repro_seccomp", pid 230, jiffies 4294687090 (age 9.915s)
hex dump (first 8 bytes):
06 00 00 00 00 00 ff 7f ........
backtrace:
kmemdup (mm/util.c:129)
bpf_prog_store_orig_filter (net/core/filter.c:1144)
bpf_prog_create_from_user (net/core/filter.c:1428)
do_seccomp (kernel/seccomp.c:671 kernel/seccomp.c:708 kernel/seccomp.c:1871 kernel/seccomp.c:1991)
do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)

[1]: https://syzkaller.appspot.com/bug?id=2809bb0ac77ad9aa3f4afe42d6a610aba594a987

[2]:
#define _GNU_SOURCE
#include <sched.h>
#include <signal.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <linux/filter.h>
#include <linux/seccomp.h>

void main(void)
{
struct sock_filter filter[] = {
BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_ALLOW),
};
struct sock_fprog fprog = {
.len = sizeof(filter) / sizeof(filter[0]),
.filter = filter,
};
long i, pid;

syscall(__NR_seccomp, SECCOMP_SET_MODE_FILTER, 0, &fprog);

for (i = 0; i < 2; i++) {
pid = syscall(__NR_clone, CLONE_NEWNET | SIGKILL, NULL, NULL, 0);
if (pid == 0)
return;
}
}

Fixes: 3a15fb6ed92c ("seccomp: release filter after task is fully dead")
Reported-by: syzbot+ab17848fe269b573eb71@syzkaller.appspotmail.com
Reported-by: Ayushman Dutta <ayudutta@amazon.com>
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220823154532.82913-1-kuniyu@amazon.com
diff a1140cb2 Tue Aug 23 09:45:32 MDT 2022 Kuniyuki Iwashima <kuniyu@amazon.com> seccomp: Move copy_seccomp() to no failure path.

Our syzbot instance reported memory leaks in do_seccomp() [0], similar
to the report [1]. It shows that we miss freeing struct seccomp_filter
and some objects included in it.

We can reproduce the issue with the program below [2] which calls one
seccomp() and two clone() syscalls.

The first clone()d child exits earlier than its parent and sends a
signal to kill it during the second clone(), more precisely before the
fatal_signal_pending() test in copy_process(). When the parent receives
the signal, it has to destroy the embryonic process and return -EINTR to
user space. In the failure path, we have to call seccomp_filter_release()
to decrement the filter's refcount.

Initially, we called it in free_task() called from the failure path, but
the commit 3a15fb6ed92c ("seccomp: release filter after task is fully
dead") moved it to release_task() to notify user space as early as possible
that the filter is no longer used.

To keep the change and current seccomp refcount semantics, let's move
copy_seccomp() just after the signal check and add a WARN_ON_ONCE() in
free_task() for future debugging.

[0]:
unreferenced object 0xffff8880063add00 (size 256):
comm "repro_seccomp", pid 230, jiffies 4294687090 (age 9.914s)
hex dump (first 32 bytes):
01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
backtrace:
do_seccomp (./include/linux/slab.h:600 ./include/linux/slab.h:733 kernel/seccomp.c:666 kernel/seccomp.c:708 kernel/seccomp.c:1871 kernel/seccomp.c:1991)
do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
unreferenced object 0xffffc90000035000 (size 4096):
comm "repro_seccomp", pid 230, jiffies 4294687090 (age 9.915s)
hex dump (first 32 bytes):
01 00 00 00 00 00 00 00 00 00 00 00 05 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
__vmalloc_node_range (mm/vmalloc.c:3226)
__vmalloc_node (mm/vmalloc.c:3261 (discriminator 4))
bpf_prog_alloc_no_stats (kernel/bpf/core.c:91)
bpf_prog_alloc (kernel/bpf/core.c:129)
bpf_prog_create_from_user (net/core/filter.c:1414)
do_seccomp (kernel/seccomp.c:671 kernel/seccomp.c:708 kernel/seccomp.c:1871 kernel/seccomp.c:1991)
do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
unreferenced object 0xffff888003fa1000 (size 1024):
comm "repro_seccomp", pid 230, jiffies 4294687090 (age 9.915s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
bpf_prog_alloc_no_stats (./include/linux/slab.h:600 ./include/linux/slab.h:733 kernel/bpf/core.c:95)
bpf_prog_alloc (kernel/bpf/core.c:129)
bpf_prog_create_from_user (net/core/filter.c:1414)
do_seccomp (kernel/seccomp.c:671 kernel/seccomp.c:708 kernel/seccomp.c:1871 kernel/seccomp.c:1991)
do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
unreferenced object 0xffff888006360240 (size 16):
comm "repro_seccomp", pid 230, jiffies 4294687090 (age 9.915s)
hex dump (first 16 bytes):
01 00 37 00 76 65 72 6c e0 83 01 06 80 88 ff ff ..7.verl........
backtrace:
bpf_prog_store_orig_filter (net/core/filter.c:1137)
bpf_prog_create_from_user (net/core/filter.c:1428)
do_seccomp (kernel/seccomp.c:671 kernel/seccomp.c:708 kernel/seccomp.c:1871 kernel/seccomp.c:1991)
do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
unreferenced object 0xffff8880060183e0 (size 8):
comm "repro_seccomp", pid 230, jiffies 4294687090 (age 9.915s)
hex dump (first 8 bytes):
06 00 00 00 00 00 ff 7f ........
backtrace:
kmemdup (mm/util.c:129)
bpf_prog_store_orig_filter (net/core/filter.c:1144)
bpf_prog_create_from_user (net/core/filter.c:1428)
do_seccomp (kernel/seccomp.c:671 kernel/seccomp.c:708 kernel/seccomp.c:1871 kernel/seccomp.c:1991)
do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)

[1]: https://syzkaller.appspot.com/bug?id=2809bb0ac77ad9aa3f4afe42d6a610aba594a987

[2]:
#define _GNU_SOURCE
#include <sched.h>
#include <signal.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <linux/filter.h>
#include <linux/seccomp.h>

void main(void)
{
struct sock_filter filter[] = {
BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_ALLOW),
};
struct sock_fprog fprog = {
.len = sizeof(filter) / sizeof(filter[0]),
.filter = filter,
};
long i, pid;

syscall(__NR_seccomp, SECCOMP_SET_MODE_FILTER, 0, &fprog);

for (i = 0; i < 2; i++) {
pid = syscall(__NR_clone, CLONE_NEWNET | SIGKILL, NULL, NULL, 0);
if (pid == 0)
return;
}
}

Fixes: 3a15fb6ed92c ("seccomp: release filter after task is fully dead")
Reported-by: syzbot+ab17848fe269b573eb71@syzkaller.appspotmail.com
Reported-by: Ayushman Dutta <ayudutta@amazon.com>
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220823154532.82913-1-kuniyu@amazon.com
diff 8d9bfb26 Wed Sep 14 16:18:07 MDT 2022 Mike Kravetz <mike.kravetz@oracle.com> hugetlb: add vma based lock for pmd sharing

Allocate a new hugetlb_vma_lock structure and hang off vm_private_data for
synchronization use by vmas that could be involved in pmd sharing. This
data structure contains a rw semaphore that is the primary tool used for
synchronization.

This new structure is ref counted, so that it can exist when NOT attached
to a vma. This is only helpful in resolving lock ordering issues where
code may need to obtain the vma_lock while there are no guarantees the vma
may go away. By obtaining a ref on the structure, it can be guaranteed
that at least the rw semaphore will not go away.

Only add infrastructure for the new lock here. Actual use will be added
in subsequent patches.

[mike.kravetz@oracle.com: fix build issue for missing hugetlb_vma_lock_release]
Link: https://lkml.kernel.org/r/YyNUtA1vRASOE4+M@monkey
Link: https://lkml.kernel.org/r/20220914221810.95771-7-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: James Houghton <jthoughton@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Prakash Sangappa <prakash.sangappa@oracle.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
diff 434c9eef Mon May 16 18:56:16 MDT 2022 Paul E. McKenney <paulmck@kernel.org> rcu-tasks: Add data structures for lightweight grace periods

This commit adds fields to task_struct and to rcu_tasks_percpu that will
be used to avoid the task-list scan for RCU Tasks Trace grace periods,
and also initializes these fields.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: KP Singh <kpsingh@kernel.org>
diff 434c9eef Mon May 16 18:56:16 MDT 2022 Paul E. McKenney <paulmck@kernel.org> rcu-tasks: Add data structures for lightweight grace periods

This commit adds fields to task_struct and to rcu_tasks_percpu that will
be used to avoid the task-list scan for RCU Tasks Trace grace periods,
and also initializes these fields.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: KP Singh <kpsingh@kernel.org>
diff a3d29e82 Mon Feb 07 16:02:50 MST 2022 Peter Zijlstra <peterz@infradead.org> sched: Define and initialize a flag to identify valid PASID in the task

Add a new single bit field to the task structure to track whether this task
has initialized the IA32_PASID MSR to the mm's PASID.

Initialize the field to zero when creating a new task with fork/clone.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20220207230254.3342514-8-fenghua.yu@intel.com
diff 701fac40 Mon Feb 07 16:02:48 MST 2022 Fenghua Yu <fenghua.yu@intel.com> iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

PASIDs are process-wide. It was attempted to use refcounted PASIDs to
free them when the last thread drops the refcount. This turned out to
be complex and error prone. Given the fact that the PASID space is 20
bits, which allows up to 1M processes to have a PASID associated
concurrently, PASID resource exhaustion is not a realistic concern.

Therefore, it was decided to simplify the approach and stick with lazy
on demand PASID allocation, but drop the eager free approach and make an
allocated PASID's lifetime bound to the lifetime of the process.

Get rid of the refcounting mechanisms and replace/rename the interfaces
to reflect this new approach.

[ bp: Massage commit message. ]

Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20220207230254.3342514-6-fenghua.yu@intel.com
diff a6cbd440 Mon Feb 07 16:02:47 MST 2022 Fenghua Yu <fenghua.yu@intel.com> kernel/fork: Initialize mm's PASID

A new mm doesn't have a PASID yet when it's created. Initialize
the mm's PASID on fork() or for init_mm to INVALID_IOASID (-1).

INIT_PASID (0) is reserved for kernel legacy DMA PASID. It cannot be
allocated to a user process. Initializing the process's PASID to 0 may
cause confusion that's why the process uses the reserved kernel legacy
DMA PASID. Initializing the PASID to INVALID_IOASID (-1) explicitly
tells the process doesn't have a valid PASID yet.

Even though the only user of mm_pasid_init() is in fork.c, define it in
<linux/sched/mm.h> as the first of three mm/pasid life cycle functions
(init/set/drop) to keep these all together.

Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20220207230254.3342514-5-fenghua.yu@intel.com

Completed in 1017 milliseconds