History log of /linux-master/drivers/tty/sysrq.c
Revision Date Author Comments
# 39ff20f5 19-Nov-2023 Tomas Mudrunka <tomas.mudrunka@gmail.com>

/proc/sysrq-trigger: accept multiple keys at once

This way we can do:
`echo _reisub > /proc/sysrq-trigger`
Instead of:
`for i in r e i s u b; do echo "$i" > /proc/sysrq-trigger; done;`

This can be very useful when trying to execute sysrq combo remotely
or from userspace. When sending keys in multiple separate writes,
userspace (eg. bash or ssh) can be killed before whole combo is completed.
Therefore putting all keys in single write is more robust approach.

Signed-off-by: Tomas Mudrunka <tomas.mudrunka@gmail.com>
Reviewed-by: Jiri Slaby <jirislaby@kernel.org>
Link: https://lore.kernel.org/r/20231120111451.527952-1-tomas.mudrunka@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# dd976a97 09-Oct-2023 Muhammad Usama Anjum <usama.anjum@collabora.com>

tty/sysrq: replace smp_processor_id() with get_cpu()

The smp_processor_id() shouldn't be called from preemptible code.
Instead use get_cpu() and put_cpu() which disables preemption in
addition to getting the processor id. Enable preemption back after
calling schedule_work() to make sure that the work gets scheduled on all
cores other than the current core. We want to avoid a scenario where
current core's stack trace is printed multiple times and one core's
stack trace isn't printed because of scheduling of current task.

This fixes the following bug:

[ 119.143590] sysrq: Show backtrace of all active CPUs
[ 119.143902] BUG: using smp_processor_id() in preemptible [00000000] code: bash/873
[ 119.144586] caller is debug_smp_processor_id+0x20/0x30
[ 119.144827] CPU: 6 PID: 873 Comm: bash Not tainted 5.10.124-dirty #3
[ 119.144861] Hardware name: QEMU QEMU Virtual Machine, BIOS 2023.05-1 07/22/2023
[ 119.145053] Call trace:
[ 119.145093] dump_backtrace+0x0/0x1a0
[ 119.145122] show_stack+0x18/0x70
[ 119.145141] dump_stack+0xc4/0x11c
[ 119.145159] check_preemption_disabled+0x100/0x110
[ 119.145175] debug_smp_processor_id+0x20/0x30
[ 119.145195] sysrq_handle_showallcpus+0x20/0xc0
[ 119.145211] __handle_sysrq+0x8c/0x1a0
[ 119.145227] write_sysrq_trigger+0x94/0x12c
[ 119.145247] proc_reg_write+0xa8/0xe4
[ 119.145266] vfs_write+0xec/0x280
[ 119.145282] ksys_write+0x6c/0x100
[ 119.145298] __arm64_sys_write+0x20/0x30
[ 119.145315] el0_svc_common.constprop.0+0x78/0x1e4
[ 119.145332] do_el0_svc+0x24/0x8c
[ 119.145348] el0_svc+0x10/0x20
[ 119.145364] el0_sync_handler+0x134/0x140
[ 119.145381] el0_sync+0x180/0x1c0

Cc: jirislaby@kernel.org
Cc: stable@vger.kernel.org
Fixes: 47cab6a722d4 ("debug lockups: Improve lockup detection, fix generic arch fallback")
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Link: https://lore.kernel.org/r/20231009162021.3607632-1-usama.anjum@collabora.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# a27f3b72 12-Jul-2023 Jiri Slaby <jirislaby@kernel.org>

tty: sysrq: use switch in sysrq_key_table_key2index()

Using switch with range cases makes the code more aligned and readable.
Expand also that 36 as explicit addition of 10 + 26 to make the source
of the constant more obvious.

Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Link: https://lore.kernel.org/r/20230712081811.29004-5-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 8ac20a03 12-Jul-2023 Jiri Slaby <jirislaby@kernel.org>

tty: sysrq: switch the rest of keys to u8

Propagate u8 more from the bottom to the interface, so that sysrq
callers (usually drivers) see that u8 is expected.

Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Link: https://lore.kernel.org/r/20230712081811.29004-4-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# bcb48185 12-Jul-2023 Jiri Slaby <jirislaby@kernel.org>

tty: sysrq: switch sysrq handlers from int to u8

The passed parameter to sysrq handlers is a key (a character). So change
the type from 'int' to 'u8'. Let it specifically be 'u8' for two
reasons:
* unsigned: unsigned values come from the upper layers (devices) and the
tty layer assumes unsigned on most places, and
* 8-bit: as that what's supposed to be one day in all the layers built
on the top of tty. (Currently, we use mostly 'unsigned char' and
somewhere still only 'char'. (But that also translates to the former
thanks to -funsigned-char.))

Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Douglas Anderson <dianders@chromium.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Zqiang <qiang.zhang1211@gmail.com>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de> # DRM
Acked-by: WANG Xuerui <git@xen0n.name> # loongarch
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20230712081811.29004-3-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 00ef7eff 12-Jul-2023 Jiri Slaby <jirislaby@kernel.org>

tty: sysrq: rename and re-type i in sysrq_handle_loglevel()

'i' is a too generic name for something which carries a 'loglevel'. Name
it as such and make it 'u8', the same as key will become in the next
patches.

Note that we are not stripping any high bits away, 'key' is given only
8bit values.

Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Link: https://lore.kernel.org/r/20230712081811.29004-2-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 527ed4f7 30-Jun-2023 Kefeng Wang <wangkefeng.wang@huawei.com>

mm: remove arguments of show_mem()

All callers of show_mem() pass 0 and NULL, so we can remove the two
arguments by directly calling __show_mem(0, NULL, MAX_NR_ZONES - 1) in
show_mem().

Link: https://lkml.kernel.org/r/20230630062253.189440-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 292a089d 20-Dec-2022 Steven Rostedt (Google) <rostedt@goodmis.org>

treewide: Convert del_timer*() to timer_shutdown*()

Due to several bugs caused by timers being re-armed after they are
shutdown and just before they are freed, a new state of timers was added
called "shutdown". After a timer is set to this state, then it can no
longer be re-armed.

The following script was run to find all the trivial locations where
del_timer() or del_timer_sync() is called in the same function that the
object holding the timer is freed. It also ignores any locations where
the timer->function is modified between the del_timer*() and the free(),
as that is not considered a "trivial" case.

This was created by using a coccinelle script and the following
commands:

$ cat timer.cocci
@@
expression ptr, slab;
identifier timer, rfield;
@@
(
- del_timer(&ptr->timer);
+ timer_shutdown(&ptr->timer);
|
- del_timer_sync(&ptr->timer);
+ timer_shutdown_sync(&ptr->timer);
)
... when strict
when != ptr->timer
(
kfree_rcu(ptr, rfield);
|
kmem_cache_free(slab, ptr);
|
kfree(ptr);
)

$ spatch timer.cocci . > /tmp/t.patch
$ patch -p1 < /tmp/t.patch

Link: https://lore.kernel.org/lkml/20221123201306.823305113@linutronix.de/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Pavel Machek <pavel@ucw.cz> [ LED ]
Acked-by: Kalle Valo <kvalo@kernel.org> [ wireless ]
Acked-by: Paolo Abeni <pabeni@redhat.com> [ networking ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 07a22b61 23-Jun-2022 Petr Mladek <pmladek@suse.com>

Revert "printk: add functions to prefer direct printing"

This reverts commit 2bb2b7b57f81255c13f4395ea911d6bdc70c9fe2.

The testing of 5.19 release candidates revealed missing synchronization
between early and regular console functionality.

It would be possible to start the console kthreads later as a workaround.
But it is clear that console lock serialized console drivers between
each other. It opens a big area of possible problems that were not
considered by people involved in the development and review.

printk() is crucial for debugging kernel issues and console output is
very important part of it. The number of consoles is huge and a proper
review would take some time. As a result it need to be reverted for 5.19.

Link: https://lore.kernel.org/r/YrBdjVwBOVgLfHyb@alley
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220623145157.21938-7-pmladek@suse.com


# 5390e7f4 17-Jan-2022 Changbin Du <changbin.du@intel.com>

sysrq: do not omit current cpu when showing backtrace of all active CPUs

The backtrace of current CPU also should be printed as it is active. This
change add stack trace for current CPU and print a hint for idle CPU for
the generic workqueue based printing. (x86 already does this)

Now it looks like below:
[ 279.401567] sysrq: Show backtrace of all active CPUs
[ 279.407234] sysrq: CPU5:
[ 279.407505] Call Trace:
[ 279.408789] [<ffffffff8000606c>] dump_backtrace+0x2c/0x3a
[ 279.411698] [<ffffffff800060ac>] show_stack+0x32/0x3e
[ 279.411809] [<ffffffff80542258>] sysrq_handle_showallcpus+0x4c/0xc6
[ 279.411929] [<ffffffff80542f16>] __handle_sysrq+0x106/0x26c
[ 279.412034] [<ffffffff805436a8>] write_sysrq_trigger+0x64/0x74
[ 279.412139] [<ffffffff8029cd48>] proc_reg_write+0x8e/0xe2
[ 279.412252] [<ffffffff8021a8f8>] vfs_write+0x90/0x2be
[ 279.412362] [<ffffffff8021acd2>] ksys_write+0xa6/0xce
[ 279.412467] [<ffffffff8021ad24>] sys_write+0x2a/0x38
[ 279.412689] [<ffffffff80003ff8>] ret_from_syscall+0x0/0x2
[ 279.417173] sysrq: CPU6: backtrace skipped as idling
[ 279.417185] sysrq: CPU4: backtrace skipped as idling
[ 279.417187] sysrq: CPU0: backtrace skipped as idling
[ 279.417181] sysrq: CPU7: backtrace skipped as idling
[ 279.417190] sysrq: CPU1: backtrace skipped as idling
[ 279.417193] sysrq: CPU3: backtrace skipped as idling
[ 279.417219] sysrq: CPU2:
[ 279.419179] Call Trace:
[ 279.419440] [<ffffffff8000606c>] dump_backtrace+0x2c/0x3a
[ 279.419782] [<ffffffff800060ac>] show_stack+0x32/0x3e
[ 279.420015] [<ffffffff80542b30>] showacpu+0x5c/0x96
[ 279.420317] [<ffffffff800ba71c>] flush_smp_call_function_queue+0xd6/0x218
[ 279.420569] [<ffffffff800bb438>] generic_smp_call_function_single_interrupt+0x14/0x1c
[ 279.420798] [<ffffffff800079ae>] handle_IPI+0xaa/0x13a
[ 279.421024] [<ffffffff804dcb92>] riscv_intc_irq+0x56/0x70
[ 279.421274] [<ffffffff80a05b70>] generic_handle_arch_irq+0x6a/0xfa
[ 279.421518] [<ffffffff80004006>] ret_from_exception+0x0/0x10
[ 279.421750] [<ffffffff80096492>] rcu_idle_enter+0x16/0x1e

Signed-off-by: Changbin Du <changbin.du@gmail.com>
Link: https://lore.kernel.org/r/20220117154300.2808-1-changbin.du@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 8ec8719f 18-Apr-2022 Junwen Wu <wudaemon@gmail.com>

tty/sysrq: change the definition of sysrq_key_table's element to make it more readable

The definition of sysrq_key_table's elements, like sysrq_thaw_op and
sysrq_showallcpus_op are not consistent with sysrq_ftrace_dump_op,
Consistency makes code more readable.

Reviewed-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Junwen Wu <wudaemon@gmail.com>
Link: https://lore.kernel.org/r/20220418153703.97705-1-wudaemon@163.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 2bb2b7b5 21-Apr-2022 John Ogness <john.ogness@linutronix.de>

printk: add functions to prefer direct printing

Once kthread printing is available, console printing will no longer
occur in the context of the printk caller. However, there are some
special contexts where it is desirable for the printk caller to
directly print out kernel messages. Using pr_flush() to wait for
threaded printers is only possible if the caller is in a sleepable
context and the kthreads are active. That is not always the case.

Introduce printk_prefer_direct_enter() and printk_prefer_direct_exit()
functions to explicitly (and globally) activate/deactivate preferred
direct console printing. The term "direct console printing" refers to
printing to all enabled consoles from the context of the printk
caller. The term "prefer" is used because this type of printing is
only best effort. If the console is currently locked or other
printers are already actively printing, the printk caller will need
to rely on the other contexts to handle the printing.

This preferred direct printing is how all printing has been handled
until now (unless it was explicitly deferred).

When kthread printing is introduced, there may be some unanticipated
problems due to kthreads being unable to flush important messages.
In order to minimize such risks, preferred direct printing is
activated for the primary important messages when the system
experiences general types of major errors. These are:

- emergency reboot/shutdown
- cpu and rcu stalls
- hard and soft lockups
- hung tasks
- warn
- sysrq

Note that since kthread printing does not yet exist, no behavior
changes result from this commit. This is only implementing the
counter and marking the various places where preferred direct
printing is active.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Acked-by: Paul E. McKenney <paulmck@kernel.org> # for RCU
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220421212250.565456-13-john.ogness@linutronix.de


# 3aee752c 25-Sep-2021 Oskari Pirhonen <xxc3ncoredxx@gmail.com>

tty/sysrq: More intuitive Shift handling

Make Alt-SysRq-Shift-<key> behave like Alt-Shift-SysRq-<key>.

Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Oskari Pirhonen <xxc3ncoredxx@gmail.com>
Link: https://lore.kernel.org/r/YU/6SCmUr9qGkqBu@dj3ntoo
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 55df0933 19-Oct-2021 Imran Khan <imran.f.khan@oracle.com>

workqueue: Introduce show_one_worker_pool and show_one_workqueue.

Currently show_workqueue_state shows the state of all workqueues and of
all worker pools. In certain cases we may need to dump state of only a
specific workqueue or worker pool. For example in destroy_workqueue we
only need to show state of the workqueue which is getting destroyed.

So rename show_workqueue_state to show_all_workqueues(to signify it
dumps state of all busy workqueues) and divide it into more granular
functions (show_one_workqueue and show_one_worker_pool), that would show
states of individual workqueues and worker pools and can be used in
cases such as the one mentioned above.

Also, as mentioned earlier, make destroy_workqueue dump data pertaining
to only the workqueue that is being destroyed and make user(s) of
earlier interface(show_workqueue_state), use new interface
(show_all_workqueues).

Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Signed-off-by: Tejun Heo <tj@kernel.org>


# 1143637f 13-Aug-2021 Changbin Du <changbin.du@intel.com>

tty: replace in_irq() with in_hardirq()

Replace the obsolete and ambiguos macro in_irq() with new
macro in_hardirq().

Reviewed-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Link: https://lore.kernel.org/r/20210814005033.2381-1-changbin.du@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 149ad2c6 07-Apr-2021 Xiaofei Tan <tanxiaofei@huawei.com>

tty/sysrq: Fix issues of code indent should use tabs

Fix issues of code indent should use tabs, reported by checkpatch.pl.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Link: https://lore.kernel.org/r/1617779210-51576-3-git-send-email-tanxiaofei@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 2c4a4cde 07-Apr-2021 Xiaofei Tan <tanxiaofei@huawei.com>

tty/sysrq: Add a blank line after declarations

Add a blank line after declarations, reported by checkpatch.pl.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Link: https://lore.kernel.org/r/1617779210-51576-2-git-send-email-tanxiaofei@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# a27eb0cb 18-Aug-2020 Andrzej Pietrasiewicz <andrzej.p@collabora.com>

tty/sysrq: Extend the sysrq_key_table to cover capital letters

All slots in sysrq_key_table[] are either used, reserved or at least
commented with their intended use. This patch adds capital letter versions
available, which means adding 26 more entries.

For already existing SysRq operations the user presses Alt-SysRq-<key>, and
for the newly added ones Alt-Shift-SysRq-<key>.

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://lore.kernel.org/r/20200818112825.6445-2-andrzej.p@collabora.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# b818f09e 20-Jun-2020 Christoph Hellwig <hch@lst.de>

tty/sysrq: emergency_thaw_all does not depend on CONFIG_BLOCK

We can also thaw non-block file systems. Remove the CONFIG_BLOCK in
sysrq.c after making the prototype available unconditionally.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 9cb8f069 08-Jun-2020 Dmitry Safonov <0x7f454c46@gmail.com>

kernel: rename show_stack_loglvl() => show_stack()

Now the last users of show_stack() got converted to use an explicit log
level, show_stack_loglvl() can drop it's redundant suffix and become once
again well known show_stack().

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200418201944.482088-51-dima@arista.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# ab34b46d 08-Jun-2020 Dmitry Safonov <0x7f454c46@gmail.com>

sysrq: use show_stack_loglvl()

Show the stack trace on a CPU with the same log level as "CPU%d" header.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Link: http://lkml.kernel.org/r/20200418201944.482088-45-dima@arista.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 7fffe31d 13-May-2020 Emil Velikov <emil.l.velikov@gmail.com>

tty/sysrq: constify the the sysrq_key_op(s)

All the users threat them as immutable - annotate them as such.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Link: https://lore.kernel.org/r/20200513214351.2138580-3-emil.l.velikov@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 23cbedf8 13-May-2020 Emil Velikov <emil.l.velikov@gmail.com>

tty/sysrq: constify the sysrq API

The user is not supposed to thinker with the underlying sysrq_key_op.
Make that explicit by adding a handful of const notations.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Link: https://lore.kernel.org/r/20200513214351.2138580-2-emil.l.velikov@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 0f1c9688 13-May-2020 Emil Velikov <emil.l.velikov@gmail.com>

tty/sysrq: alpha: export and use __sysrq_get_key_op()

Export a pointer to the sysrq_get_key_op(). This way we can cleanly
unregister it, instead of the current solutions of modifuing it inplace.

Since __sysrq_get_key_op() is no longer used externally, let's make it
a static function.

This patch will allow us to limit access to each and every sysrq op and
constify the sysrq handling.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: linux-kernel@vger.kernel.org
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: linux-alpha@vger.kernel.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Link: https://lore.kernel.org/r/20200513214351.2138580-1-emil.l.velikov@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 66bb1c95 20-Apr-2020 Dmitry Safonov <0x7f454c46@gmail.com>

tty/sysrq: Export sysrq_mask(), sysrq_toggle_support()

Build fix for serial_core being module:
ERROR: modpost: "sysrq_toggle_support" [drivers/tty/serial/serial_core.ko] undefined!
ERROR: modpost: "sysrq_mask" [drivers/tty/serial/serial_core.ko] undefined!

Fixes: eaee41727e6d ("sysctl/sysrq: Remove __sysrq_enabled copy")
Cc: Jiri Slaby <jslaby@suse.com>
Reported-by: "kernelci.org bot" <bot@kernelci.org>
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Link: https://lore.kernel.org/r/20200420172317.599611-1-dima@arista.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# eaee4172 02-Mar-2020 Dmitry Safonov <0x7f454c46@gmail.com>

sysctl/sysrq: Remove __sysrq_enabled copy

Many embedded boards have a disconnected TTL level serial which can
generate some garbage that can lead to spurious false sysrq detects.

Currently, sysrq can be either completely disabled for serial console
or always disabled (with CONFIG_MAGIC_SYSRQ_SERIAL), since
commit 732dbf3a6104 ("serial: do not accept sysrq characters via serial port")

At Arista, we have such boards that can generate BREAK and random
garbage. While disabling sysrq for serial console would solve
the problem with spurious false sysrq triggers, it's also desirable
to have a way to enable sysrq back.

Having the way to enable sysrq was beneficial to debug lockups with
a manual investigation in field and on the other side preventing false
sysrq detections.

As a preparation to add sysrq_toggle_support() call into uart,
remove a private copy of sysrq_enabled from sysctl - it should reflect
the actual status of sysrq.

Furthermore, the private copy isn't correct already in case
sysrq_always_enabled is true. So, remove __sysrq_enabled and use a
getter-helper sysrq_mask() to check sysrq_key_op enabled status.

Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Dmitry Safonov <dima@arista.com>
Link: https://lore.kernel.org/r/20200302175135.269397-2-dima@arista.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 97a32539 03-Feb-2020 Alexey Dobriyan <adobriyan@gmail.com>

proc: convert everything to "struct proc_ops"

The most notable change is DEFINE_SHOW_ATTRIBUTE macro split in
seq_file.h.

Conversion rule is:

llseek => proc_lseek
unlocked_ioctl => proc_ioctl

xxx => proc_xxx

delete ".owner = THIS_MODULE" line

[akpm@linux-foundation.org: fix drivers/isdn/capi/kcapi_proc.c]
[sfr@canb.auug.org.au: fix kernel/sched/psi.c]
Link: http://lkml.kernel.org/r/20200122180545.36222f50@canb.auug.org.au
Link: http://lkml.kernel.org/r/20191225172546.GB13378@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# f06327d1 12-Dec-2019 Dmitry Safonov <0x7f454c46@gmail.com>

sysrq: Remove sysrq_handler_registered

sysrq_toggle_support() can be called in parallel, in return calling
input_(un)register_handler(), which fortunately is safe to call
in parallel and regardless of registered/unregistered status of
sysrq_handler.
Remove sysrq_handler_registered as it doesn't have any function there
and may confuse reader about possible race.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Link: https://lore.kernel.org/r/20191213000657.931618-2-dima@arista.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# c39ea0b9 14-May-2019 Feng Tang <feng.tang@intel.com>

panic: avoid the extra noise dmesg

When kernel panic happens, it will first print the panic call stack,
then the ending msg like:

[ 35.743249] ---[ end Kernel panic - not syncing: Fatal exception
[ 35.749975] ------------[ cut here ]------------

The above message are very useful for debugging.

But if system is configured to not reboot on panic, say the
"panic_timeout" parameter equals 0, it will likely print out many noisy
message like WARN() call stack for each and every CPU except the panic
one, messages like below:

WARNING: CPU: 1 PID: 280 at kernel/sched/core.c:1198 set_task_cpu+0x183/0x190
Call Trace:
<IRQ>
try_to_wake_up
default_wake_function
autoremove_wake_function
__wake_up_common
__wake_up_common_lock
__wake_up
wake_up_klogd_work_func
irq_work_run_list
irq_work_tick
update_process_times
tick_sched_timer
__hrtimer_run_queues
hrtimer_interrupt
smp_apic_timer_interrupt
apic_timer_interrupt

For people working in console mode, the screen will first show the panic
call stack, but immediately overridden by these noisy extra messages,
which makes debugging much more difficult, as the original context gets
lost on screen.

Also these noisy messages will confuse some users, as I have seen many bug
reporters posted the noisy message into bugzilla, instead of the real
panic call stack and context.

Adding a flag "suppress_printk" which gets set in panic() to avoid those
noisy messages, without changing current kernel behavior that both panic
blinking and sysrq magic key can work as is, suggested by Petr Mladek.

To verify this, make sure kernel is not configured to reboot on panic and
in console
# echo c > /proc/sysrq-trigger
to see if console only prints out the panic call stack.

Link: http://lkml.kernel.org/r/1551430186-24169-1-git-send-email-feng.tang@intel.com
Signed-off-by: Feng Tang <feng.tang@intel.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: Sasha Levin <sashal@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 6ac972dd 13-Mar-2019 Julien Grall <julien.grall@arm.com>

tty/sysrq: Convert show_lock to raw_spinlock_t

Systems which don't provide arch_trigger_cpumask_backtrace() will
invoke showacpu() from a smp_call_function() function which is invoked
with disabled interrupts even on -RT systems.

The function acquires the show_lock lock which only purpose is to
ensure that the CPUs don't print simultaneously. Otherwise the
output would clash and it would be hard to tell the output from CPUx
apart from CPUy.

On -RT the spin_lock() can not be acquired from this context. A
raw_spin_lock() is required. It will introduce the system's latency
by performing the sysrq request and other CPUs will block on the lock
until the request is done. This is okay because the user asked for a
backtrace of all active CPUs and under "normal circumstances in
production" this path should not be triggered.

Signed-off-by: Julien Grall <julien.grall@arm.com>
[bigeasy@linuxtronix.de: commit description]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# c3fee609 11-Jan-2019 Petr Mladek <pmladek@suse.com>

sysrq: Remove duplicated sysrq message

The commit 97f5f0cd8cd0a0544 ("Input: implement SysRq as a separate input
handler") added pr_fmt() definition. It caused a duplicated message
prefix in the sysrq header messages, for example:

[ 177.053931] sysrq: SysRq : Show backtrace of all active CPUs
[ 742.864776] sysrq: SysRq : HELP : loglevel(0-9) reboot(b) crash(c)

Fixes: 97f5f0cd8cd0a05 ("Input: implement SysRq as a separate input handler")
Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 075e1a0c 11-Jan-2019 Petr Mladek <pmladek@suse.com>

sysrq: Restore original console_loglevel when sysrq disabled

The sysrq header line is printed with an increased loglevel
to provide users some positive feedback.

The original loglevel is not restored when the sysrq operation
is disabled. This bug was introduced in 2.6.12 (pre-git-history)
by the commit ("Allow admin to enable only some of the Magic-Sysrq
functions").

Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 8fefbc6d 26-Nov-2018 Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>

tty/sysrq: Do not call sync directly from sysrq_do_reset()

sysrq_do_reset() is called in softirq context, so it cannot call
sync() directly. Instead, call orderly_reboot(), which creates a work
item to run /sbin/reboot, or do emergency_sync and restart if the
command fails.

Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 8341f2f2 20-Sep-2018 Matthias Kaehlcke <mka@chromium.org>

sysrq: Use panic() to force a crash

sysrq_handle_crash() currently forces a crash by dereferencing a
NULL pointer, which is undefined behavior in C. Just call panic()
instead, which is simpler and doesn't depend on compiler specific
handling of the undefined behavior.

Remove the comment on why the RCU lock needs to be released, it isn't
accurate anymore since the crash now isn't handled by the page fault
handler (for reference: the comment was added by commit 984cf355aeaa
("sysrq: Fix warning in sysrq generated crash.")). Releasing the lock
is still good practice though.

Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 279070b9 21-Nov-2018 Yangtao Li <tiny.windzz@gmail.com>

tty/sysrq: add of_node_put()

of_find_node_by_path() acquires a reference to the node
returned by it and that reference needs to be dropped by its caller.
bl_idle_init() doesn't do that, so fix it.

Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# b16503ba 21-Jul-2018 Eric W. Biederman <ebiederm@xmission.com>

signal: send_sig_all no longer needs SEND_SIG_FORCED

Now that send_signal always delivers SEND_SIG_PRIV signals to a pid
namespace init it is no longer necessary to use SEND_SIG_FORCED when
calling do_send_sig_info to ensure that pid namespace inits are
signaled and possibly killed. Using SEND_SIG_PRIV is sufficient.

So use SEND_SIG_PRIV so that userspace when it receives a SIGTERM can
tell that the kernel sent the signal and not some random userspace
application.

Fixes: b82c32872db2 ("sysrq: use SEND_SIG_FORCED instead of force_sig()")
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>


# 40b3b025 21-Jul-2018 Eric W. Biederman <ebiederm@xmission.com>

signal: Pass pid type into do_send_sig_info

This passes the information we already have at the call sight into
do_send_sig_info. Ultimately allowing for better handling of signals
sent to a group of processes during fork.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>


# 70f68ee8 14-Mar-2018 Dominik Brodowski <linux@dominikbrodowski.net>

fs: add ksys_sync() helper; remove in-kernel calls to sys_sync()

Using this helper allows us to avoid the in-kernel calls to the
sys_sync() syscall. The ksys_ prefix denotes that this function
is meant as a drop-in replacement for the syscall. In particular, it
uses the same calling convention as sys_sync().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>


# b2441318 01-Nov-2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

License cleanup: add SPDX GPL-2.0 license identifier to files with no license

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.

For non */uapi/* files that summary was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139

and resulted in the first patch in this series.

If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930

and resulted in the second patch in this series.

- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:

SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1

and that resulted in the third patch in this series.

- when the two scanners agreed on the detected license(s), that became
the concluded license(s).

- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.

- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).

- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.

- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct

This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 8c318fa9 16-Oct-2017 Kees Cook <keescook@chromium.org>

tty/sysrq: Convert timers to use timer_setup()

In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Jiri Slaby <jslaby@suse.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# b00bebbc 10-Sep-2017 Jibin Xu <jibin.xu@windriver.com>

sysrq : fix Show Regs call trace on ARM

When kernel configuration SMP,PREEMPT and DEBUG_PREEMPT are enabled,
echo 1 >/proc/sys/kernel/sysrq
echo p >/proc/sysrq-trigger
kernel will print call trace as below:

sysrq: SysRq : Show Regs
BUG: using __this_cpu_read() in preemptible [00000000] code: sh/435
caller is __this_cpu_preempt_check+0x18/0x20
Call trace:
[<ffffff8008088e80>] dump_backtrace+0x0/0x1d0
[<ffffff8008089074>] show_stack+0x24/0x30
[<ffffff8008447970>] dump_stack+0x90/0xb0
[<ffffff8008463950>] check_preemption_disabled+0x100/0x108
[<ffffff8008463998>] __this_cpu_preempt_check+0x18/0x20
[<ffffff80084c9194>] sysrq_handle_showregs+0x1c/0x40
[<ffffff80084c9c7c>] __handle_sysrq+0x12c/0x1a0
[<ffffff80084ca140>] write_sysrq_trigger+0x60/0x70
[<ffffff8008251e00>] proc_reg_write+0x90/0xd0
[<ffffff80081f1788>] __vfs_write+0x48/0x90
[<ffffff80081f241c>] vfs_write+0xa4/0x190
[<ffffff80081f3354>] SyS_write+0x54/0xb0
[<ffffff80080833f0>] el0_svc_naked+0x24/0x28

This can be seen on a common board like an r-pi3.
This happens because when echo p >/proc/sysrq-trigger,
get_irq_regs() is called outside of IRQ context,
if preemption is enabled in this situation,kernel will
print the call trace. Since many prior discussions on
the mailing lists have made it clear that get_irq_regs
either just returns NULL or stale data when used outside
of IRQ context,we simply avoid calling it outside of
IRQ context.

Signed-off-by: Jibin Xu <jibin.xu@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# d75da004 03-May-2017 Michal Hocko <mhocko@suse.com>

oom: improve oom disable handling

Tetsuo has reported that sysrq triggered OOM killer will print a
misleading information when no tasks are selected:

sysrq: SysRq : Manual OOM execution
Out of memory: Kill process 4468 ((agetty)) score 0 or sacrifice child
Killed process 4468 ((agetty)) total-vm:43704kB, anon-rss:1760kB, file-rss:0kB, shmem-rss:0kB
sysrq: SysRq : Manual OOM execution
Out of memory: Kill process 4469 (systemd-cgroups) score 0 or sacrifice child
Killed process 4469 (systemd-cgroups) total-vm:10704kB, anon-rss:120kB, file-rss:0kB, shmem-rss:0kB
sysrq: SysRq : Manual OOM execution
sysrq: OOM request ignored because killer is disabled
sysrq: SysRq : Manual OOM execution
sysrq: OOM request ignored because killer is disabled
sysrq: SysRq : Manual OOM execution
sysrq: OOM request ignored because killer is disabled

The real reason is that there are no eligible tasks for the OOM killer
to select but since commit 7c5f64f84483 ("mm: oom: deduplicate victim
selection code for memcg and global oom") the semantic of out_of_memory
has changed without updating moom_callback.

This patch updates moom_callback to tell that no task was eligible which
is the case for both oom killer disabled and no eligible tasks. In
order to help distinguish first case from the second add printk to both
oom_killer_{enable,disable}. This information is useful on its own
because it might help debugging potential memory allocation failures.

Fixes: 7c5f64f84483 ("mm: oom: deduplicate victim selection code for memcg and global oom")
Link: http://lkml.kernel.org/r/20170404134705.6361-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 5e351410 03-Apr-2017 Eric Biggers <ebiggers@google.com>

net: ibm: emac: remove unused sysrq handler for 'c' key

Since commit d6580a9f1523 ("kexec: sysrq: simplify sysrq-c handler"),
the sysrq handler for the 'c' key has been sysrq_crash_op. Debugging
code in the ibm_emac driver also tries to register a handler for the 'c'
key, but this has no effect because register_sysrq_key() doesn't replace
existing handlers. Since evidently no one has cared enough to fix this
in the last 8 years, and it's very rare for drivers to register sysrq
handlers (for good reason), just remove the dead code.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 29930025 08-Feb-2017 Ingo Molnar <mingo@kernel.org>

sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h>

We are going to split <linux/sched/task.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/task.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# b17b0153 08-Feb-2017 Ingo Molnar <mingo@kernel.org>

sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h>

We are going to split <linux/sched/debug.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/debug.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 3f07c014 08-Feb-2017 Ingo Molnar <mingo@kernel.org>

sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h>

We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/signal.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 9af744d7 22-Feb-2017 Michal Hocko <mhocko@suse.com>

lib/show_mem.c: teach show_mem to work with the given nodemask

show_mem() allows to filter out node specific data which is irrelevant
to the allocation request via SHOW_MEM_FILTER_NODES. The filtering is
done in skip_free_areas_node which skips all nodes which are not in the
mems_allowed of the current process. This works most of the time as
expected because the nodemask shouldn't be outside of the allocating
task but there are some exceptions. E.g. memory hotplug might want to
request allocations from outside of the allowed nodes (see
new_node_page).

Get rid of this hardcoded behavior and push the allocation mask down the
show_mem path and use it instead of cpuset_current_mems_allowed. NULL
nodemask is interpreted as cpuset_current_mems_allowed.

[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20170117091543.25850-5-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 802c0388 05-Jan-2017 Akinobu Mita <akinobu.mita@gmail.com>

sysrq: attach sysrq handler correctly for 32-bit kernel

The sysrq input handler should be attached to the input device which has
a left alt key.

On 32-bit kernels, some input devices which has a left alt key cannot
attach sysrq handler. Because the keybit bitmap in struct input_device_id
for sysrq is not correctly initialized. KEY_LEFTALT is 56 which is
greater than BITS_PER_LONG on 32-bit kernels.

I found this problem when using a matrix keypad device which defines
a KEY_LEFTALT (56) but doesn't have a KEY_O (24 == 56%32).

Cc: Jiri Slaby <jslaby@suse.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 094a3262 28-Sep-2016 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Revert "drivers/tty: Explicitly pass current to show_stack"

This reverts commit 9f12cea96f47f98d612a0a0b84f950a0163731bf.

Mark writes:
Unfortunately, this patch will result in erroneous stack traces
on some architectures. Sorry about this; I should have verified
this more thoroughly before sending the series out.

Please drop the patch at your earliest convenience.

Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 9f12cea9 26-Sep-2016 Mark Rutland <mark.rutland@arm.com>

drivers/tty: Explicitly pass current to show_stack

As noted in commit:

81539169f283329f ("x86/dumpstack: Remove NULL task pointer convention")

... having a NULL task parameter imply current leads to subtle bugs in stack
walking code (so far seen on both 86 and arm64), makes callsites harder to
read, and is unnecessary as all callers have access to current.

As a step towards removing the problematic NULL-implies-current idiom entirely,
have the sysrq code explicitly pass current to show_stack.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 2a966b77 26-Jul-2016 Vladimir Davydov <vdavydov.dev@gmail.com>

mm: oom: add memcg to oom_control

It's a part of oom context just like allocation order and nodemask, so
let's move it to oom_control instead of passing it in the argument list.

Link: http://lkml.kernel.org/r/40e03fd7aaf1f55c75d787128d6d17c5a71226c2.1464358556.git.vdavydov@virtuozzo.com
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 984cf355 17-Dec-2015 Ani Sinha <ani@arista.com>

sysrq: Fix warning in sysrq generated crash.

Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") replaced
spin_lock_irqsave() calls with rcu_read_lock() calls in sysrq. Since
rcu_read_lock() does not disable preemption, faulthandler_disabled() in
__do_page_fault() in x86/fault.c returns false. When the code later calls
might_sleep() in the pagefault handler, we get the following warning:

BUG: sleeping function called from invalid context at ../arch/x86/mm/fault.c:1187
in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash
Preemption disabled at:[<ffffffff81484339>] printk+0x48/0x4a

To fix this, we release the RCU read lock before we crash.

Tested this patch on linux 3.18 by booting off one of our boards.

Fixes: 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq")

Signed-off-by: Ani Sinha <ani@arista.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>


# 3bce6f64 19-Aug-2015 Paul Gortmaker <paul.gortmaker@windriver.com>

drivers/tty: make sysrq.c slightly more explicitly non-modular

The Kconfig currently controlling compilation of this code is:

config.debug:config MAGIC_SYSRQ
bool "Magic SysRq key"

...meaning that it currently is not being built as a module by anyone.

Lets remove the traces of modularity we can so that when reading the
driver there is less doubt it is builtin-only.

Since module_init translates to device_initcall in the non-modular
case, the init ordering remains unchanged with this commit.

We don't delete the module.h include since other parts of the file are
using content from there.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 54e9e291 08-Sep-2015 David Rientjes <rientjes@google.com>

mm, oom: pass an oom order of -1 when triggered by sysrq

The force_kill member of struct oom_control isn't needed if an order of -1
is used instead. This is the same as order == -1 in struct
compact_control which requires full memory compaction.

This patch introduces no functional change.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 6e0fc46d 08-Sep-2015 David Rientjes <rientjes@google.com>

mm, oom: organize oom context into struct

There are essential elements to an oom context that are passed around to
multiple functions.

Organize these elements into a new struct, struct oom_control, that
specifies the context for an oom condition.

This patch introduces no functional change.

Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# dc56401f 24-Jun-2015 Johannes Weiner <hannes@cmpxchg.org>

mm: oom_kill: simplify OOM killer locking

The zonelist locking and the oom_sem are two overlapping locks that are
used to serialize global OOM killing against different things.

The historical zonelist locking serializes OOM kills from allocations with
overlapping zonelists against each other to prevent killing more tasks
than necessary in the same memory domain. Only when neither tasklists nor
zonelists from two concurrent OOM kills overlap (tasks in separate memcgs
bound to separate nodes) are OOM kills allowed to execute in parallel.

The younger oom_sem is a read-write lock to serialize OOM killing against
the PM code trying to disable the OOM killer altogether.

However, the OOM killer is a fairly cold error path, there is really no
reason to optimize for highly performant and concurrent OOM kills. And
the oom_sem is just flat-out redundant.

Replace both locking schemes with a single global mutex serializing OOM
kills regardless of context.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# d1e9a4f5 19-May-2015 James Hogan <jhogan@kernel.org>

MIPS: Add SysRq operation to dump TLBs on all CPUs

Add a MIPS specific SysRq operation to dump the TLB entries on all CPUs,
using the 'x' trigger key.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10072/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>


# ffb6e0c9 26-May-2015 Arnd Bergmann <arnd@arndb.de>

tty: remove platform_sysrq_reset_seq

The platform_sysrq_reset_seq code was intended as a way for an embedded
platform to provide its own sysrq sequence at compile time. After over two
years, nobody has started using it in an upstream kernel, and the platforms
that were interested in it have moved on to devicetree, which can be used
to configure the sequence without requiring kernel changes. The method is
also incompatible with the way that most architectures build support for
multiple platforms into a single kernel.

Now the code is producing warnings when built with gcc-5.1:

drivers/tty/sysrq.c: In function 'sysrq_init':
drivers/tty/sysrq.c:959:33: warning: array subscript is above array bounds [-Warray-bounds]
key = platform_sysrq_reset_seq[i];

We could fix this, but it seems unlikely that it will ever be used, so
let's just remove the code instead. We still have the option to pass the
sequence either in DT, using the kernel command line, or using the
/sys/module/sysrq/parameters/reset_seq file.

Fixes: 154b7a489a ("Input: sysrq - allow specifying alternate reset sequence")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>


# 9c27847d 26-May-2015 Luis R. Rodriguez <mcgrof@suse.com>

kernel/params: constify struct kernel_param_ops uses

Most code already uses consts for the struct kernel_param_ops,
sweep the kernel for the last offending stragglers. Other than
include/linux/moduleparam.h and kernel/params.c all other changes
were generated with the following Coccinelle SmPL patch. Merge
conflicts between trees can be handled with Coccinelle.

In the future git could get Coccinelle merge support to deal with
patch --> fail --> grammar --> Coccinelle --> new patch conflicts
automatically for us on patches where the grammar is available and
the patch is of high confidence. Consider this a feature request.

Test compiled on x86_64 against:

* allnoconfig
* allmodconfig
* allyesconfig

@ const_found @
identifier ops;
@@

const struct kernel_param_ops ops = {
};

@ const_not_found depends on !const_found @
identifier ops;
@@

-struct kernel_param_ops ops = {
+const struct kernel_param_ops ops = {
};

Generated-by: Coccinelle SmPL
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Junio C Hamano <gitster@pobox.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: cocci@systeme.lip6.fr
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>


# abab381f 22-May-2015 Arnd Bergmann <arnd@arndb.de>

tty: remove platform_sysrq_reset_seq

The platform_sysrq_reset_seq code was intended as a way for an embedded
platform to provide its own sysrq sequence at compile time. After over
two years, nobody has started using it in an upstream kernel, and
the platforms that were interested in it have moved on to devicetree,
which can be used to configure the sequence without requiring kernel
changes. The method is also incompatible with the way that most
architectures build support for multiple platforms into a single
kernel.

Now the code is producing warnings when built with gcc-5.1:

drivers/tty/sysrq.c: In function 'sysrq_init':
drivers/tty/sysrq.c:959:33: warning: array subscript is above array bounds [-Warray-bounds]
key = platform_sysrq_reset_seq[i];

We could fix this, but it seems unlikely that it will ever be used,
so let's just remove the code instead. We still have the option to
pass the sequence either in DT, using the kernel command line,
or using the /sys/module/sysrq/parameters/reset_seq file.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 154b7a489a ("Input: sysrq - allow specifying alternate reset sequence")
----
v2: moved sysrq_reset_downtime_ms variable to avoid introducing a compile
warning when CONFIG_INPUT is disabled
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 3494fc30 09-Mar-2015 Tejun Heo <tj@kernel.org>

workqueue: dump workqueues on sysrq-t

Workqueues are used extensively throughout the kernel but sometimes
it's difficult to debug stalls involving work items because visibility
into its inner workings is fairly limited. Although sysrq-t task dump
annotates each active worker task with the information on the work
item being executed, it is challenging to find out which work items
are pending or delayed on which queues and how pools are being
managed.

This patch implements show_workqueue_state() which dumps all busy
workqueues and pools and is called from the sysrq-t handler. At the
end of sysrq-t dump, something like the following is printed.

Showing busy workqueues and worker pools:
...
workqueue filler_wq: flags=0x0
pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=2/256
in-flight: 491:filler_workfn, 507:filler_workfn
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256
in-flight: 501:filler_workfn
pending: filler_workfn
...
workqueue test_wq: flags=0x8
pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/1
in-flight: 510(RESCUER):test_workfn BAR(69) BAR(500)
delayed: test_workfn1 BAR(492), test_workfn2
...
pool 0: cpus=0 node=0 flags=0x0 nice=0 workers=2 manager: 137
pool 2: cpus=1 node=0 flags=0x0 nice=0 workers=3 manager: 469
pool 3: cpus=1 node=0 flags=0x0 nice=-20 workers=2 idle: 16
pool 8: cpus=0-3 flags=0x4 nice=0 workers=2 manager: 62

The above shows that test_wq is executing test_workfn() on pid 510
which is the rescuer and also that there are two tasks 69 and 500
waiting for the work item to finish in flush_work(). As test_wq has
max_active of 1, there are two work items for test_workfn1() and
test_workfn2() which are delayed till the current work item is
finished. In addition, pid 492 is flushing test_workfn1().

The work item for test_workfn() is being executed on pwq of pool 2
which is the normal priority per-cpu pool for CPU 1. The pool has
three workers, two of which are executing filler_workfn() for
filler_wq and the last one is assuming the manager role trying to
create more workers.

This extra workqueue state dump will hopefully help chasing down hangs
involving workqueues.

v3: cpulist_pr_cont() replaced with "%*pbl" printf formatting.

v2: As suggested by Andrew, minor formatting change in pr_cont_work(),
printk()'s replaced with pr_info()'s, and cpumask printing now
uses cpulist_pr_cont().

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
CC: Ingo Molnar <mingo@redhat.com>


# c32b3cbe 11-Feb-2015 Michal Hocko <mhocko@suse.cz>

oom, PM: make OOM detection in the freezer path raceless

Commit 5695be142e20 ("OOM, PM: OOM killed task shouldn't escape PM
suspend") has left a race window when OOM killer manages to
note_oom_kill after freeze_processes checks the counter. The race
window is quite small and really unlikely and partial solution deemed
sufficient at the time of submission.

Tejun wasn't happy about this partial solution though and insisted on a
full solution. That requires the full OOM and freezer's task freezing
exclusion, though. This is done by this patch which introduces oom_sem
RW lock and turns oom_killer_disable() into a full OOM barrier.

oom_killer_disabled check is moved from the allocation path to the OOM
level and we take oom_sem for reading for both the check and the whole
OOM invocation.

oom_killer_disable() takes oom_sem for writing so it waits for all
currently running OOM killer invocations. Then it disable all the further
OOMs by setting oom_killer_disabled and checks for any oom victims.
Victims are counted via mark_tsk_oom_victim resp. unmark_oom_victim. The
last victim wakes up all waiters enqueued by oom_killer_disable().
Therefore this function acts as the full OOM barrier.

The page fault path is covered now as well although it was assumed to be
safe before. As per Tejun, "We used to have freezing points deep in file
system code which may be reacheable from page fault." so it would be
better and more robust to not rely on freezing points here. Same applies
to the memcg OOM killer.

out_of_memory tells the caller whether the OOM was allowed to trigger and
the callers are supposed to handle the situation. The page allocation
path simply fails the allocation same as before. The page fault path will
retry the fault (more on that later) and Sysrq OOM trigger will simply
complain to the log.

Normally there wouldn't be any unfrozen user tasks after
try_to_freeze_tasks so the function will not block. But if there was an
OOM killer racing with try_to_freeze_tasks and the OOM victim didn't
finish yet then we have to wait for it. This should complete in a finite
time, though, because

- the victim cannot loop in the page fault handler (it would die
on the way out from the exception)
- it cannot loop in the page allocator because all the further
allocation would fail and __GFP_NOFAIL allocations are not
acceptable at this stage
- it shouldn't be blocked on any locks held by frozen tasks
(try_to_freeze expects lockless context) and kernel threads and
work queues are not frozen yet

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Suggested-by: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 401e4a7c 11-Feb-2015 Michal Hocko <mhocko@suse.cz>

sysrq: convert printk to pr_* equivalent

While touching this area let's convert printk to pr_*. This also makes
the printing of continuation lines done properly.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 8d060bf4 06-Aug-2014 David Rientjes <rientjes@google.com>

mm, oom: ensure memoryless node zonelist always includes zones

With memoryless node support being worked on, it's possible that for
optimizations that a node may not have a non-NULL zonelist. When
CONFIG_NUMA is enabled and node 0 is memoryless, this means the zonelist
for first_online_node may become NULL.

The oom killer requires a zonelist that includes all memory zones for
the sysrq trigger and pagefault out of memory handler.

Ensure that a non-NULL zonelist is always passed to the oom killer.

[akpm@linux-foundation.org: fix non-numa build]
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 722773af 06-Jun-2014 Rik van Riel <riel@redhat.com>

sysrq,rcu: suppress RCU stall warnings while sysrq runs

Some sysrq handlers can run for a long time, because they dump a lot of
data onto a serial console. Having RCU stall warnings pop up in the
middle of them only makes the problem worse.

This patch temporarily disables RCU stall warnings while a sysrq request
is handled.

Signed-off-by: Rik van Riel <riel@redhat.com>
Suggested-by: Paul McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Madper Xie <cxie@redhat.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 984d74a7 06-Jun-2014 Rik van Riel <riel@redhat.com>

sysrq: rcu-ify __handle_sysrq

Echoing values into /proc/sysrq-trigger seems to be a popular way to get
information out of the kernel. However, dumping information about
thousands of processes, or hundreds of CPUs to serial console can result
in IRQs being blocked for minutes, resulting in various kinds of cascade
failures.

The most common failure is due to interrupts being blocked for a very
long time. This can lead to things like failed IO requests, and other
things the system cannot easily recover from.

This problem is easily fixable by making __handle_sysrq use RCU instead
of spin_lock_irqsave.

This leaves the warning that RCU grace periods have not elapsed for a
long time, but the system will come back from that automatically.

It also leaves sysrq-from-irq-context when the sysrq keys are pressed,
but that is probably desired since people want that to work in
situations where the system is already hosed.

The callers of register_sysrq_key and unregister_sysrq_key appear to be
capable of sleeping.

Signed-off-by: Rik van Riel <riel@redhat.com>
Reported-by: Madper Xie <cxie@redhat.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# a8fe19eb 04-Jun-2014 Borislav Petkov <bp@suse.de>

kernel/printk: use symbolic defines for console loglevels

... instead of naked numbers.

Stuff in sysrq.c used to set it to 8 which is supposed to mean above
default level so set it to DEBUG instead as we're terminating/killing all
tasks and we want to be verbose there.

Also, correct the check in x86_64_start_kernel which should be >= as
we're clearly issuing the string there for all debug levels, not only
the magical 10.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Joe Perches <joe@perches.com>
Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 8eaede49 06-Oct-2013 Ben Hutchings <ben@decadent.org.uk>

sysrq: Allow magic SysRq key functions to be disabled through Kconfig

Turn the initial value of sysctl kernel.sysrq (SYSRQ_DEFAULT_ENABLE)
into a Kconfig variable.

Original version by Bastian Blank <waldi@debian.org>.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 4c076eb0 03-Aug-2013 Mathieu J. Poirier <mathieu.poirier@linaro.org>

Input: sysrq - DT binding for key sequence

Adding a simple device tree binding for the specification of key
sequences. Definition of the keys found in the sequence are located in
'include/uapi/linux/input.h'.

For the sysrq driver, holding the sequence of keys down for a specific
amount of time will reset the system.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Grant Likely <grant.likely@linaro.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>


# 3d289517 05-Jun-2013 Mathieu J. Poirier <mathieu.poirier@linaro.org>

Input: sysrq - request graceful shutdown for key reset

Attempt to reboot the system gracefully when a key combo is detected.
If the reste combination is pressed the 2nd time we assume that graceful
reboot failed and perform emergency reboot. This fucntionality is useful
when UI is stuck but the system is otherwise working fine.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>


# 86b40567 01-Jun-2013 Jingoo Han <jg1.han@samsung.com>

tty: replace strict_strtoul() with kstrtoul()

The usage of strict_strtoul() is not preferred, because
strict_strtoul() is obsolete. Thus, kstrtoul() should be
used.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 39030786 01-Apr-2013 Mathieu J. Poirier <mathieu.poirier@linaro.org>

Input: sysrq - supplement reset sequence with timeout functionality

Some devices have too few buttons, which it makes it hard to have
a reset combo that won't trigger automatically. As such a
timeout functionality that requires the combination to be held for
a given amount of time before triggering is introduced.

If a key combo is recognized and held for a 'timeout' amount of time,
the system triggers a reset. If the timeout value is omitted the
driver simply ignores the functionality.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>


# afa80ccb 07-Mar-2013 zhangwei(Jovi) <jovi.zhangwei@huawei.com>

sysrq: fix inconstistent help message of sysrq key

Currently help message of /proc/sysrq-trigger highlight its
upper-case characters, like below:

SysRq : HELP : loglevel(0-9) reBoot Crash terminate-all-tasks(E)
memory-full-oom-kill(F) kill-all-tasks(I) ...

this would confuse user trigger sysrq by upper-case character, which is
inconsistent with the real lower-case character registed key.

This inconsistent help message will also lead more confused when
26 upper-case letters put into use in future.

This patch fix it.

Thanks the comments from Andrew and Randy.

Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# adf96e6f 27-Feb-2013 Linus Torvalds <torvalds@linux-foundation.org>

sysrq: don't depend on weak undefined arrays to have an address that compares as NULL

When taking an address of an extern array, gcc quite naturally should be
able to say "an address of an object can never be NULL" and just
optimize away the test entirely.

However, the new alternate sysrq reset code (commit 154b7a489a5b:
"Input: sysrq - allow specifying alternate reset sequence") did exactly
that, and declared platform_sysrq_reset_seq[] as a weak array, and
expecting that testing the address of the array would show whether it
actually got linked against something or not.

And that doesn't work with all gcc versions. Clearly it works with
*some* versions of gcc, and maybe it's even supposed to work, but it
really is a very fragile concept.

So instead of testing the address of the weak variable, just create a
weak instance of that array that is empty. If some platform then has a
real platform_sysrq_reset_seq[] that overrides our weak one, the linker
will switch to that one, and it all works without any run-time
conditionals at all.

Reported-by: Dave Airlie <airlied@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 8bd75c77 07-Feb-2013 Clark Williams <williams@redhat.com>

sched/rt: Move rt specific bits into new header file

Move rt scheduler definitions out of include/linux/sched.h into
new file include/linux/sched/rt.h

Signed-off-by: Clark Williams <williams@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20130207094707.7b9f825f@riff.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 154b7a48 07-Jan-2013 Mathieu Poirier <mathieu.poirier@linaro.org>

Input: sysrq - allow specifying alternate reset sequence

This patch adds keyreset functionality to the sysrq driver. It allows
certain button/key combinations to be used in order to trigger emergency
reboots.

Redefining the '__weak platform_sysrq_reset_seq' variable is required
to trigger the feature. Alternatively keys can be passed to the driver
via a module parameter.

This functionality comes from the keyreset driver submitted by
Arve Hjønnevåg in the Android kernel.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>


# 53186095 14-Nov-2012 David Rientjes <rientjes@google.com>

mm, oom: ensure sysrq+f always passes valid zonelist

With hotpluggable and memoryless nodes, it's possible that node 0 will
not be online, so use the first online node's zonelist rather than
hardcoding node 0 to pass a zonelist with all zones to the oom killer.

Signed-off-by: David Rientjes <rientjes@google.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 916ca14a 16-Oct-2012 David S. Miller <davem@davemloft.net>

sparc64: Add global PMU register dumping via sysrq.

Signed-off-by: David S. Miller <davem@davemloft.net>


# b82c3287 05-Apr-2012 Anton Vorontsov <anton.vorontsov@linaro.org>

sysrq: use SEND_SIG_FORCED instead of force_sig()

Change send_sig_all() to use do_send_sig_info(SEND_SIG_FORCED) instead
of force_sig(SIGKILL). With the recent changes we do not need force_ to
kill the CLONE_NEWPID tasks.

And this is more correct. force_sig() can race with the exiting thread,
while do_send_sig_info(group => true) kill the whole process.

Some more notes from Oleg Nesterov:

> Just one note. This change makes no difference for sysrq_handle_kill().
> But it obviously changes the behaviour sysrq_handle_term(). I think
> this is fine, if you want to really kill the task which blocks/ignores
> SIGTERM you can use sysrq_handle_kill().
>
> Even ignoring the reasons why force_sig() is simply wrong here,
> force_sig(SIGTERM) looks strange. The task won't be killed if it has
> a handler, but SIG_IGN can't help. However if it has the handler
> but blocks SIGTERM temporary (this is very common) it will be killed.

Also,

> force_sig() can't kill the process if the main thread has already
> exited. IOW, it is trivial to create the process which can't be
> killed by sysrq.

So, this patch fixes the issue.

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: Alan Cox <alan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 08ab9b10 21-Mar-2012 David Rientjes <rientjes@google.com>

mm, oom: force oom kill on sysrq+f

The oom killer chooses not to kill a thread if:

- an eligible thread has already been oom killed and has yet to exit,
and

- an eligible thread is exiting but has yet to free all its memory and
is not the thread attempting to currently allocate memory.

SysRq+F manually invokes the global oom killer to kill a memory-hogging
task. This is normally done as a last resort to free memory when no
progress is being made or to test the oom killer itself.

For both uses, we always want to kill a thread and never defer. This
patch causes SysRq+F to always kill an eligible thread and can be used to
force a kill even if another oom killed thread has failed to exit.

Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 079c9534 28-Feb-2012 Alan Cox <alan@linux.intel.com>

vt:tackle kbd_table

Keyboard struct lifetime is easy, but the locking is not and is completely
ignored by the existing code. Tackle this one head on

- Make the kbd_table private so we can run down all direct users
- Hoick the relevant ioctl handlers into the keyboard layer
- Lock them with the keyboard lock so they don't change mid keypress
- Add helpers for things like console stop/start so we isolate the poking
around properly
- Tweak the braille console so it still builds

There are a couple of FIXME locking cases left for ioctls that are so hideous
they should be addressed in a later patch. After this patch the kbd_table is
private and all the keyboard jiggery pokery is in one place.

This update fixes speakup and also a memory leak in the original.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# d3a532a9 06-Feb-2012 Anton Vorontsov <anton.vorontsov@linaro.org>

sysrq: Properly check for kernel threads

There's a real possibility of killing kernel threads that might
have issued use_mm(), so kthread's mm might become non-NULL.

This patch fixes the issue by checking for PF_KTHREAD (just as
get_task_mm()).

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# e502babe 06-Feb-2012 Anton Vorontsov <anton.vorontsov@linaro.org>

sysrq: Fix possible race with exiting task

sysrq should grab the tasklist lock, otherwise calling force_sig() is
not safe, as it might race with exiting task, which ->sighand might be
set to NULL already.

Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# ff01bb48 16-Sep-2011 Al Viro <viro@zeniv.linux.org.uk>

fs: move code out of buffer.c

Move invalidate_bdev, block_sync_page into fs/block_dev.c. Export
kill_bdev as well, so brd doesn't have to open code it. Reduce
buffer_head.h requirement accordingly.

Removed a rather large comment from invalidate_bdev, as it looked a bit
obsolete to bother moving. The small comment replacing it says enough.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# b2b755b5 24-Mar-2011 David Rientjes <rientjes@google.com>

lib, arch: add filter argument to show_mem and fix private implementations

Commit ddd588b5dd55 ("oom: suppress nodes that are not allowed from
meminfo on oom kill") moved lib/show_mem.o out of lib/lib.a, which
resulted in build warnings on all architectures that implement their own
versions of show_mem():

lib/lib.a(show_mem.o): In function `show_mem':
show_mem.c:(.text+0x1f4): multiple definition of `show_mem'
arch/sparc/mm/built-in.o:(.text+0xd70): first defined here

The fix is to remove __show_mem() and add its argument to show_mem() in
all implementations to prevent this breakage.

Architectures that implement their own show_mem() actually don't do
anything with the argument yet, but they could be made to filter nodes
that aren't allowed in the current context in the future just like the
generic implementation.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: James Bottomley <James.Bottomley@hansenpartnership.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 96fd7ce5 04-Nov-2010 Greg Kroah-Hartman <gregkh@suse.de>

TTY: create drivers/tty and move the tty core files there

The tty code should be in its own subdirectory and not in the char
driver with all of the cruft that is currently there.

Based on work done by Arnd Bergmann <arnd@arndb.de>

Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>