History log of /linux-master/kernel/debug/debug_core.c
Revision Date Author Comments
# dd712d3d 22-Aug-2023 Douglas Anderson <dianders@chromium.org>

kgdb: Flush console before entering kgdb on panic

When entering kdb/kgdb on a kernel panic, it was be observed that the
console isn't flushed before the `kdb` prompt came up. Specifically,
when using the buddy lockup detector on arm64 and running:
echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT

I could see:
[ 26.161099] lkdtm: Performing direct entry HARDLOCKUP
[ 32.499881] watchdog: Watchdog detected hard LOCKUP on cpu 6
[ 32.552865] Sending NMI from CPU 5 to CPUs 6:
[ 32.557359] NMI backtrace for cpu 6
... [backtrace for cpu 6] ...
[ 32.558353] NMI backtrace for cpu 5
... [backtrace for cpu 5] ...
[ 32.867471] Sending NMI from CPU 5 to CPUs 0-4,7:
[ 32.872321] NMI backtrace forP cpuANC: Hard LOCKUP

Entering kdb (current=..., pid 0) on processor 5 due to Keyboard Entry
[5]kdb>

As you can see, backtraces for the other CPUs start printing and get
interleaved with the kdb PANIC print.

Let's replicate the commands to flush the console in the kdb panic
entry point to avoid this.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20230822131945.1.I5b460ae8f954e4c4f628a373d6e74713c06dd26f@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# bcb48185 12-Jul-2023 Jiri Slaby <jirislaby@kernel.org>

tty: sysrq: switch sysrq handlers from int to u8

The passed parameter to sysrq handlers is a key (a character). So change
the type from 'int' to 'u8'. Let it specifically be 'u8' for two
reasons:
* unsigned: unsigned values come from the upper layers (devices) and the
tty layer assumes unsigned on most places, and
* 8-bit: as that what's supposed to be one day in all the layers built
on the top of tty. (Currently, we use mostly 'unsigned char' and
somewhere still only 'char'. (But that also translates to the former
thanks to -funsigned-char.))

Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Douglas Anderson <dianders@chromium.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Zqiang <qiang.zhang1211@gmail.com>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de> # DRM
Acked-by: WANG Xuerui <git@xen0n.name> # loongarch
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20230712081811.29004-3-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 7964cf8c 06-Sep-2022 Liam R. Howlett <Liam.Howlett@Oracle.com>

mm: remove vmacache

By using the maple tree and the maple tree state, the vmacache is no
longer beneficial and is complicating the VMA code. Remove the vmacache
to reduce the work in keeping it up to date and code complexity.

Link: https://lkml.kernel.org/r/20220906194824.2110408-26-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Yu Zhao <yuzhao@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# eadb2f47 23-May-2022 Daniel Thompson <daniel.thompson@linaro.org>

lockdown: also lock down previous kgdb use

KGDB and KDB allow read and write access to kernel memory, and thus
should be restricted during lockdown. An attacker with access to a
serial port (for example, via a hypervisor console, which some cloud
vendors provide over the network) could trigger the debugger so it is
important that the debugger respect the lockdown mode when/if it is
triggered.

Fix this by integrating lockdown into kdb's existing permissions
mechanism. Unfortunately kgdb does not have any permissions mechanism
(although it certainly could be added later) so, for now, kgdb is simply
and brutally disabled by immediately exiting the gdb stub without taking
any action.

For lockdowns established early in the boot (e.g. the normal case) then
this should be fine but on systems where kgdb has set breakpoints before
the lockdown is enacted than "bad things" will happen.

CVE: CVE-2022-21499
Co-developed-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# f8416aa2 06-Sep-2021 Cai Huoqing <caihuoqing@baidu.com>

kernel: debug: Convert to SPDX identifier

use SPDX-License-Identifier instead of a verbose license text

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Link: https://lore.kernel.org/r/20210906112302.937-1-caihuoqing@baidu.com
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# c8daba46 28-May-2021 Gustavo A. R. Silva <gustavoars@kernel.org>

kgdb: Fix fall-through warning for Clang

In preparation to enable -Wimplicit-fallthrough for Clang, fix a
fall-through warning by explicitly adding a goto statement instead
of letting the code fall through to the next case.

Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20210528200222.GA39201@embeddedor
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# 220a31b0 29-May-2021 Zhen Lei <thunder.leizhen@huawei.com>

kgdb: Fix spelling mistakes

Fix some spelling mistakes in comments:
initalization ==> initialization
detatch ==> detach
represntation ==> representation
hexidecimal ==> hexadecimal
delimeter ==> delimiter
architecure ==> architecture

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210529110305.9446-3-thunder.leizhen@huawei.com
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# d54ce615 25-Feb-2021 Sumit Garg <sumit.garg@linaro.org>

kgdb: fix to kill breakpoints on initmem after boot

Currently breakpoints in kernel .init.text section are not handled
correctly while allowing to remove them even after corresponding pages
have been freed.

Fix it via killing .init.text section breakpoints just prior to initmem
pages being freed.

Doug: "HW breakpoints aren't handled by this patch but it's probably
not such a big deal".

Link: https://lkml.kernel.org/r/20210224081652.587785-1-sumit.garg@linaro.org
Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Suggested-by: Doug Anderson <dianders@chromium.org>
Acked-by: Doug Anderson <dianders@chromium.org>
Acked-by: Daniel Thompson <daniel.thompson@linaro.org>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# f11e2bc6 10-Feb-2021 Daniel Thompson <daniel.thompson@linaro.org>

kgdb: Remove kgdb_schedule_breakpoint()

To the very best of my knowledge there has never been any in-tree
code that calls this function. It exists largely to support an
out-of-tree driver that provides kgdb-over-ethernet using the
netpoll API.

kgdboe has been out-of-tree for more than 10 years and I don't
recall any serious attempt to upstream it at any point in the last
five. At this stage it looks better to stop carrying this code in
the kernel and integrate the code into the out-of-tree driver
instead.

The long term trajectory for the kernel looks likely to include
effort to remove or reduce the use of tasklets (something that has
also been true for the last 10 years). Thus the main real reason
for this patch is to make explicit that the in-tree kgdb features
do not require tasklets.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20210210142525.2876648-1-daniel.thompson@linaro.org
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Jason Wessel <jason.wessel@windriver.com>


# 2da2687b 25-Jan-2021 Lukas Bulwahn <lukas.bulwahn@gmail.com>

kgdb: rectify kernel-doc for kgdb_unregister_io_module()

The command 'find ./kernel/debug/ | xargs ./scripts/kernel-doc -none'
reported a typo in the kernel-doc of kgdb_unregister_io_module().

Rectify the kernel-doc, such that no issues remain for ./kernel/debug/.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Link: https://lore.kernel.org/r/20210125144847.21896-1-lukas.bulwahn@gmail.com
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# 545b8c8d 15-Jun-2020 Peter Zijlstra <peterz@infradead.org>

smp: Cleanup smp_call_function*()

Get rid of the __call_single_node union and cleanup the API a little
to avoid external code relying on the structure layout as much.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>


# 771910f7 27-Sep-2020 Daniel Thompson <daniel.thompson@linaro.org>

kernel: debug: Centralize dbg_[de]activate_sw_breakpoints

During debug trap execution we expect dbg_deactivate_sw_breakpoints()
to be paired with an dbg_activate_sw_breakpoint(). Currently although
the calls are paired correctly they are needlessly smeared across three
different functions. Worse this also results in code to drive polled I/O
being called with breakpoints activated which, in turn, needlessly
increases the set of functions that will recursively trap if breakpointed.

Fix this by moving the activation of breakpoints into the debug core.

Reviewed-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200927211531.1380577-4-daniel.thompson@linaro.org
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# 4c4197ed 27-Sep-2020 Daniel Thompson <daniel.thompson@linaro.org>

kgdb: Add NOKPROBE labels on the trap handler functions

Currently kgdb honours the kprobe blocklist but doesn't place its own
trap handling code on the list. Add labels to discourage attempting to
use kgdb to debug itself.

Not every functions that executes from the trap handler needs to be
marked up: relatively early in the trap handler execution (just after
we bring the other CPUs to a halt) all breakpoints are replaced with
the original opcodes. This patch marks up code in the debug_core that
executes between trap entry and the breakpoints being deactivated
and, also, code that executes between breakpoint activation and trap
exit.

To be clear these changes are not sufficient to make recursive trapping
impossible since they do not include library calls made during kgdb's
entry/exit logic. However going much further whilst we are sharing the
kprobe blocklist risks reducing the capabilities of kprobe and this
would be a bad trade off (especially so given kgdb's users are currently
conditioned to avoid recursive traps).

Reviewed-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200927211531.1380577-3-daniel.thompson@linaro.org
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# f2d10ff4 27-Sep-2020 Daniel Thompson <daniel.thompson@linaro.org>

kgdb: Honour the kprobe blocklist when setting breakpoints

Currently kgdb has absolutely no safety rails in place to discourage or
prevent a user from placing a breakpoint in dangerous places such as
the debugger's own trap entry/exit and other places where it is not safe
to take synchronous traps.

Introduce a new config symbol KGDB_HONOUR_BLOCKLIST and modify the
default implementation of kgdb_validate_break_address() so that we use
the kprobe blocklist to prohibit instrumentation of critical functions
if the config symbol is set. The config symbol dependencies are set to
ensure that the blocklist will be enabled by default if we enable KGDB
and are compiling for an architecture where we HAVE_KPROBES.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20200927211531.1380577-2-daniel.thompson@linaro.org
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# e16c33e2 07-Aug-2020 Youling Tang <tangyouling@loongson.cn>

kernel/debug: Fix spelling mistake in debug_core.c

Fix typo: "notifiter" --> "notifier"
"overriden" --> "overridden"

Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Link: https://lore.kernel.org/r/1596793480-22559-1-git-send-email-tangyouling@loongson.cn
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# b18b099e 30-Jun-2020 Douglas Anderson <dianders@chromium.org>

kgdb: Make "kgdbcon" work properly with "kgdb_earlycon"

On my system the kernel processes the "kgdb_earlycon" parameter before
the "kgdbcon" parameter. When we setup "kgdb_earlycon" we'll end up
in kgdb_register_callbacks() and "kgdb_use_con" won't have been set
yet so we'll never get around to starting "kgdbcon". Let's remedy
this by detecting that the IO module was already registered when
setting "kgdb_use_con" and registering the console then.

As part of this, to avoid pre-declaring things, move the handling of
the "kgdbcon" further down in the file.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200630151422.1.I4aa062751ff5e281f5116655c976dff545c09a46@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# b13fecb1 13-Jul-2020 Kees Cook <keescook@chromium.org>

treewide: Replace DECLARE_TASKLET() with DECLARE_TASKLET_OLD()

This converts all the existing DECLARE_TASKLET() (and ...DISABLED)
macros with DECLARE_TASKLET_OLD() in preparation for refactoring the
tasklet callback type. All existing DECLARE_TASKLET() users had a "0"
data argument, it has been removed here as well.

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kees Cook <keescook@chromium.org>


# 440ab9e1 02-Jun-2020 Douglas Anderson <dianders@chromium.org>

kgdb: Avoid suspicious RCU usage warning

At times when I'm using kgdb I see a splat on my console about
suspicious RCU usage. I managed to come up with a case that could
reproduce this that looked like this:

WARNING: suspicious RCU usage
5.7.0-rc4+ #609 Not tainted
-----------------------------
kernel/pid.c:395 find_task_by_pid_ns() needs rcu_read_lock() protection!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
3 locks held by swapper/0/1:
#0: ffffff81b6b8e988 (&dev->mutex){....}-{3:3}, at: __device_attach+0x40/0x13c
#1: ffffffd01109e9e8 (dbg_master_lock){....}-{2:2}, at: kgdb_cpu_enter+0x20c/0x7ac
#2: ffffffd01109ea90 (dbg_slave_lock){....}-{2:2}, at: kgdb_cpu_enter+0x3ec/0x7ac

stack backtrace:
CPU: 7 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc4+ #609
Hardware name: Google Cheza (rev3+) (DT)
Call trace:
dump_backtrace+0x0/0x1b8
show_stack+0x1c/0x24
dump_stack+0xd4/0x134
lockdep_rcu_suspicious+0xf0/0x100
find_task_by_pid_ns+0x5c/0x80
getthread+0x8c/0xb0
gdb_serial_stub+0x9d4/0xd04
kgdb_cpu_enter+0x284/0x7ac
kgdb_handle_exception+0x174/0x20c
kgdb_brk_fn+0x24/0x30
call_break_hook+0x6c/0x7c
brk_handler+0x20/0x5c
do_debug_exception+0x1c8/0x22c
el1_sync_handler+0x3c/0xe4
el1_sync+0x7c/0x100
rpmh_rsc_probe+0x38/0x420
platform_drv_probe+0x94/0xb4
really_probe+0x134/0x300
driver_probe_device+0x68/0x100
__device_attach_driver+0x90/0xa8
bus_for_each_drv+0x84/0xcc
__device_attach+0xb4/0x13c
device_initial_probe+0x18/0x20
bus_probe_device+0x38/0x98
device_add+0x38c/0x420

If I understand properly we should just be able to blanket kgdb under
one big RCU read lock and the problem should go away. We'll add it to
the beast-of-a-function known as kgdb_cpu_enter().

With this I no longer get any splats and things seem to work fine.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200602154729.v2.1.I70e0d4fd46d5ed2aaf0c98a355e8e1b7a5bb7e4e@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# fe557319 17-Jun-2020 Christoph Hellwig <hch@lst.de>

maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault

Better describe what these functions do.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# b1350132 26-May-2020 Douglas Anderson <dianders@chromium.org>

kgdb: Don't call the deinit under spinlock

When I combined kgdboc_earlycon with an inflight patch titled ("soc:
qcom-geni-se: Add interconnect support to fix earlycon crash") [1]
things went boom. Specifically I got a crash during the transition
between kgdboc_earlycon and the main kgdboc that looked like this:

Call trace:
__schedule_bug+0x68/0x6c
__schedule+0x75c/0x924
schedule+0x8c/0xbc
schedule_timeout+0x9c/0xfc
do_wait_for_common+0xd0/0x160
wait_for_completion_timeout+0x54/0x74
rpmh_write_batch+0x1fc/0x23c
qcom_icc_bcm_voter_commit+0x1b4/0x388
qcom_icc_set+0x2c/0x3c
apply_constraints+0x5c/0x98
icc_set_bw+0x204/0x3bc
icc_put+0x30/0xf8
geni_remove_earlycon_icc_vote+0x6c/0x9c
qcom_geni_serial_earlycon_exit+0x10/0x1c
kgdboc_earlycon_deinit+0x38/0x58
kgdb_register_io_module+0x11c/0x194
configure_kgdboc+0x108/0x174
kgdboc_probe+0x38/0x60
platform_drv_probe+0x90/0xb0
really_probe+0x130/0x2fc
...

The problem was that we were holding the "kgdb_registration_lock"
while calling into code that didn't expect to be called in spinlock
context.

Let's slightly defer when we call the deinit code so that it's not
done under spinlock.

NOTE: this does mean that the "deinit" call of the old kgdb IO module
is now made _after_ the init of the new IO module, but presumably
that's OK.

[1] https://lkml.kernel.org/r/1588919619-21355-3-git-send-email-akashast@codeaurora.org

Fixes: 220995622da5 ("kgdboc: Add kgdboc_earlycon to support early kgdb using boot consoles")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200526142001.1.I523dc33f96589cb9956f5679976d402c8cda36fa@changeid
[daniel.thompson@linaro.org: Resolved merge issues by hand]
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# f83b04d3 15-Apr-2020 Vincent Chen <vincent.chen@sifive.com>

kgdb: Add kgdb_has_hit_break function

The break instruction in RISC-V does not have an immediate value field, so
the kernel cannot identify the purpose of each trap exception through the
opcode. This makes the existing identification schemes in other
architecture unsuitable for the RISC-V kernel. To solve this problem, this
patch adds kgdb_has_hit_break(), which can help RISC-V kernel identify
the KGDB trap exception.

Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Acked-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>


# 22099562 07-May-2020 Douglas Anderson <dianders@chromium.org>

kgdboc: Add kgdboc_earlycon to support early kgdb using boot consoles

We want to enable kgdb to debug the early parts of the kernel.
Unfortunately kgdb normally is a client of the tty API in the kernel
and serial drivers don't register to the tty layer until fairly late
in the boot process.

Serial drivers do, however, commonly register a boot console. Let's
enable the kgdboc driver to work with boot consoles to provide early
debugging.

This change co-opts the existing read() function pointer that's part
of "struct console". It's assumed that if a boot console (with the
flag CON_BOOT) has implemented read() that both the read() and write()
function are polling functions. That means they work without
interrupts and read() will return immediately (with 0 bytes read) if
there's nothing to read. This should be a safe assumption since it
appears that no current boot consoles implement read() right now and
there seems no reason to do so unless they wanted to support
"kgdboc_earlycon".

The normal/expected way to make all this work is to use
"kgdboc_earlycon" and "kgdboc" together. You should point them both
to the same physical serial connection. At boot time, as the system
transitions from the boot console to the normal console (and registers
a tty), kgdb will switch over.

One awkward part of all this, though, is that there can be a window
where the boot console goes away and we can't quite transtion over to
the main kgdboc that uses the tty layer. There are two main problems:

1. The act of registering the tty doesn't cause any call into kgdboc
so there is a window of time when the tty is there but kgdboc's
init code hasn't been called so we can't transition to it.

2. On some serial drivers the normal console inits (and replaces the
boot console) quite early in the system. Presumably these drivers
were coded up before earlycon worked as well as it does today and
probably they don't need to do this anymore, but it causes us
problems nontheless.

Problem #1 is not too big of a deal somewhat due to the luck of probe
ordering. kgdboc is last in the tty/serial/Makefile so its probe gets
right after all other tty devices. It's not fun to rely on this, but
it does work for the most part.

Problem #2 is a big deal, but only for some serial drivers. Other
serial drivers end up registering the console (which gets rid of the
boot console) and tty at nearly the same time.

The way we'll deal with the window when the system has stopped using
the boot console and the time when we're setup using the tty is to
keep using the boot console. This may sound surprising, but it has
been found to work well in practice. If it doesn't work, it shouldn't
be too hard for a given serial driver to make it keep working.
Specifically, it's expected that the read()/write() function provided
in the boot console should be the same (or nearly the same) as the
normal kgdb polling functions. That means continuing to use them
should work just fine. To make things even more likely to work work
we'll also trap the recently added exit() function in the boot console
we're using and delay any calls to it until we're all done with the
boot console.

NOTE: there could be ways to use all this in weird / unexpected ways.
If you do something like this, it's a bit of a buyer beware situation.
Specifically:
- If you specify only "kgdboc_earlycon" but not "kgdboc" then
(depending on your serial driver) things will probably work OK, but
you'll get a warning printed the first time you use kgdb after the
boot console is gone. You'd only be able to do this, of course, if
the serial driver you're running atop provided an early boot console.
- If your "kgdboc_earlycon" and "kgdboc" devices are not the same
device things should work OK, but it'll be your job to switch over
which device you're monitoring (including figuring out how to switch
over gdb in-flight if you're using it).

When trying to enable "kgdboc_earlycon" it should be noted that the
names that are registered through the boot console layer and the tty
layer are not the same for the same port. For example when debugging
on one board I'd need to pass "kgdboc_earlycon=qcom_geni
kgdboc=ttyMSM0" to enable things properly. Since digging up the boot
console name is a pain and there will rarely be more than one boot
console enabled, you can provide the "kgdboc_earlycon" parameter
without specifying the name of the boot console. In this case we'll
just pick the first boot that implements read() that we find.

This new "kgdboc_earlycon" parameter should be contrasted to the
existing "ekgdboc" parameter. While both provide a way to debug very
early, the usage and mechanisms are quite different. Specifically
"kgdboc_earlycon" is meant to be used in tandem with "kgdboc" and
there is a transition from one to the other. The "ekgdboc" parameter,
on the other hand, replaces the "kgdboc" parameter. It runs the same
logic as the "kgdboc" parameter but just relies on your TTY driver
being present super early. The only known usage of the old "ekgdboc"
parameter is documented as "ekgdboc=kbd earlyprintk=vga". It should
be noted that "kbd" has special treatment allowing it to init early as
a tty device.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: Sumit Garg <sumit.garg@linaro.org>
Link: https://lore.kernel.org/r/20200507130644.v4.8.I8fba5961bf452ab92350654aa61957f23ecf0100@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# 3ca676e4 07-May-2020 Douglas Anderson <dianders@chromium.org>

kgdb: Prevent infinite recursive entries to the debugger

If we detect that we recursively entered the debugger we should hack
our I/O ops to NULL so that the panic() in the next line won't
actually cause another recursion into the debugger. The first line of
kgdb_panic() will check this and return.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200507130644.v4.6.I89de39f68736c9de610e6f241e68d8dbc44bc266@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# b1a57bbf 07-May-2020 Douglas Anderson <dianders@chromium.org>

kgdb: Delay "kgdbwait" to dbg_late_init() by default

Using kgdb requires at least some level of architecture-level
initialization. If nothing else, it relies on the architecture to
pass breakpoints / crashes onto kgdb.

On some architectures this all works super early, specifically it
starts working at some point in time before Linux parses
early_params's. On other architectures it doesn't. A survey of a few
platforms:

a) x86: Presumably it all works early since "ekgdboc" is documented to
work here.
b) arm64: Catching crashes works; with a simple patch breakpoints can
also be made to work.
c) arm: Nothing in kgdb works until
paging_init() -> devicemaps_init() -> early_trap_init()

Let's be conservative and, by default, process "kgdbwait" (which tells
the kernel to drop into the debugger ASAP at boot) a bit later at
dbg_late_init() time. If an architecture has tested it and wants to
re-enable super early debugging, they can select the
ARCH_HAS_EARLY_DEBUG KConfig option. We'll do this for x86 to start.
It should be noted that dbg_late_init() is still called quite early in
the system.

Note that this patch doesn't affect when kgdb runs its init. If kgdb
is set to initialize early it will still initialize when parsing
early_param's. This patch _only_ inhibits the initial breakpoint from
"kgdbwait". This means:

* Without any extra patches arm64 platforms will at least catch
crashes after kgdb inits.
* arm platforms will catch crashes (and could handle a hardcoded
kgdb_breakpoint()) any time after early_trap_init() runs, even
before dbg_late_init().

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20200507130644.v4.4.I3113aea1b08d8ce36dc3720209392ae8b815201b@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# 202164fb 07-May-2020 Douglas Anderson <dianders@chromium.org>

kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb

In commit 81eaadcae81b ("kgdboc: disable the console lock when in
kgdb") we avoided the WARN_CONSOLE_UNLOCKED() yell when we were in
kgdboc. That still works fine, but it turns out that we get a similar
yell when using other I/O drivers. One example is the "I/O driver"
for the kgdb test suite (kgdbts). When I enabled that I again got the
same yells.

Even though "kgdbts" doesn't actually interact with the user over the
console, using it still causes kgdb to print to the consoles. That
trips the same warning:
con_is_visible+0x60/0x68
con_scroll+0x110/0x1b8
lf+0x4c/0xc8
vt_console_print+0x1b8/0x348
vkdb_printf+0x320/0x89c
kdb_printf+0x68/0x90
kdb_main_loop+0x190/0x860
kdb_stub+0x2cc/0x3ec
kgdb_cpu_enter+0x268/0x744
kgdb_handle_exception+0x1a4/0x200
kgdb_compiled_brk_fn+0x34/0x44
brk_handler+0x7c/0xb8
do_debug_exception+0x1b4/0x228

Let's increment/decrement the "ignore_console_lock_warning" variable
all the time when we enter the debugger.

This will allow us to later revert commit 81eaadcae81b ("kgdboc:
disable the console lock when in kgdb").

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200507130644.v4.1.Ied2b058357152ebcc8bf68edd6f20a11d98d7d4e@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# c69b470e 13-May-2020 Emil Velikov <emil.l.velikov@gmail.com>

kdb: constify sysrq_key_op

With earlier commits, the API no longer discards the const-ness of the
sysrq_key_op. As such we can add the notation.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: linux-kernel@vger.kernel.org
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Cc: kgdb-bugreport@lists.sourceforge.net
Acked-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Link: https://lore.kernel.org/r/20200513214351.2138580-9-emil.l.velikov@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# a1350207 06-May-2020 Andy Shevchenko <andriy.shevchenko@linux.intel.com>

kgdb: Drop malformed kernel doc comment

Kernel doc does not understand POD variables to be referred to.

.../debug_core.c:73: warning: cannot understand function prototype:
'int kgdb_connected; '

Convert kernel doc to pure comment.

Fixes: dc7d55270521 ("kgdb: core")
Cc: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# d07ce4e3 21-Oct-2019 Daniel Thompson <daniel.thompson@linaro.org>

kdb: Avoid array subscript warnings on non-SMP builds

Recent versions of gcc (reported on gcc-7.4) issue array subscript
warnings for builds where SMP is not enabled.

kernel/debug/debug_core.c: In function 'kdb_dump_stack_on_cpu':
kernel/debug/debug_core.c:452:17: warning: array subscript is outside array
+bounds [-Warray-bounds]
if (!(kgdb_info[cpu].exception_state & DCPU_IS_SLAVE)) {
~~~~~~~~~^~~~~
kernel/debug/debug_core.c:469:33: warning: array subscript is outside array
+bounds [-Warray-bounds]
kgdb_info[cpu].exception_state |= DCPU_WANT_BT;
kernel/debug/debug_core.c:470:18: warning: array subscript is outside array
+bounds [-Warray-bounds]
while (kgdb_info[cpu].exception_state & DCPU_WANT_BT)

There is no bug here but there is scope to improve the code
generation for non-SMP systems (whilst also silencing the warning).

Reported-by: kbuild test robot <lkp@intel.com>
Fixes: 2277b492582d ("kdb: Fix stack crawling on 'running' CPUs that aren't the master")
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20191021101057.23861-1-daniel.thompson@linaro.org
Reviewed-by: Douglas Anderson <dianders@chromium.org>


# 2277b492 25-Sep-2019 Douglas Anderson <dianders@chromium.org>

kdb: Fix stack crawling on 'running' CPUs that aren't the master

In kdb when you do 'btc' (back trace on CPU) it doesn't necessarily
give you the right info. Specifically on many architectures
(including arm64, where I tested) you can't dump the stack of a
"running" process that isn't the process running on the current CPU.
This can be seen by this:

echo SOFTLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT
# wait 2 seconds
<sysrq>g

Here's what I see now on rk3399-gru-kevin. I see the stack crawl for
the CPU that handled the sysrq but everything else just shows me stuck
in __switch_to() which is bogus:

======

[0]kdb> btc
btc: cpu status: Currently on cpu 0
Available cpus: 0, 1-3(I), 4, 5(I)
Stack traceback for pid 0
0xffffff801101a9c0 0 0 1 0 R 0xffffff801101b3b0 *swapper/0
Call trace:
dump_backtrace+0x0/0x138
...
kgdb_compiled_brk_fn+0x34/0x44
...
sysrq_handle_dbg+0x34/0x5c
Stack traceback for pid 0
0xffffffc0f175a040 0 0 1 1 I 0xffffffc0f175aa30 swapper/1
Call trace:
__switch_to+0x1e4/0x240
0xffffffc0f65616c0
Stack traceback for pid 0
0xffffffc0f175d040 0 0 1 2 I 0xffffffc0f175da30 swapper/2
Call trace:
__switch_to+0x1e4/0x240
0xffffffc0f65806c0
Stack traceback for pid 0
0xffffffc0f175b040 0 0 1 3 I 0xffffffc0f175ba30 swapper/3
Call trace:
__switch_to+0x1e4/0x240
0xffffffc0f659f6c0
Stack traceback for pid 1474
0xffffffc0dde8b040 1474 727 1 4 R 0xffffffc0dde8ba30 bash
Call trace:
__switch_to+0x1e4/0x240
__schedule+0x464/0x618
0xffffffc0dde8b040
Stack traceback for pid 0
0xffffffc0f17b0040 0 0 1 5 I 0xffffffc0f17b0a30 swapper/5
Call trace:
__switch_to+0x1e4/0x240
0xffffffc0f65dd6c0

===

The problem is that 'btc' eventually boils down to
show_stack(task_struct, NULL);

...and show_stack() doesn't work for "running" CPUs because their
registers haven't been stashed.

On x86 things might work better (I haven't tested) because kdb has a
special case for x86 in kdb_show_stack() where it passes the stack
pointer to show_stack(). This wouldn't work on arm64 where the stack
crawling function seems needs the "fp" and "pc", not the "sp" which is
presumably why arm64's show_stack() function totally ignores the "sp"
parameter.

NOTE: we _can_ get a good stack dump for all the cpus if we manually
switch each one to the kdb master and do a back trace. AKA:
cpu 4
bt
...will give the expected trace. That's because now arm64's
dump_backtrace will now see that "tsk == current" and go through a
different path.

In this patch I fix the problems by catching a request to stack crawl
a task that's running on a CPU and then I ask that CPU to do the stack
crawl.

NOTE: this will (presumably) change what stack crawls are printed for
x86 machines. Now kdb functions will show up in the stack crawl.
Presumably this is OK but if it's not we can go back and add a special
case for x86 again.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# 7d92bda2 25-Sep-2019 Douglas Anderson <dianders@chromium.org>

kgdb: don't use a notifier to enter kgdb at panic; call directly

Right now kgdb/kdb hooks up to debug panics by registering for the panic
notifier. This works OK except that it means that kgdb/kdb gets called
_after_ the CPUs in the system are taken offline. That means that if
anything important was happening on those CPUs (like something that might
have contributed to the panic) you can't debug them.

Specifically I ran into a case where I got a panic because a task was
"blocked for more than 120 seconds" which was detected on CPU 2. I nicely
got shown stack traces in the kernel log for all CPUs including CPU 0,
which was running 'PID: 111 Comm: kworker/0:1H' and was in the middle of
__mmc_switch().

I then ended up at the kdb prompt where switched over to kgdb to try to
look at local variables of the process on CPU 0. I found that I couldn't.
Digging more, I found that I had no info on any tasks running on CPUs
other than CPU 2 and that asking kdb for help showed me "Error: no saved
data for this cpu". This was because all the CPUs were offline.

Let's move the entry of kdb/kgdb to a direct call from panic() and stop
using the generic notifier. Putting a direct call in allows us to order
things more properly and it also doesn't seem like we're breaking any
abstractions by calling into the debugger from the panic function.

Daniel said:

: This patch changes the way kdump and kgdb interact with each other.
: However it would seem rather odd to have both tools simultaneously armed
: and, even if they were, the user still has the option to use panic_timeout
: to force a kdump to happen. Thus I think the change of order is
: acceptable.

Link: http://lkml.kernel.org/r/20190703170354.217312-1-dianders@chromium.org
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Feng Tang <feng.tang@intel.com>
Cc: YueHaibing <yuehaibing@huawei.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# d8a050f5 21-Oct-2018 Nadav Amit <namit@vmware.com>

kgdb: fix comment regarding static function

The comment that says that module_event() is not static is clearly
wrong.

Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# 162bc7f5 04-Dec-2018 Douglas Anderson <dianders@chromium.org>

kdb: Don't back trace on a cpu that didn't round up

If you have a CPU that fails to round up and then run 'btc' you'll end
up crashing in kdb becaue we dereferenced NULL. Let's add a check.
It's wise to also set the task to NULL when leaving the debugger so
that if we fail to round up on a later entry into the debugger we
won't backtrace a stale task.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# 87b09592 04-Dec-2018 Douglas Anderson <dianders@chromium.org>

kgdb: Don't round up a CPU that failed rounding up before

If we're using the default implementation of kgdb_roundup_cpus() that
uses smp_call_function_single_async() we can end up hanging
kgdb_roundup_cpus() if we try to round up a CPU that failed to round
up before.

Specifically smp_call_function_single_async() will try to wait on the
csd lock for the CPU that we're trying to round up. If the previous
round up never finished then that lock could still be held and we'll
just sit there hanging.

There's not a lot of use trying to round up a CPU that failed to round
up before. Let's keep a flag that indicates whether the CPU started
but didn't finish to round up before. If we see that flag set then
we'll skip the next round up.

In general we have a few goals here:
- We never want to end up calling smp_call_function_single_async()
when the csd is still locked. This is accomplished because
flush_smp_call_function_queue() unlocks the csd _before_ invoking
the callback. That means that when kgdb_nmicallback() runs we know
for sure the the csd is no longer locked. Thus when we set
"rounding_up = false" we know for sure that the csd is unlocked.
- If there are no timeouts rounding up we should never skip a round
up.

NOTE #1: In general trying to continue running after failing to round
up CPUs doesn't appear to be supported in the debugger. When I
simulate this I find that kdb reports "Catastrophic error detected"
when I try to continue. I can overrule and continue anyway, but it
should be noted that we may be entering the land of dragons here.
Possibly the "Catastrophic error detected" was added _because_ of the
future failure to round up, but even so this is an area of the code
that hasn't been strongly tested.

NOTE #2: I did a bit of testing before and after this change. I
introduced a 10 second hang in the kernel while holding a spinlock
that I could invoke on a certain CPU with 'taskset -c 3 cat /sys/...".

Before this change if I did:
- Invoke hang
- Enter debugger
- g (which warns about Catastrophic error, g again to go anyway)
- g
- Enter debugger

...I'd hang the rest of the 10 seconds without getting a debugger
prompt. After this change I end up in the debugger the 2nd time after
only 1 second with the standard warning about 'Timed out waiting for
secondary CPUs.'

I'll also note that once the CPU finished waiting I could actually
debug it (aka "btc" worked)

I won't promise that everything works perfectly if the errant CPU
comes back at just the wrong time (like as we're entering or exiting
the debugger) but it certainly seems like an improvement.

NOTE #3: setting 'kgdb_info[cpu].rounding_up = false' is in
kgdb_nmicallback() instead of kgdb_call_nmi_hook() because some
implementations override kgdb_call_nmi_hook(). It shouldn't hurt to
have it in kgdb_nmicallback() in any case.

NOTE #4: this logic is really only needed because there is no API call
like "smp_try_call_function_single_async()" or "smp_csd_is_locked()".
If such an API existed then we'd use it instead, but it seemed a bit
much to add an API like this just for kgdb.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# 3cd99ac3 04-Dec-2018 Douglas Anderson <dianders@chromium.org>

kgdb: Fix kgdb_roundup_cpus() for arches who used smp_call_function()

When I had lockdep turned on and dropped into kgdb I got a nice splat
on my system. Specifically it hit:
DEBUG_LOCKS_WARN_ON(current->hardirq_context)

Specifically it looked like this:
sysrq: SysRq : DEBUG
------------[ cut here ]------------
DEBUG_LOCKS_WARN_ON(current->hardirq_context)
WARNING: CPU: 0 PID: 0 at .../kernel/locking/lockdep.c:2875 lockdep_hardirqs_on+0xf0/0x160
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0 #27
pstate: 604003c9 (nZCv DAIF +PAN -UAO)
pc : lockdep_hardirqs_on+0xf0/0x160
...
Call trace:
lockdep_hardirqs_on+0xf0/0x160
trace_hardirqs_on+0x188/0x1ac
kgdb_roundup_cpus+0x14/0x3c
kgdb_cpu_enter+0x53c/0x5cc
kgdb_handle_exception+0x180/0x1d4
kgdb_compiled_brk_fn+0x30/0x3c
brk_handler+0x134/0x178
do_debug_exception+0xfc/0x178
el1_dbg+0x18/0x78
kgdb_breakpoint+0x34/0x58
sysrq_handle_dbg+0x54/0x5c
__handle_sysrq+0x114/0x21c
handle_sysrq+0x30/0x3c
qcom_geni_serial_isr+0x2dc/0x30c
...
...
irq event stamp: ...45
hardirqs last enabled at (...44): [...] __do_softirq+0xd8/0x4e4
hardirqs last disabled at (...45): [...] el1_irq+0x74/0x130
softirqs last enabled at (...42): [...] _local_bh_enable+0x2c/0x34
softirqs last disabled at (...43): [...] irq_exit+0xa8/0x100
---[ end trace adf21f830c46e638 ]---

Looking closely at it, it seems like a really bad idea to be calling
local_irq_enable() in kgdb_roundup_cpus(). If nothing else that seems
like it could violate spinlock semantics and cause a deadlock.

Instead, let's use a private csd alongside
smp_call_function_single_async() to round up the other CPUs. Using
smp_call_function_single_async() doesn't require interrupts to be
enabled so we can remove the offending bit of code.

In order to avoid duplicating this across all the architectures that
use the default kgdb_roundup_cpus(), we'll add a "weak" implementation
to debug_core.c.

Looking at all the people who previously had copies of this code,
there were a few variants. I've attempted to keep the variants
working like they used to. Specifically:
* For arch/arc we passed NULL to kgdb_nmicallback() instead of
get_irq_regs().
* For arch/mips there was a bit of extra code around
kgdb_nmicallback()

NOTE: In this patch we will still get into trouble if we try to round
up a CPU that failed to round up before. We'll try to round it up
again and potentially hang when we try to grab the csd lock. That's
not new behavior but we'll still try to do better in a future patch.

Suggested-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# 9ef7fa50 04-Dec-2018 Douglas Anderson <dianders@chromium.org>

kgdb: Remove irq flags from roundup

The function kgdb_roundup_cpus() was passed a parameter that was
documented as:

> the flags that will be used when restoring the interrupts. There is
> local_irq_save() call before kgdb_roundup_cpus().

Nobody used those flags. Anyone who wanted to temporarily turn on
interrupts just did local_irq_enable() and local_irq_disable() without
looking at them. So we can definitely remove the flags.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>


# 38b8d208 08-Feb-2017 Ingo Molnar <mingo@kernel.org>

sched/headers: Prepare for new header dependencies before moving code to <linux/sched/nmi.h>

We are going to move softlockup APIs out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

<linux/nmi.h> already includes <linux/sched.h>.

Include the <linux/nmi.h> header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 314ff785 03-Feb-2017 Ingo Molnar <mingo@kernel.org>

mm/vmacache, sched/headers: Introduce 'struct vmacache' and move it from <linux/sched.h> to <linux/mm_types>

The <linux/sched.h> header includes various vmacache related defines,
which are arguably misplaced.

Move them to mm_types.h and minimize the sched.h impact by putting
all task vmacache state into a new 'struct vmacache' structure.

No change in functionality.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 2d13bb64 14-Dec-2016 Douglas Anderson <dianders@chromium.org>

kernel/debug/debug_core.c: more properly delay for secondary CPUs

We've got a delay loop waiting for secondary CPUs. That loop uses
loops_per_jiffy. However, loops_per_jiffy doesn't actually mean how
many tight loops make up a jiffy on all architectures. It is quite
common to see things like this in the boot log:

Calibrating delay loop (skipped), value calculated using timer
frequency.. 48.00 BogoMIPS (lpj=24000)

In my case I was seeing lots of cases where other CPUs timed out
entering the debugger only to print their stack crawls shortly after the
kdb> prompt was written.

Elsewhere in kgdb we already use udelay(), so that should be safe enough
to use to implement our timeout. We'll delay 1 ms for 1000 times, which
should give us a full second of delay (just like the old code wanted)
but allow us to notice that we're done every 1 ms.

[akpm@linux-foundation.org: simplifications, per Daniel]
Link: http://lkml.kernel.org/r/1477091361-2039-1-git-send-email-dianders@chromium.org
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Brian Norris <briannorris@chromium.org>
Cc: <stable@vger.kernel.org> [4.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 5516fd7b 28-Jan-2015 Colin Cross <ccross@android.com>

debug: prevent entering debug mode on panic/exception.

On non-developer devices, kgdb prevents the device from rebooting
after a panic.

Incase of panics and exceptions, to allow the device to reboot, prevent
entering debug mode to avoid getting stuck waiting for the user to
interact with debugger.

To avoid entering the debugger on panic/exception without any extra
configuration, panic_timeout is being used which can be set via
/proc/sys/kernel/panic at run time and CONFIG_PANIC_TIMEOUT sets the
default value.

Setting panic_timeout indicates that the user requested machine to
perform unattended reboot after panic. We dont want to get stuck waiting
for the user input incase of panic.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: kgdb-bugreport@lists.sourceforge.net
Cc: linux-kernel@vger.kernel.org
Cc: Android Kernel Team <kernel-team@android.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Colin Cross <ccross@android.com>
[Kiran: Added context to commit message.
panic_timeout is used instead of break_on_panic and
break_on_exception to honor CONFIG_PANIC_TIMEOUT
Modified the commit as per community feedback]
Signed-off-by: Kiran Raparthy <kiran.kumar@linaro.org>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# df0036d1 08-Jan-2015 Jason Wessel <jason.wessel@windriver.com>

kdb: Fix off by one error in kdb_cpu()

There was a follow on replacement patch against the prior
"kgdb: Timeout if secondary CPUs ignore the roundup".

See: https://lkml.org/lkml/2015/1/7/442

This patch is the delta vs the patch that was committed upstream:
* Fix an off-by-one error in kdb_cpu().
* Replace NR_CPUS with CONFIG_NR_CPUS to tell checkpatch that we
really want a static limit.
* Removed the "KGDB: " prefix from the pr_crit() in debug_core.c
(kgdb-next contains a patch which introduced pr_fmt() to this file
to the tag will now be applied automatically).

Cc: Daniel Thompson <daniel.thompson@linaro.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# 0f16996c 12-Jun-2014 Fabian Frederick <fabf@skynet.be>

kernel/debug/debug_core.c: Logging clean-up

-Convert printk( to pr_foo()
-Add pr_fmt
-Coalesce formats

Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# a1465d2f 11-Nov-2014 Daniel Thompson <daniel.thompson@linaro.org>

kgdb: timeout if secondary CPUs ignore the roundup

Currently if an active CPU fails to respond to a roundup request the CPU
that requested the roundup will become stuck. This needlessly reduces the
robustness of the debugger.

This patch introduces a timeout allowing the system state to be examined
even when the system contains unresponsive processors. It also modifies
kdb's cpu command to make it censor attempts to switch to unresponsive
processors and to report their state as (D)ead.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# 4e857c58 17-Mar-2014 Peter Zijlstra <peterz@infradead.org>

arch: Mass conversion of smp_mb__*()

Mostly scripted conversion of the smp_mb__* barriers.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-arch@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 615d6e87 07-Apr-2014 Davidlohr Bueso <davidlohr@hp.com>

mm: per-thread vma caching

This patch is a continuation of efforts trying to optimize find_vma(),
avoiding potentially expensive rbtree walks to locate a vma upon faults.
The original approach (https://lkml.org/lkml/2013/11/1/410), where the
largest vma was also cached, ended up being too specific and random,
thus further comparison with other approaches were needed. There are
two things to consider when dealing with this, the cache hit rate and
the latency of find_vma(). Improving the hit-rate does not necessarily
translate in finding the vma any faster, as the overhead of any fancy
caching schemes can be too high to consider.

We currently cache the last used vma for the whole address space, which
provides a nice optimization, reducing the total cycles in find_vma() by
up to 250%, for workloads with good locality. On the other hand, this
simple scheme is pretty much useless for workloads with poor locality.
Analyzing ebizzy runs shows that, no matter how many threads are
running, the mmap_cache hit rate is less than 2%, and in many situations
below 1%.

The proposed approach is to replace this scheme with a small per-thread
cache, maximizing hit rates at a very low maintenance cost.
Invalidations are performed by simply bumping up a 32-bit sequence
number. The only expensive operation is in the rare case of a seq
number overflow, where all caches that share the same address space are
flushed. Upon a miss, the proposed replacement policy is based on the
page number that contains the virtual address in question. Concretely,
the following results are seen on an 80 core, 8 socket x86-64 box:

1) System bootup: Most programs are single threaded, so the per-thread
scheme does improve ~50% hit rate by just adding a few more slots to
the cache.

+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 50.61% | 19.90 |
| patched | 73.45% | 13.58 |
+----------------+----------+------------------+

2) Kernel build: This one is already pretty good with the current
approach as we're dealing with good locality.

+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 75.28% | 11.03 |
| patched | 88.09% | 9.31 |
+----------------+----------+------------------+

3) Oracle 11g Data Mining (4k pages): Similar to the kernel build workload.

+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 70.66% | 17.14 |
| patched | 91.15% | 12.57 |
+----------------+----------+------------------+

4) Ebizzy: There's a fair amount of variation from run to run, but this
approach always shows nearly perfect hit rates, while baseline is just
about non-existent. The amounts of cycles can fluctuate between
anywhere from ~60 to ~116 for the baseline scheme, but this approach
reduces it considerably. For instance, with 80 threads:

+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 1.06% | 91.54 |
| patched | 99.97% | 14.18 |
+----------------+----------+------------------+

[akpm@linux-foundation.org: fix nommu build, per Davidlohr]
[akpm@linux-foundation.org: document vmacache_valid() logic]
[akpm@linux-foundation.org: attempt to untangle header files]
[akpm@linux-foundation.org: add vmacache_find() BUG_ON]
[hughd@google.com: add vmacache_valid_mm() (from Oleg)]
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: adjust and enhance comments]
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Michel Lespinasse <walken@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Tested-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# d498d4b4 28-Jan-2014 Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

KGDB: make kgdb_breakpoint() as noinline

The function kgdb_breakpoint() sets up break point at
compile time by calling arch_kgdb_breakpoint();
Though this call is surrounded by wmb() barrier,
the compile can still re-order the break point,
because this scheduling barrier is not a code motion
barrier in gcc.

Making kgdb_breakpoint() as noinline solves this problem
of code reording around break point instruction and also
avoids problem of being called as inline function from
other places

More details about discussion on this can be found here
http://comments.gmane.org/gmane.linux.ports.arm.kernel/269732

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# fc8b1374 14-Jan-2014 Mike Travis <travis@sgi.com>

kgdb/kdb: Fix no KDB config problem

Some code added to the debug_core module had KDB dependencies
that it shouldn't have. Move the KDB dependent REASON back to
the caller to remove the dependency in the debug core code.

Update the call from the UV NMI handler to conform to the new
interface.

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Hedi Berriche <hedi@sgi.com>
Cc: Russ Anderson <rja@sgi.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20140114162551.318251993@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 8daaa5f8 02-Oct-2013 Mike Travis <travis@sgi.com>

kdb: Add support for external NMI handler to call KGDB/KDB

This patch adds a kgdb_nmicallin() interface that can be used by
external NMI handlers to call the KGDB/KDB handler. The primary
need for this is for those types of NMI interrupts where all the
CPUs have already received the NMI signal. Therefore no
send_IPI(NMI) is required, and in fact it will cause a 2nd
unhandled NMI to occur. This generates the "Dazed and Confuzed"
messages.

Since all the CPUs are getting the NMI at roughly the same time,
it's not guaranteed that the first CPU that hits the NMI handler
will manage to enter KGDB and set the dbg_master_lock before the
slaves start entering. The new argument "send_ready" was added
for KGDB to signal the NMI handler to release the slave CPUs for
entry into KGDB.

Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Jason Wessel <jason.wessel@windriver.com>
Reviewed-by: Dimitri Sivanich <sivanich@sgi.com>
Reviewed-by: Hedi Berriche <hedi@sgi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20131002151417.928886849@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# f3456509 30-Apr-2013 zhangwei(Jovi) <jovi.zhangwei@huawei.com>

kgdb/sysrq: fix inconstistent help message of sysrq key

Currently help message of /proc/sysrq-trigger highlight its upper-case
characters, like below:

SysRq : HELP : loglevel(0-9) reBoot Crash terminate-all-tasks(E)
memory-full-oom-kill(F) kill-all-tasks(I) ...

this would confuse user trigger sysrq by upper-case character, which is
inconsistent with the real lower-case character registed key.

This inconsistent help message will also lead more confused when
26 upper-case letters put into use in future.

This patch fix kgdb sysrq key: "debug(g)"

Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 16559ae4 04-Feb-2013 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

kgdb: remove #include <linux/serial_8250.h> from kgdb.h

There's no reason kgdb.h itself needs to include the 8250 serial port
header file. So push it down to the _very_ limited number of individual
drivers that need the values in that file, and fix up the places where
people really wanted serial_core.h and platform_device.h.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# f30fed10 12-Oct-2012 Jason Wessel <jason.wessel@windriver.com>

kgdb: Add module event hooks

Allow gdb to auto load kernel modules when it is attached,
which makes it trivially easy to debug module init functions
or pre-set breakpoints in a kernel module that has not loaded yet.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# 5a14fead 24-Sep-2012 Anton Vorontsov <anton.vorontsov@linaro.org>

kernel/debug: Mask KGDB NMI upon entry

The new arch callback should manage NMIs that usually cause KGDB to
enter. That is, not all NMIs should be enabled/disabled, but only
those that issue kgdb_handle_exception().

We must mask it as serial-line interrupt can be used as an NMI, so
if the original KGDB-entry cause was say a breakpoint, then every
input to KDB console will cause KGDB to reenter, which we don't want.

Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 98b54aa1 21-Mar-2012 Jason Wessel <jason.wessel@windriver.com>

kgdb,debug_core: pass the breakpoint struct instead of address and memory

There is extra state information that needs to be exposed in the
kgdb_bpt structure for tracking how a breakpoint was installed. The
debug_core only uses the the probe_kernel_write() to install
breakpoints, but this is not enough for all the archs. Some arch such
as x86 need to use text_poke() in order to install a breakpoint into a
read only page.

Passing the kgdb_bpt structure to kgdb_arch_set_breakpoint() and
kgdb_arch_remove_breakpoint() allows other archs to set the type
variable which indicates how the breakpoint was installed.

Cc: stable@vger.kernel.org # >= 2.6.36
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# 9ffc93f2 28-Mar-2012 David Howells <dhowells@redhat.com>

Remove all #inclusions of asm/system.h

Remove all #inclusions of asm/system.h preparatory to splitting and killing
it. Performed with the following command:

perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *`

Signed-off-by: David Howells <dhowells@redhat.com>


# bec4d62e 19-Mar-2012 Jason Wessel <jason.wessel@windriver.com>

kgdb,debug_core: add the ability to control the reboot notifier

Sometimes it is desirable to stop the kernel debugger before allowing
a system to reboot either with kdb or kgdb. This patch adds the
ability to turn the reboot notifier on and off or enter the debugger
and stop kernel execution before rebooting.

It is possible to change the setting after booting the kernel with the
following:

echo 1 > /sys/module/debug_core/parameters/kgdbreboot

It is also possible to change this setting using kdb / kgdb to
manipulate the variable directly.

Using KDB:
mm kgdbreboot 1

Using gdb:
set kgdbreboot=1

Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# 2366e047 16-Mar-2012 Jason Wessel <jason.wessel@windriver.com>

kgdb,debug-core,gdbstub: Hook the reboot notifier for debugger detach

The gdbstub and kdb should get detached if the system is rebooting.
Calling gdbstub_exit() will set the proper debug core state and send a
message to any debugger that is connected to correctly detach.

An attached debugger will receive the exit code from
include/linux/reboot.h based on SYS_HALT, SYS_REBOOT, etc...

Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# 60063497 26-Jul-2011 Arun Sharma <asharma@fb.com>

atomic: use <linux/atomic.h>

This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 25985edc 30-Mar-2011 Lucas De Marchi <lucas.demarchi@profusion.mobi>

Fix common misspellings

Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>


# d7ba979d 18-Aug-2010 Dongdong Deng <dongdong.deng@windriver.com>

debug_core,x86,blackfin: Clean up hw debug disable API

The kgdb_disable_hw_debug() was an architecture specific function for
disabling all hardware breakpoints on a per cpu basis when entering
the debug core.

This patch will remove the weak function kdbg_disable_hw_debug() and
change it into a call back which lives with the rest of hw breakpoint
call backs in struct kgdb_arch.

Signed-off-by: Dongdong Deng <dongdong.deng@windriver.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# 495363d3 21-May-2010 Jason Wessel <jason.wessel@windriver.com>

kdb,debug_core: adjust master cpu switch logic against new debug_core locking

The kdb shell needs to enforce switching back to the original CPU that
took the exception before restoring normal kernel execution. Resuming
from a different CPU than what took the original exception will cause
problems with spin locks that are freed from the a different processor
than had taken the lock.

The special logic in dbg_cpu_switch() can go away entirely with
because the state of what cpus want to be masters or slaves will
remain unchanged between entry and exit of the debug_core exception
context.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# dfee3a7b 21-May-2010 Jason Wessel <jason.wessel@windriver.com>

debug_core: refactor locking for master/slave cpus

For quite some time there have been problems with memory barriers and
various races with NMI on multi processor systems using the kernel
debugger. The algorithm for entering the kernel debug core and
resuming kernel execution was racy and had several known edge case
problems with attempting to debug something on a heavily loaded system
using breakpoints that are hit repeatedly and quickly.

The prior "locking" design entry worked as follows:

* The atomic counter kgdb_active was used with atomic exchange in
order to elect a master cpu out of all the cpus that may have
taken a debug exception.
* The master cpu increments all elements of passive_cpu_wait[].
* The master cpu issues the round up cpus message.
* Each "slave cpu" that enters the debug core increments its own
element in cpu_in_kgdb[].
* Each "slave cpu" spins on passive_cpu_wait[] until it becomes 0.
* The master cpu debugs the system.

The new scheme removes the two arrays of atomic counters and replaces
them with 2 single counters. One counter is used to count the number
of cpus waiting to become a master cpu (because one or more hit an
exception). The second counter is use to indicate how many cpus have
entered as slave cpus.

The new entry logic works as follows:

* One or more cpus enters via kgdb_handle_exception() and increments
the masters_in_kgdb. Each cpu attempts to get the spin lock called
dbg_master_lock.
* The master cpu sets kgdb_active to the current cpu.
* The master cpu takes the spinlock dbg_slave_lock.
* The master cpu asks to round up all the other cpus.
* Each slave cpu that is not already in kgdb_handle_exception()
will enter and increment slaves_in_kgdb. Each slave will now spin
try_locking on dbg_slave_lock.
* The master cpu waits for the sum of masters_in_kgdb and slaves_in_kgdb
to be equal to the sum of the online cpus.
* The master cpu debugs the system.

In the new design the kgdb_active can only be changed while holding
dbg_master_lock. Stress testing has not turned up any further
entry/exit races that existed in the prior locking design. The prior
locking design suffered from atomic variables not being truly atomic
(in the capacity as used by kgdb) along with memory barrier races.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Dongdong Deng <dongdong.deng@windriver.com>


# c1bb9a9c 13-Sep-2010 Dongdong Deng <dongdong.deng@windriver.com>

debug_core: disable hw_breakpoints on all cores in kgdb_cpu_enter()

The slave cpus do not have the hw breakpoints disabled upon entry to
the debug_core and as a result could cause unrecoverable recursive
faults on badly placed breakpoints, or get out of sync with the arch
specific hw breakpoint operations.

This patch addresses the problem by invoking kgdb_disable_hw_debug()
earlier in kgdb_enter_cpu for each cpu that enters the debug core.

The hw breakpoint dis/enable flow should be:

master_debug_cpu slave_debug_cpu
\ /
kgdb_cpu_enter
|
kgdb_disable_hw_debug --> uninstall pre-enabled hw_breakpoint
|
do add/rm dis/enable operates to hw_breakpoints on master_debug_cpu..
|
correct_hw_break --> correct/install the enabled hw_breakpoint
|
leave_kgdb

Signed-off-by: Dongdong Deng <dongdong.deng@windriver.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# fb70b588 12-Aug-2010 Jason Wessel <jason.wessel@windriver.com>

debug_core: stop rcu warnings on kernel resume

When returning from the kernel debugger reset the rcu jiffies_stall
value to prevent the rcu stall detector from sending NMI events which
invoke a stack dump for each cpu in the system.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# 16cdc628 06-Aug-2010 Jason Wessel <jason.wessel@windriver.com>

debug_core: move all watch dog syncs to a single function

Move the various clock and watch dog syncs to a single function in
advance of adding another sync for the rcu stall detector.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# 1495cc9d 17-Aug-2010 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Input: sysrq - drop tty argument from sysrq ops handlers

Noone is using tty argument so let's get rid of it.

Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Acked-by: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>


# 3fa43aba 05-Aug-2010 Jason Wessel <jason.wessel@windriver.com>

debug_core,kdb: fix crash when arch does not have single step

When an arch such as mips and microblaze does not implement either HW
or software single stepping the debug core should re-enter kdb. The
kdb code will properly ignore the single step operation. Attempting
to single step the kernel without software or hardware support causes
unpredictable kernel crashes.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# b0679c63 21-Jul-2010 Jason Wessel <jason.wessel@windriver.com>

debug_core,kdb: fix kgdb_connected bit set in the wrong place

Immediately following an exit from the kdb shell the kgdb_connected
variable should be set to zero, unless there are breakpoints planted.
If the kgdb_connected variable is not zeroed out with kdb, it is
impossible to turn off kdb.

This patch is merely a work around for now, the real fix will check
for the breakpoints.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# a2531293 18-Jul-2010 Pavel Machek <pavel@ucw.cz>

update email address

pavel@suse.cz no longer works, replace it with working address.

Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>


# 0b4b3827 20-May-2010 Jason Wessel <jason.wessel@windriver.com>

x86, kgdb, init: Add early and late debug states

The kernel debugger can operate well before mm_init(), but the x86
hardware breakpoint code which uses the perf api requires that the
kernel allocators are initialized.

This means the kernel debug core needs to provide an optional arch
specific call back to allow the initialization functions to run after
the kernel has been further initialized.

The kdb shell already had a similar restriction with an early
initialization and late initialization. The kdb_init() was moved into
the debug core's version of the late init which is called
dbg_late_init();

CC: kgdb-bugreport@lists.sourceforge.net
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# 4402c153 20-May-2010 Jason Wessel <jason.wessel@windriver.com>

kdb,debug_core: Allow the debug core to receive a panic notification

It is highly desirable to trap into kdb on panic. The debug core will
attempt to register as the first in line for the panic notifier.

CC: Ingo Molnar <mingo@elte.hu>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# 6d906340 20-May-2010 Jason Wessel <jason.wessel@windriver.com>

debug_core,kdb: Allow the debug core to process a recursive debug entry

This allows kdb to debug a crash with in the kms code with a
single level recursive re-entry.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# 1cee5e35 03-Jun-2009 Jason Wessel <jason.wessel@windriver.com>

kgdb: Add the ability to schedule a breakpoint via a tasklet

Some kgdb I/O modules require the ability to create a breakpoint
tasklet, such as kgdboc and external modules such as kgdboe. The
breakpoint tasklet is used as an asynchronous entry point into the
debugger which will have a different function scope than the current
execution path where it might not be safe to have an inline
breakpoint. This is true of some of the kgdb I/O drivers which share
code with kgdb and rest of the kernel users.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# f503b5ae 20-May-2010 Jason Wessel <jason.wessel@windriver.com>

x86,kgdb: Add low level debug hook

The only way the debugger can handle a trap in inside rcu_lock,
notify_die, or atomic_notifier_call_chain without a triple fault is
to have a low level "first opportunity handler" in the int3 exception
handler.

Generally this will be something the vast majority of folks will not
need, but for those who need it, it is added as a kernel .config
option called KGDB_LOW_LEVEL_TRAP.

CC: Ingo Molnar <mingo@elte.hu>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: H. Peter Anvin <hpa@zytor.com>
CC: x86@kernel.org
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# 98ec1878 27-Apr-2009 Jason Wessel <jason.wessel@windriver.com>

kgdb: remove post_primary_code references

Remove all the references to the kgdb_post_primary_code. This
function serves no useful purpose because you can obtain the same
information from the "struct kgdb_state *ks" from with in the
debugger, if for some reason you want the data.

Also remove the unintentional duplicate assignment for ks->ex_vector.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# a0de055c 20-May-2010 Jason Wessel <jason.wessel@windriver.com>

kgdb: gdb "monitor" -> kdb passthrough

One of the driving forces behind integrating another front end (kdb)
to the debug core is to allow front end commands to be accessible via
gdb's monitor command. It is true that you could write gdb macros to
get certain data, but you may want to just use gdb to access the
commands that are available in the kdb front end.

This patch implements the Rcmd gdb stub packet. In gdb you access
this with the "monitor" command. For instance you could type "monitor
help", "monitor lsmod" or "monitor ps A" etc...

There is no error checking or command restrictions on what you can and
cannot access at this point. Doing something like trying to set
breakpoints with the monitor command is going to cause nothing but
problems. Perhaps in the future only the commands that are actually
known to work with the gdb monitor command will be available.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# f5316b4a 20-May-2010 Jason Wessel <jason.wessel@windriver.com>

kgdb,8250,pl011: Return immediately from console poll

The design of the kdb shell requires that every device that can
provide input to kdb have a polling routine that exits immediately if
there is no character available. This is required in order to get the
page scrolling mechanism working.

Changing the kernel debugger I/O API to require all polling character
routines to exit immediately if there is no data allows the kernel
debugger to process multiple input channels.

NO_POLL_CHAR will be the return code to the polling routine when ever
there is no character available.

CC: linux-serial@vger.kernel.org
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# dcc78711 20-May-2010 Jason Wessel <jason.wessel@windriver.com>

kgdb: core changes to support kdb

These are the minimum changes to the kgdb core in order to enable an
API to connect a new front end (kdb) to the debug core.

This patch introduces the dbg_kdb_mode variable controls where the
user level I/O is routed. It will be routed to the gdbstub (kgdb) or
to the kdb front end which is a simple shell available over the kgdboc
connection.

You can switch back and forth between kdb or the gdb stub mode of
operation dynamically. From gdb stub mode you can blindly type
"$3#33", or from the kdb mode you can enter "kgdb" to switch to the
gdb stub.

The logic in the debug core depends on kdb to look for the typical gdb
connection sequences and return immediately with KGDB_PASS_EVENT if a
gdb serial command sequence is detected. That should allow a
reasonably seamless transition between kdb -> gdb without leaving the
kernel exception state. The two gdb serial queries that kdb is
responsible for detecting are the "?" and "qSupported" packets.

CC: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Martin Hicks <mort@sgi.com>


# 53197fc4 02-Apr-2010 Jason Wessel <jason.wessel@windriver.com>

Separate the gdbstub from the debug core

Split the former kernel/kgdb.c into debug_core.c which contains the
kernel debugger exception logic and to the gdbstub.c which contains
the logic for allowing gdb to talk to the debug core.

This also created a private include file called debug_core.h which
contains all the definitions to glue the debug_core to any other
debugger connections.

CC: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>


# c4338209 19-May-2009 Jason Wessel <jason.wessel@windriver.com>

Move kernel/kgdb.c to kernel/debug/debug_core.c

Move kgdb.c in preparation to separate the gdbstub from the debug
core and exception handling.

CC: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>