History log of /linux-master/arch/arm64/kernel/entry-ftrace.S
Revision Date Author Comments
# 36469703 08-Apr-2023 Donglin Peng <pengdonglin@sangfor.com.cn>

arm64: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL

The previous patch ("function_graph: Support recording and printing
the return value of function") has laid the groundwork for the for
the funcgraph-retval, and this modification makes it available on
the ARM64 platform.

We introduce a new structure called fgraph_ret_regs for the ARM64
platform to hold return registers and the frame pointer. We then
fill its content in the return_to_handler and pass its address to
the function ftrace_return_to_handler to record the return value.

Link: https://lkml.kernel.org/r/c78366416ce93f704ae7000c4ee60eb4258c38f7.1680954589.git.pengdonglin@sangfor.com.cn

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Donglin Peng <pengdonglin@sangfor.com.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>


# 2aa6ac03 05-Apr-2023 Florent Revest <revest@chromium.org>

arm64: ftrace: Add direct call support

This builds up on the CALL_OPS work which extends the ftrace patchsite
on arm64 with an ops pointer usable by the ftrace trampoline.

This ops pointer is valid at all time. Indeed, it is either pointing to
ftrace_list_ops or to the single ops which should be called from that
patchsite.

There are a few cases to distinguish:
- If a direct call ops is the only one tracing a function:
- If the direct called trampoline is within the reach of a BL
instruction
-> the ftrace patchsite jumps to the trampoline
- Else
-> the ftrace patchsite jumps to the ftrace_caller trampoline which
reads the ops pointer in the patchsite and jumps to the direct
call address stored in the ops
- Else
-> the ftrace patchsite jumps to the ftrace_caller trampoline and its
ops literal points to ftrace_list_ops so it iterates over all
registered ftrace ops, including the direct call ops and calls its
call_direct_funcs handler which stores the direct called
trampoline's address in the ftrace_regs and the ftrace_caller
trampoline will return to that address instead of returning to the
traced function

Signed-off-by: Florent Revest <revest@chromium.org>
Co-developed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20230405180250.2046566-2-revest@chromium.org
Signed-off-by: Will Deacon <will@kernel.org>


# baaf553d 23-Jan-2023 Mark Rutland <mark.rutland@arm.com>

arm64: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS

This patch enables support for DYNAMIC_FTRACE_WITH_CALL_OPS on arm64.
This allows each ftrace callsite to provide an ftrace_ops to the common
ftrace trampoline, allowing each callsite to invoke distinct tracer
functions without the need to fall back to list processing or to
allocate custom trampolines for each callsite. This significantly speeds
up cases where multiple distinct trace functions are used and callsites
are mostly traced by a single tracer.

The main idea is to place a pointer to the ftrace_ops as a literal at a
fixed offset from the function entry point, which can be recovered by
the common ftrace trampoline. Using a 64-bit literal avoids branch range
limitations, and permits the ops to be swapped atomically without
special considerations that apply to code-patching. In future this will
also allow for the implementation of DYNAMIC_FTRACE_WITH_DIRECT_CALLS
without branch range limitations by using additional fields in struct
ftrace_ops.

As noted in the core patch adding support for
DYNAMIC_FTRACE_WITH_CALL_OPS, this approach allows for directly invoking
ftrace_ops::func even for ftrace_ops which are dynamically-allocated (or
part of a module), without going via ftrace_ops_list_func.

Currently, this approach is not compatible with CLANG_CFI, as the
presence/absence of pre-function NOPs changes the offset of the
pre-function type hash, and there's no existing mechanism to ensure a
consistent offset for instrumented and uninstrumented functions. When
CLANG_CFI is enabled, the existing scheme with a global ops->func
pointer is used, and there should be no functional change. I am
currently working with others to allow the two to work together in
future (though this will liekly require updated compiler support).

I've benchamrked this with the ftrace_ops sample module [1], which is
not currently upstream, but available at:

https://lore.kernel.org/lkml/20230103124912.2948963-1-mark.rutland@arm.com
git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git ftrace-ops-sample-20230109

Using that module I measured the total time taken for 100,000 calls to a
trivial instrumented function, with a number of tracers enabled with
relevant filters (which would apply to the instrumented function) and a
number of tracers enabled with irrelevant filters (which would not apply
to the instrumented function). I tested on an M1 MacBook Pro, running
under a HVF-accelerated QEMU VM (i.e. on real hardware).

Before this patch:

Number of tracers || Total time | Per-call average time (ns)
Relevant | Irrelevant || (ns) | Total | Overhead
=========+============++=============+==============+============
0 | 0 || 94,583 | 0.95 | -
0 | 1 || 93,709 | 0.94 | -
0 | 2 || 93,666 | 0.94 | -
0 | 10 || 93,709 | 0.94 | -
0 | 100 || 93,792 | 0.94 | -
---------+------------++-------------+--------------+------------
1 | 1 || 6,467,833 | 64.68 | 63.73
1 | 2 || 7,509,708 | 75.10 | 74.15
1 | 10 || 23,786,792 | 237.87 | 236.92
1 | 100 || 106,432,500 | 1,064.43 | 1063.38
---------+------------++-------------+--------------+------------
1 | 0 || 1,431,875 | 14.32 | 13.37
2 | 0 || 6,456,334 | 64.56 | 63.62
10 | 0 || 22,717,000 | 227.17 | 226.22
100 | 0 || 103,293,667 | 1032.94 | 1031.99
---------+------------++-------------+--------------+--------------

Note: per-call overhead is estimated relative to the baseline case
with 0 relevant tracers and 0 irrelevant tracers.

After this patch

Number of tracers || Total time | Per-call average time (ns)
Relevant | Irrelevant || (ns) | Total | Overhead
=========+============++=============+==============+============
0 | 0 || 94,541 | 0.95 | -
0 | 1 || 93,666 | 0.94 | -
0 | 2 || 93,709 | 0.94 | -
0 | 10 || 93,667 | 0.94 | -
0 | 100 || 93,792 | 0.94 | -
---------+------------++-------------+--------------+------------
1 | 1 || 281,000 | 2.81 | 1.86
1 | 2 || 281,042 | 2.81 | 1.87
1 | 10 || 280,958 | 2.81 | 1.86
1 | 100 || 281,250 | 2.81 | 1.87
---------+------------++-------------+--------------+------------
1 | 0 || 280,959 | 2.81 | 1.86
2 | 0 || 6,502,708 | 65.03 | 64.08
10 | 0 || 18,681,209 | 186.81 | 185.87
100 | 0 || 103,550,458 | 1,035.50 | 1034.56
---------+------------++-------------+--------------+------------

Note: per-call overhead is estimated relative to the baseline case
with 0 relevant tracers and 0 irrelevant tracers.

As can be seen from the above:

a) Whenever there is a single relevant tracer function associated with a
tracee, the overhead of invoking the tracer is constant, and does not
scale with the number of tracers which are *not* associated with that
tracee.

b) The overhead for a single relevant tracer has dropped to ~1/7 of the
overhead prior to this series (from 13.37ns to 1.86ns). This is
largely due to permitting calls to dynamically-allocated ftrace_ops
without going through ftrace_ops_list_func.

I've run the ftrace selftests from v6.2-rc3, which reports:

| # of passed: 110
| # of failed: 0
| # of unresolved: 3
| # of untested: 0
| # of unsupported: 0
| # of xfailed: 1
| # of undefined(test bug): 0

... where the unresolved entries were the tests for DIRECT functions
(which are not supported), and the checkbashisms selftest (which is
irrelevant here):

| [8] Test ftrace direct functions against tracers [UNRESOLVED]
| [9] Test ftrace direct functions against kprobes [UNRESOLVED]
| [62] Meta-selftest: Checkbashisms [UNRESOLVED]

... with all other tests passing (or failing as expected).

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230123134603.1064407-9-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# cfce092d 22-Nov-2022 Mark Rutland <mark.rutland@arm.com>

ftrace: arm64: remove static ftrace

The build test robot pointer out that there's a build failure when:

CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=y
CONFIG_DYNAMIC_FTRACE_WITH_ARGS=n

... due to some mismatched ifdeffery, some of which checks
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS, and some of which checks
CONFIG_DYNAMIC_FTRACE_WITH_ARGS, leading to some missing definitions expected
by the core code when CONFIG_DYNAMIC_FTRACE=n and consequently
CONFIG_DYNAMIC_FTRACE_WITH_ARGS=n.

There's really not much point in supporting CONFIG_DYNAMIC_FTRACE=n (AKA
static ftrace). All supported toolchains allow us to implement
DYNAMIC_FTRACE, distributions all prefer DYNAMIC_FTRACE, and both
powerpc and s390 removed support for static ftrace in commits:

0c0c52306f4792a4 ("powerpc: Only support DYNAMIC_FTRACE not static")
5d6a0163494c78ad ("s390/ftrace: enforce DYNAMIC_FTRACE if FUNCTION_TRACER is selected")

... and according to Steven, static ftrace is only supported on x86 to
allow testing that the core code still functions in this configuration.

Given that, let's simplify matters by removing arm64's support for
static ftrace. This avoids the problem originally reported, and leaves
us with less code to maintain.

Fixes: 26299b3f6ba2 ("ftrace: arm64: move from REGS to ARGS")
Link: https://lore.kernel.org/r/202211212249.livTPi3Y-lkp@intel.com
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20221122163624.1225912-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>


# 26299b3f 03-Nov-2022 Mark Rutland <mark.rutland@arm.com>

ftrace: arm64: move from REGS to ARGS

This commit replaces arm64's support for FTRACE_WITH_REGS with support
for FTRACE_WITH_ARGS. This removes some overhead and complexity, and
removes some latent issues with inconsistent presentation of struct
pt_regs (which can only be reliably saved/restored at exception
boundaries).

FTRACE_WITH_REGS has been supported on arm64 since commit:

3b23e4991fb66f6d ("arm64: implement ftrace with regs")

As noted in the commit message, the major reasons for implementing
FTRACE_WITH_REGS were:

(1) To make it possible to use the ftrace graph tracer with pointer
authentication, where it's necessary to snapshot/manipulate the LR
before it is signed by the instrumented function.

(2) To make it possible to implement LIVEPATCH in future, where we need
to hook function entry before an instrumented function manipulates
the stack or argument registers. Practically speaking, we need to
preserve the argument/return registers, PC, LR, and SP.

Neither of these need a struct pt_regs, and only require the set of
registers which are live at function call/return boundaries. Our calling
convention is defined by "Procedure Call Standard for the ArmĀ® 64-bit
Architecture (AArch64)" (AKA "AAPCS64"), which can currently be found
at:

https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst

Per AAPCS64, all function call argument and return values are held in
the following GPRs:

* X0 - X7 : parameter / result registers
* X8 : indirect result location register
* SP : stack pointer (AKA SP)

Additionally, ad function call boundaries, the following GPRs hold
context/return information:

* X29 : frame pointer (AKA FP)
* X30 : link register (AKA LR)

... and for ftrace we need to capture the instrumented address:

* PC : program counter

No other GPRs are relevant, as none of the other arguments hold
parameters or return values:

* X9 - X17 : temporaries, may be clobbered
* X18 : shadow call stack pointer (or temorary)
* X19 - X28 : callee saved

This patch implements FTRACE_WITH_ARGS for arm64, only saving/restoring
the minimal set of registers necessary. This is always sufficient to
manipulate control flow (e.g. for live-patching) or to manipulate
function arguments and return values.

This reduces the necessary stack usage from 336 bytes for pt_regs down
to 112 bytes for ftrace_regs + 32 bytes for two frame records, freeing
up 188 bytes. This could be reduced further with changes to the
unwinder.

As there is no longer a need to save different sets of registers for
different features, we no longer need distinct `ftrace_caller` and
`ftrace_regs_caller` trampolines. This allows the trampoline assembly to
be simpler, and simplifies code which previously had to handle the two
trampolines.

I've tested this with the ftrace selftests, where there are no
unexpected failures.

Co-developed-by: Florent Revest <revest@chromium.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Florent Revest <revest@chromium.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20221103170520.931305-5-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>


# 2598ac6e 09-Nov-2022 Sami Tolvanen <samitolvanen@google.com>

arm64: ftrace: Define ftrace_stub_graph only with FUNCTION_GRAPH_TRACER

The 0-day bot reports that arm64 builds with CONFIG_CFI_CLANG +
CONFIG_FTRACE are broken when CONFIG_FUNCTION_GRAPH_TRACER is not
enabled:

ld.lld: error: undefined symbol: __kcfi_typeid_ftrace_stub_graph
>>> referenced by entry-ftrace.S:299 (arch/arm64/kernel/entry-ftrace.S:299)
>>> arch/arm64/kernel/entry-ftrace.o:(.text+0x48) in archive vmlinux.a

This is caused by ftrace_stub_graph using SYM_TYPE_FUNC_START when
the address of the function is not taken in any C translation unit.

Fix the build by only defining ftrace_stub_graph when it's actually
needed, i.e. with CONFIG_FUNCTION_GRAPH_TRACER.

Link: https://lore.kernel.org/lkml/202210251659.tRMs78RH-lkp@intel.com/
Fixes: 883bbbffa5a4 ("ftrace,kcfi: Separate ftrace_stub() and ftrace_stub_graph()")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20221109192831.3057131-1-samitolvanen@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 883bbbff 18-Oct-2022 Peter Zijlstra <peterz@infradead.org>

ftrace,kcfi: Separate ftrace_stub() and ftrace_stub_graph()

Different function signatures means they needs to be different
functions; otherwise CFI gets upset.

As triggered by the ftrace boot tests:

[] CFI failure at ftrace_return_to_handler+0xac/0x16c (target: ftrace_stub+0x0/0x14; expected type: 0x0a5d5347)

Fixes: 3c516f89e17e ("x86: Add support for CONFIG_CFI_CLANG")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lkml.kernel.org/r/Y06dg4e1xF6JTdQq@hirez.programming.kicks-ass.net


# 0d8116cc 14-Jun-2022 Mark Rutland <mark.rutland@arm.com>

arm64: ftrace: remove redundant label

Since commit:

c4a0ebf87cebbfa2 ("arm64/ftrace: Make function graph use ftrace directly")

The 'ftrace_common_return' label has been unused.

Remove it.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Chengming Zhou <zhouchengming@bytedance.com>
Cc: Will Deacon <will@kernel.org>
Tested-by: "Ivan T. Ivanov" <iivanov@suse.de>
Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220614080944.1349146-4-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# c4a0ebf8 20-Apr-2022 Chengming Zhou <zhouchengming@bytedance.com>

arm64/ftrace: Make function graph use ftrace directly

As we do in commit 0c0593b45c9b ("x86/ftrace: Make function graph
use ftrace directly"), we don't need special hook for graph tracer,
but instead we use graph_ops:func function to install return_hooker.

Since commit 3b23e4991fb6 ("arm64: implement ftrace with regs") add
implementation for FTRACE_WITH_REGS on arm64, we can easily adopt
the same cleanup on arm64.

And this cleanup only changes the FTRACE_WITH_REGS implementation,
so the mcount-based implementation is unaffected.

While in theory it would be possible to make a similar cleanup for
!FTRACE_WITH_REGS, this will require rework of the core code, and
so for now we only change the FTRACE_WITH_REGS implementation.

Tested-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Link: https://lore.kernel.org/r/20220420160006.17880-2-zhouchengming@bytedance.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 742a15b1 14-Dec-2021 Mark Brown <broonie@kernel.org>

arm64: Use BTI C directly and unconditionally

Now we have a macro for BTI C that looks like a regular instruction change
all the users of the current BTI_C macro to just emit a BTI C directly and
remove the macro.

This does mean that we now unconditionally BTI annotate all assembly
functions, meaning that they are worse in this respect than code generated
by the compiler. The overhead should be minimal for implementations with a
reasonable HINT implementation.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20211214152714.2380849-4-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 35b6b28e 29-Nov-2021 Mark Rutland <mark.rutland@arm.com>

arm64: ftrace: add missing BTIs

When branch target identifiers are in use, code reachable via an
indirect branch requires a BTI landing pad at the branch target site.

When building FTRACE_WITH_REGS atop patchable-function-entry, we miss
BTIs at the start start of the `ftrace_caller` and `ftrace_regs_caller`
trampolines, and when these are called from a module via a PLT (which
will use a `BR X16`), we will encounter a BTI failure, e.g.

| # insmod lkdtm.ko
| lkdtm: No crash points registered, enable through debugfs
| # echo function_graph > /sys/kernel/debug/tracing/current_tracer
| # cat /sys/kernel/debug/provoke-crash/DIRECT
| Unhandled 64-bit el1h sync exception on CPU0, ESR 0x34000001 -- BTI
| CPU: 0 PID: 174 Comm: cat Not tainted 5.16.0-rc2-dirty #3
| Hardware name: linux,dummy-virt (DT)
| pstate: 60400405 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=jc)
| pc : ftrace_caller+0x0/0x3c
| lr : lkdtm_debugfs_open+0xc/0x20 [lkdtm]
| sp : ffff800012e43b00
| x29: ffff800012e43b00 x28: 0000000000000000 x27: ffff800012e43c88
| x26: 0000000000000000 x25: 0000000000000000 x24: ffff0000c171f200
| x23: ffff0000c27b1e00 x22: ffff0000c2265240 x21: ffff0000c23c8c30
| x20: ffff8000090ba380 x19: 0000000000000000 x18: 0000000000000000
| x17: 0000000000000000 x16: ffff80001002bb4c x15: 0000000000000000
| x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000900ff0
| x11: ffff0000c4166310 x10: ffff800012e43b00 x9 : ffff8000104f2384
| x8 : 0000000000000001 x7 : 0000000000000000 x6 : 000000000000003f
| x5 : 0000000000000040 x4 : ffff800012e43af0 x3 : 0000000000000001
| x2 : ffff8000090b0000 x1 : ffff0000c171f200 x0 : ffff0000c23c8c30
| Kernel panic - not syncing: Unhandled exception
| CPU: 0 PID: 174 Comm: cat Not tainted 5.16.0-rc2-dirty #3
| Hardware name: linux,dummy-virt (DT)
| Call trace:
| dump_backtrace+0x0/0x1a4
| show_stack+0x24/0x30
| dump_stack_lvl+0x68/0x84
| dump_stack+0x1c/0x38
| panic+0x168/0x360
| arm64_exit_nmi.isra.0+0x0/0x80
| el1h_64_sync_handler+0x68/0xd4
| el1h_64_sync+0x78/0x7c
| ftrace_caller+0x0/0x3c
| do_dentry_open+0x134/0x3b0
| vfs_open+0x38/0x44
| path_openat+0x89c/0xe40
| do_filp_open+0x8c/0x13c
| do_sys_openat2+0xbc/0x174
| __arm64_sys_openat+0x6c/0xbc
| invoke_syscall+0x50/0x120
| el0_svc_common.constprop.0+0xdc/0x100
| do_el0_svc+0x84/0xa0
| el0_svc+0x28/0x80
| el0t_64_sync_handler+0xa8/0x130
| el0t_64_sync+0x1a0/0x1a4
| SMP: stopping secondary CPUs
| Kernel Offset: disabled
| CPU features: 0x0,00000f42,da660c5f
| Memory Limit: none
| ---[ end Kernel panic - not syncing: Unhandled exception ]---

Fix this by adding the required `BTI C`, as we only require these to be
reachable via BL for direct calls or BR X16/X17 for PLTs. For now, these
are open-coded in the function prologue, matching the style of the
`__hwasan_tag_mismatch` trampoline.

In future we may wish to consider adding a new SYM_CODE_START_*()
variant which has an implicit BTI.

When ftrace is built atop mcount, the trampolines are marked with
SYM_FUNC_START(), and so get an implicit BTI. We may need to change
these over to SYM_CODE_START() in future for RELIABLE_STACKTRACE, in
case we need to apply special care aroud the return address being
rewritten.

Fixes: 97fed779f2a6 ("arm64: bti: Provide Kconfig for kernel mode BTI")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211129135709.2274019-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>


# 71e70184 11-Jan-2021 Jianlin Lv <Jianlin.Lv@arm.com>

arm64: rename S_FRAME_SIZE to PT_REGS_SIZE

S_FRAME_SIZE is the size of the pt_regs structure, no longer the size of
the kernel stack frame, the name is misleading. In keeping with arm32,
rename S_FRAME_SIZE to PT_REGS_SIZE.

Signed-off-by: Jianlin Lv <Jianlin.Lv@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210112015813.2340969-1-Jianlin.Lv@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 258c3d62 18-May-2020 Will Deacon <will@kernel.org>

arm64: entry-ftrace.S: Update comment to indicate that x18 is live

The Shadow Call Stack pointer is held in x18, so update the ftrace
entry comment to indicate that it cannot be safely clobbered.

Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>


# 69d113b5 10-Mar-2020 Kunihiko Hayashi <hayashi.kunihiko@socionext.com>

arm64: entry-ftrace.S: Fix missing argument for CONFIG_FUNCTION_GRAPH_TRACER=y

Missing argument of another SYM_INNER_LABEL() breaks build for
CONFIG_FUNCTION_GRAPH_TRACER=y.

Fixes: e2d591d29d44 ("arm64: entry-ftrace.S: Convert to modern annotations for assembly functions")
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Mark Brown <broonie@kernel.org>


# 1e4729ed 18-Feb-2020 Mark Brown <broonie@kernel.org>

arm64: ftrace: Modernise annotation of return_to_handler

In an effort to clarify and simplify the annotation of assembly
functions new macros have been introduced. These replace ENTRY and
ENDPROC with two different annotations for normal functions and those
with unusual calling conventions.

return_to_handler does entertaining things with LR so doesn't follow the
usual C conventions and should therefore be annotated as code rather than
a function.

Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# e434b08b 18-Feb-2020 Mark Brown <broonie@kernel.org>

arm64: ftrace: Correct annotation of ftrace_caller assembly

In an effort to clarify and simplify the annotation of assembly
functions new macros have been introduced. These replace ENTRY and
ENDPROC with two different annotations for normal functions and those
with unusual calling conventions.

The patchable function entry versions of ftrace_*_caller don't follow the
usual AAPCS rules, pushing things onto the stack which they don't clean up,
and therefore should be annotated as code rather than functions.

Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# e2d591d2 18-Feb-2020 Mark Brown <broonie@kernel.org>

arm64: entry-ftrace.S: Convert to modern annotations for assembly functions

In an effort to clarify and simplify the annotation of assembly functions
in the kernel new macros have been introduced. These replace ENTRY and
ENDPROC and also add a new annotation for static functions which previously
had no ENTRY equivalent. Update the annotations in the core kernel code to
the new macros.

Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 70927d02 06-Dec-2019 Mark Rutland <mark.rutland@arm.com>

arm64: ftrace: fix ifdeffery

When I tweaked the ftrace entry assembly in commit:

3b23e4991fb66f6d ("arm64: implement ftrace with regs")

... my ifdeffery tweaks left ftrace_graph_caller undefined for
CONFIG_DYNAMIC_FTRACE && CONFIG_FUNCTION_GRAPH_TRACER when ftrace is
based on mcount.

The kbuild test robot reported that this issue is detected at link time:

| arch/arm64/kernel/entry-ftrace.o: In function `skip_ftrace_call':
| arch/arm64/kernel/entry-ftrace.S:238: undefined reference to `ftrace_graph_caller'
| arch/arm64/kernel/entry-ftrace.S:238:(.text+0x3c): relocation truncated to fit: R_AARCH64_CONDBR19 against undefined symbol
| `ftrace_graph_caller'
| arch/arm64/kernel/entry-ftrace.S:243: undefined reference to `ftrace_graph_caller'
| arch/arm64/kernel/entry-ftrace.S:243:(.text+0x54): relocation truncated to fit: R_AARCH64_CONDBR19 against undefined symbol
| `ftrace_graph_caller'

This patch fixes the ifdeffery so that the mcount version of
ftrace_graph_caller doesn't depend on CONFIG_DYNAMIC_FTRACE. At the same
time, a redundant #else is removed from the ifdeffery for the
patchable-function-entry version of ftrace_graph_caller.

Fixes: 3b23e4991fb66f6d ("arm64: implement ftrace with regs")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Torsten Duwe <duwe@lst.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 3b23e499 08-Feb-2019 Torsten Duwe <duwe@lst.de>

arm64: implement ftrace with regs

This patch implements FTRACE_WITH_REGS for arm64, which allows a traced
function's arguments (and some other registers) to be captured into a
struct pt_regs, allowing these to be inspected and/or modified. This is
a building block for live-patching, where a function's arguments may be
forwarded to another function. This is also necessary to enable ftrace
and in-kernel pointer authentication at the same time, as it allows the
LR value to be captured and adjusted prior to signing.

Using GCC's -fpatchable-function-entry=N option, we can have the
compiler insert a configurable number of NOPs between the function entry
point and the usual prologue. This also ensures functions are AAPCS
compliant (e.g. disabling inter-procedural register allocation).

For example, with -fpatchable-function-entry=2, GCC 8.1.0 compiles the
following:

| unsigned long bar(void);
|
| unsigned long foo(void)
| {
| return bar() + 1;
| }

... to:

| <foo>:
| nop
| nop
| stp x29, x30, [sp, #-16]!
| mov x29, sp
| bl 0 <bar>
| add x0, x0, #0x1
| ldp x29, x30, [sp], #16
| ret

This patch builds the kernel with -fpatchable-function-entry=2,
prefixing each function with two NOPs. To trace a function, we replace
these NOPs with a sequence that saves the LR into a GPR, then calls an
ftrace entry assembly function which saves this and other relevant
registers:

| mov x9, x30
| bl <ftrace-entry>

Since patchable functions are AAPCS compliant (and the kernel does not
use x18 as a platform register), x9-x18 can be safely clobbered in the
patched sequence and the ftrace entry code.

There are now two ftrace entry functions, ftrace_regs_entry (which saves
all GPRs), and ftrace_entry (which saves the bare minimum). A PLT is
allocated for each within modules.

Signed-off-by: Torsten Duwe <duwe@suse.de>
[Mark: rework asm, comments, PLTs, initialization, commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Tested-by: Torsten Duwe <duwe@suse.de>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Julien Thierry <jthierry@redhat.com>
Cc: Will Deacon <will@kernel.org>


# d2912cb1 04-Jun-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500

Based on 2 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation #

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 4122 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# dbd31962 07-Dec-2018 Mark Rutland <mark.rutland@arm.com>

arm64: frace: use asm EXPORT_SYMBOL()

For a while now it's been possible to use EXPORT_SYMBOL() in assembly
files, which allows us to place exports immediately after assembly
functions, as we do for C functions.

As a step towards removing arm64ksyms.c, let's move the ftrace exports
to the assembly files the functions are defined in.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 7dc48bf9 15-Nov-2018 Mark Rutland <mark.rutland@arm.com>

arm64: ftrace: always pass instrumented pc in x0

The core ftrace hooks take the instrumented PC in x0, but for some
reason arm64's prepare_ftrace_return() takes this in x1.

For consistency, let's flip the argument order and always pass the
instrumented PC in x0.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Torsten Duwe <duwe@suse.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 49e258e0 15-Nov-2018 Mark Rutland <mark.rutland@arm.com>

arm64: ftrace: remove return_regs macros

The save_return_regs and restore_return_regs macros are only used by
return_to_handler, and having them defined out-of-line only serves to
obscure the logic.

Before we complicate, let's clean this up and fold the logic directly
into return_to_handler, saving a few lines of macro boilerplate in the
process. At the same time, a missing trailing space is added to the
comments, fixing a code style violation.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Torsten Duwe <duwe@suse.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 6e803e2e 15-Nov-2018 Mark Rutland <mark.rutland@arm.com>

arm64: ftrace: don't adjust the LR value

The core ftrace code requires that when it is handed the PC of an
instrumented function, this PC is the address of the instrumented
instruction. This is necessary so that the core ftrace code can identify
the specific instrumentation site. Since the instrumented function will
be a BL, the address of the instrumented function is LR - 4 at entry to
the ftrace code.

This fixup is applied in the mcount_get_pc and mcount_get_pc0 helpers,
which acquire the PC of the instrumented function.

The mcount_get_lr helper is used to acquire the LR of the instrumented
function, whose value does not require this adjustment, and cannot be
adjusted to anything meaningful. No adjustment of this value is made on
other architectures, including arm. However, arm64 adjusts this value by
4.

This patch brings arm64 in line with other architectures and removes the
adjustment of the LR value.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Torsten Duwe <duwe@suse.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 5c176aff 15-Nov-2018 Mark Rutland <mark.rutland@arm.com>

arm64: ftrace: enable graph FP test

The core frace code has an optional sanity check on the frame pointer
passed by ftrace_graph_caller and return_to_handler. This is cheap,
useful, and enabled unconditionally on x86, sparc, and riscv.

Let's do the same on arm64, so that we can catch any problems early.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Torsten Duwe <duwe@suse.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# e4fe1966 15-Nov-2018 Mark Rutland <mark.rutland@arm.com>

arm64: ftrace: use GLOBAL()

The global exports of ftrace_call and ftrace_graph_call are somewhat
painful to read. Let's use the generic GLOBAL() macro to ameliorate
matters.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Torsten Duwe <duwe@suse.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# d125bffc 03-Nov-2017 Julien Thierry <julien.thierry.kdev@gmail.com>

arm64: Fix static use of function graph

Function graph does not work currently when CONFIG_DYNAMIC_TRACE is not
set. This is because ftrace_function_trace is not always set to ftrace_stub
when function_graph is in use.

Do not skip checking of graph tracer functions when ftrace_function_trace
is set.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# f705d954 14-Feb-2017 Arnd Bergmann <arnd@arndb.de>

arm64: include asm/assembler.h in entry-ftrace.S

In a randconfig build I ran into this build error:

arch/arm64/kernel/entry-ftrace.S: Assembler messages:
arch/arm64/kernel/entry-ftrace.S:101: Error: unknown mnemonic `ldr_l' -- `ldr_l x2,ftrace_trace_function'

The macro is defined in asm/assembler.h, so we should include that file.

Fixes: 829d2bd13392 ("arm64: entry-ftrace.S: avoid open-coded {adr,ldr}_l")
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 829d2bd1 17-Jan-2017 Mark Rutland <mark.rutland@arm.com>

arm64: entry-ftrace.S: avoid open-coded {adr,ldr}_l

Some places in the kernel open-code sequences using ADRP for a symbol
another instruction using a :lo12: relocation for that same symbol.
These sequences are easy to get wrong, and more painful to read than is
necessary. For these reasons, it is preferable to use the
{adr,ldr,str}_l macros for these cases.

This patch makes use of these in entry-ftrace.S, removing open-coded
sequences using adrp. This results in a minor code change, since a
temporary register is not used when generating the address for some
symbols, but this is fine, as the value of the temporary register is not
used elsewhere.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# e4a744ef 19-Aug-2016 Josh Poimboeuf <jpoimboe@redhat.com>

ftrace: Remove CONFIG_HAVE_FUNCTION_GRAPH_FP_TEST from config

Make HAVE_FUNCTION_GRAPH_FP_TEST a normal define, independent from
kconfig. This removes some config file pollution and simplifies the
checking for the fp test.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/2c4e5f05054d6d367f702fd153af7a0109dd5c81.1471607358.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# ee556d00 29-Sep-2015 Li Bin <huawei.libin@huawei.com>

arm64: ftrace: fix function_graph tracer panic

When function graph tracer is enabled, the following operation
will trigger panic:

mount -t debugfs nodev /sys/kernel
echo next_tgid > /sys/kernel/tracing/set_ftrace_filter
echo function_graph > /sys/kernel/tracing/current_tracer
ls /proc/

------------[ cut here ]------------
[ 198.501417] Unable to handle kernel paging request at virtual address cb88537fdc8ba316
[ 198.506126] pgd = ffffffc008f79000
[ 198.509363] [cb88537fdc8ba316] *pgd=00000000488c6003, *pud=00000000488c6003, *pmd=0000000000000000
[ 198.517726] Internal error: Oops: 94000005 [#1] SMP
[ 198.518798] Modules linked in:
[ 198.520582] CPU: 1 PID: 1388 Comm: ls Tainted: G
[ 198.521800] Hardware name: linux,dummy-virt (DT)
[ 198.522852] task: ffffffc0fa9e8000 ti: ffffffc0f9ab0000 task.ti: ffffffc0f9ab0000
[ 198.524306] PC is at next_tgid+0x30/0x100
[ 198.525205] LR is at return_to_handler+0x0/0x20
[ 198.526090] pc : [<ffffffc0002a1070>] lr : [<ffffffc0000907c0>] pstate: 60000145
[ 198.527392] sp : ffffffc0f9ab3d40
[ 198.528084] x29: ffffffc0f9ab3d40 x28: ffffffc0f9ab0000
[ 198.529406] x27: ffffffc000d6a000 x26: ffffffc000b786e8
[ 198.530659] x25: ffffffc0002a1900 x24: ffffffc0faf16c00
[ 198.531942] x23: ffffffc0f9ab3ea0 x22: 0000000000000002
[ 198.533202] x21: ffffffc000d85050 x20: 0000000000000002
[ 198.534446] x19: 0000000000000002 x18: 0000000000000000
[ 198.535719] x17: 000000000049fa08 x16: ffffffc000242efc
[ 198.537030] x15: 0000007fa472b54c x14: ffffffffff000000
[ 198.538347] x13: ffffffc0fada84a0 x12: 0000000000000001
[ 198.539634] x11: ffffffc0f9ab3d70 x10: ffffffc0f9ab3d70
[ 198.540915] x9 : ffffffc0000907c0 x8 : ffffffc0f9ab3d40
[ 198.542215] x7 : 0000002e330f08f0 x6 : 0000000000000015
[ 198.543508] x5 : 0000000000000f08 x4 : ffffffc0f9835ec0
[ 198.544792] x3 : cb88537fdc8ba316 x2 : cb88537fdc8ba306
[ 198.546108] x1 : 0000000000000002 x0 : ffffffc000d85050
[ 198.547432]
[ 198.547920] Process ls (pid: 1388, stack limit = 0xffffffc0f9ab0020)
[ 198.549170] Stack: (0xffffffc0f9ab3d40 to 0xffffffc0f9ab4000)
[ 198.582568] Call trace:
[ 198.583313] [<ffffffc0002a1070>] next_tgid+0x30/0x100
[ 198.584359] [<ffffffc0000907bc>] ftrace_graph_caller+0x6c/0x70
[ 198.585503] [<ffffffc0000907bc>] ftrace_graph_caller+0x6c/0x70
[ 198.586574] [<ffffffc0000907bc>] ftrace_graph_caller+0x6c/0x70
[ 198.587660] [<ffffffc0000907bc>] ftrace_graph_caller+0x6c/0x70
[ 198.588896] Code: aa0003f5 2a0103f4 b4000102 91004043 (885f7c60)
[ 198.591092] ---[ end trace 6a346f8f20949ac8 ]---

This is because when using function graph tracer, if the traced
function return value is in multi regs ([x0-x7]), return_to_handler
may corrupt them. So in return_to_handler, the parameter regs should
be protected properly.

Cc: <stable@vger.kernel.org> # 3.18+
Signed-off-by: Li Bin <huawei.libin@huawei.com>
Acked-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# f1ba46ee 07-Nov-2014 Ard Biesheuvel <ardb@kernel.org>

arm64: ftrace: eliminate literal pool entries

Replace ldr xN, =<symbol> with adrp/add or adrp/ldr [as appropriate]
in the implementation of _mcount(), which may be called very often.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# ac694fda 25-Jun-2014 Steven Rostedt (Red Hat) <rostedt@goodmis.org>

arm64, ftrace: Remove check of obsolete variable function_trace_stop

Nothing sets function_trace_stop to disable function tracing anymore.
Remove the check for it in the arch code.

arm64 was broken anyway, as it had an ifdef testing
CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST which is only set if
the arch supports the code (which it obviously did not), and
it was testing a non existent ftrace_trace_stop instead of
function_trace_stop.

Link: http://lkml.kernel.org/r/20140627124421.GP26276@arm.com

Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>


# a46ec3a1 11-Jun-2014 Paul Bolle <pebolle@tiscali.nl>

arm64: ftrace: Fix comment typo 'CONFIG_FUNCTION_GRAPH_FP_TEST'

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# bd7d38db 30-Apr-2014 AKASHI Takahiro <takahiro.akashi@linaro.org>

arm64: ftrace: Add dynamic ftrace support

This patch allows "dynamic ftrace" if CONFIG_DYNAMIC_FTRACE is enabled.
Here we can turn on and off tracing dynamically per-function base.

On arm64, this is done by patching single branch instruction to _mcount()
inserted by gcc -pg option. The branch is replaced to NOP initially at
kernel start up, and later on, NOP to branch to ftrace_caller() when
enabled or branch to NOP when disabled.
Please note that ftrace_caller() is a counterpart of _mcount() in case of
'static' ftrace.

More details on architecture specific requirements are described in
Documentation/trace/ftrace-design.txt.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>


# 819e50e2 30-Apr-2014 AKASHI Takahiro <takahiro.akashi@linaro.org>

arm64: Add ftrace support

This patch implements arm64 specific part to support function tracers,
such as function (CONFIG_FUNCTION_TRACER), function_graph
(CONFIG_FUNCTION_GRAPH_TRACER) and function profiler
(CONFIG_FUNCTION_PROFILER).

With 'function' tracer, all the functions in the kernel are traced with
timestamps in ${sysfs}/tracing/trace. If function_graph tracer is
specified, call graph is generated.

The kernel must be compiled with -pg option so that _mcount() is inserted
at the beginning of functions. This function is called on every function's
entry as long as tracing is enabled.
In addition, function_graph tracer also needs to be able to probe function's
exit. ftrace_graph_caller() & return_to_handler do this by faking link
register's value to intercept function's return path.

More details on architecture specific requirements are described in
Documentation/trace/ftrace-design.txt.

Reviewed-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>