#
1b273124 |
|
28-Feb-2024 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
tracepoints: Use WARN() and not WARN_ON() for warnings There are two WARN_ON*() warnings in tracepoint.h that deal with RCU usage. But when they trigger, especially from using a TRACE_EVENT() macro, the information is not very helpful and is confusing: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at include/trace/events/lock.h:24 lock_acquire+0x2b2/0x2d0 Where the above warning takes you to: TRACE_EVENT(lock_acquire, <<<--- line 24 in lock.h TP_PROTO(struct lockdep_map *lock, unsigned int subclass, int trylock, int read, int check, struct lockdep_map *next_lock, unsigned long ip), [..] Change the WARN_ON_ONCE() to WARN_ONCE() and add a string that allows someone to search for exactly where the bug happened. Link: https://lore.kernel.org/linux-trace-kernel/20240228133112.0d64fb1b@gandalf.local.home Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Thomas Gleixner <tglx@linutronix.de> Reported-by: Borislav Petkov <bp@alien8.de> Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
#
e2d0d7b2 |
|
06-Jun-2023 |
Masami Hiramatsu (Google) <mhiramat@kernel.org> |
tracing/probes: Add tracepoint support on fprobe_events Allow fprobe_events to trace raw tracepoints so that user can trace tracepoints which don't have traceevent wrappers. This new event is always available if the fprobe_events is enabled (thus no kconfig), because the fprobe_events depends on the trace-event and traceporint. e.g. # echo 't sched_overutilized_tp' >> dynamic_events # echo 't 9p_client_req' >> dynamic_events # cat dynamic_events t:tracepoints/sched_overutilized_tp sched_overutilized_tp t:tracepoints/_9p_client_req 9p_client_req The event name is based on the tracepoint name, but if it is started with digit character, an underscore '_' will be added. NOTE: to avoid further confusion, this renames TPARG_FL_TPOINT to TPARG_FL_TEVENT because this flag is used for eprobe (trace-event probe). And reuse TPARG_FL_TPOINT for this raw tracepoint probe. Link: https://lore.kernel.org/all/168507471874.913472.17214624519622959593.stgit@mhiramat.roam.corp.google.com/ Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202305020453.afTJ3VVp-lkp@intel.com/ Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
|
#
c2679254 |
|
10-Mar-2023 |
Steven Rostedt (Google) <rostedt@goodmis.org> |
tracing: Make tracepoint lockdep check actually test something A while ago where the trace events had the following: rcu_read_lock_sched_notrace(); rcu_dereference_sched(...); rcu_read_unlock_sched_notrace(); If the tracepoint is enabled, it could trigger RCU issues if called in the wrong place. And this warning was only triggered if lockdep was enabled. If the tracepoint was never enabled with lockdep, the bug would not be caught. To handle this, the above sequence was done when lockdep was enabled regardless if the tracepoint was enabled or not (although the always enabled code really didn't do anything, it would still trigger a warning). But a lot has changed since that lockdep code was added. One is, that sequence no longer triggers any warning. Another is, the tracepoint when enabled doesn't even do that sequence anymore. The main check we care about today is whether RCU is "watching" or not. So if lockdep is enabled, always check if rcu_is_watching() which will trigger a warning if it is not (tracepoints require RCU to be watching). Note, that old sequence did add a bit of overhead when lockdep was enabled, and with the latest kernel updates, would cause the system to slow down enough to trigger kernel "stalled" warnings. Link: http://lore.kernel.org/lkml/20140806181801.GA4605@redhat.com Link: http://lore.kernel.org/lkml/20140807175204.C257CAC5@viggo.jf.intel.com Link: https://lore.kernel.org/lkml/20230307184645.521db5c9@gandalf.local.home/ Link: https://lore.kernel.org/linux-trace-kernel/20230310172856.77406446@gandalf.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Joel Fernandes <joel@joelfernandes.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Paul E. McKenney <paulmck@kernel.org> Fixes: e6753f23d961 ("tracepoint: Make rcuidle tracepoint callers use SRCU") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
#
2455f0e1 |
|
15-Feb-2023 |
Ross Zwisler <zwisler@chromium.org> |
tracing: Always use canonical ftrace path The canonical location for the tracefs filesystem is at /sys/kernel/tracing. But, from Documentation/trace/ftrace.rst: Before 4.1, all ftrace tracing control files were within the debugfs file system, which is typically located at /sys/kernel/debug/tracing. For backward compatibility, when mounting the debugfs file system, the tracefs file system will be automatically mounted at: /sys/kernel/debug/tracing Many comments and Kconfig help messages in the tracing code still refer to this older debugfs path, so let's update them to avoid confusion. Link: https://lore.kernel.org/linux-trace-kernel/20230215223350.2658616-2-zwisler@google.com Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com> Signed-off-by: Ross Zwisler <zwisler@google.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
#
408b9611 |
|
12-Jan-2023 |
Peter Zijlstra <peterz@infradead.org> |
tracing: WARN on rcuidle ARCH_WANTS_NO_INSTR (a superset of CONFIG_GENERIC_ENTRY) disallows any and all tracing when RCU isn't enabled. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20230112195541.416110581@infradead.org
|
#
170ab26b |
|
21-Jul-2022 |
Li zeming <zeming@nfschina.com> |
tracepoints: It is CONFIG_TRACEPOINTS not CONFIG_TRACEPOINT When reading this note, CONFIG_TRACEPOINT searches my configuration file, and the result is CONFIG_TRACEPOINTS, the search results are consistent with the following macro definitions. I think it should be repaired. Link: https://lkml.kernel.org/r/20220721081904.3798-1-zeming@nfschina.com Signed-off-by: Li zeming <zeming@nfschina.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
#
6f0e6c15 |
|
08-Jun-2022 |
Frederic Weisbecker <frederic@kernel.org> |
context_tracking: Take IRQ eqs entrypoints over RCU The RCU dynticks counter is going to be merged into the context tracking subsystem. Prepare with moving the IRQ extended quiescent states entrypoints to context tracking. For now those are dumb redirection to existing RCU calls. [ paulmck: Apply Stephen Rothwell feedback from -next. ] [ paulmck: Apply Nathan Chancellor feedback. ] Acked-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com> Cc: Uladzislau Rezki <uladzislau.rezki@sony.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com> Cc: Yu Liao <liaoyu15@huawei.com> Cc: Phil Auld <pauld@redhat.com> Cc: Paul Gortmaker<paul.gortmaker@windriver.com> Cc: Alex Belits <abelits@marvell.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Nicolas Saenz Julienne <nsaenzju@redhat.com> Tested-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
|
#
c3b1c377 |
|
02-Aug-2021 |
Huang Shijie <shijie@os.amperecomputing.com> |
tracing: Fix a typo in tracepoint.h It should be @prev_pid, not @prev_prid. Link: https://lkml.kernel.org/r/20210802140234.5383-1-shijie@os.amperecomputing.com Signed-off-by: Huang Shijie <shijie@os.amperecomputing.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
#
9913d574 |
|
29-Jun-2021 |
Steven Rostedt (VMware) <rostedt@goodmis.org> |
tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing All internal use cases for tracepoint_probe_register() is set to not ever be called with the same function and data. If it is, it is considered a bug, as that means the accounting of handling tracepoints is corrupted. If the function and data for a tracepoint is already registered when tracepoint_probe_register() is called, it will call WARN_ON_ONCE() and return with EEXISTS. The BPF system call can end up calling tracepoint_probe_register() with the same data, which now means that this can trigger the warning because of a user space process. As WARN_ON_ONCE() should not be called because user space called a system call with bad data, there needs to be a way to register a tracepoint without triggering a warning. Enter tracepoint_probe_register_may_exist(), which can be called, but will not cause a WARN_ON() if the probe already exists. It will still error out with EEXIST, which will then be sent to the user space that performed the BPF system call. This keeps the previous testing for issues with other users of the tracepoint code, while letting BPF call it with duplicated data and not warn about it. Link: https://lore.kernel.org/lkml/20210626135845.4080-1-penguin-kernel@I-love.SAKURA.ne.jp/ Link: https://syzkaller.appspot.com/bug?id=41f4318cf01762389f4d1c1c459da4f542fe5153 Cc: stable@vger.kernel.org Fixes: c4f6699dfcb85 ("bpf: introduce BPF_RAW_TRACEPOINT") Reported-by: syzbot <syzbot+721aa903751db87aa244@syzkaller.appspotmail.com> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Tested-by: syzbot+721aa903751db87aa244@syzkaller.appspotmail.com Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
#
f2cc020d |
|
23-Mar-2021 |
Ingo Molnar <mingo@kernel.org> |
tracing: Fix various typos in comments Fix ~59 single-word typos in the tracing code comments, and fix the grammar in a handful of places. Link: https://lore.kernel.org/r/20210322224546.GA1981273@gmail.com Link: https://lkml.kernel.org/r/20210323174935.GA4176821@gmail.com Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
#
7211f0a2 |
|
04-Feb-2021 |
Steven Rostedt (VMware) <rostedt@goodmis.org> |
tracepoints: Code clean up Restructure the code a bit to make it simpler, fix some formatting problems and add READ_ONCE/WRITE_ONCE to make sure there's no compiler load/store tearing to the variables that can be accessed across CPUs. Started with Mathieu Desnoyers's patch: Link: https://lore.kernel.org/lkml/20210203175741.20665-1-mathieu.desnoyers@efficios.com/ And will keep his signature, but I will take the responsibility of this being correct, and keep the authorship. Link: https://lkml.kernel.org/r/20210204143004.61126582@gandalf.local.home Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
#
d9a1be1b |
|
08-Feb-2021 |
Steven Rostedt (VMware) <rostedt@goodmis.org> |
tracepoints: Do not punish non static call users With static calls, a tracepoint can call the callback directly if there is only one callback registered to that tracepoint. When there is more than one, the static call will call the tracepoint's "iterator" function, which needs to reload the tracepoint's "funcs" array again, as it could have changed since the first time it was loaded. But an arch without static calls is punished by having to load the tracepoint's "funcs" array twice. Once in the DO_TRACE macro, and once again in the iterator macro. For archs without static calls, there's no reason to load the array macro in the first place, since the iterator function will do it anyway. Change the __DO_TRACE_CALL() macro to do the load and call of the tracepoints funcs array only for architectures with static calls, and just call the iterator function directly for architectures without static calls. Link: https://lkml.kernel.org/r/20210208201050.909329787@goodmis.org Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
#
1746fd44 |
|
08-Feb-2021 |
Steven Rostedt (VMware) <rostedt@goodmis.org> |
tracepoints: Remove unnecessary "data_args" macro parameter While working on a clean up that would restructure the difference between architectures that have static calls vs those that do not, I was stumbling over the "data_args" parameter that includes "__data" in the arguments. The issue was that one version didn't even need it, while the other one did. Instead of injecting a "__data = NULL;" into the macro for the unneeded version, just remove it completely. The original idea behind data_args is that there may be a case of a tracepoint with no arguments. But this is considered bad practice, and all tracepoints should pass something to that location (that's what tracepoints were created for). Link: https://lkml.kernel.org/r/20210208201050.768074128@goodmis.org Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
#
c8b186a8 |
|
02-Feb-2021 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
tracepoint: Fix race between tracing and removing tracepoint When executing a tracepoint, the tracepoint's func is dereferenced twice - in __DO_TRACE() (where the returned pointer is checked) and later on in __traceiter_##_name where the returned pointer is dereferenced without checking which leads to races against tracepoint_removal_sync() and crashes. This adds a check before referencing the pointer in tracepoint_ptr_deref. Link: https://lkml.kernel.org/r/20210202072326.120557-1-aik@ozlabs.ru Cc: stable@vger.kernel.org Fixes: d25e37d89dd2f ("tracepoint: Optimize using static_call()") Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
#
33def849 |
|
21-Oct-2020 |
Joe Perches <joe@perches.com> |
treewide: Convert macro and uses of __section(foo) to __section("foo") Use a more generic form for __section that requires quotes to avoid complications with clang and gcc differences. Remove the quote operator # from compiler_attributes.h __section macro. Convert all unquoted __section(foo) uses to quoted __section("foo"). Also convert __attribute__((section("foo"))) uses to __section("foo") even if the __attribute__ has multiple list entry forms. Conversion done using the script at: https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.pl Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Nick Desaulniers <ndesaulniers@gooogle.com> Reviewed-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
de394e75 |
|
07-Sep-2020 |
peterz@infradead.org <peterz@infradead.org> |
tracepoint: Fix overly long tracepoint names Stephen Rothwell reported: > Exported symbols need to be <= (64 - sizeof(Elf_Addr)) long. This is > presumably 56 on 64 bit arches and the above symbol (including the '.') > is 56 characters long. Shorten the tracepoint symbol name. Fixes: d25e37d89dd2 ("tracepoint: Optimize using static_call()") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200908105743.GW2674@hirez.programming.kicks-ass.net
|
#
d25e37d8 |
|
18-Aug-2020 |
Steven Rostedt (VMware) <rostedt@goodmis.org> |
tracepoint: Optimize using static_call() Currently the tracepoint site will iterate a vector and issue indirect calls to however many handlers are registered (ie. the vector is long). Using static_call() it is possible to optimize this for the common case of only having a single handler registered. In this case the static_call() can directly call this handler. Otherwise, if the vector is longer than 1, call a function that iterates the whole vector like the current code. [peterz: updated to new interface] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20200818135805.279421092@infradead.org
|
#
1c39d761 |
|
30-Jul-2020 |
Nick Desaulniers <ndesaulniers@google.com> |
tracepoint: Use __used attribute definitions from compiler_attributes.h Just a small cleanup while I was touching this header. compiler_attributes.h does feature detection of these __attributes__(()) and provides more concise ways to invoke them. Link: https://lkml.kernel.org/r/20200730224555.2142154-3-ndesaulniers@google.com Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
#
f3751ad0 |
|
30-Jul-2020 |
Nick Desaulniers <ndesaulniers@google.com> |
tracepoint: Mark __tracepoint_string's __used __tracepoint_string's have their string data stored in .rodata, and an address to that data stored in the "__tracepoint_str" section. Functions that refer to those strings refer to the symbol of the address. Compiler optimization can replace those address references with references directly to the string data. If the address doesn't appear to have other uses, then it appears dead to the compiler and is removed. This can break the /tracing/printk_formats sysfs node which iterates the addresses stored in the "__tracepoint_str" section. Like other strings stored in custom sections in this header, mark these __used to inform the compiler that there are other non-obvious users of the address, so they should still be emitted. Link: https://lkml.kernel.org/r/20200730224555.2142154-2-ndesaulniers@google.com Cc: Ingo Molnar <mingo@redhat.com> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Cc: stable@vger.kernel.org Fixes: 102c9323c35a8 ("tracing: Add __tracepoint_string() to export string pointers") Reported-by: Tim Murray <timmurray@google.com> Reported-by: Simon MacMullen <simonmacm@google.com> Suggested-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
#
a2806ef7 |
|
13-Apr-2020 |
Nikolay Borisov <nborisov@suse.com> |
tracing: Remove DECLARE_TRACE_NOARGS This macro was intentionally broken so that the kernel code is not poluted with such noargs macro used simply as markers. This use case can be satisfied by using dummy no inline functions. Just remove it. Link: http://lkml.kernel.org/r/20200413153246.8511-1-nborisov@suse.com Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
#
52a6e82a |
|
31-May-2019 |
Thomas Gleixner <tglx@linutronix.de> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 365 Based on 1 normalized pattern(s): this file is released under the gplv2 see the file copying for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 3 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Armijn Hemel <armijn@tjaldur.nl> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190531081035.872590698@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
16336345 |
|
26-Mar-2019 |
Yafang Shao <laoar.shao@gmail.com> |
tracing: introduce TRACE_EVENT_NOP() Sometimes we want to define a tracepoint as a do-nothing function. So I introduce TRACE_EVENT_NOP, DECLARE_EVENT_CLASS_NOP and DEFINE_EVENT_NOP for this kind of usage. Link: http://lkml.kernel.org/r/1553602391-11926-2-git-send-email-laoar.shao@gmail.com Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
#
0c7a52e4 |
|
27-Nov-2018 |
Zenghui Yu <yuzenghui@huawei.com> |
tracepoint: Use __idx instead of idx in DO_TRACE macro to make it unique After enabling KVM event tracing, almost all of trace_kvm_exit()'s printk shows "kvm_exit: IRQ: ..." even if the actual exception_type is NOT IRQ. More specifically, trace_kvm_exit() is defined in virt/kvm/arm/trace.h by TRACE_EVENT. This slight problem may have existed after commit e6753f23d961 ("tracepoint: Make rcuidle tracepoint callers use SRCU"). There are two variables in trace_kvm_exit() and __DO_TRACE() which have the same name, *idx*. Thus the actual value of *idx* will be overwritten when tracing. Fix it by adding a simple prefix. Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Wang Haibin <wanghaibin.wang@huawei.com> Cc: linux-trace-devel@vger.kernel.org Cc: stable@vger.kernel.org Fixes: e6753f23d961 ("tracepoint: Make rcuidle tracepoint callers use SRCU") Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
#
74401729 |
|
06-Nov-2018 |
Paul E. McKenney <paulmck@kernel.org> |
tracing: Replace synchronize_sched() and call_rcu_sched() Now that synchronize_rcu() waits for preempt-disable regions of code as well as RCU read-side critical sections, synchronize_sched() can be replaced by synchronize_rcu(). Similarly, call_rcu_sched() can be replaced by call_rcu(). This commit therefore makes these changes. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: <linux-kernel@vger.kernel.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
#
9c0be3f6 |
|
13-Oct-2018 |
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> |
tracepoint: Fix tracepoint array element size mismatch commit 46e0c9be206f ("kernel: tracepoints: add support for relative references") changes the layout of the __tracepoint_ptrs section on architectures supporting relative references. However, it does so without turning struct tracepoint * const into const int elsewhere in the tracepoint code, which has the following side-effect: Setting mod->num_tracepoints is done in by module.c: mod->tracepoints_ptrs = section_objs(info, "__tracepoints_ptrs", sizeof(*mod->tracepoints_ptrs), &mod->num_tracepoints); Basically, since sizeof(*mod->tracepoints_ptrs) is a pointer size (rather than sizeof(int)), num_tracepoints is erroneously set to half the size it should be on 64-bit arch. So a module with an odd number of tracepoints misses the last tracepoint due to effect of integer division. So in the module going notifier: for_each_tracepoint_range(mod->tracepoints_ptrs, mod->tracepoints_ptrs + mod->num_tracepoints, tp_module_going_check_quiescent, NULL); the expression (mod->tracepoints_ptrs + mod->num_tracepoints) actually evaluates to something within the bounds of the array, but miss the last tracepoint if the number of tracepoints is odd on 64-bit arch. Fix this by introducing a new typedef: tracepoint_ptr_t, which is either "const int" on architectures that have PREL32 relocations, or "struct tracepoint * const" on architectures that does not have this feature. Also provide a new tracepoint_ptr_defer() static inline to encapsulate deferencing this type rather than duplicate code and ugly idefs within the for_each_tracepoint_range() implementation. This issue appears in 4.19-rc kernels, and should ideally be fixed before the end of the rc cycle. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Jessica Yu <jeyu@kernel.org> Link: http://lkml.kernel.org/r/20181013191050.22389-1-mathieu.desnoyers@efficios.com Link: http://lkml.kernel.org/r/20180704083651.24360-7-ard.biesheuvel@linaro.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Ingo Molnar <mingo@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morris <james.morris@microsoft.com> Cc: James Morris <jmorris@namei.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Nicolas Pitre <nico@linaro.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Russell King <linux@armlinux.org.uk> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
#
865e63b0 |
|
04-Sep-2018 |
Steven Rostedt (VMware) <rostedt@goodmis.org> |
tracing: Add back in rcu_irq_enter/exit_irqson() for rcuidle tracepoints Borislav reported the following splat: ============================= WARNING: suspicious RCU usage 4.19.0-rc1+ #1 Not tainted ----------------------------- ./include/linux/rcupdate.h:631 rcu_read_lock() used illegally while idle! other info that might help us debug this: RCU used illegally from idle CPU! rcu_scheduler_active = 2, debug_locks = 1 RCU used illegally from extended quiescent state! 1 lock held by swapper/0/0: #0: 000000004557ee0e (rcu_read_lock){....}, at: perf_event_output_forward+0x0/0x130 stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc1+ #1 Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 11/13/2012 Call Trace: dump_stack+0x85/0xcb perf_event_output_forward+0xf6/0x130 __perf_event_overflow+0x52/0xe0 perf_swevent_overflow+0x91/0xb0 perf_tp_event+0x11a/0x350 ? find_held_lock+0x2d/0x90 ? __lock_acquire+0x2ce/0x1350 ? __lock_acquire+0x2ce/0x1350 ? retint_kernel+0x2d/0x2d ? find_held_lock+0x2d/0x90 ? tick_nohz_get_sleep_length+0x83/0xb0 ? perf_trace_cpu+0xbb/0xd0 ? perf_trace_buf_alloc+0x5a/0xa0 perf_trace_cpu+0xbb/0xd0 cpuidle_enter_state+0x185/0x340 do_idle+0x1eb/0x260 cpu_startup_entry+0x5f/0x70 start_kernel+0x49b/0x4a6 secondary_startup_64+0xa4/0xb0 This is due to the tracepoints moving to SRCU usage which does not require RCU to be "watching". But perf uses these tracepoints with RCU and expects it to be. Hence, we still need to add in the rcu_irq_enter/exit_irqson() calls for "rcuidle" tracepoints. This is a temporary fix until we have SRCU working in NMI context, and then perf can be converted to use that instead of normal RCU. Link: http://lkml.kernel.org/r/20180904162611.6a120068@gandalf.local.home Cc: x86-ml <x86@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Reported-by: Borislav Petkov <bp@alien8.de> Tested-by: Borislav Petkov <bp@alien8.de> Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Fixes: e6753f23d961d ("tracepoint: Make rcuidle tracepoint callers use SRCU") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
#
46e0c9be |
|
21-Aug-2018 |
Ard Biesheuvel <ardb@kernel.org> |
kernel: tracepoints: add support for relative references To avoid the need for relocating absolute references to tracepoint structures at boot time when running relocatable kernels (which may take a disproportionate amount of space), add the option to emit these tables as relative references instead. Link: http://lkml.kernel.org/r/20180704083651.24360-7-ard.biesheuvel@linaro.org Acked-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morris <james.morris@microsoft.com> Cc: James Morris <jmorris@namei.org> Cc: Jessica Yu <jeyu@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Nicolas Pitre <nico@linaro.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Russell King <linux@armlinux.org.uk> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
e6753f23 |
|
30-Jul-2018 |
Joel Fernandes (Google) <joel@joelfernandes.org> |
tracepoint: Make rcuidle tracepoint callers use SRCU In recent tests with IRQ on/off tracepoints, a large performance overhead ~10% is noticed when running hackbench. This is root caused to calls to rcu_irq_enter_irqson and rcu_irq_exit_irqson from the tracepoint code. Following a long discussion on the list [1] about this, we concluded that srcu is a better alternative for use during rcu idle. Although it does involve extra barriers, its lighter than the sched-rcu version which has to do additional RCU calls to notify RCU idle about entry into RCU sections. In this patch, we change the underlying implementation of the trace_*_rcuidle API to use SRCU. This has shown to improve performance alot for the high frequency irq enable/disable tracepoints. Test: Tested idle and preempt/irq tracepoints. Here are some performance numbers: With a run of the following 30 times on a single core x86 Qemu instance with 1GB memory: hackbench -g 4 -f 2 -l 3000 Completion times in seconds. CONFIG_PROVE_LOCKING=y. No patches (without this series) Mean: 3.048 Median: 3.025 Std Dev: 0.064 With Lockdep using irq tracepoints with RCU implementation: Mean: 3.451 (-11.66 %) Median: 3.447 (-12.22%) Std Dev: 0.049 With Lockdep using irq tracepoints with SRCU implementation (this series): Mean: 3.020 (I would consider the improvement against the "without this series" case as just noise). Median: 3.013 Std Dev: 0.033 [1] https://patchwork.kernel.org/patch/10344297/ [remove rcu_read_lock_sched_notrace as its the equivalent of preempt_disable_notrace and is unnecessary to call in tracepoint code] Link: http://lkml.kernel.org/r/20180730222423.196630-3-joel@joelfernandes.org Cleaned-up-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> [ Simplified WARN_ON_ONCE() ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
#
ec15872d |
|
08-May-2018 |
Mauro Carvalho Chehab <mchehab+samsung@kernel.org> |
docs: fix broken references with multiple hints The script: ./scripts/documentation-file-ref-check --fix Gives multiple hints for broken references on some files. Manually use the one that applies for some files. Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: James Morris <james.morris@microsoft.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Acked-by: Jonathan Corbet <corbet@lwn.net>
|
#
844ccdd7 |
|
03-Oct-2017 |
Paul E. McKenney <paulmck@kernel.org> |
rcu: Eliminate rcu_irq_enter_disabled() Now that the irq path uses the rcu_nmi_{enter,exit}() algorithm, rcu_irq_enter() and rcu_irq_exit() may be used from any context. There is thus no need for rcu_irq_enter_disabled() and for the checks using it. This commit therefore eliminates rcu_irq_enter_disabled(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
#
4f0dfd76 |
|
31-May-2017 |
Jeremy Linton <jeremy.linton@arm.com> |
tracing: define TRACE_DEFINE_SIZEOF() macro to map sizeof's to their values Perf has a problem that if sizeof() macros are used within TRACE_EVENT() macro's they end up in userspace as "sizeof(kernel structure)" which cannot properly be parsed. Add a macro which can forward this data through the eval_map for userspace utilization. Link: http://lkml.kernel.org/r/20170531215653.3240-10-jeremy.linton@arm.com Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
#
00f4b652 |
|
31-May-2017 |
Jeremy Linton <jeremy.linton@arm.com> |
trace: rename trace_enum_map to trace_eval_map Each enum is loaded into the trace_enum_map, as we are now using this for more than enums rename it. Link: http://lkml.kernel.org/r/20170531215653.3240-3-jeremy.linton@arm.com Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
#
d54b6eeb |
|
06-Apr-2017 |
Steven Rostedt (VMware) <rostedt@goodmis.org> |
tracing: Make sure rcu_irq_enter() can work for trace_*_rcuidle() trace events Stack tracing discovered that there's a small location inside the RCU infrastructure where calling rcu_irq_enter() does not work. As trace events use rcu_irq_enter() it must make sure that it is functionable. A check against rcu_irq_enter_disabled() is added with a WARN_ON_ONCE() as no trace event should ever be used in that part of RCU. If the warning is triggered, then the trace event is ignored. Restructure the __DO_TRACE() a bit to get rid of the prercu and postrcu, and just have an rcucheck that does the work from within the _DO_TRACE() macro. gcc optimization will compile out the rcucheck=0 case. Link: http://lkml.kernel.org/r/20170405093207.404f8deb@gandalf.local.home Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
#
8cf868af |
|
28-Nov-2016 |
Steven Rostedt (Red Hat) <rostedt@goodmis.org> |
tracing: Have the reg function allow to fail Some tracepoints have a registration function that gets enabled when the tracepoint is enabled. There may be cases that the registraction function must fail (for example, can't allocate enough memory). In this case, the tracepoint should also fail to register, otherwise the user would not know why the tracepoint is not working. Cc: David Howells <dhowells@redhat.com> Cc: Seiji Aguchi <seiji.aguchi@hds.com> Cc: Anton Blanchard <anton@samba.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
dc17147d |
|
09-Mar-2016 |
Steven Rostedt (Red Hat) <rostedt@goodmis.org> |
tracing: Fix check for cpu online when event is disabled Commit f37755490fe9b ("tracepoints: Do not trace when cpu is offline") added a check to make sure that tracepoints only get called when the cpu is online, as it uses rcu_read_lock_sched() for protection. Commit 3a630178fd5f3 ("tracing: generate RCU warnings even when tracepoints are disabled") added lockdep checks (including rcu checks) for events that are not enabled to catch possible RCU issues that would only be triggered if a trace event was enabled. Commit f37755490fe9b only stopped the warnings when the trace event was enabled but did not prevent warnings if the trace event was called when disabled. To fix this, the cpu online check is moved to where the condition is added to the trace event. This will place the cpu online check in all places that it may be used now and in the future. Cc: stable@vger.kernel.org # v3.18+ Fixes: f37755490fe9b ("tracepoints: Do not trace when cpu is offline") Fixes: 3a630178fd5f3 ("tracing: generate RCU warnings even when tracepoints are disabled") Reported-by: Sudeep Holla <sudeep.holla@arm.com> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
f3775549 |
|
14-Feb-2016 |
Steven Rostedt (Red Hat) <rostedt@goodmis.org> |
tracepoints: Do not trace when cpu is offline The tracepoint infrastructure uses RCU sched protection to enable and disable tracepoints safely. There are some instances where tracepoints are used in infrastructure code (like kfree()) that get called after a CPU is going offline, and perhaps when it is coming back online but hasn't been registered yet. This can probuce the following warning: [ INFO: suspicious RCU usage. ] 4.4.0-00006-g0fe53e8-dirty #34 Tainted: G S ------------------------------- include/trace/events/kmem.h:141 suspicious rcu_dereference_check() usage! other info that might help us debug this: RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1 no locks held by swapper/8/0. stack backtrace: CPU: 8 PID: 0 Comm: swapper/8 Tainted: G S 4.4.0-00006-g0fe53e8-dirty #34 Call Trace: [c0000005b76c78d0] [c0000000008b9540] .dump_stack+0x98/0xd4 (unreliable) [c0000005b76c7950] [c00000000010c898] .lockdep_rcu_suspicious+0x108/0x170 [c0000005b76c79e0] [c00000000029adc0] .kfree+0x390/0x440 [c0000005b76c7a80] [c000000000055f74] .destroy_context+0x44/0x100 [c0000005b76c7b00] [c0000000000934a0] .__mmdrop+0x60/0x150 [c0000005b76c7b90] [c0000000000e3ff0] .idle_task_exit+0x130/0x140 [c0000005b76c7c20] [c000000000075804] .pseries_mach_cpu_die+0x64/0x310 [c0000005b76c7cd0] [c000000000043e7c] .cpu_die+0x3c/0x60 [c0000005b76c7d40] [c0000000000188d8] .arch_cpu_idle_dead+0x28/0x40 [c0000005b76c7db0] [c000000000101e6c] .cpu_startup_entry+0x50c/0x560 [c0000005b76c7ed0] [c000000000043bd8] .start_secondary+0x328/0x360 [c0000005b76c7f90] [c000000000008a6c] start_secondary_prolog+0x10/0x14 This warning is not a false positive either. RCU is not protecting code that is being executed while the CPU is offline. Instead of playing "whack-a-mole(TM)" and adding conditional statements to the tracepoints we find that are used in this instance, simply add a cpu_online() test to the tracepoint code where the tracepoint will be ignored if the CPU is offline. Use of raw_smp_processor_id() is fine, as there should never be a case where the tracepoint code goes from running on a CPU that is online and suddenly gets migrated to a CPU that is offline. Link: http://lkml.kernel.org/r/1455387773-4245-1-git-send-email-kda@linux-powerpc.org Reported-by: Denis Kirjanov <kda@linux-powerpc.org> Fixes: 97e1c18e8d17b ("tracing: Kernel Tracepoints") Cc: stable@vger.kernel.org # v2.6.28+ Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
2701121b |
|
14-Dec-2015 |
Denis Kirjanov <kda@linux-powerpc.org> |
tracing: Introduce TRACE_EVENT_FN_COND macro TRACE_EVENT_FN can't be used in some circumstances like invoking trace functions from offlined CPU due to RCU usage. This patch adds the TRACE_EVENT_FN_COND macro to make such trace points conditional. Link: http://lkml.kernel.org/r/1450124286-4822-1-git-send-email-kda@linux-powerpc.org Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
7c9906ca |
|
31-Oct-2015 |
Paul E. McKenney <paulmck@kernel.org> |
rcu: Don't redundantly disable irqs in rcu_irq_{enter,exit}() This commit replaces a local_irq_save()/local_irq_restore() pair with a lockdep assertion that interrupts are already disabled. This should remove the corresponding overhead from the interrupt entry/exit fastpaths. This change was inspired by the fact that Iftekhar Ahmed's mutation testing showed that removing rcu_irq_enter()'s call to local_ird_restore() had no effect, which might indicate that interrupts were always enabled anyway. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
#
bd2a634d |
|
01-Dec-2015 |
Andi Kleen <ak@linux.intel.com> |
tracepoints: Move struct tracepoint to new tracepoint-defs.h header Steven recommended open coding access to tracepoint->key to add trace points to headers. Unfortunately this is difficult for some headers (such as x86 asm/msr.h) because including tracepoint.h includes so many other headers that it causes include loops. The main problem is the include of linux/rcupdate.h, which pulls in a lot of other headers. The rcu header is only needed when actually defining trace points. Move the struct tracepoint into a separate tracepoint-defs.h header that can be included without pulling in all of RCU. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1449018060-1742-2-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
a15920be |
|
02-Nov-2015 |
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> |
tracepoints: Fix documentation of RCU lockdep checks The documentation on top of __DECLARE_TRACE() does not match its implementation since the condition check has been added to the RCU lockdep checks. Update the documentation to match its implementation. Link: http://lkml.kernel.org/r/1446504164-21563-1-git-send-email-mathieu.desnoyers@efficios.com CC: Dave Hansen <dave@sr71.net> Fixes: a05d59a56733 "tracing: Add condition check to RCU lockdep checks" Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
7904b5c4 |
|
22-Sep-2015 |
Steven Rostedt (Red Hat) <rostedt@goodmis.org> |
tracepoint: Give priority to probes of tracepoints In order to guarantee that a probe will be called before other probes that are attached to a tracepoint, there needs to be a mechanism to provide priority of one probe over the others. Adding a prio field to the struct tracepoint_func, which lets the probes be sorted by the priority set in the structure. If no priority is specified, then a priority of 10 is given (this is a macro, and perhaps may be changed in the future). Now probes may be added to affect other probes that are attached to a tracepoint with a guaranteed order. One use case would be to allow tracing of tracepoints be able to filter by pid. A special (higher priority probe) may be added to the sched_switch tracepoint and set the necessary flags of the other tracepoints to notify them if they should be traced or not. In case a tracepoint is enabled at the sched_switch tracepoint too, the order of the two are not random. Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
c63b7682 |
|
01-Aug-2015 |
Tal Shorer <tal.shorer@gmail.com> |
tracing: Allow disabling compilation of specific trace systems Allow a trace events header file to disable compilation of its trace events by defining the preprocessor macro NOTRACE. This could be done, for example, according to a Kconfig option. Link: http://lkml.kernel.org/r/1438432079-11704-3-git-send-email-tal.shorer@gmail.com Signed-off-by: Tal Shorer <tal.shorer@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
0c564a53 |
|
24-Mar-2015 |
Steven Rostedt (Red Hat) <rostedt@goodmis.org> |
tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values Several tracepoints use the helper functions __print_symbolic() or __print_flags() and pass in enums that do the mapping between the binary data stored and the value to print. This works well for reading the ASCII trace files, but when the data is read via userspace tools such as perf and trace-cmd, the conversion of the binary value to a human string format is lost if an enum is used, as userspace does not have access to what the ENUM is. For example, the tracepoint trace_tlb_flush() has: __print_symbolic(REC->reason, { TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" }, { TLB_REMOTE_SHOOTDOWN, "remote shootdown" }, { TLB_LOCAL_SHOOTDOWN, "local shootdown" }, { TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" }) Which maps the enum values to the strings they represent. But perf and trace-cmd do no know what value TLB_LOCAL_MM_SHOOTDOWN is, and would not be able to map it. With TRACE_DEFINE_ENUM(), developers can place these in the event header files and ftrace will convert the enums to their values: By adding: TRACE_DEFINE_ENUM(TLB_FLUSH_ON_TASK_SWITCH); TRACE_DEFINE_ENUM(TLB_REMOTE_SHOOTDOWN); TRACE_DEFINE_ENUM(TLB_LOCAL_SHOOTDOWN); TRACE_DEFINE_ENUM(TLB_LOCAL_MM_SHOOTDOWN); $ cat /sys/kernel/debug/tracing/events/tlb/tlb_flush/format [...] __print_symbolic(REC->reason, { 0, "flush on task switch" }, { 1, "remote shootdown" }, { 2, "local shootdown" }, { 3, "local mm shootdown" }) The above is what userspace expects to see, and tools do not need to be modified to parse them. Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org Cc: Guilherme Cox <cox@computer.org> Cc: Tony Luck <tony.luck@gmail.com> Cc: Xie XiuQi <xiexiuqi@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
a05d59a5 |
|
06-Feb-2015 |
Steven Rostedt (Red Hat) <rostedt@goodmis.org> |
tracing: Add condition check to RCU lockdep checks The trace_tlb_flush() tracepoint can be called when a CPU is going offline. When a CPU is offline, RCU is no longer watching that CPU and since the tracepoint is protected by RCU, it must not be called. To prevent the tlb_flush tracepoint from being called when the CPU is offline, it was converted to a TRACE_EVENT_CONDITION where the condition checks if the CPU is online before calling the tracepoint. Unfortunately, this was not enough to stop lockdep from complaining about it. Even though the RCU protected code of the tracepoint will never be called, the condition is hidden within the tracepoint, and even though the condition prevents RCU code from being called, the lockdep checks are outside the tracepoint (this is to test tracepoints even when they are not enabled). Even though tracepoints should be checked to be RCU safe when they are not enabled, the condition should still be considered when checking RCU. Link: http://lkml.kernel.org/r/CA+icZUUGiGDoL5NU8RuxKzFjoLjEKRtUWx=JB8B9a0EQv-eGzQ@mail.gmail.com Fixes: 3a630178fd5f "tracing: generate RCU warnings even when tracepoints are disabled" Cc: stable@vger.kernel.org # 3.18+ Acked-by: Dave Hansen <dave@sr71.net> Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
3a630178 |
|
07-Aug-2014 |
Dave Hansen <dave.hansen@linux.intel.com> |
tracing: generate RCU warnings even when tracepoints are disabled Dave Jones reported seeing a bug from one of my TLB tracepoints: http://lkml.kernel.org/r/20140806181801.GA4605@redhat.com I've been running these patches for months and never saw this. But, a big chunk of my testing, especially with all the debugging enabled, was in a vm where intel_idle doesn't work. On the systems where I was using intel_idle, I never had lockdep enabled and this tracepoint on at the same time. This patch ensures that whenever we have lockdep available, we do _some_ RCU activity at the site of the tracepoint, despite whether the tracepoint's condition matches or even if the tracepoint itself is completely disabled. This is a bit of a hack, but it is pretty self-contained. I confirmed that with this patch plus lockdep I get the same splat as Dave Jones did, but without enabling the tracepoint explicitly. Link: http://lkml.kernel.org/p/20140807175204.C257CAC5@viggo.jf.intel.com Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Hansen <dave@sr71.net> Cc: Dave Jones <davej@redhat.com>, Cc: paulmck@linux.vnet.ibm.com Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
3c49b52b |
|
25-Jul-2014 |
Steven Rostedt <rostedt@goodmis.org> |
tracing: Do not do anything special with tracepoint_string when tracing is disabled When CONFIG_TRACING is not enabled, there's no reason to save the trace strings either by the linker or as a static variable that can be referenced later. Simply pass back the string that is given to tracepoint_string(). Had to move the define to include/linux/tracepoint.h so that it is still visible when CONFIG_TRACING is not set. Link: http://lkml.kernel.org/p/1406318733-26754-2-git-send-email-nicolas.pitre@linaro.org Suggested-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
7c65bbc7 |
|
06-May-2014 |
Steven Rostedt (Red Hat) <rostedt@goodmis.org> |
tracing: Add trace_<tracepoint>_enabled() function There are some code paths in the kernel that need to do some preparations before it calls a tracepoint. As that code is worthless overhead when the tracepoint is not enabled, it would be prudent to have that code only run when the tracepoint is active. To accomplish this, all tracepoints now get a static inline function called "trace_<tracepoint-name>_enabled()" which returns true when the tracepoint is enabled and false otherwise. As an added bonus, that function uses the static_key of the tracepoint such that no branch is needed. if (trace_mytracepoint_enabled()) { arg = process_tp_arg(); trace_mytracepoint(arg); } Will keep the "process_tp_arg()" (which may be expensive to run) from being executed when the tracepoint isn't enabled. It's best to encapsulate the tracepoint itself in the if statement just to keep races. For example, if you had: if (trace_mytracepoint_enabled()) arg = process_tp_arg(); trace_mytracepoint(arg); There's a chance that the tracepoint could be enabled just after the if statement, and arg will be undefined when calling the tracepoint. Link: http://lkml.kernel.org/r/20140506094407.507b6435@gandalf.local.home Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
b725dfea |
|
09-Apr-2014 |
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> |
tracepoint: Fix sparse warnings in tracepoint.c Fix the following sparse warnings: CHECK kernel/tracepoint.c kernel/tracepoint.c:184:18: warning: incorrect type in assignment (different address spaces) kernel/tracepoint.c:184:18: expected struct tracepoint_func *tp_funcs kernel/tracepoint.c:184:18: got struct tracepoint_func [noderef] <asn:4>*funcs kernel/tracepoint.c:216:18: warning: incorrect type in assignment (different address spaces) kernel/tracepoint.c:216:18: expected struct tracepoint_func *tp_funcs kernel/tracepoint.c:216:18: got struct tracepoint_func [noderef] <asn:4>*funcs kernel/tracepoint.c:392:24: error: return expression in void function CC kernel/tracepoint.o kernel/tracepoint.c: In function tracepoint_module_going: kernel/tracepoint.c:491:6: warning: symbol 'syscall_regfunc' was not declared. Should it be static? kernel/tracepoint.c:508:6: warning: symbol 'syscall_unregfunc' was not declared. Should it be static? Link: http://lkml.kernel.org/r/1397049883-28692-1-git-send-email-mathieu.desnoyers@efficios.com Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
eb7d035c |
|
08-Apr-2014 |
Steven Rostedt (Red Hat) <rostedt@goodmis.org> |
tracepoint: Simplify tracepoint module search Instead of copying the num_tracepoints and tracepoints_ptrs from the module structure to the tp_mod structure, which only uses it to find the module associated to tracepoints of modules that are coming and going, simply copy the pointer to the module struct to the tracepoint tp_module structure. Also removed un-needed brackets around an if statement. Link: http://lkml.kernel.org/r/20140408201705.4dad2c4a@gandalf.local.home Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
de7b2973 |
|
08-Apr-2014 |
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> |
tracepoint: Use struct pointer instead of name hash for reg/unreg tracepoints Register/unregister tracepoint probes with struct tracepoint pointer rather than tracepoint name. This change, which vastly simplifies tracepoint.c, has been proposed by Steven Rostedt. It also removes 8.8kB (mostly of text) to the vmlinux size. From this point on, the tracers need to pass a struct tracepoint pointer to probe register/unregister. A probe can now only be connected to a tracepoint that exists. Moreover, tracers are responsible for unregistering the probe before the module containing its associated tracepoint is unloaded. text data bss dec hex filename 10443444 4282528 10391552 25117524 17f4354 vmlinux.orig 10434930 4282848 10391552 25109330 17f2352 vmlinux Link: http://lkml.kernel.org/r/1396992381-23785-2-git-send-email-mathieu.desnoyers@efficios.com CC: Ingo Molnar <mingo@kernel.org> CC: Frederic Weisbecker <fweisbec@gmail.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: Frank Ch. Eigler <fche@redhat.com> CC: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> [ SDR - fixed return val in void func in tracepoint_module_going() ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
0dea6d52 |
|
20-Mar-2014 |
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> |
tracepoint: Remove unused API functions After the following commit: commit b75ef8b44b1cb95f5a26484b0e2fe37a63b12b44 Author: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Date: Wed Aug 10 15:18:39 2011 -0400 Tracepoint: Dissociate from module mutex The following functions became unnecessary: - tracepoint_probe_register_noupdate, - tracepoint_probe_unregister_noupdate, - tracepoint_probe_update_all. In fact, none of the in-kernel tracers, nor LTTng, nor SystemTAP use them. Remove those. Moreover, the functions: - tracepoint_iter_start, - tracepoint_iter_next, - tracepoint_iter_stop, - tracepoint_iter_reset. are unused by in-kernel tracers, LTTng and SystemTAP. Remove those too. Link: http://lkml.kernel.org/r/1395379142-2118-2-git-send-email-mathieu.desnoyers@efficios.com Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
45ab2813 |
|
26-Feb-2014 |
Steven Rostedt (Red Hat) <rostedt@goodmis.org> |
tracing: Do not add event files for modules that fail tracepoints If a module fails to add its tracepoints due to module tainting, do not create the module event infrastructure in the debugfs directory. As the events will not work and worse yet, they will silently fail, making the user wonder why the events they enable do not display anything. Having a warning on module load and the events not visible to the users will make the cause of the problem much clearer. Link: http://lkml.kernel.org/r/20140227154923.265882695@goodmis.org Fixes: 6d723736e472 "tracing/events: add support for modules to TRACE_EVENT" Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: stable@vger.kernel.org # 2.6.31+ Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
2621bca8 |
|
02-Dec-2013 |
Viresh Kumar <viresh.kumar@linaro.org> |
tracepoint: comment fix "binay" -> "binary" Binary was written as binay, probably by mistake. Fix it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
#
d5b5f391 |
|
14-Nov-2013 |
Peter Zijlstra <peterz@infradead.org> |
ftrace, perf: Avoid infinite event generation loop Vince's perf-trinity fuzzer found yet another 'interesting' problem. When we sample the irq_work_exit tracepoint with period==1 (or PERF_SAMPLE_PERIOD) and we add an fasync SIGNAL handler we create an infinite event generation loop: ,-> <IPI> | irq_work_exit() -> | trace_irq_work_exit() -> | ... | __perf_event_overflow() -> (due to fasync) | irq_work_queue() -> (irq_work_list must be empty) '--------- arch_irq_work_raise() Similar things can happen due to regular poll() wakeups if we exceed the ring-buffer wakeup watermark, or have an event_limit. To avoid this, dis-allow sampling this particular tracepoint. In order to achieve this, create a special perf_perm function pointer for each event and call this (when set) on trying to create a tracepoint perf event. [ roasted: use expr... to allow for ',' in your expression ] Reported-by: Vince Weaver <vincent.weaver@maine.edu> Tested-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Dave Jones <davej@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20131114152304.GC5364@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
f5abaa1b |
|
20-Jun-2013 |
Steven Rostedt <rostedt@goodmis.org> |
tracing: Add DEFINE_EVENT_FN() macro Each TRACE_EVENT() adds several helper functions. If two or more trace events share the same structure and print format, they can also share most of these helper functions and save a lot of space from duplicate code. This is why the DECLARE_EVENT_CLASS() and DEFINE_EVENT() were created. Some events require a trigger to be called at registering and unregistering of the event and to do so they use TRACE_EVENT_FN(). If multiple events require a trigger, they currently have no choice but to use TRACE_EVENT_FN() as there's no DEFINE_EVENT_FN() available. This unfortunately causes a lot of wasted duplicate code created. By adding a DEFINE_EVENT_FN(), these events can still use a DECLARE_EVENT_CLASS() and then define their own triggers. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/51C3236C.8030508@hds.com Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
#
d6284099 |
|
22-May-2013 |
Paul E. McKenney <paulmck@kernel.org> |
trace: Allow idle-safe tracepoints to be called from irq __DECLARE_TRACE_RCU() currently creates an _rcuidle() tracepoint which may safely be invoked from what RCU considers to be an idle CPU. However, these _rcuidle() tracepoints may -not- be invoked from the handler of an irq taken from idle, because rcu_idle_enter() zeroes RCU's nesting-level counter, so that the rcu_irq_exit() returning to idle will trigger a WARN_ON_ONCE(). This commit therefore substitutes rcu_irq_enter() for rcu_idle_exit() and rcu_irq_exit() for rcu_idle_enter() in order to make the _rcuidle() tracepoints usable from irq handlers as well as from process context. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org>
|
#
7ece55a4 |
|
05-Sep-2012 |
Josh Triplett <josh@joshtriplett.org> |
trace: Don't declare trace_*_rcuidle functions in modules Tracepoints declare a static inline trace_*_rcuidle variant of the trace function, to support safely generating trace events from the idle loop. Module code never actually uses that variant of trace functions, because modules don't run code that needs tracing with RCU idled. However, the declaration of those otherwise unused functions causes the module to reference rcu_idle_exit and rcu_idle_enter, which RCU does not export to modules. To avoid this, don't generate trace_*_rcuidle functions for tracepoints declared in module code. Link: http://lkml.kernel.org/r/20120905062306.GA14756@leaf Reported-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
326f418b |
|
28-Jun-2012 |
Jason Baron <jbaron@redhat.com> |
tracepoint: Use static_key_false(), since static_branch() is deprecated Convert the last user of static_branch() -> static_key_false(). Signed-off-by: Jason Baron <jbaron@redhat.com> Cc: rostedt@goodmis.org Cc: mathieu.desnoyers@efficios.com Cc: a.p.zijlstra@chello.nl Link: http://lkml.kernel.org/r/5fffcd40a6c063769badcdd74a7d90980500dbcb.1340909155.git.jbaron@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
c5905afb |
|
24-Feb-2012 |
Ingo Molnar <mingo@elte.hu> |
static keys: Introduce 'struct static_key', static_key_true()/false() and static_key_slow_[inc|dec]() So here's a boot tested patch on top of Jason's series that does all the cleanups I talked about and turns jump labels into a more intuitive to use facility. It should also address the various misconceptions and confusions that surround jump labels. Typical usage scenarios: #include <linux/static_key.h> struct static_key key = STATIC_KEY_INIT_TRUE; if (static_key_false(&key)) do unlikely code else do likely code Or: if (static_key_true(&key)) do likely code else do unlikely code The static key is modified via: static_key_slow_inc(&key); ... static_key_slow_dec(&key); The 'slow' prefix makes it abundantly clear that this is an expensive operation. I've updated all in-kernel code to use this everywhere. Note that I (intentionally) have not pushed through the rename blindly through to the lowest levels: the actual jump-label patching arch facility should be named like that, so we want to decouple jump labels from the static-key facility a bit. On non-jump-label enabled architectures static keys default to likely()/unlikely() branches. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Jason Baron <jbaron@redhat.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: a.p.zijlstra@chello.nl Cc: mathieu.desnoyers@efficios.com Cc: davem@davemloft.net Cc: ddaney.cavm@gmail.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20120222085809.GA26397@elte.hu Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
2fbb90db |
|
07-Feb-2012 |
Steven Rostedt <srostedt@redhat.com> |
tracing/rcu: Add trace_##name##__rcuidle() static tracepoint for inside rcu_idle_exit() sections Added is a new static inline function that lets *any* tracepoint be used inside a rcu_idle_exit() section. And this also solves the problem where the same tracepoint may be used inside a rcu_idle_exit() section as well as outside of one. I added a new tracepoint function with a "_rcuidle" extension. All tracepoints can be used with either the normal "trace_foobar()" function, or the "trace_foobar_rcuidle()" function when inside a rcu_idle_exit() section. All tracepoints defined by TRACE_EVENT() or any of the derivatives will have a "_rcuidle()" function also defined. When a tracepoint is used within an rcu_idle_exit() section, the "_rcuidle()" version must be used. This denotes that the tracepoint is within rcu_idle_exit() and it allows the rcu read locks within the tracepoint to still be valid, as this version takes us out of rcu_idle_exit(). Another nice aspect about this patch is that "static inline"s are not compiled into text when not used. So only the tracepoints that actually use the _rcuidle() version will have them defined in the actual text that is booted. Link: http://lkml.kernel.org/r/1328563113.2200.39.camel@gandalf.stny.rr.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
b75ef8b4 |
|
10-Aug-2011 |
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> |
Tracepoint: Dissociate from module mutex Copy the information needed from struct module into a local module list held within tracepoint.c from within the module coming/going notifier. This vastly simplifies locking of tracepoint registration / unregistration, because we don't have to take the module mutex to register and unregister tracepoints anymore. Steven Rostedt ran into dependency problems related to modules mutex vs kprobes mutex vs ftrace mutex vs tracepoint mutex that seems to be hard to fix without removing this dependency between tracepoint and module mutex. (note: it should be investigated whether kprobes could benefit of being dissociated from the modules mutex too.) This also fixes module handling of tracepoint list iterators, because it was expecting the list to be sorted by pointer address. Given we have control on our own list now, it's OK to sort this list which has tracepoints as its only purpose. The reason why this sorting is required is to handle the fact that seq files (and any read() operation from user-space) cannot hold the tracepoint mutex across multiple calls, so list entries may vanish between calls. With sorting, the tracepoint iterator becomes usable even if the list don't contain the exact item pointed to by the iterator anymore. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Jason Baron <jbaron@redhat.com> CC: Ingo Molnar <mingo@elte.hu> CC: Lai Jiangshan <laijs@cn.fujitsu.com> CC: Peter Zijlstra <a.p.zijlstra@chello.nl> CC: Thomas Gleixner <tglx@linutronix.de> CC: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Link: http://lkml.kernel.org/r/20110810191839.GC8525@Krystal Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
d430d3d7 |
|
16-Mar-2011 |
Jason Baron <jbaron@redhat.com> |
jump label: Introduce static_branch() interface Introduce: static __always_inline bool static_branch(struct jump_label_key *key); instead of the old JUMP_LABEL(key, label) macro. In this way, jump labels become really easy to use: Define: struct jump_label_key jump_key; Can be used as: if (static_branch(&jump_key)) do unlikely code enable/disale via: jump_label_inc(&jump_key); jump_label_dec(&jump_key); that's it! For the jump labels disabled case, the static_branch() becomes an atomic_read(), and jump_label_inc()/dec() are simply atomic_inc(), atomic_dec() operations. We show testing results for this change below. Thanks to H. Peter Anvin for suggesting the 'static_branch()' construct. Since we now require a 'struct jump_label_key *key', we can store a pointer into the jump table addresses. In this way, we can enable/disable jump labels, in basically constant time. This change allows us to completely remove the previous hashtable scheme. Thanks to Peter Zijlstra for this re-write. Testing: I ran a series of 'tbench 20' runs 5 times (with reboots) for 3 configurations, where tracepoints were disabled. jump label configured in avg: 815.6 jump label *not* configured in (using atomic reads) avg: 800.1 jump label *not* configured in (regular reads) avg: 803.4 Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20110316212947.GA8792@redhat.com> Signed-off-by: Jason Baron <jbaron@redhat.com> Suggested-by: H. Peter Anvin <hpa@linux.intel.com> Tested-by: David Daney <ddaney@caviumnetworks.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
65498646 |
|
26-Jan-2011 |
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> |
tracepoints: Fix section alignment using pointer array Make the tracepoints more robust, making them solid enough to handle compiler changes by not relying on anything based on compiler-specific behavior with respect to structure alignment. Implement an approach proposed by David Miller: use an array of const pointers to refer to the individual structures, and export this pointer array through the linker script rather than the structures per se. It will consume 32 extra bytes per tracepoint (24 for structure padding and 8 for the pointers), but are less likely to break due to compiler changes. History: commit 7e066fb8 tracepoints: add DECLARE_TRACE() and DEFINE_TRACE() added the aligned(32) type and variable attribute to the tracepoint structures to deal with gcc happily aligning statically defined structures on 32-byte multiples. One attempt was to use a 8-byte alignment for tracepoint structures by applying both the variable and type attribute to tracepoint structures definitions and declarations. It worked fine with gcc 4.5.1, but broke with gcc 4.4.4 and 4.4.5. The reason is that the "aligned" attribute only specify the _minimum_ alignment for a structure, leaving both the compiler and the linker free to align on larger multiples. Because tracepoint.c expects the structures to be placed as an array within each section, up-alignment cause NULL-pointer exceptions due to the extra unexpected padding. (this patch applies on top of -tip) Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: David S. Miller <davem@davemloft.net> LKML-Reference: <20110126222622.GA10794@Krystal> CC: Frederic Weisbecker <fweisbec@gmail.com> CC: Ingo Molnar <mingo@elte.hu> CC: Thomas Gleixner <tglx@linutronix.de> CC: Andrew Morton <akpm@linux-foundation.org> CC: Peter Zijlstra <peterz@infradead.org> CC: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
bd1c8b22 |
|
04-Jan-2011 |
Lai Jiangshan <laijs@cn.fujitsu.com> |
tracepoint: Add __rcu annotation Add __rcu annotation to : (struct tracepoint)->funcs Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> LKML-Reference: <4D22D4F1.50505@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
ec6e7c3a |
|
06-Jan-2011 |
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> |
tracing/trivial: Add missing comma in TRACE_EVENT comment Add missing comma within the TRACE_EVENT() example in tracepoint.h. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> LKML-Reference: <20110106184532.GA2526@Krystal> CC: Frederic Weisbecker <fweisbec@gmail.com> CC: Ingo Molnar <mingo@elte.hu> CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
287050d3 |
|
02-Dec-2010 |
Steven Rostedt <srostedt@redhat.com> |
tracing: Add TRACE_EVENT_CONDITIONAL() There are instances in the kernel that we only want to trace a tracepoint when a certain condition is set. But we do not want to test for that condition in the core kernel. If we test for that condition before calling the tracepoin, then we will be performing that test even when tracing is not enabled. This is 99.99% of the time. We currently can just filter out on that condition, but that happens after we write to the trace buffer. We just wasted time writing to the ring buffer for an event we never cared about. This patch adds: TRACE_EVENT_CONDITION() and DEFINE_EVENT_CONDITION() These have a new TP_CONDITION() argument that comes right after the TP_ARGS(). This condition can use the parameters of TP_ARGS() in the TRACE_EVENT() to determine if the tracepoint should be traced or not. The TP_CONDITION() will be placed in a if (cond) trace; For example, for the tracepoint sched_wakeup, it is useless to trace a wakeup event where the caller never actually wakes anything up (where success == 0). So adding: TP_CONDITION(success), which uses the "success" parameter of the wakeup tracepoint will have it only trace when we have successfully woken up a task. Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
1ed0c597 |
|
17-Nov-2010 |
Frederic Weisbecker <fweisbec@gmail.com> |
tracing: New macro to set up initial event flags value This introduces the new TRACE_EVENT_FLAGS() macro in order to set up initial event flags value. This macro must simply follow the definition of a trace event and take the event name and the flag value as parameters: TRACE_EVENT(my_event, ..... .... ); TRACE_EVENT_FLAGS(my_event, 1) This will set up 1 as the initial my_event->flags value. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Jason Baron <jbaron@redhat.com>
|
#
8f7b50c5 |
|
17-Sep-2010 |
Jason Baron <jbaron@redhat.com> |
jump label: Tracepoint support for jump labels Make use of the jump label infrastructure for tracepoints. Signed-off-by: Jason Baron <jbaron@redhat.com> LKML-Reference: <a9ba2056e2c9cf332c3c300b577463ce66ff23a8.1284733808.git.jbaron@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
b70e4f05 |
|
21-Jun-2010 |
Wu Zhangjin <wuzhangjin@gmail.com> |
tracing: Fix undeclared ENOSYS in include/linux/tracepoint.h The header file include/linux/tracepoint.h may be included without include/linux/errno.h and then the compiler will fail on building for undelcared ENOSYS. This patch fixes this problem via including <linux/errno.h> to include/linux/tracepoint.h. Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com> LKML-Reference: <1277118549-622-1-git-send-email-wuzhangjin@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
38516ab5 |
|
20-Apr-2010 |
Steven Rostedt <srostedt@redhat.com> |
tracing: Let tracepoints have data passed to tracepoint callbacks This patch adds data to be passed to tracepoint callbacks. The created functions from DECLARE_TRACE() now need a mandatory data parameter. For example: DECLARE_TRACE(mytracepoint, int value, value) Will create the register function: int register_trace_mytracepoint((void(*)(void *data, int value))probe, void *data); As the first argument, all callbacks (probes) must take a (void *data) parameter. So a callback for the above tracepoint will look like: void myprobe(void *data, int value) { } The callback may choose to ignore the data parameter. This change allows callbacks to register a private data pointer along with the function probe. void mycallback(void *data, int value); register_trace_mytracepoint(mycallback, mydata); Then the mycallback() will receive the "mydata" as the first parameter before the args. A more detailed example: DECLARE_TRACE(mytracepoint, TP_PROTO(int status), TP_ARGS(status)); /* In the C file */ DEFINE_TRACE(mytracepoint, TP_PROTO(int status), TP_ARGS(status)); [...] trace_mytracepoint(status); /* In a file registering this tracepoint */ int my_callback(void *data, int status) { struct my_struct my_data = data; [...] } [...] my_data = kmalloc(sizeof(*my_data), GFP_KERNEL); init_my_data(my_data); register_trace_mytracepoint(my_callback, my_data); The same callback can also be registered to the same tracepoint as long as the data registered is different. Note, the data must also be used to unregister the callback: unregister_trace_mytracepoint(my_callback, my_data); Because of the data parameter, tracepoints declared this way can not have no args. That is: DECLARE_TRACE(mytracepoint, TP_PROTO(void), TP_ARGS()); will cause an error. If no arguments are needed, a new macro can be used instead: DECLARE_TRACE_NOARGS(mytracepoint); Since there are no arguments, the proto and args fields are left out. This is part of a series to make the tracepoint footprint smaller: text data bss dec hex filename 4913961 1088356 861512 6863829 68bbd5 vmlinux.orig 4914025 1088868 861512 6864405 68be15 vmlinux.class 4918492 1084612 861512 6864616 68bee8 vmlinux.tracepoint Again, this patch also increases the size of the kernel, but lays the ground work for decreasing it. v5: Fixed net/core/drop_monitor.c to handle these updates. v4: Moved the DECLARE_TRACE() DECLARE_TRACE_NOARGS out of the #ifdef CONFIG_TRACE_POINTS, since the two are the same in both cases. The __DECLARE_TRACE() is what changes. Thanks to Frederic Weisbecker for pointing this out. v3: Made all register_* functions require data to be passed and all callbacks to take a void * parameter as its first argument. This makes the calling functions comply with C standards. Also added more comments to the modifications of DECLARE_TRACE(). v2: Made the DECLARE_TRACE() have the ability to pass arguments and added a new DECLARE_TRACE_NOARGS() for tracepoints that do not need any arguments. Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
53da59aa |
|
29-Apr-2010 |
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> |
tracepoints: Add check trace callback type This check is meant to be used by tracepoint users which do a direct cast of callbacks to (void *) for direct registration, thus bypassing the register_trace_##name and unregister_trace_##name checks. This permits to ensure that the callback type matches the function type at the call site, but without generating any code. Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> LKML-Reference: <20100430165959.GA25605@Krystal> CC: Ingo Molnar <mingo@elte.hu> CC: Andrew Morton <akpm@linux-foundation.org> CC: Thomas Gleixner <tglx@linutronix.de> CC: Peter Zijlstra <peterz@infradead.org> CC: Arnaldo Carvalho de Melo <acme@redhat.com> CC: Lai Jiangshan <laijs@cn.fujitsu.com> CC: Li Zefan <lizf@cn.fujitsu.com> CC: Christoph Hellwig <hch@lst.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
2e26ca71 |
|
05-May-2010 |
Steven Rostedt <srostedt@redhat.com> |
tracing: Fix tracepoint.h DECLARE_TRACE() to allow more than one header When more than one header is included under CREATE_TRACE_POINTS the DECLARE_TRACE() macro is not defined back to its original meaning and the second include will fail to initialize the TRACE_EVENT() and DECLARE_TRACE() correctly. To fix this the tracepoint.h file moves the define of DECLARE_TRACE() out of the #ifdef _LINUX_TRACEPOINT_H protection (just like the define of the TRACE_EVENT()). This way the define_trace.h will undef the DECLARE_TRACE() at the end and allow new headers to start from scratch. This patch also requires fixing the include/events/napi.h It currently uses DECLARE_TRACE() and should be converted to a TRACE_EVENT() format. But I'll leave that change to the authors of that file. But since the napi.h file depends on using the CREATE_TRACE_POINTS and does not define its own DEFINE_TRACE() it must use the define_trace.h method instead. Cc: Neil Horman <nhorman@tuxdriver.com> Cc: David S. Miller <davem@davemloft.net> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
7f5b7742 |
|
16-Mar-2010 |
Lai Jiangshan <laijs@cn.fujitsu.com> |
rcu: Fix tracepoints & lockdep false positive tracepoint.h uses rcu_dereference(), which triggers this warning: [ 0.701161] =================================================== [ 0.702211] [ INFO: suspicious rcu_dereference_check() usage. ] [ 0.702716] --------------------------------------------------- [ 0.703203] include/trace/events/workqueue.h:68 invoked rcu_dereference_check() without protection! [ 0.703971] [ 0.703990] other info that might help us debug this: [ 0.703993] [ 0.705590] [ 0.705604] rcu_scheduler_active = 1, debug_locks = 0 [ 0.706712] 1 lock held by swapper/1: [ 0.707229] #0: (cpu_add_remove_lock){+.+.+.}, at: [<c0142f54>] cpu_maps_update_begin+0x14/0x20 [ 0.710097] [ 0.710106] stack backtrace: [ 0.712602] Pid: 1, comm: swapper Not tainted 2.6.34-rc1-tip-01613-g72662bb #168 [ 0.713231] Call Trace: [ 0.713997] [<c017174d>] lockdep_rcu_dereference+0x9d/0xb0 [ 0.714746] [<c015a117>] create_workqueue_thread+0x107/0x110 [ 0.715353] [<c015aee0>] ? worker_thread+0x0/0x340 [ 0.715845] [<c015a8e8>] __create_workqueue_key+0x138/0x240 [ 0.716427] [<c0142f92>] ? cpu_maps_update_done+0x12/0x20 [ 0.717012] [<c086b12f>] init_workqueues+0x6f/0x80 [ 0.717530] [<c08576d2>] kernel_init+0x102/0x1f0 [ 0.717570] [<c08575d0>] ? kernel_init+0x0/0x1f0 [ 0.718944] [<c01030fa>] kernel_thread_helper+0x6/0x10 Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <4B9F48AD.4000404@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
091ad365 |
|
26-Nov-2009 |
Ingo Molnar <mingo@elte.hu> |
events: Rename TRACE_EVENT_TEMPLATE() to DECLARE_EVENT_CLASS() It is not quite obvious at first sight what TRACE_EVENT_TEMPLATE does: does it define an event as well beyond defining a template? To clarify this, rename it to DECLARE_EVENT_CLASS, which follows the various 'DECLARE_*()' idioms we already have in the kernel: DECLARE_EVENT_CLASS(class) DEFINE_EVENT(class, event1) DEFINE_EVENT(class, event2) DEFINE_EVENT(class, event3) To complete this logic we should also rename TRACE_EVENT() to: DEFINE_SINGLE_EVENT(single_event) ... but in a more quiet moment of the kernel cycle. Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <4B0E286A.2000405@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
e5bc9721 |
|
18-Nov-2009 |
Steven Rostedt <srostedt@redhat.com> |
tracing: Create new DEFINE_EVENT_PRINT After creating the TRACE_EVENT_TEMPLATE I started to look at other trace points to see what duplication was made. I noticed that there are several trace points where they are almost identical except for the name and the output format. Since TRACE_EVENT_TEMPLATE was successful in bringing down the size of trace events, I added a DEFINE_EVENT_PRINT. DEFINE_EVENT_PRINT is used just like DEFINE_EVENT is. That is, the DEFINE_EVENT_PRINT also uses a TRACE_EVENT_TEMPLATE, but it allows the developer to overwrite the print format. If there are two or more TRACE_EVENTS that are identical except for the name and print, then they can be converted to use a TRACE_EVENT_TEMPLATE. Since the TRACE_EVENT_TEMPLATE already does the print output, the first trace event would have its print format held in the TRACE_EVENT_TEMPLATE and be defined with a DEFINE_EVENT. The rest will use the DEFINE_EVENT_PRINT and override the print format. Converting the sched trace points to both DEFINE_EVENT and DEFINE_EVENT_PRINT. Five were converted to DEFINE_EVENT and two were converted to DEFINE_EVENT_PRINT. I was able to get the following: $ size kernel/sched.o-* text data bss dec hex filename 79299 6776 2520 88595 15a13 kernel/sched.o-notrace 101941 11896 2584 116421 1c6c5 kernel/sched.o-templ 104779 11896 2584 119259 1d1db kernel/sched.o-trace sched.o-notrace is the scheduler compiled with no trace points. sched.o-templ is with the use of DEFINE_EVENT and DEFINE_EVENT_PRINT sched.o-trace is the current trace events. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
ff038f5c |
|
18-Nov-2009 |
Steven Rostedt <srostedt@redhat.com> |
tracing: Create new TRACE_EVENT_TEMPLATE There are some places in the kernel that define several tracepoints and they are all identical besides the name. The code to enable, disable and record is created for every trace point even if most of the code is identical. This patch adds TRACE_EVENT_TEMPLATE that lets the developer create a template TRACE_EVENT and create trace points with DEFINE_EVENT, which is based off of a given template. Each trace point used by this will share most of the code, and bring down the size of the kernel when there are several duplicate events. Usage is: TRACE_EVENT_TEMPLATE(name, proto, args, tstruct, assign, print); Which would be the same as defining a normal TRACE_EVENT. To create the trace events that the trace points will use: DEFINE_EVENT(template, name, proto, args) is done. The template is the name of the TRACE_EVENT_TEMPLATE to use. The name is the name of the trace point. The parameters proto and args must be the same as the proto and args of the template. If they are not the same, then a compile error will result. I tried hard removing this duplication but the C preprocessor is not powerful enough (or my CPP magic experience points is not at a high enough level) to not need them. A lot of trace events are coming in with new XFS development. Most of the trace points are identical except for the name. The following shows the advantage of having TRACE_EVENT_TEMPLATE: $ size fs/xfs/xfs.o.* text data bss dec hex filename 452114 2788 3520 458422 6feb6 fs/xfs/xfs.o.old 638482 38116 3744 680342 a6196 fs/xfs/xfs.o.template 996954 38116 4480 1039550 fdcbe fs/xfs/xfs.o.trace xfs.o.old is without any tracepoints. xfs.o.template uses the new TRACE_EVENT_TEMPLATE. xfs.o.trace uses the current TRACE_EVENT macros. Requested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
55dff495 |
|
23-Sep-2009 |
Randy Dunlap <randy.dunlap@oracle.com> |
docs: fix various Documentation/ paths in header files Fix various Documentation/ paths in include/linux/. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Reviewed-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
8cd09a59 |
|
22-Sep-2009 |
Li Hong <lihong.hi@gmail.com> |
tracing: Fix a comment and a trivial format issue in tracepoint.h Fix the tracepoint documentation path in tracepoints headers and a misaligned tabulation. Signed-off-by: Li Hong <lihong.hi@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> LKML-Reference: <3a3680030909220300h7cf18849q4d4702b9d4feaa67@mail.gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
|
#
7cb2e3ee |
|
25-Aug-2009 |
Steven Rostedt <srostedt@redhat.com> |
tracing: add comments to explain TRACE_EVENT out of protection The commit: commit 5ac35daa9343936038a3c9c4f4d6d3fe6a2a7bd8 Author: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> tracing/events: fix the include file dependencies Moved the TRACE_EVENT out of the ifdef protection of tracepoints.h but uses the define of TRACE_EVENT itself as protection. This patch adds comments to explain why. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
5ac35daa |
|
25-Aug-2009 |
Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> |
tracing/events: fix the include file dependencies The TRACE_EVENT depends on the include/linux/tracepoint.h first and include/trace/ftrace.h later, if we include the ftrace.h early, a building error will occur. Both define TRACE_EVENT in trace_a.h and trace_b.h, if we include those in .c file, like this: #define CREATE_TRACE_POINTS include <trace/events/trace_a.h> include <trace/events/trace_b.h> The above will not work, because the TRACE_EVENT was re-defined by the previous .h file. Reported-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> LKML-Reference: <4A937F5E.3020802@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
97419875 |
|
24-Aug-2009 |
Josh Stone <jistone@redhat.com> |
tracing: Move tracepoint callbacks from declaration to definition It's not strictly correct for the tracepoint reg/unreg callbacks to occur when a client is hooking up, because the actual tracepoint may not be present yet. This happens to be fine for syscall, since that's in the core kernel, but it would cause problems for tracepoints defined in a module that hasn't been loaded yet. It also means the reg/unreg has to be EXPORTed for any modules to use the tracepoint (as in SystemTap). This patch removes DECLARE_TRACE_WITH_CALLBACK, and instead introduces DEFINE_TRACE_FN which stores the callbacks in struct tracepoint. The callbacks are used now when the active state of the tracepoint changes in set_tracepoint & disable_tracepoint. This also introduces TRACE_EVENT_FN, so ftrace events can also provide registration callbacks if needed. Signed-off-by: Josh Stone <jistone@redhat.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Jiaying Zhang <jiayingz@google.com> Cc: Martin Bligh <mbligh@google.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> LKML-Reference: <1251150194-1713-4-git-send-email-jistone@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
|
#
63fbdab3 |
|
10-Aug-2009 |
Jason Baron <jbaron@redhat.com> |
tracing: Add DECLARE_TRACE_WITH_CALLBACK() macro Introduce a new 'DECLARE_TRACE_WITH_CALLBACK()' macro, so that tracepoints can associate an external register/unregister function. This prepares for the syscalls tracer conversion to trace events. We will need to perform arch level operations once a syscall event is turned on/off, such as TIF flags setting, hence the need of such specific callbacks. Signed-off-by: Jason Baron <jbaron@redhat.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Jiaying Zhang <jiayingz@google.com> Cc: Martin Bligh <mbligh@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
|
#
156f5a78 |
|
02-Jun-2009 |
GeunSik Lim <leemgs1@gmail.com> |
debugfs: Fix terminology inconsistency of dir name to mount debugfs filesystem. Many developers use "/debug/" or "/debugfs/" or "/sys/kernel/debug/" directory name to mount debugfs filesystem for ftrace according to ./Documentation/tracers/ftrace.txt file. And, three directory names(ex:/debug/, /debugfs/, /sys/kernel/debug/) is existed in kernel source like ftrace, DRM, Wireless, Documentation, Network[sky2]files to mount debugfs filesystem. debugfs means debug filesystem for debugging easy to use by greg kroah hartman. "/sys/kernel/debug/" name is suitable as directory name of debugfs filesystem. - debugfs related reference: http://lwn.net/Articles/334546/ Fix inconsistency of directory name to mount debugfs filesystem. * From Steven Rostedt - find_debugfs() and tracing_files() in this patch. Signed-off-by: GeunSik Lim <geunsik.lim@samsung.com> Acked-by : Inaky Perez-Gonzalez <inaky@linux.intel.com> Reviewed-by : Steven Rostedt <rostedt@goodmis.org> Reviewed-by : James Smart <james.smart@emulex.com> CC: Jiri Kosina <trivial@kernel.org> CC: David Airlie <airlied@linux.ie> CC: Peter Osterlund <petero2@telia.com> CC: Ananth N Mavinakayanahalli <ananth@in.ibm.com> CC: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> CC: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
#
b8e65554 |
|
24-Apr-2009 |
Steven Rostedt <srostedt@redhat.com> |
tracing: remove deprecated TRACE_FORMAT The TRACE_FORMAT macro has been deprecated by the TRACE_EVENT macro. There are no more users. All new users must use the TRACE_EVENT macro. [ Impact: remove old functionality ] Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
ea20d929 |
|
10-Apr-2009 |
Steven Rostedt <srostedt@redhat.com> |
tracing: consolidate trace and trace_event headers Impact: clean up Neil Horman (et. al.) criticized the way the trace events were broken up into two files. The reason for that was that ftrace needed to separate out the declarations from where the #include <linux/tracepoint.h> was used. It then dawned on me that the tracepoint.h header only needs to define the TRACE_EVENT macro if it is not already defined. The solution is simply to test if TRACE_EVENT is defined, and if it is not then the linux/tracepoint.h header can define it. This change consolidates all the <traces>.h and <traces>_event_types.h into the <traces>.h file. Reported-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: Theodore Tso <tytso@mit.edu> Reported-by: Jiaying Zhang <jiayingz@google.com> Cc: Zhaolei <zhaolei@cn.fujitsu.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
#
ef18012b |
|
10-Mar-2009 |
Steven Rostedt <srostedt@redhat.com> |
tracing: remove funky whitespace in the trace code Impact: clean up There existed a lot of <space><tab>'s in the tracing code. This patch removes them. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
|
#
823f9124 |
|
09-Mar-2009 |
Steven Rostedt <srostedt@redhat.com> |
tracing: document TRACE_EVENT macro in tracepoint.h Impact: clean up / comments Kosaki Motohiro asked about an explanation to the TRACE_EVENT macro. Ingo Molnar replied with a nice description. This patch takes the description that Ingo wrote (with some slight modifications) and adds it to the tracepoint.h file. Reported-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
|
#
30a8fecc |
|
09-Mar-2009 |
Steven Rostedt <srostedt@redhat.com> |
tracing: flip the TP_printk and TP_fast_assign in the TRACE_EVENT macro Impact: clean up In trying to stay consistant with the C style format in the TRACE_EVENT macro, it makes more sense to do the printk after the assigning of the variables. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
|
#
157587d7 |
|
09-Mar-2009 |
Steven Rostedt <srostedt@redhat.com> |
tracing: remove obsolete TRACE_EVENT_FORMAT macro Impact: clean up The TRACE_EVENT_FORMAT macro is no longer used by trace points and only the DECLARE_TRACE, TRACE_FORMAT or TRACE_EVENT macros should be used by them. Although the TRACE_EVENT_FORMAT macro is still used by the internal tracing utility, it should not be used in core kernel code. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
|
#
da4d0302 |
|
09-Mar-2009 |
Steven Rostedt <srostedt@redhat.com> |
tracing: new format for specialized trace points Impact: clean up and enhancement The TRACE_EVENT_FORMAT macro looks quite ugly and is limited in its ability to save data as well as to print the record out. Working with Ingo Molnar, we came up with a new format that is much more pleasing to the eye of C developers. This new macro is more C style than the old macro, and is more obvious to what it does. Here's the example. The only updated macro in this patch is the sched_switch trace point. The old method looked like this: TRACE_EVENT_FORMAT(sched_switch, TP_PROTO(struct rq *rq, struct task_struct *prev, struct task_struct *next), TP_ARGS(rq, prev, next), TP_FMT("task %s:%d ==> %s:%d", prev->comm, prev->pid, next->comm, next->pid), TRACE_STRUCT( TRACE_FIELD(pid_t, prev_pid, prev->pid) TRACE_FIELD(int, prev_prio, prev->prio) TRACE_FIELD_SPECIAL(char next_comm[TASK_COMM_LEN], next_comm, TP_CMD(memcpy(TRACE_ENTRY->next_comm, next->comm, TASK_COMM_LEN))) TRACE_FIELD(pid_t, next_pid, next->pid) TRACE_FIELD(int, next_prio, next->prio) ), TP_RAW_FMT("prev %d:%d ==> next %s:%d:%d") ); The above method is hard to read and requires two format fields. The new method: /* * Tracepoint for task switches, performed by the scheduler: * * (NOTE: the 'rq' argument is not used by generic trace events, * but used by the latency tracer plugin. ) */ TRACE_EVENT(sched_switch, TP_PROTO(struct rq *rq, struct task_struct *prev, struct task_struct *next), TP_ARGS(rq, prev, next), TP_STRUCT__entry( __array( char, prev_comm, TASK_COMM_LEN ) __field( pid_t, prev_pid ) __field( int, prev_prio ) __array( char, next_comm, TASK_COMM_LEN ) __field( pid_t, next_pid ) __field( int, next_prio ) ), TP_printk("task %s:%d [%d] ==> %s:%d [%d]", __entry->prev_comm, __entry->prev_pid, __entry->prev_prio, __entry->next_comm, __entry->next_pid, __entry->next_prio), TP_fast_assign( memcpy(__entry->next_comm, next->comm, TASK_COMM_LEN); __entry->prev_pid = prev->pid; __entry->prev_prio = prev->prio; memcpy(__entry->prev_comm, prev->comm, TASK_COMM_LEN); __entry->next_pid = next->pid; __entry->next_prio = next->prio; ) ); This macro is called TRACE_EVENT, it is broken up into 5 parts: TP_PROTO: the proto type of the trace point TP_ARGS: the arguments of the trace point TP_STRUCT_entry: the structure layout of the entry in the ring buffer TP_printk: the printk format TP_fast_assign: the method used to write the entry into the ring buffer The structure is the definition of how the event will be saved in the ring buffer. The printk is used by the internal tracing in case of an oops, and the kernel needs to print out the format of the record to the console. This the TP_printk gives a means to show the records in a human readable format. It is also used to print out the data from the trace file. The TP_fast_assign is executed directly. It is basically like a C function, where the __entry is the handle to the record. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
|
#
2939b046 |
|
09-Mar-2009 |
Steven Rostedt <srostedt@redhat.com> |
tracing: replace TP<var> with TP_<var> Impact: clean up The macros TPPROTO, TPARGS, TPFMT, TPRAWFMT, and TPCMD all look a bit ugly. This patch adds an underscore to their names. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
|
#
62992804 |
|
28-Feb-2009 |
Steven Rostedt <srostedt@redhat.com> |
tracing: create the C style tracing for the sched subsystem This patch utilizes the TRACE_EVENT_FORMAT macro to enable the C style faster tracing for the sched subsystem trace points. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
|
#
3cdfdf91 |
|
25-Feb-2009 |
Steven Rostedt <srostedt@redhat.com> |
tracing: wrap arguments with PARAMS Peter Zijlstra warned that TPPROTO and TPARGS might become something other than a simple copy of itself. To prevent this from having side effects in the TRACE_FORMAT macro in tracepoint.h, we add a PARAMS() macro to be defined as just a wrapper. Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
|
#
eef62a68 |
|
25-Feb-2009 |
Steven Rostedt <srostedt@redhat.com> |
tracing: rename DEFINE_TRACE_FMT to just TRACE_FORMAT There's been a bit confusion to whether DEFINE/DECLARE_TRACE_FMT should be a DEFINE or a DECLARE. Ingo Molnar suggested simply calling it TRACE_FORMAT. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
|
#
7c37730c |
|
23-Feb-2009 |
Steven Rostedt <srostedt@redhat.com> |
tracing: add DEFINE_TRACE_FMT to tracepoint.h This patch creates a DEFINE_TRACE_FMT to map to DECLARE_TRACE. This allows for the developers to place format strings and args in with their tracepoint declaration. A tracer may now override the DEFINE_TRACE_FMT macro and use it to record a default format. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
|
#
7e066fb8 |
|
14-Nov-2008 |
Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> |
tracepoints: add DECLARE_TRACE() and DEFINE_TRACE() Impact: API *CHANGE*. Must update all tracepoint users. Add DEFINE_TRACE() to tracepoints to let them declare the tracepoint structure in a single spot for all the kernel. It helps reducing memory consumption, especially when declaring a lot of tracepoints, e.g. for kmalloc tracing. *API CHANGE WARNING*: now, DECLARE_TRACE() must be used in headers for tracepoint declarations rather than DEFINE_TRACE(). This is the sane way to do it. The name previously used was misleading. Updates scheduler instrumentation to follow this API change. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
5f382671 |
|
14-Nov-2008 |
Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> |
tracepoints: do not put arguments in name Impact: cleanup That's overkill, takes space. We have a global tracepoint registery in header files anyway. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
c420970e |
|
14-Nov-2008 |
Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> |
tracepoints: use unregister return value Impact: bugfix. Unregistering a tracepoint can fail. Return the error value. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
da7b3eab |
|
14-Nov-2008 |
Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> |
tracepoints: use rcu_*_sched_notrace Make sure tracepoints can be called within ftrace callbacks. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
127cafbb |
|
27-Oct-2008 |
Lai Jiangshan <laijs@cn.fujitsu.com> |
tracepoint: introduce *_noupdate APIs. Impact: add new tracepoint APIs to allow the batched registration of probes new APIs separate tracepoint_probe_register(), tracepoint_probe_unregister() into 2 steps. The first step of them is just update tracepoint_entry, not connect or disconnect. this patch introduces tracepoint_probe_update_all() for update all. these APIs are very useful for registering lots of probes but just updating once. Another very important thing is that *_noupdate APIs do not require module_mutex. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
231375cc |
|
03-Oct-2008 |
Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> |
tracepoints: synchronize unregister static inline Turn tracepoint synchronize unregister into a static inline. There is no reason to keep it as a macro over a static inline. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
f2461fc8 |
|
06-Oct-2008 |
Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> |
tracepoints: tracepoint_synchronize_unregister() Create tracepoint_synchronize_unregister() which must be called before the end of exit() to make sure every probe callers have exited the non preemptible section and thus are not executing the probe code anymore. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
97e1c18e |
|
17-Jul-2008 |
Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> |
tracing: Kernel Tracepoints Implementation of kernel tracepoints. Inspired from the Linux Kernel Markers. Allows complete typing verification by declaring both tracing statement inline functions and probe registration/unregistration static inline functions within the same macro "DEFINE_TRACE". No format string is required. See the tracepoint Documentation and Samples patches for usage examples. Taken from the documentation patch : "A tracepoint placed in code provides a hook to call a function (probe) that you can provide at runtime. A tracepoint can be "on" (a probe is connected to it) or "off" (no probe is attached). When a tracepoint is "off" it has no effect, except for adding a tiny time penalty (checking a condition for a branch) and space penalty (adding a few bytes for the function call at the end of the instrumented function and adds a data structure in a separate section). When a tracepoint is "on", the function you provide is called each time the tracepoint is executed, in the execution context of the caller. When the function provided ends its execution, it returns to the caller (continuing from the tracepoint site). You can put tracepoints at important locations in the code. They are lightweight hooks that can pass an arbitrary number of parameters, which prototypes are described in a tracepoint declaration placed in a header file." Addition and removal of tracepoints is synchronized by RCU using the scheduler (and preempt_disable) as guarantees to find a quiescent state (this is really RCU "classic"). The update side uses rcu_barrier_sched() with call_rcu_sched() and the read/execute side uses "preempt_disable()/preempt_enable()". We make sure the previous array containing probes, which has been scheduled for deletion by the rcu callback, is indeed freed before we proceed to the next update. It therefore limits the rate of modification of a single tracepoint to one update per RCU period. The objective here is to permit fast batch add/removal of probes on _different_ tracepoints. Changelog : - Use #name ":" #proto as string to identify the tracepoint in the tracepoint table. This will make sure not type mismatch happens due to connexion of a probe with the wrong type to a tracepoint declared with the same name in a different header. - Add tracepoint_entry_free_old. - Change __TO_TRACE to get rid of the 'i' iterator. Masami Hiramatsu <mhiramat@redhat.com> : Tested on x86-64. Performance impact of a tracepoint : same as markers, except that it adds about 70 bytes of instructions in an unlikely branch of each instrumented function (the for loop, the stack setup and the function call). It currently adds a memory read, a test and a conditional branch at the instrumentation site (in the hot path). Immediate values will eventually change this into a load immediate, test and branch, which removes the memory read which will make the i-cache impact smaller (changing the memory read for a load immediate removes 3-4 bytes per site on x86_32 (depending on mov prefixes), or 7-8 bytes on x86_64, it also saves the d-cache hit). About the performance impact of tracepoints (which is comparable to markers), even without immediate values optimizations, tests done by Hideo Aoki on ia64 show no regression. His test case was using hackbench on a kernel where scheduler instrumentation (about 5 events in code scheduler code) was added. Quoting Hideo Aoki about Markers : I evaluated overhead of kernel marker using linux-2.6-sched-fixes git tree, which includes several markers for LTTng, using an ia64 server. While the immediate trace mark feature isn't implemented on ia64, there is no major performance regression. So, I think that we don't have any issues to propose merging marker point patches into Linus's tree from the viewpoint of performance impact. I prepared two kernels to evaluate. The first one was compiled without CONFIG_MARKERS. The second one was enabled CONFIG_MARKERS. I downloaded the original hackbench from the following URL: http://devresources.linux-foundation.org/craiger/hackbench/src/hackbench.c I ran hackbench 5 times in each condition and calculated the average and difference between the kernels. The parameter of hackbench: every 50 from 50 to 800 The number of CPUs of the server: 2, 4, and 8 Below is the results. As you can see, major performance regression wasn't found in any case. Even if number of processes increases, differences between marker-enabled kernel and marker- disabled kernel doesn't increase. Moreover, if number of CPUs increases, the differences doesn't increase either. Curiously, marker-enabled kernel is better than marker-disabled kernel in more than half cases, although I guess it comes from the difference of memory access pattern. * 2 CPUs Number of | without | with | diff | diff | processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] | -------------------------------------------------------------- 50 | 4.811 | 4.872 | +0.061 | +1.27 | 100 | 9.854 | 10.309 | +0.454 | +4.61 | 150 | 15.602 | 15.040 | -0.562 | -3.6 | 200 | 20.489 | 20.380 | -0.109 | -0.53 | 250 | 25.798 | 25.652 | -0.146 | -0.56 | 300 | 31.260 | 30.797 | -0.463 | -1.48 | 350 | 36.121 | 35.770 | -0.351 | -0.97 | 400 | 42.288 | 42.102 | -0.186 | -0.44 | 450 | 47.778 | 47.253 | -0.526 | -1.1 | 500 | 51.953 | 52.278 | +0.325 | +0.63 | 550 | 58.401 | 57.700 | -0.701 | -1.2 | 600 | 63.334 | 63.222 | -0.112 | -0.18 | 650 | 68.816 | 68.511 | -0.306 | -0.44 | 700 | 74.667 | 74.088 | -0.579 | -0.78 | 750 | 78.612 | 79.582 | +0.970 | +1.23 | 800 | 85.431 | 85.263 | -0.168 | -0.2 | -------------------------------------------------------------- * 4 CPUs Number of | without | with | diff | diff | processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] | -------------------------------------------------------------- 50 | 2.586 | 2.584 | -0.003 | -0.1 | 100 | 5.254 | 5.283 | +0.030 | +0.56 | 150 | 8.012 | 8.074 | +0.061 | +0.76 | 200 | 11.172 | 11.000 | -0.172 | -1.54 | 250 | 13.917 | 14.036 | +0.119 | +0.86 | 300 | 16.905 | 16.543 | -0.362 | -2.14 | 350 | 19.901 | 20.036 | +0.135 | +0.68 | 400 | 22.908 | 23.094 | +0.186 | +0.81 | 450 | 26.273 | 26.101 | -0.172 | -0.66 | 500 | 29.554 | 29.092 | -0.461 | -1.56 | 550 | 32.377 | 32.274 | -0.103 | -0.32 | 600 | 35.855 | 35.322 | -0.533 | -1.49 | 650 | 39.192 | 38.388 | -0.804 | -2.05 | 700 | 41.744 | 41.719 | -0.025 | -0.06 | 750 | 45.016 | 44.496 | -0.520 | -1.16 | 800 | 48.212 | 47.603 | -0.609 | -1.26 | -------------------------------------------------------------- * 8 CPUs Number of | without | with | diff | diff | processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] | -------------------------------------------------------------- 50 | 2.094 | 2.072 | -0.022 | -1.07 | 100 | 4.162 | 4.273 | +0.111 | +2.66 | 150 | 6.485 | 6.540 | +0.055 | +0.84 | 200 | 8.556 | 8.478 | -0.078 | -0.91 | 250 | 10.458 | 10.258 | -0.200 | -1.91 | 300 | 12.425 | 12.750 | +0.325 | +2.62 | 350 | 14.807 | 14.839 | +0.032 | +0.22 | 400 | 16.801 | 16.959 | +0.158 | +0.94 | 450 | 19.478 | 19.009 | -0.470 | -2.41 | 500 | 21.296 | 21.504 | +0.208 | +0.98 | 550 | 23.842 | 23.979 | +0.137 | +0.57 | 600 | 26.309 | 26.111 | -0.198 | -0.75 | 650 | 28.705 | 28.446 | -0.259 | -0.9 | 700 | 31.233 | 31.394 | +0.161 | +0.52 | 750 | 34.064 | 33.720 | -0.344 | -1.01 | 800 | 36.320 | 36.114 | -0.206 | -0.57 | -------------------------------------------------------------- Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: 'Peter Zijlstra' <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|