History log of /linux-master/tools/perf/builtin-report.c
Revision Date Author Comments
# 2f1e20fe 29-Feb-2024 Ian Rogers <irogers@google.com>

perf report: Sort child tasks by tid

Commit 91e467bc568f ("perf machine: Use hashtable for machine
threads") made the iteration of thread tids unordered. The perf report
--tasks output now shows child threads in an order determined by the
hashing. For example, in this snippet tid 3 appears after tid 256 even
though they have the same ppid 2:

```
$ perf report --tasks
% pid tid ppid comm
0 0 -1 |swapper
2 2 0 | kthreadd
256 256 2 | kworker/12:1H-k
693761 693761 2 | kworker/10:1-mm
1301762 1301762 2 | kworker/1:1-mm_
1302530 1302530 2 | kworker/u32:0-k
3 3 2 | rcu_gp
...
```

The output is easier to read if threads appear numerically
increasing. To allow for this, read all threads into a list then sort
with a comparator that orders by the child task's of the first common
parent. The list creation and deletion are created as utilities on
machine. The indentation is possible by counting the number of
parents a child has.

With this change the output for the same data file is now like:
```
$ perf report --tasks
% pid tid ppid comm
0 0 -1 |swapper
1 1 0 | systemd
823 823 1 | systemd-journal
853 853 1 | systemd-udevd
3230 3230 1 | systemd-timesyn
3236 3236 1 | auditd
3239 3239 3236 | audisp-syslog
3321 3321 1 | accounts-daemon
...
```

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240301053646.1449657-2-irogers@google.com


# 0bdfbd04 08-Feb-2024 Adrian Hunter <adrian.hunter@intel.com>

perf tools: Make it possible to see perf's kernel and module memory mappings

Dump kmaps if using 'perf --debug kmaps' or verbose > 2 (e.g. -vvv) for
tools 'perf script' and 'perf report' if there is no browser.

Example:

$ perf --debug kmaps script 2>&1 >/dev/null | grep kvm.intel
build id event received for /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko: 0691d75e10e72ebbbd45a44c59f6d00a5604badf [20]
Map: 0-3a3 4f5d8 [kvm_intel].modinfo
Map: 0-5240 5f280 [kvm_intel]__versions
Map: 0-30 64 [kvm_intel].note.Linux
Map: 0-14 644c0 [kvm_intel].orc_header
Map: 0-5297 43680 [kvm_intel].rodata
Map: 0-5bee 3b837 [kvm_intel].text.unlikely
Map: 0-7e0 41430 [kvm_intel].noinstr.text
Map: 0-2080 713c0 [kvm_intel].bss
Map: 0-26 705c8 [kvm_intel].data..read_mostly
Map: 0-5888 6a4c0 [kvm_intel].data
Map: 0-22 70220 [kvm_intel].data.once
Map: 0-40 705f0 [kvm_intel].data..percpu
Map: 0-1685 41d20 [kvm_intel].init.text
Map: 0-4b8 6fd60 [kvm_intel].init.data
Map: 0-380 70248 [kvm_intel]__dyndbg
Map: 0-8 70218 [kvm_intel].exit.data
Map: 0-438 4f980 [kvm_intel]__param
Map: 0-5f5 4ca0f [kvm_intel].rodata.str1.1
Map: 0-3657 493b8 [kvm_intel].rodata.str1.8
Map: 0-e0 70640 [kvm_intel].data..ro_after_init
Map: 0-500 70ec0 [kvm_intel].gnu.linkonce.this_module
Map: ffffffffc13a7000-ffffffffc1421000 a0 /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko

The example above shows how the module section mappings are all wrong
except for the main .text mapping at 0xffffffffc13a7000.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Like Xu <like.xu.linux@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240208085326.13432-2-adrian.hunter@intel.com


# 7727d59d 24-Jan-2024 Namhyung Kim <namhyung@kernel.org>

perf tools: Add -H short option for --hierarchy

I found the hierarchy mode useful, but it's easy to make a typo when
using it. Let's add a short option for that.

Also update the documentation. :)

Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20240125055124.1579617-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>


# 81e57dee 12-Dec-2023 Namhyung Kim <namhyung@kernel.org>

perf report: Support data type profiling

Enable type annotation when the 'type' sort key is used.

It shows type of variables the samples access at the moment. Users can
see which types are accessed frequently.

$ perf report -s dso,type --stdio
...
# Overhead Shared Object Data Type
# ........ ................. .........
#
35.47% [kernel.kallsyms] (unknown)
1.62% [kernel.kallsyms] struct sched_entry
1.23% [kernel.kallsyms] struct cfs_rq
0.83% [kernel.kallsyms] struct task_struct
0.34% [kernel.kallsyms] struct list_head
0.30% [kernel.kallsyms] struct mem_cgroup
...

Committer testing:

With the perf.data file collected in the previous cset:

# perf report --stdio -s type
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 4 of event 'cpu_atom/mem-loads,ldlat=30/P'
# Event count (approx.): 7
#
# Overhead Data Type
# ........ .........
#
42.86% struct list_head
42.86% (unknown)
14.29% char

#
# (Tip: To record callchains for each sample: perf record -g)
#
# perf report --stdio -s dso,type
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 4 of event 'cpu_atom/mem-loads,ldlat=30/P'
# Event count (approx.): 7
#
# Overhead Shared Object Data Type
# ........ .................... .........
#
42.86% [kernel.kallsyms] struct list_head
28.57% libc.so.6 (unknown)
14.29% [kernel.kallsyms] char
14.29% ld-linux-x86-64.so.2 (unknown)

#
# (Tip: Save output of perf stat using: perf stat record <target workload>)
#
#

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231213001323.718046-10-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 431be14b 06-Dec-2023 Ian Rogers <irogers@google.com>

perf report: Use function to add missing maps lock

Switch maps__fprintf_task from loop macro maps__for_each_entry to
maps__for_each_map function that takes a callback. The function holds
the maps lock, which should be held during iteration.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Li Dong <lidong@vivo.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20231207011722.1220634-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 7f929aea 28-Nov-2023 Namhyung Kim <namhyung@kernel.org>

perf annotate: Ensure init/exit for global options

Now it only cares about the global options so it can just handle it
without the argument.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231128175441.721579-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 22197fb2 28-Nov-2023 Namhyung Kim <namhyung@kernel.org>

perf ui/browser/annotate: Use global annotation_options

Now it can use the global options and no need save local browser
options separately.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231128175441.721579-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 41fd3cac 28-Nov-2023 Namhyung Kim <namhyung@kernel.org>

perf annotate: Use global annotation_options

Now it can directly use the global options and no need to pass it as an
argument.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231128175441.721579-5-namhyung@kernel.org
[ Fixup build with GTK2=1 ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 14953f03 28-Nov-2023 Namhyung Kim <namhyung@kernel.org>

perf report: Convert to the global annotation_options

Use the global option and drop the local copy.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231128175441.721579-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 9ffa6c75 02-Nov-2023 Ian Rogers <irogers@google.com>

perf machine thread: Remove exited threads by default

'struct thread' values hold onto references to mmaps, DSOs, etc. When a
thread exits it is necessary to clean all of this memory up by removing
the thread from the machine's threads. Some tools require this doesn't
happen, such as auxtrace events, 'perf report' if offcpu events exist or
if a task list is being generated, so add a 'struct symbol_conf' member
to make the behavior optional. When an exited thread is left in the
machine's threads, mark it as exited.

This change relates to commit 40826c45eb0b8856 ("perf thread: Remove
notion of dead threads") . Dead threads were removed as they had a
reference count of 0 and were difficult to reason about with the
reference count checker. Here a thread is removed from threads when it
exits, unless via symbol_conf the exited thread isn't remove and is
marked as exited. Reference counting behaves as it normally does.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Li Dong <lidong@vivo.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20231102175735.2272696-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0e0f03d7 27-Oct-2023 Colin Ian King <colin.i.king@gmail.com>

perf report: Fix spelling mistake "heirachy" -> "hierarchy"

There is a spelling mistake in a ui error message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Cc: kernel-janitors@vger.kernel.org
Link: https://lore.kernel.org/r/20231027084011.1167091-1-colin.i.king@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>


# a6e4a4a1 24-Oct-2023 Namhyung Kim <namhyung@kernel.org>

perf report: Fix hierarchy mode on pipe input

The hierarchy mode needs to setup output formats for each evsel.
Normally setup_sorting() handles this at the beginning, but it cannot
do that if data comes from a pipe since there's no evsel info before
reading the data. And then perf report cannot process the samples
in hierarchy mode and think as if there's no sample.

Let's check the condition and setup the output formats after reading
data so that it can find evsels.

Before:

$ ./perf record -o- true | ./perf report -i- --hierarchy -q
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ]
Error:
The - data has no samples!

After:

$ ./perf record -o- true | ./perf report -i- --hierarchy -q
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ]
94.76% true
94.76% [kernel.kallsyms]
94.76% [k] filemap_fault
5.24% perf-ex
5.24% [kernel.kallsyms]
5.06% [k] __memset
0.18% [k] native_write_msr

Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20231025003121.2811738-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>


# d82257d7 22-Jun-2023 Ian Rogers <irogers@google.com>

perf symbol: Remove now unused symbol_conf.sort_by_name

Previously used to specify symbol_name_rb_node was in use.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Jason Wang <wangborong@cdjrlc.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/20230623054520.4118442-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>


# 2c9f7bd7 08-Jun-2023 Ian Rogers <irogers@google.com>

perf report: Avoid 'parent_thread' thread leak on '--tasks' processing

Caught with address sanitizer and reference count checking.

Committer notes:

The command leading to this leak:

# perf record -a sleep 2
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.516 MB perf.data (6422 samples) ]
# perf report --tasks
# pid tid ppid comm
0 0 -1 |swapper
1 1 0 | systemd
1474 1474 1 | systemd
2816 2816 1474 | gjs
2816 2825 2816 | gmain
2816 2831 2816 | gdbus
2816 2861 2816 | JS Helper
2816 2862 2816 | JS Helper
2816 2863 2816 | JS Helper
2816 2864 2816 | JS Helper
2816 2865 2816 | JS Helper
2816 2866 2816 | JS Helper
2816 2867 2816 | JS Helper
2816 2868 2816 | JS Helper
3072 3072 1474 | gsd-printer
3072 3082 3072 | gmain
3072 3083 3072 | gdbus
2600 2600 1474 | gnome-shell
15621 15621 2600 | firefox
15771 15771 15621 | WebExtensions
15771 15872 15771 | TaskCon~ller #6
15771 15873 15771 | TaskCon~ller #7
15771 15778 15771 | IPC I/O Child
15771 15779 15771 | Socket Thread
15771 15780 15771 | HTML5 Parser
15771 15781 15771 | JS Watchdog
# <SNIP>

When it is going to exit a thread__put(parent_thread) was missed, add it
to have ASAN clean.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Brian Robbins <brianrob@linux.microsoft.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Fangrui Song <maskray@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Babrou <ivan@cloudflare.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Ye Xingchen <ye.xingchen@zte.com.cn>
Cc: Yuan Can <yuancan@huawei.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230608232823.4027869-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0dd5041c 08-Jun-2023 Ian Rogers <irogers@google.com>

perf addr_location: Add init/exit/copy functions

struct addr_location holds references to multiple reference counted
objects. Add init/exit functions to make maintenance of those more
consistent with the rest of the code and to try to avoid
leaks. Modification of thread reference counts isn't included in this
change.

Committer notes:

I needed to initialize result to sample->ip to make sure is set to
something, fixing a compile time error, mostly keeping the previous
logic as build_alloc_func_list() already does debugging/error prints
about what went wrong if it takes the 'goto out'.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Brian Robbins <brianrob@linux.microsoft.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Fangrui Song <maskray@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Babrou <ivan@cloudflare.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Ye Xingchen <ye.xingchen@zte.com.cn>
Cc: Yuan Can <yuancan@huawei.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230608232823.4027869-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ee84a303 08-Jun-2023 Ian Rogers <irogers@google.com>

perf thread: Add accessor functions for thread

Using accessors will make it easier to add reference count checking in
later patches.

Committer notes:

thread->nsinfo wasn't wrapped as it is used together with
nsinfo__zput(), where does a trick to set the field with a refcount
being dropped to NULL, and that doesn't work well with using
thread__nsinfo(thread), that loses the &thread->nsinfo pointer.

When refcount checking is added to 'struct thread', later in this
series, nsinfo__zput(RC_CHK_ACCESS(thread)->nsinfo) will be used to
check the thread pointer.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Brian Robbins <brianrob@linux.microsoft.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Fangrui Song <maskray@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Babrou <ivan@cloudflare.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Ye Xingchen <ye.xingchen@zte.com.cn>
Cc: Yuan Can <yuancan@huawei.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230608232823.4027869-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 7ee227f6 08-Jun-2023 Ian Rogers <irogers@google.com>

perf thread: Make threads rbtree non-invasive

Separate the rbtree out of thread and into a new struct
thread_rb_node. The refcnt is in thread and the rbtree is responsible
for a single count.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Brian Robbins <brianrob@linux.microsoft.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Fangrui Song <maskray@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Babrou <ivan@cloudflare.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Ye Xingchen <ye.xingchen@zte.com.cn>
Cc: Yuan Can <yuancan@huawei.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230608232823.4027869-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2a6e5e8a 04-Apr-2023 Ian Rogers <irogers@google.com>

perf map: Add accessors for ->pgoff and ->reloc

Later changes will add reference count checking for 'struct map'. Add
accessors so that the reference count check is only necessary in one
place.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20230404205954.2245628-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ddee3f2b 04-Apr-2023 Ian Rogers <irogers@google.com>

perf map: Add accessors for ->prot, ->priv and ->flags

Later changes will add reference count checking for 'struct map'. Add an
accessor so that the reference count check is only necessary in one
place.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20230404205954.2245628-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# e5116f46 20-Mar-2023 Ian Rogers <irogers@google.com>

perf map: Add accessor for start and end

Later changes will add reference count checking for struct map, start
and end are frequently accessed variables. Add an accessor so that the
reference count check is only necessary in one place.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 63df0e4b 20-Mar-2023 Ian Rogers <irogers@google.com>

perf map: Add accessor for dso

Later changes will add reference count checking for struct map, with
dso being the most frequently accessed variable. Add an accessor so
that the reference count check is only necessary in one place.

Additional changes:
- add a dso variable to avoid repeated map__dso calls.
- in builtin-mem.c dump_raw_samples, code only partially tested for
dso == NULL. Make the possibility of NULL consistent.
- in thread.c thread__memcpy fix use of spaces and use tabs.

Committer notes:

Did missing conversions on these files:

tools/perf/arch/powerpc/util/skip-callchain-idx.c
tools/perf/arch/powerpc/util/sym-handling.c
tools/perf/ui/browsers/hists.c
tools/perf/ui/gtk/annotate.c
tools/perf/util/cs-etm.c
tools/perf/util/thread.c
tools/perf/util/unwind-libunwind-local.c
tools/perf/util/unwind-libunwind.c

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ff583dc4 20-Mar-2023 Ian Rogers <irogers@google.com>

perf maps: Remove rb_node from struct map

struct map is reference counted, having it also be a node in an
red-black tree complicates the reference counting. Switch to having a
map_rb_node which is a red-block tree node but points at the reference
counted struct map. This reference is responsible for a single reference
count.

Committer notes:

Fixed up tools/perf/util/unwind-libunwind-local.c to use map_rb_node as
well.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 57594454 28-Mar-2023 Ian Rogers <irogers@google.com>

perf symbol: Add command line support for addr2line path

Allow addr2line to be set either on the command line or via the
perfconfig file. This doesn't currently work with llvm-addr2line as
the addr2line code emits two things:
1) the address to decode,
2) a bogus ',' value.
The expectation is the bogus value will generate:
??
??:0
that terminates the addr2line reading. However, the output from
llvm-addr2line is a single line with just the input ',' locking up the
addr2line reading that is expecting a second line.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Tom Rix <trix@redhat.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 56d9117c 28-Mar-2023 Ian Rogers <irogers@google.com>

perf annotate: Own objdump_path and disassembler_style strings

Make struct annotation_options own the strings objdump_path and
disassembler_style, freeing them on exit. Add missing strdup for
disassembler_style when read from a config file.

Committer notes:

Converted free(obj->member) to zfree(&obj->member) in
annotation_options__exit()

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Tom Rix <trix@redhat.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 217b7d41 28-Mar-2023 Ian Rogers <irogers@google.com>

perf annotate: Add init/exit to annotation_options remove default

The annotation__default_options global variable was used to initialize
annotation_options. Switch to the init/exit pattern as later changes
will give ownership over strings and this will be necessary to avoid
memory leaks.

Committer note:

Fix the GTK2=1 build, hist_entry__gtk_annotate() needs to receive a
'struct annotation_options' pointer.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Tom Rix <trix@redhat.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8f08c363 28-Mar-2023 Ian Rogers <irogers@google.com>

perf report: Additional config warnings

If the default_sort_order isn't correctly strdup-ed warn and return an
error. Debug warn if no option is matched.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Tom Rix <trix@redhat.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 9d2dc632 11-Mar-2023 Ian Rogers <irogers@google.com>

perf evlist: Remove nr_groups

Maintaining the number of groups during event parsing is problematic
and since changing to sort/regroup events can only be computed by a
linear pass over the evlist. As the value is generally only used in
tests, rather than hold it in a variable compute it by passing over
the evlist when necessary.

This change highlights that libpfm's counting of groups with a single
entry disagreed with regular event parsing. The libpfm tests are
updated accordingly.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 378ef0f5 05-Dec-2022 Ian Rogers <irogers@google.com>

perf build: Use libtraceevent from the system

Remove the LIBTRACEEVENT_DYNAMIC and LIBTRACEFS_DYNAMIC make command
line variables.

If libtraceevent isn't installed or NO_LIBTRACEEVENT=1 is passed to the
build, don't compile in libtraceevent and libtracefs support.

This also disables CONFIG_TRACE that controls "perf trace".

CONFIG_LIBTRACEEVENT is used to control enablement in Build/Makefiles,
HAVE_LIBTRACEEVENT is used in C code.

Without HAVE_LIBTRACEEVENT tracepoints are disabled and as such the
commands kmem, kwork, lock, sched and timechart are removed. The
majority of commands continue to work including "perf test".

Committer notes:

Fixed up a tools/perf/util/Build reject and added:

#include <traceevent/event-parse.h>

to tools/perf/util/scripting-engines/trace-event-perl.c.

Committer testing:

$ rpm -qi libtraceevent-devel
Name : libtraceevent-devel
Version : 1.5.3
Release : 2.fc36
Architecture: x86_64
Install Date: Mon 25 Jul 2022 03:20:19 PM -03
Group : Unspecified
Size : 27728
License : LGPLv2+ and GPLv2+
Signature : RSA/SHA256, Fri 15 Apr 2022 02:11:58 PM -03, Key ID 999f7cbf38ab71f4
Source RPM : libtraceevent-1.5.3-2.fc36.src.rpm
Build Date : Fri 15 Apr 2022 10:57:01 AM -03
Build Host : buildvm-x86-05.iad2.fedoraproject.org
Packager : Fedora Project
Vendor : Fedora Project
URL : https://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git/
Bug URL : https://bugz.fedoraproject.org/libtraceevent
Summary : Development headers of libtraceevent
Description :
Development headers of libtraceevent-libs
$

Default build:

$ ldd ~/bin/perf | grep tracee
libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007f1dcaf8f000)
$

# perf trace -e sched:* --max-events 10
0.000 migration/0/17 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, dest_cpu: 1)
0.005 migration/0/17 sched:sched_wake_idle_without_ipi(cpu: 1)
0.011 migration/0/17 sched:sched_switch(prev_comm: "", prev_pid: 17 (migration/0), prev_state: 1, next_comm: "", next_prio: 120)
1.173 :0/0 sched:sched_wakeup(comm: "", pid: 3138 (gnome-terminal-), prio: 120)
1.180 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 3138 (gnome-terminal-), next_prio: 120)
0.156 migration/1/21 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, orig_cpu: 1, dest_cpu: 2)
0.160 migration/1/21 sched:sched_wake_idle_without_ipi(cpu: 2)
0.166 migration/1/21 sched:sched_switch(prev_comm: "", prev_pid: 21 (migration/1), prev_state: 1, next_comm: "", next_prio: 120)
1.183 :0/0 sched:sched_wakeup(comm: "", pid: 1602985 (kworker/u16:0-f), prio: 120, target_cpu: 1)
1.186 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 1602985 (kworker/u16:0-f), next_prio: 120)
#

Had to tweak tools/perf/util/setup.py to make sure the python binding
shared object links with libtraceevent if -DHAVE_LIBTRACEEVENT is
present in CFLAGS.

Building with NO_LIBTRACEEVENT=1 uncovered some more build failures:

- Make building of data-convert-bt.c to CONFIG_LIBTRACEEVENT=y

- perf-$(CONFIG_LIBTRACEEVENT) += scripts/

- bpf_kwork.o needs also to be dependent on CONFIG_LIBTRACEEVENT=y

- The python binding needed some fixups and util/trace-event.c can't be
built and linked with the python binding shared object, so remove it
in tools/perf/util/setup.py and exclude it from the list of
dependencies in the python/perf.so Makefile.perf target.

Building without libtraceevent-devel installed uncovered more build
failures:

- The python binding tools/perf/util/python.c was assuming that
traceevent/parse-events.h was always available, which was the case
when we defaulted to using the in-kernel tools/lib/traceevent/ files,
now we need to enclose it under ifdef HAVE_LIBTRACEEVENT, just like
the other parts of it that deal with tracepoints.

- We have to ifdef the rules in the Build files with
CONFIG_LIBTRACEEVENT=y to build builtin-trace.c and
tools/perf/trace/beauty/ as we only ifdef setting CONFIG_TRACE=y when
setting NO_LIBTRACEEVENT=1 in the make command line, not when we don't
detect libtraceevent-devel installed in the system. Simplification here
to avoid these two ways of disabling builtin-trace.c and not having
CONFIG_TRACE=y when libtraceevent-devel isn't installed is the clean
way.

From Athira:

<quote>
tools/perf/arch/powerpc/util/Build
-perf-y += kvm-stat.o
+perf-$(CONFIG_LIBTRACEEVENT) += kvm-stat.o
</quote>

Then, ditto for arm64 and s390, detected by container cross build tests.

- s/390 uses test__checkevent_tracepoint() that is now only available if
HAVE_LIBTRACEEVENT is defined, enclose the callsite with ifder HAVE_LIBTRACEEVENT.

Also from Athira:

<quote>
With this change, I could successfully compile in these environment:
- Without libtraceevent-devel installed
- With libtraceevent-devel installed
- With “make NO_LIBTRACEEVENT=1”
</quote>

Then, finally rename CONFIG_TRACEEVENT to CONFIG_LIBTRACEEVENT for
consistency with other libraries detected in tools/perf/.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20221205225940.3079667-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# a527c2c1 18-Oct-2022 James Clark <james.clark@arm.com>

perf tools: Make quiet mode consistent between tools

Use the global quiet variable everywhere so that all tools hide warnings
in quiet mode and update the documentation to reflect this.

'perf probe' claimed that errors are not printed in quiet mode but I
don't see this so remove it from the docs.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221018094137.783081-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# d7ba22d4 01-Sep-2022 Namhyung Kim <namhyung@kernel.org>

perf report: Show per-event LOST SAMPLES stat

Display lost samples with --stat (if not zero):

$ perf report --stat
Aggregated stats:
TOTAL events: 64
COMM events: 2 ( 3.1%)
EXIT events: 1 ( 1.6%)
SAMPLE events: 26 (40.6%)
MMAP2 events: 4 ( 6.2%)
LOST_SAMPLES events: 1 ( 1.6%)
ATTR events: 2 ( 3.1%)
FINISHED_ROUND events: 1 ( 1.6%)
ID_INDEX events: 1 ( 1.6%)
THREAD_MAP events: 1 ( 1.6%)
CPU_MAP events: 1 ( 1.6%)
EVENT_UPDATE events: 2 ( 3.1%)
TIME_CONV events: 1 ( 1.6%)
FEATURE events: 20 (31.2%)
FINISHED_INIT events: 1 ( 1.6%)
cycles:uH stats:
SAMPLE events: 14
LOST_SAMPLES events: 1
instructions:uH stats:
SAMPLE events: 12

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220901195739.668604-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 557cc18e 07-Jul-2022 Ian Rogers <irogers@google.com>

perf gtk: Only support --gtk if compiled in

If HAVE_GTK2_SUPPORT isn't defined then --gtk can't succeed, don't
support it as a command line option in this case.

v2. Is a rebase. Patch appears to have been missed in:
https://lore.kernel.org/lkml/Ygu40djM1MqAfkcF@kernel.org/

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: xaizek <xaizek@posteo.net>
Link: https://lore.kernel.org/r/20220707203836.345918-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ccb17cae 14-Apr-2022 Leo Yan <leo.yan@linaro.org>

perf report: Set PERF_SAMPLE_DATA_SRC bit for Arm SPE event

Since commit bb30acae4c4dacfa ("perf report: Bail out --mem-mode if mem
info is not available") "perf mem report" and "perf report --mem-mode"
don't report result if the PERF_SAMPLE_DATA_SRC bit is missed in sample
type.

The commit ffab487052054162 ("perf: arm-spe: Fix perf report
--mem-mode") partially fixes the issue. It adds PERF_SAMPLE_DATA_SRC
bit for Arm SPE event, this allows the perf data file generated by
kernel v5.18-rc1 or later version can be reported properly.

On the other hand, perf tool still fails to be backward compatibility
for a data file recorded by an older version's perf which contains Arm
SPE trace data. This patch is a workaround in reporting phase, when
detects ARM SPE PMU event and without PERF_SAMPLE_DATA_SRC bit, it will
force to set the bit in the sample type and give a warning info.

Fixes: bb30acae4c4dacfa ("perf report: Bail out --mem-mode if mem info is not available")
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: German Gomez <german.gomez@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: https://lore.kernel.org/r/20220414123201.842754-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 3402ae0a 23-Jan-2022 Ian Rogers <irogers@google.com>

perf tui: Only support --tui with slang

Make the --tui command line flags dependent HAVE_SLANG_SUPPORT. This was
reported as confusing in:
https://lore.kernel.org/linux-perf-users/YevaTkzdXmFKdGpc@zx-spectrum.none/

Reported-by: xaizek <xaizek@posteo.net>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: xaizek <xaizek@posteo.net>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220123191849.3655855-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# aa8db3e4 17-Dec-2021 Alexandre Truong <alexandre.truong@arm.com>

perf callchain: Enable dwarf_callchain_users on arm64

Enable dwarf_callchain_users on arm64 which will be needed to do a
DWARF unwind in order to get the caller of the leaf frame.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Alexandre Truong <alexandre.truong@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211217154521.80603-5-german.gomez@arm.com
Signed-off-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# d9fc7061 18-Nov-2021 Ian Rogers <irogers@google.com>

perf report: Fix memory leaks around perf_tip()

perf_tip() may allocate memory or use a literal, this means memory
wasn't freed if allocated. Change the API so that literals aren't used.

At the same time add missing frees for system_path. These issues were
spotted using leak sanitizer.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211118073804.2149974-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# a3df50ab 18-Oct-2021 James Clark <james.clark@arm.com>

perf tools: Refactor out kernel symbol argument sanity checking

User supplied values for vmlinux and kallsyms are checked before
continuing. Refactor this into a function so that it can be used
elsewhere.

Reviewed-by: Denis Nikitin <denik@chromium.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20211018134844.2627174-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2681bd85 19-Jul-2021 Namhyung Kim <namhyung@kernel.org>

perf tools: Remove repipe argument from perf_session__new()

The repipe argument is only used by perf inject and the all others
passes 'false'. Let's remove it from the function signature and add
__perf_session__new() to be called from perf inject directly.

This is a preparation of the change the pipe input/output.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210719223153.1618812-2-namhyung@kernel.org
[ Fixed up some trivial conflicts as this patchset fell thru the cracks ;-( ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# a37338aa 15-Jul-2021 Riccardo Mancini <rickyman7@gmail.com>

perf report: Free generated help strings for sort option

ASan reports the memory leak of the strings allocated by sort_help() when
running perf report.

This patch changes the returned pointer to char* (instead of const
char*), saves it in a temporary variable, and finally deallocates it at
function exit.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: 702fb9b415e7c99b ("perf report: Show all sort keys in help output")
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/a38b13f02812a8a6759200b9063c6191337f44d4.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 3a683120 06-Jul-2021 Jiri Olsa <jolsa@redhat.com>

libperf: Move 'nr_groups' from tools/perf to evlist::nr_groups

Move evsel::nr_groups to perf_evsel::nr_groups, so we can move the group
interface to libperf.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fba7c866 06-Jul-2021 Jiri Olsa <jolsa@redhat.com>

libperf: Move 'leader' from tools/perf to perf_evsel::leader

Move evsel::leader to perf_evsel::leader, so we can move the group
interface to libperf.

Also add several evsel helpers to ease up the transition:

struct evsel *evsel__leader(struct evsel *evsel);
- get leader evsel

bool evsel__has_leader(struct evsel *evsel, struct evsel *leader);
- true if evsel has leader as leader

bool evsel__is_leader(struct evsel *evsel);
- true if evsel is itw own leader

void evsel__set_leader(struct evsel *evsel, struct evsel *leader);
- set leader for evsel

Committer notes:

Fix this when building with 'make BUILD_BPF_SKEL=1'

tools/perf/util/bpf_counter.c

- if (evsel->leader->core.nr_members > 1) {
+ if (evsel->core.leader->nr_members > 1) {

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 38fe0e01 06-Jul-2021 Jiri Olsa <jolsa@redhat.com>

libperf: Move 'idx' from tools/perf to perf_evsel::idx

Move evsel::idx to perf_evsel::idx, so we can move the group interface
to libperf.

Committer notes:

Fixup evsel->idx usage in tools/perf/util/bpf_counter_cgroup.c, that
appeared in my tree in my local tree.

Also fixed up these:

$ find tools/perf/ -name "*.[ch]" | xargs grep 'evsel->idx'
tools/perf/ui/gtk/annotate.c: evsel->idx + i);
tools/perf/ui/gtk/annotate.c: evsel->idx);
$

That running 'make -C tools/perf build-test' caught.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 892ba7f1 29-Jun-2021 Namhyung Kim <namhyung@kernel.org>

perf report: Fix --task and --stat with pipe input

Current 'perf report' fails to process a pipe input when --task or
--stat options are used. This is because they reset all the tool
callbacks and fails to find a matching event for a sample.

When pipe input is used, the event info is passed via ATTR records so it
needs to handle that operation. Otherwise the following error occurs.
Note, -14 (= -EFAULT) comes from evlist__parse_sample():

# perf record -a -o- sleep 1 | perf report -i- --stat
Can't parse sample, err = -14
0x271044 [0x38]: failed to process type: 9
Error:
failed to process sample
#

Committer testing:

Before:

$ perf record -o- sleep 1 | perf report -i- --stat
Can't parse sample, err = -14
[ perf record: Woken up 1 times to write data ]
0x1350 [0x30]: failed to process type: 9
Error:
failed to process sample
[ perf record: Captured and wrote 0.000 MB - ]
$

After:

$ perf record -o- sleep 1 | perf report -i- --stat
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ]

Aggregated stats:
TOTAL events: 41
COMM events: 2 ( 4.9%)
EXIT events: 1 ( 2.4%)
SAMPLE events: 9 (22.0%)
MMAP2 events: 4 ( 9.8%)
ATTR events: 1 ( 2.4%)
FINISHED_ROUND events: 1 ( 2.4%)
THREAD_MAP events: 1 ( 2.4%)
CPU_MAP events: 1 ( 2.4%)
EVENT_UPDATE events: 1 ( 2.4%)
TIME_CONV events: 1 ( 2.4%)
FEATURE events: 19 (46.3%)
cycles:uhH stats:
SAMPLE events: 9
$

Fixes: a4a4d0a7a2b20f78 ("perf report: Add --stats option to display quick data statistics")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210630043058.1131295-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# d5a8bd0f 26-May-2021 Jin Yao <yao.jin@linux.intel.com>

perf mem: Disable 'mem-loads-aux' group before reporting

For some platforms, such as Alderlake, the 'mem-loads' event is required
to use together with 'mem-loads-aux' within a group and 'mem-loads-aux'
must be the group leader. Now we disable this group before reporting
because 'mem-loads-aux' is just an auxiliary event. It doesn't carry
any valid memory load result. If we show the 'mem-loads-aux' +
'mem-loads' as a group in report, it needs many of changes but they
are totally unnecessary.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-8-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8f08cf33 26-Apr-2021 Namhyung Kim <namhyung@kernel.org>

perf report: Make --skip-empty as default

so that the compact output is shown by default. Also add 'report.skip-empty'
config option to override the default. Users can also use --no-skip-empty
command line option to change the behavior anytime.

Committer testing:

$ perf report --stat

Aggregated stats:
TOTAL events: 19
COMM events: 2
EXIT events: 1
SAMPLE events: 8
MMAP2 events: 4
FINISHED_ROUND events: 1
THREAD_MAP events: 1
CPU_MAP events: 1
TIME_CONV events: 1
cycles:u stats:
SAMPLE events: 8
$ perf config report.skip-empty=false
$ perf report --stat

Aggregated stats:
TOTAL events: 19
MMAP events: 0
LOST events: 0
COMM events: 2
EXIT events: 1
THROTTLE events: 0
UNTHROTTLE events: 0
FORK events: 0
READ events: 0
SAMPLE events: 8
MMAP2 events: 4
AUX events: 0
ITRACE_START events: 0
LOST_SAMPLES events: 0
SWITCH events: 0
SWITCH_CPU_WIDE events: 0
NAMESPACES events: 0
KSYMBOL events: 0
BPF_EVENT events: 0
CGROUP events: 0
TEXT_POKE events: 0
ATTR events: 0
EVENT_TYPE events: 0
TRACING_DATA events: 0
BUILD_ID events: 0
FINISHED_ROUND events: 1
ID_INDEX events: 0
AUXTRACE_INFO events: 0
AUXTRACE events: 0
AUXTRACE_ERROR events: 0
THREAD_MAP events: 1
CPU_MAP events: 1
STAT_CONFIG events: 0
STAT events: 0
STAT_ROUND events: 0
EVENT_UPDATE events: 0
TIME_CONV events: 1
FEATURE events: 0
COMPRESSED events: 0
cycles:u stats:
SAMPLE events: 8
$ perf config report.skip-empty
report.skip-empty=false
$

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427013717.1651674-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2775de0b 26-Apr-2021 Namhyung Kim <namhyung@kernel.org>

perf report: Add --skip-empty option to suppress 0 event stat

To make the output more readable, I think it's better to remove 0's in
the output. Also the dummy event has no event stats so it just wasts
the space. Let's use the --skip-empty option to suppress it.

$ perf report --stat --skip-empty

Aggregated stats:
TOTAL events: 16530
MMAP events: 226
COMM events: 1596
EXIT events: 2
THROTTLE events: 121
UNTHROTTLE events: 117
FORK events: 1595
SAMPLE events: 719
MMAP2 events: 12147
CGROUP events: 2
FINISHED_ROUND events: 2
THREAD_MAP events: 1
CPU_MAP events: 1
TIME_CONV events: 1
cycles stats:
SAMPLE events: 719

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427013717.1651674-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 55f75444 26-Apr-2021 Namhyung Kim <namhyung@kernel.org>

perf report: Show event sample counts in --stat output

To make the output identical with perf report -D, it needs to show
per-event sample counts along with the aggregated stat at the end.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427013717.1651674-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0f0abbac 26-Apr-2021 Namhyung Kim <namhyung@kernel.org>

perf hists: Split hists_stats from events_stats

Each struct hists have events_stats but most of the fields were not
used. It's to count number of samples and periods whether filtered or
not. And other fields are used only by evlist.

So it'd be better to split hists_stats and events_stats to reduce
wasted memory in the struct hists. This makes the output of event
statistics in the perf report compact by skipping 0 events in each
evsel/hists.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427013717.1651674-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2e989f82 19-Feb-2021 Jin Yao <yao.jin@linux.intel.com>

perf report: Create option to disable raw event ordering

Warning "dso not found" is reported when using "perf report -D".

66702781413407 0x32c0 [0x30]: PERF_RECORD_SAMPLE(IP, 0x2): 28177/28177: 0x55e493e00563 period: 106578 addr: 0
... thread: perf:28177
...... dso: <not found>

66702727832429 0x9dd8 [0x38]: PERF_RECORD_COMM exec: triad_loop:28177/28177

The PERF_RECORD_SAMPLE event (timestamp: 66702781413407) should be after the
PERF_RECORD_COMM event (timestamp: 66702727832429), but it's early processed.

So for most of cases, it makes sense to keep the event ordered even for dump
mode. But it would be also useful to disable ordered_events for reporting raw
dump to see events as they are stored in the perf.data file.

So now, set ordered_events by default to true and add a new option
'disable-order' to disable it. For example,

perf report -D --disable-order

Fixes: 977f739b7126b ("perf report: Disable ordered_events for raw dump")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210219070005.12397-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 03de8656 09-Dec-2020 Namhyung Kim <namhyung@kernel.org>

perf report: Support --header-only for pipe mode

The --header-only checks file header and prints the feature data. But
as pipe mode doesn't have it in the header it prints almost nothing.
Change it to process first few records until it founds HEADER_FEATURE.

Before:
$ perf record -o- true | perf report -i- --header-only
# ========
# captured on : Thu Dec 10 14:34:59 2020
# header version : 1
# data offset : 0
# data size : 0
# feat offset : 0
# ========
#

After:
$ perf record -o- true | perf report -i- --header-only
# ========
# captured on : Thu Dec 10 14:49:11 2020
# header version : 1
# data offset : 0
# data size : 0
# feat offset : 0
# ========
#
# hostname : balhae
# os release : 5.7.17-1xxx
# perf version : 5.10.rc6.gdb0ea13cc741
# arch : x86_64
# nrcpus online : 8
# nrcpus avail : 8
# cpudesc : Intel(R) Core(TM) i7-8665U CPU @ 1.90GHz
# cpuid : GenuineIntel,6,142,12
# total memory : 16158916 kB
# cmdline : perf record -o- true
# event : name = cycles, , id = { 81, 82, 83, 84, 85, 86, 87, 88 }, size = 120, ...
# CPU_TOPOLOGY info available, use -I to display
# NUMA_TOPOLOGY info available, use -I to display
# pmu mappings: intel_pt = 9, intel_bts = 8, software = 1, power = 20, uprobe = 7, ...
# time of first sample : 0.000000
# time of last sample : 0.000000
# sample duration : 0.000 ms
# MEM_TOPOLOGY info available, use -I to display
# cpu pmu capabilities: branches=32, max_precise=3, pmu_name=skylake

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20201210061302.88213-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 78e1bc25 30-Nov-2020 Arnaldo Carvalho de Melo <acme@redhat.com>

perf evlist: Use the right prefix for 'struct evlist' event attribute config methods

perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 64b4778b 30-Nov-2020 Arnaldo Carvalho de Melo <acme@redhat.com>

perf evlist: Use the right prefix for 'struct evlist' event group methods

perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 71273724 30-Nov-2020 Arnaldo Carvalho de Melo <acme@redhat.com>

perf evlist: Use the right prefix for 'struct evlist' print methods

perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# f4bd0b4a 30-Nov-2020 Arnaldo Carvalho de Melo <acme@redhat.com>

perf evlist: Use the right prefix for 'struct evlist' browser methods

perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 977f739b 27-Aug-2020 Jiri Olsa <jolsa@kernel.org>

perf report: Disable ordered_events for raw dump

Disable ordered_events for report raw dump, because for raw dump we want
to see events as they are stored in the perf.data file, not sorted by
time.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200827134830.126721-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 92c7d7cd 17-Jun-2020 Arnaldo Carvalho de Melo <acme@redhat.com>

perf evlist: Fix the class prefix for 'struct evlist' branch_type methods

To differentiate from libperf's 'struct perf_evlist' methods.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b3c2cc2b 17-Jun-2020 Arnaldo Carvalho de Melo <acme@redhat.com>

perf evlist: Fix the class prefix for 'struct evlist' sample_type methods

To differentiate from libperf's 'struct perf_evlist' methods.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 11b6e548 08-Jun-2020 Gaurav Singh <gaurav1086@gmail.com>

perf report: Fix NULL pointer dereference in hists__fprintf_nr_sample_events()

The 'evname' variable can be NULL, as it is checked a few lines back,
check it before using.

Fixes: 9e207ddfa207 ("perf report: Show call graph from reference events")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/
Signed-off-by: Gaurav Singh <gaurav1086@gmail.com>


# 3e9b26dc 01-Jun-2020 Tiezhu Yang <yangtiezhu@loongson.cn>

perf tools: Remove some duplicated includes

There exists some duplicated includes in tools/perf, remove them.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: xuefeng li <lixuefeng@loongson.cn>
Link: http://lore.kernel.org/lkml/1591071304-19338-2-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0d71a2b2 07-May-2020 Jiri Olsa <jolsa@kernel.org>

perf callchain: Setup callchain properly in pipe mode

Callchains are automatically initialized by checking on event's
sample_type. For pipe mode we need to put this check into attr event
code.

Moving the callchains setup code into callchain_param_setup function and
calling it from attr event process code.

This enables pipe output having callchains, like:

# perf record -g -e 'raw_syscalls:sys_enter' true | perf script
# perf record -g -e 'raw_syscalls:sys_enter' true | perf report

Committer notes:

We still need the next patch for the above output to work.

Reported-by: Paul Khuong <pvk@pvk.ca>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200507095024.2789147-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 10c513f7 05-May-2020 Arnaldo Carvalho de Melo <acme@redhat.com>

perf evsel: Rename perf_evsel__resort*() to evsel__resort*()

As it is a 'struct evsel' method, not part of tools/lib/perf/, aka
libperf, to whom the perf_ prefix belongs.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# c754c382 30-Apr-2020 Arnaldo Carvalho de Melo <acme@redhat.com>

perf evsel: Rename perf_evsel__is_*() to evsel__is*()

As those are 'struct evsel' methods, not part of tools/lib/perf/, aka
libperf, to whom the perf_ prefix belongs.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 347c751a 29-Apr-2020 Arnaldo Carvalho de Melo <acme@redhat.com>

perf evsel: Rename perf_evsel__group_desc() to evsel__group_desc()

As it is a 'struct evsel' method, not part of tools/lib/perf/, aka
libperf, to whom the perf_ prefix belongs.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8ab2e96d 29-Apr-2020 Arnaldo Carvalho de Melo <acme@redhat.com>

perf evsel: Rename *perf_evsel__*name() to *evsel__*name()

As they are 'struct evsel' methods or related routines, not part of
tools/lib/perf/, aka libperf, to whom the perf_ prefix belongs.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ec90e42c 29-Apr-2020 Adrian Hunter <adrian.hunter@intel.com>

perf auxtrace: Add option to synthesize branch stack for regular events

There is an existing option to synthesize branch stacks for synthesized
events. Add a new option to synthesize branch stacks for regular events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200429150751.12570-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 6fa9c3e7 26-Apr-2020 Zou Wei <zou_wei@huawei.com>

perf report: Fix warning assignment of 0/1 to bool variable

Fixes coccicheck warning:

tools/perf/builtin-report.c:1403:2-34: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1587904683-3510-1-git-send-email-zou_wei@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b1d1429b 19-Mar-2020 Kan Liang <kan.liang@linux.intel.com>

perf report: Add option to enable the LBR stitching approach

With the LBR stitching approach, the reconstructed LBR call stack can
break the HW limitation. However, it may reconstruct invalid call stacks
in some cases, e.g. exception handing such as setjmp/longjmp. Also, it
may impact the processing time especially when the number of samples
with stitched LBRs are huge.

Add an option to enable the approach.

# To display the perf.data header info, please use
# --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 6K of event 'cycles'
# Event count (approx.): 6492797701
#
# Children Self Command Shared Object Symbol
# ........ ........ ............... ..................
# .................................
#
99.99% 99.99% tchain_edit tchain_edit [.] f43
|
---main
f1
f2
f3
f4
f5
f6
f7
f8
f9
f10
f11
f12
f13
f14
f15
f16
f17
f18
f19
f20
f21
f22
f23
f24
f25
f26
f27
f28
f29
f30
f31
|
--99.65%--f32
f33
f34
f35
f36
f37
f38
f39
f40
f41
f42
f43

Committer testing:

$ perf record --call-graph lbr /wb/tchain_edit
[ perf record: Woken up 23 times to write data ]
[ perf record: Captured and wrote 5.578 MB perf.data (6839 samples) ]
$ perf report --header-only | egrep 'cpu(desc|.*capabilities)'
# cpudesc : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
# cpu pmu capabilities: branches=32, max_precise=3, pmu_name=skylake
$

Before:

$ perf report --no-children --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 6K of event 'cycles:u'
# Event count (approx.): 6459523879
#
# Overhead Command Shared Object Symbol
# ........ ........... ................ .......................
#
99.95% tchain_edit tchain_edit [.] f43
|
--99.92%--f43
f42
f41
f40
f39
f38
f37
f36
f35
f34
f33
f32
f31
f30
f29
f28
f27
f26
f25
f24
f23
f22
f21
f20
f19
f18
f17
f16
f15
f14
f13
f12
f11

0.03% tchain_edit tchain_edit [.] f42
0.01% tchain_edit tchain_edit [.] f41
0.00% tchain_edit tchain_edit [.] f31
0.00% tchain_edit ld-2.29.so [.] _dl_relocate_object
0.00% tchain_edit ld-2.29.so [.] memmove
0.00% tchain_edit [unknown] [k] 0xffffffff93a00b17

After:

$ perf report --stitch-lbr --no-children --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 6K of event 'cycles:u'
# Event count (approx.): 6459496645
#
# Overhead Command Shared Object Symbol
# ........ ........... ................ ........................
#
99.97% tchain_edit tchain_edit [.] f43
|
--99.93%--f43
f42
f41
f40
f39
f38
f37
f36
f35
f34
f33
f32
f31
f30
f29
f28
f27
f26
f25
f24
f23
f22
f21
f20
f19
f18
f17
f16
f15
f14
f13
f12
f11
f10
f9
f8
f7
f6
f5
f4
f3
f2
f1
main
__libc_start_main

0.02% tchain_edit [unknown] [k] 0xffffffff93a00b17
0.01% tchain_edit tchain_edit [.] f31
0.00% tchain_edit ld-2.29.so [.] _dl_important_hwcaps

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pavel Gerasimov <pavel.gerasimov@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vitaly Slobodskoy <vitaly.slobodskoy@intel.com>
Link: http://lore.kernel.org/lkml/20200319202517.23423-14-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 1c5c25b3 01-Apr-2020 Adrian Hunter <adrian.hunter@intel.com>

perf auxtrace: Add an option to synthesize callchains for regular events

Currently, callchains can be synthesized only for synthesized events. Add
an itrace option to synthesize callchains for regular events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200401101613.6201-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ba78c1c5 25-Mar-2020 Namhyung Kim <namhyung@kernel.org>

perf tools: Basic support for CGROUP event

Implement basic functionality to support cgroup tracking. Each cgroup
can be identified by inode number which can be read from userspace too.
The actual cgroup processing will come in the later patch.

Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
[ fix perf test failure on sampling parsing ]
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200325124536.2800725-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 5e3b810a 19-Feb-2020 Jin Yao <yao.jin@linux.intel.com>

perf report: Support a new key to reload the browser

Sometimes we may need to reload the browser to update the output since
some options are changed.

This patch creates a new key K_RELOAD. Once the __cmd_report() returns
K_RELOAD, it would repeat the whole process, such as, read samples from
data file, sort the data and display in the browser.

v5:
---
1. Fix the 'make NO_SLANG=1' error. Define K_RELOAD in util/hist.h.
2. Skip setup_sorting() in repeat path if last key is K_RELOAD.

v4:
---
Need to quit in perf_evsel_menu__run if key is K_RELOAD.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200220013616.19916-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 429a5f9d 19-Feb-2020 Jin Yao <yao.jin@linux.intel.com>

perf report: Allow specifying event to be used as sort key in --group output

When performing "perf report --group", it shows the event group
information together. By default, the output is sorted by the first
event in group.

It would be nice for user to select any event for sorting. This patch
introduces a new option "--group-sort-idx" to sort the output by the
event at the index n in event group.

For example,

Before:

# perf report --group --stdio

# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 12K of events 'cpu/instructions,period=2000003/, cpu/cpu-cycles,period=200003/, BR_MISP_RETIRED.ALL_BRANCHES:pp, cpu/event=0xc0,umask=1,cmask=1,
# Event count (approx.): 6451235635
#
# Overhead Command Shared Object Symbol
# ................................ ......... ....................... ...................................
#
92.19% 98.68% 0.00% 93.30% mgen mgen [.] LOOP1
3.12% 0.29% 0.00% 0.16% gsd-color libglib-2.0.so.0.5600.4 [.] 0x0000000000049515
1.56% 0.03% 0.00% 0.04% gsd-color libglib-2.0.so.0.5600.4 [.] 0x00000000000494b7
1.56% 0.01% 0.00% 0.00% gsd-color libglib-2.0.so.0.5600.4 [.] 0x00000000000494ce
1.56% 0.00% 0.00% 0.00% mgen [kernel.kallsyms] [k] task_tick_fair
0.00% 0.15% 0.00% 0.04% perf [kernel.kallsyms] [k] smp_call_function_single
0.00% 0.13% 0.00% 6.08% swapper [kernel.kallsyms] [k] intel_idle
0.00% 0.03% 0.00% 0.00% gsd-color libglib-2.0.so.0.5600.4 [.] g_main_context_check
0.00% 0.03% 0.00% 0.00% swapper [kernel.kallsyms] [k] apic_timer_interrupt
...

After:

# perf report --group --stdio --group-sort-idx 3

# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 12K of events 'cpu/instructions,period=2000003/, cpu/cpu-cycles,period=200003/, BR_MISP_RETIRED.ALL_BRANCHES:pp, cpu/event=0xc0,umask=1,cmask=1,
# Event count (approx.): 6451235635
#
# Overhead Command Shared Object Symbol
# ................................ ......... ....................... ...................................
#
92.19% 98.68% 0.00% 93.30% mgen mgen [.] LOOP1
0.00% 0.13% 0.00% 6.08% swapper [kernel.kallsyms] [k] intel_idle
3.12% 0.29% 0.00% 0.16% gsd-color libglib-2.0.so.0.5600.4 [.] 0x0000000000049515
0.00% 0.00% 0.00% 0.06% swapper [kernel.kallsyms] [k] hrtimer_start_range_ns
1.56% 0.03% 0.00% 0.04% gsd-color libglib-2.0.so.0.5600.4 [.] 0x00000000000494b7
0.00% 0.15% 0.00% 0.04% perf [kernel.kallsyms] [k] smp_call_function_single
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] update_curr
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] apic_timer_interrupt
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] native_apic_msr_eoi_write
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] __update_load_avg_se
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] scheduler_tick

Now the output is sorted by the fourth event in group.

v7:
---
Rebase to latest perf/core, no other change.

v4:
---
1. Update Documentation/perf-report.txt to mention
'--group-sort-idx' support multiple groups with different
amount of events and it should be used on grouped events.

2. Update __hpp__group_sort_idx(), just return when the
idx is out of limit.

3. Return failure on symbol_conf.group_sort_idx && !session->evlist->nr_groups.
So now we don't need to use together with --group.

v3:
---
Refine the code in __hpp__group_sort_idx().

Before:
for (i = 1; i < nr_members; i++) {
if (i == idx) {
ret = field_cmp(fields_a[i], fields_b[i]);
if (ret)
goto out;
}
}

After:
if (idx >= 1 && idx < nr_members) {
ret = field_cmp(fields_a[idx], fields_b[idx]);
if (ret)
goto out;
}

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200220013616.19916-2-yao.jin@linux.intel.com
[ Renamed pair_fields_alloc() to hist_entry__new_pair() and combined decl + assignment of vars ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# c3b10649 13-Mar-2020 Jin Yao <yao.jin@linux.intel.com>

perf report: Fix no branch type statistics report issue

Previously we could get the report of branch type statistics.

For example:

# perf record -j any,save_type ...
# t perf report --stdio

#
# Branch Statistics:
#
COND_FWD: 40.6%
COND_BWD: 4.1%
CROSS_4K: 24.7%
CROSS_2M: 12.3%
COND: 44.7%
UNCOND: 0.0%
IND: 6.1%
CALL: 24.5%
RET: 24.7%

But now for the recent perf, it can't report the branch type statistics.

It's a regression issue caused by commit 40c39e304641 ("perf report: Fix
a no annotate browser displayed issue"), which only counts the branch
type statistics for browser mode.

This patch moves the branch_type_count() outside of ui__has_annotation()
checking, then branch type statistics can work for stdio mode.

Fixes: 40c39e304641 ("perf report: Fix a no annotate browser displayed issue")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200313134607.12873-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# cca0cc76 02-Feb-2020 Jin Yao <yao.jin@linux.intel.com>

perf block-info: Allow selecting which columns to report and its order

Currently we use a predefined array to set the block info output
formats, it's fixed and inflexible.

This patch adds two parameters "block_hpps" and "nr_hpps" in
block_info__create_report and other static functions, in order to let
user decide which columns to report and with specified report ordering.
It should be more flexible.

Buffers will be allocated to contain the new fmts, of course, we need to
release them before perf exits.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200202141655.32053-4-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 7384083b 12-Feb-2020 Ravi Bangoria <ravi.bangoria@linux.ibm.com>

perf annotate: Make perf config effective

perf default config set by user in [annotate] section is totally ignored
by annotate code. Fix it.

Before:

$ ./perf config
annotate.hide_src_code=true
annotate.show_nr_jumps=true
annotate.show_nr_samples=true

$ ./perf annotate shash
│ unsigned h = 0;
│ movl $0x0,-0xc(%rbp)
│ while (*s)
│ ↓ jmp 44
│ h = 65599 * h + *s++;
11.33 │24: mov -0xc(%rbp),%eax
43.50 │ imul $0x1003f,%eax,%ecx
│ mov -0x18(%rbp),%rax

After:

│ movl $0x0,-0xc(%rbp)
│ ↓ jmp 44
1 │1 24: mov -0xc(%rbp),%eax
4 │ imul $0x1003f,%eax,%ecx
│ mov -0x18(%rbp),%rax

Note that we have removed show_nr_samples and show_total_period from
annotation_options because they are not used. Instead of them we use
symbol_conf.show_nr_samples and symbol_conf.show_total_period.

Committer testing:

Using 'perf annotate --stdio2' to use the TUI rendering but emitting the output to stdio:

# perf config
#
# perf config annotate.hide_src_code=true
# perf config
annotate.hide_src_code=true
#
# perf config annotate.show_nr_jumps=true
# perf config annotate.show_nr_samples=true
# perf config
annotate.hide_src_code=true
annotate.show_nr_jumps=true
annotate.show_nr_samples=true
#
#

Before:

# perf annotate --stdio2 ObjectInstance::weak_pointer_was_finalized
Samples: 1 of event 'cycles', 4000 Hz, Event count (approx.): 830873, [percent: local period]
ObjectInstance::weak_pointer_was_finalized() /usr/lib64/libgjs.so.0.0.0
Percent
00000000000609f0 <ObjectInstance::weak_pointer_was_finalized()@@Base>:
endbr64
cmpq $0x0,0x20(%rdi)
↓ je 10
xor %eax,%eax
← retq
xchg %ax,%ax
100.00 10: push %rbp
cmpq $0x0,0x18(%rdi)
mov %rdi,%rbp
↓ jne 20
1b: xor %eax,%eax
pop %rbp
← retq
nop
20: lea 0x18(%rdi),%rdi
→ callq JS_UpdateWeakPointerAfterGC(JS::Heap<JSObject*
cmpq $0x0,0x18(%rbp)
↑ jne 1b
mov %rbp,%rdi
→ callq ObjectBase::jsobj_addr() const@plt
mov $0x1,%eax
pop %rbp
← retq
#

After:

# perf annotate --stdio2 ObjectInstance::weak_pointer_was_finalized 2> /dev/null
Samples: 1 of event 'cycles', 4000 Hz, Event count (approx.): 830873, [percent: local period]
ObjectInstance::weak_pointer_was_finalized() /usr/lib64/libgjs.so.0.0.0
Samples endbr64
cmpq $0x0,0x20(%rdi)
↓ je 10
xor %eax,%eax
← retq
xchg %ax,%ax
1 1 10: push %rbp
cmpq $0x0,0x18(%rdi)
mov %rdi,%rbp
↓ jne 20
1 1b: xor %eax,%eax
pop %rbp
← retq
nop
1 20: lea 0x18(%rdi),%rdi
→ callq JS_UpdateWeakPointerAfterGC(JS::Heap<JSObject*
cmpq $0x0,0x18(%rbp)
↑ jne 1b
mov %rbp,%rdi
→ callq ObjectBase::jsobj_addr() const@plt
mov $0x1,%eax
pop %rbp
← retq
#
# perf config annotate.show_nr_jumps
annotate.show_nr_jumps=true
# perf config annotate.show_nr_jumps=false
# perf config annotate.show_nr_jumps
annotate.show_nr_jumps=false
#
# perf annotate --stdio2 ObjectInstance::weak_pointer_was_finalized 2> /dev/null
Samples: 1 of event 'cycles', 4000 Hz, Event count (approx.): 830873, [percent: local period]
ObjectInstance::weak_pointer_was_finalized() /usr/lib64/libgjs.so.0.0.0
Samples endbr64
cmpq $0x0,0x20(%rdi)
↓ je 10
xor %eax,%eax
← retq
xchg %ax,%ax
1 10: push %rbp
cmpq $0x0,0x18(%rdi)
mov %rdi,%rbp
↓ jne 20
1b: xor %eax,%eax
pop %rbp
← retq
nop
20: lea 0x18(%rdi),%rdi
→ callq JS_UpdateWeakPointerAfterGC(JS::Heap<JSObject*
cmpq $0x0,0x18(%rbp)
↑ jne 1b
mov %rbp,%rdi
→ callq ObjectBase::jsobj_addr() const@plt
mov $0x1,%eax
pop %rbp
← retq
#

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yisheng Xie <xieyisheng1@huawei.com>
Link: http://lore.kernel.org/lkml/20200213064306.160480-6-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# c3314a74 07-Jan-2020 Jin Yao <yao.jin@linux.intel.com>

perf report: Fix no libunwind compiled warning break s390 issue

Commit 800d3f561659 ("perf report: Add warning when libunwind not
compiled in") breaks the s390 platform. S390 uses libdw-dwarf-unwind for
call chain unwinding and had no support for libunwind.

So the warning "Please install libunwind development packages during the
perf build." caused the confusion even if the call-graph is displayed
correctly.

This patch adds checking for HAVE_DWARF_SUPPORT, which is set when
libdw-dwarf-unwind is compiled in.

Fixes: 800d3f561659 ("perf report: Add warning when libunwind not compiled in")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200107191745.18415-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 3b0b16bf 07-Jan-2020 Andi Kleen <ak@linux.intel.com>

perf tools: Support --prefix/--prefix-strip

The objdump utility has useful --prefix / --prefix-strip options to
allow changing source code file names hardcoded into executables' debug
info. Add options to 'perf report', 'perf top' and 'perf annotate',
which are then passed to objdump.

$ mkdir foo
$ echo 'main() { for (;;); }' > foo/foo.c
$ gcc -g foo/foo.c
foo/foo.c:1:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
1 | main() { for (;;); }
| ^~~~
$ perf record ./a.out
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.230 MB perf.data (5721 samples) ]
$ mv foo bar
$ perf annotate
<does not show source code>
$ perf annotate --prefix=/home/ak/lsrc/git/bar --prefix-strip=5
<does show source code>

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
LPU-Reference: 20200107210444.214071-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# aa9d1f83 03-Jan-2020 Andi Kleen <ak@linux.intel.com>

perf report: Clarify in help that --children is default

Refer to --no-children, which is what most people probably want.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
LPU-Reference: 20200103183643.149150-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0feba17b 19-Dec-2019 Jin Yao <yao.jin@linux.intel.com>

perf report: Fix incorrectly added dimensions as switch perf data file

We observed an issue that was some extra columns displayed after switching
perf data file in browser. The steps to reproduce:

1. perf record -a -e cycles,instructions -- sleep 3
2. perf report --group
3. In browser, we use hotkey 's' to switch to another perf.data
4. Now in browser, the extra columns 'Self' and 'Children' are displayed.

The issue is setup_sorting() executed again after repeat path, so dimensions
are added again.

This patch checks the last key returned from __cmd_report(). If it's
K_SWITCH_INPUT_DATA, skips the setup_sorting().

Fixes: ad0de0971b7f ("perf report: Enable the runtime switching of perf data file")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191220013722.20592-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# bb30acae 14-Nov-2019 Ravi Bangoria <ravi.bangoria@linux.ibm.com>

perf report: Bail out --mem-mode if mem info is not available

If perf.data is recorded without -d, don't allow user to use --mem-mode
with 'perf report'. symbol_daddr and phys_daddr can be recorded
separately and may be present in the perf.data but at the report time
they are associated with mem-mode fields and thus this restriction
applies to them as well.

Before:
$ perf record ls
$ perf report --mem-mode --stdio
# Overhead Local Weight Memory access Symbol
# ........ ............ ............. .......................
55.56% 0 N/A [k] 0xffffffff81a00ae7

After:
$ perf report --mem-mode --stdio
Error:
Selected --mem-mode but no mem data. Did you call perf record without -d?

Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20191114132213.5419-4-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fe87797d 25-Nov-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf thread: Rename thread->mg to thread->maps

One more step on the merge of 'struct maps' with 'struct map_groups'.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-69vcr8pubpym90skxhmbwhiw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 79b6bb73 25-Nov-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf maps: Merge 'struct maps' with 'struct map_groups'

And pick the shortest name: 'struct maps'.

The split existed because we used to have two groups of maps, one for
functions and one for variables, but that only complicated things,
sometimes we needed to figure out what was at some address and then had
to first try it on the functions group and if that failed, fall back to
the variables one.

That split is long gone, so for quite a while we had only one struct
maps per struct map_groups, simplify things by combining those structs.

First patch is the minimum needed to merge both, follow up patches will
rename 'thread->mg' to 'thread->maps', etc.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-hom6639ro7020o708trhxh59@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 848a5e50 18-Nov-2019 Jin Yao <yao.jin@linux.intel.com>

perf report: Jump to symbol source view from total cycles view

This patch supports jumping from tui total cycles view to symbol source
view.

For example,

perf record -b ./div
perf report --total-cycles

In total cycles view, we can select one entry and press 'a' or press
ENTER key to jump to symbol source view.

This patch also sets sort_order to NULL in cmd_report() which will use
the default branch sort order. The percent value in new annotate view
will be consistent with the percent in annotate view switched from perf
report (we observed the original percent gap with previous patches).

v2:
---
Fix the 'make NO_SLANG=1' error. (set __maybe_unused to
annotation_opts in block_hists_tui_browse()).

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191118140849.20714-2-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0e3149f8 19-Nov-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf dso: Move dso_id from 'struct map' to 'struct dso'

And take it into account when looking up DSOs when we have the dso_id
fields obtained from somewhere, like from PERF_RECORD_MMAP2 records.

Instances of struct map pointing to the same DSO pathname but with
anything in dso_id different are in fact different DSOs, so better have
different 'struct dso' instances to reflect that. At some point we may
want to get copies of the contents of the different objects if we want
to do correct annotation or other analysis.

With this we get 'struct map' 24 bytes leaner:

$ pahole -C map ~/bin/perf
struct map {
union {
struct rb_node rb_node __attribute__((__aligned__(8))); /* 0 24 */
struct list_head node; /* 0 16 */
} __attribute__((__aligned__(8))); /* 0 24 */
u64 start; /* 24 8 */
u64 end; /* 32 8 */
_Bool erange_warned:1; /* 40: 0 1 */
_Bool priv:1; /* 40: 1 1 */

/* XXX 6 bits hole, try to pack */
/* XXX 3 bytes hole, try to pack */

u32 prot; /* 44 4 */
u64 pgoff; /* 48 8 */
u64 reloc; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
u64 (*map_ip)(struct map *, u64); /* 64 8 */
u64 (*unmap_ip)(struct map *, u64); /* 72 8 */
struct dso * dso; /* 80 8 */
refcount_t refcnt; /* 88 4 */
u32 flags; /* 92 4 */

/* size: 96, cachelines: 2, members: 13 */
/* sum members: 92, holes: 1, sum holes: 3 */
/* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits */
/* forced alignments: 1 */
/* last cacheline: 32 bytes */
} __attribute__((__aligned__(8)));
$

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-g4hxxmraplo7wfjmk384mfsb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 99459a84 18-Nov-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf map: Move maj/min/ino/ino_generation to separate struct

And this patch highlights where these fields are being used: in the sort
order where it uses it to compare maps and classify samples taking into
account not just the DSO, but those DSO id fields.

I think these should be used to differentiate DSOs with the same name
but different 'struct dso_id' fields, i.e. these fields should move to
'struct dso' and then be used as part of the key when doing lookups for
DSOs, in addition to the DSO name.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-8v5isitqy0dup47nnwkpc80f@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 29754894 04-Nov-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf annotate: Pass a 'map_symbol' in places receiving a pair of 'map' and 'symbol' pointers

We are already passing things like:

symbol__annotate(ms->sym, ms->map, ...)

So shorten the signature of such functions to receive the 'map_symbol'
pointer.

This also paves the way to having the 'struct map_groups' pointer in the
'struct map_symbol' so that we can get rid of 'struct map'->groups.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-23yx8v1t41nzpkpi7rdrozww@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 7fa46cbf 07-Nov-2019 Jin Yao <yao.jin@linux.intel.com>

perf report: Sort by sampled cycles percent per block for tui

Previous patch has implemented a new option "--total-cycles". But only
stdio mode is supported.

This patch supports the tui mode and support '--percent-limit'.

For example,

perf record -b ./div
perf report --total-cycles --percent-limit 1

# Samples: 2753248 of event 'cycles'
Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles [Program Block Range] Shared Object
26.04% 2.8M 0.40% 18 [div.c:42 -> div.c:39] div
15.17% 1.2M 0.16% 7 [random_r.c:357 -> random_r.c:380] libc-2.27.so
5.11% 402.0K 0.04% 2 [div.c:27 -> div.c:28] div
4.87% 381.6K 0.04% 2 [random.c:288 -> random.c:291] libc-2.27.so
4.53% 381.0K 0.04% 2 [div.c:40 -> div.c:40] div
3.85% 300.9K 0.02% 1 [div.c:22 -> div.c:25] div
3.08% 241.1K 0.02% 1 [rand.c:26 -> rand.c:27] libc-2.27.so
3.06% 240.0K 0.02% 1 [random.c:291 -> random.c:291] libc-2.27.so
2.78% 215.7K 0.02% 1 [random.c:298 -> random.c:298] libc-2.27.so
2.52% 198.3K 0.02% 1 [random.c:293 -> random.c:293] libc-2.27.so
2.36% 184.8K 0.02% 1 [rand.c:28 -> rand.c:28] libc-2.27.so
2.33% 180.5K 0.02% 1 [random.c:295 -> random.c:295] libc-2.27.so
2.28% 176.7K 0.02% 1 [random.c:295 -> random.c:295] libc-2.27.so
2.20% 168.8K 0.02% 1 [rand@plt+0 -> rand@plt+0] div
1.98% 158.2K 0.02% 1 [random_r.c:388 -> random_r.c:388] libc-2.27.so
1.57% 123.3K 0.02% 1 [div.c:42 -> div.c:44] div
1.44% 116.0K 0.42% 19 [random_r.c:357 -> random_r.c:394] libc-2.27.so

--------------------------------------------------

v7:
---
1. Since we have used use_browser in report__browse_block_hists
to support stdio mode, now we also add supporting for tui.

2. Move block tui browser code from ui/browsers/hists.c
to block-info.c.

v6:
---
Create report__tui_browse_block_hists in block-info.c
(codes are moved from builtin-report.c).

v5:
---
Fix a crash issue when running perf report without
'--total-cycles'. The issue is because the internal flag
is renamed from 'total_cycles' to 'total_cycles_mode' in
previous patch but this patch still uses 'total_cycles'
to check if the '--total-cycles' option is enabled, which
causes the code to be inconsistent.

v4:
---
Since the block collection is moved out of printing in
previous patch, this patch is updated accordingly for
tui supporting.

v3:
---
Minor change since the function name is changed:
block_total_cycles_percent -> block_info__total_cycles_percent

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191107074719.26139-8-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0b49f836 07-Nov-2019 Jin Yao <yao.jin@linux.intel.com>

perf report: Support --percent-limit for --total-cycles

We have already supported the '--total-cycles' option in previous patch.
It's also useful to show entries only above a threshold percent.

This patch enables '--percent-limit' for not showing entries
under that percent.

For example:

perf report --total-cycles --stdio --percent-limit 1

# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 2M of event 'cycles'
# Event count (approx.): 2753248
#
# Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles [Program Block Range] Shared Object
# ............... .............. ........... .......... ................................................................. ....................
#
26.04% 2.8M 0.40% 18 [div.c:42 -> div.c:39] div
15.17% 1.2M 0.16% 7 [random_r.c:357 -> random_r.c:380] libc-2.27.so
5.11% 402.0K 0.04% 2 [div.c:27 -> div.c:28] div
4.87% 381.6K 0.04% 2 [random.c:288 -> random.c:291] libc-2.27.so
4.53% 381.0K 0.04% 2 [div.c:40 -> div.c:40] div
3.85% 300.9K 0.02% 1 [div.c:22 -> div.c:25] div
3.08% 241.1K 0.02% 1 [rand.c:26 -> rand.c:27] libc-2.27.so
3.06% 240.0K 0.02% 1 [random.c:291 -> random.c:291] libc-2.27.so
2.78% 215.7K 0.02% 1 [random.c:298 -> random.c:298] libc-2.27.so
2.52% 198.3K 0.02% 1 [random.c:293 -> random.c:293] libc-2.27.so
2.36% 184.8K 0.02% 1 [rand.c:28 -> rand.c:28] libc-2.27.so
2.33% 180.5K 0.02% 1 [random.c:295 -> random.c:295] libc-2.27.so
2.28% 176.7K 0.02% 1 [random.c:295 -> random.c:295] libc-2.27.so
2.20% 168.8K 0.02% 1 [rand@plt+0 -> rand@plt+0] div
1.98% 158.2K 0.02% 1 [random_r.c:388 -> random_r.c:388] libc-2.27.so
1.57% 123.3K 0.02% 1 [div.c:42 -> div.c:44] div
1.44% 116.0K 0.42% 19 [random_r.c:357 -> random_r.c:394] libc-2.27.so

Committer testing:

From second exapmple onwards slightly edited for brevity:

# perf report --total-cycles --percent-limit 2 --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 6M of event 'cycles'
# Event count (approx.): 6299936
#
# Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles [Program Block Range] Shared Object
# ............... .............. ........... .......... ...................................................................... ....................
#
2.17% 1.7M 0.08% 607 [compiler.h:199 -> common.c:221] [kernel.vmlinux]
#
# (Tip: Create an archive with symtabs to analyse on other machine: perf archive)
#
# perf report --total-cycles --percent-limit 1 --stdio
# Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles [Program Block Range] Shared Object
2.17% 1.7M 0.08% 607 [compiler.h:199 -> common.c:221] [kernel.vmlinux]
1.75% 1.3M 8.34% 65.5K [memset-vec-unaligned-erms.S:147 -> memset-vec-unaligned-erms.S:151] libc-2.29.so
#
# perf report --total-cycles --percent-limit 0.7 --stdio
# Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles [Program Block Range] Shared Object
2.17% 1.7M 0.08% 607 [compiler.h:199 -> common.c:221] [kernel.vmlinux]
1.75% 1.3M 8.34% 65.5K [memset-vec-unaligned-erms.S:147 -> memset-vec-unaligned-erms.S:151] libc-2.29.so
0.72% 544.5K 0.03% 230 [entry_64.S:657 -> entry_64.S:662] [kernel.vmlinux]
#

-------------------------------------------

It only shows the entries which 'Sampled Cycles%' > 1%.

v7:
---
No functional change. Only fix the conflict issue because
previous patches are changed.

v6:
---
No functional change. Only fix the conflict issue because
previous patches are changed.

v5:
---
No functional change. Only fix the conflict issue because
previous patches are changed.

v4:
---
No functional change. Only fix the build issue because
previous patches are changed.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191107074719.26139-7-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 6f7164fa 07-Nov-2019 Jin Yao <yao.jin@linux.intel.com>

perf report: Sort by sampled cycles percent per block for stdio

It would be useful to support sorting for all blocks by the sampled
cycles percent per block. This is useful to concentrate on the globally
hottest blocks.

This patch implements a new option "--total-cycles" which sorts all
blocks by 'Sampled Cycles%'. The 'Sampled Cycles%' is the percent:

percent = block sampled cycles aggregation / total sampled cycles

Note that, this patch only supports "--stdio" mode.

For example,

# perf record -b ./div
# perf report --total-cycles --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 2M of event 'cycles'
# Event count (approx.): 2753248
#
# Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles [Program Block Range] Shared Object
# ............... .............. ........... .......... ................................................ .................
#
26.04% 2.8M 0.40% 18 [div.c:42 -> div.c:39] div
15.17% 1.2M 0.16% 7 [random_r.c:357 -> random_r.c:380] libc-2.27.so
5.11% 402.0K 0.04% 2 [div.c:27 -> div.c:28] div
4.87% 381.6K 0.04% 2 [random.c:288 -> random.c:291] libc-2.27.so
4.53% 381.0K 0.04% 2 [div.c:40 -> div.c:40] div
3.85% 300.9K 0.02% 1 [div.c:22 -> div.c:25] div
3.08% 241.1K 0.02% 1 [rand.c:26 -> rand.c:27] libc-2.27.so
3.06% 240.0K 0.02% 1 [random.c:291 -> random.c:291] libc-2.27.so
2.78% 215.7K 0.02% 1 [random.c:298 -> random.c:298] libc-2.27.so
2.52% 198.3K 0.02% 1 [random.c:293 -> random.c:293] libc-2.27.so
2.36% 184.8K 0.02% 1 [rand.c:28 -> rand.c:28] libc-2.27.so
2.33% 180.5K 0.02% 1 [random.c:295 -> random.c:295] libc-2.27.so
2.28% 176.7K 0.02% 1 [random.c:295 -> random.c:295] libc-2.27.so
2.20% 168.8K 0.02% 1 [rand@plt+0 -> rand@plt+0] div
1.98% 158.2K 0.02% 1 [random_r.c:388 -> random_r.c:388] libc-2.27.so
1.57% 123.3K 0.02% 1 [div.c:42 -> div.c:44] div
1.44% 116.0K 0.42% 19 [random_r.c:357 -> random_r.c:394] libc-2.27.so
0.25% 182.5K 0.02% 1 [random_r.c:388 -> random_r.c:391] libc-2.27.so
0.00% 48 1.07% 48 [x86_pmu_enable+284 -> x86_pmu_enable+298] [kernel.kallsyms]
0.00% 74 1.64% 74 [vm_mmap_pgoff+0 -> vm_mmap_pgoff+92] [kernel.kallsyms]
0.00% 73 1.62% 73 [vm_mmap+0 -> vm_mmap+48] [kernel.kallsyms]
0.00% 63 0.69% 31 [up_write+0 -> up_write+34] [kernel.kallsyms]
0.00% 13 0.29% 13 [setup_arg_pages+396 -> setup_arg_pages+413] [kernel.kallsyms]
0.00% 3 0.07% 3 [setup_arg_pages+418 -> setup_arg_pages+450] [kernel.kallsyms]
0.00% 616 6.84% 308 [security_mmap_file+0 -> security_mmap_file+72] [kernel.kallsyms]
0.00% 23 0.51% 23 [security_mmap_file+77 -> security_mmap_file+87] [kernel.kallsyms]
0.00% 4 0.02% 1 [sched_clock+0 -> sched_clock+4] [kernel.kallsyms]
0.00% 4 0.02% 1 [sched_clock+9 -> sched_clock+12] [kernel.kallsyms]
0.00% 1 0.02% 1 [rcu_nmi_exit+0 -> rcu_nmi_exit+9] [kernel.kallsyms]

Committer testing:

This should provide material for hours of endless joy, both from looking
for suspicious things in the implementation of this patch, such as the
top one:

# Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles [Program Block Range] Shared Object
2.17% 1.7M 0.08% 607 [compiler.h:199 -> common.c:221] [kernel.vmlinux]

As well from things that look legit:

# Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles [Program Block Range] Shared Object
0.16% 123.0K 0.60% 4.7K [nospec-branch.h:265 -> nospec-branch.h:278] [kernel.vmlinux]

:-)

Very short system wide taken branches session:

# perf record -h -b

Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]

-b, --branch-any sample any taken branches

#
# perf record -b
^C[ perf record: Woken up 595 times to write data ]
[ perf record: Captured and wrote 156.672 MB perf.data (196873 samples) ]

#
# perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY
#
# perf report --total-cycles --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 6M of event 'cycles'
# Event count (approx.): 6299936
#
# Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles [Program Block Range] Shared Object
# ............... .............. ........... .......... ...................................................................... ....................
#
2.17% 1.7M 0.08% 607 [compiler.h:199 -> common.c:221] [kernel.vmlinux]
1.75% 1.3M 8.34% 65.5K [memset-vec-unaligned-erms.S:147 -> memset-vec-unaligned-erms.S:151] libc-2.29.so
0.72% 544.5K 0.03% 230 [entry_64.S:657 -> entry_64.S:662] [kernel.vmlinux]
0.56% 541.8K 0.09% 672 [compiler.h:199 -> common.c:300] [kernel.vmlinux]
0.39% 293.2K 0.01% 104 [list_debug.c:43 -> list_debug.c:61] [kernel.vmlinux]
0.36% 278.6K 0.03% 272 [entry_64.S:1289 -> entry_64.S:1308] [kernel.vmlinux]
0.30% 260.8K 0.07% 564 [clear_page_64.S:47 -> clear_page_64.S:50] [kernel.vmlinux]
0.28% 215.3K 0.05% 369 [traps.c:623 -> traps.c:628] [kernel.vmlinux]
0.23% 178.1K 0.04% 278 [entry_64.S:271 -> entry_64.S:275] [kernel.vmlinux]
0.20% 152.6K 0.09% 706 [paravirt.c:177 -> paravirt.c:179] [kernel.vmlinux]
0.20% 155.8K 0.05% 373 [entry_64.S:153 -> entry_64.S:175] [kernel.vmlinux]
0.18% 136.6K 0.03% 222 [msr.h:105 -> msr.h:166] [kernel.vmlinux]
0.16% 123.0K 0.60% 4.7K [nospec-branch.h:265 -> nospec-branch.h:278] [kernel.vmlinux]
0.16% 118.3K 0.01% 44 [entry_64.S:632 -> entry_64.S:657] [kernel.vmlinux]
0.14% 104.5K 0.00% 28 [rwsem.c:1541 -> rwsem.c:1544] [kernel.vmlinux]
0.13% 99.2K 0.01% 53 [spinlock.c:150 -> spinlock.c:152] [kernel.vmlinux]
0.13% 95.5K 0.00% 35 [swap.c:456 -> swap.c:471] [kernel.vmlinux]
0.12% 96.2K 0.05% 407 [copy_user_64.S:175 -> copy_user_64.S:209] [kernel.vmlinux]
0.11% 85.9K 0.00% 31 [swap.c:400 -> page-flags.h:188] [kernel.vmlinux]
0.10% 73.0K 0.01% 52 [paravirt.h:763 -> list.h:131] [kernel.vmlinux]
0.07% 56.2K 0.03% 214 [filemap.c:1524 -> filemap.c:1557] [kernel.vmlinux]
0.07% 54.2K 0.02% 145 [memory.c:1032 -> memory.c:1049] [kernel.vmlinux]
0.07% 50.3K 0.00% 39 [mmzone.c:49 -> mmzone.c:69] [kernel.vmlinux]
0.06% 48.3K 0.01% 40 [paravirt.h:768 -> page_alloc.c:3304] [kernel.vmlinux]
0.06% 46.7K 0.02% 155 [memory.c:1032 -> memory.c:1056] [kernel.vmlinux]
0.06% 46.9K 0.01% 103 [swap.c:867 -> swap.c:902] [kernel.vmlinux]
0.06% 47.8K 0.00% 34 [entry_64.S:1201 -> entry_64.S:1202] [kernel.vmlinux]

-----------------------------------------------------------

v7:
---
Use use_browser in report__browse_block_hists for supporting
stdio and potential tui mode.

v6:
---
Create report__browse_block_hists in block-info.c (codes are
moved from builtin-report.c). It's called from
perf_evlist__tty_browse_hists.

v5:
---
1. Move all block functions to block-info.c

2. Move the code of setting ms in block hist_entry to
other patch.

v4:
---
1. Use new option '--total-cycles' to replace
'-s total_cycles' in v3.

2. Move block info collection out of block info
printing.

v3:
---
1. Use common function block_info__process_sym to
process the blocks per symbol.

2. Remove the nasty hack for skipping calculation
of column length

3. Some minor cleanup

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191107074719.26139-6-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 7841f40a 07-Nov-2019 Jin Yao <yao.jin@linux.intel.com>

perf hist: Count the total cycles of all samples

We can get the per sample cycles by hist__account_cycles(). It's also
useful to know the total cycles of all samples in order to get the
cycles coverage for a single program block in further. For example:

coverage = per block sampled cycles / total sampled cycles

This patch creates a new argument 'total_cycles' in hist__account_cycles(),
which will be added with the cycles of each sample.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191107074719.26139-4-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8efc4f05 28-Oct-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf maps: Add for_each_entry()/_safe() iterators

To reduce boilerplate, provide a more compact form using an idiom
present in other trees of data structures.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-59gmq4kg1r68ou1wknyjl78x@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 800d3f56 10-Oct-2019 Jin Yao <yao.jin@linux.intel.com>

perf report: Add warning when libunwind not compiled in

We received a user report that call-graph DWARF mode was enabled in
'perf record' but 'perf report' didn't unwind the callstack correctly.
The reason was, libunwind was not compiled in.

We can use 'perf -vv' to check the compiled libraries but it would be
valuable to report a warning to user directly (especially valuable for
a perf newbie).

The warning is:

Warning:
Please install libunwind development packages during the perf build.

Both TUI and stdio are supported.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191011022122.26369-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 6ef81c55 21-Aug-2019 Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>

perf session: Return error code for perf_session__new() function on failure

This patch is to return error code of perf_new_session function on
failure instead of NULL.

Test Results:

Before Fix:

$ perf c2c report -input
failed to open nput: No such file or directory

$ echo $?
0
$

After Fix:

$ perf c2c report -input
failed to open nput: No such file or directory

$ echo $?
254
$

Committer notes:

Fix 'perf tests topology' case, where we use that TEST_ASSERT_VAL(...,
session), i.e. we need to pass zero in case of failure, which was the
case before when NULL was returned by perf_session__new() for failure,
but now we need to negate the result of IS_ERR(session) to respect that
TEST_ASSERT_VAL) expectation of zero meaning failure.

Reported-by: Nageswara R Sastry <rnsastry@linux.vnet.ibm.com>
Signed-off-by: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Nageswara R Sastry <rnsastry@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shawn Landden <shawn@git.icu>
Cc: Song Liu <songliubraving@fb.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Link: http://lore.kernel.org/lkml/20190822071223.17892.45782.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fb71c86c 03-Sep-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Remove util.h from where it is not needed

Check that it is not needed and remove, fixing up some fallout for
places where it was only serving to get something else.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-9h6dg6lsqe2usyqjh5rrues4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# d3300a3c 30-Aug-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Move mem_info and branch_info out of symbol.h

The mem_info struct goes to mem-events.h and branch_info goes to
branch.h, where they belong, this way we can remove several headers from
symbols.h and trim the include dependency tree more.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-aupw71xnravcsu2xoabfmhpc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 171f7474 30-Aug-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf hist: Remove needless ui/progress.h from hist.h

We only need a forward declaration, add it and fixup all the files that
need ui_progress definitions but were wrongly getting it from hist.h.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-84a90o9jdxybffxo9jmouokw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 4a3cec84 30-Aug-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf dsos: Move the dsos struct and its methods to separate source files

So that we can reduce the header dependency tree further, in the process
noticed that lots of places were getting even things like build-id
routines and 'struct perf_tool' definition indirectly, so fix all those
too.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ti0btma9ow5ndrytyoqdk62j@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8520a98d 29-Aug-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf debug: Remove needless include directives from debug.h

All we need there is a forward declaration for 'union perf_event', so
remove it from there and add missing header directives in places using
things from this indirect include.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-7ftk0ztstqub1tirjj8o8xbl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 1b8896fb 28-Aug-2019 Jiri Olsa <jolsa@kernel.org>

libperf: Add PERF_RECORD_HEADER_FEATURE 'struct feature_event' to perf/event.h

Move the PERF_RECORD_HEADER_FEATURE event definition to libperf's
event.h.

In order to keep libperf simple, we switch 'u64/u32/u16/u8' types used
events to their generic '__u*' versions.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190828135717.7245-20-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2da39f1c 27-Aug-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf evlist: Remove needless util.h from evlist.h

There is no need for that util/util.h include there and, remove it,
pruning the include tree, fix the fallout by adding necessary headers to
places that were getting needed includes indirectly from evlist.h ->
util.h.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-s9f7uve8wvykr5itcm7m7d8q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 97b9d866 22-Aug-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf srcline: Add missing srcline.h header to files needing its defs

When srcline was introduced it wrongly added the include to util/sort.h,
even with that header not needing the definitions it provides, fix it by
adding it to the places that need it as a pre patch to remove srcline.h
from sort.h.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-shuebppedtye8hrgxk15qe3x@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 10ccbc1c 09-Aug-2019 Alexey Budankov <alexey.budankov@linux.intel.com>

perf report: Prefer DWARF callstacks to LBR ones when captured both

Display DWARF based callchains when the perf.data file contains raw thread
stack data as LBR callstack data.

Commiter testing:

This changes the output from the branch stack based one, i.e. without
this patch, for the same file as in the previous csets:

# perf report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 13 of event 'cycles'
# Event count (approx.): 13
#
# Overhead Command Source Shared Object Source Symbol Target Symbol Basic Block Cycles
# ........ ....... .................... ........................... ......................................... ..................
#
7.69% ls libpthread-2.29.so [.] _init [.] __pthread_initialize_minimal_internal 6827
7.69% ls ld-2.29.so [k] _start [k] _dl_start -
7.69% ls ld-2.29.so [.] _dl_start_user [.] _dl_init -24790
7.69% ls ld-2.29.so [k] _dl_start [k] _dl_sysdep_start 278
7.69% ls ld-2.29.so [k] dl_main [k] _dl_map_object_deps 15581
7.69% ls ld-2.29.so [k] open_verify.constprop.0 [k] lseek64 4228
7.69% ls ld-2.29.so [k] _dl_map_object [k] open_verify.constprop.0 55
7.69% ls ld-2.29.so [k] openaux [k] _dl_map_object 67
7.69% ls ld-2.29.so [k] _dl_map_object_deps [k] 0x00007f441b57c090 112
7.69% ls ld-2.29.so [.] call_init.part.0 [.] _init 334
7.69% ls ld-2.29.so [.] _dl_init [.] call_init.part.0 383
7.69% ls ld-2.29.so [k] _dl_sysdep_start [k] dl_main 45
7.69% ls ld-2.29.so [k] _dl_catch_exception [k] openaux 116

#
# (Tip: For memory address profiling, try: perf mem record / perf mem report)
#

To the one that shows call chains:

# perf report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 10 of event 'cycles'
# Event count (approx.): 3204047
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... .................. .........................................
#
55.01% 0.00% ls [kernel.vmlinux] [k] entry_SYSCALL_64_after_hwframe
|
---entry_SYSCALL_64_after_hwframe
do_syscall_64
|
--16.01%--__x64_sys_execve
__do_execve_file.isra.0
search_binary_handler
load_elf_binary
elf_map
vm_mmap_pgoff
do_mmap
mmap_region
perf_event_mmap
perf_iterate_sb
perf_iterate_ctx
perf_event_mmap_output
perf_output_copy
memcpy_erms

55.01% 39.00% ls [kernel.vmlinux] [k] do_syscall_64
|
|--39.00%--0xffffffffffffffff
| _dl_map_object
| open_verify.constprop.0
| __lseek64 (inlined)
| entry_SYSCALL_64_after_hwframe
| do_syscall_64
|
--16.01%--do_syscall_64
__x64_sys_execve
__do_execve_file.isra.0
search_binary_handler
load_elf_binary
elf_map
vm_mmap_pgoff
do_mmap
mmap_region
perf_event_mmap
perf_iterate_sb
perf_iterate_ctx
perf_event_mmap_output
perf_output_copy
memcpy_erms

42.95% 42.95% ls libpthread-2.29.so [.] __pthread_initialize_minimal_internal
|
---_init
__pthread_initialize_minimal_internal

42.95% 0.00% ls libpthread-2.29.so [.] _init
|
---_init
__pthread_initialize_minimal_internal

<SNIP>

#
# (Tip: Profiling branch (mis)predictions with: perf record -b / perf report)
#
#

The branch stack view be explicitely selected using:

# perf report -h branch-stack

Usage: perf report [<options>]

-b, --branch-stack use branch records for per branch histogram filling

#

I.e. after this patch:

# perf report -b --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 13 of event 'cycles'
# Event count (approx.): 13
#
# Overhead Command Source Shared Object Source Symbol Target Symbol Basic Block Cycles
# ........ ....... .................... ........................... ......................................... ..................
#
7.69% ls libpthread-2.29.so [.] _init [.] __pthread_initialize_minimal_internal 6827
7.69% ls ld-2.29.so [k] _start [k] _dl_start -
7.69% ls ld-2.29.so [.] _dl_start_user [.] _dl_init -24790
7.69% ls ld-2.29.so [k] _dl_start [k] _dl_sysdep_start 278
7.69% ls ld-2.29.so [k] dl_main [k] _dl_map_object_deps 15581
7.69% ls ld-2.29.so [k] open_verify.constprop.0 [k] lseek64 4228
7.69% ls ld-2.29.so [k] _dl_map_object [k] open_verify.constprop.0 55
7.69% ls ld-2.29.so [k] openaux [k] _dl_map_object 67
7.69% ls ld-2.29.so [k] _dl_map_object_deps [k] 0x00007f441b57c090 112
7.69% ls ld-2.29.so [.] call_init.part.0 [.] _init 334
7.69% ls ld-2.29.so [.] _dl_init [.] call_init.part.0 383
7.69% ls ld-2.29.so [k] _dl_sysdep_start [k] dl_main 45
7.69% ls ld-2.29.so [k] _dl_catch_exception [k] openaux 116

#
# (Tip: Show current config key-value pairs: perf config --list)
#
#

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/ccbd9583-82f4-dec5-7e84-64bf56e351fb@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ef4b1a53 15-Aug-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Add --switch-on/--switch-off events

Since 'perf top' shares the histogram browser with 'perf report', then
the same explanation in the previous cset applies.

An additional example uses a pair of SDT events available for systemtap:

# perf probe --exec=/usr/bin/stap '%*:*'
Added new events:
sdt_stap:benchmark__thread__start (on %* in /usr/bin/stap)
sdt_stap:benchmark (on %* in /usr/bin/stap)
sdt_stap:benchmark__thread__end (on %* in /usr/bin/stap)
sdt_stap:pass6__start (on %* in /usr/bin/stap)
sdt_stap:pass6__end (on %* in /usr/bin/stap)
sdt_stap:pass5__start (on %* in /usr/bin/stap)
sdt_stap:pass5__end (on %* in /usr/bin/stap)
sdt_stap:pass0__start (on %* in /usr/bin/stap)
sdt_stap:pass0__end (on %* in /usr/bin/stap)
sdt_stap:pass1a__start (on %* in /usr/bin/stap)
sdt_stap:pass1b__start (on %* in /usr/bin/stap)
sdt_stap:pass1__end (on %* in /usr/bin/stap)
sdt_stap:pass2__start (on %* in /usr/bin/stap)
sdt_stap:pass2__end (on %* in /usr/bin/stap)
sdt_stap:pass3__start (on %* in /usr/bin/stap)
sdt_stap:pass3__end (on %* in /usr/bin/stap)
sdt_stap:pass4__start (on %* in /usr/bin/stap)
sdt_stap:pass4__end (on %* in /usr/bin/stap)
sdt_stap:benchmark__start (on %* in /usr/bin/stap)
sdt_stap:benchmark__end (on %* in /usr/bin/stap)
sdt_stap:cache__get (on %* in /usr/bin/stap)
sdt_stap:cache__clean (on %* in /usr/bin/stap)
sdt_stap:cache__add__module (on %* in /usr/bin/stap)
sdt_stap:cache__add__source (on %* in /usr/bin/stap)
sdt_stap:stap_system__complete (on %* in /usr/bin/stap)
sdt_stap:stap_system__start (on %* in /usr/bin/stap)
sdt_stap:stap_system__spawn (on %* in /usr/bin/stap)
sdt_stap:stap_system__fork (on %* in /usr/bin/stap)
sdt_stap:intern_string (on %* in /usr/bin/stap)
sdt_stap:client__start (on %* in /usr/bin/stap)
sdt_stap:client__end (on %* in /usr/bin/stap)

You can now use it in all perf tools, such as:

perf record -e sdt_stap:client__end -aR sleep 1

#

From these we're use the two below to run systemtap's test suite:

# perf record -e sdt_stap:pass2__*,cycles:P make installcheck > /dev/null
^C[ perf record: Woken up 8 times to write data ]
[ perf record: Captured and wrote 2.691 MB perf.data (39638 samples) ]
Terminated
# perf script | grep sdt_stap
stap 28979 [000] 19424.302660: sdt_stap:pass2__start: (561b9a537de3) arg1=140730364262544
stap 28979 [000] 19424.333083: sdt_stap:pass2__end: (561b9a53a9e1) arg1=140730364262544
stap 29045 [006] 19424.933460: sdt_stap:pass2__start: (563edddcede3) arg1=140722674883152
stap 29045 [006] 19424.963794: sdt_stap:pass2__end: (563edddd19e1) arg1=140722674883152
# perf script | grep cycles | wc -l
39634
#

Looking at the whole perf.data file:

[root@quaco testsuite]# perf report | grep cycles:P -A25
# Samples: 39K of event 'cycles:P'
# Event count (approx.): 34044267368
#
# Overhead Command Shared Object Symbol
# ........ ....... .................... ................................
#
3.50% cc1 cc1 [.] ht_lookup_with_hash
3.04% cc1 cc1 [.] _cpp_lex_token
2.11% cc1 cc1 [.] ggc_internal_alloc
1.83% cc1 cc1 [.] cpp_get_token_with_location
1.68% cc1 libc-2.29.so [.] _int_malloc
1.41% cc1 cc1 [.] linemap_position_for_column
1.25% cc1 cc1 [.] ggc_internal_cleared_alloc
1.20% cc1 cc1 [.] c_lex_with_flags
1.18% cc1 cc1 [.] get_combined_adhoc_loc
1.05% cc1 libc-2.29.so [.] malloc
1.01% cc1 libc-2.29.so [.] _int_free
0.96% stap stap [.] std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Identity, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, stringtable_hash, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, true, true> >::_M_insert<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__detail::_AllocNode<std::allocator<std::__detail::_Hash_node<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, true> > > >
0.78% stap stap [.] lexer::scan
0.74% cc1 cc1 [.] _cpp_lex_direct
0.70% cc1 cc1 [.] pop_scope
0.70% cc1 cc1 [.] c_parser_declspecs
0.69% stap libc-2.29.so [.] _int_malloc
0.68% cc1 cc1 [.] htab_find_slot
0.68% cc1 [kernel.vmlinux] [k] prepare_exit_to_usermode
0.64% cc1 [kernel.vmlinux] [k] clear_page_erms
[root@quaco testsuite]#

And now only what happens in slices demarcated by those start/end SDT
events:

[root@quaco testsuite]# perf report --switch-on=sdt_stap:pass2__start --switch-off=sdt_stap:pass2__end | grep cycles:P -A100
# Samples: 240 of event 'cycles:P'
# Event count (approx.): 206491934
#
# Overhead Command Shared Object Symbol
# ........ ....... ................... ................................................
#
38.99% stap stap [.] systemtap_session::register_library_aliases
19.47% stap stap [.] match_key::operator<
15.01% stap libc-2.29.so [.] __memcmp_avx2_movbe
5.19% stap libc-2.29.so [.] _int_malloc
2.50% stap libstdc++.so.6.0.26 [.] std::_Rb_tree_insert_and_rebalance
2.30% stap stap [.] match_node::build_no_more
2.07% stap libc-2.29.so [.] malloc
1.66% stap stap [.] std::_Rb_tree<match_key, std::pair<match_key const, match_node*>, std::_Select1st<std::pair<match_key const, match_node*> >, std::less<match_key>, std::allocator<std::pair<match_key const, match_node*> > >::find
1.66% stap stap [.] match_node::bind
1.58% stap [kernel.vmlinux] [k] prepare_exit_to_usermode
1.17% stap [kernel.vmlinux] [k] native_irq_return_iret
0.87% stap stap [.] 0x0000000000032ec4
0.77% stap libstdc++.so.6.0.26 [.] std::_Rb_tree_increment
0.47% stap stap [.] std::vector<derived_probe_builder*, std::allocator<derived_probe_builder*> >::_M_realloc_insert<derived_probe_builder* const&>
0.47% stap [kernel.vmlinux] [k] get_page_from_freelist
0.47% stap [kernel.vmlinux] [k] swapgs_restore_regs_and_return_to_usermode
0.47% stap [kernel.vmlinux] [k] do_user_addr_fault
0.46% stap [kernel.vmlinux] [k] __pagevec_lru_add_fn
0.46% stap stap [.] std::_Rb_tree<match_key, std::pair<match_key const, match_node*>, std::_Select1st<std::pair<match_key const, match_node*> >, std::less<match_key>, std::allocator<std::pair<match_key const, match_node*> > >::_M_emplace_unique<std::pair<match_key, match_node*> >
0.42% stap libstdc++.so.6.0.26 [.] 0x00000000000c18fa
0.40% stap [kernel.vmlinux] [k] interrupt_entry
0.40% stap [kernel.vmlinux] [k] update_load_avg
0.40% stap [kernel.vmlinux] [k] __intel_pmu_disable_all
0.40% stap [kernel.vmlinux] [k] clear_page_erms
0.39% stap [kernel.vmlinux] [k] __mod_node_page_state
0.39% stap [kernel.vmlinux] [k] error_entry
0.39% stap [kernel.vmlinux] [k] sync_regs
0.38% stap [kernel.vmlinux] [k] __handle_mm_fault
0.38% stap stap [.] derive_probes

#
# (Tip: System-wide collection from all CPUs: perf record -a)
#
[root@quaco testsuite]#

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-408hvumcnyn93a0auihnawew@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 5643b1a5 21-Jul-2019 Jiri Olsa <jolsa@kernel.org>

libperf: Move nr_members from perf's evsel to libperf's perf_evsel

Move the nr_members member from perf's evsel to libperf's perf_evsel.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-60-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 63503dba 21-Jul-2019 Jiri Olsa <jolsa@kernel.org>

perf evlist: Rename struct perf_evlist to struct evlist

Rename struct perf_evlist to struct evlist, so we don't have a name
clash when we add struct perf_evlist in libperf.

Committer notes:

Added fixes to build on arm64, from Jiri and from me
(tools/perf/util/cs-etm.c)

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 32dcd021 21-Jul-2019 Jiri Olsa <jolsa@kernel.org>

perf evsel: Rename struct perf_evsel to struct evsel

Rename struct perf_evsel to struct evsel, so we don't have a name clash
when we add struct perf_evsel in libperf.

Committer notes:

Added fixes for arm64, provided by Jiri.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 7f7c536f 04-Jul-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

tools lib: Adopt zalloc()/zfree() from tools/perf

Eroding a bit more the tools/perf/util/util.h hodpodge header.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-natazosyn9rwjka25tvcnyi0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fc50e0ba 03-Jul-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf evsel: perf_evsel__name(NULL) is valid, no need to check evsel

It'll return "unknown", no need to open code it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-4okvjmm18arjrcyfhuahgfxm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 526bbbdd 26-Jun-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Use skip_spaces()

No change in behaviour intended.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-lcywlfqbi37nhegmhl1ar6wg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 3052ba56 25-Jun-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

tools perf: Move from sane_ctype.h obtained from git to the Linux's original

We got the sane_ctype.h headers from git and kept using it so far, but
since that code originally came from the kernel sources to the git
sources, perhaps its better to just use the one in the kernel, so that
we can leverage tools/perf/check_headers.sh to be notified when our copy
gets out of sync, i.e. when fixes or goodies are added to the code we've
copied.

This will help with things like tools/lib/string.c where we want to have
more things in common with the kernel, such as strim(), skip_spaces(),
etc so as to go on removing the things that we have in tools/perf/util/
and instead using the code in the kernel, indirectly and removing things
like EXPORT_SYMBOL(), etc, getting notified when fixes and improvements
are made to the original code.

Hopefully this also should help with reducing the difference of code
hosted in tools/ to the one in the kernel proper.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-7k9868l713wqtgo01xxygn12@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 4885c90c 04-Jun-2019 Adrian Hunter <adrian.hunter@intel.com>

perf report: Set perf time interval in itrace_synth_ops

Instruction trace decoders can optimize output based on what time
intervals will be filtered, so pass that information in
itrace_synth_ops.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190604130017.31207-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# cb62c6f1 18-Mar-2019 Alexey Budankov <alexey.budankov@linux.intel.com>

perf report: Implement perf.data record decompression

zstd_init(, comp_level = 0) initializes decompression part of API only
hat now consists of zstd_decompress_stream() function.

The perf.data PERF_RECORD_COMPRESSED records are decompressed using
zstd_decompress_stream() function into a linked list of mmaped memory
regions of mmap_comp_len size (struct decomp).

After decompression of one COMPRESSED record its content is iterated and
fetched for usual processing. The mmaped memory regions with
decompressed events are kept in the linked list till the tool process
termination.

When dumping raw records (e.g., perf report -D --header) file offsets of
events from compressed records are printed as zero.

Committer notes:

Since now we have support for processing PERF_RECORD_COMPRESSED, we see
none, in raw form, like we saw in the previous patch commiter notes,
they were decompressed into the usual PERF_RECORD_{FORK,MMAP,COMM,etc}
records, we only see the stats for those PERF_RECORD_COMPRESSED events,
and since I used the file generated in the commiter notes for the
previous patch, there they are, 2 compressed records:

$ perf report --header-only | grep cmdline
# cmdline : /home/acme/bin/perf record -z2 sleep 1
$ perf report -D | grep COMPRESS
COMPRESSED events: 2
COMPRESSED events: 0
$ perf report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 15 of event 'cycles:u'
# Event count (approx.): 962227
#
# Overhead Command Shared Object Symbol
# ........ ....... ................ ...........................
#
46.99% sleep libc-2.28.so [.] _dl_addr
29.24% sleep [unknown] [k] 0xffffffffaea00a67
16.45% sleep libc-2.28.so [.] __GI__IO_un_link.part.1
5.92% sleep ld-2.28.so [.] _dl_setup_hash
1.40% sleep libc-2.28.so [.] __nanosleep
0.00% sleep [unknown] [k] 0xffffffffaea00163

#
# (Tip: To see callchains in a more compact form: perf report -g folded)
#
$

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/304b0a59-942c-3fe1-da02-aa749f87108b@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# bdd1666b 15-Mar-2019 Jin Yao <yao.jin@linux.intel.com>

perf annotate: Remove hist__account_cycles() from callback

The hist__account_cycles() function is executed when the
hist_iter__branch_callback() is called.

But it looks it's not necessary. In hist__account_cycles, it already
walks on all branch entries.

This patch moves the hist__account_cycles out of callback, now the data
processing is much faster than before.

Previous code has an issue that the ch[offset].num++ (in
__symbol__account_cycles) is executed repeatedly since
hist__account_cycles is called in each hist_iter__branch_callback, so
the counting of ch[offset].num is not correct (too big).

With this patch, the issue is fixed. And we don't need the code of
"ch->reset >= ch->num / 2" to check if there are too many overlaps (in
annotation__count_and_fill), otherwise some data would be hidden.

Now, we can try, for example:

perf record -b ...
perf annotate or perf report -s symbol

The before/after output should be no change.

v3:
---
Fix the crash in stdio mode.
Like previous code, it needs the checking of ui__has_annotation()
before hist__account_cycles()

v2:
---
1. Cover the similar perf report
2. Remove the checking code "ch->reset >= ch->num / 2"

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1552684577-29041-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 702fb9b4 14-Mar-2019 Andi Kleen <ak@linux.intel.com>

perf report: Show all sort keys in help output

Show all the supported sort keys in the command line help output, so
that it's not needed to refer to the manpage.

Before:

% perf report -h
...
-s, --sort <key[,key2...]>
sort by key(s): pid, comm, dso, symbol, parent, cpu, srcline, ... Please refer the man page for the complete list.

After:

% perf report -h
...
-s, --sort <key[,key2...]>
sort by key(s): overhead overhead_sys overhead_us overhead_guest_sys overhead_guest_us overhead_children sample period pid comm dso symbol parent cpu ...

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
LPU-Reference: 20190314225002.30108-5-andi@firstfloor.org
Link: https://lkml.kernel.org/n/tip-9r3uz2ch4izoi1uln3f889co@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 4968ac8f 11-Mar-2019 Andi Kleen <ak@linux.intel.com>

perf report: Implement browsing of individual samples

Now 'perf report' can show whole time periods with 'perf script', but
the user still has to find individual samples of interest manually.

It would be expensive and complicated to search for the right samples in
the whole perf file. Typically users only need to look at a small number
of samples for useful analysis.

Also the full scripts tend to show samples of all CPUs and all threads
mixed up, which can be very confusing on larger systems.

Add a new --samples option to save a small random number of samples per
hist entry.

Use a reservoir sample technique to select a representatve number of
samples.

Then allow browsing the samples using 'perf script' as part of the hist
entry context menu. This automatically adds the right filters, so only
the thread or cpu of the sample is displayed. Then we use less' search
functionality to directly jump the to the time stamp of the selected
sample.

It uses different menus for assembler and source display. Assembler
needs xed installed and source needs debuginfo.

Currently it only supports as many samples as fit on the screen due to
some limitations in the slang ui code.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190311174605.GA29294@tassilo.jf.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2a1292cb 05-Mar-2019 Andi Kleen <ak@linux.intel.com>

perf report: Parse time quantum

Many workloads change over time. 'perf report' currently aggregates the
whole time range reported in perf.data.

This patch adds an option for a time quantum to quantisize the perf.data
over time.

This just adds the option, will be used in follow on patches for a time
sort key.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20190305144758.12397-6-andi@firstfloor.org
[ Use NSEC_PER_[MU]SEC ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 52bab886 05-Mar-2019 Andi Kleen <ak@linux.intel.com>

perf report: Support output in nanoseconds

Upcoming changes add timestamp output in perf report. Add a --ns
argument similar to perf script to support nanoseconds resolution when
needed.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20190305144758.12397-5-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 284c4e18 01-Mar-2019 Jin Yao <yao.jin@linux.intel.com>

perf time-utils: Refactor time range parsing code

Jiri points out that we don't need any time checking and time string
parsing if the --time option is not set. That makes sense.

This patch refactors the time range parsing code, move the duplicated
code from perf report and perf script to time_utils and check if --time
option is set before parsing the time string. This patch is no logic
change expected. So the usage of --time is same as before.

For example:

Select the first and second 10% time slices:
perf report --time 10%/1,10%/2
perf script --time 10%/1,10%/2

Select the slices from 0% to 10% and from 30% to 40%:
perf report --time 0%-10%,30%-40%
perf script --time 0%-10%,30%-40%

Select the time slices from timestamp 3971 to 3973
perf report --time 3971,3973
perf script --time 3971,3973

Committer testing:

Using the above examples, check before and after to see if it remains
the same:

$ perf record -F 10000 -- find . -name "*.[ch]" -exec cat {} + > /dev/null
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 1.626 MB perf.data (42392 samples) ]
$
$ perf report --time 10%/1,10%/2 > /tmp/report.before.1
$ perf script --time 10%/1,10%/2 > /tmp/script.before.1
$ perf report --time 0%-10%,30%-40% > /tmp/report.before.2
$ perf script --time 0%-10%,30%-40% > /tmp/script.before.2
$ perf report --time 180457.375844,180457.377717 > /tmp/report.before.3
$ perf script --time 180457.375844,180457.377717 > /tmp/script.before.3

For example, the 3rd test produces this slice:

$ cat /tmp/script.before.3
cat 3147 180457.375844: 2143 cycles:uppp: 7f79362590d9 cfree@GLIBC_2.2.5+0x9 (/usr/lib64/libc-2.28.so)
cat 3147 180457.375986: 2245 cycles:uppp: 558b70f3d86e [unknown] (/usr/bin/cat)
cat 3147 180457.376012: 2164 cycles:uppp: 7f7936257430 _int_malloc+0x8c0 (/usr/lib64/libc-2.28.so)
cat 3147 180457.376140: 2921 cycles:uppp: 558b70f3a554 [unknown] (/usr/bin/cat)
cat 3147 180457.376296: 2844 cycles:uppp: 7f7936258abe malloc+0x4e (/usr/lib64/libc-2.28.so)
cat 3147 180457.376431: 2717 cycles:uppp: 558b70f3b0ca [unknown] (/usr/bin/cat)
cat 3147 180457.376667: 2630 cycles:uppp: 558b70f3d86e [unknown] (/usr/bin/cat)
cat 3147 180457.376795: 2442 cycles:uppp: 7f79362bff55 read+0x15 (/usr/lib64/libc-2.28.so)
cat 3147 180457.376927: 2376 cycles:uppp: ffffffff9aa00163 [unknown] ([unknown])
cat 3147 180457.376954: 2307 cycles:uppp: 7f7936257438 _int_malloc+0x8c8 (/usr/lib64/libc-2.28.so)
cat 3147 180457.377116: 3091 cycles:uppp: 7f7936258a70 malloc+0x0 (/usr/lib64/libc-2.28.so)
cat 3147 180457.377362: 2945 cycles:uppp: 558b70f3a3b0 [unknown] (/usr/bin/cat)
cat 3147 180457.377517: 2727 cycles:uppp: 558b70f3a9aa [unknown] (/usr/bin/cat)
$

Install 'coreutils-debuginfo' to see cat's guts (symbols), but then, the
above chunk translates into this 'perf report' output:

$ cat /tmp/report.before.3
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 13 of event 'cycles:uppp' (time slices: 180457.375844,180457.377717)
# Event count (approx.): 33552
#
# Overhead Command Shared Object Symbol
# ........ ....... ................ ......................
#
17.69% cat libc-2.28.so [.] malloc
14.53% cat cat [.] 0x000000000000586e
13.33% cat libc-2.28.so [.] _int_malloc
8.78% cat cat [.] 0x00000000000023b0
8.71% cat cat [.] 0x0000000000002554
8.13% cat cat [.] 0x00000000000029aa
8.10% cat cat [.] 0x00000000000030ca
7.28% cat libc-2.28.so [.] read
7.08% cat [unknown] [k] 0xffffffff9aa00163
6.39% cat libc-2.28.so [.] cfree@GLIBC_2.2.5

#
# (Tip: Order by the overhead of source file name and line number: perf report -s srcline)
#
$

Now lets see after applying this patch, nothing should change:

$ perf report --time 10%/1,10%/2 > /tmp/report.after.1
$ perf script --time 10%/1,10%/2 > /tmp/script.after.1
$ perf report --time 0%-10%,30%-40% > /tmp/report.after.2
$ perf script --time 0%-10%,30%-40% > /tmp/script.after.2
$ perf report --time 180457.375844,180457.377717 > /tmp/report.after.3
$ perf script --time 180457.375844,180457.377717 > /tmp/script.after.3
$ diff -u /tmp/report.before.1 /tmp/report.after.1
$ diff -u /tmp/script.before.1 /tmp/script.after.1
$ diff -u /tmp/report.before.2 /tmp/report.after.2
--- /tmp/report.before.2 2019-03-01 11:01:53.526094883 -0300
+++ /tmp/report.after.2 2019-03-01 11:09:18.231770467 -0300
@@ -352,5 +352,5 @@

#
-# (Tip: Generate a script for your data: perf script -g <lang>)
+# (Tip: Treat branches as callchains: perf report --branch-history)
#
$ diff -u /tmp/script.before.2 /tmp/script.after.2
$ diff -u /tmp/report.before.3 /tmp/report.after.3
--- /tmp/report.before.3 2019-03-01 11:03:08.890045588 -0300
+++ /tmp/report.after.3 2019-03-01 11:09:40.660224002 -0300
@@ -22,5 +22,5 @@

#
-# (Tip: Order by the overhead of source file name and line number: perf report -s srcline)
+# (Tip: List events using substring match: perf list <keyword>)
#
$ diff -u /tmp/script.before.3 /tmp/script.after.3
$

Cool, just the 'perf report' tips changed, QED.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1551435186-6008-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2d4f2799 21-Feb-2019 Jiri Olsa <jolsa@kernel.org>

perf data: Add global path holder

Add a 'path' member to 'struct perf_data'. It will keep the configured
path for the data (const char *). The path in struct perf_data_file is
now dynamically allocated (duped) from it.

This scheme is useful/used in following patches where struct
perf_data::path holds the 'configure' directory path and struct
perf_data_file::path holds the allocated path for specific files.

Also it actually makes the code little simpler.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190221094145.9151-3-jolsa@kernel.org
[ Fixup data-convert-bt.c missing conversion ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# dbd2a1d5 04-Feb-2019 Jiri Olsa <jolsa@kernel.org>

perf report: Move symbol annotation to the resort phase

Currently we make the annotation for the IPC column during the entry
display, already outside of the progress bar scope, so it appears like
'perf report' is stuck.

Move the annotation retrieval to the resort phase, so that all the data
are ready for display.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190204141808.23031-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 1101f69a 27-Jan-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

pref tools: Add missing map.h includes

Lots of places get the map.h file indirectly, and since we're going to
remove it from machine.h, then those need to include it directly, do it
now, before we remove that dep.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ob8jehdjda8h5jsrv9dqj9tf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# f3acb3a8 06-Dec-2018 Davidlohr Bueso <dave@stgolabs.net>

perf machine: Use cached rbtrees

At the cost of an extra pointer, we can avoid the O(logN) cost of
finding the first element in the tree (smallest node), which is
something required for nearly every operation dealing with
machine->guests and threads->entries.

The conversion is straightforward, however, it's worth noticing that the
rb_erase_init() calls have been replaced by rb_erase_cached() which has
no _init() flavor, however, the node is explicitly cleared next anyway,
which was redundant until now.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20181206191819.30182-3-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 49b8e2be 02-Nov-2018 Rasmus Villemoes <linux@rasmusvillemoes.dk>

perf tools: Replace automatic const char[] variables by statics

An automatic const char[] variable gets initialized at runtime, just
like any other automatic variable. For long strings, that uses a lot of
stack and wastes time building the string; e.g. for the "No %s
allocation events..." case one has:

444516: 48 b8 4e 6f 20 25 73 20 61 6c movabs $0x6c61207325206f4e,%rax # "No %s al"
...
444674: 48 89 45 80 mov %rax,-0x80(%rbp)
444678: 48 b8 6c 6f 63 61 74 69 6f 6e movabs $0x6e6f697461636f6c,%rax # "location"
444682: 48 89 45 88 mov %rax,-0x78(%rbp)
444686: 48 b8 20 65 76 65 6e 74 73 20 movabs $0x2073746e65766520,%rax # " events "
444690: 66 44 89 55 c4 mov %r10w,-0x3c(%rbp)
444695: 48 89 45 90 mov %rax,-0x70(%rbp)
444699: 48 b8 66 6f 75 6e 64 2e 20 20 movabs $0x20202e646e756f66,%rax

Make them all static so that the compiler just references objects in .rodata.

Committer testing:

Ok, using dwarves's codiff tool:

$ codiff --functions /tmp/perf.before ~/bin/perf
builtin-sched.c:
cmd_sched | -48
1 function changed, 48 bytes removed, diff: -48

builtin-report.c:
cmd_report | -32
1 function changed, 32 bytes removed, diff: -32

builtin-kmem.c:
cmd_kmem | -64
build_alloc_func_list | -50
2 functions changed, 114 bytes removed, diff: -114

builtin-c2c.c:
perf_c2c__report | -390
1 function changed, 390 bytes removed, diff: -390

ui/browsers/header.c:
tui__header_window | -104
1 function changed, 104 bytes removed, diff: -104

/home/acme/bin/perf:
9 functions changed, 688 bytes removed, diff: -688

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20181102230624.20064-1-linux@rasmusvillemoes.dk
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ec6ae74f 30-Nov-2018 Jin Yao <yao.jin@linux.intel.com>

perf report: Display average IPC and IPC coverage per symbol

Support displaying the average IPC and IPC coverage per symbol in 'perf
report' --tui and --stdio modes.

For example,

$ perf record -b ...
$ perf report -s symbol

Overhead Symbol IPC [IPC Coverage]
39.60% [.] __random 2.30 [ 54.8%]
18.02% [.] main 0.43 [ 54.3%]
14.21% [.] compute_flag 2.29 [100.0%]
14.16% [.] rand 0.36 [100.0%]
7.06% [.] __random_r 2.57 [ 70.5%]
6.85% [.] rand@plt 0.00 [ 0.0%]

Jiri Olsa <jolsa@redhat.com> provided the patch to support the --stdio
mode. I merged Jiri's code in this patch.

$ perf report -s symbol --stdio

# Overhead Symbol IPC [IPC Coverage]
# ........ ........................... ....................
#
39.60% [.] __random 2.30 [ 54.8%]
18.02% [.] main 0.43 [ 54.3%]
14.21% [.] compute_flag 2.29 [100.0%]
14.16% [.] rand 0.36 [100.0%]
7.06% [.] __random_r 2.57 [ 70.5%]
6.85% [.] rand@plt 0.00 [ 0.0%]
0.02% [k] run_timer_softirq 1.60 [ 57.2%]

The columns "IPC" and "[IPC Coverage]" are automatically enabled when
the sort-key "symbol" is specified. If the perf.data file doesn't
contain timed LBR information, columns are filled with "-".

For example,

# Overhead Symbol IPC [IPC Coverage]
# ........ ........................... ....................
#
46.57% [.] main - -
17.60% [.] rand - -
15.84% [.] __random_r - -
11.90% [.] __random - -
6.50% [.] compute_flag - -
1.59% [.] rand@plt - -
0.00% [.] _dl_relocate_object - -
0.00% [k] tlb_flush_mmu - -
0.00% [k] perf_event_mmap - -
0.00% [k] native_sched_clock - -
0.00% [k] intel_pmu_handle_irq_v4 - -
0.00% [k] native_write_msr - -

v3:
---
Removed the sortkey 'ipc' from command-line. The columns "IPC"
and "[IPC Coverage]" are automatically enabled when "symbol"
is specified.

v2:
---
Merge in Jiri's patch to support stdio mode

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1543586097-27632-4-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 4ab8455f 03-Oct-2018 Jiri Olsa <jolsa@redhat.com>

perf evsel: Store ids for events with their own cpus perf_event__synthesize_event_update_cpus

John reported crash when recording on an event under PMU with cpumask defined:

root@localhost:~# ./perf_debug_ record -e armv8_pmuv3_0/br_mis_pred/ sleep 1
perf: Segmentation fault
Obtained 9 stack frames.
./perf_debug_() [0x4c5ef8]
[0xffff82ba267c]
./perf_debug_() [0x4bc5a8]
./perf_debug_() [0x419550]
./perf_debug_() [0x41a928]
./perf_debug_() [0x472f58]
./perf_debug_() [0x473210]
./perf_debug_() [0x4070f4]
/lib/aarch64-linux-gnu/libc.so.6(__libc_start_main+0xe0) [0xffff8294c8a0]
Segmentation fault (core dumped)

We synthesize an update event that needs to touch the evsel id array, which is
not defined at that time. Fixing this by forcing the id allocation for events
with their own cpus.

Reported-by: John Garry <john.garry@huawei.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@huawei.com
Fixes: bfd8f72c2778 ("perf record: Synthesize unit/scale/... in event update")
Link: http://lkml.kernel.org/r/20181003212052.GA32371@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# c12e039d 13-Sep-2018 Andi Kleen <ak@linux.intel.com>

perf tools: Report itrace options in help

I often forget all the options that --itrace accepts. Instead of burying
them in the man page only report them in the normal command line help
too to make them easier accessible.

v2: Align

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Link: http://lkml.kernel.org/r/20180914031038.4160-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 89f1688a 13-Sep-2018 Jiri Olsa <jolsa@kernel.org>

perf tools: Remove perf_tool from event_op2

Now that we keep a perf_tool pointer inside perf_session, there's no
need to have a perf_tool argument in the event_op2 callback. Remove it.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180913125450.21342-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ece2a4f4 08-Aug-2018 Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>

tools lib traceevent, perf tools: Rename pevent_set_* APIs

In order to make libtraceevent into a proper library, variables, data
structures and functions require a unique prefix to prevent name space
conflicts. That prefix will be "tep_" and not "pevent_". This changes
APIs: pevent_set_file_bigendian, pevent_set_flag,
pevent_set_function_resolver, pevent_set_host_bigendian,
pevent_set_long_size, pevent_set_page_size and pevent_get_long_size

Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yordan Karadzhov (VMware) <y.karadz@gmail.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lkml.kernel.org/r/20180808180701.256265951@goodmis.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# e6902d1b 04-Aug-2018 Jiri Olsa <jolsa@kernel.org>

perf report: Add --percent-type option

Set annotation percent type from following choices:

global-period, local-period, global-hits, local-hits

With following report option setup the percent type will be passed to
annotation browser:

$ perf report --percent-type period-local

The local/global keywords set if the percentage is computed in the scope
of the function (local) or the whole data (global). The period/hits
keywords set the base the percentage is computed on - the samples period
or the number of samples (hits).

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20180804130521.11408-21-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# e9de7e2f 20-Jun-2018 Arnaldo Carvalho de Melo <acme@redhat.com>

perf hists: Clarify callchain disabling when available

We want to allow having mixed events with/without callchains, not
using a global flag to show callchains, but allowing supressing
callchains when they are present.

So invert the logic of the last parameter to hists__fprint() to
that effect.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-ohqyisr6qge79qa95ojslptx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 92ead7ee 25-Jun-2018 Ravi Bangoria <ravi.bangoria@linux.ibm.com>

perf tools: Fix crash caused by accessing feat_ops[HEADER_LAST_FEATURE]

perf_event__process_feature() accesses feat_ops[HEADER_LAST_FEATURE]
which is not defined and thus perf is crashing. HEADER_LAST_FEATURE is
used as an end marker for the perf report but it's unused for perf
script/annotate. Ignore HEADER_LAST_FEATURE for perf script/annotate,
just like it is done in 'perf report'.

Before:
# perf record -o - ls | perf script
<SNIP 'ls' output>
Segmentation fault (core dumped)
#

After:
# perf record -o - ls | perf script
<SNIP 'ls' output>
Segmentation fault (core dumped)
ls 7031 4392.099856: 250000 cpu-clock:uhH: 7f5e0ce7cd60
ls 7031 4392.100355: 250000 cpu-clock:uhH: 7f5e0c706ef7
#

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 57b5de463925 ("perf report: Support forced leader feature in pipe mode")
Link: http://lkml.kernel.org/r/20180625124220.6434-4-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 9d0199cd 28-May-2018 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: No need to have report_callchain_help as a global

It is used in a single place, move the declaration to that function.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-p650ofrl8xike4dewxod51gg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# f178fd2d 28-May-2018 Arnaldo Carvalho de Melo <acme@redhat.com>

perf annotate: Move objdump_path to struct annotation_options

One more step in grouping annotation options.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-sogzdhugoavm6fyw60jnb0vs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# cd0cccba 28-May-2018 Arnaldo Carvalho de Melo <acme@redhat.com>

perf hists browser: Pass annotation_options from tool to browser

So that things changed in the command line may percolate to the browser
code without using globals.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-5daawc40zhl6gcs600com1ua@git.kernel.org
[ Merged fix for NO_SLANG=1 build provided by Jiri Olsa ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# a47e843e 28-May-2018 Arnaldo Carvalho de Melo <acme@redhat.com>

perf annotate: Move disassembler_style global to annotation_options

Continuing to group annotation specific stuff into a struct.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-p3cdhltj58jt0byjzg3g7obx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 1eddd9e4 28-May-2018 Arnaldo Carvalho de Melo <acme@redhat.com>

perf annotate: Adopt anotation options from symbol_conf

Continuing to group annotation options in an annotation specific struct.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-astei92tzxp4yccag5pxb2h7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# e345f3bd 23-May-2018 Arnaldo Carvalho de Melo <acme@redhat.com>

perf annotate: Pass perf_evsel instead of just evsel->idx

The code gets shorter and we'll be able to use evsel->evlist in a
followup patch.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-t0s7vy19wq5kak74kavm8swf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# a26bb0ba 21-May-2018 Jin Yao <yao.jin@linux.intel.com>

perf report: Use perf_evlist__force_leader to support '--group'

Since we created a new function perf_evlist__force_leader(), remove the
old code and use that new evlist method.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1526914666-31839-3-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 3183f8ca 26-Apr-2018 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Unify symbol maps

Remove the split of symbol tables for data (MAP__VARIABLE) and for
functions (MAP__FUNCTION), its unneeded and there were various places
doing two lookups to find a symbol, so simplify this.

We still will consider only the symbols that matched the filters in
place, i.e. see the (elf_(sec,sym)|symbol_type)__filter() routines in
the patch, just so that we consider only the same symbols as before,
to reduce the possibility of regressions.

All the tests on 50-something build environments, in varios versions
of lots of distros and cross build environments were performed without
build regressions, as usual with all pull requests the other tests were
also performed: 'perf test' and 'make -C tools/perf build-test'.

Also this was done at a great granularity so that regressions can be
bisected more easily.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-hiq0fy2rsleupnqqwuojo1ne@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# e94b861a 23-Apr-2018 Arnaldo Carvalho de Melo <acme@redhat.com>

perf map: Introduce map__has_symbols()

To further simplify checking if symbols are available for a given map
and to reduce the number of users of MAP__{FUNCTION,VARIABLE}.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-iyfoyvbfdti5uehgpjum3qrq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# d88205db 23-Apr-2018 Arnaldo Carvalho de Melo <acme@redhat.com>

perf dso: Add dso__has_symbols() method

To replace longer code sequences in various places.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-tlk3klbkfyjrbfjvryyznfju@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 91340c51 16-Mar-2018 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Introduce --ignore-vmlinux command line option

We've had this in 'perf top' for quite a while, useful if one wishes
to force using /proc/kcore to do annotation using the patched kernel
instead of the ELF image it started from, aka vmlinux.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-ircpvox4wzsv7gasrpb28fw9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 7f0b6fde 16-Mar-2018 Arnaldo Carvalho de Melo <acme@redhat.com>

perf annotate: Move the default annotate options to the library

One more thing that goes from the TUI code to be used more widely,
for instance it'll affect the default options used by:

perf annotate --stdio2

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-0nsz0dm0akdbo30vgja2a10e@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 57b5de46 14-Mar-2018 Jiri Olsa <jolsa@kernel.org>

perf report: Support forced leader feature in pipe mode

Stephane reported a problem with forced leader in pipe mode, where
report does not force the group output. The reason is that we don't
force the leader in pipe mode.

This patch adds HEADER_LAST_FEATURE mark to have a point where we have
all events and features received, and force the group if requested.

$ perf record --group -e '{cycles, instructions}' -o - kill | perf report -i - --group

SNIP

# Overhead Command Shared Object Symbol
# ................ ....... ................ .......................
#
28.36% 0.00% kill libc-2.25.so [.] __unregister_atfork
26.32% 0.00% kill libc-2.25.so [.] _dl_addr
26.10% 0.00% kill ld-2.25.so [.] _dl_relocate_object
17.32% 0.00% kill ld-2.25.so [.] __tunables_init
1.70% 0.01% kill [unknown] [k] 0xffffffffafa01a40
0.20% 0.00% kill ld-2.25.so [.] _start
0.00% 48.77% kill ld-2.25.so [.] do_lookup_x
0.00% 42.97% kill libc-2.25.so [.] _IO_getline
0.00% 6.35% kill ld-2.25.so [.] strcmp
0.00% 1.71% kill ld-2.25.so [.] _dl_sysdep_start
0.00% 0.19% kill ld-2.25.so [.] _dl_start

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180314092205.23291-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ea85ab24 07-Mar-2018 Wang YanQing <udknight@gmail.com>

perf report: Provide libtraceevent with a kernel symbol resolver

So that beautifiers wanting to resolve kernel function addresses to
names can do its work, and when we use "perf report" for output of "perf
kmem record", we will get kernel symbol output.

This patch affect the output of "perf report" for the record data
generated by "perf kmem record" looks like below:

Before patch:
0.01% call_site=ffffffff814e5828 ptr=0x99bb000 bytes_req=3616 bytes_alloc=4096 gfp_flags=GFP_ATOMIC
0.01% call_site=ffffffff81370b87 ptr=0x428a3060 bytes_req=32 bytes_alloc=32 gfp_flags=GFP_KERNEL|GFP_ZERO

After patch:
0.01% (aa_alloc_task_context+0x27) call_site=ffffffff81370b87 ptr=0x428a3060 bytes_req=32 bytes_alloc=32 gfp_flags=GFP_KERNEL|GFP_ZERO
0.01% (__tty_buffer_request_room+0x88) call_site=ffffffff814e5828 ptr=0x99bb000 bytes_req=3616 bytes_alloc=4096 gfp_flags=GFP_ATOMIC

Signed-off-by: Wang YanQing <udknight@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180308032850.GA12383@udknight-ThinkPad-E550
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8ef278bb 07-Mar-2018 Jiri Olsa <jolsa@kernel.org>

perf report: Fix the output for stdio events list

Changing the output header for reporting forced groups via --groups
option on non grouped events, like:

$ perf record -e 'cycles,instructions'
$ perf report --stdio --group

Before:

# Samples: 24 of event 'anon group { cycles:u, instructions:u }'

After:

# Samples: 24 of events 'cycles:u, instructions:u'

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Fixes: ad52b8cb4886 ("perf report: Add support to display group output for non group events")
Link: http://lkml.kernel.org/r/20180307155020.32613-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ad52b8cb 09-Feb-2018 Jiri Olsa <jolsa@redhat.com>

perf report: Add support to display group output for non group events

Add support to display group output for if non grouped events are
detected and user forces --group option. Now for non-group events
recorded like:

$ perf record -e 'cycles,instructions' ls

you can still get group output by using --group option
in report:

$ perf report --group --stdio
...
# Overhead Command Shared Object Symbol
# ................ ....... ................ ......................
#
17.67% 0.00% ls libc-2.25.so [.] _IO_do_write@@GLIB
15.59% 25.94% ls ls [.] calculate_columns
15.41% 31.35% ls libc-2.25.so [.] __strcoll_l
...

Committer note:

We should improve on this by making sure that the first line states that
this is not a group, but since the user doesn't have to force group view
when really using grouped events (e.g. '{cycles,instructions}'), the
user better know what is being done...

Requested-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180209092734.GB20449@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8614ada0 06-Feb-2018 Jiri Olsa <jolsa@kernel.org>

perf report: Ask for ordered events for --tasks option

If we have the time in, keep the events in time order.

Committer notes:

Trying to be more verbose, what actual effect this will have in this particular
case?

Before and after this patch shows the artifacts:

--- /tmp/before 2018-02-06 15:40:29.536411625 -0300
+++ /tmp/after 2018-02-06 15:40:51.963403599 -0300
@@ -5,34 +5,34 @@
2540 2540 1818 | gnome-terminal-
3489 3489 2540 | bash
32433 32433 3489 | perf
- 32434 32434 32433 | perf
+ 32434 32434 32433 | make
32441 32441 32434 | make
32514 32514 32441 | make
511 511 32514 | sh
- 512 512 511 | sh
+ 512 512 511 | install
<SNIP>

We don't have 'perf' calling 'perf' calling 'make', etc, the second
'perf' actually is 'make', i.e. there was reordering of the relevant
PERF_RECORD_COMM and PERF_RECORD_FORK records.

Ditto for sh/install later on.

Look for FORK and COMM meta events, for those tids:

# perf report -D | egrep 'PERF_RECORD_(FORK|COMM)' | egrep '3243[34]'
0 14774650990679 0x1a3cd8 [0x38]: PERF_RECORD_FORK(32433:32433):(3489:3489)
1 14774652080381 0x1d6568 [0x30]: PERF_RECORD_COMM exec: perf:32433/32433
1 14774742473340 0x1dbb48 [0x38]: PERF_RECORD_FORK(32434:32434):(32433:32433)
0 14774752005779 0x1a4af8 [0x30]: PERF_RECORD_COMM exec: make:32434/32434
0 14774753997960 0x1a5578 [0x38]: PERF_RECORD_FORK(32435:32435):(32434:32434)
0 14774756070782 0x1a5618 [0x38]: PERF_RECORD_FORK(32438:32438):(32434:32434)
0 14774757772939 0x1a5680 [0x38]: PERF_RECORD_FORK(32440:32440):(32434:32434)
0 14774758230600 0x1a56e8 [0x38]: PERF_RECORD_FORK(32441:32441):(32434:32434)
#

First column is the cpu, second is the timestamp.

So they are on different CPUs, thus ring buffers, and when we don't use
the ordered_events class, we end up mixing that up, use it to take
advantage of the PERF_RECORD_FINISHED_ROUND meta events to go on
ordering the events using the PERF_SAMPLE_TIME present in the
PERF_RECORD_{FORK,COMM,EXIT,SAMPLE,etc} records in the ring buffer.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180206181813.10943-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 06cc1a47 18-Jan-2018 Kan Liang <kan.liang@intel.com>

perf hists browser: Add parameter to disable lost event warning

For overwrite mode, the ringbuffer will be paused. The event lost is
expected. It needs a way to notify the browser not print the warning.

It will be used later for perf top to disable lost event warning in
overwrite mode. There is no behavior change for now.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-15-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0a3cc3ae 10-Jan-2018 Jin Yao <yao.jin@linux.intel.com>

perf report: Remove the time slices number limitation

Previously it was only allowed to use at most 10 time slices in 'perf
report --time'.

This patch removes this limitation.
For example, following command line is OK (12 time slices)

perf report --stdio --time 1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1515596433-24653-8-git-send-email-yao.jin@linux.intel.com
[ No need to check for NULL to call free, use zfree ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 7425664b 10-Jan-2018 Jin Yao <yao.jin@linux.intel.com>

perf report: Add an indication of what time slices are used

Add a time slices indication to the perf report header.

For example,

# perf report --stdio --time 10%

# Total Lost Samples: 0
#
# Samples: 9K of event 'cycles:ppp' (time slices: 10%)
# Event count (approx.): 8951288803

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Suggested--by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1515596433-24653-6-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# eb0b419e 10-Jan-2018 Jin Yao <yao.jin@linux.intel.com>

perf report: Improve error msg when no first/last sample time found

The following message will be returned to user when executing
'perf report --time' if perf data file doesn't contain the
first/last sample time.

"HINT: no first/last sample time found in perf data.
Please use latest perf binary to execute 'perf record'
(if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1515596433-24653-2-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# eabad8c6 15-Jan-2018 Arnaldo Carvalho de Melo <acme@redhat.com>

perf unwind: Do not look just at the global callchain_param.record_mode

When setting up DWARF callchains on specific events, without using
'record' or 'trace' --call-graph, but instead doing it like:

perf trace -e cycles/call-graph=dwarf/

The unwind__prepare_access() call in thread__insert_map() when we
process PERF_RECORD_MMAP(2) metadata events were not being performed,
precluding us from using per-event DWARF callchains, handling them just
when we asked for all events to be DWARF, using "--call-graph dwarf".

We do it in the PERF_RECORD_MMAP because we have to look at one of the
executable maps to figure out the executable type (64-bit, 32-bit) of
the DSO laid out in that mmap. Also to look at the architecture where
the perf.data file was recorded.

All this probably should be deferred to when we process a sample for
some thread that has callchains, so that we do this processing only for
the threads with samples, not for all of them.

For now, fix using DWARF on specific events.

Before:

# perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.048 ms

--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.048/0.048/0.048/0.000 ms
0.000 probe_libc:inet_pton:(7fe9597bb350))
Problem processing probe_libc:inet_pton callchain, skipping...
#

After:

# perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.060 ms

--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.060/0.060/0.060/0.000 ms
0.000 probe_libc:inet_pton:(7fd4aa930350))
__inet_pton (inlined)
gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
__GI_getaddrinfo (inlined)
[0xffffaa804e51af3f] (/usr/bin/ping)
__libc_start_main (/usr/lib64/libc-2.26.so)
[0xffffaa804e51b379] (/usr/bin/ping)
#
# perf trace --call-graph=dwarf --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.057 ms

--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.057/0.057/0.057/0.000 ms
0.000 probe_libc:inet_pton:(7f9363b9e350))
__inet_pton (inlined)
gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
__GI_getaddrinfo (inlined)
[0xffffa9e8a14e0f3f] (/usr/bin/ping)
__libc_start_main (/usr/lib64/libc-2.26.so)
[0xffffa9e8a14e1379] (/usr/bin/ping)
#
# perf trace --call-graph=fp --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.077 ms

--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.077/0.077/0.077/0.000 ms
0.000 probe_libc:inet_pton:(7f4947e1c350))
__inet_pton (inlined)
gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
__GI_getaddrinfo (inlined)
[0xffffaa716d88ef3f] (/usr/bin/ping)
__libc_start_main (/usr/lib64/libc-2.26.so)
[0xffffaa716d88f379] (/usr/bin/ping)
#
# perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=fp/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.078 ms

--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.078/0.078/0.078/0.000 ms
0.000 probe_libc:inet_pton:(7fa157696350))
__GI___inet_pton (/usr/lib64/libc-2.26.so)
getaddrinfo (/usr/lib64/libc-2.26.so)
[0xffffa9ba39c74f40] (/usr/bin/ping)
#

Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/r/20180116182650.GE16107@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 6439d7d1 09-Jan-2018 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Introduce --mmaps

Similar to --tasks, producing the same output plus /proc/<PID>/maps
similar lines for each mmap record present in a perf.data file.

Please note that not all mmaps are stored, for instance, some of the
non-executable mmaps are only stored when 'perf record --data' is used,
when the user wants to resolve data accesses in addition to asking for
executable mmaps to get the DSO with symtabs.

E.g.:

# perf record sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.018 MB perf.data (7 samples) ]
[root@jouet ~]# perf report --mmaps
# pid tid ppid comm
0 0 -1 |swapper
4137 4137 -1 |sleep
5628a35a1000-5628a37aa000 r-xp 00000000 3147148 /usr/bin/sleep
7fb65ad51000-7fb65b134000 r-xp 00000000 3149795 /usr/lib64/libc-2.26.so
7fb65b134000-7fb65b35e000 r-xp 00000000 3149715 /usr/lib64/ld-2.26.so
7ffd94b9f000-7ffd94ba1000 r-xp 00000000 0 [vdso]
#
# perf record sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.019 MB perf.data (8 samples) ]
# perf report --mmaps
# pid tid ppid comm
0 0 -1 |swapper
4161 4161 -1 |sleep
55afae69a000-55afae8a3000 r-xp 00000000 3147148 /usr/bin/sleep
7f569f00d000-7f569f3f0000 r-xp 00000000 3149795 /usr/lib64/libc-2.26.so
7f569f3f0000-7f569f61a000 r-xp 00000000 3149715 /usr/lib64/ld-2.26.so
7fff6fffe000-7fff70000000 r-xp 00000000 0 [vdso]
#
# perf record time sleep 1
0.00user 0.00system 0:01.00elapsed 0%CPU (0avgtext+0avgdata 2156maxresident)k
0inputs+0outputs (0major+73minor)pagefaults 0swaps
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.019 MB perf.data (14 samples) ]
# perf report --mmaps
# pid tid ppid comm
0 0 -1 |swapper
4281 4281 -1 |time
560560dca000-560560fcf000 r-xp 00000000 3190458 /usr/bin/time
7fc175196000-7fc175579000 r-xp 00000000 3149795 /usr/lib64/libc-2.26.so
7fc175579000-7fc1757a3000 r-xp 00000000 3149715 /usr/lib64/ld-2.26.so
7ffc924f6000-7ffc924f8000 r-xp 00000000 0 [vdso]
4282 4282 4281 | sleep
560560dca000-560560fcf000 r-xp 00000000 3190458 /usr/bin/time
564b4de3c000-564b4e045000 r-xp 00000000 3147148 /usr/bin/sleep
7f6a5a716000-7f6a5aaf9000 r-xp 00000000 3149795 /usr/lib64/libc-2.26.so
7f6a5aaf9000-7f6a5ad23000 r-xp 00000000 3149715 /usr/lib64/ld-2.26.so
7fc175196000-7fc175579000 r-xp 00000000 3149795 /usr/lib64/libc-2.26.so
7fc175579000-7fc1757a3000 r-xp 00000000 3149715 /usr/lib64/ld-2.26.so
7ffc924f6000-7ffc924f8000 r-xp 00000000 0 [vdso]
7ffcec7e6000-7ffcec7e8000 r-xp 00000000 0 [vdso]
#

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-zulwdlg5rfowogr1qznorvvc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 930f8b34 07-Jan-2018 Jiri Olsa <jolsa@kernel.org>

perf report: Add --tasks option to display monitored tasks

Add --tasks option to display monitored tasks stored in perf.data.
Displaying pid/tid/ppid plus the command string aligned to distinguish
parent and child tasks.

$ perf record -a
...
$ perf report --tasks
# pid tid ppid comm
0 0 -1 |swapper
2 2 0 | kthreadd
14080 14080 2 | kworker/u17:1
4 4 2 | kworker/0:0H
6 6 2 | mm_percpu_wq
...
1 1 0 | systemd
23242 23242 1 | firefox
23242 23298 23242 | Cache2 I/O
23242 23304 23242 | GMPThread
...
1195 1195 1 | login
1611 1611 1195 | bash
1639 1639 1611 | startx
1663 1663 1639 | xinit
1673 1673 1663 | xmonad-x86_64-l
23939 23939 1673 | xterm
23941 23941 23939 | bash
23963 23963 23941 | mutt
24954 24954 23963 | offlineimap

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180107160356.28203-13-jolsa@kernel.org
[ Make it --tasks, plural, --task works as well, as its unambiguous ]
[ Use machine__find_thread(), not findnew(), as pointed out by Namhyung ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# a4a4d0a7 07-Jan-2018 Jiri Olsa <jolsa@kernel.org>

perf report: Add --stats option to display quick data statistics

Add --stats option to display quick data statistics of event numbers,
without any further processing, like the one at the end of the perf
report -D command.

$ perf report --stat

Aggregated stats:
TOTAL events: 4566
MMAP events: 113
LOST events: 19
COMM events: 3
FORK events: 400
SAMPLE events: 3315
MMAP2 events: 32
FINISHED_ROUND events: 681
THREAD_MAP events: 1
CPU_MAP events: 1
TIME_CONV events: 1

I found this useful when hunting lost events for another change.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180107160356.28203-12-jolsa@kernel.org
[ Rename it to --stats, plural ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 5b969bc7 08-Dec-2017 Jin Yao <yao.jin@linux.intel.com>

perf report: Support time percent and multiple time ranges

perf report has a --time option to limit the time range of output. It
only supports absolute time.

Now this option is extended to support multiple time ranges and support
the percent of time.

For example:

1. Select the first and second 10% time slices:

perf report --time 10%/1,10%/2

2. Select from 0% to 10% and 30% to 40% slices:

perf report --time 0%-10%,30%-40%

Changelog:

v6: Fix the merge issue with latest perf/core branch.
No functional changes.

v5: Add checking of first/last sample time to detect if it's recorded
in perf.data. If it's not recorded, returns error message to user.

v4: Remove perf_time__skip_sample, only uses perf_time__ranges_skip_sample

v3: Since the definitions of first_sample_time/last_sample_time
are moved from perf_session to perf_evlist so change the
related code.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512738826-2628-6-git-send-email-yao.jin@linux.intel.com
[ Add missing colons at end of examples in the man page ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 40c39e30 26-Dec-2017 Jin Yao <yao.jin@linux.intel.com>

perf report: Fix a no annotate browser displayed issue

When enabling '-b' option in perf record, for example,

perf record -b ...
perf report

and then browsing the annotate browser from perf report (press 'A'), it
would fail (annotate browser can't be displayed).

It's because the '.add_entry_cb' op of struct report is overwritten by
hist_iter__branch_callback() in builtin-report.c. But this function doesn't do
something like mapping symbols and sources. So next, do_annotate() will return
directly.

notes = symbol__annotation(act->ms.sym);
if (!notes->src)
return 0;

This patch adds the lost code to hist_iter__branch_callback (refer to
hist_iter__report_callback).

v2:

Fix a crash bug when perform 'perf report --stdio'.

The reason is that we init the symbol annotation only in browser mode, it
doesn't allocate/init resources for stdio mode.

So now in hist_iter__branch_callback(), it will return directly if it's not in
browser mode.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1514284963-18587-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 712d36db 04-Dec-2017 Seokho Song <0xdevssh@gmail.com>

perf report: Set browser mode right before setup_browser()

There are codes that print messages to the screen between assignment of
the use_browser variable and setup_browser().

But since the GUI browser is not initialized during that period, all
messages fail to show if the user passed the --gtk option to perf as GTK
is not initialized yet.

Reorder the code to assign use_browser variable right before
setup_browser() is called.

Signed-off-by: Seokho Song <0xdevssh@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20171204160244.6332-1-0xdevssh@gmail.com
Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 3f0a4c87 14-Nov-2017 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Ignore kptr_restrict when not sampling the kernel

If none of the evsels has attr.exclude_kernel set to zero, no kernel
samples, so no point in warning the user about problems in processing
kernel samples, as there will be none.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-7dn926v3at8txxkky92aesz2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 9c39ed90 14-Nov-2017 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Ignore kptr_restrict when not sampling the kernel

If none of the evsels has attr.exclude_kernel set to zero, no kernel
samples, so no point in warning the user about problems in processing
kernel samples, as there will be none.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-7dn926v3at8txxkky92aesz2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b2441318 01-Nov-2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

License cleanup: add SPDX GPL-2.0 license identifier to files with no license

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.

For non */uapi/* files that summary was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139

and resulted in the first patch in this series.

If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930

and resulted in the second patch in this series.

- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:

SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1

and that resulted in the third patch in this series.

- when the two scanners agreed on the detected license(s), that became
the concluded license(s).

- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.

- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).

- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.

- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct

This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# eae8ad80 23-Jan-2017 Jiri Olsa <jolsa@kernel.org>

perf tools: Add struct perf_data_file

Add struct perf_data_file to represent a single file within a perf_data
struct.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-c3f9p4xzykr845ktqcek6p4t@git.kernel.org
[ Fixup recent changes in 'perf script --per-event-dump' ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8ceb41d7 23-Jan-2017 Jiri Olsa <jolsa@kernel.org>

perf tools: Rename struct perf_data_file to perf_data

Rename struct perf_data_file to perf_data, because we will add the
possibility to have multiple files under perf.data, so the 'perf_data'
name fits better.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-39wn4d77phel3dgkzo3lyan0@git.kernel.org
[ Fixup recent changes in 'perf script --per-event-dump' ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 9933183e 24-Aug-2017 Jiri Olsa <jolsa@kernel.org>

perf report: Group stat values on global event id

There's no big value on displaying counts for every event ID, which is
one per every CPU. Rather than that, displaying the whole sum for the
event.

$ perf record -c 100000 -e cycles:u -s test
$ perf report -T

Before:
# PID TID cycles:u cycles:u cycles:u cycles:u ... [20 more columns of 'cycles:u']
3339 3339 0 0 0 0
3340 3340 0 0 0 0
3341 3341 0 0 0 0
3342 3342 0 0 0 0

Now:
# PID TID cycles:u
3339 3339 19678
3340 3340 18744
3341 3341 17335
3342 3342 26414

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170824162737.7813-10-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# dac7f6b7 24-Aug-2017 Jiri Olsa <jolsa@kernel.org>

perf report: Add dump_read function

Adding dump_read function to gather all the dump output of read
function. Adding output of enabled and running times and id if enabled
(3 new lines with '...' prefix below).

$ perf record -s ...
$ perf report -D

958358311769 0x91f8 [0x40]: PERF_RECORD_READ: 3339 3339 cycles:u 0
... time enabled : 958358313731
... time running : 958358313731
... id : 80

Committer note:

Do not use 'read' as a variable name as it breaks the build on older
systems, such as RHEL6:

CC /tmp/build/perf/util/session.o
cc1: warnings being treated as errors
util/session.c: In function 'dump_read':
util/session.c:1132: error: declaration of 'read' shadows a global declaration
/usr/include/bits/unistd.h:35: error: shadowed declaration is here
mv: cannot stat `/tmp/build/perf/util/.session.o.tmp': No such file or directory

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170824162737.7813-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b49a821e 08-May-2017 Jin Yao <yao.jin@linux.intel.com>

perf report: Make --branch-history work without callgraphs(-g) option in perf record

perf record -b -g <command>
perf report --branch-history

This merges the LBRs with the callgraphs.

However it would be nice if it also works without callgraphs (-g) set in
perf record, so that only the LBRs are displayed. But currently perf
report errors in this case. For example,

perf record -b <command>
perf report --branch-history

Error:
Selected -g or --branch-history but no callchain data. Did
you call 'perf record' without -g?

This patch displays the LBRs only even if callgraphs(-g) is not enabled
in perf record.

Change log:

v2: According to Milian Wolff's comment, change the obsolete error
message. Now the error message is:

┌─Error:─────────────────────────────────────┐
│Selected -g or --branch-history. │
│But no callchain or branch data. │
│Did you call 'perf record' without -g or -b?│
│ │
│ │
│Press any key... │
└────────────────────────────────────────────┘

When passing the last parameter to hists__fprintf,
changes "|" to "||".

hists__fprintf(hists, !quiet, 0, 0, rep->min_percent, stdout,
symbol_conf.use_callchain || symbol_conf.show_branchflag_count);

Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1494240182-28899-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# bab89f6a 20-Jul-2017 Taeung Song <treeze.taeung@gmail.com>

perf hists: Pass perf_sample to __symbol__inc_addr_samples()

To pave the way to use perf_sample fields in the annotate code, storing
sample->period in sym_hist->addr->period and its sum in
sym_hist->period.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1500500215-16646-1-git-send-email-treeze.taeung@gmail.com
[ split and adjusted from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2d78b189 18-Jul-2017 Jin Yao <yao.jin@linux.intel.com>

perf report: Show branch type statistics for stdio mode

Show the branch type statistics at the end of perf report --stdio.

For example:

perf report --stdio

COND_FWD: 28.5%
COND_BWD: 9.4%
CROSS_4K: 0.7%
CROSS_2M: 14.1%
COND: 37.9%
UNCOND: 0.2%
IND: 6.7%
CALL: 26.5%
RET: 28.7%
SYSRET: 0.0%

The branch types are:

COND_FWD: conditional forward
COND_BWD: conditional backward
COND: conditional branch
UNCOND: unconditional branch
IND: indirect
CALL: function call
IND_CALL: indirect function call
RET: function return
SYSCALL: syscall
SYSRET: syscall return
COND_CALL: conditional function call
COND_RET: conditional function return

CROSS_4K and CROSS_2M:

They are the metrics checking for branches cross 4K or 2MB pages.
It's an approximate computing. We don't know if the area is 4K or
2MB, so always compute both.

To make the output simple, if a branch crosses 2M area, CROSS_4K
will not be incremented.

Change log

v7: Since the common branch type definitions are changed, some
tags/strings are updated accordingly.

v6: Remove branch_type_stat_display() since it's moved to branch.c.

v5: Remove the unnecessary sort__mode checking in
hist_iter__branch_callback().

v4: Comparing to previous version, the major changes are:

Add the computing of JCC forward/JCC backward and cross page checking
by using the from and to addresses.

Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1500379995-6449-7-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# e9def1b2 17-Jul-2017 David Carrillo-Cisneros <davidcc@google.com>

perf tools: Add feature header record to pipe-mode

Add header record types to pipe-mode, reusing the functions
used in file-mode and leveraging the new struct feat_fd.

For alignment, check that synthesized events don't exceed
pagesize.

Add the perf_event__synthesize_feature event call back to
process the new header records.

Before this patch:

$ perf record -o - -e cycles sleep 1 | perf report --stdio --header
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ]
...

After this patch:
$ perf record -o - -e cycles sleep 1 | perf report --stdio --header
# ========
# captured on: Mon May 22 16:33:43 2017
# ========
#
# hostname : my_hostname
# os release : 4.11.0-dbx-up_perf
# perf version : 4.11.rc6.g6277c80
# arch : x86_64
# nrcpus online : 72
# nrcpus avail : 72
# cpudesc : Intel(R) Xeon(R) CPU E5-2696 v3 @ 2.30GHz
# cpuid : GenuineIntel,6,63,2
# total memory : 263457192 kB
# cmdline : /root/perf record -o - -e cycles -c 100000 sleep 1
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# pmu mappings: intel_bts = 6, uncore_imc_4 = 22, uncore_sbox_1 = 47, uncore_cbox_5 = 33, uncore_ha_0 = 16, uncore_cbox
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ]
...

Support added for the subcommands: report, inject, annotate and script.

Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Simon Que <sque@chromium.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20170718042549.145161-16-davidcc@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 114f709e 17-Jul-2017 David Carrillo-Cisneros <davidcc@google.com>

perf tool: Add show_feature_header to perf_tool

Add show_feat_hdr to control level of printed information of feature
headers.

Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Simon Que <sque@chromium.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20170718042549.145161-15-davidcc@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 644e0840 26-May-2017 Adrian Hunter <adrian.hunter@intel.com>

perf auxtrace: Add CPU filter support

Decoding auxtrace data can take a long time. To avoid decoding
unnecessarily, filter auxtrace data that is collected per-cpu before it is
decoded.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-38-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 25ce4bb8 27-Jun-2017 Arnaldo Carvalho de Melo <acme@redhat.com>

perf config: Do not die when parsing u64 or int config values

Just warn the user and ignore those values.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-tbf60nj3ierm6hrkhpothymx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 7a8ef4c4 19-Apr-2017 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Remove string.h, unistd.h and sys/stat.h from util.h

Not needed in this header, added to the places that need FILE,
putchar(), access() and a few other prototypes.

Link: http://lkml.kernel.org/n/tip-xxtdsl6nsna82j7puwbdjqhs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 58db1d6e 19-Apr-2017 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Move units conversion/formatting routines to separate object

Out of util.h, to disentangle it a bit more.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-vpksyj3w5fk9t8s6mxmkajyr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 9607ad3a 19-Apr-2017 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Add signal.h to places using its definitions

And remove it from util.h, disentangling it a bit more.

Link: http://lkml.kernel.org/n/tip-2zg9s5nx90yde64j3g4z2uhk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 1eae20c1 17-Apr-2017 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Remove regex.h and fnmatch.h from util.h

The users of regex and fnmatch functions should include those headers
instead.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ixzm5kuamsq1ixbkuv6kmwzj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8ec20b17 18-Apr-2017 Arnaldo Carvalho de Melo <acme@redhat.com>

perf str{filter,list}: Disentangle headers

There are places where we just need a forward declaration, and others
were we need to include strlist.h and/or strfilter.h, reducing the
impact of changes in headers on the build time, do it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-zab42gbiki88y9k0csorxekb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# a43783ae 18-Apr-2017 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Include errno.h where needed

Removing it from util.h, part of an effort to disentangle the includes
hell, that makes changes to util.h or something included by it to cause
a complete rebuild of the tools.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ztrjy52q1rqcchuy3rubfgt2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fd20e811 17-Apr-2017 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Including missing inttypes.h header

Needed to use the PRI[xu](32,64) formatting macros.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-wkbho8kaw24q67dd11q0j39f@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# f3a60646 25-Mar-2017 Jin Yao <yao.jin@linux.intel.com>

perf report: Introduce --inline option

It takes some time to look for inline stack for callgraph addresses. So
it provides new option "--inline" to let user decide if enable this
feature.

--inline:

If a callgraph address belongs to an inlined function, the inline stack
will be printed. Each entry is the inline function name or file/line.

Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Tested-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/1490474069-15823-4-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b0ad8ea6 27-Mar-2017 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Remove unused 'prefix' from builtin functions

We got it from the git sources but never used it for anything, with the
place where this would be somehow used remaining:

static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
{
prefix = NULL;
if (p->option & RUN_SETUP)
prefix = NULL; /* setup_perf_directory(); */

Ditch it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-uw5swz05vol0qpr32c5lpvus@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# f3b3614a 07-Mar-2017 Hari Bathini <hbathini@linux.vnet.ibm.com>

perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info

Introduce a new option to record PERF_RECORD_NAMESPACES events emitted
by the kernel when fork, clone, setns or unshare are invoked. And update
perf-record documentation with the new option to record namespace
events.

Committer notes:

Combined it with a later patch to allow printing it via 'perf report -D'
and be able to test the feature introduced in this patch. Had to move
here also perf_ns__name(), that was introduced in another later patch.

Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt:

util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=]
ret += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx
^
Testing it:

# perf record --namespaces -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ]
#
# perf report -D
<SNIP>
3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7
[0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]

0x1151e0 [0x30]: event: 9
.
. ... raw event: size 48 bytes
. 0000: 09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00 ......0..q.h....
. 0010: a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00 .9...9...(.c....
. 0020: 03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00 ................
<SNIP>
NAMESPACES events: 1
<SNIP>
#

Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8b53dbef 07-Mar-2017 Namhyung Kim <namhyung@kernel.org>

perf report: Hide tip message when -q option is given

The tip message at the end was printed regardless of the -q option.

Originally, the message suggested only '-s comm,dso' option for higher
level view when no sort option and parent option were given.

Now it shows random help message regardless of the options so the
condition can be simplified to honor the -q option.

Committer notes:

Before:

$ perf report --stdio -q
42.77% ls ls [.] _init
13.21% ls ld-2.24.so [.] match_symbol
12.55% ls libc-2.24.so [.] __strcoll_l
11.94% ls libc-2.24.so [.] _init

#
# (Tip: Show current config key-value pairs: perf config --list)
#
$

After:

$ perf report --stdio -q
42.77% ls ls [.] _init
13.21% ls ld-2.24.so [.] match_symbol
12.55% ls libc-2.24.so [.] __strcoll_l
11.94% ls libc-2.24.so [.] _init

$

We still have those two extra lines tho (that git commit insists in
turning into one, or git commit --amend doesn't make me add), food for
another patch...

Reported-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170307150851.22304-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 27fafab5 17-Feb-2017 Namhyung Kim <namhyung@kernel.org>

perf report: Add -q/--quiet option

The -q/--quiet option is to suppress any message. Sometimes users just
want to see the numbers and it can be used for that case.

Before:

$ perf report | head -15
Failed to open /lib/modules/3.19.3-3-ARCH/kernel/fs/ext4/ext4.ko.gz, continuing without symbols
Failed to open /lib/modules/3.19.3-3-ARCH/kernel/fs/jbd2/jbd2.ko.gz, continuing without symbols
Failed to open /tmp/perf-14507.map, continuing without symbols
...
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 39K of event 'cycles'
# Event count (approx.): 30444796573
#
# Overhead Command Shared Object Symbol
# ........ ........... ................... .........................
#
9.28% swapper [kernel.vmlinux] [k] intel_idle
5.64% swapper [kernel.vmlinux] [k] native_write_msr_safe
1.93% swapper [kernel.vmlinux] [k] __switch_to
1.89% swapper [kernel.vmlinux] [k] menu_select
1.75% sched-pipe [kernel.vmlinux] [k] __switch_to

After:

$ perf report -q | head
9.28% swapper [kernel.vmlinux] [k] intel_idle
5.64% swapper [kernel.vmlinux] [k] native_write_msr_safe
1.93% swapper [kernel.vmlinux] [k] __switch_to
1.89% swapper [kernel.vmlinux] [k] menu_select
1.75% sched-pipe [kernel.vmlinux] [k] __switch_to
1.67% swapper [kernel.vmlinux] [k] cpu_startup_entry
1.48% sched-pipe [kernel.vmlinux] [k] enqueue_entity
1.46% swapper [kernel.vmlinux] [k] __schedule
1.36% swapper [kernel.vmlinux] [k] native_read_tsc
1.34% sched-pipe [kernel.vmlinux] [k] __schedule

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170217081742.17417-4-namhyung@kernel.org
[ Removed builtin-report.c verbose > 0 hunk added to the previous patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# bb963e16 17-Feb-2017 Namhyung Kim <namhyung@kernel.org>

perf utils: Check verbose flag properly

It now can have negative value to suppress the message entirely. So it
needs to check it being positive.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170217081742.17417-3-namhyung@kernel.org
[ Adjust fuzz on tools/perf/util/pmu.c, add > 0 checks in many other places ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ecc4c561 24-Jan-2017 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Propagate perf_config() errors

Previously these were being ignored, sometimes silently.

Stop doing that, emitting debug messages and handling the errors.

Testing it:

$ cat ~/.perfconfig
cat: /home/acme/.perfconfig: No such file or directory
$ perf stat -e cycles usleep 1

Performance counter stats for 'usleep 1':

938,996 cycles:u

0.003813731 seconds time elapsed

$ perf top --stdio
Error:
You may not have permission to collect system-wide stats.

Consider tweaking /proc/sys/kernel/perf_event_paranoid,
<SNIP>
[ perf record: Captured and wrote 0.019 MB perf.data (7 samples) ]
[acme@jouet linux]$ perf report --stdio
# To display the perf.data header info, please use --header/--header-only options.
# Overhead Command Shared Object Symbol
# ........ ....... ................. .........................
71.77% usleep libc-2.24.so [.] _dl_addr
27.07% usleep ld-2.24.so [.] _dl_next_ld_env_entry
1.13% usleep [kernel.kallsyms] [k] page_fault
$
$ touch ~/.perfconfig
$ ls -la ~/.perfconfig
-rw-rw-r--. 1 acme acme 0 Jan 27 12:14 /home/acme/.perfconfig
$
$ perf stat -e instructions usleep 1

Performance counter stats for 'usleep 1':

244,610 instructions:u

0.000805383 seconds time elapsed

$
[root@jouet ~]# chown acme.acme ~/.perfconfig
[root@jouet ~]# perf stat -e cycles usleep 1
Warning: File /root/.perfconfig not owned by current user or root, ignoring it.

Performance counter stats for 'usleep 1':

937,615 cycles

0.000836931 seconds time elapsed
#

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-j2rq96so6xdqlr8p8rd6a3jx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 7e6a7998 12-Dec-2016 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Remove some needless __maybe_unused

I.e. those parameters/functions _are_ used, so ditch that misleading attribute.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-13cqtjh0yojg5gzvpq1zzpl0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 46690a80 29-Nov-2016 David Ahern <dsa@cumulusnetworks.com>

perf report: Add option to specify time window of interest

Add option to allow user to control analysis window. e.g., collect data
for time window and analyze a segment of interest within that window.

Committer notes:

Testing it:

Using the perf.data file captured via 'perf kmem record':

# perf report --header-only
# ========
# captured on: Tue Nov 29 16:01:53 2016
# hostname : jouet
# os release : 4.8.8-300.fc25.x86_64
# perf version : 4.9.rc6.g5a6aca
# arch : x86_64
# nrcpus online : 4
# nrcpus avail : 4
# cpudesc : Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
# cpuid : GenuineIntel,6,61,4
# total memory : 20254660 kB
# cmdline : /home/acme/bin/perf kmem record usleep 1
# event : name = kmem:kmalloc, , id = { 931980, 931981, 931982, 931983 }, type = 2, size = 112, config = 0x1b9, { sample_period, sample_freq } = 1, sample_typ
# event : name = kmem:kmalloc_node, , id = { 931984, 931985, 931986, 931987 }, type = 2, size = 112, config = 0x1b7, { sample_period, sample_freq } = 1, sampl
# event : name = kmem:kfree, , id = { 931988, 931989, 931990, 931991 }, type = 2, size = 112, config = 0x1b5, { sample_period, sample_freq } = 1, sample_type
# event : name = kmem:kmem_cache_alloc, , id = { 931992, 931993, 931994, 931995 }, type = 2, size = 112, config = 0x1b8, { sample_period, sample_freq } = 1, s
# event : name = kmem:kmem_cache_alloc_node, , id = { 931996, 931997, 931998, 931999 }, type = 2, size = 112, config = 0x1b6, { sample_period, sample_freq } =
# event : name = kmem:kmem_cache_free, , id = { 932000, 932001, 932002, 932003 }, type = 2, size = 112, config = 0x1b4, { sample_period, sample_freq } = 1, sa
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# pmu mappings: cpu = 4, intel_pt = 7, intel_bts = 6, uncore_arb = 13, cstate_pkg = 15, breakpoint = 5, uncore_cbox_1 = 12, power = 9, software = 1, uncore_im
# HEADER_CACHE info available, use -I to display
# missing features: HEADER_BRANCH_STACK HEADER_GROUP_DESC HEADER_AUXTRACE HEADER_STAT
# ========
#
# # Looking at just the histogram entries for the first event:
#
# perf report | head -33
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 40 of event 'kmem:kmalloc'
# Event count (approx.): 40
#
# Overhead Trace output
# ........ ...............................................................................................................
#
37.50% call_site=ffffffffb91ad3c7 ptr=0xffff88895fc05000 bytes_req=4096 bytes_alloc=4096 gfp_flags=GFP_KERNEL
10.00% call_site=ffffffffb9258416 ptr=0xffff888a1dc61f00 bytes_req=240 bytes_alloc=256 gfp_flags=GFP_KERNEL|__GFP_ZERO
7.50% call_site=ffffffffb9258416 ptr=0xffff888a2640ac00 bytes_req=240 bytes_alloc=256 gfp_flags=GFP_KERNEL|__GFP_ZERO
2.50% call_site=ffffffffb92759ba ptr=0xffff888a26776000 bytes_req=4096 bytes_alloc=4096 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb9276864 ptr=0xffff8886f6b82600 bytes_req=136 bytes_alloc=192 gfp_flags=GFP_KERNEL|__GFP_ZERO
2.50% call_site=ffffffffb9276903 ptr=0xffff888aefcf0460 bytes_req=32 bytes_alloc=32 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb92ad0ce ptr=0xffff888756c98a00 bytes_req=392 bytes_alloc=512 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb92ad0ce ptr=0xffff888756c9ba00 bytes_req=504 bytes_alloc=512 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb92ad301 ptr=0xffff888a31747600 bytes_req=128 bytes_alloc=128 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb92ad511 ptr=0xffff888a9d26a2a0 bytes_req=28 bytes_alloc=32 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb936a7fb ptr=0xffff88873e8c11a0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb936a7fb ptr=0xffff88873e8c12c0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb936a7fb ptr=0xffff88873e8c1540 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb936a7fb ptr=0xffff88873e8c15a0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb936a7fb ptr=0xffff88873e8c15e0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb936a7fb ptr=0xffff88873e8c16e0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb936a7fb ptr=0xffff88873e8c1c20 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb936a7fb ptr=0xffff888a9d26a2a0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
2.50% call_site=ffffffffb9373e66 ptr=0xffff8889f1931240 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_ATOMIC|__GFP_ZERO
2.50% call_site=ffffffffb9373e66 ptr=0xffff8889f1931980 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_ATOMIC|__GFP_ZERO
2.50% call_site=ffffffffb9373e66 ptr=0xffff8889f1931a00 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_ATOMIC|__GFP_ZERO

#
# # And then limiting using the example for 'perf kmem stat --time' used
# # in the previous changeset committer note we see that there were no
# # kmem:kmalloc in that last part of the file, but there were some
# # kmem:kmem_cache_alloc ones:
#
# perf report --time 20119.782088, --stdio
#
# Total Lost Samples: 0
#
# Samples: 0 of event 'kmem:kmalloc'
# Event count (approx.): 0
#
# Overhead Trace output
# ........ ............
#

# Samples: 0 of event 'kmem:kmalloc_node'
# Event count (approx.): 0
#
# Overhead Trace output
# ........ ............
#

# Samples: 0 of event 'kmem:kfree'
# Event count (approx.): 0
#
# Overhead Trace output
# ........ ............
#

# Samples: 8 of event 'kmem:kmem_cache_alloc'
# Event count (approx.): 8
#
# Overhead Trace output
# ........ ..................................................................................................................
#
75.00% call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
12.50% call_site=ffffffffb90ad33a ptr=0xffff8889f071f6e0 bytes_req=160 bytes_alloc=160 gfp_flags=GFP_ATOMIC|__GFP_NOTRACK
12.50% call_site=ffffffffb9287cc1 ptr=0xffff8889b12722d8 bytes_req=104 bytes_alloc=104 gfp_flags=GFP_NOFS|__GFP_ZERO
#

Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-7-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# f9a7be7c 30-Oct-2016 Jin Yao <yao.jin@linux.intel.com>

perf report: Create a symbol_conf flag for showing branch flag counting

Create a new flag show_branchflag_count in symbol_conf. The flag is used
to control if showing the branch flag counting information. The flag
depends on if the perf.data has branch data and if user chooses the
"branch-history" option in perf report command line.

Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linux-kernel@vger.kernel.org
Cc: Yao Jin <yao.jin@linux.intel.com>
Link: http://lkml.kernel.org/r/1477876794-30749-3-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 89973506 14-Oct-2016 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Use normal error reporting when processing PERF_RECORD_READ events

We already have handling for errors when processing PERF_RECORD_ events,
so instead of calling die() when not being able to alloc, propagate the
error, so that the normal UI exit sequence can take place, the user be
warned and possibly the terminal be properly reset to a sane mode.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brice Goglin <Brice.Goglin@inria.fr>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-r90je3c009a125dvs3525yge@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 30d476ae 13-Sep-2016 Namhyung Kim <namhyung@kernel.org>

perf report: Enable group view with hierarchy

Now that all the missing pieces are implemented, let's enable it. An
example output below:

$ perf record -e '{cycles,instructions}' make
$ perf report --hierarchy --stdio
...
# Overhead Command / Shared Object / Symbol
# ...................... ..................................
#
...
25.74% 27.18% sh
19.96% 24.14% libc-2.24.so
9.55% 14.64% [.] __strcmp_sse2
1.54% 0.00% [.] __tfind
1.07% 1.13% [.] _int_malloc
0.95% 0.00% [.] __strchr_sse2
0.89% 1.39% [.] __tsearch
0.76% 0.00% [.] strlen
...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Requested-by: Andi Kleen <andi@firstfloor.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20160913074552.13284-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b01141f4 25-Aug-2016 Arnaldo Carvalho de Melo <acme@redhat.com>

perf annotate: Initialize the priv are in symbol__new()

We need to initializa some fields (right now just a mutex) when we
allocate the per symbol annotation struct, so do it at the symbol
constructor instead of (ab)using the filter mechanism for that.

This way we remove one of the few cases we have for that symbol filter,
which will eventually led to removing it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-cvz34avlz1lez888lob95390@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fa1f4565 12-Aug-2016 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Allow configuring the default sort order in ~/.perfconfig

Allows changing the default sort order from "comm,dso,symbol" to some
other default, for instance "sym,dso" may be more fitting for kernel
developers.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-pm1h5puxua8nsxksd68fjm8r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 175b968b 05-Jul-2016 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Introduce --stdio-color to setup the color output mode selection

'perf report --stdio' will colorize entries with most hits and possibly
some other aspects of its output, but those colors gets suppressed if we
redirect the output to a non-tty, allow keeping the colors by adding a
new option, --stdio-color, now this use case will also output escape
sequences for colors:

$ perf annotate --stdio-color | more

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-3iuawqjldu4i8gziot7e3d5n@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# e5cadb93 23-Jun-2016 Arnaldo Carvalho de Melo <acme@redhat.com>

perf evlist: Rename for_each() macros to for_each_entry()

To match the semantics for list.h in the kernel, that are used to
implement those macros.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-qbcjlgj0ffxquxscahbpddi3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 41840d21 23-Jun-2016 Taeung Song <treeze.taeung@gmail.com>

perf config: Move config declarations from util/cache.h to util/config.h

Lately util/config.h has been added but util/cache.h has declarations of
functions and a global variable for config features.

To manage codes about configuration at one spot, move them to
util/config.h and let source files that need config features include
config.h And if the source files that included previous cache.h need
only config.h, remove including cache.h.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1466672119-4852-2-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# d05e3aae 14-Jun-2016 Jiri Olsa <jolsa@kernel.org>

perf stdio: Add use_callchain parameter to hists__fprintf

It will be convenient in following patches to display hists entries
without callchains even if they are defined.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1465928361-2442-9-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# a7066709 19-May-2016 He Kuang <hekuang@huawei.com>

perf tools: Set buildid dir under symfs when --symfs is provided

This patch moves the reference of buildid dir to 'symfs/.debug' and
skips the local buildid dir when '--symfs' is given, so that every
single file opened by perf is relative to symfs directory now.

Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1463658462-85131-2-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fe176085 19-May-2016 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Fix usage of max_stack sysctl

We cannot limit processing stacks from the current value of the sysctl,
as we may be processing perf.data files, possibly from other machines.

Instead use the old PERF_MAX_STACK_DEPTH, the sysctl default, that can
be overriden using --max-stack or equivalent.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Fixes: 4cb93446c587 ("perf tools: Set the maximum allowed stack from /proc/sys/kernel/perf_event_max_stack")
Link: http://lkml.kernel.org/n/tip-eqeutsr7n7wy0c36z24ytvii@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# de7e6a7c 03-May-2016 Jiri Olsa <jolsa@kernel.org>

perf hists: Move sort__has_parent into struct perf_hpp_list

Now we have sort dimensions private for struct hists, we need to make
dimension booleans hists specific as well.

Moving sort__has_parent into struct perf_hpp_list.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1462276488-26683-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 52225036 03-May-2016 Jiri Olsa <jolsa@kernel.org>

perf hists: Move sort__need_collapse into struct perf_hpp_list

Now we have sort dimensions private for struct hists, we need to make
dimension booleans hists specific as well.

Moving sort__need_collapse into struct perf_hpp_list.

Adding hists__has macro to easily access this info perf struct hists
object.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1462276488-26683-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 4cb93446 27-Apr-2016 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Set the maximum allowed stack from /proc/sys/kernel/perf_event_max_stack

There is an upper limit to what tooling considers a valid callchain,
and it was tied to the hardcoded value in the kernel,
PERF_MAX_STACK_DEPTH (127), now that this can be tuned via a sysctl,
make it read it and use that as the upper limit, falling back to
PERF_MAX_STACK_DEPTH for kernels where this sysctl isn't present.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-yjqsd30nnkogvj5oyx9ghir9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 1cc83815 18-Apr-2016 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Use callchain_param.enabled instead of tool specific knob

We have callchain_param.enabled, so no need to have something just for
'perf report' to do the same thing.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-wbeisubpualwogwi5u8utnt1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 531d2410 23-Mar-2016 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Do not include stringify.h from the kernel sources

Use instead the copy just made to tools/include/linux/.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-q736w12nwy98x5ox2hamp5ow@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# bb3eb566 22-Mar-2016 Arnaldo Carvalho de Melo <acme@redhat.com>

perf machine: Rename perf_event__preprocess_sample to machine__resolve

Since we only deal with fields in the passed struct perf_sample move
this method to struct machine, that is where the perf_sample fields
will be resolved to a struct addr_location, i.e. thread, map, symbol,
etc.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-a1ww2lbm2vbuqsv4p7ilubu9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b8cbb349 26-Feb-2016 Wang Nan <wangnan0@huawei.com>

perf config: Bring perf_default_config to the very beginning at main()

Before this patch each subcommand calls perf_config() by themself,
reading the default configuration together with subcommand specific
options. If a subcommand doesn't have it own options, it needs to call
'perf_config(perf_default_config, NULL)' to ensure .perfconfig is
loaded.

This patch brings perf_config(perf_default_config, NULL) to the very
start of main(), so subcommands don't need to do it.

After this patch, 'llvm.clang-path' works for 'perf trace'.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 4251446d 24-Feb-2016 Namhyung Kim <namhyung@kernel.org>

perf report: Add --hierarchy option

The --hierarchy option is to show output in hierarchy mode. It extends
folding/unfolding in the TUI and GTK browsers to support sort items as
well as callchains. Users can toggle the items to see the performance
result at wanted level.

$ perf report --hierarchy --tui
Overhead Command / Shared Object / Symbol
--------------------------------------------------
+ 32.96% gnome-shell
- 15.11% swapper
- 14.97% [kernel.vmlinux]
6.82% [k] intel_idle
0.66% [k] menu_select
0.43% [k] __hrtimer_start_range_ns
...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-17-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 5b2ea6f2 16-Feb-2016 Namhyung Kim <namhyung@kernel.org>

perf report: Check error during report__collapse_hists()

If it returns an error, warn user and bail out instead of silently
ignoring it.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1455631723-17345-9-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 9887804d 18-Jan-2016 Jiri Olsa <jolsa@kernel.org>

perf report: Move UI initialization ahead of sort setup

The ui initialization changes hpp format callbacks, based on the used
browser. Thus we need this init being processed before setup_sorting.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-9-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 452ce03b 18-Jan-2016 Jiri Olsa <jolsa@kernel.org>

perf hists: Introduce perf_evsel__output_resort function

Adding evsel specific function to sort hists_evsel based hists. The
hists__output_resort can be now used to sort common hists object.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2665b452 27-Jan-2016 Namhyung Kim <namhyung@kernel.org>

perf report: Apply --percent-limit to callchains also

Currently --percent-limit option only works for hist entries. However
it'd be better to have same effect to callchains as well

Requested-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 34b7b0f9 09-Jan-2016 Namhyung Kim <namhyung@kernel.org>

perf tools: Fallback to srcdir/Documentation/tips.txt

Some people don't install perf, but just use compiled version in the
source. Fallback to lookup the source directory for those poor guys. :)

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452334589-8782-4-git-send-email-namhyung@kernel.org
[ Make perf_tip() return NULL for ENOENT, making the fallback to really take place ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 14cbfbeb 07-Jan-2016 Namhyung Kim <namhyung@kernel.org>

perf report: Show random usage tip on the help line

Currently perf report only shows a help message "For a higher level
overview, try: perf report --sort comm,dso" unconditionally (even if
the sort keys were used). Add more help tips and show randomly.

Load tips from ${prefix}/share/doc/perf-tip/tips.txt file.

$ perf report | tail
0.10% swapper [kernel.vmlinux] [k] irq_exit
0.09% swapper [kernel.vmlinux] [k] flush_smp_call_function_queue
0.08% swapper [kernel.vmlinux] [k] native_write_msr_safe
0.03% swapper [kernel.vmlinux] [k] group_sched_in
0.01% perf [kernel.vmlinux] [k] native_write_msr_safe

#
# (Tip: Search options using a keyword: perf report -h <keyword>)
#

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452166913-27046-1-git-send-email-namhyung@kernel.org
[ Renamed it to perf_tip() and the parameter dirname to dirpath to fix the build on older distros ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 053a3989 22-Dec-2015 Namhyung Kim <namhyung@kernel.org>

perf report/top: Add --raw-trace option

The --raw-trace option allows disabling pretty printing by the event's
print_fmt or plugin. Besides that, each dynamic sort key now can
receive a 'raw' suffix separated by '/' to ask for the raw trace of a
specific field.

$ perf report -s comm,kmem:kmalloc.gfp_flags
...
# Overhead Command gfp_flags
# ........ ....... ...................
#
99.89% perf GFP_NOFS|GFP_ZERO
0.06% sleep GFP_KERNEL
0.03% perf GFP_KERNEL|GFP_ZERO
0.01% perf GFP_KERNEL

Now

$ perf report -s comm,kmem:kmalloc.gfp_flags --raw-trace
or
$ perf report -s comm,kmem:kmalloc.gfp_flags/raw
...
# Overhead Command gfp_flags
# ........ ....... ..........
#
99.89% perf 32848
0.06% sleep 208
0.03% perf 32976
0.01% perf 208

Suggested-and-Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1450804030-29193-9-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 40184c46 22-Dec-2015 Namhyung Kim <namhyung@kernel.org>

perf tools: Pass evlist to setup_sorting()

This is a preparation to support dynamic sort keys for tracepoint
events. Dynamic sort keys can be created for specific fields in trace
events so it needs the event information.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1450804030-29193-5-git-send-email-namhyung@kernel.org
[ Moving the evlist creation earlier in top was split to a previous patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 4b6ab94e 15-Dec-2015 Josh Poimboeuf <jpoimboe@redhat.com>

perf subcmd: Create subcmd library

Move the subcommand-related files from perf to a new library named
libsubcmd.a.

Since we're moving files anyway, go ahead and rename 'exec_cmd.*' to
'exec-cmd.*' to be consistent with the naming of all the other files.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/c0a838d4c878ab17fee50998811612b2281355c1.1450193761.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b3f38fc2 09-Dec-2015 Namhyung Kim <namhyung@kernel.org>

perf report: Check argument before calling setup_browser()

This is necessary to get rid of the browser dependency from
usage_with_options() and its friends. Because there's no code
changing the argc and argv, it'd be ok to check it early.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1449716459-23004-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 25b1606b 27-Nov-2015 Namhyung Kim <namhyung@kernel.org>

perf report: Show error message when processing sample fails

Currently when perf fails to process samples for some reason, it doesn't
show any message about the failure. This is very inconvenient for users
especially on TUI as screen is reset after the failure.

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1448645559-31167-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b49a8fe5 26-Nov-2015 Namhyung Kim <namhyung@kernel.org>

perf callchain: Honor hide_unresolved

If user requested to hide unresolved entries, skip unresolved callchains
as well as hist entries.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1448521700-32062-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# f2af0086 08-Nov-2015 Namhyung Kim <namhyung@kernel.org>

perf report: Add callchain value option

Now -g/--call-graph option supports how to display callchain values.
Possible values are 'percent', 'period' and 'count'. The percent is
same as before and it's the default behavior. The period displays the
raw period value rather than the percentage. The count displays the
number of occurrences.

$ perf report --no-children --stdio -g percent
...
39.93% swapper [kernel.vmlinux] [k] intel_idel
|
---intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
|
|--28.63%-- start_secondary
|
--11.30%-- rest_init

$ perf report --no-children --show-total-period --stdio -g period
...
39.93% 13018705 swapper [kernel.vmlinux] [k] intel_idel
|
---intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
|
|--9334403-- start_secondary
|
--3684302-- rest_init

$ perf report --no-children --show-nr-samples --stdio -g count
...
39.93% 80 swapper [kernel.vmlinux] [k] intel_idel
|
---intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
|
|--57-- start_secondary
|
--23-- rest_init

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1447047946-1691-6-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2059fc7a 12-Nov-2015 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Allow forcing reading of non-root owned files by root

When the root user tries to read a file owned by some other user we get:

# ls -la perf.data
-rw-------. 1 acme acme 20032 Nov 12 15:50 perf.data
# perf report
File perf.data not owned by current user or root (use -f to override)
# perf report -f | grep -v ^# | head -2
30.96% ls [kernel.vmlinux] [k] do_set_pte
28.24% ls libc-2.20.so [.] intel_check_word
#

That wasn't happening when the symbol code tried to read a JIT map,
where the same check was done but no forcing was possible, fix it.

Reported-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Tested-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://permalink.gmane.org/gmane.linux.kernel.perf.user/2380
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b272a59d 24-Oct-2015 Namhyung Kim <namhyung@kernel.org>

perf report: Rename to --show-cpu-utilization

So that it can be more consistent with other --show-* options. The old
name (--showcpuutilization) is provided only for compatibility.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445701767-12731-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 76a26549 22-Oct-2015 Namhyung Kim <namhyung@kernel.org>

perf tools: Improve call graph documents and help messages

The --call-graph option is complex so we should provide better guide for
users. Also change help message to be consistent with config option
names. Now perf top will show help like below:

$ perf top --call-graph
Error: option `call-graph' requires a value

Usage: perf top [<options>]

--call-graph <record_mode[,record_size],print_type,threshold[,print_limit],order,sort_key[,branch]>
setup and enables call-graph (stack chain/backtrace):

record_mode: call graph recording mode (fp|dwarf|lbr)
record_size: if record_mode is 'dwarf', max size of stack recording (<bytes>)
default: 8192 (bytes)
print_type: call graph printing style (graph|flat|fractal|none)
threshold: minimum call graph inclusion threshold (<percent>)
print_limit: maximum number of call graph entry (<number>)
order: call graph order (caller|callee)
sort_key: call graph sort key (function|address)
branch: include last branch info to call graph (branch)

Default: fp,graph,0.5,caller,function

Requested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1445524112-5201-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 792aeafa 22-Oct-2015 Namhyung Kim <namhyung@kernel.org>

perf tools: Defaults to 'caller' callchain order only if --children is enabled

The caller callchain order is useful with --children option since it can
show 'overview' style output, but other commands which don't use
--children feature like 'perf script' or even 'perf report/top' without
--children are better to keep callee order.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1445499946-29817-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 21cf6284 22-Oct-2015 Namhyung Kim <namhyung@kernel.org>

perf tools: Move callchain help messages to callchain.h

These messages will be used by 'perf top' in the next patch.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1445495330-25416-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# def02db0 05-Oct-2015 Arnaldo Carvalho de Melo <acme@redhat.com>

perf callchain: Switch default to 'graph,0.5,caller'

Which is the most common default found in other similar tools.

Requested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://www.youtube.com/watch?v=nXaxk27zwlk
Link: http://lkml.kernel.org/n/tip-v8lq36aispvdwgxdmt9p9jd9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# a5e813c6 30-Sep-2015 Arnaldo Carvalho de Melo <acme@redhat.com>

perf machine: Add method for common kernel_map(FUNCTION) operation

And it is also a step in the direction of killing the separation of data
and text maps in map_groups.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-rrds86kb3wx5wk8v38v56gw8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 77e65977 30-Sep-2015 Arnaldo Carvalho de Melo <acme@redhat.com>

perf machine: Use machine__kernel_map() thoroughly

In places where we were using its open coded equivalent.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-khkdugcdoqy3tkszm3jdxgbe@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 188bb5e2 25-Sep-2015 Adrian Hunter <adrian.hunter@intel.com>

perf report: Make max_stack value allow for synthesized callchains

perf report has an option (--max-stack) to set the maximum stack depth
when processing callchains. The option defaults to the hard-coded
maximum definition PERF_MAX_STACK_DEPTH which is 127. The intention of
the option is to allow the user to reduce the processing time by
reducing the amount of the callchain that is processed.

It is also possible, when processing instruction traces, to synthesize
callchains. Synthesized callchains do not have the kernel size
limitation and are whatever size the user requests, although validation
presently prevents the user requested a value greater that 1024. The
default value is 16.

To allow for synthesized callchains, make the max_stack value at least
the same size as the synthesized callchain size.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-16-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# f86225db 25-Sep-2015 Adrian Hunter <adrian.hunter@intel.com>

perf report: Skip events with null branch stacks

A non-synthesized event might not have a branch stack if branch stacks
have been synthesized (using itrace options).

An example of that is when Intel PT records sched_switch events for
decoding purposes. Those sched_switch events do not have branch stacks
even though the Intel PT decoder may be synthesizing other events that
do due to the itrace options.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-12-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fb9fab66 25-Sep-2015 Adrian Hunter <adrian.hunter@intel.com>

perf report: Also do default setup for synthesized branch stacks

The 'perf report' tool will default to displaying branch stacks (-b
option) if they are present. Make that also happen for synthesized
branch stacks.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-11-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# c7eced63 25-Sep-2015 Adrian Hunter <adrian.hunter@intel.com>

perf report: Adjust sample type validation for synthesized branch stacks

perf report looks at event sample types to determine if branch stacks
have been sampled. Adjust the validation to know about instruction
tracing options.

This change allows the use of the -b option which otherwise would
complain with an error like:

Error:
Selected -b but no branch data. Did you call perf record without -b?
# To display the perf.data header info,
# please use --header/--header-only options.
#

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-10-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# d062ac16 25-Sep-2015 Adrian Hunter <adrian.hunter@intel.com>

perf report: Fix sample type validation for synthesized callchains

Processing instruction tracing data (e.g. Intel PT) can synthesize
callchains e.g.

$ perf record -e intel_pt//u uname
$ perf report --stdio --itrace=ige

However perf report's callgraph option gets extra validation, so:

$ perf report --stdio --itrace=ige -gflat
Error:
Selected -g or --branch-history but no callchain data. Did
you call 'perf record' without -g?
# To display the perf.data header info,
# please use --header/--header-only options.
#

Fix the validation to know about instruction tracing options so
above command works.

A side-effect of the change is that the default option to
accumulate the callchain of child functions comes into force.
To get the previous behaviour the --no-children option can be
used e.g.

$ perf report --stdio --itrace=ige -gflat --no-children

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 84734b06 04-Sep-2015 Kan Liang <kan.liang@intel.com>

perf hists browser: Zoom in/out for processor socket

Currently, users can zoom in/out for threads and dso in 'perf top' and
'perf report'.

This patch extends it for the processor sockets.

'S' is the short key to zoom into current Processor Socket.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1441377946-44429-4-git-send-email-kan.liang@intel.com
[ - Made it elide the Socket column when zooming into it,
just like with the other zoom ops;
- Make it use browser->pstack, to unzoom level by level;
- Rename 'socket' variables to 'socket_id' to make it build on
older systems where it shadows a global glibc declaration ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 21394d94 04-Sep-2015 Kan Liang <kan.liang@intel.com>

perf report: Introduce --socket-filter option

Introduce --socket-filter option for 'perf report' to only show entries
for a processor socket that match this filter.

$ perf report --socket-filter 1 --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 752 of event 'cycles'
# Event count (approx.): 350995599
# Processor Socket: 1
#
# Overhead Command Shared Object Symbol
# ........ ......... ................ .................................
#
97.02% test test [.] plusB_c
0.97% test test [.] plusA_c
0.23% swapper [kernel.vmlinux] [k] acpi_idle_do_entry
0.09% rcu_sched [kernel.vmlinux] [k] dyntick_save_progress_counter
0.01% swapper [kernel.vmlinux] [k] task_waking_fair
0.00% swapper [kernel.vmlinux] [k] run_timer_softirq

Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1441377946-44429-3-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 9e207ddf 11-Aug-2015 Kan Liang <kan.liang@intel.com>

perf report: Show call graph from reference events

Introduce --show-ref-call-graph for perf report to print reference
callgraph for no callgraph event.

Here is an example.

perf report --show-ref-call-graph --stdio

# To display the perf.data header info, please use
--header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 5 of event 'cpu/cpu-cycles,call-graph=fp/'
# Event count (approx.): 144985
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................ ........................................
#
72.30% 0.00% sleep [kernel.vmlinux] [k] entry_SYSCALL_64_fastpath
|
---entry_SYSCALL_64_fastpath
|
|--22.62%-- __GI___libc_nanosleep
--77.38%-- [...]

......

# Samples: 6 of event 'cpu/instructions,call-graph=no/', show reference callgraph
# Event count (approx.): 172780
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................ ........................................
#
73.16% 0.00% sleep [kernel.vmlinux] [k] entry_SYSCALL_64_fastpath
|
---entry_SYSCALL_64_fastpath
|
|--31.44%-- __GI___libc_nanosleep
--68.56%-- [...]

Signed-off-by: Kan Liang <kan.liang@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1439289050-40510-3-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# a9710ba0 07-Aug-2015 Andi Kleen <ak@linux.intel.com>

perf tools: Support full source file paths for srcline

For perf report/script srcline currently only the base file name of the
source file is printed. This is a good default because it usually fits
on the screen.

But in some cases we want to know the full file name, for example to
aggregate hits per file.

In the later case we need more than the base file name to resolve file
naming collisions: for example the kernel source has ~70 files named
"core.c"

It's also useful as input to post processing tools which want to point
to the right file.

Add a flag to allow full file name output.

Add an option to perf report/script to enable this option.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1438986245-15191-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 57849998 18-Jul-2015 Andi Kleen <ak@linux.intel.com>

perf report: Add processing for cycle histograms

Call the earlier added cycle histogram infrastructure from the perf
report hist iter callback. For this we walk the branch records.

This allows to use cycle histograms when browsing perf report annotate.

v2: Rename flag

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1437233094-12844-5-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 98df858e 18-Jul-2015 Andi Kleen <ak@linux.intel.com>

perf report: Add flag for non ANY branch mode

Later patches need to cheaply check that the branch mode is in ANY. Add
a new function to check all event attrs and add a flag to the report
state, which is then initialized.

v2: Rename flag

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1437233094-12844-3-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 07a716ff 30-Jun-2015 Taeung Song <treeze.taeung@gmail.com>

perf report: Fill in the missing session freeing after an error occurs

When an error occurs an error value is just returned without freeing the
session. So allocating and freeing session have to be matched as a pair
even if an error occurs.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1435652124-22414-6-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 36c8bb56 19-Jun-2015 Li Zhang <zhlcindy@linux.vnet.ibm.com>

perf symbols: Check access permission when reading symbol files

There 2 problems when reading symbols files:

* It doesn't report any errors even if when users specify symbol
files which don't exist with --kallsyms or --vmlinux. The result
just shows the address without symbols, which is not what is expected.
So it's better to report errors and exit the program.

* When using command perf report --kallsyms=/proc/kallsyms with a
non-root user, symbols are resolved. Then select one symbol and
annotate it, it reports the error as the following:
Can't annotate __clear_user: No vmlinux file with build id xxx was
found.

The problem is caused by reading /proc/kcore without access permission.
/proc/kcore requires CAP_SYS_RAWIO capability to access, so it needs to
change access permission to allow a specific user to read /proc/kcore or
use root to execute the perf command.

This patch is to report errors when symbol files specified by users
don't exist. And check access permission of /proc/kcore when reading it.

Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1434704253-2632-1-git-send-email-zhlcindy@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 6ba29c2f 10-Jun-2015 He Kuang <hekuang@huawei.com>

perf tools: Fix build failure on 32-bit arch

Failed in 32bit arch build like this:

CC /opt/h00206996/output/perf/arm32/builtin-record.o
util/session.c: In function ‘perf_session__warn_about_errors’:
util/session.c:1304:9: error: format ‘%lu’ expects argument of type ‘long unsigned int’,
but argument 2 has type ‘long long unsigned int’ [-Werror=format=]

builtin-report.c: In function ‘perf_evlist__tty_browse_hists’:
builtin-report.c:323:2: error: format ‘%lu’ expects argument of type ‘long unsigned int’,
but argument 3 has type ‘u64’ [-Werror=format=]

Replace %lu format strings in warning message with PRIu64 for u64
'total_lost_samples' to fix this problem.

Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1434026664-71642-1-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# c4937a91 10-May-2015 Kan Liang <kan.liang@intel.com>

perf tools: handle PERF_RECORD_LOST_SAMPLES

This patch modifies the perf tool to handle the new RECORD type,
PERF_RECORD_LOST_SAMPLES.

The number of lost-sample events is stored in
.nr_events[PERF_RECORD_LOST_SAMPLES]. The exact number of samples
which the kernel dropped is stored in total_lost_samples.

When the percentage of dropped samples is greater than 5%, a warning
is printed.

Here are some examples:

Eg 1, Recording different frequently-occurring events is safe with the
patch. Only a very low drop rate is associated with such actions.

$ perf record -e '{cycles:p,instructions:p}' -c 20003 --no-time ~/tchain ~/tchain

$ perf report -D | tail
SAMPLE events: 120243
MMAP2 events: 5
LOST_SAMPLES events: 24
FINISHED_ROUND events: 15
cycles:p stats:
TOTAL events: 59348
SAMPLE events: 59348
instructions:p stats:
TOTAL events: 60895
SAMPLE events: 60895

$ perf report --stdio --group
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 24
#
# Samples: 120K of event 'anon group { cycles:p, instructions:p }'
# Event count (approx.): 24048600000
#
# Overhead Command Shared Object Symbol
# ................ ........... ................
..................................
#
99.74% 99.86% tchain_edit tchain_edit [.] f3
0.09% 0.02% tchain_edit tchain_edit [.] f2
0.04% 0.00% tchain_edit [kernel.vmlinux] [k] ixgbe_read_reg

Eg 2, Recording the same thing multiple times can lead to high drop
rate, but it is not a useful configuration.

$ perf record -e '{cycles:p,cycles:p}' -c 20003 --no-time ~/tchain
Warning: Processed 600592 samples and lost 99.73% samples!
[perf record: Woken up 148 times to write data]
[perf record: Captured and wrote 36.922 MB perf.data (1206322 samples)]
[perf record: Woken up 1 times to write data]
[perf record: Captured and wrote 0.121 MB perf.data (1629 samples)]

Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@infradead.org
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1431285195-14269-9-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 063bd936 19-May-2015 Namhyung Kim <namhyung@kernel.org>

perf hists: Reducing arguments of hist_entry_iter__add()

The evsel and sample arguments are to set iter for later use. As it
also receives an iter as another argument, just set them before calling
the function.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1432022650-18205-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 021162cf 11-May-2015 Namhyung Kim <namhyung@kernel.org>

perf report: Do not restrict -T option by other options

It seems there's no reason to suppress per-thread event stat by -T
option when -s or -p option is used. Make it work with those options.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1431351879-23798-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b138f42e 09-May-2015 Namhyung Kim <namhyung@kernel.org>

perf report: Force tty output if -T/--thread option is given

The -T/--thread option is supported only on --stdio mode (at least for
now). So enforce the tty output if the option was requested.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1431184784-30525-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b91fc39f 06-Apr-2015 Arnaldo Carvalho de Melo <acme@redhat.com>

perf machine: Protect the machine->threads with a rwlock

In addition to using refcounts for the struct thread lifetime
management, we need to protect access to machine->threads from
concurrent access.

That happens in 'perf top', where a thread processes events, inserting
and deleting entries from that rb_tree while another thread decays
hist_entries, that end up dropping references and ultimately deleting
threads from the rb_tree and releasing its resources when no further
hist_entry (or other data structures, like in 'perf sched') references
it.

So the rule is the same for refcounts + protected trees in the kernel,
get the tree lock, find object, bump the refcount, drop the tree lock,
return, use object, drop the refcount if no more use of it is needed,
keep it if storing it in some other data structure, drop when releasing
that data structure.

I.e. pair "t = machine__find(new)_thread()" with a "thread__put(t)", and
"perf_event__preprocess_sample(&al)" with "addr_location__put(&al)".

The addr_location__put() one is because as we return references to
several data structures, we may end up adding more reference counting
for the other data structures and then we'll drop it at
addr_location__put() time.

Acked-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-bs9rt4n0jw3hi9f3zxyy3xln@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 520a2ebc 24-Apr-2015 Adrian Hunter <adrian.hunter@intel.com>

perf report: Add Instruction Tracing support

Add support for decoding an AUX area assuming it contains instruction
tracing data.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1429903807-20559-4-git-send-email-adrian.hunter@intel.com
[ Do not use -Z as an alternative to --itrace ]
[ Fixed initialization of itrace_synth_opts struct fields on older gcc versions ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# e944ec2c 29-Apr-2015 Namhyung Kim <namhyung@kernel.org>

perf report: Fix -T/--threads option to work again

The commit 512ae1bd6acb ("perf tools: Consolidate management of default
sort orders") changed default value of the 'sort_order' variable to NULL
indicating that users don't set any sort keys on the command line.

However it missed to update a check in perf_evlist__tty_browse_hists()
so that 'perf report -T' cannot show the per-thread values after the
normal output. This patch fixes it to work again.

Note that the -T option only works on --stdio and neither --sort nor
--parent option was given.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1430309328-28317-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# f6fcc143 08-Apr-2015 Wang Nan <wangnan0@huawei.com>

perf report: Don't call map__kmap if map is NULL.

report__warn_kptr_restrict() calls map__kmap(kernel_map) before checking
kernel_map againest NULL.

Which is dangerous, since map__kmap() will return a invalid and not NULL
address.

It will trigger a warning message in map__kmap() after the patch "perf:
kmaps: enforce usage of kmaps to protect futher bugs." was applied.

This patch fixes it by adding the missing checking.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1428490772-135393-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# e03eaa40 24-Mar-2015 David Ahern <dsahern@gmail.com>

perf tools: Add pid/tid filtering to report and script commands

The 'record' and 'top' tools already allow a user to specify a CSV of
pids and/or tids of tasks to collect data.

Add those options to the 'report' and 'script' analysis commands to only
consider samples related to the given pids/tids.

This is also inline with the existing comm option.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1427212361-7066-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 228f14f2 22-Mar-2015 Yunlong Song <yunlong.song@huawei.com>

perf tools: Remove (null) value of "Sort order" for perf mem report

When '--sort' is not set, 'perf mem report" will print a null pointer as
the output value of sort order, so fix it.

Example:

Before this patch:

$ perf mem report
# To display the perf.data header info, please use --header/--header-only options.
#
# Samples: 18 of event 'cpu/mem-loads/pp'
# Total weight : 188
# Sort order : (null)
#
...

After this patch:

$ perf mem report
# To display the perf.data header info, please use --header/--header-only options.
#
# Samples: 18 of event 'cpu/mem-loads/pp'
# Total weight : 188
# Sort order : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
#
...

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427082605-12881-1-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0c8c2077 12-Mar-2015 Wang Nan <wangnan0@huawei.com>

perf report: Don't allow empty argument for '-t'.

Without this patch, perf report cause segfault if pass "" as '-t':

$ perf report -t ""

# To display the perf.data header info, please use --header/--header-only options.
#
# Samples: 37 of event 'syscalls:sys_enter_write'
# Event count (approx.): 37
#
# Children SelfCommand Shared Object Symbol
Segmentation fault

Since -t is used to add field-separator for generate table, -t "" is
actually meanless. This patch defines a new OPT_STRING_NOEMPTY() option
generator to ensure user never pass empty string to that option.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: pi3orama@163.com
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/r/1426251114-198991-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b7b61cbe 03-Mar-2015 Arnaldo Carvalho de Melo <acme@redhat.com>

perf ordered_events: Shorten function signatures

By keeping pointers to machines, evlist and tool in ordered_events.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-0c6huyaf59mqtm2ek9pmposl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fefd2d96 14-Feb-2015 He Kuang <hekuang@huawei.com>

perf report: Fix branch stack mode cannot be set

When perf.data file is obtained using 'perf record -b', perf report
should use branch stack mode to generate output. But this function is
broken by improper comparison between boolean and constant -1.

before this patch:

$ perf report -b -i perf.data
Samples: 16 of event 'cycles', Event count (approx.): 3171896
Overhead Command Shared Object Symbol
13.59% ls [kernel.kallsyms] [k] prio_tree_remove
13.16% ls [kernel.kallsyms] [k] change_pte_range
12.09% ls [kernel.kallsyms] [k] page_fault
12.02% ls [kernel.kallsyms] [k] zap_pte_range
...

after this patch:

$ perf report -b -i perf.data
Samples: 256 of event 'cycles', Event count (approx.): 256
Overhead Command Source Shared Object Source Symbol Target Shared Object Target Symbol
9.38% ls [unknown] [k] 0000000000000000 [unknown] [k] 0000000000000000
6.25% ls libc-2.19.so [.] _dl_addr libc-2.19.so [.] _dl_addr
6.25% ls [kernel.kallsyms] [k] zap_pte_range [kernel.kallsyms] [k] zap_pte_range
6.25% ls [kernel.kallsyms] [k] change_pte_range [kernel.kallsyms] [k] change_pte_range
0.39% ls [kernel.kallsyms] [k] prio_tree_remove [kernel.kallsyms] [k] prio_tree_remove
...

Signed-off-by: He Kuang <hekuang@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1423967617-28879-1-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# aad2b21c 05-Jan-2015 Kan Liang <kan.liang@intel.com>

perf tools: Enable LBR call stack support

Currently, there are two call chain recording options, fp and dwarf.

Haswell has a new feature that utilizes the existing LBR facility to
record call chains. Kernel side LBR support code provides this as a
third option to record call chains. This patch enables the lbr call
stack support on the tooling side.

LBR call stack has some limitations:

- It reuses current LBR facility, so LBR call stack and branch record
can not be enabled at the same time.

- It is only available for user-space callchains.

However, it also offers some advantages:

- LBR call stack can work on user apps which don't have frame-pointers
or dwarf debug info compiled. It is a good alternative when nothing
else works.

Tested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Cody P Schafer <cody@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Jacob Shin <jacob.w.shin@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masanari Iida <standby24x7@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1420482185-29830-2-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 590cd344 21-Dec-2014 Namhyung Kim <namhyung@kernel.org>

perf report: Get rid of report__inc_stat()

The report__inc_stat() function collects the number of hist entries in
the session in order to calculate the max size of the progess bar.

It'd be better if it does it during the addition of hist entries so that
it can be used by other places too.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1419223455-4362-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 740b97f9 21-Dec-2014 Namhyung Kim <namhyung@kernel.org>

perf report: Show progress bar for output resorting

Sometimes it takes a long time to resort hist entries for output in case
of a large data file. Show a progress bar window and inform user.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1419223455-4362-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 09a6a1b0 17-Nov-2014 Andi Kleen <ak@linux.intel.com>

perf report: In branch stack mode use address history sorting

Enable CCKEY_ADDRESS address history sorting with --branch-history.
This makes get_srcline display the source lines correctly, otherwise all
history entries for a function a hunked into one.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1416275935-20971-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fa94c36c 12-Nov-2014 Andi Kleen <ak@linux.intel.com>

perf report: Add --branch-history option

Add a --branch-history option to perf report that changes all the
settings necessary for using the branches in callstacks.

This is just a short cut to make this nicer to use, it does not enable
any functionality by itself.

v2: Change sort order. Rename option to --branch-history to
be less confusing.
v3: Updates
v4: Fix conflict with newer perf base
v5: Port to latest tip
v6: Add more comments. Remove CCKEY_ADDRESS setting. Remove
unnecessary branch_mode setting. Use a boolean.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1415844328-4884-5-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8b7bad58 12-Nov-2014 Andi Kleen <ak@linux.intel.com>

perf callchain: Support handling complete branch stacks as histograms

Currently branch stacks can be only shown as edge histograms for
individual branches. I never found this display particularly useful.

This implements an alternative mode that creates histograms over
complete branch traces, instead of individual branches, similar to how
normal callgraphs are handled. This is done by putting it in front of
the normal callgraph and then using the normal callgraph histogram
infrastructure to unify them.

This way in complex functions we can understand the control flow that
lead to a particular sample, and may even see some control flow in the
caller for short functions.

Example (simplified, of course for such simple code this is usually not
needed), please run this after the whole patchkit is in, as at this
point in the patch order there is no --branch-history, that will be
added in a patch after this one:

tcall.c:

volatile a = 10000, b = 100000, c;

__attribute__((noinline)) f2()
{
c = a / b;
}

__attribute__((noinline)) f1()
{
f2();
f2();
}
main()
{
int i;
for (i = 0; i < 1000000; i++)
f1();
}

% perf record -b -g ./tsrc/tcall
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.044 MB perf.data (~1923 samples) ]
% perf report --no-children --branch-history
...
54.91% tcall.c:6 [.] f2 tcall
|
|--65.53%-- f2 tcall.c:5
| |
| |--70.83%-- f1 tcall.c:11
| | f1 tcall.c:10
| | main tcall.c:18
| | main tcall.c:18
| | main tcall.c:17
| | main tcall.c:17
| | f1 tcall.c:13
| | f1 tcall.c:13
| | f2 tcall.c:7
| | f2 tcall.c:5
| | f1 tcall.c:12
| | f1 tcall.c:12
| | f2 tcall.c:7
| | f2 tcall.c:5
| | f1 tcall.c:11
| |
| --29.17%-- f1 tcall.c:12
| f1 tcall.c:12
| f2 tcall.c:7
| f2 tcall.c:5
| f1 tcall.c:11
| f1 tcall.c:10
| main tcall.c:18
| main tcall.c:18
| main tcall.c:17
| main tcall.c:17
| f1 tcall.c:13
| f1 tcall.c:13
| f2 tcall.c:7
| f2 tcall.c:5
| f1 tcall.c:12

The default output is unchanged.

This is only implemented in perf report, no change to record or anywhere
else.

This adds the basic code to report:

- add a new "branch" option to the -g option parser to enable this mode
- when the flag is set include the LBR into the callstack in machine.c.

The rest of the history code is unchanged and doesn't know the
difference between LBR entry and normal call entry.

- detect overlaps with the callchain
- remove small loop duplicates in the LBR

Current limitations:

- The LBR flags (mispredict etc.) are not shown in the history
and LBR entries have no special marker.
- It would be nice if annotate marked the LBR entries somehow
(e.g. with arrows)

v2: Various fixes.
v3: Merge further patches into this one. Fix white space.
v4: Improve manpage. Address review feedback.
v5: Rename functions. Better error message without -g. Fix crash without
-b.
v6: Rebase
v7: Rebase. Use NO_ENTRY in memset.
v8: Port to latest tip. Move add_callchain_ip to separate
patch. Skip initial entries in callchain. Minor cleanups.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1415844328-4884-3-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0cdccac6 05-Oct-2014 Namhyung Kim <namhyung@kernel.org>

perf report: Set callchain_param.record_mode for future use

Normally the callchain_param.record_mode is used only for record path.
But as it might need to prepare something for dwarf unwinding, setup
this info for perf report too.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1412556363-26229-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# a635fc51 09-Oct-2014 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Remove hists from evsel

Now tools that deals want to have an hists per evsel need to call
hists__init() before creating any evsels, which can be as early as when
parsing the command line, so do it before calling parse_options().

The current tools using hists/hist_entries are report, top and annotate,
change them to request per evsel hists.

This is in preparation for making evsels usable by 3rd party tools, that
not necessarily live in perf's source code repository.

Acked-by: Borislav Petkov <bp@suse.de>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-usjx2la743f10ippj7p1b20x@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2a1731fb 10-Oct-2014 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Remove last reference to hists struct

Now perf_session doesn't require that the evsels in its evlist are hists
containing ones.

Tools that are hists based and want to do per evsel events_stats
updates, if at some point this turns into a necessity, should do it in
the tool specific code, keeping the session class hists agnostic.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-cli1bgwpo82mdikuhy3djsuy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 4ea062ed 09-Oct-2014 Arnaldo Carvalho de Melo <acme@redhat.com>

perf evsel: Add hists helper

Not all tools need a hists instance per perf_evsel, so lets pave the way
to remove evsel->hists while leaving a way to access the hists from a
specially allocated evsel, one that comes with space at the end where
lives the evsel.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-qlktkhe31w4mgtbd84035sr2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 52e02834 23-Sep-2014 Taeung Song <treeze.taeung@gmail.com>

perf tools: Modify error code for when perf_session__new() fails

Because perf_session__new() can fail for more reasons than just ENOMEM,
modify error code(ENOMEM or EINVAL) to -1.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1411522417-9917-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 763122ad 12-Sep-2014 Avi Kivity <avi@cloudius-systems.com>

perf tools: Disable kernel symbol demangling by default

Some Linux symbols (for example __vt_event_wait) are interpreted by the
demangler as C++ mangled names, which of course they aren't.

Disable kernel symbol demangling by default to avoid this, and allow
enabling it with a new option --demangle-kernel for those who wish it.

Reported-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1410581705-26968-1-git-send-email-avi@cloudius-systems.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# edd114e2 06-Aug-2014 naota@elisp.net <naota@elisp.net>

perf report: Set proper sort__mode for the branch option

When you specify "--branch-stack"("-b" for short) or
"--no-branch-stack", "branch_mode" variable is set to 1 or 0
respectively. However, the code is just checking if the variable is -1
or not, ignoring "branch_mode == 1" case. Thus "perf report -b" dose not
show its result with the branch sorted mode. This patch fix the problem.

Signed-off-by: Naohiro Aota <naota@elisp.net>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/87y4v1fylq.fsf@elisp.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0a7e6d1b 12-Aug-2014 Namhyung Kim <namhyung@kernel.org>

perf tools: Check recorded kernel version when finding vmlinux

Currently vmlinux_path__init() only tries to find vmlinux file from
current directory, /boot and some canonical directories with version
number of the running kernel. This can be a problem when reporting old
data recorded on a kernel version not running currently.

We can use --symfs option for this but it's annoying for user to do it
always. As we already have the info in the perf.data file, it can be
changed to use it for the search automatically.

Before:

$ perf report
...
# Samples: 4K of event 'cpu-clock'
# Event count (approx.): 1067250000
#
# Overhead Command Shared Object Symbol
# ........ .......... ................. ..............................
71.87% swapper [kernel.kallsyms] [k] recover_probed_instruction

After:

# Overhead Command Shared Object Symbol
# ........ .......... ................. ....................
71.87% swapper [kernel.kallsyms] [k] native_safe_halt

This requires to change signature of symbol__init() to receive struct
perf_session_env *.

Reported-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1407825645-24586-14-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 94786b67 05-Jun-2014 Jiri Olsa <jolsa@kernel.org>

perf tools: Add report.queue-size config file option

Adding report.queue-size config file option to setup the maximum
allocation size for session's struct ordered_events object.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-lm42mbpu0cwljpyy8vw5y26n@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0a8cb85c 06-Jul-2014 Jiri Olsa <jolsa@kernel.org>

perf tools: Rename ordered_samples bool to ordered_events

The time ordering is generic for all kinds of events, so using generic
name 'ordered_events' for ordered_samples bool in perf_tool struct.

No functional change was intended.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-07mrqzcuhsks9wfmxrzsvemz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 9d3c02d7 07-Jan-2014 Namhyung Kim <namhyung@kernel.org>

perf tools: Add callback function to hist_entry_iter

The new ->add_entry_cb() will be called after an entry was added to
the histogram. It's used for code sharing between perf report and
perf top. Note that ops->add_*_entry() should set iter->he properly
in order to call the ->add_entry_cb.

Also pass @arg to the callback function. It'll be used by perf top
later.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/87k393g999.fsf@sejong.aot.lge.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>


# 8d8e645c 22-Jan-2013 Namhyung Kim <namhyung@kernel.org>

perf report: Add report.children config option

Add report.children config option for setting default value of
callchain accumulation. It affects the report output only if
perf.data contains callchain info.

A user can write .perfconfig file like below to enable accumulation
by default:

$ cat ~/.perfconfig
[report]
children = true

And it can be disabled through command line:

$ perf report --no-children

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-17-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>


# 793aaaab 30-Oct-2013 Namhyung Kim <namhyung@kernel.org>

perf report: Add --children option

The --children option is for showing accumulated overhead (period)
value as well as self overhead.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-16-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>


# 7a13aa28 10-Sep-2012 Namhyung Kim <namhyung@kernel.org>

perf hists: Accumulate hist entry stat based on the callchain

Call __hists__add_entry() for each callchain node to get an
accumulated stat for an entry. Introduce new cumulative_iter ops to
process them properly.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-6-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>


# 69bcb019 29-Oct-2013 Namhyung Kim <namhyung@kernel.org>

perf tools: Introduce struct hist_entry_iter

There're some duplicate code when adding hist entries. They are
different in that some have branch info or mem info but generally do
same thing. So introduce new struct hist_entry_iter and add callbacks
to customize each case in general way.

The new perf_evsel__add_entry() function will look like:

iter->prepare_entry();
iter->add_single_entry();

while (iter->next_entry())
iter->add_next_entry();

iter->finish_entry();

This will help further work like the cumulative callchain patchset.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1401335910-16832-3-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>


# 1844dbcb 27-May-2014 Namhyung Kim <namhyung@kernel.org>

perf tools: Introduce hists__inc_nr_samples()

There're some duplicate code for counting number of samples. Add
hists__inc_nr_samples() and reuse it.

Suggested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1401335910-16832-2-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>


# a7d945bc 03-Mar-2014 Namhyung Kim <namhyung@kernel.org>

perf report: Add -F option to specify output fields

The -F/--fields option is to allow user setup output field in any
order. It can receive any sort keys and following (hpp) fields:

overhead, overhead_sys, overhead_us, sample and period

If guest profiling is enabled, overhead_guest_{sys,us} will be
available too.

The output fields also affect sort order unless you give -s/--sort
option. And any keys specified on -s option, will also be added to
the output field list automatically.

$ perf report -F sym,sample,overhead
...
# Symbol Samples Overhead
# .......................... ............ ........
#
[.] __cxa_atexit 2 2.50%
[.] __libc_csu_init 4 5.00%
[.] __new_exitfn 3 3.75%
[.] _dl_check_map_versions 1 1.25%
[.] _dl_name_match_p 4 5.00%
[.] _dl_setup_hash 1 1.25%
[.] _dl_sysdep_start 1 1.25%
[.] _init 5 6.25%
[.] _setjmp 6 7.50%
[.] a 8 10.00%
[.] b 8 10.00%
[.] brk 1 1.25%
[.] c 8 10.00%

Note that, the example output above is captured after applying next
patch which fixes sort/comparing behavior.

Requested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-12-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>


# 22af969e 15-Apr-2014 Namhyung Kim <namhyung@kernel.org>

perf tools: Call perf_hpp__init() before setting up GUI browsers

So that it can be set properly prior to set up output fields. That
makes easy to handle/warn errors during the setup since it doesn't
need to be bothered with the GUI.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-11-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>


# 512ae1bd 17-Mar-2014 Namhyung Kim <namhyung@kernel.org>

perf tools: Consolidate management of default sort orders

The perf uses different default sort orders for different use-cases,
and this was scattered throughout the code. Add get_default_sort_
order() function to handle this and change initial value of sort_order
to NULL to distinguish it from user-given one.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1400480762-22852-10-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>


# a2ce067e 03-Mar-2014 Namhyung Kim <namhyung@kernel.org>

perf tools: Allow hpp fields to be sort keys

Add overhead{,_sys,_us,_guest_sys,_guest_us}, sample and period sort
keys so that they can be selected with --sort/-s option.

$ perf report -s period,comm --stdio
...
# Overhead Period Command
# ........ ............ ...............
#
47.06% 152 swapper
13.93% 45 qemu-system-arm
12.38% 40 synergys
3.72% 12 firefox
2.48% 8 xchat

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-9-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>


# 820bc81f 21-Apr-2014 Namhyung Kim <namhyung@kernel.org>

perf tools: Account entry stats when it's added to the output tree

Currently, accounting each sample is done in multiple places - once
when adding them to the input tree, other when adding them to the
output tree. It's not only confusing but also can cause a subtle
problem since concurrent processing like in perf top might see the
updated stats before adding entries into the output tree - like seeing
more (blank) lines at the end and/or slight inaccurate percentage.

To fix this, only account the entries when it's moved into the output
tree so that they cannot be seen prematurely. There're some
exceptional cases here and there - they should be addressed separately
with comments.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1398327843-31845-7-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>


# 58c311da 21-Apr-2014 Namhyung Kim <namhyung@kernel.org>

perf report: Count number of entries separately

The hists->nr_entries is counted in multiple places so that they can
confuse readers of the code. This is a preparation of later change
and do not intend any functional difference.

Note that report__collapse_hists() now changed to return nothing since
its return value (nr_samples) is only for checking if there's any data
in the input file and this can be acheived by checking ->nr_entries.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1398327843-31845-2-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>


# cff6bb46 07-Apr-2014 Don Zickus <dzickus@redhat.com>

perf callchain: Add generic report parse callchain callback function

This takes the parse_callchain_opt function and copies it into the
callchain.c file. Now the c2c tool can use it too without duplicating.

Update perf-report to use the new routine too.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1396896924-129847-5-git-send-email-dzickus@redhat.com
[ Adding missing braces to multiline if condition ]
Signed-off-by: Jiri Olsa <jolsa@redhat.com>


# 33db4568 06-Feb-2014 Namhyung Kim <namhyung@kernel.org>

perf top: Add --percentage option

The --percentage option is for controlling overhead percentage
displayed. It can only receive either of "relative" or "absolute".
Move the parser callback function into a common location since it's
used by multiple commands now.

For more information, please see previous commit same thing done to
"perf report".

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-4-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>


# f2148330 13-Jan-2014 Namhyung Kim <namhyung@kernel.org>

perf report: Add --percentage option

The --percentage option is for controlling overhead percentage
displayed. It can only receive either of "relative" or "absolute".

"relative" means it's relative to filtered entries only so that the
sum of shown entries will be always 100%. "absolute" means it retains
the original value before and after the filter is applied.

$ perf report -s comm
# Overhead Command
# ........ ............
#
74.19% cc1
7.61% gcc
6.11% as
4.35% sh
4.14% make
1.13% fixdep
...

$ perf report -s comm -c cc1,gcc --percentage absolute
# Overhead Command
# ........ ............
#
74.19% cc1
7.61% gcc

$ perf report -s comm -c cc1,gcc --percentage relative
# Overhead Command
# ........ ............
#
90.69% cc1
9.31% gcc

Note that it has zero effect if no filter was applied.

Suggested-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-3-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>


# 1ab1fa5d 25-Dec-2013 Namhyung Kim <namhyung@kernel.org>

perf hists: Add support for showing relative percentage

When filtering by thread, dso or symbol on TUI it also update total
period so that the output shows different result than no filter - the
percentage changed to relative to filtered entries only. Sometimes
this is not desired since users might expect same results with filter.

So new filtered_* fields to hists->stats to count them separately.
They'll be controlled/used by user later.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-2-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>


# b9ce0c99 18-Mar-2014 Namhyung Kim <namhyung@kernel.org>

perf report: Use ui__has_annotation()

Since we introduced the ui__has_annotation() for that, don't open code
it.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1395124359-11744-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2c86c7ca 17-Mar-2014 Namhyung Kim <namhyung@kernel.org>

perf report: Merge al->filtered with hist_entry->filtered

I.e. don't drop al->filtered entries, create the hist_entries and use
its ->filtered bitmap, that is kept with the same semantics for its
bitmap, leaving the filtering to be done at the hist_entry level, i.e.
in the UIs.

This will allow zooming in/out the filters.

Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-xeyhkepu7plw716lrtb0zlnu@git.kernel.org
[ yanked this out of a previous patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 48c65bda 19-Feb-2014 Namhyung Kim <namhyung@kernel.org>

perf annotate: Check availability of annotate when processing samples

The TUI of perf report and top support annotation, but stdio and GTK
don't. So it should be checked before calling hist_entry__inc_addr_
samples() to avoid wasting resources that will never be used.

perf annotate need it regardless of UI and sort keys, so the check
of whether to allocate resources should be on the tools that have
annotate as an option in the TUI, 'report' and 'top', not on the
function called by all of them.

It caused perf annotate on ppc64 to produce zero output, since the
buckets were not being allocated.

Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1392859976-32760-1-git-send-email-namhyung@kernel.org
[ Renamed (report,top)__needs_annotate() to ui__has_annotation() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 106395df 22-Jan-2014 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Remove some needless container_of usage

Since all it wants is to get the 'struct record' from the received
'struct perf_tool', and this is already done at the callers of these
functions, short circuit it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-xz8p659sjpad396vye5t24gx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 644f2df2 22-Jan-2014 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Shorten sample symbol resolving function signature

Since two of the parameters come from the same 'struct
addr_location', rename machine__resolve_bstack() to sample__resolve_bstack()
and pass the that addr_location instead.

This is also for consistency with the same change that resulted in the
sample__resolve_mem() function.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-99ecqt8jiyyksiyx3se7l5ia@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# e80faac0 22-Jan-2014 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Shorten sample symbol resolving function signature

Since three of the parameters come from the same 'struct addr_location',
rename machine__resolve_mem() to sample__resolve_mem() and pass the
that addr_location instead.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-3f5otpssefh9l5hi1t259h8n@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 17f22a3f 21-Jan-2014 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Use al->cpumode where applicable

We don't need to recalculate cpumode from the perf_event->header field,
as this is already available in the struct addr_location->cpumode field.

Remove the function signature of functions that receive both perf_event
and addr_location parameters but use perf_event just to extract the
cpumode.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-tmct07y7mka54allj82trlnx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2dc9fb1a 13-Jan-2014 Namhyung Kim <namhyung@kernel.org>

perf tools: Factor out sample__resolve_callchain()

The report__resolve_callchain() can be shared with perf top code as it
doesn't really depend on the perf report code. Factor it out as
sample__resolve_callchain(). The same goes to the hist_entry__append_
callchain() too.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rodrigo Campos <rodrigo@sdfg.com.ar>
Link: http://lkml.kernel.org/r/1389677157-30513-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0050f7aa 10-Jan-2014 Arnaldo Carvalho de Melo <acme@redhat.com>

perf evlist: Introduce evlist__for_each() & friends

For the common evsel list traversal, so that it becomes more compact.

Use the opportunity to start ditching the 'perf_' from 'perf_evlist__',
as discussed, as the whole conversion touches a lot of places, lets do
it piecemeal when we have the chance due to other work, like in this
case.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-qnkx7dzm2h6m6uptkfk03ni6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# f6d8b057 08-Jan-2014 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Move histogram entries collapsing to separate function

Further uncluttering the main 'report' function by group related code in
separate function.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-b594zsbwke8khir13kudwqmj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8362951b 07-Jan-2014 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Move hist browser selection code to separate function

To unclutter the main function.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-agvxwpazlucy6h5sejuttw9t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fad2918e 08-Jan-2014 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Move logic to warn about kptr_restrict'ed kernels to separate function

Its too big, better have a separate function for it so that the main
logic gets shorter/clearer.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ahh6vfzyh8fsygjwrsbroeu0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 150e465a 19-Dec-2013 Namhyung Kim <namhyung.kim@lge.com>

perf report: Print session information only if --stdio is given

Move those print functions under "if (use_browser == 0)" so that they
don't interfere with TUI output.

Maybe they can handle other UIs later.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1387516278-17024-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# a4210141 19-Dec-2013 Namhyung Kim <namhyung.kim@lge.com>

perf report: Use pr_*() functions where applicable

There're some places printing messages to stdout/err directly.

It should be converted to use proper error printing functions instead.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1387516278-17024-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# cc22e575 19-Dec-2013 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Add 'machine' member to struct addr_location

The addr_location struct should fully qualify an address, and to do that
it should have in it the machine where the thread was found.

Thus all functions that receive an addr_location now don't need to also
receive a 'machine', those functions just need to access al->machine
instead, just like it does with the other parts of an address location:
al->thread, al->map, etc.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-o51iiee7vyq4r3k362uvuylg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 28b21393 19-Dec-2013 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Rename 'perf_report' to 'report'

Reduce typing, functions use class__method convention, so unlikely to
clash with other libraries.

This actually was discussed in the "Link:" referenced message below.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20131112113427.GA4053@ghostprotocols.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 6dbc8ca9 18-Dec-2013 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Introduce helpers for processing callchains

Continuing to try to remove the code duplication introduced with mem and
branch hist entry code, this time providing prologue and epilogues to
deal with callchains when processing samples.

Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-js3pour59yk2aibqzb1tpumh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0f4e7a24 18-Dec-2013 Arnaldo Carvalho de Melo <acme@redhat.com>

perf annotate: Add inc_samples method to addr_map_symbol

Since there are three calls that could receive just the struct
addr_map_symbol pointer and call the symbol method.

Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-d728gz1orgkaknac9ppnzd9e@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 00e55218 18-Dec-2013 Arnaldo Carvalho de Melo <acme@redhat.com>

perf hists: Leave symbol addr hist bucket auto alloc to symbol layer

Since now symbol__addr_inc_samples() does the auto alloc, no need to do
it prior to calling hist_entry__inc_addr_samples.

Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-6ife7xq2kef1nn017m04b3id@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b66d8c0c 18-Dec-2013 Arnaldo Carvalho de Melo <acme@redhat.com>

perf annotate: Auto allocate symbol per addr hist buckets

Instead of open coding it in multiple places in 'report' and 'top'.

Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ay1ushp57qsva9aw59rha5ve@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 5cfe2c82 09-Dec-2013 Jiri Olsa <jolsa@redhat.com>

perf report: Add --header/--header-only options

Currently the perf.data header is always displayed for stdio output,
which is no always useful.

Disabling header information by default and adding following options to
control header output:

--header - display header information (old default)
--header-only - display header information only w/o further
processing, forces stdio output

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1386583370-1699-2-git-send-email-jolsa@redhat.com
[ Added single line explaining talking about the new --header* options,
to address David Ahern comment; better man page entry for the new options,
from Namhyung Kim ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 41a4e6e2 31-Oct-2013 Namhyung Kim <namhyung.kim@lge.com>

perf hists: Consolidate __hists__add_*entry()

The __hists__add_{branch,mem}_entry() does almost the same thing that
__hists__add_entry() does. Consolidate them into one.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1383202576-28141-2-git-send-email-namhyung@kernel.org
[ Fixup clash with new COMM infrastructure ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 91aba0a6 01-Nov-2013 Namhyung Kim <namhyung.kim@lge.com>

perf report: Use parse_options_usage() for -s option failure

The -s (--sort) option was processed after normal option parsing so that
it cannot call the parse_options_usage() automatically. Currently it
calls usage_with_options() which shows entire help messages for event
option. Fix it by showing just -s options.

$ perf report -s help
Error: Unknown --sort key: `help'

usage: perf report [<options>]

-s, --sort <key[,key2...]>
sort by key(s): pid, comm, dso, symbol, ...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Enthusiastically-Supported-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383291195-24386-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 4bceffbc 01-Nov-2013 Namhyung Kim <namhyung.kim@lge.com>

perf report: Postpone setting up browser after parsing options

If setup_browser() called earlier than option parsing, the actual error
message can be discarded during the terminal reset. So move it after
setup_sorting() checks whether the sort keys are valid.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Enthusiastically-Supported-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383291195-24386-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# c1fb5651 10-Oct-2013 Namhyung Kim <namhyung.kim@lge.com>

perf tools: Show progress on histogram collapsing

It can take quite amount of time so add progress bar UI to inform user.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1381468543-25334-4-git-send-email-namhyung@kernel.org
[ perf_progress -> ui_progress ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# c824c433 22-Oct-2013 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Stop using 'self' in some more places

As suggested by tglx, 'self' should be replaced by something that is
more useful.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-fmblhc6tbb99tk1q8vowtsbj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 91e95617 18-Oct-2013 Waiman Long <Waiman.Long@hp.com>

perf report: Add --max-stack option to limit callchain stack scan

When callgraph data was included in the perf data file, it may take a
long time to scan all those data and merge them together especially if
the stored callchains are long and the perf data file itself is large,
like a Gbyte or so.

The callchain stack is currently limited to PERF_MAX_STACK_DEPTH (127).
This is a large value. Usually the callgraph data that developers are
most interested in are the first few levels, the rests are usually not
looked at.

This patch adds a new --max-stack option to perf-report to limit the
depth of callchain stack data to look at to reduce the time it takes for
perf-report to finish its processing. It trades the presence of trailing
stack information with faster speed.

The following table shows the elapsed time of doing perf-report on a
perf.data file of size 985,531,828 bytes.

--max_stack Elapsed Time Output data size
----------- ------------ ----------------
not set 88.0s 124,422,651
64 87.5s 116,303,213
32 87.2s 112,023,804
16 86.6s 94,326,380
8 59.9s 33,697,248
4 40.7s 10,116,637
-g none 27.1s 2,555,810

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1382107129-2010-4-git-send-email-Waiman.Long@hp.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# cc9784bd 15-Oct-2013 Jiri Olsa <jolsa@redhat.com>

perf session: Separating data file properties from session

Removing 'fd, fd_pipe, filename, size' from struct perf_session and
replacing them with struct perf_data_file object.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1381847254-28809-4-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# f5fc1412 15-Oct-2013 Jiri Olsa <jolsa@redhat.com>

perf tools: Add data object to handle perf data file

This patch is adding 'struct perf_data_file' object as a placeholder for
all attributes regarding perf.data file handling. Changing
perf_session__new to take it as an argument.

The rest of the functionality will be added later to keep this change
simple enough, because all the places using perf_session are changed
now.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1381847254-28809-2-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fc67297b 13-Sep-2013 Namhyung Kim <namhyung@kernel.org>

perf tools: Separate out GTK codes to libperf-gtk.so

Separate out GTK codes to a shared object called libperf-gtk.so. This
time only GTK codes are built with -fPIC and libperf remains as is. Now
run GTK hist and annotation browser using libdl.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1379053663-13706-1-git-send-email-namhyung@kernel.org
[ Fix it up wrt Ingo's tools/perf build speedups ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 475eeab9 20-Sep-2013 Andi Kleen <ak@linux.intel.com>

tools/perf: Add support for record transaction flags

Add support for recording and displaying the transaction flags.
They are essentially a new sort key. Also display them
in a nice way to the user.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-6-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# f5d05bce 20-Sep-2013 Andi Kleen <ak@linux.intel.com>

tools/perf: Support sorting by in_tx or abort branch flags

Extend the perf branch sorting code to support sorting by in_tx
or abort_tx qualifiers. Also print out those qualifiers.

This also fixes up some of the existing sort key documentation.

We do not support no_tx here, because it's simply not showing
the in_tx flag.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-4-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 33e940a2 17-Sep-2013 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Check for SIGINT in more loops

When processing big files we were not checking if session_done was set
by the SIGINT signal handler, for instance in 'perf report'. Fix it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-pyad42lgrtq7xhg2dpsoauq7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 5c5e854b 20-Aug-2013 Stephane Eranian <eranian@google.com>

perf tools: Add attr->mmap2 support

This patch adds support for the new PERF_RECORD_MMAP2 record type
exposed by the kernel. This is an extended PERF_RECORD_MMAP record.

It adds for each file-backed mapping the device major, minor number and
the inode number and generation.

This triplet uniquely identifies the source of a file-backed mapping. It
can be used to detect identical virtual mappings between processes, for
instance.

The patch will prefer MMAP2 over MMAP.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1377079825-19057-3-git-send-email-eranian@google.com
[ Cope with 314add6 "Change machine__findnew_thread() to set thread pid",
fix 'perf test' regression test entry affected,
use perf_missing_features.mmap2 to fallback to not using .mmap2 in older kernels,
so that new tools can work with kernels where this feature is not present ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 75562573 27-Aug-2013 Adrian Hunter <adrian.hunter@intel.com>

perf tools: Add support for PERF_SAMPLE_IDENTIFIER

Enable parsing of samples with sample format bit PERF_SAMPLE_IDENTIFIER.
In addition, if the kernel supports it, prefer it to selecting
PERF_SAMPLE_ID thereby allowing non-matching sample types.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1377591794-30553-8-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# e44baa3e 08-Aug-2013 Adrian Hunter <adrian.hunter@intel.com>

perf tools: Remove filter parameter of perf_event__preprocess_sample()

Now that the symbol filter is recorded on the machine there is no need
to pass it to perf_event__preprocess_sample(). So remove it.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1375961547-30267-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b8681711 08-Aug-2013 Adrian Hunter <adrian.hunter@intel.com>

perf report: Set the machines symbol filter

Take into use the machines' symbol filter member.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1375961547-30267-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 99571ab3 18-Jul-2013 Andi Kleen <ak@linux.intel.com>

perf tools: Support callchain sorting based on addresses

With programs with very large functions it can be useful to distinguish
the callgraph nodes on more than just function names. So for example if
you have multiple calls to the same function, it ends up being separate
nodes in the chain.

This patch adds a new key field to the callgraph options, that allows
comparing nodes on functions (as today, default) and addresses.

Longer term it would be nice to also handle src lines, but that would
need more changes and address is a reasonable proxy for it today.

I right now reference the global params, as there was no simple way to
register a params pointer.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-0uskktybf0e7wrnoi5e9b9it@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 6065210d 11-Jul-2013 Jiri Olsa <jolsa@redhat.com>

perf tools: Remove event types framework completely

Removing event types framework completely. The only remainder (apart
from few comments) is following enum:

enum perf_user_event_type {
...
PERF_RECORD_HEADER_EVENT_TYPE = 65, /* deprecated */
...
}

It's kept as deprecated, resulting in error when processed in
perf_session__process_user_event function.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1373556513-3000-6-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2b8bfa6b 31-Jan-2013 Jiri Olsa <jolsa@redhat.com>

perf tools: Centralize default columns init in perf_hpp__init

Now when diff command is separated from other standard outputs,
we can use perf_hpp__init to initialize all standard columns.

Moving PERF_HPP__OVERHEAD column init back to perf_hpp__init,
and removing extra enable calls.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-nj2xk89tj972tbqswfs498ex@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b21484f1 06-Dec-2012 Greg Price <price@MIT.EDU>

perf report/top: Add option to collapse undesired parts of call graph

For example, in an application with an expensive function implemented
with deeply nested recursive calls, the default call-graph presentation
is dominated by the different callchains within that function. By
ignoring these callees, we can collect the callchains leading into the
function and compactly identify what to blame for expensive calls.

For example, in this report the callers of garbage_collect() are
scattered across the tree:

$ perf report -d ruby 2>- | grep -m10 ^[^#]*[a-z]
22.03% ruby [.] gc_mark
--- gc_mark
|--59.40%-- mark_keyvalue
| st_foreach
| gc_mark_children
| |--99.75%-- rb_gc_mark
| | rb_vm_mark
| | gc_mark_children
| | gc_marks
| | |--99.00%-- garbage_collect

If we ignore the callees of garbage_collect(), its callers are coalesced:

$ perf report --ignore-callees garbage_collect -d ruby 2>- | grep -m10 ^[^#]*[a-z]
72.92% ruby [.] garbage_collect
--- garbage_collect
vm_xmalloc
|--47.08%-- ruby_xmalloc
| st_insert2
| rb_hash_aset
| |--98.45%-- features_index_add
| | rb_provide_feature
| | rb_require_safe
| | vm_call_method

Signed-off-by: Greg Price <price@mit.edu>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20130623031720.GW22203@biohazard-cafe.mit.edu
Link: http://lkml.kernel.org/r/20130708115746.GO22203@biohazard-cafe.mit.edu
Cc: Fengguang Wu <fengguang.wu@intel.com>
[ remove spaces at beginning of line, reported by Fengguang Wu ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# d4ae0a6f 25-Jun-2013 Jiri Olsa <jolsa@redhat.com>

perf report: Fix perf_session__delete removal

There's no point of having out_delete label with perf_session__delete
call within __cmd_report function, because it's called at the end of the
cmd_report function.

The speed up due to commenting out the perf_session__delete at the end
does not seem relevant anymore. Measured speedup for ~1GB data file with
222466 FORKS events is around 0.5%.

$ perf report -i perf.data.delete -P perf_session__delete -s parent

+ 99.51% [other]
+ 0.49% perf_session__delete

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1372161253-22081-6-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ad3d6f50 17-Jun-2013 Jiri Olsa <jolsa@redhat.com>

perf tools: Do not elide parent symbol column

I found the parent symbol column data interesting even
if there's another sorting enabled. Switching it on.

Previous behaviour:
$ perf report -i perf.data.delete -p perf_session__delete -x

+ 3.60% perf perf [.] __rb_change_child
+ 1.89% perf perf [.] rb_erase
+ 1.89% perf perf [.] rb_erase
+ 1.83% perf perf [.] free@plt

Current behaviour:
$ perf report -i perf.data.delete -p perf_session__delete -x

+ 3.60% perf perf [.] __rb_change_child perf_session__delete
+ 1.89% perf perf [.] rb_erase perf_session__delete_dead_threads
+ 1.89% perf perf [.] rb_erase perf_session__delete_threads
+ 1.83% perf perf [.] free@plt perf_session__delete

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-r79fn89bhqz16ixa5zmyflrd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0276c22a 10-Jun-2013 Jiri Olsa <jolsa@redhat.com>

perf tools: Fix -x/--exclude-other option for report command

Currently we have symbol_conf.exclude_other being set as true every time
so the -x/--exclude-other has nothing to do.

Also we have no way to see the data with symbol_conf.exclude_other being
false which is useful sometimes.

Fixing it by making symbol_conf.exclude_other false by default.

1) Example without -x option:

$ perf report -i perf.data.delete -p perf_session__delete -s parent

+ 99.91% [other]
+ 0.08% perf_session__delete
+ 0.00% perf_session__delete_dead_threads
+ 0.00% perf_session__delete_threads

2) Example with -x option:

$ ./perf report -i perf.data.delete -p perf_session__delete -s parent -x

+ 96.22% perf_session__delete
+ 1.89% perf_session__delete_dead_threads
+ 1.89% perf_session__delete_threads

In Example 1) we get the sorted out data together with the rest
"[other]". This could help us estimate how much time we spent in the
sorted data.

In Example 2) the total is just the sorted data.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-sg8fvu0fyqohf9ur9l38lhkw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# eec574e6 13-May-2013 Namhyung Kim <namhyung.kim@lge.com>

perf report: Add report.percent-limit config variable

Now an user can set a default value of --percent-limit option into the
perfconfig file.

$ cat ~/.perfconfig
[report]
percent-limit = 0.1

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1368497347-9628-9-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 064f1981 13-May-2013 Namhyung Kim <namhyung.kim@lge.com>

perf report: Add --percent-limit option

The --percent-limit option is for not showing small overhead entries in
the output. Maybe we want to set a certain default value like 0.1.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1368497347-9628-7-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# f3dd1981 13-May-2013 Namhyung Kim <namhyung.kim@lge.com>

perf report: Don't bother locking when adding hist entries

The 'perf report'command is single-threaded, so no need to grab a lock.

Although the fast path of pthread_mutex_[un]lock() is very fast, there's
a ~3% gain by eliminating it when we have huge sample data.

$ perf record -a -F 100000 -o perf.data.bench -- perf bench sched all
$ perf record -e cycles:upp -o perf.data.before -- \
> perf report -i perf.data.bench --stdio > /dev/null
... apply this patch ...
$ perf record -e cycles:upp -o perf.data.after -- \
> perf report -i perf.data.bench --stdio > /dev/null
$ perf diff perf.data.{before,after} | grep pthread
+0.02% libpthread-2.15.so [.] _pthread_cleanup_push_defer
+0.02% libpthread-2.15.so [.] _pthread_cleanup_pop_restore
0.05% -0.05% perf [.] pthread_mutex_unlock@plt
0.05% -0.05% perf [.] pthread_mutex_lock@plt
1.01% -1.01% libpthread-2.15.so [.] pthread_mutex_lock
1.68% -1.68% libpthread-2.15.so [.] __pthread_mutex_unlock_usercnt
0.05% -0.05% libpthread-2.15.so [.] pthread_mutex_unlock

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1368497347-9628-6-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 27a0dcb7 13-May-2013 Namhyung Kim <namhyung.kim@lge.com>

perf hists: Move locking to its call-sites

It's a preparation patch to eliminate unneeded locking in the perf
report path.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1368497347-9628-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 08e71542 03-Apr-2013 Namhyung Kim <namhyung.kim@lge.com>

perf sort: Consolidate sort_entry__setup_elide()

The same code was duplicate to places, factor them out to common
sort__setup_elide().

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1364991979-3008-11-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# afab87b9 03-Apr-2013 Namhyung Kim <namhyung.kim@lge.com>

perf sort: Separate out memory-specific sort keys

Since they're used only for perf mem, separate out them to a different
dimension so that normal user cannot access them by any chance.

For global/local weights, I'm not entirely sure to place them into the
memory dimension. But it's the only user at this time.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1364991979-3008-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 55369fc1 01-Apr-2013 Namhyung Kim <namhyung.kim@lge.com>

perf sort: Introduce sort__mode variable

It's used for determining current sort mode which can be one of
NORMAL, BRANCH and new MEMORY.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1364816125-12212-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 26353a61 01-Apr-2013 Namhyung Kim <namhyung.kim@lge.com>

perf hists: Fix an invalid memory free on he->branch_info

The branch info was allocated for the whole stack and passed matching
hist entry for each level during processing samples. Thus when a hist
entry tries to free its branch info like in hists__collapse_insert_entry
it'll face following error.

*** glibc detected *** perf: munmap_chunk(): invalid pointer: 0x00000000014e9d20 ***
======= Backtrace: =========
/lib64/libc.so.6[0x387d47ae16]
perf[0x4923bd]
perf(cmd_report+0xd68)[0x432a08]
perf[0x41a663]
perf(main+0x58f)[0x419eaf]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x387d421735]
perf[0x419f95]

Fix it by allocating and copying branch info for each new hist entry.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1364816125-12212-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 6692c262 28-Mar-2013 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Remove dependency on libnewt

Now that the map browser shares the input routine with the hists
browser, there is no need for using any libnewt routine, so remove all
traces except for honouring NO_NEWT=1 on the makefile command line as an
indication that TUI support is not needed, in fact it just sets
NO_SLANG=1.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-wae5o7xca9m52bj1re28jc5j@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# f4f7e28d 24-Jan-2013 Stephane Eranian <eranian@google.com>

perf report: Add support for mem access profiling

This patch adds the --mem-mode option to perf report.

This mode requires a perf.data file created with memory access samples.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1359040242-8269-13-git-send-email-eranian@google.com
[ Removed duplicates in the --sort help, man page needs updating,
Fixed minor conflict with 328ccda "perf report: Add --no-demangle option" ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 05484298 24-Jan-2013 Andi Kleen <ak@linux.intel.com>

perf tools: Add support for weight v7 (modified)

perf record has a new option -W that enables weightened sampling.

Add sorting support in top/report for the average weight per sample and the
total weight sum. This allows to both compare relative cost per event
and the total cost over the measurement period.

Add the necessary glue to perf report, record and the library.

v2: Merge with new hist refactoring.
v3: Fix manpage. Remove value check.
Rename global_weight to weight and weight to local_weight.
v4: Readd sort keys to manpage
v5: Move weight to end
v6: Move weight to template
v7: Rename weight key.

Original patch from Andi modified by Stephane Eranian <eranian@google.com>
to include ONLY the weight supporting code and apply to pristine 3.8.0-rc4.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1359040242-8269-6-git-send-email-eranian@google.com
[ committer note: changed to cope with fc5871ed and the hists_link perf test entry ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 328ccdac 25-Mar-2013 Namhyung Kim <namhyung.kim@lge.com>

perf report: Add --no-demangle option

It's sometimes useful to see undemangled raw symbol name for example
other tools using the perf output to do manipulation of binaries.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Suggested-by: William Cohen <wcohen@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: William Cohen <wcohen@redhat.com>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=55571
Link: http://lkml.kernel.org/r/1364203098-17741-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# db3c6bf8 12-Mar-2013 Wei Yongjun <yongjun_wei@trendmicro.com.cn>

perf report: Remove duplicated include

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/CAPgLHd9=EXaH1hv4jeVvTa4tZFsjnx+8+g3zqmmUKqQ5qRqTEA@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 759ff497 04-Mar-2013 Namhyung Kim <namhyung.kim@lge.com>

perf evsel: Introduce perf_evsel__is_group_event() helper

The perf_evsel__is_group_event function is for checking whether given
evsel needs event group view support or not. Please note that it's
different to the existing perf_evsel__is_group_leader() which checks
only the given evsel is a leader or a standalone (i.e. non-group) event
regardless of event group feature.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1362462812-30885-7-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 55309985 05-Feb-2013 Namhyung Kim <namhyung.kim@lge.com>

perf sort: Make setup_sorting returns an error code

Currently the setup_sorting() is called for parsing sort keys and exits
if it failed to add the sort key. As it's included in libperf it'd be
better returning an error code rather than exiting application inside of
the library.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Suggested-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1360130237-9963-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ad0de097 02-Feb-2013 Feng Tang <feng.tang@intel.com>

perf report: Enable the runtime switching of perf data file

This is for tui browser only. This patch will check the returned key of
tui hists browser, if it's K_SWITH_INPUT_DATA, then recreate a session
for the new selected data file.

V2: Move the setup_brower() before the "repeat" jump point.

Signed-off-by: Feng Tang <feng.tang@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1359873501-24541-2-git-send-email-feng.tang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 00c7e1f1 22-Jan-2013 Namhyung Kim <namhyung.kim@lge.com>

perf report: Add report.group config option

Add report.group config option for setting default value of event
group view. It affects the report output only if perf.data contains
event group info.

A user can write .perfconfig file like below to enable group view by
default:

$ cat ~/.perfconfig
[report]
group = true

And it can be disabled through command line:

$ perf report --no-group

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1358845787-1350-19-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 01d14f16 22-Jan-2013 Namhyung Kim <namhyung.kim@lge.com>

perf report: Add --group option

Add --group option to enable event grouping. When enabled, all the
group members information will be shown together with the leader.

$ perf report --group
...
# group: {ref-cycles,cycles}
# ========
#
# Samples: 7K of event 'anon group { ref-cycles, cycles }'
# Event count (approx.): 6876107743
#
# Overhead Command Shared Object Symbol
# ................ ....... ................. ..........................
#
99.84% 99.76% noploop noploop [.] main
0.07% 0.00% noploop ld-2.15.so [.] strcmp
0.03% 0.00% noploop [kernel.kallsyms] [k] timerqueue_del
0.03% 0.03% noploop [kernel.kallsyms] [k] sched_clock_cpu
0.02% 0.00% noploop [kernel.kallsyms] [k] account_user_time
0.01% 0.00% noploop [kernel.kallsyms] [k] __alloc_pages_nodemask
0.00% 0.00% noploop [kernel.kallsyms] [k] native_write_msr_safe
0.00% 0.11% noploop [kernel.kallsyms] [k] _raw_spin_lock
0.00% 0.06% noploop [kernel.kallsyms] [k] find_get_page
0.00% 0.02% noploop [kernel.kallsyms] [k] rcu_check_callbacks
0.00% 0.02% noploop [kernel.kallsyms] [k] __current_kernel_time

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1358845787-1350-18-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 717e263f 22-Jan-2013 Namhyung Kim <namhyung.kim@lge.com>

perf report: Show group description when event group is enabled

When using event group viewer, it's better to show the group description
rather than the leader information alone.

If a leader did not contain any member, it's a non-group event.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1358845787-1350-17-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fc24d7c2 22-Jan-2013 Namhyung Kim <namhyung.kim@lge.com>

perf report: Bypass non-leader events when event group is enabled

Since we have all necessary information in the leader events and other
members don't, bypass members. Member events will be shown along with
the leaders if event group is enabled.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1358845787-1350-16-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 6e1f601a 22-Jan-2013 Namhyung Kim <namhyung.kim@lge.com>

perf report: Make another loop for linking group hists

Now the event grouping viewing requires linking all member hists in a
group to the leader's. Thus hists__output_resort should be called after
linking all events in evlist.

Introduce symbol_conf.event_group flag to determine whether the feature
is enabled or not.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1358845787-1350-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 9811360e 27-Dec-2012 Namhyung Kim <namhyung.kim@lge.com>

perf report: Update documentation for sort keys

Add description of sort keys to the perf-report document and also add
missing cpu and srcline keys to the command line help string.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1356599507-14226-11-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 876650e6 18-Dec-2012 Arnaldo Carvalho de Melo <acme@redhat.com>

perf machine: Introduce struct machines

That consolidates the grouping of host + guests, isolating a bit more of
functionality now centered on 'perf_session' that can be used
independently in tools that don't need a 'perf_session' instance, but
needs to have all the thread/map/symbol machinery.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-c700rsiphpmzv8klogojpfut@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 1240005e 12-Oct-2012 Jiri Olsa <jolsa@redhat.com>

perf hists: Introduce perf_hpp__list for period related columns

Adding perf_hpp__list list to register and contain all period related
columns the command is interested in.

This way we get rid of static array holding all possible columns and
enable commands to register their own columns.

It'll be handy for diff command in future to process and display data
for multiple files.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/tip-kiykge4igrcl7etmpmveto1h@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 68d80758 01-Nov-2012 Namhyung Kim <namhyung.kim@lge.com>

perf report: Postpone objdump check until annotation requested

David reported that current perf report refused to run on a data file
captured from a different machine because of objdump.

Since the objdump tools won't be used unless annotation was requested,
checking its presence at init time doesn't make sense.

Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Irina Tirdea <irina.tirdea@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1351835406-15208-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 9783adf7 01-Nov-2012 Namhyung Kim <namhyung.kim@lge.com>

perf tools: Introduce struct hist_browser_timer

Currently various hist browser functions receive 3 arguments for
refreshing histogram but only used from a few places. Also it's only
for perf top command so that it can be NULL for other (and probably
most) cases. Pack them into a struct in order to reduce number of those
unused arguments.

This is a mechanical change and does not intend a functional change.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Irina Tirdea <irina.tirdea@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1351835406-15208-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 70cb4e96 29-Oct-2012 Feng Tang <feng.tang@intel.com>

perf tools: Add a global variable "const char *input_name"

Currently many perf commands annotate/evlist/report/script/lock etc all
support "-i" option to chose a specific perf data, and all of them
create a local "input_name" to save the file name for that perf data.

Since most of these commands need it, we can add a global variable for
it, also it can some other benefits:

1. When calling script browser inside hists/annotation browser, it needs
to know the perf data file name to run that script.

2. For further feature like runtime switching to another perf data file,
this variable can also help.

Signed-off-by: Feng Tang <feng.tang@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1351569369-26732-2-git-send-email-feng.tang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 68e94f4e 15-Oct-2012 Irina Tirdea <irina.tirdea@intel.com>

perf tools: Try to find cross-built objdump path

As we have architecture information of saved perf.data file, we can try
to find cross-built objdump path.

The triplets include support for Android (arm, x86 and mips
architectures).

Signed-off-by: Irina Tirdea <irina.tirdea@intel.com>
Originally-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1350344020-8071-5-git-send-email-irina.tirdea@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# f62d3f0f 06-Oct-2012 Arnaldo Carvalho de Melo <acme@redhat.com>

perf event: No need to create a thread when handling PERF_RECORD_EXIT

When we were processing a PERF_RECORD_EXIT event we first used
machine__findnew_thread for both the thread exiting and for its parent,
only to use just the thread struct associated with the one exiting, and
to just delete it.

If it existed, i.e. not created at this very moment in
machine__findnew_thread, it will be moved to the machine->dead_threads
linked list, because we may have hist_entries pointing to it, but if it
was created just do be deleted, it will just sit there with no
references at all.

Use the new machine__find_thread() method so that if it is not there, we
don't create it.

As a bonus the parent thread will also not be created at this point.

Create process_fork() and process_exit() helpers to use this and make
the builtins use it instead of the generic process_task(), ditched by
this patch.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-z7n2y98ebjyrvmytaope4vdl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 1d77822e 04-Oct-2012 Jiri Olsa <jolsa@redhat.com>

perf tool: Add hpp interface to enable/disable hpp column

Adding perf_hpp__column_enable function to enable/disable hists column
and removing diff command specific stuff 'need_pair and
show_displacement' from hpp code.

The diff command now enables/disables columns separately according to
the user arguments. This will be helpful in future patches where more
columns are added into diff output.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1349354994-17853-6-git-send-email-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 41724e4c 04-Oct-2012 Jiri Olsa <jolsa@redhat.com>

perf tools: Removing hists pair argument from output path

The hists pointer is now part of the 'struct hist_entry'.

And since the overhead and baseline columns are split now, there's no
reason to pass it through the output path.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1349354994-17853-5-git-send-email-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# dd464345 04-Oct-2012 Jiri Olsa <jolsa@redhat.com>

perf diff: Refactor diff displacement possition info

Moving the position calculation into the diff command, so the position
as prepared inside struct hist_entry data and there's no need to compute
in the output display path.

Removing 'displacement' from struct perf_hpp as it is no longer needed.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1349354994-17853-3-git-send-email-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 60e5c706 12-Sep-2012 Namhyung Kim <namhyung.kim@lge.com>

perf report: Add missing perf_hpp__init for pipe-mode

The perf_hpp__init() function was only called from setup_browser() so
that the pipe-mode missed the initialization thus didn't respond to
related options. Fix it.

Reported-by: Robert Richter <robert.richter@amd.com>
Tested-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-tip-commits@vger.kernel.org
Link: http://lkml.kernel.org/r/87txv28spl.fsf_-_@sejong.aot.lge.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 034a9265 14-Sep-2012 Namhyung Kim <namhyung.kim@lge.com>

perf report: Enable integrated annotation only if possible

The integrated annotation feature is supported only in TUI mode. Also
it should be enabled with 'symbol' sort key otherwise resulting hist
entry doesn't need to have same symbol as of a sample so that it can
fail on hist_entry__inc_addr_samples with -ERANGE.

You can easily see it when start perf report TUI without symbol* sort
key. This patch fixes the problem.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1347611729-16994-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 1d037ca1 10-Sep-2012 Irina Tirdea <irina.tirdea@gmail.com>

perf tools: Use __maybe_used for unused variables

perf defines both __used and __unused variables to use for marking
unused variables. The variable __used is defined to
__attribute__((__unused__)), which contradicts the kernel definition to
__attribute__((__used__)) for new gcc versions. On Android, __used is
also defined in system headers and this leads to warnings like: warning:
'__used__' attribute ignored

__unused is not defined in the kernel and is not a standard definition.
If __unused is included everywhere instead of __used, this leads to
conflicts with glibc headers, since glibc has a variables with this name
in its headers.

The best approach is to use __maybe_unused, the definition used in the
kernel for __attribute__((unused)). In this way there is only one
definition in perf sources (instead of 2 definitions that point to the
same thing: __used and __unused) and it works on both Linux and Android.
This patch simply replaces all instances of __used and __unused with
__maybe_unused.

Signed-off-by: Irina Tirdea <irina.tirdea@intel.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1347315303-29906-7-git-send-email-irina.tirdea@intel.com
[ committer note: fixed up conflict with a116e05 in builtin-sched.c ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 7a4ec938 03-Sep-2012 Maciek Borzecki <maciek.borzecki@gmail.com>

perf tools: Allow user to indicate path to objdump in command line

When analyzing perf data from hosts of other architecture than one of
the local host it's useful to call objdump that is part of a toolchain
for that architecture. Instead of calling regular objdump, call one that
user specified in command line.

Signed-off-by: Maciek Borzecki <maciek.borzecki@gmail.com>
Acked-by: David Ahern <dsahern@gmail.com>
Link: http://lkml.kernel.org/r/1346754750.16299.3.camel@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 71ad0f5e 07-Aug-2012 Jiri Olsa <jolsa@redhat.com>

perf tools: Support for DWARF CFI unwinding on post processing

This brings the support for DWARF cfi unwinding on perf post
processing. Call frame informations are retrieved and then passed
to libunwind that requests memory and register content from the
applications.

Adding unwind object to handle the user stack backtrace based
on the user register values and user stack dump.

The unwind object access the libunwind via remote interface
and provides to it all the necessary data to unwind the stack.

The unwind interface provides following function:
unwind__get_entries

And callback (specified in above function) to retrieve
the backtrace entries:
typedef int (*unwind_entry_cb_t)(struct unwind_entry *entry,
void *arg);

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Original-patch-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Arun Sharma <asharma@fb.com>
Cc: Benjamin Redelings <benjamin.redelings@nescent.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Ulrich Drepper <drepper@gmail.com>
Link: http://lkml.kernel.org/r/1344345647-11536-12-git-send-email-jolsa@redhat.com
[ Replaced use of perf_session by usage of perf_evsel ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 7f3be652 01-Aug-2012 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Use perf_evlist__sample_type more extensively

Removing perf_session->sample_type, as it can be obtained from the
evsel/evlist.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-mnt1zwlik7sp7z6ljc9kyefg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 300aa941 11-Jun-2012 David Ahern <dsahern@gmail.com>

perf report: Delay sample_type checks in pipe mode

The pipeline:
perf record -a -g -o - sleep 5 |perf inject -v -b | perf report -g -i -

generates the warning:
Selected -g but no callchain data. Did you call 'perf record' without -g?

The problem is that the header data is not written to the pipe, so the
sample_type has not been available when perf_report__setup_sample_type
is called. For pipe mode, record dumps the sample type as part of the
synthesized events stream -- perf_event__synthesize_attrs(). Handle this
be detecting pipe mode and not doing early sanity checks on sample_type.

Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Link: http://lkml.kernel.org/r/1339444121-26236-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# a9c34a9f 11-Jun-2012 Jiri Olsa <jolsa@redhat.com>

perf tools: Remove unused evsel parameter from machine__resolve_callchain

Removing unused evsel parameter from machine__resolve_callchain
function. Plus related header file and callers changes.

The evsel parameter is unused since following commit:
perf callchain: Make callchain cursors TLS
commit 472606458f3e1ced5fe3cc5f04e90a6b5a4732cf
Author: Namhyung Kim <namhyung.kim@lge.com>
Date: Thu May 31 14:43:26 2012 +0900

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Arun Sharma <asharma@fb.com>
Cc: Benjamin Redelings <benjamin.redelings@nescent.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Ulrich Drepper <drepper@gmail.com>
Link: http://lkml.kernel.org/r/1339420814-7379-9-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 7289f83c 11-Jun-2012 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Move all users of event_name to perf_evsel__name

So that we don't use global variables that could make us misreport event
names when having a multi window top, for instance.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-mccancovi1u0wdkg8ncth509@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 47260645 30-May-2012 Namhyung Kim <namhyung.kim@lge.com>

perf callchain: Make callchain cursors TLS

perf top -G has a race on callchain cursor between main thread and
display thread. Since the callchain cursors are used locally make them
thread-local data would solve the problem.

Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Reported-by: Sunjin Yang <fan4326@gmail.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sunjin Yang <fan4326@gmail.com>
Link: http://lkml.kernel.org/r/1338443007-24857-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 91557e84 29-May-2012 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Use the right symbol for annotation

In non symbolic views, i.e. --sort without "symbol", as in:

perf report --sort comm

We're segfaulting in the --tui because we're testing the symbol resolved
and then trying to use the symbol on the histogram entry where we're
coalescing all hits for a COMM, and the first hist_entry for a comm may
have a NULL symbol, i.e. the RIP didn't resolve to any symbol.

In this case we're segfaulting, fix it by testing against the symbol in
the histogram entry.

Reported-by: William Cohen <wcohen@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-8ylwubbcmu27ucc9ffrku3yv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 3780f488 28-May-2012 Namhyung Kim <namhyung@gmail.com>

perf tools: Convert critical messages to ui__error()

There were places where use ui__warning (or even fprintf) to show
critical messages. This patch converts them to ui__error so that the
front-end code can implement appropriate behavior.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1338265382-6872-3-git-send-email-namhyung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 04480d01 02-May-2012 Jiri Olsa <jolsa@redhat.com>

perf report: Fix format string for x86-32 compilation

Using PRIu64 for printing out u64 nr_events to fix compilation
for x86 32 bits.

Cc: Arun Sharma <asharma@fb.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Frank C. Eigler <fche@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Ulrich Drepper <drepper@gmail.com>
Link: http://lkml.kernel.org/r/1335958638-5160-7-git-send-email-jolsa@redhat.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 281ef544 29-Apr-2012 Namhyung Kim <namhyung.kim@lge.com>

perf ui: Add gtk2 support into setup_browser()

Now setup_browser can handle gtk2 front-end so split the TUI code to
ui/tui/setup.c in order to remove dependency.

To this end, make ui__init/exit global symbols and take an argument.
Also split gtk code to ui/gtk/setup.c.

Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1335761711-31403-5-git-send-email-namhyung.kim@lge.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 28e62b90 29-Apr-2012 Namhyung Kim <namhyung.kim@lge.com>

perf ui gtk: Rename functions for consistency

We use double underscore characters to distinguish its subsystem and
actual function name.

Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1335761711-31403-4-git-send-email-namhyung.kim@lge.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 7706f966 29-Apr-2012 Namhyung Kim <namhyung.kim@lge.com>

perf ui gtk: Drop arg[cv] arguments from perf_gtk_setup_browser()

As perf doesn't allow to specify gtk command-line option, drop the
arguments and pass NULL to gtk_init().

This makes the function easier to be called from setup_browser().

Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1335761711-31403-3-git-send-email-namhyung.kim@lge.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 9e755756 15-Apr-2012 David Ahern <dsahern@gmail.com>

perf report: Fix crash showing warning related to kernel maps

While testing https://lkml.org/lkml/2012/4/10/123 I hit this crash:

(gdb) bt
0 0x000000000042000f in __cmd_report (rep=0x7fff80cec580) at builtin-report.c:380
1 cmd_report (argc=0, argv=<optimized out>, prefix=<optimized out>) at builtin-report.c:759
2 0x0000000000414513 in run_builtin (p=0x7724a8, argc=3, argv=0x7fff80ceca70) at perf.c:273
3 0x0000000000413d41 in handle_internal_command (argv=0x7fff80ceca70, argc=3) at perf.c:345
4 run_argv (argv=0x7fff80cec880, argcp=0x7fff80cec88c) at perf.c:389
5 main (argc=3, argv=0x7fff80ceca70) at perf.c:487

kernel_map can be NULL, so need to handle it while dumping a warning
to user.

v2:
- fixed RB_EMPTY_ROOT check -- desc takes the altnerative output when RB_EMPTY_ROOT is false.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Link: http://lkml.kernel.org/r/1334544855-55021-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# cc686280 05-Apr-2012 Ashay Rane <ashay.rane@tacc.utexas.edu>

perf report: Correct display of samples and events in header

This patch prints the number of samples and the count of performance
events separately.

This allows comparing performance of different applications with each
other.

Previously, the sample count was displayed against an 'Events:' heading.
With this patch, the header now reads (for example):

Samples: 5K of event 'instructions'
Event count (approx.): 2993026545

The patch covers both the stdio and the browser interface.

Signed-off-by: Ashay Rane <ashay.rane@tacc.utexas.edu>
[ committer note: Fixed wrt e7f01d1 ]
Link: http://lkml.kernel.org/n/tip-h4nfjm8msedlk8gxkzivfh5y@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# c31a9457 19-Mar-2012 Pekka Enberg <penberg@kernel.org>

perf report: Add a simple GTK2-based 'perf report' browser

This patch adds a simple GTK2-based browser to 'perf report' that's
based on the TTY-based browser in builtin-report.c.

To launch "perf report" using the new GTK interface just type:

$ perf report --gtk

The interface is somewhat limited in features at the moment:

- No callgraph support

- No KVM guest profiling support

- No color coding for percentages

- No sorting from the UI

- ..and many, many more!

That said, I think this patch a reasonable start to build future features on.

Signed-off-by: Pekka Enberg <penberg@kernel.org>
Cc: Colin Walters <walters@verbum.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1202231952410.6689@tux.localdomain
[ committer note: Added #pragma to make gtk no strict prototype problem go
away as suggested by Colin Walters modulo avoiding push/pop ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 6db6127c 16-Mar-2012 Namhyung Kim <namhyung.kim@lge.com>

perf report: Treat an argument as a symbol filter

As Ingo requested, it'd be better off treating first (and the only)
argument as a symbol filter, so that user doesn't need to input the
symbol on the dialog window on TUI.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1331887855-874-5-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b14ffaca 16-Mar-2012 Namhyung Kim <namhyung.kim@lge.com>

perf report: Add --symbol-filter option

Add new --symbol-filter command line option to set appropriate filter
string.

Its short version is missing as I couldn't find an ideal one and
--filter option of perf record also has no short version.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1331887855-874-4-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# a68c2c58 08-Mar-2012 Stephane Eranian <eranian@google.com>

perf report: Enable TUI in branch view mode

This patch updates perf report to support TUI mode
when the perf.data file contains samples with branch
stacks.

For each row in the report, it is possible to annotate
either the source or target of each branch.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1331246868-19905-5-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 993ac88d 08-Mar-2012 Stephane Eranian <eranian@google.com>

perf report: Auto-detect branch stack sampling mode

This patch enhances perf report to auto-detect when the
perf.data file contains samples with branch stacks. That way it
is not necessary to use the -b option.

To force branch view mode to off, simply use --no-branch-stack.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1331246868-19905-4-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# b50311dc 09-Feb-2012 Roberto Agostino Vitillo <ravitillo@lbl.gov>

perf report: Add support for taken branch sampling

This patch adds support for taken branch sampling, i.e, the
PERF_SAMPLE_BRANCH_STACK feature to perf report. In other
words, to display histograms based on taken branches rather
than executed instructions addresses.

The new option is called -b and it takes no argument. To
generate meaningful output, the perf.data must have been
obtained using perf record -b xxx ... where xxx is a branch
filter option.

The output shows symbols, modules, sorted by 'who branches
where' the most often. The percentages reported in the first
column refer to the total number of branches captured and
not the usual number of samples.

Here is a quick example.
Here branchy is simple test program which looks as follows:

void f2(void)
{}
void f3(void)
{}
void f1(unsigned long n)
{
if (n & 1UL)
f2();
else
f3();
}
int main(void)
{
unsigned long i;

for (i=0; i < N; i++)
f1(i);
return 0;
}

Here is the output captured on Nehalem, if we are
only interested in user level function calls.

$ perf record -b any_call,u -e cycles:u branchy

$ perf report -b --sort=symbol
52.34% [.] main [.] f1
24.04% [.] f1 [.] f3
23.60% [.] f1 [.] f2
0.01% [k] _IO_new_file_xsputn [k] _IO_file_overflow
0.01% [k] _IO_vfprintf_internal [k] _IO_new_file_xsputn
0.01% [k] _IO_vfprintf_internal [k] strchrnul
0.01% [k] __printf [k] _IO_vfprintf_internal
0.01% [k] main [k] __printf

About half (52%) of the call branches captured are from main()
-> f1(). The second half (24%+23%) is split in two equal shares
between f1() -> f2(), f1() ->f3(). The output is as expected
given the code.

It should be noted, that using -b in perf record does not
eliminate information in the perf.data file. Consequently, a
typical profile can also be obtained by perf report by simply
not using its -b option.

It is possible to sort on branch related columns:

- dso_from, symbol_from
- dso_to, symbol_to
- mispredict

Signed-off-by: Roberto Agostino Vitillo <ravitillo@lbl.gov>
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-14-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# efad1415 07-Dec-2011 Robert Richter <robert.richter@amd.com>

perf report: Accept fifos as input file

The default input file for perf report is not handled the same way as
perf record does it for its output file. This leads to unexpected
behavior of perf report, etc. E.g.:

# perf record -a -e cpu-cycles sleep 2 | perf report | cat
failed to open perf.data: No such file or directory (try 'perf record' first)

While perf record writes to a fifo, perf report expects perf.data to be
read. This patch changes this to accept fifos as input file.

Applies to the following commands:

perf annotate
perf buildid-list
perf evlist
perf kmem
perf lock
perf report
perf sched
perf script
perf timechart

Also fixes char const* -> const char* type declaration for filename
strings.

v2:
* Prevent potential null pointer access to input_name in
builtin-report.c. Needed due to removal of patch "perf report: Setup
browser if stdout is a pipe"

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1323248577-11268-5-git-send-email-robert.richter@amd.com
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fb2baceb 12-Dec-2011 Namhyung Kim <namhyung@gmail.com>

perf report: Fix usage string

perf report does not take a command from command line.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1323703017-6060-8-git-send-email-namhyung@gmail.com
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 6581f6e3 12-Dec-2011 Namhyung Kim <namhyung@gmail.com>

perf report: Document '--call-graph' for optional print_limit argument

The '--call-graph' command line option can receive undocumented optional
print_limit argument. Besides, use strtoul() to parse the option since
its type is u32.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1323703017-6060-2-git-send-email-namhyung@gmail.com
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# c8e66720 13-Nov-2011 David Ahern <dsahern@gmail.com>

perf tools: make -C consistent across commands (for cpu list arg)

Currently the meaning of -C varies by perf command: for perf-top,
perf-stat, perf-record it means cpu list. For perf-report it means comm
list. Then perf-annotate, perf-report and perf-script use -c for cpu
list.

Fix annotate, report and script to use -C for cpu list to be consistent
with top, stat and record. This means report needs to use -c for comm
list which does introduce a backward compatibility change.

v1 -> v2
- update perf-script.txt too

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1321209008-7004-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 45694aa7 28-Nov-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Rename perf_event_ops to perf_tool

To better reflect that it became the base class for all tools, that must
be in each tool struct and where common stuff will be put.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-qgpc4msetqlwr8y2k7537cxe@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 743eb868 28-Nov-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Resolve machine earlier and pass it to perf_event_ops

Reducing the exposure of perf_session further, so that we can use the
classes in cases where no perf.data file is created.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-stua66dcscsezzrcdugvbmvd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# d20deb64 25-Nov-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Pass tool context in the the perf_event_ops functions

So that we don't need to have that many globals.

Next steps will remove the 'session' pointer, that in most cases is
not needed.

Then we can rename perf_event_ops to 'perf_tool' that better describes
this class hierarchy.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-wp4djox7x6w1i2bab1pt4xxp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fa372aae 16-Nov-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Group options in a struct

Paving the way to remove these globals when we change the perf_event_ops
to receive as a first parameter a pointer to a perf_event_ops that will
then provide access to perf_report via container_of.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-2eh2vi2nb5z3tg1lvoxv09xu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 246d4ce8 11-Nov-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Remove superfluous callchain_cursor member

Since we have it in evsel->hists.callchain_cursor, remove it from
perf_session.

One more step in disentangling several places from requiring a
perf_session pointer.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-rxr5dj3di7ckyfmnz0naku1z@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# d04b35f8 11-Nov-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Add nr_events to symbol_conf

Since symbol__alloc_hists need it, to avoid passing it around in many
functions have it in the symbol_conf struct.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-cwv8ysvpywzjq4v3xtbd4zwv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 64c6f0c7 05-Oct-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Make --no-asm-raw the default

And add the annotation output knobs to all the tools that have
integrated annotation (top, report).

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-gnlob67mke6sji2kf4nstp7m@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fbe96f29 30-Sep-2011 Stephane Eranian <eranian@google.com>

perf tools: Make perf.data more self-descriptive (v8)

The goal of this patch is to include more information about the host
environment into the perf.data so it is more self-descriptive. Overtime,
profiles are captured on various machines and it becomes hard to track
what was recorded, on what machine and when.

This patch provides a way to solve this by extending the perf.data file
with basic information about the host machine. To add those extensions,
we leverage the feature bits capabilities of the perf.data format. The
change is backward compatible with existing perf.data files.

We define the following useful new extensions:
- HEADER_HOSTNAME: the hostname
- HEADER_OSRELEASE: the kernel release number
- HEADER_ARCH: the hw architecture
- HEADER_CPUDESC: generic CPU description
- HEADER_NRCPUS: number of online/avail cpus
- HEADER_CMDLINE: perf command line
- HEADER_VERSION: perf version
- HEADER_TOPOLOGY: cpu topology
- HEADER_EVENT_DESC: full event description (attrs)
- HEADER_CPUID: easy-to-parse low level CPU identication

The small granularity for the entries is to make it easier to extend
without breaking backward compatiblity. Many entries are provided as
ASCII strings.

Perf report/script have been modified to print the basic information as
easy-to-parse ASCII strings. Extended information about CPU and NUMA
topology may be requested with the -I option.

Thanks to David Ahern for reviewing and testing the many versions of
this patch.

$ perf report --stdio
# ========
# captured on : Mon Sep 26 15:22:14 2011
# hostname : quad
# os release : 3.1.0-rc4-tip
# perf version : 3.1.0-rc4
# arch : x86_64
# nrcpus online : 4
# nrcpus avail : 4
# cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
# cpuid : GenuineIntel,6,15,11
# total memory : 8105360 kB
# cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date
# event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31,
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# ========
#
...

$ perf report --stdio -I
# ========
# captured on : Mon Sep 26 15:22:14 2011
# hostname : quad
# os release : 3.1.0-rc4-tip
# perf version : 3.1.0-rc4
# arch : x86_64
# nrcpus online : 4
# nrcpus avail : 4
# cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
# cpuid : GenuineIntel,6,15,11
# total memory : 8105360 kB
# cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date
# event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31,
# sibling cores : 0-3
# sibling threads : 0
# sibling threads : 1
# sibling threads : 2
# sibling threads : 3
# node0 meminfo : total = 8320608 kB, free = 7571024 kB
# node0 cpu list : 0-3
# ========
#
...

Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/20110930134040.GA5575@quad
Signed-off-by: Stephane Eranian <eranian@google.com>
[ committer notes: Use --show-info in the tools as was in the docs, rename
perf_header_fprintf_info to perf_file_section__fprintf_info, fixup
conflict with f69b64f7 "perf: Support setting the disassembler style" ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 81cce8de 05-Oct-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf browsers: Add live mode to the hists, annotate browsers

This allows passing a timer to be run periodically, which will update
the hists tree that then gers refreshed on the screen, just like the
Live mode (symbol entries, annotation) we already have in 'perf top
--tui'.

Will be used by the new hist_entry/hists based 'top' tool.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-2r44qd8oe4sagzcgoikl8qzc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 3f2728bd 05-Oct-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Add option to show total period

Just like --show-nr-samples, to help in diagnosing problems in the
tools.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-1lr7ejdjfvy2uwy2wkmatcpq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ef9dfe6e 25-Sep-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf hists: Allow limiting the number of rows and columns in fprintf

So that we can reuse hists__fprintf for in the new perf top tool.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-huazw48x05h8r9niz5cf63za@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b53af0f2 21-Sep-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Fix stdio event name header printing

In the past we tried to avoid printing the name of the event when just
one event was found in the perf.data file, after some refactorings it
ended up not printing the event name if just one hist_entry was found in
one of the events.

Fix it by always printing the name of the event, even if just one is
found.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-kikr0c7ou55bd9caok8569rf@git.kernel.org
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# f69b64f7 15-Sep-2011 Andi Kleen <ak@linux.intel.com>

perf: Support setting the disassembler style

Add -M option to report/annotate to pass directly to objdump. This
allows to use -M intel for intel style disassembler syntax, which is
useful for people who are very used to the Intel syntax.

Link: http://lkml.kernel.org/r/1316122302-24306-2-git-send-email-andi@firstfloor.org
[committer note: Add missing Documentation bits, fixup conflicts with 3e6a2a7]
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 00894ce9 02-Aug-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Use ui__warning in some more places

So that we get a proper warning in the TUI in cases like:

$ perf report --stdio -g fractal,0.5,caller --sort pid
Selected -g but no callchain data. Did you call 'perf record' without -g?
$

The --stdio case is ok because it uses fprintf, ui__warning is needed to
figure out if --stdio or --tui is being used.

Cc: Arun Sharma <asharma@fb.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sam Liao <phyomh@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ag9fz2wd17mbbfjsbznq1wms@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 5d67be97 04-Jul-2011 Anton Blanchard <anton@samba.org>

perf report/annotate/script: Add option to specify a CPU range

Add an option to perf report/annotate/script to specify which
CPUs to operate on. This enables us to take a single system wide
profile and analyse each CPU (or group of CPUs) in isolation.

This was useful when profiling a multiprocess workload where the
bottleneck was on one CPU but this was hidden in the overall
profile. Per process and per thread breakdowns didn't help
because multiple processes were running on each CPU and no
single process consumed an entire CPU.

The patch converts the list of CPUs returned by cpu_map__new
into a bitmap for fast lookup. I wanted to use -C to be
consistent with perf top/record/stat, but unfortunately perf
report already uses -C <comms>.

v2: Incorporate suggestions from David Ahern:
- Added -c to perf script
- Check that SAMPLE_CPU is set when -c is used
- Update documentation

v3: Create perf_session__cpu_bitmap()

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Link: http://lkml.kernel.org/r/20110704215750.11647eb9@kryten
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# cb1955b8 29-Jun-2011 Frederic Weisbecker <fweisbec@gmail.com>

perf tools: Only display parent field if explictly sorted

We don't need to display the parent field if the parent
sorting machinery is only used for parent filtering
(as in "-p foo").

However if parent filtering is used in combination with
explicit parent sorting ( -s parent), we want to
display it.

Result with:

perf report -p kernel_thread -s parent

Before:

# Overhead Parent symbol
# ........ .............
#
0.07%
|
--- ioread8
ata_sff_check_status
ata_sff_tf_load
ata_sff_qc_issue
ata_bmdma_qc_issue
ata_qc_issue
ata_scsi_translate
ata_scsi_queuecmd
scsi_dispatch_cmd
scsi_request_fn
__blk_run_queue
__make_request
generic_make_request
submit_bio
submit_bh
journal_submit_commit_record
jbd2_journal_commit_transaction
kjournald2
kthread
kernel_thread_helpe

After:

# Overhead Parent symbol
# ........ .............
#
0.07% kernel_thread_helper
|
--- ioread8
ata_sff_check_status
ata_sff_tf_load
ata_sff_qc_issue
ata_bmdma_qc_issue
ata_qc_issue
ata_scsi_translate
ata_scsi_queuecmd
scsi_dispatch_cmd
scsi_request_fn
__blk_run_queue
__make_request
generic_make_request
submit_bio
submit_bh
journal_submit_commit_record
jbd2_journal_commit_transaction
kjournald2
kthread
kernel_thread_helper

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>


# d797fdc5 07-Jun-2011 Sam Liao <phyomh@gmail.com>

perf tools: Add inverted call graph report support.

Add "caller/callee" option to support inverted butterfly report,
in the inverted report (with caller option), the call graph start
from the callee's ancestor. Users can use such view to catch system's
performance bottleneck from a sysprof like view. Using this option
with specified sort order like pid gives us high level view of call
graph statistics.

Also add "-G" alias for inverted call graph.

Signed-off-by: Sam Liao <phyomh@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>


# 646aaea6 27-May-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Make sure kptr_restrict warnings fit 80 col terms

Suggested-by: Ingo Molnar <mingo@elte.hu>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/n/tip-i1p8vrhq7xveyui6t1sc914e@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ec80fde7 26-May-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Handle /proc/sys/kernel/kptr_restrict

Perf uses /proc/modules to figure out where kernel modules are loaded.

With the advent of kptr_restrict, non root users get zeroes for all module
start addresses.

So check if kptr_restrict is non zero and don't generate the syntethic
PERF_RECORD_MMAP events for them.

Warn the user about it in perf record and in perf report.

In perf report the reference relocation symbol being zero means that
kptr_restrict was set, thus /proc/kallsyms has only zeroed addresses, so don't
use it to fixup symbol addresses when using a valid kallsyms (in the buildid
cache) or vmlinux (in the vmlinux path) build-id located automatically or
specified by the user.

Provide an explanation about it in 'perf report' if kernel samples were taken,
checking if a suitable vmlinux or kallsyms was found/specified.

Restricted /proc/kallsyms don't go to the buildid cache anymore.

Example:

[acme@emilia ~]$ perf record -F 100000 sleep 1

WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted, check
/proc/sys/kernel/kptr_restrict.

Samples in kernel functions may not be resolved if a suitable vmlinux file is
not found in the buildid cache or in the vmlinux path.

Samples in kernel modules won't be resolved at all.

If some relocation was applied (e.g. kexec) symbols may be misresolved even
with a suitable vmlinux or kallsyms file.

[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.005 MB perf.data (~231 samples) ]
[acme@emilia ~]$

[acme@emilia ~]$ perf report --stdio
Kernel address maps (/proc/{kallsyms,modules}) were restricted,
check /proc/sys/kernel/kptr_restrict before running 'perf record'.

If some relocation was applied (e.g. kexec) symbols may be misresolved.

Samples in kernel modules can't be resolved as well.

# Events: 13 cycles
#
# Overhead Command Shared Object Symbol
# ........ ....... ................. .....................
#
20.24% sleep [kernel.kallsyms] [k] page_fault
20.04% sleep [kernel.kallsyms] [k] filemap_fault
19.78% sleep [kernel.kallsyms] [k] __lru_cache_add
19.69% sleep ld-2.12.so [.] memcpy
14.71% sleep [kernel.kallsyms] [k] dput
4.70% sleep [kernel.kallsyms] [k] flush_signal_handlers
0.73% sleep [kernel.kallsyms] [k] perf_event_comm
0.11% sleep [kernel.kallsyms] [k] native_write_msr_safe

#
# (For a higher level overview, try: perf report --sort comm,dso)
#
[acme@emilia ~]$

This is because it found a suitable vmlinux (build-id checked) in
/lib/modules/2.6.39-rc7+/build/vmlinux (use -v in perf report to see the long
file name).

If we remove that file from the vmlinux path:

[root@emilia ~]# mv /lib/modules/2.6.39-rc7+/build/vmlinux \
/lib/modules/2.6.39-rc7+/build/vmlinux.OFF
[acme@emilia ~]$ perf report --stdio
[kernel.kallsyms] with build id 57298cdbe0131f6871667ec0eaab4804dcf6f562
not found, continuing without symbols

Kernel address maps (/proc/{kallsyms,modules}) were restricted, check
/proc/sys/kernel/kptr_restrict before running 'perf record'.

As no suitable kallsyms nor vmlinux was found, kernel samples can't be
resolved.

Samples in kernel modules can't be resolved as well.

# Events: 13 cycles
#
# Overhead Command Shared Object Symbol
# ........ ....... ................. ......
#
80.31% sleep [kernel.kallsyms] [k] 0xffffffff8103425a
19.69% sleep ld-2.12.so [.] memcpy

#
# (For a higher level overview, try: perf report --sort comm,dso)
#
[acme@emilia ~]$

Reported-by: Stephane Eranian <eranian@google.com>
Suggested-by: David Miller <davem@davemloft.net>
Cc: Dave Jones <davej@redhat.com>
Cc: David Miller <davem@davemloft.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Kees Cook <kees.cook@canonical.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/n/tip-mt512joaxxbhhp1odop04yit@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 9e69c210 15-Mar-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Pass evsel in event_ops->sample()

Resolving the sample->id to an evsel since the most advanced tools,
report and annotate, and the others will too when they evolve to
properly support multi-event perf.data files.

Good also because it does an extra validation, checking that the ID is
valid when present. When that is not the case, the overhead is just a
branch + function call (perf_evlist__id2evsel).

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# a91e5431 10-Mar-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Use evlist/evsel for managing perf.data attributes

So that we can reuse things like the id to attr lookup routine
(perf_evlist__id2evsel) that uses a hash table instead of the linear
lookup done in the older perf_header_attr routines, etc.

Also to make evsels/evlist more pervasive an API, simplyfing using the
emerging perf lib.

cc: Arun Sharma <arun@sharma-home.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 7f0030b2 06-Mar-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report tui: Improve multi event session support

When multiple events were used in 'perf record', allow the user to
choose which one is wanted before showing the per event histograms.

Annotations will be performed on the chosen event.

Allow going back and forth from event to event quickly using just the
arrow keys and enter.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: William Cohen <wcohen@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# e248de33 05-Mar-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Improve support for sessions with multiple events

By creating an perf_evlist out of the attributes in the perf.data file
header, so that we can use evlists and evsels when reading recorded
sessions in addition to when we record sessions.

More work is needed to allow tools to allow the user to select which
events are wanted when browsing sessions, be it just one or a subset of
them, aggregated or showed at the same time but with different
indications on the UI to allow seeing workloads thru different views at
the same time.

But the overall goal/trend is to more uniformly use evsels and evlists.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 74cfc17d 17-Feb-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Tell the user when a perf.data file has no samples

[root@emilia ~]# perf report --stdio
The perf.data file has no samples!
[root@emilia ~]#

The TUI shows a popup warning message with the same message.

Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0849327d 10-Feb-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Fix initializion of annotate symbol priv area

We only allocate it when in TUI mode. In --stdio mode unconditionally
initializing this area leads to memory corruption.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ce6f4fab 08-Feb-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf annotate: Move locking to struct annotation

Since we'll need it when implementing the live annotate TUI browser.

This also simplifies things a bit by having the list head for the source
code to be in the dynamicly allocated part of struct annotation, that
way we don't have to pass it around, it can be found from the struct
symbol that is passed everywhere.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2f525d01 04-Feb-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf annotate: Support multiple histograms in annotation

The perf annotate tool continues aggregating everything on just one
histograms, but to support the top model add support for one histogram
perf evsel in the evlist.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 78f7defe 04-Feb-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf annotate: Move annotate functions to util/

They will be used by perf top, so that we have just one set of routines
to do annotation.

Rename "struct sym_priv" to "struct annotation", etc, to clarify this
code a bit.

Rename "struct sym_ext" to "struct source_line", to give it a meaningful
name, that clarifies that it is a the result of an addr2line call, that
is sorted by percentage one particular source code line appeared in the
annotation.

And since we're moving things around also rename 'sym_hist->ip' to
'sym_hist->addr' as we want to do data structure annotation at some
point.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 229ade9b 31-Jan-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Don't fallback to setup_pager unconditionally

Because in tools like 'top' we don't want the pager.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8115d60c 29-Jan-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Kill event_t typedef, use 'union perf_event' instead

And move the event_t methods to the perf_event__ too.

No code changes, just namespace consistency.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8d50e5b4 29-Jan-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Rename 'struct sample_data' to 'struct perf_sample'

Making the namespace more uniform.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 16537f13 13-Jan-2011 Frederic Weisbecker <fweisbec@gmail.com>

perf callchain: Rename register_callchain_param into callchain_register_param

To make the callchain API naming more consistent.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1294977121-5700-4-git-send-email-fweisbec@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 1b3a0e95 13-Jan-2011 Frederic Weisbecker <fweisbec@gmail.com>

perf callchain: Feed callchains into a cursor

The callchains are fed with an array of a fixed size.
As a result we iterate over each callchains three times:

- 1st to resolve symbols
- 2nd to filter out context boundaries
- 3rd for the insertion into the tree

This also involves some pairs of memory allocation/deallocation
everytime we insert a callchain, for the filtered out array of
addresses and for the array of symbols that comes along.

Instead, feed the callchains through a linked list with persistent
allocations. It brings several pros like:

- Merge the 1st and 2nd iterations in one. That was possible before
but in a way that would involve allocating an array slightly taller
than necessary because we don't know in advance the number of context
boundaries to filter out.

- Much lesser allocations/deallocations. The linked list keeps
persistent empty entries for the next usages and is extendable at
will.

- Makes it easier for multiple sources of callchains to feed a
stacktrace together. This is deemed to pave the way for cfi based
callchains wherein traditional frame pointer based kernel
stacktraces will precede cfi based user ones, producing an overall
callchain which size is hardly predictable. This requirement
makes the static array obsolete and makes a linked list based
iterator a much more flexible fit.

Basic testing on a big perf file containing callchains (~ 176 MB)
has shown a throughput gain of about 11% with perf report.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1294977121-5700-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 9486aa38 22-Jan-2011 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Fix 64 bit integer format strings

Using %L[uxd] has issues in some architectures, like on ppc64. Fix it
by making our 64 bit integers typedefs of stdint.h types and using
PRI[ux]64 like, for instance, git does.

Reported by Denis Kirjanov that provided a patch for one case, I went
and changed all cases.

Reported-by: Denis Kirjanov <dkirjanov@kernel.org>
Tested-by: Denis Kirjanov <dkirjanov@kernel.org>
LKML-Reference: <20110120093246.GA8031@hera.kernel.org>
Cc: Denis Kirjanov <dkirjanov@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Pingtian Han <phan@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ec5761ea 09-Dec-2010 David Ahern <daahern@cisco.com>

perf symbols: Add symfs option for off-box analysis using specified tree

The symfs argument allows analysis of perf.data file using a locally accessible
filesystem tree with debug symbols - e.g., tree created during image builds,
sshfs mount, loop mounted KVM disk images, USB keys, initrds, etc. Anything
with an OS tree can be analyzed from anywhere without the need to populate a
local data store with build-ids.

Commiter notes:

o Fixed up symfs="/" variants handling.

o prefixed DSO__ORIG_GUEST_KMODULE case with symfs too, avoiding use of files
outside the symfs directory.

LKML-Reference: <1291926427-28846-1-git-send-email-daahern@cisco.com>
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# eac23d1c 08-Dec-2010 Ian Munsie <imunsie@au1.ibm.com>

perf record,report,annotate,diff: Process events in order

This patch changes perf report to ask for the ID info on all events be
default if recording from multiple CPUs.

Perf report, annotate and diff will now process the events in order if
the kernel is able to provide timestamps on all events. This ensures
that events such as COMM and MMAP which are necessary to correctly
interpret samples are processed prior to those samples so that they are
attributed correctly.

Before:
# perf record ./cachetest
# perf report

# Events: 6K cycles
#
# Overhead Command Shared Object Symbol
# ........ ....... ................. ...............................
#
74.11% :3259 [unknown] [k] 0x4a6c
1.50% cachetest ld-2.11.2.so [.] 0x1777c
1.46% :3259 [kernel.kallsyms] [k] .perf_event_mmap_ctx
1.25% :3259 [kernel.kallsyms] [k] restore
0.74% :3259 [kernel.kallsyms] [k] ._raw_spin_lock
0.71% :3259 [kernel.kallsyms] [k] .filemap_fault
0.66% :3259 [kernel.kallsyms] [k] .memset
0.54% cachetest [kernel.kallsyms] [k] .sha_transform
0.54% :3259 [kernel.kallsyms] [k] .copy_4K_page
0.54% :3259 [kernel.kallsyms] [k] .find_get_page
0.52% :3259 [kernel.kallsyms] [k] .trace_hardirqs_off
0.50% :3259 [kernel.kallsyms] [k] .__do_fault
<SNIP>

After:
# perf report

# Events: 6K cycles
#
# Overhead Command Shared Object Symbol
# ........ ....... ................. ...............................
#
44.28% cachetest cachetest [.] sumArrayNaive
22.53% cachetest cachetest [.] sumArrayOptimal
6.59% cachetest ld-2.11.2.so [.] 0x1777c
2.13% cachetest [unknown] [k] 0x340
1.46% cachetest [kernel.kallsyms] [k] .perf_event_mmap_ctx
1.25% cachetest [kernel.kallsyms] [k] restore
0.74% cachetest [kernel.kallsyms] [k] ._raw_spin_lock
0.71% cachetest [kernel.kallsyms] [k] .filemap_fault
0.66% cachetest [kernel.kallsyms] [k] .memset
0.54% cachetest [kernel.kallsyms] [k] .copy_4K_page
0.54% cachetest [kernel.kallsyms] [k] .find_get_page
0.54% cachetest [kernel.kallsyms] [k] .sha_transform
0.52% cachetest [kernel.kallsyms] [k] .trace_hardirqs_off
0.50% cachetest [kernel.kallsyms] [k] .__do_fault
<SNIP>

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1291872833-839-1-git-send-email-imunsie@au1.ibm.com>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 21ef97f0 09-Dec-2010 Ian Munsie <imunsie@au1.ibm.com>

perf session: Fallback to unordered processing if no sample_id_all

If we are running the new perf on an old kernel without support for
sample_id_all, we should fall back to the old unordered processing of
events. If we didn't than we would *always* process events without
timestamps out of order, whether or not we hit a reordering race. In
other words, instead of there being a chance of not attributing samples
correctly, we would guarantee that samples would not be attributed.

While processing all events without timestamps before events with
timestamps may seem like an intuitive solution, it falls down as
PERF_RECORD_EXIT events would also be processed before any samples.
Even with a workaround for that case, samples before/after an exec would
not be attributed correctly.

This patch allows commands to indicate whether they need to fall back to
unordered processing, so that commands that do not care about timestamps
on every event will not be affected. If we do fallback, this will print
out a warning if report -D was invoked.

This patch adds the test in perf_session__new so that we only need to
test once per session. Commands that do not use an event_ops (such as
record and top) can simply pass NULL in it's place.

Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1291951882-sup-6069@au1.ibm.com>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b226a5a7 07-Dec-2010 David Ahern <daahern@cisco.com>

perf report: Allow user to specify path to kallsyms file

This is useful for analyzing a perf data file on a different system than
the one data was collected on and still include symbols from loaded
kernel modules in the output.

Commiter note: Updated the man page accordingly.

LKML-Reference: <1291775986-16475-1-git-send-email-daahern@cisco.com>
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 640c03ce 02-Dec-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Parse sample earlier

At perf_session__process_event, so that we reduce the number of lines in eache
tool sample processing routine that now receives a sample_data pointer already
parsed.

This will also be useful in the next patch, where we'll allow sample the
identity fields in MMAP, FORK, EXIT, etc, when it will be possible to see (cpu,
timestamp) just after before every event.

Also validate callchains in perf_session__process_event, i.e. as early as
possible, and keep a counter of the number of events discarded due to invalid
callchains, warning the user about it if it happens.

There is an assumption that was kept that all events have the same sample_type,
that will be dealt with in the future, when this preexisting limitation will be
removed.

Tested-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <1291318772-30880-4-git-send-email-acme@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 6cb8e561 22-Aug-2010 Frederic Weisbecker <fweisbec@gmail.com>

perf: Rename append_callchain into callchain_append

Do that to start a consistant callchain API namespace.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Christoph Hellwig <hch@infradead.org>


# 8b9e74eb 21-Aug-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Add --tui and --stdio to choose the UI

Relying just on ~/.perfconfig or rebuilding the tool disabling support
for the TUI is too cumbersome, so allow specifying which UI to use and
make the command line switch override whatever is in ~/.perfconfig.

Suggested-by: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 71e7cf3a 05-Aug-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Speed up exit path

When cmd_record exits the whole perf binary will exit right after,
so no need to traverse lots of complex data structures freeing them.

Sticked a comment for leak detectives and for a experiment with obstacks
to be performed so that we can speed up freeing that memory.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 80d50cae 05-Aug-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf ui: Add search by name/addr to the map__browser

Only in verbose mode so as not to bloat struct symbol too much.

The key used is '/', just like in vi, less, etc.

More work is needed to allocate space on the symbol in a more clear way.

This experiment shows how to do it for the hist_browser, in the main
window.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 88ca895d 27-Jul-2010 Dave Martin <dave.martin@linaro.org>

perf tools: Remove unneeded code for tracking the cwd in perf sessions

Tidy-up patch to remove some code and struct perf_session data members
which are no longer needed due to the previous patch: "perf tools: Don't
abbreviate file paths relative to the cwd".

LKML-Reference: <new-submission>
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 108553e1 07-Jul-2010 Frederic Weisbecker <fweisbec@gmail.com>

perf: Sync callchains with period based hits

Hists have their hits increased by the event period. And this
period based counting is the foundation of all the stats in
perf report.

But callchains still use the raw number of hits, without taking
the period into account. So when we compute the percentage,
absolute based percentages are totally broken, and relative ones
too in the first parent level. Because we pass the number of events
muliplied by their period as the total number of hits to the
callchain filtering, while callchains expect this number to be
the number of raw hits.

perf report -g graph was simply not working, showing no graph unless
the min percent was zero. And even there the percentage of the
branches was always 0. And may be fractal filtering was broken on
the first branch level too.

flat also was broken, but it was hidden because of other breakages.

Anyway fix this by counting using periods on callchains.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>


# 41a37e20 04-Jun-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Make event__preprocess_sample parse the sample

Simplifying the tools that were using both in sequence and allowing
upcoming simplifications, such as Arun's patch to sort by cpus.

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 761844b9 27-May-2010 Stephane Eranian <eranian@google.com>

perf report: Make -D print sampled CPU

It is useful to know on which CPU a sample was captured on.
The information is captured with perf record -R but it was
not printed out by perf report -D. This patch adds this.

When -R is not used, cpu is set to -1to indicate that
the CPU is unknown (it is not captured).

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <4bff964c.e88cd80a.3106.7d31@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# d67f088e 23-May-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Support multiple events on the TUI

The hists__tty_browse_tree function was created with the loop to print
all events, and its equivalent, hists__tui_browse_tree, was created in a
similar fashion, where it is possible to switch among the multiple
events, if present, using TAB to go the next event, and shift+TAB
(UNTAB) to go to the previous.

The report TUI now shows as the window title the name of the event and a
leak was fixed wrt pstacks.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 46e3e055 22-May-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf annotate: Add TUI interface

When annotating multiple entries, for instance, when running simply as:

$ perf annotate

the right and left keys, as well as TAB can be used to cycle thru the
multiple symbols being annotated.

If one doesn't like TUI annotate, disable it by editing ~/.perfconfig
and adding:

[tui]

annotate = off

Just like it is possible for report.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0e8dc259 21-May-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Don't start the TUI if -D is used

One day we'll have support for the "dump raw trace in ASCII" in the TUI
frontend, but till then, use the tty code.

Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 5d06e691 20-May-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tui: Allow disabling the TUI on a per command basis in ~/.perfconfig

Using the same scheme as for git's/perf's pager setup, i.e. if one
doesn't want to, on a newt enabled perf binary, to disable the TUI for
'perf report', its just a matter of doing:

[root@doppio linux-2.6-tip]# printf "[tui]\n\nreport = off\n" >
/root/.perfconfig
[root@doppio linux-2.6-tip]# cat /root/.perfconfig
[tui]

report = off
[root@doppio linux-2.6-tip]#

System wide settings are also possible, by editing /etc/perfconfig, etc,
i.e. the git machinery for config files applies to perf as well, so when
in doubt where to put your settings, consult the git documentation, if
it fails, please let us know.

Suggested-by: Ingo Molnar <mingo@elte.hu>
Discussed-with: Stephane Eranian <eranian@google.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# edb7c60e 17-May-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf options: Type check all the remaining OPT_ variants

OPT_SET_INT was renamed to OPT_SET_UINT since the only use in these
tools is to set something that has an enum type, that is builtin
compatible with unsigned int.

Several string constifications were done to make OPT_STRING require a
const char * type.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# c82ee828 14-May-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Report number of events, not samples

Number of samples is meaningless after we switched to auto-freq, so
report the number of events, i.e. not the sum of the different periods,
but the number PERF_RECORD_SAMPLE emitted by the kernel.

While doing this I noticed that naming "count" to the sum of all the
event periods can be confusing, so rename it to .period, just like in
struct sample.data, so that we become more consistent.

This helps with the next step, that was to record in struct hist_entry
the number of sample events for each instance, we need that because we
use it to generate the number of events when applying filters to the
tree of hist entries like it is being done in the TUI report browser.

Suggested-by: Ingo Molnar <mingo@elte.hu>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# cee75ac7 14-May-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf hist: Clarify events_stats fields usage

The events_stats.total field is too generic, rename it to .total_period,
and also add a comment explaining that it is the sum of all the .period
fields in samples, that is needed because we use auto-freq to avoid
sampling artifacts.

Ditto for events_stats.lost, that is the sum of all lost_event.lost
fields, i.e. the number of events the kernel dropped.

Looking at the users, builtin-sched.c can make use of these fields and
stop doing it again.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# c8446b9b 14-May-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf hist: Make event__totals per hists

This is one more thing that started global but are more useful per hist
or per session.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ef7b93a1 11-May-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Librarize the annotation code and use it in the newt browser

Now we don't anymore use popen to run 'perf annotate' for the selected
symbol, instead we collect per address samplings when processing samples
in 'perf report' if we're using the newt browser, then we use this data
directly to do annotation.

Done this way we can actually traverse the objdump_line objects
directly, matching the addresses to the collected samples and colouring
them appropriately using lower level slang routines.

The new ui_browser class will be reused for the main, callchain aware,
histogram browser, when it will be made generic and don't assume that
the objects are always instances of the objdump_line class maintained
using list_heads.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b09e0190 11-May-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf hist: Adopt filter by dso and by thread methods from the newt browser

Those are really not specific to the newt code, can be used by other UI
frontends.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fefb0b94 10-May-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf hist: Calculate max_sym name len and nr_entries

Better done when we are adding entries, be it initially of when we're
re-sorting the histograms.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 1c02c4d2 10-May-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf hist: Introduce hists class and move lots of methods to it

In cbbc79a we introduced support for multiple events by introducing a
new "event_stat_id" struct and then made several perf_session methods
receive a point to it instead of a pointer to perf_session, and kept the
event_stats and hists rb_tree in perf_session.

While working on the new newt based browser, I realised that it would be
better to introduce a new class, "hists" (short for "histograms"),
renaming the "event_stat_id" struct and the perf_session methods that
were really "hists" methods, as they manipulate only struct hists
members, not touching anything in the other perf_session members.

Other optimizations, such as calculating the maximum lenght of a symbol
name present in an hists instance will be possible as we add them,
avoiding a re-traversal just for finding that information.

The rationale for the name "hists" to replace "event_stat_id" is that we
may have multiple sets of hists for the same event_stat id, as, for
instance, the 'perf diff' tool has, so event stat id is not what
characterizes what this struct and the functions that manipulate it do.

Cc: Eric B Munson <ebmunson@us.ibm.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 232a5c94 09-May-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Allow limiting the number of entries to print in callchains

Works by adding a third parameter to the '-g' argument, after the graph
type and minimum percentage, for example:

[root@doppio linux-2.6-tip]# perf report -g fractal,0.5,2

Will show only the first two symbols where at least 0.5% of the samples
took place.

All the other symbols that don't fall outside these constraints will be
put together in the last entry, prefixed with "[...]" and the total
percentage for them.

Suggested-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 28e2a106 09-May-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf hist: Simplify the insertion of new hist_entry instances

And with that fix at least one bug:

The first hit for an entry, the one that calls malloc to create a new
instance in __perf_session__add_hist_entry, wasn't adding the count to
the per cpumode (PERF_RECORD_MISC_USER, etc) total variable.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 39d1e1b1 08-May-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Fix leak of resolved callchains array on error path

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 139633c6 09-May-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf callchain: Move validate_callchain to callchain lib

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# e157eb83 08-May-2010 Pekka Enberg <penberg@cs.helsinki.fi>

perf report: Document '--call-graph' better for usage

This patch improves 'perf report -h' output for the
'--call-graph' command line option by enumerating the
different output types.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1273332783-4268-1-git-send-email-penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 454c407e 01-May-2010 Tom Zanussi <tzanussi@gmail.com>

perf: add perf-inject builtin

Currently, perf 'live mode' writes build-ids at the end of the
session, which isn't actually useful for processing live mode events.

What would be better would be to have the build-ids sent before any of
the samples that reference them, which can be done by processing the
event stream and retrieving the build-ids on the first hit. Doing
that in perf-record itself, however, is off-limits.

This patch introduces perf-inject, which does the same job while
leaving perf-record untouched. Normal mode perf still records the
build-ids at the end of the session as it should, but for live mode,
perf-inject can be injected in between the record and report steps
e.g.:

perf record -o - ./hackbench 10 | perf inject -v -b | perf report -v -i -

perf-inject reads a perf-record event stream and repipes it to stdout.
At any point the processing code can inject other events into the
event stream - in this case build-ids (-b option) are read and
injected as needed into the event stream.

Build-ids are just the first user of perf-inject - potentially
anything that needs userspace processing to augment the trace stream
with additional information could make use of this facility.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1272696080-16435-3-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# cbf69680 27-Apr-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf machines: Make the machines class adopt the dsos__fprintf methods

Now those methods don't operate on a global list of dsos, but on lists
of machines, so make this clear by renaming the functions.

Cc: Avi Kivity <avi@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 23346f21 27-Apr-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Rename "kernel_info" to "machine"

struct kernel_info and kerninfo__ are too vague, what they really
describe are machines, virtual ones or hosts.

There are more changes to introduce helpers to shorten function calls
and to make more clear what is really being done, but I left that for
subsequent patches.

Cc: Avi Kivity <avi@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# a1645ce1 18-Apr-2010 Zhang, Yanmin <yanmin_zhang@linux.intel.com>

perf: 'perf kvm' tool for monitoring guest performance from host

Here is the patch of userspace perf tool.

Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>


# c7929e47 01-Apr-2010 Tom Zanussi <tzanussi@gmail.com>

perf: Convert perf header build_ids into build_id events

Bypasses the build_id perf header code and replaces it with a
synthesized event and processing function that accomplishes the
same thing, used when reading/writing perf data to/from a pipe.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference: <1270184365-8281-9-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 9215545e 01-Apr-2010 Tom Zanussi <tzanussi@gmail.com>

perf: Convert perf tracing data into a tracing_data event

Bypasses the tracing_data perf header code and replaces it with
a synthesized event and processing function that accomplishes
the same thing, used when reading/writing perf data to/from a
pipe.

The tracing data is pretty large, and this patch doesn't attempt
to break it down into component events. The tracing_data event
itself doesn't actually contain the tracing data, rather it
arranges for the event processing code to skip over it after
it's read, using the skip return value added to the event
processing loop in a previous patch.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference: <1270184365-8281-8-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# cd19a035 01-Apr-2010 Tom Zanussi <tzanussi@gmail.com>

perf: Convert perf event types into event type events

Bypasses the event type perf header code and replaces it with a
synthesized event and processing function that accomplishes the
same thing, used when reading/writing perf data to/from a pipe.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference: <1270184365-8281-7-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 2c46dbb5 01-Apr-2010 Tom Zanussi <tzanussi@gmail.com>

perf: Convert perf header attrs into attr events

Bypasses the attr perf header code and replaces it with a
synthesized event and processing function that accomplishes the
same thing, used when reading/writing perf data to/from a pipe.

Making the attrs into events allows them to be streamed over a
pipe along with the rest of the header data (in later patches).
It also paves the way to allowing events to be added and removed
from perf sessions dynamically.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference: <1270184365-8281-6-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 46656ac7 01-Apr-2010 Tom Zanussi <tzanussi@gmail.com>

perf report: Introduce special handling for pipe input

Adds special treatment for stdin - if the user specifies '-i -'
to perf report, the intent is that the event stream be written
to stdin rather than from a disk file.

The actual handling of the '-' filename is done by the session;
this just adds a signal handler to stop reporting, and turns off
interference by the pager.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference: <1270184365-8281-4-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# c0555642 13-Apr-2010 Ian Munsie <imunsie@au.ibm.com>

perf: Fix endianness argument compatibility with OPT_BOOLEAN() and introduce OPT_INCR()

Parsing an option from the command line with OPT_BOOLEAN on a
bool data type would not work on a big-endian machine due to the
manner in which the boolean was being cast into an int and
incremented. For example, running 'perf probe --list' on a
PowerPC machine would fail to properly set the list_events bool
and would therefore print out the usage information and
terminate.

This patch makes OPT_BOOLEAN work as expected with a bool
datatype. For cases where the original OPT_BOOLEAN was
intentionally being used to increment an int each time it was
passed in on the command line, this patch introduces OPT_INCR
with the old behaviour of OPT_BOOLEAN (the verbose variable is
currently the only such example of this).

I have reviewed every use of OPT_BOOLEAN to verify that a true
C99 bool was passed. Where integers were used, I verified that
they were only being used for boolean logic and changed them to
bools to ensure that they would not be mistakenly used as ints.
The major exception was the verbose variable which now uses
OPT_INCR instead of OPT_BOOLEAN.

Signed-off-by: Ian Munsie <imunsie@au.ibm.com>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: <stable@kernel.org> # NOTE: wont apply to .3[34].x cleanly, please backport
Cc: Git development list <git@vger.kernel.org>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Eric B Munson <ebmunson@us.ibm.com>
Cc: Valdis.Kletnieks@vt.edu
Cc: WANG Cong <amwang@redhat.com>
Cc: Thiago Farina <tfransosi@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: John Kacur <jkacur@redhat.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1271147857-11604-1-git-send-email-imunsie@au.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 533c46c3 03-Apr-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf newt: Pass the input_name to perf_session__browse_hists

So that it can use it in the 'perf annotate' command line, otherwise
it'll use the default and not the specified -i filename passed to 'perf
report'.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# e206d556 03-Apr-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Move the prototypes in util/string.h to util.h

So that we avoid conflict with libc's string.h header.

Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Suggested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2aefa4f7 01-Apr-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: sort_dimension__add shouldn't die

Propagate error instead.

LKML-Reference: <new-submission>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ad5b217b 02-Apr-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Remove one more exit() call from library code

Return NULL instead and make the caller propagate the error.

LKML-Reference: <new-submission>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b9fb9304 02-Apr-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf hist: Only allocate callchain_node if processing callchains

The struct callchain_node size is 120 bytes, that are never used when
there are no callchains or '-g none' is specified, so conditionally
allocate it, reducing sizeof(struct hist_entry) from 210 bytes to only
96, greatly speeding the non-callchain processing.

LKML-Reference: <new-submission>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 5f4d3f88 26-Mar-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Add progress bars

For when we are processing the events and inserting the entries in the
browser.

Experimentation here: naming "ui_something" we may be treading into
creating a TUI/GUI set of routines that can then be implemented in terms
of multiple backends.

Also the time it takes for adding things to the "browser" takes, visually
(I guess I should do some profiling here ;-) ), more time than for
processing the events...

That means we probably need to create a custom hist_entry browser, so
that we reuse the structures we have in place instead of duplicating
them in newt.

But progress was made and at least we can see something while long files
are being loaded, that must be one of UI 101 bullet points :-)

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b3c9ac08 24-Mar-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf callchains: Store the map together with the symbol

We need this to know where a symbol in a callchain came from,
for various reasons, among them precise annotation from a
TUI/GUI tool.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1269459619-982-5-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 301fde27 22-Mar-2010 Frederic Weisbecker <fweisbec@gmail.com>

perf: Fix orphan callchain branches

Callchains have markers inside their capture to tell we
enter a context (kernel, user, ...).

Those are not displayed in the callchains but they are
incidentally an active part of the radix tree where
callchains are stored, just like any other address.

If we have the two following callchains:

addr1 -> addr2 -> user context -> addr3
addr1 -> addr2 -> user context -> addr4
addr1 -> addr2 -> addr 5

This is pretty common if addr1 and addr2 are part of an
interrupt path, addr3 and addr4 are user addresses and
addr5 is a kernel non interrupt path.

This will be stored as follows in the tree:

addr1
addr2
/ \
/ addr5
user context
/ \
addr3 addr4

But we ignore the context markers in the report, hence
the addr3 and addr4 will appear as orphan branches:

|--28.30%-- hrtimer_interrupt
| smp_apic_timer_interrupt
| apic_timer_interrupt
| | <------------- here, no parent!
| | |
| | |--11.11%-- 0x7fae7bccb875
| | |
| | |--11.11%-- 0xffffffffff60013b
| | |
| | |--11.11%-- __pthread_mutex_lock_internal
| | |
| | |--11.11%-- __errno_location

Fix this by removing the context markers when we process the
callchains to the tree.

Reported-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1269274173-20328-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# f9224c5c 11-Mar-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Implement initial UI using newt

Newt has widespread availability and provides a rather simple
API as can be seen by the size of this patch.

The work needed to support it will benefit other frontends too.

In this initial patch it just checks if the output is a tty, if
not it falls back to the previous behaviour, also if
newt-devel/libnewt-dev is not installed the previous behaviour
is maintaned.

Pressing enter on a symbol will annotate it, ESC in the
annotation window will return to the report symbol list.

More work will be done to remove the special casing in
color_fprintf, stop using fmemopen/FILE in the printing of
hist_entries, etc.

Also the annotation doesn't need to be done via spawning "perf
annotate" and then browsing its output, we can do better by
calling directly the builtin-annotate.c functions, that would
then be moved to tools/perf/util/annotate.c and shared with perf
top, etc

But lets go by baby steps, this patch already improves perf
usability by allowing to quickly do annotations on symbols from
the report screen and provides a first experimentation with
libnewt/TUI integration of tools.

Tested on RHEL5 and Fedora12 X86_64 and on Debian PARISC64 to
browse a perf.data file collected on a Fedora12 x86_64 box.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1268349164-5822-5-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# cbbc79a5 04-Mar-2010 Eric B Munson <ebmunson@us.ibm.com>

perf report: Add multiple event support

Perf report does not handle multiple events being reported, even
though perf record stores them properly on disk. This patch
addresses that issue by adding the logic to perf report to use
the event stream id that is saved by record and the new data
structures to seperate the event streams and report them
individually.

Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267804269-22660-6-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# eefc465c 04-Mar-2010 Eric B Munson <ebmunson@us.ibm.com>

perf session: Change perf_session post processing functions to take histogram tree

Now that report can store historgrams for multiple events we
need to be able to do the post processing work for each
histogram. This patch changes the post processing functions so
that they can be called individually for each event's histogram.

Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
[ Guarantee bisectabilty by fixing up builtin-report.c ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267804269-22660-5-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# d403d0ac 04-Mar-2010 Eric B Munson <ebmunson@us.ibm.com>

perf session: Change add_hist_entry to take the tree root instead of session

In order to minimize the impact of storing multiple events in a
report this function will now take the root of the histogram
tree so that the logic for selecting the proper tree can be
inserted before the call.

Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267804269-22660-3-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# b7ae11b3 21-Jan-2010 Yong Wang <yong.y.wang@linux.intel.com>

perf report: Fix segmentation fault when running with '-g none'

Segmentation fault occurs when running perf report with '-g
none'.

Reported-by: Austin Zhang <austin.zhang@intel.com>
Signed-off-by: Yong Wang <yong.y.wang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100122014750.GA4111@ywang-moblin2.bj.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 0d755034 13-Jan-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Don't cast RIP to pointers

Since they can come from another architecture with bigger
pointers, i.e. processing a 64-bit perf.data on a 32-bit arch.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1263478990-8200-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# b9a63b9b 05-Jan-2010 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Fix --no-call-chain option handling

To avoid the funny:

[root@doppio ~]# perf record -a -f sleep 2s
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.334 MB perf.data (~14572 samples) ]
[root@doppio ~]# perf report --no-call-graph
selected -g but no callchain data. Did you call perf record without -g?

And fix the bug reported by peterz when we do indeed record with
callchains and then ask for a report without:

[root@doppio ~]# perf record -a -g -f sleep 2s
[root@doppio ~]# perf report --no-call-graph
Segmentation fault
[root@doppio ~]#

Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1262699685-27820-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 71289be7 28-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Add --hide-unresolved/-U command line option

Useful to match the 'overhead' column in 'perf report' with the
'baseline' one in 'perf diff'.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1262047716-23171-3-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 55aa640f 27-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Remove redundant prefix & suffix from perf_event_ops

Since now all that we have are perf event handlers, leave just
the name of the event.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1261957026-15580-9-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# f7d87444 27-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Move full_paths config to symbol_conf

Now perf_event_ops has just that, event handlers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1261957026-15580-8-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# d549c769 27-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Remove sample_type_check from event_ops

This is really something tools need to do before asking for the
events to be processed, leaving perf_session__process_events to
do just that, process events.

Also add a msg parameter to perf_session__has_traces() so that
the right message can be printed, fixing a regression added by
me in the previous cset (right timechart message) and also
fixing 'perf kmem', that was not asking if 'perf kmem record'
was ran.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1261957026-15580-6-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 0422a4fc 18-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf diff: Fix usage array, it must end with a NULL entry

Fixing this:

[acme@doppio linux-2.6-tip]$ perf diff --hell
Error: unknown option `hell'

usage: perf diff [<options>] [old_file] [new_file]
Segmentation fault
[acme@doppio linux-2.6-tip]$

Also go over the other such arrays to check if they all were OK,
they are, but there were some minor changes to do like making
one static and renaming another to match the command it refers
to.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1261161358-23959-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# b5b60fda 18-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Make events_stats u64 to avoid overflow on 32-bit arches

Pekka Enberg reported weird percentages in perf report. It
turns out we are overflowing a 32-bit variables in struct
events_stats on 32-bit architectures.

Before:

[acme@ana linux-2.6-tip]$ perf report -i pekka.perf.data 2> /dev/null | head -10
281.96% Xorg b710a561 [.] 0x000000b710a561
140.15% Xorg [kernel] [k] __initramfs_end
51.56% metacity libgobject-2.0.so.0.2000.1 [.] 0x00000000026e46
35.12% evolution libcairo.so.2.10800.6 [.] 0x000000000203bd
33.84% metacity libpthread-2.9.so [.] 0x00000000007a3d

After:

[acme@ana linux-2.6-tip]$ perf report -i pekka.perf.data 2> /dev/null | head -10
30.04% Xorg b710a561 [.] 0x000000b710a561
14.93% Xorg [kernel] [k] __initramfs_end
5.49% metacity libgobject-2.0.so.0.2000.1 [.] 0x00000000026e46
3.74% evolution libcairo.so.2.10800.6 [.] 0x000000000203bd
3.61% metacity libpthread-2.9.so [.] 0x00000000007a3d

Reported-by: Pekka Enberg <penberg@cs.helsinki.fi>
Tested-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1261148583-20395-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# c351c281 16-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf diff: Use perf_session__fprintf_hists just like 'perf record'

That means that almost everything you can do with 'perf report'
can be done with 'perf diff', for instance:

$ perf record -f find / > /dev/null
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.062 MB perf.data (~2699
samples) ] $ perf record -f find / > /dev/null
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.062 MB perf.data (~2687
samples) ] perf diff | head -8
9.02% +1.00% find libc-2.10.1.so [.] _IO_vfprintf_internal
2.91% -1.00% find [kernel] [k] __kmalloc
2.85% -1.00% find [kernel] [k] ext4_htree_store_dirent
1.99% -1.00% find [kernel] [k] _atomic_dec_and_lock
2.44% find [kernel] [k] half_md4_transform
$

So if you want to zoom into libc:

$ perf diff --dsos libc-2.10.1.so | head -8
37.34% find [.] _IO_vfprintf_internal
10.34% find [.] __GI_memmove
8.25% +2.00% find [.] _int_malloc
5.07% -1.00% find [.] __GI_mempcpy
7.62% +2.00% find [.] _int_free
$

And if there were multiple commands using libc, it is also
possible to aggregate them all by using --sort symbol:

$ perf diff --dsos libc-2.10.1.so --sort symbol | head -8
37.34% [.] _IO_vfprintf_internal
10.34% [.] __GI_memmove
8.25% +2.00% [.] _int_malloc
5.07% -1.00% [.] __GI_mempcpy
7.62% +2.00% [.] _int_free
$

The displacement column now is off by default, to use it:

perf diff -m --dsos libc-2.10.1.so --sort symbol | head -8
37.34% [.] _IO_vfprintf_internal
10.34% [.] __GI_memmove
8.25% +2.00% [.] _int_malloc
5.07% -1.00% +2 [.] __GI_mempcpy
7.62% +2.00% -1 [.] _int_free
$

Using -t/--field-separator can be used for scripting:

$ perf diff -t, -m --dsos libc-2.10.1.so --sort symbol | head -8
37.34, , ,[.] _IO_vfprintf_internal
10.34, , ,[.] __GI_memmove
8.25,+2.00%, ,[.] _int_malloc
5.07,-1.00%, +2,[.] __GI_mempcpy
7.62,+2.00%, -1,[.] _int_free
6.99,+1.00%, -1,[.] _IO_new_file_xsputn
1.89,-2.00%, +4,[.] __readdir64
$

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260978567-550-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 125c4fad 15-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Fix cut'n'paste error recently introduced

Introduced in:

d599db3fc5dd4f1e8432fdbc6d899584b25f4dff
"perf report: Generalize perf_session__fprintf_hists()"

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260973631-28035-3-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 3e6055ab 15-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Move perf report specific hits out of perf_session__fprintf_hists

Those don't make sense for tools such as 'perf diff'.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260973631-28035-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 4ecf84d0 15-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Move hist entries printing routines from perf report

Will be used in other tools such as 'perf diff'.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260973631-28035-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# d599db3f 15-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Generalize perf_session__fprintf_hists()

Pull it out of builtin-report - further changes will be made and it
will then be reusable in 'perf diff' as well.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260914682-29652-4-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# c410a338 15-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Move symbol filtering to event__preprocess_sample()

So that --dsos, --comm, --symbols can bem used in more tools,
like in perf diff:

$ perf record -f find / > /dev/null
$ perf record -f find / > /dev/null
$ perf diff --dsos /lib64/libc-2.10.1.so | head -5
1 +22392124 /lib64/libc-2.10.1.so _IO_vfprintf_internal
2 +6410655 /lib64/libc-2.10.1.so __GI_memmove
3 +1 +9192692 /lib64/libc-2.10.1.so _int_malloc
4 -1 -15158605 /lib64/libc-2.10.1.so _int_free
5 +45669 /lib64/libc-2.10.1.so _IO_new_file_xsputn
$

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260914682-29652-3-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 655000e7 15-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Adopt the strlists for dso, comm

Will be used in perf diff too.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260914682-29652-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 75be6cf4 15-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Make symbol_conf global

This simplifies a lot of functions, less stuff to be done by
tool writers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260914682-29652-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# c249a4ce 14-Dec-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf tools: Make symbol_conf static

perf top, report and annotate all define their own symbol_conf,
it should be static.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260843322-6602-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# c8829c7a 14-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf util: Remove setup_sorting dups

And it is also needed by 'perf diff'.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260828571-3613-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# f823e441 14-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Event statistics also are per session

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260810361-22828-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# c019879b 14-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Adopt the sample_type variable

All tools had copies, and perf diff would have to specify a
sample_type_check method just for copying it.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260807780-19377-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# a328626b 14-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Adopt resolve_callchain

This is really a generic library routine, so declutter
builtin-report.c a bit by moving it to the library.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260807780-19377-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 4e4f06e4 14-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Move the hist_entries rb tree to perf_session

As we'll need to sort multiple times for multiple perf sessions,
so that we can then do a diff.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260803439-16783-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# b9bf0892 14-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: No need for three rb_trees for sorting hist entries

All hist entries are in only one of them, so use just one and a
temporary rb_root while sorting/collapsing.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260797831-11220-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 4aa65636 13-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Move kmaps to perf_session

There is still some more work to do to disentangle map creation
from DSO loading, but this happens only for the kernel, and for
the early adopters of perf diff, where this disentanglement
matters most, we'll be testing different kernels, so no problem
here.

Further clarification: right now we create the kernel maps for
the various modules and discontiguous kernel text maps when
loading the DSO, we should do it as a two step process, first
creating the maps, for multiple mappings with the same DSO
store, then doing the dso load just once, for the first hit on
one of the maps sharing this DSO backing store.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260741029-4430-6-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# b3165f41 13-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Move the global threads list to perf_session

So that we can process two perf.data files.

We still need to add a O_MMAP mode for perf_session so that we
can do all the mmap stuff in it.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260741029-4430-5-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# ec913369 13-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Reduce the number of parms to perf_session__process_events

By having the cwd/cwdlen in the perf_session struct and
full_paths in perf_event_ops.

Now its just a matter of passing the ops.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260741029-4430-4-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 13df45ca 13-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Register the idle thread in perf_session__process_events

No need for all tools to register it and then immediately call
perf_session__process_events.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260741029-4430-3-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 301a0b02 13-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Ditch register_perf_file_handler

Pass the event_ops to perf_session__process_events instead.

Also move the event_ops definition to session.h, starting to
move things around to their right place, trimming the many
unneeded headers we have.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260741029-4430-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# d8f66248 13-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Pass the perf_session to the event handling operations

They will need it to get the right threads list, etc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260741029-4430-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 94c744b6 11-Dec-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Introduce perf_session class

That does all the initialization boilerplate, opening the file,
reading the header, checking if it is valid, etc.

And that will as well have the threads list, kmap (now) global
variable, etc, so that we can handle two (or more) perf.data files
describing sessions to compare.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260573842-19720-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 180f95e2 06-Dec-2009 OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

perf: Make common SAMPLE_EVENT parser

Currently, sample event data is parsed for each commands, and it
is assuming that the data is not including other data. (E.g.
timechart, trace, etc. can't parse the event if it has
PERF_SAMPLE_CALLCHAIN)

So, even if we record the superset data for multiple commands at
a time, commands can't parse. etc.

To fix it, this makes common sample event parser, and use it to
parse sample event correctly. (PERF_SAMPLE_READ is unsupported
for now though, it seems to be not using.)

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <87hbs48imv.fsf@devron.myhome.or.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 1ed091c4 27-Nov-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Consolidate symbol resolving across all tools

Now we have a very high level routine for simple tools to
process IP sample events:

int event__preprocess_sample(const event_t *self,
struct addr_location *al,
symbol_filter_t filter)

It receives the event itself and will insert new threads in the
global threads list and resolve the map and symbol, filling all
this info into the new addr_location struct, so that tools like
annotate and report can further process the event by creating
hist_entries in their specific way (with or without callgraphs,
etc).

It in turn uses the new next layer function:

void thread__find_addr_location(struct thread *self, u8 cpumode,
enum map_type type, u64 addr,
struct addr_location *al,
symbol_filter_t filter)

This one will, given a thread (userspace or the kernel kthread
one), will find the given type (MAP__FUNCTION now, MAP__VARIABLE
too in the near future) at the given cpumode, taking vdsos into
account (userspace hit, but kernel symbol) and will fill all
these details in the addr_location given.

Tools that need a more compact API for plain function
resolution, like 'kmem', can use this other one:

struct symbol *thread__find_function(struct thread *self, u64 addr,
symbol_filter_t filter)

So, to resolve a kernel symbol, that is all the 'kmem' tool
needs, its just a matter of calling:

sym = thread__find_function(kthread, addr, NULL);

The 'filter' parameter is needed because we do lazy
parsing/loading of ELF symtabs or /proc/kallsyms.

With this we remove more code duplication all around, which is
always good, huh? :-)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1259346563-12568-12-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 62daacb5 27-Nov-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Reorganize event processing routines, lotsa dups killed

While implementing event__preprocess_sample, that will do all of
the symbol lookup in one convenient function, I noticed that
util/process_event.[ch] were not being used at all, then started
looking if there were other functions that could be shared
and...

All those functions really don't need to receive offset + head,
the only thing they did was common to all of them, so do it at
one place instead.

Stats about number of each type of event processed now is done
in a central place.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1259346563-12568-11-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 95011c60 27-Nov-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Support multiple symtabs in struct thread

Making the routines that were so far specific to the kernel maps
useful for all threads.

This is done by making the kernel maps be contained in a kernel
"thread".

This gets the kernel specific routines closer to the userspace
counterparts, which will help in reducing the boilerplate for
resolving a symbol, as will be demonstrated in the next patches.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1259346563-12568-9-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 6a4694a4 27-Nov-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Better support for multiple symbol tables per dso

By using an array of rb_roots in struct dso we can, from a
struct map instance to get the right symbol rb_tree more easily.
This way we can have just one symbol lookup method for struct
map instances, map__find_symbol, instead of one per symtab type
(functions, variables).

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1259346563-12568-6-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# fcf1203a 24-Nov-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Rename find_symbol routines to find_function

Paving the way for supporting variable in adition to function
symbols.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1259074912-5924-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# b32d133a 23-Nov-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Simplify symbol machinery setup

And also express its configuration toggles via a struct.

Now all one has to do is to call symbol__init(NULL) if the
defaults are OK, or pass a struct symbol_conf pointer with the
desired configuration.

If a tool uses kernel_maps__find_symbol() to look at the kernel
and modules mappings for a symbol but didn't call symbol__init()
first, that will generate a one time warning too, alerting the
subcommand developer that symbol__init() must be called.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1259071517-3242-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# e74328d3 24-Nov-2009 John Kacur <jkacur@redhat.com>

perf tools: Use common process_event functions for annotate and report

Prevent bit-rot in perf-annotate by using common functions where
possible. Here we create process_events.[ch] to hold the common
functions.

Signed-off-by: John Kacur <jkacur@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: acme@redhat.com
LKML-Reference: <1259073301-11506-3-git-send-email-jkacur@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# cc612d81 23-Nov-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Look for vmlinux in more places

Now that we can check the buildid to see if it really matches,
this can be done safely:

vmlinux
/boot/vmlinux
/boot/vmlinux-<uts.release>
/lib/modules/<uts.release>/build/vmlinux
/usr/lib/debug/lib/modules/%s/vmlinux

More can be added - if you know about distros that put the
vmlinux somewhere else please let us know.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1259001550-8194-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 50e5095a 22-Nov-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Do map lookups in resolve_callchain()

Bug introduced in 439d473b4777de510e1322168ac6f2f377ecd5bc,
making the initial map be used for all IPs, so that symbols
outside this initial map would either be erroneously resolved or
not resolve at all.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1258909162-28496-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# c338aee8 20-Nov-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Do lazy symtab loading for the kernel & modules too

Just like we do with the other DSOs. This also simplifies the
kernel_maps setup process, now all that the tools need to do is
to call kernel_maps__init and the maps for the modules and
kernel will be created, then, later, when
kernel_maps__find_symbol() is used, it will also call
maps__find_symbol that already checks if the symtab was loaded,
loading it if needed.

Now if one does 'perf top --hide_kernel_symbols' we won't pay
the price of loading the (many) symbols in /proc/kallsyms or
vmlinux.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1258757489-5978-4-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 6671cb16 20-Nov-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf symbols: Remove unrelated actions from dso__load_kernel_sym

It should just load kernel symbols, not load the list of
modules. There are more stuff to move to other routines, but
lets do it in several steps.

End goal is to be able to defer symbol table loading till we
find a hit for that map address range. So that the kernel &
modules are handled just like all the other DSOs in the system.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1258757489-5978-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 00a192b3 30-Oct-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Simplify the symbol priv area mechanism

Before we were storing this in the DSO, but in fact this is a
property of the 'symbol' class, not something that will vary
among DSOs, so move it to a global variable and initialize it
using the existing symbol__init routine.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1256927305-4628-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 66bd8424 28-Oct-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Delay loading symtabs till we hit a map with it

So that we can have a quicker start on perf top and even
speedups in the other tools, as we can have maps with no hits,
so no need to load its symtabs.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1256773881-4191-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 6beba7ad 21-Oct-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Unify debug messages mechanisms

We were using eprintf in some places, that looks at a global
'verbose' level, and at other places passing a 'v' parameter to
specify the verbosity level, unify it by introducing
pr_{err,warning,debug,etc}, just like in the kernel.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1256153646-10097-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# a4fb581b 22-Oct-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf tools: Bind callchains to the first sort dimension column

Currently, the callchains are displayed using a constant left
margin. So depending on the current sort dimension
configuration, callchains may appear to be well attached to the
first sort dimension column field which is mostly the case,
except when the first dimension of sorting is done by comm,
because these are right aligned.

This patch binds the callchain to the first letter in the first
column, whatever type of column it is (dso, comm, symbol).
Before:

0.80% perf [k] __lock_acquire
__lock_acquire
lock_acquire
|
|--58.33%-- _spin_lock
| |
| |--28.57%-- inotify_should_send_event
| | fsnotify
| | __fsnotify_parent

After:

0.80% perf [k] __lock_acquire
__lock_acquire
lock_acquire
|
|--58.33%-- _spin_lock
| |
| |--28.57%-- inotify_should_send_event
| | fsnotify
| | __fsnotify_parent

Also, for clarity, we don't put anymore the callchain as is but:

- If we have a top level ancestor in the callchain, start it
with a first ascii hook.

Before:

0.80% perf [kernel] [k] __lock_acquire
__lock_acquire
lock_acquire
|
|--58.33%-- _spin_lock
| |
| |--28.57%-- inotify_should_send_event
| | fsnotify
[..] [..]

After:

0.80% perf [kernel] [k] __lock_acquire
|
--- __lock_acquire
lock_acquire
|
|--58.33%-- _spin_lock
| |
| |--28.57%-- inotify_should_send_event
| | fsnotify
[..] [..]

- Otherwise, if we have several top level ancestors, then
display these like we did before:

1.69% Xorg
|
|--21.21%-- vread_hpet
| 0x7fffd85b46fc
| 0x7fffd85b494d
| 0x7f4fafb4e54d
|
|--15.15%-- exaOffscreenAlloc
|
|--9.09%-- I830WaitLpRing

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
LKML-Reference: <1256246604-17156-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# af0a6fa4 22-Oct-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf tools: Fix missing top level callchain

While recursively printing the branches of each callchains, we
forget to display the root. It is never printed.

Say we have:

symbol
f1
f2
|
-------- f3
| f4
|
---------f5
f6

Actually we never see that, instead it displays:

symbol
|
--------- f3
| f4
|
--------- f5
f6

However f1 is always the same than "symbol" and if we are
sorting by symbols first then "symbol", f1 and f2 will be well
aligned like in the above example, so displaying f1 looks
redundant here.

But if we are sorting by something else first (dso, comm,
etc...), displaying f1 doesn't look redundant but rather
necessary because the symbol is not well aligned anymore with
its callchain:

comm dso symbol
f1
f2
|
--------- [...]

And we want the callchain to be obvious.
So we fix the bug by printing the root branch, but we also
filter its first entry if we are sorting by symbols first.

Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1256246604-17156-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# e4204992 20-Oct-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf annotate: Use the sym_priv_size area for the histogram

We have this sym_priv_size mechanism for attaching private areas
to struct symbol entries but annotate wasn't using it, adding
private areas to struct symbol in addition to a ->priv pointer.

Scrap all that and use the sym_priv_size mechanism.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1256055940-19511-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# f39cdf25 17-Oct-2009 Julia Lawall <julia@diku.dk>

perf tools: Move dereference after NULL test

In each case, if the NULL test on thread is needed, then the
dereference should be after the NULL test.

A simplified version of the semantic match that detects this
problem is as follows (http://coccinelle.lip6.fr/):

// <smpl>
@match exists@
expression x, E;
identifier fld;
@@

* x->fld
... when != \(x = E\|&x\)
* x == NULL
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
LKML-Reference: <Pine.LNX.4.64.0910170842500.9213@ask.diku.dk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# d5b889f2 13-Oct-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Move threads & last_match to threads.c

This was just being copy'n'pasted all over.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <20091013141629.GD21809@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# da21d1b5 07-Oct-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Up the verbose level for some really verbose stuff

Like printing every symbol created.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1254923340-4870-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 016e92fb 06-Oct-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf tools: Unify perf.data mapping and events handling

This librarizes the perf.data file mapping and handling in various
perf tools, roughly reducing the amount of code and fixing the
places that mmap from beginning of the file whereas we want to mmap
from the beginning of the data, leading to page fault because the
mmap window is too small since the trace info are written in the
file too.

TODO:

- convert perf timechart too

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arjan van de Ven <arjan@infradead.org>
LKML-Reference: <20091007104729.GD5043@nowhere>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# c3b32fcb 05-Oct-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Use kernel_maps__find_symbol as fallback to find vdsos, etc

In resolve_symbol, as we're moving to breaking the kernel symbols
list per address ranges, i.e. kernel linking sections, so that we
don't have a big kernel_map that in its range covers what is in the
modules.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# ec218fc4 03-Oct-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Remove show_mask bitmask

As it was not being exposed via any command line and with --dsos/--comms
we can do this and even more, like asking for just kernel + some module:

[root@doppio linux-2.6-tip]# perf report --dsos \[kernel\],\[drm\]
--vmlinux /home/acme/git/build/tip-recvmmsg/vmlinux --modules | head -15
# Samples: 619669
#
# Overhead Command Shared Object Symbol
# ........ ............... ............. ......
#
7.12% swapper [kernel] [k] read_hpet
6.86% init [kernel] [k] read_hpet
6.22% init [kernel] [k] mwait_idle_with_hints
5.34% swapper [kernel] [k] mwait_idle_with_hints
3.01% firefox [kernel] [.] vread_hpet
2.14% Xorg [drm] [k] drm_clflush_pages
2.09% pidgin [kernel] [.] vread_hpet
1.58% npviewer.bin [kernel] [.] vread_hpet
1.37% swapper [kernel] [k] hpet_next_event
1.23% Xorg [kernel] [k] read_hpet
[root@doppio linux-2.6-tip]#

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <20091003233048.GA30535@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 9735abf1 03-Oct-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Move hist_entry__add common code to hist.c

Now perf report and annotate do the callgraph/hit processing in
their specialized hist_entry__add functions.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 439d473b 02-Oct-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Rewrite and improve support for kernel modules

Representing modules as struct map entries, backed by a DSO, etc,
using /proc/modules to find where the module is loaded.

DSOs now can have a short and long name, so that in verbose mode we
can show exactly which .ko or vmlinux image was used.

As kernel modules now are a DSO separate from the kernel, we can
ask for just the hits for a particular set of kernel modules, just
like we can do with shared libraries:

[root@doppio linux-2.6-tip]# perf report -n --vmlinux
/home/acme/git/build/tip-recvmmsg/vmlinux --modules --dsos \[drm\] | head -15
84.58% 13266 Xorg [k] drm_clflush_pages
4.02% 630 Xorg [k] trace_kmalloc.clone.0
3.95% 619 Xorg [k] drm_ioctl
2.07% 324 Xorg [k] drm_addbufs
1.68% 263 Xorg [k] drm_gem_close_ioctl
0.77% 120 Xorg [k] drm_setmaster_ioctl
0.70% 110 Xorg [k] drm_lastclose
0.68% 106 Xorg [k] drm_open
0.54% 85 Xorg [k] drm_mm_search_free
[root@doppio linux-2.6-tip]#

Specifying --dsos /lib/modules/2.6.31-tip/kernel/drivers/gpu/drm/drm.ko
would have the same effect. Allowing specifying just 'drm.ko' is left
for another patch.

Processing kallsyms so that per kernel module struct map are
instantiated was also left for another patch. That will allow
removing the module name from each of its symbols.

struct symbol was reduced by removing the ->module backpointer and
moving it (well now the map) to struct symbol_entry in perf top,
that is its only user right now.

The total linecount went down by ~500 lines.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Avi Kivity <avi@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 3d1d07ec 28-Sep-2009 John Kacur <jkacur@redhat.com>

perf tools: Put common histogram functions in their own file

Move histogram related functions into their own files (hist.c and
hist.h) and make use of them in builtin-annotate.c and
builtin-report.c.

Signed-off-by: John Kacur <jkacur@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <alpine.LFD.2.00.0909281531180.8316@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# dd68ada2 24-Sep-2009 John Kacur <jkacur@redhat.com>

perf tools: Create util/sort.and use it

Create util/sort.[ch] and move common functionality for
builtin-report.c and builtin-annotate.c there, and make use of it.

Signed-off-by: John Kacur <jkacur@redhat.com>
LKML-Reference: <alpine.LFD.2.00.0909241758390.11383@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# cdd6c482 20-Sep-2009 Ingo Molnar <mingo@elte.hu>

perf: Do the big rename: Performance Counters -> Performance Events

Bye-bye Performance Counters, welcome Performance Events!

In the past few months the perfcounters subsystem has grown out its
initial role of counting hardware events, and has become (and is
becoming) a much broader generic event enumeration, reporting, logging,
monitoring, analysis facility.

Naming its core object 'perf_counter' and naming the subsystem
'perfcounters' has become more and more of a misnomer. With pending
code like hw-breakpoints support the 'counter' name is less and
less appropriate.

All in one, we've decided to rename the subsystem to 'performance
events' and to propagate this rename through all fields, variables
and API names. (in an ABI compatible fashion)

The word 'event' is also a bit shorter than 'counter' - which makes
it slightly more convenient to write/handle as well.

Thanks goes to Stephane Eranian who first observed this misnomer and
suggested a rename.

User-space tooling and ABI compatibility is not affected - this patch
should be function-invariant. (Also, defconfigs were not touched to
keep the size down.)

This patch has been generated via the following script:

FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')

sed -i \
-e 's/PERF_EVENT_/PERF_RECORD_/g' \
-e 's/PERF_COUNTER/PERF_EVENT/g' \
-e 's/perf_counter/perf_event/g' \
-e 's/nb_counters/nb_events/g' \
-e 's/swcounter/swevent/g' \
-e 's/tpcounter_event/tp_event/g' \
$FILES

for N in $(find . -name perf_counter.[ch]); do
M=$(echo $N | sed 's/perf_counter/perf_event/g')
mv $N $M
done

FILES=$(find . -name perf_event.*)

sed -i \
-e 's/COUNTER_MASK/REG_MASK/g' \
-e 's/COUNTER/EVENT/g' \
-e 's/\<event\>/event_id/g' \
-e 's/counter/event/g' \
-e 's/Counter/Event/g' \
$FILES

... to keep it as correct as possible. This script can also be
used by anyone who has pending perfcounters patches - it converts
a Linux kernel tree over to the new naming. We tried to time this
change to the point in time where the amount of pending patches
is the smallest: the end of the merge window.

Namespace clashes were fixed up in a preparatory patch - and some
stylistic fallout will be fixed up in a subsequent patch.

( NOTE: 'counters' are still the proper terminology when we deal
with hardware registers - and these sed scripts are a bit
over-eager in renaming them. I've undone some of that, but
in case there's something left where 'counter' would be
better than 'event' we can undo that on an individual basis
instead of touching an otherwise nicely automated patch. )

Suggested-by: Stephane Eranian <eranian@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <linux-arch@vger.kernel.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 5b447a6a 30-Aug-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf tools: Librarize idle thread registration

Librarize register_idle_thread() used by annotate and report.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1251693921-6579-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 119e7a22 27-Aug-2009 Pierre Habouzit <pierre.habouzit@intersec.com>

perf tools: do not complain if root is owning perf.data

This improves patch fa6963b24 so that perf.data stuff that has
been dumped as root can be read (annotate/report) by a user
without the use of the --force.

Rationale is that root has plenty of ways to screw us (usually)
that do not require twisted schemes involving specially
crafting a perf.data.

Signed-off-by: Pierre Habouzit <pierre.habouzit@intersec.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
LKML-Reference: <20090827075902.GF19653@laphroaig.corp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# fa6963b2 19-Aug-2009 Peter Zijlstra <a.p.zijlstra@chello.nl>

perf tools: Check perf.data owner

Add an owner check to opening perf.data files and a switch to
silence it.

Because perf-report/perf-annotate are binary parsers reading
another users' perf.data file could be a security risk if the
file were explicitly engineered to trigger bugs in the parser
(we hope of course there are non such bugs, but you never
know).

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20090819092023.896648538@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 4273b005 18-Aug-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf tools: Fix comm column adjusting

The librarization of the thread helpers between annotate and
report lost some perf report specifics.

This patch fixes the thread comm column adjusting that has
been omitted during this export.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1250604226-6852-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 6ede59c4 17-Aug-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf tools: Fix spelling mistake in callchain error

While running perf report -g in a perf.data file that hasn't
been recorded in callchain mode, the error reported has a
spelling issue:

./perf report -g
selected -c but no callchain data. Did you call perf record without -g?

Fix it.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1250543271-8383-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 8f28827a 16-Aug-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf tools: Librarize trace_event() helper

Librarize trace_event() helper so that perf trace can use it
too. Also clean up the debug.h includes a bit.

It's not good to have it included in perf.h because it doesn't
make it flexible against other headers it may need (headers
that can also depend on perf.h and then create a recursive
header dependency).

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1250453149-664-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 0d3a5c88 16-Aug-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf tools: Librarize sample type and attr finding from headers

Librarize the sample type and attr fetching from perf data file
headers so that we can also use it from perf trace.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1250448997-30715-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 0f25bfc8 16-Aug-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf tools: Put the show mode into the event headers files

Annotate and report share the same flags to filter events
considering their context (kernel, user, hypervisor).

Both tools have their own definitions of these flags. Factorize
them out into the event headers file.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1250445414-29237-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 2cec19d9 16-Aug-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf tools: Factorize the dprintf definition

We have two users of dprintf: report and annotate. Another one
is coming with perf trace. Then factorize it into the debug
file.

While at it, rename dprintf() to dump_printf() so that it
doesn't conflicts with its libc homograph.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1250443461-28130-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 83a0944f 14-Aug-2009 Ingo Molnar <mingo@elte.hu>

perf: Enable more compiler warnings

Related to a shadowed variable bug fix Valdis Kletnieks noticed
that perf does not get built with -Wshadow, which could have
helped us avoid the bug.

So enable -Wshadow and also enable the following warnings on
perf builds, in addition to the already enabled -Wall -Wextra
-std=gnu99 warnings:

-Wcast-align
-Wformat=2
-Wshadow
-Winit-self
-Wpacked
-Wredundant-decls
-Wstack-protector
-Wstrict-aliasing=3
-Wswitch-default
-Wswitch-enum
-Wno-system-headers
-Wundef
-Wvolatile-register-var
-Wwrite-strings
-Wbad-function-cast
-Wmissing-declarations
-Wmissing-prototypes
-Wnested-externs
-Wold-style-definition
-Wstrict-prototypes
-Wdeclaration-after-statement

And change/fix the perf code to build cleanly under GCC 4.3.2.

The list of warnings enablement is rather arbitrary: it's based
on my (quick) reading of the GCC manpages and trying them on
perf.

I categorized the warnings based on individually enabling them
and looking whether they trigger something in the perf build.
If i liked those warnings (i.e. if they trigger for something
that arguably could be improved) i enabled the warning.

If the warnings seemed to come from language laywers spamming
the build with tons of nuisance warnings i generally kept them
off. Most of the sign conversion related warnings were in
this category. (A second patch enabling some of the sign
warnings might be welcome - sign bugs can be nasty.)

I also kept warnings that seem to make sense from their manpage
description and which produced no actual warnings on our code
base. These warnings might still be turned off if they end up
being a nuisance.

I also left out a few warnings that are not supported in older
compilers.

[ Note that these changes might break the build on older
compilers i did not test, or on non-x86 architectures that
produce different warnings, so more testing would be welcome. ]

Reported-by: Valdis.Kletnieks@vt.edu
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 6baa0a5a 13-Aug-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf tools: Factorize the thread code in a dedicated file

Factorize the thread management code used by perf-annotate and
perf-report in dedicated source and header files.

v2: pass last_match by address so that it can actually be
modified.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1250245313-6995-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 8fd101f2 12-Aug-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Don't show unresolved DSOs and symbols when -S/-d is used

We're interested in just those symbols/DSOs, so filter out the
unresolved ones.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20090812211957.GE3495@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 94a24752 11-Aug-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Show the tid too in -D

This made it easier to find the firefox threading related
bug.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090811192138.GE18061@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 66e274f3 12-Aug-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf tools: Factorize the map helpers

Factorize the dso mapping helpers into a single purpose common file
"util/map.c"

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Brice Goglin <Brice.Goglin@inria.fr>


# 1fe2c106 12-Aug-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf tools: Factorize the event structure definitions in a single file

Factorize the multiple definition of the events structures into a
single util/event.h file.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Brice Goglin <Brice.Goglin@inria.fr>


# cd84c2ac 12-Aug-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf tools: Factorize high level dso helpers

Factorize multiple definitions of high level dso helpers into the
symbol source file.

The side effect is a general export of the verbose and eprintf
debugging helpers into a new file dedicated to debugging purposes.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Brice Goglin <Brice.Goglin@inria.fr>


# 9f866697 10-Aug-2009 Brice Goglin <Brice.Goglin@inria.fr>

perf report: Add raw displaying of per-thread counters

If --pretty=raw is given to perf report -T, it now displays one
line per-thread per-counter with the raw event id added.

We get:
# PID TID Name Raw Count
18608 18609 cache-misses 28e 416744
18608 18609 cache-references 28f 6456792
18608 18608 cache-misses 28e 448219
18608 18608 cache-references 28f 7270244
instead of:

# PID TID cache-misses cache-references
18608 18609 416744 6456792
18608 18608 448219 7270244

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
Acked-by: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <4A802008.5050409@inria.fr>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 8d513270 07-Aug-2009 Brice Goglin <Brice.Goglin@inria.fr>

perf report: Fix and improve the displaying of per-thread event counters

Improve and fix the handling of per-thread counter stats
recorded via perf record -s. Previously we only displayed
it in debug printouts (-D) and even that output was hard
to disambiguate.

I moved everything to utils/values.[ch] so that we may reuse
it in perf stat.

We get something like this now:

# PID TID cache-misses cache-references
4658 4659 495581 3238779
4658 4662 498246 3236823
4658 4663 499531 3243162

Then it'll be easy to add --pretty=raw to display a single line per thread/event.

By the way, -S was also used for --symbol... So I used -T/--thread here.

perf report: Add -T/--threads to display per-thread counter values

We get something like this now:
# PID TID cache-misses cache-references
4658 4659 495581 3238779
4658 4662 498246 3236823
4658 4663 499531 3243162

Per-thread arrays of counter values are managed in utils/values.[ch]

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus@samba.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 25446036 07-Aug-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf tools: callchain: Fix sum of percentages to be 100% by displaying amount of ignored chains in fractal mode

When we filter the callchains below a given percentage, we
ignore them and the end result only shows entries that have an
upper percentage than the filter threshold.

It seems to users then that we have an imbalance in the
percentage, as if the sum inside a profiled branch doesn't
reach 100%.

Since in the past there have been real perf report bugs that
showed the same sypmtom, it would be nice to assure the user
that the data is perfect and trustable and it all sums up to
100.00%.

So fix this by displaying the remaining hits that have been
filtered but without more detail than their amount in each
branches. Example while filtering below 50%:

7.73% [k] delay_tsc
|
|--98.22%-- __const_udelay
| |
| |--86.37%-- ath5k_hw_register_timeout
| | ath5k_hw_noise_floor_calibration
| | ath5k_hw_reset
| | ath5k_reset
| | ath5k_config
| | ieee80211_hw_config
| | |
| | |--88.53%-- ieee80211_scan_work
| | | worker_thread
| | | kthread
| | | child_rip
| | --11.47%-- [...]
| --13.63%-- [...]
--1.78%-- [...]

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1249690585-9145-4-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# b1a88349 07-Aug-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf tools: callchain: Fix 'perf report' display to be callchain by default

If we recorded with -g option to record the callchain, right now
we require a -g option to perf report as well - and people reported
this as unnecessary complication: the user already specified -g
once, no need to require it a second time.

So if the recording includes call-chains, display the callchain by
default from perf report.

( The user can override this default using "-g none" option from
perf report. )

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1249690585-9145-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 94cb9e38 06-Aug-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Add debug help for the finding of symbol bugs - show the symtab origin (DSO, build-id, kernel, etc)

Used with perf report --verbose:

[acme@doppio linux-2.6-tip]$ perf report -v | head -16
5.17% firefox /usr/lib64/xulrunner-1.9.1/libxul.so 0x00000000005d8eee f [.] imgContainer::DrawFrameTo(gfxIImageFrame*, gfxIImageFrame*, nsRect&)
2.56% firefox /lib64/libpthread-2.10.1.so 0x0000000000008e02 d [.] __pthread_mutex_lock_internal
1.94% firefox /usr/lib64/xulrunner-1.9.1/libxul.so 0x0000000000d0af8f f [.] SearchTable
1.75% firefox [kernel] 0xffffffffff60013b k [.] vread_hpet
1.63% firefox /lib64/libpthread-2.10.1.so 0x000000000000a404 d [.] __pthread_mutex_unlock
1.47% firefox /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x00000000000482ea f [.] js_Interpret
1.42% firefox /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x000000000003eda3 f [.] JS_CallTracer
1.24% firefox [kernel] 0xffffffff8102ca4a k [k] read_hpet
1.16% firefox [kernel] 0xffffffff810f3dd4 k [k] fget_light
1.11% firefox /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x00000000000567ff f [.] js_TraceObject
0.98% firefox /usr/lib64/firefox-3.5.2/firefox 0x000000000000dd23 b [.] arena_ralloc
[acme@doppio linux-2.6-tip]$

The new field is just after the symbol address. To help in
figuring out symbol resolution bugs.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 8f18aec5 06-Aug-2009 Peter Zijlstra <a.p.zijlstra@chello.nl>

perf report: Fix per task mult-counter stat reporting

Brice Goglin reported:

> I can easily sort them by thread id, but I don't know how to match
> my 4 events with each group of 4 lines.

Also report the counter id and the time running/enabled
stats (in case the counter got time-shared).

Reported-by: Brice Goglin <Brice.Goglin@inria.fr>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Brice Goglin <Brice.Goglin@inria.fr>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 1953287b 06-Aug-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf tools: Fix call-chain cumul hit based sub-total (fractal mode)

The callchain fractal mode builds each new total hits in a new
branch of profiling by using the parent's hits of the current
branch plus the hits of the children.

This is wrong, the total hits of a branch should be made of the
sum of every children hits, we must ignore the parent hits in
this scope.

This patch also fixes another mistake with the hit counting.

Now the rates are correct.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 114cfab2 05-Aug-2009 Pekka Enberg <penberg@cs.helsinki.fi>

perf report: Make --sort comm,dso,symbol the default

If you're doing performance testing, you're interested in the
symbols anyway so lets make "--sort comm,dso,symbol" the
default sort option.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: acme@redhat.com
LKML-Reference: <1249467921-10450-1-git-send-email-penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 7e030655 02-Aug-2009 Roel Kluin <roel.kluin@gmail.com>

perf: Fix read buffer overflow

Check whether index is within bounds before testing the element.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: a.p.zijlstra@chello.nl
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4A757BCF.40101@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 27d028de 23-Jul-2009 Peter Zijlstra <a.p.zijlstra@chello.nl>

perf report: Update for the new FORK/EXIT events

Since FORK is now also issued for threads, detect those by
comparing the parent and child PID.

Teach it about EXIT events and ignore them.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 7f453c24 21-Jul-2009 Peter Zijlstra <a.p.zijlstra@chello.nl>

perf_counter: PERF_SAMPLE_ID and inherited counters

Anton noted that for inherited counters the counter-id as provided by
PERF_SAMPLE_ID isn't mappable to the id found through PERF_RECORD_ID
because each inherited counter gets its own id.

His suggestion was to always return the parent counter id, since that
is the primary counter id as exposed. However, these inherited
counters have a unique identifier so that events like
PERF_EVENT_PERIOD and PERF_EVENT_THROTTLE can be specific about which
counter gets modified, which is important when trying to normalize the
sample streams.

This patch removes PERF_EVENT_PERIOD in favour of PERF_SAMPLE_PERIOD,
which is more useful anyway, since changing periods became a lot more
common than initially thought -- rendering PERF_EVENT_PERIOD the less
useful solution (also, PERF_SAMPLE_PERIOD reports the more accurate
value, since it reports the value used to trigger the overflow,
whereas PERF_EVENT_PERIOD simply reports the requested period changed,
which might only take effect on the next cycle).

This still leaves us PERF_EVENT_THROTTLE to consider, but since that
_should_ be a rare occurrence, and linking it to a primary id is the
most useful bit to diagnose the problem, we introduce a
PERF_SAMPLE_STREAM_ID, for those few cases where the full
reconstruction is important.

[Does change the ABI a little, but I see no other way out]

Suggested-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1248095846.15751.8781.camel@twins>


# 1483b19f 16-Jul-2009 Anton Blanchard <anton@samba.org>

perf_counter: Make call graph option consistent

perf record uses -g for logging call graph data but perf report
uses -c to print call graph data. Be consistent and use -g
everywhere for call graph data.

Also update the help text to reflect the current default -
fractal,0.5

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090716104817.803604373@samba.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# e3d7e183 10-Jul-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Introduce -n/--show-nr-samples

[acme@doppio pahole]$ perf report -ns comm,dso,symbol -d /lib64/libc-2.10.1.so -C pahole | head -17
21.94% 32101 [.] _int_malloc
20.10% 29402 [.] __GI_strcmp
16.77% 24533 [.] __tsearch
12.61% 18450 [.] malloc_consolidate
6.42% 9394 [.] _int_free
6.28% 9191 [.] __tfind
4.56% 6678 [.] __GI___libc_free
4.46% 6520 [.] _IO_vfprintf_internal
2.59% 3786 [.] __malloc
1.17% 1716 [.] __GI_memcpy
[acme@doppio pahole]$

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1247325517-12272-5-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 021191b3 10-Jul-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Make the output more compact

When we filter by column content we may end up with a column
that has the same value for all the lines. So remove that
column and tell its unique value on the top, as a comment.

Example:

[acme@doppio pahole]$ perf report --sort comm,dso,symbol -d ./build/libdwarves.so.1.0.0 -C pahole | head -15
# dso: ./build/libdwarves.so.1.0.0
# comm: pahole
# Samples: 58409
#
# Overhead Symbol
# ........ ......
#
20.93% [.] tag__recode_dwarf_type
14.94% [.] namespace__recode_dwarf_types
10.38% [.] cu__table_add_tag
6.69% [.] __die__process_tag
5.05% [.] die__process_function
4.70% [.] list__for_all_tags
3.68% [.] tag__init
3.48% [.] die__create_new_parameter
[acme@doppio pahole]$

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1247325517-12272-3-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 60c1baf1 10-Jul-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Tidy up reporting of symbols not found

Always printing the level info about if it is in the kernel,
hypervisor or userspace as that is in the hist_entry.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1247325517-12272-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 52d422de 10-Jul-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Adjust column width to the values sampled

Auto-adjust column width of perf report output to the
longest occuring string length.

Example:

[acme@doppio pahole]$ perf report --sort comm,dso,symbol | head -13

12.79% pahole /usr/lib64/libdw-0.141.so [.] __libdw_find_attr
8.90% pahole /lib64/libc-2.10.1.so [.] _int_malloc
8.68% pahole /usr/lib64/libdw-0.141.so [.] __libdw_form_val_len
8.15% pahole /lib64/libc-2.10.1.so [.] __GI_strcmp
6.80% pahole /lib64/libc-2.10.1.so [.] __tsearch
5.54% pahole ./build/libdwarves.so.1.0.0 [.] tag__recode_dwarf_type
[acme@doppio pahole]$

[acme@doppio pahole]$ perf report --sort comm,dso,symbol -d /lib64/libc-2.10.1.so | head -10

21.92% pahole /lib64/libc-2.10.1.so [.] _int_malloc
20.08% pahole /lib64/libc-2.10.1.so [.] __GI_strcmp
16.75% pahole /lib64/libc-2.10.1.so [.] __tsearch
[acme@doppio pahole]$

Also add these extra options to control the new behaviour:

-w, --field-width

Force each column width to the provided list, for large terminal
readability.

-t, --field-separator:

Use a special separator character and don't pad with spaces, replacing
all occurances of this separator in symbol names (and other output) with
a '.' character, that thus it's the only non valid separator.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20090711014728.GH3452@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 805d127d 04-Jul-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf report: Add "Fractal" mode output - support callchains with relative overhead rate

The current callchain displays the overhead rates as absolute:
relative to the total overhead.

This patch provides relative overhead percentage, in which each
branch of the callchain tree is a independant instrumentated object.

This provides a 'fractal' view of the call-chain profile: each
sub-graph looks like a profile in itself - relative to its parent.

You can produce such output by using the "fractal" mode
that you can abbreviate via f, fr, fra, frac, etc...

./perf report -s sym -c fractal

Example:

8.46% [k] copy_user_generic_string
|
|--52.01%-- generic_file_aio_read
| do_sync_read
| vfs_read
| |
| |--97.20%-- sys_pread64
| | system_call_fastpath
| | pread64
| |
| --2.81%-- sys_read
| system_call_fastpath
| __read
|
|--39.85%-- generic_file_buffered_write
| __generic_file_aio_write_nolock
| generic_file_aio_write
| do_sync_write
| reiserfs_file_write
| vfs_write
| |
| |--97.05%-- sys_pwrite64
| | system_call_fastpath
| | __pwrite64
| |
| --2.95%-- sys_write
| system_call_fastpath
| __write_nocancel
[...]

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246772361-9960-5-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 94a8eb02 04-Jul-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf report: Change default callchain parameters

The default callchain parameters are set to use the flat mode and never
filter any overhead threshold of backtrace.

But flat mode is boring compared to graph mode.
Also the number of callchains may be very high if none is
filtered.

Let's change this to set the graph view and a minimum overhead of 0.5%
as default parameters.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246772361-9960-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# be903885 04-Jul-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf report: Use a modifiable string for default callchain options

If the user doesn't provide options to tune his callchain output
(ie: if he uses -c without arguments) then the default value passed
in the OPT_CALLBACK_DEFAULT() macro is used.

But it's parsed later by strtok() which will replace comma separators
to a zero. This may segfault as we are using a read-only string.

Use a modifiable one instead, and also fix the "100%" default
minimum threshold value by turning it into a 0 (output every callchains)
as it was intended in the origin.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246772361-9960-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 91b4eaea 04-Jul-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf report: Warn on callchain output request from non-callchain file

perf report segfaults while trying to handle callchains from a non
callchain data file.

Instead of a segfault, print a useful message to the user.

Reported-by: Jens Axboe <jens.axboe@oracle.com>
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246772361-9960-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 029e5b16 03-Jul-2009 Ingo Molnar <mingo@elte.hu>

perf report: Annotate variable initialization

Certain versions of GCC dont see the initialization that is done here:

builtin-report.c: In function ‘__cmd_report’:
builtin-report.c:1038: warning: ‘syms’ may be used uninitialized in this function

So annotate it with a NULL initialization.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 24b57c69 02-Jul-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf_counter tools: Display percents of hits in callchain with overhead colors

This adds the use of colors to signal at a glance the important
overhead thresholds in callchains hit rates.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246558475-10624-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 1e11fd82 02-Jul-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf_counter tools: Provide helper to print percents color

Among perf annotate, perf report and perf top, we can find the
common colored printing of percents according to the following
rules:

High overhead = > 5%, colored in red
Mid overhead = > 0.5%, colored in green
Low overhead = < 0.5%, default color

Factorize these multiple checks in a single function named
percent_color_fprintf() and also provide a get_percent_color()
for sites which print percentages and other things at the same
time.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246558475-10624-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# c20ab37e 02-Jul-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf_counter tools: Set the minimum percent for callchains to be displayed

Callchains output may become a burden on a trace because even
rarely hit site are exposed. This can be too much information.

Let the user set a threshold as a minimum percent of hits using
the new pattern for the -c option:

-c mode,min_percent

Example:

$ perf report -s sym -c flat,4

8.25% [k] copy_user_generic_string
4.19%
copy_user_generic_string
generic_file_aio_read
do_sync_read
vfs_read
sys_pread64
system_call_fastpath
pread64

5.39% [k] search_by_key
4.63% 0x00000000009e0a
2.36% [k] memcpy_c
[...]

$ perf report -s sym -c graph,2

8.25% [k] copy_user_generic_string
|
|--4.31%-- generic_file_aio_read
| do_sync_read
| vfs_read
| |
| --4.19%-- sys_pread64
| system_call_fastpath
| pread64
|
--3.24%-- generic_file_buffered_write
__generic_file_aio_write_nolock
generic_file_aio_write
do_sync_write
reiserfs_file_write
vfs_write
|
--3.14%-- sys_pwrite64
system_call_fastpath
__pwrite64

5.39% [k] search_by_key
|
--2.23%-- reiserfs_update_sd_size

4.63% 0x00000000009e0a

2.36% [k] memcpy_c
[...]

You can also omit it and it will default to 0.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246558475-10624-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 4eb3e478 02-Jul-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf report: Add support for callchain graph output

Currently, the printing of callchains is done in a single
vertical level, this is the "flat" mode:

8.25% [k] copy_user_generic_string
4.19%
copy_user_generic_string
generic_file_aio_read
do_sync_read
vfs_read
sys_pread64
system_call_fastpath
pread64

This patch introduces a new "graph" mode which provides a
hierarchical output of factorized paths recursively sorted:

8.25% [k] copy_user_generic_string
|
|--4.31%-- generic_file_aio_read
| do_sync_read
| vfs_read
| |
| |--4.19%-- sys_pread64
| | system_call_fastpath
| | pread64
| |
| --0.12%-- sys_read
| system_call_fastpath
| __read
|
|--3.24%-- generic_file_buffered_write
| __generic_file_aio_write_nolock
| generic_file_aio_write
| do_sync_write
| reiserfs_file_write
| vfs_write
| |
| |--3.14%-- sys_pwrite64
| | system_call_fastpath
| | __pwrite64
| |
| --0.10%-- sys_write
[...]

The command line has then changed.

By providing the -c option, the callchain will output in the
flat mode by default.

But you can override it:

perf report -c graph

or

perf report -c flat

You can also pass the abreviated mode:

perf report -c g

or

perf report -c gra

will both make use of the graph mode.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246550301-8954-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 42976487 02-Jul-2009 Mike Galbraith <efault@gmx.de>

perf_counter tools: Enable kernel module symbol loading in tools

Add the -m/--modules option to perf report and perf annotate,
which enables live module symbol/image loading. To be used
with -k/--vmlinux.

(Also give perf annotate a -P/--full-paths option.)

Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246514986.13293.48.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 6cfcc53e 02-Jul-2009 Mike Galbraith <efault@gmx.de>

perf_counter tools: Connect module support infrastructure to symbol loading infrastructure

Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246514916.13293.46.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 9974f496 02-Jul-2009 Mike Galbraith <efault@gmx.de>

perf_counter tools: Make symbol loading consistently return number of loaded symbols

perf_counter tools: Make symbol loading consistently return number of loaded symbols.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246514758.13293.42.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 5da50258 01-Jul-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf_counter tools: Share list.h with the kernel

The copy we were using came from another copy I did for the dwarves
(pahole) package, that came from the kernel years ago.

The only function that is used by the perf tools and that isn't in the
kernel is list_del_range, that I'm leaving in the perf tools only for
now.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20090701174608.GA5823@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 43cbcd8a 30-Jun-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf_counter tools: Share rbtree.with the kernel

The tools/perf/util/rbtree.c copy already drifted by three
csets:

4b324126e0c6c3a5080ca3ec0981e8766ed6f1ee
4c60117811171d867d4f27f17ea07d7419d45dae
16c047add3ceaf0ab882e3e094d1ec904d02312d

So remove the copy and use the lib/rbtree.c directly, sharing
the source code while still generating a separate object file,
since tools/perf uses a far more agressive -O6 switch.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090701152837.GG15682@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# f37a291c 30-Jun-2009 Ingo Molnar <mingo@elte.hu>

perf_counter tools: Add more warnings and fix/annotate them

Enable -Wextra. This found a few real bugs plus a number
of signed/unsigned type mismatches/uncleanlinesses. It
also required a few annotations

All things considered it was still worth it so lets try with
this enabled for now.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 88a69dfb 01-Jul-2009 Ingo Molnar <mingo@elte.hu>

perf report: Fix HV bit mismerge

Fix:

builtin-report.c: In function ‘hist_entry__add’:
builtin-report.c:1015: error: case label not within a switch statement
builtin-report.c:1017: error: break statement not within loop or switch

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 4424961a 30-Jun-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf_counter tools: Resolve symbols in callchains

This patch resolves the names, when possible, of each ip
present in the callchains while using the -c option with perf
report.

Example:

5.40% [k] __d_lookup
5.37%
perf_callchain
perf_counter_overflow
intel_pmu_handle_irq
perf_counter_nmi_handler
notifier_call_chain
atomic_notifier_call_chain
notify_die
do_nmi
nmi
do_lookup
__link_path_walk
path_walk
do_path_lookup
user_path_at
sys_faccessat
sys_access
system_call_fastpath
0x7fb609846f77

0.01%
perf_callchain
perf_counter_overflow
intel_pmu_handle_irq
perf_counter_nmi_handler
notifier_call_chain
atomic_notifier_call_chain
notify_die
do_nmi
nmi
do_lookup
__link_path_walk
path_walk
do_path_lookup
user_path_at
sys_faccessat

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246419315-9968-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# fb9c8188 30-Jun-2009 Anton Blanchard <anton@samba.org>

perf report: Add hypervisor dso

Add a dso for hypervisor samples. We don't get any symbol
information on the ppc64 hypervisor but this at least gives
us a high level summary of the time spent in there.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: a.p.zijlstra@chello.nl
Cc: paulus@samba.org
LKML-Reference: <20090630230141.182536873@samba.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# d8db1b57 30-Jun-2009 Anton Blanchard <anton@samba.org>

perf report: Fix reporting of hypervisor

PERF_EVENT_MISC_* is not a bitmask, so we have to mask and compare.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: a.p.zijlstra@chello.nl
Cc: paulus@samba.org
LKML-Reference: <20090630230141.088394681@samba.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 7bec7a91 30-Jun-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Add --symbols parameter

So that we can filter by symbol name.

The 'pfunct' utility in the 'dwarves' package can be used to
create a file with the functions one wants.

Example:

[acme@doppio pahole]$ pfunct /usr/lib/debug/usr/lib64/libdw-0.141.so.debug | grep dwarf > /tmp/dwarf.symbols
[acme@doppio pahole]$ wc -l /tmp/dwarf.symbols
93 /tmp/dwarf.symbols
[acme@doppio pahole]$ head -3 /tmp/dwarf.symbols
dwfl_addrdwarf
dwfl_module_getdwarf
dwfl_getdwarf
[acme@doppio pahole]$ perf report --sort comm,dso,symbol --comms pahole --dsos /usr/lib64/libdw-0.141.so --symbols file:///tmp/dwarf.symbols

33.99% pahole /usr/lib64/libdw-0.141.so [.] dwarf_tag
29.07% pahole /usr/lib64/libdw-0.141.so [.] dwarf_decl_file
27.71% pahole /usr/lib64/libdw-0.141.so [.] dwarf_getsrclines
4.54% pahole /usr/lib64/libdw-0.141.so 0x00000000007400
3.93% pahole /usr/lib64/libdw-0.141.so [.] dwarf_decl_line
0.46% pahole /usr/lib64/libdw-0.141.so [.] dwarf_getlocation
0.18% pahole /usr/lib64/libdw-0.141.so [.] __libdwarf_next_prime
0.13% pahole /usr/lib64/libdw-0.141.so [.] dwarf_diecu

[acme@doppio pahole]$

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1246399282-20934-4-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# cc8b88b1 30-Jun-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Add --comms parameter

So that we can filter by comm. Symbols in other comms won't be
accounted for.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1246399282-20934-3-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 25903407 30-Jun-2009 Arnaldo Carvalho de Melo <acme@redhat.com>

perf report: Add --dsos parameter

So that we can filter by dso. Symbols in other dsos won't be
accounted for.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1246399282-20934-2-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# f55c5552 26-Jun-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf report: Print sorted callchains per histogram entries

Use the newly created callchains radix tree to gather the chains stats
from the recorded events and then print the callchains for all of them,
sorted by hits, using the "-c" parameter with perf report.

Example:

66.15% [k] atm_clip_exit
63.08%
0xffffffffffffff80
0xffffffff810196a8
0xffffffff810c14c8
0xffffffff8101a79c
0xffffffff810194f3
0xffffffff8106ab7f
0xffffffff8106abe5
0xffffffff8106acde
0xffffffff8100d94b
0xffffffff8153e7ea
[...]

1.54%
0xffffffffffffff80
0xffffffff810196a8
0xffffffff810c14c8
0xffffffff8101a79c
[...]

Symbols are not yet resolved.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1246026481-8314-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 8cb76d99 26-Jun-2009 Frederic Weisbecker <fweisbec@gmail.com>

perf_counter tools: Prepare a small callchain framework

We plan to display the callchains depending on some user-configurable
parameters.

To gather the callchains stats from the recorded stream in a fast way,
this patch introduces an ad hoc radix tree adapted for callchains and also
a rbtree to sort these callchains once we have gathered every events
from the stream.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1246026481-8314-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# e9ea2fde 24-Jun-2009 Peter Zijlstra <a.p.zijlstra@chello.nl>

perf-report: Add bare minimum PERF_EVENT_READ parsing

Provide the basic infrastructure to provide per task stats.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# e6e18ec7 25-Jun-2009 Peter Zijlstra <a.p.zijlstra@chello.nl>

perf_counter: Rework the sample ABI

The PERF_EVENT_READ implementation made me realize we don't
actually need the sample_type int the output sample, since
we already have that in the perf_counter_attr information.

Therefore, remove the PERF_EVENT_MISC_OVERFLOW bit and the
event->type overloading, and imply put counter overflow
samples in a PERF_EVENT_SAMPLE type.

This also fixes the issue that event->type was only 32-bit
and sample_type had 64 usable bits.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 7c6a1c65 25-Jun-2009 Peter Zijlstra <a.p.zijlstra@chello.nl>

perf_counter tools: Rework the file format

Create a structured file format that includes the full
perf_counter_attr and all its relevant counter IDs so that
the reporting program has full information.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 3d906ef1 23-Jun-2009 Peter Zijlstra <a.p.zijlstra@chello.nl>

perf_counter tools: Handle overlapping MMAP events

Martin Schwidefsky reported "perf report" symbol resolution
problems on S390.

Since we only report MMAP, not MUNMAP, we have to deal with
overlapping maps.

We used to simply throw out the old map on the assumption whole
maps got unmapped. This obviously doesn't deal with partial
unmaps. However it appears some dynamic linkers do fancy
partial unmaps (s390), so do something more elaborate and
truncate the old maps, only removing them when they've been
fully covered.

This resolves (part of) the S390 symbol resolution problems.

Reported-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Tested-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 520f2c34 22-Jun-2009 Peter Zijlstra <peterz@infradead.org>

perf report: Output more symbol related debug data

Print more symbol relocation related info under -vv.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# eadc84cc 19-Jun-2009 Frederic Weisbecker <fweisbec@gmail.com>

perfcounter: Handle some IO return values

Building perfcounter tools raises the following warnings:

builtin-record.c: In function ‘atexit_header’:
builtin-record.c:464: erreur: ignoring return value of ‘pwrite’, declared with attribute warn_unused_result
builtin-record.c: In function ‘__cmd_record’:
builtin-record.c:503: erreur: ignoring return value of ‘read’, declared with attribute warn_unused_result

builtin-report.c: In function ‘__cmd_report’:
builtin-report.c:1403: erreur: ignoring return value of ‘read’, declared with attribute warn_unused_result

This patch handles these IO return values.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1245456100-5477-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 9cffa8d5 19-Jun-2009 Paul Mackerras <paulus@samba.org>

perf_counter tools: Define and use our own u64, s64 etc. definitions

On 64-bit powerpc, __u64 is defined to be unsigned long rather than
unsigned long long. This causes compiler warnings every time we
print a __u64 value with %Lx.

Rather than changing __u64, we define our own u64 to be unsigned long
long on all architectures, and similarly s64 as signed long long.
For consistency we also define u32, s32, u16, s16, u8 and s8. These
definitions are put in a new header, types.h, because these definitions
are needed in util/string.h and util/symbol.h.

The main change here is the mechanical change of __[us]{64,32,16,8}
to remove the "__". The other changes are:

* Create types.h
* Include types.h in perf.h, util/string.h and util/symbol.h
* Add types.h to the LIB_H definition in Makefile
* Added (u64) casts in process_overflow_event() and print_sym_table()
to kill two remaining warnings.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: benh@kernel.crashing.org
LKML-Reference: <19003.33494.495844.956580@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# f5970550 18-Jun-2009 Peter Zijlstra <a.p.zijlstra@chello.nl>

perf_counter tools: Add a data file header

Add a data file header so we can transfer data between record and report.

LKML-Reference: <new-submission>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 2a0a50fe 18-Jun-2009 Peter Zijlstra <a.p.zijlstra@chello.nl>

perf_counter: Update userspace callchain sampling uses

Update the tools to reflect the new callchain sampling format.

LKML-Reference: <new-submission>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# b8e6d829 18-Jun-2009 Ingo Molnar <mingo@elte.hu>

perf report: Filter to parent set by default

Make it easier to use parent filtering - default to a filtered
output. Also add the parent column so that we get collapsing but
dont display it by default.

add --no-exclude-other to override this.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 9d91a6f7 18-Jun-2009 Peter Zijlstra <a.p.zijlstra@chello.nl>

perf_counter tools: Handle lost events

Make use of the new ->data_tail mechanism to tell kernel-space
about user-space draining the data stream. Emit lost events
(and display them) if they happen.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# a73c7d84 18-Jun-2009 Peter Zijlstra <a.p.zijlstra@chello.nl>

perf_counter tools: Add and use isprint()

Introduce isprint() to print out raw event dumps to ASCII, etc.

(This is an extension to upstream Git's ctype.c.)

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
[ removed openssl.h inclusion from util.h - it leaked ctype.h ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 7522060c 18-Jun-2009 Ingo Molnar <mingo@elte.hu>

perf report: Add validation of call-chain entries

Add boundary checks for call-chain events. In case of corrupted
entries we could crash otherwise.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# b25bcf2f 17-Jun-2009 Ingo Molnar <mingo@elte.hu>

perf report: Tidy up the "--parent <regex>" and "--sort parent" call-chain features

Instead of the ambigious 'call' naming use the much more
specific 'parent' naming:

- rename --call <regex> to --parent <regex>

- rename --sort call to --sort parent

- rename [unmatched] to [other] - to signal that this is not
an error but the inverse set

Also add pagefaults to the default parent-symbol pattern too,
as it's a 'syscall overhead category' in a sense.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 5aa75a0f 15-Jun-2009 Peter Zijlstra <a.p.zijlstra@chello.nl>

perf_counter tools: Replace isprint() with issane()

The Git utils came with a ctype replacement that doesn't provide
isprint(). Add a replacement.

Solves a build bug on certain distros.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 6e7d6fdc 17-Jun-2009 Peter Zijlstra <a.p.zijlstra@chello.nl>

perf report: Add --sort <call> --call <$regex>

Implement sorting by callchain symbols, --sort <call>.

It will create a new column which will show a match to
--call $regex or "[unmatched]".

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# e2eae0f5 15-Jun-2009 Ingo Molnar <mingo@elte.hu>

perf report: Fix 32-bit printf format

Yong Wang reported the following compiler warning:

builtin-report.c: In function 'process_overflow_event':
builtin-report.c:984: error: cast to pointer from integer of different size

Which happens because we try to print ->ips[] out with a limited
format, losing the high 32 bits. Print it out using %016Lx instead.

Reported-by: Yong Wang <yong.y.wang@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 3dfabc74 15-Jun-2009 Ingo Molnar <mingo@elte.hu>

perf report: Add per system call overhead histogram

Take advantage of call-graph percounter sampling/recording to
display a non-trivial histogram: the true, collapsed/summarized
cost measurement, on a per system call total overhead basis:

aldebaran:~/linux/linux/tools/perf> ./perf record -g -a -f ~/hackbench 10
aldebaran:~/linux/linux/tools/perf> ./perf report -s symbol --syscalls | head -10
#
# (3536 samples)
#
# Overhead Symbol
# ........ ......
#
40.75% [k] sys_write
40.21% [k] sys_read
4.44% [k] do_nmi
...

This is done by accounting each (reliable) call-chain that chains back
to a given system call to that system call function.

[ So in the above example we can see that hackbench spends about 40% of
its total time somewhere in sys_write() and 40% somewhere in
sys_read(), the rest of the time is spent in user-space. The time
is not spent in sys_write() _itself_ but in one of its many child
functions. ]

Or, a recording of a (source files are already in the page-cache) kernel build:

$ perf record -g -m 512 -f -- make -j32 kernel
$ perf report -s s --syscalls | grep '\[k\]' | grep -v nmi

4.14% [k] do_page_fault
1.20% [k] sys_write
1.10% [k] sys_open
0.63% [k] sys_exit_group
0.48% [k] smp_apic_timer_interrupt
0.37% [k] sys_read
0.37% [k] sys_execve
0.20% [k] sys_mmap
0.18% [k] sys_close
0.14% [k] sys_munmap
0.13% [k] sys_poll
0.09% [k] sys_newstat
0.07% [k] sys_clone
0.06% [k] sys_newfstat
0.05% [k] sys_access
0.05% [k] schedule

Shows the true total cost of each syscall variant that gets used
during a kernel build. This profile reveals it that pagefaults are
the costliest, followed by read()/write().

An interesting detail: timer interrupts cost 0.5% - or 0.5 seconds
per 100 seconds of kernel build-time. (this was done with HZ=1000)

The summary is done in 'perf report', i.e. in the post-processing
stage - so once we have a good call-graph recording, this type of
non-trivial high-level analysis becomes possible.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 3efa1cc9 14-Jun-2009 Ingo Molnar <mingo@elte.hu>

perf record/report: Add call graph / call chain profiling

Add the first steps of call-graph profiling:

- add the -c (--call-graph) option to perf record
- parse the call-graph record and printout out under -D (--dump-trace)

The call-graph data is not put into the histogram yet, but it
can be seen that it's being processed correctly:

0x3ce0 [0x38]: event: 35
.
. ... raw event: size 56 bytes
. 0000: 23 00 00 00 05 00 38 00 d4 df 0e 81 ff ff ff ff #.....8........
. 0010: 60 0b 00 00 60 0b 00 00 03 00 00 00 01 00 02 00 `...`..........
. 0020: d4 df 0e 81 ff ff ff ff a0 61 ed 41 36 00 00 00 .........a.A6..
. 0030: 04 92 e6 41 36 00 00 00 .a.A6..
.
0x3ce0 [0x38]: PERF_EVENT (IP, 5): 2912: 0xffffffff810edfd4 period: 1
... chain: u:2, k:1, nr:3
..... 0: 0xffffffff810edfd4
..... 1: 0x3641ed61a0
..... 2: 0x3641e69204
... thread: perf:2912
...... dso: [kernel]

This shows a 3-entry call-graph: with 1 kernel-space and two user-space
entries

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 8465b050 14-Jun-2009 Ingo Molnar <mingo@elte.hu>

perf report: Print out raw events in hexa

Print out events in hexa dump format, when -D is specified:

0x4868 [0x48]: event: 1
.
. ... raw event: size 72 bytes
. 0000: 01 00 00 00 00 00 48 00 d4 72 00 00 d4 72 00 00 ......H..r...r.
. 0010: 00 00 40 f2 3e 00 00 00 00 30 01 00 00 00 00 00 ..@.>....0.....
. 0020: 00 00 00 00 00 00 00 00 2f 75 73 72 2f 6c 69 62 ......../usr/li
. 0030: 36 34 2f 6c 69 62 65 6c 66 2d 30 2e 31 34 31 2e 64/libelf-0.141
. 0040: 73 6f 00 00 00 00 00 00 f-0.141
.
0x4868 [0x48]: PERF_EVENT_MMAP 29396: [0x3ef2400000(0x13000) @ (nil)]: /usr/lib64/libelf-0.141.so

This helps the debugging of mis-parsing of data files, and helps
the addition of new sample/trace formats.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 729ff5e2 11-Jun-2009 Ingo Molnar <mingo@elte.hu>

perf_counter tools: Clean up u64 usage

A build error slipped in:

builtin-report.c: In function ‘hist_entry__fprintf’:
builtin-report.c:711: error: format ‘%12d’ expects type ‘int’, but argument 3 has type ‘uint64_t’

Because we got a bit sloppy with those types. uint64_t really sucks,
because there's no printf format for it. So standardize on __u64
instead - for all types that go to or come from the ABI (which is __u64),
or for values that need to be large enough even on 32-bit.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# ea1900e5 10-Jun-2009 Peter Zijlstra <a.p.zijlstra@chello.nl>

perf_counter tools: Normalize data using per sample period data

When we use variable period sampling, add the period to the sample
data and use that to normalize the samples.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 4502d77c 10-Jun-2009 Peter Zijlstra <a.p.zijlstra@chello.nl>

perf_counter tools: Small frequency related fixes

Create the counter in a disabled state and only enable it after we
mmap() the buffer, this allows us to see the first few samples (and
observe the frequency ramp).

Furthermore, print the period in the verbose report.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# aefcf37b 08-Jun-2009 Ingo Molnar <mingo@elte.hu>

perf_counter tools: Standardize color printing

The rule is:

- high overhead: red
- mid overhead: green
- low overhead: normal (white/black)

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 80d496be 08-Jun-2009 Pekka Enberg <penberg@cs.helsinki.fi>

perf report: Add support for profiling JIT generated code

This patch adds support for profiling JIT generated code to 'perf
report'. A JIT compiler is required to generate a "/tmp/perf-$PID.map"
symbols map that is parsed when looking and displaying symbols.

Thanks to Peter Zijlstra for his help with this patch!

Example "perf report" output with the Jato JIT:

#
# (40311 samples)
#
# Overhead Command Shared Object Symbol
# ........ ................ ......................... ......
#
97.80% jato /tmp/perf-11915.map [.] Fibonacci.fib(I)I
0.56% jato 00000000b7fa023b 0x000000b7fa023b
0.45% jato /tmp/perf-11915.map [.] Fibonacci.main([Ljava/lang/String;)V
0.38% jato [kernel] [k] get_page_from_freelist
0.06% jato [kernel] [k] kunmap_atomic
0.05% jato ./jato [.] utf8Hash
0.04% jato ./jato [.] executeJava
0.04% jato ./jato [.] defineClass

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: a.p.zijlstra@chello.nl
Cc: acme@redhat.com
LKML-Reference: <Pine.LNX.4.64.0906082111590.12407@melkki.cs.Helsinki.FI>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# a14832ff 07-Jun-2009 Ingo Molnar <mingo@elte.hu>

perf report: Print more expressive message in case of file open error

Before:

$ perf report
failed to open file: No such file or directory

After:

$ perf report
failed to open file: perf.data (try 'perf record' first)

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# 86470930 06-Jun-2009 Ingo Molnar <mingo@elte.hu>

perf_counter tools: Move from Documentation/perf_counter/ to tools/perf/

Several people have suggested that 'perf' has become a full-fledged
tool that should be moved out of Documentation/. Move it to the
(new) tools/ directory.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>