#
5b9e4eef |
|
12-Jan-2024 |
Namhyung Kim <namhyung@kernel.org> |
perf record: Display data size on pipe mode Currently pipe mode doesn't set the file size and it results in a misleading message of 0 data size at the end. Although it might miss some accounting for pipe header or more, just displaying the data size would reduce the possible confusion. Before: $ perf record -o- perf test -w noploop | perf report -i- -q --percent-limit=1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] <====== (here) 99.58% perf perf [.] noploop After: $ perf record -o- perf test -w noploop | perf report -i- -q --percent-limit=1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.229 MB - ] 99.46% perf perf [.] noploop Signed-off-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20240112231340.779469-1-namhyung@kernel.org
|
#
57c8f107 |
|
18-Jan-2024 |
Yang Jihong <yangjihong1@huawei.com> |
perf data: Minor code style alignment cleanup Minor code style alignment cleanup for perf_data__switch() and perf_data__write(). No functional change. Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240119040304.3708522-4-yangjihong1@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
#
02f9b50e |
|
18-Jan-2024 |
Yang Jihong <yangjihong1@huawei.com> |
perf record: Check conflict between '--timestamp-filename' option and pipe mode before recording In pipe mode, no need to switch perf data output, therefore, '--timestamp-filename' option should not take effect. Check the conflict before recording and output WARNING. In this case, the check pipe mode in perf_data__switch() can be removed. Before: # perf record --timestamp-filename -o- perf test -w noploop | perf report -i- --percent-limit=1 # To display the perf.data header info, please use --header/--header-only options. # [ perf record: Woken up 1 times to write data ] [ perf record: Dump -.2024011812110182 ] # # Total Lost Samples: 0 # # Samples: 4K of event 'cycles:P' # Event count (approx.): 2176784359 # # Overhead Command Shared Object Symbol # ........ ....... .................... ...................................... # 97.83% perf perf [.] noploop # # (Tip: Print event counts in CSV format with: perf stat -x,) # After: # perf record --timestamp-filename -o- perf test -w noploop | perf report -i- --percent-limit=1 WARNING: --timestamp-filename option is not available in pipe mode. # To display the perf.data header info, please use --header/--header-only options. # [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] # # Total Lost Samples: 0 # # Samples: 4K of event 'cycles:P' # Event count (approx.): 2185575421 # # Overhead Command Shared Object Symbol # ........ ....... ..................... ............................................. # 97.75% perf perf [.] noploop # # (Tip: Profiling branch (mis)predictions with: perf record -b / perf report) # Fixes: ecfd7a9c044e ("perf record: Add '--timestamp-filename' option to append timestamp to output file name") Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240119040304.3708522-3-yangjihong1@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
#
aff10a16 |
|
18-Jan-2024 |
Yang Jihong <yangjihong1@huawei.com> |
perf record: Fix possible incorrect free in record__switch_output() perf_data__switch() may not assign a legal value to 'new_filename'. In this case, 'new_filename' uses the on-stack value, which may cause a incorrect free and unexpected result. Fixes: 03724b2e9c45 ("perf record: Allow to limit number of reported perf.data files") Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240119040304.3708522-2-yangjihong1@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
#
7bbe8f00 |
|
06-Jan-2024 |
Sun Haiyong <sunhaiyong@loongson.cn> |
perf tools: Fix calloc() arguments to address error introduced in gcc-14 the definition of calloc is as follows: void *calloc(size_t nmemb, size_t size); number of members is in the first parameter and the size is in the second parameter. Fix error messages on gcc 14 20240102: error: 'calloc' sizes specified with 'sizeof' in the earlier argument and not in the later argument [-Werror=calloc-transposed-args] Committer notes: I noticed this on fedora 40 and rawhide. Signed-off-by: Sun Haiyong <sunhaiyong@loongson.cn> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240106094129.3337057-1-siyanteng@loongson.cn Signed-off-by: Yanteng Si <siyanteng@loongson.cn> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7d1405c7 |
|
06-Dec-2023 |
Ian Rogers <irogers@google.com> |
perf record: Reduce memory for recording PERF_RECORD_LOST_SAMPLES event Reduce from PERF_SAMPLE_MAX_SIZE to "sizeof(*lost) + session->machines.host.id_hdr_size". Suggested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20231207021627.1322884-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a61f89bf |
|
14-Dec-2023 |
Kan Liang <kan.liang@linux.intel.com> |
perf top: Uniform the event name for the hybrid machine It's hard to distinguish the default cycles events among hybrid PMUs. For example, $ perf top Available samples 385 cycles:P 903 cycles:P The other tool, e.g., perf record, uniforms the event name and adds the hybrid PMU name before opening the event. So the events can be easily distinguished. Apply the same methodology for the perf top as well. The evlist__uniquify_name() will be invoked by both record and top. Move it to util/evlist.c With the patch: $ perf top Available samples 148 cpu_atom/cycles:P/ 1K cpu_core/cycles:P/ Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Hector Martin <marcan@marcan.st> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231214144612.1092028-2-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5805c825 |
|
28-Nov-2023 |
Ian Rogers <irogers@google.com> |
libperf cpumap: Add for_each_cpu() that skips the "any CPU" case When iterating CPUs in a CPU map it is often desirable to skip the "any CPU" (aka dummy) case. Add a helper for this and use in builtin-record. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Andrew Jones <ajones@ventanamicro.com> Cc: André Almeida <andrealmeid@igalia.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Atish Patra <atishp@rivosinc.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Yang Li <yang.lee@linux.alibaba.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: bpf@vger.kernel.org Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20231129060211.1890454-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
030ac3ca |
|
27-Nov-2023 |
Ian Rogers <irogers@google.com> |
perf record: Be lazier in allocating lost samples buffer Wait until a lost sample occurs to allocate the lost samples buffer, often the buffer isn't necessary. This saves a 64kb allocation and 5.3kb of peak memory consumption. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: German Gomez <german.gomez@arm.com> Cc: Guilherme Amadio <amadio@gentoo.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Li Dong <lidong@vivo.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Nick Terrell <terrelln@fb.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231127220902.1315692-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
eb2eac0c |
|
20-Nov-2023 |
Ian Rogers <irogers@google.com> |
perf evsel: Fallback to "task-clock" when not system wide When the "cycles" event isn't available evsel will fallback to the "cpu-clock" software event. "task-clock" is similar to "cpu-clock" but only runs when the process is running. Falling back to "cpu-clock" when not system wide leads to confusion, by falling back to "task-clock" it is hoped the confusion is less. Pass the target to determine if "task-clock" is more appropriate. Update a nearby comment and debug string for the change. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ajay Kaher <akaher@vmware.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Makhalov <amakhalov@vmware.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231121000420.368075-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5940a20a |
|
02-Nov-2023 |
Ian Rogers <irogers@google.com> |
perf mmap: Lazily initialize zstd streams to save memory when not using it Zstd streams create dictionaries that can require significant RAM, especially when there is one per-CPU. Tools like 'perf record' won't use the streams without the -z option, and so the creation of the streams is pure overhead. Switch to creating the streams on first use. Committer notes: ssize_t comes from sys/types.h, size_t from stddef.h. This worked on glibc as stdlib.h includes both, but not on musl libc. So do what 'man size_t' says and include sys/types.h and stddef.h instead of stdlib.h Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: German Gomez <german.gomez@arm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Li Dong <lidong@vivo.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Terrell <terrelln@fb.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231102175735.2272696-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
1a27fc01 |
|
02-Nov-2023 |
Ian Rogers <irogers@google.com> |
perf record: Lazy load kernel symbols Commit 5b7ba82a75915e73 ("perf symbols: Load kernel maps before using") changed it so that loading a kernel DSO would cause the symbols for the DSO to be eagerly loaded. For 'perf record' this is overhead as the symbols won't be used. Add a field to 'struct symbol_conf' to control the behavior and disable it for 'perf record' and 'perf inject'. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: German Gomez <german.gomez@arm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Li Dong <lidong@vivo.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Nick Terrell <terrelln@fb.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231102175735.2272696-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
74b4f3ec |
|
03-Sep-2023 |
Yang Jihong <yangjihong1@huawei.com> |
perf record: Track sideband events for all CPUs when tracing selected CPUs User space tasks can migrate between CPUs, we need to track side-band events for all CPUs. The specific scenarios are as follows: CPU0 CPU1 perf record -C 0 start taskA starts to be created and executed -> PERF_RECORD_COMM and PERF_RECORD_MMAP events only deliver to CPU1 ...... | migrate to CPU0 | Running on CPU0 <----------/ ... perf record -C 0 stop Now perf samples the PC of taskA. However, perf does not record the PERF_RECORD_COMM and PERF_RECORD_MMAP events of taskA. Therefore, the comm and symbols of taskA cannot be parsed. The solution is to record sideband events for all CPUs when tracing selected CPUs. Because this modifies the default behavior, add related comments to the perf record man page. The sys_perf_event_open invoked is as follows: # perf --debug verbose=3 record -e cpu-clock -C 1 true <SNIP> Opening: cpu-clock ------------------------------------------------------------ perf_event_attr: type 1 (PERF_TYPE_SOFTWARE) size 136 config 0 (PERF_COUNT_SW_CPU_CLOCK) { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID|LOST disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 5 Opening: dummy:u ------------------------------------------------------------ perf_event_attr: type 1 (PERF_TYPE_SOFTWARE) size 136 config 0x9 (PERF_COUNT_SW_DUMMY) { sample_period, sample_freq } 1 sample_type IP|TID|TIME|CPU|IDENTIFIER read_format ID|LOST inherit 1 exclude_kernel 1 exclude_hv 1 mmap 1 comm 1 task 1 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 bpf_event 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 6 sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 7 sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 9 sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 10 sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 11 sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 12 sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 13 sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 14 <SNIP> Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20230904023340.12707-5-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
1285ab30 |
|
03-Sep-2023 |
Yang Jihong <yangjihong1@huawei.com> |
perf record: Move setting tracking events before record__init_thread_masks() User space tasks can migrate between CPUs, so when tracing selected CPUs, sideband for all CPUs is needed. In this case set the cpu map of the evsel to all online CPUs. This may modify the original cpu map of the evlist. Therefore, need to check whether the preceding scenario exists before record__init_thread_masks(). Dummy tracking has been set in record__open(), move it before record__init_thread_masks() and add a helper for unified processing. The sys_perf_event_open invoked is as follows: # perf --debug verbose=3 record -e cpu-clock -D 100 true <SNIP> Opening: cpu-clock ------------------------------------------------------------ perf_event_attr: type 1 (PERF_TYPE_SOFTWARE) size 136 config 0 (PERF_COUNT_SW_CPU_CLOCK) { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD|IDENTIFIER read_format ID|LOST disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 10318 cpu 0 group_fd -1 flags 0x8 = 5 sys_perf_event_open: pid 10318 cpu 1 group_fd -1 flags 0x8 = 6 sys_perf_event_open: pid 10318 cpu 2 group_fd -1 flags 0x8 = 7 sys_perf_event_open: pid 10318 cpu 3 group_fd -1 flags 0x8 = 9 sys_perf_event_open: pid 10318 cpu 4 group_fd -1 flags 0x8 = 10 sys_perf_event_open: pid 10318 cpu 5 group_fd -1 flags 0x8 = 11 sys_perf_event_open: pid 10318 cpu 6 group_fd -1 flags 0x8 = 12 sys_perf_event_open: pid 10318 cpu 7 group_fd -1 flags 0x8 = 13 Opening: dummy:u ------------------------------------------------------------ perf_event_attr: type 1 (PERF_TYPE_SOFTWARE) size 136 config 0x9 (PERF_COUNT_SW_DUMMY) { sample_period, sample_freq } 1 sample_type IP|TID|TIME|IDENTIFIER read_format ID|LOST disabled 1 inherit 1 exclude_kernel 1 exclude_hv 1 mmap 1 comm 1 enable_on_exec 1 task 1 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 bpf_event 1 ------------------------------------------------------------ sys_perf_event_open: pid 10318 cpu 0 group_fd -1 flags 0x8 = 14 sys_perf_event_open: pid 10318 cpu 1 group_fd -1 flags 0x8 = 15 sys_perf_event_open: pid 10318 cpu 2 group_fd -1 flags 0x8 = 16 sys_perf_event_open: pid 10318 cpu 3 group_fd -1 flags 0x8 = 17 sys_perf_event_open: pid 10318 cpu 4 group_fd -1 flags 0x8 = 18 sys_perf_event_open: pid 10318 cpu 5 group_fd -1 flags 0x8 = 19 sys_perf_event_open: pid 10318 cpu 6 group_fd -1 flags 0x8 = 20 sys_perf_event_open: pid 10318 cpu 7 group_fd -1 flags 0x8 = 21 <SNIP> 'perf test' needs to update base-record & system-wide-dummy attr expected values for test-record-C0: 1. Because a dummy sideband event is added to the sampling of specified CPUs. When evlist contains evsel of different sample_type, evlist__config() will change the default PERF_SAMPLE_ID bit to PERF_SAMPLE_IDENTIFICATION bit. The attr sample_type expected value of base-record and system-wide-dummy in test-record-C0 needs to be updated. 2. The perf record uses evlist__add_aux_dummy() instead of evlist__add_dummy() to add a dummy event. The expected value of system-wide-dummy attr needs to be updated. The 'perf test' result is as follows: # ./perf test list 2>&1 | grep 'Setup struct perf_event_attr' 17: Setup struct perf_event_attr # ./perf test 17 17: Setup struct perf_event_attr : Ok Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20230904023340.12707-4-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9c95e4ef |
|
03-Sep-2023 |
Yang Jihong <yangjihong1@huawei.com> |
perf evlist: Add evlist__findnew_tracking_event() helper Currently, intel-bts, intel-pt, and arm-spe may add tracking event to the evlist. We may need to search for the tracking event for some settings. Therefore, add evlist__findnew_tracking_event() helper. If system_wide is true, evlist__findnew_tracking_event() set the cpu map of the evsel to all online CPUs. Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20230904023340.12707-3-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
3d6dfae8 |
|
11-Aug-2023 |
Ian Rogers <irogers@google.com> |
perf parse-events: Remove BPF event support New features like the BPF --filter support in perf record have made the BPF event functionality somewhat redundant. As shown by commit fcb027c1a4f6 ("perf tools: Revert enable indices setting syntax for BPF map") and commit 14e4b9f4289a ("perf trace: Raw augmented syscalls fix libbpf 1.0+ compatibility") the BPF event support hasn't been well maintained and it adds considerable complexity in areas like event parsing, not least as '/' is a separator for event modifiers as well as in paths. This patch removes support in the event parser for BPF events and then the associated functions are removed. This leads to the removal of whole source files like bpf-loader.c. Removing support means that augmented syscalls in perf trace is broken, this will be fixed in a later commit adding support using BPF skeletons. The removal of BPF events causes an unused label warning from flex generated code, so update build to ignore it: ``` util/parse-events-flex.c:2704:1: error: label ‘find_rule’ defined but not used [-Werror=unused-label] 2704 | find_rule: /* we branch to this label when backing up */ ``` Committer notes: Extracted from a larger patch that was also removing the support for linking with libllvm and libclang, that were an alternative to using an external clang execution to compile the .c event source code into BPF bytecode. Testing it: # perf trace -e /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c event syntax error: '/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c' \___ Bad event or PMU Unabled to find PMU or event on a PMU of 'home' Initial error: event syntax error: '/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c' \___ Cannot find PMU `home'. Missing kernel support? Run 'perf list' for a list of valid events Usage: perf trace [<options>] [<command>] or: perf trace [<options>] -- <command> [<options>] or: perf trace record [<options>] [<command>] or: perf trace record [<options>] -- <command> [<options>] -e, --event <event> event/syscall selector. use 'perf list' to list available events # Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Carsten Haitzler <carsten.haitzler@arm.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Fangrui Song <maskray@google.com> Cc: He Kuang <hekuang@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Tom Rix <trix@redhat.com> Cc: Wang Nan <wangnan0@huawei.com> Cc: Wang ShaoBo <bobo.shaobowang@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Yonghong Song <yhs@fb.com> Cc: YueHaibing <yuehaibing@huawei.com> Cc: bpf@vger.kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230810184853.2860737-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
94f9eb95 |
|
27-May-2023 |
Ian Rogers <irogers@google.com> |
perf pmus: Remove perf_pmus__has_hybrid perf_pmus__has_hybrid was used to detect when there was >1 core PMU, this can be achieved with perf_pmus__num_core_pmus that doesn't depend upon is_pmu_hybrid and PMU name comparisons. When modifying the function calls take the opportunity to improve comments, enable/simplify tests that were previously failing for hybrid but now pass and to simplify generic code. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230527072210.2900565-34-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
1eaf496e |
|
27-May-2023 |
Ian Rogers <irogers@google.com> |
perf pmu: Separate pmu and pmus Separate and hide the pmus list in pmus.[ch]. Move pmus functionality out of pmu.[ch] into pmus.[ch] renaming pmus functions which were prefixed perf_pmu__ to perf_pmus__. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230527072210.2900565-28-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
597a4276 |
|
27-May-2023 |
Ian Rogers <irogers@google.com> |
perf pmu: Remove perf_pmu__hybrid_pmus list Rather than iterate hybrid PMUs, inhererently Intel specific, iterate all PMUs checking whether they are core. To only get hybrid cores, first call perf_pmu__has_hybrid. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230527072210.2900565-25-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b167b530 |
|
27-May-2023 |
Ian Rogers <irogers@google.com> |
perf evlist: Reduce scope of evlist__has_hybrid Function is only used in printout, reduce scope to stat-display.c. Remove the now empty evlist-hybrid.c and evlist-hybrid.h. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230527072210.2900565-15-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7b100989 |
|
27-May-2023 |
Ian Rogers <irogers@google.com> |
perf evlist: Remove __evlist__add_default __evlist__add_default adds a cycles event to a typically empty evlist and was extended for hybrid with evlist__add_default_hybrid, as more than 1 PMU was necessary. Rather than have dedicated logic for the cycles event, this change switches to parsing 'cycles:P' which will handle wildcarding the PMUs appropriately for hybrid. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230527072210.2900565-14-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5ac72634 |
|
27-May-2023 |
Ian Rogers <irogers@google.com> |
perf tools: Warn if no user requested CPUs match PMU's CPUs In commit 1d3351e631fc ("perf tools: Enable on a list of CPUs for hybrid") perf on hybrid will warn if a user requested CPU doesn't match the PMU of the given event but only for hybrid PMUs. Make the logic generic for all PMUs and remove the hybrid logic. Warn if a CPU is requested that isn't present/offline for events not on the core. Warn if a CPU is requested for a core PMU, but the CPU isn't within the cpu map of that PMU. For example on a 16 (0-15) CPU system: ``` $ perf stat -e imc_free_running/data_read/,cycles -C 16 true WARNING: A requested CPU in '16' is not supported by PMU 'uncore_imc_free_running_1' (CPUs 0-15) for event 'imc_free_running/data_read/' WARNING: A requested CPU in '16' is not supported by PMU 'uncore_imc_free_running_0' (CPUs 0-15) for event 'imc_free_running/data_read/' WARNING: A requested CPU in '16' is not supported by PMU 'cpu' (CPUs 0-15) for event 'cycles' Performance counter stats for 'CPU(s) 16': <not supported> MiB imc_free_running/data_read/ <not supported> cycles 0.000575312 seconds time elapsed ``` Remove evlist__fix_hybrid_cpus that previously produced the warnings and also perf_pmu__cpus_match that worked with evlist__fix_hybrid_cpus to change CPU maps for hybrid CPUs, something that is no longer necessary as CPU map propagation properly intersects user requested CPUs with the core PMU's CPU map. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230527072210.2900565-12-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
8ec984d5 |
|
27-May-2023 |
Ian Rogers <irogers@google.com> |
perf target: Remove unused hybrid value Previously this was used to modify CPU map propagation, but it is now unnecessary as map propagation ensure core PMUs only have valid PMUs in the CPU map from user requested CPUs. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230527072210.2900565-11-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
411ad22e |
|
02-May-2023 |
Ian Rogers <irogers@google.com> |
perf parse-events: Add pmu filter To support the cputype argument added to "perf stat" for hybrid it is necessary to filter events during wildcard matching. Add a scanner argument for the filter and checking it when wildcard matching. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20230502223851.2234828-30-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9a2d5178 |
|
06-May-2023 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
Revert "perf build: Make BUILD_BPF_SKEL default, rename to NO_BPF_SKEL" This reverts commit a980755beb5aca9002e1c95ba519b83a44242b5b. We need to better polish building with BPF skels, so revert back to making it an experimental feature that has to be explicitely enabled using BUILD_BPF_SKEL=1. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4310551b |
|
14-Mar-2023 |
Namhyung Kim <namhyung@kernel.org> |
perf bpf filter: Show warning for missing sample flags For a BPF filter to work properly, users need to provide appropriate options to enable the sample types. Otherwise the BPF program would see an invalid value (i.e. always 0) and filter won't work well. Show a warning message if sample types are missing like below. $ sudo ./perf record -e cycles --filter 'addr < 100' true Error: cycles event does not have PERF_SAMPLE_ADDR Hint: please add -d option to perf record. failed to set filter "BPF" on event cycles with 22 (Invalid argument) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
27c6f245 |
|
14-Mar-2023 |
Namhyung Kim <namhyung@kernel.org> |
perf record: Record dropped sample count When it uses bpf filters, event might drop some samples. It'd be nice if it can report how many samples it lost. As LOST_SAMPLES event can carry the similar information, let's use it for bpf filters. To indicate it's from BPF filters, add a new misc flag for that and do not display cpu load warnings. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a980755b |
|
10-Mar-2023 |
Ian Rogers <irogers@google.com> |
perf build: Make BUILD_BPF_SKEL default, rename to NO_BPF_SKEL BPF skeleton support is now key to a number of perf features. Rather than making it so that BPF support must be enabled for the build, make this the default and error if the build lacks a clang and libbpf that are sufficient. To avoid the error and build without BPF skeletons the NO_BPF_SKEL=1 flag can be used. Add a build-options flag to 'perf version' to enable detection of the BPF skeleton support and use this in the offcpu shell test. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Pavithra Gurushankar <gpavithrasha@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Monnet <quentin@isovalent.com> Cc: Roberto Sassu <roberto.sassu@huawei.com> Cc: Stephane Eranian <eranian@google.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Tom Rix <trix@redhat.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230311065753.3012826-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9d2dc632 |
|
11-Mar-2023 |
Ian Rogers <irogers@google.com> |
perf evlist: Remove nr_groups Maintaining the number of groups during event parsing is problematic and since changing to sort/regroup events can only be computed by a linear pass over the evlist. As the value is generally only used in tests, rather than hold it in a variable compute it by passing over the evlist when necessary. This change highlights that libpfm's counting of groups with a single entry disagreed with regular event parsing. The libpfm tests are updated accordingly. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
cb4b9e68 |
|
01-Mar-2023 |
Changbin Du <changbin.du@huawei.com> |
perf record: Reuse target::initial_delay This just simply replace record_opts::initial_delay with target::initial_delay. Nothing else is changed. Signed-off-by: Changbin Du <changbin.du@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hui Wang <hw.huiwang@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230302031146.2801588-3-changbin.du@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
07d85ba9 |
|
01-Mar-2023 |
Kan Liang <kan.liang@linux.intel.com> |
perf record: Fix "read LOST count failed" msg with sample read Hundreds of "read LOST count failed" error messages may be displayed, when the below command is launched. perf record -e '{cpu/mem-loads-aux/,cpu/event=0xcd,umask=0x1/}:S' -a According to the commit 89e3106fa25fb1b6 ("libperf: Handle read format in perf_evsel__read()"), the PERF_FORMAT_GROUP is only available for the leader. However, the record__read_lost_samples() goes through every entry of an evlist, which includes both leader and member. The member event errors out and triggers the error message. Since there may be hundreds of CPUs on a server, the message will be printed hundreds of times, which is very annoying. The message itself is correct, but the pr_err is a overkill. Other error messages in the record__read_lost_samples() are all pr_debug. To make the output format consistent, change the pr_err("read LOST count failed\n"); to pr_debug("read LOST count failed\n");. User can still get the message via -v option. Fixes: e3a23261ad06d598 ("perf record: Read and inject LOST_SAMPLES events") Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230301150413.27011-1-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
91621be6 |
|
14-Feb-2023 |
Yang Jihong <yangjihong1@huawei.com> |
perf record: Fix segfault with --overwrite and --max-size When --overwrite and --max-size options of perf record are used together, a segmentation fault occurs. The following is an example: # perf record -e sched:sched* --overwrite --max-size 1K -a -- sleep 1 [ perf record: Woken up 1 times to write data ] perf: Segmentation fault Obtained 12 stack frames. ./perf/perf(+0x197673) [0x55f99710b673] /lib/x86_64-linux-gnu/libc.so.6(+0x3ef0f) [0x7fa45f3cff0f] ./perf/perf(+0x8eb40) [0x55f997002b40] ./perf/perf(+0x1f6882) [0x55f99716a882] ./perf/perf(+0x794c2) [0x55f996fed4c2] ./perf/perf(+0x7b7c7) [0x55f996fef7c7] ./perf/perf(+0x9074b) [0x55f99700474b] ./perf/perf(+0x12e23c) [0x55f9970a223c] ./perf/perf(+0x12e54a) [0x55f9970a254a] ./perf/perf(+0x7db60) [0x55f996ff1b60] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe6) [0x7fa45f3b2c86] ./perf/perf(+0x7dfe9) [0x55f996ff1fe9] Segmentation fault (core dumped) backtrace of the core file is as follows: (gdb) bt #0 record__bytes_written (rec=0x55f99755a200 <record>) at builtin-record.c:234 #1 record__output_max_size_exceeded (rec=0x55f99755a200 <record>) at builtin-record.c:242 #2 record__write (map=0x0, size=12816, bf=0x55f9978da2e0, rec=0x55f99755a200 <record>) at builtin-record.c:263 #3 process_synthesized_event (tool=tool@entry=0x55f99755a200 <record>, event=event@entry=0x55f9978da2e0, sample=sample@entry=0x0, machine=machine@entry=0x55f997893658) at builtin-record.c:618 #4 0x000055f99716a883 in __perf_event__synthesize_id_index (tool=tool@entry=0x55f99755a200 <record>, process=process@entry=0x55f997002aa0 <process_synthesized_event>, evlist=0x55f9978928b0, machine=machine@entry=0x55f997893658, from=from@entry=0) at util/synthetic-events.c:1895 #5 0x000055f99716a91f in perf_event__synthesize_id_index (tool=tool@entry=0x55f99755a200 <record>, process=process@entry=0x55f997002aa0 <process_synthesized_event>, evlist=<optimized out>, machine=machine@entry=0x55f997893658) at util/synthetic-events.c:1905 #6 0x000055f996fed4c3 in record__synthesize (tail=tail@entry=true, rec=0x55f99755a200 <record>) at builtin-record.c:1997 #7 0x000055f996fef7c8 in __cmd_record (argc=argc@entry=2, argv=argv@entry=0x7ffc67551260, rec=0x55f99755a200 <record>) at builtin-record.c:2802 #8 0x000055f99700474c in cmd_record (argc=<optimized out>, argv=0x7ffc67551260) at builtin-record.c:4258 #9 0x000055f9970a223d in run_builtin (p=0x55f997564d88 <commands+264>, argc=10, argv=0x7ffc67551260) at perf.c:330 #10 0x000055f9970a254b in handle_internal_command (argc=10, argv=0x7ffc67551260) at perf.c:384 #11 0x000055f996ff1b61 in run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:428 #12 main (argc=<optimized out>, argv=0x7ffc67551260) at perf.c:562 The reason is that record__bytes_written accesses the freed memory rec->thread_data, The process is as follows: __cmd_record -> record__free_thread_data -> zfree(&rec->thread_data) // free rec->thread_data -> record__synthesize -> perf_event__synthesize_id_index -> process_synthesized_event -> record__write -> record__bytes_written // access rec->thread_data We add a member variable "thread_bytes_written" in the struct "record" to save the data size written by the threads. Fixes: 6d57581659f72299 ("perf record: Add support for limit perf output file size") Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiwei Sun <jiwei.sun@windriver.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/CAM9d7ci_TRrqBQVQNW8=GwakUr7SsZpYxaaty-S4bxF8zJWyqw@mail.gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7c0a6144 |
|
19-Dec-2022 |
Yang Jihong <yangjihong1@huawei.com> |
perf tools: Fix usage of the verbose variable The data type of the verbose variable is integer and can be negative, replace improperly used cases in a unified manner: 1. if (verbose) => if (verbose > 0) 2. if (!verbose) => if (verbose <= 0) 3. if (XX && verbose) => if (XX && verbose > 0) 4. if (XX && !verbose) => if (XX && verbose <= 0) Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Carsten Haitzler <carsten.haitzler@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <martin.lau@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Link: https://lore.kernel.org/r/20221220035702.188413-3-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5f8f9567 |
|
13-Dec-2022 |
Ian Rogers <irogers@google.com> |
perf evlist: Remove group option. The group option predates grouping events using curly braces added in commit 89efb029502d7f2d ("perf tools: Add support to parse event group syntax"). The --group option was retained for legacy support (in August 2012) but keeping it adds complexity. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Eelco Chaudron <echaudro@redhat.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Shaomin Deng <dengshaomin@cdjrlc.com> Cc: Stephane Eranian <eranian@google.com> Cc: Timothy Hayes <timothy.hayes@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20221213232651.1269909-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
378ef0f5 |
|
05-Dec-2022 |
Ian Rogers <irogers@google.com> |
perf build: Use libtraceevent from the system Remove the LIBTRACEEVENT_DYNAMIC and LIBTRACEFS_DYNAMIC make command line variables. If libtraceevent isn't installed or NO_LIBTRACEEVENT=1 is passed to the build, don't compile in libtraceevent and libtracefs support. This also disables CONFIG_TRACE that controls "perf trace". CONFIG_LIBTRACEEVENT is used to control enablement in Build/Makefiles, HAVE_LIBTRACEEVENT is used in C code. Without HAVE_LIBTRACEEVENT tracepoints are disabled and as such the commands kmem, kwork, lock, sched and timechart are removed. The majority of commands continue to work including "perf test". Committer notes: Fixed up a tools/perf/util/Build reject and added: #include <traceevent/event-parse.h> to tools/perf/util/scripting-engines/trace-event-perl.c. Committer testing: $ rpm -qi libtraceevent-devel Name : libtraceevent-devel Version : 1.5.3 Release : 2.fc36 Architecture: x86_64 Install Date: Mon 25 Jul 2022 03:20:19 PM -03 Group : Unspecified Size : 27728 License : LGPLv2+ and GPLv2+ Signature : RSA/SHA256, Fri 15 Apr 2022 02:11:58 PM -03, Key ID 999f7cbf38ab71f4 Source RPM : libtraceevent-1.5.3-2.fc36.src.rpm Build Date : Fri 15 Apr 2022 10:57:01 AM -03 Build Host : buildvm-x86-05.iad2.fedoraproject.org Packager : Fedora Project Vendor : Fedora Project URL : https://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git/ Bug URL : https://bugz.fedoraproject.org/libtraceevent Summary : Development headers of libtraceevent Description : Development headers of libtraceevent-libs $ Default build: $ ldd ~/bin/perf | grep tracee libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007f1dcaf8f000) $ # perf trace -e sched:* --max-events 10 0.000 migration/0/17 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, dest_cpu: 1) 0.005 migration/0/17 sched:sched_wake_idle_without_ipi(cpu: 1) 0.011 migration/0/17 sched:sched_switch(prev_comm: "", prev_pid: 17 (migration/0), prev_state: 1, next_comm: "", next_prio: 120) 1.173 :0/0 sched:sched_wakeup(comm: "", pid: 3138 (gnome-terminal-), prio: 120) 1.180 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 3138 (gnome-terminal-), next_prio: 120) 0.156 migration/1/21 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, orig_cpu: 1, dest_cpu: 2) 0.160 migration/1/21 sched:sched_wake_idle_without_ipi(cpu: 2) 0.166 migration/1/21 sched:sched_switch(prev_comm: "", prev_pid: 21 (migration/1), prev_state: 1, next_comm: "", next_prio: 120) 1.183 :0/0 sched:sched_wakeup(comm: "", pid: 1602985 (kworker/u16:0-f), prio: 120, target_cpu: 1) 1.186 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 1602985 (kworker/u16:0-f), next_prio: 120) # Had to tweak tools/perf/util/setup.py to make sure the python binding shared object links with libtraceevent if -DHAVE_LIBTRACEEVENT is present in CFLAGS. Building with NO_LIBTRACEEVENT=1 uncovered some more build failures: - Make building of data-convert-bt.c to CONFIG_LIBTRACEEVENT=y - perf-$(CONFIG_LIBTRACEEVENT) += scripts/ - bpf_kwork.o needs also to be dependent on CONFIG_LIBTRACEEVENT=y - The python binding needed some fixups and util/trace-event.c can't be built and linked with the python binding shared object, so remove it in tools/perf/util/setup.py and exclude it from the list of dependencies in the python/perf.so Makefile.perf target. Building without libtraceevent-devel installed uncovered more build failures: - The python binding tools/perf/util/python.c was assuming that traceevent/parse-events.h was always available, which was the case when we defaulted to using the in-kernel tools/lib/traceevent/ files, now we need to enclose it under ifdef HAVE_LIBTRACEEVENT, just like the other parts of it that deal with tracepoints. - We have to ifdef the rules in the Build files with CONFIG_LIBTRACEEVENT=y to build builtin-trace.c and tools/perf/trace/beauty/ as we only ifdef setting CONFIG_TRACE=y when setting NO_LIBTRACEEVENT=1 in the make command line, not when we don't detect libtraceevent-devel installed in the system. Simplification here to avoid these two ways of disabling builtin-trace.c and not having CONFIG_TRACE=y when libtraceevent-devel isn't installed is the clean way. From Athira: <quote> tools/perf/arch/powerpc/util/Build -perf-y += kvm-stat.o +perf-$(CONFIG_LIBTRACEEVENT) += kvm-stat.o </quote> Then, ditto for arm64 and s390, detected by container cross build tests. - s/390 uses test__checkevent_tracepoint() that is now only available if HAVE_LIBTRACEEVENT is defined, enclose the callsite with ifder HAVE_LIBTRACEEVENT. Also from Athira: <quote> With this change, I could successfully compile in these environment: - Without libtraceevent-devel installed - With libtraceevent-devel installed - With “make NO_LIBTRACEEVENT=1” </quote> Then, finally rename CONFIG_TRACEEVENT to CONFIG_LIBTRACEEVENT for consistency with other libraries detected in tools/perf/. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20221205225940.3079667-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
49bd97c2 |
|
18-Nov-2022 |
Sean Christopherson <seanjc@google.com> |
perf tools: Use dedicated non-atomic clear/set bit helpers Use the dedicated non-atomic helpers for {clear,set}_bit() and their test variants, i.e. the double-underscore versions. Depsite being defined in atomic.h, and despite the kernel versions being atomic in the kernel, tools' {clear,set}_bit() helpers aren't actually atomic. Move to the double-underscore versions so that the versions that are expected to be atomic (for kernel developers) can be made atomic without affecting users that don't want atomic operations. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: James Morse <james.morse@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Yury Norov <yury.norov@gmail.com> Cc: alexandru elisei <alexandru.elisei@arm.com> Cc: kvm@vger.kernel.org Cc: kvmarm@lists.cs.columbia.edu Cc: kvmarm@lists.linux.dev Cc: linux-arm-kernel@lists.infradead.org Link: http://lore.kernel.org/lkml/20221119013450.2643007-6-seanjc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
8ed28c2b |
|
24-Oct-2022 |
Ian Rogers <irogers@google.com> |
perf record: Use sig_atomic_t for signal handlers This removes undefined behavior as described in: https://wiki.sei.cmu.edu/confluence/display/c/SIG31-C.+Do+not+access+shared+objects+in+signal+handlers Suggested-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20221024181913.630986-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a527c2c1 |
|
18-Oct-2022 |
James Clark <james.clark@arm.com> |
perf tools: Make quiet mode consistent between tools Use the global quiet variable everywhere so that all tools hide warnings in quiet mode and update the documentation to reflect this. 'perf probe' claimed that errors are not printed in quiet mode but I don't see this so remove it from the docs. Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20221018094137.783081-3-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
75d7ba32 |
|
18-Nov-2022 |
Sean Christopherson <seanjc@google.com> |
perf tools: Use dedicated non-atomic clear/set bit helpers Use the dedicated non-atomic helpers for {clear,set}_bit() and their test variants, i.e. the double-underscore versions. Depsite being defined in atomic.h, and despite the kernel versions being atomic in the kernel, tools' {clear,set}_bit() helpers aren't actually atomic. Move to the double-underscore versions so that the versions that are expected to be atomic (for kernel developers) can be made atomic without affecting users that don't want atomic operations. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Message-Id: <20221119013450.2643007-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
#
304f0a2f |
|
23-Oct-2022 |
Ian Rogers <irogers@google.com> |
perf record: Fix event fd races The write call may set errno which is problematic if occurring in a function also setting errno. Save and restore errno around the write call. done_fd may be used after close, clear it as part of the close and check its validity in the signal handler. Suggested-by: <gthelen@google.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anand K Mistry <amistry@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20221024011024.462518-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
da406202 |
|
12-Sep-2022 |
Adrian Hunter <adrian.hunter@intel.com> |
perf tools: Add debug messages and comments for testing Add debug messages to enable scripts to track aspects of 'perf record' behaviour. The messages will be consumed after 'perf record' has run, with the exception of "perf record has started" which is consequently flushed. Put comments so developers know which messages are also being used by test scripts. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220912083412.7058-11-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d031a00a |
|
09-Sep-2022 |
Namhyung Kim <namhyung@kernel.org> |
perf record: Fix a segfault in record__read_lost_samples() When it fails to open events record__open() returns without setting the session->evlist. Then it gets a segfault in the function trying to read lost sample counts. You can easily reproduce it as a normal user like: $ perf record -p 1 true ... perf: Segmentation fault ... Skip the function if it has no evlist. And add more protection for evsels which are not properly initialized. Fixes: a49aa8a54e861af1 ("perf record: Read and inject LOST_SAMPLES events") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220909235024.278281-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e3a23261 |
|
01-Sep-2022 |
Namhyung Kim <namhyung@kernel.org> |
perf record: Read and inject LOST_SAMPLES events When there are lost samples, it can read the number of PERF_FORMAT_LOST and convert it to PERF_RECORD_LOST_SAMPLES and write to the data file at the end. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220901195739.668604-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
49c670b17 |
|
26-Aug-2022 |
Ian Rogers <irogers@google.com> |
perf record: Update use of pthread mutex Switch to the use of mutex wrappers that provide better error checking for synth_lock. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Truong <alexandre.truong@arm.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: André Almeida <andrealmeid@igalia.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Colin Ian King <colin.king@intel.com> Cc: Dario Petrillo <dario.pk1@gmail.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Dave Marchevsky <davemarchevsky@fb.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Fangrui Song <maskray@google.com> Cc: Hewenliang <hewenliang4@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jason Wang <wangborong@cdjrlc.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Liška <mliska@suse.cz> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Pavithra Gurushankar <gpavithrasha@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Monnet <quentin@isovalent.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Remi Bernon <rbernon@codeweavers.com> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tom Rix <trix@redhat.com> Cc: Weiguo Li <liwg06@foxmail.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: William Cohen <wcohen@redhat.com> Cc: Zechuan Chen <chenzechuan1@huawei.com> Cc: bpf@vger.kernel.org Cc: llvm@lists.linux.dev Cc: yaowenbin <yaowenbin1@huawei.com> Link: https://lore.kernel.org/r/20220826164242.43412-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
6657a099 |
|
24-Aug-2022 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Allow multiple recording time ranges AUX area traces can produce too much data to record successfully or analyze subsequently. Add another means to reduce data collection by allowing multiple recording time ranges. This is useful, for instance, in cases where a workload produces predictably reproducible events in specific time ranges. Today we only have perf record -D <msecs> to start at a specific region, or some complicated approach using snapshot mode and external scripts sending signals or using the fifos. But these approaches are difficult to set up compared with simply having perf do it. Extend perf record option -D/--delay option to specifying relative time stamps for start stop controlled by perf with the right time offset, for instance: perf record -e intel_pt// -D 10-20,30-40 to record 10ms to 20ms into the trace and 30ms to 40ms. Example: The example workload is: $ cat repeat-usleep.c int usleep(useconds_t usec); int usage(int ret, const char *msg) { if (msg) fprintf(stderr, "%s\n", msg); fprintf(stderr, "Usage is: repeat-usleep <microseconds>\n"); return ret; } int main(int argc, char *argv[]) { unsigned long usecs; char *end_ptr; if (argc != 2) return usage(1, "Error: Wrong number of arguments!"); errno = 0; usecs = strtoul(argv[1], &end_ptr, 0); if (errno || *end_ptr || usecs > UINT_MAX) return usage(1, "Error: Invalid argument!"); while (1) { int ret = usleep(usecs); if (ret & errno != EINTR) return usage(1, "Error: usleep() failed!"); } return 0; } $ perf record -e intel_pt//u --delay 10-20,40-70,110-160 -- ./repeat-usleep 500 Events disabled Events enabled Events disabled Events enabled Events disabled Events enabled Events disabled [ perf record: Woken up 5 times to write data ] [ perf record: Captured and wrote 0.204 MB perf.data ] Terminated A dlfilter is used to determine continuous data collection (timestamps less than 1ms apart): $ cat dlfilter-show-delays.c static __u64 start_time; static __u64 last_time; int start(void **data, void *ctx) { printf("%-17s\t%-9s\t%-6s\n", " Time", " Duration", " Delay"); return 0; } int filter_event_early(void *data, const struct perf_dlfilter_sample *sample, void *ctx) { __u64 delta; if (!sample->time) return 1; if (!last_time) goto out; delta = sample->time - last_time; if (delta < 1000000) goto out2;; printf("%17.9f\t%9.1f\t%6.1f\n", start_time / 1000000000.0, (last_time - start_time) / 1000000.0, delta / 1000000.0); out: start_time = sample->time; out2: last_time = sample->time; return 1; } int stop(void *data, void *ctx) { printf("%17.9f\t%9.1f\n", start_time / 1000000000.0, (last_time - start_time) / 1000000.0); return 0; } The result shows the times roughly match the --delay option: $ perf script --itrace=qb --dlfilter dlfilter-show-delays.so Time Duration Delay 39215.302317300 9.7 20.5 39215.332480217 30.4 40.9 39215.403837717 49.8 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220824072814.16422-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
feff0b61 |
|
24-Aug-2022 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Change evlist->ctl_fd to use fdarray_flag__non_perf_event Patch "perf record: Fix way of handling non-perf-event pollfds" added a generic way to handle non-perf-event file descriptors like evlist->ctl_fd. Use it instead of handling evlist->ctl_fd separately. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220824072814.16422-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
6562c9ac |
|
24-Aug-2022 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Fix way of handling non-perf-event pollfds perf record __cmd_record() does not poll evlist pollfds. Instead it polls thread_data[0].pollfd. That happens whether or not threads are being used. perf record duplicates evlist mmap pollfds as needed for separate threads. The non-perf-event represented by evlist->ctl_fd has to handled separately, which is done explicitly, duplicating it into the thread_data[0] pollfds. That approach neglects any other non-perf-event file descriptors. Currently there is also done_fd which needs the same handling. Add a new generalized approach. Add fdarray_flag__non_perf_event to identify the file descriptors that need the special handling. For those cases, also keep a mapping of the evlist pollfd index and thread pollfd index, so that the evlist revents can be updated. Although this patch adds the new handling, it does not take it into use. There is no functional change, but it is the precursor to a fix, so is marked as a fix. Fixes: 415ccb58f68a6beb ("perf record: Introduce thread specific data array") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220824072814.16422-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ca76d7d2 |
|
15-Sep-2022 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Fix cpu mask bit setting for mixed mmaps With mixed per-thread and (system-wide) per-cpu maps, the "any cpu" value -1 must be skipped when setting CPU mask bits. Prior to commit cbd7bfc7fd99acdd ("tools/perf: Fix out of bound access to cpu mask array") the invalid setting went unnoticed, but since then it causes perf record to fail with an error. Example: Before: $ perf record -e intel_pt// --per-thread uname Failed to initialize parallel data streaming masks After: $ perf record -e intel_pt// --per-thread uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.068 MB perf.data ] Fixes: ae4f8ae16a078964 ("libperf evlist: Allow mixing per-thread and per-cpu mmaps") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220915122612.81738-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
faf59ec8 |
|
07-Sep-2022 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Fix synthesis failure warnings Some calls to synthesis functions set err < 0 but only warn about the failure and continue. However they do not set err back to zero, relying on subsequent code to do that. That changed with the introduction of option --synth. When --synth=no subsequent functions that set err back to zero are not called. Fix by setting err = 0 in those cases. Example: Before: $ perf record --no-bpf-event --synth=all -o /tmp/huh uname Couldn't synthesize bpf events. Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.014 MB /tmp/huh (7 samples) ] $ perf record --no-bpf-event --synth=no -o /tmp/huh uname Couldn't synthesize bpf events. After: $ perf record --no-bpf-event --synth=no -o /tmp/huh uname Couldn't synthesize bpf events. Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.014 MB /tmp/huh (7 samples) ] Fixes: 41b740b6e8a994e5 ("perf record: Add --synth option") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220907162458.72817-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
cbd7bfc7 |
|
05-Sep-2022 |
Athira Rajeev <atrajeev@linux.vnet.ibm.com> |
tools/perf: Fix out of bound access to cpu mask array The cpu mask init code in "record__mmap_cpu_mask_init" function access "bits" array part of "struct mmap_cpu_mask". The size of this array is the value from cpu__max_cpu().cpu. This array is used to contain the cpumask value for each cpu. While setting bit for each cpu, it calls "set_bit" function which access index in "bits" array. If we provide a command line option to -C which is greater than the number of CPU's present in the system, the set_bit could access an array member which is out-of the array size. This is because currently, there is no boundary check for the CPU. This will result in seg fault: <<>> ./perf record -C 12341234 ls Perf can support 2048 CPUs. Consider raising MAX_NR_CPUS Segmentation fault (core dumped) <<>> Debugging with gdb, points to function flow as below: <<>> set_bit record__mmap_cpu_mask_init record__init_thread_default_masks record__init_thread_masks cmd_record <<>> Fix this by adding boundary check for the array. After the patch: <<>> ./perf record -C 12341234 ls Perf can support 2048 CPUs. Consider raising MAX_NR_CPUS Failed to initialize parallel data streaming masks <<>> With this fix, if -C is given a non-exsiting CPU, perf record will fail with: <<>> ./perf record -C 50 ls Failed to initialize parallel data streaming masks <<>> Reported-by: Nageswara R Sastry <rnsastry@linux.ibm.com> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220905141929.7171-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
1bf7d836 |
|
12-Aug-2022 |
Martin Liška <mliska@suse.cz> |
perf record: Improve error message of -p not_existing_pid When one uses -p $not_existing_pid, the output of --help is printed: $ perf record -p 123456789 2>&1 | head -n3 Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] Let's change it something similar what perf top -p $not_existing_pid prints: $ ./perf top -p 123456789 --stdio Error: Couldn't create thread/CPU maps: No such process Newly suggested error message: $ ./perf record -p 123456789 Couldn't create thread/CPU maps: No such process Signed-off-by: Martin Liška <mliska@suse.cz> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lore.kernel.org/lkml/8e00eda1-4de0-2c44-ce67-d4df48ac1f7c@suse.cz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
3812d298 |
|
10-Jun-2022 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Add finished init event In preparation for recording sideband events in a virtual machine guest so that they can be injected into a host perf.data file. This is needed to enable injecting events after the initial synthesized user events (that have an all zero id sample) but before regular events. Committer notes: Add entry about PERF_RECORD_FINISHED_INIT to tools/perf/Documentation/perf.data-file-format.txt. Committer testing: Before: # perf report -D | grep FINISHED 0 0x5910 [0x8]: PERF_RECORD_FINISHED_ROUND FINISHED_ROUND events: 1 ( 0.5%) # After: # perf record -- sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.020 MB perf.data (7 samples) ] # perf report -D | grep FINISHED 0 0x5068 [0x8]: PERF_RECORD_FINISHED_INIT: unhandled! 0 0x5390 [0x8]: PERF_RECORD_FINISHED_ROUND FINISHED_ROUND events: 1 ( 0.5%) FINISHED_INIT events: 1 ( 0.5%) # Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220610113316.6682-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
61110883 |
|
14-Jun-2022 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Add new option to sample identifier In preparation for recording sideband events in a virtual machine guest so that they can be injected into a host perf.data file. Add an option to always include sample type PERF_SAMPLE_IDENTIFIER. Committer testing: # perf record sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.020 MB perf.data (7 samples) ] # perf evlist -v cycles: size: 128, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 # # # perf record --sample-identifier sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.022 MB perf.data (7 samples) ] # perf evlist -v cycles: size: 128, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 # Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220615052511.4441-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
6b080312 |
|
10-Jun-2022 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Always record id index In preparation for recording sideband events in a virtual machine guest so that they can be injected into a host perf.data file. Adjust the logic so that if there are IDs then the id index is recorded. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220610113316.6682-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f42c0ce5 |
|
10-Jun-2022 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Always get text_poke events with --kcore option kcore provides a copy of the running kernel including any modified code. A trace that benefits from that also benefits from text_poke events, so enable them. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220610113316.6682-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
685439a7 |
|
18-May-2022 |
Namhyung Kim <namhyung@kernel.org> |
perf record: Add cgroup support for off-cpu profiling This covers two different use cases. The first one is cgroup filtering given by -G/--cgroup option which controls the off-cpu profiling for tasks in the given cgroups only. The other use case is cgroup sampling which is enabled by --all-cgroups option and it adds PERF_SAMPLE_CGROUP to the sample_type to set the cgroup id of the task in the sample data. Example output. $ sudo perf record -a --off-cpu --all-cgroups sleep 1 $ sudo perf report --stdio -s comm,cgroup --call-graph=no ... # Samples: 144 of event 'offcpu-time' # Event count (approx.): 48452045427 # # Children Self Command Cgroup # ........ ........ ............... .......................................... # 61.57% 5.60% Chrome_ChildIOT /user.slice/user-657345.slice/user@657345.service/app.slice/... 29.51% 7.38% Web Content /user.slice/user-657345.slice/user@657345.service/app.slice/... 17.48% 1.59% Chrome_IOThread /user.slice/user-657345.slice/user@657345.service/app.slice/... 16.48% 4.12% pipewire-pulse /user.slice/user-657345.slice/user@657345.service/session.slice/... 14.48% 2.07% perf /user.slice/user-657345.slice/user@657345.service/app.slice/... 14.30% 7.15% CompositorTileW /user.slice/user-657345.slice/user@657345.service/app.slice/... 13.33% 6.67% Timer /user.slice/user-657345.slice/user@657345.service/app.slice/... ... Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Blake Jones <blakejones@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20220518224725.742882-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
10742d0c |
|
18-May-2022 |
Namhyung Kim <namhyung@kernel.org> |
perf record: Implement basic filtering for off-cpu It should honor cpu and task filtering with -a, -C or -p, -t options. Committer testing: # perf record --off-cpu --cpu 1 perf bench sched messaging -l 1000 # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 10 groups == 400 processes run Total time: 1.722 [sec] [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 1.446 MB perf.data (7248 samples) ] # # perf script | head -20 perf 97164 [001] 38287.696761: 1 cycles: ffffffffb6070174 native_write_msr+0x4 (vmlinux) perf 97164 [001] 38287.696764: 1 cycles: ffffffffb6070174 native_write_msr+0x4 (vmlinux) perf 97164 [001] 38287.696765: 9 cycles: ffffffffb6070174 native_write_msr+0x4 (vmlinux) perf 97164 [001] 38287.696767: 212 cycles: ffffffffb6070176 native_write_msr+0x6 (vmlinux) perf 97164 [001] 38287.696768: 5130 cycles: ffffffffb6070176 native_write_msr+0x6 (vmlinux) perf 97164 [001] 38287.696770: 123063 cycles: ffffffffb6e0011e syscall_return_via_sysret+0x38 (vmlinux) perf 97164 [001] 38287.696803: 2292748 cycles: ffffffffb636c82d __fput+0xad (vmlinux) swapper 0 [001] 38287.702852: 1927474 cycles: ffffffffb6761378 mwait_idle_with_hints.constprop.0+0x48 (vmlinux) :97513 97513 [001] 38287.767207: 1172536 cycles: ffffffffb612ff65 newidle_balance+0x5 (vmlinux) swapper 0 [001] 38287.769567: 1073081 cycles: ffffffffb618216d ktime_get_mono_fast_ns+0xd (vmlinux) :97533 97533 [001] 38287.770962: 984460 cycles: ffffffffb65b2900 selinux_socket_sendmsg+0x0 (vmlinux) :97540 97540 [001] 38287.772242: 883462 cycles: ffffffffb6d0bf59 irqentry_exit_to_user_mode+0x9 (vmlinux) swapper 0 [001] 38287.773633: 741963 cycles: ffffffffb6761378 mwait_idle_with_hints.constprop.0+0x48 (vmlinux) :97552 97552 [001] 38287.774539: 606680 cycles: ffffffffb62eda0a page_add_file_rmap+0x7a (vmlinux) :97556 97556 [001] 38287.775333: 502254 cycles: ffffffffb634f964 get_obj_cgroup_from_current+0xc4 (vmlinux) :97561 97561 [001] 38287.776163: 427891 cycles: ffffffffb61b1522 cgroup_rstat_updated+0x22 (vmlinux) swapper 0 [001] 38287.776854: 359030 cycles: ffffffffb612fc5e load_balance+0x9ce (vmlinux) :97567 97567 [001] 38287.777312: 330371 cycles: ffffffffb6a8d8d0 skb_set_owner_w+0x0 (vmlinux) :97566 97566 [001] 38287.777589: 311622 cycles: ffffffffb614a7a8 native_queued_spin_lock_slowpath+0x148 (vmlinux) :97512 97512 [001] 38287.777671: 307851 cycles: ffffffffb62e0f35 find_vma+0x55 (vmlinux) # # perf record --off-cpu --cpu 4 perf bench sched messaging -l 1000 # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 10 groups == 400 processes run Total time: 1.613 [sec] [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 1.415 MB perf.data (6729 samples) ] # perf script | head -20 perf 97650 [004] 38323.728036: 1 cycles: ffffffffb6070174 native_write_msr+0x4 (vmlinux) perf 97650 [004] 38323.728040: 1 cycles: ffffffffb6070174 native_write_msr+0x4 (vmlinux) perf 97650 [004] 38323.728041: 9 cycles: ffffffffb6070174 native_write_msr+0x4 (vmlinux) perf 97650 [004] 38323.728042: 208 cycles: ffffffffb6070176 native_write_msr+0x6 (vmlinux) perf 97650 [004] 38323.728044: 5026 cycles: ffffffffb6070176 native_write_msr+0x6 (vmlinux) perf 97650 [004] 38323.728046: 119970 cycles: ffffffffb6d0bebc syscall_exit_to_user_mode+0x1c (vmlinux) perf 97650 [004] 38323.728078: 2190103 cycles: 54b756 perf_tool__process_synth_event+0x16 (/home/acme/bin/perf) swapper 0 [004] 38323.783357: 1593139 cycles: ffffffffb6761378 mwait_idle_with_hints.constprop.0+0x48 (vmlinux) swapper 0 [004] 38323.785352: 1593139 cycles: ffffffffb6761378 mwait_idle_with_hints.constprop.0+0x48 (vmlinux) swapper 0 [004] 38323.797330: 1418936 cycles: ffffffffb6761378 mwait_idle_with_hints.constprop.0+0x48 (vmlinux) swapper 0 [004] 38323.802350: 1418936 cycles: ffffffffb6761378 mwait_idle_with_hints.constprop.0+0x48 (vmlinux) swapper 0 [004] 38323.806333: 1418936 cycles: ffffffffb6761378 mwait_idle_with_hints.constprop.0+0x48 (vmlinux) :97996 97996 [004] 38323.807145: 1418936 cycles: 7f5db9be6917 [unknown] ([unknown]) :97959 97959 [004] 38323.807730: 1445074 cycles: ffffffffb6329d36 memcg_slab_post_alloc_hook+0x146 (vmlinux) :97959 97959 [004] 38323.808103: 1341584 cycles: ffffffffb62fd90f get_page_from_freelist+0x112f (vmlinux) :97959 97959 [004] 38323.808451: 1227537 cycles: ffffffffb65b2905 selinux_socket_sendmsg+0x5 (vmlinux) :97959 97959 [004] 38323.808768: 1184321 cycles: ffffffffb6d1ba35 _raw_spin_lock_irqsave+0x15 (vmlinux) :97959 97959 [004] 38323.809073: 1153017 cycles: ffffffffb6a8d92d skb_set_owner_w+0x5d (vmlinux) :97959 97959 [004] 38323.809402: 1126875 cycles: ffffffffb6329c64 memcg_slab_post_alloc_hook+0x74 (vmlinux) :97959 97959 [004] 38323.809695: 1073248 cycles: ffffffffb6e0001d entry_SYSCALL_64+0x1d (vmlinux) # Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Blake Jones <blakejones@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20220518224725.742882-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
edc41a10 |
|
18-May-2022 |
Namhyung Kim <namhyung@kernel.org> |
perf record: Enable off-cpu analysis with BPF Add --off-cpu option to enable the off-cpu profiling with BPF. It'd use a bpf_output event and rename it to "offcpu-time". Samples will be synthesized at the end of the record session using data from a BPF map which contains the aggregated off-cpu time at context switches. So it needs root privilege to get the off-cpu profiling. Each sample will have a separate user stacktrace so it will skip kernel threads. The sample ip will be set from the stacktrace and other sample data will be updated accordingly. Currently it only handles some basic sample types. The sample timestamp is set to a dummy value just not to bother with other events during the sorting. So it has a very big initial value and increase it on processing each samples. Good thing is that it can be used together with regular profiling like cpu cycles. If you don't want to that, you can use a dummy event to enable off-cpu profiling only. Example output: $ sudo perf record --off-cpu perf bench sched messaging -l 1000 $ sudo perf report --stdio --call-graph=no # Total Lost Samples: 0 # # Samples: 41K of event 'cycles' # Event count (approx.): 42137343851 ... # Samples: 1K of event 'offcpu-time' # Event count (approx.): 587990831640 # # Children Self Command Shared Object Symbol # ........ ........ ............... .................. ......................... # 81.66% 0.00% sched-messaging libc-2.33.so [.] __libc_start_main 81.66% 0.00% sched-messaging perf [.] cmd_bench 81.66% 0.00% sched-messaging perf [.] main 81.66% 0.00% sched-messaging perf [.] run_builtin 81.43% 0.00% sched-messaging perf [.] bench_sched_messaging 40.86% 40.86% sched-messaging libpthread-2.33.so [.] __read 37.66% 37.66% sched-messaging libpthread-2.33.so [.] __write 2.91% 2.91% sched-messaging libc-2.33.so [.] __poll ... As you can see it spent most of off-cpu time in read and write in bench_sched_messaging(). The --call-graph=no was added just to make the output concise here. It uses perf hooks facility to control BPF program during the record session rather than adding new BPF/off-cpu specific calls. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Blake Jones <blakejones@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20220518224725.742882-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7be1fedd |
|
24-May-2022 |
Adrian Hunter <adrian.hunter@intel.com> |
perf tools: Allow all_cpus to be a superset of user_requested_cpus To support collection of system-wide events with user requested CPUs, all_cpus must be a superset of user_requested_cpus. In order to support all_cpus to be a superset of user_requested_cpus, all_cpus must be used instead of user_requested_cpus when dealing with CPUs of all events instead of CPUs of requested events. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Link: https://lore.kernel.org/r/20220524075436.29144-10-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
921e3be5 |
|
24-May-2022 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Use evlist__add_dummy_on_all_cpus() in record__config_text_poke() Use evlist__add_dummy_on_all_cpus() in record__config_text_poke() in preparation for allowing system-wide events on all CPUs while the user requested events are on only user requested CPUs. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Link: https://lore.kernel.org/r/20220524075436.29144-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0255571a |
|
02-May-2022 |
Ian Rogers <irogers@google.com> |
perf cpumap: Switch to using perf_cpu_map API Switch some raw accesses to the cpu map to using the library API. This can help with reference count checking. Some BPF cases switch from index to CPU for consistency, this shouldn't matter as the CPU map is full. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Link: http://lore.kernel.org/lkml/20220503041757.2365696-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
23380e4d |
|
13-Apr-2022 |
Alexey Bayduraev <alexey.bayduraev@gmail.com> |
perf record: Fix per-thread option Per-thread mode doesn't have specific CPUs for events, add checks for this case. Minor fix to a pr_debug by Ian Rogers <irogers@google.com> to avoid an out of bound array access. Fixes: 7954f71689f90cb2 ("perf record: Introduce thread affinity and mmap masks") Reported-by: Ian Rogers <irogers@google.com> Signed-off-by: Alexey Bayduraev <alexey.bayduraev@gmail.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220414014642.3308206-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0df6ade7 |
|
28-Mar-2022 |
Ian Rogers <irogers@google.com> |
perf evlist: Rename cpus to user_requested_cpus evlist contains cpus and all_cpus. all_cpus is the union of the cpu maps of all evsels. For non-task targets, cpus is set to be cpus requested from the command line, defaulting to all online cpus if no cpus are specified. For an uncore event, all_cpus may be just CPU 0 or every online CPU. This causes all_cpus to have fewer values than the cpus variable which is confusing given the 'all' in the name. To try to make the behavior clearer, rename cpus to user_requested_cpus and add comments on the two struct variables. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: netdev@vger.kernel.org Link: http://lore.kernel.org/lkml/20220328232648.2127340-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
65e7c963 |
|
21-Feb-2022 |
Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> |
perf data: Adding error message if perf_data__create_dir() fails Add proper return codes for all cases of data directory creation failure and add error message output based on these codes. Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220222091417.11020-1-alexey.v.bayduraev@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b5f2511d |
|
17-Jan-2022 |
Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> |
perf record: Implement compatibility checks Implement compatibility checks for other modes and related command line options: asynchronous (--aio) trace streaming and affinity (--affinity) modes, pipe mode, AUX area tracing --snapshot and --aux-sample options, --switch-output, --switch-output-event, --switch-max-files and --timestamp-filename options. Parallel data streaming is compatible with Zstd compression (--compression-level) and external control commands (--control). CPU mask provided via -C option filters --threads specification masks. Reviewed-by: Riccardo Mancini <rickyman7@gmail.com> Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Riccardo Mancini <rickyman7@gmail.com> Acked-by: Namhyung Kim <namhyung@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/fadc1cf74057af4d5766248fcfe5cdde40732aa9.1642440724.git.alexey.v.bayduraev@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f466e5ed |
|
17-Jan-2022 |
Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> |
perf record: Extend --threads command line option Extend --threads option in perf record command line interface. The option can have a value in the form of masks that specify CPUs to be monitored with data streaming threads and its layout in system topology. The masks can be filtered using CPU mask provided via -C option. The specification value can be user defined list of masks. Masks separated by colon define CPUs to be monitored by one thread and affinity mask of that thread is separated by slash. For example: <cpus mask 1>/<affinity mask 1>:<cpu mask 2>/<affinity mask 2> specifies parallel threads layout that consists of two threads with corresponding assigned CPUs to be monitored. The specification value can be a string e.g. "cpu", "core" or "package" meaning creation of data streaming thread for every CPU or core or package to monitor distinct CPUs or CPUs grouped by core or package. The option provided with no or empty value defaults to per-cpu parallel threads layout creating data streaming thread for every CPU being monitored. Document --threads option syntax and parallel data streaming modes in Documentation/perf-record.txt. Suggested-by: Jiri Olsa <jolsa@kernel.org> Suggested-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Riccardo Mancini <rickyman7@gmail.com> Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Riccardo Mancini <rickyman7@gmail.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Namhyung Kim <namhyung@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/079e2619be70c465317cf7c9fdaf5fa069728c32.1642440724.git.alexey.v.bayduraev@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
06380a84 |
|
17-Jan-2022 |
Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> |
perf record: Introduce --threads command line option Provide --threads option in perf record command line interface. The option creates a data streaming thread for each CPU in the system. Document --threads option in Documentation/perf-record.txt. Reviewed-by: Riccardo Mancini <rickyman7@gmail.com> Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Riccardo Mancini <rickyman7@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/01aeae43b047f428596c4ef9f9342ab94865cedd.1642440724.git.alexey.v.bayduraev@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
610fbc01 |
|
17-Jan-2022 |
Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> |
perf record: Introduce data transferred and compressed stats Introduce bytes_transferred and bytes_compressed stats so they would capture statistics for the related data buffer transfers. Reviewed-by: Riccardo Mancini <rickyman7@gmail.com> Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Riccardo Mancini <rickyman7@gmail.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Namhyung Kim <namhyung@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/b5d598034c507dfb7544d2125500280b7d434764.1642440724.git.alexey.v.bayduraev@linux.intel.com [ Use PRiu64 to print u64 values, fixing the build on 32-bit architectures ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
75f5f1fc |
|
17-Jan-2022 |
Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> |
perf record: Introduce compressor at mmap buffer object Introduce compressor object into mmap object so it could be used to pack the data stream from the corresponding kernel data buffer. Initialize and make use of the introduced per mmap compressor. Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Namhyung Kim <namhyung@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Link: https://lore.kernel.org/r/80edc286cf6543139a7d5a91217605123aa0b50d.1642440724.git.alexey.v.bayduraev@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ae9c7242b |
|
17-Jan-2022 |
Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> |
perf record: Introduce bytes written stats Introduce a function to calculate the total amount of data written and use it to support the --max-size option. Reviewed-by: Riccardo Mancini <rickyman7@gmail.com> Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Riccardo Mancini <rickyman7@gmail.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/3e2c69186641446f8ab003ec209bccc762b3394d.1642440724.git.alexey.v.bayduraev@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
56f735ff |
|
17-Jan-2022 |
Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> |
perf record: Introduce data file at mmap buffer object Introduce data file objects into mmap object so it could be used to process and store data stream from the corresponding kernel data buffer. Initialize data files located at mmap buffer objects so trace data can be written into several data file located at data directory. Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Namhyung Kim <namhyung@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Link: https://lore.kernel.org/r/177077f7734b63e5c999ccd75ac6dc3c694f0d0d.1642440724.git.alexey.v.bayduraev@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
3217e9fe |
|
17-Jan-2022 |
Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> |
perf record: Start threads in the beginning of trace streaming Start thread in detached state because its management is implemented via messaging to avoid any scaling issues. Block signals prior thread start so only main tool thread would be notified on external async signals during data collection. Thread affinity mask is used to assign eligible CPUs for the thread to run. Wait and sync on thread start using thread ack pipe. Reviewed-by: Riccardo Mancini <rickyman7@gmail.com> Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Riccardo Mancini <rickyman7@gmail.com> Acked-by: Namhyung Kim <namhyung@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/95784dd9f7c81ee408eab27b50b4c09ad4cf7be6.1642440724.git.alexey.v.bayduraev@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
1e5de7d9 |
|
17-Jan-2022 |
Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> |
perf record: Stop threads in the end of trace streaming Signal thread to terminate by closing write fd of msg pipe. Receive THREAD_MSG__READY message as the confirmation of the thread's termination. Stop threads created for parallel trace streaming prior their stats processing. Reviewed-by: Riccardo Mancini <rickyman7@gmail.com> Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Riccardo Mancini <rickyman7@gmail.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Namhyung Kim <namhyung@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/55ef8cc5ec3a96360660d9dc1763573225325f8c.1642440724.git.alexey.v.bayduraev@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
396b626b |
|
17-Jan-2022 |
Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> |
perf record: Introduce thread local variable Introduce thread local variable and use it for threaded trace streaming. Use thread affinity mask instead of record affinity mask in affinity modes. Use evlist__ctlfd_update() to propagate control commands from thread object to global evlist object to enable evlist__ctlfd_* functionality. Move waking and sample statistic to struct record_thread and introduce record__waking function to calculate the total number of wakes. Reviewed-by: Riccardo Mancini <rickyman7@gmail.com> Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Riccardo Mancini <rickyman7@gmail.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Namhyung Kim <namhyung@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/0d127555219991c1dcd6c6bb76b24fa6b78d2932.1642440724.git.alexey.v.bayduraev@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
415ccb58 |
|
17-Jan-2022 |
Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> |
perf record: Introduce thread specific data array Introduce thread specific data object and array of such objects to store and manage thread local data. Implement functions to allocate, initialize, finalize and release thread specific data. Thread local maps and overwrite_maps arrays keep pointers to mmap buffer objects to serve according to maps thread mask. Thread local pollfd array keeps event fds connected to mmaps buffers according to maps thread mask. Thread control commands are delivered via thread local comm pipes and ctlfd_pos fd. External control commands (--control option) are delivered via evlist ctlfd_pos fd and handled by the main tool thread. Reviewed-by: Riccardo Mancini <rickyman7@gmail.com> Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Riccardo Mancini <rickyman7@gmail.com> Acked-by: Namhyung Kim <namhyung@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/fc9f74af6f822d9c0fa0e145c3564a760dbe3d4b.1642440724.git.alexey.v.bayduraev@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7954f716 |
|
17-Jan-2022 |
Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> |
perf record: Introduce thread affinity and mmap masks Introduce affinity and mmap thread masks. Thread affinity mask defines CPUs that a thread is allowed to run on. Thread maps mask defines mmap data buffers the thread serves to stream profiling data from. Reviewed-by: Riccardo Mancini <rickyman7@gmail.com> Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Riccardo Mancini <rickyman7@gmail.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Namhyung Kim <namhyung@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/9042bf7daf988e17e17e6acbf5d29590bde869cd.1642440724.git.alexey.v.bayduraev@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9bce13ea |
|
09-Dec-2021 |
Jiri Olsa <jolsa@redhat.com> |
perf record: Disable debuginfod by default Fedora 35 sets DEBUGINFOD_URLS by default, which might lead to unexpected stalls in perf record exit path, when we try to cache profiled binaries. # DEBUGINFOD_PROGRESS=1 ./perf record -a ^C[ perf record: Woken up 1 times to write data ] Downloading from https://debuginfod.fedoraproject.org/ 447069 Downloading from https://debuginfod.fedoraproject.org/ 1502175 Downloading \^Z Disabling DEBUGINFOD_URLS by default in perf record and adding debuginfod option and .perfconfig variable support to enable id. Default without debuginfo processing: # perf record -a Using system debuginfod setup: # perf record -a --debuginfod Using custom debuginfd url: # perf record -a --debuginfod='https://evenbetterdebuginfodserver.krava' Adding single perf_debuginfod_setup function and using it also in perf buildid-cache command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20211209200425.303561-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
6d18804b |
|
04-Jan-2022 |
Ian Rogers <irogers@google.com> |
perf cpumap: Give CPUs their own type A common problem is confusing CPU map indices with the CPU, by wrapping the CPU with a struct then this is avoided. This approach is similar to atomic_t. Committer notes: To make it build with BUILD_BPF_SKEL=1 these files needed the conversions to 'struct perf_cpu' usage: tools/perf/util/bpf_counter.c tools/perf/util/bpf_counter_cgroup.c tools/perf/util/bpf_ftrace.c Also perf_env__get_cpu() was removed back in "perf cpumap: Switch cpu_map__build_map to cpu function". Additionally these needed to be fixed for the ARM builds to complete: tools/perf/arch/arm/util/cs-etm.c tools/perf/arch/arm64/util/pmu.c Suggested-by: John Garry <john.garry@huawei.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Vineet Singh <vineet.singh@intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: zhengjun.xing@intel.com Link: https://lore.kernel.org/r/20220105061351.120843-49-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7248e308 |
|
17-Dec-2021 |
Alexandre Truong <alexandre.truong@arm.com> |
perf tools: Record ARM64 LR register automatically On ARM64, automatically record the link register if the frame pointer mode is on. It will be used to do a dwarf unwind to find the caller of the leaf frame if the frame pointer was omitted. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Alexandre Truong <alexandre.truong@arm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20211217154521.80603-2-german.gomez@arm.com Signed-off-by: German Gomez <german.gomez@arm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7cc72553 |
|
18-Oct-2021 |
James Clark <james.clark@arm.com> |
perf tools: Check vmlinux/kallsyms arguments in all tools Only perf report checked the validity of these arguments so apply the same check to all tools that read them for consistency. Signed-off-by: James Clark <james.clark@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Denis Nikitin <denik@chromium.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20211018134844.2627174-3-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
61750473 |
|
07-Sep-2021 |
Adrian Hunter <adrian.hunter@intel.com> |
perf tools: Add support for PERF_RECORD_AUX_OUTPUT_HW_ID The PERF_RECORD_AUX_OUTPUT_HW_ID event provides a way to match AUX output data like Intel PT PEBS-via-PT back to the event that it came from, by providing a hardware ID that is present in the AUX output. Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: x86@kernel.org Link: http://lore.kernel.org/lkml/20210907163903.11820-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
41b740b6 |
|
10-Aug-2021 |
Namhyung Kim <namhyung@kernel.org> |
perf record: Add --synth option Add an option to control the synthesizing behavior. --synth <no|all|task|mmap|cgroup> Fine-tune event synthesis: default=all This can be useful when we know it doesn't need some synthesis like in a specific usecase and/or when using pipe: $ perf record -a --all-cgroups --synth cgroup -o- sleep 1 | \ > perf report -i- -s cgroup Committer notes: Added a clarification to the man page entry for --synth that this is about pre-existing threads. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https //lore.kernel.org/r/20210811044658.1313391-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
84111b9c |
|
10-Aug-2021 |
Namhyung Kim <namhyung@kernel.org> |
perf tools: Allow controlling synthesizing PERF_RECORD_ metadata events during record Depending on the use case, it might require some kind of synthesizing and some not. Make it controllable to turn off heavy operations like MMAP for all tasks. Currently all users are converted to enable all the synthesis by default. It'll be updated in the later patch. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https //lore.kernel.org/r/20210811044658.1313391-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7fc5b571 |
|
07-Sep-2021 |
Andy Shevchenko <andriy.shevchenko@linux.intel.com> |
tools: rename bitmap_alloc() to bitmap_zalloc() Rename bitmap_alloc() to bitmap_zalloc() in tools to follow the bitmap API in the kernel. No functional changes intended. Link: https://lkml.kernel.org/r/20210814211713.180533-14-yury.norov@gmail.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Yury Norov <yury.norov@gmail.com> Suggested-by: Yury Norov <yury.norov@gmail.com> Acked-by: Yury Norov <yury.norov@gmail.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Lobakin <alobakin@pm.me> Cc: Alexey Klimov <aklimov@redhat.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
bb07d62e |
|
27-Aug-2021 |
Namhyung Kim <namhyung@kernel.org> |
perf record: Fix wrong comm in system-wide mode with delay Stephane found that the name of the forked process in a system-wide mode is wrong when --delay option is used. For example, # perf record -a --delay=1000 noploop 3 The noploop process will run a busy loop for 3 second. And on an idle machine it should show up at the top in the perf report. It works well without the --delay option. But if I add the option, it showed 'perf' not 'noploop'. # perf report -s comm -q | head -3 52.94% perf 16.65% swapper 12.04% chrome It turned out that the dummy event didn't work at all and it missed COMM and MMAP events for the noploop process (and others too). We should enable the dummy event immediately in system-wide mode, as the enable-on-exec would work only for task events. With this change, # perf report -s comm -q | head -3 52.75% noploop 17.03% swapper 12.83% chrome Reported-by: Stephane Eranian <eranian@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210827233212.3121037-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
1d3351e6 |
|
23-Jul-2021 |
Jin Yao <yao.jin@linux.intel.com> |
perf tools: Enable on a list of CPUs for hybrid The 'perf record' and 'perf stat' commands have supported the option '-C/--cpus' to count or collect only on the list of CPUs provided. This option needs to be supported for hybrid as well. For hybrid support, it needs to check that the cpu list are available on hybrid PMU. One example for AlderLake, cpu0-7 is 'cpu_core', cpu8-11 is 'cpu_atom'. Before: # perf stat -e cpu_core/cycles/ -C11 -- sleep 1 Performance counter stats for 'CPU(s) 11': <not supported> cpu_core/cycles/ 1.006179431 seconds time elapsed The 'perf stat' command silently returned "<not supported>" without any helpful information. It should error out pointing out that that cpu11 was not 'cpu_core'. After: # perf stat -e cpu_core/cycles/ -C11 -- sleep 1 WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7) failed to use cpu list 11 We also need to support the events without pmu prefix specified. # perf stat -e cycles -C11 -- sleep 1 WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7) Performance counter stats for 'CPU(s) 11': 1,067,373 cpu_atom/cycles/ 1.005544738 seconds time elapsed The perf tool creates two cycles events automatically, cpu_core/cycles/ and cpu_atom/cycles/. It checks that cpu11 is not 'cpu_core', then shows a warning for cpu_core/cycles/ and only count the cpu_atom/cycles/. If part of cpus are 'cpu_core' and part of cpus are 'cpu_atom', for example, # perf stat -e cycles -C0,11 -- sleep 1 WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list. WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list. Performance counter stats for 'CPU(s) 0,11': 1,914,704 cpu_core/cycles/ 2,036,983 cpu_atom/cycles/ 1.005815641 seconds time elapsed It now automatically selects cpu0 for cpu_core/cycles/, selects cpu11 for cpu_atom/cycles/, and output with some warnings. Some more complex examples, # perf stat -e cycles,instructions -C0,11 -- sleep 1 WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list. WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list. WARNING: use 0 in 'cpu_core' for 'instructions', skip other cpus in list. WARNING: use 11 in 'cpu_atom' for 'instructions', skip other cpus in list. Performance counter stats for 'CPU(s) 0,11': 2,780,387 cpu_core/cycles/ 1,583,432 cpu_atom/cycles/ 3,957,277 cpu_core/instructions/ 1,167,089 cpu_atom/instructions/ 1.006005124 seconds time elapsed # perf stat -e cycles,cpu_atom/instructions/ -C0,11 -- sleep 1 WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list. WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list. WARNING: use 11 in 'cpu_atom' for 'cpu_atom/instructions/', skip other cpus in list. Performance counter stats for 'CPU(s) 0,11': 3,290,301 cpu_core/cycles/ 1,953,073 cpu_atom/cycles/ 1,407,869 cpu_atom/instructions/ 1.006260912 seconds time elapsed Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https //lore.kernel.org/r/20210723063433.7318-4-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c3a057dc |
|
19-Jul-2021 |
Namhyung Kim <namhyung@kernel.org> |
perf inject: Fix output from a file to a pipe When the input is a regular file but the output is a pipe, it should write a pipe header. But just repiping would write a portion of the existing header which is different in 'size' value. So we need to prevent it and write a new pipe header along with other information like event attributes and features. This can handle something like this: # perf record -a -B sleep 1 # perf inject -b -i perf.data | perf report -i - Factor out perf_event__synthesize_for_pipe() to be shared between perf record and inject. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210719223153.1618812-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
2681bd85 |
|
19-Jul-2021 |
Namhyung Kim <namhyung@kernel.org> |
perf tools: Remove repipe argument from perf_session__new() The repipe argument is only used by perf inject and the all others passes 'false'. Let's remove it from the function signature and add __perf_session__new() to be called from perf inject directly. This is a preparation of the change the pipe input/output. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210719223153.1618812-2-namhyung@kernel.org [ Fixed up some trivial conflicts as this patchset fell thru the cracks ;-( ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b91e5492 |
|
08-Jul-2021 |
Kan Liang <kan.liang@linux.intel.com> |
perf record: Add a dummy event on hybrid systems to collect metadata records Some symbols may not be resolved if a user only monitors one type of PMU. $ sudo perf record -e cpu_atom/branch-instructions/ ./big_small_workload $ sudo perf report –stdio # Overhead Command Shared Object Symbol # ........ ......... ................. ..................... # 28.02% perf-exec [unknown] [.] 0x0000000000401cf6 11.32% perf-exec [unknown] [.] 0x0000000000401d04 10.90% perf-exec [unknown] [.] 0x0000000000401d11 10.61% perf-exec [unknown] [.] 0x0000000000401cfc To parse symbols the metadata records, e.g., PERF_RECORD_COMM, which are generated by the kernel, are required. To decide whether to generate the metadata records, the kernel relies on the event_filter_match() to filter the unrelated events. On a hybrid system, event_filter_match() further checks the CPU mask of the current enabled PMU. If an event is collected on the CPU which doesn't have an enabled PMU, it's treated as an unrelated event. The "big_small_workload" is created in a big core, but runs on a small core. The metadata records are filtered, because the user only monitors the PMU of the small core. The big core PMU is not enabled. For a hybrid system, a dummy event is required to generate the complete side-band events. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/1625760212-18441-1-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
3a683120 |
|
06-Jul-2021 |
Jiri Olsa <jolsa@redhat.com> |
libperf: Move 'nr_groups' from tools/perf to evlist::nr_groups Move evsel::nr_groups to perf_evsel::nr_groups, so we can move the group interface to libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210706151704.73662-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
fba7c866 |
|
06-Jul-2021 |
Jiri Olsa <jolsa@redhat.com> |
libperf: Move 'leader' from tools/perf to perf_evsel::leader Move evsel::leader to perf_evsel::leader, so we can move the group interface to libperf. Also add several evsel helpers to ease up the transition: struct evsel *evsel__leader(struct evsel *evsel); - get leader evsel bool evsel__has_leader(struct evsel *evsel, struct evsel *leader); - true if evsel has leader as leader bool evsel__is_leader(struct evsel *evsel); - true if evsel is itw own leader void evsel__set_leader(struct evsel *evsel, struct evsel *leader); - set leader for evsel Committer notes: Fix this when building with 'make BUILD_BPF_SKEL=1' tools/perf/util/bpf_counter.c - if (evsel->leader->core.nr_members > 1) { + if (evsel->core.leader->nr_members > 1) { Signed-off-by: Jiri Olsa <jolsa@kernel.org> Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210706151704.73662-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
66286ed3 |
|
03-May-2021 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Set timestamp boundary for AUX area events AUX area data is not processed by 'perf record' and consequently the --timestamp-boundary option may result in no values for "time of first sample" and "time of last sample". However there are non-sample events that can be used instead, namely 'itrace_start' and 'aux'. 'itrace_start' is issued before tracing starts, and 'aux' is issued every time data is ready. Implement tool callbacks for those two for 'perf record', to update the timestamp boundary. Example: $ perf record -e intel_pt//u --timestamp-boundary uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.022 MB perf.data ] $ perf script --header-only | grep "time of" # time of first sample : 4574.835541 # time of last sample : 4574.835907 $ perf script --itrace=be -F-ip | head -1 uname 13752 [001] 4574.835589: 1 branches:uH: $ perf script --itrace=be -F-ip | tail -1 uname 13752 [001] 4574.835867: 1 branches:uH: $ Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20210503064222.5319-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4f2abe91 |
|
27-May-2021 |
Namhyung Kim <namhyung@kernel.org> |
perf record: Move probing cgroup sampling support I found that checking cgroup sampling support using the missing features doesn't work on old kernels. Because it added both attr.cgroup bit and PERF_SAMPLE_CGROUP bit, it needs to check whichever comes first (usually the actual event, not dummy). But it only checks the attr.cgroup bit which is set only in the dummy event so cannot detect failtures due the sample bits. Also we don't ignore the missing feature and retry, it'd be better checking it with the API probing logic. Committer notes: Extracted the minimal part to check using the new cgroup API probe routine, the part that removes the cgroup member can be left for further discussion. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210527182835.1634339-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
91c0f5ec |
|
27-Apr-2021 |
Jin Yao <yao.jin@linux.intel.com> |
perf record: Uniquify hybrid event name For perf-record, it would be useful to tell user the pmu which the event belongs to. For example, # perf record -a -- sleep 1 # perf report # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 106 of event 'cpu_core/cycles/' # Event count (approx.): 22043448 # # Overhead Command Shared Object Symbol # ........ ............ ....................... ............................ # ... Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-18-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b53a0755 |
|
27-Apr-2021 |
Jin Yao <yao.jin@linux.intel.com> |
perf record: Create two hybrid 'cycles' events by default When evlist is empty, for example no '-e' specified in perf record, one default 'cycles' event is added to evlist. While on hybrid platform, it needs to create two default 'cycles' events. One is for cpu_core, the other is for cpu_atom. This patch actually calls evsel__new_cycles() two times to create two 'cycles' events. # ./perf record -vv -a -- sleep 1 ... ------------------------------------------------------------ perf_event_attr: size 120 config 0x400000000 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|ID|CPU|PERIOD read_format ID disabled 1 inherit 1 freq 1 precise_ip 3 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 5 sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 6 sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 7 sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 9 sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 10 sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 11 sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 12 sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 13 sys_perf_event_open: pid -1 cpu 8 group_fd -1 flags 0x8 = 14 sys_perf_event_open: pid -1 cpu 9 group_fd -1 flags 0x8 = 15 sys_perf_event_open: pid -1 cpu 10 group_fd -1 flags 0x8 = 16 sys_perf_event_open: pid -1 cpu 11 group_fd -1 flags 0x8 = 17 sys_perf_event_open: pid -1 cpu 12 group_fd -1 flags 0x8 = 18 sys_perf_event_open: pid -1 cpu 13 group_fd -1 flags 0x8 = 19 sys_perf_event_open: pid -1 cpu 14 group_fd -1 flags 0x8 = 20 sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 21 ------------------------------------------------------------ perf_event_attr: size 120 config 0x800000000 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|ID|CPU|PERIOD read_format ID disabled 1 inherit 1 freq 1 precise_ip 3 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 22 sys_perf_event_open: pid -1 cpu 17 group_fd -1 flags 0x8 = 23 sys_perf_event_open: pid -1 cpu 18 group_fd -1 flags 0x8 = 24 sys_perf_event_open: pid -1 cpu 19 group_fd -1 flags 0x8 = 25 sys_perf_event_open: pid -1 cpu 20 group_fd -1 flags 0x8 = 26 sys_perf_event_open: pid -1 cpu 21 group_fd -1 flags 0x8 = 27 sys_perf_event_open: pid -1 cpu 22 group_fd -1 flags 0x8 = 28 sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 29 ------------------------------------------------------------ We have to create evlist-hybrid.c otherwise due to the symbol dependency the perf test python would be failed. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-14-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
3535a696 |
|
14-Apr-2021 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Improve 'Workload failed' message printing events + what was exec'ed Before: # perf record -a cycles,instructions,cache-misses Workload failed: No such file or directory # After: # perf record -a cycles,instructions,cache-misses Failed to collect 'cycles' for the 'cycles,instructions,cache-misses' workload: No such file or directory # Helps disambiguating other error scenarios: # perf record -a -e cycles,instructions,cache-misses bla Failed to collect 'cycles,instructions,cache-misses' for the 'bla' workload: No such file or directory # perf record -a cycles,instructions,cache-misses sleep 1 Failed to collect 'cycles' for the 'cycles,instructions,cache-misses' workload: No such file or directory # When all goes well we're back to the usual: # perf record -a -e cycles,instructions,cache-misses sleep 1 [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 3.151 MB perf.data (21242 samples) ] # Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20210414131628.2064862-3-acme@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d58b3f7e |
|
21-Jan-2021 |
Adrian Hunter <adrian.hunter@intel.com> |
perf auxtrace: Automatically group aux-output events aux-output events need to have an AUX area event as the group leader. However, grouping events does not allow the AUX area event to be given an address filter because the --filter option must come after the event, which conflicts with the grouping syntax. To allow filtering in that case, automatically create a group since that is the requirement anyway. Example: (requires Intel Tremont) perf record -c 500 -e 'intel_pt//u' --filter 'filter main @ /bin/ls' -e 'cycles/aux-output/pp' ls Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20210121140418.14705-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e16c2ce7 |
|
04-Feb-2021 |
Yang Jihong <yangjihong1@huawei.com> |
perf record: Fix continue profiling after draining the buffer Commit da231338ec9c0987 ("perf record: Use an eventfd to wakeup when done") uses eventfd() to solve a rare race where the setting and checking of 'done' which add done_fd to pollfd. When draining buffer, revents of done_fd is 0 and evlist__filter_pollfd function returns a non-zero value. As a result, perf record does not stop profiling. The following simple scenarios can trigger this condition: # sleep 10 & # perf record -p $! After the sleep process exits, perf record should stop profiling and exit. However, perf record keeps running. If pollfd revents contains only POLLERR or POLLHUP, perf record indicates that buffer is draining and need to stop profiling. Use fdarray_flag__nonfilterable() to set done eventfd to nonfilterable objects, so that evlist__filter_pollfd() does not filter and check done eventfd. Fixes: da231338ec9c0987 ("perf record: Use an eventfd to wakeup when done") Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: zhangjinhao2@huawei.com Link: http://lore.kernel.org/lkml/20210205065001.23252-1-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
47fddcb4 |
|
26-Dec-2020 |
Jiri Olsa <jolsa@kernel.org> |
perf tools: Add 'ping' control command Add a control 'ping' command to detect if perf is up and its control interface is operational. It will be used in following daemon patches to synchronize with record session - when control interface is up and running, we know that perf record is monitoring and ready to receive signals. Example session: terminal 1: # mkfifo control ack # perf record --control=fifo:control,ack terminal 2: # echo ping > control # cat ack ack Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20201226232038.390883-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f186cd61 |
|
26-Dec-2020 |
Jiri Olsa <jolsa@kernel.org> |
perf tools: Add 'stop' control command Adding control 'stop' command to stop perf record. When it is received, perf will set the 'done' variable to 1 to stop its mmap ring buffer reading loop. Example session: terminal 1: # mkfifo control ack # perf record --control=fifo:control,ack terminal 2: # echo stop > control terminal 1: [ perf record: Woken up 7 times to write data ] [ perf record: Captured and wrote 3.214 MB perf.data (38280 samples) ] # Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20201226232038.390883-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
142544a9 |
|
26-Dec-2020 |
Jiri Olsa <jolsa@kernel.org> |
perf tools: Add 'evlist' control command Add a new 'evlist' control command to display all the evlist events. When it is received, perf will scan and print current evlist into perf record terminal. The interface string for control file is: evlist [-v|-g|-F] The syntax follows perf evlist command: -F Show just the sample frequency used for each event. -v Show all fields. -g Show event group information. Example session: terminal 1: # mkfifo control ack # perf record --control=fifo:control,ack -e '{cycles,instructions}' terminal 2: # echo evlist > control terminal 1: cycles instructions dummy:HG terminal 2: # echo 'evlist -v' > control terminal 1: cycles: size: 120, { sample_period, sample_freq }: 4000, sample_type: \ IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, freq: 1, \ sample_id_all: 1, exclude_guest: 1 instructions: size: 120, config: 0x1, { sample_period, sample_freq }: 4000, \ sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, inherit: 1, freq: 1, \ sample_id_all: 1, exclude_guest: 1 dummy:HG: type: 1, size: 120, config: 0x9, { sample_period, sample_freq }: 4000, \ sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, inherit: 1, mmap: 1, \ comm: 1, freq: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, \ bpf_event: 1 terminal 2: # echo 'evlist -g' > control terminal 1: {cycles,instructions} dummy:HG terminal 2: # echo 'evlist -F' > control terminal 1: cycles: sample_freq=4000 instructions: sample_freq=4000 dummy:HG: sample_freq=4000 This new evlist command is handy to get real event names when wildcards are used. Adding evsel_fprintf.c object to python/perf.so build, because it's now evlist.c dependency. Adding PYTHON_PERF define for python/perf.so compilation, so we can use it to compile in only evsel__fprintf from evsel_fprintf.c object. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20201226232038.390883-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
991ae4eb |
|
26-Dec-2020 |
Jiri Olsa <jolsa@kernel.org> |
perf tools: Allow to enable/disable events via control file Adding new control events to enable/disable specific event. The interface string for control file are: 'enable <EVENT NAME>' 'disable <EVENT NAME>' when received the command, perf will scan the current evlist for <EVENT NAME> and if found it's enabled/disabled. Example session: terminal 1: # mkfifo control ack perf.pipe # perf record --control=fifo:control,ack -D -1 --no-buffering -e 'sched:*' -o - > perf.pipe terminal 2: # cat perf.pipe | perf --no-pager script -i - terminal 1: Events disabled NOTE Above message will show only after read side of the pipe ('>') is started on 'terminal 2'. The 'terminal 1's bash does not execute perf before that, hence the delyaed perf record message. terminal 3: # echo 'enable sched:sched_process_fork' > control terminal 1: event sched:sched_process_fork enabled terminal 2: bash 33349 [034] 149587.674295: sched:sched_process_fork: comm=bash pid=33349 child_comm=bash child_pid=34056 bash 33349 [034] 149588.239521: sched:sched_process_fork: comm=bash pid=33349 child_comm=bash child_pid=34057 terminal 3: # echo 'enable sched:sched_wakeup_new' > control terminal 1: event sched:sched_wakeup_new enabled terminal 2: bash 33349 [034] 149632.228023: sched:sched_process_fork: comm=bash pid=33349 child_comm=bash child_pid=34059 bash 33349 [034] 149632.228050: sched:sched_wakeup_new: bash:34059 [120] success=1 CPU:036 bash 33349 [034] 149633.950005: sched:sched_process_fork: comm=bash pid=33349 child_comm=bash child_pid=34060 bash 33349 [034] 149633.950030: sched:sched_wakeup_new: bash:34060 [120] success=1 CPU:036 Committer testing: If I use 'sched:*' and then enable all events, I can't get 'perf record' to react to further commands, so I tested it with: [root@five ~]# perf record --control=fifo:control,ack -D -1 --no-buffering -e 'sched:sched_process_*' -o - > perf.pipe Events disabled Events enabled Events disabled And then it works as expected, so we need to fix this pre-existing problem. Another issue, we need to check if a event is already enabled or disabled and change the message to be clearer, i.e.: [root@five ~]# perf record --control=fifo:control,ack -D -1 --no-buffering -e 'sched:sched_process_*' -o - > perf.pipe Events disabled If we receive a 'disable' command, then it should say: [root@five ~]# perf record --control=fifo:control,ack -D -1 --no-buffering -e 'sched:sched_process_*' -o - > perf.pipe Events disabled Events already disabled Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20201226232038.390883-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c1de7f3d |
|
05-Jan-2021 |
Kan Liang <kan.liang@linux.intel.com> |
perf record: Add support for PERF_SAMPLE_CODE_PAGE_SIZE Adds the infrastructure to sample the code address page size. Introduce a new --code-page-size option for perf record. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Originally-by: Stephane Eranian <eranian@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210105195752.43489-4-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e29386c8 |
|
14-Dec-2020 |
Jiri Olsa <jolsa@kernel.org> |
perf record: Add --buildid-mmap option to enable PERF_RECORD_MMAP2's build id Add --buildid-mmap option to enable build id in PERF_RECORD_MMAP2 events. It will only work if there's kernel support for that and it disables build id cache (implies --no-buildid). It's also possible to enable it permanently via config option in ~/.perfconfig file: [record] build-id=mmap Also added build_id bit in the verbose output for perf_event_attr: # perf record --buildid-mmap -vv ... perf_event_attr: type 1 size 120 ... build_id 1 Adding also missing text_poke bit. Committer testing: $ perf record -h build Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -B, --no-buildid do not collect buildids in perf.data -N, --no-buildid-cache do not update the buildid cache --buildid-all Record build-id of all DSOs regardless of hits --buildid-mmap Record build-id in map events $ $ perf record --buildid-mmap sleep 1 Failed: no support to record build id in mmap events, update your kernel. $ After adding the needed kernel bits in a test kernel: $ perf record -vv --buildid-mmap sleep 1 |& grep -m1 build Enabling build id in mmap2 events. $ perf evlist -v cycles:u: size: 120, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, exclude_kernel: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, build_id: 1 $ Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20201214105457.543111-16-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
542b88fd |
|
30-Nov-2020 |
Kan Liang <kan.liang@linux.intel.com> |
perf record: Support new sample type for data page size Support new sample type PERF_SAMPLE_DATA_PAGE_SIZE for page size. Add new option --data-page-size to record sample data page size. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Link: http://lore.kernel.org/lkml/20201130172803.2676-3-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
db0ea13c |
|
30-Nov-2020 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Use the right prefix for 'struct evlist' record methods perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/, go on completing this split. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
25f84702 |
|
30-Nov-2020 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Use the right prefix for 'struct evlist' mmap pages parsing method perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/, go on completing this split. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
78e1bc25 |
|
30-Nov-2020 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Use the right prefix for 'struct evlist' event attribute config methods perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/, go on completing this split. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
64b4778b |
|
30-Nov-2020 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Use the right prefix for 'struct evlist' event group methods perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/, go on completing this split. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7748bb71 |
|
30-Nov-2020 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Use the right prefix for 'struct evlist' create maps methods perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/, go on completing this split. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e80db255 |
|
30-Nov-2020 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Use the right prefix for 'struct evlist' tracking event methods perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/, go on completing this split. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
08c83997 |
|
30-Nov-2020 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Use the right prefix for 'struct evlist' sideband thread methods perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/, go on completing this split. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
24bf91a7 |
|
30-Nov-2020 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Use the right prefix for 'struct evlist' 'filter' methods perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/, go on completing this split. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ade9d208 |
|
30-Nov-2020 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Use the right prefix for 'struct evlist' 'toggle' methods perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/, go on completing this split. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7b392ef0 |
|
30-Nov-2020 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Use the right prefix for 'struct evlist' 'workload' methods perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/, go on completing this split. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ee7fe31e |
|
03-Sep-2020 |
Adrian Hunter <adrian.hunter@intel.com> |
perf tools: Consolidate close_control_option()'s into one function Consolidate control option fifo closing into one function. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Suggested-by: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20200903122937.25691-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d20aff15 |
|
31-Aug-2020 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Add 'snapshot' control command Add 'snapshot' control command to create an AUX area tracing snapshot the same as if sending SIGUSR2. The advantage of the FIFO is that access is governed by access to the FIFO. Example: $ mkfifo perf.control $ mkfifo perf.ack $ cat perf.ack & [1] 15235 $ sudo ~/bin/perf record --control fifo:perf.control,perf.ack -S -e intel_pt//u -- sleep 60 & [2] 15243 $ ps -e | grep perf 15244 pts/1 00:00:00 perf $ kill -USR2 15244 bash: kill: (15244) - Operation not permitted $ echo snapshot > perf.control ack $ Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Alexey Budankov <alexey.budankov@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20200901093758.32293-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a8fcbd26 |
|
02-Sep-2020 |
Adrian Hunter <adrian.hunter@intel.com> |
perf tools: Add FIFO file names as alternative options to --control Enable the --control option to accept file names as an alternative to file descriptors. Example: $ mkfifo perf.control $ mkfifo perf.ack $ cat perf.ack & [1] 6808 $ perf record --control fifo:perf.control,perf.ack -- sleep 300 & [2] 6810 $ echo disable > perf.control $ Events disabled ack $ echo enable > perf.control $ Events enabled ack $ echo disable > perf.control $ Events disabled ack $ kill %2 [ perf record: Woken up 4 times to write data ] $ [ perf record: Captured and wrote 0.018 MB perf.data (7 samples) ] [1]- Done cat perf.ack [2]+ Terminated perf record --control fifo:perf.control,perf.ack -- sleep 300 $ Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Alexey Budankov <alexey.budankov@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20200902105707.11491-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9864a66d |
|
31-Aug-2020 |
Adrian Hunter <adrian.hunter@intel.com> |
perf tools: Consolidate --control option parsing into one function Consolidate --control option parsing into one function, in preparation for adding FIFO file name options. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Alexey Budankov <alexey.budankov@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20200901093758.32293-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a060c1f1 |
|
18-Aug-2020 |
Wei Li <liwei391@huawei.com> |
perf record: Correct the help info of option "--no-bpf-event" The help info of option "--no-bpf-event" is wrongly described as "record bpf events", correct it. Committer testing: $ perf record -h bpf Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] --clang-opt <clang options> options passed to clang when compiling BPF scriptlets --clang-path <clang path> clang binary to use for compiling BPF scriptlets --no-bpf-event do not record bpf events $ Fixes: 71184c6ab7e6 ("perf record: Replace option --bpf-event with --no-bpf-event") Signed-off-by: Wei Li <liwei391@huawei.com> Acked-by: Song Liu <songliubraving@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Li Bin <huawei.libin@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20200819031947.12115-1-liwei391@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b784a88e |
|
03-Jun-2020 |
Fam Zheng <famzheng@amazon.com> |
perf: Fix opt help text for --no-bpf-event The opt name was once inverted but the help text didn't reflect the change. Fixes: 71184c6ab7e6 ("perf record: Replace option --bpf-event with --no-bpf-event") Signed-off-by: Fam Zheng <famzheng@amazon.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
#
1101c872 |
|
04-Aug-2020 |
Jin Yao <yao.jin@linux.intel.com> |
perf record: Skip side-band event setup if HAVE_LIBBPF_SUPPORT is not set We received an error report that perf-record caused 'Segmentation fault' on a newly system (e.g. on the new installed ubuntu). (gdb) backtrace #0 __read_once_size (size=4, res=<synthetic pointer>, p=0x14) at /root/0-jinyao/acme/tools/include/linux/compiler.h:139 #1 atomic_read (v=0x14) at /root/0-jinyao/acme/tools/include/asm/../../arch/x86/include/asm/atomic.h:28 #2 refcount_read (r=0x14) at /root/0-jinyao/acme/tools/include/linux/refcount.h:65 #3 perf_mmap__read_init (map=map@entry=0x0) at mmap.c:177 #4 0x0000561ce5c0de39 in perf_evlist__poll_thread (arg=0x561ce68584d0) at util/sideband_evlist.c:62 #5 0x00007fad78491609 in start_thread (arg=<optimized out>) at pthread_create.c:477 #6 0x00007fad7823c103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 The root cause is, evlist__add_bpf_sb_event() just returns 0 if HAVE_LIBBPF_SUPPORT is not defined (inline function path). So it will not create a valid evsel for side-band event. But perf-record still creates BPF side band thread to process the side-band event, then the error happpens. We can reproduce this issue by removing the libelf-dev. e.g. 1. apt-get remove libelf-dev 2. perf record -a -- sleep 1 root@test:~# ./perf record -a -- sleep 1 perf: Segmentation fault Obtained 6 stack frames. ./perf(+0x28eee8) [0x5562d6ef6ee8] /lib/x86_64-linux-gnu/libc.so.6(+0x46210) [0x7fbfdc65f210] ./perf(+0x342e74) [0x5562d6faae74] ./perf(+0x257e39) [0x5562d6ebfe39] /lib/x86_64-linux-gnu/libpthread.so.0(+0x9609) [0x7fbfdc990609] /lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7fbfdc73b103] Segmentation fault (core dumped) To fix this issue, 1. We either install the missing libraries to let HAVE_LIBBPF_SUPPORT be defined. e.g. apt-get install libelf-dev and install other related libraries. 2. Use this patch to skip the side-band event setup if HAVE_LIBBPF_SUPPORT is not set. Committer notes: The side band thread is not used just with BPF, it is also used with --switch-output-event, so narrow the ifdef to the BPF specific part. Fixes: 23cbb41c939a ("perf record: Move side band evlist setup to separate routine") Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200805022937.29184-1-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9d88a1a1 |
|
05-Aug-2020 |
Jiri Olsa <jolsa@kernel.org> |
perf tools: Move clockid_res_ns under clock struct Move the clockid_res_ns struct member to the clock struct, so we have the clock related stuff in one place. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Geneviève Bastien <gbastien@versatic.net> Cc: Ian Rogers <irogers@google.com> Cc: Jeremie Galarneau <jgalar@efficios.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lore.kernel.org/lkml/20200805093444.314999-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d1e325cf |
|
05-Aug-2020 |
Jiri Olsa <jolsa@kernel.org> |
perf header: Store clock references for -k/--clockid option Add a new CLOCK_DATA feature that stores reference times when -k/--clockid option is specified. It contains the clock id and its reference time together with wall clock time taken at the 'same time', both values are in nanoseconds. The format of data is as below: struct { u32 version; /* version = 1 */ u32 clockid; u64 wall_clock_ns; u64 clockid_time_ns; }; This clock reference times will be used in following changes to display wall clock for perf events. It's available only for recording with clockid specified, because it's the only case where we can get reference time to wallclock time. It's can't do that with perf clock yet. Committer testing: $ perf record -h -k Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -k, --clockid <clockid> clockid to use for events, see clock_gettime() $ perf record -k monotonic sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data (8 samples) ] $ perf report --header-only | grep clockid -A1 # event : name = cycles:u, , id = { 88815, 88816, 88817, 88818, 88819, 88820, 88821, 88822 }, size = 120, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|PERIOD, read_format = ID, disabled = 1, inherit = 1, exclude_kernel = 1, mmap = 1, comm = 1, freq = 1, enable_on_exec = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, use_clockid = 1, ksymbol = 1, bpf_event = 1, clockid = 1 # CPU_TOPOLOGY info available, use -I to display -- # clockid frequency: 1000 MHz # cpu pmu capabilities: branches=32, max_precise=3, pmu_name=skylake # clockid: monotonic (1) # reference time: 2020-08-06 09:40:21.619290 = 1596717621.619290 (TOD) = 21931.077673635 (monotonic) $ Original-patch-by: David Ahern <dsahern@gmail.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Geneviève Bastien <gbastien@versatic.net> Cc: Ian Rogers <irogers@google.com> Cc: Jeremie Galarneau <jgalar@efficios.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lore.kernel.org/lkml/20200805093444.314999-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
6953beb4 |
|
05-Aug-2020 |
Jiri Olsa <jolsa@kernel.org> |
perf clockid: Move parse_clockid() to new clockid object Move parse_clockid and all needed clcckid related stuff into clockid object. We are going to add clockid_name function in following change, so it's better it's placed in separated object and not in builtin-record.c. No functional change is intended. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Geneviève Bastien <gbastien@versatic.net> Cc: Ian Rogers <irogers@google.com> Cc: Jeremie Galarneau <jgalar@efficios.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lore.kernel.org/lkml/20200805093444.314999-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
1d078ccb |
|
17-Jul-2020 |
Alexey Budankov <alexey.budankov@linux.intel.com> |
perf record: Introduce --control fd:ctl-fd[,ack-fd] options Introduce --control fd:ctl-fd[,ack-fd] options to pass open file descriptors numbers from command line. Extend perf-record.txt file with --control fd:ctl-fd[,ack-fd] options description. Document possible usage model introduced by --control fd:ctl-fd[,ack-fd] options by providing example bash shell script. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/8dc01e1a-3a80-3f67-5385-4bc7112b0dd3@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
acce0223 |
|
17-Jul-2020 |
Alexey Budankov <alexey.budankov@linux.intel.com> |
perf record: Implement control commands handling Implement handling of 'enable' and 'disable' control commands coming from control file descriptor. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/f0fde590-1320-dca1-39ff-da3322704d3b@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
68cd3b45 |
|
17-Jul-2020 |
Alexey Budankov <alexey.budankov@linux.intel.com> |
perf record: Extend -D,--delay option with -1 value Extend -D,--delay option with -1 to start collection with events disabled to be enabled later by 'enable' command provided via control file descriptor. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/3e7d362c-7973-ee5d-e81e-c60ea22432c3@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
246eba8e |
|
12-May-2020 |
Adrian Hunter <adrian.hunter@intel.com> |
perf tools: Add support for PERF_RECORD_TEXT_POKE Add processing for PERF_RECORD_TEXT_POKE events. When a text poke event is processed, then the kernel dso data cache is updated with the poked bytes. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: x86@kernel.org Link: http://lore.kernel.org/lkml/20200512121922.8997-12-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
442ad225 |
|
28-Jun-2020 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Fix duplicated sideband events with Intel PT system wide tracing Commit 0a892c1c9472 ("perf record: Add dummy event during system wide synthesis") reveals an issue with Intel PT system wide tracing. Specifically that Intel PT already adds a dummy tracking event, and it is not the first event. Adding another dummy tracking event causes duplicated sideband events. Fix by checking for an existing dummy tracking event first. Example showing duplicated switch events: Before: # perf record -a -e intel_pt//u uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.895 MB perf.data ] # perf script --no-itrace --show-switch-events | head swapper 0 [007] 6390.516222: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11/11 swapper 0 [007] 6390.516222: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11/11 rcu_sched 11 [007] 6390.516223: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0 rcu_sched 11 [007] 6390.516224: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0 rcu_sched 11 [007] 6390.516227: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0 rcu_sched 11 [007] 6390.516227: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0 swapper 0 [007] 6390.516228: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11/11 swapper 0 [007] 6390.516228: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11/11 swapper 0 [002] 6390.516415: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 5556/5559 swapper 0 [002] 6390.516416: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 5556/5559 After: # perf record -a -e intel_pt//u uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.868 MB perf.data ] # perf script --no-itrace --show-switch-events | head swapper 0 [005] 6450.567013: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 7179/7181 perf 7181 [005] 6450.567014: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0 perf 7181 [005] 6450.567028: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0 swapper 0 [005] 6450.567029: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 7179/7181 swapper 0 [005] 6450.571699: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11/11 rcu_sched 11 [005] 6450.571700: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0 rcu_sched 11 [005] 6450.571702: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0 swapper 0 [005] 6450.571703: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11/11 swapper 0 [005] 6450.579703: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11/11 rcu_sched 11 [005] 6450.579704: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20200629091955.17090-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
8cedf3a5 |
|
17-Jun-2020 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Fix the class prefix for 'struct evlist' sample_id_all methods To differentiate from libperf's 'struct perf_evlist' methods. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e251abee |
|
17-Jun-2020 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Fix the class prefix for 'struct evlist' 'add' evsel methods To differentiate from libperf's 'struct perf_evlist' methods. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
70943490 |
|
05-May-2020 |
Stephane Eranian <eranian@google.com> |
perf tools: Add optional support for libpfm4 This patch links perf with the libpfm4 library if it is available and LIBPFM4 is passed to the build. The libpfm4 library contains hardware event tables for all processors supported by perf_events. It is a helper library that helps convert from a symbolic event name to the event encoding required by the underlying kernel interface. This library is open-source and available from: http://perfmon2.sf.net. With this patch, it is possible to specify full hardware events by name. Hardware filters are also supported. Events must be specified via the --pfm-events and not -e option. Both options are active at the same time and it is possible to mix and match: $ perf stat --pfm-events inst_retired:any_p:c=1:i -e cycles .... One needs to explicitely ask for its inclusion by using the LIBPFM4 make command line option, ie its opt-in rather than opt-out of feature detection and build support. Signed-off-by: Stephane Eranian <eranian@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrii Nakryiko <andriin@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Igor Lubashev <ilubashe@akamai.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jiwei Sun <jiwei.sun@windriver.com> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: netdev@vger.kernel.org Cc: yuzhoujian <yuzhoujian@didichuxing.com> Link: http://lore.kernel.org/lkml/20200505182943.218248-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
16b4b4e1 |
|
28-May-2020 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Respect --no-switch-events Context switch events are added automatically by Intel PT and Coresight. Make it possible to suppress them. That is useful for tracing the scheduler without the disturbance that the switch event processing creates. Example: Prerequisites: $ which perf ~/bin/perf $ sudo setcap "cap_sys_rawio,cap_sys_admin,cap_sys_ptrace,cap_syslog,cap_ipc_lock=ep" ~/bin/perf $ sudo chmod +r /proc/kcore Before: $ perf record --no-switch-events --kcore -a -e intel_pt//k -- sleep 0.001 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.938 MB perf.data ] $ perf script -D | grep PERF_RECORD_SWITCH | wc -l 572 After: $ perf record --no-switch-events --kcore -a -e intel_pt//k -- sleep 0.001 Warning: Intel Processor Trace decoding will not be possible except for kernel tracing! [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.838 MB perf.data ] $ perf script -D | grep PERF_RECORD_SWITCH | wc -l 0 $ sudo chmod go-r /proc/kcore $ sudo setcap -r ~/bin/perf Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Link: http://lore.kernel.org/lkml/20200528120859.21604-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
da231338 |
|
12-May-2020 |
Anand K Mistry <amistry@google.com> |
perf record: Use an eventfd to wakeup when done The setting and checking of 'done' contains a rare race where the signal handler setting 'done' is run after checking to break the loop, but before waiting in evlist__poll(). In this case, the main loop won't wake up until either another signal is sent, or the perf data fd causes a wake up. The following simple script can trigger this condition (but you might need to run it for several hours): for ((i = 0; i >= 0; i++)) ; do echo "Loop $i" delay=$(echo "scale=4; 0.1 * $RANDOM/32768" | bc) ./perf record -- sleep 30000000 >/dev/null& pid=$! sleep $delay kill -TERM $pid echo "PID $pid" wait $pid done At some point, the loop will stall. Adding logging, even though perf has received the SIGTERM and set 'done = 1', perf will remain sleeping until a second signal is sent. Committer notes: Make this dependent on HAVE_EVENTFD_SUPPORT, so that we continue building on older systems without the eventfd syscall. Signed-off-by: Anand K Mistry <amistry@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200513122012.v3.1.I4d7421c6bbb1f83ea58419082481082e19097841@changeid Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0a892c1c |
|
22-Apr-2020 |
Ian Rogers <irogers@google.com> |
perf record: Add dummy event during system wide synthesis During the processing of /proc during event synthesis new processes may start. Add a dummy event if /proc is to be processed, to capture mmaps for starting processes. This reuses the existing logic for initial-delay. v3 fixes the attr test of test-record-C0 v2 fixes the dummy event configuration and a branch stack issue. Suggested-by: Stephane Eranian <eranian@google.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200422173615.59436-1-irogers@google.com [ split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
2bb72dbb |
|
04-May-2020 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evsel: Rename perf_evsel__group_idx() to evsel__group_idx() As it is a 'struct evsel' method, not part of tools/lib/perf/, aka libperf, to whom the perf_ prefix belongs. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ae430892 |
|
30-Apr-2020 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evsel: Rename perf_evsel__fallback() to evsel__fallback() As it is a 'struct evsel' method, not part of tools/lib/perf/, aka libperf, to whom the perf_ prefix belongs. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
8ab2e96d |
|
29-Apr-2020 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evsel: Rename *perf_evsel__*name() to *evsel__*name() As they are 'struct evsel' methods or related routines, not part of tools/lib/perf/, aka libperf, to whom the perf_ prefix belongs. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
23cbb41c |
|
28-Apr-2020 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Move side band evlist setup to separate routine It is quite big by now, move that code to a separate record__setup_sb_evlist() routine. Suggested-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Link: http://lore.kernel.org/lkml/20200429131106.27974-9-acme@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
899e5ffb |
|
27-Apr-2020 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Introduce --switch-output-event Now we can use it with --overwrite to have a flight recorder mode that gets snapshot requests from arbitrary events that are processed in the side band thread together with the PERF_RECORD_BPF_EVENT processing. Example: To collect scheduler events until a recvmmsg syscall happens, system wide: [root@five a]# rm -f perf.data.2020042717* [root@five a]# perf record --overwrite -e sched:*switch,syscalls:*recvmmsg --switch-output-event syscalls:sys_enter_recvmmsg [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2020042717585458 ] [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2020042717590235 ] [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2020042717590398 ] ^C[ perf record: Woken up 1 times to write data ] [ perf record: Dump perf.data.2020042717590511 ] [ perf record: Captured and wrote 7.244 MB perf.data.<timestamp> ] So in the above case we had 3 snapshots, the fourth was forced by control+C: [root@five a]# ls -la total 20440 drwxr-xr-x. 2 root root 4096 Apr 27 17:59 . dr-xr-x---. 12 root root 4096 Apr 27 17:46 .. -rw-------. 1 root root 3936125 Apr 27 17:58 perf.data.2020042717585458 -rw-------. 1 root root 5074869 Apr 27 17:59 perf.data.2020042717590235 -rw-------. 1 root root 4291037 Apr 27 17:59 perf.data.2020042717590398 -rw-------. 1 root root 7617037 Apr 27 17:59 perf.data.2020042717590511 [root@five a]# One can make this more precise by adding the switch output event to the main -e events list, as since this is done asynchronously, a few events after the signal event will appear in the snapshots, as can be seen with: [root@five a]# rm -f perf.data.20200427175* [root@five a]# perf record --overwrite -e sched:*switch,syscalls:*recvmmsg --switch-output-event syscalls:sys_enter_recvmmsg [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2020042718024203 ] [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2020042718024301 ] [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2020042718024484 ] ^C[ perf record: Woken up 1 times to write data ] [ perf record: Dump perf.data.2020042718024562 ] [ perf record: Captured and wrote 7.337 MB perf.data.<timestamp> ] [root@five a]# perf script -i perf.data.2020042718024203 | tail -15 PacerThread 148586 [005] 122.830729: sched:sched_switch: prev_comm=PacerThread prev_pid=148586... swapper 0 [000] 122.833588: sched:sched_switch: prev_comm=swapper/0 prev_pid=... NetworkManager 1251 [000] 122.833619: syscalls:sys_enter_recvmmsg: fd: 0x0000001c, mmsg: 0x7ffe83054a1... swapper 0 [002] 122.833624: sched:sched_switch: prev_comm=swapper/2 prev_pid=... swapper 0 [003] 122.833624: sched:sched_switch: prev_comm=swapper/3 prev_pid=... NetworkManager 1251 [000] 122.833626: syscalls:sys_exit_recvmmsg: 0x1 kworker/3:3-eve 158946 [003] 122.833628: sched:sched_switch: prev_comm=kworker/3:3 prev_pid=15894... swapper 0 [004] 122.833641: sched:sched_switch: prev_comm=swapper/4 prev_pid=... NetworkManager 1251 [000] 122.833642: sched:sched_switch: prev_comm=NetworkManage... perf 228273 [002] 122.833645: sched:sched_switch: prev_comm=perf prev_pid=22827... swapper 0 [011] 122.833646: sched:sched_switch: prev_comm=swapper/1... swapper 0 [002] 122.833648: sched:sched_switch: prev_comm=swapper/... kworker/0:2-eve 207387 [000] 122.833648: sched:sched_switch: prev_comm=kworker/0:2 prev_pid=20738... kworker/2:3-eve 232038 [002] 122.833652: sched:sched_switch: prev_comm=kworker/2:3 prev_pid=23203... perf 235825 [003] 122.833653: sched:sched_switch: prev_comm=perf prev_pid=23582... [root@five a]# Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lore.kernel.org/lkml/20200429131106.27974-8-acme@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b38d85ef |
|
23-Apr-2020 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf bpf: Decouple creating the evlist from adding the SB event Renaming bpf_event__add_sb_event() to evlist__add_sb_event() and requiring that the evlist be allocated beforehand. This will allow using the same side band thread and evlist to be used for multiple purposes in addition to react to PERF_RECORD_BPF_EVENT soon after they are generated. Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Link: http://lore.kernel.org/lkml/20200429131106.27974-4-acme@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
bc477d79 |
|
24-Apr-2020 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Move sb_evlist to 'struct record' Where state related to a 'perf record' session is grouped. Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Link: http://lore.kernel.org/lkml/20200429131106.27974-2-acme@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
40c7d246 |
|
05-May-2020 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Move routines that probe for perf API features to separate file Trying to disentangle this a bit further, unfortunately it uses parse_events(), its interesting to have it separated anyway, so do it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d99c22ea |
|
22-Apr-2020 |
Stephane Eranian <eranian@google.com> |
perf record: Add num-synthesize-threads option To control degree of parallelism of the synthesize_mmap() code which is scanning /proc/PID/task/PID/maps and can be time consuming. Mimic perf top way of handling the option. If not specified will default to 1 thread, i.e. default behavior before this option. On a desktop computer the processing of /proc/PID/task/PID/maps isn't slow enough to warrant parallel processing and the thread creation has some cost - hence the default of 1. On a loaded server with >100 cores it is possible to see synthesis times in the order of seconds and in this case having the option is desirable. As the processing is a synchronization point, it is legitimate to worry if Amdahl's law will apply to this patch. Profiling with this patch in place: https://lore.kernel.org/lkml/20200415054050.31645-4-irogers@google.com/ shows: ... - 32.59% __perf_event__synthesize_threads - 32.54% __event__synthesize_thread + 22.13% perf_event__synthesize_mmap_events + 6.68% perf_event__get_comm_ids.constprop.0 + 1.49% process_synthesized_event + 1.29% __GI___readdir64 + 0.60% __opendir ... That is the processing is 1.49% of execution time and there is plenty to make parallel. This is shown in the benchmark in this patch: https://lore.kernel.org/lkml/20200415054050.31645-2-irogers@google.com/ Computing performance of multi threaded perf event synthesis by synthesizing events on CPU 0: Number of synthesis threads: 1 Average synthesis took: 127729.000 usec (+- 3372.880 usec) Average num. events: 21548.600 (+- 0.306) Average time per event 5.927 usec Number of synthesis threads: 2 Average synthesis took: 88863.500 usec (+- 385.168 usec) Average num. events: 21552.800 (+- 0.327) Average time per event 4.123 usec Number of synthesis threads: 3 Average synthesis took: 83257.400 usec (+- 348.617 usec) Average num. events: 21553.200 (+- 0.327) Average time per event 3.863 usec Number of synthesis threads: 4 Average synthesis took: 75093.000 usec (+- 422.978 usec) Average num. events: 21554.200 (+- 0.200) Average time per event 3.484 usec Number of synthesis threads: 5 Average synthesis took: 64896.600 usec (+- 353.348 usec) Average num. events: 21558.000 (+- 0.000) Average time per event 3.010 usec Number of synthesis threads: 6 Average synthesis took: 59210.200 usec (+- 342.890 usec) Average num. events: 21560.000 (+- 0.000) Average time per event 2.746 usec Number of synthesis threads: 7 Average synthesis took: 54093.900 usec (+- 306.247 usec) Average num. events: 21562.000 (+- 0.000) Average time per event 2.509 usec Number of synthesis threads: 8 Average synthesis took: 48938.700 usec (+- 341.732 usec) Average num. events: 21564.000 (+- 0.000) Average time per event 2.269 usec Where average time per synthesized event goes from 5.927 usec with 1 thread to 2.269 usec with 8. This isn't a linear speed up as not all of synthesize code has been made parallel. If the synthesis time was about 10 seconds then using 8 threads may bring this down to less than 4. Signed-off-by: Stephane Eranian <eranian@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tony Jones <tonyj@suse.de> Cc: yuzhoujian <yuzhoujian@didichuxing.com> Link: http://lore.kernel.org/lkml/20200422155038.9380-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
8fb4b679 |
|
25-Mar-2020 |
Namhyung Kim <namhyung@kernel.org> |
perf record: Add --all-cgroups option The --all-cgroups option is to enable cgroup profiling support. It tells kernel to record CGROUP events in the ring buffer so that perf report can identify task/cgroup association later. [root@seventh ~]# perf record --all-cgroups --namespaces /wb/cgtest [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.042 MB perf.data (558 samples) ] [root@seventh ~]# perf report --stdio -s cgroup_id,cgroup,pid # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 558 of event 'cycles' # Event count (approx.): 458017341 # # Overhead cgroup id (dev/inode) Cgroup Pid:Command # ........ ..................... .......... ............... # 33.15% 4/0xeffffffb /sub 9615:looper0 32.83% 4/0xf00002f5 /sub/cgrp2 9620:looper2 32.79% 4/0xf00002f4 /sub/cgrp1 9619:looper1 0.35% 4/0xf00002f5 /sub/cgrp2 9618:cgtest 0.34% 4/0xf00002f4 /sub/cgrp1 9617:cgtest 0.32% 4/0xeffffffb / 9615:looper0 0.11% 4/0xeffffffb /sub 9617:cgtest 0.10% 4/0xeffffffb /sub 9618:cgtest # # (Tip: Sample related events with: perf record -e '{cycles,instructions}:S') # [root@seventh ~]# Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200325124536.2800725-8-namhyung@kernel.org Link: http://lore.kernel.org/lkml/20200402015249.3800462-1-namhyung@kernel.org [ Extracted the HAVE_FILE_HANDLE from the followup patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ab64069f |
|
25-Mar-2020 |
Namhyung Kim <namhyung@kernel.org> |
perf record: Support synthesizing cgroup events Synthesize cgroup events by iterating cgroup filesystem directories. The cgroup event only saves the portion of cgroup path after the mount point and the cgroup id (which actually is a file handle). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200325124536.2800725-7-namhyung@kernel.org Link: http://lore.kernel.org/lkml/20200402015249.3800462-1-namhyung@kernel.org [ Extracted the HAVE_FILE_HANDLE from the followup patch, added missing __maybe_unused ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
8384a260 |
|
03-Dec-2019 |
Alexey Budankov <alexey.budankov@linux.intel.com> |
perf record: Adapt affinity to machines with #CPUs > 1K Use struct mmap_cpu_mask type for the tool's thread and mmap data buffers to overcome current 1024 CPUs mask size limitation of cpu_set_t type. Currently glibc's cpu_set_t type has an internal mask size limit of 1024 CPUs. Moving to the 'struct mmap_cpu_mask' type allows overcoming that limit. The tools bitmap API is used to manipulate objects of 'struct mmap_cpu_mask' type. Committer notes: To print the 'nbits' struct member we must use %zd, since it is a size_t, this fixes the build in some toolchains/arches. Reported-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/96d7e2ff-ce8b-c1e0-d52c-aa59ea96f0ea@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4804e011 |
|
20-Nov-2019 |
Andi Kleen <ak@linux.intel.com> |
perf stat: Use affinity for opening events Restructure the event opening in perf stat to cycle through the events by CPU after setting affinity to that CPU. This eliminates IPI overhead in the perf API. We have to loop through the CPU in the outter builtin-stat code instead of leaving that to low level functions. It has to change the weak group fallback strategy slightly. Since we cannot easily undo the opens for other CPUs move the weak group retry to a separate loop. Before with a large test case with 94 CPUs: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 42.75 4.050910 67 60046 110 perf_event_open After: 26.86 0.944396 16 58069 110 perf_event_open (the number changes slightly because the weak group retries work differently and the test case relies on weak groups) Committer notes: Added one of the hunks in a patch provided by Andi after I noticed that the "event times" 'perf test' entry was segfaulting. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20191121001522.180827-10-andi@firstfloor.org Link: http://lore.kernel.org/lkml/20191127232657.GL84886@tassilo.jf.intel.com # Fix Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c0a6de06 |
|
15-Nov-2019 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Add support for AUX area sampling Add a 'perf record' option '--aux-sample' to request AUX area sampling. AUX area sampling uses an overwriting buffer much like snapshot mode, so adjust the AUX buffer mmapping accordingly. To make it easy to queue samples for decoding, synthesize an ID index. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191115124225.5247-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
6e0a9b3d |
|
13-Nov-2019 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: No need to process the synthesized MMAP events twice At the end of a 'perf record' session, by default, we'll process all samples and populate the threads, maps, etc so as to find out which of the DSOs got samples, to reduce the size of the build-id table we'll add to the perf.data headers. But we don't need to process the PERF_RECORD_MMAP events synthesized for the kernel modules, as we have those already via perf_session__create_kernel_maps(), so add mmap/mmap2 handlers that first look at event->header.misc to see if the event is for a user map, bailing out if not. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-mofoxvcx2dryppcw3o689jdd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
6d575816 |
|
22-Oct-2019 |
Jiwei Sun <jiwei.sun@windriver.com> |
perf record: Add support for limit perf output file size The patch adds a new option to limit the output file size, then based on it, we can create a wrapper of the perf command that uses the option to avoid exhausting the disk space by the unconscious user. In order to make the perf.data parsable, we just limit the sample data size, since the perf.data consists of many headers and sample data and other data, the actual size of the recorded file will bigger than the setting value. Testing it: # ./perf record -a -g --max-size=10M Couldn't synthesize bpf events. [ perf record: perf size limit reached (10249 KB), stopping session ] [ perf record: Woken up 32 times to write data ] [ perf record: Captured and wrote 10.133 MB perf.data (71964 samples) ] # ls -lh perf.data -rw------- 1 root root 11M Oct 22 14:32 perf.data # ./perf record -a -g --max-size=10K [ perf record: perf size limit reached (10 KB), stopping session ] Couldn't synthesize bpf events. [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 1.546 MB perf.data (69 samples) ] # ls -l perf.data -rw------- 1 root root 1626952 Oct 22 14:36 perf.data Committer notes: Fixed the build in multiple distros by using PRIu64 to print u64 struct members, fixing this: builtin-record.c: In function 'record__write': builtin-record.c:150:5: error: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'u64' [-Werror=format=] rec->bytes_written >> 10); ^ CC /tmp/build/pe Signed-off-by: Jiwei Sun <jiwei.sun@windriver.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Danter <richard.danter@windriver.com> Link: http://lore.kernel.org/lkml/20191022080901.3841-1-jiwei.sun@windriver.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
eeb399b5 |
|
04-Oct-2019 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Put a copy of kcore into the perf.data directory Add a new 'perf record' option '--kcore' which will put a copy of /proc/kcore, kallsyms and modules into a perf.data directory. Note, that without the --kcore option, output goes to a file as previously. The tools' -o and -i options work with either a file name or directory name. Example: $ sudo perf record --kcore uname $ sudo tree perf.data perf.data ├── kcore_dir │ ├── kallsyms │ ├── kcore │ └── modules └── data $ sudo perf script -v build id event received for vmlinux: 1eaa285996affce2d74d8e66dcea09a80c9941de build id event received for [vdso]: 8bbaf5dc62a9b644b4d4e4539737e104e4a84541 Samples for 'cycles' event do not have CPU attribute set. Skipping 'cpu' field. Using CPUID GenuineIntel-6-8E-A Using perf.data/kcore_dir/kcore for kernel data Using perf.data/kcore_dir/kallsyms for symbols perf 19058 506778.423729: 1 cycles: ffffffffa2caa548 native_write_msr+0x8 (vmlinux) perf 19058 506778.423733: 1 cycles: ffffffffa2caa548 native_write_msr+0x8 (vmlinux) perf 19058 506778.423734: 7 cycles: ffffffffa2caa548 native_write_msr+0x8 (vmlinux) perf 19058 506778.423736: 117 cycles: ffffffffa2caa54a native_write_msr+0xa (vmlinux) perf 19058 506778.423738: 2092 cycles: ffffffffa2c9b7b0 native_apic_msr_write+0x0 (vmlinux) perf 19058 506778.423740: 37380 cycles: ffffffffa2f121d0 perf_event_addr_filters_exec+0x0 (vmlinux) uname 19058 506778.423751: 582673 cycles: ffffffffa303a407 propagate_protected_usage+0x147 (vmlinux) uname 19058 506778.423892: 2241841 cycles: ffffffffa2cae0c9 unwind_next_frame.part.5+0x79 (vmlinux) uname 19058 506778.424430: 2457397 cycles: ffffffffa3019232 check_memory_region+0x52 (vmlinux) Committer testing: # rm -rf perf.data* # perf record sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.024 MB perf.data (7 samples) ] # ls -l perf.data -rw-------. 1 root root 34772 Oct 21 11:08 perf.data # perf record --kcore uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.024 MB perf.data (7 samples) ] ls[root@quaco ~]# ls -lad perf.data* drwx------. 3 root root 4096 Oct 21 11:08 perf.data -rw-------. 1 root root 34772 Oct 21 11:08 perf.data.old # perf evlist -v cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 # perf evlist -v -i perf.data/data cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 # Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lore.kernel.org/lkml/20191004083121.12182-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
46e201ef |
|
04-Oct-2019 |
Adrian Hunter <adrian.hunter@intel.com> |
perf data: Support single perf.data file directory Support directory output that contains a regular perf.data file, named "data". By default the directory is named perf.data i.e. perf.data └── data Most of the infrastructure to support a directory is already there. This patch makes the changes needed to support the format above. Presently there is no 'perf record' option to output a directory. This is preparation for adding support for putting a copy of /proc/kcore in the directory. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20191004083121.12182-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
80e53d11 |
|
07-Oct-2019 |
Jiri Olsa <jolsa@kernel.org> |
libperf: Adopt perf_mmap__put() function from tools/perf Move perf_mmap__put() from tools/perf to libperf. Once perf_mmap__put() is moved, we need a way to call application related unmap code (AIO and aux related code for eprf), when the map goes away. Add the perf_mmap::unmap callback to do that. The unmap path from perf is: perf_mmap__put (libperf) perf_mmap__munmap (libperf) map->unmap_cb -> perf_mmap__unmap_cb (perf) mmap__munmap (perf) Committer notes: Add missing linux/kernel.h to tools/perf/lib/mmap.c to get the BUG_ON definition. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20191007125344.14268-8-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e75710f0 |
|
07-Oct-2019 |
Jiri Olsa <jolsa@kernel.org> |
libperf: Adopt perf_mmap__get() function from tools/perf Move perf_mmap__get() from tools/perf to libperf in the internal header internal/mmap.h. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20191007125344.14268-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
bf59b305 |
|
07-Oct-2019 |
Jiri Olsa <jolsa@kernel.org> |
libperf: Adopt perf_mmap__mmap_len() function from tools/perf Move perf_mmap__mmap_len() from tools/perf wto libperf, it will be used in the following patches. And rename the existing perf's function to mmap__mmap_len(). Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20191007125344.14268-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
80ab2987 |
|
31-Aug-2019 |
Jiri Olsa <jolsa@kernel.org> |
libperf: Add perf_evlist__poll() function Move perf_evlist__poll() from tools/perf to libperf, it will be used in the following patches. And rename the existing perf's function to evlist__poll(). Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lore.kernel.org/lkml/20190913132355.21634-39-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f4009e7b |
|
16-Aug-2019 |
Jiri Olsa <jolsa@kernel.org> |
libperf: Add perf_evlist__add_pollfd() function Move perf_evlist__add_pollfd() from tools/perf to libperf, it will be used in the following patches. Also rename perf's perf_evlist__add_pollfd()/perf_evlist__filter_pollfd() to evlist__add_pollfd()/evlist__filter_pollfd(). Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lore.kernel.org/lkml/20190913132355.21634-38-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
515dbe48 |
|
03-Sep-2019 |
Jiri Olsa <jolsa@kernel.org> |
libperf: Add perf_evlist__first()/last() functions Add perf_evlist__first()/last() functions to libperf, as internal functions and rename perf's origins to evlist__first/last. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lore.kernel.org/lkml/20190913132355.21634-29-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f6fa4375 |
|
06-Aug-2019 |
Jiri Olsa <jolsa@kernel.org> |
libperf: Move 'mmap_len' from 'struct evlist' to 'struct perf_evlist' Moving 'mmap_len' from 'struct evlist' to 'struct perf_evlist' it will be used in following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lore.kernel.org/lkml/20190913132355.21634-22-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c976ee11 |
|
30-Jul-2019 |
Jiri Olsa <jolsa@kernel.org> |
libperf: Move 'nr_mmaps' from 'struct evlist' to 'struct perf_evlist' Moving 'nr_mmaps' from 'struct evlist' to 'struct perf_evlist', it will be used in following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lore.kernel.org/lkml/20190913132355.21634-21-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
65aa2e6b |
|
27-Aug-2019 |
Jiri Olsa <jolsa@kernel.org> |
libperf: Add 'flush' to 'struct perf_mmap' Move 'flush' from tools/perf's mmap to libperf's perf_mmap struct. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lore.kernel.org/lkml/20190913132355.21634-19-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
547740f7 |
|
27-Jul-2019 |
Jiri Olsa <jolsa@kernel.org> |
libperf: Add perf_mmap struct Add the perf_mmap struct to libperf. The definition is added into: include/internal/mmap.h which is not to be included by users, but shared within perf and libperf. Committer notes: Remove unnecessary includes from tools/perf/lib/include/internal/mmap.h, those will be readded as they become necessary, later in the series. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lore.kernel.org/lkml/20190913132355.21634-11-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e0fcfb08 |
|
22-Sep-2019 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Adopt backwards ring buffer state enum As this isn't used at all in mmap.h but in evlist.h, so to cut down the header dependency tree, move it to where it is used. Also add mmap.h to the places using it but previously getting it indirectly via evlist.h. Add missing pthread.h to evlist.h, as it has a pthread_t struct member and was getting the header via mmap.h. Noticed while processing a Jiri's libperf batch touching mmap.h, where almost everything gets rebuilt because evlist.h is so popular, so cut down't this rebuild the world party. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Link: https://lkml.kernel.org/n/tip-he0uljeftl0xfveh3d6vtode@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9521b5f2 |
|
27-Jul-2019 |
Jiri Olsa <jolsa@kernel.org> |
perf tools: Rename perf_evlist__mmap() to evlist__mmap() Rename perf_evlist__mmap() to evlist__mmap(), so we don't have a name clash when we add perf_evlist__mmap() in libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lore.kernel.org/lkml/20190913132355.21634-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a5830532 |
|
27-Jul-2019 |
Jiri Olsa <jolsa@kernel.org> |
perf tools: Rename 'struct perf_mmap' to 'struct mmap' Rename 'struct perf_evlist' to 'struct evlist', so we don't have a name clash when we add 'struct perf_mmap' to libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lore.kernel.org/lkml/20190913132355.21634-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c8b567c8 |
|
23-Sep-2019 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Move restricted maps check to after a possible fallback to not collect kernel samples Before: [acme@quaco ~]$ perf record -b -e cycles date WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted, check /proc/sys/kernel/kptr_restrict and /proc/sys/kernel/perf_event_paranoid. Samples in kernel functions may not be resolved if a suitable vmlinux file is not found in the buildid cache or in the vmlinux path. Samples in kernel modules won't be resolved at all. If some relocation was applied (e.g. kexec) symbols may be misresolved even with a suitable vmlinux or kallsyms file. Mon 23 Sep 2019 11:00:59 AM -03 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.005 MB perf.data (14 samples) ] [acme@quaco ~]$ But we did a fallback and exclude_kernel was set, so no need for resolving kernel symbols: $ perf evlist -v cycles:u: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY $ After: [acme@quaco ~]$ perf record -b -e cycles date Mon 23 Sep 2019 11:07:18 AM -03 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.007 MB perf.data (16 samples) ] [acme@quaco ~]$ perf evlist -v cycles:u: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY [acme@quaco ~]$ No needless warning is emitted. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lkml.kernel.org/n/tip-5yqnr8xcqwhr15xktj2097ac@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
6ef81c55 |
|
21-Aug-2019 |
Mamatha Inamdar <mamatha4@linux.vnet.ibm.com> |
perf session: Return error code for perf_session__new() function on failure This patch is to return error code of perf_new_session function on failure instead of NULL. Test Results: Before Fix: $ perf c2c report -input failed to open nput: No such file or directory $ echo $? 0 $ After Fix: $ perf c2c report -input failed to open nput: No such file or directory $ echo $? 254 $ Committer notes: Fix 'perf tests topology' case, where we use that TEST_ASSERT_VAL(..., session), i.e. we need to pass zero in case of failure, which was the case before when NULL was returned by perf_session__new() for failure, but now we need to negate the result of IS_ERR(session) to respect that TEST_ASSERT_VAL) expectation of zero meaning failure. Reported-by: Nageswara R Sastry <rnsastry@linux.vnet.ibm.com> Signed-off-by: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Nageswara R Sastry <rnsastry@linux.vnet.ibm.com> Acked-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shawn Landden <shawn@git.icu> Cc: Song Liu <songliubraving@fb.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com> Link: http://lore.kernel.org/lkml/20190822071223.17892.45782.stgit@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
055c67ed |
|
18-Sep-2019 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Move event synthesizing routines to separate .c file For better grouping, in time we may end up making most of these static, i.e. generalizing the 'perf record' synthesizing code so that based on the target it can do the right thing and call the needed synthesizers. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-s9zxxhk40s95pjng9panet16@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ea49e01c |
|
18-Sep-2019 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Move event synthesizing routines to separate header Those are the only routines using the perf_event__handler_t typedef and are all related, so move to a separate header to reduce the header dependency tree, lots of places were getting event.h and even stdio.h, limits.h indirectly, so fix those as well. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-yvx9u1mf7baq6cu1abfhbqgs@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
8520a98d |
|
29-Aug-2019 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf debug: Remove needless include directives from debug.h All we need there is a forward declaration for 'union perf_event', so remove it from there and add missing header directives in places using things from this indirect include. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-7ftk0ztstqub1tirjj8o8xbl@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c1a604df |
|
29-Aug-2019 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Remove needless perf.h include directive from headers Its not needed there, add it to the places that need it and were getting it via those headers. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-5yulx1u16vyd0zmrbg1tjhju@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
72932371 |
|
28-Aug-2019 |
Jiri Olsa <jolsa@kernel.org> |
libperf: Rename the PERF_RECORD_ structs to have a "perf" prefix Even more, to have a "perf_record_" prefix, so that they match the PERF_RECORD_ enum they map to. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190828135717.7245-23-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d06e5fad |
|
26-Aug-2019 |
Igor Lubashev <ilubashe@akamai.com> |
perf tools: Warn that perf_event_paranoid can restrict kernel symbols Warn that /proc/sys/kernel/perf_event_paranoid can also restrict kernel symbols. Signed-off-by: Igor Lubashev <ilubashe@akamai.com> Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: James Morris <jmorris@namei.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1566869956-7154-6-git-send-email-ilubashe@akamai.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
aeb00b1a |
|
22-Aug-2019 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Move record_opts and other record decls out of perf.h And into a separate util/record.h, to better isolate things and make sure that those who use record_opts and the other moved declarations are explicitly including the necessary header. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-31q8mei1qkh74qvkl9nwidfq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ce7b0e42 |
|
06-Aug-2019 |
Alexander Shishkin <alexander.shishkin@linux.intel.com> |
perf record: Add an option to take an AUX snapshot on exit It is sometimes useful to generate a snapshot when perf record exits; I've been using a wrapper script around the workload that would do a killall -USR2 perf when the workload exits. This patch makes it easier and also works when perf record is attached to a pre-existing task. A new snapshot option 'e' can be specified in -S to enable this behavior: root@elsewhere:~# perf record -e intel_pt// -Se sleep 1 [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.085 MB perf.data ] Co-developed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190806144101.62892-1-alexander.shishkin@linux.intel.com [ Fixed up !HAVE_AUXTRACE_SUPPORT build in builtin-record.c, adding 2 missing __maybe_unused ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
03617c22 |
|
21-Jul-2019 |
Jiri Olsa <jolsa@kernel.org> |
libperf: Add threads to struct perf_evlist Move threads from tools/perf's evlist to libperf's perf_evlist struct. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-56-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f72f901d |
|
21-Jul-2019 |
Jiri Olsa <jolsa@kernel.org> |
libperf: Add cpus to struct perf_evlist Move cpus from tools/perf's evlist to libperf's perf_evlist struct. Committer notes: Fixed up this one: tools/perf/arch/arm/util/cs-etm.c Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-55-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
af663bd0 |
|
21-Jul-2019 |
Jiri Olsa <jolsa@kernel.org> |
libperf: Add threads to struct perf_evsel Move 'threads' from tools/perf's evsel to libperf's perf_evsel struct. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-53-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d400bd3a |
|
21-Jul-2019 |
Jiri Olsa <jolsa@kernel.org> |
libperf: Add cpus to struct perf_evsel Mov the 'cpus' field from tools/perf's evsel to libperf's perf_evsel. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-51-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
1fc632ce |
|
21-Jul-2019 |
Jiri Olsa <jolsa@kernel.org> |
libperf: Move perf_event_attr field from perf's evsel to libperf's perf_evsel Move the perf_event_attr struct fron 'struct evsel' to 'struct perf_evsel'. Committer notes: Fixed up these: tools/perf/arch/arm/util/auxtrace.c tools/perf/arch/arm/util/cs-etm.c tools/perf/arch/arm64/util/arm-spe.c tools/perf/arch/s390/util/auxtrace.c tools/perf/util/cs-etm.c Also cc1: warnings being treated as errors tests/sample-parsing.c: In function 'do_test': tests/sample-parsing.c:162: error: missing initializer tests/sample-parsing.c:162: error: (near initialization for 'evsel.core.cpus') struct evsel evsel = { .needs_swap = false, - .core.attr = { - .sample_type = sample_type, - .read_format = read_format, + .core = { + . attr = { + .sample_type = sample_type, + .read_format = read_format, + }, [perfbuilder@a70e4eeb5549 /]$ gcc --version |& head -1 gcc (GCC) 4.4.7 Also we don't need to include perf_event.h in tools/perf/lib/include/perf/evsel.h, forward declaring 'struct perf_event_attr' is enough. And this even fixes the build in some systems where things are used somewhere down the include path from perf_event.h without defining __always_inline. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-43-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
6484d2f9 |
|
21-Jul-2019 |
Jiri Olsa <jolsa@kernel.org> |
libperf: Add nr_entries to struct perf_evlist Move nr_entries count from 'struct perf' to into perf_evlist struct. Committer notes: Fix tools/perf/arch/s390/util/auxtrace.c case. And also the comment in tools/perf/util/annotate.h. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-42-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ce9036a6 |
|
21-Jul-2019 |
Jiri Olsa <jolsa@kernel.org> |
libperf: Include perf_evlist in evlist object Include perf_evlist in the evlist object, will continue to move other generic things into libperf's perf_evlist. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-37-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7836e52e |
|
21-Jul-2019 |
Jiri Olsa <jolsa@kernel.org> |
libperf: Add perf_thread_map__get()/perf_thread_map__put() Move the following functions: thread_map__get() thread_map__put() thread_map__comm() to libperf with the following names: perf_thread_map__get() perf_thread_map__put() perf_thread_map__comm() Add the perf_thread_map__comm() function for it to work/compile. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-34-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e74676de |
|
21-Jul-2019 |
Jiri Olsa <jolsa@kernel.org> |
perf evlist: Rename perf_evlist__disable() to evlist__disable() Rename perf_evlist__disable() to evlist__disable(), so we don't have a name clash when we add perf_evlist__disable() in libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-23-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
1c87f165 |
|
21-Jul-2019 |
Jiri Olsa <jolsa@kernel.org> |
perf evlist: Rename perf_evlist__enable() to evlist__enable() Rename perf_evlist__enable() to evlist__enable(), so we don't have a name clash when we add perf_evlist__enable() in libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-22-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5972d1e0 |
|
21-Jul-2019 |
Jiri Olsa <jolsa@kernel.org> |
perf evsel: Rename perf_evsel__open() to evsel__open() Rename perf_evsel__open() to evsel__open(), so we don't have a name clash when we add perf_evsel__open() in libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-15-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c12995a5 |
|
21-Jul-2019 |
Jiri Olsa <jolsa@kernel.org> |
perf evlist: Rename perf_evlist__delete() to evlist__delete() Rename perf_evlist__delete() to evlist__delete(), so we don't have a name clash when we add perf_evlist__delete() in libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-10-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0f98b11c |
|
21-Jul-2019 |
Jiri Olsa <jolsa@kernel.org> |
perf evlist: Rename perf_evlist__new() to evlist__new() Rename perf_evlist__new() to evlist__new(), so we don't have a name clash when we add perf_evlist__new() in libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-9-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
63503dba |
|
21-Jul-2019 |
Jiri Olsa <jolsa@kernel.org> |
perf evlist: Rename struct perf_evlist to struct evlist Rename struct perf_evlist to struct evlist, so we don't have a name clash when we add struct perf_evlist in libperf. Committer notes: Added fixes to build on arm64, from Jiri and from me (tools/perf/util/cs-etm.c) Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
32dcd021 |
|
21-Jul-2019 |
Jiri Olsa <jolsa@kernel.org> |
perf evsel: Rename struct perf_evsel to struct evsel Rename struct perf_evsel to struct evsel, so we don't have a name clash when we add struct perf_evsel in libperf. Committer notes: Added fixes for arm64, provided by Jiri. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9749b90e |
|
21-Jul-2019 |
Jiri Olsa <jolsa@kernel.org> |
perf tools: Rename struct thread_map to struct perf_thread_map Rename struct thread_map to struct perf_thread_map, so it could be part of libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d8f9da24 |
|
03-Jul-2019 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Use zfree() where applicable In places where the equivalent was already being done, i.e.: free(a); a = NULL; And in placs where struct members are being freed so that if we have some erroneous reference to its struct, then accesses to freed members will result in segfaults, which we can detect faster than use after free to areas that may still have something seemingly valid. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-jatyoofo5boc1bsvoig6bb6i@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
53651b28 |
|
30-May-2019 |
yuzhoujian <yuzhoujian@didichuxing.com> |
perf record: Add support to collect callchains from kernel or user space only One can just record callchains in the kernel or user space with this new options. We can use it together with "--all-kernel" options. This two options is used just like print_stack(sys) or print_ustack(usr) for systemtap. Shown below is the usage of this new option combined with "--all-kernel" options: 1. Configure all used events to run in kernel space and just collect kernel callchains. $ perf record -a -g --all-kernel --kernel-callchains 2. Configure all used events to run in kernel space and just collect user callchains. $ perf record -a -g --all-kernel --user-callchains Committer notes: Improved documentation to state that asking for kernel callchains really is asking for excluding user callchains, and vice versa. Further mentioned that using both won't get both, but nothing, as both will be excluded. Signed-off-by: yuzhoujian <yuzhoujian@didichuxing.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1559222962-22891-1-git-send-email-ufo19890607@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
aeea9062 |
|
14-May-2019 |
Kan Liang <kan.liang@linux.intel.com> |
perf parse-regs: Split parse_regs The available registers for --int-regs and --user-regs may be different, e.g. XMM registers. Split parse_regs into two dedicated functions for --int-regs and --user-regs respectively. Modify the warning message. "--user-regs=?" should be applied to show the available registers for --user-regs. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1557865174-56264-1-git-send-email-kan.liang@linux.intel.com [ Changed docs as suggested by Ravi and agreed by Kan ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
504c1ad1 |
|
18-Mar-2019 |
Alexey Budankov <alexey.budankov@linux.intel.com> |
perf record: Implement -z,--compression_level[=<n>] option Implemented -z,--compression_level[=<n>] option that enables compression of mmaped kernel data buffers content in runtime during perf record mode collection. Default option value is 1 (fastest compression). Compression overhead has been measured for serial and AIO streaming when profiling matrix multiplication workload: ------------------------------------------------------------- | SERIAL | AIO-1 | ----------------------------------------------------------------| |-z | OVH(x) | ratio(x) size(MiB) | OVH(x) | ratio(x) size(MiB) | |---------------------------------------------------------------| | 0 | 1,00 | 1,000 179,424 | 1,00 | 1,000 187,527 | | 1 | 1,04 | 8,427 181,148 | 1,01 | 8,474 188,562 | | 2 | 1,07 | 8,055 186,953 | 1,03 | 7,912 191,773 | | 3 | 1,04 | 8,283 181,908 | 1,03 | 8,220 191,078 | | 5 | 1,09 | 8,101 187,705 | 1,05 | 7,780 190,065 | | 8 | 1,05 | 9,217 179,191 | 1,12 | 6,111 193,024 | ----------------------------------------------------------------- OVH = (Execution time with -z N) / (Execution time with -z 0) ratio - compression ratio size - number of bytes that was compressed size ~= trace size x ratio Committer notes: Testing it I noticed that it failed to disable build id processing when compression is enabled, and as we'd have to uncompress everything to look for the PERF_RECORD_{MMAP,SAMPLE,etc} to figure out which build ids to read from DSOs, we better disable build id processing when compression is enabled, logging with pr_debug() when doing so: Original patch: # perf record -z2 ^C[ perf record: Woken up 1 times to write data ] 0x1746e0 [0x76]: failed to process type: 81 [Invalid argument] [ perf record: Captured and wrote 1.568 MB perf.data, compressed (original 0.452 MB, ratio is 3.995) ] # After auto-disabling build id processing when compression is enabled: $ perf record -z2 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.292) ] $ perf record -v -z2 sleep 1 Compression enabled, disabling build id collection at the end of the session. <SNIP extra -v pr_debug() messages> [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.305) ] $ Also, with parts of the patch originally after this one moved to just before this one we get: $ perf record -z2 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.371) ] $ perf report -D | grep COMPRESS 0 0x1b8 [0x155]: PERF_RECORD_COMPRESSED: unhandled! 0 0x30d [0x80]: PERF_RECORD_COMPRESSED: unhandled! COMPRESSED events: 2 COMPRESSED events: 0 $ I.e. when faced with PERF_RECORD_COMPRESSED that we still have no code to process, we just show it as not being handled, skip them and continue, while before we had: $ perf report -D | grep COMPRESS 0x1b8 [0x169]: failed to process type: 81 [Invalid argument] Error: failed to process sample 0 0x1b8 [0x169]: PERF_RECORD_COMPRESSED $ Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/9ff06518-ae63-a908-e44d-5d9e56dd66d9@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ef781128 |
|
18-Mar-2019 |
Alexey Budankov <alexey.budankov@linux.intel.com> |
perf record: Implement compression for AIO trace streaming Compression is implemented using the functions from zstd.c. As the memory to operate on the compression uses mmap->aio.data[] buffers. If Zstd streaming compression API fails for some reason the data to be compressed are just copied into the memory buffers using plain memcpy(). Compressed trace frame consists of an array of PERF_RECORD_COMPRESSED records. Each element of the array is not longer that PERF_SAMPLE_MAX_SIZE and consists of perf_event_header followed by the compressed chunk that is decompressed on the loading stage. perf_mmap__aio_push() is replaced by perf_mmap__push() which is now used in the both serial and AIO streaming cases. perf_mmap__push() is extended with positive return values to signify absence of data ready for processing. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/77db2b2c-5d03-dbb0-aeac-c4dd92129ab9@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5d7f4116 |
|
18-Mar-2019 |
Alexey Budankov <alexey.budankov@linux.intel.com> |
perf record: Implement compression for serial trace streaming Compression is implemented using the functions from zstd.c. As the memory to operate on the compression uses mmap->data buffer. If Zstd streaming compression API fails for some reason the data to be compressed are just copied into the memory buffers using plain memcpy(). Compressed trace frame consists of an array of PERF_RECORD_COMPRESSED records. Each element of the array is not longer that PERF_SAMPLE_MAX_SIZE and consists of perf_event_header followed by the compressed chunk that is decompressed on the loading stage. Comitter notes: Undo some unnecessary line breaks, remove some unnecessary () around zstd_data to then just get its address, and fix conflicts with BPF_PROG_INFO/BPF_BTF patchkits. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/744df43f-3932-2594-ddef-1e99a3cad03a@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
51255a8a |
|
18-Mar-2019 |
Alexey Budankov <alexey.budankov@linux.intel.com> |
perf mmap: Implement dedicated memory buffer for data compression Implemented mmap data buffer that is used as the memory to operate on when compressing data in case of serial trace streaming. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/49b31321-0f70-392b-9a4f-649d3affe090@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
42e1fd80 |
|
18-Mar-2019 |
Alexey Budankov <alexey.budankov@linux.intel.com> |
perf record: Implement COMPRESSED event record and its attributes Implemented PERF_RECORD_COMPRESSED event, related data types, header feature and functions to write, read and print feature attributes from the trace header section. comp_mmap_len preserves the size of mmaped kernel buffer that was used during collection. comp_mmap_len size is used on loading stage as the size of decomp buffer for decompression of COMPRESSED events content. Committer notes: Fixed up conflict with BPF_PROG_INFO and BTF_BTF header features. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/ebbaf031-8dda-3864-ebc6-7922d43ee515@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d3c8c08e |
|
18-Mar-2019 |
Alexey Budankov <alexey.budankov@linux.intel.com> |
perf session: Define 'bytes_transferred' and 'bytes_compressed' metrics Define 'bytes_transferred' and 'bytes_compressed' metrics to calculate ratio in the end of the data collection: compression ratio = bytes_transferred / bytes_compressed The 'bytes_transferred' metric accumulates the amount of bytes that was extracted from the mmaped kernel buffers for compression, while 'bytes_compressed' accumulates the amount of bytes that was received after applying compression. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1d4bf499-cb03-26dc-6fc6-f14fec7622ce@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
8e5bc76f |
|
13-May-2019 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Fix suggestion to get list of registers usable with --user-regs and --intr-regs $ perf record -h -I Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -I, --intr-regs[=<any register>] sample selected machine registers on interrupt, use -I ? to list register names $ m $ perf record -I ? Workload failed: No such file or directory $ After: $ perf record -h -I Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -I, --intr-regs[=<any register>] sample selected machine registers on interrupt, use '-I?' to list register names $ $ perf record -I? available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15 Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -I, --intr-regs[=<any register>] sample selected machine registers on interrupt, use '-I?' to list register names $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Fixes: bcc84ec65ad1 ("perf record: Add ability to name registers to record") Link: https://lkml.kernel.org/n/tip-r0xhfhy5radmkhhcbcfs5izf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
470530bb |
|
18-Mar-2019 |
Alexey Budankov <alexey.budankov@linux.intel.com> |
perf record: Implement --mmap-flush=<number> option Implement a --mmap-flush option that specifies minimal number of bytes that is extracted from mmaped kernel buffer to store into a trace. The default option value is 1 byte what means every time trace writing thread finds some new data in the mmaped buffer the data is extracted, possibly compressed and written to a trace. $ tools/perf/perf record --mmap-flush 1024 -e cycles -- matrix.gcc $ tools/perf/perf record --aio --mmap-flush 1K -e cycles -- matrix.gcc The option is independent from -z setting, doesn't vary with compression level and can serve two purposes. The first purpose is to increase the compression ratio of a trace data. Larger data chunks are compressed more effectively so the implemented option allows specifying data chunk size to compress. Also at some cases executing more write syscalls with smaller data size can take longer than executing less write syscalls with bigger data size due to syscall overhead so extracting bigger data chunks specified by the option value could additionally decrease runtime overhead. The second purpose is to avoid self monitoring live-lock issue in system wide (-a) profiling mode. Profiling in system wide mode with compression (-a -z) can additionally induce data into the kernel buffers along with the data from monitored processes. If performance data rate and volume from the monitored processes is high then trace streaming and compression activity in the tool is also high. High tool process activity can lead to subtle live-lock effect when compression of single new byte from some of mmaped kernel buffer leads to generation of the next single byte at some mmaped buffer. So perf tool process ends up in endless self monitoring. Implemented synch parameter is the mean to force data move independently from the specified flush threshold value. Despite the provided flush value the tool needs capability to unconditionally drain memory buffers, at least in the end of the collection. Committer testing: Running with the default value, i.e. as soon as there is something to read go on consuming, we first write the synthesized events, small chunks of about 128 bytes: # perf trace -m 2048 --call-graph dwarf -e write -- perf record <SNIP> 101.142 ( 0.004 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x210db60, count: 120) = 120 __libc_write (/usr/lib64/libpthread-2.28.so) ion (/home/acme/bin/perf) record__write (inlined) process_synthesized_event (/home/acme/bin/perf) perf_tool__process_synth_event (inlined) perf_event__synthesize_mmap_events (/home/acme/bin/perf) Then we move to reading the mmap buffers consuming the events put there by the kernel perf infrastructure: 107.561 ( 0.005 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befc02000, count: 336) = 336 __libc_write (/usr/lib64/libpthread-2.28.so) ion (/home/acme/bin/perf) record__write (inlined) record__pushfn (/home/acme/bin/perf) perf_mmap__push (/home/acme/bin/perf) record__mmap_read_evlist (inlined) record__mmap_read_all (inlined) __cmd_record (inlined) cmd_record (/home/acme/bin/perf) 12919.953 ( 0.136 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befc83150, count: 184984) = 184984 <SNIP same backtrace as in the 107.561 timestamp> 12920.094 ( 0.155 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befc02150, count: 261816) = 261816 <SNIP same backtrace as in the 107.561 timestamp> 12920.253 ( 0.093 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befb81120, count: 170832) = 170832 <SNIP same backtrace as in the 107.561 timestamp> If we limit it to write only when more than 16MB are available for reading, it throttles that to a quarter of the --mmap-pages set for 'perf record', which by default get to 528384 bytes, found out using 'record -v': mmap flush: 132096 mmap size 528384B With that in place all the writes coming from record__mmap_read_evlist(), i.e. from the mmap buffers setup by the kernel perf infrastructure were at least 132096 bytes long. Trying with a bigger mmap size: perf trace -e write perf record -v -m 2048 --mmap-flush 16M 74982.928 ( 2.471 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff94a6cc000, count: 3580888) = 3580888 74985.406 ( 2.353 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff949ecb000, count: 3453256) = 3453256 74987.764 ( 2.629 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff9496ca000, count: 3859232) = 3859232 74990.399 ( 2.341 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff948ec9000, count: 3769032) = 3769032 74992.744 ( 2.064 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff9486c8000, count: 3310520) = 3310520 74994.814 ( 2.619 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff947ec7000, count: 4194688) = 4194688 74997.439 ( 2.787 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff9476c6000, count: 4029760) = 4029760 Was again limited to a quarter of the mmap size: mmap flush: 2098176 mmap size 8392704B A warning about that would be good to have but can be added later, something like: "max flush is a quarter of the mmap size, if wanting to bump the mmap flush further, bump the mmap size as well using -m/--mmap-pages" Also rename the 'sync' parameters to 'synch' to keep tools/perf building with older glibcs: cc1: warnings being treated as errors builtin-record.c: In function 'record__mmap_read_evlist': builtin-record.c:775: warning: declaration of 'sync' shadows a global declaration /usr/include/unistd.h:933: warning: shadowed declaration is here builtin-record.c: In function 'record__mmap_read_all': builtin-record.c:856: warning: declaration of 'sync' shadows a global declaration /usr/include/unistd.h:933: warning: shadowed declaration is here Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/f6600d72-ecfa-2eb7-7e51-f6954547d500@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d56354dc |
|
11-Mar-2019 |
Song Liu <songliubraving@fb.com> |
perf tools: Save bpf_prog_info and BTF of new BPF programs To fully annotate BPF programs with source code mapping, 4 different information are needed: 1) PERF_RECORD_KSYMBOL 2) PERF_RECORD_BPF_EVENT 3) bpf_prog_info 4) btf This patch handles 3) and 4) for BPF programs loaded after 'perf record|top'. For timely process of these information, a dedicated event is added to the side band evlist. When PERF_RECORD_BPF_EVENT is received via the side band event, the polling thread gathers 3) and 4) vis sys_bpf and store them in perf_env. This information is saved to perf.data at the end of 'perf record'. Committer testing: The 'wakeup_watermark' member in 'struct perf_event_attr' is inside a unnamed union, so can't be used in a struct designated initialization with older gccs, get it out of that, isolating as 'attr.wakeup_watermark = 1;' to work with all gcc versions. We also need to add '--no-bpf-event' to the 'perf record' perf_event_attr tests in 'perf test', as the way that that test goes is to intercept the events being setup and looking if they match the fields described in the control files, since now it finds first the side band event used to catch the PERF_RECORD_BPF_EVENT, they all fail. With these issues fixed: Same scenario as for testing BPF programs loaded before 'perf record' or 'perf top' starts, only start the BPF programs after 'perf record|top', so that its information get collected by the sideband threads, the rest works as for the programs loaded before start monitoring. Add missing 'inline' to the bpf_event__add_sb_event() when HAVE_LIBBPF_SUPPORT is not defined, fixing the build in systems without binutils devel files installed. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190312053051.2690567-16-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
657ee553 |
|
11-Mar-2019 |
Song Liu <songliubraving@fb.com> |
perf evlist: Introduce side band thread This patch introduces side band thread that captures extended information for events like PERF_RECORD_BPF_EVENT. This new thread uses its own evlist that uses ring buffer with very low watermark for lower latency. To use side band thread, we need to: 1. add side band event(s) by calling perf_evlist__add_sb_event(); 2. calls perf_evlist__start_sb_thread(); 3. at the end of perf run, perf_evlist__stop_sb_thread(). In the next patch, we use this thread to handle PERF_RECORD_BPF_EVENT. Committer notes: Add fix by Jiri Olsa for when te sb_tread can't get started and then at the end the stop_sb_thread() segfaults when joining the (non-existing) thread. That can happen when running 'perf top' or 'perf record' as a normal user, for instance. Further checks need to be done on top of this to more graciously handle these possible failure scenarios. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190312053051.2690567-15-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e5416950 |
|
11-Mar-2019 |
Song Liu <songliubraving@fb.com> |
perf bpf: Make synthesize_bpf_events() receive perf_session pointer instead of perf_tool This patch changes the arguments of perf_event__synthesize_bpf_events() to include perf_session* instead of perf_tool*. perf_session will be used in the next patch. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-6-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
71184c6a |
|
11-Mar-2019 |
Song Liu <songliubraving@fb.com> |
perf record: Replace option --bpf-event with --no-bpf-event Currently, monitoring of BPF programs through bpf_event is off by default for 'perf record'. To turn it on, the user need to use option "--bpf-event". As BPF gets wider adoption in different subsystems, this option becomes inconvenient. This patch makes bpf_event on by default, and adds option "--no-bpf-event" to turn it off. Since option --bpf-event is not released yet, it is safe to remove it. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: kernel-team@fb.com Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190312053051.2690567-2-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c38dab7d |
|
14-Mar-2019 |
Andi Kleen <ak@linux.intel.com> |
perf record: Clarify help for --switch-output The help description for --switch-output looks like there are multiple comma separated fields. But it's actually a choice of different options. Make it clear and less confusing. Before: % perf record -h ... --switch-output[=<signal,size,time>] Switch output when receive SIGUSR2 or cross size,time threshold After: % perf record -h ... --switch-output[=<signal or size[BKMG] or time[smhd]>] Switch output when receiving SIGUSR2 (signal) or cross a size or time threshold Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> LPU-Reference: 20190314225002.30108-4-andi@firstfloor.org Link: https://lkml.kernel.org/n/tip-9yecyuha04nyg8toyd1b2pgi@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
03724b2e |
|
14-Mar-2019 |
Andi Kleen <ak@linux.intel.com> |
perf record: Allow to limit number of reported perf.data files When doing long term recording and waiting for some event to snapshot on, we often only care about the last minute or so. The --switch-output command line option supports rotating the perf.data file when the size exceeds a threshold. But the disk would still be filled with unnecessary old files. Add a new option to only keep a number of rotated files, so that the disk space usage can be limited. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> LPU-Reference: 20190314225002.30108-3-andi@firstfloor.org Link: https://lkml.kernel.org/n/tip-y5u2lik0ragt4vlktz6qc9ks@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
258031c0 |
|
08-Mar-2019 |
Jiri Olsa <jolsa@kernel.org> |
perf header: Add DIR_FORMAT feature to describe directory data The data files layout is described by HEADER_DIR_FORMAT feature. Currently it holds only version number (1): uint64_t version; The current version holds only version value (1) means that data files: - Follow the 'data.*' name format. - Contain raw events data in standard perf format as read from kernel (and need to be sorted) Future versions are expected to describe different data files layout according to special needs. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190308134745.5057-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
cd3dd8dd |
|
08-Mar-2019 |
Jiri Olsa <jolsa@kernel.org> |
perf data: Don't store auxtrace index for directory data file We can't store the auxtrace index when we store into multiple files, because we keep only offset for it, not the file. The auxtrace data will be processed correctly in the 'pipe' mode. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190308134745.5057-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
2d4f2799 |
|
21-Feb-2019 |
Jiri Olsa <jolsa@kernel.org> |
perf data: Add global path holder Add a 'path' member to 'struct perf_data'. It will keep the configured path for the data (const char *). The path in struct perf_data_file is now dynamically allocated (duped) from it. This scheme is useful/used in following patches where struct perf_data::path holds the 'configure' directory path and struct perf_data_file::path holds the allocated path for specific files. Also it actually makes the code little simpler. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190221094145.9151-3-jolsa@kernel.org [ Fixup data-convert-bt.c missing conversion ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
45112e89 |
|
21-Feb-2019 |
Jiri Olsa <jolsa@kernel.org> |
perf data: Move size to struct perf_data_file We are about to add support for multiple files, so we need each file to keep its size. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190221094145.9151-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f4fe11b7 |
|
22-Jan-2019 |
Alexey Budankov <alexey.budankov@linux.intel.com> |
perf record: Implement --affinity=node|cpu option Implement --affinity=node|cpu option for the record mode defaulting to system affinity mask bouncing. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/083f5422-ece9-10dd-8305-bf59c860f10f@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f13de660 |
|
22-Jan-2019 |
Alexey Budankov <alexey.budankov@linux.intel.com> |
perf record: Apply affinity masks when reading mmap buffers Build node cpu masks for mmap data buffers. Apply node cpu masks to tool thread every time it references data buffers cross node or cross cpu. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/b25e4ebc-078d-2c7b-216c-f0bed108d073@linux.intel.com [ Use cpu-set-sched.h to get the CPU_{EQUAL,OR}() fallbacks for older systems ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9d2ed645 |
|
22-Jan-2019 |
Alexey Budankov <alexey.budankov@linux.intel.com> |
perf record: Allocate affinity masks Allocate affinity option and masks for mmap data buffers and record thread as well as initialize allocated objects. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/526fa2b0-07de-6dbd-a7e9-26ba875593c9@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
159b0da5 |
|
31-Jan-2019 |
Mathieu Poirier <mathieu.poirier@linaro.org> |
perf pmu: Remove set_drv_config API CoreSight was the only client of the PMU's set_drv_config() API. Now that it is no longer needed by CoreSight remove it from the code base. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Acked-by: Suzuki K Poulouse <suzuki.poulose@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-s390@vger.kernel.org Link: http://lkml.kernel.org/r/20190131184714.20388-8-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7b612e29 |
|
17-Jan-2019 |
Song Liu <songliubraving@fb.com> |
perf tools: Synthesize PERF_RECORD_* for loaded BPF programs This patch synthesize PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT for BPF programs loaded before perf-record. This is achieved by gathering information about all BPF programs via sys_bpf. Committer notes: Fix the build on some older systems such as amazonlinux:1 where it was breaking with: util/bpf-event.c: In function 'perf_event__synthesize_one_bpf_prog': util/bpf-event.c:52:9: error: missing initializer for field 'type' of 'struct bpf_prog_info' [-Werror=missing-field-initializers] struct bpf_prog_info info = {}; ^ In file included from /git/linux/tools/lib/bpf/bpf.h:26:0, from util/bpf-event.c:3: /git/linux/tools/include/uapi/linux/bpf.h:2699:8: note: 'type' declared here __u32 type; ^ cc1: all warnings being treated as errors Further fix on a centos:6 system: cc1: warnings being treated as errors util/bpf-event.c: In function 'perf_event__synthesize_one_bpf_prog': util/bpf-event.c:50: error: 'func_info_rec_size' may be used uninitialized in this function The compiler is wrong, but to silence it, initialize that variable to zero. One more fix, this time for debian:experimental-x-mips, x-mips64 and x-mipsel: util/bpf-event.c: In function 'perf_event__synthesize_one_bpf_prog': util/bpf-event.c:93:16: error: implicit declaration of function 'calloc' [-Werror=implicit-function-declaration] func_infos = calloc(sub_prog_cnt, func_info_rec_size); ^~~~~~ util/bpf-event.c:93:16: error: incompatible implicit declaration of built-in function 'calloc' [-Werror] util/bpf-event.c:93:16: note: include '<stdlib.h>' or provide a declaration of 'calloc' Add the missing header. Committer testing: # perf record --bpf-event sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.021 MB perf.data (7 samples) ] # perf report -D | grep PERF_RECORD_BPF_EVENT | nl 1 0 0x4b10 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 13 2 0 0x4c60 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 14 3 0 0x4db0 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 15 4 0 0x4f00 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 16 5 0 0x5050 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 17 6 0 0x51a0 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 18 7 0 0x52f0 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 21 8 0 0x5440 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 22 # bpftool prog 13: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 14: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 15: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 15,16 16: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 15,16 17: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:44-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 17,18 18: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:44-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 17,18 21: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:45-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 21,22 22: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:45-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 21,22 # # perf report -D | grep -B22 PERF_RECORD_KSYMBOL . ... raw event: size 312 bytes . 0000: 11 00 00 00 00 00 38 01 ff 44 06 c0 ff ff ff ff ......8..D...... . 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog . 0020: 5f 37 62 65 34 39 65 33 39 33 34 61 31 32 35 62 _7be49e3934a125b . 0030: 61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a............... <SNIP zeroes> . 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!....... . 0120: 7b e4 9e 39 34 a1 25 ba 00 00 00 00 00 00 00 00 {..94.%......... . 0130: 00 00 00 00 00 00 00 00 ........ 0 0x49d8 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc00644ff len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba -- . ... raw event: size 312 bytes . 0000: 11 00 00 00 00 00 38 01 48 6d 06 c0 ff ff ff ff ......8.Hm...... . 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog . 0020: 5f 32 61 31 34 32 65 66 36 37 61 61 61 64 31 37 _2a142ef67aaad17 . 0030: 34 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4............... <SNIP zeroes> . 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!....... . 0120: 2a 14 2e f6 7a aa d1 74 00 00 00 00 00 00 00 00 *...z..t........ . 0130: 00 00 00 00 00 00 00 00 ........ 0 0x4b28 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0066d48 len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174 -- . ... raw event: size 312 bytes . 0000: 11 00 00 00 00 00 38 01 04 cf 03 c0 ff ff ff ff ......8......... . 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog . 0020: 5f 37 62 65 34 39 65 33 39 33 34 61 31 32 35 62 _7be49e3934a125b . 0030: 61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a............... <SNIP zeroes> . 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!....... . 0120: 7b e4 9e 39 34 a1 25 ba 00 00 00 00 00 00 00 00 {..94.%......... . 0130: 00 00 00 00 00 00 00 00 ........ 0 0x4c78 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc003cf04 len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba -- . ... raw event: size 312 bytes . 0000: 11 00 00 00 00 00 38 01 96 28 04 c0 ff ff ff ff ......8..(...... . 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog . 0020: 5f 32 61 31 34 32 65 66 36 37 61 61 61 64 31 37 _2a142ef67aaad17 . 0030: 34 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4............... <SNIP zeroes> . 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!....... . 0120: 2a 14 2e f6 7a aa d1 74 00 00 00 00 00 00 00 00 *...z..t........ . 0130: 00 00 00 00 00 00 00 00 ........ 0 0x4dc8 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0042896 len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174 -- . ... raw event: size 312 bytes . 0000: 11 00 00 00 00 00 38 01 05 13 17 c0 ff ff ff ff ......8......... . 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog . 0020: 5f 37 62 65 34 39 65 33 39 33 34 61 31 32 35 62 _7be49e3934a125b . 0030: 61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a............... <SNIP zeroes> . 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!....... . 0120: 7b e4 9e 39 34 a1 25 ba 00 00 00 00 00 00 00 00 {..94.%......... . 0130: 00 00 00 00 00 00 00 00 ........ 0 0x4f18 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0171305 len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba -- . ... raw event: size 312 bytes . 0000: 11 00 00 00 00 00 38 01 0a 8c 23 c0 ff ff ff ff ......8...#..... . 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog . 0020: 5f 32 61 31 34 32 65 66 36 37 61 61 61 64 31 37 _2a142ef67aaad17 . 0030: 34 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4............... <SNIP zeroes> . 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!....... . 0120: 2a 14 2e f6 7a aa d1 74 00 00 00 00 00 00 00 00 *...z..t........ . 0130: 00 00 00 00 00 00 00 00 ........ 0 0x5068 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0238c0a len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174 -- . ... raw event: size 312 bytes . 0000: 11 00 00 00 00 00 38 01 2a a5 a4 c0 ff ff ff ff ......8.*....... . 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog . 0020: 5f 37 62 65 34 39 65 33 39 33 34 61 31 32 35 62 _7be49e3934a125b . 0030: 61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a............... <SNIP zeroes> . 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!....... . 0120: 7b e4 9e 39 34 a1 25 ba 00 00 00 00 00 00 00 00 {..94.%......... . 0130: 00 00 00 00 00 00 00 00 ........ 0 0x51b8 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0a4a52a len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba -- . ... raw event: size 312 bytes . 0000: 11 00 00 00 00 00 38 01 9b c9 a4 c0 ff ff ff ff ......8......... . 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog . 0020: 5f 32 61 31 34 32 65 66 36 37 61 61 61 64 31 37 _2a142ef67aaad17 . 0030: 34 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4............... <SNIP zeroes> . 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!....... . 0120: 2a 14 2e f6 7a aa d1 74 00 00 00 00 00 00 00 00 *...z..t........ . 0130: 00 00 00 00 00 00 00 00 ........ 0 0x5308 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0a4c99b len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174 Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-team@fb.com Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20190117161521.1341602-8-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
45178a92 |
|
17-Jan-2019 |
Song Liu <songliubraving@fb.com> |
perf tools: Handle PERF_RECORD_BPF_EVENT This patch adds basic handling of PERF_RECORD_BPF_EVENT. Tracking of PERF_RECORD_BPF_EVENT is OFF by default. Option --bpf-event is added to turn it on. Committer notes: Add dummy machine__process_bpf_event() variant that returns zero for systems without HAVE_LIBBPF_SUPPORT, such as Alpine Linux, unbreaking the build in such systems. Remove the needless include <machine.h> from bpf->event.h, provide just forward declarations for the structs and unions in the parameters, to reduce compilation time and needless rebuilds when machine.h gets changed. Committer testing: When running with: # perf record --bpf-event On an older kernel where PERF_RECORD_BPF_EVENT and PERF_RECORD_KSYMBOL is not present, we fallback to removing those two bits from perf_event_attr, making the tool to continue to work on older kernels: perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 bpf_event 1 ------------------------------------------------------------ sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off bpf_event ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 ------------------------------------------------------------ sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off ksymbol ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ------------------------------------------------------------ And then proceeds to work without those two features. As passing --bpf-event is an explicit action performed by the user, perhaps we should emit a warning telling that the kernel has no such feature, but this can be done on top of this patch. Now with a kernel that supports these events, start the 'record --bpf-event -a' and then run 'perf trace sleep 10000' that will use the BPF augmented_raw_syscalls.o prebuilt (for another kernel version even) and thus should generate PERF_RECORD_BPF_EVENT events: [root@quaco ~]# perf record -e dummy -a --bpf-event ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.713 MB perf.data ] [root@quaco ~]# bpftool prog 13: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 14: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 15: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 15,16 16: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 15,16 17: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:44-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 17,18 18: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:44-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 17,18 21: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:45-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 21,22 22: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:45-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 21,22 31: tracepoint name sys_enter tag 12504ba9402f952f gpl loaded_at 2019-01-19T09:19:56-0300 uid 0 xlated 512B jited 374B memlock 4096B map_ids 30,29,28 32: tracepoint name sys_exit tag c1bd85c092d6e4aa gpl loaded_at 2019-01-19T09:19:56-0300 uid 0 xlated 256B jited 191B memlock 4096B map_ids 30,29 # perf report -D | grep PERF_RECORD_BPF_EVENT | nl 1 0 55834574849 0x4fc8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 13 2 0 60129542145 0x5118 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 14 3 0 64424509441 0x5268 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 15 4 0 68719476737 0x53b8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 16 5 0 73014444033 0x5508 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 17 6 0 77309411329 0x5658 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 18 7 0 90194313217 0x57a8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 21 8 0 94489280513 0x58f8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 22 9 7 620922484360 0xb6390 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 29 10 7 620922486018 0xb6410 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 29 11 7 620922579199 0xb6490 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 30 12 7 620922580240 0xb6510 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 30 13 7 620922765207 0xb6598 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 31 14 7 620922874543 0xb6620 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 32 # There, the 31 and 32 tracepoint BPF programs put in place by 'perf trace'. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-team@fb.com Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20190117161521.1341602-7-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
3fcb10e4 |
|
04-Dec-2018 |
Mark Drayton <mbd@fb.com> |
perf tools: Allow specifying proc-map-timeout in config file The default timeout of 500ms for parsing /proc/<pid>/maps files is too short for profiling many of our services. This can be overridden by passing --proc-map-timeout to the relevant command but it'd be nice to globally increase our default value. This patch permits setting a different default with the core.proc-map-timeout config file parameter. Signed-off-by: Mark Drayton <mbd@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20181204203420.1683114-1-mbd@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
93f20c0f |
|
05-Nov-2018 |
Alexey Budankov <alexey.budankov@linux.intel.com> |
perf record: Extend trace writing to multi AIO Multi AIO trace writing allows caching more kernel data into userspace memory postponing trace writing for the sake of overall profiling data thruput increase. It could be seen as kernel data buffer extension into userspace memory. With an --aio option value different from 0 (default value is 1) the tool has capability to cache more and more data into user space along with delegating spill to AIO. That allows avoiding to suspend at record__aio_sync() between calls of record__mmap_read_evlist() and increases profiling data thruput at the cost of userspace memory. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/050bb053-e7f3-aa83-fde7-f27ff90be7f6@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d3d1af6f |
|
05-Nov-2018 |
Alexey Budankov <alexey.budankov@linux.intel.com> |
perf record: Enable asynchronous trace writing The trace file offset is read once before mmaps iterating loop and written back after all performance data is enqueued for aio writing. The trace file offset is incremented linearly after every successful aio write operation. record__aio_sync() blocks till completion of the started AIO operation and then proceeds. record__aio_mmap_read_sync() implements a barrier for all incomplete aio write requests. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/ce2d45e9-d236-871c-7c8f-1bed2d37e8ac@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
cf99ad14 |
|
01-Oct-2018 |
Andi Kleen <ak@linux.intel.com> |
perf record: Support weak groups Implement a weak group fallback for 'perf record', similar to the existing 'perf stat' support. This allows to use groups that might be longer than the available counters without failing. Before: $ perf record -e '{cycles,cache-misses,cache-references,cpu_clk_unhalted.thread,cycles,cycles,cycles}' -a sleep 1 Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cycles). /bin/dmesg | grep -i perf may provide additional information. After: $ ./perf record -e '{cycles,cache-misses,cache-references,cpu_clk_unhalted.thread,cycles,cycles,cycles}:W' -a sleep 1 WARNING: No sample_id_all support, falling back to unordered processing [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 8.136 MB perf.data (134069 samples) ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20181001195927.14211-2-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
cf790516 |
|
09-Oct-2018 |
Alexey Budankov <alexey.budankov@linux.intel.com> |
perf record: Encode -k clockid frequency into Perf trace Store -k clockid frequency into Perf trace to enable timestamps derived metrics conversion into wall clock time on reporting stage. Below is the example of perf report output: tools/perf/perf record -k raw -- ../../matrix/linux/matrix.gcc ... [ perf record: Captured and wrote 31.222 MB perf.data (818054 samples) ] tools/perf/perf report --header # ======== ... # event : name = cycles:ppp, , size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|PERIOD, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, enable_on_exec = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, use_clockid = 1, clockid = 4 ... # clockid frequency: 1000 MHz ... # ======== Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/23a4a1dc-b160-85a0-347d-40a2ed6d007b@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ded2b8fe |
|
13-Sep-2018 |
Jiri Olsa <jolsa@kernel.org> |
perf tools: Add 'struct perf_mmap' arg to record__write() The struct perf_mmap map argument will hold the file pointer to write the data to. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180913125450.21342-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e035f4ca |
|
13-Sep-2018 |
Jiri Olsa <jolsa@kernel.org> |
perf auxtrace: Pass struct perf_mmap into mmap__read* functions The perf_mmap struct will hold a file pointer to write the mmap's contents, so we need to propagate it down the stack to record__write callers instead of its member the auxtrace_mmap struct. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180913125450.21342-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
318ec184 |
|
30-Aug-2018 |
Jiri Olsa <jolsa@kernel.org> |
perf tools: Switch 'session' argument to 'evlist' in perf_event__synthesize_attrs() To be able to pass in other than session's evlist. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-7-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a2015516 |
|
14-Mar-2018 |
Jiri Olsa <jolsa@kernel.org> |
perf record: Synthesize features before events in pipe mode We need to synthesize events first, because some features works on top of them (on report side). Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Stephane Eranian <eranian@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180314092205.23291-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
cff17205 |
|
12-Mar-2018 |
Yisheng Xie <xieyisheng1@huawei.com> |
perf record: Avoid duplicate call of perf_default_config() We have brought perf_default_config to the very beginning at main(), so it no need to call perf_default_config() once more for most of config in perf-record but only for record.call-graph. Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1520853957-36106-2-git-send-email-xieyisheng1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
915b4e27 |
|
07-Mar-2018 |
Jiri Olsa <jolsa@kernel.org> |
perf record: Remove progname from struct record It's no longer used. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180307155020.32613-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
20a8a3cf |
|
07-Mar-2018 |
Jiri Olsa <jolsa@kernel.org> |
perf record: Move machine variable down the function It's used far more down to be declared on the top of the __cmd_record. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180307155020.32613-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
07a9461d |
|
06-Mar-2018 |
Kan Liang <kan.liang@linux.intel.com> |
perf mmap: Use the stored scope data in perf_mmap__push() Using the 'start' and 'end' which are stored in struct perf_mmap to replace the temporary 'start' and 'end'. The temporary variables will be discarded later. It doesn't need to pass 'overwrite' to perf_mmap__push(). It's stored in struct perf_mmap. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1520350567-80082-3-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4b5ea3bd |
|
06-Mar-2018 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Combine some auxtrace initialization into a single function In preparation for adding AUX area sampling support, combine some auxtrace initialization into a single function. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1520327598-1317-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
cfacbabd |
|
02-Mar-2018 |
Jiri Olsa <jolsa@kernel.org> |
perf record: Fix crash in pipe mode Currently we can crash perf record when running in pipe mode, like: $ perf record ls | perf report # To display the perf.data header info, please use --header/--header-only options. # perf: Segmentation fault Error: The - file has no samples! The callstack of the crash is: 0x0000000000515242 in perf_event__synthesize_event_update_name 3513 ev = event_update_event__new(len + 1, PERF_EVENT_UPDATE__NAME, evsel->id[0]); (gdb) bt #0 0x0000000000515242 in perf_event__synthesize_event_update_name #1 0x00000000005158a4 in perf_event__synthesize_extra_attr #2 0x0000000000443347 in record__synthesize #3 0x00000000004438e3 in __cmd_record #4 0x000000000044514e in cmd_record #5 0x00000000004cbc95 in run_builtin #6 0x00000000004cbf02 in handle_internal_command #7 0x00000000004cc054 in run_argv #8 0x00000000004cc422 in main The reason of the crash is that the evsel does not have ids array allocated and the pipe's synthesize code tries to access it. We don't force evsel ids allocation when we have single event, because it's not needed. However we need it when we are in pipe mode even for single event as a key for evsel update event. Fixing this by forcing evsel ids allocation event for single event, when we are in pipe mode. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180302161354.30192-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ad46e48c |
|
02-Mar-2018 |
Jiri Olsa <jolsa@kernel.org> |
perf record: Fix crash in pipe mode Currently we can crash perf record when running in pipe mode, like: $ perf record ls | perf report # To display the perf.data header info, please use --header/--header-only options. # perf: Segmentation fault Error: The - file has no samples! The callstack of the crash is: 0x0000000000515242 in perf_event__synthesize_event_update_name 3513 ev = event_update_event__new(len + 1, PERF_EVENT_UPDATE__NAME, evsel->id[0]); (gdb) bt #0 0x0000000000515242 in perf_event__synthesize_event_update_name #1 0x00000000005158a4 in perf_event__synthesize_extra_attr #2 0x0000000000443347 in record__synthesize #3 0x00000000004438e3 in __cmd_record #4 0x000000000044514e in cmd_record #5 0x00000000004cbc95 in run_builtin #6 0x00000000004cbf02 in handle_internal_command #7 0x00000000004cc054 in run_argv #8 0x00000000004cc422 in main The reason of the crash is that the evsel does not have ids array allocated and the pipe's synthesize code tries to access it. We don't force evsel ids allocation when we have single event, because it's not needed. However we need it when we are in pipe mode even for single event as a key for evsel update event. Fixing this by forcing evsel ids allocation event for single event, when we are in pipe mode. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180302161354.30192-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b09c2364 |
|
01-Mar-2018 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Throttle user defined frequencies to the maximum allowed # perf record -F 200000 sleep 1 warning: Maximum frequency rate (15,000 Hz) exceeded, throttling from 200,000 Hz to 15,000 Hz. The limit can be raised via /proc/sys/kernel/perf_event_max_sample_rate. The kernel will lower it when perf's interrupts take too long. Use --strict-freq to disable this throttling, refusing to record. [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.019 MB perf.data (15 samples) ] # perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 15000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 For those wanting that it fails if the desired frequency can't be used: # perf record --strict-freq -F 200000 sleep 1 error: Maximum frequency rate (15,000 Hz) exceeded. Please use -F freq option with a lower value or consider tweaking /proc/sys/kernel/perf_event_max_sample_rate. # Suggested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-oyebruc44nlja499nqkr1nzn@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
67230479 |
|
01-Mar-2018 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Allow asking for the maximum allowed sample rate Add the handy '-F max' shortcut to reading and using the kernel.perf_event_max_sample_rate value as the user supplied sampling frequency: # perf record -F max sleep 1 info: Using a maximum frequency rate of 15,000 Hz [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.019 MB perf.data (14 samples) ] # sysctl kernel.perf_event_max_sample_rate kernel.perf_event_max_sample_rate = 15000 # perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 15000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 # perf record -F 10 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.019 MB perf.data (4 samples) ] # perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 10, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 # Suggested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-4y0tiuws62c64gp4cf0hme0m@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c3dec27b |
|
06-Feb-2018 |
Jiri Olsa <jolsa@kernel.org> |
perf record: Put new line after target override warning There's no new-line after target-override warning, now: $ perf record -a --per-thread Warning: SYSTEM/CPU switch overriding PER-THREAD^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.705 MB perf.data (2939 samples) ] with patch: $ perf record -a --per-thread Warning: SYSTEM/CPU switch overriding PER-THREAD ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.705 MB perf.data (2939 samples) ] Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: 16ad2ffb822c ("perf tools: Introduce perf_target__strerror()") Link: http://lkml.kernel.org/r/20180206181813.10943-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f290aa1f |
|
01-Feb-2018 |
Jiri Olsa <jolsa@kernel.org> |
perf record: Fix period option handling Stephan reported we don't unset PERIOD sample type when --no-period is specified. Adding the unset check and reset PERIOD if --no-period is specified. Committer notes: Check the sample_type, it shouldn't have PERF_SAMPLE_PERIOD there when --no-period is used. Before: # perf record --no-period sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.018 MB perf.data (7 samples) ] # perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 # After: [root@jouet ~]# perf record --no-period sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.019 MB perf.data (17 samples) ] [root@jouet ~]# perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 [root@jouet ~]# Reported-by: Stephane Eranian <eranian@google.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Stephane Eranian <eranian@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180201083812.11359-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
68588baf |
|
08-Dec-2017 |
Jin Yao <yao.jin@linux.intel.com> |
perf record: Record the first and last sample time in the header In the default 'perf record' configuration, all samples are processed, to create the HEADER_BUILD_ID table. So it's very easy to get the first/last samples and save the time to perf file header via the function write_sample_time(). Later, at post processing time, perf report/script will fetch the time from perf file header. Committer testing: # perf record -a sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 2.099 MB perf.data (1101 samples) ] [root@jouet home]# perf report --header | grep "time of " # time of first sample : 22947.909226 # time of last sample : 22948.910704 # # perf report -D | grep PERF_RECORD_SAMPLE\( 0 22947909226101 0x20bb68 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa21b1af3 period: 1 addr: 0 0 22947909229928 0x20bb98 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa200d204 period: 1 addr: 0 <SNIP> 3 22948910397351 0x219360 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 28251/28251: 0xffffffffa22071d8 period: 169518 addr: 0 0 22948910652380 0x20f120 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa2856816 period: 198807 addr: 0 2 22948910704034 0x2172d0 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa2856816 period: 88111 addr: 0 # Changelog: v7: Just update the patch description according to Arnaldo's suggestion. v6: Currently '--buildid-all' is not enabled at default. So the walking on all samples is the default operation. There is no big overhead to calculate the timestamp boundary in process_sample_event handler once we already go through all samples. So the timestamp boundary calculation is enabled by default when '--buildid-all' is not enabled. While if '--buildid-all' is enabled, we creates a new option "--timestamp-boundary" for user to decide if it enables the timestamp boundary calculation. v5: There is an issue that the sample walking can only work when '--buildid-all' is not enabled. So we need to let the walking be able to work even if '--buildid-all' is enabled and let the processing skips the dso hit marking for this case. At first, I want to provide a new option "--record-time-boundaries". While after consideration, I think a new option is not very necessary. v3: Remove the definitions of first_sample_time and last_sample_time from struct record and directly save them in perf_evlist. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1512738826-2628-3-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ca800068 |
|
13-Dec-2017 |
Mengting Zhang <zhangmengting@huawei.com> |
perf evsel: Enable ignore_missing_thread for pid option While monitoring a multithread process with pid option, perf sometimes may return sys_perf_event_open failure with 3(No such process) if any of the process's threads die before we open the event. However, we want perf continue monitoring the remaining threads and do not exit with error. Here, the patch enables perf_evsel::ignore_missing_thread for -p option to ignore complete failure if any of threads die before we open the event. But it may still return sys_perf_event_open failure with 22(Invalid) if we monitors several event groups. sys_perf_event_open: pid 28960 cpu 40 group_fd 118202 flags 0x8 sys_perf_event_open: pid 28961 cpu 40 group_fd 118203 flags 0x8 WARNING: Ignored open failure for pid 28962 sys_perf_event_open: pid 28962 cpu 40 group_fd [118203] flags 0x8 sys_perf_event_open failed, error -22 That is because when we ignore a missing thread, we change the thread_idx without dealing with its fds, FD(evsel, cpu, thread). Then get_group_fd() may return a wrong group_fd for the next thread and sys_perf_event_open() return with 22. sys_perf_event_open(){ ... if (group_fd != -1) perf_fget_light()//to get corresponding group_leader by group_fd ... if (group_leader) if (group_leader->ctx->task != ctx->task)//should on the same task goto err_context ... } This patch also fixes this bug by introducing perf_evsel__remove_fd() and update_fds to allow removing fds for the missing thread. Changes since v1: - Change group_fd__remove() into a more genetic way without changing code logic - Remove redundant condition Changes since v2: - Use a proper function name and add some comment. - Multiline comment style fixes. Committer testing: Before this patch the recently added 'perf stat --per-thread' for system wide counting would race while enumerating all threads using /proc: [root@jouet ~]# perf stat --per-thread failed to parse CPUs map: No such file or directory Usage: perf stat [<options>] [<command>] -C, --cpu <cpu> list of cpus to monitor in system-wide -a, --all-cpus system-wide collection from all CPUs [root@jouet ~]# perf stat --per-thread failed to parse CPUs map: No such file or directory Usage: perf stat [<options>] [<command>] -C, --cpu <cpu> list of cpus to monitor in system-wide -a, --all-cpus system-wide collection from all CPUs [root@jouet ~]# When, say, the kernel was being built, so lots of shortlived threads, after this patch this doesn't happen. Signed-off-by: Mengting Zhang <zhangmengting@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Cheng Jian <cj.chengjian@huawei.com> Cc: Li Bin <huawei.libin@huawei.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1513148513-6974-1-git-send-email-zhangmengting@huawei.com [ Remove one use 'evlist' alias variable ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
3315d14f |
|
06-Dec-2017 |
Pravin Shedge <pravin.shedge4linux@gmail.com> |
perf perf: Remove duplicate includes These duplicate includes have been found with scripts/checkincludes.pl but they have been removed manually to avoid removing false positives. Signed-off-by: Pravin Shedge <pravin.shedge4linux@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1512582204-6493-1-git-send-email-pravin.shedge4linux@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0b72d69a |
|
04-Dec-2017 |
Wang Nan <wangnan0@huawei.com> |
perf tools: Rename 'backward' to 'overwrite' in evlist, mmap and record Remove the backward/forward concept to make it uniform with user interface (the '--overwrite' option). Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Mengting Zhang <zhangmengting@huawei.com> Link: http://lkml.kernel.org/r/20171204165107.95327-4-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ca6a9a05 |
|
02-Dec-2017 |
Wang Nan <wangnan0@huawei.com> |
perf mmap: Remove overwrite from arguments list of perf_mmap__push 'overwrite' argument is always 'false'. Remove it from arguments list of perf_mmap__push(). Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/20171203020044.81680-5-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
144b9a4f |
|
02-Dec-2017 |
Wang Nan <wangnan0@huawei.com> |
perf evlist: Remove evlist->overwrite evlist->overwrite is set to false in all users. It can be removed. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/20171203020044.81680-4-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7a276ff6 |
|
02-Dec-2017 |
Wang Nan <wangnan0@huawei.com> |
perf evlist: Remove 'overwrite' parameter from perf_evlist__mmap_ex All users of perf_evlist__mmap_ex set !overwrite. Remove it from its arguments list. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/20171203020044.81680-3-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
373565d2 |
|
17-Nov-2017 |
Andi Kleen <ak@linux.intel.com> |
perf record: Synthesize thread map and cpu map Synthesize the per attr thread maps and cpu maps in 'perf record'. This allows code from 'perf stat' called from 'perf script' to access this information. Committer testing: Please see the PERF_RECORD_THREAD_MAP and PERF_RECORD_CPU_MAP records, added by this patch: $ perf record sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data (8 samples) ] $ perf report -D | grep PERF_RECORD_ | head 0xe8 [0x20]: PERF_RECORD_TIME_CONV: unhandled! 0x108 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 23568 0x130 [0x18]: PERF_RECORD_CPU_MAP: 0-3 0 0x148 [0x28]: PERF_RECORD_COMM: perf:23568/23568 0x570 [0x8]: PERF_RECORD_FINISHED_ROUND 445342677837144 0x170 [0x28]: PERF_RECORD_COMM exec: sleep:23568/23568 445342677847339 0x198 [0x68]: PERF_RECORD_MMAP2 23568/23568: [0x564c943a4000(0x208000) @ 0 fd:00 3147174 2566255743]: r-xp /usr/bin/sleep 445342677862450 0x200 [0x70]: PERF_RECORD_MMAP2 23568/23568: [0x7f25968a8000(0x229000) @ 0 fd:00 3151761 2566238119]: r-xp /usr/lib64/ld-2.25.so 445342677873174 0x270 [0x60]: PERF_RECORD_MMAP2 23568/23568: [0x7ffc98176000(0x2000) @ 0 00:00 0 0]: r-xp [vdso] 445342677891928 0x2d0 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4002): 23568/23568: 0xffffffff8f84c7e7 period: 1 addr: 0 $ Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/20171117214300.32746-3-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
bfd8f72c |
|
17-Nov-2017 |
Andi Kleen <ak@linux.intel.com> |
perf record: Synthesize unit/scale/... in event update Move the code to synthesize event updates for scale/unit/cpus to a common utility file, and use it both from stat and record. This allows to access scale and other extra qualifiers from perf script. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20171117214300.32746-2-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b0ebd811 |
|
14-Nov-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Ignore kptr_restrict when not sampling the kernel If we're not sampling the kernel, we shouldn't care about kptr_restrict neither synthesize anything for assisting in resolving kernel samples, like the reference relocation symbol or kernel modules information. Before: $ cat /proc/sys/kernel/kptr_restrict /proc/sys/kernel/perf_event_paranoid 2 2 $ perf record sleep 1 WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted, check /proc/sys/kernel/kptr_restrict. Samples in kernel functions may not be resolved if a suitable vmlinux file is not found in the buildid cache or in the vmlinux path. Samples in kernel modules won't be resolved at all. If some relocation was applied (e.g. kexec) symbols may be misresolved even with a suitable vmlinux or kallsyms file. Couldn't record kernel reference relocation symbol Symbol resolution may be skewed if relocation was used (e.g. kexec). Check /proc/kallsyms permission or run as root. [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data (8 samples) ] $ perf evlist -v cycles:uppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, exclude_kernel: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 $ After: $ perf record sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data (10 samples) ] $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-t025e9zftbx2b8cq2w01g5e5@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
dffdcbdb |
|
03-Nov-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Generate PERF_RECORD_{MMAP,COMM,EXEC} with --delay When we use an initial delay, e.g.: 'perf record --delay 1000', we do not enable the events until that delay has passed after we started the workload, including the tracking event, i.e. the one for which we have attr.mmap, etc, enabled to ask the kernel to generate the PERF_RECORD_{MMAP,COMM,EXEC} metadata events that will then allow us to resolve addresses in samples to the map, dso and symbol. There will be a shadow that even synthesizing samples won't cover, i.e. the workload that we start and other processes forking while we wait for the initial delay to expire. So use a dummy event to be the tracking one and make it be enabled on exec. Before: # perf record --delay 1000 stress --cpu 1 --timeout 5 stress: info: [9029] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd stress: info: [9029] successful run completed in 5s [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.624 MB perf.data (15908 samples) ] # perf script | head :9031 9031 32001.826888: 1 cycles:ppp: ffffffff831aa30d event_function (/lib/modules/4.14.0-rc6+/build/vmlinux) :9031 9031 32001.826893: 1 cycles:ppp: ffffffff8300d1a0 intel_bts_enable_local (/lib/modules/4.14.0-rc6+/build/vmlinux) :9031 9031 32001.826895: 7 cycles:ppp: ffffffff83023870 sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux) :9031 9031 32001.826897: 103 cycles:ppp: ffffffff8300c331 intel_pmu_handle_irq (/lib/modules/4.14.0-rc6+/build/vmlinux) :9031 9031 32001.826899: 1615 cycles:ppp: ffffffff830231f8 native_sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux) :9031 9031 32001.826902: 26724 cycles:ppp: ffffffff8384c6a7 native_irq_return_iret (/lib/modules/4.14.0-rc6+/build/vmlinux) :9031 9031 32001.826913: 329739 cycles:ppp: 7fb2a5410932 [unknown] ([unknown]) :9031 9031 32001.827033: 1225451 cycles:ppp: 7fb2a5410930 [unknown] ([unknown]) :9031 9031 32001.827474: 1391725 cycles:ppp: 7fb2a5410930 [unknown] ([unknown]) :9031 9031 32001.827978: 1233697 cycles:ppp: 7fb2a5410928 [unknown] ([unknown]) # After: # perf record --delay 1000 stress --cpu 1 --timeout 5 stress: info: [9741] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd stress: info: [9741] successful run completed in 5s [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.751 MB perf.data (15976 samples) ] # perf script | head stress 9742 32110.959106: 1 cycles:ppp: ffffffff831b26f6 __perf_event_task_sched_in (/lib/modules/4.14.0-rc6+/build/vmlinux) stress 9742 32110.959110: 1 cycles:ppp: ffffffff8300c2e9 intel_pmu_handle_irq (/lib/modules/4.14.0-rc6+/build/vmlinux) stress 9742 32110.959112: 7 cycles:ppp: ffffffff830231e0 native_sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux) stress 9742 32110.959115: 101 cycles:ppp: ffffffff83023870 sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux) stress 9742 32110.959117: 1533 cycles:ppp: ffffffff830231f8 native_sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux) stress 9742 32110.959119: 23992 cycles:ppp: ffffffff831b0900 ctx_sched_in (/lib/modules/4.14.0-rc6+/build/vmlinux) stress 9742 32110.959129: 329406 cycles:ppp: 7f4b1b661930 __random_r (/usr/lib64/libc-2.25.so) stress 9742 32110.959249: 1288322 cycles:ppp: 5566e1e7cbc9 hogcpu (/usr/bin/stress) stress 9742 32110.959712: 1464046 cycles:ppp: 7f4b1b66179e __random (/usr/lib64/libc-2.25.so) stress 9742 32110.960241: 1266918 cycles:ppp: 7f4b1b66195b __random_r (/usr/lib64/libc-2.25.so) # Reported-by: Bram Stolk <b.stolk@gmail.com> Tested-by: Bram Stolk <b.stolk@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 6619a53ef757 ("perf record: Add --initial-delay option") Link: http://lkml.kernel.org/n/tip-nrdfchshqxf7diszhxcecqb9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
6c443954 |
|
14-Nov-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Ignore kptr_restrict when not sampling the kernel If we're not sampling the kernel, we shouldn't care about kptr_restrict neither synthesize anything for assisting in resolving kernel samples, like the reference relocation symbol or kernel modules information. Before: $ cat /proc/sys/kernel/kptr_restrict /proc/sys/kernel/perf_event_paranoid 2 2 $ perf record sleep 1 WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted, check /proc/sys/kernel/kptr_restrict. Samples in kernel functions may not be resolved if a suitable vmlinux file is not found in the buildid cache or in the vmlinux path. Samples in kernel modules won't be resolved at all. If some relocation was applied (e.g. kexec) symbols may be misresolved even with a suitable vmlinux or kallsyms file. Couldn't record kernel reference relocation symbol Symbol resolution may be skewed if relocation was used (e.g. kexec). Check /proc/kallsyms permission or run as root. [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data (8 samples) ] $ perf evlist -v cycles:uppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, exclude_kernel: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 $ After: $ perf record sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data (10 samples) ] $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-t025e9zftbx2b8cq2w01g5e5@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d3dbf43c |
|
03-Nov-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Generate PERF_RECORD_{MMAP,COMM,EXEC} with --delay When we use an initial delay, e.g.: 'perf record --delay 1000', we do not enable the events until that delay has passed after we started the workload, including the tracking event, i.e. the one for which we have attr.mmap, etc, enabled to ask the kernel to generate the PERF_RECORD_{MMAP,COMM,EXEC} metadata events that will then allow us to resolve addresses in samples to the map, dso and symbol. There will be a shadow that even synthesizing samples won't cover, i.e. the workload that we start and other processes forking while we wait for the initial delay to expire. So use a dummy event to be the tracking one and make it be enabled on exec. Before: # perf record --delay 1000 stress --cpu 1 --timeout 5 stress: info: [9029] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd stress: info: [9029] successful run completed in 5s [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.624 MB perf.data (15908 samples) ] # perf script | head :9031 9031 32001.826888: 1 cycles:ppp: ffffffff831aa30d event_function (/lib/modules/4.14.0-rc6+/build/vmlinux) :9031 9031 32001.826893: 1 cycles:ppp: ffffffff8300d1a0 intel_bts_enable_local (/lib/modules/4.14.0-rc6+/build/vmlinux) :9031 9031 32001.826895: 7 cycles:ppp: ffffffff83023870 sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux) :9031 9031 32001.826897: 103 cycles:ppp: ffffffff8300c331 intel_pmu_handle_irq (/lib/modules/4.14.0-rc6+/build/vmlinux) :9031 9031 32001.826899: 1615 cycles:ppp: ffffffff830231f8 native_sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux) :9031 9031 32001.826902: 26724 cycles:ppp: ffffffff8384c6a7 native_irq_return_iret (/lib/modules/4.14.0-rc6+/build/vmlinux) :9031 9031 32001.826913: 329739 cycles:ppp: 7fb2a5410932 [unknown] ([unknown]) :9031 9031 32001.827033: 1225451 cycles:ppp: 7fb2a5410930 [unknown] ([unknown]) :9031 9031 32001.827474: 1391725 cycles:ppp: 7fb2a5410930 [unknown] ([unknown]) :9031 9031 32001.827978: 1233697 cycles:ppp: 7fb2a5410928 [unknown] ([unknown]) # After: # perf record --delay 1000 stress --cpu 1 --timeout 5 stress: info: [9741] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd stress: info: [9741] successful run completed in 5s [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.751 MB perf.data (15976 samples) ] # perf script | head stress 9742 32110.959106: 1 cycles:ppp: ffffffff831b26f6 __perf_event_task_sched_in (/lib/modules/4.14.0-rc6+/build/vmlinux) stress 9742 32110.959110: 1 cycles:ppp: ffffffff8300c2e9 intel_pmu_handle_irq (/lib/modules/4.14.0-rc6+/build/vmlinux) stress 9742 32110.959112: 7 cycles:ppp: ffffffff830231e0 native_sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux) stress 9742 32110.959115: 101 cycles:ppp: ffffffff83023870 sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux) stress 9742 32110.959117: 1533 cycles:ppp: ffffffff830231f8 native_sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux) stress 9742 32110.959119: 23992 cycles:ppp: ffffffff831b0900 ctx_sched_in (/lib/modules/4.14.0-rc6+/build/vmlinux) stress 9742 32110.959129: 329406 cycles:ppp: 7f4b1b661930 __random_r (/usr/lib64/libc-2.25.so) stress 9742 32110.959249: 1288322 cycles:ppp: 5566e1e7cbc9 hogcpu (/usr/bin/stress) stress 9742 32110.959712: 1464046 cycles:ppp: 7f4b1b66179e __random (/usr/lib64/libc-2.25.so) stress 9742 32110.960241: 1266918 cycles:ppp: 7f4b1b66195b __random_r (/usr/lib64/libc-2.25.so) # Reported-by: Bram Stolk <b.stolk@gmail.com> Tested-by: Bram Stolk <b.stolk@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 6619a53ef757 ("perf record: Add --initial-delay option") Link: http://lkml.kernel.org/n/tip-nrdfchshqxf7diszhxcecqb9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b2441318 |
|
01-Nov-2017 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
eae8ad80 |
|
23-Jan-2017 |
Jiri Olsa <jolsa@kernel.org> |
perf tools: Add struct perf_data_file Add struct perf_data_file to represent a single file within a perf_data struct. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Changbin Du <changbin.du@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-c3f9p4xzykr845ktqcek6p4t@git.kernel.org [ Fixup recent changes in 'perf script --per-event-dump' ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
8ceb41d7 |
|
23-Jan-2017 |
Jiri Olsa <jolsa@kernel.org> |
perf tools: Rename struct perf_data_file to perf_data Rename struct perf_data_file to perf_data, because we will add the possibility to have multiple files under perf.data, so the 'perf_data' name fits better. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Changbin Du <changbin.du@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-39wn4d77phel3dgkzo3lyan0@git.kernel.org [ Fixup recent changes in 'perf script --per-event-dump' ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
73c17d81 |
|
06-Oct-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf mmap: Adopt push method from builtin-record.c The previous prep patch was just to show exactly what changed in that function, now its time to move that method and things only it uses to the right place, mmap.[ch] Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-aaxywfgw3d44x6xlu8zm1avu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d37f1586 |
|
05-Oct-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Make record__mmap_read generic It becomes a perf_mmap method, "push", that build reads from a mmap and "pushes" it to a consumer, that in the initial case, for 'perf record', just writes it to the perf.data file descriptor, but may be used by 'top', etc. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-u4l1qjbi6l76r2k0nv99220n@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
340b47f5 |
|
29-Sep-2017 |
Kan Liang <kan.liang@intel.com> |
perf top: Implement multithreading for perf_event__synthesize_threads The proc files which is sorted with alphabetical order are evenly assigned to several synthesize threads to be processed in parallel. For 'perf top', the threads number hard code to online CPU number. The following patch will introduce an option to set it. For other perf tools, the thread number is 1. Because the process function is not ready for multithreading, e.g. process_synthesized_event. This patch series only support event synthesize multithreading for 'perf top'. For other tools, it can be done separately later. With multithread applied, the total processing time can get up to 1.56x speedup on Knights Mill for 'perf top'. For specific single event processing, the processing time could increase because of the lock contention. So proc_map_timeout may need to be increased. Otherwise some proc maps will be truncated. Based on my test, increasing the proc_map_timeout has small impact on the total processing time. The total processing time still get 1.49x speedup on Knights Mill after increasing the proc_map_timeout. The patch itself doesn't increase the proc_map_timeout. Doesn't need to implement multithreading for per task monitoring, perf_event__synthesize_thread_map. It doesn't have performance issue. Committer testing: # getconf _NPROCESSORS_ONLN 4 # perf trace --no-inherit -e clone -o /tmp/output perf top # tail -4 /tmp/bla 0.124 ( 0.041 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3eb3a8f30, parent_tidptr: 0x7fc3eb3a99d0, child_tidptr: 0x7fc3eb3a99d0, tls: 0x7fc3eb3a9700) = 9548 (perf) 0.246 ( 0.023 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3eaba7f30, parent_tidptr: 0x7fc3eaba89d0, child_tidptr: 0x7fc3eaba89d0, tls: 0x7fc3eaba8700) = 9549 (perf) 0.286 ( 0.019 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3ea3a6f30, parent_tidptr: 0x7fc3ea3a79d0, child_tidptr: 0x7fc3ea3a79d0, tls: 0x7fc3ea3a7700) = 9550 (perf) 246.540 ( 0.047 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3ea3a6f30, parent_tidptr: 0x7fc3ea3a79d0, child_tidptr: 0x7fc3ea3a79d0, tls: 0x7fc3ea3a7700) = 9551 (perf) # Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Lukasz Odzioba <lukasz.odzioba@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1506696477-146932-4-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
84c41742 |
|
05-Sep-2017 |
Andi Kleen <ak@linux.intel.com> |
perf record: Support direct --user-regs arguments USER_REGS can currently only collected implicitely with call graph recording. Sometimes it is useful to see them separately, and filter them. Add a new --user-regs option to record that is similar to --intr-regs, but acts on user regs. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170905170029.19722-1-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
3b0a5daa |
|
29-Aug-2017 |
Kan Liang <kan.liang@intel.com> |
perf tools: Support new sample type for physical address Support new sample type PERF_SAMPLE_PHYS_ADDR for physical address. Add new option --phys-data to record sample physical address. Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1504026672-7304-2-git-send-email-kan.liang@intel.com [ Added missing printing in evsel.c patch sent by Jiri Olsa ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e9def1b2 |
|
17-Jul-2017 |
David Carrillo-Cisneros <davidcc@google.com> |
perf tools: Add feature header record to pipe-mode Add header record types to pipe-mode, reusing the functions used in file-mode and leveraging the new struct feat_fd. For alignment, check that synthesized events don't exceed pagesize. Add the perf_event__synthesize_feature event call back to process the new header records. Before this patch: $ perf record -o - -e cycles sleep 1 | perf report --stdio --header [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] ... After this patch: $ perf record -o - -e cycles sleep 1 | perf report --stdio --header # ======== # captured on: Mon May 22 16:33:43 2017 # ======== # # hostname : my_hostname # os release : 4.11.0-dbx-up_perf # perf version : 4.11.rc6.g6277c80 # arch : x86_64 # nrcpus online : 72 # nrcpus avail : 72 # cpudesc : Intel(R) Xeon(R) CPU E5-2696 v3 @ 2.30GHz # cpuid : GenuineIntel,6,63,2 # total memory : 263457192 kB # cmdline : /root/perf record -o - -e cycles -c 100000 sleep 1 # HEADER_CPU_TOPOLOGY info available, use -I to display # HEADER_NUMA_TOPOLOGY info available, use -I to display # pmu mappings: intel_bts = 6, uncore_imc_4 = 22, uncore_sbox_1 = 47, uncore_cbox_5 = 33, uncore_ha_0 = 16, uncore_cbox [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] ... Support added for the subcommands: report, inject, annotate and script. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170718042549.145161-16-davidcc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4b4cd503 |
|
03-Jul-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Do not ask for precise_ip with --no-samples When the user doesn't specify an event with -e/--event, 'perf record' will use as a default the "cycles" event with the highest level of precision in perf_event_attr.precise_ip, but --no-samples, if present, is incompatible with precise_ip != 0, so use the newly introduced __perf_event__add_default(precise = false) to fix that: Before: # perf record -n usleep 1 Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate. Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cycles:ppp). /bin/dmesg may provide additional information. No CONFIG_PERF_EVENTS=y kernel support configured? # After: # perf record -n usleep 1 Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate. [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.018 MB perf.data ] [root@jouet /]# perf evlist -v cycles: size: 112, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 [root@jouet /]# Reported-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-q991fw6s6rhjvrd5ye4t7qom@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
62d94b00 |
|
27-Jun-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Replace error() with pr_err() To consolidate the error reporting facility. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-b41iot1094katoffdf19w9zk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4208735d |
|
19-Apr-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Remove poll.h and wait.h from util.h Not needed in this header, added to the places that need poll(), wait() and a few other prototypes. Link: http://lkml.kernel.org/n/tip-i39c7b6xmo1vwd9wxp6fmkl0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c5e4027e |
|
19-Apr-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Move timestamp routines from util.h to time-utils.h We already have a header for time utilities, so use it. Link: http://lkml.kernel.org/n/tip-sijzpbvutlg0c3oxn49hy9ca@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
58db1d6e |
|
19-Apr-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Move units conversion/formatting routines to separate object Out of util.h, to disentangle it a bit more. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-vpksyj3w5fk9t8s6mxmkajyr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9607ad3a |
|
19-Apr-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Add signal.h to places using its definitions And remove it from util.h, disentangling it a bit more. Link: http://lkml.kernel.org/n/tip-2zg9s5nx90yde64j3g4z2uhk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a43783ae |
|
18-Apr-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Include errno.h where needed Removing it from util.h, part of an effort to disentangle the includes hell, that makes changes to util.h or something included by it to cause a complete rebuild of the tools. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ztrjy52q1rqcchuy3rubfgt2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
fd20e811 |
|
17-Apr-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Including missing inttypes.h header Needed to use the PRI[xu](32,64) formatting macros. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-wkbho8kaw24q67dd11q0j39f@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b0ad8ea6 |
|
27-Mar-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Remove unused 'prefix' from builtin functions We got it from the git sources but never used it for anything, with the place where this would be somehow used remaining: static int run_builtin(struct cmd_struct *p, int argc, const char **argv) { prefix = NULL; if (p->option & RUN_SETUP) prefix = NULL; /* setup_perf_directory(); */ Ditch it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-uw5swz05vol0qpr32c5lpvus@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e907caf3 |
|
07-Mar-2017 |
Hari Bathini <hbathini@linux.vnet.ibm.com> |
perf record: Synthesize namespace events for current processes Synthesize PERF_RECORD_NAMESPACES events for processes that were running prior to invocation of perf record. The data for this is taken from /proc/$PID/ns. These changes make way for analyzing events with regard to namespaces. Committer notes: Check if 'tool' is NULL in perf_event__synthesize_namespaces(), as in the test__mmap_thread_lookup case, i.e. 'perf test Lookup mmap thread". Testing it: # ps axH > /tmp/allthreads # perf record -a --namespaces usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.169 MB perf.data (8 samples) ] # perf report -D | grep PERF_RECORD_NAMESPACES | wc -l 602 # wc -l /tmp/allthreads 601 /tmp/allthreads # tail /tmp/allthreads 16951 pts/4 T 0:00 git rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 16952 pts/4 T 0:00 /bin/sh /usr/libexec/git-core/git-rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^ 17176 pts/4 T 0:00 git commit --amend --no-post-rewrite 17204 pts/4 T 0:00 vim /home/acme/git/linux/.git/COMMIT_EDITMSG 18939 ? S 0:00 [kworker/2:1] 18947 ? S 0:00 [kworker/3:0] 18974 ? S 0:00 [kworker/1:0] 19047 ? S 0:00 [kworker/0:1] 19152 pts/6 S+ 0:00 weechat 19153 pts/7 R+ 0:00 ps axH # perf report -D | grep PERF_RECORD_NAMESPACES | tail 0 0 0x125068 [0xa0]: PERF_RECORD_NAMESPACES 17176/17176 - nr_namespaces: 7 0 0 0x1255b8 [0xa0]: PERF_RECORD_NAMESPACES 17204/17204 - nr_namespaces: 7 0 0 0x125df0 [0xa0]: PERF_RECORD_NAMESPACES 18939/18939 - nr_namespaces: 7 0 0 0x125f00 [0xa0]: PERF_RECORD_NAMESPACES 18947/18947 - nr_namespaces: 7 0 0 0x126010 [0xa0]: PERF_RECORD_NAMESPACES 18974/18974 - nr_namespaces: 7 0 0 0x126120 [0xa0]: PERF_RECORD_NAMESPACES 19047/19047 - nr_namespaces: 7 0 0 0x126230 [0xa0]: PERF_RECORD_NAMESPACES 19152/19152 - nr_namespaces: 7 0 0 0x129330 [0xa0]: PERF_RECORD_NAMESPACES 19154/19154 - nr_namespaces: 7 0 0 0x12a1f8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 0 0 0x12b0b8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7 # Humm, investigate why we got two record for the 19155 pid/tid... Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891931111.25309.11073854609798681633.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f3b3614a |
|
07-Mar-2017 |
Hari Bathini <hbathini@linux.vnet.ibm.com> |
perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info Introduce a new option to record PERF_RECORD_NAMESPACES events emitted by the kernel when fork, clone, setns or unshare are invoked. And update perf-record documentation with the new option to record namespace events. Committer notes: Combined it with a later patch to allow printing it via 'perf report -D' and be able to test the feature introduced in this patch. Had to move here also perf_ns__name(), that was introduced in another later patch. Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt: util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=] ret += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx ^ Testing it: # perf record --namespaces -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ] # # perf report -D <SNIP> 3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7 [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc, 4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb] 0x1151e0 [0x30]: event: 9 . . ... raw event: size 48 bytes . 0000: 09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00 ......0..q.h.... . 0010: a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00 .9...9...(.c.... . 0020: 03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00 ................ <SNIP> NAMESPACES events: 1 <SNIP> # Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
68ba3235 |
|
17-Feb-2017 |
Namhyung Kim <namhyung@kernel.org> |
perf record: Honor --quiet option properly It should call perf_quiet_option() to suppress messages. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170217081742.17417-7-namhyung@kernel.org [ Fix merge clash with 483635a9d080 ("perf record: Add -a as default target") ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
bb963e16 |
|
17-Feb-2017 |
Namhyung Kim <namhyung@kernel.org> |
perf utils: Check verbose flag properly It now can have negative value to suppress the message entirely. So it needs to check it being positive. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170217081742.17417-3-namhyung@kernel.org [ Adjust fuzz on tools/perf/util/pmu.c, add > 0 checks in many other places ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
483635a9 |
|
17-Feb-2017 |
Jiri Olsa <jolsa@redhat.com> |
perf record: Add -a as default target Running 'perf record' with no target (-a, -p, -t, etc) will now collect system wide data. Commiter notes: Testing it: [root@jouet ~]# perf record ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.351 MB perf.data (366 samples) ] # is equivalent to: # perf record -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.411 MB perf.data (978 samples) ] # Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20170217170018.GA15389@krava Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9d6aae72 |
|
14-Feb-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Do not put a variable sized type not at the end of a struct As this is a GNU extension and while harmless in this case, we can do the same thing in a more clearer way by using an existing thread_map constructor. With this we avoid this while compiling with clang: builtin-record.c:659:21: error: field 'map' with variable sized type 'struct thread_map' not at the end of a struct or class is a GNU extension [-Werror,-Wgnu-variable-sized-type-not-at-end] struct thread_map map; ^ 1 error generated. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-c9drclo52ezxmwa7qxklin2y@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d6195a6a |
|
13-Feb-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evsel: Inform how to make a sysctl setting permanent When a tool can't open counters due to the kernel.perf_event_paranoit sysctl setting, we inform how to tweak it to allow the operation to succeed, in addition to that, suggest setting /etc/sysctl.conf to make the setting permanent. Suggested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-4gwe99k4a6p12d4u8bbyttj2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ecc4c561 |
|
24-Jan-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Propagate perf_config() errors Previously these were being ignored, sometimes silently. Stop doing that, emitting debug messages and handling the errors. Testing it: $ cat ~/.perfconfig cat: /home/acme/.perfconfig: No such file or directory $ perf stat -e cycles usleep 1 Performance counter stats for 'usleep 1': 938,996 cycles:u 0.003813731 seconds time elapsed $ perf top --stdio Error: You may not have permission to collect system-wide stats. Consider tweaking /proc/sys/kernel/perf_event_paranoid, <SNIP> [ perf record: Captured and wrote 0.019 MB perf.data (7 samples) ] [acme@jouet linux]$ perf report --stdio # To display the perf.data header info, please use --header/--header-only options. # Overhead Command Shared Object Symbol # ........ ....... ................. ......................... 71.77% usleep libc-2.24.so [.] _dl_addr 27.07% usleep ld-2.24.so [.] _dl_next_ld_env_entry 1.13% usleep [kernel.kallsyms] [k] page_fault $ $ touch ~/.perfconfig $ ls -la ~/.perfconfig -rw-rw-r--. 1 acme acme 0 Jan 27 12:14 /home/acme/.perfconfig $ $ perf stat -e instructions usleep 1 Performance counter stats for 'usleep 1': 244,610 instructions:u 0.000805383 seconds time elapsed $ [root@jouet ~]# chown acme.acme ~/.perfconfig [root@jouet ~]# perf stat -e cycles usleep 1 Warning: File /root/.perfconfig not owned by current user or root, ignoring it. Performance counter stats for 'usleep 1': 937,615 cycles 0.000836931 seconds time elapsed # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-j2rq96so6xdqlr8p8rd6a3jx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
bfacbe3b |
|
09-Jan-2017 |
Jiri Olsa <jolsa@kernel.org> |
perf record: Add switch-output time option argument It's now possible to specify the threshold time for perf.data like: $ perf record --switch-output=30s ... Once it's reached, the current data are dumped in to the perf.data.<timestamp> file and session does on. $ perf record --switch-output=30s ... [ perf record: dump data: Woken up 44 times ] [ perf record: Dump perf.data.2017010213043746 ] ... The time is expected to be a number with appended unit character - s/m/h/d. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Wang Nan <wangnan0@huawei.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1483955520-29063-7-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0c582449 |
|
09-Jan-2017 |
Jiri Olsa <jolsa@kernel.org> |
perf record: Add switch-output size warning Adding switch-output size warning if the requested size of lower than the wakeup ring buffer size. $ perf record --switch-output=1K ls WARNING: switch-output data size lower than wakeup kernel buffer size (258K) expect bigger perf.data sizes ... Signed-off-by: Jiri Olsa <jolsa@kernel.org> Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1483955520-29063-6-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
dc0c6127 |
|
09-Jan-2017 |
Jiri Olsa <jolsa@kernel.org> |
perf record: Add switch-output size option argument It's now possible to specify the threshold size for perf.data like: $ perf record --switch-output=2G ... Once it's reached, the current data are dumped in to the perf.data.<timestamp> file and session does on. $ perf record --switch-output=2G ... [ perf record: dump data: Woken up 7244 times ] [ perf record: Dump perf.data.2017010214093746 ] ... The size is expected to be a number with appended unit character - B/K/M/G. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1483955520-29063-5-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
cb4e1ebb |
|
09-Jan-2017 |
Jiri Olsa <jolsa@kernel.org> |
perf record: Change switch-output option to take optional argument Next patches will add --switch-output option arguments, changing the option to allow that and adding its default value to 'signal'. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1483955520-29063-4-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
1b43b704 |
|
09-Jan-2017 |
Jiri Olsa <jolsa@kernel.org> |
perf record: Add struct switch_output Next patches will add more --switch-output option arguments, so preparing the data holder. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Wang Nan <wangnan0@huawei.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1483955520-29063-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
60437ac0 |
|
03-Jan-2017 |
Jiri Olsa <jolsa@kernel.org> |
perf record: Fix --switch-output documentation and comment There's no --signal-trigger option, also adding the code comment into record man page. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Wang Nan <wangnan0@huawei.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1483431600-19887-4-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
efd21307 |
|
03-Jan-2017 |
Jiri Olsa <jolsa@kernel.org> |
perf record: Make __record_options static There's no need for this one to be global. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Wang Nan <wangnan0@huawei.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1483431600-19887-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
23dc4f15 |
|
12-Dec-2016 |
Jiri Olsa <jolsa@kernel.org> |
perf record: Force ignore_missing_thread for uid option Enable perf_evsel::ignore_missing_thread for -u option to ignore complete failure if any of the user's processes die between its enumeration and time we open the event. Committer notes: While doing a 'make -j4 allmodconfig' we sometimes get into the race: Before: # perf record -u acme Error: The sys_perf_event_open() syscall returned with 3 (No such process) for event (cycles:ppp). /bin/dmesg may provide additional information. No CONFIG_PERF_EVENTS=y kernel support configured? # After: [root@jouet ~]# perf record -u acme WARNING: Ignored open failure for pid 9888 WARNING: Ignored open failure for pid 18059 [root@jouet ~]# Which is an improvement, with the races not preventing the remaining threads for the specified user from being monitored, but the message probably needs further clarification. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1481538943-21874-6-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a074865e |
|
26-Nov-2016 |
Wang Nan <wangnan0@huawei.com> |
perf tools: Introduce perf hooks Perf hooks allow hooking user code at perf events. They can be used for manipulation of BPF maps, taking snapshot and reporting results. In this patch two perf hook points are introduced: record_start and record_end. To avoid buggy user actions, a SIGSEGV signal handler is introduced into 'perf record'. It turns off perf hook if it causes a segfault and report an error to help debugging. A test case for perf hook is introduced. Test result: $ ./buildperf/perf test -v hook 50: Test perf hooks : --- start --- test child forked, pid 10311 SIGSEGV is observed as expected, try to recover. Fatal error (SEGFAULT) in perf hook 'test' test child finished with 0 ---- end ---- Test perf hooks: Ok Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-5-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
1b36c03e |
|
23-Sep-2016 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Add support for using symbols in address filters Symbols come from either the DSO or /proc/kallsyms for the kernel. Details of the functionality can be found in Documentation/perf-record.txt. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Link: http://lkml.kernel.org/r/1474641528-18776-8-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5c01ad60 |
|
23-Sep-2016 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Fix error paths Some error paths do not tidy-up. Fix that. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Link: http://lkml.kernel.org/r/1474641528-18776-6-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
394c01ed |
|
23-Sep-2016 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Rename label 'out_symbol_exit' In preparation for fixing the error paths, rename label 'out_symbol_exit' to be 'out' because that error path can be used irrespective of whether symbols (or anything else) has been initialized. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Link: http://lkml.kernel.org/r/1474641528-18776-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5d8bb1ec |
|
16-Sep-2016 |
Mathieu Poirier <mathieu.poirier@linaro.org> |
perf tools: Add PMU configuration to tools Now that the required mechanic is there to deal with PMU specific configuration, add the functionality to the tools where events can be selected. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1474041004-13956-7-git-send-email-mathieu.poirier@linaro.org [ Fix the build on XSI-compliant systems, using str_error_r() to make sure we return a string, not an integer ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5e30d55c |
|
22-Aug-2016 |
Colin Ian King <colin.king@canonical.com> |
perf record: Fix spelling mistake "Finshed" -> "Finished" Trivial fix to spelling mistake in pr_debug message. Signed-off-by: Colin King <colin.king@canonical.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160822183008.26368-1-colin.king@canonical.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0693e680 |
|
08-Aug-2016 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Use USEC_PER_MSEC Instead of a naked 1000. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-7v6be7jhvstbkvk3rsytjw0o@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b6f35ed7 |
|
01-Aug-2016 |
Jiri Olsa <jolsa@kernel.org> |
perf record: Add --sample-cpu option Adding --sample-cpu option to be able to explicitly enable CPU sample type. Currently it's only enable implicitly in case the target is cpu related. It will be useful for following c2c record tool. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1470074555-24889-8-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4ea648ae |
|
14-Jul-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Add --tail-synthesize option When working with overwritable ring buffer there's a inconvenience problem: if perf dumps data after a long period after it starts, non-sample events may lost, which makes following 'perf report' unable to identify proc name and mmap layout. For example: # perf record -m 4 -e raw_syscalls:* -g --overwrite --switch-output \ dd if=/dev/zero of=/dev/null send SIGUSR2 after dd runs long enough. The resuling perf.data lost correct comm and mmap events: # perf script -i perf.data.2016061522374354 perf 24478 [004] 2581325.601789: raw_syscalls:sys_exit: NR 0 = 512 ^^^^ Should be 'dd' 27b2e8 syscall_slow_exit_work+0xfe2000e3 (/lib/modules/4.6.0-rc3+/build/vmlinux) 203cc7 do_syscall_64+0xfe200117 (/lib/modules/4.6.0-rc3+/build/vmlinux) b18d83 return_from_SYSCALL_64+0xfe200000 (/lib/modules/4.6.0-rc3+/build/vmlinux) 7f47c417edf0 [unknown] ([unknown]) ^^^^^^^^^^^^ Fail to unwind This patch provides a '--tail-synthesize' option, allows perf to collect system status when finalizing output file. In resuling output file, the non-sample events reflect system status when dumping data. After this patch: # perf record -m 4 -e raw_syscalls:* -g --overwrite --switch-output --tail-synthesize \ dd if=/dev/zero of=/dev/null # perf script -i perf.data.2016061600544998 dd 27364 [004] 2583244.994464: raw_syscalls:sys_enter: NR 1 (1, ... ^^ Correct comm 203a18 syscall_trace_enter_phase2+0xfe2001a8 ([kernel.kallsyms]) 203aa5 syscall_trace_enter+0xfe200055 ([kernel.kallsyms]) 203caa do_syscall_64+0xfe2000fa ([kernel.kallsyms]) b18d83 return_from_SYSCALL_64+0xfe200000 ([kernel.kallsyms]) d8e50 __GI___libc_write+0xffff01d9639f4010 (/tmp/oxygen_root-w00229757/lib64/libc-2.18.so) ^^^^^ Correct unwind This option doesn't aim to solve this problem completely. If a process terminates before SIGUSR2, we still lost its COMM and MMAP events. For example, we can't unwind correctly from the final perf.data we get from the previous example, because when perf collects the final output file (when we press C-c), 'dd' has been terminated so its '/proc/<pid>/mmap' becomes empty. However, this is a cheaper choice. To completely solve this problem we need to continously output non-sample events. To satisify the requirement of daemonization, we need to merge them periodically. It is possible but requires much more code and cycles. Automatically select --tail-synthesize when --overwrite is provided. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1468485287-33422-16-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
626a6b78 |
|
14-Jul-2016 |
Wang Nan <wangnan0@huawei.com> |
perf tools: Enable overwrite settings This patch allows following config terms and option: Globally setting events to overwrite; # perf record --overwrite ... Set specific events to be overwrite or no-overwrite. # perf record --event cycles/overwrite/ ... # perf record --event cycles/no-overwrite/ ... Add missing config terms and update the config term array size because the longest string length has changed. For overwritable events, it automatically selects attr.write_backward since perf requires it to be backward for reading. Test result: # perf record --overwrite -e syscalls:*enter_nanosleep* usleep 1 [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.011 MB perf.data (1 samples) ] # perf evlist -v syscalls:sys_enter_nanosleep: type: 2, size: 112, config: 0x134, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, write_backward: 1 # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1468485287-33422-14-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
05737464 |
|
14-Jul-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Read from overwritable ring buffer Drive the evlist->bkw_mmap_state state machine during draining and when SIGUSR2 is received. Read the backward ring buffer in record__mmap_read_all. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1468485287-33422-12-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
54cc54de |
|
14-Jul-2016 |
Wang Nan <wangnan0@huawei.com> |
perf evlist: Setup backward mmap state machine Introduce a bkw_mmap_state state machine to evlist: .________________(forbid)_____________. | V NOTREADY --(0)--> RUNNING --(1)--> DATA_PENDING --(2)--> EMPTY ^ ^ | ^ | | |__(forbid)____/ |___(forbid)___/| | | \_________________(3)_______________/ NOTREADY : Backward ring buffers are not ready RUNNING : Backward ring buffers are recording DATA_PENDING : We are required to collect data from backward ring buffers EMPTY : We have collected data from backward ring buffers. (0): Setup backward ring buffer (1): Pause ring buffers for reading (2): Read from ring buffers (3): Resume ring buffers for recording We can't avoid this complexity. Since we deliberately drop records from overwritable ring buffer, there's no way for us to check remaining from ring buffer itself (by checking head and old pointers). Therefore, we need DATA_PENDING and EMPTY state to help us recording what we have done to the ring buffer. In record__mmap_read_evlist(), drive this state machine from DATA_PENDING to EMPTY. In perf_evlist__mmap_per_evsel(), drive this state machine from NOTREADY to RUNNING when creating backward mmap. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1468485287-33422-11-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b2cb615d |
|
14-Jul-2016 |
Wang Nan <wangnan0@huawei.com> |
perf evlist: Introduce backward_mmap array for evlist Add backward_mmap to evlist, free it together with normal mmap. Improve perf_evlist__pick_pc(), search backward_mmap if evlist->mmap is not available. This patch doesn't alloc this array. It will be allocated conditionally in the following commits. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1468485287-33422-8-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a4ea0ec4 |
|
14-Jul-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Decouple record__mmap_read() and evlist. Perf evlist will have multiple mmap arrays. Update record__mmap_read(): it should read from 'struct perf_mmap' directly. Also, make record__mmap_read() ready to read from backward ring buffer. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1468485287-33422-5-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c8b5f2c9 |
|
06-Jul-2016 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
tools: Introduce str_error_r() The tools so far have been using the strerror_r() GNU variant, that returns a string, be it the buffer passed or something else. But that, besides being tricky in cases where we expect that the function using strerror_r() returns the error formatted in a provided buffer (we have to check if it returned something else and copy that instead), breaks the build on systems not using glibc, like Alpine Linux, where musl libc is used. So, introduce yet another wrapper, str_error_r(), that has the GNU interface, but uses the portable XSI variant of strerror_r(), so that users rest asured that the provided buffer is used and it is what is returned. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-d4t42fnf48ytlk8rjxs822tf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ee667f94 |
|
27-Jun-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Prepare picking perf_event_mmap_page from multiple evlists Following commits introduce new evlists to record. This patch adjusts record__pick_pc() and introduces perf_evlist__pick_pc() to read control page from one specific evlist. record__pick_pc() will be improved to search control page from multiple evlists. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1467023052-146749-4-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
cb21686b |
|
27-Jun-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Prepare reading from multiple evlists in record__mmap_read_all() Following commits introduce new evlists to record. This patch adjusts record__mmap_read_all() and record__mmap_read(): converting original record__mmap_read_all() to record__mmap_read_evlist(), read from one evlist; makes record__mmap_read() reading from specific evlist. record__mmap_read_all() will be improved to read from multiple evlists. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1467023052-146749-3-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
cda57a8c |
|
27-Jun-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Move mmap setup block to separate function Following commits introduce multiple evlists to record. This patch extracts perf_evlist__mmap_ex() processing to a new function, creates record__mmap() and record__mmap_evlist() to wrap perf_evlist__mmap_ex() and its error processing. They will be improvemented to create mmap for all evlists. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1467023052-146749-2-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e5cadb93 |
|
23-Jun-2016 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Rename for_each() macros to for_each_entry() To match the semantics for list.h in the kernel, that are used to implement those macros. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-qbcjlgj0ffxquxscahbpddi3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
41840d21 |
|
23-Jun-2016 |
Taeung Song <treeze.taeung@gmail.com> |
perf config: Move config declarations from util/cache.h to util/config.h Lately util/config.h has been added but util/cache.h has declarations of functions and a global variable for config features. To manage codes about configuration at one spot, move them to util/config.h and let source files that need config features include config.h And if the source files that included previous cache.h need only config.h, remove including cache.h. Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1466672119-4852-2-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0aab2136 |
|
16-Jun-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Add --dry-run option to check cmdline options With '--dry-run', 'perf record' doesn't do reall recording. Combine with llvm.dump-obj option, --dry-run can be used to help compile BPF objects for embedded platform. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1466064161-48553-3-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c45628b0 |
|
23-May-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Robustify perf_event__synth_time_conv() It is possible that all events in an evlist are overwritable. perf_event__synth_time_conv() should not crash in this case. record__pick_pc() is used to check avaliability. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1464056944-166978-3-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> [ Split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
3a62a7b8 |
|
23-May-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Read from backward ring buffer Introduce rb_find_range() to find start and end position from a backward ring buffer. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1463987628-163563-5-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
09fa4f40 |
|
23-May-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Rename variable to make code clear record__mmap_read() writes data from ring buffer into perf.data. 'head' is maintained by the kernel, points to the last written record. 'old' is maintained by perf, points to the record read in previous round. record__mmap_read() saves data from 'old' to 'head' to perf.data. The names of these variables are not very intutive. In addition, when dealing with backward writing ring buffer, the md->prev pointer should point to 'head' instead of the last byte it got. Add 'start' and 'end' pointer to make code clear and set md->prev to 'head' instead of the moved 'old' pointer. This patch doesn't change behavior since: buf = &data[old & md->mask]; size = head - old; old += size; <--- Here, old == head Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1463987628-163563-4-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
2d11c650 |
|
23-May-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Prevent reading invalid data in record__mmap_read When record__mmap_read() requires data more than the size of ring buffer, drop those data to avoid accessing invalid memory. This can happen when reading from overwritable ring buffer, which should be avoided. However, check this for robustness. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1463987628-163563-3-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
be7b0c9e |
|
20-Apr-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Generate tracking events for process forked by perf With 'perf record --switch-output' without -a, record__synthesize() in record__switch_output() won't generate tracking events because there's no thread_map in evlist. Which causes newly created perf.data doesn't contain map and comm information. This patch creates a fake thread_map and directly call perf_event__synthesize_thread_map() for those events. Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1461178794-40467-8-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0c1d46a8 |
|
20-Apr-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Disable buildid cache options by default in switch output mode The cost of buildid cache processing is high: reading all events in output perf.data, opening each elf file to read buildids then copying them into ~/.debug directory. In switch output mode, these heavy works block perf from receiving perf events for too long. Enable no-buildid and no-buildid-cache by default if --switch-output is provided. Still allow user use --no-no-buildid to explicitly enable buildid in this case. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1461178794-40467-6-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> [ Updated man page ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
eca857ab |
|
20-Apr-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Force enable --timestamp-filename when --switch-output is provided Without this patch, the last output doesn't have timestamp appended if --timestamp-filename is not explicitly provided. For example: # perf record -a --switch-output & [1] 11224 # kill -s SIGUSR2 11224 [ perf record: dump data: Woken up 1 times ] # [ perf record: Dump perf.data.2015122622372823 ] # fg perf record -a --switch-output ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.027 MB perf.data (540 samples) ] # ls -l total 836 -rw------- 1 root root 33256 Dec 26 22:37 perf.data <---- *Odd* -rw------- 1 root root 817156 Dec 26 22:37 perf.data.2015122622372823 Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1461178794-40467-5-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> [ Updated man page, that also got an entry for --timestamp-filename ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
3c1cb7e3 |
|
20-Apr-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Split output into multiple files via '--switch-output' Allow 'perf record' to split its output into multiple files. For example: # ~/perf record -a --timestamp-filename --switch-output & [1] 10763 # kill -s SIGUSR2 10763 [ perf record: dump data: Woken up 1 times ] # [ perf record: Dump perf.data.2015122622314468 ] # kill -s SIGUSR2 10763 [ perf record: dump data: Woken up 1 times ] # [ perf record: Dump perf.data.2015122622314762 ] # kill -s SIGUSR2 10763 [ perf record: dump data: Woken up 1 times ] #[ perf record: Dump perf.data.2015122622315171 ] # fg perf record -a --timestamp-filename --switch-output ^C[ perf record: Woken up 1 times to write data ] [ perf record: Dump perf.data.2015122622315513 ] [ perf record: Captured and wrote 0.014 MB perf.data.<timestamp> (296 samples) ] # ls -l total 920 -rw------- 1 root root 797692 Dec 26 22:31 perf.data.2015122622314468 -rw------- 1 root root 59960 Dec 26 22:31 perf.data.2015122622314762 -rw------- 1 root root 59912 Dec 26 22:31 perf.data.2015122622315171 -rw------- 1 root root 19220 Dec 26 22:31 perf.data.2015122622315513 Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1461178794-40467-4-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> [ Added man page entry, used the re-synthesize patch in this series as a fixup ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5f9cf599 |
|
20-Apr-2016 |
Wang Nan <wangnan0@huawei.com> |
perf tools: Derive trigger class from auxtrace_snapshot auxtrace_snapshot_state matches the trigger model. Use trigger to implement it. auxtrace_snapshot_state and auxtrace_snapshot_err are absorbed. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1461178794-40467-3-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
2ddd5c04 |
|
17-Apr-2016 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Ditch record_opts.callgraph_set We have callchain_param.enabled for that. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-silwqjc2t25ls42dsvg28pp5@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0883e820 |
|
15-Apr-2016 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Export record_opts based callchain parsing helper To be able to call it outside option parsing, like when setting a default --call-graph parameter in 'perf trace' when just --min-stack is used. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-xay69plylwibpb3l4isrpl1k@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ecfd7a9c |
|
13-Apr-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Add '--timestamp-filename' option to append timestamp to output file name This option appends current timestamp to the output file name. For example: # perf record -a --timestamp-filename ^C[ perf record: Woken up 1 times to write data ] [ perf record: Dump perf.data.2015122622265847 ] [ perf record: Captured and wrote 0.742 MB perf.data.<timestamp> (90 samples) ] # ls perf.data.201512262226584 The timestamp will be useful for identifying each perf.data after the 'perf record' support for generating multiple output files gets introduced. Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460535673-159866-5-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c0bdc1c4 |
|
13-Apr-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Turns auxtrace_snapshot_enable into 3 states auxtrace_snapshot_enable has only two states (0/1). Turns it into a triple states enum so SIGUSR2 handler can safely do other works without triggering auxtrace snapshot. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460535673-159866-4-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e68ae9cf |
|
11-Apr-2016 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evsel: Do not use globals in config() Instead receive a callchain_param pointer to configure callchain aspects, not doing so if NULL is passed. This will allow fine grained control over which evsels in an evlist gets callchains enabled. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-2mupip6khc92mh5x4nw9to82@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d7888573 |
|
08-Apr-2016 |
Wang Nan <wangnan0@huawei.com> |
perf bpf: Clone bpf stdout events in multiple bpf scripts This patch allows cloning bpf-output event configuration among multiple bpf scripts. If there exist a map named '__bpf_output__' and not configured using 'map:__bpf_output__.event=', this patch clones the configuration of another '__bpf_stdout__' map. For example, following command: # perf trace --ev bpf-output/no-inherit,name=evt/ \ --ev ./test_bpf_trace.c/map:__bpf_stdout__.event=evt/ \ --ev ./test_bpf_trace2.c usleep 100000 equals to: # perf trace --ev bpf-output/no-inherit,name=evt/ \ --ev ./test_bpf_trace.c/map:__bpf_stdout__.event=evt/ \ --ev ./test_bpf_trace2.c/map:__bpf_stdout__.event=evt/ \ usleep 100000 Signed-off-by: Wang Nan <wangnan0@huawei.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460128045-97310-4-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
46bc29b9 |
|
08-Mar-2016 |
Adrian Hunter <adrian.hunter@intel.com> |
perf tools: Add time conversion event Intel PT uses the time members from the perf_event_mmap_page to convert between TSC and perf time. Due to a lack of foresight when Intel PT was implemented, those time members were recorded in the (implementation dependent) AUXTRACE_INFO event, the structure of which is generally inaccessible outside of the Intel PT decoder. However now the conversion between TSC and perf time is needed when processing a jitdump file when Intel PT has been used for tracing. So add a user event to record the time members. 'perf record' will synthesize the event if the information is available. And session processing will put a copy of the event on the session so that tools like 'perf inject' can easily access it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1457426324-30158-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
95c36561 |
|
26-Feb-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Ensure return non-zero rc when mmap fail perf_evlist__mmap_ex() can fail without setting errno (for example, fail in condition checking. In this case all syscall is success). If this happen, record__open() incorrectly returns 0. Force setting rc is a quick way to avoid this problem, or we have to follow all possible code path in perf_evlist__mmap_ex() to make sure there's at least one system call before returning an error. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456479154-136027-30-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e1ab48ba |
|
26-Feb-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Introduce record__finish_output() to finish a perf.data Move code for finalizing 'perf.data' to record__finish_output(). It will be used by following commits to split output to multiple files. Signed-off-by: He Kuang <hekuang@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456479154-136027-23-git-send-email-wangnan0@huawei.com Signed-off-by: Wang Nan <wangnan0@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c45c86eb |
|
26-Feb-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Extract synthesize code to record__synthesize() Create record__synthesize(). It can be used to create tracking events for each perf.data after perf supporting splitting into multiple outputs. Signed-off-by: He Kuang <hekuang@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456479154-136027-20-git-send-email-wangnan0@huawei.com Signed-off-by: Wang Nan <wangnan0@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d8871ea7 |
|
26-Feb-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Use WARN_ONCE to replace 'if' condition Commits in a BPF patchkit will extract kernel and module synthesizing code into a separated function and call it multiple times. This patch replace 'if (err < 0)' using WARN_ONCE, makes sure the error message show one time. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456479154-136027-19-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
8690a2a7 |
|
22-Feb-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Apply config to BPF objects before recording bpf__apply_obj_config() is introduced as the core API to apply object config options to all BPF objects. This patch also does the real work for setting values for BPF_MAP_TYPE_PERF_ARRAY maps by inserting value stored in map's private field into the BPF map. This patch is required because we are not always able to set all BPF config during parsing. Further patch will set events created by perf to BPF_MAP_TYPE_PERF_EVENT_ARRAY maps, which is not exist until perf_evsel__open(). bpf_map_foreach_key() is introduced to iterate over each key needs to be configured. This function would be extended to support more map types and different key settings. In perf record, before start recording, call bpf__apply_config() to turn on all BPF config options. Test result: # cat ./test_bpf_map_1.c /************************ BEGIN **************************/ #include <uapi/linux/bpf.h> #define SEC(NAME) __attribute__((section(NAME), used)) struct bpf_map_def { unsigned int type; unsigned int key_size; unsigned int value_size; unsigned int max_entries; }; static void *(*map_lookup_elem)(struct bpf_map_def *, void *) = (void *)BPF_FUNC_map_lookup_elem; static int (*trace_printk)(const char *fmt, int fmt_size, ...) = (void *)BPF_FUNC_trace_printk; struct bpf_map_def SEC("maps") channel = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(int), .value_size = sizeof(int), .max_entries = 1, }; SEC("func=sys_nanosleep") int func(void *ctx) { int key = 0; char fmt[] = "%d\n"; int *pval = map_lookup_elem(&channel, &key); if (!pval) return 0; trace_printk(fmt, sizeof(fmt), *pval); return 0; } char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; /************************* END ***************************/ # echo "" > /sys/kernel/debug/tracing/trace # ./perf record -e './test_bpf_map_1.c/map:channel.value=11/' usleep 10 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.012 MB perf.data ] # cat /sys/kernel/debug/tracing/trace # tracer: nop # # entries-in-buffer/entries-written: 1/1 #P:8 [SNIP] # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | usleep-18593 [007] d... 2394714.395539: : 11 # ./perf record -e './test_bpf_map_1.c/map:channel.value=101/' usleep 10 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.012 MB perf.data ] # cat /sys/kernel/debug/tracing/trace # tracer: nop # # entries-in-buffer/entries-written: 1/1 #P:8 [SNIP] # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | usleep-18593 [007] d... 2394714.395539: : 11 usleep-19000 [006] d... 2394831.057840: : 101 Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Cody P Schafer <dev@codyps.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456132275-98875-6-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
85723885 |
|
15-Feb-2016 |
Jiri Olsa <jolsa@kernel.org> |
perf record: Add --all-user/--all-kernel options Allow user to easily switch all events to user or kernel space with simple --all-user or --all-kernel options. This will be handy within perf mem/c2c wrappers to switch easily monitoring modes. Committer note: Testing it: # perf record --all-kernel --all-user -a sleep 2 Error: option `all-user' cannot be used with all-kernel Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] --all-user Configure all used events to run in user space. --all-kernel Configure all used events to run in kernel space. # perf record --all-user --all-kernel -a sleep 2 Error: option `all-kernel' cannot be used with all-user Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] --all-kernel Configure all used events to run in kernel space. --all-user Configure all used events to run in user space. # perf record --all-user -a sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.416 MB perf.data (162 samples) ] # perf report | grep '\[k\]' # perf record --all-kernel -a sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.423 MB perf.data (296 samples) ] # perf report | grep '\[\.\]' # Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1455525293-8671-2-git-send-email-jolsa@kernel.org [ Made those options to be mutually exclusive ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d2db9a98 |
|
25-Jan-2016 |
Wang Nan <wangnan0@huawei.com> |
perf record: Use OPT_BOOLEAN_SET for buildid cache related options 'perf record' knows whether buildid cache is enabled (via --no-no-buildid-cache) deliberately. Buildid cache can be turned off in some situations. Output switching support needs this feature to turn off buildid cache by default. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1453715801-7732-33-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
6156681b |
|
11-Jan-2016 |
Namhyung Kim <namhyung@kernel.org> |
perf record: Add --buildid-all option The --buildid-all option is to record build-id of all DSOs in the file. It might be very costly to postprocess samples to find which DSO hits. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1452519429-31779-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5c0cf224 |
|
07-Jan-2016 |
Jiri Olsa <jolsa@redhat.com> |
perf record: Store data mmaps for dwarf unwind Currently we don't synthesize data mmap by default. It depends on -d option, that enables data address sampling. But we've seen cases (softice) where DWARF unwinder went through non executable mmaps, which we need to lookup in MAP__VARIABLE tree. Making data mmaps to be synthesized for dwarf unwind as well. Reported-by: Noel Grandin <noelgrandin@gmail.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20160107133022.GA32115@krava.brq.redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ffa517ad |
|
25-Oct-2015 |
Jiri Olsa <jolsa@kernel.org> |
perf tools: Introduce stat perf.data header feature Introducing the 'stat' feature to mark a perf.data as created by the 'perf stat record' command. It contains no data. It's needed so that the report tools (report/script) can differentiate sampling data from counting data, because they need to be treated in a different way. In the future it might be used to store the version of the stat storage system used. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Kan Liang <kan.liang@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1445784728-21732-28-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4b6ab94e |
|
15-Dec-2015 |
Josh Poimboeuf <jpoimboe@redhat.com> |
perf subcmd: Create subcmd library Move the subcommand-related files from perf to a new library named libsubcmd.a. Since we're moving files anyway, go ahead and rename 'exec_cmd.*' to 'exec-cmd.*' to be consistent with the naming of all the other files. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/c0a838d4c878ab17fee50998811612b2281355c1.1450193761.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7a29c087 |
|
14-Dec-2015 |
Namhyung Kim <namhyung@kernel.org> |
perf record: Add record.build-id config option Post processing at 'perf record' takes a long time on big machines. What it does is to find the build-id of binaries found in the event stream, so that it can make sure, at 'report' time, that the symtabs (be it ELF, kallsyms, etc) being used to resolve symbols are the ones matching the binaries found at 'record' time. Sometimes we just want to skip this processing of events at the end of the session to get quicker results, making sure the binaries haven't changed from 'record' to 'report' time. Add a new config option to control this behavior. The record.build-id config variable can have one of the following values: - cache: post-process data and save/update the binaries into the build-id cache (in ~/.debug). This is the default. - no-cache: post-process the data but not update the build-id cache. Same effect as using the -N option. - skip: skip post-processing and do not update the cache. Same effect as using the -B option. Reported-and-Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Taeung Song <treeze.taeung@gmail.com> Link: http://lkml.kernel.org/r/1450144196-22957-1-git-send-email-namhyung@kernel.org [ Added some more text to the documentation ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7efe0e03 |
|
14-Dec-2015 |
He Kuang <hekuang@huawei.com> |
perf record: Support custom vmlinux path Make perf-record command support --vmlinux option if BPF_PROLOGUE is on. 'perf record' needs vmlinux as the source of DWARF info to generate prologue for BPF programs, so path of vmlinux should be specified. Short name 'k' has been taken by 'clockid'. This patch skips the short option name and uses '--vmlinux' for vmlinux path. Documentation is also updated. Test result: In a production (or broken) environment: (by: # rm -rf ~/.debug/ # mv /lib/modules/`uname -r`/build/vmlinux /tmp/ ) # ./perf record -e ./test_bpf_base.c ls Failed to find the path for kernel: No such file or directory event syntax error: './test_bpf_base.c' \___ You need to check probing points in BPF file ... # ./perf record --vmlinux /tmp/vmlinux -e ./test_bpf_base.c ls ... [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.011 MB perf.data ] Help messages when build with NO_LIBBPF: # ./perf record -h --transaction sample transaction flags (special events only) --vmlinux <file> vmlinux pathname (not built-in because NO_LIBBPF=1) # ./perf record --vmlinux /tmp/vmlinux ls / Warning: option `vmlinux' is being ignored because NO_LIBBPF=1 ... [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.011 MB perf.data (11 samples) ] Help messages when build with NO_DWARF: # ./perf record -h --transaction sample transaction flags (special events only) --vmlinux <file> vmlinux pathname (not built-in because NO_DWARF=1) Signed-off-by: He Kuang <hekuang@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1450089563-122430-15-git-send-email-wangnan0@huawei.com Signed-off-by: Wang Nan <wangnan0@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
48e1cab1 |
|
14-Dec-2015 |
Wang Nan <wangnan0@huawei.com> |
perf tools: Make options always available, even if required libs not linked This patch keeps options of perf builtins same in all conditions. If one option is disabled because of compiling options, users should be notified. Masami suggested another implementation in [1] that, by adding a OPTION_NEXT_DEPENDS option before those options in the 'struct option' array, options parser knows an option is disabled. However, in some cases this array is reordered (options__order()). In addition, in parse-option.c that array is const, so we can't simply merge information in decorator option into the affacted option. This patch chooses a simpler implementation that, introducing a set_option_nobuild() function and two option parsing flags. Builtins with such options should call set_option_nobuild() before option parsing. The complexity of this patch is because we want some of options can be skipped safely. In this case their arguments should also be consumed. Options in 'perf record' and 'perf probe' are fixed in this patch. [1] http://lkml.kernel.org/g/50399556C9727B4D88A595C8584AAB3752627CD4@GSjpTKYDCembx32.service.hitachi.net Test result: Normal case: # ./perf probe --vmlinux /tmp/vmlinux sys_write Added new event: probe:sys_write (on sys_write) You can now use it in all perf tools, such as: perf record -e probe:sys_write -aR sleep 1 Build with NO_DWARF=1: # ./perf probe -L sys_write Error: switch `L' is not available because NO_DWARF=1 Usage: perf probe [<options>] 'PROBEDEF' ['PROBEDEF' ...] or: perf probe [<options>] --add 'PROBEDEF' [--add 'PROBEDEF' ...] or: perf probe [<options>] --del '[GROUP:]EVENT' ... or: perf probe --list [GROUP:]EVENT ... or: perf probe [<options>] --funcs -L, --line <FUNC[:RLN[+NUM|-RLN2]]|SRC:ALN[+NUM|-ALN2]> Show source code lines. (not built-in because NO_DWARF=1) # ./perf probe -k /tmp/vmlinux sys_write Warning: switch `k' is being ignored because NO_DWARF=1 Added new event: probe:sys_write (on sys_write) You can now use it in all perf tools, such as: perf record -e probe:sys_write -aR sleep 1 # ./perf probe --vmlinux /tmp/vmlinux sys_write Warning: option `vmlinux' is being ignored because NO_DWARF=1 Added new event: [SNIP] # ./perf probe -l Usage: perf probe [<options>] 'PROBEDEF' ['PROBEDEF' ...] or: perf probe [<options>] --add 'PROBEDEF' [--add 'PROBEDEF' ...] ... -k, --vmlinux <file> vmlinux pathname (not built-in because NO_DWARF=1) -L, --line <FUNC[:RLN[+NUM|-RLN2]]|SRC:ALN[+NUM|-ALN2]> Show source code lines. (not built-in because NO_DWARF=1) ... -V, --vars <FUNC[@SRC][+OFF|%return|:RL|;PT]|SRC:AL|SRC;PT> Show accessible variables on PROBEDEF (not built-in because NO_DWARF=1) --externs Show external variables too (with --vars only) (not built-in because NO_DWARF=1) --no-inlines Don't search inlined functions (not built-in because NO_DWARF=1) --range Show variables location range in scope (with --vars only) (not built-in because NO_DWARF=1) Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1450089563-122430-14-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
71dc2326 |
|
13-Oct-2015 |
Wang Nan <wangnan0@huawei.com> |
perf record: Add clang options for compiling BPF scripts Although previous patch allows setting BPF compiler related options in perfconfig, on some ad-hoc situation it still requires passing options through cmdline. This patch introduces 2 options to 'perf record' for this propose: --clang-path and --clang-opt. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1444826502-49291-9-git-send-email-wangnan0@huawei.com [ Add the new options to the 'record' man page ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c7118369 |
|
24-Oct-2015 |
Namhyung Kim <namhyung@kernel.org> |
perf tools: Introduce usage_with_options_msg() Now usage_with_options() setup a pager before printing message so normal printf() or pr_err() will not be shown. The usage_with_options_msg() can be used to print some help message before usage strings. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1445701767-12731-4-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
76a26549 |
|
22-Oct-2015 |
Namhyung Kim <namhyung@kernel.org> |
perf tools: Improve call graph documents and help messages The --call-graph option is complex so we should provide better guide for users. Also change help message to be consistent with config option names. Now perf top will show help like below: $ perf top --call-graph Error: option `call-graph' requires a value Usage: perf top [<options>] --call-graph <record_mode[,record_size],print_type,threshold[,print_limit],order,sort_key[,branch]> setup and enables call-graph (stack chain/backtrace): record_mode: call graph recording mode (fp|dwarf|lbr) record_size: if record_mode is 'dwarf', max size of stack recording (<bytes>) default: 8192 (bytes) print_type: call graph printing style (graph|flat|fractal|none) threshold: minimum call graph inclusion threshold (<percent>) print_limit: maximum number of call graph entry (<number>) order: call graph order (caller|callee) sort_key: call graph sort key (function|address) branch: include last branch info to call graph (branch) Default: fp,graph,0.5,caller,function Requested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Chandler Carruth <chandlerc@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1445524112-5201-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
21cf6284 |
|
22-Oct-2015 |
Namhyung Kim <namhyung@kernel.org> |
perf tools: Move callchain help messages to callchain.h These messages will be used by 'perf top' in the next patch. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Chandler Carruth <chandlerc@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1445495330-25416-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9f065194 |
|
29-Sep-2015 |
Yang Shi <yang.shi@linaro.org> |
perf record: Change 'record.samples' type to unsigned long long When run "perf record -e", the number of samples showed up is wrong on some 32 bit systems, i.e. powerpc and arm. For example, run the below commands on 32 bit powerpc: perf probe -x /lib/libc.so.6 malloc perf record -e probe_libc:malloc -a ls perf.data [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.036 MB perf.data (13829241621624967218 samples) ] Actually, "perf script" just shows 21 samples. The number of samples is also absurd since samples is long type, but it is printed as PRIu64. Build test ran on x86-64, x86, aarch64, arm, mips, ppc and ppc64. Signed-off-by: Yang Shi <yang.shi@linaro.org> Cc: linaro-kernel@lists.linaro.org Link: http://lkml.kernel.org/r/1443563383-4064-1-git-send-email-yang.shi@linaro.org [ Bumped the 'hits' var used together with record.samples to 'unsigned long long' too ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e5bed564 |
|
29-Sep-2015 |
Namhyung Kim <namhyung@kernel.org> |
perf record: Allocate area for sample_id_hdr in a synthesized comm event A previous patch added a synthesized comm event for forked child process but it missed that the event should contain area for sample_id_hdr at the end. It worked by accident since the perf_event union contains bigger event structs like mmap_events. This patch fixes it by dynamically allocating event struct including those area like in perf_event__synthesize_thread_map(). Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1443577526-3240-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e803cf97 |
|
21-Sep-2015 |
Namhyung Kim <namhyung@kernel.org> |
perf record: Synthesize COMM event for a command line workload When perf creates a new child to profile, the events are enabled on exec(). And in this case, it doesn't synthesize any event for the child since they'll be generated during exec(). But there's an window between the enabling and the event generation. It used to be overcome since samples are only in kernel (so we always have the map) and the comm is overridden by a later COMM event. However it won't work if events are processed and displayed before the COMM event overrides like in 'perf script'. This leads to those early samples (like native_write_msr_safe) not having a comm but pid (like ':15328'). So it needs to synthesize COMM event for the child explicitly before enabling so that it can have a correct comm. But at this time, the comm will be "perf" since it's not exec-ed yet. Committer note: Before this patch: # perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ] # perf script --show-task-events :4429 4429 27909.079372: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. :4429 4429 27909.079375: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. :4429 4429 27909.079376: 10 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. :4429 4429 27909.079377: 223 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. :4429 4429 27909.079378: 6571 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. usleep 4429 27909.079380: PERF_RECORD_COMM exec: usleep:4429/4429 usleep 4429 27909.079381: 185403 cycles: ffffffff810a72d3 flush_signal_handlers (/lib/modules/4. usleep 4429 27909.079444: 2241110 cycles: 7fc575355be3 _dl_start (/usr/lib64/ld-2.20.so) usleep 4429 27909.079875: PERF_RECORD_EXIT(4429:4429):(4429:4429) After: # perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ] # perf script --show-task perf 0 0.000000: PERF_RECORD_COMM: perf:8446/8446 perf 8446 30154.038944: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. perf 8446 30154.038948: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. perf 8446 30154.038949: 9 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. perf 8446 30154.038950: 230 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. perf 8446 30154.038951: 6772 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. usleep 8446 30154.038952: PERF_RECORD_COMM exec: usleep:8446/8446 usleep 8446 30154.038954: 196923 cycles: ffffffff81766440 _raw_spin_lock (/lib/modules/4.3.0-rc1 usleep 8446 30154.039021: 2292130 cycles: 7f609a173dc4 memcpy (/usr/lib64/ld-2.20.so) usleep 8446 30154.039349: PERF_RECORD_EXIT(8446:8446):(8446:8446) # Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1442881495-2928-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
bcc84ec6 |
|
31-Aug-2015 |
Stephane Eranian <eranian@google.com> |
perf record: Add ability to name registers to record This patch modifies the -I/--int-regs option to enablepassing the name of the registers to sample on interrupt. Registers can be specified by their symbolic names. For instance on x86, --intr-regs=ax,si. The motivation is to reduce the size of the perf.data file and the overhead of sampling by only collecting the registers useful to a specific analysis. For instance, for value profiling, sampling only the registers used to passed arguements to functions. With no parameter, the --intr-regs still records all possible registers based on the architecture. To name registers, it is necessary to use the long form of the option, i.e., --intr-regs: $ perf record --intr-regs=si,di,r8,r9 ..... To record any possible registers: $ perf record -I ..... $ perf report --intr-regs ... To display the register, one can use perf report -D To list the available registers: $ perf record --intr-regs=\? available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15 Signed-off-by: Stephane Eranian <eranian@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1441039273-16260-4-git-send-email-eranian@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d988d5ee |
|
21-Aug-2015 |
Kan Liang <kan.liang@intel.com> |
perf evlist: Open event on evsel cpus and threads An evsel may have different cpus and threads than the evlist it is in. Use it's own cpus and threads, when opening the evsel in 'perf record'. Signed-off-by: Kan Liang <kan.liang@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1440138194-17001-1-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
cca8482c |
|
19-Aug-2015 |
Adrian Hunter <adrian.hunter@intel.com> |
perf tools: Fix buildid processing After recording, 'perf record' post-processes the data to determine which buildids are needed. That processing must process the data in time order, if possible, because otherwise dependent events, like forks and mmaps, will not make sense. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1439994561-27436-4-git-send-email-adrian.hunter@intel.com [ Moved the sample_id_add to after trying to open the events, use pr_warning ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c3a6a8c4 |
|
04-Aug-2015 |
Kan Liang <kan.liang@intel.com> |
perf tools: Refine parse/config callchain functions Pass global callchain_param into parse_callchain_record_opt and perf_evsel__config_callgraph as parameter. So we can reuse these functions to parse/config local param for callchain. Signed-off-by: Kan Liang <kan.liang@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1438677022-34296-3-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c421e80b |
|
29-Jul-2015 |
Kan Liang <kan.liang@intel.com> |
perf tools: Introduce callgraph_set for callgraph option Introduce callgraph_set to indicate whether the callgraph option was set by user. Signed-off-by: Kan Liang <kan.liang@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1438162936-59698-4-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b757bb09 |
|
20-Jul-2015 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Add option --switch-events to select PERF_RECORD_SWITCH events Add an option to select PERF_RECORD_SWITCH events. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1437471846-26995-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4ba1faa1 |
|
10-Jul-2015 |
Wang Nan <wangnan0@huawei.com> |
perf record: Allow filtering perf's pid via --exclude-perf This patch allows 'perf record' to exclude events issued by perf itself by '--exclude-perf' option. Before this patch, when doing something like: # perf record -a -e syscalls:sys_enter_write <cmd> One could easily get result like this: # /tmp/perf report --stdio ... # Overhead Command Shared Object Symbol # ........ ....... .................. .................... # 99.99% perf libpthread-2.18.so [.] __write_nocancel 0.01% ls libc-2.18.so [.] write 0.01% sshd libc-2.18.so [.] write ... Where most events are generated by perf itself. A shell trick can be done to filter perf itself out: # cat << EOF > ./tmp > #!/bin/sh > exec perf record -e ... --filter="common_pid != \$\$" -a sleep 10 > EOF # chmod a+x ./tmp # ./tmp However, doing so is user unfriendly. This patch extracts evsel iteration framework introduced by patch 'perf record: Apply filter to all events in a glob matching' into foreach_evsel_in_last_glob(), and makes exclude_perf() function append new filter expression to each evsel selected by a '-e' selector. To avoid losing filters if user pass '--filter' after '--exclude-perf', this patch uses perf_evsel__append_filter() in both case, instead of perf_evsel__set_filter() which removes old filter. As a side effect, now it is possible to use multiple '--filter' option for one selector. They are combinded with '&&'. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1436513770-8896-2-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
3abebc55 |
|
06-Jul-2015 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Let user have timestamps with per-thread recording If the option -T is used with option --per-thread, then time is still not sampled. Fix that by using OPT_BOOLEAN_SET to distinguish when the user used the -T option as opposed to the default case when timestamps are enabled but only for per-cpu recording. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1436183461-1918-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9d9cad76 |
|
17-Jun-2015 |
Kan Liang <kan.liang@intel.com> |
perf tools: Configurable per thread proc map processing time out The time out to limit the individual proc map processing was hard code to 500ms. This patch introduce a new option --proc-map-timeout to make the time limit configurable. Signed-off-by: Kan Liang <kan.liang@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ying Huang <ying.huang@intel.com> Link: http://lkml.kernel.org/r/1434549071-25611-2-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
56100321 |
|
10-Jun-2015 |
Peter Zijlstra <peterz@infradead.org> |
perf record: Amend option summaries Because there's too many options and I cannot read, I frequently get confused between -c and -P, and try to do things like: perf record -P 50000 -- foo Which does not work; try and make the option description slightly longer and hopefully less confusing. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150610144850.GP19282@twins.programming.kicks-ass.net [ Do those changes on the man page as well ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
457ae94a |
|
28-May-2015 |
He Kuang <hekuang@huawei.com> |
perf record: Fix perf.data size in no-buildid mode The size of perf.data is missing update in no-buildid mode, which gives wrong output result. Before this patch: $ perf.perf record -B -e syscalls:sys_enter_open uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB perf.data ] After this patch: $ perf.perf record -B -e syscalls:sys_enter_open uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data ] Signed-off-by: He Kuang <hekuang@huawei.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1432819050-30511-1-git-send-email-hekuang@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f00898f4 |
|
27-May-2015 |
Andi Kleen <ak@linux.intel.com> |
perf tools: Move branch option parsing to own file .. to allow sharing between builtin-record and builtin-top later. No code changes, just moved code. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1432749114-904-9-git-send-email-andi@firstfloor.org [ Rename too generic branch.[ch] name to parse-branch-options.[ch] ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
2dd6d8a1 |
|
30-Apr-2015 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Add AUX area tracing Snapshot Mode support Add a new option and support for Instruction Tracing Snapshot Mode. When the new option is selected, no AUX area tracing data is captured until a signal (SIGUSR2) is received. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1430404667-10593-10-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e31f0d01 |
|
30-Apr-2015 |
Adrian Hunter <adrian.hunter@intel.com> |
perf tools: Add build option NO_AUXTRACE to exclude AUX area tracing Add build option NO_AUXTRACE to exclude compiling support for AUX area tracing. Support for both recording and processing is excluded and by implication any future additions such as Intel PT and Intel BTS will also not be compiled in with this option. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1430404667-10593-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
cd10b289 |
|
30-Apr-2015 |
Adrian Hunter <adrian.hunter@intel.com> |
perf tools: Hit all build ids when AUX area tracing We need to include all buildids when a perf.data file contains AUX area tracing data because we do not decode the trace for that purpose because it would take too long. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1430404667-10593-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
99fa2984 |
|
30-Apr-2015 |
Adrian Hunter <adrian.hunter@intel.com> |
perf tools: Add AUX area tracing index Add an index of AUX area tracing events within a perf.data file. perf record uses a special user event PERF_RECORD_FINISHED_ROUND to enable sorting of events in chunks instead of having to sort all events altogether. AUX area tracing events contain data that can span back to the very beginning of the recording period. i.e. they do not obey the rules of PERF_RECORD_FINISHED_ROUND. By adding an index, AUX area tracing events can be found in advance and the PERF_RECORD_FINISHED_ROUND approach works as usual. The index is recorded with the auxtrace feature in the perf.data file. A session reads the index but does not process it. An AUX area decoder can queue all the AUX area data in advance using auxtrace_queues__process_index() or otherwise process the index in some custom manner. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1430404667-10593-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e9db1310 |
|
09-Apr-2015 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Extend -m option for AUX area tracing mmap pages Extend the -m option so that the number of mmap pages for AUX area tracing can be specified by adding a comma followed by the number of pages. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1428594864-29309-7-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ef149c25 |
|
09-Apr-2015 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Add basic AUX area tracing support Amend the perf record tool to read the AUX area tracing mmap and synthesize AUX area tracing events. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1428594864-29309-6-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7b8283b5 |
|
07-Apr-2015 |
David Ahern <dsahern@gmail.com> |
perf evlist: Fix type for references to data_head/tail The data_head and data_tail fields are defined as __u64 in linux/perf_event.h, but perf userspace uses int and unsigned int. Convert all references to u64 for consistency. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1428420037-26599-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
814c8c38 |
|
30-Mar-2015 |
Peter Zijlstra <peterz@infradead.org> |
perf record: Add clockid parameter Teach perf-record about the new perf_event_attr::{use_clockid, clockid} fields. Add a simple parameter to set the clock (if any) to be used for the events to be recorded into the data file. Since we store the entire perf_event_attr in the EVENT_DESC section we also already store the used clockid in the data file. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: David Ahern <dsahern@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yunlong Song <yunlong.song@huawei.com> Link: http://lkml.kernel.org/r/20150407154851.GR23123@twins.programming.kicks-ass.net [ Conditionally define CLOCK_BOOTTIME, at least rhel6 doesn't have it - dsahern Ditto for CLOCK_MONOTONIC_RAW, sles11sp2 doesn't have it - yunlong.song ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
23d4aad4 |
|
24-Mar-2015 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Return the first evsel with an invalid filter in apply_filters() Use of a bad filter currently generates the message: Error: failed to set filter with 22 (Invalid argument) Add the event name to make it clear to which event the filter failed to apply: Error: Failed to set filter "foo" on event sched:sg_lb_stats: 22: Invalid argument To test it use something like: # perf record -e sched:sched_switch -e sched:*fork --filter parent_pid==1 -e sched:*wait* --filter bla usleep 1 Error: failed to set filter "bla" on event sched:sched_stat_iowait with 22 (Invalid argument) # Based-on-a-patch-by: David Ahern <dsahern@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-d7gq2fjvaecozp9o2i0siifu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b7b61cbe |
|
03-Mar-2015 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf ordered_events: Shorten function signatures By keeping pointers to machines, evlist and tool in ordered_events. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-0c6huyaf59mqtm2ek9pmposl@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
85c273d2 |
|
24-Feb-2015 |
Andi Kleen <ak@linux.intel.com> |
perf record: Support recording running/enabled time Add an option to perf record to record running/enabled time for read events, similar to what stat does. This is useful to understand multiplexing problems. Right now the report support is not great, but at least report -D already supports it. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1424819620-16043-1-git-send-email-andi@firstfloor.org [ Fixed the Documentation entry to match the OPT_BOOLEAN one ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
aad2b21c |
|
05-Jan-2015 |
Kan Liang <kan.liang@intel.com> |
perf tools: Enable LBR call stack support Currently, there are two call chain recording options, fp and dwarf. Haswell has a new feature that utilizes the existing LBR facility to record call chains. Kernel side LBR support code provides this as a third option to record call chains. This patch enables the lbr call stack support on the tooling side. LBR call stack has some limitations: - It reuses current LBR facility, so LBR call stack and branch record can not be enabled at the same time. - It is only available for user-space callchains. However, it also offers some advantages: - LBR call stack can work on user apps which don't have frame-pointers or dwarf debug info compiled. It is a good alternative when nothing else works. Tested-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Jacob Shin <jacob.w.shin@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masanari Iida <standby24x7@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Rodrigo Campos <rodrigo@sdfg.com.ar> Cc: Stephane Eranian <eranian@google.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1420482185-29830-2-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
42aa276f |
|
29-Jan-2015 |
Namhyung Kim <namhyung@kernel.org> |
perf tools: Use perf_data_file__fd() consistently Do not reference file->fd directly since we want hide the implementation details from outside for possible future changes. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1422518843-25818-8-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e3d59112 |
|
29-Jan-2015 |
Namhyung Kim <namhyung@kernel.org> |
perf record: Show precise number of samples After perf record finishes, it prints file size and number of samples in the file but this info is wrong since it assumes typical sample size of 24 bytes and divides file size by the value. However as we post-process recorded samples for build-id, it can show correct number like below. If build-id post-processing is not requested just omit the wrong number of samples. $ perf record noploop 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.159 MB perf.data (3989 samples) ] $ perf report --stdio -n # To display the perf.data header info, please use --header/--header-only options. # # Samples: 3K of event 'cycles' # Event count (approx.): 3771330663 # # Overhead Samples Command Shared Object Symbol # ........ ............ ....... ................ .......................... # 99.90% 3982 noploop noploop [.] main 0.09% 1 noploop ld-2.17.so [.] _dl_check_map_versions 0.01% 1 noploop [kernel.vmlinux] [k] setup_arg_pages 0.00% 5 noploop [kernel.vmlinux] [k] intel_pmu_enable_all Reported-by: Milian Wolff <mail@milianw.de> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1422518843-25818-4-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4ac30cf7 |
|
29-Jan-2015 |
Namhyung Kim <namhyung@kernel.org> |
perf tools: Do not use __perf_session__process_events() directly It's only used for perf record to process build-id because its file size it's not fixed at this time due to remaining header features. However data offset and size is available so that we can use the perf_session__process_events() once we set the file size as the current offset like for now. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1422518843-25818-3-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4b6c5177 |
|
24-Sep-2014 |
Stephane Eranian <eranian@google.com> |
perf record: Add new -I option to sample interrupted machine state Add -I/--intr-regs option to capture machine state registers at interrupt. Add the corresponding man page description Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/1411559322-16548-6-git-send-email-eranian@google.com Cc: cebbert.lkml@gmail.com Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masanari Iida <standby24x7@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
00dc8657 |
|
03-Nov-2014 |
Namhyung Kim <namhyung@kernel.org> |
perf record: Do not save pathname in ./debug/.build-id directory for vmlinux When perf record finishes a session, it pre-processes samples in order to write build-id info from DSOs that had samples. During this process it'll call map__load() for the kernel map, and it ends up calling dso__load_vmlinux_path() which replaces dso->long_name. But this function checks kernel's build-id before searching vmlinux path so it'll end up with a cryptic name, the pathname for the entry in the ~/.debug cache, which can be confusing to users. This patch adds a flag to skip the build-id check during record, so that it'll have the original vmlinux path for the kernel dso->long_name, not the entry in the ~/.debug cache. Before: # perf record -va sleep 3 mmap size 528384B [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.196 MB perf.data (~8545 samples) ] Looking at the vmlinux_path (7 entries long) Using /home/namhyung/.debug/.build-id/f0/6e17aa50adf4d00b88925e03775de107611551 for symbols After: # perf record -va sleep 3 mmap size 528384B [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.193 MB perf.data (~8432 samples) ] Looking at the vmlinux_path (7 entries long) Using /lib/modules/3.16.4-1-ARCH/build/vmlinux for symbols Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1415063674-17206-7-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e5b2c207 |
|
22-Oct-2014 |
Namhyung Kim <namhyung@kernel.org> |
perf tools: Export usage string and option table of perf record Those are shared with other builtin commands like kvm, script. So make it accessable from them. This is a preparation of later change that limiting possible options. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1413990949-13953-3-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f14d5707 |
|
16-Oct-2014 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evsel: No need to drag util/cgroup.h The only thing we need is a forward declaration for 'struct cgroup_sel', that is inside 'struct perf_evsel'. Include cgroup.h instead on the tools that support cgroups. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-b7kuymbgf0zxi5viyjjtu5hk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
8f651eae |
|
09-Oct-2014 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf callchain: Move the callchain_param extern to callchain.h It was lost in hist.h, move it to where it belongs, callchain.h, as there are places that gets hist.h by means of evsel.h, and since evsel.h is being untangled from hist.h... Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-0rg7ji1jnbm6q6gj35j37jby@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5a2e5e85 |
|
22-Sep-2014 |
Namhyung Kim <namhyung@kernel.org> |
perf tools: Convert {record,top}.call-graph option to call-graph.record-mode So that it'll be passed to perf_callchain_config(). Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Milian Wolff <mail@milianw.de> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1411434104-5307-6-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f7f084f4 |
|
22-Sep-2014 |
Namhyung Kim <namhyung@kernel.org> |
perf callchain: Move some parser functions to callchain.c And rename record_callchain_parse() to parse_callchain_record_opt() in accordance to parse_callchain_report_opt(). Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Milian Wolff <mail@milianw.de> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1411434104-5307-4-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
72a128aa |
|
22-Sep-2014 |
Namhyung Kim <namhyung@kernel.org> |
perf tools: Move callchain config from record_opts to callchain_param So that all callchain config parameters can be read/written to a single place. It's a preparation to consolidate handling of all callchain options. Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Milian Wolff <mail@milianw.de> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1411434104-5307-3-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
6dcf45ef |
|
13-Aug-2014 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Filter out POLLHUP'ed file descriptors So that we don't continue polling on vanished file descriptors, i.e. file descriptors for events monitoring threads that exited. I.e. the following 'perf record' command now exits as expected, instead of staying in an eternal loop: $ sleep 5s & $ perf record -p `pidof sleep` Reported-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-8dg8o21t2ntzly2bfh53p3sg@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f66a889d |
|
18-Aug-2014 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Introduce poll method for common code idiom Since we have access two evlist members in all these poll calls, provide a helper. This will also help to make the patch introducing the pollfd class more clear, as the evlist specific uses will be hiden away perf_evlist__poll(). Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-jr9d4aop4lvy9453qahbcgp0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e5685730 |
|
17-Sep-2014 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Use ring buffer consume method to look like other tools All builtins that consume events from perf's ring buffer now end up calling perf_evlist__mmap_consume(), which will allow unmapping the ring buffer when all the fds gets closed and all events in the buffer consumed. This is in preparation for the patchkit that will notice POLLHUP on perf events file descriptors. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-8vhaeeoq11ppz0713el4xcps@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
35550da3 |
|
13-Aug-2014 |
Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> |
perf record: Use strerror_r instead of strerror Use strerror_r instead of strerror in error messages for thread-safety. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naohiro Aota <naota@elisp.net> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20140814022243.3545.7411.stgit@kbuild-fedora.novalocal Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0a7e6d1b |
|
12-Aug-2014 |
Namhyung Kim <namhyung@kernel.org> |
perf tools: Check recorded kernel version when finding vmlinux Currently vmlinux_path__init() only tries to find vmlinux file from current directory, /boot and some canonical directories with version number of the running kernel. This can be a problem when reporting old data recorded on a kernel version not running currently. We can use --symfs option for this but it's annoying for user to do it always. As we already have the info in the perf.data file, it can be changed to use it for the search automatically. Before: $ perf report ... # Samples: 4K of event 'cpu-clock' # Event count (approx.): 1067250000 # # Overhead Command Shared Object Symbol # ........ .......... ................. .............................. 71.87% swapper [kernel.kallsyms] [k] recover_probed_instruction After: # Overhead Command Shared Object Symbol # ........ .......... ................. .................... 71.87% swapper [kernel.kallsyms] [k] native_safe_halt This requires to change signature of symbol__init() to receive struct perf_session_env *. Reported-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1407825645-24586-14-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
8affc2b8 |
|
31-Jul-2014 |
Andi Kleen <ak@linux.intel.com> |
perf record: Honour --no-time command line option Time stamps are always implicitely enabled for record currently. The old --time/-T option is a nop. Allow the user to disable timestamps by using --no-time, honouring the existing option. The defaults are unchanged. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1406789104-25863-10-git-send-email-zheng.z.yan@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
dcabb507 |
|
25-Jul-2014 |
Jiri Olsa <jolsa@kernel.org> |
perf record: Store PERF_RECORD_FINISHED_ROUND only for nonempty rounds Currently we store PERF_RECORD_FINISHED_ROUND event each time we go throught mmap buffers no matter if it contains any data, which is useless. Forcing the PERF_RECORD_FINISHED_ROUND event to be stored any time we finished the round AND wrote at least one event. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: David Ahern <dsahern@gmail.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1406300177-31805-19-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
33bf7481 |
|
25-Jul-2014 |
Jiri Olsa <jolsa@kernel.org> |
perf record: Always force PERF_RECORD_FINISHED_ROUND event The PERF_RECORD_FINISHED_ROUND synthetic record governs queue flushing in reporting, so it needs to be stored for any kind of event. The lack of such periodic flushing made the tools use more memory than needed, as the reordering was being done only after processing all events. This was the case when no tracepoints were in the mix. Forcing the PERF_RECORD_FINISHED_ROUND event to be stored for all event types. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: David Ahern <dsahern@gmail.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1406300177-31805-18-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0fffa5df |
|
21-May-2014 |
Anshuman Khandual <khandual@linux.vnet.ibm.com> |
perf/tool: Add conditional branch filter 'cond' to perf record Adding perf record support for new branch stack filter criteria PERF_SAMPLE_BRANCH_COND. Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Reviewed-by: Stephane Eranian <eranian@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1400743210-32289-2-git-send-email-khandual@linux.vnet.ibm.com Cc: mpe@ellerman.id.au Cc: benh@kernel.crashing.org Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
a515114f |
|
02-Jun-2014 |
Jiri Olsa <jolsa@kernel.org> |
perf record: Fix poll return value propagation If the perf record command is interrupted in record__mmap_read_all function, the 'done' is set and err has the latest poll return value, which is most likely positive number (= number of pollfds ready to read). This 'positive err' is then propagated to the exit code, resulting in not finishing the perf.data header properly, causing following error in report: # perf record -F 50000 -a --- make the system real busy, so there's more chance to interrupt perf in event writing code --- ^C[ perf record: Woken up 16 times to write data ] [ perf record: Captured and wrote 30.292 MB perf.data (~1323468 samples) ] # perf report --stdio > /dev/null WARNING: The perf.data file's data size field is 0 which is unexpected. Was the 'perf record' command properly terminated? Fixing this by checking for positive poll return value and setting err to 0. Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1401732126-19465-1-git-send-email-jolsa@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
|
#
bac1e4d1 |
|
11-May-2014 |
Namhyung Kim <namhyung@kernel.org> |
perf tools: Get rid of on_exit() feature test The on_exit() function was only used in perf record but it's gone in previous patch. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Stephane Eranian <eranian@google.com> Cc: Bernhard Rosenkraenzer <Bernhard.Rosenkranzer@linaro.org> Cc: Irina Tirdea <irina.tirdea@intel.com> Link: http://lkml.kernel.org/r/1399855645-25815-2-git-send-email-namhyung@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
|
#
45604710 |
|
11-May-2014 |
Namhyung Kim <namhyung@kernel.org> |
perf record: Propagate exit status of a command line workload Currently perf record doesn't propagate the exit status of a workload given by the command line. But sometimes it'd useful if it's propagated so that a monitoring script can handle errors appropriately. To do that, it moves most of logic out of the exit handlers and run them directly in the __cmd_record(). The only thing needs to be done in the handler is propagating terminating signal so that the shell can terminate its loop properly when Ctrl-C was pressed. Also it cleaned up the resource management code in record__exit(). With this change, perf record returns the child exit status in case of normal termination and send signal to itself when terminated by signal. Example run of Stephane's case: $ perf record true && echo yes || echo no [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.013 MB perf.data (~589 samples) ] yes $ perf record false && echo yes || echo no [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.013 MB perf.data (~589 samples) ] no Jiri's case (error in parent): $ perf record -m 10G true && echo yes || echo no rounding mmap pages size to 17179869184 bytes (4194304 pages) failed to mmap with 12 (Cannot allocate memory) no $ ulimit -n 6 $ perf record sleep 1 && echo yes || echo no failed to create 'go' pipe: Too many open files Couldn't run the workload! no And Peter's case (interrupted by signal): $ while :; do perf record sleep 1; done ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.014 MB perf.data (~593 samples) ] Reported-by: Stephane Eranian <eranian@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Stephane Eranian <eranian@google.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/1399855645-25815-1-git-send-email-namhyung@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
|
#
ffa91880 |
|
17-Apr-2014 |
Adrien BAK <adrien.bak@metascale.org> |
perf tools: Improve error reporting In the current version, when using perf record, if something goes wrong in tools/perf/builtin-record.c:375 session = perf_session__new(file, false, NULL); The error message: "Not enough memory for reading per file header" is issued. This error message seems to be outdated and is not very helpful. This patch proposes to replace this error message by "Perf session creation failed" I believe this issue has been brought to lkml: https://lkml.org/lkml/2014/2/24/458 although this patch only tackles a (small) part of the issue. Additionnaly, this patch improves error reporting in tools/perf/util/data.c open_file_write. Currently, if the call to open fails, the user is unaware of it. This patch logs the error, before returning the error code to the caller. Reported-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Adrien BAK <adrien.bak@metascale.org> Link: http://lkml.kernel.org/r/1397786443.3093.4.camel@beast [ Reorganize the changelog into paragraphs ] [ Added empty line after fd declaration in open_file_write ] Signed-off-by: Jiri Olsa <jolsa@redhat.com>
|
#
9ff125d1 |
|
07-Jan-2014 |
Jiri Olsa <jolsa@redhat.com> |
perf callchain: Introduce HAVE_DWARF_UNWIND_SUPPORT macro Introducing global macro HAVE_DWARF_UNWIND_SUPPORT to indicate we have dwarf unwind support. Any library providing the dwarf post unwind support will enable this macro. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Jean Pihet <jean.pihet@linaro.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1389098853-14466-12-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a601fdff |
|
02-Feb-2014 |
Jiri Olsa <jolsa@redhat.com> |
perf record: Add readable output for callchain debug Adding people readable output for callchain debug, to get following '-v' output: $ perf record -v -g ls callchain: type DWARF callchain: stack dump size 4096 ... Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1391427883-13443-3-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
eb853e80 |
|
02-Feb-2014 |
Jiri Olsa <jolsa@redhat.com> |
perf tools: Add call-graph option support into .perfconfig Adding call-graph option support into .perfconfig file, so it's now possible use call-graph option like: [top] call-graph = fp [record] call-graph = dwarf,8192 Above options ONLY setup the unwind method. To enable perf record/top to actually use it the command line option -g/-G must be specified. The --call-graph option overloads .perfconfig setup. Assuming above configuration: $ perf record -g ls - enables dwarf unwind with user stack size dump 8192 bytes $ perf top -G - enables frame pointer unwind $ perf record --call-graph=fp ls - enables frame pointer unwind $ perf top --call-graph=dwarf,4096 ls - enables dwarf unwind with user stack size dump 4096 bytes Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1391427883-13443-2-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0ae617be |
|
29-Jan-2014 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Get ref_reloc_sym from kernel map Now that ref_reloc_sym is set up when the kernel map is created, 'perf record' does not need to pass the symbol names to perf_event__synthesize_kernel_mmap() which can read the values needed from ref_reloc_sym directly. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1391004884-10334-6-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a6205a35 |
|
14-Jan-2014 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Rename --initial-delay to --delay To be consistent with the equivalent option in 'stat', also, for the same reason, use -D as the one letter alias. Suggested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-p5yjnopajb3a8x0xha7yl5w8@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
509051ea |
|
14-Jan-2014 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Rename --no-delay to --no-buffering That is how the option summary describes it and so that we can free --delay to replace --initial-delay and then be consistent with stat's --delay equivalent option. Suggested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-f8hd2010uhjl2zzb34hepbmi@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
6619a53e |
|
11-Jan-2014 |
Andi Kleen <ak@linux.intel.com> |
perf record: Add --initial-delay option perf stat has a --delay option to delay measuring the workload. This is useful to skip measuring the startup phase of the program, which is often very different from the main workload. The same is useful for perf record when sampling. --no-delay was already taken, so add a --initial-delay to perf record too. -D was already taken for record, so there is only a long option. v2: Don't disable group members (Namhyung Kim) v3: port to latest perf/core rename to --initial-delay to avoid conflict with --no-delay Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1389476307-2124-1-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0050f7aa |
|
10-Jan-2014 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Introduce evlist__for_each() & friends For the common evsel list traversal, so that it becomes more compact. Use the opportunity to start ditching the 'perf_' from 'perf_evlist__', as discussed, as the whole conversion touches a lot of places, lets do it piecemeal when we have the chance due to other work, like in this case. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-qnkx7dzm2h6m6uptkfk03ni6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
983874d1 |
|
03-Jan-2014 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Auto unmap on destructor Removing further boilerplate after making sure perf_evlist__munmap can be called multiple times for the same evlist. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-o0luenuld4abupm4nmrgzm6f@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f26e1c7c |
|
03-Jan-2014 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Close fds on destructor Since it is safe to call perf_evlist__close() multiple times, autoclose it and remove the calls to the close from existing tools, reducing the tooling boilerplate. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-2kq9v7p1rude1tqxa0aue2tk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
03ad9747 |
|
03-Jan-2014 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Move destruction of maps to evlist destructor Instead of requiring tools to do an extra destructor call just before calling perf_evlist__delete. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-0jd2ptzyikxb5wp7inzz2ah2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
3e2be2da |
|
03-Jan-2014 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Remove old evsel_list usage To be consistent with other places, use just 'evlist' for the evsel list variable, and since we have it in 'struct record', use it directly from there. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-396bnfvmlxrsj3o2tk47b8t1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
735f7e0b |
|
03-Jan-2014 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Move the SIGUSR1 error reporting logic to prepare_workload So that we have the boilerplate in the preparation method, instead of open coded in tools wanting the reporting when the exec fails. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-purbdzcphdveskh7wwmnm4t7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f33cbe72 |
|
02-Jan-2014 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Send the errno in the signal when workload fails When a tool uses perf_evlist__start_workload and the supplied workload fails (e.g.: its binary wasn't found), perror was being used to print the error reason. This is undesirable, as the caller may be a GUI, when it wants to have total control of the error reporting process. So move to using sigaction(SA_SIGINFO) + siginfo_t->sa_value->sival_int to communicate to the caller the errno and let it print it using the UI of its choosing. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-epgcv7kjq8ll2udqfken92pz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b4006796 |
|
19-Dec-2013 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Rename 'perf_record_opts' to 'record_opts Reduce typing, functions use class__method convention, so unlikely to clash with other libraries. This actually was discussed in the "Link:" referenced message below. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20131112113427.GA4053@ghostprotocols.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
8c6f45a7 |
|
19-Dec-2013 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Rename 'perf_record' to plain 'record' Its a local struct and the functions use the __ separator from the class name to the method name, so its unlikely that this will clash with other namespaces. Save some typing then. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-r011tdv7ianars9jr9ur2n4q@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
cf8b2e69 |
|
19-Dec-2013 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Simplify perf_record__write 1. Since all callers either test if it is less than zero or assign its result to an int variable, convert it from ssize_t to int; 2. There is just one use for the 'session' variable, so use rec->session directly instead; 3. No need to store the result of perf_data_file__write, since that result is either 'size' or -1, the later making the error result to be stored in 'errno' and accessed thru printf's %m in the pr_err call. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-xwsk964dp681fica3xlqhjin@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
50a9b868 |
|
22-Nov-2013 |
Jiri Olsa <jolsa@redhat.com> |
perf record: Use perf_data_file__write for output file Changing the file output code to use the newly added perf_data_file__write interface. No functional change intended. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Mike Galbraith <efault@gmx.de> Cc: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
53653d70 |
|
09-Dec-2013 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Fix display of incorrect mmap pages 'mmap_pages' is 'unsigned int' not 'int' e.g. perf record -m2147483648 uname Permission error mapping pages. Consider increasing /proc/sys/kernel/perf_event_mlock_kb, or try again with a smaller value of -m/--mmap_pages. (current value: -2147483648) Fixed: perf record -m2147483648 uname Permission error mapping pages. Consider increasing /proc/sys/kernel/perf_event_mlock_kb, or try again with a smaller value of -m/--mmap_pages. (current value: 2147483648) Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1386595120-22978-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
6233dd5e |
|
28-Nov-2013 |
Jiri Olsa <jolsa@redhat.com> |
perf record: Unify data output code into perf_record__write function Unifying current 2 data output functions do_write_output and write_output into single one perf_record__write. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1385634619-8129-2-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
69e7e5b0 |
|
18-Nov-2013 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Default -t option to no inheritance The change to per-cpu mmaps causes the -p, -t and -u options now to have inheritance enabled by default. Change that back to no inheritance but for the -t option only. Requested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1384768557-23331-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
3aa5939d |
|
15-Nov-2013 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Make per-cpu mmaps the default. This affects the -p, -t and -u options that previously defaulted to per-thread mmaps. Consequently add an option to select per-thread mmaps to support the old behaviour. Note that per-thread can be used with a workload-only (i.e. none of -p, -t, -u, -a or -C is selected) to get a per-thread mmap with no inheritance. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/5286271D.3020808@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
539e6bb7 |
|
01-Nov-2013 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Add an option to force per-cpu mmaps By default, when tasks are specified (i.e. -p, -t or -u options) per-thread mmaps are created. Add an option to override that and force per-cpu mmaps. Further comments by peterz: So this option allows -t/-p/-u to create one buffer per cpu and attach all the various thread/process/user tasks' their counters to that one buffer? As opposed to the current state where each such counter would have its own buffer. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1383313899-15987-7-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
602ad878 |
|
12-Nov-2013 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf target: Shorten perf_target__ to target__ Getting unwieldly long, for this app domain should be descriptive enough and the use of __ to separate the class from the method names should help with avoiding clashes with other code bases. Reported-by: David Ahern <dsahern@gmail.com> Suggested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20131112113427.GA4053@ghostprotocols.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a9986fad |
|
07-Nov-2013 |
David Ahern <dsahern@gmail.com> |
perf record: Move existing write_output into helper function Code move only; no logic changes. In preparation for the mmap based output option in the next patch. Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1383884605-30968-2-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
410f1786 |
|
07-Nov-2013 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Use correct return type for write() write() returns a 'ssize_t' not an 'int'. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1383906470-21002-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a33fbd56 |
|
11-Nov-2013 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf machine: Simplify synthesize_threads method Several tools (top, kvm) don't need to be called back to process each of the syntheiszed records, instead relying on the machine__process_event function to change the per machine data structures that represent threads and mmaps, so provide a way to ask for this common idiom. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-pusqibp8n3c4ynegd1frn4zd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
58d925dc |
|
11-Nov-2013 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf machine: Introduce synthesize_threads method out of open coded equivalent Further simplifications to be done on following patch, as most tools don't use the callback, using instead just the canned machine__process_event one. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-r1m0vuuj3cat4bampno9yc8d@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
62605dc50 |
|
11-Nov-2013 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Synthesize non-exec MMAP records when --data used When perf_event_attr.mmap_data is set the kernel will generate PERF_RECORD_MMAP events when non-exec (data, SysV mem) mmaps are created, so we need to synthesize from /proc/pid/maps for existing threads, as we do for exec mmaps. Right now just 'perf record' does it, but any other tool that uses perf_event__synthesize_thread(s|map) can request it. Reported-by: Don Zickus <dzickus@redhat.com> Tested-by: Don Zickus <dzickus@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Bill Gray <bgray@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joe Mario <jmario@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Fowles <rfowles@redhat.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-ihwzraikx23ian9txinogvv2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7ab75cff |
|
06-Nov-2013 |
David Ahern <dsahern@gmail.com> |
perf record: Remove post_processing_offset variable Duplicates the data_offset from header in the session. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1383763297-27066-4-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f34b9001 |
|
06-Nov-2013 |
David Ahern <dsahern@gmail.com> |
perf record: Remove advance_output function 1 line function with only 1 user; might as well embed directly. Signed-off-by: David Ahern <dsahern@gmail.com> Suggested-by: Ingo Molnar <mingo@kernel.org> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1383763297-27066-3-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
57706abc |
|
06-Nov-2013 |
David Ahern <dsahern@gmail.com> |
perf record: Refactor feature handling into a separate function Code move only. No logic changes. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1383763297-27066-2-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
714647bd |
|
05-Nov-2013 |
Jiri Olsa <jolsa@redhat.com> |
perf tools: Check maximum frequency rate for record/top Adding the check for maximum allowed frequency rate defined in following file: /proc/sys/kernel/perf_event_max_sample_rate When we cross the maximum value we fail and display detailed error message with advise. $ perf record -F 3000 ls Maximum frequency rate (2000) reached. Please use -F freq option with lower value or consider tweaking /proc/sys/kernel/perf_event_max_sample_rate. In case user does not specify the frequency and the default value cross the maximum, we display warning and set the frequency value to the current maximum. $ perf record ls Lowering default frequency rate to 2000. Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate. Same messages are used for 'perf top'. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1383660887-1734-4-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
09b0fd45 |
|
26-Oct-2013 |
Jiri Olsa <jolsa@redhat.com> |
perf record: Split -g and --call-graph Splitting -g and --call-graph for record command, so we could use '-g' with no option. The '-g' option now takes NO argument and enables the configured unwind method, which is currently the frame pointers method. It will be possible to configure unwind method via config file in upcoming patches. All current '-g' arguments is overtaken by --call-graph option. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Tested-by: David Ahern <dsahern@gmail.com> Tested-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: David Ahern <dsahern@gmail.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1382797536-32303-2-git-send-email-jolsa@redhat.com [ reordered -g/--call-graph on --help and expanded the man page according to comments by David Ahern and Namhyung Kim ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
cc9784bd |
|
15-Oct-2013 |
Jiri Olsa <jolsa@redhat.com> |
perf session: Separating data file properties from session Removing 'fd, fd_pipe, filename, size' from struct perf_session and replacing them with struct perf_data_file object. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1381847254-28809-4-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
6a4d98d7 |
|
15-Oct-2013 |
Jiri Olsa <jolsa@redhat.com> |
perf tools: Add perf_data_file__open interface to data object Adding perf_data_file__open interface to data object to open the perf.data file for both read and write. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1381847254-28809-3-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f5fc1412 |
|
15-Oct-2013 |
Jiri Olsa <jolsa@redhat.com> |
perf tools: Add data object to handle perf data file This patch is adding 'struct perf_data_file' object as a placeholder for all attributes regarding perf.data file handling. Changing perf_session__new to take it as an argument. The rest of the functionality will be added later to keep this change simple enough, because all the places using perf_session are changed now. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1381847254-28809-2-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4f624685 |
|
18-Oct-2013 |
Adrian Hunter <adrian.hunter@intel.com> |
perf record: Improve write_output error message Improve the error message from write_output() to say what failed to write and give the error number. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382099356-4918-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
994a1f78 |
|
31-Aug-2013 |
Jiri Olsa <jolsa@redhat.com> |
perf tools: Check mmap pages value early Move the check of the mmap_pages value to the options parsing time, so we could rely on this value on other parts of code. Related changes come in the next patches. Also changes perf_evlist::mmap_len to proper size_t type. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1378031796-17892-2-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
918512b4 |
|
12-Sep-2013 |
Jiri Olsa <jolsa@redhat.com> |
perf tools: Unify page_size usage Making page_size global from the util object. Removing the not needed one. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1379003976-5839-4-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
89fe808a |
|
29-Sep-2013 |
Ingo Molnar <mingo@kernel.org> |
tools/perf: Standardize feature support define names to: HAVE_{FEATURE}_SUPPORT Standardize all the feature flags based on the HAVE_{FEATURE}_SUPPORT naming convention: HAVE_ARCH_X86_64_SUPPORT HAVE_BACKTRACE_SUPPORT HAVE_CPLUS_DEMANGLE_SUPPORT HAVE_DWARF_SUPPORT HAVE_ELF_GETPHDRNUM_SUPPORT HAVE_GTK2_SUPPORT HAVE_GTK_INFO_BAR_SUPPORT HAVE_LIBAUDIT_SUPPORT HAVE_LIBELF_MMAP_SUPPORT HAVE_LIBELF_SUPPORT HAVE_LIBNUMA_SUPPORT HAVE_LIBUNWIND_SUPPORT HAVE_ON_EXIT_SUPPORT HAVE_PERF_REGS_SUPPORT HAVE_SLANG_SUPPORT HAVE_STRLCPY_SUPPORT Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/n/tip-u3zvqejddfZhtrbYbfhi3spa@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
475eeab9 |
|
20-Sep-2013 |
Andi Kleen <ak@linux.intel.com> |
tools/perf: Add support for record transaction flags Add support for recording and displaying the transaction flags. They are essentially a new sort key. Also display them in a nice way to the user. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379688044-14173-6-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
0126d493 |
|
20-Sep-2013 |
Andi Kleen <ak@linux.intel.com> |
tools/perf/record: Add abort_tx,no_tx,in_tx branch filter options to perf record -j Make perf record -j aware of the new in_tx,no_tx,abort_tx branch qualifiers. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379688044-14173-5-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
6065210d |
|
11-Jul-2013 |
Jiri Olsa <jolsa@redhat.com> |
perf tools: Remove event types framework completely Removing event types framework completely. The only remainder (apart from few comments) is following enum: enum perf_user_event_type { ... PERF_RECORD_HEADER_EVENT_TYPE = 65, /* deprecated */ ... } It's kept as deprecated, resulting in error when processed in perf_session__process_user_event function. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1373556513-3000-6-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
30d35079 |
|
11-Jul-2013 |
Jiri Olsa <jolsa@redhat.com> |
perf record: Remove event types pushing Removing event types data pushing from record command. It's no longer needed, because this data is ignored. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1373556513-3000-5-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4a4d371a |
|
05-Jun-2013 |
Jiri Olsa <jolsa@redhat.com> |
perf record: Remove -f/--force option It no longer have any affect on the processing and is marked as obsolete anyway. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-tvwyspiqr4getzfib2lw06ty@git.kernel.org Link: http://lkml.kernel.org/r/1372307120-737-1-git-send-email-namhyung@kernel.org [ combined patch removing the -f usage in various sub-commands, such as 'perf sched', etc, by Namhyung Kim ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
563aecb2 |
|
05-Jun-2013 |
Jiri Olsa <jolsa@redhat.com> |
perf record: Remove -A/--append option It's no longer working and needed. Quite straightforward discussion/vote was in here: http://marc.info/?t=137028288300004&r=1&w=2 Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-8fgdva12hl8w3xzzpsvvg7nx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
804f7ac7 |
|
06-May-2013 |
David Ahern <dsahern@gmail.com> |
perf record: handle death by SIGTERM Perf data files cannot be processed until the header is updated which is done via an on_exit handler. If perf is killed due to a SIGTERM it does not run the on_exit hooks leaving the perf.data file in a random state which perf-report will happily spin on trying to read. As noted by Mike an easy reproducer is: perf record -a -g & sleep 1; killall perf Fix by catching SIGTERM like it does SIGINT. Also need to remove the kill which was added via commit f7b7c26e. Acked-by: Stephane Eranian <eranian@google.com> Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1367864663-1309-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
05484298 |
|
24-Jan-2013 |
Andi Kleen <ak@linux.intel.com> |
perf tools: Add support for weight v7 (modified) perf record has a new option -W that enables weightened sampling. Add sorting support in top/report for the average weight per sample and the total weight sum. This allows to both compare relative cost per event and the total cost over the measurement period. Add the necessary glue to perf report, record and the library. v2: Merge with new hist refactoring. v3: Fix manpage. Remove value check. Rename global_weight to weight and weight to local_weight. v4: Readd sort keys to manpage v5: Move weight to end v6: Move weight to template v7: Rename weight key. Original patch from Andi modified by Stephane Eranian <eranian@google.com> to include ONLY the weight supporting code and apply to pristine 3.8.0-rc4. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1359040242-8269-6-git-send-email-eranian@google.com [ committer note: changed to cope with fc5871ed and the hists_link perf test entry ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
62baca8a |
|
19-Mar-2013 |
Namhyung Kim <namhyung.kim@lge.com> |
perf tools: Get rid of redundant _FILE_OFFSET_BITS definition We define it in the Makefile so no need to duplicate it. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1363686376-29525-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
8fa60e1f |
|
14-Mar-2013 |
Namhyung Kim <namhyung.kim@lge.com> |
perf record: Fixup return path of cmd_record() The error path of calling perf_target__parse_uid wrongly went to out_free_fd. Also add missing evlist cleanup routines. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1363326533-3310-4-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
55e162ea |
|
11-Mar-2013 |
Namhyung Kim <namhyung.kim@lge.com> |
perf evlist: Add want_signal parameter to perf_evlist__prepare_workload() In case a caller doesn't want to receive SIGUSR1 when the child failed to exec(). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1362987798-24969-6-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
119fa3c9 |
|
11-Mar-2013 |
Namhyung Kim <namhyung.kim@lge.com> |
perf evlist: Do not pass struct record_opts to perf_evlist__prepare_workload() Since it's only used for checking ->pipe_output, we can pass the result directly. Now the perf_evlist__prepare_workload() don't have a dependency of struct perf_record_opts, it can be called from other places like perf stat. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1362987798-24969-5-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
6ef73ec4 |
|
11-Mar-2013 |
Namhyung Kim <namhyung.kim@lge.com> |
perf evlist: Pass struct perf_target to perf_evlist__prepare_workload() It's a preparation step of removing @opts arg from the function so that it can be used more widely. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1362987798-24969-4-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
334fe7a3 |
|
11-Mar-2013 |
Namhyung Kim <namhyung.kim@lge.com> |
perf evlist: Remove cpus and threads arguments from perf_evlist__new() It's almost always used with NULL for both arguments. Get rid of the arguments from the signature and use perf_evlist__set_maps() if needed. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1362987798-24969-1-git-send-email-namhyung@kernel.org [ committer note: replaced spaces with tabs in some of the affected lines ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e4dd45fe |
|
25-Feb-2013 |
Jiri Olsa <jolsa@redhat.com> |
perf record: Fix -C option Currently the -C option does not work for record command, because of the targets mismatch when synthesizing threads. Fixing this by using proper target interface for the synthesize decision. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Reported-by: Oleg Nesterov <oleg@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1361785972-7431-2-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a8bb559b |
|
22-Jan-2013 |
Namhyung Kim <namhyung.kim@lge.com> |
perf header: Add HEADER_GROUP_DESC feature Save group relationship information so that it can be restored when perf report is running. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1358845787-1350-4-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
34ba5122 |
|
19-Dec-2012 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf machine: Simplify accessing the host machine It is always there, no sense in calling a function named "perf_session__find_host_machine". Also no sense in checking if that function return is NULL, so ditch needless error handling. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-a6a3zx3afbrxo8p2zqm5mxo8@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
876650e6 |
|
18-Dec-2012 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf machine: Introduce struct machines That consolidates the grouping of host + guests, isolating a bit more of functionality now centered on 'perf_session' that can be used independently in tools that don't need a 'perf_session' instance, but needs to have all the thread/map/symbol machinery. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-c700rsiphpmzv8klogojpfut@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7e383de4 |
|
18-Dec-2012 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Don't pass host machine to guest synthesizer We were calling perf_session__process_machines(), that would first pass the struct machine associated with the host to the provided callback, perf_event__synthesize_guest_os() that would test if it was the host and if so wouldn't do anything. Ditch this contraption, just call directly machines__process with the list of guests. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-x65vsxgzg4dvo3zqohtrrb9o@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
56e52e85 |
|
13-Dec-2012 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evsel: Introduce perf_evsel__open_strerror method That consolidates the error messages in 'record', 'stat' and 'top', that now get a consistent set of messages and allow other tools to use the new method to report problems using whatever UI toolkit. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-1cudb7wl996kz7ilz83ctvhr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c0a54341 |
|
13-Dec-2012 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evsel: Introduce event fallback method The only fallback right now is for HW cpu-cycles -> SW cpu-clock, that was done in the same way in both 'top' and 'record'. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-58l1mgibh9oa9m0pd3fasxa5@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
594ac61a |
|
13-Dec-2012 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evsel: Do missing feature fallbacks in just one place Instead of doing it in stat, top, record or any other tool that opens event descriptors. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-vr8hzph83d5t2mdlkf565h84@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
75d9a108 |
|
11-Dec-2012 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Export the callchain parsing routine and help Will be used by perf top, that will first setup the symbol system to deal with callchains and then call these routines to ask the kernel for callchains. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-jg0dh8rmlx7x11e7u7mnasvd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c5ff78c3 |
|
11-Dec-2012 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Pass perf_record_opts to the callchain cmdline parsing callback Its all it uses and makes the parsing callback suitable for use by 'perf top', which will happen in a followup patch. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-wb9eti78bk2jd7wpasro8hsz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f77a9518 |
|
10-Dec-2012 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Set the leader in the perf_evlist__config method Since we need to ensure the leader is set before configuring the evsel perf_event_attrs. Reducing the boilerplate needed by tools, helping, for instance, 'perf trace', that wasn't setting the leader. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-22shm0ptkch2kgl7rtqlligx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7be5ebe8 |
|
10-Dec-2012 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evsel: Update sample_size when setting sample_type bits We use evsel->sample_size to detect underflows in perf_evsel__parse_sample, but we were failing to update it after perf_evsel__init(), i.e. when we decide, after creating an evsel, that we want some extra field bit set. Fix it by introducing methods to set a bit that will take care of correctly adjusting evsel->sample_size. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-2ny5pzsing0dcth7hws48x9c@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
2711926a |
|
12-Nov-2012 |
Jiri Olsa <jolsa@redhat.com> |
perf tools: Ensure single disable call per event in record comand It's possible we issue the event disable ioctl multiple times until we read the final portion of the mmap buffer. Ensuring just single disable ioctl call for event, because there's no need to do that more than once. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1352741644-16809-4-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
774cb499 |
|
12-Nov-2012 |
Jiri Olsa <jolsa@redhat.com> |
perf tools: Fix 'disabled' attribute config for record command Currently the record command sets all events initially as disabled. There's non conditional perf_evlist__enable call, that enables all events before we exec tracee program. That actually screws whole enable_on_exec logic, because the event is enabled before the traced program got executed. What we actually want is: 1) For any type of traced program: - all independent events and group leaders are disabled - all group members are enabled Group members are ruled by group leaders. They need to be enabled, because the group scheduling relies on that. 2) For traced programs executed by perf: - all independent events and group leaders have enable_on_exec set - we don't specifically enable or disable any event during the record command Independent events and group leaders are initially disabled and get enabled by exec. Group members are ruled by group leaders as stated in 1). 3) For traced programs attached by perf (pid/tid): - we specifically enable or disable all events during the record command When attaching events to already running traced we enable/disable events specifically, as there's no initial traced exec call. Fixing appropriate perf_event_attr test case to cover this change. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1352741644-16809-3-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
cac21425 |
|
12-Nov-2012 |
Jiri Olsa <jolsa@redhat.com> |
perf tools: Fix attributes for '{}' defined event groups Fixing events attributes for groups defined via '{}'. Currently 'enable_on_exec' attribute in record command and both 'disabled ' and 'enable_on_exec' attributes in stat command are set based on the 'group' option. This eliminates proper setup for '{}' defined groups as they don't set 'group' option. Making above attributes values based on the 'evsel->leader' as this is common to both group definition. Moving perf_evlist__set_leader call within builtin-record ahead perf_evlist__config_attrs call, because the latter needs possible group leader links in place. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1352741644-16809-2-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0089fa98 |
|
20-Oct-2012 |
Jiri Olsa <jolsa@redhat.com> |
perf record: Fix mmap error output condition The mmap_pages default value is not power of 2 (UINT_MAX). Together with perf_evlist__mmap function returning error value different from EPERM, we get misleading error message: "--mmap_pages/-m value must be a power of two." Fixing this by adding extra check for UINT_MAX value for this error condition. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1350743599-4805-12-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
2305c82f |
|
13-Sep-2012 |
David Ahern <dsahern@gmail.com> |
perf tools: Give user better message if precise is not supported Platforms (e.g., VM's) without support for precise mode get a confusing error message. e.g., $ perf record -e cycles:p -a -- sleep 1 Error: sys_perf_event_open() syscall returned with 95 (Operation not supported). /bin/dmesg may provide additional information. No hardware sampling interrupt available. No APIC? If so then you can boot the kernel with the "lapic" boot parameter to force-enable it. sleep: Terminated which is not clear that precise mode might be the root problem. With this patch: $ perf record -e cycles:p -fo /tmp/perf.data -- sleep 1 Error: 'precise' request may not be supported. Try removing 'p' modifier sleep: Terminated v2: softened message to 'may not be' supported per Robert's suggestion Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@amd.com> Link: http://lkml.kernel.org/r/1347569955-54626-4-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
78da39fa |
|
08-Oct-2012 |
Bernhard Rosenkraenzer <Bernhard.Rosenkranzer@linaro.org> |
perf tools: Add on_exit implementation on_exit() is only available in new versions of glibc. It is not implemented in Bionic and will lead to linking errors when compiling for Android. Implement a wrapper for on_exit using atexit. The implementation for on_exit is the one sent by Bernhard Rosenkraenzer in https://lkml.org/lkml/2012/8/23/316. The configuration part from the Makefile is different than the one from the original patch. Signed-off-by: Bernhard Rosenkraenzer <Bernhard.Rosenkranzer@linaro.org> Signed-off-by: Irina Tirdea <irina.tirdea@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Irina Tirdea <irina.tirdea@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1349678613-7045-2-git-send-email-irina.tirdea@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
61eaa3be |
|
01-Oct-2012 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Don't use globals where not needed to Some variables were global but used in just one function, so move it to where it belongs. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-2ce3v9qheiobs3sz6pxf4tud@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
95485b1c |
|
28-Sep-2012 |
Namhyung Kim <namhyung.kim@lge.com> |
perf tools: Convert to LIBUNWIND_SUPPORT For building perf without libunwind, we can set NO_LIBUNWIND=1 as a argument of make. It then defines NO_LIBUNWIND_SUPPORT macro for C code to do the proper handling. However it usually used in a negative semantics - e.g. #ifndef - so we saw double negations which can be misleading. Convert it to a positive form to make it more readable. Also change NO_PERF_REGS macro to HAVE_PERF_REGS for the same reason. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1348824728-14025-5-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
1491a632 |
|
26-Sep-2012 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Renane set_filters method to apply_filters Because that is what it really does, i.e. it applies the filters that were parsed from the command line and stashed into the evsels they refer to. We'll need the set_filter method name to actually apply a filter to all the evsels in an evlist, for instance, to ask that a syswide tracer doesn't trace itself. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-v9x3q9rv4caxtox7wtjpchq5@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
1863fbbb |
|
20-Sep-2012 |
Stephane Eranian <eranian@google.com> |
perf record: Print event causing perf_event_open() to fail Got tired of not getting the event that caused the perf_event_open() syscall to fail. So I fixed the error message. This is very useful when monitoring lots of events in a single run. Signed-off-by: Stephane Eranian <eranian@google.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20120920161945.GA7064@quad Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
1d037ca1 |
|
10-Sep-2012 |
Irina Tirdea <irina.tirdea@gmail.com> |
perf tools: Use __maybe_used for unused variables perf defines both __used and __unused variables to use for marking unused variables. The variable __used is defined to __attribute__((__unused__)), which contradicts the kernel definition to __attribute__((__used__)) for new gcc versions. On Android, __used is also defined in system headers and this leads to warnings like: warning: '__used__' attribute ignored __unused is not defined in the kernel and is not a standard definition. If __unused is included everywhere instead of __used, this leads to conflicts with glibc headers, since glibc has a variables with this name in its headers. The best approach is to use __maybe_unused, the definition used in the kernel for __attribute__((unused)). In this way there is only one definition in perf sources (instead of 2 definitions that point to the same thing: __used and __unused) and it works on both Linux and Android. This patch simply replaces all instances of __used and __unused with __maybe_unused. Signed-off-by: Irina Tirdea <irina.tirdea@intel.com> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1347315303-29906-7-git-send-email-irina.tirdea@intel.com [ committer note: fixed up conflict with a116e05 in builtin-sched.c ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
8d3eca20 |
|
26-Aug-2012 |
David Ahern <dsahern@gmail.com> |
perf record: Remove use of die/exit Allows perf to clean up properly on exit. If perf-record is exiting due to failure, the on_exit should not run as the session has been deleted. Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1346005487-62961-8-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0c21f736 |
|
14-Aug-2012 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Introduce evsel list accessors To replace the longer list_entry constructs for things that are widely used: perf_evlist__{first,last}(evlist) perf_evsel__next(evsel) Acked-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-ng7azq26wg1jd801qqpcozwp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
63dab225 |
|
14-Aug-2012 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Rename __group method to __set_leader Just like was done for parse_events__set_leader. Also we need to have the list_entry set_leader method in evlist.c so that we don't grow another dep in the python binding: # ~acme/git/linux/tools/perf/python/twatch.py Traceback (most recent call last): File "/home/acme/git/linux/tools/perf/python/twatch.py", line 16, in <module> import perf ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: parse_events__set_leader And also remove a pr_debug from evsel.c so that we avoid this one too: # ~acme/git/linux/tools/perf/python/twatch.py Traceback (most recent call last): File "/home/acme/git/linux/tools/perf/python/twatch.py", line 16, in <module> import perf ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: eprintf Acked-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-0hk9dazg9pora9jylkqngovm@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
6a4bb04c |
|
07-Aug-2012 |
Jiri Olsa <jolsa@redhat.com> |
perf tools: Enable grouping logic for parsed events This patch adds a functionality that allows to create event groups based on the way they are specified on the command line. Adding functionality to the '{}' group syntax introduced in earlier patch. The current '--group/-g' option behaviour remains intact. If you specify it for record/stat/top command, all the specified events become members of a single group with the first event as a group leader. With the new '{}' group syntax you can create group like: # perf record -e '{cycles,faults}' ls resulting in single event group containing 'cycles' and 'faults' events, with cycles event as group leader. All groups are created with regards to threads and cpus. Thus recording an event group within a 2 threads on server with 4 CPUs will create 8 separate groups. Examples (first event in brackets is group leader): # 1 group (cpu-clock,task-clock) perf record --group -e cpu-clock,task-clock ls perf record -e '{cpu-clock,task-clock}' ls # 2 groups (cpu-clock,task-clock) (minor-faults,major-faults) perf record -e '{cpu-clock,task-clock},{minor-faults,major-faults}' ls # 1 group (cpu-clock,task-clock,minor-faults,major-faults) perf record --group -e cpu-clock,task-clock -e minor-faults,major-faults ls perf record -e '{cpu-clock,task-clock,minor-faults,major-faults}' ls # 2 groups (cpu-clock,task-clock) (minor-faults,major-faults) perf record -e '{cpu-clock,task-clock} -e '{minor-faults,major-faults}' \ -e instructions ls # 1 group # (cpu-clock,task-clock,minor-faults,major-faults,instructions) perf record --group -e cpu-clock,task-clock \ -e minor-faults,major-faults -e instructions ls perf record -e '{cpu-clock,task-clock,minor-faults,major-faults,instructions}' ls It's possible to use standard event modifier for a group, which spans over all events in the group and updates each event modifier settings, for example: # perf record -r '{faults:k,cache-references}:p' resulting in ':kp' modifier being used for 'faults' and ':p' modifier being used for 'cache-references' event. Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ulrich Drepper <drepper@gmail.com> Link: http://lkml.kernel.org/n/tip-ho42u0wcr8mn1otkalqi13qp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
26d33022 |
|
07-Aug-2012 |
Jiri Olsa <jolsa@redhat.com> |
perf tools: Support for DWARF mode callchain This patch enables perf to use the DWARF unwind code. It extends the perf record '-g' option with following arguments: 'fp' - provides framepointer based user stack backtrace 'dwarf[,size]' - provides DWARF (libunwind) based user stack backtrace. The size specifies the size of the user stack dump. If omitted it is 8192 by default. If libunwind is found during the perf build, then the 'dwarf' argument becomes available for record command. The 'fp' stays as default option in any case. Examples: (perf compiled with libunwind) perf record -g dwarf ls - provides dwarf unwind with 8192 as stack dump size perf record -g dwarf,4096 ls - provides dwarf unwind with 4096 as stack dump size perf record -g -- ls perf record -g fp ls - provides frame pointer unwind Signed-off-by: Jiri Olsa <jolsa@redhat.com> Original-patch-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: "Frank Ch. Eigler" <fche@redhat.com> Cc: Arun Sharma <asharma@fb.com> Cc: Benjamin Redelings <benjamin.redelings@nescent.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Ulrich Drepper <drepper@gmail.com> Link: http://lkml.kernel.org/r/1344345647-11536-13-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
56e6f602 |
|
29-Jul-2012 |
David Ahern <dsahern@gmail.com> |
perf tool: Save cmdline from user in file header vs what is passed to record A number of builtin commands process some user args and then pass the rest to cmd_record. cmd_record then saves argc/argv that it receives into the header of the perf data file. But this loses the arguments handled by the first command -- ie., the real command line from the user. This patch saves the command line as typed by the user rather than what was passed to cmd_record. As an example consider the command: $ perf kvm --guest --host --guestmount=/tmp/guest-mount record -fo /tmp/perf.data -ag -- sleep 10 Currently the command saved to the header is: cmdline : /tmp/p3.5/perf record -o perf.data.kvm -fo /tmp/perf.data -ag -- sleep 1 (ignore the duplicated -o -- the first would be yet another bug with perf-kvm). With this patch the command line saved to the header is: cmdline : /tmp/p3.5/perf kvm --guest --host --guestmount=/tmp/guest-mount record -fo /tmp/perf.data -ag -- sleep 1 v2: simplified to saving the command in parse_options per Stephane's suggestion Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1343616831-6408-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7b56cce2 |
|
01-Aug-2012 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf session: Use perf_evlist__id_hdr_size more extensively Removing perf_session->id_hdr_size, as it can be obtained from the evsel/evlist. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-1nwc2kslu7gsfblu98xbqbll@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7289f83c |
|
11-Jun-2012 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Move all users of event_name to perf_evsel__name So that we don't use global variables that could make us misreport event names when having a multi window top, for instance. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-mccancovi1u0wdkg8ncth509@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
3780f488 |
|
28-May-2012 |
Namhyung Kim <namhyung@gmail.com> |
perf tools: Convert critical messages to ui__error() There were places where use ui__warning (or even fprintf) to show critical messages. This patch converts them to ui__error so that the front-end code can implement appropriate behavior. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1338265382-6872-3-git-send-email-namhyung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
447a6013 |
|
22-May-2012 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Bump default sample freq to 4 kHz Quoting Ingo: "While at it I'd also suggest increasing the default sampling frequency, from 1000 Hz per CPU to at least 4Khz auto-freq or so - this should work well all across the board I think. CPUs are getting faster and command/app run times are getting shorter, 1Khz is a bit low IMO." Requested-by: Ingo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-2jafa6mkrufyekny9ei59lpu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
2eeaaa09 |
|
15-May-2012 |
Stephane Eranian <eranian@google.com> |
perf tools: rename HEADER_TRACE_INFO to HEADER_TRACING_DATA To match the PERF_RECORD_HEADER_TRACING_DATA record type. This is the same info as the one used for pipe mode whereas the other one is for regular file output. This will help in the later patch to add meta-data infos in pipe mode. Signed-off-by: Stephane Eranian <eranian@google.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1337081295-10303-4-git-send-email-eranian@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d1cb9fce |
|
16-May-2012 |
Namhyung Kim <namhyung.kim@lge.com> |
perf target: Add uses_mmap field If perf doesn't mmap on event (like perf stat), it should not create per-task-per-cpu events. So just use a dummy cpu map to create a per-task event for this case. Signed-off-by: Namhyung Kim <namhyung.kim@lge.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1337161549-9870-3-git-send-email-namhyung.kim@lge.com [ committer note: renamed .need_mmap to .uses_mmap ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d1cae34d |
|
08-May-2012 |
David Ahern <dsahern@gmail.com> |
perf record: Reset event name when falling back to cpu-clock perf-record defaults to the H/W cycles event and if it is not supported falls back to cpu-clock. Reset the event name as well. Signed-off-by: David Ahern <dsahern@gmail.com> Link: http://lkml.kernel.org/r/1336495811-58461-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
028d455b |
|
08-May-2012 |
David Ahern <dsahern@gmail.com> |
perf record: Fix fallback to cpu-clock on ppc perf-record on PPC is not falling back to cpu-clock: $ perf record -ag -fo /tmp/perf.data -- sleep 1 Error: sys_perf_event_open() syscall returned with 6 (No such device or address). /bin/dmesg may provide additional information. Fatal: No CONFIG_PERF_EVENTS=y kernel support configured? The problem is that until 2.6.37 (behavior changed with commit b0a873e) perf on PPC returns ENXIO when hw_perf_event_init() fails. With this patch we get the expected behavior: $ perf record -ag -fo /tmp/perf.data -v -- sleep 1 Old kernel, cannot exclude guest or host samples. The cycles event is not supported, trying to fall back to cpu-clock-ticks [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.151 MB /tmp/perf.data (~6592 samples) ] Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1336490937-57106-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d67356e7 |
|
06-May-2012 |
Namhyung Kim <namhyung.kim@lge.com> |
perf target: Consolidate target task/cpu checking There are places that check whether target task/cpu is given or not and some of them didn't check newly introduced uid or cpu list. Add and use three of helper functions to treat them properly. Signed-off-by: Namhyung Kim <namhyung.kim@lge.com> Reviewed-by: David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1336367344-28071-7-git-send-email-namhyung.kim@lge.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
16ad2ffb |
|
06-May-2012 |
Namhyung Kim <namhyung.kim@lge.com> |
perf tools: Introduce perf_target__strerror() The perf_target__strerror() sets @buf to a string that describes the (perf_target-specific) error condition that is passed via @errnum. This is similar to strerror_r() and does same thing if @errnum has a standard errno value. Signed-off-by: Namhyung Kim <namhyung.kim@lge.com> Suggested-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Reviewed-by: David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1336367344-28071-6-git-send-email-namhyung.kim@lge.com [ committer note: No need to use PERF_ERRNO_TARGET__SUCCESS, use shorter idiom ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
dfe78ada |
|
06-May-2012 |
Namhyung Kim <namhyung.kim@lge.com> |
perf target: Introduce perf_target__parse_uid() Add and use the modern perf_target__parse_uid() and get rid of the old parse_target_uid(). Signed-off-by: Namhyung Kim <namhyung.kim@lge.com> Reviewed-by: David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1336367344-28071-5-git-send-email-namhyung.kim@lge.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b809ac10 |
|
25-Apr-2012 |
Namhyung Kim <namhyung.kim@lge.com> |
perf evlist: Make create_maps() take struct perf_target Now we have all information that needed to create cpu/thread maps in struct perf_target, it'd be better using it as an argument. Signed-off-by: Namhyung Kim <namhyung.kim@lge.com> Reviewed-by: David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1335417327-11796-6-git-send-email-namhyung.kim@lge.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4bd0f2d2 |
|
25-Apr-2012 |
Namhyung Kim <namhyung.kim@lge.com> |
perf tools: Introduce perf_target__validate() helper The perf_target__validate function is used to check given PID/TID/UID/CPU target options and warn if some combination is impossible. Also this can make some arguments of parse_target_uid() function useless as it is checked before the call via our new helper. Signed-off-by: Namhyung Kim <namhyung.kim@lge.com> Reviewed-by: David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1335417327-11796-5-git-send-email-namhyung.kim@lge.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
bea03405 |
|
25-Apr-2012 |
Namhyung Kim <namhyung.kim@lge.com> |
perf tools: Introduce struct perf_target The perf_target struct will be used for taking care of cpu/thread maps based on user's input. Since it is used on various subcommands it'd better factoring it out. Thanks to Arnaldo for suggesting the better name. Signed-off-by: Namhyung Kim <namhyung.kim@lge.com> Reviewed-by: David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1335417327-11796-2-git-send-email-namhyung.kim@lge.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5a7ed29c |
|
05-Apr-2012 |
Robert Richter <robert.richter@amd.com> |
perf record: Use sw counter only if hw pmu is not detected Use cpu-clock-tick sw counter for cpu-cycles only if there is no hw pmu available. This is the case if the syscall reports ENOENT. In other cases (e.g. invalid attributes) we don't want the sw counter to be used. Cc: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/1333643188-26895-5-git-send-email-robert.richter@amd.com Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
330aa675 |
|
08-Mar-2012 |
Stephane Eranian <eranian@google.com> |
perf record: Add HEADER_BRANCH_STACK tag This patch adds a new feature bit, namely, HEADER_BRANCH_STACK. When present, it indicates that sample records may contain branch stack. This could be useful to a viewer to switch to branch mode without having to parse all the samples or without a specific cmdline option. This will be used in a subsequent patch to enhance perf report with branch stacks. Signed-off-by: Stephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: acme@redhat.com Cc: asharma@fb.com Cc: ravitillo@lbl.gov Cc: vweaver1@eecs.utk.edu Cc: khandual@linux.vnet.ibm.com Cc: dsahern@gmail.com Link: http://lkml.kernel.org/r/1331246868-19905-3-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
a5aabdac |
|
08-Mar-2012 |
Stephane Eranian <eranian@google.com> |
perf record: Provide default branch stack sampling mode option This patch chanegs the logic of the -b, --branch-stack options of perf record. Based on users' request, the patch provides a default filter mode with the -b (or --branch-any) option. With the option, any type of taken branches is sampled. With -j (or --branch-filter), the user can specify any valid combination of branch types and privilege levels if supported by the underlying hardware. The -b (--branch any) is a shortcut for: --branch-filter any. $ perf record -b foo or: $ perf record --branch-filter any foo For more specific filtering: $ perf record --branch-filter ind_call,u foo Signed-off-by: Stephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: acme@redhat.com Cc: asharma@fb.com Cc: ravitillo@lbl.gov Cc: vweaver1@eecs.utk.edu Cc: khandual@linux.vnet.ibm.com Cc: dsahern@gmail.com Link: http://lkml.kernel.org/r/1331246868-19905-2-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
bdfebd84 |
|
09-Feb-2012 |
Roberto Agostino Vitillo <ravitillo@lbl.gov> |
perf record: Add support for sampling taken branch This patch adds a new option to enable taken branch stack sampling, i.e., leverage the PERF_SAMPLE_BRANCH_STACK feature of perf_events. There is a new option to active this mode: -b. It is possible to pass a set of filters to select the type of branches to sample. The following filters are available: - any : any type of branches - any_call : any function call or system call - any_ret : any function return or system call return - any_ind : any indirect branch - u: only when the branch target is at the user level - k: only when the branch target is in the kernel - hv: only when the branch target is in the hypervisor Filters can be combined by passing a comma separated list to the option: $ perf record -b any_call,u -e cycles:u branchy Signed-off-by: Roberto Agostino Vitillo <ravitillo@lbl.gov> Signed-off-by: Stephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: acme@redhat.com Cc: robert.richter@amd.com Cc: ming.m.lin@intel.com Cc: andi@firstfloor.org Cc: asharma@fb.com Cc: vweaver1@eecs.utk.edu Cc: khandual@linux.vnet.ibm.com Cc: dsahern@gmail.com Link: http://lkml.kernel.org/r/1328826068-11713-13-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
bc76efe6 |
|
14-Feb-2012 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Handle kernels that don't support attr.exclude_{guest,host} Just fall back to resetting those fields, if set, warning the user that that feature is not available. If guest samples appear they will just be discarded because no struct machine will be found and thus the event will be accounted as not handled and dropped, see 0c09571. Reported-by: Namhyung Kim <namhyung@gmail.com> Tested-by: Joerg Roedel <joerg.roedel@amd.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-vuwxig36mzprl5n7nzvnxxsh@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
6e557a6a |
|
06-Feb-2012 |
David Ahern <dsahern@gmail.com> |
perf record: No build id option fails A recent refactoring of perf-record introduced the following: perf record -a -B Couldn't generating buildids. Use --no-buildid to profile anyway. sleep: Terminated I believe the triple negative was meant to be only a double negative. :-) While I'm there, fixed the grammar on the error message. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1328567272-13190-1-git-send-email-dsahern@gmail.com Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
808e1226 |
|
14-Feb-2012 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Invert the sample_id_all logic Instead of requiring that users of perf_record_opts set .sample_id_all_avail to true, just invert the logic, using .sample_id_all_missing, that doesn't need to be explicitely initialized since gcc will zero members ommitted in a struct initialization. Just like the newly introduced .exclude_{guest,host} feature test. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-ab772uzk78cwybihf0vt7kxw@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0c978128 |
|
14-Feb-2012 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Handle kernels that don't support attr.exclude_{guest,host} Just fall back to resetting those fields, if set, warning the user that that feature is not available. If guest samples appear they will just be discarded because no struct machine will be found and thus the event will be accounted as not handled and dropped, see 0c09571. Reported-by: Namhyung Kim <namhyung@gmail.com> Tested-by: Joerg Roedel <joerg.roedel@amd.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-vuwxig36mzprl5n7nzvnxxsh@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b52956c9 |
|
08-Feb-2012 |
David Ahern <dsahern@gmail.com> |
perf tools: Allow multiple threads or processes in record, stat, top Allow a user to collect events for multiple threads or processes using a comma separated list. e.g., collect data on a VM and its vhost thread: perf top -p 21483,21485 perf stat -p 21483,21485 -ddd perf record -p 21483,21485 or monitoring vcpu threads perf top -t 21488,21489 perf stat -t 21488,21489 -ddd perf record -t 21488,21489 Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1328718772-16688-1-git-send-email-dsahern@gmail.com Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d3665498 |
|
06-Feb-2012 |
David Ahern <dsahern@gmail.com> |
perf record: No build id option fails A recent refactoring of perf-record introduced the following: perf record -a -B Couldn't generating buildids. Use --no-buildid to profile anyway. sleep: Terminated I believe the triple negative was meant to be only a double negative. :-) While I'm there, fixed the grammar on the error message. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1328567272-13190-1-git-send-email-dsahern@gmail.com Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
781ba9d2 |
|
15-Dec-2011 |
Robert Richter <robert.richter@amd.com> |
perf record: Make feature initialization generic Loop over all features to enable it instead of explicitly enabling every single feature. Reducing duplicate code and making it more robust to later changes e.g. when adding more features. Cc: Ingo Molnar <mingo@elte.hu> Link: http://lkml.kernel.org/r/1323966762-8574-3-git-send-email-robert.richter@amd.com Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0d37aa34 |
|
19-Jan-2012 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Introduce per user view The new --uid command line option will show only the tasks for a given user, using the proc interface to figure out the existing tasks. Kernel work is needed to close races at startup, but this should already be useful in many use cases. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-bdnspm000gw2l984a2t53o8z@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e20960c0 |
|
07-Dec-2011 |
Robert Richter <robert.richter@amd.com> |
perf tools: Unify handling of features when writing feature section The features HEADER_TRACE_INFO and HEADER_BUILD_ID are handled different when writing the feature section. All other features are simply disabled on failure and writing the section goes on without returning an error. There is no reason for these special cases. This patch unifies handling of the features. This should be ok since all features can be parsed independently. Offset and size of a feature's block is stored in struct perf_file_ section right after the data block of perf.data (see perf_session__ write_header()). Thus, if a feature does not exist then other features can be processed anyway. Also moving special code for HEADER_BUILD_ID out to write_build_id(). v2: * perf record throws an error now if buildids may not be generated, which can be disabled with the --no-buildid option. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1323248577-11268-6-git-send-email-robert.richter@amd.com Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
41d0d933 |
|
19-Dec-2011 |
Nelson Elhage <nelhage@nelhage.com> |
perf: builtin-record: Document and check that mmap_pages must be a power of two. Now that we automatically point users at it, let's provide them some guidance so that they hopefully don't just get mysterious EINVAL's from the kernel. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1324301972-22740-4-git-send-email-nelhage@nelhage.com Signed-off-by: Nelson Elhage <nelhage@nelhage.com> [ committer note: Made it work after 50a682c ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
18e60939 |
|
19-Dec-2011 |
Nelson Elhage <nelhage@nelhage.com> |
perf: builtin-record: Provide advice if mmap'ing fails with EPERM. This failure is most likely due to running up against the kernel.perf_event_mlock_kb sysctl, so we can tell the user what to do to fix the issue. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1324301972-22740-3-git-send-email-nelhage@nelhage.com Signed-off-by: Nelson Elhage <nelhage@nelhage.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
3e76ac78 |
|
20-Dec-2011 |
Andrew Vagin <avagin@openvz.org> |
perf record: Add ability to record event period The problem is that when SAMPLE_PERIOD is not set, the kernel generates a number of samples in proportion to an event's period. Number of these samples may be too big and the kernel throttles all samples above a defined limit. E.g.: I want to trace when a process sleeps. I created a process which sleeps for 1ms and for 4ms. perf got 100 events in both cases. swapper 0 [000] 1141.371830: sched_stat_sleep: comm=foo pid=1801 delay=1386750 [ns] swapper 0 [000] 1141.369444: sched_stat_sleep: comm=foo pid=1801 delay=4499585 [ns] In the first case a kernel want to send 4499585 events and in the second case it wants to send 1386750 events. perf-reports shows that process sleeps in both places equal time. Instead of this we can get only one sample with an attribute period. As result we have less data transferring between kernel and user-space and we avoid throttling of samples. The patch "events: Don't divide events if it has field period" added a kernel part of this functionality. Acked-by: Arun Sharma <asharma@fb.com> Cc: Arun Sharma <asharma@fb.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: devel@openvz.org Link: http://lkml.kernel.org/r/1324391565-1369947-1-git-send-email-avagin@openvz.org Signed-off-by: Andrew Vagin <avagin@openvz.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
806fb630 |
|
29-Nov-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Always do automatic allocation of pollfd and mmap structures At first tools were required to do that, but while writing the python bindings to simplify the API I made them auto-allocate when needed. This just makes record, stat and top use that auto allocation, simplifying them a bit. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-iokhcvkzzijr3keioubx8hlq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
45694aa7 |
|
28-Nov-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Rename perf_event_ops to perf_tool To better reflect that it became the base class for all tools, that must be in each tool struct and where common stuff will be put. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-qgpc4msetqlwr8y2k7537cxe@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
743eb868 |
|
28-Nov-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Resolve machine earlier and pass it to perf_event_ops Reducing the exposure of perf_session further, so that we can use the classes in cases where no perf.data file is created. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-stua66dcscsezzrcdugvbmvd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d20deb64 |
|
25-Nov-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Pass tool context in the the perf_event_ops functions So that we don't need to have that many globals. Next steps will remove the 'session' pointer, that in most cases is not needed. Then we can rename perf_event_ops to 'perf_tool' that better describes this class hierarchy. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-wp4djox7x6w1i2bab1pt4xxp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ed80f581 |
|
11-Nov-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Move 'group' to perf_event_ops Will be used in other tools to share the command line parsing code. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-8x0yr77r6lrd2t699s499m8n@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
01c2d99b |
|
09-Nov-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Move mmap_pages to perf_record_opts Tools being developed will need this to allow the user to override this value. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-zydc1yhxfm0z35fuy95bsn1l@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
50a682ce |
|
09-Nov-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Handle default value for 'pages' on mmap method Every tool that calls this and allows the user to override the value needs this logic. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-lwscxpg57xfzahz5dmdfp9uz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
35b9d88e |
|
09-Nov-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Introduce {prepare,start}_workload refactored from 'perf record' So that we can easily start a workload in other tools. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-zdsksd4aphu0nltg2lpwsw3x@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0f82ebc4 |
|
08-Nov-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evsel: Introduce config attr method Out of the code in 'perf record', so that we can share option parsing, etc. Eventually will be used by 'perf top', but first 'trace' will use it. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-hzjqsgnte1esk90ytq0ap98v@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b8631e6e |
|
26-Oct-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf ui: Rename ui__warning_paranoid to ui__error_paranoid As it will exit the tool after the user is notified. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-vy06m8xzlvkhr8tk7nylhbng@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
727ab04e |
|
25-Oct-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Fix grouping of multiple events The __perf_evsel__open routing was grouping just the threads for that specific events per cpu when we want to group all threads in all events to the first fd opened on that cpu. So pass the xyarray with the first event, where the other events will be able to get that first per cpu fd. At some point top and record will switch to using perf_evlist__open that takes care of this detail and probably will also handle the fallback from hw to soft counters, etc. Reported-by: Deng-Cheng Zhu <dczhu@mips.com> Tested-by: Deng-Cheng Zhu <dczhu@mips.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-ebm34rh098i9y9v4cytfdp0x@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
fbe96f29 |
|
30-Sep-2011 |
Stephane Eranian <eranian@google.com> |
perf tools: Make perf.data more self-descriptive (v8) The goal of this patch is to include more information about the host environment into the perf.data so it is more self-descriptive. Overtime, profiles are captured on various machines and it becomes hard to track what was recorded, on what machine and when. This patch provides a way to solve this by extending the perf.data file with basic information about the host machine. To add those extensions, we leverage the feature bits capabilities of the perf.data format. The change is backward compatible with existing perf.data files. We define the following useful new extensions: - HEADER_HOSTNAME: the hostname - HEADER_OSRELEASE: the kernel release number - HEADER_ARCH: the hw architecture - HEADER_CPUDESC: generic CPU description - HEADER_NRCPUS: number of online/avail cpus - HEADER_CMDLINE: perf command line - HEADER_VERSION: perf version - HEADER_TOPOLOGY: cpu topology - HEADER_EVENT_DESC: full event description (attrs) - HEADER_CPUID: easy-to-parse low level CPU identication The small granularity for the entries is to make it easier to extend without breaking backward compatiblity. Many entries are provided as ASCII strings. Perf report/script have been modified to print the basic information as easy-to-parse ASCII strings. Extended information about CPU and NUMA topology may be requested with the -I option. Thanks to David Ahern for reviewing and testing the many versions of this patch. $ perf report --stdio # ======== # captured on : Mon Sep 26 15:22:14 2011 # hostname : quad # os release : 3.1.0-rc4-tip # perf version : 3.1.0-rc4 # arch : x86_64 # nrcpus online : 4 # nrcpus avail : 4 # cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz # cpuid : GenuineIntel,6,15,11 # total memory : 8105360 kB # cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31, # HEADER_CPU_TOPOLOGY info available, use -I to display # HEADER_NUMA_TOPOLOGY info available, use -I to display # ======== # ... $ perf report --stdio -I # ======== # captured on : Mon Sep 26 15:22:14 2011 # hostname : quad # os release : 3.1.0-rc4-tip # perf version : 3.1.0-rc4 # arch : x86_64 # nrcpus online : 4 # nrcpus avail : 4 # cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz # cpuid : GenuineIntel,6,15,11 # total memory : 8105360 kB # cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31, # sibling cores : 0-3 # sibling threads : 0 # sibling threads : 1 # sibling threads : 2 # sibling threads : 3 # node0 meminfo : total = 8320608 kB, free = 7571024 kB # node0 cpu list : 0-3 # ======== # ... Reviewed-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@amd.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/20110930134040.GA5575@quad Signed-off-by: Stephane Eranian <eranian@google.com> [ committer notes: Use --show-info in the tools as was in the docs, rename perf_header_fprintf_info to perf_file_section__fprintf_info, fixup conflict with f69b64f7 "perf: Support setting the disassembler style" ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
33e49ea7 |
|
15-Sep-2011 |
Andi Kleen <ak@linux.intel.com> |
perf tools: Make stat/record print fatal signals of the target program When a program crashes under perf there is no message about it, unlike when running it from bash. This can be confusing and lead to wrong actions during debugging. Print fatal signals in perf stat/record. Thanks to Furat Afram for finding the problem originally Link: http://lkml.kernel.org/r/1316122302-24306-1-git-send-email-andi@firstfloor.org Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Stephane Eranian <eranian@google.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
764e16a3 |
|
25-Aug-2011 |
David Ahern <dsahern@gmail.com> |
perf record: Create events initially disabled and enable after init perf-record currently creates events enabled. When doing a system wide collection (-a arg) this causes data collection for perf's initialization activities -- eg., perf_event__synthesize_threads(). For some events (e.g., context switch S/W event or tracepoints like syscalls) perf's initialization causes a lot of events to be captured frequently generating "Check IO/CPU overload!" warnings on larger systems (e.g., 2 socket, quad core, hyperthreading). perf's initialization phase can be skipped by creating events disabled and then enabling them once the initialization is done. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1314289075-14706-1-git-send-email-dsahern@gmail.com Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
43bece79 |
|
17-Aug-2011 |
Lin Ming <ming.m.lin@intel.com> |
perf tools: Add group event scheduling option to perf record/stat Group event scheduling command line option is missing in perf record/stat. Add it to perf record/stat, which is same as in perf top. Reported-by: Andi Kleen <andi@firstfloor.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1313577727.2754.5.camel@hp6530s Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4152ab37 |
|
25-Jul-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Introduce 'disable' method To remove the last case of access to the FD() macro outside the library. Inspired by a patch by Borislav that moved the FD() macro to util.h, for namespace concerns I rather preferred to constrain it to ev{sel,list}.c. Cc: Borislav Petkov <bp@amd64.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-qn893qsstcg366tkucu649qj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f120f9d5 |
|
14-Jul-2011 |
Jiri Olsa <jolsa@redhat.com> |
perf tools: De-opt the parse_events function Moving out the option parameter from parse_events function, and adding new parse_events_option function instead. The option parameter is used only to carry "struct perf_evlist" pointer for chaining new events. Putting it away, enable us to call parse_events from other places without using the option parameter. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: acme@redhat.com Cc: a.p.zijlstra@chello.nl Cc: paulus@samba.org Link: http://lkml.kernel.org/r/1310635534-4013-2-git-send-email-jolsa@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
646aaea6 |
|
27-May-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Make sure kptr_restrict warnings fit 80 col terms Suggested-by: Ingo Molnar <mingo@elte.hu> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/n/tip-i1p8vrhq7xveyui6t1sc914e@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ec80fde7 |
|
26-May-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf symbols: Handle /proc/sys/kernel/kptr_restrict Perf uses /proc/modules to figure out where kernel modules are loaded. With the advent of kptr_restrict, non root users get zeroes for all module start addresses. So check if kptr_restrict is non zero and don't generate the syntethic PERF_RECORD_MMAP events for them. Warn the user about it in perf record and in perf report. In perf report the reference relocation symbol being zero means that kptr_restrict was set, thus /proc/kallsyms has only zeroed addresses, so don't use it to fixup symbol addresses when using a valid kallsyms (in the buildid cache) or vmlinux (in the vmlinux path) build-id located automatically or specified by the user. Provide an explanation about it in 'perf report' if kernel samples were taken, checking if a suitable vmlinux or kallsyms was found/specified. Restricted /proc/kallsyms don't go to the buildid cache anymore. Example: [acme@emilia ~]$ perf record -F 100000 sleep 1 WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted, check /proc/sys/kernel/kptr_restrict. Samples in kernel functions may not be resolved if a suitable vmlinux file is not found in the buildid cache or in the vmlinux path. Samples in kernel modules won't be resolved at all. If some relocation was applied (e.g. kexec) symbols may be misresolved even with a suitable vmlinux or kallsyms file. [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.005 MB perf.data (~231 samples) ] [acme@emilia ~]$ [acme@emilia ~]$ perf report --stdio Kernel address maps (/proc/{kallsyms,modules}) were restricted, check /proc/sys/kernel/kptr_restrict before running 'perf record'. If some relocation was applied (e.g. kexec) symbols may be misresolved. Samples in kernel modules can't be resolved as well. # Events: 13 cycles # # Overhead Command Shared Object Symbol # ........ ....... ................. ..................... # 20.24% sleep [kernel.kallsyms] [k] page_fault 20.04% sleep [kernel.kallsyms] [k] filemap_fault 19.78% sleep [kernel.kallsyms] [k] __lru_cache_add 19.69% sleep ld-2.12.so [.] memcpy 14.71% sleep [kernel.kallsyms] [k] dput 4.70% sleep [kernel.kallsyms] [k] flush_signal_handlers 0.73% sleep [kernel.kallsyms] [k] perf_event_comm 0.11% sleep [kernel.kallsyms] [k] native_write_msr_safe # # (For a higher level overview, try: perf report --sort comm,dso) # [acme@emilia ~]$ This is because it found a suitable vmlinux (build-id checked) in /lib/modules/2.6.39-rc7+/build/vmlinux (use -v in perf report to see the long file name). If we remove that file from the vmlinux path: [root@emilia ~]# mv /lib/modules/2.6.39-rc7+/build/vmlinux \ /lib/modules/2.6.39-rc7+/build/vmlinux.OFF [acme@emilia ~]$ perf report --stdio [kernel.kallsyms] with build id 57298cdbe0131f6871667ec0eaab4804dcf6f562 not found, continuing without symbols Kernel address maps (/proc/{kallsyms,modules}) were restricted, check /proc/sys/kernel/kptr_restrict before running 'perf record'. As no suitable kallsyms nor vmlinux was found, kernel samples can't be resolved. Samples in kernel modules can't be resolved as well. # Events: 13 cycles # # Overhead Command Shared Object Symbol # ........ ....... ................. ...... # 80.31% sleep [kernel.kallsyms] [k] 0xffffffff8103425a 19.69% sleep ld-2.12.so [.] memcpy # # (For a higher level overview, try: perf report --sort comm,dso) # [acme@emilia ~]$ Reported-by: Stephane Eranian <eranian@google.com> Suggested-by: David Miller <davem@davemloft.net> Cc: Dave Jones <davej@redhat.com> Cc: David Miller <davem@davemloft.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Kees Cook <kees.cook@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/n/tip-mt512joaxxbhhp1odop04yit@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
aece948f |
|
15-May-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Fix per thread mmap setup The PERF_EVENT_IOC_SET_OUTPUT ioctl was returning -EINVAL when using --pid when monitoring multithreaded apps, as we can only share a ring buffer for events on the same thread if not doing per cpu. Fix it by using per thread ring buffers. Tested with: [root@felicio ~]# tuna -t 26131 -CP | nl 1 thread ctxt_switches 2 pid SCHED_ rtpri affinity voluntary nonvoluntary cmd 3 26131 OTHER 0 0,1 10814276 2397830 chromium-browse 4 642 OTHER 0 0,1 14688 0 chromium-browse 5 26148 OTHER 0 0,1 713602 115479 chromium-browse 6 26149 OTHER 0 0,1 801958 2262 chromium-browse 7 26150 OTHER 0 0,1 1271128 248 chromium-browse 8 26151 OTHER 0 0,1 3 0 chromium-browse 9 27049 OTHER 0 0,1 36796 9 chromium-browse 10 618 OTHER 0 0,1 14711 0 chromium-browse 11 661 OTHER 0 0,1 14593 0 chromium-browse 12 29048 OTHER 0 0,1 28125 0 chromium-browse 13 26143 OTHER 0 0,1 2202789 781 chromium-browse [root@felicio ~]# So 11 threads under pid 26131, then: [root@felicio ~]# perf record -F 50000 --pid 26131 [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl 1 7fa4a2538000-7fa4a25b9000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 2 7fa4a25b9000-7fa4a263a000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 3 7fa4a263a000-7fa4a26bb000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 4 7fa4a26bb000-7fa4a273c000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 5 7fa4a273c000-7fa4a27bd000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 6 7fa4a27bd000-7fa4a283e000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 7 7fa4a283e000-7fa4a28bf000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 8 7fa4a28bf000-7fa4a2940000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 9 7fa4a2940000-7fa4a29c1000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 10 7fa4a29c1000-7fa4a2a42000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 11 7fa4a2a42000-7fa4a2ac3000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] [root@felicio ~]# 11 mmaps, one per thread since we didn't specify any CPU list, so we need one mmap per thread and: [root@felicio ~]# perf record -F 50000 --pid 26131 ^M ^C[ perf record: Woken up 79 times to write data ] [ perf record: Captured and wrote 20.614 MB perf.data (~900639 samples) ] [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl 1 371310 26131 2 96516 26148 3 95694 26149 4 95203 26150 5 7291 26143 6 87 27049 7 76 661 8 60 29048 9 47 618 10 43 642 [root@felicio ~]# Ok, one of the threads, 26151 was quiescent, so no samples there, but all the others are there. Then, if I specify one CPU: [root@felicio ~]# perf record -F 50000 --pid 26131 --cpu 1 ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.680 MB perf.data (~29730 samples) ] [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl 1 8444 26131 2 2584 26149 3 2518 26148 4 2324 26150 5 123 26143 6 9 661 7 9 29048 [root@felicio ~]# This machine has two cores, so fewer threads appeared on the radar, and: [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl 1 7f484b922000-7f484b9a3000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] [root@felicio ~]# Just one mmap, as now we can use just one per-cpu buffer instead of the per-thread needed in the previous case. For global profiling: [root@felicio ~]# perf record -F 50000 -a ^C[ perf record: Woken up 26 times to write data ] [ perf record: Captured and wrote 7.128 MB perf.data (~311412 samples) ] [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl 1 7fb49b435000-7fb49b4b6000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 2 7fb49b4b6000-7fb49b537000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] [root@felicio ~]# It uses per-cpu buffers. For just one thread: [root@felicio ~]# perf record -F 50000 --tid 26148 ^C[ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.330 MB perf.data (~14426 samples) ] [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl 1 9969 26148 [root@felicio ~]# [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl 1 7f286a51b000-7f286a59c000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] [root@felicio ~]# Tested-by: David Ahern <dsahern@gmail.com> Tested-by: Lin Ming <ming.m.lin@intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/r/20110426204401.GB1746@ghostprotocols.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5d2cd909 |
|
14-Apr-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evsel: Fix use of inherit perf stat doesn't mmap and its perfectly fine for it to use task-bound counters with inheritance. So set the attr.inherit on the caller and leave the syscall itself to validate it. When the mmap fails perf_evlist__mmap will just emit a warning if this is the failure reason. Reported-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/r/20110414170121.GC3229@ghostprotocols.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
800cd25c |
|
30-Mar-2011 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf: mmap 512 kiB by default The default setting of perf record is to mmap 128 pages if the user did not override with -m. However the page size may vary accross different architecture settings, giving different default size between each. Moreover the kernel side still has a default max number of mlocked pages of 512 kiB + 1 page for unprivileged users. 128 + 1 pages with page size > 4096 overlaps this threshold. Thus, better adapt to this limitation and set the default number of pages to fit those 512 kiB + 1 page. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1301535324-9735-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
ca6a4258 |
|
25-Mar-2011 |
David Ahern <daahern@cisco.com> |
perf tools: Emit clearer message for sys_perf_event_open ENOENT return Resend of patch sent back in January 2011 in light of recent confusion around unsupported events for a given platform. Improve sys_perf_event_open ENOENT return handling in top and record, just like 5a3446b does for stat. Retry of Arnaldo's patch using ui_warning instead of die which allows the fallback from hardware cycles to software clock. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org LKML-Reference: <1301080271-20945-1-git-send-email-daahern@cisco.com> Signed-off-by: David Ahern <daahern@cisco.com> [ committer note: Some adjustments to make it apply to newer codebase ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c286c419 |
|
28-Mar-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Fixup exit path when not able to open events We have to deal with the TUI mode in perf top, so that we don't end up with a garbled screen when, say, a non root user on a machine with a paranoid setting (the default) tries to use 'perf top'. Introduce a ui__warning_paranoid() routine shared by top and record that tells the user the valid values for /proc/sys/kernel/perf_event_paranoid. Cc: David Ahern <daahern@cisco.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a91e5431 |
|
10-Mar-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf session: Use evlist/evsel for managing perf.data attributes So that we can reuse things like the id to attr lookup routine (perf_evlist__id2evsel) that uses a hash table instead of the linear lookup done in the older perf_header_attr routines, etc. Also to make evsels/evlist more pervasive an API, simplyfing using the emerging perf lib. cc: Arun Sharma <arun@sharma-home.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0a102479 |
|
25-Feb-2011 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf: Set filters before mmaping events We currently set the filters after we mmap the events, this is a race that let undesired events record themselves in the buffer before we had the time to set the filters. So set the filters before they can be recorded. That also librarizes the filters setting so that filtering can be done more easily from other tools than perf record later. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org>
|
#
712a4b60 |
|
16-Feb-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Delay setting the header writing atexit call While testing the --filter option I noticed that we were writing lots of unneeded stuff to the perf.data header when the filter ioctl fails, so move the atexit(atexit_header) call to after we create the counters successfully. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
023695d9 |
|
14-Feb-2011 |
Stephane Eranian <eranian@google.com> |
perf tool: Add cgroup support This patch adds the ability to filter monitoring based on container groups (cgroups) for both perf stat and perf record. It is possible to monitor multiple cgroup in parallel. There is one cgroup per event. The cgroups to monitor are passed via a new -G option followed by a comma separated list of cgroup names. The cgroup filesystem has to be mounted. Given a cgroup name, the perf tool finds the corresponding directory in the cgroup filesystem and opens it. It then passes that file descriptor to the kernel. Example: $ perf stat -B -a -e cycles:u,cycles:u,cycles:u -G test1,,test2 -- sleep 1 Performance counter stats for 'sleep 1': 2,368,667,414 cycles test1 2,369,661,459 cycles <not counted> cycles test2 1.001856890 seconds time elapsed Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4d590290.825bdf0a.7d0a.4890@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
401b8e13 |
|
09-Feb-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Fix thread_map event synthesizing in top and record Jeff Moyer reported these messages: Warning: ... trying to fall back to cpu-clock-ticks couldn't open /proc/-1/status couldn't open /proc/-1/maps [ls output] [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.008 MB perf.data (~363 samples) ] That lead me and David Ahern to see that something was fishy on the thread synthesizing routines, at least for the case where the workload is started from 'perf record', as -1 is the default for target_tid in 'perf record --tid' parameter, so somehow we were trying to synthesize the PERF_RECORD_MMAP and PERF_RECORD_COMM events for the thread -1, a bug. So I investigated this and noticed that when we introduced support for recording a process and its threads using --pid some bugs were introduced and that the way to fix it was to instead of passing the target_tid to the event synthesizing routines we should better pass the thread_map that has the list of threads for a --pid or just the single thread for a --tid. Checked in the following ways: On a 8-way machine run cyclictest: [root@emilia ~]# perf record cyclictest -a -t -n -p99 -i100 -d50 policy: fifo: loadavg: 0.00 0.13 0.31 2/139 28798 T: 0 (28791) P:99 I:100 C: 25072 Min: 4 Act: 5 Avg: 6 Max: 122 T: 1 (28792) P:98 I:150 C: 16715 Min: 4 Act: 6 Avg: 5 Max: 27 T: 2 (28793) P:97 I:200 C: 12534 Min: 4 Act: 5 Avg: 4 Max: 8 T: 3 (28794) P:96 I:250 C: 10028 Min: 4 Act: 5 Avg: 5 Max: 96 T: 4 (28795) P:95 I:300 C: 8357 Min: 5 Act: 6 Avg: 5 Max: 12 T: 5 (28796) P:94 I:350 C: 7163 Min: 5 Act: 6 Avg: 5 Max: 12 T: 6 (28797) P:93 I:400 C: 6267 Min: 4 Act: 5 Avg: 5 Max: 9 T: 7 (28798) P:92 I:450 C: 5571 Min: 4 Act: 5 Avg: 5 Max: 9 ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.108 MB perf.data (~4719 samples) ] [root@emilia ~]# This will create one extra thread per CPU: [root@emilia ~]# tuna -t cyclictest -CP thread ctxt_switches pid SCHED_ rtpri affinity voluntary nonvoluntary cmd 28825 OTHER 0 0xff 2169 671 cyclictest 28832 FIFO 93 6 52338 1 cyclictest 28833 FIFO 92 7 46524 1 cyclictest 28826 FIFO 99 0 209360 1 cyclictest 28827 FIFO 98 1 139577 1 cyclictest 28828 FIFO 97 2 104686 0 cyclictest 28829 FIFO 96 3 83751 1 cyclictest 28830 FIFO 95 4 69794 1 cyclictest 28831 FIFO 94 5 59825 1 cyclictest [root@emilia ~]# So we should expect only samples for the above 9 threads when using the --dump-raw-trace|-D perf report switch to look at the column with the tid: [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c 629 28825 110 28826 491 28827 308 28828 198 28829 621 28830 225 28831 203 28832 89 28833 [root@emilia ~]# So for workloads started by 'perf record' seems to work, now for existing workloads, just run cyclictest first, without 'perf record': [root@emilia ~]# tuna -t cyclictest -CP thread ctxt_switches pid SCHED_ rtpri affinity voluntary nonvoluntary cmd 28859 OTHER 0 0xff 594 200 cyclictest 28864 FIFO 95 4 16587 1 cyclictest 28865 FIFO 94 5 14219 1 cyclictest 28866 FIFO 93 6 12443 0 cyclictest 28867 FIFO 92 7 11062 1 cyclictest 28860 FIFO 99 0 49779 1 cyclictest 28861 FIFO 98 1 33190 1 cyclictest 28862 FIFO 97 2 24895 1 cyclictest 28863 FIFO 96 3 19918 1 cyclictest [root@emilia ~]# and then later did: [root@emilia ~]# perf record --pid 28859 sleep 3 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.027 MB perf.data (~1195 samples) ] [root@emilia ~]# To collect 3 seconds worth of samples for pid 28859 and its children: [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c 15 28859 33 28860 19 28861 13 28862 13 28863 10 28864 11 28865 9 28866 255 28867 [root@emilia ~]# Works, last thing is to check if looking at just one of those threads also works: [root@emilia ~]# perf record --tid 28866 sleep 3 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.006 MB perf.data (~242 samples) ] [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c 3 28866 [root@emilia ~]# Works too. Reported-by: Jeff Moyer <jmoyer@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7e2ed097 |
|
30-Jan-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Store pointer to the cpu and thread maps So that we don't have to pass it around to the several methods that needs it, simplifying usage. There is one case where we don't have the thread/cpu map in advance, which is in the parsing routines used by top, stat, record, that we have to wait till all options are parsed to know if a cpu or thread list was passed to then create those maps. For that case consolidate the cpu and thread map creation via perf_evlist__create_maps() out of the code in top and record, while also providing a perf_evlist__set_maps() for cases where multiple evlists share maps or for when maps that represent CPU sockets, for instance, get crafted out of topology information or subsets of threads in a particular application are to be monitored, providing more granularity in specifying which cpus and threads to monitor. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
8115d60c |
|
29-Jan-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Kill event_t typedef, use 'union perf_event' instead And move the event_t methods to the perf_event__ too. No code changes, just namespace consistency. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
8d50e5b4 |
|
29-Jan-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Rename 'struct sample_data' to 'struct perf_sample' Making the namespace more uniform. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
dc82009a |
|
28-Jan-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: No need to check for overwrites As we open the mmap with (PROT_READ | PROT_WRITE), signalling the kernel with perf_mmap__write_tail() when consuming data, so the kernel will not overwrite. Suggested-by: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
fd78260b |
|
18-Jan-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf threads: Move thread_map to separate file To untangle it from struct thread handling, that is tied to symbols, etc. Right now in the python bindings I'm working on I need just a subset of the util/ files, untangling it allows me to do that. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d7065adb |
|
16-Jan-2011 |
Franck Bui-Huu <fbuihuu@gmail.com> |
perf record: auto detect when stdout is a pipe This patch gives the ability to 'perf record' to detect when its stdout has been redirected to a pipe. There's now no more need to add '-o -' switch in this case. However '-o <path>' option has always precedence, that is if specified and stdout has been connected via a pipe then the output will go into the specified output. LKML-Reference: <m3ipxo966i.fsf@gmail.com> Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0a27d7f9 |
|
14-Jan-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Use perf_evlist__mmap There is more stuff that can go to the perf_ev{sel,list} layer, like detecting if sample_id_all is available, etc, but lets try using this in 'perf test' first. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
115d2d89 |
|
12-Jan-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Move perf_mmap__write_tail to perf.h Close to perf_mmap__read_head() and the perf_mmap struct definition. This is useful for any recorder, and we will need it in 'perf test'. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
744bd8aa |
|
12-Jan-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Use struct perf_mmap and helpers Paving the way to using perf_evsel->mmap, do this to reduce the patch noise in the next ones. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
dd7927f4 |
|
12-Jan-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Use perf_evsel__open Now its time to factor out the mmap handling bits into the perf_evsel class. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5c581041 |
|
11-Jan-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Adopt the pollfd array Allocating just the space needed for nr_cpus * nr_threads * nr_evsels, not the MAX_NR_CPUS and counters. LKML-Reference: <new-submission> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
361c99a6 |
|
11-Jan-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evsel: Introduce perf_evlist Killing two more perf wide global variables: nr_counters and evsel_list as a list_head. There are more operations that will need more fields in perf_evlist, like the pollfd for polling all the fds in a list of evsel instances. Use option->value to pass the evsel_list to parse_{events,filters}. LKML-Reference: <new-submission> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9486aa38 |
|
22-Jan-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Fix 64 bit integer format strings Using %L[uxd] has issues in some architectures, like on ppc64. Fix it by making our 64 bit integers typedefs of stdint.h types and using PRI[ux]64 like, for instance, git does. Reported by Denis Kirjanov that provided a patch for one case, I went and changed all cases. Reported-by: Denis Kirjanov <dkirjanov@kernel.org> Tested-by: Denis Kirjanov <dkirjanov@kernel.org> LKML-Reference: <20110120093246.GA8031@hera.kernel.org> Cc: Denis Kirjanov <dkirjanov@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pingtian Han <phan@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ad7f4e3f |
|
17-Jan-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Fix tracepoint id to string perf.data header table It was broken by f006d25 that passed just the event name, not the complete sys:event that it expected to open the /sys/.../sys/sys:event/id file to get the id. Fix it by moving it to after parse_events in cmd_record, as at that point we can just traverse the evsel_list and use evsel->attr.config + event_name(evsel) instead of re-opening the /id file. Reported-by: Franck Bui-Huu <vagabon.xyz@gmail.com> Cc: Franck Bui-Huu <vagabon.xyz@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Han Pingtian <phan@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <20110117202801.GG2085@ghostprotocols.net> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
acac03fa |
|
12-Jan-2011 |
Kirill Smelkov <kirr@mns.spb.ru> |
perf record: Add "nodelay" mode, disabled by default Sometimes there is a need to use perf in "live-log" mode. The problem is, for seldom events, actual info output is largely delayed because perf-record reads sample data in whole pages. So for such scenarious, add flag for perf-record to go in "nodelay" mode. To track e.g. what's going on in icmp_rcv while ping is running Use it with something like this: (1) $ perf probe -L icmp_rcv | grep -U8 '^ *43\>' goto error; } 38 if (!pskb_pull(skb, sizeof(*icmph))) goto error; icmph = icmp_hdr(skb); 43 ICMPMSGIN_INC_STATS_BH(net, icmph->type); /* * 18 is the highest 'known' ICMP type. Anything else is a mystery * * RFC 1122: 3.2.2 Unknown ICMP messages types MUST be silently * discarded. */ 50 if (icmph->type > NR_ICMP_TYPES) goto error; $ perf probe icmp_rcv:43 'type=icmph->type' (2) $ cat trace-icmp.py [...] def trace_begin(): print "in trace_begin" def trace_end(): print "in trace_end" def probe__icmp_rcv(event_name, context, common_cpu, common_secs, common_nsecs, common_pid, common_comm, __probe_ip, type): print_header(event_name, common_cpu, common_secs, common_nsecs, common_pid, common_comm) print "__probe_ip=%u, type=%u\n" % \ (__probe_ip, type), [...] (3) $ perf record -a -D -e probe:icmp_rcv -o - | \ perf script -i - -s trace-icmp.py Thanks to Peter Zijlstra for pointing how to do it. Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu>, Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <20110112140613.GA11698@tugrik.mns.mnsspb.ru> Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4ad9f594 |
|
11-Jan-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
Revert "perf tools: Emit clearer message for sys_perf_event_open ENOENT return" This reverts commit aa7bc7ef73efc46d7c3a0e185eefaf85744aec98. It removed the fallback from hardware profiling to software profiling. .e.g., in a VM with no PMU. Reported-by: David Ahern <daahern@cisco.com> Cc: David Ahern <daahern@cisco.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
bd3bfe9e |
|
10-Jan-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evsel: Fix order of event list deletion We need to defer calling perf_evsel_list__delete() till after atexit registered routines, because we need to traverse the events being recorded at that time at least on 'perf record'. This fixes the problem reported by Thomas Renninger where cmd_record called by cmd_timechart would not write the tracing data to the perf.data file header because the evsel_list at atexit (control+C on 'perf timechart record') time would be empty, being already deleted by run_builtin(), and thus 'perf timechart' when trying to process such perf.data file would die with: "no trace data in the file" Problem introduced in 70d544d. Reported-by: Thomas Renninger <trenn@suse.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Renninger <trenn@suse.de> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
aa7bc7ef |
|
10-Jan-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Emit clearer message for sys_perf_event_open ENOENT return Improve sys_perf_event_open ENOENT return handling in top and record, just like 5a3446b does for stat. Cc: David Ahern <daahern@cisco.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5c98d466 |
|
03-Jan-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Refactor all_tids to hold nr and the map So that later, we can pass the thread_map instance instead of (thread_num, thread_map) for things like perf_evsel__open and friends, just like was done with cpu_map. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
60d567e2 |
|
03-Jan-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Refactor cpumap to hold nr and the map So that later, we can pass the cpu_map instance instead of (nr_cpus, cpu_map) for things like perf_evsel__open and friends. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
70d544d0 |
|
03-Jan-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evsel: Delete the event selectors at exit Freeing all the possibly allocated resources, reducing complexity on each tool exit path. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
69aad6f1 |
|
03-Jan-2011 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Introduce event selectors Out of ad-hoc code and global arrays with hard coded sizes. This is the first step on having a library that will be first used on regression tests in the 'perf test' tool. [acme@felicio linux]$ size /tmp/perf.before text data bss dec hex filename 1273776 97384 5104416 6475576 62cf38 /tmp/perf.before [acme@felicio linux]$ size /tmp/perf.new text data bss dec hex filename 1275422 97416 1392416 2765254 2a31c6 /tmp/perf.new Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a43d3f08 |
|
24-Dec-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Fix use of sample_id_all userspace with !sample_id_all kernels Check if parse_single_tracepoint_event has already asked for PERF_SAMPLE_TIME. This is kludgy but short term fix for problems introduced by eac23d1c that broke 'perf script' by having different sample_types when using multiple tracepoint events when we use a perf binary that tries to use sample_id_all on an older kernel. We need to move counter creation to perf_session, support different sample_types, etc. Ongoing work on the perf test infrastructure needs this so that we can create counters to monitor threads generating specific events, etc. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Torok Edwin <edwintorok@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
eac23d1c |
|
08-Dec-2010 |
Ian Munsie <imunsie@au1.ibm.com> |
perf record,report,annotate,diff: Process events in order This patch changes perf report to ask for the ID info on all events be default if recording from multiple CPUs. Perf report, annotate and diff will now process the events in order if the kernel is able to provide timestamps on all events. This ensures that events such as COMM and MMAP which are necessary to correctly interpret samples are processed prior to those samples so that they are attributed correctly. Before: # perf record ./cachetest # perf report # Events: 6K cycles # # Overhead Command Shared Object Symbol # ........ ....... ................. ............................... # 74.11% :3259 [unknown] [k] 0x4a6c 1.50% cachetest ld-2.11.2.so [.] 0x1777c 1.46% :3259 [kernel.kallsyms] [k] .perf_event_mmap_ctx 1.25% :3259 [kernel.kallsyms] [k] restore 0.74% :3259 [kernel.kallsyms] [k] ._raw_spin_lock 0.71% :3259 [kernel.kallsyms] [k] .filemap_fault 0.66% :3259 [kernel.kallsyms] [k] .memset 0.54% cachetest [kernel.kallsyms] [k] .sha_transform 0.54% :3259 [kernel.kallsyms] [k] .copy_4K_page 0.54% :3259 [kernel.kallsyms] [k] .find_get_page 0.52% :3259 [kernel.kallsyms] [k] .trace_hardirqs_off 0.50% :3259 [kernel.kallsyms] [k] .__do_fault <SNIP> After: # perf report # Events: 6K cycles # # Overhead Command Shared Object Symbol # ........ ....... ................. ............................... # 44.28% cachetest cachetest [.] sumArrayNaive 22.53% cachetest cachetest [.] sumArrayOptimal 6.59% cachetest ld-2.11.2.so [.] 0x1777c 2.13% cachetest [unknown] [k] 0x340 1.46% cachetest [kernel.kallsyms] [k] .perf_event_mmap_ctx 1.25% cachetest [kernel.kallsyms] [k] restore 0.74% cachetest [kernel.kallsyms] [k] ._raw_spin_lock 0.71% cachetest [kernel.kallsyms] [k] .filemap_fault 0.66% cachetest [kernel.kallsyms] [k] .memset 0.54% cachetest [kernel.kallsyms] [k] .copy_4K_page 0.54% cachetest [kernel.kallsyms] [k] .find_get_page 0.54% cachetest [kernel.kallsyms] [k] .sha_transform 0.52% cachetest [kernel.kallsyms] [k] .trace_hardirqs_off 0.50% cachetest [kernel.kallsyms] [k] .__do_fault <SNIP> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <1291872833-839-1-git-send-email-imunsie@au1.ibm.com> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
21ef97f0 |
|
09-Dec-2010 |
Ian Munsie <imunsie@au1.ibm.com> |
perf session: Fallback to unordered processing if no sample_id_all If we are running the new perf on an old kernel without support for sample_id_all, we should fall back to the old unordered processing of events. If we didn't than we would *always* process events without timestamps out of order, whether or not we hit a reordering race. In other words, instead of there being a chance of not attributing samples correctly, we would guarantee that samples would not be attributed. While processing all events without timestamps before events with timestamps may seem like an intuitive solution, it falls down as PERF_RECORD_EXIT events would also be processed before any samples. Even with a workaround for that case, samples before/after an exec would not be attributed correctly. This patch allows commands to indicate whether they need to fall back to unordered processing, so that commands that do not care about timestamps on every event will not be affected. If we do fallback, this will print out a warning if report -D was invoked. This patch adds the test in perf_session__new so that we only need to test once per session. Commands that do not use an event_ops (such as record and top) can simply pass NULL in it's place. Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <1291951882-sup-6069@au1.ibm.com> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
18483b81 |
|
06-Dec-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Fix eternal wait for stillborn child When execvp fails to find the specified command on the path we won't get SIGCHLD, so send a SIGUSR1 and exit right away. Current situation would require a SIGINT performed by the user and would produce meaningless summary. Now: [acme@emilia linux]$ ./foo -bash: ./foo: No such file or directory [acme@emilia linux]$ perf record ./foo ./foo: No such file or directory [acme@emilia linux]$ Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9c90a61c |
|
02-Dec-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
640c03ce |
|
02-Dec-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf session: Parse sample earlier At perf_session__process_event, so that we reduce the number of lines in eache tool sample processing routine that now receives a sample_data pointer already parsed. This will also be useful in the next patch, where we'll allow sample the identity fields in MMAP, FORK, EXIT, etc, when it will be possible to see (cpu, timestamp) just after before every event. Also validate callchains in perf_session__process_event, i.e. as early as possible, and keep a counter of the number of events discarded due to invalid callchains, warning the user about it if it happens. There is an assumption that was kept that all events have the same sample_type, that will be dealt with in the future, when this preexisting limitation will be removed. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-4-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
baa2f6ce |
|
26-Nov-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Add option to disable collecting build-ids Collecting build-ids for long running sessions may take a long time because it needs to traverse the whole just collected perf.data stream of events, marking the DSOs that had hits and then looking for the .note.gnu.build-id ELF section. For things like the 'trace' tool that records and right away consumes the data on systems where its unlikely that the DSOs being monitored will change while 'trace' runs, it is desirable to remove build id collection, so add a -B/--no-buildid option to perf record to allow such use case. Longer term we'll avoid all this if we, at DSO load time, in the kernel, take advantage of this slow code path to collect the build-id and stash it somewhere, so that we can insert it in the PERF_RECORD_MMAP event. Reported-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c1a3a4b9 |
|
22-Nov-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Handle restrictive permissions in /proc/{kallsyms,modules} The 59365d1 commit, even being reverted by 33e0d57, showed a non robust behavior in 'perf record': it really should just warn the user that some functionality will not be available. The new behavior then becomes: [acme@felicio linux]$ ls -la /proc/{kallsyms,modules} -r-------- 1 root root 0 Nov 22 12:19 /proc/kallsyms -r-------- 1 root root 0 Nov 22 12:19 /proc/modules [acme@felicio linux]$ perf record ls -R > /dev/null Couldn't record kernel reference relocation symbol Symbol resolution may be skewed if relocation was used (e.g. kexec). Check /proc/kallsyms permission or run as root. [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.004 MB perf.data (~161 samples) ] [acme@felicio linux]$ perf report --stdio [kernel.kallsyms] with build id 77b05e00e64e4de1c9347d83879779b540d69f00 not found, continuing without symbols # Events: 98 cycles # # Overhead Command Shared Object Symbol # ........ ....... ............... .................... # 48.26% ls [kernel] [k] ffffffff8102b92b 22.49% ls libc-2.12.90.so [.] __strlen_sse2 8.35% ls libc-2.12.90.so [.] __GI___strcoll_l 8.17% ls ls [.] 11580 3.35% ls libc-2.12.90.so [.] _IO_new_file_xsputn 3.33% ls libc-2.12.90.so [.] _int_malloc 1.88% ls libc-2.12.90.so [.] _int_free 0.84% ls libc-2.12.90.so [.] malloc_consolidate 0.84% ls libc-2.12.90.so [.] __readdir64 0.83% ls ls [.] strlen@plt 0.83% ls libc-2.12.90.so [.] __GI_fwrite_unlocked 0.83% ls libc-2.12.90.so [.] __memcpy_sse2 # # (For a higher level overview, try: perf report --sort comm,dso) # [acme@felicio linux]$ It still has the build-ids for DSOs in the maps with hits: [acme@felicio linux]$ perf buildid-list 77b05e00e64e4de1c9347d83879779b540d69f00 [kernel.kallsyms] 09c4a431a4a8b648fcfc2c2bdda70f56050ddff1 /bin/ls af75ea9ad951d25e0f038901a11b3846dccb29a4 /lib64/libc-2.12.90.so [acme@felicio linux]$ That can be used in another machine to resolve kernel symbols. Cc: Eugene Teo <eugeneteo@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jesper Juhl <jj@chaosbits.net> Cc: Marcus Meissner <meissner@suse.de> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sarah Sharp <sarah.a.sharp@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d9cf837e |
|
19-Nov-2010 |
Corey Ashford <cjashfor@linux.vnet.ibm.com> |
perf stat: Change and clean up sys_perf_event_open error handling This patch makes several changes to "perf stat": - "perf stat" will no longer go ahead and run the application when one or more of the specified events could not be opened. - Use error() and die() instead of pr_err() so that the output is more consistent with "perf top" and "perf record". - Handle permission errors in a more robust way, and in a similar way to "perf record" and "perf top". In addition, the sys_perf_event_open() error handling of "perf top" and "perf record" is made more consistent and adds the following phrase when an event doesn't open (with something ther than an access or permission error): "/bin/dmesg may provide additional information." This is added because kernel code doesn't have a good way of expressing detailed errors to user space, so its only avenue is to use printk's. However, many users may not think of looking at dmesg to find out why an event is being rejected. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <fweisbec@gmail.com> Cc: Ian Munsie <ianmunsi@au1.ibm.com> Cc: Michael Ellerman <michaele@au1.ibm.com> LKML-Reference: <1290217044-26293-1-git-send-email-cjashfor@linux.vnet.ibm.com> Signed-off-by: Corey Ashford <cjashfor@linux.vnet.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
bca647aa |
|
10-Nov-2010 |
Tom Zanussi <tom.zanussi@linux.intel.com> |
perf record: make the record options available outside perf record Other perf commands that invoke perf record, such as perf trace, may want to reuse the options used by perf record. This makes them non-static and renames them to avoid clashes with other 'options' variables. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
|
#
b44308f5 |
|
26-Oct-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf scripting: Shut up 'perf record' final status We want just the script output, not internal details about the record phase. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0ab7368f |
|
28-Aug-2010 |
Matt Fleming <matt@console-pimps.org> |
perf record: Remove newline character from perror() argument If we include a newline character in the string argument to perror() then the output will be split across two lines like so, Unable to read perf file descriptor : No space left on device Deleting the newline character prints a much more readable error, Unable to read perf file descriptor: No space left on device Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <89e77b54659bc3798b23a5596c2debb7f6f4cf27.1283010281.git.matt@console-pimps.org> Signed-off-by: Matt Fleming <matt@console-pimps.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@felicio.ghostprotocols.net>
|
#
d65a458b |
|
30-Jul-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Release session and symbol resources on exit So that we reduce the noise when looking for leaks using tools such as valgrind. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
39d17dac |
|
29-Jul-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Release resources at exit So that we can reduce the noise on valgrind when looking for memory leaks. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
830f4c80 |
|
24-Jun-2010 |
Gui Jianfeng <guijianfeng@cn.fujitsu.com> |
perf kvm: Get rid of unused guest_kallsyms guest_kallsyms is redundant here, remove it. Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com> LKML-Reference: <4C241140.9090008@cn.fujitsu.com> Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5ffc8881 |
|
09-Jun-2010 |
Ian Munsie <imunsie@au1.ibm.com> |
perf record: prevent kill(0, SIGTERM); At exit, perf record will kill the process it was profiling by sending a SIGTERM to child_pid (if it had been initialised), but in certain situations child_pid may be 0 and perf would mistakenly kill more processes than intended. child_pid is set to the return of fork() to either 0 or the pid of the child. Ordinarily this would not present an issue as the child calls execvp to spawn the process to be profiled and would therefore never run it's sig_atexit and never attempt to kill pid 0. However, if a nonexistant binary had been passed in to perf record the call to execvp would fail and child_pid would be left set to 0. The child would then exit and it's atexit handler, finding that child_pid was initialised to 0, would call kill(0, SIGTERM), resulting in every process within it's process group being killed. In the case that perf was being run directly from the shell this typically would not be an issue as the shell isolates the process. However, if perf was being called from another program it could kill unexpected processes, which may even include X. This patch changes the logic of the test for whether child_pid was initialised to only consider positive pids as valid, thereby never attempting to kill pid 0. Cc: David S. Miller <davem@davemloft.net> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <1276072680-17378-1-git-send-email-imunsie@au1.ibm.com> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a1ac1d3c |
|
17-Jun-2010 |
Stephane Eranian <eranian@google.com> |
perf record: Add option to avoid updating buildid cache There are situations where there is enough information in the perf.data to process the samples. Updating the buildid cache may add unecessary overhead in terms of disk space and time (copying large elf images). A persistent option to do this already exists via the perfconfig file, simply do: [buildid] dir = /dev/null This patch provides a way to suppress builid cache updates on a per-run basis. It addds a new option, -N, to perf record. Buildids are still generated in the perf.data file. Cc: David S. Miller <davem@davemloft.net> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <4c19ef89.93ecd80a.40dc.fffff8e9@mx.google.com> Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
cf103a14 |
|
16-Jun-2010 |
Stephane Eranian <eranian@google.com> |
perf record: Avoid synthesizing mmap() for all processes in per-thread mode A bug was introduced by commit c45c6ea2e5c57960dc67e00294c2b78e9540c007. Perf record was scanning /proc/PID to create synthetic PERF_RECOR_MMAP entries even though it was running in per-thread mode. There was a bogus check to select what mmaps to synthesize. We only need all processes in system-wide mode. Cc: David S. Miller <davem@davemloft.net> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <4c192107.4f1ee30a.4316.fffff98e@mx.google.com> Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
3af9e859 |
|
18-May-2010 |
Eric B Munson <ebmunson@us.ibm.com> |
perf: Add non-exec mmap() tracking Add the capacility to track data mmap()s. This can be used together with PERF_SAMPLE_ADDR for data profiling. Signed-off-by: Anton Blanchard <anton@samba.org> [Updated code for stable perf ABI] Signed-off-by: Eric B Munson <ebmunson@us.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1274193049-25997-1-git-send-email-ebmunson@us.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
f60f3593 |
|
04-Jun-2010 |
Arun Sharma <aruns@google.com> |
perf report: Implement --sort cpu In a shared multi-core environment, users want to analyze why their program was slow. In particular, if the code ran slower only on certain CPUs due to interference from other programs or kernel threads, the user should be able to notice that. Sample usage: perf record -f -a -- sleep 3 perf report --sort cpu,comm Workload: program is running on 16 CPUs Experiencing interference from an antagonist only on 4 CPUs. Samples: 106218177676 cycles Overhead CPU Command ........ ... ............... 6.25% 2 program 6.24% 6 program 6.24% 11 program 6.24% 5 program 6.24% 9 program 6.24% 10 program 6.23% 15 program 6.23% 7 program 6.23% 3 program 6.23% 14 program 6.22% 1 program 6.20% 13 program 3.17% 12 program 3.15% 8 program 3.14% 0 program 3.13% 4 program 3.11% 4 antagonist 3.11% 0 antagonist 3.10% 8 antagonist 3.07% 12 antagonist Cc: David S. Miller <davem@davemloft.net> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <20100505181612.GA5091@sharma-home.net> Signed-off-by: Arun Sharma <aruns@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c45c6ea2 |
|
27-May-2010 |
Stephane Eranian <eranian@google.com> |
perf tools: Add the ability to specify list of cpus to monitor This patch adds a -C option to stat, record, top to designate a list of CPUs to monitor. CPUs can be specified as a comma-separated list or ranges, no space allowed. Examples: $ perf record -a -C0-1,4-7 sleep 1 $ perf top -C0-4 $ perf stat -a -C1,2,3,4 sleep 1 With perf record in per-thread mode with inherit mode on, samples are collected only when the thread runs on the designated CPUs. The -C option does not turn on system-wide mode automatically. Cc: David S. Miller <davem@davemloft.net> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <4bff9496.d345d80a.41fe.7b00@mx.google.com> Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
2fb750e8 |
|
31-May-2010 |
Borislav Petkov <bp@alien8.de> |
perf-record: Check correct pid when forking When forking the child to be traced, we should check the correct return value from fork() and not a local variable which is otherwise unused. Signed-off-by: Borislav Petkov <bp@alien8.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <20100531211818.GA30175@liondog.tnic> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
|
#
0e2e63dd |
|
20-May-2010 |
Peter Zijlstra <a.p.zijlstra@chello.nl> |
perf-record: Share per-cpu buffers It seems a waste of space to create a buffer per event, share it per-cpu. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20100521090710.634824884@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
57adc51d |
|
20-May-2010 |
Peter Zijlstra <a.p.zijlstra@chello.nl> |
perf-record: Remove -M Since it is not allowed to create cross-cpu (or cross-task) buffers, this option is no longer valid. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20100521090710.582740993@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
ef365cef |
|
18-May-2010 |
Russ Anderson <rja@sgi.com> |
perf record: remove unneeded gettimeofday() call Perf record repeatedly calls gettimeofday() which adds noise to the performance measurements. Since gettimeofday() is only used for the error printf, delete it. Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <20100518225240.GC25589@sgi.com> Signed-off-by: Russ Anderson <rja@sgi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a41794cd |
|
18-May-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Remove some unused functions Without the bloated cplus_demangle from binutils, i.e building with: $ make NO_DEMANGLE=1 O=~acme/git/build/perf -j3 -C tools/perf/ install Before: text data bss dec hex filename 471851 29280 4025056 4526187 45106b /home/acme/bin/perf After: [acme@doppio linux-2.6-tip]$ size ~/bin/perf text data bss dec hex filename 446886 29232 4008576 4484694 446e56 /home/acme/bin/perf So its a 5.3% size reduction in code, but the interesting part is in the git diff --stat output: 19 files changed, 20 insertions(+), 1909 deletions(-) If we ever need some of the things we got from git but weren't using, we just have to go to the git repo and get fresh, uptodate source code bits. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
1967936d |
|
17-May-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf options: Check v type in OPT_U?INTEGER To avoid problems like the one fixed by Stephane Eranian in 3de29ca, now we'll got this instead: bench/sched-messaging.c:259: error: negative width in bit-field ‘<anonymous>’ bench/sched-messaging.c:261: error: negative width in bit-field ‘<anonymous>’ Which is rather cryptic, but is how BUILD_BUG_ON_ZERO works, so kernel hackers should be already used to this. With it in place found some problems, fixed by changing the affected variables to sensible types or changed some OPT_INTEGER to OPT_UINTEGER. Next csets will go thru converting each of the remaining OPT_ so that review can be made easier by grouping changes per type per patch. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
3de29cab |
|
16-May-2010 |
Stephane Eranian <eranian@google.com> |
perf record: Fix bug mismatch with -c option definition The -c option defines the user requested sampling period. It was implemented using an unsigned int variable but the type of the option was OPT_LONG. Thus, the option parser was overwriting memory belonging to other variables, namely the mmap_pages leading to a zero page sampling buffer. The bug was exposed only when compiling at -O0, probably because the compiler was padding variables at higher optimization levels. This patch fixes this problem by declaring user_interval as u64. This also avoids wrap-around issues for large period on 32-bit systems. Commiter note: Made it use OPT_U64(user_interval) after implementing OPT_U64 in the previous patch. Cc: David S. Miller <davem@davemloft.net> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <4bf11ae9.e88cd80a.06b0.ffffa8e3@mx.google.com> Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
2e6cdf99 |
|
12-May-2010 |
Stephane Eranian <eranian@google.com> |
perf tools: change event inheritance logic in stat and record By default, event inheritance across fork and pthread_create was on but the -i option of stat and record, which enabled inheritance, led to believe it was off by default. This patch fixes this logic by inverting the meaning of the -i option. By default inheritance is on whether you attach to a process (-p), a thread (-t) or start a process. If you pass -i, then you turn off inheritance. Turning off inheritance if you don't need it, helps limit perf resource usage as well. The patch also fixes perf stat -t xxxx and perf record -t xxxx which did not start the counters. Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <4bea9d2f.d60ce30a.0b5b.08e1@mx.google.com> Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
46db2c32 |
|
30-Mar-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Add a fallback to the reference relocation symbol Usually "_text" is enough, but I received reports that its not always available, so fallback to "_stext" for the symbol we use to check if we need to apply any relocation to all the symbols in the kernel symtab, for when, for instance, kexec is being used. Reported-by: Darren Hart <dvhltc@us.ibm.com> Reported-by: Steven Rostedt <rostedt@goodmis.org> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
98402807 |
|
02-May-2010 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf: Introduce a new "round of buffers read" pseudo event In order to provide a more rubust and deterministic reordering algorithm, we need to know when we reach a point where we just did a pass through over every counter buffers to read every thing they had. This patch introduces a new PERF_RECORD_FINISHED_ROUND pseudo event that only consist in an event header and doesn't need to contain anything. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Masami Hiramatsu <mhiramat@redhat.com>
|
#
db620b1c |
|
04-May-2010 |
Tom Zanussi <tzanussi@gmail.com> |
perf/record: simplify TRACE_INFO tracepoint check Fix a couple of inefficiencies and redundancies related to have_tracepoints() and its use when checking whether to write TRACE_INFO. First, there's no need to use get_tracepoints_path() in have_tracepoints() - we really just want the part that checks whether any attributes correspondo to tracepoints. Second, we really don't care about raw_samples per se - tracepoints are always raw_samples. In any case, the have_tracepoints() check should be sufficient to decide whether or not to write TRACE_INFO. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu>, Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1273030770.6383.6.camel@tropicana> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
63e0c771 |
|
02-May-2010 |
Tom Zanussi <tzanussi@gmail.com> |
perf: record TRACE_INFO only if using tracepoints and SAMPLE_RAW The current perf code implicitly assumes SAMPLE_RAW means tracepoints are being used, but doesn't check for that. It happily records the TRACE_INFO even if SAMPLE_RAW is used without tracepoints, but when the perf data is read it won't go any further when it finds TRACE_INFO but no tracepoints, and displays misleading errors. This adds a check for both in perf-record, and won't record TRACE_INFO unless both are true. This at least allows perf report -D to dump raw events, and avoids triggering a misleading error condition in perf trace. It doesn't actually enable the non-tracepoint raw events to be displayed in perf trace, since perf trace currently only deals with tracepoint events. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1272865861.7932.16.camel@tropicana> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
2c9faa06 |
|
02-May-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Don't exit in live mode when no tracepoints are enabled With this I was able to actually test Tom Zanussi's two previous patches in my usual perf testing ways, i.e. without any tracepoints activated. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
454c407e |
|
01-May-2010 |
Tom Zanussi <tzanussi@gmail.com> |
perf: add perf-inject builtin Currently, perf 'live mode' writes build-ids at the end of the session, which isn't actually useful for processing live mode events. What would be better would be to have the build-ids sent before any of the samples that reference them, which can be done by processing the event stream and retrieving the build-ids on the first hit. Doing that in perf-record itself, however, is off-limits. This patch introduces perf-inject, which does the same job while leaving perf-record untouched. Normal mode perf still records the build-ids at the end of the session as it should, but for live mode, perf-inject can be injected in between the record and report steps e.g.: perf record -o - ./hackbench 10 | perf inject -v -b | perf report -v -i - perf-inject reads a perf-record event stream and repipes it to stdout. At any point the processing code can inject other events into the event stream - in this case build-ids (-b option) are read and injected as needed into the event stream. Build-ids are just the first user of perf-inject - potentially anything that needs userspace processing to augment the trace stream with additional information could make use of this facility. Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frédéric Weisbecker <fweisbec@gmail.com> LKML-Reference: <1272696080-16435-3-git-send-email-tzanussi@gmail.com> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
789688fa |
|
01-May-2010 |
Tom Zanussi <tzanussi@gmail.com> |
perf/live: don't synthesize build ids at the end of a live mode trace It doesn't really make sense to record the build ids at the end of a live mode session - live mode samples need that information during the trace rather than at the end. Leave event__synthesize_build_id() in place, however; we'll still be using that to synthesize build ids in a more timely fashion in a future patch. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1272696080-16435-2-git-send-email-tzanussi@gmail.com> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
23346f21 |
|
27-Apr-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Rename "kernel_info" to "machine" struct kernel_info and kerninfo__ are too vague, what they really describe are machines, virtual ones or hosts. There are more changes to introduce helpers to shorten function calls and to make more clear what is really being done, but I left that for subsequent patches. Cc: Avi Kivity <avi@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a1645ce1 |
|
18-Apr-2010 |
Zhang, Yanmin <yanmin_zhang@linux.intel.com> |
perf: 'perf kvm' tool for monitoring guest performance from host Here is the patch of userspace perf tool. Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
|
#
f9212819 |
|
14-Apr-2010 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf: Make the trace events sample period default to 1 Trace events are mostly used for tracing and then require not to be lost when possible. As opposite to hardware events that really require to trigger after a given sample period, trace events mostly need to trigger everytime. It is a frustrating experience to trace with perf and realize we lost a lot of events because we forgot the "-c 1" option. Then default sample_period to 1 for trace events but let the user override it. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de>
|
#
7865e817 |
|
14-Apr-2010 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf: Make -f the default for perf record Force the overwriting mode by default if append mode is not explicit. Adding -f every time one uses perf on a daily basis quickly becomes a burden. Keep the -f among the options though to avoid breaking some random users scripts. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de>
|
#
c7929e47 |
|
01-Apr-2010 |
Tom Zanussi <tzanussi@gmail.com> |
perf: Convert perf header build_ids into build_id events Bypasses the build_id perf header code and replaces it with a synthesized event and processing function that accomplishes the same thing, used when reading/writing perf data to/from a pipe. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: fweisbec@gmail.com Cc: rostedt@goodmis.org Cc: k-keiichi@bx.jp.nec.com Cc: acme@ghostprotocols.net LKML-Reference: <1270184365-8281-9-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
9215545e |
|
01-Apr-2010 |
Tom Zanussi <tzanussi@gmail.com> |
perf: Convert perf tracing data into a tracing_data event Bypasses the tracing_data perf header code and replaces it with a synthesized event and processing function that accomplishes the same thing, used when reading/writing perf data to/from a pipe. The tracing data is pretty large, and this patch doesn't attempt to break it down into component events. The tracing_data event itself doesn't actually contain the tracing data, rather it arranges for the event processing code to skip over it after it's read, using the skip return value added to the event processing loop in a previous patch. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: fweisbec@gmail.com Cc: rostedt@goodmis.org Cc: k-keiichi@bx.jp.nec.com Cc: acme@ghostprotocols.net LKML-Reference: <1270184365-8281-8-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
cd19a035 |
|
01-Apr-2010 |
Tom Zanussi <tzanussi@gmail.com> |
perf: Convert perf event types into event type events Bypasses the event type perf header code and replaces it with a synthesized event and processing function that accomplishes the same thing, used when reading/writing perf data to/from a pipe. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: fweisbec@gmail.com Cc: rostedt@goodmis.org Cc: k-keiichi@bx.jp.nec.com Cc: acme@ghostprotocols.net LKML-Reference: <1270184365-8281-7-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
2c46dbb5 |
|
01-Apr-2010 |
Tom Zanussi <tzanussi@gmail.com> |
perf: Convert perf header attrs into attr events Bypasses the attr perf header code and replaces it with a synthesized event and processing function that accomplishes the same thing, used when reading/writing perf data to/from a pipe. Making the attrs into events allows them to be streamed over a pipe along with the rest of the header data (in later patches). It also paves the way to allowing events to be added and removed from perf sessions dynamically. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: fweisbec@gmail.com Cc: rostedt@goodmis.org Cc: k-keiichi@bx.jp.nec.com Cc: acme@ghostprotocols.net LKML-Reference: <1270184365-8281-6-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
529870e3 |
|
01-Apr-2010 |
Tom Zanussi <tzanussi@gmail.com> |
perf record: Introduce special handling for pipe output Adds special treatment for stdout - if the user specifies '-o -' to perf record, the intent is that the event stream be written to stdout rather than to a disk file. Also, redirect stdout of forked child to stderr - in pipe mode, stdout of the forked child interferes with the stdout perf stream, so redirect it to stderr where it can still be seen but won't be mixed in with the perf output. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: fweisbec@gmail.com Cc: rostedt@goodmis.org Cc: k-keiichi@bx.jp.nec.com Cc: acme@ghostprotocols.net LKML-Reference: <1270184365-8281-3-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
8dc58101 |
|
01-Apr-2010 |
Tom Zanussi <tzanussi@gmail.com> |
perf: Add pipe-specific header read/write and event processing code This patch makes several changes to allow the perf event stream to be sent and received over a pipe: - adds pipe-specific versions of the header read/write code - adds pipe-specific version of the event processing code - adds a range of event types to be used for header or other pseudo events, above the range used by the kernel - checks the return value of event handlers, which they can use to skip over large events during event processing rather than actually reading them into event objects. - unifies the multiple do_read() functions and updates its users. Note that none of these changes affect the existing perf data file format or processing - this code only comes into play if perf output is sent to stdout (or is read from stdin). Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: fweisbec@gmail.com Cc: rostedt@goodmis.org Cc: k-keiichi@bx.jp.nec.com Cc: acme@ghostprotocols.net LKML-Reference: <1270184365-8281-2-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
c0555642 |
|
13-Apr-2010 |
Ian Munsie <imunsie@au.ibm.com> |
perf: Fix endianness argument compatibility with OPT_BOOLEAN() and introduce OPT_INCR() Parsing an option from the command line with OPT_BOOLEAN on a bool data type would not work on a big-endian machine due to the manner in which the boolean was being cast into an int and incremented. For example, running 'perf probe --list' on a PowerPC machine would fail to properly set the list_events bool and would therefore print out the usage information and terminate. This patch makes OPT_BOOLEAN work as expected with a bool datatype. For cases where the original OPT_BOOLEAN was intentionally being used to increment an int each time it was passed in on the command line, this patch introduces OPT_INCR with the old behaviour of OPT_BOOLEAN (the verbose variable is currently the only such example of this). I have reviewed every use of OPT_BOOLEAN to verify that a true C99 bool was passed. Where integers were used, I verified that they were only being used for boolean logic and changed them to bools to ensure that they would not be mistakenly used as ints. The major exception was the verbose variable which now uses OPT_INCR instead of OPT_BOOLEAN. Signed-off-by: Ian Munsie <imunsie@au.ibm.com> Acked-by: David S. Miller <davem@davemloft.net> Cc: <stable@kernel.org> # NOTE: wont apply to .3[34].x cleanly, please backport Cc: Git development list <git@vger.kernel.org> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Eric B Munson <ebmunson@us.ibm.com> Cc: Valdis.Kletnieks@vt.edu Cc: WANG Cong <amwang@redhat.com> Cc: Thiago Farina <tfransosi@gmail.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Cc: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Mike Galbraith <efault@gmx.de> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Anton Blanchard <anton@samba.org> Cc: John Kacur <jkacur@redhat.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1271147857-11604-1-git-send-email-imunsie@au.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
e206d556 |
|
03-Apr-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Move the prototypes in util/string.h to util.h So that we avoid conflict with libc's string.h header. Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Suggested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
70162138 |
|
30-Mar-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Add a fallback to the reference relocation symbol Usually "_text" is enough, but I received reports that its not always available, so fallback to "_stext" for the symbol we use to check if we need to apply any relocation to all the symbols in the kernel symtab, for when, for instance, kexec is being used. Reported-by: Darren Hart <dvhltc@us.ibm.com> Cc: Darren Hart <dvhltc@us.ibm.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5a103174 |
|
25-Mar-2010 |
Zhang, Yanmin <yanmin_zhang@linux.intel.com> |
perf record: Zero out mmap_array to fix segfault Reported-by: Li Zefan <lizf@cn.fujitsu.com> Tested-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1269557941-15617-6-git-send-email-acme@infradead.org> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
d6d901c2 |
|
18-Mar-2010 |
Zhang, Yanmin <yanmin_zhang@linux.intel.com> |
perf events: Change perf parameter --pid to process-wide collection instead of thread-wide Parameter --pid (or -p) of perf currently means a thread-wide collection. For exmaple, if a process whose id is 8888 has 10 threads, 'perf top -p 8888' just collects the main thread statistics. That's misleading. Users are used to attach a whole process when debugging a process by gdb. To follow normal usage style, the patch change --pid to process-wide collection and add --tid (-t) to mean a thread-wide collection. Usage example is: # perf top -p 8888 # perf record -p 8888 -f sleep 10 # perf stat -p 8888 -f sleep 10 Above commands collect the statistics of all threads of process 8888. Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Avi Kivity <avi@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Sheng Yang <sheng@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jes Sorensen <Jes.Sorensen@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: zhiteng.huang@intel.com Cc: Zachary Amsden <zamsden@redhat.com> LKML-Reference: <1268922965-14774-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
46be604b |
|
18-Mar-2010 |
Zhang, Yanmin <yanmin_zhang@linux.intel.com> |
perf record: Enable counters only when kernel is execing subcommand 'perf record' starts counters before subcommand is execed, so the statistics is not precise because it includes data of some preparation steps. I fix it with the patch. In addition, change the condition to fork/exec subcommand. If there is a subcommand parameter, perf always fork/exec it. The usage example is: # perf record -f -a sleep 10 So this command could collect statistics for 10 seconds precisely. User still could stop it by CTRL+C. Without the new capability, user could only input CTRL+C to stop it without precise time clock. Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Avi Kivity <avi@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Sheng Yang <sheng@linux.intel.com> Cc: oerg Roedel <joro@8bytes.org> Cc: Jes Sorensen <Jes.Sorensen@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: <zhiteng.huang@intel.com> Cc: Zachary Amsden <zamsden@redhat.com> LKML-Reference: <1268922965-14774-2-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
bedbfdea |
|
15-Mar-2010 |
Eric B Munson <ebmunson@us.ibm.com> |
perf record: Enable the enable_on_exec flag if record forks the target When forking its target, perf record can capture data from before the target application is started. Perf stat uses the enable_on_exec flag in the event attributes to keep from displaying events from before the target program starts, this patch adds the same functionality to perf record when it is will fork the target process. Signed-off-by: Eric B Munson <ebmunson@us.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1268664418-28328-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
6230f2c7 |
|
11-Mar-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Mention paranoid sysctl when failing to create counter [acme@mica linux-2.6-tip]$ perf record -a -f Fatal: Permission error - are you root? Consider tweaking /proc/sys/kernel/perf_event_paranoid. [acme@mica linux-2.6-tip]$ Suggested-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1268333592-30872-2-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
9f591fd7 |
|
11-Mar-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Don't try to find buildids in a zero sized file Fixing this symptom: [acme@mica linux-2.6-tip]$ perf record -a -f Fatal: Permission error - are you root? Bus error [acme@mica linux-2.6-tip]$ I.e. if for some reason no data is collected, in this case a non root user trying to do systemwide profiling, no data will be collected, and then we end up trying to mmap a zero sized file and access the file header, b00m. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: <stable@kernel.org> LKML-Reference: <1268333592-30872-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
a12b51c4 |
|
10-Mar-2010 |
Paul Mackerras <paulus@samba.org> |
perf tools: Fix sparse CPU numbering related bugs At present, the perf subcommands that do system-wide monitoring (perf stat, perf record and perf top) don't work properly unless the online cpus are numbered 0, 1, ..., N-1. These tools ask for the number of online cpus with sysconf(_SC_NPROCESSORS_ONLN) and then try to create events for cpus 0, 1, ..., N-1. This creates problems for systems where the online cpus are numbered sparsely. For example, a POWER6 system in single-threaded mode (i.e. only running 1 hardware thread per core) will have only even-numbered cpus online. This fixes the problem by reading the /sys/devices/system/cpu/online file to find out which cpus are online. The code that does that is in tools/perf/util/cpumap.[ch], and consists of a read_cpu_map() function that sets up a cpumap[] array and returns the number of online cpus. If /sys/devices/system/cpu/online can't be read or can't be parsed successfully, it falls back to using sysconf to ask how many cpus are online and sets up an identity map in cpumap[]. The perf record, perf stat and perf top code then calls read_cpu_map() in the system-wide monitoring case (instead of sysconf) and uses cpumap[] to get the cpu numbers to pass to perf_event_open. Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> LKML-Reference: <20100310093609.GA3959@brick.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
8907fd60 |
|
04-Mar-2010 |
Eric B Munson <ebmunson@us.ibm.com> |
perf record: Add ID and to recorded event data when recording multiple events Currently perf record does not write the ID or the to disk for events. This doesn't allow report to tell if an event stream contains one or more types of events. This patch adds this entry to the list of data that record will write to disk if more than one event was requested. Signed-off-by: Eric B Munson <ebmunson@us.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1267804269-22660-2-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
f7e7ee36 |
|
05-Feb-2010 |
austin_zhang@linux.intel.com <austin_zhang@linux.intel.com> |
perf record: Fix existing process callgraph symbol When 'perf record -g' a existing process, even with debuginfo packages, still cannnot get symbol from 'perf report'. try: perf record -g -p `pidof xxx` -f perf report 68.26% :1181 b74870f2 [.] 0x000000b74870f2 | |--32.09%-- 0xb73b5b44 | 0xb7487102 | 0xb748a4e2 | 0xb748633d | 0xb73b41cd | 0xb73b4467 | 0xb747d531 The reason is: for existing process, in __cmd_record(), the pid is 0 rather than the existing process id. Signed-off-by: Austin Zhang <austin_zhang@linux.intel.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <4710.10.255.24.35.1265389362.squirrel@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
f887f301 |
|
04-Feb-2010 |
Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> |
perf tools: Clean up O_LARGEFILE et al usage Setting _FILE_OFFSET_BITS and using O_LARGEFILE, lseek64, etc, is redundant. Thanks H. Peter Anvin for pointing it out. So, this patch removes O_LARGEFILE, lseek64, etc. Suggested-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <4B6A8972.3070605@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
6122e4e4 |
|
03-Feb-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Stop intercepting events, use postprocessing to get build-ids We want to stream events as fast as possible to perf.data, and also in the future we want to have splice working, when no interception will be possible. Using build_id__mark_dso_hit_ops to create the list of DSOs that back MMAPs we also optimize disk usage in the build-id cache by only caching DSOs that had hits. Suggested-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1265223128-11786-6-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
b8f46c5a |
|
02-Feb-2010 |
Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> |
perf tools: Use O_LARGEFILE to open perf data file Open perf data file with O_LARGEFILE flag since its size is easily larger that 2G. For example: # rm -rf perf.data # ./perf kmem record sleep 300 [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 3142.147 MB perf.data (~137282513 samples) ] # ll -h perf.data -rw------- 1 root root 3.1G ..... Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <4B68F32A.9040203@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
a8e6f734 |
|
30-Jan-2010 |
Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> |
Revert "perf record: Intercept all events" This reverts commit f5a2c3dce03621b55f84496f58adc2d1a87ca16f. This patch is required for making "perf lock rec" work. The commit f5a2c3dce0 changes write_event() of builtin-record.c . And changed write_event() sometimes doesn't stop with perf lock rec. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> [ that commit also causes perf record to not be Ctrl-C-able, and it's concetually wrong to parse the data at record time (unconditionally - even when not needed), as we eventually want to be able to do zero-copy recording, at least for non-archive recordings. ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
64abebf7 |
|
27-Jan-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf session: Create kernel maps in the constructor Removing one extra step needed in the tools that need this, fixing a bug in 'perf probe' where this was not being done. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1264633557-17597-4-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
f5a2c3dc |
|
15-Jan-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Intercept all events The event interception we need to do in 'perf record' to create a list of all DSOs in PERF_RECORD_MMAP events wasn't seeing all events, make sure that happens by checking size agains event_t->header.size. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1263586107-1756-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
18c3daa4 |
|
14-Jan-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Encode the domain while synthesizing MMAP events In the past 'perf record' had to process only userspace MMAP events, the ones generated in the kernel, but after we reused the MMAP events to encode the module mapings we ended up adding them first to the list of userspace DSOs (dsos__user) and to the kernel one (dsos__kernel). Fix this by encoding the header.misc field and then using it, like other parts to decide the right DSOs list to insert/find. The gotcha here is that since the kernel puts zero in .misc, which isn't PERF_RECORD_MISC_KERNEL (1 << 1), to differentiate, we put 1 in .misc. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1263519930-22803-2-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
b7cece76 |
|
13-Jan-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Encode kernel module mappings in perf.data We were always looking at the running machine /proc/modules, even when processing a perf.data file, which only makes sense when we're doing 'perf record' and 'perf report' on the same machine, and in close sucession, or if we don't use modules at all, right Peter? ;-) Now, at 'perf record' time we read /proc/modules, find the long path for modules, and put them as PERF_MMAP events, just like we did to encode the reloc reference symbol for vmlinux. Talking about that now it is encoded in .pgoff, so that we can use .{start,len} to store the address boundaries for the kernel so that when we reconstruct the kmaps tree we can do lookups right away, without having to fixup the end of the kernel maps like we did in the past (and now only in perf record). One more step in the 'perf archive' direction when we'll finally be able to collect data in one machine and analyse in another. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1263396139-4798-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
56b03f3c |
|
05-Jan-2010 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Handle relocatable kernels DSOs don't have this problem because the kernel emits a PERF_MMAP for each new executable mapping it performs on monitored threads. To fix the kernel case we simulate the same behaviour, by having 'perf record' to synthesize a PERF_MMAP for the kernel, encoded like this: [root@doppio ~]# perf record -a -f sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.344 MB perf.data (~15038 samples) ] [root@doppio ~]# perf report -D | head -10 0xd0 [0x40]: event: 1 . . ... raw event: size 64 bytes . 0000: 01 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 ......@........ . 0010: 00 00 00 81 ff ff ff ff 00 00 00 00 00 00 00 00 ............... . 0020: 00 00 00 00 00 00 00 00 5b 6b 65 72 6e 65 6c 2e ........ [kernel . 0030: 6b 61 6c 6c 73 79 6d 73 2e 5f 74 65 78 74 5d 00 kallsyms._text] . 0xd0 [0x40]: PERF_RECORD_MMAP 0/0: [0xffffffff81000000((nil)) @ (nil)]: [kernel.kallsyms._text] I.e. we identify such event as having: .pid = 0 .filename = [kernel.kallsyms.REFNAME] .start = REFNAME addr in /proc/kallsyms at 'perf record' time and use now a hardcoded value of '.text' for REFNAME. Then, later, in 'perf report', if there are any kernel hits and thus we need to resolve kernel symbols, we search for REFNAME and if its address changed, relocation happened and we thus must change the kernel mapping routines to one that uses .pgoff as the relocation to apply. This way we use the same mechanism used for the other DSOs and don't have to do a two pass in all the kernel symbols. Reported-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> LKML-Reference: <1262717431-1246-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
d4db3f16 |
|
27-Dec-2009 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: We should fork only if a program was specified to run IOW: Now 'perf record -a' works, this was a bug introduced in: 856e96608a72412d319e498a3a7c557571f811bd "perf record: Properly synchronize child creation" Also fix the -C usage, i.e. allow for profiling all the tasks in one CPU. Reported-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1261957026-15580-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
60ab2716 |
|
16-Dec-2009 |
Peter Zijlstra <a.p.zijlstra@chello.nl> |
perf record: Use per-task-per-cpu events for inherited events Create events with a pid and cpu contraint for inherited events so that we get a stream per cpu, instead of all cpus contending on a single stream. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: fweisbec@gmail.com Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091216165904.987643843@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
856e9660 |
|
16-Dec-2009 |
Peter Zijlstra <a.p.zijlstra@chello.nl> |
perf record: Properly synchronize child creation Remove that ugly usleep and provide proper serialization between parent and child just like perf-stat does. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: fweisbec@gmail.com Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091216165904.908184135@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
655000e7 |
|
15-Dec-2009 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf symbols: Adopt the strlists for dso, comm Will be used in perf diff too. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1260914682-29652-2-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
75be6cf4 |
|
15-Dec-2009 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf symbols: Make symbol_conf global This simplifies a lot of functions, less stuff to be done by tool writers. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1260914682-29652-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
b38d3464 |
|
14-Dec-2009 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Rename perf.data to perf.data.old if --force/-f is used Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1260828571-3613-2-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
4aa65636 |
|
13-Dec-2009 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf session: Move kmaps to perf_session There is still some more work to do to disentangle map creation from DSO loading, but this happens only for the kernel, and for the early adopters of perf diff, where this disentanglement matters most, we'll be testing different kernels, so no problem here. Further clarification: right now we create the kernel maps for the various modules and discontiguous kernel text maps when loading the DSO, we should do it as a two step process, first creating the maps, for multiple mappings with the same DSO store, then doing the dso load just once, for the first hit on one of the maps sharing this DSO backing store. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1260741029-4430-6-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
d8f66248 |
|
13-Dec-2009 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf session: Pass the perf_session to the event handling operations They will need it to get the right threads list, etc. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1260741029-4430-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
94c744b6 |
|
11-Dec-2009 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Introduce perf_session class That does all the initialization boilerplate, opening the file, reading the header, checking if it is valid, etc. And that will as well have the threads list, kmap (now) global variable, etc, so that we can handle two (or more) perf.data files describing sessions to compare. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1260573842-19720-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
bfd45118 |
|
15-Nov-2009 |
Simon Kaempflein <s.kaempflein@gmx.de> |
perf record, x86: Print more intelligent error message when sampling fails Print more accurate error message when "perf record" fails because there is no APIC support, on x86. Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
d5eed904 |
|
19-Nov-2009 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Eliminate some more die() uses in library functions This time in perf_header__adds_write, propagating the do_write error returns. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1258649757-17554-2-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
4dc0a04b |
|
19-Nov-2009 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: perf_header__read() shouldn't die() And also don't call the constructor in it, this way it adheres to the model the other methods follow. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1258649757-17554-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
a9a70bbc |
|
16-Nov-2009 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Don't die() in perf_header__new() Propagate the errors instead, the users are the ones to decide what to do if a library call fails. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <1258427892-16312-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
58754121 |
|
16-Nov-2009 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Don't die() in perf_header_attr__add_id() Propagate the errors instead, the users are the ones to decide what to do if a library call fails. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <1258427892-16312-2-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
11deb1f9 |
|
16-Nov-2009 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Don't die() in perf_header__add_attr() Propagate the errors instead, the users are the ones to decide what to do if a library call fails. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <1258427892-16312-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
dc79c0fc |
|
16-Nov-2009 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Don't die in perf_header_attr__new() We really should propagate such kinds of errors so that users of these library functions decide what to do in such cases instead of exiting in random places like now. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <1258407027-384-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
3e13ab2d |
|
10-Nov-2009 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf tools: Use perf_header__set/has_feat whenever possible And drop the alternate checks/sets using set_bit or other kind of helpers. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> LKML-Reference: <1257911467-28276-5-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
8671dab9 |
|
10-Nov-2009 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf tools: Move the build-id storage operations to headers So that it makes easier to control it. Especially because we plan to give it a feature section. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> LKML-Reference: <1257911467-28276-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
de896721 |
|
10-Nov-2009 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf tools: Synthetize the targeted process Don't forget to also synthetize the targeted process from perf record or we'll miss its dso in the events and then we won't be able to deal with its build-id. We are missing it because it is created after the existing synthetized tasks but before the counters are enabled and can send its mapping event. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> LKML-Reference: <1257911467-28276-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
c10edee2 |
|
08-Nov-2009 |
Pekka Enberg <penberg@cs.helsinki.fi> |
perf tools: Fix permission checks The perf_event_open() system call returns EACCES if the user is not root which results in a very confusing error message: $ perf record -A -a -f Error: perfcounter syscall returned with -1 (Permission denied) Fatal: No CONFIG_PERF_EVENTS=y kernel support configured? It turns out that's because perf tools are checking only for EPERM. Fix that up to get a much better error message: $ perf record -A -a -f Fatal: Permission error - are you root? Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1257696066-4046-1-git-send-email-penberg@cs.helsinki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
8d06367f |
|
04-Nov-2009 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf symbols: Use the buildids if present With this change 'perf record' will intercept PERF_RECORD_MMAP calls, creating a linked list of DSOs, then when the session finishes, it will traverse this list and read the buildids, stashing them at the end of the file and will set up a new feature bit in the header bitmask. 'perf report' will then notice this feature and populate the 'dsos' list and set the build ids. When reading the symtabs it will refuse to load from a file that doesn't have the same build id. This improves the reliability of the profiler output, as symbols and profiling data is more guaranteed to match. Example: [root@doppio ~]# perf report | head /home/acme/bin/perf with build id b1ea544ac3746e7538972548a09aadecc5753868 not found, continuing without symbols # Samples: 2621434559 # # Overhead Command Shared Object Symbol # ........ ............... ............................. ...... # 7.91% init [kernel] [k] read_hpet 7.64% init [kernel] [k] mwait_idle_with_hints 7.60% swapper [kernel] [k] read_hpet 7.60% swapper [kernel] [k] mwait_idle_with_hints 3.65% init [kernel] [k] 0xffffffffa02339d9 [root@doppio ~]# In this case the 'perf' binary was an older one, vanished, so its symbols probably wouldn't match or would cause subtly different (and misleading) output. Next patches will support the kernel as well, reading the build id notes for it and the modules from /sys. Another patch should also introduce a new plumbing command: 'perf list-buildids' that will then be used in porcelain that is distro specific to fetch -debuginfo packages where such buildids are present. This will in turn allow for one to run 'perf record' in one machine and 'perf report' in another. Future work on having the buildid sent directly from the kernel in the PERF_RECORD_MMAP event is needed to close races, as the DSO can be changed during a 'perf record' session, but this patch at least helps with non-corner cases and current/older kernels. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: K. Prasad <prasad@linux.vnet.ibm.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Roland McGrath <roland@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1257367843-26224-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
234fbbf5 |
|
26-Oct-2009 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Generalize event synthesizing routines Because we will need it in 'perf top' to support userspace symbols for existing threads. Now we pass a callback that will receive the synthesized event and then write it to the output file in 'perf record' and in the upcoming patch for 'perf top' we will just immediatelly create the in memory representation of threads and maps. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1256592199-9608-2-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
7f3bedcc |
|
26-Oct-2009 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Fix race where process can disappear while reading its /proc/pid/tasks Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1256592199-9608-1-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
6beba7ad |
|
21-Oct-2009 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Unify debug messages mechanisms We were using eprintf in some places, that looks at a global 'verbose' level, and at other places passing a 'v' parameter to specify the verbosity level, unify it by introducing pr_{err,warning,debug,etc}, just like in the kernel. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1256153646-10097-1-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
2ba08250 |
|
17-Oct-2009 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf tools: Introduce bitmask'ed additional headers This provides a new set of bitmasked headers. A new field is added in the perf headers that implements a bitmap storing optional features present in the perf.data file. The layout can be pictured like this: (Usual perf headers)(Features bitmap)[Feature 0][Feature n][Feature 255] If the bit n is set, then the feature n is used in this file. They are all set in order. This brings a backward and forward compatibility. The trace_info section has moved into such optional features, this is the first and only one for now. This is backward compatible with the .32 file version although it doesn't support the previous separate trace.info file. And finally it doesn't support the current interim development version. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1255792354-11304-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
5a116dd2 |
|
17-Oct-2009 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf tools: Use kernel bitmap library Use the kernel bitmap library for internal perf tools uses. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1255792354-11304-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
c171b552 |
|
14-Oct-2009 |
Li Zefan <lizf@cn.fujitsu.com> |
perf trace: Add filter Suppport Add a new option "--filter <filter_str>" to perf record, and it should be right after "-e trace_point": #./perf record -R -f -e irq:irq_handler_entry --filter irq==18 ^C # ./perf trace perf-4303 ... irq_handler_entry: irq=18 handler=eth0 init-0 ... irq_handler_entry: irq=18 handler=eth0 init-0 ... irq_handler_entry: irq=18 handler=eth0 init-0 ... irq_handler_entry: irq=18 handler=eth0 init-0 ... irq_handler_entry: irq=18 handler=eth0 See Documentation/trace/events.txt for the syntax of filter expressions. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <4AD6955F.90602@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
7e4ff9e3 |
|
11-Oct-2009 |
Mike Galbraith <efault@gmx.de> |
perf tools: Fix counter sample frequency breakage Commit 42e59d7d19dc4b4 switched to a default sample frequency of 1KHz, which overrides any user supplied count, causing sched, top and timechart to miss events due to their discrete events being flagged PERF_SAMPLE_PERIOD. Override default sample frequency when the user profides a period count, and make both record and top honor that user supplied option. Signed-off-by: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arjan van de Ven <arjan@infradead.org> LKML-Reference: <1255326963.15107.2.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
03456a15 |
|
06-Oct-2009 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf tools: Merge trace.info content into perf.data This drops the trace.info file and move its contents into the common perf.data file. This is done by creating a new trace_info section into this file. A user of perf headers needs to call perf_header__set_trace_info() to save the trace meta informations into the perf.data file. A file created by perf after his patch is unsupported by previous version because the size of the headers have increased. That said, it's two new fields that have been added in the end of the headers, and those could be ignored by previous versions if they just handled the dynamic header size and then ignore the unknow part. The offsets guarantee the compatibility. We'll do a -stable fix for that. But current previous versions handle the header size using its static size, not dynamic, then it's not backward compatible with trace records. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091006213643.GA5343@nowhere> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
42e59d7d |
|
06-Oct-2009 |
Ingo Molnar <mingo@elte.hu> |
perf tools: Default to 1 KHz auto-sampling freq events Use auto-freq events by default in perf record and perf top. This allows more consistent hardware event sampling, regardless of the intensity of the underlying event. It also keeps us from over-sampling on larger/busier systems. (also make surrounding initializations more consistent) Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
933da83a |
|
03-Oct-2009 |
Chris Wilson <chris@chris-wilson.co.uk> |
perf: Propagate term signal to child If we launch the child on behalf of the user, ensure that it dies along with ourselves when we are interrupted. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> LKML-Reference: <1254616502-4728-1-git-send-email-chris@chris-wilson.co.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
cdd6c482 |
|
20-Sep-2009 |
Ingo Molnar <mingo@elte.hu> |
perf: Do the big rename: Performance Counters -> Performance Events Bye-bye Performance Counters, welcome Performance Events! In the past few months the perfcounters subsystem has grown out its initial role of counting hardware events, and has become (and is becoming) a much broader generic event enumeration, reporting, logging, monitoring, analysis facility. Naming its core object 'perf_counter' and naming the subsystem 'perfcounters' has become more and more of a misnomer. With pending code like hw-breakpoints support the 'counter' name is less and less appropriate. All in one, we've decided to rename the subsystem to 'performance events' and to propagate this rename through all fields, variables and API names. (in an ABI compatible fashion) The word 'event' is also a bit shorter than 'counter' - which makes it slightly more convenient to write/handle as well. Thanks goes to Stephane Eranian who first observed this misnomer and suggested a rename. User-space tooling and ABI compatibility is not affected - this patch should be function-invariant. (Also, defconfigs were not touched to keep the size down.) This patch has been generated via the following script: FILES=$(find * -type f | grep -vE 'oprofile|[^K]config') sed -i \ -e 's/PERF_EVENT_/PERF_RECORD_/g' \ -e 's/PERF_COUNTER/PERF_EVENT/g' \ -e 's/perf_counter/perf_event/g' \ -e 's/nb_counters/nb_events/g' \ -e 's/swcounter/swevent/g' \ -e 's/tpcounter_event/tp_event/g' \ $FILES for N in $(find . -name perf_counter.[ch]); do M=$(echo $N | sed 's/perf_counter/perf_event/g') mv $N $M done FILES=$(find . -name perf_event.*) sed -i \ -e 's/COUNTER_MASK/REG_MASK/g' \ -e 's/COUNTER/EVENT/g' \ -e 's/\<event\>/event_id/g' \ -e 's/counter/event/g' \ -e 's/Counter/Event/g' \ $FILES ... to keep it as correct as possible. This script can also be used by anyone who has pending perfcounters patches - it converts a Linux kernel tree over to the new naming. We tried to time this change to the point in time where the amount of pending patches is the smallest: the end of the merge window. Namespace clashes were fixed up in a preparatory patch - and some stylistic fallout will be fixed up in a subsequent patch. ( NOTE: 'counters' are still the proper terminology when we deal with hardware registers - and these sed scripts are a bit over-eager in renaming them. I've undone some of that, but in case there's something left where 'counter' would be better than 'event' we can undo that on an individual basis instead of touching an otherwise nicely automated patch. ) Suggested-by: Stephane Eranian <eranian@google.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <linux-arch@vger.kernel.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
8b412664 |
|
17-Sep-2009 |
Peter Zijlstra <a.p.zijlstra@chello.nl> |
perf record: Disable profiling before draining the buffer I noticed that perf-record continues profiling itself after the child terminated and we're draining the buffer. This can cause a _lot_ of overhead with --all recording - we keep and keep recording, which produces new and new events. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
ea57c4f5 |
|
13-Sep-2009 |
Ingo Molnar <mingo@elte.hu> |
perf tools: Implement counter output multiplexing Finish the -M/--multiplex option implementation: - separate it out from group_fd - correctly set it via the ioctl and dont mmap counters that are multiplexed - modify the perf record event loop to deal with buffer-less counters. - remove the -g option from perf sched record - account for unordered events in perf sched latency - (add -f to perf sched record to ease measurements) - skip idle threads (pid==0) in latency output The result is better latency output by 'perf sched latency': ----------------------------------------------------------------------------------- Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | ----------------------------------------------------------------------------------- ksoftirqd/8 | 0.071 ms | 2 | avg: 0.458 ms | max: 0.913 ms | at-spi-registry | 0.609 ms | 19 | avg: 0.013 ms | max: 0.023 ms | perf | 3.316 ms | 16 | avg: 0.013 ms | max: 0.054 ms | Xorg | 0.392 ms | 19 | avg: 0.011 ms | max: 0.018 ms | sleep | 0.537 ms | 2 | avg: 0.009 ms | max: 0.009 ms | ----------------------------------------------------------------------------------- TOTAL: | 4.925 ms | 58 | --------------------------------------------- Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
d1302522 |
|
14-Sep-2009 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf tools: Add an option to multiplex counters in a single channel Add an option to multiplex counters output in the channel of the group leader, ie: the first counter opened: -M --multiplex The effect is better serialized samples. This is especially useful for tracepoint samples that need to be well serialized for their post-processing. Also make use of this option in 'perf sched'. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
6ddf259d |
|
02-Sep-2009 |
Ingo Molnar <mingo@elte.hu> |
perf trace: Sample timestamps as well Before: perf-21082 [013] 0.000000: sched_wakeup_new: task perf:21083 [120] success=1 [015] perf-21082 [013] 0.000000: sched_migrate_task: task perf:21082 [120] from: 13 to: 15 perf-21082 [013] 0.000000: sched_process_fork: parent perf:21082 child perf:21083 true-21083 [015] 0.000000: sched_wakeup: task migration/15:33 [0] success=1 [015] perf-21082 [013] 0.000000: sched_switch: task perf:21082 [120] (S) ==> swapper:0 [140] true-21083 [015] 0.000000: sched_switch: task perf:21083 [120] (R) ==> migration/15:33 [0] true-21083 [011] 0.000000: sched_process_exit: task true:21083 [120] After: perf-21082 [013] 14674.797613: sched_wakeup_new: task perf:21083 [120] success=1 [015] perf-21082 [013] 14674.797506: sched_migrate_task: task perf:21082 [120] from: 13 to: 15 perf-21082 [013] 14674.797610: sched_process_fork: parent perf:21082 child perf:21083 true-21083 [015] 14674.797725: sched_wakeup: task migration/15:33 [0] success=1 [015] perf-21082 [013] 14674.797722: sched_switch: task perf:21082 [120] (S) ==> swapper:0 [140] true-21083 [015] 14674.797729: sched_switch: task perf:21083 [120] (R) ==> migration/15:33 [0] true-21083 [011] 14674.798159: sched_process_exit: task true:21083 [120] Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
cd6feeea |
|
02-Sep-2009 |
Ingo Molnar <mingo@elte.hu> |
perf trace: Sample the CPU too Sample, record, parse and print the CPU field - it had all zeroes before. Before (watch the second column, the CPU values): perf-32685 [000] 0.000000: sched_wakeup_new: task perf:32686 [120] success=1 [011] perf-32685 [000] 0.000000: sched_migrate_task: task perf:32685 [120] from: 1 to: 11 perf-32685 [000] 0.000000: sched_process_fork: parent perf:32685 child perf:32686 true-32686 [000] 0.000000: sched_wakeup: task migration/11:25 [0] success=1 [011] true-32686 [000] 0.000000: sched_wakeup: task distccd:12793 [125] success=1 [015] true-32686 [000] 0.000000: sched_wakeup: task distccd:12793 [125] success=1 [015] perf-32685 [000] 0.000000: sched_switch: task perf:32685 [120] (S) ==> swapper:0 [140] true-32686 [000] 0.000000: sched_switch: task perf:32686 [120] (R) ==> migration/11:25 [0] true-32686 [000] 0.000000: sched_switch: task perf:32686 [120] (R) ==> distccd:12793 [125] true-32686 [000] 0.000000: sched_switch: task true:32686 [120] (R) ==> distccd:12793 [125] true-32686 [000] 0.000000: sched_process_exit: task true:32686 [120] true-32686 [000] 0.000000: sched_stat_wait: task: distccd:12793 wait: 6767985949080 [ns] true-32686 [000] 0.000000: sched_stat_wait: task: distccd:12793 wait: 6767986139446 [ns] true-32686 [000] 0.000000: sched_stat_sleep: task: distccd:12793 sleep: 132844 [ns] true-32686 [000] 0.000000: sched_stat_sleep: task: distccd:12793 sleep: 131724 [ns] After: perf-32685 [001] 0.000000: sched_wakeup_new: task perf:32686 [120] success=1 [011] perf-32685 [001] 0.000000: sched_migrate_task: task perf:32685 [120] from: 1 to: 11 perf-32685 [001] 0.000000: sched_process_fork: parent perf:32685 child perf:32686 true-32686 [011] 0.000000: sched_wakeup: task migration/11:25 [0] success=1 [011] true-32686 [015] 0.000000: sched_wakeup: task distccd:12793 [125] success=1 [015] true-32686 [015] 0.000000: sched_wakeup: task distccd:12793 [125] success=1 [015] perf-32685 [001] 0.000000: sched_switch: task perf:32685 [120] (S) ==> swapper:0 [140] true-32686 [011] 0.000000: sched_switch: task perf:32686 [120] (R) ==> migration/11:25 [0] true-32686 [015] 0.000000: sched_switch: task perf:32686 [120] (R) ==> distccd:12793 [125] true-32686 [015] 0.000000: sched_switch: task true:32686 [120] (R) ==> distccd:12793 [125] true-32686 [015] 0.000000: sched_process_exit: task true:32686 [120] true-32686 [015] 0.000000: sched_stat_wait: task: distccd:12793 wait: 6767985949080 [ns] true-32686 [015] 0.000000: sched_stat_wait: task: distccd:12793 wait: 6767986139446 [ns] true-32686 [015] 0.000000: sched_stat_sleep: task: distccd:12793 sleep: 132844 [ns] true-32686 [015] 0.000000: sched_stat_sleep: task: distccd:12793 sleep: 131724 [ns] So we can now see how this workload migrated between CPUs. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
1ef2ed10 |
|
27-Aug-2009 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf tools: Only save the event formats we need While opening a trace event counter, every events are saved in the trace.info file. But we only want to save the specifications of the events we are using. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1251421798-9101-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
9df37ddd |
|
17-Aug-2009 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf tools: Record events info also when :record suffix is used. You can enable a counter's PERF_SAMPLE_RAW attribute in two fashions: - using the -R option (every counters get PERF_SAMPLE_RAW) - using the :record suffix in a trace event counter name Currently we record the events info in a trace.info file from perf record when the former method is used but we omit it with the latter. Check both situations. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1250543271-8383-3-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
5f9c39dc |
|
17-Aug-2009 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf tools: Add perf trace This adds perf trace into the set of perf tools. It is written to fetch the tracepoint samples from perf events and display them, according to the events information given by the debugfs files through the util/trace* tools. It is a rough first shot and doesn't yet handle the cpu, timestamps fields and some other things. Example: perf record -f -e workqueue:workqueue_execution:record -F 1 -a perf trace kblockd/0-236 [000] 0.000000: workqueue_execution: thread=:236 func=cfq_kick_queue+0x0 kondemand/0-360 [000] 0.000000: workqueue_execution: thread=:360 func=do_dbs_timer+0x0 kondemand/0-360 [000] 0.000000: workqueue_execution: thread=:360 func=do_dbs_timer+0x0 kondemand/1-361 [000] 0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0 kondemand/1-361 [000] 0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0 kondemand/1-361 [000] 0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0 kondemand/1-361 [000] 0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0 kondemand/1-361 [000] 0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0 kondemand/1-361 [000] 0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0 kondemand/1-361 [000] 0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0 kondemand/1-361 [000] 0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0 kondemand/1-361 [000] 0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0 kondemand/1-361 [000] 0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0 kondemand/1-361 [000] 0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0 kondemand/1-361 [000] 0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0 kondemand/1-361 [000] 0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0 Todo: - A lot of things! Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: "Luis Claudio R. Goncalves" <lclaudio@uudg.org> Cc: Clark Williams <williams@redhat.com> Cc: Jon Masters <jonathan@jonmasters.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Christoph Hellwig <hch@infradead.org> Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Cc: Zhaolei <zhaolei@cn.fujitsu.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: "Frank Ch. Eigler" <fche@redhat.com> Cc: Roland McGrath <roland@redhat.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Jiaying Zhang <jiayingz@google.com> Cc: Anton Blanchard <anton@samba.org> LKML-Reference: <1250518688-7207-4-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
8f28827a |
|
16-Aug-2009 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf tools: Librarize trace_event() helper Librarize trace_event() helper so that perf trace can use it too. Also clean up the debug.h includes a bit. It's not good to have it included in perf.h because it doesn't make it flexible against other headers it may need (headers that can also depend on perf.h and then create a recursive header dependency). Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1250453149-664-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
39e6dd73 |
|
14-Aug-2009 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Fix typo in pid_synthesize_comm_event We were using 'fd' locally, but there was a global 'fd' too, so when converting from open to fopen the test made against fd should be made against 'fp', but since we have that global it didnt get discovered ... Reported-by: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090814182632.GF3490@ghostprotocols.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
daac07b2 |
|
13-Aug-2009 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf tools: Add a general option to enable raw sample records While we can enable the perf sample records per tracepoint counter, we may also want to enable this option for every tracepoint counters to open, so that we don't need to add a :record flag for all of them. Add the -R, --raw-samples options for this purpose. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1250152039-7284-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
3a9f131f |
|
13-Aug-2009 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf tools: Add a per tracepoint counter attribute to get raw sample Add a new flag field while opening a tracepoint perf counter: -e tracepoint_subsystem:tracepoint_name:flags This is intended to be generic although for now it only supports the r[e[c[o[r[d]]]]] flag: ./perf record -e workqueue:workqueue_insertion:record ./perf record -e workqueue:workqueue_insertion:r will have the same effect: enabling the raw samples record for the given tracepoint counter. In the future, we may want to support further flags, separated by commas. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1250152039-7284-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
0a5ac846 |
|
12-Aug-2009 |
Jens Axboe <jens.axboe@oracle.com> |
perf record: Add missing -C option support for specifying profile cpu perf top supports a -C for setting the profile CPU, but perf record does not. This adds the same option for perf record, allowing the user to specify a specific target profile CPU. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20090812091801.GC12579@kernel.dk> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
2a8083f0 |
|
11-Aug-2009 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf record: Fix .tid and .pid fill-in when synthesizing events Noticed when trying to record events for a firefox thread. We were synthesizing both .tid and .pid with the pid passed via --pid. Fix it by reading /proc/PID/status and getting the tgid to use in .pid, .tid gets the specified "pid". Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20090811192200.GF18061@ghostprotocols.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
66e274f3 |
|
12-Aug-2009 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf tools: Factorize the map helpers Factorize the dso mapping helpers into a single purpose common file "util/map.c" Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Brice Goglin <Brice.Goglin@inria.fr>
|
#
1fe2c106 |
|
12-Aug-2009 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf tools: Factorize the event structure definitions in a single file Factorize the multiple definition of the events structures into a single util/event.h file. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Brice Goglin <Brice.Goglin@inria.fr>
|
#
cd84c2ac |
|
12-Aug-2009 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf tools: Factorize high level dso helpers Factorize multiple definitions of high level dso helpers into the symbol source file. The side effect is a general export of the verbose and eprintf debugging helpers into a new file dedicated to debugging purposes. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Brice Goglin <Brice.Goglin@inria.fr>
|
#
266e0e21 |
|
07-Aug-2009 |
Pierre Habouzit <pierre.habouzit@intersec.com> |
perf record: Fix the -A UI for empty or non-existent perf.data 1. Ignore the -A argument if there is no perf.data file 2. Treat an empty file like a non existent file. Else, perf will try to read the perf.data header, and fail with an error. Treating an empty file like a non-existent file makes sense, since an interupted (as in SIGKILLed) perf could leave such files around, and you don't want to annoy the user with errors for files with no data in it. Signed-off-by: Pierre Habouzit <pierre.habouzit@intersec.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
f413cdb8 |
|
06-Aug-2009 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf_counter: Fix/complete ftrace event records sampling This patch implements the kernel side support for ftrace event record sampling. A new counter sampling attribute is added: PERF_SAMPLE_TP_RECORD which requests ftrace events record sampling. In this case if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint fires, we emit the tracepoint binary record to the perfcounter event buffer, as a sample. Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf record: perf record -f -F 1 -a -e workqueue:workqueue_execution perf report -D 0x21e18 [0x48]: event: 9 . . ... raw event: size 72 bytes . 0000: 09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff ......H........ . 0010: 0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00 ........!...... . 0020: 2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e +...........eve . 0030: 74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00 ts/1........... . 0040: e0 b1 31 81 ff ff ff ff ....... . 0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33 The raw ftrace binary record starts at offset 0020. Translation: struct trace_entry { type = 0x2b = 43; flags = 1; preempt_count = 2; pid = 0xa = 10; tgid = 0xa = 10; } thread_comm = "events/1" thread_pid = 0xa = 10; func = 0xffffffff8131b1e0 = flush_to_ldisc() What will come next? - Userspace support ('perf trace'), 'flight data recorder' mode for perf trace, etc. - The unconditional copy from the profiling callback brings some costs however if someone wants no such sampling to occur, and needs to be fixed in the future. For that we need to have an instant access to the perf counter attribute. This is a matter of a flag to add in the struct ftrace_event. - Take care of the events recursivity! Don't ever try to record a lock event for example, it seems some locking is used in the profiling fast path and lead to a tracing recursivity. That will be fixed using raw spinlock or recursivity protection. - [...] - Profit! :-) Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
a0541234 |
|
22-Jul-2009 |
Anton Blanchard <anton@samba.org> |
perf_counter: Improve perf stat and perf record option parsing perf stat and perf record currently look for all options on the command line. This can lead to some confusion: # perf stat ls -l Error: unknown switch `l' While we can work around this by adding '--' before the command, the git option parsing code can stop at the first non option: # perf stat ls -l Performance counter stats for 'ls -l': .... Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090722130412.GD9029@kryten>
|
#
4bba828d |
|
16-Jul-2009 |
Anton Blanchard <anton@samba.org> |
perf_counter: Add perf record option to log addresses Add the -d or --data option to log event addresses (eg page faults). Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090716104817.697698033@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
11b5f81e |
|
16-Jul-2009 |
Anton Blanchard <anton@samba.org> |
perf_counter: Synthesize VDSO mmap event perf record synthesizes mmap events for the running process. Right now it just catches file mappings, but we can check for the vdso symbol and add that too. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090716104817.517264409@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
f37a291c |
|
30-Jun-2009 |
Ingo Molnar <mingo@elte.hu> |
perf_counter tools: Add more warnings and fix/annotate them Enable -Wextra. This found a few real bugs plus a number of signed/unsigned type mismatches/uncleanlinesses. It also required a few annotations All things considered it was still worth it so lets try with this enabled for now. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
3928ddbe |
|
25-Jun-2009 |
Frederic Weisbecker <fweisbec@gmail.com> |
perf record: Fix unhandled io return value Building latest perfcounter fails on the following error: builtin-record.c: In function ‘create_counter’: builtin-record.c:451: erreur: ignoring return value of ‘read’, declared with attribute warn_unused_result make: *** [builtin-record.o] Erreur 1 Just check if we successfully read the perf file descriptor. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1245961287-5327-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
649c48a9 |
|
24-Jun-2009 |
Peter Zijlstra <a.p.zijlstra@chello.nl> |
perf-report: Add modes for inherited stats and no-samples Now that we can collect per task statistics, add modes that make use of that facility. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
7c6a1c65 |
|
25-Jun-2009 |
Peter Zijlstra <a.p.zijlstra@chello.nl> |
perf_counter tools: Rework the file format Create a structured file format that includes the full perf_counter_attr and all its relevant counter IDs so that the reporting program has full information. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
76c64c5e |
|
24-Jun-2009 |
Johannes Weiner <hannes@cmpxchg.org> |
perf record: Fix filemap pathname parsing in /proc/pid/maps Looking backward for the first space from the end of a line in /proc/pid/maps does not find the start of the pathname of the mapped file if it contains a space. Since the only slashes we have in this file occur in the (absolute!) pathname column of file mappings, looking for the first slash in a line is a safe method to find the name. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Stefani Seibold <stefani@seibold.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090624190835.GA25548@cmpxchg.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
eadc84cc |
|
19-Jun-2009 |
Frederic Weisbecker <fweisbec@gmail.com> |
perfcounter: Handle some IO return values Building perfcounter tools raises the following warnings: builtin-record.c: In function ‘atexit_header’: builtin-record.c:464: erreur: ignoring return value of ‘pwrite’, declared with attribute warn_unused_result builtin-record.c: In function ‘__cmd_record’: builtin-record.c:503: erreur: ignoring return value of ‘read’, declared with attribute warn_unused_result builtin-report.c: In function ‘__cmd_report’: builtin-report.c:1403: erreur: ignoring return value of ‘read’, declared with attribute warn_unused_result This patch handles these IO return values. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1245456100-5477-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
9cffa8d5 |
|
19-Jun-2009 |
Paul Mackerras <paulus@samba.org> |
perf_counter tools: Define and use our own u64, s64 etc. definitions On 64-bit powerpc, __u64 is defined to be unsigned long rather than unsigned long long. This causes compiler warnings every time we print a __u64 value with %Lx. Rather than changing __u64, we define our own u64 to be unsigned long long on all architectures, and similarly s64 as signed long long. For consistency we also define u32, s32, u16, s16, u8 and s8. These definitions are put in a new header, types.h, because these definitions are needed in util/string.h and util/symbol.h. The main change here is the mechanical change of __[us]{64,32,16,8} to remove the "__". The other changes are: * Create types.h * Include types.h in perf.h, util/string.h and util/symbol.h * Add types.h to the LIB_H definition in Makefile * Added (u64) casts in process_overflow_event() and print_sym_table() to kill two remaining warnings. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: benh@kernel.crashing.org LKML-Reference: <19003.33494.495844.956580@cargo.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
f5970550 |
|
18-Jun-2009 |
Peter Zijlstra <a.p.zijlstra@chello.nl> |
perf_counter tools: Add a data file header Add a data file header so we can transfer data between record and report. LKML-Reference: <new-submission> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
9d91a6f7 |
|
18-Jun-2009 |
Peter Zijlstra <a.p.zijlstra@chello.nl> |
perf_counter tools: Handle lost events Make use of the new ->data_tail mechanism to tell kernel-space about user-space draining the data stream. Emit lost events (and display them) if they happen. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
613d8602 |
|
15-Jun-2009 |
Ingo Molnar <mingo@elte.hu> |
perf record: Fix fast task-exit race Recording with -a (or with -p) can race with tasks going away: couldn't open /proc/8440/maps Causing an early exit() and no recording done. Do not abort the recording session - instead just skip that task. Also, only print the warnings under -v. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
3efa1cc9 |
|
14-Jun-2009 |
Ingo Molnar <mingo@elte.hu> |
perf record/report: Add call graph / call chain profiling Add the first steps of call-graph profiling: - add the -c (--call-graph) option to perf record - parse the call-graph record and printout out under -D (--dump-trace) The call-graph data is not put into the histogram yet, but it can be seen that it's being processed correctly: 0x3ce0 [0x38]: event: 35 . . ... raw event: size 56 bytes . 0000: 23 00 00 00 05 00 38 00 d4 df 0e 81 ff ff ff ff #.....8........ . 0010: 60 0b 00 00 60 0b 00 00 03 00 00 00 01 00 02 00 `...`.......... . 0020: d4 df 0e 81 ff ff ff ff a0 61 ed 41 36 00 00 00 .........a.A6.. . 0030: 04 92 e6 41 36 00 00 00 .a.A6.. . 0x3ce0 [0x38]: PERF_EVENT (IP, 5): 2912: 0xffffffff810edfd4 period: 1 ... chain: u:2, k:1, nr:3 ..... 0: 0xffffffff810edfd4 ..... 1: 0x3641ed61a0 ..... 2: 0x3641e69204 ... thread: perf:2912 ...... dso: [kernel] This shows a 3-entry call-graph: with 1 kernel-space and two user-space entries Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
bbd36e5e |
|
11-Jun-2009 |
Peter Zijlstra <a.p.zijlstra@chello.nl> |
perf record: Explicity program a default counter Up until now record has worked on the assumption that type=0, config=0 was a suitable configuration - which it is. Lets make this a little more explicit and more readable via the use of proper symbols. [ Impact: cleanup ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
f4dbfa8f |
|
11-Jun-2009 |
Peter Zijlstra <a.p.zijlstra@chello.nl> |
perf_counter: Standardize event names Pure renames only, to PERF_COUNT_HW_* and PERF_COUNT_SW_*. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
729ff5e2 |
|
11-Jun-2009 |
Ingo Molnar <mingo@elte.hu> |
perf_counter tools: Clean up u64 usage A build error slipped in: builtin-report.c: In function ‘hist_entry__fprintf’: builtin-report.c:711: error: format ‘%12d’ expects type ‘int’, but argument 3 has type ‘uint64_t’ Because we got a bit sloppy with those types. uint64_t really sucks, because there's no printf format for it. So standardize on __u64 instead - for all types that go to or come from the ABI (which is __u64), or for values that need to be large enough even on 32-bit. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
ea1900e5 |
|
10-Jun-2009 |
Peter Zijlstra <a.p.zijlstra@chello.nl> |
perf_counter tools: Normalize data using per sample period data When we use variable period sampling, add the period to the sample data and use that to normalize the samples. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
f7b7c26e |
|
10-Jun-2009 |
Peter Zijlstra <a.p.zijlstra@chello.nl> |
perf_counter tools: Propagate signals properly Currently report and stat catch SIGINT (and others) without altering their exit state. This means that things like: while :; do perf stat ./foo ; done Loops become hard-to-interrupt, because bash never sees perf terminate due to interruption. Fix this. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
4502d77c |
|
10-Jun-2009 |
Peter Zijlstra <a.p.zijlstra@chello.nl> |
perf_counter tools: Small frequency related fixes Create the counter in a disabled state and only enable it after we mmap() the buffer, this allows us to see the first few samples (and observe the frequency ramp). Furthermore, print the period in the verbose report. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
30c806a0 |
|
07-Jun-2009 |
Ingo Molnar <mingo@elte.hu> |
perf_counter tools: Handle kernels with !CONFIG_PERF_COUNTER If perf is run on a !CONFIG_PERF_COUNTER kernel right now it bails out with no messages or with confusing messages. Standardize this case some more and explain the situation. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
3da297a6 |
|
07-Jun-2009 |
Ingo Molnar <mingo@elte.hu> |
perf record: Fall back to cpu-clock-ticks if no PMU On architectures/CPUs without PMU support but with perfcounters enabled 'perf record' currently fails because it cannot create a cycle based hw-perfcounter. Fall back to the cpu-clock-tick sw-perfcounter in this case, which is hrtimer based and will always work (as long as perfcounters are enabled). Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
#
86470930 |
|
06-Jun-2009 |
Ingo Molnar <mingo@elte.hu> |
perf_counter tools: Move from Documentation/perf_counter/ to tools/perf/ Several people have suggested that 'perf' has become a full-fledged tool that should be moved out of Documentation/. Move it to the (new) tools/ directory. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|