#
93bb5b0d |
|
29-Feb-2024 |
Ian Rogers <irogers@google.com> |
perf threads: Move threads to its own files Move threads out of machine and into its own file. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240301053646.1449657-6-irogers@google.com
|
#
8f0ec15f |
|
17-Feb-2024 |
Changbin Du <changbin.du@huawei.com> |
perf: util: use capstone disasm engine to show assembly instructions Currently, the instructions of samples are shown as raw hex strings which are hard to read. x86 has a special option '--xed' to disassemble the hex string via intel XED tool. Here we use capstone as our disassembler engine to give more friendly instructions. We select libcapstone because capstone can provide more insn details. Perf will fallback to raw instructions if libcapstone is not available. The advantages compared to XED tool: * Support arm, arm64, x86-32, x86_64 (more could be supported), xed only for x86_64. * Immediate address operands are shown as symbol+offs. Signed-off-by: Changbin Du <changbin.du@huawei.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Cc: changbin.du@gmail.com Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240217074046.4100789-3-changbin.du@huawei.com
|
#
b9c87f53 |
|
12-Dec-2023 |
Namhyung Kim <namhyung@kernel.org> |
perf annotate-data: Add find_data_type() to get type from memory access The find_data_type() is to get a data type from the memory access at the given address (IP) using a register and an offset. It requires DWARF debug info in the DSO and searches the list of variables and function parameters in the scope. In a pseudo code, it does basically the following: find_data_type(dso, ip, reg, offset) { pc = map__rip_2objdump(ip); CU = dwarf_addrdie(dso->dwarf, pc); scopes = die_get_scopes(CU, pc); for_each_scope(S, scopes) { V = die_find_variable_by_reg(S, pc, reg); if (V && V.type == pointer_type) { T = die_get_real_type(V); if (offset < T.size) return T; } } return NULL; } Committer notes: The 'size' variable in check_variable() is 64-bit, so use PRIu64 and inttypes.h to debug it. Ditto at find_data_type_die(). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: linux-toolchains@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Link: https://lore.kernel.org/r/20231213001323.718046-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
6f1b6291 |
|
09-Nov-2023 |
Namhyung Kim <namhyung@kernel.org> |
perf tools: Add util/debuginfo.[ch] files Split debuginfo data structure and related functions into a separate file so that it can be used by other components than the probe-finder. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: linux-toolchains@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Link: https://lore.kernel.org/r/20231110000012.3538610-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a29ee6ae |
|
21-Nov-2023 |
Oliver Upton <oliver.upton@linux.dev> |
perf build: Ensure sysreg-defs Makefile respects output dir Currently the sysreg-defs are written out to the source tree unconditionally, ignoring the specified output directory. Correct the build rule to emit the header to the output directory. Opportunistically reorganize the rules to avoid interleaving with the set of beauty make rules. Reported-by: Ian Rogers <irogers@google.com> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20231121192956.919380-3-oliver.upton@linux.dev Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
#
56e144fe |
|
24-Oct-2023 |
Ian Rogers <irogers@google.com> |
perf mem_info: Add and use map_symbol__exit and addr_map_symbol__exit Fix leak where mem_info__put wouldn't release the maps/map as used by perf mem. Add exit functions and use elsewhere that the maps and map are released. Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-12-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
#
8c984209 |
|
12-Aug-2023 |
Yang Jihong <yangjihong1@huawei.com> |
perf kwork top: Implements BPF-based cpu usage statistics Use BPF to collect statistics on the CPU usage based on perf BPF skeletons. Example usage: # perf kwork top -h Usage: perf kwork top [<options>] -b, --use-bpf Use BPF to measure task cpu usage -C, --cpu <cpu> list of cpus to profile -i, --input <file> input file name -n, --name <name> event name to profile -s, --sort <key[,key2...]> sort by key(s): rate, runtime, tid --time <str> Time span for analysis (start,stop) # # perf kwork -k sched top -b Starting trace, Hit <Ctrl+C> to stop and report ^C Total : 160702.425 ms, 8 cpus %Cpu(s): 36.00% id, 0.00% hi, 0.00% si %Cpu0 [|||||||||||||||||| 61.66%] %Cpu1 [|||||||||||||||||| 61.27%] %Cpu2 [||||||||||||||||||| 66.40%] %Cpu3 [|||||||||||||||||| 61.28%] %Cpu4 [|||||||||||||||||| 61.82%] %Cpu5 [||||||||||||||||||||||| 77.41%] %Cpu6 [|||||||||||||||||| 61.73%] %Cpu7 [|||||||||||||||||| 63.25%] PID SPID %CPU RUNTIME COMMMAND ------------------------------------------------------------- 0 0 38.72 8089.463 ms [swapper/1] 0 0 38.71 8084.547 ms [swapper/3] 0 0 38.33 8007.532 ms [swapper/0] 0 0 38.26 7992.985 ms [swapper/6] 0 0 38.17 7971.865 ms [swapper/4] 0 0 36.74 7447.765 ms [swapper/7] 0 0 33.59 6486.942 ms [swapper/2] 0 0 22.58 3771.268 ms [swapper/5] 9545 9351 2.48 447.136 ms sched-messaging 9574 9351 2.09 418.583 ms sched-messaging 9724 9351 2.05 372.407 ms sched-messaging 9531 9351 2.01 368.804 ms sched-messaging 9512 9351 2.00 362.250 ms sched-messaging 9514 9351 1.95 357.767 ms sched-messaging 9538 9351 1.86 384.476 ms sched-messaging 9712 9351 1.84 386.490 ms sched-messaging 9723 9351 1.83 380.021 ms sched-messaging 9722 9351 1.82 382.738 ms sched-messaging 9517 9351 1.81 354.794 ms sched-messaging 9559 9351 1.79 344.305 ms sched-messaging 9725 9351 1.77 365.315 ms sched-messaging <SNIP> # perf kwork -k sched top -b -n perf Starting trace, Hit <Ctrl+C> to stop and report ^C Total : 151563.332 ms, 8 cpus %Cpu(s): 26.49% id, 0.00% hi, 0.00% si %Cpu0 [ 0.01%] %Cpu1 [ 0.00%] %Cpu2 [ 0.00%] %Cpu3 [ 0.00%] %Cpu4 [ 0.00%] %Cpu5 [ 0.00%] %Cpu6 [ 0.00%] %Cpu7 [ 0.00%] PID SPID %CPU RUNTIME COMMMAND ------------------------------------------------------------- 9754 9754 0.01 2.303 ms perf # # perf kwork -k sched top -b -C 2,3,4 Starting trace, Hit <Ctrl+C> to stop and report ^C Total : 48016.721 ms, 3 cpus %Cpu(s): 27.82% id, 0.00% hi, 0.00% si %Cpu2 [|||||||||||||||||||||| 74.68%] %Cpu3 [||||||||||||||||||||| 71.06%] %Cpu4 [||||||||||||||||||||| 70.91%] PID SPID %CPU RUNTIME COMMMAND ------------------------------------------------------------- 0 0 29.08 4734.998 ms [swapper/4] 0 0 28.93 4710.029 ms [swapper/3] 0 0 25.31 3912.363 ms [swapper/2] 10248 10158 1.62 264.931 ms sched-messaging 10253 10158 1.62 265.136 ms sched-messaging 10158 10158 1.60 263.013 ms bash 10360 10158 1.49 243.639 ms sched-messaging 10413 10158 1.48 238.604 ms sched-messaging 10531 10158 1.47 234.067 ms sched-messaging 10400 10158 1.47 240.631 ms sched-messaging 10355 10158 1.47 230.586 ms sched-messaging 10377 10158 1.43 234.835 ms sched-messaging 10526 10158 1.42 232.045 ms sched-messaging 10298 10158 1.41 222.396 ms sched-messaging 10410 10158 1.38 221.853 ms sched-messaging 10364 10158 1.38 226.042 ms sched-messaging 10480 10158 1.36 213.633 ms sched-messaging 10370 10158 1.36 223.620 ms sched-messaging 10553 10158 1.34 217.169 ms sched-messaging 10291 10158 1.34 211.516 ms sched-messaging 10251 10158 1.34 218.813 ms sched-messaging 10522 10158 1.33 218.498 ms sched-messaging 10288 10158 1.33 216.787 ms sched-messaging <SNIP> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Link: https://lore.kernel.org/r/20230812084917.169338-15-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e2bdd172 |
|
11-Oct-2023 |
Oliver Upton <oliver.upton@linux.dev> |
perf build: Generate arm64's sysreg-defs.h and add to include path Start generating sysreg-defs.h in anticipation of updating sysreg.h to a version that needs the generated output. Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20231011195740.3349631-3-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
#
5000e7f6 |
|
05-Jun-2023 |
Leo Yan <leo.yan@linaro.org> |
perf parse-regs: Refactor arch register parsing functions Every architecture has a specific register parsing function for returning register name based on register index, to support cross analysis (e.g. we use perf x86 binary to parse Arm64's perf data), we build all these register parsing functions into the tool, this is why we place all related functions into util/perf_regs.c. Unfortunately, since util/perf_regs.c needs to include every arch's perf_regs.h, this easily introduces duplicated definitions coming from multiple headers, finally it's fragile for building and difficult for maintenance. We cannot simply move these register parsing functions into the corresponding 'arch' folder, the folder is only conditionally built based on the target architecture. Therefore, this commit creates a new folder util/perf-regs-arch/ and uses a dedicated source file to keep every architecture's register parsing function to avoid definition conflicts. This is only a refactoring, no functionality change is expected. Committer notes: Had to add util/perf-regs-arch/*.c to tools/perf/util/python-ext-sources to keep 'perf test python' passing. Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Eric Lin <eric.lin@sifive.com> Cc: Fangrui Song <maskray@google.com> Cc: Guo Ren <guoren@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Babrou <ivan@cloudflare.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-csky@vger.kernel.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20230606014559.21783-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
3d6dfae8 |
|
11-Aug-2023 |
Ian Rogers <irogers@google.com> |
perf parse-events: Remove BPF event support New features like the BPF --filter support in perf record have made the BPF event functionality somewhat redundant. As shown by commit fcb027c1a4f6 ("perf tools: Revert enable indices setting syntax for BPF map") and commit 14e4b9f4289a ("perf trace: Raw augmented syscalls fix libbpf 1.0+ compatibility") the BPF event support hasn't been well maintained and it adds considerable complexity in areas like event parsing, not least as '/' is a separator for event modifiers as well as in paths. This patch removes support in the event parser for BPF events and then the associated functions are removed. This leads to the removal of whole source files like bpf-loader.c. Removing support means that augmented syscalls in perf trace is broken, this will be fixed in a later commit adding support using BPF skeletons. The removal of BPF events causes an unused label warning from flex generated code, so update build to ignore it: ``` util/parse-events-flex.c:2704:1: error: label ‘find_rule’ defined but not used [-Werror=unused-label] 2704 | find_rule: /* we branch to this label when backing up */ ``` Committer notes: Extracted from a larger patch that was also removing the support for linking with libllvm and libclang, that were an alternative to using an external clang execution to compile the .c event source code into BPF bytecode. Testing it: # perf trace -e /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c event syntax error: '/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c' \___ Bad event or PMU Unabled to find PMU or event on a PMU of 'home' Initial error: event syntax error: '/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c' \___ Cannot find PMU `home'. Missing kernel support? Run 'perf list' for a list of valid events Usage: perf trace [<options>] [<command>] or: perf trace [<options>] -- <command> [<options>] or: perf trace record [<options>] [<command>] or: perf trace record [<options>] -- <command> [<options>] -e, --event <event> event/syscall selector. use 'perf list' to list available events # Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Carsten Haitzler <carsten.haitzler@arm.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Fangrui Song <maskray@google.com> Cc: He Kuang <hekuang@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Tom Rix <trix@redhat.com> Cc: Wang Nan <wangnan0@huawei.com> Cc: Wang ShaoBo <bobo.shaobowang@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Yonghong Song <yhs@fb.com> Cc: YueHaibing <yuehaibing@huawei.com> Cc: bpf@vger.kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230810184853.2860737-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
56b11a21 |
|
10-Aug-2023 |
Ian Rogers <irogers@google.com> |
perf bpf: Remove support for embedding clang for compiling BPF events (-e foo.c) This never was in the default build for perf, is difficult to maintain as it uses clang/llvm internals so ditch it, keeping, for now, the external compilation of .c BPF into .o bytecode and its subsequent loading, that is also going to be removed, do it separately to help bisection and to properly document what is being removed and why. Committer notes: Extracted from a larger patch and removed some leftovers, namely deleting these now unused feature tests: tools/build/feature/test-clang.cpp tools/build/feature/test-cxx.cpp tools/build/feature/test-llvm-version.cpp tools/build/feature/test-llvm.cpp Testing the use of BPF events after applying this patch: To use the external clang/llvm toolchain to compile a .c event and then use libbpf to load it, to get the syscalls:sys_enter_open* tracepoints and read the filename pointer, putting it into the ring buffer right after the usual tracepoint payload for 'perf trace' to then print it: [root@quaco ~]# perf trace -e /home/acme/git/perf-tools-next/tools/perf/examples/bpf/augmented_raw_syscalls.c,open* --max-events=10 0.000 systemd-oomd/959 openat(dfd: CWD, filename: "/proc/meminfo", flags: RDONLY|CLOEXEC) = 12 0.083 abrt-dump-jour/1453 openat(dfd: CWD, filename: "/var/log/journal/d6a97235307247e09f13f326fb607e3c/system.journal", flags: RDONLY|CLOEXEC|NONBLOCK) = 4 0.063 abrt-dump-jour/1454 openat(dfd: CWD, filename: "/var/log/journal/d6a97235307247e09f13f326fb607e3c/system.journal", flags: RDONLY|CLOEXEC|NONBLOCK) = 4 0.082 abrt-dump-jour/1455 openat(dfd: CWD, filename: "/var/log/journal/d6a97235307247e09f13f326fb607e3c/system.journal", flags: RDONLY|CLOEXEC|NONBLOCK) = 4 250.124 systemd-oomd/959 openat(dfd: CWD, filename: "/proc/meminfo", flags: RDONLY|CLOEXEC) = 12 250.521 systemd-oomd/959 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/app.slice/memory.pressure", flags: RDONLY|CLOEXEC) = 12 251.047 systemd-oomd/959 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/app.slice/memory.current", flags: RDONLY|CLOEXEC) = 12 251.162 systemd-oomd/959 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/app.slice/memory.min", flags: RDONLY|CLOEXEC) = 12 251.242 systemd-oomd/959 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/app.slice/memory.low", flags: RDONLY|CLOEXEC) = 12 251.353 systemd-oomd/959 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/app.slice/memory.swap.current", flags: RDONLY|CLOEXEC) = 12 [root@quaco ~]# Same thing, but with a prebuilt .o BPF bytecode: [root@quaco ~]# perf trace -e /home/acme/git/perf-tools-next/tools/perf/examples/bpf/augmented_raw_syscalls.o,open* --max-events=10 0.000 systemd-oomd/959 openat(dfd: CWD, filename: "/proc/meminfo", flags: RDONLY|CLOEXEC) = 12 0.083 abrt-dump-jour/1453 openat(dfd: CWD, filename: "/var/log/journal/d6a97235307247e09f13f326fb607e3c/system.journal", flags: RDONLY|CLOEXEC|NONBLOCK) = 4 0.083 abrt-dump-jour/1455 openat(dfd: CWD, filename: "/var/log/journal/d6a97235307247e09f13f326fb607e3c/system.journal", flags: RDONLY|CLOEXEC|NONBLOCK) = 4 0.062 abrt-dump-jour/1454 openat(dfd: CWD, filename: "/var/log/journal/d6a97235307247e09f13f326fb607e3c/system.journal", flags: RDONLY|CLOEXEC|NONBLOCK) = 4 249.985 systemd-oomd/959 openat(dfd: CWD, filename: "/proc/meminfo", flags: RDONLY|CLOEXEC) = 12 466.763 thermald/1234 openat(dfd: CWD, filename: "/sys/class/powercap/intel-rapl/intel-rapl:0/intel-rapl:0:2/energy_uj") = 13 467.145 thermald/1234 openat(dfd: CWD, filename: "/sys/class/powercap/intel-rapl/intel-rapl:0/energy_uj") = 13 467.311 thermald/1234 openat(dfd: CWD, filename: "/sys/class/thermal/thermal_zone2/temp") = 13 500.040 cgroupify/24006 openat(dfd: 4, filename: ".", flags: RDONLY|CLOEXEC|DIRECTORY|NONBLOCK) = 5 500.295 cgroupify/24006 openat(dfd: 4, filename: "24616/cgroup.procs") = 5 [root@quaco ~]# Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Carsten Haitzler <carsten.haitzler@arm.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Fangrui Song <maskray@google.com> Cc: He Kuang <hekuang@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Tom Rix <trix@redhat.com> Cc: Wang Nan <wangnan0@huawei.com> Cc: Wang ShaoBo <bobo.shaobowang@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Yonghong Song <yhs@fb.com> Cc: YueHaibing <yuehaibing@huawei.com> Link: https://lore.kernel.org/lkml/ZNZWsAXg2px1sm2h@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
878460e8 |
|
10-Aug-2023 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf build: Remove -Wno-unused-but-set-variable from the flex flags when building with clang < 13.0.0 clang < 13.0.0 doesn't grok -Wno-unused-but-set-variable, so just remove it to avoid: error: unknown warning option '-Wno-unused-but-set-variable'; did you mean '-Wno-unused-const-variable'? [-Werror,-Wunknown-warning-option] make[4]: *** [/git/perf-6.5.0-rc4/tools/build/Makefile.build:128: /tmp/build/perf/util/pmu-flex.o] Error 1 make[4]: *** Waiting for unfinished jobs.... Fixes: ddc8e4c966923ad1 ("perf build: Disable fewer bison warnings") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZNUSWr52jUnVaaa%2F@kernel.org/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f776b043 |
|
28-Jul-2023 |
Ian Rogers <irogers@google.com> |
perf build: Remove -Wno-redundant-decls in 2 cases Properly fix a warning and remove the -Wno-redundant-decls C flag. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Herring <robh@kernel.org> Cc: Tom Rix <trix@redhat.com> Cc: bpf@vger.kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230728064917.767761-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ddc8e4c9 |
|
28-Jul-2023 |
Ian Rogers <irogers@google.com> |
perf build: Disable fewer bison warnings If bison is version 3.8.2, reduce the number of bison C warnings disabled. Earlier bison versions have all C warnings disabled. Avoid implicit declarations of yylex by adding the declaration in the C file. A header can't be included as a circular dependency would occur due to the lexer using the bison defined tokens. Committer notes: Some recent versions of gcc and clang (noticed on Alpine Linux 3.17, edge, clearlinux, fedora 37, etc. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Herring <robh@kernel.org> Cc: Tom Rix <trix@redhat.com> Cc: bpf@vger.kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230728064917.767761-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
10c775af |
|
28-Jul-2023 |
Ian Rogers <irogers@google.com> |
perf build: Disable fewer flex warnings If flex is version 2.6.4, reduce the number of flex C warnings disabled. Earlier flex versions have all C warnings disabled. Committer notes: Added this to the list of ignored warnings to get it building on a Fedora 36 machine with flex 2.6.4: -Wno-misleading-indentation Noticed when building with: $ make LLVM=1 -C tools/perf NO_BPF_SKEL=1 DEBUG=1 Take two: We can't just try to canonicalize flex versions by just removing the dots, as we end up with: 2.6.4 >= 2.5.37 becoming: 264 >= 2537 Failing the build on flex 2.5.37, so instead use the back to the past added $(call version_ge3,$(FLEX_VERSION),2.6.4) variant to check for that. Making sure $(FLEX_VERSION) keeps the dots as we may want to use 'sort -V' or something nicer when available everywhere. Some other tweaks for other flex versions and combinations with gcc and clang versions were added, notes on the patch. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Herring <robh@kernel.org> Cc: Tom Rix <trix@redhat.com> Cc: bpf@vger.kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230728064917.767761-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9462e4de |
|
27-Jun-2023 |
Ian Rogers <irogers@google.com> |
perf parse-event: Add memory allocation test for name terms If the name memory allocation fails then propagate to the parser. Committer notes: Use $(BISON_FALLBACK_FLAGS) on the bison call so that we continue building with older bison versions, before 3.81, where YYNOMEM isn't present. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230627181030.95608-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
88cc47e2 |
|
28-Jul-2023 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf build: Define YYNOMEM as YYNOABORT for bison < 3.81 YYNOMEM was introduced in bison 3.81, so define it as YYABORT for older versions, which should provide the previous perf behaviour. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0650b2b2 |
|
14-Jun-2023 |
Ian Rogers <irogers@google.com> |
perf sharded_mutex: Introduce sharded_mutex Per object mutexes may come with significant memory cost while a global mutex can suffer from unnecessary contention. A sharded mutex is a compromise where objects are hashed and then a particular mutex for the hash of the object used. Contention can be controlled by the number of shards. v2. Use hashmap.h's hash_bits in case of contention from alignment of objects. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Andres Freund <andres@anarazel.de> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Yuan Can <yuancan@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Link: https://lore.kernel.org/r/20230615040715.2064350-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
#
620be847 |
|
08-Jun-2023 |
Ian Rogers <irogers@google.com> |
perf addr_location: Move to its own header addr_location is a common abstraction, move it into its own header and source file in preparation for wider clean up. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Brian Robbins <brianrob@linux.microsoft.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Fangrui Song <maskray@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Babrou <ivan@cloudflare.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Ye Xingchen <ye.xingchen@zte.com.cn> Cc: Yuan Can <yuancan@huawei.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230608232823.4027869-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
597a4276 |
|
27-May-2023 |
Ian Rogers <irogers@google.com> |
perf pmu: Remove perf_pmu__hybrid_pmus list Rather than iterate hybrid PMUs, inhererently Intel specific, iterate all PMUs checking whether they are core. To only get hybrid cores, first call perf_pmu__has_hybrid. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230527072210.2900565-25-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b167b530 |
|
27-May-2023 |
Ian Rogers <irogers@google.com> |
perf evlist: Reduce scope of evlist__has_hybrid Function is only used in printout, reduce scope to stat-display.c. Remove the now empty evlist-hybrid.c and evlist-hybrid.h. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230527072210.2900565-15-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
996e54bb |
|
02-May-2023 |
Ian Rogers <irogers@google.com> |
perf parse-events: Remove now unused hybrid logic The event parser no longer needs to recurse in case of a legacy cache event in a PMU, the necessary wild card logic has moved to perf_pmu__supports_legacy_cache and perf_pmu__supports_wildcard_numeric. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20230502223851.2234828-28-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
65cd8e55 |
|
17-Apr-2023 |
Ian Rogers <irogers@google.com> |
perf build: Don't compile demangle-cxx.cpp if not necessary demangle-cxx.cpp requires a C++ compiler, but feature checks may fail because of the absence of this. Add a CONFIG_CXX_DEMANGLE so that the source isn't built if not supported. Copy libbfd and cplus demangle variants to a weak symbol-elf.c version so they aren't dependent on C++. These variants are only built with the build option BUILD_NONDISTRO=1. Committer note: This also handles this build break when a C++ compiler isn't available: CXX /tmp/build/perf/util/demangle-cxx.o /bin/sh: g++: command not found Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Liu <liuqi115@huawei.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Link: https://lore.kernel.org/r/20230417192546.99923-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
990a71e9 |
|
14-Mar-2023 |
Namhyung Kim <namhyung@kernel.org> |
perf bpf filter: Introduce basic BPF filter expression This implements a tiny parser for the filter expressions used for BPF. Each expression will be converted to struct perf_bpf_filter_expr and be passed to a BPF map. For now, I'd like to start with the very basic comparisons like EQ or GT. The LHS should be a term for sample data and the RHS is a number. The expressions are connected by a comma. For example, period > 10000 ip < 0x1000000000000, cpu == 3 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
80c3a7d9 |
|
15-Mar-2023 |
Adrian Hunter <adrian.hunter@intel.com> |
perf script: Fix Python support when no libtraceevent Python scripting can be used without libtraceevent. In particular, scripting for Intel PT does not use tracepoints, and so does not need libtraceevent support. Alter the build and employ conditional compilation to allow Python scripting without libtraceevent. Example: Before: $ ldd `which perf` | grep -i python $ ldd `which perf` | grep -i libtraceevent $ perf record -e intel_pt//u uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.031 MB perf.data ] $ perf script intel-pt-events.py |& head -3 Error: Couldn't find script `intel-pt-events.py' See perf script -l for available scripts. After: $ ldd `which perf` | grep -i python libpython3.10.so.1.0 => /lib/x86_64-linux-gnu/libpython3.10.so.1.0 (0x00007f4bac400000) $ ldd `which perf` | grep -i libtraceevent $ perf script intel-pt-events.py | head Intel PT Branch Trace, Power Events, Event Trace and PTWRITE Switch In 8021/8021 [000] 11234.097713404 0/0 perf-exec 8021/8021 [000] 11234.098041726 psb offset: 0x0 0 [unknown] ([unknown]) perf-exec 8021/8021 [000] 11234.098041726 cbr 45 freq: 4505 MHz (161%) 0 [unknown] ([unknown]) uname 8021/8021 [000] 11234.098082170 branches:uH tr strt 0 [unknown] ([unknown]) => 7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) uname 8021/8021 [000] 11234.098082379 branches:uH tr end 7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 0 [unknown] ([unknown]) uname 8021/8021 [000] 11234.098083629 branches:uH tr strt 0 [unknown] ([unknown]) => 7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) uname 8021/8021 [000] 11234.098083629 branches:uH call 7f3a8b9422b3 _start+0x3 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 7f3a8b943050 _dl_start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) uname 8021/8021 [000] 11234.098083837 branches:uH tr end 7f3a8b943060 _dl_start+0x10 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 0 [unknown] ([unknown]) IPC: 0.01 (9/938) uname 8021/8021 [000] 11234.098084670 branches:uH tr strt 0 [unknown] ([unknown]) => 7f3a8b943060 _dl_start+0x10 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) Fixes: 378ef0f5d9d7f465 ("perf build: Use libtraceevent from the system") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20230315084321.14563-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
3b4e4efe |
|
10-Mar-2023 |
Ian Rogers <irogers@google.com> |
perf symbol: Add abi::__cxa_demangle C++ demangling support Refactor C++ demangling out of symbol-elf into its own files similar to other languages. Add abi::__cxa_demangle support. As the other demanglers are not shippable with distributions, this brings back C++ demangling in a common case. It isn't perfect as the support for optionally demangling arguments and modifiers isn't present. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Pavithra Gurushankar <gpavithrasha@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Monnet <quentin@isovalent.com> Cc: Roberto Sassu <roberto.sassu@huawei.com> Cc: Stephane Eranian <eranian@google.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Tom Rix <trix@redhat.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230311065753.3012826-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
16cad1d3 |
|
02-Feb-2023 |
Namhyung Kim <namhyung@kernel.org> |
perf lock contention: Use lock_stat_find{,new} This is a preparation work to support complex keys of BPF maps. Now it has single value key according to the aggregation mode like stack_id or pid. But we want to use a combination of those keys. Then lock_contention_read() should still aggregate the result based on the key that was requested by user. The other key info will be used for filtering. So instead of creating a lock_stat entry always, Check if it's already there using lock_stat_find() first. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230203021324.143540-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
55c1de99 |
|
12-Dec-2022 |
James Clark <james.clark@arm.com> |
perf cs-etm: Print auxtrace info even if OpenCSD isn't linked Printing the info doesn't have any dependency on OpenCSD, and neither does recording Coresight data. Because it's sometimes useful to look at the info for debugging, it makes sense to be able to see it on the same platform that the recording was made on. So pull the auxtrace info printing parts into a new file that is always compiled into Perf. Signed-off-by: James Clark <james.clark@arm.com> Cc: Al Grant <Al.Grant@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20221212155513.2259623-6-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
378ef0f5 |
|
05-Dec-2022 |
Ian Rogers <irogers@google.com> |
perf build: Use libtraceevent from the system Remove the LIBTRACEEVENT_DYNAMIC and LIBTRACEFS_DYNAMIC make command line variables. If libtraceevent isn't installed or NO_LIBTRACEEVENT=1 is passed to the build, don't compile in libtraceevent and libtracefs support. This also disables CONFIG_TRACE that controls "perf trace". CONFIG_LIBTRACEEVENT is used to control enablement in Build/Makefiles, HAVE_LIBTRACEEVENT is used in C code. Without HAVE_LIBTRACEEVENT tracepoints are disabled and as such the commands kmem, kwork, lock, sched and timechart are removed. The majority of commands continue to work including "perf test". Committer notes: Fixed up a tools/perf/util/Build reject and added: #include <traceevent/event-parse.h> to tools/perf/util/scripting-engines/trace-event-perl.c. Committer testing: $ rpm -qi libtraceevent-devel Name : libtraceevent-devel Version : 1.5.3 Release : 2.fc36 Architecture: x86_64 Install Date: Mon 25 Jul 2022 03:20:19 PM -03 Group : Unspecified Size : 27728 License : LGPLv2+ and GPLv2+ Signature : RSA/SHA256, Fri 15 Apr 2022 02:11:58 PM -03, Key ID 999f7cbf38ab71f4 Source RPM : libtraceevent-1.5.3-2.fc36.src.rpm Build Date : Fri 15 Apr 2022 10:57:01 AM -03 Build Host : buildvm-x86-05.iad2.fedoraproject.org Packager : Fedora Project Vendor : Fedora Project URL : https://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git/ Bug URL : https://bugz.fedoraproject.org/libtraceevent Summary : Development headers of libtraceevent Description : Development headers of libtraceevent-libs $ Default build: $ ldd ~/bin/perf | grep tracee libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007f1dcaf8f000) $ # perf trace -e sched:* --max-events 10 0.000 migration/0/17 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, dest_cpu: 1) 0.005 migration/0/17 sched:sched_wake_idle_without_ipi(cpu: 1) 0.011 migration/0/17 sched:sched_switch(prev_comm: "", prev_pid: 17 (migration/0), prev_state: 1, next_comm: "", next_prio: 120) 1.173 :0/0 sched:sched_wakeup(comm: "", pid: 3138 (gnome-terminal-), prio: 120) 1.180 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 3138 (gnome-terminal-), next_prio: 120) 0.156 migration/1/21 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, orig_cpu: 1, dest_cpu: 2) 0.160 migration/1/21 sched:sched_wake_idle_without_ipi(cpu: 2) 0.166 migration/1/21 sched:sched_switch(prev_comm: "", prev_pid: 21 (migration/1), prev_state: 1, next_comm: "", next_prio: 120) 1.183 :0/0 sched:sched_wakeup(comm: "", pid: 1602985 (kworker/u16:0-f), prio: 120, target_cpu: 1) 1.186 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 1602985 (kworker/u16:0-f), next_prio: 120) # Had to tweak tools/perf/util/setup.py to make sure the python binding shared object links with libtraceevent if -DHAVE_LIBTRACEEVENT is present in CFLAGS. Building with NO_LIBTRACEEVENT=1 uncovered some more build failures: - Make building of data-convert-bt.c to CONFIG_LIBTRACEEVENT=y - perf-$(CONFIG_LIBTRACEEVENT) += scripts/ - bpf_kwork.o needs also to be dependent on CONFIG_LIBTRACEEVENT=y - The python binding needed some fixups and util/trace-event.c can't be built and linked with the python binding shared object, so remove it in tools/perf/util/setup.py and exclude it from the list of dependencies in the python/perf.so Makefile.perf target. Building without libtraceevent-devel installed uncovered more build failures: - The python binding tools/perf/util/python.c was assuming that traceevent/parse-events.h was always available, which was the case when we defaulted to using the in-kernel tools/lib/traceevent/ files, now we need to enclose it under ifdef HAVE_LIBTRACEEVENT, just like the other parts of it that deal with tracepoints. - We have to ifdef the rules in the Build files with CONFIG_LIBTRACEEVENT=y to build builtin-trace.c and tools/perf/trace/beauty/ as we only ifdef setting CONFIG_TRACE=y when setting NO_LIBTRACEEVENT=1 in the make command line, not when we don't detect libtraceevent-devel installed in the system. Simplification here to avoid these two ways of disabling builtin-trace.c and not having CONFIG_TRACE=y when libtraceevent-devel isn't installed is the clean way. From Athira: <quote> tools/perf/arch/powerpc/util/Build -perf-y += kvm-stat.o +perf-$(CONFIG_LIBTRACEEVENT) += kvm-stat.o </quote> Then, ditto for arm64 and s390, detected by container cross build tests. - s/390 uses test__checkevent_tracepoint() that is now only available if HAVE_LIBTRACEEVENT is defined, enclose the callsite with ifder HAVE_LIBTRACEEVENT. Also from Athira: <quote> With this change, I could successfully compile in these environment: - Without libtraceevent-devel installed - With libtraceevent-devel installed - With “make NO_LIBTRACEEVENT=1” </quote> Then, finally rename CONFIG_TRACEEVENT to CONFIG_LIBTRACEEVENT for consistency with other libraries detected in tools/perf/. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20221205225940.3079667-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
336b92da |
|
05-Dec-2022 |
Ravi Bangoria <ravi.bangoria@amd.com> |
perf tool: Move pmus list variable to a new file The 'pmus' list variable is defined as static variable under pmu.c file. Introduce a new pmus.c file and migrate this variable to it. Also make it non static so that it can be accessed from outside. Suggested-by: Ian Rogers <irogers@google.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Santosh Shukla <santosh.shukla@amd.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: carsten.haitzler@arm.com Link: https://lore.kernel.org/r/20221206043237.12159-2-ravi.bangoria@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
84bec6f0 |
|
09-Nov-2022 |
Ian Rogers <irogers@google.com> |
perf build: Install libsymbol locally when building The perf build currently has a '-Itools/lib' on the CC command line. This causes issues as the libapi, libsubcmd, libtraceevent, libbpf and libsymbol headers are all found via this path, making it impossible to override include behavior. Change the libsymbol build mirroring the libbpf, libsubcmd, libapi, libperf and libtraceevent build, so that it is installed in a directory along with its headers. A later change will modify the include behavior. Don't build kallsyms.o as part of util as this will lead to duplicate definitions. Add kallsym's directory to the MANIFEST rather than individual files, so that the Build and Makefile are added to a source tar ball. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nicolas Schier <nicolas@fjasle.eu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20221109184914.1357295-11-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b018899e |
|
04-Nov-2022 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf bpf: Rename perf_include_dir to libbpf_include_dir As this is where we expect to find bpf/bpf_helpers.h, etc. This needs more work to make it follow LIBBPF_DYNAMIC=1 usage, i.e. when not using the system libbpf it should use the headers in the in-kernel sources libbpf in tools/lib/bpf. We need to do that anyway to avoid this mixup system libbpf and in-kernel files, so we'll get this sorted out that way. And this also may become moot as we move to using BPF skels for this feature. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5e91e57e |
|
27-Sep-2022 |
Qi Liu <liuqi115@huawei.com> |
perf auxtrace arm64: Add support for parsing HiSilicon PCIe Trace packet Add support for using 'perf report --dump-raw-trace' to parse PTT packet. Example usage: Output will contain raw PTT data and its textual representation, such as (8DW format): 0 0 0x5810 [0x30]: PERF_RECORD_AUXTRACE size: 0x400000 offset: 0 ref: 0xa5d50c725 idx: 0 tid: -1 cpu: 0 . . ... HISI PTT data: size 4194304 bytes . 00000000: 00 00 00 00 Prefix . 00000004: 08 20 00 60 Header DW0 . 00000008: ff 02 00 01 Header DW1 . 0000000c: 20 08 00 00 Header DW2 . 00000010: 10 e7 44 ab Header DW3 . 00000014: 2a a8 1e 01 Time . 00000020: 00 00 00 00 Prefix . 00000024: 01 00 00 60 Header DW0 . 00000028: 0f 1e 00 01 Header DW1 . 0000002c: 04 00 00 00 Header DW2 . 00000030: 40 00 81 02 Header DW3 . 00000034: ee 02 00 00 Time .... This patch only add basic parsing support according to the definition of the PTT packet described in Documentation/trace/hisi-ptt.rst. And the fields of each packet can be further decoded following the PCIe Spec's definition of TLP packet. Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bjorn Helgaas <helgaas@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: John Garry <john.garry@huawei.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Liu <liuqi6124@gmail.com> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Zeng Prime <prime.zeng@huawei.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-pci@vger.kernel.org Cc: linuxarm@huawei.com Link: https://lore.kernel.org/r/20220927081400.14364-4-yangyicong@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e57d8977 |
|
26-Aug-2022 |
Pavithra Gurushankar <gpavithrasha@gmail.com> |
perf mutex: Wrapped usage of mutex and cond Added a new header file mutex.h that wraps the usage of pthread_mutex_t and pthread_cond_t. By abstracting these it is possible to introduce error checking. Signed-off-by: Pavithra Gurushankar <gpavithrasha@gmail.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Truong <alexandre.truong@arm.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: André Almeida <andrealmeid@igalia.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Colin Ian King <colin.king@intel.com> Cc: Dario Petrillo <dario.pk1@gmail.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Dave Marchevsky <davemarchevsky@fb.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Fangrui Song <maskray@google.com> Cc: Hewenliang <hewenliang4@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jason Wang <wangborong@cdjrlc.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Liška <mliska@suse.cz> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Monnet <quentin@isovalent.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Remi Bernon <rbernon@codeweavers.com> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tom Rix <trix@redhat.com> Cc: Weiguo Li <liwg06@foxmail.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: William Cohen <wcohen@redhat.com> Cc: Zechuan Chen <chenzechuan1@huawei.com> Cc: bpf@vger.kernel.org Cc: llvm@lists.linux.dev Cc: yaowenbin <yaowenbin1@huawei.com> Link: https://lore.kernel.org/r/20220826164242.43412-2-irogers@google.com Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5149a427 |
|
29-Sep-2022 |
Jiri Olsa <jolsa@kernel.org> |
perf parse-events: Ignore clang 15 warning about variable set but unused in bison produced code clang 15 now warns: 46 65.20 fedora:rawhide : FAIL clang version 15.0.0 (Fedora 15.0.0-3.fc38) util/parse-events-bison.c:1401:9: error: variable 'parse_events_nerrs' set but not used [-Werror,-Wunused-but-set-variable] int yynerrs = 0; ^ #define yynerrs parse_events_nerrs ^ 1 error generated. make[3]: *** [/git/perf-6.0.0-rc7/tools/build/Makefile.build:139: util] Error 2 Just ignore one more compiler warning for the bison generated C code. Committer notes: Older clangs don't know about -Wunused-but-set-variable, so we need to add -Wno-unknown-warning-option to avoid this: 37 44.92 fedora:32 : FAIL clang version 10.0.1 (Fedora 10.0.1-3.fc32) error: unknown warning option '-Wno-unused-but-set-variable'; did you mean '-Wno-unused-const-variable'? [-Werror,-Wunknown-warning-option] make[3]: *** [/git/perf-6.0.0-rc7/tools/build/Makefile.build:139: util] Error 2 Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/lkml/20220929140514.226807-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4a88c4ec |
|
11-Aug-2022 |
Leo Yan <leo.yan@linaro.org> |
perf arm64: Add missing -I for tools/arch/arm64/include/ to find asm/sysreg.h when building arm_spe.h This cures a current problem where tools/perf/util/arm-spe.c isn't finding a ARM64 specific asm header, so lets add it for now to make progress. Adding a .o specific rule seems clunky, lets try and find if this is really the right solution. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reported-by: Suzuki K Poulose <suzuki.poulose@arm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Will Deacon <will@kernel.org> Cc: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/lkml/20220811124825.GA868014@leoy-huanghe.lan Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9b7c7728 |
|
29-Jul-2022 |
Ian Rogers <irogers@google.com> |
perf parse-events: Break out tracepoint and printing Move print_*_events functions out of parse-events.c into a new print-events.c. Move tracepoint code into tracepoint.c or trace-event-info.c (sole user). This reduces the dependencies of parse-events.c and makes it more amenable to being a library in the future. Remove some unnecessary definitions from parse-events.h. Fix a checkpatch.pl warning on using unsigned rather than unsigned int. Fix some line length warnings too. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220729204217.250166-3-irogers@google.com [ Add include linux/stddef.h before perf_events.h for systems where __always_inline isn't pulled in before used, such as older Alpine Linux ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
407b36f6 |
|
29-Jul-2022 |
Namhyung Kim <namhyung@kernel.org> |
perf lock: Use BPF for lock contention analysis Add -b/--use-bpf option to use BPF to collect lock contention stats. For simplicity it now runs system-wide and requires C-c to stop. Upcoming changes will add the usual filtering. $ sudo perf lock con -b ^C contended total wait max wait avg wait type caller 42 192.67 us 13.64 us 4.59 us spinlock queue_work_on+0x20 23 85.54 us 10.28 us 3.72 us spinlock worker_thread+0x14a 6 13.92 us 6.51 us 2.32 us mutex kernfs_iop_permission+0x30 3 11.59 us 10.04 us 3.86 us mutex kernfs_dop_revalidate+0x3c 1 7.52 us 7.52 us 7.52 us spinlock kthread+0x115 1 7.24 us 7.24 us 7.24 us rwlock:W sys_epoll_wait+0x148 2 7.08 us 3.99 us 3.54 us spinlock delayed_work_timer_fn+0x1b 1 6.41 us 6.41 us 6.41 us spinlock idle_balance+0xa06 2 2.50 us 1.83 us 1.25 us mutex kernfs_iop_lookup+0x2f 1 1.71 us 1.71 us 1.71 us mutex kernfs_iop_getattr+0x2c Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Blake Jones <blakejones@google.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220729200756.666106-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
daf07d22 |
|
08-Jul-2022 |
Yang Jihong <yangjihong1@huawei.com> |
perf kwork: Implement BPF trace 'perf record' generates perf.data, which generates extra interrupts for hard disk, amount of data to be collected increases with time. Using eBPF trace can process the data in kernel, which solves the preceding two problems. Add -b/--use-bpf option for latency and report to support tracing kwork events using eBPF: 1. Create bpf prog and attach to tracepoints, 2. Start tracing after command is entered, 3. After user hit "ctrl+c", stop tracing and report, 4. Support CPU and name filtering. This commit implements the framework code and does not add specific event support. Test cases: # perf kwork rep -h Usage: perf kwork report [<options>] -b, --use-bpf Use BPF to measure kwork runtime -C, --cpu <cpu> list of cpus to profile -i, --input <file> input file name -n, --name <name> event name to profile -s, --sort <key[,key2...]> sort by key(s): runtime, max, count -S, --with-summary Show summary with statistics --time <str> Time span for analysis (start,stop) # perf kwork lat -h Usage: perf kwork latency [<options>] -b, --use-bpf Use BPF to measure kwork latency -C, --cpu <cpu> list of cpus to profile -i, --input <file> input file name -n, --name <name> event name to profile -s, --sort <key[,key2...]> sort by key(s): avg, max, count --time <str> Time span for analysis (start,stop) # perf kwork lat -b Unsupported bpf trace class irq # perf kwork rep -b Unsupported bpf trace class irq Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-15-yangjihong1@huawei.com [ Simplify work_findnew() ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
edc41a10 |
|
18-May-2022 |
Namhyung Kim <namhyung@kernel.org> |
perf record: Enable off-cpu analysis with BPF Add --off-cpu option to enable the off-cpu profiling with BPF. It'd use a bpf_output event and rename it to "offcpu-time". Samples will be synthesized at the end of the record session using data from a BPF map which contains the aggregated off-cpu time at context switches. So it needs root privilege to get the off-cpu profiling. Each sample will have a separate user stacktrace so it will skip kernel threads. The sample ip will be set from the stacktrace and other sample data will be updated accordingly. Currently it only handles some basic sample types. The sample timestamp is set to a dummy value just not to bother with other events during the sorting. So it has a very big initial value and increase it on processing each samples. Good thing is that it can be used together with regular profiling like cpu cycles. If you don't want to that, you can use a dummy event to enable off-cpu profiling only. Example output: $ sudo perf record --off-cpu perf bench sched messaging -l 1000 $ sudo perf report --stdio --call-graph=no # Total Lost Samples: 0 # # Samples: 41K of event 'cycles' # Event count (approx.): 42137343851 ... # Samples: 1K of event 'offcpu-time' # Event count (approx.): 587990831640 # # Children Self Command Shared Object Symbol # ........ ........ ............... .................. ......................... # 81.66% 0.00% sched-messaging libc-2.33.so [.] __libc_start_main 81.66% 0.00% sched-messaging perf [.] cmd_bench 81.66% 0.00% sched-messaging perf [.] main 81.66% 0.00% sched-messaging perf [.] run_builtin 81.43% 0.00% sched-messaging perf [.] bench_sched_messaging 40.86% 40.86% sched-messaging libpthread-2.33.so [.] __read 37.66% 37.66% sched-messaging libpthread-2.33.so [.] __write 2.91% 2.91% sched-messaging libc-2.33.so [.] __poll ... As you can see it spent most of off-cpu time in read and write in bench_sched_messaging(). The --call-graph=no was added just to make the output concise here. It uses perf hooks facility to control BPF program during the record session rather than adding new BPF/off-cpu specific calls. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Blake Jones <blakejones@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20220518224725.742882-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9d31d18b |
|
11-Feb-2022 |
Ian Rogers <irogers@google.com> |
perf maps: Move maps code to own C file The maps code has its own header, move the corresponding C function definitions to their own C file. In the process tidy and minimize includes. Committer notes: Add back the 'static' for maps__init() and maps__exit(). Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: André Almeida <andrealmeid@collabora.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Hao Luo <haoluo@google.com> Cc: James Clark <james.clark@arm.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yury Norov <yury.norov@gmail.com> Link: http://lore.kernel.org/lkml/20220211103415.2737789-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b9f6fbb3 |
|
17-Dec-2021 |
Alexandre Truong <alexandre.truong@arm.com> |
perf arm64: Inject missing frames when using 'perf record --call-graph=fp' When unwinding using frame pointers on ARM64, the return address of the current function may not have been pushed into the stack when a function was interrupted, which makes perf show an incorrect call graph to the user. Consider the following example program: void leaf() { /* long computation */ } void parent() { // (1) leaf(); // (2) } ... could be compiled into (using gcc -fno-inline -fno-omit-frame-pointer): leaf: /* long computation */ nop ret parent: // (1) stp x29, x30, [sp, -16]! mov x29, sp bl parent nop ldp x29, x30, [sp], 16 // (2) ret If the program is interrupted at (1), (2), or any point in "leaf:", the call graph will skip the callers of the current function. We can unwind using the dwarf info and check if the return addr is the same as the LR register, and inject the missing frame into the call graph. Before this patch, the above example shows the following call-graph when recording using "--call-graph fp" mode in ARM64: # Children Self Command Shared Object Symbol # ........ ........ ........ ................ ...................... # 99.86% 99.86% program3 program3 [.] leaf | ---_start __libc_start_main main leaf As can be seen, the "parent" function is missing. This is specially problematic in "leaf" because for leaf functions the compiler may always omit pushing the return addr into the stack. After this patch, it shows the correct graph: # Children Self Command Shared Object Symbol # ........ ........ ........ ................ ...................... # 99.86% 99.86% program3 program3 [.] leaf | ---_start __libc_start_main main parent leaf Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Alexandre Truong <alexandre.truong@arm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20211217154521.80603-7-german.gomez@arm.com Signed-off-by: German Gomez <german.gomez@arm.com> [ Rename machine__normalize_is() to machine__normalized_is(), as suggested by James Clark ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
177f4eac |
|
15-Dec-2021 |
Namhyung Kim <namhyung@kernel.org> |
perf ftrace: Add -b/--use-bpf option for latency subcommand The -b/--use-bpf option is to use BPF to get latency info of kernel functions. It'd have better performance impact and I observed that latency of same function is smaller than before when using BPF. Committer testing: # strace -e bpf perf ftrace latency -b -T __handle_mm_fault -a sleep 1 bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7fff51914e00, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 3 bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\20\0\0\0\20\0\0\0\5\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=45, btf_log_size=0, btf_log_level=0}, 128) = 3 bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\t\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=81, btf_log_size=0, btf_log_level=0}, 128) = 3 bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\08\0\0\08\0\0\0\t\0\0\0\0\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=89, btf_log_size=0, btf_log_level=0}, 128) = 3 bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\f\0\0\0\f\0\0\0\7\0\0\0\1\0\0\0\0\0\0\20"..., btf_log_buf=NULL, btf_size=43, btf_log_size=0, btf_log_level=0}, 128) = 3 bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\t\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=81, btf_log_size=0, btf_log_level=0}, 128) = 3 bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\5\0\0\0\0\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=77, btf_log_size=0, btf_log_level=0}, 128) = -1 EINVAL (Invalid argument) bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\350\2\0\0\350\2\0\0\353\2\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1515, btf_log_size=0, btf_log_level=0}, 128) = 3 bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=32, max_entries=1, map_flags=0, inner_map_fd=0, map_name="", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 4 bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=5, insns=0x7fff51914c30, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 5 bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=4, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 4 bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7fff51914a80, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="test", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 4 bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=8, value_size=8, max_entries=10000, map_flags=0, inner_map_fd=0, map_name="functime", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 4 bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=1, max_entries=1, map_flags=0, inner_map_fd=0, map_name="cpu_filter", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 5 bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=1, max_entries=1, map_flags=0, inner_map_fd=0, map_name="task_filter", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 7 bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=8, max_entries=22, map_flags=0, inner_map_fd=0, map_name="latency", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 8 bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=4, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="func_lat.bss", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=30, btf_vmlinux_value_type_id=0}, 128) = 9 bpf(BPF_MAP_UPDATE_ELEM, {map_fd=9, key=0x7fff51914c40, value=0x7f6e99be2000, flags=BPF_ANY}, 128) = 0 bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_KPROBE, insn_cnt=18, insns=0x11e4160, license="", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 14, 16), prog_flags=0, prog_name="func_begin", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=3, func_info_rec_size=8, func_info=0x11dfc50, func_info_cnt=1, line_info_rec_size=16, line_info=0x11e04c0, line_info_cnt=9, attach_btf_id=0, attach_prog_fd=0}, 128) = 10 bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_KPROBE, insn_cnt=99, insns=0x11ded70, license="", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 14, 16), prog_flags=0, prog_name="func_end", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=3, func_info_rec_size=8, func_info=0x11dfc70, func_info_cnt=1, line_info_rec_size=16, line_info=0x11f6e10, line_info_cnt=20, attach_btf_id=0, attach_prog_fd=0}, 128) = 11 bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACEPOINT, insn_cnt=2, insns=0x7fff51914a80, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 13 bpf(BPF_LINK_CREATE, {link_create={prog_fd=13, target_fd=-1, attach_type=0x29 /* BPF_??? */, flags=0}}, 128) = -1 EINVAL (Invalid argument) --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1699992, si_uid=0, si_status=0, si_utime=0, si_stime=0} --- bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0 bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0 bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0 bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0 bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0 bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0 bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0 bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0 bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0 bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0 bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0 bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0 bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0 bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0 bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0 bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0 bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0 bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0 bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0 bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0 bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0 bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0 # DURATION | COUNT | GRAPH | 0 - 1 us | 52 | ################### | 1 - 2 us | 36 | ############# | 2 - 4 us | 24 | ######### | 4 - 8 us | 7 | ## | 8 - 16 us | 1 | | 16 - 32 us | 0 | | 32 - 64 us | 0 | | 64 - 128 us | 0 | | 128 - 256 us | 0 | | 256 - 512 us | 0 | | 512 - 1024 us | 0 | | 1 - 2 ms | 0 | | 2 - 4 ms | 0 | | 4 - 8 ms | 0 | | 8 - 16 ms | 0 | | 16 - 32 ms | 0 | | 32 - 64 ms | 0 | | 64 - 128 ms | 0 | | 128 - 256 ms | 0 | | 256 - 512 ms | 0 | | 512 - 1024 ms | 0 | | 1 - ... s | 0 | | +++ exited with 0 +++ # Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Changbin Du <changbin.du@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20211215185154.360314-5-namhyung@kernel.org [ Add missing util/cpumap.h include and removed unused 'fd' variable ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
199e06fe |
|
01-Nov-2021 |
Dave Marchevsky <davemarchevsky@fb.com> |
perf: Pull in bpf_program__get_prog_info_linear To prepare for impending deprecation of libbpf's bpf_program__get_prog_info_linear, pull in the function and associated helpers into the perf codebase and migrate existing uses to the perf copy. Since libbpf's deprecated definitions will still be visible to perf, it is necessary to rename perf's definitions. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20211101224357.2651181-4-davemarchevsky@fb.com
|
#
6ac22d03 |
|
11-Oct-2021 |
Dave Marchevsky <davemarchevsky@fb.com> |
perf bpf: Pull in bpf_program__get_prog_info_linear() To prepare for impending deprecation of libbpf's bpf_program__get_prog_info_linear(), pull in the function and associated helpers into the perf codebase and migrate existing uses to the perf copy. Since libbpf's deprecated definitions will still be visible to perf, it is necessary to rename perf's definitions. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20211011082031.4148337-4-davemarchevsky@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
92ec3cc9 |
|
15-Oct-2021 |
Ian Rogers <irogers@google.com> |
tools lib: Adopt list_sort() from the kernel sources Add list_sort.[ch] from the main kernel tree. The linux/bug.h #include is removed due to conflicting definitions. Add check-headers and modify perf build accordingly. MANIFEST and python-ext-sources fixes suggested by Arnaldo. Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Andi Kleen <ak@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrew Kilroy <andrew.kilroy@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Changbin Du <changbin.du@intel.com> Cc: Denys Zagorui <dzagorui@cisco.com> Cc: Fabian Hemmer <copy@copy.sh> Cc: Felix Fietkau <nbd@nbd.name> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jacob Keller <jacob.e.keller@intel.com> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joakim Zhang <qiangqing.zhang@nxp.com> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kees Kook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nicholas Fraser <nfraser@codeweavers.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: ShihCheng Tu <mrtoastcheng@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Wan Jiabing <wanjiabing@vivo.com> Cc: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20211015172132.1162559-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
291dcb98 |
|
17-Aug-2021 |
Kim Phillips <kim.phillips@amd.com> |
perf report: Add support to print a textual representation of IBS raw sample data Perf records IBS (Instruction Based Sampling) extra sample data when 'perf record --raw-samples' is used with an IBS-compatible event, on a machine that supports IBS. IBS support is indicated in CPUID_Fn80000001_ECX bit #10. Up until now, users have been able to see the extra sample data solely in raw hex format using 'perf report --dump-raw-trace'. From there, users could decode the data either manually, or by using an external script. Enable the built-in 'perf report --dump-raw-trace' to do the decoding of the extra sample data bits, so manual or external script decoding isn't necessary. Example usage: $ sudo perf record -c 10000001 -a --raw-samples -e ibs_fetch/rand_en=1/,ibs_op/cnt_ctl=1/ -C 0,1 taskset -c 0,1 7za b -mmt2 | perf report --dump-raw-trace Stdout contains IBS Fetch samples, e.g.: ibs_fetch_ctl: 02170007ffffffff MaxCnt 1048560 Cnt 1048560 Lat 7 En 1 Val 1 Comp 1 IcMiss 0 PhyAddrValid 1 L1TlbPgSz 4KB L1TlbMiss 0 L2TlbMiss 0 RandEn 1 L2Miss 0 IbsFetchLinAd: 000056016b2ead40 IbsFetchPhysAd: 000000115cedfd40 c_ibs_ext_ctl: 0000000000000000 IbsItlbRefillLat 0 ..and IBS Op samples, e.g.: ibs_op_ctl: 0000009e009e8968 MaxCnt 10000000 En 1 Val 1 CntCtl 1=uOps CurCnt 158 IbsOpRip: 000056016b2ea73d ibs_op_data: 00000000000b0002 CompToRetCtr 2 TagToRetCtr 11 BrnRet 0 RipInvalid 0 BrnFuse 0 Microcode 0 ibs_op_data2: 0000000000000002 CacheHitSt 0=M-state RmtNode 0 DataSrc 2=Local node cache ibs_op_data3: 0000000000c60002 LdOp 0 StOp 1 DcL1TlbMiss 0 DcL2TlbMiss 0 DcL1TlbHit2M 0 DcL1TlbHit1G 0 DcL2TlbHit2M 0 DcMiss 0 DcMisAcc 0 DcWcMemAcc 0 DcUcMemAcc 0 DcLockedOp 0 DcMissNoMabAlloc 0 DcLinAddrValid 1 DcPhyAddrValid 1 DcL2TlbHit1G 0 L2Miss 0 SwPf 0 OpMemWidth 4 bytes OpDcMissOpenMemReqs 0 DcMissLat 0 TlbRefillLat 0 IbsDCLinAd: 00007f133c319ce0 IbsDCPhysAd: 0000000270485ce0 Committer notes: Fixed up this: util/amd-sample-raw.c: In function ‘evlist__amd_sample_raw’: util/amd-sample-raw.c:125:42: error: ‘ bytes’ directive output may be truncated writing 6 bytes into a region of size between 4 and 7 [-Werror=format-truncation=] 125 | " OpMemWidth %2d bytes", 1 << (reg.op_mem_width - 1)); | ^~~~~~ In file included from /usr/include/stdio.h:866, from util/amd-sample-raw.c:7: /usr/include/bits/stdio2.h:71:10: note: ‘__builtin___snprintf_chk’ output between 21 and 24 bytes into a destination of size 21 71 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 72 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 73 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors As that %2d won't limit the number of chars to 2, just state that 2 is the minimal width: $ cat printf.c #include <stdio.h> #include <stdlib.h> int main(int argc, char *argv[]) { char bf[64]; int len = snprintf(bf, sizeof(bf), "%2d", atoi(argv[1])); printf("strlen(%s): %u\n", bf, len); return 0; } $ ./printf 1 strlen( 1): 2 $ ./printf 12 strlen(12): 2 $ ./printf 123 strlen(123): 3 $ ./printf 1234 strlen(1234): 4 $ ./printf 12345 strlen(12345): 5 $ ./printf 123456 strlen(123456): 6 $ And since we probably don't want that output to be truncated, just assume the worst case, as the compiler did, and add a few more chars to that buffer. Also use sizeof(var) instead of sizeof(dup-of-wanted-format-string) to avoid bugs when changing one but not the other. I also had to change this: -#include <asm/amd-ibs.h> +#include "../../arch/x86/include/asm/amd-ibs.h" To make it build on other architectures, just like intel-pt does. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joao Martins <joao.m.martins@oracle.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Link: https //lore.kernel.org/r/20210817221509.88391-4-kim.phillips@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
944138f0 |
|
01-Jul-2021 |
Namhyung Kim <namhyung@kernel.org> |
perf stat: Enable BPF counter with --for-each-cgroup Recently bperf was added to use BPF to count perf events for various purposes. This is an extension for the approach and targetting to cgroup usages. Unlike the other bperf, it doesn't share the events with other processes but it'd reduce unnecessary events (and the overhead of multiplexing) for each monitored cgroup within the perf session. When --for-each-cgroup is used with --bpf-counters, it will open cgroup-switches event per cpu internally and attach the new BPF program to read given perf_events and to aggregate the results for cgroups. It's only called when task is switched to a task in a different cgroup. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20210701211227.1403788-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
291961fc |
|
27-Jun-2021 |
Adrian Hunter <adrian.hunter@intel.com> |
perf script: Add API for filtering via dynamically loaded shared object In some cases, users want to filter very large amounts of data (e.g. from AUX area tracing like Intel PT) looking for something specific. While scripting such as Python can be used, Python is 10 to 20 times slower than C. So define a C API so that custom filters can be written and loaded. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210627131818.810-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
6793672a |
|
24-May-2021 |
Denys Zagorui <dzagorui@cisco.com> |
perf parse-events: Add bison --file-prefix-map option During a perf build with O= bison stores full paths in generated files and those paths are stored in resulting perf binary. Starting from bison v3.7.1 those paths can be remapped by using the --file-prefix-map option. Use this option if possible to make perf binary more reproducible. Signed-off-by: Denys Zagorui <dzagorui@cisco.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210524111514.65713-3-dzagorui@cisco.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ad1237c3 |
|
08-May-2021 |
Jiri Olsa <jolsa@kernel.org> |
perf tools: Fix dynamic libbpf link Justin reported broken build with LIBBPF_DYNAMIC=1. When linking libbpf dynamically we need to use perf's hashmap object, because it's not exported in libbpf.so (only in libbpf.a). Following build is now passing: $ make LIBBPF_DYNAMIC=1 BUILD: Doing 'make -j8' parallel build ... $ ldd perf | grep libbpf libbpf.so.0 => /lib64/libbpf.so.0 (0x00007fa7630db000) Fixes: eee19501926d ("perf tools: Grab a copy of libbpf's hashmap") Reported-by: Justin M. Forbes <jforbes@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210508205020.617984-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b53a0755 |
|
27-Apr-2021 |
Jin Yao <yao.jin@linux.intel.com> |
perf record: Create two hybrid 'cycles' events by default When evlist is empty, for example no '-e' specified in perf record, one default 'cycles' event is added to evlist. While on hybrid platform, it needs to create two default 'cycles' events. One is for cpu_core, the other is for cpu_atom. This patch actually calls evsel__new_cycles() two times to create two 'cycles' events. # ./perf record -vv -a -- sleep 1 ... ------------------------------------------------------------ perf_event_attr: size 120 config 0x400000000 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|ID|CPU|PERIOD read_format ID disabled 1 inherit 1 freq 1 precise_ip 3 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 5 sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 6 sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 7 sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 9 sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 10 sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 11 sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 12 sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 13 sys_perf_event_open: pid -1 cpu 8 group_fd -1 flags 0x8 = 14 sys_perf_event_open: pid -1 cpu 9 group_fd -1 flags 0x8 = 15 sys_perf_event_open: pid -1 cpu 10 group_fd -1 flags 0x8 = 16 sys_perf_event_open: pid -1 cpu 11 group_fd -1 flags 0x8 = 17 sys_perf_event_open: pid -1 cpu 12 group_fd -1 flags 0x8 = 18 sys_perf_event_open: pid -1 cpu 13 group_fd -1 flags 0x8 = 19 sys_perf_event_open: pid -1 cpu 14 group_fd -1 flags 0x8 = 20 sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 21 ------------------------------------------------------------ perf_event_attr: size 120 config 0x800000000 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|ID|CPU|PERIOD read_format ID disabled 1 inherit 1 freq 1 precise_ip 3 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 22 sys_perf_event_open: pid -1 cpu 17 group_fd -1 flags 0x8 = 23 sys_perf_event_open: pid -1 cpu 18 group_fd -1 flags 0x8 = 24 sys_perf_event_open: pid -1 cpu 19 group_fd -1 flags 0x8 = 25 sys_perf_event_open: pid -1 cpu 20 group_fd -1 flags 0x8 = 26 sys_perf_event_open: pid -1 cpu 21 group_fd -1 flags 0x8 = 27 sys_perf_event_open: pid -1 cpu 22 group_fd -1 flags 0x8 = 28 sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 29 ------------------------------------------------------------ We have to create evlist-hybrid.c otherwise due to the symbol dependency the perf test python would be failed. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-14-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9cbfa2f6 |
|
27-Apr-2021 |
Jin Yao <yao.jin@linux.intel.com> |
perf parse-events: Create two hybrid hardware events Current hardware events has special perf types PERF_TYPE_HARDWARE. But it doesn't pass the PMU type in the user interface. For a hybrid system, the perf kernel doesn't know which PMU the events belong to. So now this type is extended to be PMU aware type. The PMU type ID is stored at attr.config[63:32]. PMU type ID is retrieved from sysfs. root@lkp-adl-d01:/sys/devices/cpu_atom# cat type 8 root@lkp-adl-d01:/sys/devices/cpu_core# cat type 4 When enabling a hybrid hardware event without specified pmu, such as, 'perf stat -e cycles -a', two events are created automatically. One is for atom, the other is for core. # perf stat -e cycles -a -vv -- sleep 1 Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: size 120 config 0x400000000 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ ... ------------------------------------------------------------ perf_event_attr: size 120 config 0x400000000 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 19 ------------------------------------------------------------ perf_event_attr: size 120 config 0x800000000 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 20 ------------------------------------------------------------ ... ------------------------------------------------------------ perf_event_attr: size 120 config 0x800000000 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 27 cycles: 0: 836272 1001525722 1001525722 cycles: 1: 628564 1001580453 1001580453 cycles: 2: 872693 1001605997 1001605997 cycles: 3: 70417 1001641369 1001641369 cycles: 4: 88593 1001726722 1001726722 cycles: 5: 470495 1001752993 1001752993 cycles: 6: 484733 1001840440 1001840440 cycles: 7: 1272477 1001593105 1001593105 cycles: 8: 209185 1001608616 1001608616 cycles: 9: 204391 1001633962 1001633962 cycles: 10: 264121 1001661745 1001661745 cycles: 11: 826104 1001689904 1001689904 cycles: 12: 89935 1001728861 1001728861 cycles: 13: 70639 1001756757 1001756757 cycles: 14: 185266 1001784810 1001784810 cycles: 15: 171094 1001825466 1001825466 cycles: 0: 129624 1001854843 1001854843 cycles: 1: 122533 1001840421 1001840421 cycles: 2: 90055 1001882506 1001882506 cycles: 3: 139607 1001896463 1001896463 cycles: 4: 141791 1001907838 1001907838 cycles: 5: 530927 1001883880 1001883880 cycles: 6: 143246 1001852529 1001852529 cycles: 7: 667769 1001872626 1001872626 cycles: 6744979 16026956922 16026956922 cycles: 1965552 8014991106 8014991106 Performance counter stats for 'system wide': 6,744,979 cpu_core/cycles/ 1,965,552 cpu_atom/cycles/ 1.001882711 seconds time elapsed 0x4 in 0x400000000 indicates the cpu_core pmu. 0x8 in 0x800000000 indicates the cpu_atom pmu. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-9-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
44462430 |
|
27-Apr-2021 |
Jin Yao <yao.jin@linux.intel.com> |
perf pmu: Save detected hybrid pmus to a global pmu list We identify the cpu_core pmu and cpu_atom pmu by explicitly checking following files: For cpu_core, checks: "/sys/bus/event_source/devices/cpu_core/cpus" For cpu_atom, checks: "/sys/bus/event_source/devices/cpu_atom/cpus" If the 'cpus' file exists and it has data, the pmu exists. But in order not to hardcode the "cpu_core" and "cpu_atom", and make the code in a generic way. So if the path "/sys/bus/event_source/devices/cpu_xxx/cpus" exists, the hybrid pmu exists. All the detected hybrid pmus are linked to a global list 'perf_pmu__hybrid_pmus' and then next we just need to iterate the list to get all hybrid pmu by using perf_pmu__for_each_hybrid_pmu. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-6-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d0713d4c |
|
26-Apr-2021 |
Nicholas Fraser <nfraser@codeweavers.com> |
perf data: Add JSON export This adds a feature to export perf data to JSON. The resolved symbols are exported into the JSON so that external tools don't need to load the dsos themselves (or even have access to them at all.) This makes it easy to load and analyze perf data with standalone tools where direct perf or libbabeltrace integration is impractical. The exporter uses a minimal inline JSON encoding without any external dependencies. Currently it only outputs some headers and sample metadata but it's easily extensible. Use it like this: $ perf data convert --to-json out.json Committer notes: Fixup a __printf() bug that broke the build: util/data-convert-json.c:103:11: error: expected ‘)’ before numeric constant 103 | __(printf, 5, 6) | ^~ | ) util/data-convert-json.c: In function ‘output_sample_callchain_entry’: util/data-convert-json.c:124:2: error: implicit declaration of function ‘output_json_key_format’; did you mean ‘output_json_format’? [-Werror=implicit-function-declaration] 124 | output_json_key_format(out, false, 5, "ip", "\"0x%" PRIx64 "\"", ip); | ^~~~~~~~~~~~~~~~~~~~~~ | output_json_format Also had to add this patch to fix errors reported by various versions of clang: - if (al && al->sym && al->sym->name && strlen(al->sym->name) > 0) { + if (al && al->sym && al->sym->namelen) { al->sym->name is a zero sized array, to avoid one extra alloc in the symbol__new() constructor, sym->namelen carries its strlen. Committer testing: $ ls -la out.json ls: cannot access 'out.json': No such file or directory $ perf record sleep 0.1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data (8 samples) ] $ perf report --stats | grep -w SAMPLE SAMPLE events: 8 $ perf data convert --to-json out.json [ perf data convert: Converted 'perf.data' into JSON data 'out.json' ] [ perf data convert: Converted and wrote 0.002 MB (8 samples) ] $ ls -la out.json -rw-rw-r--. 1 acme acme 2017 Apr 26 17:29 out.json $ cat out.json { "linux-perf-json-version": 1, "headers": { "header-version": 1, "captured-on": "2021-04-26T20:28:57Z", "data-offset": 432, "data-size": 1016, "feat-offset": 1448, "hostname": "five", "os-release": "5.11.14-200.fc33.x86_64", "arch": "x86_64", "cpu-desc": "AMD Ryzen 9 3900X 12-Core Processor", "cpuid": "AuthenticAMD,23,113,0", "nrcpus-online": 24, "nrcpus-avail": 24, "perf-version": "5.12.gee134f3189bd", "cmdline": [ "/home/acme/bin/perf", "record", "sleep", "0.1" ] }, "samples": [ { "timestamp": 170517539043684, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0xffffffffa6268827" } ] }, { "timestamp": 170517539048443, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0xffffffffa661359d" } ] }, { "timestamp": 170517539051018, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0xffffffffa6311e18" } ] }, { "timestamp": 170517539053652, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0x7fdb77b4812b", "symbol": "_dl_start", "dso": "ld-2.32.so" } ] }, { "timestamp": 170517539055306, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0xffffffffa6269286" } ] }, { "timestamp": 170517539057590, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0xffffffffa62abd8b" } ] }, { "timestamp": 170517539067559, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0x7fdb77b5e9e9", "symbol": "__GI___tunables_init", "dso": "ld-2.32.so" } ] }, { "timestamp": 170517539282452, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0x7fdb779978d2", "symbol": "getenv", "dso": "libc-2.32.so" } ] } ] } $ Signed-off-by: Nicholas Fraser <nfraser@codeweavers.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Changbin Du <changbin.du@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tan Xiaojun <tanxiaojun@huawei.com> Cc: Ulrich Czekalla <uczekalla@codeweavers.com> Link: http://lore.kernel.org/lkml/3884969f-804d-2f53-c648-e2b0bd85edff@codeweavers.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f07952b1 |
|
18-Apr-2021 |
Alexander Antonov <alexander.antonov@linux.intel.com> |
perf stat: Basic support for iostat in perf Add basic flow for a new iostat mode in perf. Mode is intended to provide four I/O performance metrics per each PCIe root port: Inbound Read, Inbound Write, Outbound Read, Outbound Write. The actual code to compute the metrics and attribute it to root port is in follow-on patches. Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey V Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210419094147.15909-2-alexander.antonov@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
cef7af25 |
|
03-Feb-2021 |
Fabian Hemmer <copy@copy.sh> |
perf tools: Add OCaml demangling Detect symbols generated by the OCaml compiler based on their prefix. Demangle OCaml symbols, returning a newly allocated string (like the existing Java demangling functionality). Move a helper function (hex) from tests/code-reading.c to util/string.c To test: echo 'Printf.printf "%d\n" (Random.int 42)' > test.ml perf record ocamlopt.opt test.ml perf report -d ocamlopt.opt Signed-off-by: Fabian Hemmer <copy@copy.sh> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> LPU-Reference: 20210203211537.b25ytjb6dq5jfbwx@nyu Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
fa853c4b |
|
29-Dec-2020 |
Song Liu <songliubraving@fb.com> |
perf stat: Enable counting events for BPF programs Introduce 'perf stat -b' option, which counts events for BPF programs, like: [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000 1.487903822 115,200 ref-cycles 1.487903822 86,012 cycles 2.489147029 80,560 ref-cycles 2.489147029 73,784 cycles 3.490341825 60,720 ref-cycles 3.490341825 37,797 cycles 4.491540887 37,120 ref-cycles 4.491540887 31,963 cycles The example above counts 'cycles' and 'ref-cycles' of BPF program of id 254. This is similar to bpftool-prog-profile command, but more flexible. 'perf stat -b' creates per-cpu perf_event and loads fentry/fexit BPF programs (monitor-progs) to the target BPF program (target-prog). The monitor-progs read perf_event before and after the target-prog, and aggregate the difference in a BPF map. Then the user space reads data from these maps. A new 'struct bpf_counter' is introduced to provide a common interface that uses BPF programs/maps to count perf events. Committer notes: Removed all but bpf_counter.h includes from evsel.h, not needed at all. Also BPF map lookups for PERCPU_ARRAYs need to have as its value receive buffer passed to the kernel libbpf_num_possible_cpus() entries, not evsel__nr_cpus(evsel), as the former uses /sys/devices/system/cpu/possible while the later uses /sys/devices/system/cpu/online, which may be less than the 'possible' number making the bpf map lookup overwrite memory and cause hard to debug memory corruption. We need to continue using evsel__nr_cpus(evsel) when accessing the perf_counts array tho, not to overwrite another are of memory :-) Signed-off-by: Song Liu <songliubraving@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/lkml/20210120163031.GU12699@kernel.org/ Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-team@fb.com Link: http://lore.kernel.org/lkml/20201229214214.3413833-4-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
480accbb |
|
08-Oct-2020 |
Jin Yao <yao.jin@linux.intel.com> |
perf streams: Introduce branch history "streams" We define a stream as the branch history which is aggregated by the branch records from perf samples. For example, the callchains aggregated from the branch records are considered as streams. By browsing the hot stream, we can understand the hot code path. Now we only support the callchain for stream. For measuring the hot level for a stream, we use the callchain_node->hit, higher is hotter. There may be many callchains sampled so we only focus on the top N hottest callchains. N is a user defined parameter or predefined default value (nr_streams_max). This patch creates an evsel_streams array per event, and saves the top N hottest streams in a stream array. So now we can get the per-event top N hottest streams. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20201009022845.13141-2-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
687986bb |
|
11-Sep-2020 |
Kan Liang <kan.liang@linux.intel.com> |
perf tools: Rename group to topdown The group.h/c only include TopDown group related functions. The name "group" is too generic and inaccurate. Use the name "topdown" to replace it. Move topdown related functions to a dedicated file, topdown.c. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200911144808.27603-2-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a80abe2a |
|
07-Aug-2020 |
Changbin Du <changbin.du@intel.com> |
perf tools: Add general function to parse sublevel options This factors out a general function perf_parse_sublevel_options() to parse sublevel options. The 'sublevel' options is something like the '--debug' options which allow more sublevel options. Signed-off-by: Changbin Du <changbin.du@gmail.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: http://lore.kernel.org/lkml/20200808023141.14227-8-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
6953beb4 |
|
05-Aug-2020 |
Jiri Olsa <jolsa@kernel.org> |
perf clockid: Move parse_clockid() to new clockid object Move parse_clockid and all needed clcckid related stuff into clockid object. We are going to add clockid_name function in following change, so it's better it's placed in separated object and not in builtin-record.c. No functional change is intended. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Geneviève Bastien <gbastien@versatic.net> Cc: Ian Rogers <irogers@google.com> Cc: Jeremie Galarneau <jgalar@efficios.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lore.kernel.org/lkml/20200805093444.314999-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
1f16fcad |
|
18-Jun-2020 |
Ian Rogers <irogers@google.com> |
perf parse-events: Disable a subset of bison warnings Rather than disable all warnings with -w, disable specific warnings. Predicate enabling the warnings on a recent version of bison. Tested with GCC 9.3.0 and clang 9.0.1. Committer testing: The full set of compilers, gcc and clang that this will be tested on will be on the signed tag when this change goes upstream. Had to add -Wno-switch-enum to build on opensuse tumbleweed: /tmp/build/perf/util/parse-events-bison.c: In function 'yydestruct': /tmp/build/perf/util/parse-events-bison.c:1200:3: error: enumeration value 'YYSYMBOL_YYEMPTY' not handled in switch [-Werror=switch-enum] 1200 | switch (yykind) | ^~~~~~ /tmp/build/perf/util/parse-events-bison.c:1200:3: error: enumeration value 'YYSYMBOL_YYEOF' not handled in switch [-Werror=switch-enum] Also replace -Wno-error=implicit-function-declaration with -Wno-implicit-function-declaration. Also needed to check just the first two levels of the bison version, as the patch was assuming that all versions were of the form x.y.z, and there are several cases where it is just x.y, breaking the build. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200619043356.90024-11-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
304d7a90 |
|
18-Jun-2020 |
Ian Rogers <irogers@google.com> |
perf parse-events: Disable a subset of flex warnings Rather than disable all warnings with -w, disable specific warnings. Predicate enabling the warnings on more recent flex versions. Tested with GCC 9.3.0 and clang 9.0.1. Committer notes: The full set of compilers, gcc and clang that this will be tested on will be on the signed tag when this change goes upstream. Added -Wno-misleading-indentation to the flex_flags to overcome this on opensuse tumbleweed when building with clang: CC /tmp/build/perf/util/parse-events-flex.o CC /tmp/build/perf/util/pmu.o /tmp/build/perf/util/parse-events-flex.c:5038:13: error: misleading indentation; statement is not part of the previous 'if' [-Werror,-Wmisleading-indentation] if ( ! yyg->yy_state_buf ) ^ /tmp/build/perf/util/parse-events-flex.c:5036:9: note: previous statement is here if ( ! yyg->yy_state_buf ) ^ And we need to use this to redirect stderr to stdin and then grep in a way that is acceptable for BusyBox shell: 2>&1 | Previously I was using: |& Which seems to be bash specific. Added -Wno-sign-compare to overcome this on systems such as centos:7: CC /tmp/build/perf/util/parse-events-flex.o CC /tmp/build/perf/util/pmu.o CC /tmp/build/perf/util/pmu-flex.o util/parse-events.l: In function 'parse_events_lex': /tmp/build/perf/util/parse-events-flex.c:193:36: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare] for ( yyl = n; yyl < yyleng; ++yyl )\ ^ /tmp/build/perf/util/parse-events-flex.c:204:9: note: in expansion of macro 'YY_LESS_LINENO' Added -Wno-unused-parameter to overcome this in systems such as centos:7: CC /tmp/build/perf/util/parse-events-flex.o CC /tmp/build/perf/util/pmu.o /tmp/build/perf/util/parse-events-flex.c: In function 'yy_fatal_error': /tmp/build/perf/util/parse-events-flex.c:6265:58: error: unused parameter 'yyscanner' [-Werror=unused-parameter] static void yy_fatal_error (yyconst char* msg , yyscan_t yyscanner) ^ Added -Wno-missing-declarations to build in systems such as centos:6: /tmp/build/perf/util/parse-events-flex.c:6313: error: no previous prototype for 'parse_events_get_column' /tmp/build/perf/util/parse-events-flex.c:6389: error: no previous prototype for 'parse_events_set_column' And -Wno-missing-prototypes to cover older compilers: -Wmissing-prototypes (C only) Warn if a global function is defined without a previous prototype declaration. This warning is issued even if the definition itself provides a prototype. The aim is to detect global functions that fail to be declared in header files. -Wmissing-declarations (C only) Warn if a global function is defined without a previous declaration. Do so even if the definition itself provides a prototype. Use this option to detect global functions that are not declared in header files. Older C compilers lack -Wno-misleading-indentation, check if it is available before using it. Also needed to check just the first two levels of the flex version, as the patch was assuming that all versions were of the form x.y.z, and there are several cases where it is just x.y, breaking the build. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200619043356.90024-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ef9894d9 |
|
18-Jun-2020 |
Ian Rogers <irogers@google.com> |
perf parse-events: Declare bison header file output Declare bison header file output so that C files can depend upon them. As there are multiple output targets $@ is replaced by the target name. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200619043356.90024-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4b971df9 |
|
18-Jun-2020 |
Ian Rogers <irogers@google.com> |
perf parse-events: Declare flex header file output Declare flex header file output so that bison C files can depend upon them. As there are multiple output targets $@ is replaced by the target name. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200619043356.90024-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
970a4a34 |
|
18-Jun-2020 |
Ian Rogers <irogers@google.com> |
perf pmu: Add flex debug build flag Allow pmu parser's flex to be debugged as the parse-events and expr currently are. Enabling this requires the C code to call perf_pmu__flex_debug. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200619043356.90024-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5011a52f |
|
18-Jun-2020 |
Ian Rogers <irogers@google.com> |
perf pmu: Add bison debug build flag Allow pmu parser to be debugged as the parse-events and expr currently are. Enabling this requires the C code to set perf_pmu_debug. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200619043356.90024-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
da77a14d |
|
18-Jun-2020 |
Ian Rogers <irogers@google.com> |
perf parse-events: Use automatic variable for yacc input This reduces the command line size slightly. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200619043356.90024-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
8d54c308 |
|
18-Jun-2020 |
Ian Rogers <irogers@google.com> |
perf parse-events: Use automatic variable for flex input This reduces the command line size slightly. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200619043356.90024-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4db25f66 |
|
30-May-2020 |
Tan Xiaojun <tanxiaojun@huawei.com> |
perf tools: Move arm-spe-pkt-decoder.h/c to the new dir Create a new arm-spe-decoder directory for subsequent extensions and move arm-spe-pkt-decoder.h/c to this directory. No code changes. Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com> Tested-by: James Clark <james.clark@arm.com> Tested-by: Qi Liu <liuqi115@hisilicon.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Grant <al.grant@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: http://lore.kernel.org/lkml/20200530122442.490-2-leo.yan@linaro.org Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
70943490 |
|
05-May-2020 |
Stephane Eranian <eranian@google.com> |
perf tools: Add optional support for libpfm4 This patch links perf with the libpfm4 library if it is available and LIBPFM4 is passed to the build. The libpfm4 library contains hardware event tables for all processors supported by perf_events. It is a helper library that helps convert from a symbolic event name to the event encoding required by the underlying kernel interface. This library is open-source and available from: http://perfmon2.sf.net. With this patch, it is possible to specify full hardware events by name. Hardware filters are also supported. Events must be specified via the --pfm-events and not -e option. Both options are active at the same time and it is possible to mix and match: $ perf stat --pfm-events inst_retired:any_p:c=1:i -e cycles .... One needs to explicitely ask for its inclusion by using the LIBPFM4 make command line option, ie its opt-in rather than opt-out of feature detection and build support. Signed-off-by: Stephane Eranian <eranian@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrii Nakryiko <andriin@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Igor Lubashev <ilubashe@akamai.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jiwei Sun <jiwei.sun@windriver.com> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: netdev@vger.kernel.org Cc: yuzhoujian <yuzhoujian@didichuxing.com> Link: http://lore.kernel.org/lkml/20200505182943.218248-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
eee19501 |
|
15-May-2020 |
Ian Rogers <irogers@google.com> |
perf tools: Grab a copy of libbpf's hashmap Allow use of hashmap in perf. Modify perf's check-headers.sh script to check that the files are kept in sync, in the same way kernel headers are checked. This will warn if they are out of sync at the start of a perf build. Committer note: This starts out of synch as a fix went thru the bpf tree, namely the one removing the needless libbpf_internal.h include in hashmap.h. There is also another change related to __WORDSIZE, that as is in tools/lib/bpf/hashmap.h causes the tools/perf/ build to fail in systems such as Alpine Linus, that uses the Musl libc, so we need an alternative way of having __WORDSIZE available, use the one used by tools/include/linux/bitops.h, that builds in all the systems I have build containers for. These differences will be resolved at some point, so keep the warning in check-headers.sh as a reminder. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: kp singh <kpsingh@chromium.org> Cc: netdev@vger.kernel.org Link: http://lore.kernel.org/lkml/20200515221732.44078-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9a399944 |
|
04-May-2020 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evlist: Move the sideband thread routines to separate object To avoid dragging more stuff into the perf python binding in the following csets. Reported-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
40c7d246 |
|
05-May-2020 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Move routines that probe for perf API features to separate file Trying to disentangle this a bit further, unfortunately it uses parse_events(), its interesting to have it separated anyway, so do it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
26226a97 |
|
28-Feb-2020 |
Jiri Olsa <jolsa@kernel.org> |
perf expr: Move expr lexer to flex Adding expr flex code instead of the manual parser code. So it's easily extensible in upcoming changes. The new flex code is in flex.l object and gets compiled like all the other flexers we use. It's defined as flex reentrant parser. It's used by both expr__parse and expr__find_other interfaces by separating the starting point. There's no intended change of functionality ;-) the test expr is passing. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lore.kernel.org/lkml/20200228093616.67125-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
576a65b6 |
|
28-Feb-2020 |
Jiri Olsa <jolsa@kernel.org> |
perf expr: Add expr.c object Add generic expr code into new expr.c object. The expr.c object will be mainly used in following change that will get rid of the manual flex code, Signed-off-by: Jiri Olsa <jolsa@kernel.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lore.kernel.org/lkml/20200228093616.67125-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
267ed5d8 |
|
20-Nov-2019 |
Andi Kleen <ak@linux.intel.com> |
perf affinity: Add infrastructure to save/restore affinity The kernel perf subsystem has to IPI to the target CPU for many operations. On systems with many CPUs and when managing many events the overhead can be dominated by lots of IPIs. An alternative is to set up CPU affinity in the perf tool, then set up all the events for that CPU, and then move on to the next CPU. Add some affinity management infrastructure to enable such a model. Used in followon patches. Committer notes: Use zfree() in some places, add missing stdbool.h header, some minor coding style changes. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20191121001522.180827-3-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d9664582 |
|
20-Nov-2019 |
Andi Kleen <ak@linux.intel.com> |
perf pmu: Use file system cache to optimize sysfs access pmu.c does a lot of redundant /sys accesses while parsing aliases and probing for PMUs. On large systems with a lot of PMUs this can get expensive (>2s): % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 27.25 1.227847 8 160888 16976 openat 26.42 1.190481 7 164224 164077 stat Add a cache to remember if specific file names exist or don't exist, which eliminates most of this overhead. Also optimize some stat() calls to be slightly cheaper access() Resulting in: 0.18 0.004166 2 1851 305 open 0.08 0.001970 2 829 622 access Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20191121001522.180827-2-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
60414418 |
|
07-Nov-2019 |
Jin Yao <yao.jin@linux.intel.com> |
perf block: Cleanup and refactor block info functions We have already implemented some block-info related functions. Now it's time to do some cleanup, refactoring and move the functions and structures to new block-info.h/block-info.c. v4: --- Move code for skipping column length calculation to patch: 'perf diff: Don't use hack to skip column length calculation' v3: --- 1. Rename the patch title 2. Rename from block.h/block.c to block-info.h/block-info.c 3. Move more common part to block-info, such as block_info__process_sym. 4. Remove the nasty hack for skipping calculation of column length Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20191107074719.26139-3-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
cebf7d51 |
|
24-Sep-2019 |
Jin Yao <yao.jin@linux.intel.com> |
perf diff: Report noisy for cycles diff This patch prints the stddev and hist for the cycles diff of program block. It can help us to understand if the cycles is noisy or not. This patch is inspired by Andi Kleen's patch: https://lwn.net/Articles/600471/ We create new option '--cycles-hist'. Example: perf record -b ./div perf record -b ./div perf diff -c cycles # Baseline [Program Block Range] Cycles Diff Shared Object Symbol # ........ .......................................................... .... ................. ............................ # 46.72% [div.c:40 -> div.c:40] 0 div [.] main 46.72% [div.c:42 -> div.c:44] 0 div [.] main 46.72% [div.c:42 -> div.c:39] 0 div [.] main 20.54% [random_r.c:357 -> random_r.c:394] 1 libc-2.27.so [.] __random_r 20.54% [random_r.c:357 -> random_r.c:380] 0 libc-2.27.so [.] __random_r 20.54% [random_r.c:388 -> random_r.c:388] 0 libc-2.27.so [.] __random_r 20.54% [random_r.c:388 -> random_r.c:391] 0 libc-2.27.so [.] __random_r 17.04% [random.c:288 -> random.c:291] 0 libc-2.27.so [.] __random 17.04% [random.c:291 -> random.c:291] 0 libc-2.27.so [.] __random 17.04% [random.c:293 -> random.c:293] 0 libc-2.27.so [.] __random 17.04% [random.c:295 -> random.c:295] 0 libc-2.27.so [.] __random 17.04% [random.c:295 -> random.c:295] 0 libc-2.27.so [.] __random 17.04% [random.c:298 -> random.c:298] 0 libc-2.27.so [.] __random 8.40% [div.c:22 -> div.c:25] 0 div [.] compute_flag 8.40% [div.c:27 -> div.c:28] 0 div [.] compute_flag 5.14% [rand.c:26 -> rand.c:27] 0 libc-2.27.so [.] rand 5.14% [rand.c:28 -> rand.c:28] 0 libc-2.27.so [.] rand 2.15% [rand@plt+0 -> rand@plt+0] 0 div [.] rand@plt 0.00% [kernel.kallsyms] [k] __x86_indirect_thunk_rax 0.00% [do_mmap+714 -> do_mmap+732] -10 [kernel.kallsyms] [k] do_mmap 0.00% [do_mmap+737 -> do_mmap+765] 1 [kernel.kallsyms] [k] do_mmap 0.00% [do_mmap+262 -> do_mmap+299] 0 [kernel.kallsyms] [k] do_mmap 0.00% [__x86_indirect_thunk_r15+0 -> __x86_indirect_thunk_r15+0] 7 [kernel.kallsyms] [k] __x86_indirect_thunk_r15 0.00% [native_sched_clock+0 -> native_sched_clock+119] -1 [kernel.kallsyms] [k] native_sched_clock 0.00% [native_write_msr+0 -> native_write_msr+16] -13 [kernel.kallsyms] [k] native_write_msr When we enable the option '--cycles-hist', the output is perf diff -c cycles --cycles-hist # Baseline [Program Block Range] Cycles Diff stddev/Hist Shared Object Symbol # ........ .......................................................... .... ................. ................. ............................ # 46.72% [div.c:40 -> div.c:40] 0 ± 37.8% ▁█▁▁██▁█ div [.] main 46.72% [div.c:42 -> div.c:44] 0 ± 49.4% ▁▁▂█▂▂▂▂ div [.] main 46.72% [div.c:42 -> div.c:39] 0 ± 24.1% ▃█▂▄▁▃▂▁ div [.] main 20.54% [random_r.c:357 -> random_r.c:394] 1 ± 33.5% ▅▂▁█▃▁▂▁ libc-2.27.so [.] __random_r 20.54% [random_r.c:357 -> random_r.c:380] 0 ± 39.4% ▁▁█▁██▅▁ libc-2.27.so [.] __random_r 20.54% [random_r.c:388 -> random_r.c:388] 0 libc-2.27.so [.] __random_r 20.54% [random_r.c:388 -> random_r.c:391] 0 ± 41.2% ▁▃▁▂█▄▃▁ libc-2.27.so [.] __random_r 17.04% [random.c:288 -> random.c:291] 0 ± 48.8% ▁▁▁▁███▁ libc-2.27.so [.] __random 17.04% [random.c:291 -> random.c:291] 0 ±100.0% ▁█▁▁▁▁▁▁ libc-2.27.so [.] __random 17.04% [random.c:293 -> random.c:293] 0 ±100.0% ▁█▁▁▁▁▁▁ libc-2.27.so [.] __random 17.04% [random.c:295 -> random.c:295] 0 ±100.0% ▁█▁▁▁▁▁▁ libc-2.27.so [.] __random 17.04% [random.c:295 -> random.c:295] 0 libc-2.27.so [.] __random 17.04% [random.c:298 -> random.c:298] 0 ± 75.6% ▃█▁▁▁▁▁▁ libc-2.27.so [.] __random 8.40% [div.c:22 -> div.c:25] 0 ± 42.1% ▁▃▁▁███▁ div [.] compute_flag 8.40% [div.c:27 -> div.c:28] 0 ± 41.8% ██▁▁▄▁▁▄ div [.] compute_flag 5.14% [rand.c:26 -> rand.c:27] 0 ± 37.8% ▁▁▁████▁ libc-2.27.so [.] rand 5.14% [rand.c:28 -> rand.c:28] 0 libc-2.27.so [.] rand 2.15% [rand@plt+0 -> rand@plt+0] 0 div [.] rand@plt 0.00% [kernel.kallsyms] [k] __x86_indirect_thunk_rax 0.00% [do_mmap+714 -> do_mmap+732] -10 [kernel.kallsyms] [k] do_mmap 0.00% [do_mmap+737 -> do_mmap+765] 1 [kernel.kallsyms] [k] do_mmap 0.00% [do_mmap+262 -> do_mmap+299] 0 [kernel.kallsyms] [k] do_mmap 0.00% [__x86_indirect_thunk_r15+0 -> __x86_indirect_thunk_r15+0] 7 [kernel.kallsyms] [k] __x86_indirect_thunk_r15 0.00% [native_sched_clock+0 -> native_sched_clock+119] -1 ± 38.5% ▄█▁ [kernel.kallsyms] [k] native_sched_clock 0.00% [native_write_msr+0 -> native_write_msr+16] -13 ± 47.1% ▁█▇▃▁▁ [kernel.kallsyms] [k] native_write_msr v8: --- Rebase to perf/core branch v7: --- 1. v6 got Jiri's ACK. 2. Rebase to latest perf/core branch. v6: --- 1. Jiri provides better code for using data__hpp_register() in ui_init(). Use this code in v6. v5: --- 1. Refine the use of data__hpp_register() in ui_init() according to Jiri's suggestion. v4: --- 1. Rename the new option from '--noisy' to '--cycles-hist' 2. Remove the option '-n'. 3. Only update the spark value and stats when '--cycles-hist' is enabled. 4. Remove the code of printing '..'. v3: --- 1. Move the histogram to a separate column 2. Move the svals[] out of struct stats v2: --- Jiri got a compile error, CC builtin-diff.o builtin-diff.c: In function ‘compute_cycles_diff’: builtin-diff.c:712:10: error: taking the absolute value of unsigned type ‘u64’ {aka ‘long unsigned int’} has no effect [-Werror=absolute-value] 712 | labs(pair->block_info->cycles_spark[i] - | ^~~~ Because the result of u64 - u64 is still u64. Now we change the type of cycles_spark[] to s64. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20190925011446.30678-1-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ca125277 |
|
24-Sep-2019 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evsel: Introduce evsel_fprintf.h We already had evsel_fprintf.c, add its counterpart, so that we can reduce evsel.h a bit more. We needed a new perf_event_attr_fprintf.c file so as to have a separate object to link with the python binding in tools/perf/util/python-ext-sources and not drag symbol_conf, etc into the python binding. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-06bdmt1062d9unzgqmxwlv88@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
32ff3fec |
|
24-Sep-2019 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf copyfile: Move copyfile routines to separate files Further reducing the util.c hodgepodge files. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-0i62zh7ok25znibyebgq0qs4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
055c67ed |
|
18-Sep-2019 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Move event synthesizing routines to separate .c file For better grouping, in time we may end up making most of these static, i.e. generalizing the 'perf record' synthesizing code so that based on the target it can do the right thing and call the needed synthesizers. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-s9zxxhk40s95pjng9panet16@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4a3cec84 |
|
30-Aug-2019 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf dsos: Move the dsos struct and its methods to separate source files So that we can reduce the header dependency tree further, in the process noticed that lots of places were getting even things like build-id routines and 'struct perf_tool' definition indirectly, so fix all those too. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-ti0btma9ow5ndrytyoqdk62j@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
12500902 |
|
22-Aug-2019 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf cacheline: Move cacheline related routines to separate files To disentangle util/sort.h a bit more. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-6kbf2cauas06rbqp15pyter5@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
8829e56f |
|
15-Aug-2019 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evswitch: Move switch logic to use in other tools Now other tools that want switching can use an evswitch for that, just set it up and add it to the PERF_RECORD_SAMPLE processing function. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: William Cohen <wcohen@redhat.com> Link: https://lkml.kernel.org/n/tip-b1trj1q97qwfv251l66q3noj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c22e150e |
|
07-Aug-2019 |
Igor Lubashev <ilubashe@akamai.com> |
perf tools: Add helpers to use capabilities if present Add utilities to help checking capabilities of the running procss. Make perf link with libcap, if it is available. If no libcap-dev[el], fallback to the geteuid() == 0 test used before. Committer notes: $ perf test python 18: 'import perf' in python : FAILED! $ perf test -v python Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc 18: 'import perf' in python : --- start --- test child forked, pid 23288 Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: /tmp/build/perf/python/perf.so: undefined symbol: cap_get_flag test child finished with -1 ---- end ---- 'import perf' in python: FAILED! $ This happens because differently from the perf binary generated with this patch applied: $ ldd /tmp/build/perf/perf | grep libcap libcap.so.2 => /lib64/libcap.so.2 (0x00007f724a4ef000) $ The python binding isn't linking with libcap: $ ldd /tmp/build/perf/python/perf.so | grep libcap $ So add 'cap' to the 'extra_libraries' variable in tools/perf/util/setup.py, and rebuild: $ perf test python 18: 'import perf' in python : Ok $ If we explicitely disable libcap it also continues to work: $ make NO_LIBCAP=1 -C tools/perf O=/tmp/build/perf install-bin $ ldd /tmp/build/perf/perf | grep libcap $ ldd /tmp/build/perf/python/perf.so | grep libcap $ perf test python 18: 'import perf' in python : Ok $ Signed-off-by: Igor Lubashev <ilubashe@akamai.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: James Morris <jmorris@namei.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org [ split from a larger patch ] Link: http://lkml.kernel.org/r/8a1e76cf5c7c9796d0d4d240fbaa85305298aafa.1565188228.git.ilubashe@akamai.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4b247fa7 |
|
21-Jul-2019 |
Jiri Olsa <jolsa@kernel.org> |
libperf: Adopt xyarray class from perf Move the xyarray class from perf to libperf, because it's going to be used in both. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-58-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
93bce7e5 |
|
21-Jul-2019 |
Jiri Olsa <jolsa@kernel.org> |
libperf: Move zalloc.o into libperf We need it in both perf and libperf, thus moving it to libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-45-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4975223b |
|
09-Jul-2019 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Introduce rlimit__bump_memlock() helper Just like the BPF guys did when faced with failures with map creation, etc, i.e. their solution is: tools/testing/selftests/bpf/bpf_rlimit.h For perf use this function in 'perf test' and in 'perf trace'. Make it bump to 4 times the current value, if it fails twice the current value and if it still fails, warn that things like BPF map creation may fail, to help in diagnosing the problem. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-muvqef2i7n6pzqbmu7tn2d2y@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
7f7c536f |
|
04-Jul-2019 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
tools lib: Adopt zalloc()/zfree() from tools/perf Eroding a bit more the tools/perf/util/util.h hodpodge header. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-natazosyn9rwjka25tvcnyi0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9c10548c |
|
26-Jun-2019 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
tools lib: Move argv_{split,free} from tools/perf/util/ This came from the kernel lib/argv_split.c, so move it to tools/lib/argv_split.c, to get it closer to the kernel structure. We need to audit the usage of argv_split() to figure out if it is really necessary to do have one allocation per argv[] entry, looking at one of its users I guess that is not the case and we probably are even leaking those allocations by not using argv_free() judiciously, for later. With this we further remove stuff from tools/perf/util/, reducing the perf specific codebase and encouraging other tools/ code to use these routines so as to keep the style and constructs used with the kernel. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-j479s1ive9h75w5lfg16jroz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
3052ba56 |
|
25-Jun-2019 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
tools perf: Move from sane_ctype.h obtained from git to the Linux's original We got the sane_ctype.h headers from git and kept using it so far, but since that code originally came from the kernel sources to the git sources, perhaps its better to just use the one in the kernel, so that we can leverage tools/perf/check_headers.sh to be notified when our copy gets out of sync, i.e. when fixes or goodies are added to the code we've copied. This will help with things like tools/lib/string.c where we want to have more things in common with the kernel, such as strim(), skip_spaces(), etc so as to go on removing the things that we have in tools/perf/util/ and instead using the code in the kernel, indirectly and removing things like EXPORT_SYMBOL(), etc, getting notified when fixes and improvements are made to the original code. Hopefully this also should help with reducing the difference of code hosted in tools/ to the one in the kernel proper. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-7k9868l713wqtgo01xxygn12@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f24c1d75 |
|
18-Mar-2019 |
Alexey Budankov <alexey.budankov@linux.intel.com> |
perf tools: Introduce Zstd streaming based compression API Implemented functions are based on Zstd streaming compression API. The functions are used in runtime to compress data that come from mmaped kernel buffer. zstd_init(), zstd_fini() are used for initialization and finalization to allocate and deallocate internal zstd objects. zstd_compress_stream_to_records() is used to convert parts of mmaped kernel buffer into an array of PERF_RECORD_COMPRESSED records. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/18bf36f3-b85a-1fe2-dd83-10e0c6069568@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d19f8564 |
|
19-Feb-2019 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf bpf: Add bpf_map dumper At some point I'll suggest moving this to libbpf, for now I'll experiment with ways to dump BPF maps set by events in 'perf trace', starting with a very basic dumper for the current very limited needs of the augmented_raw_syscalls code: dumping booleans. Having functions that apply to the map keys and values and do table lookup in things like syscall id to string tables should come next. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Yonghong Song <yhs@fb.com> Link: https://lkml.kernel.org/n/tip-lz14w0esqyt1333aon05jpwc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5135d5ef |
|
19-Feb-2019 |
Jiri Olsa <jolsa@kernel.org> |
perf tools: Add cpu_topology object Make struct cpu_topo global and rename it to 'struct cpu_topology', so that it can be used from the 'perf record' command in the following patches. Add the following interface functions to load/free cpu topology details: struct cpu_topology *cpu_topology__new(void); void cpu_topology__delete(struct cpu_topology *tp); Move it to a separate source file cputopo.c together with numa related object in the following patches. No functional change, the new interface will be used in upcoming changes. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190219095815.15931-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5ff32883 |
|
13-Feb-2019 |
Jiri Olsa <jolsa@kernel.org> |
perf tools: Rename build libperf to perf Rename build libperf to perf, because it's used to build perf. The libperf build object name will be used for libperf library. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190213123246.4015-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
159b0da5 |
|
31-Jan-2019 |
Mathieu Poirier <mathieu.poirier@linaro.org> |
perf pmu: Remove set_drv_config API CoreSight was the only client of the PMU's set_drv_config() API. Now that it is no longer needed by CoreSight remove it from the code base. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Acked-by: Suzuki K Poulouse <suzuki.poulose@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-s390@vger.kernel.org Link: http://lkml.kernel.org/r/20190131184714.20388-8-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
32e9136e |
|
21-Jan-2019 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf utils: Move perf_config using routines from color.c to separate object To untangle objects a bit more, avoiding rebuilding the color_fprintf routines when changes are made to the perf config headers. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lkml.kernel.org/n/tip-8qvu2ek26antm3a8jyl4ocbq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
45178a92 |
|
17-Jan-2019 |
Song Liu <songliubraving@fb.com> |
perf tools: Handle PERF_RECORD_BPF_EVENT This patch adds basic handling of PERF_RECORD_BPF_EVENT. Tracking of PERF_RECORD_BPF_EVENT is OFF by default. Option --bpf-event is added to turn it on. Committer notes: Add dummy machine__process_bpf_event() variant that returns zero for systems without HAVE_LIBBPF_SUPPORT, such as Alpine Linux, unbreaking the build in such systems. Remove the needless include <machine.h> from bpf->event.h, provide just forward declarations for the structs and unions in the parameters, to reduce compilation time and needless rebuilds when machine.h gets changed. Committer testing: When running with: # perf record --bpf-event On an older kernel where PERF_RECORD_BPF_EVENT and PERF_RECORD_KSYMBOL is not present, we fallback to removing those two bits from perf_event_attr, making the tool to continue to work on older kernels: perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 bpf_event 1 ------------------------------------------------------------ sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off bpf_event ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 ------------------------------------------------------------ sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off ksymbol ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ------------------------------------------------------------ And then proceeds to work without those two features. As passing --bpf-event is an explicit action performed by the user, perhaps we should emit a warning telling that the kernel has no such feature, but this can be done on top of this patch. Now with a kernel that supports these events, start the 'record --bpf-event -a' and then run 'perf trace sleep 10000' that will use the BPF augmented_raw_syscalls.o prebuilt (for another kernel version even) and thus should generate PERF_RECORD_BPF_EVENT events: [root@quaco ~]# perf record -e dummy -a --bpf-event ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.713 MB perf.data ] [root@quaco ~]# bpftool prog 13: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 14: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 15: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 15,16 16: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 15,16 17: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:44-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 17,18 18: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:44-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 17,18 21: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:45-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 21,22 22: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:45-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 21,22 31: tracepoint name sys_enter tag 12504ba9402f952f gpl loaded_at 2019-01-19T09:19:56-0300 uid 0 xlated 512B jited 374B memlock 4096B map_ids 30,29,28 32: tracepoint name sys_exit tag c1bd85c092d6e4aa gpl loaded_at 2019-01-19T09:19:56-0300 uid 0 xlated 256B jited 191B memlock 4096B map_ids 30,29 # perf report -D | grep PERF_RECORD_BPF_EVENT | nl 1 0 55834574849 0x4fc8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 13 2 0 60129542145 0x5118 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 14 3 0 64424509441 0x5268 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 15 4 0 68719476737 0x53b8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 16 5 0 73014444033 0x5508 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 17 6 0 77309411329 0x5658 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 18 7 0 90194313217 0x57a8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 21 8 0 94489280513 0x58f8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 22 9 7 620922484360 0xb6390 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 29 10 7 620922486018 0xb6410 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 29 11 7 620922579199 0xb6490 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 30 12 7 620922580240 0xb6510 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 30 13 7 620922765207 0xb6598 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 31 14 7 620922874543 0xb6620 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 32 # There, the 31 and 32 tracepoint BPF programs put in place by 'perf trace'. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-team@fb.com Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20190117161521.1341602-7-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
93115d32 |
|
17-Jan-2019 |
Thomas Richter <tmricht@linux.ibm.com> |
perf report: Display arch specific diagnostic counter sets, starting with s390 On s390 the event bc000 (also named CF_DIAG) extracts the CPU Measurement Facility diagnostic counter sets and displays them as counter number and counter value pairs sorted by counter set number. Output: [root@s35lp76 perf]# ./perf report -D --stdio [00000000] Counterset:0 Counters:6 Counter:000 Value:0x000000000085ec36 Counter:001 Value:0x0000000000796c94 Counter:002 Value:0x0000000000005ada Counter:003 Value:0x0000000000092460 Counter:004 Value:0x0000000000006073 Counter:005 Value:0x00000000001a9a73 [0x000038] Counterset:1 Counters:2 Counter:000 Value:0x000000000007c59f Counter:001 Value:0x000000000002fad6 [0x000050] Counterset:2 Counters:16 Counter:000 Value:000000000000000000 Counter:001 Value:000000000000000000 Counter:002 Value:000000000000000000 Counter:003 Value:000000000000000000 Counter:004 Value:000000000000000000 Counter:005 Value:000000000000000000 Counter:006 Value:000000000000000000 Counter:007 Value:000000000000000000 Counter:008 Value:000000000000000000 Counter:009 Value:000000000000000000 Counter:010 Value:000000000000000000 Counter:011 Value:000000000000000000 Counter:012 Value:000000000000000000 Counter:013 Value:000000000000000000 Counter:014 Value:000000000000000000 Counter:015 Value:000000000000000000 [0x0000d8] Counterset:3 Counters:128 Counter:000 Value:0x000000000000020f Counter:001 Value:0x00000000000001d8 Counter:002 Value:0x000000000000d7fa Counter:003 Value:0x000000000000008b ... The number in brackets is the offset into the raw data field of the sample. New functions trace_event_sample_raw__init() and s390_sample_raw() are introduced in the code path to enable interpretation on non s390 platforms. This event bc000 attached raw data is generated only on s390 platform. Correct display on other platforms requires correct endianness handling. Committer notes: Added a init function that sets up a evlist function pointer to avoid repeated tests on evlist->env and calls to perf_env__name() that involves normalizing, etc, for each PERF_RECORD_SAMPLE. Removed needless __maybe_unused from the trace_event_raw() prototype in session.h, move it to be an static function in evlist. The 'offset' variable is a size_t, not an u64, fix it to avoid this on some arches: CC /tmp/build/perf/util/s390-sample-raw.o util/s390-sample-raw.c: In function 's390_cpumcfdg_testctr': util/s390-sample-raw.c:77:4: error: format '%llx' expects argument of type 'long long unsigned int', but argument 4 has type 'size_t' [-Werror=format=] pr_err("Invalid counter set entry at %#" PRIx64 "\n", ^ cc1: all warnings being treated as errors Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Link: https://lkml.kernel.org/r/9c856ac0-ef23-72b5-901d-a1f815508976@linux.ibm.com Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Link: https://lkml.kernel.org/n/tip-s3jhif06et9ug78qhclw41z1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
dd2e18e9 |
|
03-Dec-2018 |
Andi Kleen <ak@linux.intel.com> |
perf tools: Support 'srccode' output When looking at PT or brstackinsn traces with 'perf script' it can be very useful to see the source code. This adds a simple facility to print them with 'perf script', if the information is available through dwarf % perf record ... % perf script -F insn,ip,sym,srccode ... 4004c6 main 5 for (i = 0; i < 10000000; i++) 4004cd main 5 for (i = 0; i < 10000000; i++) 4004c6 main 5 for (i = 0; i < 10000000; i++) 4004cd main 5 for (i = 0; i < 10000000; i++) 4004cd main 5 for (i = 0; i < 10000000; i++) 4004cd main 5 for (i = 0; i < 10000000; i++) 4004cd main 5 for (i = 0; i < 10000000; i++) 4004cd main 5 for (i = 0; i < 10000000; i++) 4004b3 main 6 v++; % perf record -b ... % perf script -F insn,ip,sym,srccode,brstackinsn ... main+22: 0000000000400543 insn: e8 ca ff ff ff # PRED |18 f1(); f1: 0000000000400512 insn: 55 |10 { 0000000000400513 insn: 48 89 e5 0000000000400516 insn: b8 00 00 00 00 |11 f2(); 000000000040051b insn: e8 d6 ff ff ff # PRED f2: 00000000004004f6 insn: 55 |5 { 00000000004004f7 insn: 48 89 e5 00000000004004fa insn: 8b 05 2c 0b 20 00 |6 c = a / b; 0000000000400500 insn: 8b 0d 2a 0b 20 00 0000000000400506 insn: 99 0000000000400507 insn: f7 f9 0000000000400509 insn: 89 05 29 0b 20 00 000000000040050f insn: 90 |7 } 0000000000400510 insn: 5d 0000000000400511 insn: c3 # PRED f1+14: 0000000000400520 insn: b8 00 00 00 00 |12 f2(); 0000000000400525 insn: e8 cc ff ff ff # PRED f2: 00000000004004f6 insn: 55 |5 { 00000000004004f7 insn: 48 89 e5 00000000004004fa insn: 8b 05 2c 0b 20 00 |6 c = a / b; Not supported for callchains currently, would need some layout changes there. Committer notes: Fixed the build on Alpine Linux (3.4 .. 3.8) by addressing this warning: In file included from util/srccode.c:19:0: /usr/include/sys/fcntl.h:1:2: error: #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h> [-Werror=cpp] #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h> ^~~~~~~ cc1: all warnings being treated as errors Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20181204001848.24769-1-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
8feb8efe |
|
19-Nov-2018 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
tools build feature: Check if get_current_dir_name() is available As the namespace support code will use this, which is not available in some non _GNU_SOURCE libraries such as Android's bionic used in my container build tests (r12b and r15c at the moment). Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-x56ypm940pwclwu45d7jfj47@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
088519f3 |
|
30-Aug-2018 |
Jiri Olsa <jolsa@kernel.org> |
perf stat: Move the display functions to stat-display.c Move perf_evlist__print_counters() with all its dependency functions to the stat-display.c object. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-44-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b96e6615 |
|
02-Aug-2018 |
Thomas Richter <tmricht@linux.ibm.com> |
perf auxtrace: Support for perf report -D for s390 Add initial support for s390 auxiliary traces using the CPU-Measurement Sampling Facility. Support and ignore PERF_REPORT_AUXTRACE_INFO records in the perf data file. Later patches will show the contents of the auxiliary traces. Setup the auxtrace queues and data structures for s390. A raw dump of the perf.data file now does not show an error when an auxtrace event is encountered. Output before: [root@s35lp76 perf]# ./perf report -D -i perf.data.auxtrace 0x128 [0x10]: failed to process type: 70 Error: failed to process sample 0x128 [0x10]: event: 70 . . ... raw event: size 16 bytes . 0000: 00 00 00 46 00 00 00 10 00 00 00 00 00 00 00 00 ...F............ 0x128 [0x10]: PERF_RECORD_AUXTRACE_INFO type: 0 [root@s35lp76 perf]# Output after: # ./perf report -D -i perf.data.auxtrace |fgrep PERF_RECORD_AUXTRACE 0 0 0x128 [0x10]: PERF_RECORD_AUXTRACE_INFO type: 5 0 0 0x25a66 [0x30]: PERF_RECORD_AUXTRACE size: 0x40000 offset: 0 ref: 0 idx: 4 tid: -1 cpu: 4 .... Additional notes about the underlying hardware and software implementation, provided by Hendrik Brueckner (see Link: below). ============================================================================= The CPU-Measurement Facility (CPU-MF) provides a set of functions to obtain performance information on the mainframe. Basically, it was introduced with System z10 years ago for the z/Architecture, that means, 64-bit. For Linux, there are two facilities of interest, counter facility and sampling facility. The counter facility provides hardware counters for instructions, cycles, crypto-activities, and many more. The sampling facility is a hardware sampler that when started will write samples at a particular interval into a sampling buffer. At some point, for example, if a sample block is full, it generates an interrupt to collect samples (while the sampler continues to run). Few years ago, I started to provide the a perf PMU to use the counter and sampling facilities. Recently, the device driver was updated to also "export" the sampling buffer into the AUX area. Thomas now completed the related perf work to interpret and process these AUX data. If people are more interested in the sampling facility, they can have a look into: - The Load-Program-Parameter and the CPU-Measurement Facilities, SA23-2260-05 http://www-01.ibm.com/support/docview.wss?uid=isg26fcd1cc32246f4c8852574ce0044734a and to learn how-to use it for Linux on Z, have look at chapter 54, "Using the CPU-measurement facilities" in the: - Device Drivers, Features, and Commands, SC33-8411-34 http://public.dhe.ibm.com/software/dw/linux390/docu/l416dd34.pdf ============================================================================= Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Link: http://lkml.kernel.org/r/20180803100758.GA28475@linux.ibm.com Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180802074622.13641-2-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4f5aeecd |
|
24-May-2018 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Remove dead quote.[ch] code In c68677014bac ("perf tools: Remove support for command aliases") we removed the only remaining use of a function provided by these files, so ditch it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-mgnzqbi46gucs48d7bzfwr55@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
1b16fffa |
|
04-May-2018 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf llvm-utils: Add bpf include path to clang command line We'll start putting headers for helpers to be used in eBPF proggies in there: # perf trace -v --no-syscalls -e empty.c |& grep "llvm compiling command : " llvm compiling command : /usr/lib64/ccache/clang -D__KERNEL__ -D__NR_CPUS__=4 -DLINUX_VERSION_CODE=0x41100 -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h -I/home/acme/lib/include/perf/bpf -Wno-unused-value -Wno-pointer-sign -working-directory /lib/modules/4.17.0-rc3-00034-gf4ef6a438cee/build -c /home/acme/bpf/empty.c -target bpf -O2 -o - # Notice the "-I/home/acme/lib/include/perf/bpf" Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-6xq94xro8xlb5s9urznh3f9k@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4acf6142 |
|
09-Mar-2018 |
Jiri Olsa <jolsa@kernel.org> |
perf tools: Add mem2node object Adding mem2node object to allow the easy lookup of the node for the physical address. It has following interface: int mem2node__init(struct mem2node *map, struct perf_env *env); void mem2node__exit(struct mem2node *map); int mem2node__node(struct mem2node *map, u64 addr); The mem2node__toolsinit initialize object from the perf data file MEM_TOPOLOGY feature data. Following calls to mem2node__node will return node number for given physical address. The mem2node__exit function frees the object. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180309101442.9224-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
68ffe390 |
|
17-Jan-2018 |
Mathieu Poirier <mathieu.poirier@linaro.org> |
perf tools: Add decoder mechanic to support dumping trace data This patch adds the required interface to the openCSD library to support dumping CoreSight trace packet using the "report --dump" command. The information conveyed is related to the type of packets gathered by a trace session rather than full decoding. Co-authored-by: Tor Jeremiassen <tor@ti.com> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Mike Leach <mike.leach@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1516211539-5166-5-git-send-email-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
440a23b3 |
|
17-Jan-2018 |
Mathieu Poirier <mathieu.poirier@linaro.org> |
perf tools: Add initial entry point for decoder CoreSight traces This patch adds the entry point for CoreSight trace decoding, serving as a jumping board for furhter expansions. Co-authored-by: Tor Jeremiassen <tor@ti.com> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Mike Leach <mike.leach@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1516211539-5166-3-git-send-email-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b3fa3896 |
|
19-Jan-2018 |
Hendrik Brueckner <brueckner@linux.vnet.ibm.com> |
perf trace: Remove audit-libs dependency if syscall tables are present Change the Makefile and build process to no longer require audit-libs interfaces when the architecture provides system call tables. Committer notes: Its not enough to hook into the NO_LIBAUDIT makefile block, we need to define a CONFIG_TRACE that gets selected by both architectures generating the syscall tables from the kernel headers and from detecting the availability of libaudit. With that in place we will not link against libaudit even if the necessary files are available for that, in fact we will not even try to detect its availability, speeding up a bit the feature detection phase. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: linux-s390@vger.kernel.org LPU-Reference: 1516352177-11106-6-git-send-email-brueckner@linux.vnet.ibm.com Link: https://lkml.kernel.org/n/tip-j68lub6ipm8apvy52vd3l4cm@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ffd3d18c |
|
14-Jan-2018 |
Kim Phillips <kim.phillips@arm.com> |
perf tools: Add ARM Statistical Profiling Extensions (SPE) support 'perf record' and 'perf report --dump-raw-trace' supported in this release. Example usage: # perf record -e arm_spe/ts_enable=1,pa_enable=1/ dd if=/dev/zero of=/dev/null count=10000 # perf report --dump-raw-trace Note that the perf.data file is portable, so the report can be run on another architecture host if necessary. Output will contain raw SPE data and its textual representation, such as: 0x5c8 [0x30]: PERF_RECORD_AUXTRACE size: 0x200000 offset: 0 ref: 0x1891ad0e idx: 1 tid: 2227 cpu: 1 . . ... ARM SPE data: size 2097152 bytes . 00000000: 49 00 LD . 00000002: b2 c0 3b 29 0f 00 00 ff ff VA 0xffff00000f293bc0 . 0000000b: b3 c0 eb 24 fb 00 00 00 80 PA 0xfb24ebc0 ns=1 . 00000014: 9a 00 00 LAT 0 XLAT . 00000017: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS . 00000019: b0 00 c4 15 08 00 00 ff ff PC 0xff00000815c400 el3 ns=1 . 00000022: 98 00 00 LAT 0 TOT . 00000025: 71 36 6c 21 2c 09 00 00 00 TS 39395093558 . 0000002e: 49 00 LD . 00000030: b2 80 3c 29 0f 00 00 ff ff VA 0xffff00000f293c80 . 00000039: b3 80 ec 24 fb 00 00 00 80 PA 0xfb24ec80 ns=1 . 00000042: 9a 00 00 LAT 0 XLAT . 00000045: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS . 00000047: b0 f4 11 16 08 00 00 ff ff PC 0xff0000081611f4 el3 ns=1 . 00000050: 98 00 00 LAT 0 TOT . 00000053: 71 36 6c 21 2c 09 00 00 00 TS 39395093558 . 0000005c: 48 00 INSN-OTHER . 0000005e: 42 02 EV RETIRED . 00000060: b0 2c ef 7f 08 00 00 ff ff PC 0xff0000087fef2c el3 ns=1 . 00000069: 98 00 00 LAT 0 TOT . 0000006c: 71 d1 6f 21 2c 09 00 00 00 TS 39395094481 ... Other release notes: - applies to acme's perf/{core,urgent} branches, likely elsewhere - Report is self-contained within the tool. Record requires enabling the kernel SPE driver by setting CONFIG_ARM_SPE_PMU. - The intel-bts implementation was used as a starting point; its min/default/max buffer sizes and power of 2 pages granularity need to be revisited for ARM SPE - Recording across multiple SPE clusters/domains not supported - Snapshot support (record -S), and conversion to native perf events (e.g., via 'perf inject --itrace'), are also not supported - Technically both cs-etm and spe can be used simultaneously, however disabled for simplicity in this release Signed-off-by: Kim Phillips <kim.phillips@arm.com> Reviewed-by: Dongjiu Geng <gengdongjiu@huawei.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Herring <robh@kernel.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wang Nan <wangnan0@huawei.com> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20180114132850.0b127434b704a26bad13268f@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
16958497 |
|
06-Oct-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf mmap: Move perf_mmap and methods to separate mmap.[ch] files To better organize the sources, and we may end up even using it directly, without evlists and evsels. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-oiqrm7grflurnnzo2ovfnslg@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0a7c74ea |
|
04-Apr-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Provide mutex wrappers for pthreads rwlocks Andi reported a performance drop in single threaded perf tools such as 'perf script' due to the growing number of locks being put in place to allow for multithreaded tools, so wrap the POSIX threads rwlock routines with the names used for such kinds of locks in the Linux kernel and then allow for tools to ask for those locks to be used or not. I.e. a tool may have a multithreaded phase and then switch to single threaded, like the upcoming patches for the synthesizing of PERF_RECORD_{FORK,MMAP,etc} for pre-existing processes to then switch to single threaded mode in 'perf top'. The init routines will not be conditional, this way starting as single threaded to then move to multi threaded mode should be possible. Reported-by: Andi Kleen <ak@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170404161739.GH12903@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b18f3e36 |
|
31-Aug-2017 |
Andi Kleen <ak@linux.intel.com> |
perf stat: Support JSON metrics in perf stat Add generic support for standalone metrics specified in JSON files to perf stat. A metric is a formula that uses multiple events to compute a higher level result (e.g. IPC). Previously metrics were always tied to an event and automatically enabled with that event. But now change it that we can have standalone metrics. They are in the same JSON data structure as events, but don't have an event name. We also allow to organize the metrics in metric groups, which allows a short cut to select several related metrics at once. Add a new -M / --metrics option to perf stat that adds the metrics or metric groups specified. Add the core code to manage and parse the metric groups. They are collected from the JSON data structures into a separate rblist. When computing shadow values look for metrics in that list. Then they are computed using the existing saved values infrastructure in stat-shadow.c The actual JSON metrics are in a separate pull request. % perf stat -M Summary --metric-only -a sleep 1 Performance counter stats for 'system wide': Instructions CLKS CPU_Utilization GFLOPs SMT_2T_Utilization Kernel_Utilization 317614222.0 1392930775.0 0.0 0.0 0.2 0.1 1.001497549 seconds time elapsed % perf stat -M GFLOPs flops Performance counter stats for 'flops': 3,999,541,471 fp_comp_ops_exe.sse_scalar_single # 1.2 GFLOPs (66.65%) 14 fp_comp_ops_exe.sse_scalar_double (66.65%) 0 fp_comp_ops_exe.sse_packed_double (66.67%) 0 fp_comp_ops_exe.sse_packed_single (66.70%) 0 simd_fp_256.packed_double (66.70%) 0 simd_fp_256.packed_single (66.67%) 0 duration_time 3.238372845 seconds time elapsed v2: Add missing header file v3: Move find_map to pmu.c Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170831194036.30146-7-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
de5077c4 |
|
11-Aug-2017 |
Andi Kleen <ak@linux.intel.com> |
perf tools: Add utility function to detect SMT status Add an smt_on() function to return if SMT is enabled or disabled. Used in the next patch. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170811232634.30465-7-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
992c7e92 |
|
18-Jul-2017 |
Jin Yao <yao.jin@linux.intel.com> |
perf util: Create branch.c/.h for common branch functions Create new util/branch.c and util/branch.h to contain the common branch functions. Such as: branch_type_count(): Count the numbers of branch types branch_type_name() : Return the name of branch type branch_type_stat_display(): Display branch type statistics info branch_type_str(): Construct the branch type string. The branch type is saved in branch_flags. Change log: v8: Change PERF_BR_NONE to PERF_BR_UNKNOWN. v7: Since the common branch type name is changed (e.g. JCC->COND), this patch is performed the modification accordingly. v6: Move that multiline conditional code inside {} brackets. Move branch_type_stat_display() from builtin-report.c to branch.c. Move branch_type_str() from callchain.c to branch.c. v5: It's a new patch in v5 patch series. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1500379995-6449-6-git-send-email-yao.jin@linux.intel.com [ Don't use 'index' and 'stat' as names for variables, it shadows global decls in older distros ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
86bcdb5a |
|
18-Jul-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
tools build: Add test for setns() And provide an alternative implementation to keep perf building on older distros as we're about to add initial support for namespaces. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Krister Johansen <kjlx@templeofstupid.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-bqdwijunhjlvps1ardykhw1i@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
98521b38 |
|
25-Apr-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf memswap: Split the byteswap memory range wrappers from util.[ch] Just one more step into splitting util.[ch] to reduce the includes hell. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-navarr9mijkgwgbzu464dwam@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
58db1d6e |
|
19-Apr-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Move units conversion/formatting routines to separate object Out of util.h, to disentangle it a bit more. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-vpksyj3w5fk9t8s6mxmkajyr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
fea01392 |
|
17-Apr-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Move print_binary definitions to separate files Continuing the split of util.[ch] into more manageable bits. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-5eu367rwcwnvvn7fz09l7xpb@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c6867701 |
|
28-Mar-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Remove support for command aliases This came from 'git', but isn't documented anywhere in tools/perf/Documentation/, looks like baggage we can do without, ditch it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-e7uwkn60t4hmlnwj99ba4t2s@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
07516736 |
|
20-Mar-2017 |
Andi Kleen <ak@linux.intel.com> |
perf tools: Add a simple expression parser for JSON Add a simple expression parser good enough to parse JSON relation expressions. The parser is implemented using bison. This is just intended as an simple parser for internal usage in the event lists, not the beginning of a "perf scripting language" v2: Use expr__ prefix instead of expr_ Support multiple free variables for parser Committer note: The v2 patch had: %define api.pure full In expr.y, that is a feature introduced in bison 2.7, to have reentrant parsers, not using global variables, which would make tools/perf stop building with the bison version shipped in older distros, so Andi realised that the other parsers (e.g. parse-events.y) were using: %pure-parser Which is present in older versions of bison and fits the bill. I added: CFLAGS_expr-bison.o += -DYYENABLE_NLS=0 -DYYLTYPE_IS_TRIVIAL=0 -w To finally make it build, copying what was there for pmu-bison.o, another parser. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170320201711.14142-8-andi@firstfloor.org [ stdlib.h is needed in tests/expr.c for free() fixing build in systems such as ubuntu:16.04-x-s390 ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
48d02a1d |
|
23-Feb-2017 |
Andi Kleen <ak@linux.intel.com> |
perf script: Add 'brstackinsn' for branch stacks Implement printing instruction sequences as hex dump for branch stacks. This relies on the x86 instruction decoder used by the PT decoder to find the lengths of instructions to dump them individually. This is good enough for pattern matching. This allows to study hot paths for individual samples, together with branch misprediction and cycle count / IPC information if available (on Skylake systems). % perf record -b ... % perf script -F brstackinsn ... read_hpet+67: ffffffff9905b843 insn: 74 ea # PRED ffffffff9905b82f insn: 85 c9 ffffffff9905b831 insn: 74 12 ffffffff9905b833 insn: f3 90 ffffffff9905b835 insn: 48 8b 0f ffffffff9905b838 insn: 48 89 ca ffffffff9905b83b insn: 48 c1 ea 20 ffffffff9905b83f insn: 39 f2 ffffffff9905b841 insn: 89 d0 ffffffff9905b843 insn: 74 ea # PRED Only works when no special branch filters are specified. Occasionally the path does not reach up to the sample IP, as the LBRs may be frozen before executing a final jump. In this case we print a special message. The instruction dumper piggy backs on the existing infrastructure from the IP PT decoder. An earlier iteration of this patch relied on a disassembler, but this version only uses the existing instruction decoder. Committer note: Added hint about how to get suitable perf.data files for use with '-F brstackinsm': $ perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.018 MB perf.data (8 samples) ] $ $ perf script -F brstackinsn Display of branch stack assembler requested, but non all-branch filter set Hint: run 'perf record -b ...' $ Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Link: http://lkml.kernel.org/r/20170223234634.583-1-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f3b3614a |
|
07-Mar-2017 |
Hari Bathini <hbathini@linux.vnet.ibm.com> |
perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info Introduce a new option to record PERF_RECORD_NAMESPACES events emitted by the kernel when fork, clone, setns or unshare are invoked. And update perf-record documentation with the new option to record namespace events. Committer notes: Combined it with a later patch to allow printing it via 'perf report -D' and be able to test the feature introduced in this patch. Had to move here also perf_ns__name(), that was introduced in another later patch. Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt: util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=] ret += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx ^ Testing it: # perf record --namespaces -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ] # # perf report -D <SNIP> 3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7 [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc, 4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb] 0x1151e0 [0x30]: event: 9 . . ... raw event: size 48 bytes . 0000: 09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00 ......0..q.h.... . 0010: a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00 .9...9...(.c.... . 0020: 03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00 ................ <SNIP> NAMESPACES events: 1 <SNIP> # Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d25ed5d9 |
|
16-Jan-2017 |
Soramichi AKIYAMA <akiyama@m.soramichi.jp> |
perf tools: Move two variables usied in libperf from perf.c The use_browser and perf_version_string variables are both declared in perf.c but they are also referenced by other functions of libperf.a. Therefore a user linking an own main() with libperf.a must declare those two variables in their files even if the files never use the browser or the version information. This patch fixes this issue by moving use_browser and perf_version_string out of perf.c to some other files. Signed-off-by: Soramichi Akiyama <akiyama@m.soramichi.jp> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170117002237.c1aec0ce3b4d675dca018deb@m.soramichi.jp Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
00b86691 |
|
26-Nov-2016 |
Wang Nan <wangnan0@huawei.com> |
perf clang: Add builtin clang support ant test case Add basic clang support in clang.cpp and test__clang() testcase. The first testcase checks if builtin clang is able to generate LLVM IR. tests/clang.c is a proxy. Real testcase resides in utils/c++/clang-test.cpp in c++ and exports C interface to perf test subsystem. Test result: $ perf test -v clang 51: builtin clang support : 51.1: Test builtin clang compile C source to IR : --- start --- test child forked, pid 13215 test child finished with 0 ---- end ---- Test builtin clang support subtest 0: Ok Committer note: Make sure you've enabled CLANG and LLVM builtin support by setting the LIBCLANGLLVM variable on the make command line, e.g.: make LIBCLANGLLVM=1 O=/tmp/build/perf -C tools/perf install-bin Otherwise you'll get this when trying to do the 'perf test' call above: # perf test clang 51: builtin clang support : Skip (not compiled in) # Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-11-wangnan0@huawei.com [ Removed "Test" from descriptions, redundant and already removed from all the other entries ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
fdf9dc4b |
|
29-Nov-2016 |
David Ahern <dsa@cumulusnetworks.com> |
perf tools: Add time-based utility functions Add function to parse a user time string of the form <start>,<stop> where start and stop are time in sec.nsec format. Both start and stop times are optional. Add function to determine if a sample time is within a given time time window of interest. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1480439746-42695-2-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a074865e |
|
26-Nov-2016 |
Wang Nan <wangnan0@huawei.com> |
perf tools: Introduce perf hooks Perf hooks allow hooking user code at perf events. They can be used for manipulation of BPF maps, taking snapshot and reporting results. In this patch two perf hook points are introduced: record_start and record_end. To avoid buggy user actions, a SIGSEGV signal handler is introduced into 'perf record'. It turns off perf hook if it causes a segfault and report an error to help debugging. A test case for perf hook is introduced. Test result: $ ./buildperf/perf test -v hook 50: Test perf hooks : --- start --- test child forked, pid 10311 SIGSEGV is observed as expected, try to recover. Fatal error (SEGFAULT) in perf hook 'test' test child finished with 0 ---- end ---- Test perf hooks: Ok Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-5-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
621cb4e7 |
|
13-Oct-2016 |
Maciej Debski <maciejd@google.com> |
perf jit: Enable jitdump support without dwarf This patch modifies the build dependencies on the jitdump support in perf. As it stands jitdump was wrongfully made dependent 100% on using DWARF. However, the dwarf dependency, only exist if generating the source line table in genelf_debug.c. The rest of the support does not need DWARF. This patch removes the dependency on DWARF for the entire jitdump support. It keeps it only for the genelf_debug.c support. Signed-off-by: Maciej Debski <maciejd@google.com> Reviewed-by: Stephane Eranian <eranian@google.com> Cc: Anton Blanchard <anton@ozlabs.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1476356383-30100-3-git-send-email-eranian@google.com Fixes: e12b202f8fb9 ("perf jitdump: Build only on supported archs") [ Make it build only if NO_LIBELF isn't defined, as jitdump.o will only be built in that case ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
859442bd |
|
16-Sep-2016 |
Mathieu Poirier <mathieu.poirier@linaro.org> |
perf pmu: Push configuration down to PMU driver This patch adds a PMU callback and the required mechanic so that drivers can process the command line configuration elements found in evsel::config_terms. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1474041004-13956-6-git-send-email-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
70fbe057 |
|
05-Sep-2016 |
Peter Zijlstra <peterz@infradead.org> |
perf annotate: Add branch stack / basic block I wanted to know the hottest path through a function and figured the branch-stack (LBR) information should be able to help out with that. The below uses the branch-stack to create basic blocks and generate statistics from them. from to branch_i * ----> * | | block v * ----> * from to branch_i+1 The blocks are broken down into non-overlapping ranges, while tracking if the start of each range is an entry point and/or the end of a range is a branch. Each block iterates all ranges it covers (while splitting where required to exactly match the block) and increments the 'coverage' count. For the range including the branch we increment the taken counter, as well as the pred counter if flags.predicted. Using these number we can find if an instruction: - had coverage; given by: br->coverage / br->sym->max_coverage This metric ensures each symbol has a 100% spot, which reflects the observation that each symbol must have a most covered/hottest block. - is a branch target: br->is_target && br->start == add - for targets, how much of a branch's coverages comes from it: target->entry / branch->coverage - is a branch: br->is_branch && br->end == addr - for branches, how often it was taken: br->taken / br->coverage after all, all execution that didn't take the branch would have incremented the coverage and continued onward to a later branch. - for branches, how often it was predicted: br->pred / br->taken The coverage percentage is used to color the address and asm sections; for low (<1%) coverage we use NORMAL (uncolored), indicating that these instructions are not 'important'. For high coverage (>75%) we color the address RED. For each branch, we add an asm comment after the instruction with information on how often it was taken and predicted. Output looks like (sans color, which does loose a lot of the information :/) $ perf record --branch-filter u,any -e cycles:p ./branches 27 $ perf annotate branches Percent | Source code & Disassembly of branches for cycles:pu (217 samples) --------------------------------------------------------------------------------- : branches(): 0.00 : 40057a: push %rbp 0.00 : 40057b: mov %rsp,%rbp 0.00 : 40057e: sub $0x20,%rsp 0.00 : 400582: mov %rdi,-0x18(%rbp) 0.00 : 400586: mov %rsi,-0x20(%rbp) 0.00 : 40058a: mov -0x18(%rbp),%rax 0.00 : 40058e: mov %rax,-0x10(%rbp) 0.00 : 400592: movq $0x0,-0x8(%rbp) 0.00 : 40059a: jmpq 400656 <branches+0xdc> 1.84 : 40059f: mov -0x10(%rbp),%rax # +100.00% 3.23 : 4005a3: and $0x1,%eax 1.84 : 4005a6: test %rax,%rax 0.00 : 4005a9: je 4005bf <branches+0x45> # -54.50% (p:42.00%) 0.46 : 4005ab: mov 0x200bbe(%rip),%rax # 601170 <acc> 12.90 : 4005b2: add $0x1,%rax 2.30 : 4005b6: mov %rax,0x200bb3(%rip) # 601170 <acc> 0.46 : 4005bd: jmp 4005d1 <branches+0x57> # -100.00% (p:100.00%) 0.92 : 4005bf: mov 0x200baa(%rip),%rax # 601170 <acc> # +49.54% 13.82 : 4005c6: sub $0x1,%rax 0.46 : 4005ca: mov %rax,0x200b9f(%rip) # 601170 <acc> 2.30 : 4005d1: mov -0x10(%rbp),%rax # +50.46% 0.46 : 4005d5: mov %rax,%rdi 0.46 : 4005d8: callq 400526 <lfsr> # -100.00% (p:100.00%) 0.00 : 4005dd: mov %rax,-0x10(%rbp) # +100.00% 0.92 : 4005e1: mov -0x18(%rbp),%rax 0.00 : 4005e5: and $0x1,%eax 0.00 : 4005e8: test %rax,%rax 0.00 : 4005eb: je 4005ff <branches+0x85> # -100.00% (p:100.00%) 0.00 : 4005ed: mov 0x200b7c(%rip),%rax # 601170 <acc> 0.00 : 4005f4: shr $0x2,%rax 0.00 : 4005f8: mov %rax,0x200b71(%rip) # 601170 <acc> 0.00 : 4005ff: mov -0x10(%rbp),%rax # +100.00% 7.37 : 400603: and $0x1,%eax 3.69 : 400606: test %rax,%rax 0.00 : 400609: jne 400612 <branches+0x98> # -59.25% (p:42.99%) 1.84 : 40060b: mov $0x1,%eax 14.29 : 400610: jmp 400617 <branches+0x9d> # -100.00% (p:100.00%) 1.38 : 400612: mov $0x0,%eax # +57.65% 10.14 : 400617: test %al,%al # +42.35% 0.00 : 400619: je 40062f <branches+0xb5> # -57.65% (p:100.00%) 0.46 : 40061b: mov 0x200b4e(%rip),%rax # 601170 <acc> 2.76 : 400622: sub $0x1,%rax 0.00 : 400626: mov %rax,0x200b43(%rip) # 601170 <acc> 0.46 : 40062d: jmp 400641 <branches+0xc7> # -100.00% (p:100.00%) 0.92 : 40062f: mov 0x200b3a(%rip),%rax # 601170 <acc> # +56.13% 2.30 : 400636: add $0x1,%rax 0.92 : 40063a: mov %rax,0x200b2f(%rip) # 601170 <acc> 0.92 : 400641: mov -0x10(%rbp),%rax # +43.87% 2.30 : 400645: mov %rax,%rdi 0.00 : 400648: callq 400526 <lfsr> # -100.00% (p:100.00%) 0.00 : 40064d: mov %rax,-0x10(%rbp) # +100.00% 1.84 : 400651: addq $0x1,-0x8(%rbp) 0.92 : 400656: mov -0x8(%rbp),%rax 5.07 : 40065a: cmp -0x20(%rbp),%rax 0.00 : 40065e: jb 40059f <branches+0x25> # -100.00% (p:100.00%) 0.00 : 400664: nop 0.00 : 400665: leaveq 0.00 : 400666: retq (Note: the --branch-filter u,any was used to avoid spurious target and branch points due to interrupts/faults, they show up as very small -/+ annotations on 'weird' locations) Committer note: Please take a look at: http://vger.kernel.org/~acme/perf/annotate_basic_blocks.png To see the colors. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> [ Moved sym->max_coverage to 'struct annotate', aka symbol__annotate(sym) ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
293d5b43 |
|
25-Aug-2016 |
Masami Hiramatsu <mhiramat@kernel.org> |
perf probe: Support probing on offline cross-arch binary Support probing on offline cross-architecture binary by adding getting the target machine arch from ELF and choose correct register string for the machine. Here is an example: ----- $ perf probe --vmlinux=./vmlinux-arm --definition 'do_sys_open $params' p:probe/do_sys_open do_sys_open+0 dfd=%r5:s32 filename=%r1:u32 flags=%r6:s32 mode=%r3:u16 ----- Here, we can get probe/do_sys_open from above and append it to to the target machine's tracing/kprobe_events file in the tracefs mountput, usually /sys/kernel/debug/tracing/kprobe_events (or /sys/kernel/tracing/kprobe_events). Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/147214229717.23638.6440579792548044658.stgit@devbox [ Add definition for EM_AARCH64 to fix the build on at least centos 6, debian 7 & ubuntu 12.04.5 ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
8149a774 |
|
27-Jul-2016 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
tools lib api: Add str_error_c to libapi Because it uses that function, which would lead every tool using it to need to link against tools/lib/str_error_r.o. This fixes building tools/vm/, that links with libapi. Reported-by: Arjan van de Ven <arjan@linux.intel.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: b31e3e3316a7 ("tools lib api fs: Use str_error_r()") Link: http://lkml.kernel.org/n/tip-aedt3qzibhnhaov2j4caqi61@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
cae15db7 |
|
09-Jul-2016 |
David Tolnay <dtolnay@gmail.com> |
perf symbols: Add Rust demangling Rust demangling is another step after bfd demangling. Add a diagnosis to identify mangled Rust symbols based on the hash that the Rust mangler appends as the last path component, as well as other characteristics. Add a demangler to reconstruct the original symbol. Committer notes: How I tested it: Enabled COPR on Fedora 24 and then installed the 'rust-binary' package, with it: $ cat src/main.rs fn main() { println!("Hello, world!"); } $ cat Cargo.toml [package] name = "hello_world" version = "0.0.1" authors = [ "Arnaldo Carvalho de Melo <acme@kernel.org>" ] $ perf record cargo bench Compiling hello_world v0.0.1 (file:///home/acme/projects/hello_world) Running target/release/hello_world-d4b9dab4b2a47d75 running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.096 MB perf.data (1457 samples) ] $ Before this patch: $ perf report --stdio --dsos librbml-e8edd0fd.so # dso: librbml-e8edd0fd.so # # Total Lost Samples: 0 # # Samples: 1K of event 'cycles:u' # Event count (approx.): 979599126 # # Overhead Command Symbol # ........ ....... ............................................................................................................. # 1.78% rustc [.] rbml::reader::maybe_get_doc::hb9d387df6024b15b 1.50% rustc [.] _$LT$reader..DocsIterator$LT$$u27$a$GT$$u20$as$u20$std..iter..Iterator$GT$::next::hd9af9e60d79a35c8 1.20% rustc [.] rbml::reader::doc_at::hc88107fba445af31 0.46% rustc [.] _$LT$reader..TaggedDocsIterator$LT$$u27$a$GT$$u20$as$u20$std..iter..Iterator$GT$::next::h0cb40e696e4bb489 0.35% rustc [.] rbml::reader::Decoder::_next_int::h66eef7825a398bc3 0.29% rustc [.] rbml::reader::Decoder::_next_sub::h8e5266005580b836 0.15% rustc [.] rbml::reader::get_doc::h094521c645459139 0.14% rustc [.] _$LT$reader..Decoder$LT$$u27$doc$GT$$u20$as$u20$serialize..Decoder$GT$::read_u32::h0acea2fff9669327 0.07% rustc [.] rbml::reader::Decoder::next_doc::h6714d469c9dfaf91 0.07% rustc [.] _ZN4rbml6reader10doc_as_u6417h930b740aa94f1d3aE@plt 0.06% rustc [.] _fini $ After: $ perf report --stdio --dsos librbml-e8edd0fd.so # dso: librbml-e8edd0fd.so # # Total Lost Samples: 0 # # Samples: 1K of event 'cycles:u' # Event count (approx.): 979599126 # # Overhead Command Symbol # ........ ....... ................................................................. # 1.78% rustc [.] rbml::reader::maybe_get_doc 1.50% rustc [.] <reader::DocsIterator<'a> as std::iter::Iterator>::next 1.20% rustc [.] rbml::reader::doc_at 0.46% rustc [.] <reader::TaggedDocsIterator<'a> as std::iter::Iterator>::next 0.35% rustc [.] rbml::reader::Decoder::_next_int 0.29% rustc [.] rbml::reader::Decoder::_next_sub 0.15% rustc [.] rbml::reader::get_doc 0.14% rustc [.] <reader::Decoder<'doc> as serialize::Decoder>::read_u32 0.07% rustc [.] rbml::reader::Decoder::next_doc 0.07% rustc [.] _ZN4rbml6reader10doc_as_u6417h930b740aa94f1d3aE@plt 0.06% rustc [.] _fini $ Signed-off-by: David Tolnay <dtolnay@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/5780B7FA.3030602@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d0761e37 |
|
07-Jul-2016 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Uninline scnprintf() and vscnprint() They were in tools/include/linux/kernel.h, requiring that it in turn included stdio.h, which is way too heavy. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-855h8olnkot9v0dajuee1lo3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c8b5f2c9 |
|
06-Jul-2016 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
tools: Introduce str_error_r() The tools so far have been using the strerror_r() GNU variant, that returns a string, be it the buffer passed or something else. But that, besides being tricky in cases where we expect that the function using strerror_r() returns the error formatted in a provided buffer (we have to check if it returned something else and copy that instead), breaks the build on systems not using glibc, like Alpine Linux, where musl libc is used. So, introduce yet another wrapper, str_error_r(), that has the GNU interface, but uses the portable XSI variant of strerror_r(), so that users rest asured that the provided buffer is used and it is what is returned. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-d4t42fnf48ytlk8rjxs822tf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
057fbfb2 |
|
02-Jun-2016 |
He Kuang <hekuang@huawei.com> |
perf callchain: Support aarch64 cross-platform Support aarch64 cross platform callchain unwind. Signed-off-by: He Kuang <hekuang@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1464924803-22214-15-git-send-email-hekuang@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
52ffe0ff |
|
02-Jun-2016 |
He Kuang <hekuang@huawei.com> |
perf callchain: Support x86 target platform Support x86(32-bit) cross platform callchain unwind. Signed-off-by: He Kuang <hekuang@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1464924803-22214-14-git-send-email-hekuang@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f6d72532 |
|
02-Jun-2016 |
He Kuang <hekuang@huawei.com> |
perf tools: Extract common API out of unwind-libunwind-local.c This patch extracts common unwind-libunwind APIs out of unwind-libunwind-local.c, this part will be used by both local and remote libunwind. Signed-off-by: He Kuang <hekuang@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1464924803-22214-9-git-send-email-hekuang@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a597b547 |
|
02-Jun-2016 |
He Kuang <hekuang@huawei.com> |
perf unwind: Rename unwind-libunwind.c to unwind-libunwind-local.c Since unwind-libunwind.c contains code for specific arithecture, we change it's name to unwind-libunwind-local.c, and let it only be built if local libunwind is supported. Signed-off-by: He Kuang <hekuang@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1464924803-22214-8-git-send-email-hekuang@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
452e8401 |
|
09-May-2016 |
Masami Hiramatsu <mhiramat@kernel.org> |
perf tools: Remove xrealloc and ALLOC_GROW Remove unused xrealloc() and ALLOC_GROW() from libperf. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160510054801.6158.6204.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
451db126 |
|
28-Apr-2016 |
Chris Phlipot <cphlipot0@gmail.com> |
perf tools: Refactor code to move call path handling out of thread-stack Move the call path handling code out of thread-stack.c and thread-stack.h to allow other components that are not part of thread-stack to create call paths. Summary: - Create call-path.c and call-path.h and add them to the build. - Move all call path related code out of thread-stack.c and thread-stack.h and into call-path.c and call-path.h. - A small subset of structures and functions are now visible through call-path.h, which is required for thread-stack.c to continue to compile. This change is a prerequisite for subsequent patches in this change set and by itself contains no user-visible changes. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461831551-12213-3-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
2cc46669 |
|
18-Apr-2016 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf build: Remove x86 references from arch-neutral Build It will already be dealt with generating the syscalltbl.c file in the x86 arch specific Build files, namely via 'archheaders'. This fixes the build on !x86 arches, as reported for powerpcle Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Tested-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 1b700c997500 ("perf tools: Build syscall table .c header from kernel's syscall_64.tbl") Link: http://lkml.kernel.org/r/20160415212831.GT9056@kernel.org [ Removed the syscalltbl.o altogether, as per Jiri's suggestion ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
25da4fab |
|
14-Apr-2016 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf evsel: Move fprintf methods to separate source file They still use functions that would drag more stuff to the python binding, where these fprintf methods are not used, so separate it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-xfp0mgq3hh3px61di6ixi1jk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
bfbba189 |
|
14-Apr-2016 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf symbols: Move fprintf routines to separate object file To disentangle symbol printing from all the code related to symbol tables, resolution of addresses to symbols, etc. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-eik9g3hbtdc7ddv57f1d4v3p@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
1b700c99 |
|
04-Apr-2016 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tools: Build syscall table .c header from kernel's syscall_64.tbl We used libaudit to map ids to syscall names and vice-versa, but that imposes a delay in supporting new syscalls, having to wait for libaudit to get those new syscalls on its tables. To remove that delay, for x86_64 initially, grab a copy of arch/x86/entry/syscalls/syscall_64.tbl and use it to generate those tables. Syscalls currently not available in audit-libs: # trace -e copy_file_range,membarrier,mlock2,pread64,pwrite64,timerfd_create,userfaultfd Error: Invalid syscall copy_file_range, membarrier, mlock2, pread64, pwrite64, timerfd_create, userfaultfd Hint: try 'perf list syscalls:sys_enter_*' Hint: and: 'man syscalls' # With this patch: # trace -e copy_file_range,membarrier,mlock2,pread64,pwrite64,timerfd_create,userfaultfd 8505.733 ( 0.010 ms): gnome-shell/2519 timerfd_create(flags: 524288) = 36 8506.688 ( 0.005 ms): gnome-shell/2519 timerfd_create(flags: 524288) = 40 30023.097 ( 0.025 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63ae382000, count: 4096, pos: 529592320) = 4096 31268.712 ( 0.028 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63afd8b000, count: 4096, pos: 2314133504) = 4096 31268.854 ( 0.016 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63afda2000, count: 4096, pos: 2314137600) = 4096 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-51xfjbxevdsucmnbc4ka5r88@git.kernel.org [ Added make dep for 'prepare' in 'LIBPERF_IN', fix by Wang Nan to fix parallell build ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
fd0db102 |
|
04-Apr-2016 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf trace: Move syscall table id <-> name routines to separate class We're using libaudit for doing name to id and id to syscall name translations, but that makes 'perf trace' to have to wait for newer libaudit versions supporting recently added syscalls, such as "userfaultfd" at the time of this changeset. We have all the information right there, in the kernel sources, so move this code to a separate place, wrapped behind functions that will progressively use the kernel source files to extract the syscall table for use in 'perf trace'. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-i38opd09ow25mmyrvfwnbvkj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
2a28e230 |
|
08-Mar-2016 |
Adrian Hunter <adrian.hunter@intel.com> |
perf jit: Add support for using TSC as a timestamp Intel PT uses TSC as a timestamp, so add support for using TSC instead of the monotonic clock. Use of TSC is selected by an environment variable "JITDUMP_USE_ARCH_TIMESTAMP" and flagged in the jitdump file with flag JITDUMP_FLAGS_ARCH_TIMESTAMP. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1457426330-30226-1-git-send-email-adrian.hunter@intel.com [ Added the fixup from He Kuang to make it build on other arches, ] [ such as aarch64, to avoid inserting this bisectiong breakage upstream ] Link: http://lkml.kernel.org/r/1459482572-129494-1-git-send-email-hekuang@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
78478269 |
|
23-Mar-2016 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf llvm: Use realpath to canonicalize paths To kill the last user of make_nonrelative_path(), that gets ditched, one more panicking function killed. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-3hu56rvyh4q5gxogovb6ko8a@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e12b202f |
|
10-Mar-2016 |
Jiri Olsa <jolsa@redhat.com> |
perf jitdump: Build only on supported archs Build jitdump only on architectures defined in util/genelf.h file, to avoid breaking the build on such arches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Borislav Petkov <bp@suse.de> Cc: Colin Ian King <colin.king@canonical.com> Cc: David Ahern <dsahern@gmail.com> Cc: Davidlohr Bueso <dbueso@suse.com> Cc: He Kuang <hekuang@huawei.com> Cc: Mel Gorman <mgorman@suse.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20160310164113.GA11357@krava.redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
46dad054 |
|
07-Mar-2016 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf jitdump: DWARF is also needed While building on a Docker container for ubuntu and installing package by package one ends up with: MKDIR /tmp/build/util/ CC /tmp/build/util/genelf.o util/genelf.c:22:19: fatal error: dwarf.h: No such file or directory #include <dwarf.h> ^ compilation terminated. mv: cannot stat '/tmp/build/util/.genelf.o.tmp': No such file or directory Because the jitdump code needs the DWARF related development packages to be installed. So make it dependent on that so that the build can succeed without jitdump support. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-le498robnmxd40237wej3w62@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
acbe613e |
|
15-Feb-2016 |
Jiri Olsa <jolsa@kernel.org> |
perf tools: Add monitored events array It will ease up configuration of memory events and addition of other memory events in following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1455525293-8671-5-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
598b7c69 |
|
30-Nov-2015 |
Stephane Eranian <eranian@google.com> |
perf jit: add source line info support This patch adds source line information support to perf for jitted code. The source line info must be emitted by the runtime, such as JVMTI. Perf injects extract the source line info from the jitdump file and adds the corresponding .debug_lines section in the ELF image generated for each jitted function. The source line enables matching any address in the profile with a source file and line number. The improvement is visible in perf annotate with the source code displayed alongside the assembly code. The dwarf code leverages the support from OProfile which is also released under GPLv2. Copyright 2007 OProfile authors. Signed-off-by: Stephane Eranian <eranian@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Carl Love <cel@us.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John McCutchan <johnmccutchan@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sonny Rao <sonnyrao@chromium.org> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1448874143-7269-5-git-send-email-eranian@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9b07e27f |
|
30-Nov-2015 |
Stephane Eranian <eranian@google.com> |
perf inject: Add jitdump mmap injection support This patch adds a --jit/-j option to perf inject. This options injects MMAP records into the perf.data file to cover the jitted code mmaps. It also emits ELF images for each function in the jidump file. Those images are created where the jitdump file is. The MMAP records point to that location as well. Typical flow: $ perf record -k mono -- java -agentpath:libpjvmti.so java_class $ perf inject --jit -i perf.data -o perf.data.jitted $ perf report -i perf.data.jitted Note that jitdump.h support is not limited to Java, it works with any jitted environment modified to emit the jitdump file format, include those where code can be jitted multiple times and moved around. The jitdump.h format is adapted from the Oprofile project. The genelf.c (ELF binary generation) depends on MD5 hash encoding for the buildid. To enable this, libssl-dev must be installed. If not, then genelf.c defaults to using urandom to generate the buildid, which is not ideal. The Makefile auto-detects the presence on libssl-dev. This version mmaps the jitdump file to create a marker MMAP record in the perf.data file. The marker is used to detect jitdump and cause perf inject to inject the jitted mmaps and generate ELF images for jitted functions. In V8, the following fixes and changes were made among other things: - the jidump header format include a new flags field to be used to carry information about the configuration of the runtime agent. Contributed by: Adrian Hunter <adrian.hunter@intel.com> - Fix mmap pgoff: MMAP event pgoff must be the offset within the ELF file at which the code resides. Contributed by: Adrian Hunter <adrian.hunter@intel.com> - Fix ELF virtual addresses: perf tools expect the ELF virtual addresses of dynamic objects to match the file offset. Contributed by: Adrian Hunter <adrian.hunter@intel.com> - JIT MMAP injection does not obey finished_round semantics. JIT MMAP injection injects all MMAP events in one go, so it does not obey finished_round semantics, so drop the finished_round events from the output perf.data file. Contributed by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Stephane Eranian <eranian@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Carl Love <cel@us.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John McCutchan <johnmccutchan@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sonny Rao <sonnyrao@chromium.org> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1448874143-7269-3-git-send-email-eranian@google.com [ Moved inject.build_ids ordering bits to a separate patch, fixed the NO_LIBELF=1 build ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e9c4bcdd |
|
30-Nov-2015 |
Stephane Eranian <eranian@google.com> |
perf symbols: add Java demangling support Add Java function descriptor demangling support. Something bfd cannot do. Use the JAVA_DEMANGLE_NORET flag to avoid decoding the return type of functions. Signed-off-by: Stephane Eranian <eranian@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Carl Love <cel@us.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John McCutchan <johnmccutchan@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sonny Rao <sonnyrao@chromium.org> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1448874143-7269-2-git-send-email-eranian@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
915b0882 |
|
07-Jan-2016 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
tools lib: Move bitmap.[ch] from tools/perf/ to tools/{lib,include}/ So that lib/find_bit.c doesn't requires anything inside tools/perf/ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: George Spelvin <linux@horizon.com Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Wang Nan <wangnan0@huawei.com> Cc: Yury Norov <yury.norov@gmail.com> Link: http://lkml.kernel.org/n/tip-7lxe7jgohaac5faodndhdmvk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
552eb975 |
|
08-Jan-2016 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
tools lib: Move find_next_bit.c to tools/lib/ The commit that introduced it should've moved it to the same place, plus the 'tools/' prefix, but instead moved it to a bogus tools/lib/util/ directory, being the only file there. Move it to tools/lib/find_bit.c, picking the name for the file where these routines live since: 8f6f19dd5143 ("lib: move find_last_bit to lib/find_next_bit.c") Next step is to make tools/lib/find_bit.c to differ from lib/find_bit.c just in removing what is not used by tools/. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: George Spelvin <linux@horizon.com Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Wang Nan <wangnan0@huawei.com> Cc: Yury Norov <yury.norov@gmail.com> Link: http://lkml.kernel.org/n/tip-p391cex5mqvahp4pwrton87n@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4b6ab94e |
|
15-Dec-2015 |
Josh Poimboeuf <jpoimboe@redhat.com> |
perf subcmd: Create subcmd library Move the subcommand-related files from perf to a new library named libsubcmd.a. Since we're moving files anyway, go ahead and rename 'exec_cmd.*' to 'exec-cmd.*' to be consistent with the naming of all the other files. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/c0a838d4c878ab17fee50998811612b2281355c1.1450193761.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
096d3558 |
|
15-Dec-2015 |
Josh Poimboeuf <jpoimboe@redhat.com> |
perf tools: Provide subcmd configuration at runtime Create init functions for exec_cmd.c and pager.c. This allows their configuration to be specified at runtime so they can be split out into a separate library which can be used by other programs. Their configuration is stored in a shared subcmd_config struct. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/21f5f6b38da72c985a8dcfa185700d03e7eecd1d.1450193761.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
5feaac24 |
|
13-Dec-2015 |
Josh Poimboeuf <jpoimboe@redhat.com> |
perf tools: Move help_unknown_cmd() to its own file help_unknown_cmd() is quite perf-specific because it relies on some perf_config*() functions. Move it and its supporting functions out into a separate file so that help.c can be moved to a library. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/562d918bcaaf340c1ae3e47586b3f0ae33b9918b.1449965119.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
1fe143c5 |
|
07-Dec-2015 |
Josh Poimboeuf <jpoimboe@redhat.com> |
perf tools: Move term functions out of util.c The term functions are needed by help.c which is going to be moved into a separate library. Move them out of util.c and into their own file. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/9a39c854dd156b55ebda57e427594c9a59dcb40f.1449548395.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
de7cf7ca |
|
07-Dec-2015 |
Josh Poimboeuf <jpoimboe@redhat.com> |
perf tools: Remove unused pager_use_color variable Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/e540c61b3068761181db6d9b1b3411990bafdb2f.1449548395.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
bfc077b4 |
|
15-Nov-2015 |
He Kuang <hekuang@huawei.com> |
perf bpf: Add prologue for BPF programs for fetching arguments This patch generates a prologue for a BPF program which fetches arguments for it. With this patch, the program can have arguments as follow: SEC("lock_page=__lock_page page->flags") int lock_page(struct pt_regs *ctx, int err, unsigned long flags) { return 1; } This patch passes at most 3 arguments from r3, r4 and r5. r1 is still the ctx pointer. r2 is used to indicate if dereferencing was done successfully. This patch uses r6 to hold ctx (struct pt_regs) and r7 to hold stack pointer for result. Result of each arguments first store on stack: low address BPF_REG_FP - 24 ARG3 BPF_REG_FP - 16 ARG2 BPF_REG_FP - 8 ARG1 BPF_REG_FP high address Then loaded into r3, r4 and r5. The output prologue for offn(...off2(off1(reg)))) should be: r6 <- r1 // save ctx into a callee saved register r7 <- fp r7 <- r7 - stack_offset // pointer to result slot /* load r3 with the offset in pt_regs of 'reg' */ (r7) <- r3 // make slot valid r3 <- r3 + off1 // prepare to read unsafe pointer r2 <- 8 r1 <- r7 // result put onto stack call probe_read // read unsafe pointer jnei r0, 0, err // error checking r3 <- (r7) // read result r3 <- r3 + off2 // prepare to read unsafe pointer r2 <- 8 r1 <- r7 call probe_read jnei r0, 0, err ... /* load r2, r3, r4 from stack */ goto success err: r2 <- 1 /* load r3, r4, r5 with 0 */ goto usercode success: r2 <- 0 usercode: r1 <- r6 // restore ctx // original user code If all of arguments reside in register (dereferencing is not required), gen_prologue_fastpath() will be used to create fast prologue: r3 <- (r1 + offset of reg1) r4 <- (r1 + offset of reg2) r5 <- (r1 + offset of reg3) r2 <- 0 P.S. eBPF calling convention is defined as: * r0 - return value from in-kernel function, and exit value for eBPF program * r1 - r5 - arguments from eBPF program to in-kernel function * r6 - r9 - callee saved registers that in-kernel function will preserve * r10 - read-only frame pointer to access stack Committer note: At least testing if it builds and loads: # cat test_probe_arg.c struct pt_regs; __attribute__((section("lock_page=__lock_page page->flags"), used)) int func(struct pt_regs *ctx, int err, unsigned long flags) { return 1; } char _license[] __attribute__((section("license"), used)) = "GPL"; int _version __attribute__((section("version"), used)) = 0x40300; # perf record -e ./test_probe_arg.c usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.016 MB perf.data ] # perf evlist perf_bpf_probe:lock_page # Signed-off-by: He Kuang <hekuang@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1447675815-166222-11-git-send-email-wangnan0@huawei.com Signed-off-by: Wang Nan <wangnan0@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
4ddd3274 |
|
16-Nov-2015 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
tools: Adopt memdup() from tools/perf, moving it to tools/lib/string.c That will contain more string functions with counterparts, sometimes verbatim copies, in the kernel. Acked-by: Wang Nan <wangnan0@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/n/tip-rah6g97kn21vfgmlramorz6o@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
84c86ca1 |
|
13-Oct-2015 |
Wang Nan <wangnan0@huawei.com> |
perf tools: Enable passing bpf object file to --event By introducing new rules in tools/perf/util/parse-events.[ly], this patch enables 'perf record --event bpf_file.o' to select events by an eBPF object file. It calls parse_events_load_bpf() to load that file, which uses bpf__prepare_load() and finally calls bpf_object__open() for the object files. After applying this patch, commands like: # perf record --event foo.o sleep become possible. However, at this point it is unable to link any useful things onto the evsel list because the creating of probe points and BPF program attaching have not been implemented. Before real events are possible to be extracted, to avoid perf report error because of empty evsel list, this patch link a dummy evsel. The dummy event related code will be removed when probing and extracting code is ready. Commiter notes: Using it: $ ls -la foo.o ls: cannot access foo.o: No such file or directory $ perf record --event foo.o sleep libbpf: failed to open foo.o: No such file or directory event syntax error: 'foo.o' \___ BPF object file 'foo.o' is invalid (add -v to see detail) Run 'perf list' for a list of valid events Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -e, --event <event> event selector. use 'perf list' to list available events $ $ file /tmp/build/perf/perf.o /tmp/build/perf/perf.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped $ perf record --event /tmp/build/perf/perf.o sleep libbpf: /tmp/build/perf/perf.o is not an eBPF object file event syntax error: '/tmp/build/perf/perf.o' \___ BPF object file '/tmp/build/perf/perf.o' is invalid (add -v to see detail) Run 'perf list' for a list of valid events Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -e, --event <event> event selector. use 'perf list' to list available events $ $ file /tmp/foo.o /tmp/foo.o: ELF 64-bit LSB relocatable, no machine, version 1 (SYSV), not stripped $ perf record --event /tmp/foo.o sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.013 MB perf.data ] $ perf evlist /tmp/foo.o $ perf evlist -v /tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 $ So, type 1 is PERF_TYPE_SOFTWARE, config 0x9 is PERF_COUNT_SW_DUMMY, ok. $ perf report --stdio Error: The perf.data file has no samples! # To display the perf.data header info, please use --header/--header-only options. # $ Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1444826502-49291-4-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9fb47654 |
|
24-Sep-2015 |
Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> |
perf tools: Fix build break on powerpc due to sample_reg_masks perf_regs.c does not get built on Powerpc as CONFIG_PERF_REGS is false. So the weak definition for 'sample_regs_masks' doesn't get picked up. Adding perf_regs.o to util/Build unconditionally, exposes a redefinition error for 'perf_reg_value()' function (due to the static inline version in util/perf_regs.h). So use #ifdef HAVE_PERF_REGS_SUPPORT' around that function. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Stephane Eranian <eranian@google.com> Cc: linuxppc-dev@ozlabs.org Link: http://lkml.kernel.org/r/20150930182836.GA27858@us.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
eb56db54 |
|
24-Sep-2015 |
Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> |
perf tools: Fix build break on powerpc due to sample_reg_masks The perf_regs.c file does not get built on Powerpc as CONFIG_PERF_REGS is false. So the weak definition for 'sample_regs_masks' doesn't get picked up. Adding perf_regs.o to util/Build unconditionally, exposes a redefinition error for 'perf_reg_value()' function (due to the static inline version in util/perf_regs.h). So use #ifdef HAVE_PERF_REGS_SUPPORT' around that function. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Stephane Eranian <eranian@google.com> Cc: linuxppc-dev@ozlabs.org Link: http://lkml.kernel.org/r/20150930182836.GA27858@us.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f0ce888c |
|
08-Sep-2015 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf env: Move perf_env out of header.h and session.c into separate object Since it can be used separately from 'perf_session' and 'perf_header', move it to separate include file and object, next csets will try to move a perf_env__init() routine. Tested-by: Wang Nan <wangnan0@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-ff2rw99tsn670y1b6gxbwdsi@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
bcc84ec6 |
|
31-Aug-2015 |
Stephane Eranian <eranian@google.com> |
perf record: Add ability to name registers to record This patch modifies the -I/--int-regs option to enablepassing the name of the registers to sample on interrupt. Registers can be specified by their symbolic names. For instance on x86, --intr-regs=ax,si. The motivation is to reduce the size of the perf.data file and the overhead of sampling by only collecting the registers useful to a specific analysis. For instance, for value profiling, sampling only the registers used to passed arguements to functions. With no parameter, the --intr-regs still records all possible registers based on the architecture. To name registers, it is necessary to use the long form of the option, i.e., --intr-regs: $ perf record --intr-regs=si,di,r8,r9 ..... To record any possible registers: $ perf record -I ..... $ perf report --intr-regs ... To display the register, one can use perf report -D To list the available registers: $ perf record --intr-regs=\? available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15 Signed-off-by: Stephane Eranian <eranian@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1441039273-16260-4-git-send-email-eranian@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
97db6206 |
|
31-Aug-2015 |
Adrian Hunter <adrian.hunter@intel.com> |
perf tools: Fix build on powerpc broken by pt/bts It is theoretically possible to process perf.data files created on x86 and that contain Intel PT or Intel BTS data, on any other architecture, which is why it is possible for there to be build errors on powerpc caused by pt/bts. The errors were: util/intel-pt-decoder/intel-pt-insn-decoder.c: In function ‘intel_pt_insn_decoder’: util/intel-pt-decoder/intel-pt-insn-decoder.c:138:3: error: switch missing default case [-Werror=switch-default] switch (insn->immediate.nbytes) { ^ cc1: all warnings being treated as errors linux-acme.git/tools/perf/perf-obj/libperf.a(libperf-in.o): In function `intel_pt_synth_branch_sample': sources/linux-acme.git/tools/perf/util/intel-pt.c:871: undefined reference to `tsc_to_perf_time' linux-acme.git/tools/perf/perf-obj/libperf.a(libperf-in.o): In function `intel_pt_sample': sources/linux-acme.git/tools/perf/util/intel-pt.c:915: undefined reference to `tsc_to_perf_time' sources/linux-acme.git/tools/perf/util/intel-pt.c:962: undefined reference to `tsc_to_perf_time' linux-acme.git/tools/perf/perf-obj/libperf.a(libperf-in.o): In function `intel_pt_process_event': sources/linux-acme.git/tools/perf/util/intel-pt.c:1454: undefined reference to `perf_time_to_tsc' Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1441046384-28663-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d0170af7 |
|
17-Jul-2015 |
Adrian Hunter <adrian.hunter@intel.com> |
perf tools: Add Intel BTS support Intel BTS support fits within the new auxtrace infrastructure. Recording is supporting by identifying the Intel BTS PMU, parsing options and setting up events. Decoding is supported by queuing up trace data by thread and then decoding synchronously delivering synthesized event samples into the session processing for tools to consume. Committer note: E.g: [root@felicio ~]# perf record --per-thread -e intel_bts// ls anaconda-ks.cfg apctest.output bin kernel-rt-3.10.0-298.rt56.171.el7.x86_64.rpm libexec lock_page.bpf.c perf.data perf.data.old [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 4.367 MB perf.data ] [root@felicio ~]# perf evlist -v intel_bts//: type: 6, size: 112, { sample_period, sample_freq }: 1, sample_type: IP|TID|IDENTIFIER, read_format: ID, disabled: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1 dummy:u: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 1, sample_type: IP|TID|IDENTIFIER, read_format: ID, disabled: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1 [root@felicio ~]# perf script # the navigate in the pager to some interesting place: ls 1843 1 branches: ffffffff810a60cb flush_signal_handlers ([kernel.kallsyms]) => ffffffff8121a522 setup_new_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8121a529 setup_new_exec ([kernel.kallsyms]) => ffffffff8122fa30 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa5d do_close_on_exec ([kernel.kallsyms]) => ffffffff81767ae0 _raw_spin_lock ([kernel.kallsyms]) ls 1843 1 branches: ffffffff81767af4 _raw_spin_lock ([kernel.kallsyms]) => ffffffff8122fa62 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fac9 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fad2 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fadd do_close_on_exec ([kernel.kallsyms]) => ffffffff8120fc80 filp_close ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8120fcaf filp_close ([kernel.kallsyms]) => ffffffff8120fcb6 filp_close ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8120fcc2 filp_close ([kernel.kallsyms]) => ffffffff812547f0 dnotify_flush ([kernel.kallsyms]) ls 1843 1 branches: ffffffff81254823 dnotify_flush ([kernel.kallsyms]) => ffffffff8120fcc7 filp_close ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8120fccd filp_close ([kernel.kallsyms]) => ffffffff81261790 locks_remove_posix ([kernel.kallsyms]) ls 1843 1 branches: ffffffff812617a3 locks_remove_posix ([kernel.kallsyms]) => ffffffff812617b9 locks_remove_posix ([kernel.kallsyms]) ls 1843 1 branches: ffffffff812617b9 locks_remove_posix ([kernel.kallsyms]) => ffffffff8120fcd2 filp_close ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8120fcd5 filp_close ([kernel.kallsyms]) => ffffffff812142c0 fput ([kernel.kallsyms]) ls 1843 1 branches: ffffffff812142d6 fput ([kernel.kallsyms]) => ffffffff812142df fput ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8121430c fput ([kernel.kallsyms]) => ffffffff810b6580 task_work_add ([kernel.kallsyms]) ls 1843 1 branches: ffffffff810b65ad task_work_add ([kernel.kallsyms]) => ffffffff810b65b1 task_work_add ([kernel.kallsyms]) ls 1843 1 branches: ffffffff810b65c1 task_work_add ([kernel.kallsyms]) => ffffffff810bc710 kick_process ([kernel.kallsyms]) ls 1843 1 branches: ffffffff810bc725 kick_process ([kernel.kallsyms]) => ffffffff810bc742 kick_process ([kernel.kallsyms]) ls 1843 1 branches: ffffffff810bc742 kick_process ([kernel.kallsyms]) => ffffffff810b65c6 task_work_add ([kernel.kallsyms]) ls 1843 1 branches: ffffffff810b65c9 task_work_add ([kernel.kallsyms]) => ffffffff81214311 fput ([kernel.kallsyms]) Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-9-git-send-email-adrian.hunter@intel.com [ Merged sample->time fix for bug found after first round of testing on slightly older kernel ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
90e457f7 |
|
17-Jul-2015 |
Adrian Hunter <adrian.hunter@intel.com> |
perf tools: Add Intel PT support Add support for Intel Processor Trace. Intel PT support fits within the new auxtrace infrastructure. Recording is supporting by identifying the Intel PT PMU, parsing options and setting up events. Decoding is supported by queuing up trace data by cpu or thread and then decoding synchronously delivering synthesized event samples into the session processing for tools to consume. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-7-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
a4e92590 |
|
17-Jul-2015 |
Adrian Hunter <adrian.hunter@intel.com> |
perf tools: Add Intel PT packet decoder Add support for decoding Intel Processor Trace packets. This essentially provides intel_pt_get_packet() which takes a buffer of binary data and returns the decoded packet. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
d809560b |
|
06-Aug-2015 |
Jiri Olsa <jolsa@redhat.com> |
perf stat: Move perf_counts struct and functions into separate object Moving 'struct perf_counts' and associated functions into separate object, so we could remove stat.c object dependency from python build. It makes the python code to build properly, because it fails to load due to missing stat-shadow.c object dependency if some patches from Kan Liang are applied. So apply this one, then Kan's. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20150807105103.GB8624@krava.brq.redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
aa61fd05 |
|
21-Jul-2015 |
Wang Nan <wangnan0@huawei.com> |
perf tools: Introduce llvm config options This patch introduces [llvm] config section with 5 options. Following patches will use then to config llvm dynamica compiling. 'llvm-utils.[ch]' is introduced in this patch for holding all llvm/clang related stuffs. Example: [llvm] # Path to clang. If omit, search it from $PATH. clang-path = "/path/to/clang" # Cmdline template. Following line shows its default value. # Environment variable is used to passing options. # # *NOTE*: -D__KERNEL__ MUST appears before $CLANG_OPTIONS, # so user have a chance to use -U__KERNEL__ in $CLANG_OPTIONS # to cancel it. clang-bpf-cmd-template = "$CLANG_EXEC -D__KERNEL__ $CLANG_OPTIONS \ $KERNEL_INC_OPTIONS -Wno-unused-value \ -Wno-pointer-sign -working-directory \ $WORKING_DIR -c $CLANG_SOURCE -target \ bpf -O2 -o -" # Options passed to clang, will be passed to cmdline by # $CLANG_OPTIONS. clang-opt = "-Wno-unused-value -Wno-pointer-sign" # kbuild directory. If not set, use /lib/modules/`uname -r`/build. # If set to "" deliberately, skip kernel header auto-detector. kbuild-dir = "/path/to/kernel/build" # Options passed to 'make' when detecting kernel header options. kbuild-opts = "ARCH=x86_64" Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1437477214-149684-1-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
92f6c72e |
|
15-Jul-2015 |
Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> |
perf probe: Move ftrace probe-event operations to probe-file.c Move ftrace probe-event operations to probe-file.c from probe-event.c. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20150715091407.8915.14316.stgit@localhost.localdomain [ Fixed up strlist__new() calls wrt 4a77e2183fc0 ("perf strlist: Make dupstr be the...") ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
0aefc359 |
|
09-Jul-2015 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
tools: Copy lib/hweight.c from the kernel sources Instead of accessing it directly, as it uses EXPORT_SYMBOL, that has no meaning in tools/perf and because we removed the stubs for it, i.e. we removed the tools/include/linux/export.h file. This fixes the build for the detached tarball sources cases and removes one more source of entanglement with the kernel sources. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-oyqx541o7apa2cskjhcxi6nx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
3f735377 |
|
05-Jul-2015 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
tools: Copy lib/rbtree.c to tools/lib/ So that we can remove kernel specific stuff we've been stubbing out via a tools/include/linux/export.h that gets removed in this patch and to avoid breakages in the future like the one fixed recently where rcupdate.h started being used in rbtree.h. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-rxuzfsozpb8hv1emwpx06rm6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f87027b9 |
|
03-Jun-2015 |
Jiri Olsa <jolsa@kernel.org> |
perf stat: Move shadow stat counters into separate object Separating shadow counters code into separate object as a cleanup, but mainly for upcomming changes, so could use it from script command context. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1433341559-31848-10-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
f00898f4 |
|
27-May-2015 |
Andi Kleen <ak@linux.intel.com> |
perf tools: Move branch option parsing to own file .. to allow sharing between builtin-record and builtin-top later. No code changes, just moved code. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1432749114-904-9-git-send-email-andi@firstfloor.org [ Rename too generic branch.[ch] name to parse-branch-options.[ch] ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
70923bd2 |
|
13-May-2015 |
Jiri Olsa <jolsa@kernel.org> |
perf tools: Make flex/bison calls honour V=1 Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-dnc2ggwhffdpuvijwq4rkic9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
e31f0d01 |
|
30-Apr-2015 |
Adrian Hunter <adrian.hunter@intel.com> |
perf tools: Add build option NO_AUXTRACE to exclude AUX area tracing Add build option NO_AUXTRACE to exclude compiling support for AUX area tracing. Support for both recording and processing is excluded and by implication any future additions such as Intel PT and Intel BTS will also not be compiled in with this option. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1430404667-10593-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
539f3aa2 |
|
28-Apr-2015 |
Namhyung Kim <namhyung@kernel.org> |
perf tools: Fix bison-related build failure on CentOS 6 The YYLTYPE_IS_TRIVIAL is defined in the Build file, but unlike pmu-bison.c, gcc complained about it for parse-events-bison.c: CC util/parse-events-bison.o In file included from util/parse-events.y:16: util/parse-events-bison.h:101:1: error: "YYLTYPE_IS_TRIVIAL" redefined <command-line>: error: this is the location of the previous definition make[3]: *** [util/parse-events-bison.o] Error 1 Comments from Jiri Olsa: "Reason is the parse error handling that was added just recently: it adds YYLTYPE type (which is not present in pmu-bison.h), so YYLTYPE_IS_TRIVIAL gets redefined, which is ok in F20 that handle the error via '-w' option, but it's not ok for RHEL6 where the '-w' does not work for this kind of error." Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1430322871-18107-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
718c602d |
|
09-Apr-2015 |
Adrian Hunter <adrian.hunter@intel.com> |
perf evlist: Add support for mmapping an AUX area buffer This patch supports the addition to the kernel of AUX area buffers that can be mmapped separately from the perf-events buffer. The AUX buffer can be configured to contain hardware-produced trace information. The first implementation will support Intel BTS and Intel PT. One auxtrace buffer is mmapped per perf-events buffer. If the requested auxtrace buffer size is zero, which it will be until further support is added, then no auxtrace mmapping is attempted. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1428594864-29309-3-git-send-email-adrian.hunter@intel.com [ Fixed conflict in evlist.h ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
80a32e5b |
|
29-Jan-2015 |
Jiri Olsa <jolsa@kernel.org> |
perf tools: Add lzma decompression support for kernel module In short, Fedora compresses kernel modules now (since version 21) with lzma compression. Adding lzma decompress support into the dso.c:compressions array introduced by Namhyung earlier. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-2glp65kdtbrk0gblmirsjsnt@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
ecefde62 |
|
19-Feb-2015 |
David Ahern <david.ahern@oracle.com> |
perf tools: Only include tsc file for x86 The perf_time_to_tsc and tsc_to_perf_time functions are only used for x86. Make inclusion of tsc.c dependent on x86 as well. Signed-off-by: David Ahern <david.ahern@oracle.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <david.ahern@oracle.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1424370153-128274-1-git-send-email-david.ahern@oracle.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
edbe9817 |
|
20-Feb-2015 |
Jiri Olsa <jolsa@kernel.org> |
perf data: Add perf data to CTF conversion support Adding 'perf data convert' to convert perf data file into different format. This patch adds support for CTF format conversion. To convert perf.data into CTF run: $ perf data convert --to-ctf=./ctf-data/ [ perf data convert: Converted 'perf.data' into CTF data './ctf-data/' ] [ perf data convert: Converted and wrote 11.268 MB (100230 samples) ] The command will create CTF metadata out of perf.data file (or one specified via -i option) and then convert all sample events into single CTF stream. Each sample_type bit is translated into separated CTF event field apart from following exceptions: PERF_SAMPLE_RAW - added in next patch PERF_SAMPLE_READ - TODO PERF_SAMPLE_CALLCHAIN - TODO PERF_SAMPLE_BRANCH_STACK - TODO PERF_SAMPLE_REGS_USER - TODO PERF_SAMPLE_STACK_USER - TODO $ perf --debug=data-convert=2 data convert ... The converted CTF data could be analyzed by CTF tools, like babletrace or tracecompass [1]. $ babeltrace ./ctf-data/ [03:19:13.962125533] (+?.?????????) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 1 } [03:19:13.962130001] (+0.000004468) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 1 } [03:19:13.962131936] (+0.000001935) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 8 } [03:19:13.962133732] (+0.000001796) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 114 } [03:19:13.962135557] (+0.000001825) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 2087 } [03:19:13.962137627] (+0.000002070) cycles: { }, { ip = 0xFFFFFFFF81361938, tid = 20714, pid = 20714, period = 37582 } [03:19:13.962161091] (+0.000023464) cycles: { }, { ip = 0xFFFFFFFF8124218F, tid = 20714, pid = 20714, period = 600246 } [03:19:13.962517569] (+0.000356478) cycles: { }, { ip = 0xFFFFFFFF811A75DB, tid = 20714, pid = 20714, period = 1325731 } [03:19:13.969518008] (+0.007000439) cycles: { }, { ip = 0x34080917B2, tid = 20714, pid = 20714, period = 1144298 } The following members to the ctf-environment were decided to be added to distinguish and specify perf CTF data: - domain It says "kernel" because it contains a kernel trace (not to be confused with a user space like lttng-ust does) - tracer_name It says perf. This can be used to distinguish between lttng and perf CTF based trace. - version The kernel version from stream. In addition to release, this is what it looks like on a Debian kernel: release = "3.14-1-amd64"; version = "3.14.0"; [1] http://projects.eclipse.org/projects/tools.tracecompass Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: David Ahern <dsahern@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jeremie Galarneau <jgalar@efficios.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1424470628-5969-4-git-send-email-jolsa@kernel.org Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
1999307b |
|
30-Dec-2014 |
Jiri Olsa <jolsa@kernel.org> |
perf build: Add single target build framework support Add support to build single targets, like: $ make util/map.o # objects $ make util/map.i # preprocessor $ make util/map.s # assembly Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Tested-by: Will Deacon <will.deacon@arm.com> Cc: Alexis Berlemont <alexis.berlemont@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-tt10y0dmweq6rjaod937rpb4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
1571b695 |
|
30-Dec-2014 |
Jiri Olsa <jolsa@kernel.org> |
perf build: Add zlib objects building Move the zlib objects building under build framework to be included in the libperf build object. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Tested-by: Will Deacon <will.deacon@arm.com> Cc: Alexis Berlemont <alexis.berlemont@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-cpbb47g82ahpa4yqfr9dcobq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
3bc3374c |
|
30-Dec-2014 |
Jiri Olsa <jolsa@kernel.org> |
perf build: Add perf regs objects building Move the regs objects building under build framework to be included in the libperf build object. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Tested-by: Will Deacon <will.deacon@arm.com> Cc: Alexis Berlemont <alexis.berlemont@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-hgny792g5x5iaklc34aa57uh@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
c7355f84 |
|
30-Dec-2014 |
Jiri Olsa <jolsa@kernel.org> |
perf build: Add scripts objects building Move the scripts objects building under build framework to be included in the libperf build object. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Tested-by: Will Deacon <will.deacon@arm.com> Cc: Alexis Berlemont <alexis.berlemont@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-ry8pd41ahwpq9h46i8te33c7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
b2e45c32 |
|
29-Dec-2014 |
Jiri Olsa <jolsa@kernel.org> |
perf build: Add dwarf unwind objects building Move the dwarf unwind objects building under build framework to be included in the libperf build object. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Tested-by: Will Deacon <will.deacon@arm.com> Cc: Alexis Berlemont <alexis.berlemont@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-7f7dmhkhs0e7jnqiu9ibzqia@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
8379fce4 |
|
29-Dec-2014 |
Jiri Olsa <jolsa@kernel.org> |
perf build: Add dwarf objects building Move the dwarf objects building under build framework to be included in the libperf build object. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Tested-by: Will Deacon <will.deacon@arm.com> Cc: Alexis Berlemont <alexis.berlemont@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-5ody6tnfnkt4rezvpem8n7rm@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
709e6791 |
|
29-Dec-2014 |
Jiri Olsa <jolsa@kernel.org> |
perf build: Add probe objects building Move the probe objects building under build framework to be included in the libperf build object. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Tested-by: Will Deacon <will.deacon@arm.com> Cc: Alexis Berlemont <alexis.berlemont@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-p39iitiu2ltgmtbn48bsh7nz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
9352aaba |
|
29-Dec-2014 |
Jiri Olsa <jolsa@kernel.org> |
perf build: Add libperf objects building Move the util objects building under build framework. Add the new libperf build object so it's separated from the rest of the perf code and could be librarized. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Tested-by: Will Deacon <will.deacon@arm.com> Cc: Alexis Berlemont <alexis.berlemont@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-574tgt9t23tnxo9td8qjiibc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|