History log of /linux-master/tools/perf/util/unwind-libdw.c
Revision Date Author Comments
# ff0bd799 09-Feb-2024 Ian Rogers <irogers@google.com>

perf maps: Hide maps internals

Move the struct into the C file. Add maps__equal to work around
exposing the struct for reference count checking. Add accessors for
the unwind_libunwind_ops. Move maps_list_node to its only use in
symbol.c.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: James Clark <james.clark@arm.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Artem Savkov <asavkov@redhat.com>
Cc: bpf@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240210031746.4057262-6-irogers@google.com


# c966d23a 12-Dec-2023 Namhyung Kim <namhyung@kernel.org>

perf unwind-libdw: Handle JIT-generated DSOs properly

Usually DSOs are mapped from the beginning of the file, so the base
address of the DSO can be calculated by map->start - map->pgoff.

However, JIT DSOs which are generated by `perf inject -j`, are mapped
only the code segment. This makes unwind-libdw code confusing and
rejects processing unwinds in the JIT DSOs. It should use the map
start address as base for them to fix the confusion.

Fixes: 1fe627da30331024 ("perf unwind: Take pgoff into account when reporting elf to libdwfl")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Pablo Galindo <pablogsal@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231212070547.612536-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# d8f69fb6 05-Jun-2023 Leo Yan <leo.yan@linaro.org>

perf unwind: Use perf_arch_reg_{ip|sp}() to substitute macros

We use perf_arch_reg_ip() and perf_arch_reg_sp() to substitute macros
for obtaining the register numbers of SP and IP. This modification
enables cross analysis in the unwinding, therefore, the unwinding is
not restricted to the predefined values by the macros.

Consequently, the macros LIBUNWIND__ARCH_REG_{IP|SP} are removed since
they are no longer used.

Committer notes:

Add missing "util/env.h" header to make sure we have the definition for
perf_env__arch(), that when built with NO_LIBUNWIND=1 isn't available,
i.e. it was being included by sheer luck.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Eric Lin <eric.lin@sifive.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Babrou <ivan@cloudflare.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-csky@vger.kernel.org
Cc: linux-riscv@lists.infradead.org
Link: https://lore.kernel.org/r/20230606014559.21783-4-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 5f06267b 30-Jun-2023 Vincent Whitchurch <vincent.whitchurch@axis.com>

perf: unwind: Fix symfs with libdw

Pass the full path including the symfs (if any) to libdw. Without this
unwinding fails with errors like this when a symfs is used:

unwind: failed with 'No such file or directory'"

Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: kernel@axis.com
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Link: https://lore.kernel.org/r/20230630-perf-libdw-symfs-v2-1-469760dd4d5b@axis.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>


# 0dd5041c 08-Jun-2023 Ian Rogers <irogers@google.com>

perf addr_location: Add init/exit/copy functions

struct addr_location holds references to multiple reference counted
objects. Add init/exit functions to make maintenance of those more
consistent with the rest of the code and to try to avoid
leaks. Modification of thread reference counts isn't included in this
change.

Committer notes:

I needed to initialize result to sample->ip to make sure is set to
something, fixing a compile time error, mostly keeping the previous
logic as build_alloc_func_list() already does debugging/error prints
about what went wrong if it takes the 'goto out'.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Brian Robbins <brianrob@linux.microsoft.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Fangrui Song <maskray@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Babrou <ivan@cloudflare.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Ye Xingchen <ye.xingchen@zte.com.cn>
Cc: Yuan Can <yuancan@huawei.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230608232823.4027869-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# ee84a303 08-Jun-2023 Ian Rogers <irogers@google.com>

perf thread: Add accessor functions for thread

Using accessors will make it easier to add reference count checking in
later patches.

Committer notes:

thread->nsinfo wasn't wrapped as it is used together with
nsinfo__zput(), where does a trick to set the field with a refcount
being dropped to NULL, and that doesn't work well with using
thread__nsinfo(thread), that loses the &thread->nsinfo pointer.

When refcount checking is added to 'struct thread', later in this
series, nsinfo__zput(RC_CHK_ACCESS(thread)->nsinfo) will be used to
check the thread pointer.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Brian Robbins <brianrob@linux.microsoft.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Fangrui Song <maskray@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Babrou <ivan@cloudflare.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Ye Xingchen <ye.xingchen@zte.com.cn>
Cc: Yuan Can <yuancan@huawei.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230608232823.4027869-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8f12692b 19-Apr-2023 Ian Rogers <irogers@google.com>

perf maps: Add reference count checking

Add reference count checking to make sure of good use of get and put.
Add and use accessors to reduce RC_CHK clutter.

The only significant issue was in tests/thread-maps-share.c where
reference counts were released in the reverse order to acquisition,
leading to a use after put. This was fixed by reversing the put order.

Committer notes:

Extracted from a larger patch removing bits that were covered by the use
of pre-existing maps__ accessors (e.g. maps__nr_maps()) and new ones
added (maps__refcnt()) to reduce RC_CHK_ACCESS(maps)-> source code
pollution.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Link: https://lore.kernel.org/lkml/20230407230405.2931830-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2a6e5e8a 04-Apr-2023 Ian Rogers <irogers@google.com>

perf map: Add accessors for ->pgoff and ->reloc

Later changes will add reference count checking for 'struct map'. Add
accessors so that the reference count check is only necessary in one
place.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20230404205954.2245628-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 78a1f7cd 04-Apr-2023 Ian Rogers <irogers@google.com>

perf map: Add helper for ->map_ip() and ->unmap_ip()

Later changes will add reference count checking for struct map, add a
helper function to invoke the map_ip and unmap_ip function pointers. The
helper allows the reference count check to be in fewer places.

Committer notes:

Add missing conversions to:

tools/perf/util/map.c
tools/perf/util/cs-etm.c
tools/perf/util/annotate.c
tools/perf/arch/powerpc/util/sym-handling.c
tools/perf/arch/s390/annotate/instructions.c

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20230404205954.2245628-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# e5116f46 20-Mar-2023 Ian Rogers <irogers@google.com>

perf map: Add accessor for start and end

Later changes will add reference count checking for struct map, start
and end are frequently accessed variables. Add an accessor so that the
reference count check is only necessary in one place.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 63df0e4b 20-Mar-2023 Ian Rogers <irogers@google.com>

perf map: Add accessor for dso

Later changes will add reference count checking for struct map, with
dso being the most frequently accessed variable. Add an accessor so
that the reference count check is only necessary in one place.

Additional changes:
- add a dso variable to avoid repeated map__dso calls.
- in builtin-mem.c dump_raw_samples, code only partially tested for
dso == NULL. Make the possibility of NULL consistent.
- in thread.c thread__memcpy fix use of spaces and use tabs.

Committer notes:

Did missing conversions on these files:

tools/perf/arch/powerpc/util/skip-callchain-idx.c
tools/perf/arch/powerpc/util/sym-handling.c
tools/perf/ui/browsers/hists.c
tools/perf/ui/gtk/annotate.c
tools/perf/util/cs-etm.c
tools/perf/util/thread.c
tools/perf/util/unwind-libunwind-local.c
tools/perf/util/unwind-libunwind.c

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fa7095c5 06-Apr-2022 James Clark <james.clark@arm.com>

perf unwind: Don't show unwind error messages when augmenting frame pointer stack

Commit Fixes: b9f6fbb3b2c29736 ("perf arm64: Inject missing frames when
using 'perf record --call-graph=fp'") intended to add a 'best effort'
DWARF unwind that improved the frame pointer stack in most scenarios.

It's expected that the unwind will fail sometimes, but this shouldn't be
reported as an error. It only works when the return address can be
determined from the contents of the link register alone.

Fix the error shown when the unwinder requires extra registers by adding
a new flag that suppresses error messages. This flag is not set in the
normal --call-graph=dwarf unwind mode so that behavior is not changed.

Fixes: b9f6fbb3b2c29736 ("perf arm64: Inject missing frames when using 'perf record --call-graph=fp'")
Reported-by: John Garry <john.garry@huawei.com>
Signed-off-by: James Clark <james.clark@arm.com>
Tested-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Truong <alexandre.truong@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220406145651.1392529-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 4e148144 18-Feb-2021 Dave Rigby <d.rigby@me.com>

perf unwind: Set userdata for all __report_module() paths

When locating the DWARF module for a given address, __find_debuginfo()
requires a 'struct dso' passed via the userdata argument.

However, this field is only set in __report_module() if the module is
found in via dwfl_addrmodule(), not if it is found later via
dwfl_report_elf().

Set userdata irrespective of how the DWARF module was found, as long as
we found a module.

Fixes: bf53fc6b5f41 ("perf unwind: Fix separate debug info files when using elfutils' libdw's unwinder")
Signed-off-by: Dave Rigby <d.rigby@me.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211801
Acked-by: Jan Kratochvil <jan.kratochvil@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/linux-perf-users/20210218165654.36604-1-d.rigby@me.com/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# bf53fc6b 04-Dec-2020 Jan Kratochvil <jan.kratochvil@redhat.com>

perf unwind: Fix separate debug info files when using elfutils' libdw's unwinder

elfutils needs to be provided main binary and separate debug info file
respectively. Providing separate debug info file instead of the main
binary is not sufficient.

One needs to try both supplied filename and its possible cache by its
build-id depending on the use case.

Signed-off-by: Jan Kratochvil <jan.kratochvil@redhat.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# f2eaea09 25-Nov-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf map_symbol: Rename ms->mg to ms->maps

One more step on the merge of 'struct maps' with 'struct map_groups'.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-61rra2wg392rhvdgw421wzpt@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 694520df 25-Nov-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf addr_location: Rename al->mg to al->maps

One more step on the merge of 'struct maps' with 'struct map_groups'.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-foo95pyyp3bhocbt7yd8qrvq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fe87797d 25-Nov-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf thread: Rename thread->mg to thread->maps

One more step on the merge of 'struct maps' with 'struct map_groups'.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-69vcr8pubpym90skxhmbwhiw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 08f6680e 04-Nov-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Add a 'struct map_groups' pointer to 'struct map_symbol'

And fill it whenever we setup a a 'struct map_symbol', now we need to
use it, next cset.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-fzwfcnddenz1o7uj1fzw3g46@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# c1529738 04-Nov-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf unwind: Use 'struct map_symbol' in 'struct unwind_entry'

To help in passing that info around to callchain routines that, for the
same reason, are moving to use 'struct map_symbol'.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-epsiibeprpxa8qpwji47uskc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fb71c86c 03-Sep-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Remove util.h from where it is not needed

Check that it is not needed and remove, fixing up some fallout for
places where it was only serving to get something else.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-9h6dg6lsqe2usyqjh5rrues4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 4a3cec84 30-Aug-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf dsos: Move the dsos struct and its methods to separate source files

So that we can reduce the header dependency tree further, in the process
noticed that lots of places were getting even things like build-id
routines and 'struct perf_tool' definition indirectly, so fix all those
too.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ti0btma9ow5ndrytyoqdk62j@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 7f7c536f 04-Jul-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

tools lib: Adopt zalloc()/zfree() from tools/perf

Eroding a bit more the tools/perf/util/util.h hodpodge header.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-natazosyn9rwjka25tvcnyi0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# daecf9e0 27-Jan-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Add missing include for symbols.h

Several places were using definitions found in symbols.h but not
including it, getting it by sheer luck from some other headers that now
are in the process of removing that include because they don't need it
or because simply having struct forward declarations is enough, fix it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-xbcvvx296d70kpg9wb0qmeq9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 1101f69a 27-Jan-2019 Arnaldo Carvalho de Melo <acme@redhat.com>

pref tools: Add missing map.h includes

Lots of places get the map.h file indirectly, and since we're going to
remove it from machine.h, then those need to include it directly, do it
now, before we remove that dep.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ob8jehdjda8h5jsrv9dqj9tf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 1fe627da 29-Oct-2018 Milian Wolff <milian.wolff@kdab.com>

perf unwind: Take pgoff into account when reporting elf to libdwfl

libdwfl parses an ELF file itself and creates mappings for the
individual sections. perf on the other hand sees raw mmap events which
represent individual sections. When we encounter an address pointing
into a mapping with pgoff != 0, we must take that into account and
report the file at the non-offset base address.

This fixes unwinding with libdwfl in some cases. E.g. for a file like:

```

using namespace std;

mutex g_mutex;

double worker()
{
lock_guard<mutex> guard(g_mutex);
uniform_real_distribution<double> uniform(-1E5, 1E5);
default_random_engine engine;
double s = 0;
for (int i = 0; i < 1000; ++i) {
s += norm(complex<double>(uniform(engine), uniform(engine)));
}
cout << s << endl;
return s;
}

int main()
{
vector<std::future<double>> results;
for (int i = 0; i < 10000; ++i) {
results.push_back(async(launch::async, worker));
}
return 0;
}
```

Compile it with `g++ -g -O2 -lpthread cpp-locking.cpp -o cpp-locking`,
then record it with `perf record --call-graph dwarf -e
sched:sched_switch`.

When you analyze it with `perf script` and libunwind, you should see:

```
cpp-locking 20038 [005] 54830.236589: sched:sched_switch: prev_comm=cpp-locking prev_pid=20038 prev_prio=120 prev_state=T ==> next_comm=swapper/5 next_pid=0 next_prio=120
ffffffffb166fec5 __sched_text_start+0x545 (/lib/modules/4.14.78-1-lts/build/vmlinux)
ffffffffb166fec5 __sched_text_start+0x545 (/lib/modules/4.14.78-1-lts/build/vmlinux)
ffffffffb1670208 schedule+0x28 (/lib/modules/4.14.78-1-lts/build/vmlinux)
ffffffffb16737cc rwsem_down_read_failed+0xec (/lib/modules/4.14.78-1-lts/build/vmlinux)
ffffffffb1665e04 call_rwsem_down_read_failed+0x14 (/lib/modules/4.14.78-1-lts/build/vmlinux)
ffffffffb1672a03 down_read+0x13 (/lib/modules/4.14.78-1-lts/build/vmlinux)
ffffffffb106bd85 __do_page_fault+0x445 (/lib/modules/4.14.78-1-lts/build/vmlinux)
ffffffffb18015f5 page_fault+0x45 (/lib/modules/4.14.78-1-lts/build/vmlinux)
7f38e4252591 new_heap+0x101 (/usr/lib/libc-2.28.so)
7f38e4252d0b arena_get2.part.4+0x2fb (/usr/lib/libc-2.28.so)
7f38e4255b1c tcache_init.part.6+0xec (/usr/lib/libc-2.28.so)
7f38e42569e5 __GI___libc_malloc+0x115 (inlined)
7f38e4241790 __GI__IO_file_doallocate+0x90 (inlined)
7f38e424fbbf __GI__IO_doallocbuf+0x4f (inlined)
7f38e424ee47 __GI__IO_file_overflow+0x197 (inlined)
7f38e424df36 _IO_new_file_xsputn+0x116 (inlined)
7f38e4242bfb __GI__IO_fwrite+0xdb (inlined)
7f38e463fa6d std::basic_streambuf<char, std::char_traits<char> >::sputn(char const*, long)+0x1cd (inlined)
7f38e463fa6d std::ostreambuf_iterator<char, std::char_traits<char> >::_M_put(char const*, long)+0x1cd (inlined)
7f38e463fa6d std::ostreambuf_iterator<char, std::char_traits<char> > std::__write<char>(std::ostreambuf_iterator<char, std::char_traits<char> >, char const*, int)+0x1cd (inlined)
7f38e463fa6d std::ostreambuf_iterator<char, std::char_traits<char> > std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_float<double>(std::ostreambuf_iterator<c>
7f38e464bd70 std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::put(std::ostreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, char, double) const+0x90 (inl>
7f38e464bd70 std::ostream& std::ostream::_M_insert<double>(double)+0x90 (/usr/lib/libstdc++.so.6.0.25)
563b9cb502f7 std::ostream::operator<<(double)+0xb7 (inlined)
563b9cb502f7 worker()+0xb7 (/ssd/milian/projects/kdab/rnd/hotspot/build/tests/test-clients/cpp-locking/cpp-locking)
563b9cb506fb double std::__invoke_impl<double, double (*)()>(std::__invoke_other, double (*&&)())+0x2b (inlined)
563b9cb506fb std::__invoke_result<double (*)()>::type std::__invoke<double (*)()>(double (*&&)())+0x2b (inlined)
563b9cb506fb decltype (__invoke((_S_declval<0ul>)())) std::thread::_Invoker<std::tuple<double (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>)+0x2b (inlined)
563b9cb506fb std::thread::_Invoker<std::tuple<double (*)()> >::operator()()+0x2b (inlined)
563b9cb506fb std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<double>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<double (*)()> >, dou>
563b9cb506fb std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_>
563b9cb507e8 std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>::operator()() const+0x28 (inlined)
563b9cb507e8 std::__future_base::_State_baseV2::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*)+0x28 (/ssd/milian/>
7f38e46d24fe __pthread_once_slow+0xbe (/usr/lib/libpthread-2.28.so)
563b9cb51149 __gthread_once+0xe9 (inlined)
563b9cb51149 void std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*)>
563b9cb51149 std::__future_base::_State_baseV2::_M_set_result(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>, bool)+0xe9 (inlined)
563b9cb51149 std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<double (*)()> >, double>::_Async_state_impl(std::thread::_Invoker<std::tuple<double (*)()> >&&)::{lambda()#1}::op>
563b9cb51149 void std::__invoke_impl<void, std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<double (*)()> >, double>::_Async_state_impl(std::thread::_Invoker<std::tuple<double>
563b9cb51149 std::__invoke_result<std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<double (*)()> >, double>::_Async_state_impl(std::thread::_Invoker<std::tuple<double (*)()> >>
563b9cb51149 decltype (__invoke((_S_declval<0ul>)())) std::thread::_Invoker<std::tuple<std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<double (*)()> >, double>::_Async_state_>
563b9cb51149 std::thread::_Invoker<std::tuple<std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<double (*)()> >, double>::_Async_state_impl(std::thread::_Invoker<std::tuple<dou>
563b9cb51149 std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<double (*)()> >, double>::_Async_state_impl(std::thread>
7f38e45f0062 execute_native_thread_routine+0x12 (/usr/lib/libstdc++.so.6.0.25)
7f38e46caa9c start_thread+0xfc (/usr/lib/libpthread-2.28.so)
7f38e42ccb22 __GI___clone+0x42 (inlined)
```

Before this patch, using libdwfl, you would see:

```
cpp-locking 20038 [005] 54830.236589: sched:sched_switch: prev_comm=cpp-locking prev_pid=20038 prev_prio=120 prev_state=T ==> next_comm=swapper/5 next_pid=0 next_prio=120
ffffffffb166fec5 __sched_text_start+0x545 (/lib/modules/4.14.78-1-lts/build/vmlinux)
ffffffffb166fec5 __sched_text_start+0x545 (/lib/modules/4.14.78-1-lts/build/vmlinux)
ffffffffb1670208 schedule+0x28 (/lib/modules/4.14.78-1-lts/build/vmlinux)
ffffffffb16737cc rwsem_down_read_failed+0xec (/lib/modules/4.14.78-1-lts/build/vmlinux)
ffffffffb1665e04 call_rwsem_down_read_failed+0x14 (/lib/modules/4.14.78-1-lts/build/vmlinux)
ffffffffb1672a03 down_read+0x13 (/lib/modules/4.14.78-1-lts/build/vmlinux)
ffffffffb106bd85 __do_page_fault+0x445 (/lib/modules/4.14.78-1-lts/build/vmlinux)
ffffffffb18015f5 page_fault+0x45 (/lib/modules/4.14.78-1-lts/build/vmlinux)
7f38e4252591 new_heap+0x101 (/usr/lib/libc-2.28.so)
a041161e77950c5c [unknown] ([unknown])
```

With this patch applied, we get a bit further in unwinding:

```
cpp-locking 20038 [005] 54830.236589: sched:sched_switch: prev_comm=cpp-locking prev_pid=20038 prev_prio=120 prev_state=T ==> next_comm=swapper/5 next_pid=0 next_prio=120
ffffffffb166fec5 __sched_text_start+0x545 (/lib/modules/4.14.78-1-lts/build/vmlinux)
ffffffffb166fec5 __sched_text_start+0x545 (/lib/modules/4.14.78-1-lts/build/vmlinux)
ffffffffb1670208 schedule+0x28 (/lib/modules/4.14.78-1-lts/build/vmlinux)
ffffffffb16737cc rwsem_down_read_failed+0xec (/lib/modules/4.14.78-1-lts/build/vmlinux)
ffffffffb1665e04 call_rwsem_down_read_failed+0x14 (/lib/modules/4.14.78-1-lts/build/vmlinux)
ffffffffb1672a03 down_read+0x13 (/lib/modules/4.14.78-1-lts/build/vmlinux)
ffffffffb106bd85 __do_page_fault+0x445 (/lib/modules/4.14.78-1-lts/build/vmlinux)
ffffffffb18015f5 page_fault+0x45 (/lib/modules/4.14.78-1-lts/build/vmlinux)
7f38e4252591 new_heap+0x101 (/usr/lib/libc-2.28.so)
7f38e4252d0b arena_get2.part.4+0x2fb (/usr/lib/libc-2.28.so)
7f38e4255b1c tcache_init.part.6+0xec (/usr/lib/libc-2.28.so)
7f38e42569e5 __GI___libc_malloc+0x115 (inlined)
7f38e4241790 __GI__IO_file_doallocate+0x90 (inlined)
7f38e424fbbf __GI__IO_doallocbuf+0x4f (inlined)
7f38e424ee47 __GI__IO_file_overflow+0x197 (inlined)
7f38e424df36 _IO_new_file_xsputn+0x116 (inlined)
7f38e4242bfb __GI__IO_fwrite+0xdb (inlined)
7f38e463fa6d std::basic_streambuf<char, std::char_traits<char> >::sputn(char const*, long)+0x1cd (inlined)
7f38e463fa6d std::ostreambuf_iterator<char, std::char_traits<char> >::_M_put(char const*, long)+0x1cd (inlined)
7f38e463fa6d std::ostreambuf_iterator<char, std::char_traits<char> > std::__write<char>(std::ostreambuf_iterator<char, std::char_traits<char> >, char const*, int)+0x1cd (inlined)
7f38e463fa6d std::ostreambuf_iterator<char, std::char_traits<char> > std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_float<double>(std::ostreambuf_iterator<c>
7f38e464bd70 std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::put(std::ostreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, char, double) const+0x90 (inl>
7f38e464bd70 std::ostream& std::ostream::_M_insert<double>(double)+0x90 (/usr/lib/libstdc++.so.6.0.25)
563b9cb502f7 std::ostream::operator<<(double)+0xb7 (inlined)
563b9cb502f7 worker()+0xb7 (/ssd/milian/projects/kdab/rnd/hotspot/build/tests/test-clients/cpp-locking/cpp-locking)
6eab825c1ee3e4ff [unknown] ([unknown])
```

Note that the backtrace is still stopping too early, when compared to
the nice results obtained via libunwind. It's unclear so far what the
reason for that is.

Committer note:

Further comment by Milian on the thread started on the Link: tag below:

---
The remaining issue is due to a bug in elfutils:

https://sourceware.org/ml/elfutils-devel/2018-q4/msg00089.html

With both patches applied, libunwind and elfutils produce the same output for
the above scenario.
---

Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181029141644.3907-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2a9d5050 03-Jul-2018 Sandipan Das <sandipan@linux.ibm.com>

perf script: Show correct offsets for DWARF-based unwinding

When perf/data is recorded with the dwarf call-graph option, the
callchain shown by 'perf script' still shows the binary offsets of the
userspace symbols instead of their virtual addresses. Since the symbol
offset calculation is based on using virtual address as the ip, we see
incorrect offsets as well.

The use of virtual addresses affects the ability to find out the
line number in the corresponding source file to which an address
maps to as described in commit 67540759151a ("perf unwind: Use
addr_location::addr instead of ip for entries").

This has also been addressed by temporarily converting the virtual
address to the correponding binary offset so that it can be mapped
to the source line number correctly.

This is a follow-up for commit 19610184693c ("perf script: Show
virtual addresses instead of offsets").

This can be verified on a powerpc64le system running Fedora 27 as
shown below:

# perf probe -x /usr/lib64/libc-2.26.so -a inet_pton
# perf record -e probe_libc:inet_pton --call-graph=dwarf ping -6 -c 1 ::1

Before:

# perf report --stdio --no-children -s sym,srcline -g address

# Samples: 1 of event 'probe_libc:inet_pton'
# Event count (approx.): 1
#
# Overhead Symbol Source:Line
# ........ .................... ...........
#
100.00% [.] __GI___inet_pton inet_pton.c
|
---gaih_inet getaddrinfo.c:537 (inlined)
__GI_getaddrinfo getaddrinfo.c:2304 (inlined)
main ping.c:519
generic_start_main libc-start.c:308 (inlined)
__libc_start_main libc-start.c:102
...

# perf script -F comm,ip,sym,symoff,srcline,dso

ping
15af28 __GI___inet_pton+0xffff000099160008 (/usr/lib64/libc-2.26.so)
libc-2.26.so[ffff80004ca0af28]
10fa53 gaih_inet+0xffff000099160f43
libc-2.26.so[ffff80004c9bfa53] (inlined)
1105b3 __GI_getaddrinfo+0xffff000099160163
libc-2.26.so[ffff80004c9c05b3] (inlined)
2d6f main+0xfffffffd9f1003df (/usr/bin/ping)
ping[fffffffecf882d6f]
2369f generic_start_main+0xffff00009916013f
libc-2.26.so[ffff80004c8d369f] (inlined)
23897 __libc_start_main+0xffff0000991600b7 (/usr/lib64/libc-2.26.so)
libc-2.26.so[ffff80004c8d3897]

After:

# perf report --stdio --no-children -s sym,srcline -g address

# Samples: 1 of event 'probe_libc:inet_pton'
# Event count (approx.): 1
#
# Overhead Symbol Source:Line
# ........ .................... ...........
#
100.00% [.] __GI___inet_pton inet_pton.c
|
---gaih_inet.constprop.7 getaddrinfo.c:537
getaddrinfo getaddrinfo.c:2304
main ping.c:519
generic_start_main.isra.0 libc-start.c:308
__libc_start_main libc-start.c:102
...

# perf script -F comm,ip,sym,symoff,srcline,dso

ping
7fffb38aaf28 __GI___inet_pton+0x8 (/usr/lib64/libc-2.26.so)
inet_pton.c:68
7fffb385fa53 gaih_inet.constprop.7+0xf43 (/usr/lib64/libc-2.26.so)
getaddrinfo.c:537
7fffb38605b3 getaddrinfo+0x163 (/usr/lib64/libc-2.26.so)
getaddrinfo.c:2304
130782d6f main+0x3df (/usr/bin/ping)
ping.c:519
7fffb377369f generic_start_main.isra.0+0x13f (/usr/lib64/libc-2.26.so)
libc-start.c:308
7fffb3773897 __libc_start_main+0xb7 (/usr/lib64/libc-2.26.so)
libc-start.c:102

Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Fixes: 67540759151a ("perf unwind: Use addr_location::addr instead of ip for entries")
Link: http://lkml.kernel.org/r/20180703120555.32971-1-sandipan@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 404eb5a4 26-Apr-2018 Arnaldo Carvalho de Melo <acme@redhat.com>

perf thread: Make thread__find_map() search all maps

We still have the split internally, but users don't see it anymore,
simplifying the growing number of cases where we end up searching
in the MAP__VARIABLE maps.

This further paves the way for ditching the split.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-86mfxrztf310konutxvhr5ua@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 71a84b5a 24-Apr-2018 Arnaldo Carvalho de Melo <acme@redhat.com>

perf thread: Make thread__find_map() return the map

It was returning the searched map just on the addr_location passed, with
the function itself returning void.

Make it return the map so that we can make the code more compact.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-tzlrrzdeoof4i6ktyqv1t6ks@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 4546263d 24-Apr-2018 Arnaldo Carvalho de Melo <acme@redhat.com>

perf thread: Introduce thread__find_symbol()

Out of thread__find_addr_location(..., MAP__FUNCTION, ...), idea here is to
continue removing references to MAP__{FUNCTION,VARIABLE} ahead of
getting both types of symbols in the same rbtree, as various places do
two lookups, looking first at MAP__FUNCTION, then at MAP__VARIABLE.

So thread__find_symbol() will eventually do just that, and 'struct
symbol' will have the symbol type, for code that cares about that.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-n7528en9e08yd3flzmb26tth@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# f07a2d32 24-Apr-2018 Arnaldo Carvalho de Melo <acme@redhat.com>

perf thread: Introduce thread__find_map()

Out of thread__find_add_map(..., MAP__FUNCTION, ...), idea here is to
continue removing references to MAP__{FUNCTION,VARIABLE} ahead of
getting both types of symbols in the same rbtree, as various places do
two lookups, looking first at MAP__FUNCTION, then at MAP__VARIABLE.

So thread__find_map() will eventually do just that, and 'struct symbol'
will have the symbol type, for code that cares about that.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-q27xee34l4izpfau49w103s6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 555fc3b1 18-Mar-2018 Martin Vuille <jpmv27@aim.com>

perf unwind: Report error from dwfl_attach_state

In verbose level 2, errors returned by libdw are reported in most cases,
but not when calling dwfl_attach_state.

Since elfutils v 0.160 (2014), dwfl_attach_state sets the error code to
report failure cause. On failure, log the reported error.

Signed-off-by: Martin Vuille <jpmv27@aim.com>
Reviewed-by: Kim Phillips <kim.phillips@arm.com>
Link: http://lkml.kernel.org/r/20180318175053.4222-1-jpmv27@aim.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 3d20c624 11-Feb-2018 Martin Vuille <jpmv27@aim.com>

perf unwind: Unwind with libdw doesn't take symfs into account

Path passed to libdw for unwinding doesn't include symfs path
if specified, so unwinding fails because ELF file is not found.

Similar to unwinding with libunwind, pass symsrc_filename instead
of long_name. If there is no symsrc_filename, fallback to long_name.

Signed-off-by: Martin Vuille <jpmv27@aim.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20180211212420.18388-1-jpmv27@aim.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b2441318 01-Nov-2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

License cleanup: add SPDX GPL-2.0 license identifier to files with no license

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.

For non */uapi/* files that summary was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139

and resulted in the first patch in this series.

If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930

and resulted in the second patch in this series.

- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:

SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1

and that resulted in the third patch in this series.

- when the two scanners agreed on the detected license(s), that became
the concluded license(s).

- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.

- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).

- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.

- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct

This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 9126cbba 02-Jun-2017 Milian Wolff <milian.wolff@kdab.com>

perf unwind: Report module before querying isactivation in dwfl unwind

The PC returned by dwfl_frame_pc() may map into a not-yet-reported
module. We have to report it before we continue unwinding. But when we
query for the isactivation flag in dwfl_frame_pc, libdw will actually do
one more unwinding step internally which can then break and lead to
missed frames or broken stacks.

With libunwind we get e.g.:

~~~~~
heaptrack_gui 2228 135073.400474: 613969 cycles:
108c8e [unknown] (/usr/lib/libQt5Core.so.5.8.0)
1093bc [unknown] (/usr/lib/libQt5Core.so.5.8.0)
109e7b QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
1470ff [unknown] (/usr/lib/libQt5Core.so.5.8.0)
147f67 QSystemLocale::query (/usr/lib/libQt5Core.so.5.8.0)
109fbf QLocalePrivate::updateSystemPrivate (/usr/lib/libQt5Core.so.5.8.0)
10aa27 QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
1e02c3 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
2113bb [unknown] (/usr/lib/libQt5Core.so.5.8.0)
211505 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
1b5df0 QFileInfo::exists (/usr/lib/libQt5Core.so.5.8.0)
92eb2 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
93423 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
93d2a QLibraryInfo::location (/usr/lib/libQt5Core.so.5.8.0)
2170af [unknown] (/usr/lib/libQt5Core.so.5.8.0)
297c53 QCoreApplicationPrivate::init (/usr/lib/libQt5Core.so.5.8.0)
f7cde QGuiApplicationPrivate::init (/usr/lib/libQt5Gui.so.5.8.0)
1589e8 QApplicationPrivate::init (/usr/lib/libQt5Widgets.so.5.8.0)
78622 main (/home/milian/projects/compiled/other/bin/heaptrack_gui)
20439 __libc_start_main (/usr/lib/libc-2.25.so)
78299 _start (/home/milian/projects/compiled/other/bin/heaptrack_gui)

heaptrack_gui 2228 135073.401156: 569521 cycles:
131633 QString::endsWith (/usr/lib/libQt5Core.so.5.8.0)
1a0701 QDir::cleanPath (/usr/lib/libQt5Core.so.5.8.0)
21b82d [unknown] (/usr/lib/libQt5Core.so.5.8.0)
1b3727 QFileInfo::canonicalFilePath (/usr/lib/libQt5Core.so.5.8.0)
2780c7 QFactoryLoader::update (/usr/lib/libQt5Core.so.5.8.0)
279525 QFactoryLoader::QFactoryLoader (/usr/lib/libQt5Core.so.5.8.0)
e5bd0 QPlatformIntegrationFactory::create (/usr/lib/libQt5Gui.so.5.8.0)
f5a1c QGuiApplicationPrivate::createPlatformIntegration (/usr/lib/libQt5Gui.so.5.8.0)
f650c QGuiApplicationPrivate::createEventDispatcher (/usr/lib/libQt5Gui.so.5.8.0)
298524 QCoreApplicationPrivate::init (/usr/lib/libQt5Core.so.5.8.0)
f7cde QGuiApplicationPrivate::init (/usr/lib/libQt5Gui.so.5.8.0)
1589e8 QApplicationPrivate::init (/usr/lib/libQt5Widgets.so.5.8.0)
78622 main (/home/milian/projects/compiled/other/bin/heaptrack_gui)
20439 __libc_start_main (/usr/lib/libc-2.25.so)
78299 _start (/home/milian/projects/compiled/other/bin/heaptrack_gui)
~~~~~

Note the two frames 1589e8 and 78622 in the first sample. These are
missing when unwinding with libdw. The second sample's breakage is
more obvious:

~~~~~
heaptrack_gui 2228 135073.400474: 613969 cycles:
108c8e [unknown] (/usr/lib/libQt5Core.so.5.8.0)
1093bc [unknown] (/usr/lib/libQt5Core.so.5.8.0)
109e7b QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
1470ff [unknown] (/usr/lib/libQt5Core.so.5.8.0)
147f67 QSystemLocale::query (/usr/lib/libQt5Core.so.5.8.0)
109fbf QLocalePrivate::updateSystemPrivate (/usr/lib/libQt5Core.so.5.8.0)
10aa27 QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
1e02c3 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
2113bb [unknown] (/usr/lib/libQt5Core.so.5.8.0)
211505 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
1b5df0 QFileInfo::exists (/usr/lib/libQt5Core.so.5.8.0)
92eb2 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
93423 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
93d2a QLibraryInfo::location (/usr/lib/libQt5Core.so.5.8.0)
2170af [unknown] (/usr/lib/libQt5Core.so.5.8.0)
297c53 QCoreApplicationPrivate::init (/usr/lib/libQt5Core.so.5.8.0)
f7cde QGuiApplicationPrivate::init (/usr/lib/libQt5Gui.so.5.8.0)
20439 __libc_start_main (/usr/lib/libc-2.25.so)
78299 _start (/home/milian/projects/compiled/other/bin/heaptrack_gui)

heaptrack_gui 2228 135073.401156: 569521 cycles:
131633 QString::endsWith (/usr/lib/libQt5Core.so.5.8.0)
1a0701 QDir::cleanPath (/usr/lib/libQt5Core.so.5.8.0)
21b82d [unknown] (/usr/lib/libQt5Core.so.5.8.0)
1b3727 QFileInfo::canonicalFilePath (/usr/lib/libQt5Core.so.5.8.0)
2780c7 QFactoryLoader::update (/usr/lib/libQt5Core.so.5.8.0)
279525 QFactoryLoader::QFactoryLoader (/usr/lib/libQt5Core.so.5.8.0)
e5bd0 QPlatformIntegrationFactory::create (/usr/lib/libQt5Gui.so.5.8.0)
723dbf [unknown] ([unknown])
~~~~~

This patch fixes this issue and the libdw unwinder mimicks the libunwind
behavior more closely.

Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Acked-by: Jan Kratochvil <jan.kratochvil@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20170602143753.16907-2-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 2538b9e2 02-Jun-2017 Milian Wolff <milian.wolff@kdab.com>

perf report: Ensure the perf DSO mapping matches what libdw sees

In some situations the libdw unwinder stopped working properly. I.e.
with libunwind we see:

~~~~~
heaptrack_gui 2228 135073.400112: 641314 cycles:
e8ed _dl_fixup (/usr/lib/ld-2.25.so)
15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so)
ed94c KDynamicJobTracker::KDynamicJobTracker (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
608f3 _GLOBAL__sub_I_kdynamicjobtracker.cpp (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
f199 call_init.part.0 (/usr/lib/ld-2.25.so)
f2a5 _dl_init (/usr/lib/ld-2.25.so)
db9 _dl_start_user (/usr/lib/ld-2.25.so)
~~~~~

But with libdw and without this patch this sample is not properly
unwound:

~~~~~
heaptrack_gui 2228 135073.400112: 641314 cycles:
e8ed _dl_fixup (/usr/lib/ld-2.25.so)
15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so)
ed94c KDynamicJobTracker::KDynamicJobTracker (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
~~~~~

Debug output showed me that libdw found a module for the last frame
address, but it thinks it belongs to /usr/lib/ld-2.25.so. This patch
double-checks what libdw sees and what perf knows. If the mappings
mismatch, we now report the elf known to perf. This fixes the situation
above, and the libdw unwinder produces the same stack as libunwind.

Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20170602143753.16907-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 5ea0416f 01-Jun-2017 Milian Wolff <milian.wolff@kdab.com>

perf report: Include partial stacks unwound with libdw

So far the whole stack was thrown away when any error occurred before
the maximum stack depth was unwound. This is actually a very common
scenario though. The stacks that got unwound so far are still
interesting. This removes a large chunk of differences when comparing
perf script output for libunwind and libdw perf unwinding.

E.g. with libunwind:

~~~~~
heaptrack_gui 2228 135073.388524: 479408 cycles:
ffffffff811749ed perf_iterate_ctx ([kernel.kallsyms])
ffffffff81181662 perf_event_mmap ([kernel.kallsyms])
ffffffff811cf5ed mmap_region ([kernel.kallsyms])
ffffffff811cfe6b do_mmap ([kernel.kallsyms])
ffffffff811b0dca vm_mmap_pgoff ([kernel.kallsyms])
ffffffff811cdb0c sys_mmap_pgoff ([kernel.kallsyms])
ffffffff81033acb sys_mmap ([kernel.kallsyms])
ffffffff81631d37 entry_SYSCALL_64_fastpath ([kernel.kallsyms])
192ca mmap64 (/usr/lib/ld-2.25.so)
59a9 _dl_map_object_from_fd (/usr/lib/ld-2.25.so)
83d0 _dl_map_object (/usr/lib/ld-2.25.so)
cda1 openaux (/usr/lib/ld-2.25.so)
1834f _dl_catch_error (/usr/lib/ld-2.25.so)
cfe2 _dl_map_object_deps (/usr/lib/ld-2.25.so)
3481 dl_main (/usr/lib/ld-2.25.so)
17387 _dl_sysdep_start (/usr/lib/ld-2.25.so)
4d37 _dl_start (/usr/lib/ld-2.25.so)
d87 _start (/usr/lib/ld-2.25.so)

heaptrack_gui 2228 135073.388677: 611329 cycles:
1a3e0 strcmp (/usr/lib/ld-2.25.so)
82b2 _dl_map_object (/usr/lib/ld-2.25.so)
cda1 openaux (/usr/lib/ld-2.25.so)
1834f _dl_catch_error (/usr/lib/ld-2.25.so)
cfe2 _dl_map_object_deps (/usr/lib/ld-2.25.so)
3481 dl_main (/usr/lib/ld-2.25.so)
17387 _dl_sysdep_start (/usr/lib/ld-2.25.so)
4d37 _dl_start (/usr/lib/ld-2.25.so)
d87 _start (/usr/lib/ld-2.25.so)
~~~~~

With libdw without this patch:

~~~~~
heaptrack_gui 2228 135073.388524: 479408 cycles:
ffffffff811749ed perf_iterate_ctx ([kernel.kallsyms])
ffffffff81181662 perf_event_mmap ([kernel.kallsyms])
ffffffff811cf5ed mmap_region ([kernel.kallsyms])
ffffffff811cfe6b do_mmap ([kernel.kallsyms])
ffffffff811b0dca vm_mmap_pgoff ([kernel.kallsyms])
ffffffff811cdb0c sys_mmap_pgoff ([kernel.kallsyms])
ffffffff81033acb sys_mmap ([kernel.kallsyms])
ffffffff81631d37 entry_SYSCALL_64_fastpath ([kernel.kallsyms])

heaptrack_gui 2228 135073.388677: 611329 cycles:
~~~~~

With this patch applied, the libdw unwinder will produce the same
output as the libunwind unwinder.

Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20170601210021.20046-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 1982ad48 24-May-2017 Milian Wolff <milian.wolff@kdab.com>

perf report: Fix off-by-one for non-activation frames

As the documentation for dwfl_frame_pc says, frames that
are no activation frames need to have their program counter
decremented by one to properly find the function of the caller.

This fixes many cases where perf report currently attributes
the cost to the next line. I.e. I have code like this:

~~~~~~~~~~~~~~~
#include <thread>
#include <chrono>

using namespace std;

int main()
{
this_thread::sleep_for(chrono::milliseconds(1000));
this_thread::sleep_for(chrono::milliseconds(100));
this_thread::sleep_for(chrono::milliseconds(10));

return 0;
}
~~~~~~~~~~~~~~~

Now compile and record it:

~~~~~~~~~~~~~~~
g++ -std=c++11 -g -O2 test.cpp
echo 1 | sudo tee /proc/sys/kernel/sched_schedstats
perf record \
--event sched:sched_stat_sleep \
--event sched:sched_process_exit \
--event sched:sched_switch --call-graph=dwarf \
--output perf.data.raw \
./a.out
echo 0 | sudo tee /proc/sys/kernel/sched_schedstats
perf inject --sched-stat --input perf.data.raw --output perf.data
~~~~~~~~~~~~~~~

Before this patch, the report clearly shows the off-by-one issue.
Most notably, the last sleep invocation is incorrectly attributed
to the "return 0;" line:

~~~~~~~~~~~~~~~
Overhead Source:Line
........ ...........

100.00% core.c:0
|
---__schedule core.c:0
schedule
do_nanosleep hrtimer.c:0
hrtimer_nanosleep
sys_nanosleep
entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
__nanosleep_nocancel .:0
std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323
|
|--90.08%--main test.cpp:9
| __libc_start_main
| _start
|
|--9.01%--main test.cpp:10
| __libc_start_main
| _start
|
--0.91%--main test.cpp:13
__libc_start_main
_start
~~~~~~~~~~~~~~~

With this patch here applied, the issue is fixed. The report becomes
much more usable:

~~~~~~~~~~~~~~~
Overhead Source:Line
........ ...........

100.00% core.c:0
|
---__schedule core.c:0
schedule
do_nanosleep hrtimer.c:0
hrtimer_nanosleep
sys_nanosleep
entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
__nanosleep_nocancel .:0
std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323
|
|--90.08%--main test.cpp:8
| __libc_start_main
| _start
|
|--9.01%--main test.cpp:9
| __libc_start_main
| _start
|
--0.91%--main test.cpp:10
__libc_start_main
_start
~~~~~~~~~~~~~~~

Similarly it works for signal frames:

~~~~~~~~~~~~~~~
__noinline void bar(void)
{
volatile long cnt = 0;

for (cnt = 0; cnt < 100000000; cnt++);
}

__noinline void foo(void)
{
bar();
}

void sig_handler(int sig)
{
foo();
}

int main(void)
{
signal(SIGUSR1, sig_handler);
raise(SIGUSR1);

foo();
return 0;
}
~~~~~~~~~~~~~~~~

Before, the report wrongly points to `signal.c:29` after raise():

~~~~~~~~~~~~~~~~
$ perf report --stdio --no-children -g srcline -s srcline
...
100.00% signal.c:11
|
---bar signal.c:11
|
|--50.49%--main signal.c:29
| __libc_start_main
| _start
|
--49.51%--0x33a8f
raise .:0
main signal.c:29
__libc_start_main
_start
~~~~~~~~~~~~~~~~

With this patch in, the issue is fixed and we instead get:

~~~~~~~~~~~~~~~~
100.00% signal signal [.] bar
|
---bar signal.c:11
|
|--50.49%--main signal.c:29
| __libc_start_main
| _start
|
--49.51%--0x33a8f
raise .:0
main signal.c:27
__libc_start_main
_start
~~~~~~~~~~~~~~~~

Note how this patch fixes this issue for both unwinding methods, i.e.
both dwfl and libunwind. The former case is straight-forward thanks
to dwfl_frame_pc(). For libunwind, we replace the functionality via
unw_is_signal_frame() for any but the very first frame.

Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-4-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 9a3993d4 18-Apr-2017 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Move path related functions to util/path.h

Disentangling util.h header mess a bit more.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-aj6je8ly377i4upedmjzdsq6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 67540759 16-Aug-2016 Milian Wolff <milian.wolff@kdab.com>

perf unwind: Use addr_location::addr instead of ip for entries

This fixes the srcline translation for call chains of user space
applications.

Before we got:

perf report --stdio --no-children -s sym,srcline -g address
8.92% [.] main mandelbrot.h:41
|
|--3.70%--main +8390240
| __libc_start_main +139950056726769
| _start +8388650
|
|--2.74%--main +8390189
|
--2.08%--main +8390296
__libc_start_main +139950056726769
_start +8388650

7.59% [.] main complex:1326
|
|--4.79%--main +8390203
| __libc_start_main +139950056726769
| _start +8388650
|
--2.80%--main +8390219

7.12% [.] __muldc3 libgcc2.c:1945
|
|--3.76%--__muldc3 +139950060519490
| main +8390224
| __libc_start_main +139950056726769
| _start +8388650
|
--3.32%--__muldc3 +139950060519512
main +8390224

With this patch applied, we instead get:

perf report --stdio --no-children -s sym,srcline -g address
8.92% [.] main mandelbrot.h:41
|
|--3.70%--main mandelbrot.h:41
| __libc_start_main +241
| _start +4194346
|
|--2.74%--main mandelbrot.h:41
|
--2.08%--main mandelbrot.h:41
__libc_start_main +241
_start +4194346

7.59% [.] main complex:1326
|
|--4.79%--main complex:1326
| __libc_start_main +241
| _start +4194346
|
--2.80%--main complex:1326

7.12% [.] __muldc3 libgcc2.c:1945
|
|--3.76%--__muldc3 libgcc2.c:1945
| main mandelbrot.h:39
| __libc_start_main +241
| _start +4194346
|
--3.32%--__muldc3 libgcc2.c:1945
main mandelbrot.h:39

Suggested-and-Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
LPU-Reference: 20160816153926.11288-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 0ba98149 07-Jan-2016 Jiri Olsa <jolsa@kernel.org>

perf libdw: Check for mmaps also in MAP__VARIABLE tree

We've seen cases (softice) where DWARF unwinder went through non
executable mmaps, which we need to lookup in MAP__VARIABLE tree.

Reported-and-Tested-by: Noel Grandin <noelgrandin@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 8bd508b0 19-Nov-2015 Jiri Olsa <jolsa@redhat.com>

perf callchain: Add order support for libdw DWARF unwinder

As reported by Milian, currently for DWARF unwind (both libdw and
libunwind) we display callchain in callee order only.

Adding the support to follow callchain order setup to libdw DWARF
unwinder, so we could get following output for report:

$ perf record --call-graph dwarf ls
...

$ perf report --no-children --stdio

21.12% ls libc-2.21.so [.] __strcoll_l
|
---__strcoll_l
mpsort_with_tmp
mpsort_with_tmp
mpsort_with_tmp
sort_files
main
__libc_start_main
_start

$ perf report --stdio --no-children -g caller

21.12% ls libc-2.21.so [.] __strcoll_l
|
---_start
__libc_start_main
main
sort_files
mpsort_with_tmp
mpsort_with_tmp
mpsort_with_tmp
__strcoll_l

Reported-and-Tested-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Wang Nan <wangnan0@huawei.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jan Kratochvil <jkratoch@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20151119130119.GA26617@krava.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# dd8c17a5 23-Oct-2014 Arnaldo Carvalho de Melo <acme@redhat.com>

perf callchains: Use thread->mg->machine

The unwind__get_entries() already receives the thread parameter, from where it can
obtain the matching machine structure, shorten the signature.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-isjc6bm8mv4612mhi6af64go@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# bb871a9c 22-Oct-2014 Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: A thread's machine can be found via thread->mg->machine

So stop passing both machine and thread to several thread methods,
reducing function signature length.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ckcy19dcp1jfkmdihdjcqdn1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 84f5d36f 14-Jul-2014 Jiri Olsa <jolsa@kernel.org>

perf tools: Move pr_* debug macros into debug object

Moving pr_* debug macros to have it with in same object as debug
variables, becase we will change them to use verbose variable in next
patch.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1405374411-29012-3-git-send-email-jolsa@kernel.org
[ Add missing debug.h include in python scripting glue and in the libdw unwind lib ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# d944c4ee 25-Apr-2014 Borislav Petkov <bp@suse.de>

tools: Consolidate types.h

Combine all definitions into a common tools/include/linux/types.h and
kill the wild growth elsewhere. Move DECLARE_BITMAP to its proper
bitmap.h header.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Link: http://lkml.kernel.org/n/tip-azczs7qcv6h9xek9od10hiv2@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>


# 5ea84154 19-Feb-2014 Jiri Olsa <jolsa@redhat.com>

perf tools: Add libdw DWARF post unwind support

Adding libdw DWARF post unwind support, which is part of
elfutils-devel/libdw-dev package from version 0.158.

The new code is contained in unwin-libdw.c object, and implements
unwind__get_entries unwind interface function.

New Makefile variable NO_LIBDW_DWARF_UNWIND was added to control its
compilation, and is marked as disabled now. It's factored with the rest
of the Makefile unwind build code in the next patch.

Arch specific code was added for x86.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1392825179-5228-5-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>