History log of /linux-master/tools/perf/util/annotate-data.c
Revision Date Author Comments
# bc10db8e 16-Jan-2024 Namhyung Kim <namhyung@kernel.org>

perf annotate-data: Support stack variables

Local variables are allocated in the stack and the location list
should look like base register(s) and an offset. Extend the
die_find_variable_by_reg() to handle the following expressions

* DW_OP_breg{0..31}
* DW_OP_bregx
* DW_OP_fbreg

Ususally DWARF subprogram entries have frame base information and
use it to locate stack variable like below:

<2><43d1575>: Abbrev Number: 62 (DW_TAG_variable)
<43d1576> DW_AT_location : 2 byte block: 91 7c (DW_OP_fbreg: -4) <--- here
<43d1579> DW_AT_name : (indirect string, offset: 0x2c00c9): i
<43d157d> DW_AT_decl_file : 1
<43d157e> DW_AT_decl_line : 78
<43d157f> DW_AT_type : <0x43d19d7>

I found some differences on saving the frame base between gcc and clang.
The gcc uses the CFA to get the base so it needs to check the current
frame's CFI info. In this case, stack offset needs to be adjusted from
the start of the CFA.

<1><1bb8d>: Abbrev Number: 102 (DW_TAG_subprogram)
<1bb8e> DW_AT_name : (indirect string, offset: 0x74d41): kernel_init
<1bb92> DW_AT_decl_file : 2
<1bb92> DW_AT_decl_line : 1440
<1bb94> DW_AT_decl_column : 18
<1bb95> DW_AT_prototyped : 1
<1bb95> DW_AT_type : <0xcc>
<1bb99> DW_AT_low_pc : 0xffffffff81bab9e0
<1bba1> DW_AT_high_pc : 0x1b2
<1bba9> DW_AT_frame_base : 1 byte block: 9c (DW_OP_call_frame_cfa) <------ here
<1bbab> DW_AT_call_all_calls: 1
<1bbab> DW_AT_sibling : <0x1bf5a>

While clang sets it to a register directly and it can check the register
and offset in the instruction directly.

<1><43d1542>: Abbrev Number: 60 (DW_TAG_subprogram)
<43d1543> DW_AT_low_pc : 0xffffffff816a7c60
<43d154b> DW_AT_high_pc : 0x98
<43d154f> DW_AT_frame_base : 1 byte block: 56 (DW_OP_reg6 (rbp)) <---------- here
<43d1551> DW_AT_GNU_all_call_sites: 1
<43d1551> DW_AT_name : (indirect string, offset: 0x3bce91): foo
<43d1555> DW_AT_decl_file : 1
<43d1556> DW_AT_decl_line : 75
<43d1557> DW_AT_prototyped : 1
<43d1557> DW_AT_type : <0x43c7332>
<43d155b> DW_AT_external : 1

Also it needs to update the offset after finding the type like global
variables since the offset was from the frame base. Factor out
match_var_offset() to check global and local variables in the same way.

The type stats are improved too:

Annotate data type stats:
total 294, ok 160 (54.4%), bad 134 (45.6%)
-----------------------------------------------------------
30 : no_sym
32 : no_mem_ops
51 : no_var
14 : no_typeinfo
7 : bad_offset

Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240117062657.985479-9-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>


# 5f7cdde8 16-Jan-2024 Namhyung Kim <namhyung@kernel.org>

perf annotate-data: Support global variables

Global variables are accessed using PC-relative address so it needs to
be handled separately. The PC-rel addressing is detected by using
DWARF_REG_PC. On x86, %rip register would be used.

The address can be calculated using the ip and offset in the
instruction. But it should start from the next instruction so add
calculate_pcrel_addr() to do it properly.

But global variables defined in a different file would only have a
declaration which doesn't include a location list. So it first tries
to get the type info using the address, and then looks up the variable
declarations using name. The name of global variables should be get
from the symbol table. The declaration would have the type info.

So extend find_var_type() to take both address and name for global
variables.

The stat is now looks like:

Annotate data type stats:
total 294, ok 153 (52.0%), bad 141 (48.0%)
-----------------------------------------------------------
30 : no_sym
32 : no_mem_ops
61 : no_var
10 : no_typeinfo
8 : bad_offset

Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240117062657.985479-7-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>


# 83bfa06d 16-Jan-2024 Namhyung Kim <namhyung@kernel.org>

perf annotate-data: Handle PC-relative addressing

Extend find_data_type_die() to find data type from PC-relative address
using die_find_variable_by_addr(). Users need to pass the address for
the (global) variable.

The offset for the variable should be updated after finding the type
because the offset in the instruction is just to calcuate the address
for the variable. So it changed to pass a pointer to offset and renamed
it to 'poffset'.

First it searches variables in the CU DIE as it's likely that the global
variables are defined in the file level. And then it iterates the scope
DIEs to find a local (static) variable.

Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240117062657.985479-6-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>


# d3030191 16-Jan-2024 Namhyung Kim <namhyung@kernel.org>

perf annotate-data: Handle array style accesses

On x86, instructions for array access often looks like below.

mov 0x1234(%rax,%rbx,8), %rcx

Usually the first register holds the type information and the second one
has the index. And the current code only looks up a variable for the
first register. But it's possible to be in the other way around so it
needs to check the second register if the first one failed.

The stat changed like this.

Annotate data type stats:
total 294, ok 148 (50.3%), bad 146 (49.7%)
-----------------------------------------------------------
30 : no_sym
32 : no_mem_ops
66 : no_var
10 : no_typeinfo
8 : bad_offset

Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240117062657.985479-4-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>


# 61a9741e 12-Dec-2023 Namhyung Kim <namhyung@kernel.org>

perf annotate: Add --type-stat option for debugging

The --type-stat option is to be used with --data-type and to print
detailed failure reasons for the data type annotation.

$ perf annotate --data-type --type-stat
Annotate data type stats:
total 294, ok 116 (39.5%), bad 178 (60.5%)
-----------------------------------------------------------
30 : no_sym
40 : no_insn_ops
33 : no_mem_ops
63 : no_var
4 : no_typeinfo
8 : bad_offset

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231213001323.718046-17-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 263925bf 12-Dec-2023 Namhyung Kim <namhyung@kernel.org>

perf annotate: Add --data-type option

Support data type annotation with new --data-type option. It internally
uses type sort key to collect sample histogram for the type and display
every members like below.

$ perf annotate --data-type
...
Annotate type: 'struct cfs_rq' in [kernel.kallsyms] (13 samples):
============================================================================
samples offset size field
13 0 640 struct cfs_rq {
2 0 16 struct load_weight load {
2 0 8 unsigned long weight;
0 8 4 u32 inv_weight;
};
0 16 8 unsigned long runnable_weight;
0 24 4 unsigned int nr_running;
1 28 4 unsigned int h_nr_running;
...

For simplicity it prints the number of samples per field for now.
But it should be easy to show the overhead percentage instead.

The number at the outer struct is a sum of the numbers of the inner
members. For example, struct cfs_rq got total 13 samples, and 2 came
from the load (struct load_weight) and 1 from h_nr_running. Similarly,
the struct load_weight got total 2 samples and they all came from the
weight field.

I've added two new flags in the symbol_conf for this. The
annotate_data_member is to get the members of the type. This is also
needed for perf report with typeoff sort key. The annotate_data_sample
is to update sample stats for each offset and used only in annotate.

Currently it only support stdio output mode, TUI support can be added
later.

Committer testing:

With the perf.data from the previous csets, a very simple, short
duration one:

# perf annotate --data-type
Annotate type: 'struct list_head' in [kernel.kallsyms] (1 samples):
============================================================================
samples offset size field
1 0 16 struct list_head {
0 0 8 struct list_head* next;
1 8 8 struct list_head* prev;
};

Annotate type: 'char' in [kernel.kallsyms] (1 samples):
============================================================================
samples offset size field
1 0 1 char ;

#

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231213001323.718046-15-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 9bd7ddd1 12-Dec-2023 Namhyung Kim <namhyung@kernel.org>

perf annotate-data: Update sample histogram for type

The annotated_data_type__update_samples() to get histogram for data type
access.

It'll be called by perf annotate to show which fields in the data type
are accessed frequently.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231213001323.718046-12-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 4a111cad 12-Dec-2023 Namhyung Kim <namhyung@kernel.org>

perf annotate-data: Add member field in the data type

Add child member field if the current type is a composite type like a
struct or union. The member fields are linked in the children list and
do the same recursively if the child itself is a composite type.

Add 'self' member to the annotated_data_type to handle the members in
the same way.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231213001323.718046-11-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# fc044c53 12-Dec-2023 Namhyung Kim <namhyung@kernel.org>

perf annotate-data: Add dso->data_types tree

To aggregate accesses to the same data type, add 'data_types' tree in
DSO to maintain data types and find it by name and size.

It might have different data types that happen to have the same name,
so it also compares the size of the type.

Even if it doesn't 100% guarantee, it reduces the possibility of
mis-handling of such conflicts.

And I don't think it's common to have different types with the same
name.

Committer notes:

Very few cases on the Linux kernel, but there are some different types
with the same name, unsure if there is a debug mode in libbpf dedup that
warns about such cases, but there are provisions in pahole for that,
see:

"emit: Notice type shadowing, i.e. multiple types with the same name (enum, struct, union, etc)"
https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=4f332dbfd02072e4f410db7bdcda8d6e3422974b

$ pahole --compile > vmlinux.h
$ rm -f a ; make a
cc a.c -o a
$ grep __[0-9] vmlinux.h
union irte__1 {
struct map_info__1;
struct map_info__1 {
struct map_info__1 * next; /* 0 8 */
$

drivers/iommu/amd/amd_iommu_types.h 'union irte'
include/linux/dmar.h 'struct irte'

include/linux/device-mapper.h:

union map_info {
void *ptr;
};

include/linux/mtd/map.h:

struct map_info {
const char *name;
unsigned long size;
resource_size_t phys;
<SNIP>

kernel/events/uprobes.c:

struct map_info {
struct map_info *next;
struct mm_struct *mm;
unsigned long vaddr;
};

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231213001323.718046-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# b9c87f53 12-Dec-2023 Namhyung Kim <namhyung@kernel.org>

perf annotate-data: Add find_data_type() to get type from memory access

The find_data_type() is to get a data type from the memory access at the
given address (IP) using a register and an offset.

It requires DWARF debug info in the DSO and searches the list of
variables and function parameters in the scope.

In a pseudo code, it does basically the following:

find_data_type(dso, ip, reg, offset)
{
pc = map__rip_2objdump(ip);
CU = dwarf_addrdie(dso->dwarf, pc);
scopes = die_get_scopes(CU, pc);
for_each_scope(S, scopes) {
V = die_find_variable_by_reg(S, pc, reg);
if (V && V.type == pointer_type) {
T = die_get_real_type(V);
if (offset < T.size)
return T;
}
}
return NULL;
}

Committer notes:

The 'size' variable in check_variable() is 64-bit, so use PRIu64 and
inttypes.h to debug it.

Ditto at find_data_type_die().

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231213001323.718046-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>