History log of /linux-master/tools/objtool/elf.c
Revision Date Author Comments
# f404a58d 04-Oct-2023 Aaron Plattner <aplattner@nvidia.com>

objtool: Remove max symbol name length limitation

If one of the symbols processed by read_symbols() happens to have a
.cold variant with a name longer than objtool's MAX_NAME_LEN limit, the
build fails.

Avoid this problem by just using strndup() to copy the parent function's
name, rather than strncpy()ing it onto the stack.

Signed-off-by: Aaron Plattner <aplattner@nvidia.com>
Link: https://lore.kernel.org/r/41e94cfea1d9131b758dd637fecdeacd459d4584.1696355111.git.aplattner@nvidia.com
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>


# 9f71fbcd 28-Jun-2023 Michal Kubecek <mkubecek@suse.cz>

objtool: initialize all of struct elf

Function elf_open_read() only zero initializes the initial part of
allocated struct elf; num_relocs member was recently added outside the
zeroed part so that it was left uninitialized, resulting in build failures
on some systems.

The partial initialization is a relic of times when struct elf had large
hash tables embedded. This is no longer the case so remove the trap and
initialize the whole structure instead.

Fixes: eb0481bbc4ce ("objtool: Fix reloc_hash size")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lore.kernel.org/r/20230629102051.42E8360467@lion.mk-sys.cz


# b4c96ef0 30-May-2023 Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Skip reading DWARF section data

Objtool doesn't use DWARF at all, and the DWARF sections' data take up a
lot of memory. Skip reading them.

Note this only skips the DWARF base sections, not the rela sections.
The relas are needed because their symbol references may need to be
reindexed if any local symbols get added by elf_create_symbol().

Also note the DWARF data will eventually be read by libelf anyway, when
writing the object file. But that's fine, the goal here is to reduce
*peak* memory usage, and the previous patch (which freed insn memory)
gave some breathing room. So the allocation gets shifted to a later
time, resulting in lower peak memory usage.

With allyesconfig + CONFIG_DEBUG_INFO:

- Before: peak heap memory consumption: 29.93G
- After: peak heap memory consumption: 25.47G

Link: https://lore.kernel.org/r/52a9698835861dd35f2ec35c49f96d0bb39fb177.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>


# ec24b927 30-May-2023 Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Get rid of reloc->rel[a]

Get the relocation entry info from the underlying rsec->data.

With allyesconfig + CONFIG_DEBUG_INFO:

- Before: peak heap memory consumption: 35.12G
- After: peak heap memory consumption: 29.93G

Link: https://lore.kernel.org/r/2be32323de6d8cc73179ee0ff14b71f4e7cefaa0.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>


# 02b54001 30-May-2023 Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Shrink elf hash nodes

Instead of using hlist for the 'struct elf' hashes, use a custom
single-linked list scheme.

With allyesconfig + CONFIG_DEBUG_INFO:

- Before: peak heap memory consumption: 36.89G
- After: peak heap memory consumption: 35.12G

Link: https://lore.kernel.org/r/6e8cd305ed22e743c30d6e72cfdc1be20fb94cd4.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>


# 890f10a4 30-May-2023 Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Shrink reloc->sym_reloc_entry

Convert it to a singly-linked list.

With allyesconfig + CONFIG_DEBUG_INFO:

- Before: peak heap memory consumption: 38.64G
- After: peak heap memory consumption: 36.89G

Link: https://lore.kernel.org/r/a51f0a6f9bbf2494d5a3a449807307e78a940988.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>


# 0696b6e3 30-May-2023 Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Get rid of reloc->addend

Get the addend from the embedded GElf_Rel[a] struct.

With allyesconfig + CONFIG_DEBUG_INFO:

- Before: peak heap memory consumption: 42.10G
- After: peak heap memory consumption: 40.37G

Link: https://lore.kernel.org/r/ad2354f95d9ddd86094e3f7687acfa0750657784.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>


# fcee899d 30-May-2023 Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Get rid of reloc->type

Get the type from the embedded GElf_Rel[a] struct.

Link: https://lore.kernel.org/r/d1c1f8da31e4f052a2478aea585fcf355cacc53a.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>


# e4cbb9b8 30-May-2023 Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Get rid of reloc->offset

Get the offset from the embedded GElf_Rel[a] struct.

With allyesconfig + CONFIG_DEBUG_INFO:

- Before: peak heap memory consumption: 43.83G
- After: peak heap memory consumption: 42.10G

Link: https://lore.kernel.org/r/2b9ec01178baa346a99522710bf2e82159412e3a.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>


# be9a4c11 30-May-2023 Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Get rid of reloc->idx

Use the array offset to calculate the reloc index.

With allyesconfig + CONFIG_DEBUG_INFO:

- Before: peak heap memory consumption: 45.56G
- After: peak heap memory consumption: 43.83G

Link: https://lore.kernel.org/r/7351d2ebad0519027db14a32f6204af84952574a.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>


# ebcef730 30-May-2023 Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Get rid of reloc->list

Now that all relocs are allocated in an array, the linked list is no
longer needed.

With allyesconfig + CONFIG_DEBUG_INFO:

- Before: peak heap memory consumption: 49.02G
- After: peak heap memory consumption: 45.56G

Link: https://lore.kernel.org/r/71e7a2c017dbc46bb497857ec97d67214f832d10.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>


# e0a9349b 30-May-2023 Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Allocate relocs in advance for new rela sections

Similar to read_relocs(), allocate the reloc structs all together in an
array rather than allocating them one at a time.

Link: https://lore.kernel.org/r/5332d845c5a2d6c2d052075b381bfba8bcb67ed5.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>


# 5201a9bc 30-May-2023 Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Don't free memory in elf_close()

It's not necessary, objtool's about to exit anyway.

Link: https://lore.kernel.org/r/74bdb3058b8f029db8d5b3b5175f2a200804196d.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>


# fcf93355 30-May-2023 Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Keep GElf_Rel[a] structs synced

Keep the GElf_Rela structs synced with their 'struct reloc' counterparts
instead of having to go back and "rebuild" them later.

Link: https://lore.kernel.org/r/156d8a3e528a11e5c8577cf552890ed1f2b9567b.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>


# 6342a20e 30-May-2023 Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Add elf_create_section_pair()

When creating an annotation section, allocate the reloc section data at
the beginning. This simplifies the data model a bit and also saves
memory due to the removal of malloc() in elf_rebuild_reloc_section().

With allyesconfig + CONFIG_DEBUG_INFO:

- Before: peak heap memory consumption: 53.49G
- After: peak heap memory consumption: 49.02G

Link: https://lore.kernel.org/r/048e908f3ede9b66c15e44672b6dda992b1dae3e.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>


# ff408273 30-May-2023 Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Add mark_sec_changed()

Ensure elf->changed always gets set when sec->changed gets set.

Link: https://lore.kernel.org/r/9a810a8d2e28af6ba07325362d0eb4703bb09d3a.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>


# eb0481bb 30-May-2023 Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Fix reloc_hash size

With CONFIG_DEBUG_INFO, DWARF creates a lot of relocations and
reloc_hash is woefully undersized, which can affect performance
significantly. Fix that.

Link: https://lore.kernel.org/r/38ef60dc8043270bf3b9dfd139ae2a30ca3f75cc.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>


# 53257a97 30-May-2023 Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Consolidate rel/rela handling

The GElf_Rel[a] structs have more similarities than differences. It's
safe to hard-code the assumptions about their shared fields as they will
never change. Consolidate their handling where possible, getting rid of
duplicated code.

Also, at least for now we only ever create rela sections, so simplify
the relocation creation code to be rela-only.

Link: https://lore.kernel.org/r/dcabf6df400ca500ea929f1e4284f5e5ec0b27c8.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>


# a5bd6236 30-May-2023 Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Improve reloc naming

- The term "reloc" is overloaded to mean both "an instance of struct
reloc" and "a reloc section". Change the latter to "rsec".

- For variable names, use "sec" for regular sections and "rsec" for rela
sections to prevent them getting mixed up.

- For struct reloc variables, use "reloc" instead of "rel" everywhere
for consistency.

Link: https://lore.kernel.org/r/8b790e403df46f445c21003e7893b8f53b99a6f3.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>


# 2707579d 30-May-2023 Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Remove flags argument from elf_create_section()

Simplify the elf_create_section() interface a bit by removing the flags
argument. Most callers don't care about changing the section header
flags. If needed, they can be modified afterwards, just like any other
section header field.

Link: https://lore.kernel.org/r/515235d9cf62637a14bee37bfa9169ef20065471.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>


# 9290e772 12-Apr-2023 Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Add symbol iteration helpers

Add [sec_]for_each_sym() and use them.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/59023e5886ab125aa30702e633be7732b1acaa7e.1681325924.git.jpoimboe@kernel.org


# 8045b8f0 27-Dec-2022 Thomas Weißschuh <linux@weissschuh.net>

objtool: Allocate multiple structures with calloc()

By using calloc() instead of malloc() in a loop, libc does not have to
keep around bookkeeping information for each single structure.

This reduces maximum memory usage while processing vmlinux.o from
3153325 KB to 3035668 KB (-3.7%) on my notebooks "localmodconfig".

Note this introduces memory leaks, because some additional structs get
added to the lists later after reading the symbols and sections from the
original object. Luckily we don't really care about memory leaks in
objtool.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20221216-objtool-memory-v2-3-17968f85a464@weissschuh.net
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>


# 86ea7f36 14-Nov-2022 Christophe Leroy <christophe.leroy@csgroup.eu>

objtool: Use target file class size instead of a compiled constant

In order to allow using objtool on cross-built kernels,
determine size of long from elf data instead of using
sizeof(long) at build time.

For the time being this covers only mcount.

[Sathvika Vasireddy: Rename variable "size" to "addrsize" and function
"elf_class_size()" to "elf_class_addrsize()", and modify
create_mcount_loc_sections() function to follow reverse christmas tree
format to order local variable declarations.]

Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221114175754.1131267-11-sv@linux.ibm.com


# 19526717 02-Nov-2022 Peter Zijlstra <peterz@infradead.org>

objtool: Optimize elf_dirty_reloc_sym()

When moving a symbol in the symtab its index changes and any reloc
referring that symtol-table-index will need to be rewritten too.

In order to facilitate this, objtool simply marks the whole reloc
section 'changed' which will cause the whole section to be
re-generated.

However, finding the relocs that use any given symbol is implemented
rather crudely -- a fully iteration of all sections and their relocs.
Given that some builds have over 20k sections (kallsyms etc..)
iterating all that for *each* symbol moved takes a bit of time.

Instead have each symbol keep a list of relocs that reference it.

This *vastly* improves build times for certain configs.

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/Y2LlRA7x+8UsE1xf@hirez.programming.kicks-ass.net


# 9f2899fe 28-Oct-2022 Peter Zijlstra <peterz@infradead.org>

objtool: Add option to generate prefix symbols

When code is compiled with:

-fpatchable-function-entry=${PADDING_BYTES},${PADDING_BYTES}

functions will have PADDING_BYTES of NOP in front of them. Unwinders
and other things that symbolize code locations will typically
attribute these bytes to the preceding function.

Given that these bytes nominally belong to the following symbol this
mis-attribution is confusing.

Inspired by the fact that CFI_CLANG emits __cfi_##name symbols to
claim these bytes, allow objtool to emit __pfx_##name symbols to do
the same.

Therefore add the objtool --prefix=N argument, to conditionally place
a __pfx_##name symbol at N bytes ahead of symbol 'name' when: all
these preceding bytes are NOP and name-N is an instruction boundary.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Yujie Liu <yujie.liu@intel.com>
Link: https://lkml.kernel.org/r/20221028194453.526899822@infradead.org


# 13f60e80 28-Oct-2022 Peter Zijlstra <peterz@infradead.org>

objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf

Due to how gelf_update_sym*() requires an Elf_Data pointer, and how
libelf keeps Elf_Data in a linked list per section,
elf_update_symbol() ends up having to iterate this list on each
update to find the correct Elf_Data for the index'ed symbol.

By allocating one Elf_Data per new symbol, the list grows per new
symbol, giving an effective O(n^2) insertion time. This is obviously
bloody terrible.

Therefore over-allocate the Elf_Data when an extention is needed.
Except it turns out libelf disregards Elf_Scn::sh_size in favour of
the sum of Elf_Data::d_size. IOW it will happily write out all the
unused space and fill it with:

0000000000000000 0 NOTYPE LOCAL DEFAULT UND

entries (aka zeros). Which obviously violates the STB_LOCAL placement
rule, and is a general pain in the backside for not being the desired
behaviour.

Manually fix-up the Elf_Data size to avoid this problem before calling
elf_update().

This significantly improves performance when adding a significant
number of symbols.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Yujie Liu <yujie.liu@intel.com>
Link: https://lkml.kernel.org/r/20221028194453.461658986@infradead.org


# 4c91be8e 28-Oct-2022 Peter Zijlstra <peterz@infradead.org>

objtool: Slice up elf_create_section_symbol()

In order to facilitate creation of more symbol types, slice up
elf_create_section_symbol() to extract a generic helper that deals
with adding ELF symbols.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Yujie Liu <yujie.liu@intel.com>
Link: https://lkml.kernel.org/r/20221028194453.396634875@infradead.org


# 5da6aea3 15-Sep-2022 Peter Zijlstra <peterz@infradead.org>

objtool: Fix find_{symbol,func}_containing()

The current find_{symbol,func}_containing() functions are broken in
the face of overlapping symbols, exactly the case that is needed for a
new ibt/endbr supression.

Import interval_tree_generic.h into the tools tree and convert the
symbol tree to an interval tree to support proper range stabs.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220915111146.330203761@infradead.org


# 5141d3a0 08-Sep-2022 Sami Tolvanen <samitolvanen@google.com>

objtool: Preserve special st_shndx indexes in elf_update_symbol

elf_update_symbol fails to preserve the special st_shndx values
between [SHN_LORESERVE, SHN_HIRESERVE], which results in it
converting SHN_ABS entries into SHN_UNDEF, for example. Explicitly
check for the special indexes and ensure these symbols are not
marked undefined.

Fixes: ead165fa1042 ("objtool: Fix symbol creation")
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220908215504.3686827-17-samitolvanen@google.com


# 22682a07 16-May-2022 Mikulas Patocka <mpatocka@redhat.com>

objtool: Fix objtool regression on x32 systems

Commit c087c6e7b551 ("objtool: Fix type of reloc::addend") failed to
appreciate cross building from ILP32 hosts, where 'int' == 'long' and
the issue persists.

As such, use s64/int64_t/Elf64_Sxword for this field and suffer the
pain that is ISO C99 printf formats for it.

Fixes: c087c6e7b551 ("objtool: Fix type of reloc::addend")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
[peterz: reword changelog, s/long long/s64/]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/alpine.LRH.2.02.2205161041260.11556@file01.intranet.prod.int.rdu2.redhat.com


# ead165fa 17-May-2022 Peter Zijlstra <peterz@infradead.org>

objtool: Fix symbol creation

Nathan reported objtool failing with the following messages:

warning: objtool: no non-local symbols !?
warning: objtool: gelf_update_symshndx: invalid section index

The problem is due to commit 4abff6d48dbc ("objtool: Fix code relocs
vs weak symbols") failing to consider the case where an object would
have no non-local symbols.

The problem that commit tries to address is adding a STB_LOCAL symbol
to the symbol table in light of the ELF spec's requirement that:

In each symbol table, all symbols with STB_LOCAL binding preced the
weak and global symbols. As ``Sections'' above describes, a symbol
table section's sh_info section header member holds the symbol table
index for the first non-local symbol.

The approach taken is to find this first non-local symbol, move that
to the end and then re-use the freed spot to insert a new local symbol
and increment sh_info.

Except it never considered the case of object files without global
symbols and got a whole bunch of details wrong -- so many in fact that
it is a wonder it ever worked :/

Specifically:

- It failed to re-hash the symbol on the new index, so a subsequent
find_symbol_by_index() would not find it at the new location and a
query for the old location would now return a non-deterministic
choice between the old and new symbol.

- It failed to appreciate that the GElf wrappers are not a valid disk
format (it works because GElf is basically Elf64 and we only
support x86_64 atm.)

- It failed to fully appreciate how horrible the libelf API really is
and got the gelf_update_symshndx() call pretty much completely
wrong; with the direct consequence that if inserting a second
STB_LOCAL symbol would require moving the same STB_GLOBAL symbol
again it would completely come unstuck.

Write a new elf_update_symbol() function that wraps all the magic
required to update or create a new symbol at a given index.

Specifically, gelf_update_sym*() require an @ndx argument that is
relative to the @data argument; this means you have to manually
iterate the section data descriptor list and update @ndx.

Fixes: 4abff6d48dbc ("objtool: Fix code relocs vs weak symbols")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/YoPCTEYjoPqE4ZxB@hirez.programming.kicks-ass.net


# 753da417 18-Apr-2022 Josh Poimboeuf <jpoimboe@redhat.com>

objtool: Remove --lto and --vmlinux in favor of --link

The '--lto' option is a confusing way of telling objtool to do stack
validation despite it being a linked object. It's no longer needed now
that an explicit '--stackval' option exists. The '--vmlinux' option is
also redundant.

Remove both options in favor of a straightforward '--link' option which
identifies a linked object.

Also, implicitly set '--link' with a warning if the user forgets to do
so and we can tell that it's a linked object. This makes it easier for
manual vmlinux runs.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/dcd3ceffd15a54822c6183e5766d21ad06082b45.1650300597.git.jpoimboe@redhat.com


# 2daf7fab 18-Apr-2022 Josh Poimboeuf <jpoimboe@redhat.com>

objtool: Reorganize cmdline options

Split the existing options into two groups: actions, which actually do
something; and options, which modify the actions in some way.

Also there's no need to have short flags for all the non-action options.
Reserve short flags for the more important actions.

While at it:

- change a few of the short flags to be more intuitive

- make option descriptions more consistently descriptive

- sort options in the source like they are when printed

- move options to a global struct

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/9dcaa752f83aca24b1b21f0b0eeb28a0c181c0b0.1650300597.git.jpoimboe@redhat.com


# 4abff6d4 17-Apr-2022 Peter Zijlstra <peterz@infradead.org>

objtool: Fix code relocs vs weak symbols

Occasionally objtool driven code patching (think .static_call_sites
.retpoline_sites etc..) goes sideways and it tries to patch an
instruction that doesn't match.

Much head-scatching and cursing later the problem is as outlined below
and affects every section that objtool generates for us, very much
including the ORC data. The below uses .static_call_sites because it's
convenient for demonstration purposes, but as mentioned the ORC
sections, .retpoline_sites and __mount_loc are all similarly affected.

Consider:

foo-weak.c:

extern void __SCT__foo(void);

__attribute__((weak)) void foo(void)
{
return __SCT__foo();
}

foo.c:

extern void __SCT__foo(void);
extern void my_foo(void);

void foo(void)
{
my_foo();
return __SCT__foo();
}

These generate the obvious code
(gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -c foo*.c):

foo-weak.o:
0000000000000000 <foo>:
0: e9 00 00 00 00 jmpq 5 <foo+0x5> 1: R_X86_64_PLT32 __SCT__foo-0x4

foo.o:
0000000000000000 <foo>:
0: 48 83 ec 08 sub $0x8,%rsp
4: e8 00 00 00 00 callq 9 <foo+0x9> 5: R_X86_64_PLT32 my_foo-0x4
9: 48 83 c4 08 add $0x8,%rsp
d: e9 00 00 00 00 jmpq 12 <foo+0x12> e: R_X86_64_PLT32 __SCT__foo-0x4

Now, when we link these two files together, you get something like
(ld -r -o foos.o foo-weak.o foo.o):

foos.o:
0000000000000000 <foo-0x10>:
0: e9 00 00 00 00 jmpq 5 <foo-0xb> 1: R_X86_64_PLT32 __SCT__foo-0x4
5: 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:0x0(%rax,%rax,1)
f: 90 nop

0000000000000010 <foo>:
10: 48 83 ec 08 sub $0x8,%rsp
14: e8 00 00 00 00 callq 19 <foo+0x9> 15: R_X86_64_PLT32 my_foo-0x4
19: 48 83 c4 08 add $0x8,%rsp
1d: e9 00 00 00 00 jmpq 22 <foo+0x12> 1e: R_X86_64_PLT32 __SCT__foo-0x4

Noting that ld preserves the weak function text, but strips the symbol
off of it (hence objdump doing that funny negative offset thing). This
does lead to 'interesting' unused code issues with objtool when ran on
linked objects, but that seems to be working (fingers crossed).

So far so good.. Now lets consider the objtool static_call output
section (readelf output, old binutils):

foo-weak.o:

Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry:
Offset Info Type Symbol's Value Symbol's Name + Addend
0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 .text + 0
0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1

foo.o:

Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries:
Offset Info Type Symbol's Value Symbol's Name + Addend
0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 .text + d
0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1

foos.o:

Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries:
Offset Info Type Symbol's Value Symbol's Name + Addend
0000000000000000 0000000100000002 R_X86_64_PC32 0000000000000000 .text + 0
0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1
0000000000000008 0000000100000002 R_X86_64_PC32 0000000000000000 .text + 1d
000000000000000c 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1

So we have two patch sites, one in the dead code of the weak foo and one
in the real foo. All is well.

*HOWEVER*, when the toolchain strips unused section symbols it
generates things like this (using new enough binutils):

foo-weak.o:

Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry:
Offset Info Type Symbol's Value Symbol's Name + Addend
0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 foo + 0
0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1

foo.o:

Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries:
Offset Info Type Symbol's Value Symbol's Name + Addend
0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 foo + d
0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1

foos.o:

Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries:
Offset Info Type Symbol's Value Symbol's Name + Addend
0000000000000000 0000000100000002 R_X86_64_PC32 0000000000000000 foo + 0
0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1
0000000000000008 0000000100000002 R_X86_64_PC32 0000000000000000 foo + d
000000000000000c 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1

And now we can see how that foos.o .static_call_sites goes side-ways, we
now have _two_ patch sites in foo. One for the weak symbol at foo+0
(which is no longer a static_call site!) and one at foo+d which is in
fact the right location.

This seems to happen when objtool cannot find a section symbol, in which
case it falls back to any other symbol to key off of, however in this
case that goes terribly wrong!

As such, teach objtool to create a section symbol when there isn't
one.

Fixes: 44f6a7c0755d ("objtool: Fix seg fault with Clang non-section symbols")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20220419203807.655552918@infradead.org


# c087c6e7 17-Apr-2022 Peter Zijlstra <peterz@infradead.org>

objtool: Fix type of reloc::addend

Elf{32,64}_Rela::r_addend is of type: Elf{32,64}_Sword, that means
that our reloc::addend needs to be long or face tuncation issues when
we do elf_rebuild_reloc_section():

- 107: 48 b8 00 00 00 00 00 00 00 00 movabs $0x0,%rax 109: R_X86_64_64 level4_kernel_pgt+0x80000067
+ 107: 48 b8 00 00 00 00 00 00 00 00 movabs $0x0,%rax 109: R_X86_64_64 level4_kernel_pgt-0x7fffff99

Fixes: 627fce14809b ("objtool: Add ORC unwind table generation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20220419203807.596871927@infradead.org


# 4adb2368 08-Mar-2022 Peter Zijlstra <peterz@infradead.org>

objtool: Ignore extra-symbol code

There's a fun implementation detail on linking STB_WEAK symbols. When
the linker combines two translation units, where one contains a weak
function and the other an override for it. It simply strips the
STB_WEAK symbol from the symbol table, but doesn't actually remove the
code.

The result is that when objtool is ran in a whole-archive kind of way,
it will encounter *heaps* of unused (and unreferenced) code. All
rudiments of weak functions.

Additionally, when a weak implementation is split into a .cold
subfunction that .cold symbol is left in place, even though completely
unused.

Teach objtool to ignore such rudiments by searching for symbol holes;
that is, code ranges that fall outside the given symbol bounds.
Specifically, ignore a sequence of unreachable instruction iff they
occupy a single hole, additionally ignore any .cold subfunctions
referenced.

Both ld.bfd and ld.lld behave like this. LTO builds otoh can (and do)
properly DCE weak functions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.232019347@infradead.org


# f2d3a250 08-Mar-2022 Peter Zijlstra <peterz@infradead.org>

objtool: Add --dry-run

Add a --dry-run argument to skip writing the modifications. This is
convenient for debugging.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.282720146@infradead.org


# 988f0168 02-Dec-2021 Peter Zijlstra <peterz@infradead.org>

objtool: Fix pv_ops noinstr validation

Boris reported that in one of his randconfig builds, objtool got
infinitely stuck. Turns out there's trivial list corruption in the
pv_ops tracking when a function is both in a static table and in a code
assignment.

Avoid re-adding function to the pv_ops[] lists when they're already on
it.

Fixes: db2b0c5d7b6f ("objtool: Support pv_opsindirect calls for noinstr")
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Borislav Petkov <bp@alien8.de>
Link: https://lkml.kernel.org/r/20211202204534.GA16608@worktop.programming.kicks-ass.net


# 134ab5bd 26-Oct-2021 Peter Zijlstra <peterz@infradead.org>

objtool,x86: Replace alternatives with .retpoline_sites

Instead of writing complete alternatives, simply provide a list of all
the retpoline thunk calls. Then the kernel is free to do with them as
it pleases. Simpler code all-round.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211026120309.850007165@infradead.org


# 86e1e054 08-May-2021 Michael Forney <mforney@mforney.org>

objtool: Update section header before relocations

The libelf implementation from elftoolchain has a safety check in
gelf_update_rel[a] to check that the data corresponds to a section
that has type SHT_REL[A] [0]. If the relocation is updated before
the section header is updated with the proper type, this check
fails.

To fix this, update the section header first, before the relocations.
Previously, the section size was calculated in elf_rebuild_reloc_section
by counting the number of entries in the reloc_list. However, we
now need the size during elf_write so instead keep a running total
and add to it for every new relocation.

[0] https://sourceforge.net/p/elftoolchain/mailman/elftoolchain-developers/thread/CAGw6cBtkZro-8wZMD2ULkwJ39J+tHtTtAWXufMjnd3cQ7XG54g@mail.gmail.com/

Signed-off-by: Michael Forney <mforney@mforney.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20210509000103.11008-2-mforney@mforney.org


# b46179d6 08-May-2021 Michael Forney <mforney@mforney.org>

objtool: Check for gelf_update_rel[a] failures

Otherwise, if these fail we end up with garbage data in the
.rela.orc_unwind_ip section, leading to errors like

ld: fs/squashfs/namei.o: bad reloc symbol index (0x7f16 >= 0x12) for offset 0x7f16d5c82cc8 in section `.orc_unwind_ip'

Signed-off-by: Michael Forney <mforney@mforney.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20210509000103.11008-1-mforney@mforney.org


# fe255fe6 22-Aug-2021 Joe Lawrence <joe.lawrence@redhat.com>

objtool: Remove redundant 'len' field from struct section

The section structure already contains sh_size, so just remove the extra
'len' member that requires extra mirroring and potential confusion.

Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20210822225037.54620-3-joe.lawrence@redhat.com
Cc: Andy Lavr <andy.lavr@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org


# d33b9035 11-Jun-2021 Peter Zijlstra <peterz@infradead.org>

objtool: Improve reloc hash size guestimate

Nathan reported that LLVM ThinLTO builds have a performance regression
with commit 25cf0d8aa2a3 ("objtool: Rewrite hashtable sizing"). Sami
was quick to note that this is due to their use of -ffunction-sections.

As a result the .text section is small and basing the number of relocs
off of that no longer works. Instead have read_sections() compute the
sum of all SHF_EXECINSTR sections and use that.

Fixes: 25cf0d8aa2a3 ("objtool: Rewrite hashtable sizing")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Debugged-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lkml.kernel.org/r/YMJpGLuGNsGtA5JJ@hirez.programming.kicks-ass.net


# 25cf0d8a 06-May-2021 Peter Zijlstra <peterz@infradead.org>

objtool: Rewrite hashtable sizing

Currently objtool has 5 hashtables and sizes them 16 or 20 bits
depending on the --vmlinux argument.

However, a single side doesn't really work well for the 5 tables,
which among them, cover 3 different uses. Also, while vmlinux is
larger, there is still a very wide difference between a defconfig and
allyesconfig build, which again isn't optimally covered by a single
size.

Another aspect is the cost of elf_hash_init(), which for large tables
dominates the runtime for small input files. It turns out that all it
does it assign NULL, something that is required when using malloc().
However, when we allocate memory using mmap(), we're guaranteed to get
zero filled pages.

Therefore, rewrite the whole thing to:

1) use more dynamic sized tables, depending on the input file,
2) avoid the need for elf_hash_init() entirely by using mmap().

This speeds up a regular kernel build (100s to 98s for
x86_64-defconfig), and potentially dramatically speeds up vmlinux
processing.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210506194157.452881700@infradead.org


# 584fd3b3 07-Jun-2021 Peter Zijlstra <peterz@infradead.org>

objtool: Fix .symtab_shndx handling for elf_create_undef_symbol()

When an ELF object uses extended symbol section indexes (IOW it has a
.symtab_shndx section), these must be kept in sync with the regular
symbol table (.symtab).

So for every new symbol we emit, make sure to also emit a
.symtab_shndx value to keep the arrays of equal size.

Note: since we're writing an UNDEF symbol, most GElf_Sym fields will
be 0 and we can repurpose one (st_size) to host the 0 for the xshndx
value.

Fixes: 2f2f7e47f052 ("objtool: Add elf_create_undef_symbol()")
Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lkml.kernel.org/r/YL3q1qFO9QIRL/BA@hirez.programming.kicks-ass.net


# 46c7405d 12-May-2021 Vasily Gorbik <gor@linux.ibm.com>

objtool: Fix elf_create_undef_symbol() endianness

Currently x86 cross-compilation fails on big endian system with:

x86_64-cross-ld: init/main.o: invalid string offset 488112128 >= 6229 for section `.strtab'

Mark new ELF data in elf_create_undef_symbol() as symbol, so that libelf
does endianness handling correctly.

Fixes: 2f2f7e47f052 ("objtool: Add elf_create_undef_symbol()")
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/patch-1.thread-6c9df9.git-d39264656387.your-ad-here.call-01620841104-ext-2554@work.hours


# 2f2f7e47 26-Mar-2021 Peter Zijlstra <peterz@infradead.org>

objtool: Add elf_create_undef_symbol()

Allow objtool to create undefined symbols; this allows creating
relocations to symbols not currently in the symbol table.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.064743095@infradead.org


# 9a7827b7 26-Mar-2021 Peter Zijlstra <peterz@infradead.org>

objtool: Extract elf_symbol_add()

Create a common helper to add symbols.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.003468981@infradead.org


# 417a4dc9 26-Mar-2021 Peter Zijlstra <peterz@infradead.org>

objtool: Extract elf_strtab_concat()

Create a common helper to append strings to a strtab.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.941474004@infradead.org


# d0c5c4cc 26-Mar-2021 Peter Zijlstra <peterz@infradead.org>

objtool: Create reloc sections implicitly

Have elf_add_reloc() create the relocation section implicitly.

Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.880174448@infradead.org


# ef47cc01 26-Mar-2021 Peter Zijlstra <peterz@infradead.org>

objtool: Add elf_create_reloc() helper

We have 4 instances of adding a relocation. Create a common helper
to avoid growing even more.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.817438847@infradead.org


# 3a647607 26-Mar-2021 Peter Zijlstra <peterz@infradead.org>

objtool: Rework the elf_rebuild_reloc_section() logic

Instead of manually calling elf_rebuild_reloc_section() on sections
we've called elf_add_reloc() on, have elf_write() DTRT.

This makes it easier to add random relocations in places without
carefully tracking when we're done and need to flush what section.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.754213408@infradead.org


# 7786032e 12-Nov-2020 Vasily Gorbik <gor@linux.ibm.com>

objtool: Rework header include paths

Currently objtool headers are being included either by their base name
or included via ../ from a parent directory. In case of a base name usage:

#include "warn.h"
#include "arch_elf.h"

it does not make it apparent from which directory the file comes from.
To make it slightly better, and actually to avoid name clashes some arch
specific files have "arch_" suffix. And files from an arch folder have
to revert to including via ../ e.g:
#include "../../elf.h"

With additional architectures support and the code base growth there is
a need for clearer headers naming scheme for multiple reasons:
1. to make it instantly obvious where these files come from (objtool
itself / objtool arch|generic folders / some other external files),
2. to avoid name clashes of objtool arch specific headers, potential
obtool arch generic headers and the system header files (there is
/usr/include/elf.h already),
3. to avoid ../ includes and improve code readability.
4. to give a warm fuzzy feeling to developers who are mostly kernel
developers and are accustomed to linux kernel headers arranging
scheme.

Doesn't this make it instantly obvious where are these files come from?

#include <objtool/warn.h>
#include <arch/elf.h>

And doesn't it look nicer to avoid ugly ../ includes? Which also
guarantees this is elf.h from the objtool and not /usr/include/elf.h.

#include <objtool/elf.h>

This patch defines and implements new objtool headers arranging
scheme. Which is:
- all generic headers go to include/objtool (similar to include/linux)
- all arch headers go to arch/$(SRCARCH)/include/arch (to get arch
prefix). This is similar to linux arch specific "asm/*" headers but we
are not abusing "asm" name and calling it what it is. This also helps
to prevent name clashes (arch is not used in system headers or kernel
exports).

To bring objtool to this state the following things are done:
1. current top level tools/objtool/ headers are moved into
include/objtool/ subdirectory,
2. arch specific headers, currently only arch/x86/include/ are moved into
arch/x86/include/arch/ and were stripped of "arch_" suffix,
3. new -I$(srctree)/tools/objtool/include include path to make
includes like <objtool/warn.h> possible,
4. rewriting file includes,
5. make git not to ignore include/objtool/ subdirectory.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>


# a1a664ec 12-Nov-2020 Martin Schwidefsky <schwidefsky@de.ibm.com>

objtool: Fix reloc generation on big endian cross-compiles

Relocations generated in elf_rebuild_rel[a]_reloc_section() are broken
if objtool is built and run on a big endian system.

The following errors pop up during x86 cross-compilation:

x86_64-9.1.0-ld: fs/efivarfs/inode.o: bad reloc symbol index (0x2000000 >= 0x22) for offset 0 in section `.orc_unwind_ip'
x86_64-9.1.0-ld: final link failed: bad value

Convert those functions to use gelf_update_rel[a](), similar to what
elf_write_reloc() does.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Co-developed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>


# 2d24dd57 29-Apr-2020 Peter Zijlstra <peterz@infradead.org>

rbtree: Add generic add and find helpers

I've always been bothered by the endless (fragile) boilerplate for
rbtree, and I recently wrote some rbtree helpers for objtool and
figured I should lift them into the kernel and use them more widely.

Provide:

partial-order; less() based:
- rb_add(): add a new entry to the rbtree
- rb_add_cached(): like rb_add(), but for a rb_root_cached

total-order; cmp() based:
- rb_find(): find an entry in an rbtree
- rb_find_add(): find an entry, and add if not found

- rb_find_first(): find the first (leftmost) matching entry
- rb_next_match(): continue from rb_find_first()
- rb_for_each(): iterate a sub-tree using the previous two

Inlining and constant propagation should see the compiler inline the
whole thing, including the various compare functions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Michel Lespinasse <walken@google.com>
Acked-by: Davidlohr Bueso <dbueso@suse.de>


# 1d489151 14-Jan-2021 Josh Poimboeuf <jpoimboe@redhat.com>

objtool: Don't fail on missing symbol table

Thanks to a recent binutils change which doesn't generate unused
symbols, it's now possible for thunk_64.o be completely empty without
CONFIG_PREEMPTION: no text, no data, no symbols.

We could edit the Makefile to only build that file when
CONFIG_PREEMPTION is enabled, but that will likely create confusion
if/when the thunks end up getting used by some other code again.

Just ignore it and move on.

Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1254
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>


# a2e38dff 05-Jan-2021 Josh Poimboeuf <jpoimboe@redhat.com>

objtool: Don't add empty symbols to the rbtree

Building with the Clang assembler shows the following warning:

arch/x86/kernel/ftrace_64.o: warning: objtool: missing symbol for insn at offset 0x16

The Clang assembler strips section symbols. That ends up giving
objtool's find_func_containing() much more test coverage than normal.
Turns out, find_func_containing() doesn't work so well for overlapping
symbols:

2: 000000000000000e 0 NOTYPE LOCAL DEFAULT 2 fgraph_trace
3: 000000000000000f 0 NOTYPE LOCAL DEFAULT 2 trace
4: 0000000000000000 165 FUNC GLOBAL DEFAULT 2 __fentry__
5: 000000000000000e 0 NOTYPE GLOBAL DEFAULT 2 ftrace_stub

The zero-length NOTYPE symbols are inside __fentry__(), confusing the
rbtree search for any __fentry__() offset coming after a NOTYPE.

Try to avoid this problem by not adding zero-length symbols to the
rbtree. They're rare and aren't needed in the rbtree anyway.

One caveat, this actually might not end up being the right fix.
Non-empty overlapping symbols, if they exist, could have the same
problem. But that would need bigger changes, let's see if we can get
away with the easy fix for now.

Reported-by: Arnd Bergmann <arnd@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>


# 44f6a7c0 14-Dec-2020 Josh Poimboeuf <jpoimboe@redhat.com>

objtool: Fix seg fault with Clang non-section symbols

The Clang assembler likes to strip section symbols, which means objtool
can't reference some text code by its section. This confuses objtool
greatly, causing it to seg fault.

The fix is similar to what was done before, for ORC reloc generation:

e81e07244325 ("objtool: Support Clang non-section symbols in ORC generation")

Factor out that code into a common helper and use it for static call
reloc generation as well.

Reported-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://github.com/ClangBuiltLinux/linux/issues/1207
Link: https://lkml.kernel.org/r/ba6b6c0f0dd5acbba66e403955a967d9fdd1726a.1607983452.git.jpoimboe@redhat.com


# 1e7e4788 18-Aug-2020 Josh Poimboeuf <jpoimboe@redhat.com>

x86/static_call: Add inline static call implementation for x86-64

Add the inline static call implementation for x86-64. The generated code
is identical to the out-of-line case, except we move the trampoline into
it's own section.

Objtool uses the trampoline naming convention to detect all the call
sites. It then annotates those call sites in the .static_call_sites
section.

During boot (and module init), the call sites are patched to call
directly into the destination function. The temporary trampoline is
then no longer used.

[peterz: merged trampolines, put trampoline in section]

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20200818135804.864271425@infradead.org


# fdabdd0b 12-Jun-2020 Peter Zijlstra <peterz@infradead.org>

objtool: Provide elf_write_{insn,reloc}()

This provides infrastructure to rewrite instructions; this is
immediately useful for helping out with KCOV-vs-noinstr, but will
also come in handy for a bunch of variable sized jump-label patches
that are still on ice.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>


# 2b10be23 17-Apr-2020 Peter Zijlstra <peterz@infradead.org>

objtool: Clean up elf_write() condition

With there being multiple ways to change the ELF data, let's more
concisely track modification.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>


# fb414783 29-May-2020 Matt Helsley <mhelsley@vmware.com>

objtool: Add support for relocations without addends

Currently objtool only collects information about relocations with
addends. In recordmcount, which we are about to merge into objtool,
some supported architectures do not use rela relocations.

Signed-off-by: Matt Helsley <mhelsley@vmware.com>
Reviewed-by: Julien Thierry <jthierry@redhat.com>
Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>


# f1974222 29-May-2020 Matt Helsley <mhelsley@vmware.com>

objtool: Rename rela to reloc

Before supporting additional relocation types rename the relevant
types and functions from "rela" to "reloc". This work be done with
the following regex:

sed -e 's/struct rela/struct reloc/g' \
-e 's/\([_\*]\)rela\(s\{0,1\}\)/\1reloc\2/g' \
-e 's/tmprela\(s\{0,1\}\)/tmpreloc\1/g' \
-e 's/relasec/relocsec/g' \
-e 's/rela_list/reloc_list/g' \
-e 's/rela_hash/reloc_hash/g' \
-e 's/add_rela/add_reloc/g' \
-e 's/rela->/reloc->/g' \
-e '/rela[,\.]/{ s/\([^\.>]\)rela\([\.,]\)/\1reloc\2/g ; }' \
-e 's/rela =/reloc =/g' \
-e 's/relas =/relocs =/g' \
-e 's/relas\[/relocs[/g' \
-e 's/relaname =/relocname =/g' \
-e 's/= rela\;/= reloc\;/g' \
-e 's/= relas\;/= relocs\;/g' \
-e 's/= relaname\;/= relocname\;/g' \
-e 's/, rela)/, reloc)/g' \
-e 's/\([ @]\)rela\([ "]\)/\1reloc\2/g' \
-e 's/ rela$/ reloc/g' \
-e 's/, relaname/, relocname/g' \
-e 's/sec->rela/sec->reloc/g' \
-e 's/(\(!\{0,1\}\)rela/(\1reloc/g' \
-i \
arch.h \
arch/x86/decode.c \
check.c \
check.h \
elf.c \
elf.h \
orc_gen.c \
special.c

Notable exceptions which complicate the regex include gelf_*
library calls and standard/expected section names which still use
"rela" because they encode the type of relocation expected. Also, keep
"rela" in the struct because it encodes a specific type of relocation
we currently expect.

It will eventually turn into a member of an anonymous union when a
susequent patch adds implicit addend, or "rel", relocation support.

Signed-off-by: Matt Helsley <mhelsley@vmware.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>


# 1e968bf5 21-Apr-2020 Sami Tolvanen <samitolvanen@google.com>

objtool: Use sh_info to find the base for .rela sections

ELF doesn't require .rela section names to match the base section. Use
the section index in sh_info to find the section instead of looking it
up by name.

LLD, for example, generates a .rela section that doesn't match the base
section name when we merge sections in a linker script for a binary
compiled with -ffunction-sections.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>


# e000acc1 15-Apr-2020 Kristen Carlson Accardi <kristen@linux.intel.com>

objtool: Do not assume order of parent/child functions

If a .cold function is examined prior to it's parent, the link
to the parent/child function can be overwritten when the parent
is examined. Only update pfunc and cfunc if they were previously
nil to prevent this from happening.

This fixes an issue seen when compiling with -ffunction-sections.

Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>


# 28fe1d7b 21-Apr-2020 Sami Tolvanen <samitolvanen@google.com>

objtool: use gelf_getsymshndx to handle >64k sections

Currently, objtool fails to load the correct section for symbols when
the index is greater than SHN_LORESERVE. Use gelf_getsymshndx instead
of gelf_getsym to handle >64k sections.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200421220843.188260-2-samitolvanen@google.com


# b490f453 24-Apr-2020 Miroslav Benes <mbenes@suse.cz>

objtool: Move the IRET hack into the arch decoder

Quoting Julien:

"And the other suggestion is my other email was that you don't even
need to add INSN_EXCEPTION_RETURN. You can keep IRET as
INSN_CONTEXT_SWITCH by default and x86 decoder lookups the symbol
conaining an iret. If it's a function symbol, it can just set the type
to INSN_OTHER so that it caries on to the next instruction after
having handled the stack_op."

Suggested-by: Julien Thierry <jthierry@redhat.com>
Signed-off-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20200428191659.913283807@infradead.org


# bc359ff2 21-Apr-2020 Ingo Molnar <mingo@kernel.org>

objtool: Rename elf_read() to elf_open_read()

'struct elf *' handling is an open/close paradigm, make sure the naming
matches that:

elf_open_read()
elf_write()
elf_close()

Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20200422103205.61900-3-mingo@kernel.org


# 894e48ca 21-Apr-2020 Ingo Molnar <mingo@kernel.org>

objtool: Constify 'struct elf *' parameters

In preparation to parallelize certain parts of objtool, map out which uses
of various data structures are read-only vs. read-write.

As a first step constify 'struct elf' pointer passing, most of the secondary
uses of it in find_symbol_*() methods are read-only.

Also, while at it, better group the 'struct elf' handling methods in elf.h.

Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20200422103205.61900-2-mingo@kernel.org


# 7f9b34f3 03-Apr-2020 Julien Thierry <jthierry@redhat.com>

objtool: Fix off-by-one in symbol_by_offset()

Sometimes, WARN_FUNC() and other users of symbol_by_offset() will
associate the first instruction of a symbol with the symbol preceding
it. This is because symbol->offset + symbol->len is already outside of
the symbol's range.

Fixes: 2a362ecc3ec9 ("objtool: Optimize find_symbol_*() and read_symbols()")
Signed-off-by: Julien Thierry <jthierry@redhat.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>


# 34f7c96d 12-Mar-2020 Peter Zijlstra <peterz@infradead.org>

objtool: Optimize !vmlinux.o again

When doing kbuild tests to see if the objtool changes affected those I
found that there was a measurable regression:

pre post

real 1m13.594 1m16.488s
user 34m58.246s 35m23.947s
sys 4m0.393s 4m27.312s

Perf showed that for small files the increased hash-table sizes were a
measurable difference. Since we already have -l "vmlinux" to
distinguish between the modes, make it also use a smaller portion of
the hash-tables.

This flips it into a small win:

real 1m14.143s
user 34m49.292s
sys 3m44.746s

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20200416115119.167588731@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 5377cae9 03-Apr-2020 Julien Thierry <jthierry@redhat.com>

objtool: Fix off-by-one in symbol_by_offset()

Sometimes, WARN_FUNC() and other users of symbol_by_offset() will
associate the first instruction of a symbol with the symbol preceding
it. This is because symbol->offset + symbol->len is already outside of
the symbol's range.

Fixes: 2a362ecc3ec9 ("objtool: Optimize find_symbol_*() and read_symbols()")
Signed-off-by: Julien Thierry <jthierry@redhat.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 74b873e4 12-Mar-2020 Peter Zijlstra <peterz@infradead.org>

objtool: Optimize find_rela_by_dest_range()

Perf shows there is significant time in find_rela_by_dest(); this is
because we have to iterate the address space per byte, looking for
relocation entries.

Optimize this by reducing the address space granularity.

This reduces objtool on vmlinux.o runtime from 4.8 to 4.4 seconds.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20200324160924.861321325@infradead.org


# 8b5fa6bc 12-Mar-2020 Peter Zijlstra <peterz@infradead.org>

objtool: Optimize read_sections()

Perf showed that __hash_init() is a significant portion of
read_sections(), so instead of doing a per section rela_hash, use an
elf-wide rela_hash.

Statistics show us there are about 1.1 million relas, so size it
accordingly.

This reduces the objtool on vmlinux.o runtime to a third, from 15 to 5
seconds.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20200324160924.739153726@infradead.org


# cdb3d057 12-Mar-2020 Peter Zijlstra <peterz@infradead.org>

objtool: Optimize find_symbol_by_name()

Perf showed that find_symbol_by_name() takes time; add a symbol name
hash.

This shaves another second off of objtool on vmlinux.o runtime, down
to 15 seconds.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20200324160924.676865656@infradead.org


# 53d20720 16-Mar-2020 Peter Zijlstra <peterz@infradead.org>

objtool: Rename find_containing_func()

For consistency; we have:

find_symbol_by_offset() / find_symbol_containing()
find_func_by_offset() / find_containing_func()

fix that.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20200324160924.558470724@infradead.org


# 2a362ecc 12-Mar-2020 Peter Zijlstra <peterz@infradead.org>

objtool: Optimize find_symbol_*() and read_symbols()

All of:

read_symbols(), find_symbol_by_offset(), find_symbol_containing(),
find_containing_func()

do a linear search of the symbols. Add an RB tree to make it go
faster.

This about halves objtool runtime on vmlinux.o, from 34s to 18s.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20200324160924.499016559@infradead.org


# ae358196 12-Mar-2020 Peter Zijlstra <peterz@infradead.org>

objtool: Optimize find_section_by_name()

In order to avoid yet another linear search of (20k) sections, add a
name based hash.

This reduces objtool runtime on vmlinux.o by some 10s to around 35s.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20200324160924.440174280@infradead.org


# 53038996 10-Mar-2020 Peter Zijlstra <peterz@infradead.org>

objtool: Optimize find_section_by_index()

In order to avoid a linear search (over 20k entries), add an
section_hash to the elf object.

This reduces objtool on vmlinux.o from a few minutes to around 45
seconds.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20200324160924.381249993@infradead.org


# 1e11f3fd 12-Mar-2020 Peter Zijlstra <peterz@infradead.org>

objtool: Add a statistics mode

Have it print a few numbers which can be used to size the hashtables.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20200324160924.321381240@infradead.org


# 65fb11a7 10-Mar-2020 Peter Zijlstra <peterz@infradead.org>

objtool: Optimize find_symbol_by_index()

The symbol index is object wide, not per section, so it makes no sense
to have the symbol_hash be part of the section object. By moving it to
the elf object we avoid the linear sections iteration.

This reduces the runtime of objtool on vmlinux.o from over 3 hours (I
gave up) to a few minutes. The defconfig vmlinux.o has around 20k
sections.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20200324160924.261852348@infradead.org


# 7acfe531 17-Feb-2020 Josh Poimboeuf <jpoimboe@redhat.com>

objtool: Improve call destination function detection

A recent clang change, combined with a binutils bug, can trigger a
situation where a ".Lprintk$local" STT_NOTYPE symbol gets created at the
same offset as the "printk" STT_FUNC symbol. This confuses objtool:

kernel/printk/printk.o: warning: objtool: ignore_loglevel_setup()+0x10: can't find call dest symbol at .text+0xc67

Improve the call destination detection by looking specifically for an
STT_FUNC symbol.

Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/872
Link: https://sourceware.org/bugzilla/show_bug.cgi?id=25551
Link: https://lkml.kernel.org/r/0a7ee320bc0ea4469bd3dc450a7b4725669e0ea9.1581997059.git.jpoimboe@redhat.com


# e7c2bc37 17-Jul-2019 Josh Poimboeuf <jpoimboe@redhat.com>

objtool: Refactor jump table code

Now that C jump tables are supported, call them "jump tables" instead of
"switch tables". Also rename some other variables, add comments, and
simplify the code flow a bit.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/cf951b0c0641628e0b9b81f7ceccd9bcabcb4bd8.1563413318.git.jpoimboe@redhat.com


# e10cd8fe 17-Jul-2019 Josh Poimboeuf <jpoimboe@redhat.com>

objtool: Refactor function alias logic

- Add an alias check in validate_functions(). With this change, aliases
no longer need uaccess_safe set.

- Add an alias check in decode_instructions(). With this change, the
"if (!insn->func)" check is no longer needed.

- Don't create aliases for zero-length functions, as it can have
unexpected results. The next patch will spit out a warning for
zero-length functions anyway.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/26a99c31426540f19c9a58b9e10727c385a147bc.1563413318.git.jpoimboe@redhat.com


# 8e144797 10-Jul-2019 Michael Forney <mforney@mforney.org>

objtool: Rename elf_open() to prevent conflict with libelf from elftoolchain

The elftoolchain version of libelf has a function named elf_open().

The function name isn't quite accurate anyway, since it also reads all
the ELF data. Rename it to elf_read(), which is more accurate.

[ jpoimboe: rename to elf_read(); write commit description ]

Signed-off-by: Michael Forney <mforney@mforney.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/7ce2d1b35665edf19fd0eb6fbc0b17b81a48e62f.1562793604.git.jpoimboe@redhat.com


# 3c3ea503 10-Jul-2019 Michael Forney <mforney@mforney.org>

objtool: Use Elf_Scn typedef instead of assuming struct name

The libelf implementation might use a different struct name, and the
Elf_Scn typedef is already used throughout the rest of objtool.

Signed-off-by: Michael Forney <mforney@mforney.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/d270e1be2835fc2a10acf67535ff2ebd2145bf43.1562793448.git.jpoimboe@redhat.com


# 1ccea77e 19-May-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 13

Based on 2 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version this program is distributed in the
hope that it will be useful but without any warranty without even
the implied warranty of merchantability or fitness for a particular
purpose see the gnu general public license for more details you
should have received a copy of the gnu general public license along
with this program if not see http www gnu org licenses

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version this program is distributed in the
hope that it will be useful but without any warranty without even
the implied warranty of merchantability or fitness for a particular
purpose see the gnu general public license for more details [based]
[from] [clk] [highbank] [c] you should have received a copy of the
gnu general public license along with this program if not see http
www gnu org licenses

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 355 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Jilayne Lovejoy <opensource@jilayne.com>
Reviewed-by: Steve Winslow <swinslow@gmail.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190519154041.837383322@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 09f30d83 28-Feb-2019 Peter Zijlstra <peterz@infradead.org>

objtool: Handle function aliases

Function aliases result in different symbols for the same set of
instructions; track a canonical symbol so there is a unique point of
access.

This again prepares the way for function attributes. And in particular
the need for aliases comes from how KASAN uses them.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 22566c16 20-Nov-2018 Artem Savkov <asavkov@redhat.com>

objtool: Fix segfault in .cold detection with -ffunction-sections

Because find_symbol_by_name() traverses the same lists as
read_symbols(), changing sym->name in place without copying it affects
the result of find_symbol_by_name(). In the case where a ".cold"
function precedes its parent in sec->symbol_list, it can result in a
function being considered a parent of itself. This leads to function
length being set to 0 and other consequent side-effects including a
segfault in add_switch_table(). The effects of this bug are only
visible when building with -ffunction-sections in KCFLAGS.

Fix by copying the search string instead of modifying it in place.

Signed-off-by: Artem Savkov <asavkov@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 13810435b9a7 ("objtool: Support GCC 8's cold subfunctions")
Link: http://lkml.kernel.org/r/910abd6b5a4945130fd44f787c24e07b9e07c8da.1542736240.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 0b9301fb 20-Nov-2018 Artem Savkov <asavkov@redhat.com>

objtool: Fix double-free in .cold detection error path

If read_symbols() fails during second list traversal (the one dealing
with ".cold" subfunctions) it frees the symbol, but never deletes it
from the list/hash_table resulting in symbol being freed again in
elf_close(). Fix it by just returning an error, leaving cleanup to
elf_close().

Signed-off-by: Artem Savkov <asavkov@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 13810435b9a7 ("objtool: Support GCC 8's cold subfunctions")
Link: http://lkml.kernel.org/r/beac5a9b7da9e8be90223459dcbe07766ae437dd.1542736240.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# bcb6fb5d 31-Oct-2018 Josh Poimboeuf <jpoimboe@redhat.com>

objtool: Support GCC 9 cold subfunction naming scheme

Starting with GCC 8, a lot of unlikely code was moved out of line to
"cold" subfunctions in .text.unlikely.

For example, the unlikely bits of:

irq_do_set_affinity()

are moved out to the following subfunction:

irq_do_set_affinity.cold.49()

Starting with GCC 9, the numbered suffix has been removed. So in the
above example, the cold subfunction is instead:

irq_do_set_affinity.cold()

Tweak the objtool subfunction detection logic so that it detects both
GCC 8 and GCC 9 naming schemes.

Reported-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/015e9544b1f188d36a7f02fa31e9e95629aa5f50.1541040800.git.jpoimboe@redhat.com


# 4a60aa05 07-Sep-2018 Allan Xavier <allan.x.xavier@oracle.com>

objtool: Support per-function rodata sections

Add support for processing switch jump tables in objects with multiple
.rodata sections, such as those created by '-ffunction-sections' and
'-fdata-sections'. Currently, objtool always looks in .rodata for jump
table information, which results in many "sibling call from callable
instruction with modified stack frame" warnings with objects compiled
using those flags.

The fix is comprised of three parts:

1. Flagging all .rodata sections when importing ELF information for
easier checking later.

2. Keeping a reference to the section each relocation is from in order
to get the list_head for the other relocations in that section.

3. Finding jump tables by following relocations to .rodata sections,
rather than always referencing a single global .rodata section.

The patch has been tested without data sections enabled and no
differences in the resulting orc unwind information were seen.

Note that as objtool adds terminators to end of each .text section the
unwind information generated between a function+data sections build and
a normal build aren't directly comparable. Manual inspection suggests
that objtool is now generating the correct information, or at least
making more of an effort to do so than it did previously.

Signed-off-by: Allan Xavier <allan.x.xavier@oracle.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/099bdc375195c490dda04db777ee0b95d566ded1.1536325914.git.jpoimboe@redhat.com


# 6d77d3b4 09-Jul-2018 Simon Ser <contact@emersion.fr>

objtool: Use '.strtab' if '.shstrtab' doesn't exist, to support ORC tables on Clang

Clang puts its section header names in the '.strtab' section instead of
'.shstrtab', which causes objtool to fail with a "can't find
.shstrtab section" warning when attempting to write ORC metadata to an
object file.

If '.shstrtab' doesn't exist, use '.strtab' instead.

Signed-off-by: Simon Ser <contact@emersion.fr>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/d1c1c3fe55872be433da7bc5e1860538506229ba.1531153015.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 08b393d0 27-Jun-2018 Josh Poimboeuf <jpoimboe@redhat.com>

objtool: Support GCC 8 '-fnoreorder-functions'

Since the following commit:

cd77849a69cf ("objtool: Fix GCC 8 cold subfunction detection for aliased functions")

... if the kernel is built with EXTRA_CFLAGS='-fno-reorder-functions',
objtool can get stuck in an infinite loop.

That flag causes the new GCC 8 cold subfunctions to be placed in .text
instead of .text.unlikely. But it also has an unfortunate quirk: in the
symbol table, the subfunction (e.g., nmi_panic.cold.7) is nested inside
the parent (nmi_panic).

That function overlap confuses objtool, and causes it to get into an
infinite loop in next_insn_same_func(). Here's Allan's description of
the loop:

"Objtool iterates through the instructions in nmi_panic using
next_insn_same_func. Once it reaches the end of nmi_panic at 0x534 it
jumps to 0x528 as that's the start of nmi_panic.cold.7. However, since
the instructions starting at 0x528 are still associated with nmi_panic
objtool will get stuck in a loop, continually jumping back to 0x528
after reaching 0x534."

Fix it by shortening the length of the parent function so that the
functions no longer overlap.

Reported-and-analyzed-by: Allan Xavier <allan.x.xavier@oracle.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Allan Xavier <allan.x.xavier@oracle.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/9e704c52bee651129b036be14feda317ae5606ae.1530136978.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 13810435 09-May-2018 Josh Poimboeuf <jpoimboe@redhat.com>

objtool: Support GCC 8's cold subfunctions

GCC 8 moves a lot of unlikely code out of line to "cold" subfunctions in
.text.unlikely. Properly detect the new subfunctions and treat them as
extensions of the original functions.

This fixes a bunch of warnings like:

kernel/cgroup/cgroup.o: warning: objtool: parse_cgroup_root_flags()+0x33: sibling call from callable instruction with modified stack frame
kernel/cgroup/cgroup.o: warning: objtool: cgroup_addrm_files()+0x290: sibling call from callable instruction with modified stack frame
kernel/cgroup/cgroup.o: warning: objtool: cgroup_apply_control_enable()+0x25b: sibling call from callable instruction with modified stack frame
kernel/cgroup/cgroup.o: warning: objtool: rebind_subsystems()+0x325: sibling call from callable instruction with modified stack frame

Reported-and-tested-by: damian <damian.tometzki@icloud.com>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/0965e7fcfc5f31a276f0c7f298ff770c19b68706.1525923412.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 385d11b1 15-Jan-2018 Josh Poimboeuf <jpoimboe@redhat.com>

objtool: Improve error message for bad file argument

If a nonexistent file is supplied to objtool, it complains with a
non-helpful error:

open: No such file or directory

Improve it to:

objtool: Can't open 'foo': No such file or directory

Reported-by: Markus <M4rkusXXL@web.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/406a3d00a21225eee2819844048e17f68523ccf6.1516025651.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 97dab2ae 15-Sep-2017 Josh Poimboeuf <jpoimboe@redhat.com>

objtool: Fix object file corruption

Arnd Bergmann reported that a randconfig build was failing with the
following link error:

built-in.o: member arch/x86/kernel/time.o in archive is not an object

It turns out the link failed because the time.o file had been corrupted
by objtool:

nm: arch/x86/kernel/time.o: File format not recognized

In certain rare cases, when a .o file's ORC table is very small, the
.data section size doesn't change because it's page aligned. Because
all the existing sections haven't changed size, libelf doesn't detect
any section header changes, and so it doesn't update the section header
table properly. Instead it writes junk in the section header entries
for the new ORC sections.

Make sure libelf properly updates the section header table by setting
the ELF_F_DIRTY flag in the top level elf struct.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 627fce14809b ("objtool: Add ORC unwind table generation")
Link: http://lkml.kernel.org/r/e650fd0f2d8a209d1409a9785deb101fdaed55fb.1505459813.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# df968c93 15-Sep-2017 Petr Vandrovec <petr@vandrovec.name>

objtool: Do not retrieve data from empty sections

Binutils 2.29-9 in Debian return an error when elf_getdata is invoked
on empty section (.note.GNU-stack in all kernel files), causing
immediate failure of kernel build with:

elf_getdata: can't manipulate null section

As nothing is done with sections that have zero size, just do not
retrieve their data at all.

Signed-off-by: Petr Vandrovec <petr@vandrovec.name>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/2ce30a44349065b70d0f00e71e286dc0cbe745e6.1505459652.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 0998b7a0 14-Sep-2017 Martin Kepplinger <martink@posteo.de>

objtool: Fix memory leak in elf_create_rela_section()

Let's free the allocated char array 'relaname' before returning,
in order to avoid leaking memory.

Signed-off-by: Martin Kepplinger <martink@posteo.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: mingo.kernel.org@gmail.com
Link: http://lkml.kernel.org/r/20170914060138.26472-1-martink@posteo.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 627fce14 11-Jul-2017 Josh Poimboeuf <jpoimboe@redhat.com>

objtool: Add ORC unwind table generation

Now that objtool knows the states of all registers on the stack for each
instruction, it's straightforward to generate debuginfo for an unwinder
to use.

Instead of generating DWARF, generate a new format called ORC, which is
more suitable for an in-kernel unwinder. See
Documentation/x86/orc-unwinder.txt for a more detailed description of
this new debuginfo format and why it's preferable to DWARF.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/c9b9f01ba6c5ed2bdc9bb0957b78167fdbf9632e.1499786555.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# baa41469 28-Jun-2017 Josh Poimboeuf <jpoimboe@redhat.com>

objtool: Implement stack validation 2.0

This is a major rewrite of objtool. Instead of only tracking frame
pointer changes, it now tracks all stack-related operations, including
all register saves/restores.

In addition to making stack validation more robust, this also paves the
way for undwarf generation.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/678bd94c0566c6129bcc376cddb259c4c5633004.1498659915.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 5c51f4ae 02-Mar-2017 Josh Poimboeuf <jpoimboe@redhat.com>

objtool: Fix another GCC jump table detection issue

Arnd Bergmann reported a (false positive) objtool warning:

drivers/infiniband/sw/rxe/rxe_resp.o: warning: objtool: rxe_responder()+0xfe: sibling call from callable instruction with changed frame pointer

The issue is in find_switch_table(). It tries to find a switch
statement's jump table by walking backwards from an indirect jump
instruction, looking for a relocation to the .rodata section. In this
case it stopped walking prematurely: the first .rodata relocation it
encountered was for a variable (resp_state_name) instead of a jump
table, so it just assumed there wasn't a jump table.

The fix is to ignore any .rodata relocation which refers to an ELF
object symbol. This works because the jump tables are anonymous and
have no symbols associated with them.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 3732710ff6f2 ("objtool: Improve rare switch jump table pattern detection")
Link: http://lkml.kernel.org/r/20170302225723.3ndbsnl4hkqbne7a@treble
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 774bec3f 13-Jul-2016 Arnaldo Carvalho de Melo <acme@redhat.com>

objtool: Add fallback from ELF_C_READ_MMAP to ELF_C_READ

Not all libelf implementations have this "Please, ELF_C_READ, but use
mmap if possible" elf_begin() cmd, so provide a fallback to plain old
ELF_C_READ.

Case in point: Alpine Linux 3.4.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-1fctuknrawgoi5xqon4mu9dv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 042ba73f 08-Mar-2016 Josh Poimboeuf <jpoimboe@redhat.com>

objtool: Add several performance improvements

Use hash tables for instruction and rela lookups (and keep the linked
lists around for sequential access).

Also cache the section struct for the "__func_stack_frame_non_standard"
section.

With this change, "objtool check net/wireless/nl80211.o" goes from:

real 0m1.168s
user 0m1.163s
sys 0m0.005s

to:

real 0m0.059s
user 0m0.042s
sys 0m0.017s

for a 20x speedup.

With the same object, it should be noted that the memory heap usage grew
from 8MB to 62MB. Reducing the memory usage is on the TODO list.

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/dd0d8e1449506cfa7701b4e7ba73577077c44253.1457502970.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# a196e171 08-Mar-2016 Josh Poimboeuf <jpoimboe@redhat.com>

objtool: Rename some variables and functions

Rename some list heads to distinguish them from hash node heads, which
are added later in the patch series.

Also rename the get_*() functions to add_*(), which is more descriptive:
they "add" data to the objtool_file struct.

Also rename rodata_rela and text_rela to be clearer:
- text_rela refers to a rela entry in .rela.text.
- rodata_rela refers to a rela entry in .rela.rodata.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/ee0eca2bba8482aa45758958c5586c00a7b71e62.1457502970.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 442f04c3 28-Feb-2016 Josh Poimboeuf <jpoimboe@redhat.com>

objtool: Add tool to perform compile-time stack metadata validation

This adds a host tool named objtool which has a "check" subcommand which
analyzes .o files to ensure the validity of stack metadata. It enforces
a set of rules on asm code and C inline assembly code so that stack
traces can be reliable.

For each function, it recursively follows all possible code paths and
validates the correct frame pointer state at each instruction.

It also follows code paths involving kernel special sections, like
.altinstructions, __jump_table, and __ex_table, which can add
alternative execution paths to a given instruction (or set of
instructions). Similarly, it knows how to follow switch statements, for
which gcc sometimes uses jump tables.

Here are some of the benefits of validating stack metadata:

a) More reliable stack traces for frame pointer enabled kernels

Frame pointers are used for debugging purposes. They allow runtime
code and debug tools to be able to walk the stack to determine the
chain of function call sites that led to the currently executing
code.

For some architectures, frame pointers are enabled by
CONFIG_FRAME_POINTER. For some other architectures they may be
required by the ABI (sometimes referred to as "backchain pointers").

For C code, gcc automatically generates instructions for setting up
frame pointers when the -fno-omit-frame-pointer option is used.

But for asm code, the frame setup instructions have to be written by
hand, which most people don't do. So the end result is that
CONFIG_FRAME_POINTER is honored for C code but not for most asm code.

For stack traces based on frame pointers to be reliable, all
functions which call other functions must first create a stack frame
and update the frame pointer. If a first function doesn't properly
create a stack frame before calling a second function, the *caller*
of the first function will be skipped on the stack trace.

For example, consider the following example backtrace with frame
pointers enabled:

[<ffffffff81812584>] dump_stack+0x4b/0x63
[<ffffffff812d6dc2>] cmdline_proc_show+0x12/0x30
[<ffffffff8127f568>] seq_read+0x108/0x3e0
[<ffffffff812cce62>] proc_reg_read+0x42/0x70
[<ffffffff81256197>] __vfs_read+0x37/0x100
[<ffffffff81256b16>] vfs_read+0x86/0x130
[<ffffffff81257898>] SyS_read+0x58/0xd0
[<ffffffff8181c1f2>] entry_SYSCALL_64_fastpath+0x12/0x76

It correctly shows that the caller of cmdline_proc_show() is
seq_read().

If we remove the frame pointer logic from cmdline_proc_show() by
replacing the frame pointer related instructions with nops, here's
what it looks like instead:

[<ffffffff81812584>] dump_stack+0x4b/0x63
[<ffffffff812d6dc2>] cmdline_proc_show+0x12/0x30
[<ffffffff812cce62>] proc_reg_read+0x42/0x70
[<ffffffff81256197>] __vfs_read+0x37/0x100
[<ffffffff81256b16>] vfs_read+0x86/0x130
[<ffffffff81257898>] SyS_read+0x58/0xd0
[<ffffffff8181c1f2>] entry_SYSCALL_64_fastpath+0x12/0x76

Notice that cmdline_proc_show()'s caller, seq_read(), has been
skipped. Instead the stack trace seems to show that
cmdline_proc_show() was called by proc_reg_read().

The benefit of "objtool check" here is that because it ensures that
*all* functions honor CONFIG_FRAME_POINTER, no functions will ever[*]
be skipped on a stack trace.

[*] unless an interrupt or exception has occurred at the very
beginning of a function before the stack frame has been created,
or at the very end of the function after the stack frame has been
destroyed. This is an inherent limitation of frame pointers.

b) 100% reliable stack traces for DWARF enabled kernels

This is not yet implemented. For more details about what is planned,
see tools/objtool/Documentation/stack-validation.txt.

c) Higher live patching compatibility rate

This is not yet implemented. For more details about what is planned,
see tools/objtool/Documentation/stack-validation.txt.

To achieve the validation, "objtool check" enforces the following rules:

1. Each callable function must be annotated as such with the ELF
function type. In asm code, this is typically done using the
ENTRY/ENDPROC macros. If objtool finds a return instruction
outside of a function, it flags an error since that usually indicates
callable code which should be annotated accordingly.

This rule is needed so that objtool can properly identify each
callable function in order to analyze its stack metadata.

2. Conversely, each section of code which is *not* callable should *not*
be annotated as an ELF function. The ENDPROC macro shouldn't be used
in this case.

This rule is needed so that objtool can ignore non-callable code.
Such code doesn't have to follow any of the other rules.

3. Each callable function which calls another function must have the
correct frame pointer logic, if required by CONFIG_FRAME_POINTER or
the architecture's back chain rules. This can by done in asm code
with the FRAME_BEGIN/FRAME_END macros.

This rule ensures that frame pointer based stack traces will work as
designed. If function A doesn't create a stack frame before calling
function B, the _caller_ of function A will be skipped on the stack
trace.

4. Dynamic jumps and jumps to undefined symbols are only allowed if:

a) the jump is part of a switch statement; or

b) the jump matches sibling call semantics and the frame pointer has
the same value it had on function entry.

This rule is needed so that objtool can reliably analyze all of a
function's code paths. If a function jumps to code in another file,
and it's not a sibling call, objtool has no way to follow the jump
because it only analyzes a single file at a time.

5. A callable function may not execute kernel entry/exit instructions.
The only code which needs such instructions is kernel entry code,
which shouldn't be be in callable functions anyway.

This rule is just a sanity check to ensure that callable functions
return normally.

It currently only supports x86_64. I tried to make the code generic so
that support for other architectures can hopefully be plugged in
relatively easily.

On my Lenovo laptop with a i7-4810MQ 4-core/8-thread CPU, building the
kernel with objtool checking every .o file adds about three seconds of
total build time. It hasn't been optimized for performance yet, so
there are probably some opportunities for better build performance.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/f3efb173de43bd067b060de73f856567c0fa1174.1456719558.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>