#
f404a58d |
|
04-Oct-2023 |
Aaron Plattner <aplattner@nvidia.com> |
objtool: Remove max symbol name length limitation If one of the symbols processed by read_symbols() happens to have a .cold variant with a name longer than objtool's MAX_NAME_LEN limit, the build fails. Avoid this problem by just using strndup() to copy the parent function's name, rather than strncpy()ing it onto the stack. Signed-off-by: Aaron Plattner <aplattner@nvidia.com> Link: https://lore.kernel.org/r/41e94cfea1d9131b758dd637fecdeacd459d4584.1696355111.git.aplattner@nvidia.com Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
9f71fbcd |
|
28-Jun-2023 |
Michal Kubecek <mkubecek@suse.cz> |
objtool: initialize all of struct elf Function elf_open_read() only zero initializes the initial part of allocated struct elf; num_relocs member was recently added outside the zeroed part so that it was left uninitialized, resulting in build failures on some systems. The partial initialization is a relic of times when struct elf had large hash tables embedded. This is no longer the case so remove the trap and initialize the whole structure instead. Fixes: eb0481bbc4ce ("objtool: Fix reloc_hash size") Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Link: https://lore.kernel.org/r/20230629102051.42E8360467@lion.mk-sys.cz
|
#
b4c96ef0 |
|
30-May-2023 |
Josh Poimboeuf <jpoimboe@kernel.org> |
objtool: Skip reading DWARF section data Objtool doesn't use DWARF at all, and the DWARF sections' data take up a lot of memory. Skip reading them. Note this only skips the DWARF base sections, not the rela sections. The relas are needed because their symbol references may need to be reindexed if any local symbols get added by elf_create_symbol(). Also note the DWARF data will eventually be read by libelf anyway, when writing the object file. But that's fine, the goal here is to reduce *peak* memory usage, and the previous patch (which freed insn memory) gave some breathing room. So the allocation gets shifted to a later time, resulting in lower peak memory usage. With allyesconfig + CONFIG_DEBUG_INFO: - Before: peak heap memory consumption: 29.93G - After: peak heap memory consumption: 25.47G Link: https://lore.kernel.org/r/52a9698835861dd35f2ec35c49f96d0bb39fb177.1685464332.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
ec24b927 |
|
30-May-2023 |
Josh Poimboeuf <jpoimboe@kernel.org> |
objtool: Get rid of reloc->rel[a] Get the relocation entry info from the underlying rsec->data. With allyesconfig + CONFIG_DEBUG_INFO: - Before: peak heap memory consumption: 35.12G - After: peak heap memory consumption: 29.93G Link: https://lore.kernel.org/r/2be32323de6d8cc73179ee0ff14b71f4e7cefaa0.1685464332.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
02b54001 |
|
30-May-2023 |
Josh Poimboeuf <jpoimboe@kernel.org> |
objtool: Shrink elf hash nodes Instead of using hlist for the 'struct elf' hashes, use a custom single-linked list scheme. With allyesconfig + CONFIG_DEBUG_INFO: - Before: peak heap memory consumption: 36.89G - After: peak heap memory consumption: 35.12G Link: https://lore.kernel.org/r/6e8cd305ed22e743c30d6e72cfdc1be20fb94cd4.1685464332.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
890f10a4 |
|
30-May-2023 |
Josh Poimboeuf <jpoimboe@kernel.org> |
objtool: Shrink reloc->sym_reloc_entry Convert it to a singly-linked list. With allyesconfig + CONFIG_DEBUG_INFO: - Before: peak heap memory consumption: 38.64G - After: peak heap memory consumption: 36.89G Link: https://lore.kernel.org/r/a51f0a6f9bbf2494d5a3a449807307e78a940988.1685464332.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
0696b6e3 |
|
30-May-2023 |
Josh Poimboeuf <jpoimboe@kernel.org> |
objtool: Get rid of reloc->addend Get the addend from the embedded GElf_Rel[a] struct. With allyesconfig + CONFIG_DEBUG_INFO: - Before: peak heap memory consumption: 42.10G - After: peak heap memory consumption: 40.37G Link: https://lore.kernel.org/r/ad2354f95d9ddd86094e3f7687acfa0750657784.1685464332.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
fcee899d |
|
30-May-2023 |
Josh Poimboeuf <jpoimboe@kernel.org> |
objtool: Get rid of reloc->type Get the type from the embedded GElf_Rel[a] struct. Link: https://lore.kernel.org/r/d1c1f8da31e4f052a2478aea585fcf355cacc53a.1685464332.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
e4cbb9b8 |
|
30-May-2023 |
Josh Poimboeuf <jpoimboe@kernel.org> |
objtool: Get rid of reloc->offset Get the offset from the embedded GElf_Rel[a] struct. With allyesconfig + CONFIG_DEBUG_INFO: - Before: peak heap memory consumption: 43.83G - After: peak heap memory consumption: 42.10G Link: https://lore.kernel.org/r/2b9ec01178baa346a99522710bf2e82159412e3a.1685464332.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
be9a4c11 |
|
30-May-2023 |
Josh Poimboeuf <jpoimboe@kernel.org> |
objtool: Get rid of reloc->idx Use the array offset to calculate the reloc index. With allyesconfig + CONFIG_DEBUG_INFO: - Before: peak heap memory consumption: 45.56G - After: peak heap memory consumption: 43.83G Link: https://lore.kernel.org/r/7351d2ebad0519027db14a32f6204af84952574a.1685464332.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
ebcef730 |
|
30-May-2023 |
Josh Poimboeuf <jpoimboe@kernel.org> |
objtool: Get rid of reloc->list Now that all relocs are allocated in an array, the linked list is no longer needed. With allyesconfig + CONFIG_DEBUG_INFO: - Before: peak heap memory consumption: 49.02G - After: peak heap memory consumption: 45.56G Link: https://lore.kernel.org/r/71e7a2c017dbc46bb497857ec97d67214f832d10.1685464332.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
e0a9349b |
|
30-May-2023 |
Josh Poimboeuf <jpoimboe@kernel.org> |
objtool: Allocate relocs in advance for new rela sections Similar to read_relocs(), allocate the reloc structs all together in an array rather than allocating them one at a time. Link: https://lore.kernel.org/r/5332d845c5a2d6c2d052075b381bfba8bcb67ed5.1685464332.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
5201a9bc |
|
30-May-2023 |
Josh Poimboeuf <jpoimboe@kernel.org> |
objtool: Don't free memory in elf_close() It's not necessary, objtool's about to exit anyway. Link: https://lore.kernel.org/r/74bdb3058b8f029db8d5b3b5175f2a200804196d.1685464332.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
fcf93355 |
|
30-May-2023 |
Josh Poimboeuf <jpoimboe@kernel.org> |
objtool: Keep GElf_Rel[a] structs synced Keep the GElf_Rela structs synced with their 'struct reloc' counterparts instead of having to go back and "rebuild" them later. Link: https://lore.kernel.org/r/156d8a3e528a11e5c8577cf552890ed1f2b9567b.1685464332.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
6342a20e |
|
30-May-2023 |
Josh Poimboeuf <jpoimboe@kernel.org> |
objtool: Add elf_create_section_pair() When creating an annotation section, allocate the reloc section data at the beginning. This simplifies the data model a bit and also saves memory due to the removal of malloc() in elf_rebuild_reloc_section(). With allyesconfig + CONFIG_DEBUG_INFO: - Before: peak heap memory consumption: 53.49G - After: peak heap memory consumption: 49.02G Link: https://lore.kernel.org/r/048e908f3ede9b66c15e44672b6dda992b1dae3e.1685464332.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
ff408273 |
|
30-May-2023 |
Josh Poimboeuf <jpoimboe@kernel.org> |
objtool: Add mark_sec_changed() Ensure elf->changed always gets set when sec->changed gets set. Link: https://lore.kernel.org/r/9a810a8d2e28af6ba07325362d0eb4703bb09d3a.1685464332.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
eb0481bb |
|
30-May-2023 |
Josh Poimboeuf <jpoimboe@kernel.org> |
objtool: Fix reloc_hash size With CONFIG_DEBUG_INFO, DWARF creates a lot of relocations and reloc_hash is woefully undersized, which can affect performance significantly. Fix that. Link: https://lore.kernel.org/r/38ef60dc8043270bf3b9dfd139ae2a30ca3f75cc.1685464332.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
53257a97 |
|
30-May-2023 |
Josh Poimboeuf <jpoimboe@kernel.org> |
objtool: Consolidate rel/rela handling The GElf_Rel[a] structs have more similarities than differences. It's safe to hard-code the assumptions about their shared fields as they will never change. Consolidate their handling where possible, getting rid of duplicated code. Also, at least for now we only ever create rela sections, so simplify the relocation creation code to be rela-only. Link: https://lore.kernel.org/r/dcabf6df400ca500ea929f1e4284f5e5ec0b27c8.1685464332.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
a5bd6236 |
|
30-May-2023 |
Josh Poimboeuf <jpoimboe@kernel.org> |
objtool: Improve reloc naming - The term "reloc" is overloaded to mean both "an instance of struct reloc" and "a reloc section". Change the latter to "rsec". - For variable names, use "sec" for regular sections and "rsec" for rela sections to prevent them getting mixed up. - For struct reloc variables, use "reloc" instead of "rel" everywhere for consistency. Link: https://lore.kernel.org/r/8b790e403df46f445c21003e7893b8f53b99a6f3.1685464332.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
2707579d |
|
30-May-2023 |
Josh Poimboeuf <jpoimboe@kernel.org> |
objtool: Remove flags argument from elf_create_section() Simplify the elf_create_section() interface a bit by removing the flags argument. Most callers don't care about changing the section header flags. If needed, they can be modified afterwards, just like any other section header field. Link: https://lore.kernel.org/r/515235d9cf62637a14bee37bfa9169ef20065471.1685464332.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
9290e772 |
|
12-Apr-2023 |
Josh Poimboeuf <jpoimboe@kernel.org> |
objtool: Add symbol iteration helpers Add [sec_]for_each_sym() and use them. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/59023e5886ab125aa30702e633be7732b1acaa7e.1681325924.git.jpoimboe@kernel.org
|
#
8045b8f0 |
|
27-Dec-2022 |
Thomas Weißschuh <linux@weissschuh.net> |
objtool: Allocate multiple structures with calloc() By using calloc() instead of malloc() in a loop, libc does not have to keep around bookkeeping information for each single structure. This reduces maximum memory usage while processing vmlinux.o from 3153325 KB to 3035668 KB (-3.7%) on my notebooks "localmodconfig". Note this introduces memory leaks, because some additional structs get added to the lists later after reading the symbols and sections from the original object. Luckily we don't really care about memory leaks in objtool. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20221216-objtool-memory-v2-3-17968f85a464@weissschuh.net Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
#
86ea7f36 |
|
14-Nov-2022 |
Christophe Leroy <christophe.leroy@csgroup.eu> |
objtool: Use target file class size instead of a compiled constant In order to allow using objtool on cross-built kernels, determine size of long from elf data instead of using sizeof(long) at build time. For the time being this covers only mcount. [Sathvika Vasireddy: Rename variable "size" to "addrsize" and function "elf_class_size()" to "elf_class_addrsize()", and modify create_mcount_loc_sections() function to follow reverse christmas tree format to order local variable declarations.] Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Sathvika Vasireddy <sv@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221114175754.1131267-11-sv@linux.ibm.com
|
#
19526717 |
|
02-Nov-2022 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Optimize elf_dirty_reloc_sym() When moving a symbol in the symtab its index changes and any reloc referring that symtol-table-index will need to be rewritten too. In order to facilitate this, objtool simply marks the whole reloc section 'changed' which will cause the whole section to be re-generated. However, finding the relocs that use any given symbol is implemented rather crudely -- a fully iteration of all sections and their relocs. Given that some builds have over 20k sections (kallsyms etc..) iterating all that for *each* symbol moved takes a bit of time. Instead have each symbol keep a list of relocs that reference it. This *vastly* improves build times for certain configs. Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/Y2LlRA7x+8UsE1xf@hirez.programming.kicks-ass.net
|
#
9f2899fe |
|
28-Oct-2022 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Add option to generate prefix symbols When code is compiled with: -fpatchable-function-entry=${PADDING_BYTES},${PADDING_BYTES} functions will have PADDING_BYTES of NOP in front of them. Unwinders and other things that symbolize code locations will typically attribute these bytes to the preceding function. Given that these bytes nominally belong to the following symbol this mis-attribution is confusing. Inspired by the fact that CFI_CLANG emits __cfi_##name symbols to claim these bytes, allow objtool to emit __pfx_##name symbols to do the same. Therefore add the objtool --prefix=N argument, to conditionally place a __pfx_##name symbol at N bytes ahead of symbol 'name' when: all these preceding bytes are NOP and name-N is an instruction boundary. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Yujie Liu <yujie.liu@intel.com> Link: https://lkml.kernel.org/r/20221028194453.526899822@infradead.org
|
#
13f60e80 |
|
28-Oct-2022 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf Due to how gelf_update_sym*() requires an Elf_Data pointer, and how libelf keeps Elf_Data in a linked list per section, elf_update_symbol() ends up having to iterate this list on each update to find the correct Elf_Data for the index'ed symbol. By allocating one Elf_Data per new symbol, the list grows per new symbol, giving an effective O(n^2) insertion time. This is obviously bloody terrible. Therefore over-allocate the Elf_Data when an extention is needed. Except it turns out libelf disregards Elf_Scn::sh_size in favour of the sum of Elf_Data::d_size. IOW it will happily write out all the unused space and fill it with: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND entries (aka zeros). Which obviously violates the STB_LOCAL placement rule, and is a general pain in the backside for not being the desired behaviour. Manually fix-up the Elf_Data size to avoid this problem before calling elf_update(). This significantly improves performance when adding a significant number of symbols. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Yujie Liu <yujie.liu@intel.com> Link: https://lkml.kernel.org/r/20221028194453.461658986@infradead.org
|
#
4c91be8e |
|
28-Oct-2022 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Slice up elf_create_section_symbol() In order to facilitate creation of more symbol types, slice up elf_create_section_symbol() to extract a generic helper that deals with adding ELF symbols. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Yujie Liu <yujie.liu@intel.com> Link: https://lkml.kernel.org/r/20221028194453.396634875@infradead.org
|
#
5da6aea3 |
|
15-Sep-2022 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Fix find_{symbol,func}_containing() The current find_{symbol,func}_containing() functions are broken in the face of overlapping symbols, exactly the case that is needed for a new ibt/endbr supression. Import interval_tree_generic.h into the tools tree and convert the symbol tree to an interval tree to support proper range stabs. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111146.330203761@infradead.org
|
#
5141d3a0 |
|
08-Sep-2022 |
Sami Tolvanen <samitolvanen@google.com> |
objtool: Preserve special st_shndx indexes in elf_update_symbol elf_update_symbol fails to preserve the special st_shndx values between [SHN_LORESERVE, SHN_HIRESERVE], which results in it converting SHN_ABS entries into SHN_UNDEF, for example. Explicitly check for the special indexes and ensure these symbols are not marked undefined. Fixes: ead165fa1042 ("objtool: Fix symbol creation") Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220908215504.3686827-17-samitolvanen@google.com
|
#
22682a07 |
|
16-May-2022 |
Mikulas Patocka <mpatocka@redhat.com> |
objtool: Fix objtool regression on x32 systems Commit c087c6e7b551 ("objtool: Fix type of reloc::addend") failed to appreciate cross building from ILP32 hosts, where 'int' == 'long' and the issue persists. As such, use s64/int64_t/Elf64_Sxword for this field and suffer the pain that is ISO C99 printf formats for it. Fixes: c087c6e7b551 ("objtool: Fix type of reloc::addend") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> [peterz: reword changelog, s/long long/s64/] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/alpine.LRH.2.02.2205161041260.11556@file01.intranet.prod.int.rdu2.redhat.com
|
#
ead165fa |
|
17-May-2022 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Fix symbol creation Nathan reported objtool failing with the following messages: warning: objtool: no non-local symbols !? warning: objtool: gelf_update_symshndx: invalid section index The problem is due to commit 4abff6d48dbc ("objtool: Fix code relocs vs weak symbols") failing to consider the case where an object would have no non-local symbols. The problem that commit tries to address is adding a STB_LOCAL symbol to the symbol table in light of the ELF spec's requirement that: In each symbol table, all symbols with STB_LOCAL binding preced the weak and global symbols. As ``Sections'' above describes, a symbol table section's sh_info section header member holds the symbol table index for the first non-local symbol. The approach taken is to find this first non-local symbol, move that to the end and then re-use the freed spot to insert a new local symbol and increment sh_info. Except it never considered the case of object files without global symbols and got a whole bunch of details wrong -- so many in fact that it is a wonder it ever worked :/ Specifically: - It failed to re-hash the symbol on the new index, so a subsequent find_symbol_by_index() would not find it at the new location and a query for the old location would now return a non-deterministic choice between the old and new symbol. - It failed to appreciate that the GElf wrappers are not a valid disk format (it works because GElf is basically Elf64 and we only support x86_64 atm.) - It failed to fully appreciate how horrible the libelf API really is and got the gelf_update_symshndx() call pretty much completely wrong; with the direct consequence that if inserting a second STB_LOCAL symbol would require moving the same STB_GLOBAL symbol again it would completely come unstuck. Write a new elf_update_symbol() function that wraps all the magic required to update or create a new symbol at a given index. Specifically, gelf_update_sym*() require an @ndx argument that is relative to the @data argument; this means you have to manually iterate the section data descriptor list and update @ndx. Fixes: 4abff6d48dbc ("objtool: Fix code relocs vs weak symbols") Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/YoPCTEYjoPqE4ZxB@hirez.programming.kicks-ass.net
|
#
753da417 |
|
18-Apr-2022 |
Josh Poimboeuf <jpoimboe@redhat.com> |
objtool: Remove --lto and --vmlinux in favor of --link The '--lto' option is a confusing way of telling objtool to do stack validation despite it being a linked object. It's no longer needed now that an explicit '--stackval' option exists. The '--vmlinux' option is also redundant. Remove both options in favor of a straightforward '--link' option which identifies a linked object. Also, implicitly set '--link' with a warning if the user forgets to do so and we can tell that it's a linked object. This makes it easier for manual vmlinux runs. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/dcd3ceffd15a54822c6183e5766d21ad06082b45.1650300597.git.jpoimboe@redhat.com
|
#
2daf7fab |
|
18-Apr-2022 |
Josh Poimboeuf <jpoimboe@redhat.com> |
objtool: Reorganize cmdline options Split the existing options into two groups: actions, which actually do something; and options, which modify the actions in some way. Also there's no need to have short flags for all the non-action options. Reserve short flags for the more important actions. While at it: - change a few of the short flags to be more intuitive - make option descriptions more consistently descriptive - sort options in the source like they are when printed - move options to a global struct Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/9dcaa752f83aca24b1b21f0b0eeb28a0c181c0b0.1650300597.git.jpoimboe@redhat.com
|
#
4abff6d4 |
|
17-Apr-2022 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Fix code relocs vs weak symbols Occasionally objtool driven code patching (think .static_call_sites .retpoline_sites etc..) goes sideways and it tries to patch an instruction that doesn't match. Much head-scatching and cursing later the problem is as outlined below and affects every section that objtool generates for us, very much including the ORC data. The below uses .static_call_sites because it's convenient for demonstration purposes, but as mentioned the ORC sections, .retpoline_sites and __mount_loc are all similarly affected. Consider: foo-weak.c: extern void __SCT__foo(void); __attribute__((weak)) void foo(void) { return __SCT__foo(); } foo.c: extern void __SCT__foo(void); extern void my_foo(void); void foo(void) { my_foo(); return __SCT__foo(); } These generate the obvious code (gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -c foo*.c): foo-weak.o: 0000000000000000 <foo>: 0: e9 00 00 00 00 jmpq 5 <foo+0x5> 1: R_X86_64_PLT32 __SCT__foo-0x4 foo.o: 0000000000000000 <foo>: 0: 48 83 ec 08 sub $0x8,%rsp 4: e8 00 00 00 00 callq 9 <foo+0x9> 5: R_X86_64_PLT32 my_foo-0x4 9: 48 83 c4 08 add $0x8,%rsp d: e9 00 00 00 00 jmpq 12 <foo+0x12> e: R_X86_64_PLT32 __SCT__foo-0x4 Now, when we link these two files together, you get something like (ld -r -o foos.o foo-weak.o foo.o): foos.o: 0000000000000000 <foo-0x10>: 0: e9 00 00 00 00 jmpq 5 <foo-0xb> 1: R_X86_64_PLT32 __SCT__foo-0x4 5: 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:0x0(%rax,%rax,1) f: 90 nop 0000000000000010 <foo>: 10: 48 83 ec 08 sub $0x8,%rsp 14: e8 00 00 00 00 callq 19 <foo+0x9> 15: R_X86_64_PLT32 my_foo-0x4 19: 48 83 c4 08 add $0x8,%rsp 1d: e9 00 00 00 00 jmpq 22 <foo+0x12> 1e: R_X86_64_PLT32 __SCT__foo-0x4 Noting that ld preserves the weak function text, but strips the symbol off of it (hence objdump doing that funny negative offset thing). This does lead to 'interesting' unused code issues with objtool when ran on linked objects, but that seems to be working (fingers crossed). So far so good.. Now lets consider the objtool static_call output section (readelf output, old binutils): foo-weak.o: Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 .text + 0 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 foo.o: Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 .text + d 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 foos.o: Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000100000002 R_X86_64_PC32 0000000000000000 .text + 0 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 0000000000000008 0000000100000002 R_X86_64_PC32 0000000000000000 .text + 1d 000000000000000c 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 So we have two patch sites, one in the dead code of the weak foo and one in the real foo. All is well. *HOWEVER*, when the toolchain strips unused section symbols it generates things like this (using new enough binutils): foo-weak.o: Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 foo + 0 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 foo.o: Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 foo + d 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 foos.o: Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000100000002 R_X86_64_PC32 0000000000000000 foo + 0 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 0000000000000008 0000000100000002 R_X86_64_PC32 0000000000000000 foo + d 000000000000000c 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 And now we can see how that foos.o .static_call_sites goes side-ways, we now have _two_ patch sites in foo. One for the weak symbol at foo+0 (which is no longer a static_call site!) and one at foo+d which is in fact the right location. This seems to happen when objtool cannot find a section symbol, in which case it falls back to any other symbol to key off of, however in this case that goes terribly wrong! As such, teach objtool to create a section symbol when there isn't one. Fixes: 44f6a7c0755d ("objtool: Fix seg fault with Clang non-section symbols") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20220419203807.655552918@infradead.org
|
#
c087c6e7 |
|
17-Apr-2022 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Fix type of reloc::addend Elf{32,64}_Rela::r_addend is of type: Elf{32,64}_Sword, that means that our reloc::addend needs to be long or face tuncation issues when we do elf_rebuild_reloc_section(): - 107: 48 b8 00 00 00 00 00 00 00 00 movabs $0x0,%rax 109: R_X86_64_64 level4_kernel_pgt+0x80000067 + 107: 48 b8 00 00 00 00 00 00 00 00 movabs $0x0,%rax 109: R_X86_64_64 level4_kernel_pgt-0x7fffff99 Fixes: 627fce14809b ("objtool: Add ORC unwind table generation") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20220419203807.596871927@infradead.org
|
#
4adb2368 |
|
08-Mar-2022 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Ignore extra-symbol code There's a fun implementation detail on linking STB_WEAK symbols. When the linker combines two translation units, where one contains a weak function and the other an override for it. It simply strips the STB_WEAK symbol from the symbol table, but doesn't actually remove the code. The result is that when objtool is ran in a whole-archive kind of way, it will encounter *heaps* of unused (and unreferenced) code. All rudiments of weak functions. Additionally, when a weak implementation is split into a .cold subfunction that .cold symbol is left in place, even though completely unused. Teach objtool to ignore such rudiments by searching for symbol holes; that is, code ranges that fall outside the given symbol bounds. Specifically, ignore a sequence of unreachable instruction iff they occupy a single hole, additionally ignore any .cold subfunctions referenced. Both ld.bfd and ld.lld behave like this. LTO builds otoh can (and do) properly DCE weak functions. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20220308154319.232019347@infradead.org
|
#
f2d3a250 |
|
08-Mar-2022 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Add --dry-run Add a --dry-run argument to skip writing the modifications. This is convenient for debugging. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20220308154317.282720146@infradead.org
|
#
988f0168 |
|
02-Dec-2021 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Fix pv_ops noinstr validation Boris reported that in one of his randconfig builds, objtool got infinitely stuck. Turns out there's trivial list corruption in the pv_ops tracking when a function is both in a static table and in a code assignment. Avoid re-adding function to the pv_ops[] lists when they're already on it. Fixes: db2b0c5d7b6f ("objtool: Support pv_opsindirect calls for noinstr") Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@alien8.de> Link: https://lkml.kernel.org/r/20211202204534.GA16608@worktop.programming.kicks-ass.net
|
#
134ab5bd |
|
26-Oct-2021 |
Peter Zijlstra <peterz@infradead.org> |
objtool,x86: Replace alternatives with .retpoline_sites Instead of writing complete alternatives, simply provide a list of all the retpoline thunk calls. Then the kernel is free to do with them as it pleases. Simpler code all-round. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120309.850007165@infradead.org
|
#
86e1e054 |
|
08-May-2021 |
Michael Forney <mforney@mforney.org> |
objtool: Update section header before relocations The libelf implementation from elftoolchain has a safety check in gelf_update_rel[a] to check that the data corresponds to a section that has type SHT_REL[A] [0]. If the relocation is updated before the section header is updated with the proper type, this check fails. To fix this, update the section header first, before the relocations. Previously, the section size was calculated in elf_rebuild_reloc_section by counting the number of entries in the reloc_list. However, we now need the size during elf_write so instead keep a running total and add to it for every new relocation. [0] https://sourceforge.net/p/elftoolchain/mailman/elftoolchain-developers/thread/CAGw6cBtkZro-8wZMD2ULkwJ39J+tHtTtAWXufMjnd3cQ7XG54g@mail.gmail.com/ Signed-off-by: Michael Forney <mforney@mforney.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20210509000103.11008-2-mforney@mforney.org
|
#
b46179d6 |
|
08-May-2021 |
Michael Forney <mforney@mforney.org> |
objtool: Check for gelf_update_rel[a] failures Otherwise, if these fail we end up with garbage data in the .rela.orc_unwind_ip section, leading to errors like ld: fs/squashfs/namei.o: bad reloc symbol index (0x7f16 >= 0x12) for offset 0x7f16d5c82cc8 in section `.orc_unwind_ip' Signed-off-by: Michael Forney <mforney@mforney.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20210509000103.11008-1-mforney@mforney.org
|
#
fe255fe6 |
|
22-Aug-2021 |
Joe Lawrence <joe.lawrence@redhat.com> |
objtool: Remove redundant 'len' field from struct section The section structure already contains sh_size, so just remove the extra 'len' member that requires extra mirroring and potential confusion. Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20210822225037.54620-3-joe.lawrence@redhat.com Cc: Andy Lavr <andy.lavr@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org
|
#
d33b9035 |
|
11-Jun-2021 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Improve reloc hash size guestimate Nathan reported that LLVM ThinLTO builds have a performance regression with commit 25cf0d8aa2a3 ("objtool: Rewrite hashtable sizing"). Sami was quick to note that this is due to their use of -ffunction-sections. As a result the .text section is small and basing the number of relocs off of that no longer works. Instead have read_sections() compute the sum of all SHF_EXECINSTR sections and use that. Fixes: 25cf0d8aa2a3 ("objtool: Rewrite hashtable sizing") Reported-by: Nathan Chancellor <nathan@kernel.org> Debugged-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Link: https://lkml.kernel.org/r/YMJpGLuGNsGtA5JJ@hirez.programming.kicks-ass.net
|
#
25cf0d8a |
|
06-May-2021 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Rewrite hashtable sizing Currently objtool has 5 hashtables and sizes them 16 or 20 bits depending on the --vmlinux argument. However, a single side doesn't really work well for the 5 tables, which among them, cover 3 different uses. Also, while vmlinux is larger, there is still a very wide difference between a defconfig and allyesconfig build, which again isn't optimally covered by a single size. Another aspect is the cost of elf_hash_init(), which for large tables dominates the runtime for small input files. It turns out that all it does it assign NULL, something that is required when using malloc(). However, when we allocate memory using mmap(), we're guaranteed to get zero filled pages. Therefore, rewrite the whole thing to: 1) use more dynamic sized tables, depending on the input file, 2) avoid the need for elf_hash_init() entirely by using mmap(). This speeds up a regular kernel build (100s to 98s for x86_64-defconfig), and potentially dramatically speeds up vmlinux processing. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210506194157.452881700@infradead.org
|
#
584fd3b3 |
|
07-Jun-2021 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Fix .symtab_shndx handling for elf_create_undef_symbol() When an ELF object uses extended symbol section indexes (IOW it has a .symtab_shndx section), these must be kept in sync with the regular symbol table (.symtab). So for every new symbol we emit, make sure to also emit a .symtab_shndx value to keep the arrays of equal size. Note: since we're writing an UNDEF symbol, most GElf_Sym fields will be 0 and we can repurpose one (st_size) to host the 0 for the xshndx value. Fixes: 2f2f7e47f052 ("objtool: Add elf_create_undef_symbol()") Reported-by: Nick Desaulniers <ndesaulniers@google.com> Suggested-by: Fangrui Song <maskray@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lkml.kernel.org/r/YL3q1qFO9QIRL/BA@hirez.programming.kicks-ass.net
|
#
46c7405d |
|
12-May-2021 |
Vasily Gorbik <gor@linux.ibm.com> |
objtool: Fix elf_create_undef_symbol() endianness Currently x86 cross-compilation fails on big endian system with: x86_64-cross-ld: init/main.o: invalid string offset 488112128 >= 6229 for section `.strtab' Mark new ELF data in elf_create_undef_symbol() as symbol, so that libelf does endianness handling correctly. Fixes: 2f2f7e47f052 ("objtool: Add elf_create_undef_symbol()") Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/patch-1.thread-6c9df9.git-d39264656387.your-ad-here.call-01620841104-ext-2554@work.hours
|
#
2f2f7e47 |
|
26-Mar-2021 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Add elf_create_undef_symbol() Allow objtool to create undefined symbols; this allows creating relocations to symbols not currently in the symbol table. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/20210326151300.064743095@infradead.org
|
#
9a7827b7 |
|
26-Mar-2021 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Extract elf_symbol_add() Create a common helper to add symbols. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/20210326151300.003468981@infradead.org
|
#
417a4dc9 |
|
26-Mar-2021 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Extract elf_strtab_concat() Create a common helper to append strings to a strtab. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/20210326151259.941474004@infradead.org
|
#
d0c5c4cc |
|
26-Mar-2021 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Create reloc sections implicitly Have elf_add_reloc() create the relocation section implicitly. Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/20210326151259.880174448@infradead.org
|
#
ef47cc01 |
|
26-Mar-2021 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Add elf_create_reloc() helper We have 4 instances of adding a relocation. Create a common helper to avoid growing even more. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/20210326151259.817438847@infradead.org
|
#
3a647607 |
|
26-Mar-2021 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Rework the elf_rebuild_reloc_section() logic Instead of manually calling elf_rebuild_reloc_section() on sections we've called elf_add_reloc() on, have elf_write() DTRT. This makes it easier to add random relocations in places without carefully tracking when we're done and need to flush what section. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/20210326151259.754213408@infradead.org
|
#
7786032e |
|
12-Nov-2020 |
Vasily Gorbik <gor@linux.ibm.com> |
objtool: Rework header include paths Currently objtool headers are being included either by their base name or included via ../ from a parent directory. In case of a base name usage: #include "warn.h" #include "arch_elf.h" it does not make it apparent from which directory the file comes from. To make it slightly better, and actually to avoid name clashes some arch specific files have "arch_" suffix. And files from an arch folder have to revert to including via ../ e.g: #include "../../elf.h" With additional architectures support and the code base growth there is a need for clearer headers naming scheme for multiple reasons: 1. to make it instantly obvious where these files come from (objtool itself / objtool arch|generic folders / some other external files), 2. to avoid name clashes of objtool arch specific headers, potential obtool arch generic headers and the system header files (there is /usr/include/elf.h already), 3. to avoid ../ includes and improve code readability. 4. to give a warm fuzzy feeling to developers who are mostly kernel developers and are accustomed to linux kernel headers arranging scheme. Doesn't this make it instantly obvious where are these files come from? #include <objtool/warn.h> #include <arch/elf.h> And doesn't it look nicer to avoid ugly ../ includes? Which also guarantees this is elf.h from the objtool and not /usr/include/elf.h. #include <objtool/elf.h> This patch defines and implements new objtool headers arranging scheme. Which is: - all generic headers go to include/objtool (similar to include/linux) - all arch headers go to arch/$(SRCARCH)/include/arch (to get arch prefix). This is similar to linux arch specific "asm/*" headers but we are not abusing "asm" name and calling it what it is. This also helps to prevent name clashes (arch is not used in system headers or kernel exports). To bring objtool to this state the following things are done: 1. current top level tools/objtool/ headers are moved into include/objtool/ subdirectory, 2. arch specific headers, currently only arch/x86/include/ are moved into arch/x86/include/arch/ and were stripped of "arch_" suffix, 3. new -I$(srctree)/tools/objtool/include include path to make includes like <objtool/warn.h> possible, 4. rewriting file includes, 5. make git not to ignore include/objtool/ subdirectory. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
|
#
a1a664ec |
|
12-Nov-2020 |
Martin Schwidefsky <schwidefsky@de.ibm.com> |
objtool: Fix reloc generation on big endian cross-compiles Relocations generated in elf_rebuild_rel[a]_reloc_section() are broken if objtool is built and run on a big endian system. The following errors pop up during x86 cross-compilation: x86_64-9.1.0-ld: fs/efivarfs/inode.o: bad reloc symbol index (0x2000000 >= 0x22) for offset 0 in section `.orc_unwind_ip' x86_64-9.1.0-ld: final link failed: bad value Convert those functions to use gelf_update_rel[a](), similar to what elf_write_reloc() does. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Co-developed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
|
#
2d24dd57 |
|
29-Apr-2020 |
Peter Zijlstra <peterz@infradead.org> |
rbtree: Add generic add and find helpers I've always been bothered by the endless (fragile) boilerplate for rbtree, and I recently wrote some rbtree helpers for objtool and figured I should lift them into the kernel and use them more widely. Provide: partial-order; less() based: - rb_add(): add a new entry to the rbtree - rb_add_cached(): like rb_add(), but for a rb_root_cached total-order; cmp() based: - rb_find(): find an entry in an rbtree - rb_find_add(): find an entry, and add if not found - rb_find_first(): find the first (leftmost) matching entry - rb_next_match(): continue from rb_find_first() - rb_for_each(): iterate a sub-tree using the previous two Inlining and constant propagation should see the compiler inline the whole thing, including the various compare functions. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Michel Lespinasse <walken@google.com> Acked-by: Davidlohr Bueso <dbueso@suse.de>
|
#
1d489151 |
|
14-Jan-2021 |
Josh Poimboeuf <jpoimboe@redhat.com> |
objtool: Don't fail on missing symbol table Thanks to a recent binutils change which doesn't generate unused symbols, it's now possible for thunk_64.o be completely empty without CONFIG_PREEMPTION: no text, no data, no symbols. We could edit the Makefile to only build that file when CONFIG_PREEMPTION is enabled, but that will likely create confusion if/when the thunks end up getting used by some other code again. Just ignore it and move on. Reported-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Link: https://github.com/ClangBuiltLinux/linux/issues/1254 Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
|
#
a2e38dff |
|
05-Jan-2021 |
Josh Poimboeuf <jpoimboe@redhat.com> |
objtool: Don't add empty symbols to the rbtree Building with the Clang assembler shows the following warning: arch/x86/kernel/ftrace_64.o: warning: objtool: missing symbol for insn at offset 0x16 The Clang assembler strips section symbols. That ends up giving objtool's find_func_containing() much more test coverage than normal. Turns out, find_func_containing() doesn't work so well for overlapping symbols: 2: 000000000000000e 0 NOTYPE LOCAL DEFAULT 2 fgraph_trace 3: 000000000000000f 0 NOTYPE LOCAL DEFAULT 2 trace 4: 0000000000000000 165 FUNC GLOBAL DEFAULT 2 __fentry__ 5: 000000000000000e 0 NOTYPE GLOBAL DEFAULT 2 ftrace_stub The zero-length NOTYPE symbols are inside __fentry__(), confusing the rbtree search for any __fentry__() offset coming after a NOTYPE. Try to avoid this problem by not adding zero-length symbols to the rbtree. They're rare and aren't needed in the rbtree anyway. One caveat, this actually might not end up being the right fix. Non-empty overlapping symbols, if they exist, could have the same problem. But that would need bigger changes, let's see if we can get away with the easy fix for now. Reported-by: Arnd Bergmann <arnd@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
|
#
44f6a7c0 |
|
14-Dec-2020 |
Josh Poimboeuf <jpoimboe@redhat.com> |
objtool: Fix seg fault with Clang non-section symbols The Clang assembler likes to strip section symbols, which means objtool can't reference some text code by its section. This confuses objtool greatly, causing it to seg fault. The fix is similar to what was done before, for ORC reloc generation: e81e07244325 ("objtool: Support Clang non-section symbols in ORC generation") Factor out that code into a common helper and use it for static call reloc generation as well. Reported-by: Arnd Bergmann <arnd@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://github.com/ClangBuiltLinux/linux/issues/1207 Link: https://lkml.kernel.org/r/ba6b6c0f0dd5acbba66e403955a967d9fdd1726a.1607983452.git.jpoimboe@redhat.com
|
#
1e7e4788 |
|
18-Aug-2020 |
Josh Poimboeuf <jpoimboe@redhat.com> |
x86/static_call: Add inline static call implementation for x86-64 Add the inline static call implementation for x86-64. The generated code is identical to the out-of-line case, except we move the trampoline into it's own section. Objtool uses the trampoline naming convention to detect all the call sites. It then annotates those call sites in the .static_call_sites section. During boot (and module init), the call sites are patched to call directly into the destination function. The temporary trampoline is then no longer used. [peterz: merged trampolines, put trampoline in section] Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20200818135804.864271425@infradead.org
|
#
fdabdd0b |
|
12-Jun-2020 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Provide elf_write_{insn,reloc}() This provides infrastructure to rewrite instructions; this is immediately useful for helping out with KCOV-vs-noinstr, but will also come in handy for a bunch of variable sized jump-label patches that are still on ice. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
|
#
2b10be23 |
|
17-Apr-2020 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Clean up elf_write() condition With there being multiple ways to change the ELF data, let's more concisely track modification. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
|
#
fb414783 |
|
29-May-2020 |
Matt Helsley <mhelsley@vmware.com> |
objtool: Add support for relocations without addends Currently objtool only collects information about relocations with addends. In recordmcount, which we are about to merge into objtool, some supported architectures do not use rela relocations. Signed-off-by: Matt Helsley <mhelsley@vmware.com> Reviewed-by: Julien Thierry <jthierry@redhat.com> Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
|
#
f1974222 |
|
29-May-2020 |
Matt Helsley <mhelsley@vmware.com> |
objtool: Rename rela to reloc Before supporting additional relocation types rename the relevant types and functions from "rela" to "reloc". This work be done with the following regex: sed -e 's/struct rela/struct reloc/g' \ -e 's/\([_\*]\)rela\(s\{0,1\}\)/\1reloc\2/g' \ -e 's/tmprela\(s\{0,1\}\)/tmpreloc\1/g' \ -e 's/relasec/relocsec/g' \ -e 's/rela_list/reloc_list/g' \ -e 's/rela_hash/reloc_hash/g' \ -e 's/add_rela/add_reloc/g' \ -e 's/rela->/reloc->/g' \ -e '/rela[,\.]/{ s/\([^\.>]\)rela\([\.,]\)/\1reloc\2/g ; }' \ -e 's/rela =/reloc =/g' \ -e 's/relas =/relocs =/g' \ -e 's/relas\[/relocs[/g' \ -e 's/relaname =/relocname =/g' \ -e 's/= rela\;/= reloc\;/g' \ -e 's/= relas\;/= relocs\;/g' \ -e 's/= relaname\;/= relocname\;/g' \ -e 's/, rela)/, reloc)/g' \ -e 's/\([ @]\)rela\([ "]\)/\1reloc\2/g' \ -e 's/ rela$/ reloc/g' \ -e 's/, relaname/, relocname/g' \ -e 's/sec->rela/sec->reloc/g' \ -e 's/(\(!\{0,1\}\)rela/(\1reloc/g' \ -i \ arch.h \ arch/x86/decode.c \ check.c \ check.h \ elf.c \ elf.h \ orc_gen.c \ special.c Notable exceptions which complicate the regex include gelf_* library calls and standard/expected section names which still use "rela" because they encode the type of relocation expected. Also, keep "rela" in the struct because it encodes a specific type of relocation we currently expect. It will eventually turn into a member of an anonymous union when a susequent patch adds implicit addend, or "rel", relocation support. Signed-off-by: Matt Helsley <mhelsley@vmware.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
|
#
1e968bf5 |
|
21-Apr-2020 |
Sami Tolvanen <samitolvanen@google.com> |
objtool: Use sh_info to find the base for .rela sections ELF doesn't require .rela section names to match the base section. Use the section index in sh_info to find the section instead of looking it up by name. LLD, for example, generates a .rela section that doesn't match the base section name when we merge sections in a linker script for a binary compiled with -ffunction-sections. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org>
|
#
e000acc1 |
|
15-Apr-2020 |
Kristen Carlson Accardi <kristen@linux.intel.com> |
objtool: Do not assume order of parent/child functions If a .cold function is examined prior to it's parent, the link to the parent/child function can be overwritten when the parent is examined. Only update pfunc and cfunc if they were previously nil to prevent this from happening. This fixes an issue seen when compiling with -ffunction-sections. Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
|
#
28fe1d7b |
|
21-Apr-2020 |
Sami Tolvanen <samitolvanen@google.com> |
objtool: use gelf_getsymshndx to handle >64k sections Currently, objtool fails to load the correct section for symbols when the index is greater than SHN_LORESERVE. Use gelf_getsymshndx instead of gelf_getsym to handle >64k sections. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200421220843.188260-2-samitolvanen@google.com
|
#
b490f453 |
|
24-Apr-2020 |
Miroslav Benes <mbenes@suse.cz> |
objtool: Move the IRET hack into the arch decoder Quoting Julien: "And the other suggestion is my other email was that you don't even need to add INSN_EXCEPTION_RETURN. You can keep IRET as INSN_CONTEXT_SWITCH by default and x86 decoder lookups the symbol conaining an iret. If it's a function symbol, it can just set the type to INSN_OTHER so that it caries on to the next instruction after having handled the stack_op." Suggested-by: Julien Thierry <jthierry@redhat.com> Signed-off-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200428191659.913283807@infradead.org
|
#
bc359ff2 |
|
21-Apr-2020 |
Ingo Molnar <mingo@kernel.org> |
objtool: Rename elf_read() to elf_open_read() 'struct elf *' handling is an open/close paradigm, make sure the naming matches that: elf_open_read() elf_write() elf_close() Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200422103205.61900-3-mingo@kernel.org
|
#
894e48ca |
|
21-Apr-2020 |
Ingo Molnar <mingo@kernel.org> |
objtool: Constify 'struct elf *' parameters In preparation to parallelize certain parts of objtool, map out which uses of various data structures are read-only vs. read-write. As a first step constify 'struct elf' pointer passing, most of the secondary uses of it in find_symbol_*() methods are read-only. Also, while at it, better group the 'struct elf' handling methods in elf.h. Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200422103205.61900-2-mingo@kernel.org
|
#
7f9b34f3 |
|
03-Apr-2020 |
Julien Thierry <jthierry@redhat.com> |
objtool: Fix off-by-one in symbol_by_offset() Sometimes, WARN_FUNC() and other users of symbol_by_offset() will associate the first instruction of a symbol with the symbol preceding it. This is because symbol->offset + symbol->len is already outside of the symbol's range. Fixes: 2a362ecc3ec9 ("objtool: Optimize find_symbol_*() and read_symbols()") Signed-off-by: Julien Thierry <jthierry@redhat.com> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
|
#
34f7c96d |
|
12-Mar-2020 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Optimize !vmlinux.o again When doing kbuild tests to see if the objtool changes affected those I found that there was a measurable regression: pre post real 1m13.594 1m16.488s user 34m58.246s 35m23.947s sys 4m0.393s 4m27.312s Perf showed that for small files the increased hash-table sizes were a measurable difference. Since we already have -l "vmlinux" to distinguish between the modes, make it also use a smaller portion of the hash-tables. This flips it into a small win: real 1m14.143s user 34m49.292s sys 3m44.746s Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200416115119.167588731@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
5377cae9 |
|
03-Apr-2020 |
Julien Thierry <jthierry@redhat.com> |
objtool: Fix off-by-one in symbol_by_offset() Sometimes, WARN_FUNC() and other users of symbol_by_offset() will associate the first instruction of a symbol with the symbol preceding it. This is because symbol->offset + symbol->len is already outside of the symbol's range. Fixes: 2a362ecc3ec9 ("objtool: Optimize find_symbol_*() and read_symbols()") Signed-off-by: Julien Thierry <jthierry@redhat.com> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
74b873e4 |
|
12-Mar-2020 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Optimize find_rela_by_dest_range() Perf shows there is significant time in find_rela_by_dest(); this is because we have to iterate the address space per byte, looking for relocation entries. Optimize this by reducing the address space granularity. This reduces objtool on vmlinux.o runtime from 4.8 to 4.4 seconds. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200324160924.861321325@infradead.org
|
#
8b5fa6bc |
|
12-Mar-2020 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Optimize read_sections() Perf showed that __hash_init() is a significant portion of read_sections(), so instead of doing a per section rela_hash, use an elf-wide rela_hash. Statistics show us there are about 1.1 million relas, so size it accordingly. This reduces the objtool on vmlinux.o runtime to a third, from 15 to 5 seconds. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200324160924.739153726@infradead.org
|
#
cdb3d057 |
|
12-Mar-2020 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Optimize find_symbol_by_name() Perf showed that find_symbol_by_name() takes time; add a symbol name hash. This shaves another second off of objtool on vmlinux.o runtime, down to 15 seconds. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200324160924.676865656@infradead.org
|
#
53d20720 |
|
16-Mar-2020 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Rename find_containing_func() For consistency; we have: find_symbol_by_offset() / find_symbol_containing() find_func_by_offset() / find_containing_func() fix that. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200324160924.558470724@infradead.org
|
#
2a362ecc |
|
12-Mar-2020 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Optimize find_symbol_*() and read_symbols() All of: read_symbols(), find_symbol_by_offset(), find_symbol_containing(), find_containing_func() do a linear search of the symbols. Add an RB tree to make it go faster. This about halves objtool runtime on vmlinux.o, from 34s to 18s. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200324160924.499016559@infradead.org
|
#
ae358196 |
|
12-Mar-2020 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Optimize find_section_by_name() In order to avoid yet another linear search of (20k) sections, add a name based hash. This reduces objtool runtime on vmlinux.o by some 10s to around 35s. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200324160924.440174280@infradead.org
|
#
53038996 |
|
10-Mar-2020 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Optimize find_section_by_index() In order to avoid a linear search (over 20k entries), add an section_hash to the elf object. This reduces objtool on vmlinux.o from a few minutes to around 45 seconds. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200324160924.381249993@infradead.org
|
#
1e11f3fd |
|
12-Mar-2020 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Add a statistics mode Have it print a few numbers which can be used to size the hashtables. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200324160924.321381240@infradead.org
|
#
65fb11a7 |
|
10-Mar-2020 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Optimize find_symbol_by_index() The symbol index is object wide, not per section, so it makes no sense to have the symbol_hash be part of the section object. By moving it to the elf object we avoid the linear sections iteration. This reduces the runtime of objtool on vmlinux.o from over 3 hours (I gave up) to a few minutes. The defconfig vmlinux.o has around 20k sections. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20200324160924.261852348@infradead.org
|
#
7acfe531 |
|
17-Feb-2020 |
Josh Poimboeuf <jpoimboe@redhat.com> |
objtool: Improve call destination function detection A recent clang change, combined with a binutils bug, can trigger a situation where a ".Lprintk$local" STT_NOTYPE symbol gets created at the same offset as the "printk" STT_FUNC symbol. This confuses objtool: kernel/printk/printk.o: warning: objtool: ignore_loglevel_setup()+0x10: can't find call dest symbol at .text+0xc67 Improve the call destination detection by looking specifically for an STT_FUNC symbol. Reported-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Link: https://github.com/ClangBuiltLinux/linux/issues/872 Link: https://sourceware.org/bugzilla/show_bug.cgi?id=25551 Link: https://lkml.kernel.org/r/0a7ee320bc0ea4469bd3dc450a7b4725669e0ea9.1581997059.git.jpoimboe@redhat.com
|
#
e7c2bc37 |
|
17-Jul-2019 |
Josh Poimboeuf <jpoimboe@redhat.com> |
objtool: Refactor jump table code Now that C jump tables are supported, call them "jump tables" instead of "switch tables". Also rename some other variables, add comments, and simplify the code flow a bit. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/cf951b0c0641628e0b9b81f7ceccd9bcabcb4bd8.1563413318.git.jpoimboe@redhat.com
|
#
e10cd8fe |
|
17-Jul-2019 |
Josh Poimboeuf <jpoimboe@redhat.com> |
objtool: Refactor function alias logic - Add an alias check in validate_functions(). With this change, aliases no longer need uaccess_safe set. - Add an alias check in decode_instructions(). With this change, the "if (!insn->func)" check is no longer needed. - Don't create aliases for zero-length functions, as it can have unexpected results. The next patch will spit out a warning for zero-length functions anyway. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/26a99c31426540f19c9a58b9e10727c385a147bc.1563413318.git.jpoimboe@redhat.com
|
#
8e144797 |
|
10-Jul-2019 |
Michael Forney <mforney@mforney.org> |
objtool: Rename elf_open() to prevent conflict with libelf from elftoolchain The elftoolchain version of libelf has a function named elf_open(). The function name isn't quite accurate anyway, since it also reads all the ELF data. Rename it to elf_read(), which is more accurate. [ jpoimboe: rename to elf_read(); write commit description ] Signed-off-by: Michael Forney <mforney@mforney.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/7ce2d1b35665edf19fd0eb6fbc0b17b81a48e62f.1562793604.git.jpoimboe@redhat.com
|
#
3c3ea503 |
|
10-Jul-2019 |
Michael Forney <mforney@mforney.org> |
objtool: Use Elf_Scn typedef instead of assuming struct name The libelf implementation might use a different struct name, and the Elf_Scn typedef is already used throughout the rest of objtool. Signed-off-by: Michael Forney <mforney@mforney.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/d270e1be2835fc2a10acf67535ff2ebd2145bf43.1562793448.git.jpoimboe@redhat.com
|
#
1ccea77e |
|
19-May-2019 |
Thomas Gleixner <tglx@linutronix.de> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 13 Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details [based] [from] [clk] [highbank] [c] you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 355 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Jilayne Lovejoy <opensource@jilayne.com> Reviewed-by: Steve Winslow <swinslow@gmail.com> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190519154041.837383322@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
09f30d83 |
|
28-Feb-2019 |
Peter Zijlstra <peterz@infradead.org> |
objtool: Handle function aliases Function aliases result in different symbols for the same set of instructions; track a canonical symbol so there is a unique point of access. This again prepares the way for function attributes. And in particular the need for aliases comes from how KASAN uses them. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
22566c16 |
|
20-Nov-2018 |
Artem Savkov <asavkov@redhat.com> |
objtool: Fix segfault in .cold detection with -ffunction-sections Because find_symbol_by_name() traverses the same lists as read_symbols(), changing sym->name in place without copying it affects the result of find_symbol_by_name(). In the case where a ".cold" function precedes its parent in sec->symbol_list, it can result in a function being considered a parent of itself. This leads to function length being set to 0 and other consequent side-effects including a segfault in add_switch_table(). The effects of this bug are only visible when building with -ffunction-sections in KCFLAGS. Fix by copying the search string instead of modifying it in place. Signed-off-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 13810435b9a7 ("objtool: Support GCC 8's cold subfunctions") Link: http://lkml.kernel.org/r/910abd6b5a4945130fd44f787c24e07b9e07c8da.1542736240.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
0b9301fb |
|
20-Nov-2018 |
Artem Savkov <asavkov@redhat.com> |
objtool: Fix double-free in .cold detection error path If read_symbols() fails during second list traversal (the one dealing with ".cold" subfunctions) it frees the symbol, but never deletes it from the list/hash_table resulting in symbol being freed again in elf_close(). Fix it by just returning an error, leaving cleanup to elf_close(). Signed-off-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 13810435b9a7 ("objtool: Support GCC 8's cold subfunctions") Link: http://lkml.kernel.org/r/beac5a9b7da9e8be90223459dcbe07766ae437dd.1542736240.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
bcb6fb5d |
|
31-Oct-2018 |
Josh Poimboeuf <jpoimboe@redhat.com> |
objtool: Support GCC 9 cold subfunction naming scheme Starting with GCC 8, a lot of unlikely code was moved out of line to "cold" subfunctions in .text.unlikely. For example, the unlikely bits of: irq_do_set_affinity() are moved out to the following subfunction: irq_do_set_affinity.cold.49() Starting with GCC 9, the numbered suffix has been removed. So in the above example, the cold subfunction is instead: irq_do_set_affinity.cold() Tweak the objtool subfunction detection logic so that it detects both GCC 8 and GCC 9 naming schemes. Reported-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/015e9544b1f188d36a7f02fa31e9e95629aa5f50.1541040800.git.jpoimboe@redhat.com
|
#
4a60aa05 |
|
07-Sep-2018 |
Allan Xavier <allan.x.xavier@oracle.com> |
objtool: Support per-function rodata sections Add support for processing switch jump tables in objects with multiple .rodata sections, such as those created by '-ffunction-sections' and '-fdata-sections'. Currently, objtool always looks in .rodata for jump table information, which results in many "sibling call from callable instruction with modified stack frame" warnings with objects compiled using those flags. The fix is comprised of three parts: 1. Flagging all .rodata sections when importing ELF information for easier checking later. 2. Keeping a reference to the section each relocation is from in order to get the list_head for the other relocations in that section. 3. Finding jump tables by following relocations to .rodata sections, rather than always referencing a single global .rodata section. The patch has been tested without data sections enabled and no differences in the resulting orc unwind information were seen. Note that as objtool adds terminators to end of each .text section the unwind information generated between a function+data sections build and a normal build aren't directly comparable. Manual inspection suggests that objtool is now generating the correct information, or at least making more of an effort to do so than it did previously. Signed-off-by: Allan Xavier <allan.x.xavier@oracle.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/099bdc375195c490dda04db777ee0b95d566ded1.1536325914.git.jpoimboe@redhat.com
|
#
6d77d3b4 |
|
09-Jul-2018 |
Simon Ser <contact@emersion.fr> |
objtool: Use '.strtab' if '.shstrtab' doesn't exist, to support ORC tables on Clang Clang puts its section header names in the '.strtab' section instead of '.shstrtab', which causes objtool to fail with a "can't find .shstrtab section" warning when attempting to write ORC metadata to an object file. If '.shstrtab' doesn't exist, use '.strtab' instead. Signed-off-by: Simon Ser <contact@emersion.fr> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/d1c1c3fe55872be433da7bc5e1860538506229ba.1531153015.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
08b393d0 |
|
27-Jun-2018 |
Josh Poimboeuf <jpoimboe@redhat.com> |
objtool: Support GCC 8 '-fnoreorder-functions' Since the following commit: cd77849a69cf ("objtool: Fix GCC 8 cold subfunction detection for aliased functions") ... if the kernel is built with EXTRA_CFLAGS='-fno-reorder-functions', objtool can get stuck in an infinite loop. That flag causes the new GCC 8 cold subfunctions to be placed in .text instead of .text.unlikely. But it also has an unfortunate quirk: in the symbol table, the subfunction (e.g., nmi_panic.cold.7) is nested inside the parent (nmi_panic). That function overlap confuses objtool, and causes it to get into an infinite loop in next_insn_same_func(). Here's Allan's description of the loop: "Objtool iterates through the instructions in nmi_panic using next_insn_same_func. Once it reaches the end of nmi_panic at 0x534 it jumps to 0x528 as that's the start of nmi_panic.cold.7. However, since the instructions starting at 0x528 are still associated with nmi_panic objtool will get stuck in a loop, continually jumping back to 0x528 after reaching 0x534." Fix it by shortening the length of the parent function so that the functions no longer overlap. Reported-and-analyzed-by: Allan Xavier <allan.x.xavier@oracle.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Allan Xavier <allan.x.xavier@oracle.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/9e704c52bee651129b036be14feda317ae5606ae.1530136978.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
13810435 |
|
09-May-2018 |
Josh Poimboeuf <jpoimboe@redhat.com> |
objtool: Support GCC 8's cold subfunctions GCC 8 moves a lot of unlikely code out of line to "cold" subfunctions in .text.unlikely. Properly detect the new subfunctions and treat them as extensions of the original functions. This fixes a bunch of warnings like: kernel/cgroup/cgroup.o: warning: objtool: parse_cgroup_root_flags()+0x33: sibling call from callable instruction with modified stack frame kernel/cgroup/cgroup.o: warning: objtool: cgroup_addrm_files()+0x290: sibling call from callable instruction with modified stack frame kernel/cgroup/cgroup.o: warning: objtool: cgroup_apply_control_enable()+0x25b: sibling call from callable instruction with modified stack frame kernel/cgroup/cgroup.o: warning: objtool: rebind_subsystems()+0x325: sibling call from callable instruction with modified stack frame Reported-and-tested-by: damian <damian.tometzki@icloud.com> Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: David Laight <David.Laight@ACULAB.COM> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/0965e7fcfc5f31a276f0c7f298ff770c19b68706.1525923412.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
385d11b1 |
|
15-Jan-2018 |
Josh Poimboeuf <jpoimboe@redhat.com> |
objtool: Improve error message for bad file argument If a nonexistent file is supplied to objtool, it complains with a non-helpful error: open: No such file or directory Improve it to: objtool: Can't open 'foo': No such file or directory Reported-by: Markus <M4rkusXXL@web.de> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/406a3d00a21225eee2819844048e17f68523ccf6.1516025651.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
97dab2ae |
|
15-Sep-2017 |
Josh Poimboeuf <jpoimboe@redhat.com> |
objtool: Fix object file corruption Arnd Bergmann reported that a randconfig build was failing with the following link error: built-in.o: member arch/x86/kernel/time.o in archive is not an object It turns out the link failed because the time.o file had been corrupted by objtool: nm: arch/x86/kernel/time.o: File format not recognized In certain rare cases, when a .o file's ORC table is very small, the .data section size doesn't change because it's page aligned. Because all the existing sections haven't changed size, libelf doesn't detect any section header changes, and so it doesn't update the section header table properly. Instead it writes junk in the section header entries for the new ORC sections. Make sure libelf properly updates the section header table by setting the ELF_F_DIRTY flag in the top level elf struct. Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 627fce14809b ("objtool: Add ORC unwind table generation") Link: http://lkml.kernel.org/r/e650fd0f2d8a209d1409a9785deb101fdaed55fb.1505459813.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
df968c93 |
|
15-Sep-2017 |
Petr Vandrovec <petr@vandrovec.name> |
objtool: Do not retrieve data from empty sections Binutils 2.29-9 in Debian return an error when elf_getdata is invoked on empty section (.note.GNU-stack in all kernel files), causing immediate failure of kernel build with: elf_getdata: can't manipulate null section As nothing is done with sections that have zero size, just do not retrieve their data at all. Signed-off-by: Petr Vandrovec <petr@vandrovec.name> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/2ce30a44349065b70d0f00e71e286dc0cbe745e6.1505459652.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
0998b7a0 |
|
14-Sep-2017 |
Martin Kepplinger <martink@posteo.de> |
objtool: Fix memory leak in elf_create_rela_section() Let's free the allocated char array 'relaname' before returning, in order to avoid leaking memory. Signed-off-by: Martin Kepplinger <martink@posteo.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: mingo.kernel.org@gmail.com Link: http://lkml.kernel.org/r/20170914060138.26472-1-martink@posteo.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
627fce14 |
|
11-Jul-2017 |
Josh Poimboeuf <jpoimboe@redhat.com> |
objtool: Add ORC unwind table generation Now that objtool knows the states of all registers on the stack for each instruction, it's straightforward to generate debuginfo for an unwinder to use. Instead of generating DWARF, generate a new format called ORC, which is more suitable for an in-kernel unwinder. See Documentation/x86/orc-unwinder.txt for a more detailed description of this new debuginfo format and why it's preferable to DWARF. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/c9b9f01ba6c5ed2bdc9bb0957b78167fdbf9632e.1499786555.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
baa41469 |
|
28-Jun-2017 |
Josh Poimboeuf <jpoimboe@redhat.com> |
objtool: Implement stack validation 2.0 This is a major rewrite of objtool. Instead of only tracking frame pointer changes, it now tracks all stack-related operations, including all register saves/restores. In addition to making stack validation more robust, this also paves the way for undwarf generation. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/678bd94c0566c6129bcc376cddb259c4c5633004.1498659915.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
5c51f4ae |
|
02-Mar-2017 |
Josh Poimboeuf <jpoimboe@redhat.com> |
objtool: Fix another GCC jump table detection issue Arnd Bergmann reported a (false positive) objtool warning: drivers/infiniband/sw/rxe/rxe_resp.o: warning: objtool: rxe_responder()+0xfe: sibling call from callable instruction with changed frame pointer The issue is in find_switch_table(). It tries to find a switch statement's jump table by walking backwards from an indirect jump instruction, looking for a relocation to the .rodata section. In this case it stopped walking prematurely: the first .rodata relocation it encountered was for a variable (resp_state_name) instead of a jump table, so it just assumed there wasn't a jump table. The fix is to ignore any .rodata relocation which refers to an ELF object symbol. This works because the jump tables are anonymous and have no symbols associated with them. Reported-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 3732710ff6f2 ("objtool: Improve rare switch jump table pattern detection") Link: http://lkml.kernel.org/r/20170302225723.3ndbsnl4hkqbne7a@treble Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
774bec3f |
|
13-Jul-2016 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
objtool: Add fallback from ELF_C_READ_MMAP to ELF_C_READ Not all libelf implementations have this "Please, ELF_C_READ, but use mmap if possible" elf_begin() cmd, so provide a fallback to plain old ELF_C_READ. Case in point: Alpine Linux 3.4. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-1fctuknrawgoi5xqon4mu9dv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
#
042ba73f |
|
08-Mar-2016 |
Josh Poimboeuf <jpoimboe@redhat.com> |
objtool: Add several performance improvements Use hash tables for instruction and rela lookups (and keep the linked lists around for sequential access). Also cache the section struct for the "__func_stack_frame_non_standard" section. With this change, "objtool check net/wireless/nl80211.o" goes from: real 0m1.168s user 0m1.163s sys 0m0.005s to: real 0m0.059s user 0m0.042s sys 0m0.017s for a 20x speedup. With the same object, it should be noted that the memory heap usage grew from 8MB to 62MB. Reducing the memory usage is on the TODO list. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at> Cc: Borislav Petkov <bp@alien8.de> Cc: Chris J Arges <chris.j.arges@canonical.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Marek <mmarek@suse.cz> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Pedro Alves <palves@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/dd0d8e1449506cfa7701b4e7ba73577077c44253.1457502970.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
a196e171 |
|
08-Mar-2016 |
Josh Poimboeuf <jpoimboe@redhat.com> |
objtool: Rename some variables and functions Rename some list heads to distinguish them from hash node heads, which are added later in the patch series. Also rename the get_*() functions to add_*(), which is more descriptive: they "add" data to the objtool_file struct. Also rename rodata_rela and text_rela to be clearer: - text_rela refers to a rela entry in .rela.text. - rodata_rela refers to a rela entry in .rela.rodata. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at> Cc: Borislav Petkov <bp@alien8.de> Cc: Chris J Arges <chris.j.arges@canonical.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Marek <mmarek@suse.cz> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Pedro Alves <palves@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/ee0eca2bba8482aa45758958c5586c00a7b71e62.1457502970.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
442f04c3 |
|
28-Feb-2016 |
Josh Poimboeuf <jpoimboe@redhat.com> |
objtool: Add tool to perform compile-time stack metadata validation This adds a host tool named objtool which has a "check" subcommand which analyzes .o files to ensure the validity of stack metadata. It enforces a set of rules on asm code and C inline assembly code so that stack traces can be reliable. For each function, it recursively follows all possible code paths and validates the correct frame pointer state at each instruction. It also follows code paths involving kernel special sections, like .altinstructions, __jump_table, and __ex_table, which can add alternative execution paths to a given instruction (or set of instructions). Similarly, it knows how to follow switch statements, for which gcc sometimes uses jump tables. Here are some of the benefits of validating stack metadata: a) More reliable stack traces for frame pointer enabled kernels Frame pointers are used for debugging purposes. They allow runtime code and debug tools to be able to walk the stack to determine the chain of function call sites that led to the currently executing code. For some architectures, frame pointers are enabled by CONFIG_FRAME_POINTER. For some other architectures they may be required by the ABI (sometimes referred to as "backchain pointers"). For C code, gcc automatically generates instructions for setting up frame pointers when the -fno-omit-frame-pointer option is used. But for asm code, the frame setup instructions have to be written by hand, which most people don't do. So the end result is that CONFIG_FRAME_POINTER is honored for C code but not for most asm code. For stack traces based on frame pointers to be reliable, all functions which call other functions must first create a stack frame and update the frame pointer. If a first function doesn't properly create a stack frame before calling a second function, the *caller* of the first function will be skipped on the stack trace. For example, consider the following example backtrace with frame pointers enabled: [<ffffffff81812584>] dump_stack+0x4b/0x63 [<ffffffff812d6dc2>] cmdline_proc_show+0x12/0x30 [<ffffffff8127f568>] seq_read+0x108/0x3e0 [<ffffffff812cce62>] proc_reg_read+0x42/0x70 [<ffffffff81256197>] __vfs_read+0x37/0x100 [<ffffffff81256b16>] vfs_read+0x86/0x130 [<ffffffff81257898>] SyS_read+0x58/0xd0 [<ffffffff8181c1f2>] entry_SYSCALL_64_fastpath+0x12/0x76 It correctly shows that the caller of cmdline_proc_show() is seq_read(). If we remove the frame pointer logic from cmdline_proc_show() by replacing the frame pointer related instructions with nops, here's what it looks like instead: [<ffffffff81812584>] dump_stack+0x4b/0x63 [<ffffffff812d6dc2>] cmdline_proc_show+0x12/0x30 [<ffffffff812cce62>] proc_reg_read+0x42/0x70 [<ffffffff81256197>] __vfs_read+0x37/0x100 [<ffffffff81256b16>] vfs_read+0x86/0x130 [<ffffffff81257898>] SyS_read+0x58/0xd0 [<ffffffff8181c1f2>] entry_SYSCALL_64_fastpath+0x12/0x76 Notice that cmdline_proc_show()'s caller, seq_read(), has been skipped. Instead the stack trace seems to show that cmdline_proc_show() was called by proc_reg_read(). The benefit of "objtool check" here is that because it ensures that *all* functions honor CONFIG_FRAME_POINTER, no functions will ever[*] be skipped on a stack trace. [*] unless an interrupt or exception has occurred at the very beginning of a function before the stack frame has been created, or at the very end of the function after the stack frame has been destroyed. This is an inherent limitation of frame pointers. b) 100% reliable stack traces for DWARF enabled kernels This is not yet implemented. For more details about what is planned, see tools/objtool/Documentation/stack-validation.txt. c) Higher live patching compatibility rate This is not yet implemented. For more details about what is planned, see tools/objtool/Documentation/stack-validation.txt. To achieve the validation, "objtool check" enforces the following rules: 1. Each callable function must be annotated as such with the ELF function type. In asm code, this is typically done using the ENTRY/ENDPROC macros. If objtool finds a return instruction outside of a function, it flags an error since that usually indicates callable code which should be annotated accordingly. This rule is needed so that objtool can properly identify each callable function in order to analyze its stack metadata. 2. Conversely, each section of code which is *not* callable should *not* be annotated as an ELF function. The ENDPROC macro shouldn't be used in this case. This rule is needed so that objtool can ignore non-callable code. Such code doesn't have to follow any of the other rules. 3. Each callable function which calls another function must have the correct frame pointer logic, if required by CONFIG_FRAME_POINTER or the architecture's back chain rules. This can by done in asm code with the FRAME_BEGIN/FRAME_END macros. This rule ensures that frame pointer based stack traces will work as designed. If function A doesn't create a stack frame before calling function B, the _caller_ of function A will be skipped on the stack trace. 4. Dynamic jumps and jumps to undefined symbols are only allowed if: a) the jump is part of a switch statement; or b) the jump matches sibling call semantics and the frame pointer has the same value it had on function entry. This rule is needed so that objtool can reliably analyze all of a function's code paths. If a function jumps to code in another file, and it's not a sibling call, objtool has no way to follow the jump because it only analyzes a single file at a time. 5. A callable function may not execute kernel entry/exit instructions. The only code which needs such instructions is kernel entry code, which shouldn't be be in callable functions anyway. This rule is just a sanity check to ensure that callable functions return normally. It currently only supports x86_64. I tried to make the code generic so that support for other architectures can hopefully be plugged in relatively easily. On my Lenovo laptop with a i7-4810MQ 4-core/8-thread CPU, building the kernel with objtool checking every .o file adds about three seconds of total build time. It hasn't been optimized for performance yet, so there are probably some opportunities for better build performance. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at> Cc: Borislav Petkov <bp@alien8.de> Cc: Chris J Arges <chris.j.arges@canonical.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Marek <mmarek@suse.cz> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Pedro Alves <palves@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/f3efb173de43bd067b060de73f856567c0fa1174.1456719558.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|