#
360784 |
|
07-May-2020 |
dim |
Merge llvm, clang, compiler-rt, libc++, libunwind, lld, lldb and openmp llvmorg-10.0.0-0-gd32170dbd5b (aka 10.0.0 release), and a number of follow-ups.
MFC r356479 (by bdragon):
[PowerPC] Fix libllvmminimal build when building from powerpc64 ELFv1.
When bootstrapping on powerpc64 ELFv1, it is necessary to use binutils ld.bfd from ports for the bootstrap, as this is the only modern linker for ELFv1 host tools.
As binutils ld.bfd is rather strict in its handling of undefined symbols, it is necessary to pull in Support/Atomic.cpp to avoid an undefined symbol.
Reviewed by: dim, emaste Sponsored by: Tag1 Consulting, Inc. Differential Revision: https://reviews.freebsd.org/D23072
MFC r356930:
Add more Subversion mergeinfo bootstrap information, to hopefully increase the probability of merging in vendor changes.
MFC r358408 (by brooks):
Merge commit 7214f7a79 from llvm git (by Sam Elliott):
[RISCV] Lower llvm.trap and llvm.debugtrap
Summary: Until this commit, these have lowered to a call to abort().
`llvm.trap()` now lowers to `unimp`, which should trap on all systems.
`llvm.debugtrap()` now lowers to `ebreak`, which is exactly what this instruction is for.
Reviewers: asb, luismarques
Reviewed By: asb
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69390
This fixes miscompilation resulting in linking failures with INVARIANTS disabled.
Reviewed by: dim Differential Revision: https://reviews.freebsd.org/D23857
MFC r358851:
Merge llvm, clang, compiler-rt, libc++, libunwind, lld, lldb and openmp 10.0.0-rc3 c290cb61fdc.
Release notes for llvm, clang, lld and libc++ 10.0.0 will become available here:
https://releases.llvm.org/10.0.0/docs/ReleaseNotes.html https://releases.llvm.org/10.0.0/tools/clang/docs/ReleaseNotes.html https://releases.llvm.org/10.0.0/tools/lld/docs/ReleaseNotes.html https://releases.llvm.org/10.0.0/projects/libcxx/docs/ReleaseNotes.html
PR: 244251
MFC r358854:
Add one additional file to libllvmminimal, to help the ppc64 bootstrap.
Reported by: bdragon PR: 244251
MFC r358857:
Move another file in libllvm from sources required for world, to sources required for bootstrap, as the PowerPC builds need this.
Reported by: bdragon PR: 244251
MFC r359082:
Merge llvm, clang, compiler-rt, libc++, libunwind, lld, lldb and openmp llvmorg-10.0.0-rc4-5-g52c365aa9ca. The actual release should follow Real Soon Now.
PR: 244251
MFC r359084:
Merge commit 00925aadb from llvm git (by Fangrui Song):
[ELF][PPC32] Fix canonical PLTs when the order does not match the PLT order
Reviewed By: Bdragon28
Differential Revision: https://reviews.llvm.org/D75394
This is needed to fix miscompiled canonical PLTs on ppc32/lld10.
Requested by: bdragon Differential Revision: https://reviews.freebsd.org/D24109
MFC r359085:
Merge commit 315f8a55f from llvm git (by Fangrui Song):
[ELF][PPC32] Don't report "relocation refers to a discarded section" for .got2
Similar to D63182 [ELF][PPC64] Don't report "relocation refers to a discarded section" for .toc
Reviewed By: Bdragon28
Differential Revision: https://reviews.llvm.org/D75419
This is needed to fix compile errors when building for ppc32/lld10.
Requested by: bdragon Differential Revision: https://reviews.freebsd.org/D24110
MFC r359086:
Merge commit b8ebc11f0 from llvm git (by Sanjay Patel):
[EarlyCSE] avoid crashing when detecting min/max/abs patterns (PR41083)
As discussed in PR41083: https://bugs.llvm.org/show_bug.cgi?id=41083 ...we can assert/crash in EarlyCSE using the current hashing scheme and instructions with flags.
ValueTracking's matchSelectPattern() may rely on overflow (nsw, etc) or other flags when detecting patterns such as min/max/abs composed of compare+select. But the value numbering / hashing mechanism used by EarlyCSE intersects those flags to allow more CSE.
Several alternatives to solve this are discussed in the bug report. This patch avoids the issue by doing simple matching of min/max/abs patterns that never requires instruction flags. We give up some CSE power because of that, but that is not expected to result in much actual performance difference because InstCombine will canonicalize these patterns when possible. It even has this comment for abs/nabs:
/// Canonicalize all these variants to 1 pattern. /// This makes CSE more likely.
(And this patch adds PhaseOrdering tests to verify that the expected transforms are still happening in the standard optimization pipelines.
I left this code to use ValueTracking's "flavor" enum values, so we don't have to change the callers' code. If we decide to go back to using the ValueTracking call (by changing the hashing algorithm instead), it should be obvious how to replace this chunk.
Differential Revision: https://reviews.llvm.org/D74285
This fixes an assertion when building the math/gsl port on PowerPC64.
Requested by: pkubja
MFC r359087:
Merge commit 585a3cc31 from llvm git (by me):
Fix -Wdeprecated-copy-dtor and -Wdeprecated-dynamic-exception-spec warnings.
Summary: The former are like:
libcxx/include/typeinfo:322:11: warning: definition of implicit copy constructor for 'bad_cast' is deprecated because it has a user-declared destructor [-Wdeprecated-copy-dtor] virtual ~bad_cast() _NOEXCEPT; ^ libcxx/include/typeinfo:344:11: note: in implicit copy constructor for 'std::bad_cast' first required here throw bad_cast(); ^
Fix these by adding an explicitly defaulted copy constructor.
The latter are like:
libcxx/include/codecvt:105:37: warning: dynamic exception specifications are deprecated [-Wdeprecated-dynamic-exception-spec] virtual int do_encoding() const throw(); ^~~~~~~
Fix these by using the _NOEXCEPT macro instead.
Reviewers: EricWF, mclow.lists, ldionne, #libc
Reviewed By: EricWF, #libc
Subscribers: dexonsmith, libcxx-commits
Tags: #libc
Differential Revision: https://reviews.llvm.org/D76150
This is because we use -Wsystem-headers during buildworld, and the two warnings above are now triggered by default with clang 10, preventing most C++ code from compiling without NO_WERROR.
Requested by: brooks Differential Revision: https://reviews.freebsd.org/D24049
MFC r359333:
Merge commit f0990e104 from llvm git (by Justin Hibbits):
[PowerPC]: e500 target can't use lwsync, use msync instead
The e500 core has a silicon bug that triggers an illegal instruction program trap on any sync other than msync. Other cores will typically ignore illegal sync types, and the documentation even implies that the 'illegal' bits are ignored.
Address this hardware deficiency by only using msync, like the PPC440.
Differential Revision: https://reviews.llvm.org/D76614
Requested by: jhibbits
MFC r359334:
Merge commit 459e8e948 from llvm git (by Justin Hibbits):
[PowerPC]: Don't allow r0 as a target for LD_GOT_TPREL_L/32
Summary: The linker is free to relax this (relocation R_PPC_GOT_TPREL16) against R_PPC_TLS, if it sees fit (initial exec to local exec). If r0 is used, this can generate execution-invalid code (converts to 'addi %rX, %r0, FOO, which translates in PPC-lingo to li %rX, FOO). Forbid this instead.
This fixes static binaries using locales on FreeBSD/powerpc (tested on FreeBSD/powerpcspe).
Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D76662
Requested by: jhibbits
MFC r359338:
Merge llvm, clang, compiler-rt, libc++, libunwind, lld, lldb and openmp llvmorg-10.0.0-0-gd32170dbd5b (aka 10.0.0 release).
PR: 244251
MFC r359506 (by emaste):
lldb: stop excluding bindings/ subdir
With liblua in the tree we should be able to enable lldb's lua scripting. We'll need the files in bindings/, so start by allowing them to come in with the next import.
Approved by: dim Sponsored by: The FreeBSD Foundation
MFC r359578:
Merge once more from ^/vendor/llvm-project/release-10.x, to get the lldb/bindings directory, which will be used to provide lua bindings for lldb.
Requested by: emaste
MFC r359826:
Merge commit 30588a739 from llvm git (by Erich Keane):
Make target features check work with ctor and dtor-
The problem was reported in PR45468, applying target features to an always_inline constructor/destructor runs afoul of GlobalDecl construction assert when checking for target-feature compatibility.
The core problem is fixed by using the version of the check that takes a FunctionDecl rather than the GlobalDecl. However, while writing the test, I discovered that source locations weren't properly set for this check on ctors/dtors. This patch also fixes constructors and CALLED destructors.
Unfortunately, it doesn't seem too possible to get a meaningful source location for a 'cleanup' destructor, so those are still 'frontend' level errors unfortunately. A fixme was added to the test to cover that situation.
This should fix 'Assertion failed: (!isa<CXXConstructorDecl>(D) && "Use other ctor with ctor decls!"), function Init, file /usr/src/contrib/llvm-project/clang/include/clang/AST/GlobalDecl.h, line 45' when compiling the security/botan2 port.
PR: 245550
MFC r359981:
Revert commit a9ad65a2b from llvm git (by Nemanja Ivanovic):
[PowerPC] Change default for unaligned FP access for older subtargets
This is a fix for https://bugs.llvm.org/show_bug.cgi?id=40554
Some CPU's trap to the kernel on unaligned floating point access and there are kernels that do not handle the interrupt. The program then fails with a SIGBUS according to the PR. This just switches the default for unaligned access to only allow it on recent server CPUs that are known to allow this.
Differential revision: https://reviews.llvm.org/D71954
This upstream commit causes a compiler hang when building certain ports (e.g. security/nss, multimedia/x264) for powerpc64. The hang has been reported in https://bugs.llvm.org/show_bug.cgi?id=45186, but in the mean time it is more convenient to revert the commit.
Requested by: jhibbits
MFC r359994:
Revert commit b6cf400aa fro llvm git (by Nemanja Ivanovic):
Fix bots after a9ad65a2b34f
In the last commit, I neglected to initialize the new subtarget feature I added which caused failures on a few bots. This should fix that.
This unbreaks the build after r359981, which reverted upstream commit a9ad65a2b34f.
Reported by: jhibbits (and jenkins :)
MFC r360129:
Merge commit ce5173c0e from llvm git (by Reid Kleckner):
Use FinishThunk to finish musttail thunks
FinishThunk, and the invariant of setting and then unsetting CurCodeDecl, was added in 7f416cc42638 (2015). The invariant didn't exist when I added this musttail codepath in ab2090d10765 (2014). Recently in 28328c3771, I started using this codepath on non-Windows platforms, and users reported problems during release testing (PR44987).
The issue was already present for users of EH on i686-windows-msvc, so I added a test for that case as well.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D76444
This should fix 'Assertion failed: (!empty() && "popping exception stack when not empty"), function popTerminate, file /usr/src/contrib/llvm-project/clang/lib/CodeGen/CGCleanup.h, line 583' when building the net-p2p/libtorrent-rasterbar
PR: 244830 Reported by: jbeich, yuri
MFC r360134:
Merge commit 64b31d96d from llvm git (by Nemanja Ivanovic):
[PowerPC] Do not attempt to reuse load for 64-bit FP_TO_UINT without FPCVT
We call the function that attempts to reuse the conversion without checking whether the target matches the constraints that the callee expects. This patch adds the check prior to the call.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=43976
Differential revision: https://reviews.llvm.org/D77564
This should fix 'Assertion failed: ((Op.getOpcode() == ISD::FP_TO_SINT || Subtarget.hasFPCVT()) && "i64 FP_TO_UINT is supported only with FPCVT"), function LowerFP_TO_INTForReuse, file /usr/src/contrib/llvm/lib/Target/PowerPC/PPCISelLowering.cpp, line 7276' when building the devel/libslang2 port (and a few others) for PowerPC64.
Requested by: pkubaj
MFC r360350:
Tentatively apply https://reviews.llvm.org/D78877 (by Dave Green):
[ARM] Only produce qadd8b under hasV6Ops
When compiling for a arm5te cpu from clang, the +dsp attribute is set. This meant we could try and generate qadd8 instructions where we would end up having no pattern. I've changed the condition here to be hasV6Ops && hasDSP, which is what other parts of ARMISelLowering seem to use for similar instructions.
Fixed PR45677.
This fixes "fatal error: error in backend: Cannot select: t37: i32 = ARMISD::QADD8b t43, t44" when compiling sys/dev/sound/pcm/feeder_mixer.c for armv5. For some reason we do not encounter this on head, but this error popped up while building universes for stable/12.
MFC r360697:
In r358396 I merged llvm upstream commit 2e24219d3, which fixed "error: unsupported relocation on symbol" when assembling arm 'adr' pseudo instructions. However, the upstream commit did not take big-endian arm into account.
Applying the same changes to the big-endian handling is straightforward, thanks to Andrew Turner and Peter Smith for the hint. This will also be submitted upstream.
|
#
328817 |
|
02-Feb-2018 |
dim |
Upgrade our copies of clang, llvm, lld, lldb, compiler-rt and libc++ to 6.0.0 (branches/release_60 r324090).
This introduces retpoline support, with the -mretpoline flag. The upstream initial commit message (r323155 by Chandler Carruth) contains quite a bit of explanation. Quoting:
Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre.
Summary: First, we need to explain the core of the vulnerability. Note that this is a very incomplete description, please see the Project Zero blog post for details: https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html
The basis for branch target injection is to direct speculative execution of the processor to some "gadget" of executable code by poisoning the prediction of indirect branches with the address of that gadget. The gadget in turn contains an operation that provides a side channel for reading data. Most commonly, this will look like a load of secret data followed by a branch on the loaded value and then a load of some predictable cache line. The attacker then uses timing of the processors cache to determine which direction the branch took *in the speculative execution*, and in turn what one bit of the loaded value was. Due to the nature of these timing side channels and the branch predictor on Intel processors, this allows an attacker to leak data only accessible to a privileged domain (like the kernel) back into an unprivileged domain.
The goal is simple: avoid generating code which contains an indirect branch that could have its prediction poisoned by an attacker. In many cases, the compiler can simply use directed conditional branches and a small search tree. LLVM already has support for lowering switches in this way and the first step of this patch is to disable jump-table lowering of switches and introduce a pass to rewrite explicit indirectbr sequences into a switch over integers.
However, there is no fully general alternative to indirect calls. We introduce a new construct we call a "retpoline" to implement indirect calls in a non-speculatable way. It can be thought of loosely as a trampoline for indirect calls which uses the RET instruction on x86. Further, we arrange for a specific call->ret sequence which ensures the processor predicts the return to go to a controlled, known location. The retpoline then "smashes" the return address pushed onto the stack by the call with the desired target of the original indirect call. The result is a predicted return to the next instruction after a call (which can be used to trap speculative execution within an infinite loop) and an actual indirect branch to an arbitrary address.
On 64-bit x86 ABIs, this is especially easily done in the compiler by using a guaranteed scratch register to pass the target into this device. For 32-bit ABIs there isn't a guaranteed scratch register and so several different retpoline variants are introduced to use a scratch register if one is available in the calling convention and to otherwise use direct stack push/pop sequences to pass the target address.
This "retpoline" mitigation is fully described in the following blog post: https://support.google.com/faqs/answer/7625886
We also support a target feature that disables emission of the retpoline thunk by the compiler to allow for custom thunks if users want them. These are particularly useful in environments like kernels that routinely do hot-patching on boot and want to hot-patch their thunk to different code sequences. They can write this custom thunk and use `-mretpoline-external-thunk` *in addition* to `-mretpoline`. In this case, on x86-64 thu thunk names must be: ``` __llvm_external_retpoline_r11 ``` or on 32-bit: ``` __llvm_external_retpoline_eax __llvm_external_retpoline_ecx __llvm_external_retpoline_edx __llvm_external_retpoline_push ``` And the target of the retpoline is passed in the named register, or in the case of the `push` suffix on the top of the stack via a `pushl` instruction.
There is one other important source of indirect branches in x86 ELF binaries: the PLT. These patches also include support for LLD to generate PLT entries that perform a retpoline-style indirection.
The only other indirect branches remaining that we are aware of are from precompiled runtimes (such as crt0.o and similar). The ones we have found are not really attackable, and so we have not focused on them here, but eventually these runtimes should also be replicated for retpoline-ed configurations for completeness.
For kernels or other freestanding or fully static executables, the compiler switch `-mretpoline` is sufficient to fully mitigate this particular attack. For dynamic executables, you must compile *all* libraries with `-mretpoline` and additionally link the dynamic executable and all shared libraries with LLD and pass `-z retpolineplt` (or use similar functionality from some other linker). We strongly recommend also using `-z now` as non-lazy binding allows the retpoline-mitigated PLT to be substantially smaller.
When manually apply similar transformations to `-mretpoline` to the Linux kernel we observed very small performance hits to applications running typic al workloads, and relatively minor hits (approximately 2%) even for extremely syscall-heavy applications. This is largely due to the small number of indirect branches that occur in performance sensitive paths of the kernel.
When using these patches on statically linked applications, especially C++ applications, you should expect to see a much more dramatic performance hit. For microbenchmarks that are switch, indirect-, or virtual-call heavy we have seen overheads ranging from 10% to 50%.
However, real-world workloads exhibit substantially lower performance impact. Notably, techniques such as PGO and ThinLTO dramatically reduce the impact of hot indirect calls (by speculatively promoting them to direct calls) and allow optimized search trees to be used to lower switches. If you need to deploy these techniques in C++ applications, we *strongly* recommend that you ensure all hot call targets are statically linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well tuned servers using all of these techniques saw 5% - 10% overhead from the use of retpoline.
We will add detailed documentation covering these components in subsequent patches, but wanted to make the core functionality available as soon as possible. Happy for more code review, but we'd really like to get these patches landed and backported ASAP for obvious reasons. We're planning to backport this to both 6.0 and 5.0 release streams and get a 5.0 release with just this cherry picked ASAP for distros and vendors.
This patch is the work of a number of people over the past month: Eric, Reid, Rui, and myself. I'm mailing it out as a single commit due to the time sensitive nature of landing this and the need to backport it. Huge thanks to everyone who helped out here, and everyone at Intel who helped out in discussions about how to craft this. Also, credit goes to Paul Turner (at Google, but not an LLVM contributor) for much of the underlying retpoline design.
Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer
Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D41723
MFC after: 3 months X-MFC-With: r327952 PR: 224669
|