1//===- MemorySanitizer.cpp - detector of uninitialized reads --------------===//
2//
3// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4// See https://llvm.org/LICENSE.txt for license information.
5// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6//
7//===----------------------------------------------------------------------===//
8//
9/// \file
10/// This file is a part of MemorySanitizer, a detector of uninitialized
11/// reads.
12///
13/// The algorithm of the tool is similar to Memcheck
14/// (http://goo.gl/QKbem). We associate a few shadow bits with every
15/// byte of the application memory, poison the shadow of the malloc-ed
16/// or alloca-ed memory, load the shadow bits on every memory read,
17/// propagate the shadow bits through some of the arithmetic
18/// instruction (including MOV), store the shadow bits on every memory
19/// write, report a bug on some other instructions (e.g. JMP) if the
20/// associated shadow is poisoned.
21///
22/// But there are differences too. The first and the major one:
23/// compiler instrumentation instead of binary instrumentation. This
24/// gives us much better register allocation, possible compiler
25/// optimizations and a fast start-up. But this brings the major issue
26/// as well: msan needs to see all program events, including system
27/// calls and reads/writes in system libraries, so we either need to
28/// compile *everything* with msan or use a binary translation
29/// component (e.g. DynamoRIO) to instrument pre-built libraries.
30/// Another difference from Memcheck is that we use 8 shadow bits per
31/// byte of application memory and use a direct shadow mapping. This
32/// greatly simplifies the instrumentation code and avoids races on
33/// shadow updates (Memcheck is single-threaded so races are not a
34/// concern there. Memcheck uses 2 shadow bits per byte with a slow
35/// path storage that uses 8 bits per byte).
36///
37/// The default value of shadow is 0, which means "clean" (not poisoned).
38///
39/// Every module initializer should call __msan_init to ensure that the
40/// shadow memory is ready. On error, __msan_warning is called. Since
41/// parameters and return values may be passed via registers, we have a
42/// specialized thread-local shadow for return values
43/// (__msan_retval_tls) and parameters (__msan_param_tls).
44///
45///                           Origin tracking.
46///
47/// MemorySanitizer can track origins (allocation points) of all uninitialized
48/// values. This behavior is controlled with a flag (msan-track-origins) and is
49/// disabled by default.
50///
51/// Origins are 4-byte values created and interpreted by the runtime library.
52/// They are stored in a second shadow mapping, one 4-byte value for 4 bytes
53/// of application memory. Propagation of origins is basically a bunch of
54/// "select" instructions that pick the origin of a dirty argument, if an
55/// instruction has one.
56///
57/// Every 4 aligned, consecutive bytes of application memory have one origin
58/// value associated with them. If these bytes contain uninitialized data
59/// coming from 2 different allocations, the last store wins. Because of this,
60/// MemorySanitizer reports can show unrelated origins, but this is unlikely in
61/// practice.
62///
63/// Origins are meaningless for fully initialized values, so MemorySanitizer
64/// avoids storing origin to memory when a fully initialized value is stored.
65/// This way it avoids needless overwriting origin of the 4-byte region on
66/// a short (i.e. 1 byte) clean store, and it is also good for performance.
67///
68///                            Atomic handling.
69///
70/// Ideally, every atomic store of application value should update the
71/// corresponding shadow location in an atomic way. Unfortunately, atomic store
72/// of two disjoint locations can not be done without severe slowdown.
73///
74/// Therefore, we implement an approximation that may err on the safe side.
75/// In this implementation, every atomically accessed location in the program
76/// may only change from (partially) uninitialized to fully initialized, but
77/// not the other way around. We load the shadow _after_ the application load,
78/// and we store the shadow _before_ the app store. Also, we always store clean
79/// shadow (if the application store is atomic). This way, if the store-load
80/// pair constitutes a happens-before arc, shadow store and load are correctly
81/// ordered such that the load will get either the value that was stored, or
82/// some later value (which is always clean).
83///
84/// This does not work very well with Compare-And-Swap (CAS) and
85/// Read-Modify-Write (RMW) operations. To follow the above logic, CAS and RMW
86/// must store the new shadow before the app operation, and load the shadow
87/// after the app operation. Computers don't work this way. Current
88/// implementation ignores the load aspect of CAS/RMW, always returning a clean
89/// value. It implements the store part as a simple atomic store by storing a
90/// clean shadow.
91///
92///                      Instrumenting inline assembly.
93///
94/// For inline assembly code LLVM has little idea about which memory locations
95/// become initialized depending on the arguments. It can be possible to figure
96/// out which arguments are meant to point to inputs and outputs, but the
97/// actual semantics can be only visible at runtime. In the Linux kernel it's
98/// also possible that the arguments only indicate the offset for a base taken
99/// from a segment register, so it's dangerous to treat any asm() arguments as
100/// pointers. We take a conservative approach generating calls to
101///   __msan_instrument_asm_store(ptr, size)
102/// , which defer the memory unpoisoning to the runtime library.
103/// The latter can perform more complex address checks to figure out whether
104/// it's safe to touch the shadow memory.
105/// Like with atomic operations, we call __msan_instrument_asm_store() before
106/// the assembly call, so that changes to the shadow memory will be seen by
107/// other threads together with main memory initialization.
108///
109///                  KernelMemorySanitizer (KMSAN) implementation.
110///
111/// The major differences between KMSAN and MSan instrumentation are:
112///  - KMSAN always tracks the origins and implies msan-keep-going=true;
113///  - KMSAN allocates shadow and origin memory for each page separately, so
114///    there are no explicit accesses to shadow and origin in the
115///    instrumentation.
116///    Shadow and origin values for a particular X-byte memory location
117///    (X=1,2,4,8) are accessed through pointers obtained via the
118///      __msan_metadata_ptr_for_load_X(ptr)
119///      __msan_metadata_ptr_for_store_X(ptr)
120///    functions. The corresponding functions check that the X-byte accesses
121///    are possible and returns the pointers to shadow and origin memory.
122///    Arbitrary sized accesses are handled with:
123///      __msan_metadata_ptr_for_load_n(ptr, size)
124///      __msan_metadata_ptr_for_store_n(ptr, size);
125///  - TLS variables are stored in a single per-task struct. A call to a
126///    function __msan_get_context_state() returning a pointer to that struct
127///    is inserted into every instrumented function before the entry block;
128///  - __msan_warning() takes a 32-bit origin parameter;
129///  - local variables are poisoned with __msan_poison_alloca() upon function
130///    entry and unpoisoned with __msan_unpoison_alloca() before leaving the
131///    function;
132///  - the pass doesn't declare any global variables or add global constructors
133///    to the translation unit.
134///
135/// Also, KMSAN currently ignores uninitialized memory passed into inline asm
136/// calls, making sure we're on the safe side wrt. possible false positives.
137///
138///  KernelMemorySanitizer only supports X86_64 at the moment.
139///
140//
141// FIXME: This sanitizer does not yet handle scalable vectors
142//
143//===----------------------------------------------------------------------===//
144
145#include "llvm/Transforms/Instrumentation/MemorySanitizer.h"
146#include "llvm/ADT/APInt.h"
147#include "llvm/ADT/ArrayRef.h"
148#include "llvm/ADT/DenseMap.h"
149#include "llvm/ADT/DepthFirstIterator.h"
150#include "llvm/ADT/SetVector.h"
151#include "llvm/ADT/SmallString.h"
152#include "llvm/ADT/SmallVector.h"
153#include "llvm/ADT/StringExtras.h"
154#include "llvm/ADT/StringRef.h"
155#include "llvm/ADT/Triple.h"
156#include "llvm/Analysis/GlobalsModRef.h"
157#include "llvm/Analysis/TargetLibraryInfo.h"
158#include "llvm/Analysis/ValueTracking.h"
159#include "llvm/IR/Argument.h"
160#include "llvm/IR/Attributes.h"
161#include "llvm/IR/BasicBlock.h"
162#include "llvm/IR/CallingConv.h"
163#include "llvm/IR/Constant.h"
164#include "llvm/IR/Constants.h"
165#include "llvm/IR/DataLayout.h"
166#include "llvm/IR/DerivedTypes.h"
167#include "llvm/IR/Function.h"
168#include "llvm/IR/GlobalValue.h"
169#include "llvm/IR/GlobalVariable.h"
170#include "llvm/IR/IRBuilder.h"
171#include "llvm/IR/InlineAsm.h"
172#include "llvm/IR/InstVisitor.h"
173#include "llvm/IR/InstrTypes.h"
174#include "llvm/IR/Instruction.h"
175#include "llvm/IR/Instructions.h"
176#include "llvm/IR/IntrinsicInst.h"
177#include "llvm/IR/Intrinsics.h"
178#include "llvm/IR/IntrinsicsX86.h"
179#include "llvm/IR/MDBuilder.h"
180#include "llvm/IR/Module.h"
181#include "llvm/IR/Type.h"
182#include "llvm/IR/Value.h"
183#include "llvm/IR/ValueMap.h"
184#include "llvm/Support/Alignment.h"
185#include "llvm/Support/AtomicOrdering.h"
186#include "llvm/Support/Casting.h"
187#include "llvm/Support/CommandLine.h"
188#include "llvm/Support/Debug.h"
189#include "llvm/Support/DebugCounter.h"
190#include "llvm/Support/ErrorHandling.h"
191#include "llvm/Support/MathExtras.h"
192#include "llvm/Support/raw_ostream.h"
193#include "llvm/Transforms/Utils/BasicBlockUtils.h"
194#include "llvm/Transforms/Utils/Local.h"
195#include "llvm/Transforms/Utils/ModuleUtils.h"
196#include <algorithm>
197#include <cassert>
198#include <cstddef>
199#include <cstdint>
200#include <memory>
201#include <string>
202#include <tuple>
203
204using namespace llvm;
205
206#define DEBUG_TYPE "msan"
207
208DEBUG_COUNTER(DebugInsertCheck, "msan-insert-check",
209              "Controls which checks to insert");
210
211static const unsigned kOriginSize = 4;
212static const Align kMinOriginAlignment = Align(4);
213static const Align kShadowTLSAlignment = Align(8);
214
215// These constants must be kept in sync with the ones in msan.h.
216static const unsigned kParamTLSSize = 800;
217static const unsigned kRetvalTLSSize = 800;
218
219// Accesses sizes are powers of two: 1, 2, 4, 8.
220static const size_t kNumberOfAccessSizes = 4;
221
222/// Track origins of uninitialized values.
223///
224/// Adds a section to MemorySanitizer report that points to the allocation
225/// (stack or heap) the uninitialized bits came from originally.
226static cl::opt<int> ClTrackOrigins(
227    "msan-track-origins",
228    cl::desc("Track origins (allocation sites) of poisoned memory"), cl::Hidden,
229    cl::init(0));
230
231static cl::opt<bool> ClKeepGoing("msan-keep-going",
232                                 cl::desc("keep going after reporting a UMR"),
233                                 cl::Hidden, cl::init(false));
234
235static cl::opt<bool>
236    ClPoisonStack("msan-poison-stack",
237                  cl::desc("poison uninitialized stack variables"), cl::Hidden,
238                  cl::init(true));
239
240static cl::opt<bool> ClPoisonStackWithCall(
241    "msan-poison-stack-with-call",
242    cl::desc("poison uninitialized stack variables with a call"), cl::Hidden,
243    cl::init(false));
244
245static cl::opt<int> ClPoisonStackPattern(
246    "msan-poison-stack-pattern",
247    cl::desc("poison uninitialized stack variables with the given pattern"),
248    cl::Hidden, cl::init(0xff));
249
250static cl::opt<bool>
251    ClPrintStackNames("msan-print-stack-names",
252                      cl::desc("Print name of local stack variable"),
253                      cl::Hidden, cl::init(true));
254
255static cl::opt<bool> ClPoisonUndef("msan-poison-undef",
256                                   cl::desc("poison undef temps"), cl::Hidden,
257                                   cl::init(true));
258
259static cl::opt<bool>
260    ClHandleICmp("msan-handle-icmp",
261                 cl::desc("propagate shadow through ICmpEQ and ICmpNE"),
262                 cl::Hidden, cl::init(true));
263
264static cl::opt<bool>
265    ClHandleICmpExact("msan-handle-icmp-exact",
266                      cl::desc("exact handling of relational integer ICmp"),
267                      cl::Hidden, cl::init(false));
268
269static cl::opt<bool> ClHandleLifetimeIntrinsics(
270    "msan-handle-lifetime-intrinsics",
271    cl::desc(
272        "when possible, poison scoped variables at the beginning of the scope "
273        "(slower, but more precise)"),
274    cl::Hidden, cl::init(true));
275
276// When compiling the Linux kernel, we sometimes see false positives related to
277// MSan being unable to understand that inline assembly calls may initialize
278// local variables.
279// This flag makes the compiler conservatively unpoison every memory location
280// passed into an assembly call. Note that this may cause false positives.
281// Because it's impossible to figure out the array sizes, we can only unpoison
282// the first sizeof(type) bytes for each type* pointer.
283// The instrumentation is only enabled in KMSAN builds, and only if
284// -msan-handle-asm-conservative is on. This is done because we may want to
285// quickly disable assembly instrumentation when it breaks.
286static cl::opt<bool> ClHandleAsmConservative(
287    "msan-handle-asm-conservative",
288    cl::desc("conservative handling of inline assembly"), cl::Hidden,
289    cl::init(true));
290
291// This flag controls whether we check the shadow of the address
292// operand of load or store. Such bugs are very rare, since load from
293// a garbage address typically results in SEGV, but still happen
294// (e.g. only lower bits of address are garbage, or the access happens
295// early at program startup where malloc-ed memory is more likely to
296// be zeroed. As of 2012-08-28 this flag adds 20% slowdown.
297static cl::opt<bool> ClCheckAccessAddress(
298    "msan-check-access-address",
299    cl::desc("report accesses through a pointer which has poisoned shadow"),
300    cl::Hidden, cl::init(true));
301
302static cl::opt<bool> ClEagerChecks(
303    "msan-eager-checks",
304    cl::desc("check arguments and return values at function call boundaries"),
305    cl::Hidden, cl::init(false));
306
307static cl::opt<bool> ClDumpStrictInstructions(
308    "msan-dump-strict-instructions",
309    cl::desc("print out instructions with default strict semantics"),
310    cl::Hidden, cl::init(false));
311
312static cl::opt<int> ClInstrumentationWithCallThreshold(
313    "msan-instrumentation-with-call-threshold",
314    cl::desc(
315        "If the function being instrumented requires more than "
316        "this number of checks and origin stores, use callbacks instead of "
317        "inline checks (-1 means never use callbacks)."),
318    cl::Hidden, cl::init(3500));
319
320static cl::opt<bool>
321    ClEnableKmsan("msan-kernel",
322                  cl::desc("Enable KernelMemorySanitizer instrumentation"),
323                  cl::Hidden, cl::init(false));
324
325static cl::opt<bool>
326    ClDisableChecks("msan-disable-checks",
327                    cl::desc("Apply no_sanitize to the whole file"), cl::Hidden,
328                    cl::init(false));
329
330static cl::opt<bool>
331    ClCheckConstantShadow("msan-check-constant-shadow",
332                          cl::desc("Insert checks for constant shadow values"),
333                          cl::Hidden, cl::init(true));
334
335// This is off by default because of a bug in gold:
336// https://sourceware.org/bugzilla/show_bug.cgi?id=19002
337static cl::opt<bool>
338    ClWithComdat("msan-with-comdat",
339                 cl::desc("Place MSan constructors in comdat sections"),
340                 cl::Hidden, cl::init(false));
341
342// These options allow to specify custom memory map parameters
343// See MemoryMapParams for details.
344static cl::opt<uint64_t> ClAndMask("msan-and-mask",
345                                   cl::desc("Define custom MSan AndMask"),
346                                   cl::Hidden, cl::init(0));
347
348static cl::opt<uint64_t> ClXorMask("msan-xor-mask",
349                                   cl::desc("Define custom MSan XorMask"),
350                                   cl::Hidden, cl::init(0));
351
352static cl::opt<uint64_t> ClShadowBase("msan-shadow-base",
353                                      cl::desc("Define custom MSan ShadowBase"),
354                                      cl::Hidden, cl::init(0));
355
356static cl::opt<uint64_t> ClOriginBase("msan-origin-base",
357                                      cl::desc("Define custom MSan OriginBase"),
358                                      cl::Hidden, cl::init(0));
359
360static cl::opt<int>
361    ClDisambiguateWarning("msan-disambiguate-warning-threshold",
362                          cl::desc("Define threshold for number of checks per "
363                                   "debug location to force origin update."),
364                          cl::Hidden, cl::init(3));
365
366const char kMsanModuleCtorName[] = "msan.module_ctor";
367const char kMsanInitName[] = "__msan_init";
368
369namespace {
370
371// Memory map parameters used in application-to-shadow address calculation.
372// Offset = (Addr & ~AndMask) ^ XorMask
373// Shadow = ShadowBase + Offset
374// Origin = OriginBase + Offset
375struct MemoryMapParams {
376  uint64_t AndMask;
377  uint64_t XorMask;
378  uint64_t ShadowBase;
379  uint64_t OriginBase;
380};
381
382struct PlatformMemoryMapParams {
383  const MemoryMapParams *bits32;
384  const MemoryMapParams *bits64;
385};
386
387} // end anonymous namespace
388
389// i386 Linux
390static const MemoryMapParams Linux_I386_MemoryMapParams = {
391    0x000080000000, // AndMask
392    0,              // XorMask (not used)
393    0,              // ShadowBase (not used)
394    0x000040000000, // OriginBase
395};
396
397// x86_64 Linux
398static const MemoryMapParams Linux_X86_64_MemoryMapParams = {
399    0,              // AndMask (not used)
400    0x500000000000, // XorMask
401    0,              // ShadowBase (not used)
402    0x100000000000, // OriginBase
403};
404
405// mips64 Linux
406static const MemoryMapParams Linux_MIPS64_MemoryMapParams = {
407    0,              // AndMask (not used)
408    0x008000000000, // XorMask
409    0,              // ShadowBase (not used)
410    0x002000000000, // OriginBase
411};
412
413// ppc64 Linux
414static const MemoryMapParams Linux_PowerPC64_MemoryMapParams = {
415    0xE00000000000, // AndMask
416    0x100000000000, // XorMask
417    0x080000000000, // ShadowBase
418    0x1C0000000000, // OriginBase
419};
420
421// s390x Linux
422static const MemoryMapParams Linux_S390X_MemoryMapParams = {
423    0xC00000000000, // AndMask
424    0,              // XorMask (not used)
425    0x080000000000, // ShadowBase
426    0x1C0000000000, // OriginBase
427};
428
429// aarch64 Linux
430static const MemoryMapParams Linux_AArch64_MemoryMapParams = {
431    0,               // AndMask (not used)
432    0x0B00000000000, // XorMask
433    0,               // ShadowBase (not used)
434    0x0200000000000, // OriginBase
435};
436
437// aarch64 FreeBSD
438static const MemoryMapParams FreeBSD_AArch64_MemoryMapParams = {
439    0x1800000000000, // AndMask
440    0x0400000000000, // XorMask
441    0x0200000000000, // ShadowBase
442    0x0700000000000, // OriginBase
443};
444
445// i386 FreeBSD
446static const MemoryMapParams FreeBSD_I386_MemoryMapParams = {
447    0x000180000000, // AndMask
448    0x000040000000, // XorMask
449    0x000020000000, // ShadowBase
450    0x000700000000, // OriginBase
451};
452
453// x86_64 FreeBSD
454static const MemoryMapParams FreeBSD_X86_64_MemoryMapParams = {
455    0xc00000000000, // AndMask
456    0x200000000000, // XorMask
457    0x100000000000, // ShadowBase
458    0x380000000000, // OriginBase
459};
460
461// x86_64 NetBSD
462static const MemoryMapParams NetBSD_X86_64_MemoryMapParams = {
463    0,              // AndMask
464    0x500000000000, // XorMask
465    0,              // ShadowBase
466    0x100000000000, // OriginBase
467};
468
469static const PlatformMemoryMapParams Linux_X86_MemoryMapParams = {
470    &Linux_I386_MemoryMapParams,
471    &Linux_X86_64_MemoryMapParams,
472};
473
474static const PlatformMemoryMapParams Linux_MIPS_MemoryMapParams = {
475    nullptr,
476    &Linux_MIPS64_MemoryMapParams,
477};
478
479static const PlatformMemoryMapParams Linux_PowerPC_MemoryMapParams = {
480    nullptr,
481    &Linux_PowerPC64_MemoryMapParams,
482};
483
484static const PlatformMemoryMapParams Linux_S390_MemoryMapParams = {
485    nullptr,
486    &Linux_S390X_MemoryMapParams,
487};
488
489static const PlatformMemoryMapParams Linux_ARM_MemoryMapParams = {
490    nullptr,
491    &Linux_AArch64_MemoryMapParams,
492};
493
494static const PlatformMemoryMapParams FreeBSD_ARM_MemoryMapParams = {
495    nullptr,
496    &FreeBSD_AArch64_MemoryMapParams,
497};
498
499static const PlatformMemoryMapParams FreeBSD_X86_MemoryMapParams = {
500    &FreeBSD_I386_MemoryMapParams,
501    &FreeBSD_X86_64_MemoryMapParams,
502};
503
504static const PlatformMemoryMapParams NetBSD_X86_MemoryMapParams = {
505    nullptr,
506    &NetBSD_X86_64_MemoryMapParams,
507};
508
509namespace {
510
511/// Instrument functions of a module to detect uninitialized reads.
512///
513/// Instantiating MemorySanitizer inserts the msan runtime library API function
514/// declarations into the module if they don't exist already. Instantiating
515/// ensures the __msan_init function is in the list of global constructors for
516/// the module.
517class MemorySanitizer {
518public:
519  MemorySanitizer(Module &M, MemorySanitizerOptions Options)
520      : CompileKernel(Options.Kernel), TrackOrigins(Options.TrackOrigins),
521        Recover(Options.Recover), EagerChecks(Options.EagerChecks) {
522    initializeModule(M);
523  }
524
525  // MSan cannot be moved or copied because of MapParams.
526  MemorySanitizer(MemorySanitizer &&) = delete;
527  MemorySanitizer &operator=(MemorySanitizer &&) = delete;
528  MemorySanitizer(const MemorySanitizer &) = delete;
529  MemorySanitizer &operator=(const MemorySanitizer &) = delete;
530
531  bool sanitizeFunction(Function &F, TargetLibraryInfo &TLI);
532
533private:
534  friend struct MemorySanitizerVisitor;
535  friend struct VarArgAMD64Helper;
536  friend struct VarArgMIPS64Helper;
537  friend struct VarArgAArch64Helper;
538  friend struct VarArgPowerPC64Helper;
539  friend struct VarArgSystemZHelper;
540
541  void initializeModule(Module &M);
542  void initializeCallbacks(Module &M, const TargetLibraryInfo &TLI);
543  void createKernelApi(Module &M, const TargetLibraryInfo &TLI);
544  void createUserspaceApi(Module &M, const TargetLibraryInfo &TLI);
545
546  /// True if we're compiling the Linux kernel.
547  bool CompileKernel;
548  /// Track origins (allocation points) of uninitialized values.
549  int TrackOrigins;
550  bool Recover;
551  bool EagerChecks;
552
553  LLVMContext *C;
554  Type *IntptrTy;
555  Type *OriginTy;
556
557  // XxxTLS variables represent the per-thread state in MSan and per-task state
558  // in KMSAN.
559  // For the userspace these point to thread-local globals. In the kernel land
560  // they point to the members of a per-task struct obtained via a call to
561  // __msan_get_context_state().
562
563  /// Thread-local shadow storage for function parameters.
564  Value *ParamTLS;
565
566  /// Thread-local origin storage for function parameters.
567  Value *ParamOriginTLS;
568
569  /// Thread-local shadow storage for function return value.
570  Value *RetvalTLS;
571
572  /// Thread-local origin storage for function return value.
573  Value *RetvalOriginTLS;
574
575  /// Thread-local shadow storage for in-register va_arg function
576  /// parameters (x86_64-specific).
577  Value *VAArgTLS;
578
579  /// Thread-local shadow storage for in-register va_arg function
580  /// parameters (x86_64-specific).
581  Value *VAArgOriginTLS;
582
583  /// Thread-local shadow storage for va_arg overflow area
584  /// (x86_64-specific).
585  Value *VAArgOverflowSizeTLS;
586
587  /// Are the instrumentation callbacks set up?
588  bool CallbacksInitialized = false;
589
590  /// The run-time callback to print a warning.
591  FunctionCallee WarningFn;
592
593  // These arrays are indexed by log2(AccessSize).
594  FunctionCallee MaybeWarningFn[kNumberOfAccessSizes];
595  FunctionCallee MaybeStoreOriginFn[kNumberOfAccessSizes];
596
597  /// Run-time helper that generates a new origin value for a stack
598  /// allocation.
599  FunctionCallee MsanSetAllocaOriginWithDescriptionFn;
600  // No description version
601  FunctionCallee MsanSetAllocaOriginNoDescriptionFn;
602
603  /// Run-time helper that poisons stack on function entry.
604  FunctionCallee MsanPoisonStackFn;
605
606  /// Run-time helper that records a store (or any event) of an
607  /// uninitialized value and returns an updated origin id encoding this info.
608  FunctionCallee MsanChainOriginFn;
609
610  /// Run-time helper that paints an origin over a region.
611  FunctionCallee MsanSetOriginFn;
612
613  /// MSan runtime replacements for memmove, memcpy and memset.
614  FunctionCallee MemmoveFn, MemcpyFn, MemsetFn;
615
616  /// KMSAN callback for task-local function argument shadow.
617  StructType *MsanContextStateTy;
618  FunctionCallee MsanGetContextStateFn;
619
620  /// Functions for poisoning/unpoisoning local variables
621  FunctionCallee MsanPoisonAllocaFn, MsanUnpoisonAllocaFn;
622
623  /// Each of the MsanMetadataPtrXxx functions returns a pair of shadow/origin
624  /// pointers.
625  FunctionCallee MsanMetadataPtrForLoadN, MsanMetadataPtrForStoreN;
626  FunctionCallee MsanMetadataPtrForLoad_1_8[4];
627  FunctionCallee MsanMetadataPtrForStore_1_8[4];
628  FunctionCallee MsanInstrumentAsmStoreFn;
629
630  /// Helper to choose between different MsanMetadataPtrXxx().
631  FunctionCallee getKmsanShadowOriginAccessFn(bool isStore, int size);
632
633  /// Memory map parameters used in application-to-shadow calculation.
634  const MemoryMapParams *MapParams;
635
636  /// Custom memory map parameters used when -msan-shadow-base or
637  // -msan-origin-base is provided.
638  MemoryMapParams CustomMapParams;
639
640  MDNode *ColdCallWeights;
641
642  /// Branch weights for origin store.
643  MDNode *OriginStoreWeights;
644};
645
646void insertModuleCtor(Module &M) {
647  getOrCreateSanitizerCtorAndInitFunctions(
648      M, kMsanModuleCtorName, kMsanInitName,
649      /*InitArgTypes=*/{},
650      /*InitArgs=*/{},
651      // This callback is invoked when the functions are created the first
652      // time. Hook them into the global ctors list in that case:
653      [&](Function *Ctor, FunctionCallee) {
654        if (!ClWithComdat) {
655          appendToGlobalCtors(M, Ctor, 0);
656          return;
657        }
658        Comdat *MsanCtorComdat = M.getOrInsertComdat(kMsanModuleCtorName);
659        Ctor->setComdat(MsanCtorComdat);
660        appendToGlobalCtors(M, Ctor, 0, Ctor);
661      });
662}
663
664template <class T> T getOptOrDefault(const cl::opt<T> &Opt, T Default) {
665  return (Opt.getNumOccurrences() > 0) ? Opt : Default;
666}
667
668} // end anonymous namespace
669
670MemorySanitizerOptions::MemorySanitizerOptions(int TO, bool R, bool K,
671                                               bool EagerChecks)
672    : Kernel(getOptOrDefault(ClEnableKmsan, K)),
673      TrackOrigins(getOptOrDefault(ClTrackOrigins, Kernel ? 2 : TO)),
674      Recover(getOptOrDefault(ClKeepGoing, Kernel || R)),
675      EagerChecks(getOptOrDefault(ClEagerChecks, EagerChecks)) {}
676
677PreservedAnalyses MemorySanitizerPass::run(Module &M,
678                                           ModuleAnalysisManager &AM) {
679  bool Modified = false;
680  if (!Options.Kernel) {
681    insertModuleCtor(M);
682    Modified = true;
683  }
684
685  auto &FAM = AM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager();
686  for (Function &F : M) {
687    if (F.empty())
688      continue;
689    MemorySanitizer Msan(*F.getParent(), Options);
690    Modified |=
691        Msan.sanitizeFunction(F, FAM.getResult<TargetLibraryAnalysis>(F));
692  }
693
694  if (!Modified)
695    return PreservedAnalyses::all();
696
697  PreservedAnalyses PA = PreservedAnalyses::none();
698  // GlobalsAA is considered stateless and does not get invalidated unless
699  // explicitly invalidated; PreservedAnalyses::none() is not enough. Sanitizers
700  // make changes that require GlobalsAA to be invalidated.
701  PA.abandon<GlobalsAA>();
702  return PA;
703}
704
705void MemorySanitizerPass::printPipeline(
706    raw_ostream &OS, function_ref<StringRef(StringRef)> MapClassName2PassName) {
707  static_cast<PassInfoMixin<MemorySanitizerPass> *>(this)->printPipeline(
708      OS, MapClassName2PassName);
709  OS << "<";
710  if (Options.Recover)
711    OS << "recover;";
712  if (Options.Kernel)
713    OS << "kernel;";
714  if (Options.EagerChecks)
715    OS << "eager-checks;";
716  OS << "track-origins=" << Options.TrackOrigins;
717  OS << ">";
718}
719
720/// Create a non-const global initialized with the given string.
721///
722/// Creates a writable global for Str so that we can pass it to the
723/// run-time lib. Runtime uses first 4 bytes of the string to store the
724/// frame ID, so the string needs to be mutable.
725static GlobalVariable *createPrivateConstGlobalForString(Module &M,
726                                                         StringRef Str) {
727  Constant *StrConst = ConstantDataArray::getString(M.getContext(), Str);
728  return new GlobalVariable(M, StrConst->getType(), /*isConstant=*/true,
729                            GlobalValue::PrivateLinkage, StrConst, "");
730}
731
732/// Create KMSAN API callbacks.
733void MemorySanitizer::createKernelApi(Module &M, const TargetLibraryInfo &TLI) {
734  IRBuilder<> IRB(*C);
735
736  // These will be initialized in insertKmsanPrologue().
737  RetvalTLS = nullptr;
738  RetvalOriginTLS = nullptr;
739  ParamTLS = nullptr;
740  ParamOriginTLS = nullptr;
741  VAArgTLS = nullptr;
742  VAArgOriginTLS = nullptr;
743  VAArgOverflowSizeTLS = nullptr;
744
745  WarningFn = M.getOrInsertFunction("__msan_warning",
746                                    TLI.getAttrList(C, {0}, /*Signed=*/false),
747                                    IRB.getVoidTy(), IRB.getInt32Ty());
748
749  // Requests the per-task context state (kmsan_context_state*) from the
750  // runtime library.
751  MsanContextStateTy = StructType::get(
752      ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8),
753      ArrayType::get(IRB.getInt64Ty(), kRetvalTLSSize / 8),
754      ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8),
755      ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8), /* va_arg_origin */
756      IRB.getInt64Ty(), ArrayType::get(OriginTy, kParamTLSSize / 4), OriginTy,
757      OriginTy);
758  MsanGetContextStateFn = M.getOrInsertFunction(
759      "__msan_get_context_state", PointerType::get(MsanContextStateTy, 0));
760
761  Type *RetTy = StructType::get(PointerType::get(IRB.getInt8Ty(), 0),
762                                PointerType::get(IRB.getInt32Ty(), 0));
763
764  for (int ind = 0, size = 1; ind < 4; ind++, size <<= 1) {
765    std::string name_load =
766        "__msan_metadata_ptr_for_load_" + std::to_string(size);
767    std::string name_store =
768        "__msan_metadata_ptr_for_store_" + std::to_string(size);
769    MsanMetadataPtrForLoad_1_8[ind] = M.getOrInsertFunction(
770        name_load, RetTy, PointerType::get(IRB.getInt8Ty(), 0));
771    MsanMetadataPtrForStore_1_8[ind] = M.getOrInsertFunction(
772        name_store, RetTy, PointerType::get(IRB.getInt8Ty(), 0));
773  }
774
775  MsanMetadataPtrForLoadN = M.getOrInsertFunction(
776      "__msan_metadata_ptr_for_load_n", RetTy,
777      PointerType::get(IRB.getInt8Ty(), 0), IRB.getInt64Ty());
778  MsanMetadataPtrForStoreN = M.getOrInsertFunction(
779      "__msan_metadata_ptr_for_store_n", RetTy,
780      PointerType::get(IRB.getInt8Ty(), 0), IRB.getInt64Ty());
781
782  // Functions for poisoning and unpoisoning memory.
783  MsanPoisonAllocaFn =
784      M.getOrInsertFunction("__msan_poison_alloca", IRB.getVoidTy(),
785                            IRB.getInt8PtrTy(), IntptrTy, IRB.getInt8PtrTy());
786  MsanUnpoisonAllocaFn = M.getOrInsertFunction(
787      "__msan_unpoison_alloca", IRB.getVoidTy(), IRB.getInt8PtrTy(), IntptrTy);
788}
789
790static Constant *getOrInsertGlobal(Module &M, StringRef Name, Type *Ty) {
791  return M.getOrInsertGlobal(Name, Ty, [&] {
792    return new GlobalVariable(M, Ty, false, GlobalVariable::ExternalLinkage,
793                              nullptr, Name, nullptr,
794                              GlobalVariable::InitialExecTLSModel);
795  });
796}
797
798/// Insert declarations for userspace-specific functions and globals.
799void MemorySanitizer::createUserspaceApi(Module &M, const TargetLibraryInfo &TLI) {
800  IRBuilder<> IRB(*C);
801
802  // Create the callback.
803  // FIXME: this function should have "Cold" calling conv,
804  // which is not yet implemented.
805  if (TrackOrigins) {
806    StringRef WarningFnName = Recover ? "__msan_warning_with_origin"
807                                      : "__msan_warning_with_origin_noreturn";
808    WarningFn = M.getOrInsertFunction(WarningFnName,
809                                      TLI.getAttrList(C, {0}, /*Signed=*/false),
810                                      IRB.getVoidTy(), IRB.getInt32Ty());
811  } else {
812    StringRef WarningFnName =
813        Recover ? "__msan_warning" : "__msan_warning_noreturn";
814    WarningFn = M.getOrInsertFunction(WarningFnName, IRB.getVoidTy());
815  }
816
817  // Create the global TLS variables.
818  RetvalTLS =
819      getOrInsertGlobal(M, "__msan_retval_tls",
820                        ArrayType::get(IRB.getInt64Ty(), kRetvalTLSSize / 8));
821
822  RetvalOriginTLS = getOrInsertGlobal(M, "__msan_retval_origin_tls", OriginTy);
823
824  ParamTLS =
825      getOrInsertGlobal(M, "__msan_param_tls",
826                        ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8));
827
828  ParamOriginTLS =
829      getOrInsertGlobal(M, "__msan_param_origin_tls",
830                        ArrayType::get(OriginTy, kParamTLSSize / 4));
831
832  VAArgTLS =
833      getOrInsertGlobal(M, "__msan_va_arg_tls",
834                        ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8));
835
836  VAArgOriginTLS =
837      getOrInsertGlobal(M, "__msan_va_arg_origin_tls",
838                        ArrayType::get(OriginTy, kParamTLSSize / 4));
839
840  VAArgOverflowSizeTLS =
841      getOrInsertGlobal(M, "__msan_va_arg_overflow_size_tls", IRB.getInt64Ty());
842
843  for (size_t AccessSizeIndex = 0; AccessSizeIndex < kNumberOfAccessSizes;
844       AccessSizeIndex++) {
845    unsigned AccessSize = 1 << AccessSizeIndex;
846    std::string FunctionName = "__msan_maybe_warning_" + itostr(AccessSize);
847    MaybeWarningFn[AccessSizeIndex] = M.getOrInsertFunction(
848        FunctionName, TLI.getAttrList(C, {0, 1}, /*Signed=*/false),
849        IRB.getVoidTy(), IRB.getIntNTy(AccessSize * 8), IRB.getInt32Ty());
850
851    FunctionName = "__msan_maybe_store_origin_" + itostr(AccessSize);
852    MaybeStoreOriginFn[AccessSizeIndex] = M.getOrInsertFunction(
853        FunctionName, TLI.getAttrList(C, {0, 2}, /*Signed=*/false),
854        IRB.getVoidTy(), IRB.getIntNTy(AccessSize * 8), IRB.getInt8PtrTy(),
855        IRB.getInt32Ty());
856  }
857
858  MsanSetAllocaOriginWithDescriptionFn = M.getOrInsertFunction(
859      "__msan_set_alloca_origin_with_descr", IRB.getVoidTy(),
860      IRB.getInt8PtrTy(), IntptrTy, IRB.getInt8PtrTy(), IRB.getInt8PtrTy());
861  MsanSetAllocaOriginNoDescriptionFn = M.getOrInsertFunction(
862      "__msan_set_alloca_origin_no_descr", IRB.getVoidTy(), IRB.getInt8PtrTy(),
863      IntptrTy, IRB.getInt8PtrTy());
864  MsanPoisonStackFn = M.getOrInsertFunction(
865      "__msan_poison_stack", IRB.getVoidTy(), IRB.getInt8PtrTy(), IntptrTy);
866}
867
868/// Insert extern declaration of runtime-provided functions and globals.
869void MemorySanitizer::initializeCallbacks(Module &M, const TargetLibraryInfo &TLI) {
870  // Only do this once.
871  if (CallbacksInitialized)
872    return;
873
874  IRBuilder<> IRB(*C);
875  // Initialize callbacks that are common for kernel and userspace
876  // instrumentation.
877  MsanChainOriginFn = M.getOrInsertFunction(
878      "__msan_chain_origin",
879      TLI.getAttrList(C, {0}, /*Signed=*/false, /*Ret=*/true), IRB.getInt32Ty(),
880      IRB.getInt32Ty());
881  MsanSetOriginFn = M.getOrInsertFunction(
882      "__msan_set_origin", TLI.getAttrList(C, {2}, /*Signed=*/false),
883      IRB.getVoidTy(), IRB.getInt8PtrTy(), IntptrTy, IRB.getInt32Ty());
884  MemmoveFn =
885      M.getOrInsertFunction("__msan_memmove", IRB.getInt8PtrTy(),
886                            IRB.getInt8PtrTy(), IRB.getInt8PtrTy(), IntptrTy);
887  MemcpyFn =
888      M.getOrInsertFunction("__msan_memcpy", IRB.getInt8PtrTy(),
889                            IRB.getInt8PtrTy(), IRB.getInt8PtrTy(), IntptrTy);
890  MemsetFn = M.getOrInsertFunction(
891      "__msan_memset", TLI.getAttrList(C, {1}, /*Signed=*/true),
892      IRB.getInt8PtrTy(), IRB.getInt8PtrTy(), IRB.getInt32Ty(), IntptrTy);
893
894  MsanInstrumentAsmStoreFn =
895      M.getOrInsertFunction("__msan_instrument_asm_store", IRB.getVoidTy(),
896                            PointerType::get(IRB.getInt8Ty(), 0), IntptrTy);
897
898  if (CompileKernel) {
899    createKernelApi(M, TLI);
900  } else {
901    createUserspaceApi(M, TLI);
902  }
903  CallbacksInitialized = true;
904}
905
906FunctionCallee MemorySanitizer::getKmsanShadowOriginAccessFn(bool isStore,
907                                                             int size) {
908  FunctionCallee *Fns =
909      isStore ? MsanMetadataPtrForStore_1_8 : MsanMetadataPtrForLoad_1_8;
910  switch (size) {
911  case 1:
912    return Fns[0];
913  case 2:
914    return Fns[1];
915  case 4:
916    return Fns[2];
917  case 8:
918    return Fns[3];
919  default:
920    return nullptr;
921  }
922}
923
924/// Module-level initialization.
925///
926/// inserts a call to __msan_init to the module's constructor list.
927void MemorySanitizer::initializeModule(Module &M) {
928  auto &DL = M.getDataLayout();
929
930  bool ShadowPassed = ClShadowBase.getNumOccurrences() > 0;
931  bool OriginPassed = ClOriginBase.getNumOccurrences() > 0;
932  // Check the overrides first
933  if (ShadowPassed || OriginPassed) {
934    CustomMapParams.AndMask = ClAndMask;
935    CustomMapParams.XorMask = ClXorMask;
936    CustomMapParams.ShadowBase = ClShadowBase;
937    CustomMapParams.OriginBase = ClOriginBase;
938    MapParams = &CustomMapParams;
939  } else {
940    Triple TargetTriple(M.getTargetTriple());
941    switch (TargetTriple.getOS()) {
942    case Triple::FreeBSD:
943      switch (TargetTriple.getArch()) {
944      case Triple::aarch64:
945        MapParams = FreeBSD_ARM_MemoryMapParams.bits64;
946        break;
947      case Triple::x86_64:
948        MapParams = FreeBSD_X86_MemoryMapParams.bits64;
949        break;
950      case Triple::x86:
951        MapParams = FreeBSD_X86_MemoryMapParams.bits32;
952        break;
953      default:
954        report_fatal_error("unsupported architecture");
955      }
956      break;
957    case Triple::NetBSD:
958      switch (TargetTriple.getArch()) {
959      case Triple::x86_64:
960        MapParams = NetBSD_X86_MemoryMapParams.bits64;
961        break;
962      default:
963        report_fatal_error("unsupported architecture");
964      }
965      break;
966    case Triple::Linux:
967      switch (TargetTriple.getArch()) {
968      case Triple::x86_64:
969        MapParams = Linux_X86_MemoryMapParams.bits64;
970        break;
971      case Triple::x86:
972        MapParams = Linux_X86_MemoryMapParams.bits32;
973        break;
974      case Triple::mips64:
975      case Triple::mips64el:
976        MapParams = Linux_MIPS_MemoryMapParams.bits64;
977        break;
978      case Triple::ppc64:
979      case Triple::ppc64le:
980        MapParams = Linux_PowerPC_MemoryMapParams.bits64;
981        break;
982      case Triple::systemz:
983        MapParams = Linux_S390_MemoryMapParams.bits64;
984        break;
985      case Triple::aarch64:
986      case Triple::aarch64_be:
987        MapParams = Linux_ARM_MemoryMapParams.bits64;
988        break;
989      default:
990        report_fatal_error("unsupported architecture");
991      }
992      break;
993    default:
994      report_fatal_error("unsupported operating system");
995    }
996  }
997
998  C = &(M.getContext());
999  IRBuilder<> IRB(*C);
1000  IntptrTy = IRB.getIntPtrTy(DL);
1001  OriginTy = IRB.getInt32Ty();
1002
1003  ColdCallWeights = MDBuilder(*C).createBranchWeights(1, 1000);
1004  OriginStoreWeights = MDBuilder(*C).createBranchWeights(1, 1000);
1005
1006  if (!CompileKernel) {
1007    if (TrackOrigins)
1008      M.getOrInsertGlobal("__msan_track_origins", IRB.getInt32Ty(), [&] {
1009        return new GlobalVariable(
1010            M, IRB.getInt32Ty(), true, GlobalValue::WeakODRLinkage,
1011            IRB.getInt32(TrackOrigins), "__msan_track_origins");
1012      });
1013
1014    if (Recover)
1015      M.getOrInsertGlobal("__msan_keep_going", IRB.getInt32Ty(), [&] {
1016        return new GlobalVariable(M, IRB.getInt32Ty(), true,
1017                                  GlobalValue::WeakODRLinkage,
1018                                  IRB.getInt32(Recover), "__msan_keep_going");
1019      });
1020  }
1021}
1022
1023namespace {
1024
1025/// A helper class that handles instrumentation of VarArg
1026/// functions on a particular platform.
1027///
1028/// Implementations are expected to insert the instrumentation
1029/// necessary to propagate argument shadow through VarArg function
1030/// calls. Visit* methods are called during an InstVisitor pass over
1031/// the function, and should avoid creating new basic blocks. A new
1032/// instance of this class is created for each instrumented function.
1033struct VarArgHelper {
1034  virtual ~VarArgHelper() = default;
1035
1036  /// Visit a CallBase.
1037  virtual void visitCallBase(CallBase &CB, IRBuilder<> &IRB) = 0;
1038
1039  /// Visit a va_start call.
1040  virtual void visitVAStartInst(VAStartInst &I) = 0;
1041
1042  /// Visit a va_copy call.
1043  virtual void visitVACopyInst(VACopyInst &I) = 0;
1044
1045  /// Finalize function instrumentation.
1046  ///
1047  /// This method is called after visiting all interesting (see above)
1048  /// instructions in a function.
1049  virtual void finalizeInstrumentation() = 0;
1050};
1051
1052struct MemorySanitizerVisitor;
1053
1054} // end anonymous namespace
1055
1056static VarArgHelper *CreateVarArgHelper(Function &Func, MemorySanitizer &Msan,
1057                                        MemorySanitizerVisitor &Visitor);
1058
1059static unsigned TypeSizeToSizeIndex(unsigned TypeSize) {
1060  if (TypeSize <= 8)
1061    return 0;
1062  return Log2_32_Ceil((TypeSize + 7) / 8);
1063}
1064
1065namespace {
1066
1067/// Helper class to attach debug information of the given instruction onto new
1068/// instructions inserted after.
1069class NextNodeIRBuilder : public IRBuilder<> {
1070public:
1071  explicit NextNodeIRBuilder(Instruction *IP) : IRBuilder<>(IP->getNextNode()) {
1072    SetCurrentDebugLocation(IP->getDebugLoc());
1073  }
1074};
1075
1076/// This class does all the work for a given function. Store and Load
1077/// instructions store and load corresponding shadow and origin
1078/// values. Most instructions propagate shadow from arguments to their
1079/// return values. Certain instructions (most importantly, BranchInst)
1080/// test their argument shadow and print reports (with a runtime call) if it's
1081/// non-zero.
1082struct MemorySanitizerVisitor : public InstVisitor<MemorySanitizerVisitor> {
1083  Function &F;
1084  MemorySanitizer &MS;
1085  SmallVector<PHINode *, 16> ShadowPHINodes, OriginPHINodes;
1086  ValueMap<Value *, Value *> ShadowMap, OriginMap;
1087  std::unique_ptr<VarArgHelper> VAHelper;
1088  const TargetLibraryInfo *TLI;
1089  Instruction *FnPrologueEnd;
1090
1091  // The following flags disable parts of MSan instrumentation based on
1092  // exclusion list contents and command-line options.
1093  bool InsertChecks;
1094  bool PropagateShadow;
1095  bool PoisonStack;
1096  bool PoisonUndef;
1097
1098  struct ShadowOriginAndInsertPoint {
1099    Value *Shadow;
1100    Value *Origin;
1101    Instruction *OrigIns;
1102
1103    ShadowOriginAndInsertPoint(Value *S, Value *O, Instruction *I)
1104        : Shadow(S), Origin(O), OrigIns(I) {}
1105  };
1106  SmallVector<ShadowOriginAndInsertPoint, 16> InstrumentationList;
1107  DenseMap<const DILocation *, int> LazyWarningDebugLocationCount;
1108  bool InstrumentLifetimeStart = ClHandleLifetimeIntrinsics;
1109  SmallSetVector<AllocaInst *, 16> AllocaSet;
1110  SmallVector<std::pair<IntrinsicInst *, AllocaInst *>, 16> LifetimeStartList;
1111  SmallVector<StoreInst *, 16> StoreList;
1112  int64_t SplittableBlocksCount = 0;
1113
1114  MemorySanitizerVisitor(Function &F, MemorySanitizer &MS,
1115                         const TargetLibraryInfo &TLI)
1116      : F(F), MS(MS), VAHelper(CreateVarArgHelper(F, MS, *this)), TLI(&TLI) {
1117    bool SanitizeFunction =
1118        F.hasFnAttribute(Attribute::SanitizeMemory) && !ClDisableChecks;
1119    InsertChecks = SanitizeFunction;
1120    PropagateShadow = SanitizeFunction;
1121    PoisonStack = SanitizeFunction && ClPoisonStack;
1122    PoisonUndef = SanitizeFunction && ClPoisonUndef;
1123
1124    // In the presence of unreachable blocks, we may see Phi nodes with
1125    // incoming nodes from such blocks. Since InstVisitor skips unreachable
1126    // blocks, such nodes will not have any shadow value associated with them.
1127    // It's easier to remove unreachable blocks than deal with missing shadow.
1128    removeUnreachableBlocks(F);
1129
1130    MS.initializeCallbacks(*F.getParent(), TLI);
1131    FnPrologueEnd = IRBuilder<>(F.getEntryBlock().getFirstNonPHI())
1132                        .CreateIntrinsic(Intrinsic::donothing, {}, {});
1133
1134    if (MS.CompileKernel) {
1135      IRBuilder<> IRB(FnPrologueEnd);
1136      insertKmsanPrologue(IRB);
1137    }
1138
1139    LLVM_DEBUG(if (!InsertChecks) dbgs()
1140               << "MemorySanitizer is not inserting checks into '"
1141               << F.getName() << "'\n");
1142  }
1143
1144  bool instrumentWithCalls(Value *V) {
1145    // Constants likely will be eliminated by follow-up passes.
1146    if (isa<Constant>(V))
1147      return false;
1148
1149    ++SplittableBlocksCount;
1150    return ClInstrumentationWithCallThreshold >= 0 &&
1151           SplittableBlocksCount > ClInstrumentationWithCallThreshold;
1152  }
1153
1154  bool isInPrologue(Instruction &I) {
1155    return I.getParent() == FnPrologueEnd->getParent() &&
1156           (&I == FnPrologueEnd || I.comesBefore(FnPrologueEnd));
1157  }
1158
1159  // Creates a new origin and records the stack trace. In general we can call
1160  // this function for any origin manipulation we like. However it will cost
1161  // runtime resources. So use this wisely only if it can provide additional
1162  // information helpful to a user.
1163  Value *updateOrigin(Value *V, IRBuilder<> &IRB) {
1164    if (MS.TrackOrigins <= 1)
1165      return V;
1166    return IRB.CreateCall(MS.MsanChainOriginFn, V);
1167  }
1168
1169  Value *originToIntptr(IRBuilder<> &IRB, Value *Origin) {
1170    const DataLayout &DL = F.getParent()->getDataLayout();
1171    unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
1172    if (IntptrSize == kOriginSize)
1173      return Origin;
1174    assert(IntptrSize == kOriginSize * 2);
1175    Origin = IRB.CreateIntCast(Origin, MS.IntptrTy, /* isSigned */ false);
1176    return IRB.CreateOr(Origin, IRB.CreateShl(Origin, kOriginSize * 8));
1177  }
1178
1179  /// Fill memory range with the given origin value.
1180  void paintOrigin(IRBuilder<> &IRB, Value *Origin, Value *OriginPtr,
1181                   unsigned Size, Align Alignment) {
1182    const DataLayout &DL = F.getParent()->getDataLayout();
1183    const Align IntptrAlignment = DL.getABITypeAlign(MS.IntptrTy);
1184    unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
1185    assert(IntptrAlignment >= kMinOriginAlignment);
1186    assert(IntptrSize >= kOriginSize);
1187
1188    unsigned Ofs = 0;
1189    Align CurrentAlignment = Alignment;
1190    if (Alignment >= IntptrAlignment && IntptrSize > kOriginSize) {
1191      Value *IntptrOrigin = originToIntptr(IRB, Origin);
1192      Value *IntptrOriginPtr =
1193          IRB.CreatePointerCast(OriginPtr, PointerType::get(MS.IntptrTy, 0));
1194      for (unsigned i = 0; i < Size / IntptrSize; ++i) {
1195        Value *Ptr = i ? IRB.CreateConstGEP1_32(MS.IntptrTy, IntptrOriginPtr, i)
1196                       : IntptrOriginPtr;
1197        IRB.CreateAlignedStore(IntptrOrigin, Ptr, CurrentAlignment);
1198        Ofs += IntptrSize / kOriginSize;
1199        CurrentAlignment = IntptrAlignment;
1200      }
1201    }
1202
1203    for (unsigned i = Ofs; i < (Size + kOriginSize - 1) / kOriginSize; ++i) {
1204      Value *GEP =
1205          i ? IRB.CreateConstGEP1_32(MS.OriginTy, OriginPtr, i) : OriginPtr;
1206      IRB.CreateAlignedStore(Origin, GEP, CurrentAlignment);
1207      CurrentAlignment = kMinOriginAlignment;
1208    }
1209  }
1210
1211  void storeOrigin(IRBuilder<> &IRB, Value *Addr, Value *Shadow, Value *Origin,
1212                   Value *OriginPtr, Align Alignment) {
1213    const DataLayout &DL = F.getParent()->getDataLayout();
1214    const Align OriginAlignment = std::max(kMinOriginAlignment, Alignment);
1215    unsigned StoreSize = DL.getTypeStoreSize(Shadow->getType());
1216    Value *ConvertedShadow = convertShadowToScalar(Shadow, IRB);
1217    if (auto *ConstantShadow = dyn_cast<Constant>(ConvertedShadow)) {
1218      if (!ClCheckConstantShadow || ConstantShadow->isZeroValue()) {
1219        // Origin is not needed: value is initialized or const shadow is
1220        // ignored.
1221        return;
1222      }
1223      if (llvm::isKnownNonZero(ConvertedShadow, DL)) {
1224        // Copy origin as the value is definitely uninitialized.
1225        paintOrigin(IRB, updateOrigin(Origin, IRB), OriginPtr, StoreSize,
1226                    OriginAlignment);
1227        return;
1228      }
1229      // Fallback to runtime check, which still can be optimized out later.
1230    }
1231
1232    unsigned TypeSizeInBits = DL.getTypeSizeInBits(ConvertedShadow->getType());
1233    unsigned SizeIndex = TypeSizeToSizeIndex(TypeSizeInBits);
1234    if (instrumentWithCalls(ConvertedShadow) &&
1235        SizeIndex < kNumberOfAccessSizes && !MS.CompileKernel) {
1236      FunctionCallee Fn = MS.MaybeStoreOriginFn[SizeIndex];
1237      Value *ConvertedShadow2 =
1238          IRB.CreateZExt(ConvertedShadow, IRB.getIntNTy(8 * (1 << SizeIndex)));
1239      CallBase *CB = IRB.CreateCall(
1240          Fn, {ConvertedShadow2,
1241               IRB.CreatePointerCast(Addr, IRB.getInt8PtrTy()), Origin});
1242      CB->addParamAttr(0, Attribute::ZExt);
1243      CB->addParamAttr(2, Attribute::ZExt);
1244    } else {
1245      Value *Cmp = convertToBool(ConvertedShadow, IRB, "_mscmp");
1246      Instruction *CheckTerm = SplitBlockAndInsertIfThen(
1247          Cmp, &*IRB.GetInsertPoint(), false, MS.OriginStoreWeights);
1248      IRBuilder<> IRBNew(CheckTerm);
1249      paintOrigin(IRBNew, updateOrigin(Origin, IRBNew), OriginPtr, StoreSize,
1250                  OriginAlignment);
1251    }
1252  }
1253
1254  void materializeStores() {
1255    for (StoreInst *SI : StoreList) {
1256      IRBuilder<> IRB(SI);
1257      Value *Val = SI->getValueOperand();
1258      Value *Addr = SI->getPointerOperand();
1259      Value *Shadow = SI->isAtomic() ? getCleanShadow(Val) : getShadow(Val);
1260      Value *ShadowPtr, *OriginPtr;
1261      Type *ShadowTy = Shadow->getType();
1262      const Align Alignment = SI->getAlign();
1263      const Align OriginAlignment = std::max(kMinOriginAlignment, Alignment);
1264      std::tie(ShadowPtr, OriginPtr) =
1265          getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ true);
1266
1267      StoreInst *NewSI = IRB.CreateAlignedStore(Shadow, ShadowPtr, Alignment);
1268      LLVM_DEBUG(dbgs() << "  STORE: " << *NewSI << "\n");
1269      (void)NewSI;
1270
1271      if (SI->isAtomic())
1272        SI->setOrdering(addReleaseOrdering(SI->getOrdering()));
1273
1274      if (MS.TrackOrigins && !SI->isAtomic())
1275        storeOrigin(IRB, Addr, Shadow, getOrigin(Val), OriginPtr,
1276                    OriginAlignment);
1277    }
1278  }
1279
1280  // Returns true if Debug Location curresponds to multiple warnings.
1281  bool shouldDisambiguateWarningLocation(const DebugLoc &DebugLoc) {
1282    if (MS.TrackOrigins < 2)
1283      return false;
1284
1285    if (LazyWarningDebugLocationCount.empty())
1286      for (const auto &I : InstrumentationList)
1287        ++LazyWarningDebugLocationCount[I.OrigIns->getDebugLoc()];
1288
1289    return LazyWarningDebugLocationCount[DebugLoc] >= ClDisambiguateWarning;
1290  }
1291
1292  /// Helper function to insert a warning at IRB's current insert point.
1293  void insertWarningFn(IRBuilder<> &IRB, Value *Origin) {
1294    if (!Origin)
1295      Origin = (Value *)IRB.getInt32(0);
1296    assert(Origin->getType()->isIntegerTy());
1297
1298    if (shouldDisambiguateWarningLocation(IRB.getCurrentDebugLocation())) {
1299      // Try to create additional origin with debug info of the last origin
1300      // instruction. It may provide additional information to the user.
1301      if (Instruction *OI = dyn_cast_or_null<Instruction>(Origin)) {
1302        assert(MS.TrackOrigins);
1303        auto NewDebugLoc = OI->getDebugLoc();
1304        // Origin update with missing or the same debug location provides no
1305        // additional value.
1306        if (NewDebugLoc && NewDebugLoc != IRB.getCurrentDebugLocation()) {
1307          // Insert update just before the check, so we call runtime only just
1308          // before the report.
1309          IRBuilder<> IRBOrigin(&*IRB.GetInsertPoint());
1310          IRBOrigin.SetCurrentDebugLocation(NewDebugLoc);
1311          Origin = updateOrigin(Origin, IRBOrigin);
1312        }
1313      }
1314    }
1315
1316    if (MS.CompileKernel || MS.TrackOrigins)
1317      IRB.CreateCall(MS.WarningFn, Origin)->setCannotMerge();
1318    else
1319      IRB.CreateCall(MS.WarningFn)->setCannotMerge();
1320    // FIXME: Insert UnreachableInst if !MS.Recover?
1321    // This may invalidate some of the following checks and needs to be done
1322    // at the very end.
1323  }
1324
1325  void materializeOneCheck(IRBuilder<> &IRB, Value *ConvertedShadow,
1326                           Value *Origin) {
1327    const DataLayout &DL = F.getParent()->getDataLayout();
1328    unsigned TypeSizeInBits = DL.getTypeSizeInBits(ConvertedShadow->getType());
1329    unsigned SizeIndex = TypeSizeToSizeIndex(TypeSizeInBits);
1330    if (instrumentWithCalls(ConvertedShadow) &&
1331        SizeIndex < kNumberOfAccessSizes && !MS.CompileKernel) {
1332      FunctionCallee Fn = MS.MaybeWarningFn[SizeIndex];
1333      Value *ConvertedShadow2 =
1334          IRB.CreateZExt(ConvertedShadow, IRB.getIntNTy(8 * (1 << SizeIndex)));
1335      CallBase *CB = IRB.CreateCall(
1336          Fn, {ConvertedShadow2,
1337               MS.TrackOrigins && Origin ? Origin : (Value *)IRB.getInt32(0)});
1338      CB->addParamAttr(0, Attribute::ZExt);
1339      CB->addParamAttr(1, Attribute::ZExt);
1340    } else {
1341      Value *Cmp = convertToBool(ConvertedShadow, IRB, "_mscmp");
1342      Instruction *CheckTerm = SplitBlockAndInsertIfThen(
1343          Cmp, &*IRB.GetInsertPoint(),
1344          /* Unreachable */ !MS.Recover, MS.ColdCallWeights);
1345
1346      IRB.SetInsertPoint(CheckTerm);
1347      insertWarningFn(IRB, Origin);
1348      LLVM_DEBUG(dbgs() << "  CHECK: " << *Cmp << "\n");
1349    }
1350  }
1351
1352  void materializeInstructionChecks(
1353      ArrayRef<ShadowOriginAndInsertPoint> InstructionChecks) {
1354    const DataLayout &DL = F.getParent()->getDataLayout();
1355    // Disable combining in some cases. TrackOrigins checks each shadow to pick
1356    // correct origin.
1357    bool Combine = !MS.TrackOrigins;
1358    Instruction *Instruction = InstructionChecks.front().OrigIns;
1359    Value *Shadow = nullptr;
1360    for (const auto &ShadowData : InstructionChecks) {
1361      assert(ShadowData.OrigIns == Instruction);
1362      IRBuilder<> IRB(Instruction);
1363
1364      Value *ConvertedShadow = ShadowData.Shadow;
1365
1366      if (auto *ConstantShadow = dyn_cast<Constant>(ConvertedShadow)) {
1367        if (!ClCheckConstantShadow || ConstantShadow->isZeroValue()) {
1368          // Skip, value is initialized or const shadow is ignored.
1369          continue;
1370        }
1371        if (llvm::isKnownNonZero(ConvertedShadow, DL)) {
1372          // Report as the value is definitely uninitialized.
1373          insertWarningFn(IRB, ShadowData.Origin);
1374          if (!MS.Recover)
1375            return; // Always fail and stop here, not need to check the rest.
1376          // Skip entire instruction,
1377          continue;
1378        }
1379        // Fallback to runtime check, which still can be optimized out later.
1380      }
1381
1382      if (!Combine) {
1383        materializeOneCheck(IRB, ConvertedShadow, ShadowData.Origin);
1384        continue;
1385      }
1386
1387      if (!Shadow) {
1388        Shadow = ConvertedShadow;
1389        continue;
1390      }
1391
1392      Shadow = convertToBool(Shadow, IRB, "_mscmp");
1393      ConvertedShadow = convertToBool(ConvertedShadow, IRB, "_mscmp");
1394      Shadow = IRB.CreateOr(Shadow, ConvertedShadow, "_msor");
1395    }
1396
1397    if (Shadow) {
1398      assert(Combine);
1399      IRBuilder<> IRB(Instruction);
1400      materializeOneCheck(IRB, Shadow, nullptr);
1401    }
1402  }
1403
1404  void materializeChecks() {
1405    llvm::stable_sort(InstrumentationList,
1406                      [](const ShadowOriginAndInsertPoint &L,
1407                         const ShadowOriginAndInsertPoint &R) {
1408                        return L.OrigIns < R.OrigIns;
1409                      });
1410
1411    for (auto I = InstrumentationList.begin();
1412         I != InstrumentationList.end();) {
1413      auto J =
1414          std::find_if(I + 1, InstrumentationList.end(),
1415                       [L = I->OrigIns](const ShadowOriginAndInsertPoint &R) {
1416                         return L != R.OrigIns;
1417                       });
1418      // Process all checks of instruction at once.
1419      materializeInstructionChecks(ArrayRef<ShadowOriginAndInsertPoint>(I, J));
1420      I = J;
1421    }
1422
1423    LLVM_DEBUG(dbgs() << "DONE:\n" << F);
1424  }
1425
1426  // Returns the last instruction in the new prologue
1427  void insertKmsanPrologue(IRBuilder<> &IRB) {
1428    Value *ContextState = IRB.CreateCall(MS.MsanGetContextStateFn, {});
1429    Constant *Zero = IRB.getInt32(0);
1430    MS.ParamTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1431                                {Zero, IRB.getInt32(0)}, "param_shadow");
1432    MS.RetvalTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1433                                 {Zero, IRB.getInt32(1)}, "retval_shadow");
1434    MS.VAArgTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1435                                {Zero, IRB.getInt32(2)}, "va_arg_shadow");
1436    MS.VAArgOriginTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1437                                      {Zero, IRB.getInt32(3)}, "va_arg_origin");
1438    MS.VAArgOverflowSizeTLS =
1439        IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1440                      {Zero, IRB.getInt32(4)}, "va_arg_overflow_size");
1441    MS.ParamOriginTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1442                                      {Zero, IRB.getInt32(5)}, "param_origin");
1443    MS.RetvalOriginTLS =
1444        IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1445                      {Zero, IRB.getInt32(6)}, "retval_origin");
1446  }
1447
1448  /// Add MemorySanitizer instrumentation to a function.
1449  bool runOnFunction() {
1450    // Iterate all BBs in depth-first order and create shadow instructions
1451    // for all instructions (where applicable).
1452    // For PHI nodes we create dummy shadow PHIs which will be finalized later.
1453    for (BasicBlock *BB : depth_first(FnPrologueEnd->getParent()))
1454      visit(*BB);
1455
1456    // Finalize PHI nodes.
1457    for (PHINode *PN : ShadowPHINodes) {
1458      PHINode *PNS = cast<PHINode>(getShadow(PN));
1459      PHINode *PNO = MS.TrackOrigins ? cast<PHINode>(getOrigin(PN)) : nullptr;
1460      size_t NumValues = PN->getNumIncomingValues();
1461      for (size_t v = 0; v < NumValues; v++) {
1462        PNS->addIncoming(getShadow(PN, v), PN->getIncomingBlock(v));
1463        if (PNO)
1464          PNO->addIncoming(getOrigin(PN, v), PN->getIncomingBlock(v));
1465      }
1466    }
1467
1468    VAHelper->finalizeInstrumentation();
1469
1470    // Poison llvm.lifetime.start intrinsics, if we haven't fallen back to
1471    // instrumenting only allocas.
1472    if (InstrumentLifetimeStart) {
1473      for (auto Item : LifetimeStartList) {
1474        instrumentAlloca(*Item.second, Item.first);
1475        AllocaSet.remove(Item.second);
1476      }
1477    }
1478    // Poison the allocas for which we didn't instrument the corresponding
1479    // lifetime intrinsics.
1480    for (AllocaInst *AI : AllocaSet)
1481      instrumentAlloca(*AI);
1482
1483    // Insert shadow value checks.
1484    materializeChecks();
1485
1486    // Delayed instrumentation of StoreInst.
1487    // This may not add new address checks.
1488    materializeStores();
1489
1490    return true;
1491  }
1492
1493  /// Compute the shadow type that corresponds to a given Value.
1494  Type *getShadowTy(Value *V) { return getShadowTy(V->getType()); }
1495
1496  /// Compute the shadow type that corresponds to a given Type.
1497  Type *getShadowTy(Type *OrigTy) {
1498    if (!OrigTy->isSized()) {
1499      return nullptr;
1500    }
1501    // For integer type, shadow is the same as the original type.
1502    // This may return weird-sized types like i1.
1503    if (IntegerType *IT = dyn_cast<IntegerType>(OrigTy))
1504      return IT;
1505    const DataLayout &DL = F.getParent()->getDataLayout();
1506    if (VectorType *VT = dyn_cast<VectorType>(OrigTy)) {
1507      uint32_t EltSize = DL.getTypeSizeInBits(VT->getElementType());
1508      return FixedVectorType::get(IntegerType::get(*MS.C, EltSize),
1509                                  cast<FixedVectorType>(VT)->getNumElements());
1510    }
1511    if (ArrayType *AT = dyn_cast<ArrayType>(OrigTy)) {
1512      return ArrayType::get(getShadowTy(AT->getElementType()),
1513                            AT->getNumElements());
1514    }
1515    if (StructType *ST = dyn_cast<StructType>(OrigTy)) {
1516      SmallVector<Type *, 4> Elements;
1517      for (unsigned i = 0, n = ST->getNumElements(); i < n; i++)
1518        Elements.push_back(getShadowTy(ST->getElementType(i)));
1519      StructType *Res = StructType::get(*MS.C, Elements, ST->isPacked());
1520      LLVM_DEBUG(dbgs() << "getShadowTy: " << *ST << " ===> " << *Res << "\n");
1521      return Res;
1522    }
1523    uint32_t TypeSize = DL.getTypeSizeInBits(OrigTy);
1524    return IntegerType::get(*MS.C, TypeSize);
1525  }
1526
1527  /// Flatten a vector type.
1528  Type *getShadowTyNoVec(Type *ty) {
1529    if (VectorType *vt = dyn_cast<VectorType>(ty))
1530      return IntegerType::get(*MS.C,
1531                              vt->getPrimitiveSizeInBits().getFixedValue());
1532    return ty;
1533  }
1534
1535  /// Extract combined shadow of struct elements as a bool
1536  Value *collapseStructShadow(StructType *Struct, Value *Shadow,
1537                              IRBuilder<> &IRB) {
1538    Value *FalseVal = IRB.getIntN(/* width */ 1, /* value */ 0);
1539    Value *Aggregator = FalseVal;
1540
1541    for (unsigned Idx = 0; Idx < Struct->getNumElements(); Idx++) {
1542      // Combine by ORing together each element's bool shadow
1543      Value *ShadowItem = IRB.CreateExtractValue(Shadow, Idx);
1544      Value *ShadowInner = convertShadowToScalar(ShadowItem, IRB);
1545      Value *ShadowBool = convertToBool(ShadowInner, IRB);
1546
1547      if (Aggregator != FalseVal)
1548        Aggregator = IRB.CreateOr(Aggregator, ShadowBool);
1549      else
1550        Aggregator = ShadowBool;
1551    }
1552
1553    return Aggregator;
1554  }
1555
1556  // Extract combined shadow of array elements
1557  Value *collapseArrayShadow(ArrayType *Array, Value *Shadow,
1558                             IRBuilder<> &IRB) {
1559    if (!Array->getNumElements())
1560      return IRB.getIntN(/* width */ 1, /* value */ 0);
1561
1562    Value *FirstItem = IRB.CreateExtractValue(Shadow, 0);
1563    Value *Aggregator = convertShadowToScalar(FirstItem, IRB);
1564
1565    for (unsigned Idx = 1; Idx < Array->getNumElements(); Idx++) {
1566      Value *ShadowItem = IRB.CreateExtractValue(Shadow, Idx);
1567      Value *ShadowInner = convertShadowToScalar(ShadowItem, IRB);
1568      Aggregator = IRB.CreateOr(Aggregator, ShadowInner);
1569    }
1570    return Aggregator;
1571  }
1572
1573  /// Convert a shadow value to it's flattened variant. The resulting
1574  /// shadow may not necessarily have the same bit width as the input
1575  /// value, but it will always be comparable to zero.
1576  Value *convertShadowToScalar(Value *V, IRBuilder<> &IRB) {
1577    if (StructType *Struct = dyn_cast<StructType>(V->getType()))
1578      return collapseStructShadow(Struct, V, IRB);
1579    if (ArrayType *Array = dyn_cast<ArrayType>(V->getType()))
1580      return collapseArrayShadow(Array, V, IRB);
1581    Type *Ty = V->getType();
1582    Type *NoVecTy = getShadowTyNoVec(Ty);
1583    if (Ty == NoVecTy)
1584      return V;
1585    return IRB.CreateBitCast(V, NoVecTy);
1586  }
1587
1588  // Convert a scalar value to an i1 by comparing with 0
1589  Value *convertToBool(Value *V, IRBuilder<> &IRB, const Twine &name = "") {
1590    Type *VTy = V->getType();
1591    if (!VTy->isIntegerTy())
1592      return convertToBool(convertShadowToScalar(V, IRB), IRB, name);
1593    if (VTy->getIntegerBitWidth() == 1)
1594      // Just converting a bool to a bool, so do nothing.
1595      return V;
1596    return IRB.CreateICmpNE(V, ConstantInt::get(VTy, 0), name);
1597  }
1598
1599  Type *ptrToIntPtrType(Type *PtrTy) const {
1600    if (FixedVectorType *VectTy = dyn_cast<FixedVectorType>(PtrTy)) {
1601      return FixedVectorType::get(ptrToIntPtrType(VectTy->getElementType()),
1602                                  VectTy->getNumElements());
1603    }
1604    assert(PtrTy->isIntOrPtrTy());
1605    return MS.IntptrTy;
1606  }
1607
1608  Type *getPtrToShadowPtrType(Type *IntPtrTy, Type *ShadowTy) const {
1609    if (FixedVectorType *VectTy = dyn_cast<FixedVectorType>(IntPtrTy)) {
1610      return FixedVectorType::get(
1611          getPtrToShadowPtrType(VectTy->getElementType(), ShadowTy),
1612          VectTy->getNumElements());
1613    }
1614    assert(IntPtrTy == MS.IntptrTy);
1615    return ShadowTy->getPointerTo();
1616  }
1617
1618  Constant *constToIntPtr(Type *IntPtrTy, uint64_t C) const {
1619    if (FixedVectorType *VectTy = dyn_cast<FixedVectorType>(IntPtrTy)) {
1620      return ConstantDataVector::getSplat(
1621          VectTy->getNumElements(), constToIntPtr(VectTy->getElementType(), C));
1622    }
1623    assert(IntPtrTy == MS.IntptrTy);
1624    return ConstantInt::get(MS.IntptrTy, C);
1625  }
1626
1627  /// Compute the integer shadow offset that corresponds to a given
1628  /// application address.
1629  ///
1630  /// Offset = (Addr & ~AndMask) ^ XorMask
1631  /// Addr can be a ptr or <N x ptr>. In both cases ShadowTy the shadow type of
1632  /// a single pointee.
1633  /// Returns <shadow_ptr, origin_ptr> or <<N x shadow_ptr>, <N x origin_ptr>>.
1634  Value *getShadowPtrOffset(Value *Addr, IRBuilder<> &IRB) {
1635    Type *IntptrTy = ptrToIntPtrType(Addr->getType());
1636    Value *OffsetLong = IRB.CreatePointerCast(Addr, IntptrTy);
1637
1638    if (uint64_t AndMask = MS.MapParams->AndMask)
1639      OffsetLong = IRB.CreateAnd(OffsetLong, constToIntPtr(IntptrTy, ~AndMask));
1640
1641    if (uint64_t XorMask = MS.MapParams->XorMask)
1642      OffsetLong = IRB.CreateXor(OffsetLong, constToIntPtr(IntptrTy, XorMask));
1643    return OffsetLong;
1644  }
1645
1646  /// Compute the shadow and origin addresses corresponding to a given
1647  /// application address.
1648  ///
1649  /// Shadow = ShadowBase + Offset
1650  /// Origin = (OriginBase + Offset) & ~3ULL
1651  /// Addr can be a ptr or <N x ptr>. In both cases ShadowTy the shadow type of
1652  /// a single pointee.
1653  /// Returns <shadow_ptr, origin_ptr> or <<N x shadow_ptr>, <N x origin_ptr>>.
1654  std::pair<Value *, Value *>
1655  getShadowOriginPtrUserspace(Value *Addr, IRBuilder<> &IRB, Type *ShadowTy,
1656                              MaybeAlign Alignment) {
1657    Type *IntptrTy = ptrToIntPtrType(Addr->getType());
1658    Value *ShadowOffset = getShadowPtrOffset(Addr, IRB);
1659    Value *ShadowLong = ShadowOffset;
1660    if (uint64_t ShadowBase = MS.MapParams->ShadowBase) {
1661      ShadowLong =
1662          IRB.CreateAdd(ShadowLong, constToIntPtr(IntptrTy, ShadowBase));
1663    }
1664    Value *ShadowPtr = IRB.CreateIntToPtr(
1665        ShadowLong, getPtrToShadowPtrType(IntptrTy, ShadowTy));
1666
1667    Value *OriginPtr = nullptr;
1668    if (MS.TrackOrigins) {
1669      Value *OriginLong = ShadowOffset;
1670      uint64_t OriginBase = MS.MapParams->OriginBase;
1671      if (OriginBase != 0)
1672        OriginLong =
1673            IRB.CreateAdd(OriginLong, constToIntPtr(IntptrTy, OriginBase));
1674      if (!Alignment || *Alignment < kMinOriginAlignment) {
1675        uint64_t Mask = kMinOriginAlignment.value() - 1;
1676        OriginLong = IRB.CreateAnd(OriginLong, constToIntPtr(IntptrTy, ~Mask));
1677      }
1678      OriginPtr = IRB.CreateIntToPtr(
1679          OriginLong, getPtrToShadowPtrType(IntptrTy, MS.OriginTy));
1680    }
1681    return std::make_pair(ShadowPtr, OriginPtr);
1682  }
1683
1684  std::pair<Value *, Value *> getShadowOriginPtrKernelNoVec(Value *Addr,
1685                                                            IRBuilder<> &IRB,
1686                                                            Type *ShadowTy,
1687                                                            bool isStore) {
1688    Value *ShadowOriginPtrs;
1689    const DataLayout &DL = F.getParent()->getDataLayout();
1690    int Size = DL.getTypeStoreSize(ShadowTy);
1691
1692    FunctionCallee Getter = MS.getKmsanShadowOriginAccessFn(isStore, Size);
1693    Value *AddrCast =
1694        IRB.CreatePointerCast(Addr, PointerType::get(IRB.getInt8Ty(), 0));
1695    if (Getter) {
1696      ShadowOriginPtrs = IRB.CreateCall(Getter, AddrCast);
1697    } else {
1698      Value *SizeVal = ConstantInt::get(MS.IntptrTy, Size);
1699      ShadowOriginPtrs = IRB.CreateCall(isStore ? MS.MsanMetadataPtrForStoreN
1700                                                : MS.MsanMetadataPtrForLoadN,
1701                                        {AddrCast, SizeVal});
1702    }
1703    Value *ShadowPtr = IRB.CreateExtractValue(ShadowOriginPtrs, 0);
1704    ShadowPtr = IRB.CreatePointerCast(ShadowPtr, PointerType::get(ShadowTy, 0));
1705    Value *OriginPtr = IRB.CreateExtractValue(ShadowOriginPtrs, 1);
1706
1707    return std::make_pair(ShadowPtr, OriginPtr);
1708  }
1709
1710  /// Addr can be a ptr or <N x ptr>. In both cases ShadowTy the shadow type of
1711  /// a single pointee.
1712  /// Returns <shadow_ptr, origin_ptr> or <<N x shadow_ptr>, <N x origin_ptr>>.
1713  std::pair<Value *, Value *> getShadowOriginPtrKernel(Value *Addr,
1714                                                       IRBuilder<> &IRB,
1715                                                       Type *ShadowTy,
1716                                                       bool isStore) {
1717    FixedVectorType *VectTy = dyn_cast<FixedVectorType>(Addr->getType());
1718    if (!VectTy) {
1719      assert(Addr->getType()->isPointerTy());
1720      return getShadowOriginPtrKernelNoVec(Addr, IRB, ShadowTy, isStore);
1721    }
1722
1723    // TODO: Support callbacs with vectors of addresses.
1724    unsigned NumElements = VectTy->getNumElements();
1725    Value *ShadowPtrs = ConstantInt::getNullValue(
1726        FixedVectorType::get(ShadowTy->getPointerTo(), NumElements));
1727    Value *OriginPtrs = nullptr;
1728    if (MS.TrackOrigins)
1729      OriginPtrs = ConstantInt::getNullValue(
1730          FixedVectorType::get(MS.OriginTy->getPointerTo(), NumElements));
1731    for (unsigned i = 0; i < NumElements; ++i) {
1732      Value *OneAddr =
1733          IRB.CreateExtractElement(Addr, ConstantInt::get(IRB.getInt32Ty(), i));
1734      auto [ShadowPtr, OriginPtr] =
1735          getShadowOriginPtrKernelNoVec(OneAddr, IRB, ShadowTy, isStore);
1736
1737      ShadowPtrs = IRB.CreateInsertElement(
1738          ShadowPtrs, ShadowPtr, ConstantInt::get(IRB.getInt32Ty(), i));
1739      if (MS.TrackOrigins)
1740        OriginPtrs = IRB.CreateInsertElement(
1741            OriginPtrs, OriginPtr, ConstantInt::get(IRB.getInt32Ty(), i));
1742    }
1743    return {ShadowPtrs, OriginPtrs};
1744  }
1745
1746  std::pair<Value *, Value *> getShadowOriginPtr(Value *Addr, IRBuilder<> &IRB,
1747                                                 Type *ShadowTy,
1748                                                 MaybeAlign Alignment,
1749                                                 bool isStore) {
1750    if (MS.CompileKernel)
1751      return getShadowOriginPtrKernel(Addr, IRB, ShadowTy, isStore);
1752    return getShadowOriginPtrUserspace(Addr, IRB, ShadowTy, Alignment);
1753  }
1754
1755  /// Compute the shadow address for a given function argument.
1756  ///
1757  /// Shadow = ParamTLS+ArgOffset.
1758  Value *getShadowPtrForArgument(Value *A, IRBuilder<> &IRB, int ArgOffset) {
1759    Value *Base = IRB.CreatePointerCast(MS.ParamTLS, MS.IntptrTy);
1760    if (ArgOffset)
1761      Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
1762    return IRB.CreateIntToPtr(Base, PointerType::get(getShadowTy(A), 0),
1763                              "_msarg");
1764  }
1765
1766  /// Compute the origin address for a given function argument.
1767  Value *getOriginPtrForArgument(Value *A, IRBuilder<> &IRB, int ArgOffset) {
1768    if (!MS.TrackOrigins)
1769      return nullptr;
1770    Value *Base = IRB.CreatePointerCast(MS.ParamOriginTLS, MS.IntptrTy);
1771    if (ArgOffset)
1772      Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
1773    return IRB.CreateIntToPtr(Base, PointerType::get(MS.OriginTy, 0),
1774                              "_msarg_o");
1775  }
1776
1777  /// Compute the shadow address for a retval.
1778  Value *getShadowPtrForRetval(Value *A, IRBuilder<> &IRB) {
1779    return IRB.CreatePointerCast(MS.RetvalTLS,
1780                                 PointerType::get(getShadowTy(A), 0), "_msret");
1781  }
1782
1783  /// Compute the origin address for a retval.
1784  Value *getOriginPtrForRetval(IRBuilder<> &IRB) {
1785    // We keep a single origin for the entire retval. Might be too optimistic.
1786    return MS.RetvalOriginTLS;
1787  }
1788
1789  /// Set SV to be the shadow value for V.
1790  void setShadow(Value *V, Value *SV) {
1791    assert(!ShadowMap.count(V) && "Values may only have one shadow");
1792    ShadowMap[V] = PropagateShadow ? SV : getCleanShadow(V);
1793  }
1794
1795  /// Set Origin to be the origin value for V.
1796  void setOrigin(Value *V, Value *Origin) {
1797    if (!MS.TrackOrigins)
1798      return;
1799    assert(!OriginMap.count(V) && "Values may only have one origin");
1800    LLVM_DEBUG(dbgs() << "ORIGIN: " << *V << "  ==> " << *Origin << "\n");
1801    OriginMap[V] = Origin;
1802  }
1803
1804  Constant *getCleanShadow(Type *OrigTy) {
1805    Type *ShadowTy = getShadowTy(OrigTy);
1806    if (!ShadowTy)
1807      return nullptr;
1808    return Constant::getNullValue(ShadowTy);
1809  }
1810
1811  /// Create a clean shadow value for a given value.
1812  ///
1813  /// Clean shadow (all zeroes) means all bits of the value are defined
1814  /// (initialized).
1815  Constant *getCleanShadow(Value *V) { return getCleanShadow(V->getType()); }
1816
1817  /// Create a dirty shadow of a given shadow type.
1818  Constant *getPoisonedShadow(Type *ShadowTy) {
1819    assert(ShadowTy);
1820    if (isa<IntegerType>(ShadowTy) || isa<VectorType>(ShadowTy))
1821      return Constant::getAllOnesValue(ShadowTy);
1822    if (ArrayType *AT = dyn_cast<ArrayType>(ShadowTy)) {
1823      SmallVector<Constant *, 4> Vals(AT->getNumElements(),
1824                                      getPoisonedShadow(AT->getElementType()));
1825      return ConstantArray::get(AT, Vals);
1826    }
1827    if (StructType *ST = dyn_cast<StructType>(ShadowTy)) {
1828      SmallVector<Constant *, 4> Vals;
1829      for (unsigned i = 0, n = ST->getNumElements(); i < n; i++)
1830        Vals.push_back(getPoisonedShadow(ST->getElementType(i)));
1831      return ConstantStruct::get(ST, Vals);
1832    }
1833    llvm_unreachable("Unexpected shadow type");
1834  }
1835
1836  /// Create a dirty shadow for a given value.
1837  Constant *getPoisonedShadow(Value *V) {
1838    Type *ShadowTy = getShadowTy(V);
1839    if (!ShadowTy)
1840      return nullptr;
1841    return getPoisonedShadow(ShadowTy);
1842  }
1843
1844  /// Create a clean (zero) origin.
1845  Value *getCleanOrigin() { return Constant::getNullValue(MS.OriginTy); }
1846
1847  /// Get the shadow value for a given Value.
1848  ///
1849  /// This function either returns the value set earlier with setShadow,
1850  /// or extracts if from ParamTLS (for function arguments).
1851  Value *getShadow(Value *V) {
1852    if (Instruction *I = dyn_cast<Instruction>(V)) {
1853      if (!PropagateShadow || I->getMetadata(LLVMContext::MD_nosanitize))
1854        return getCleanShadow(V);
1855      // For instructions the shadow is already stored in the map.
1856      Value *Shadow = ShadowMap[V];
1857      if (!Shadow) {
1858        LLVM_DEBUG(dbgs() << "No shadow: " << *V << "\n" << *(I->getParent()));
1859        (void)I;
1860        assert(Shadow && "No shadow for a value");
1861      }
1862      return Shadow;
1863    }
1864    if (UndefValue *U = dyn_cast<UndefValue>(V)) {
1865      Value *AllOnes = (PropagateShadow && PoisonUndef) ? getPoisonedShadow(V)
1866                                                        : getCleanShadow(V);
1867      LLVM_DEBUG(dbgs() << "Undef: " << *U << " ==> " << *AllOnes << "\n");
1868      (void)U;
1869      return AllOnes;
1870    }
1871    if (Argument *A = dyn_cast<Argument>(V)) {
1872      // For arguments we compute the shadow on demand and store it in the map.
1873      Value *&ShadowPtr = ShadowMap[V];
1874      if (ShadowPtr)
1875        return ShadowPtr;
1876      Function *F = A->getParent();
1877      IRBuilder<> EntryIRB(FnPrologueEnd);
1878      unsigned ArgOffset = 0;
1879      const DataLayout &DL = F->getParent()->getDataLayout();
1880      for (auto &FArg : F->args()) {
1881        if (!FArg.getType()->isSized()) {
1882          LLVM_DEBUG(dbgs() << "Arg is not sized\n");
1883          continue;
1884        }
1885
1886        unsigned Size = FArg.hasByValAttr()
1887                            ? DL.getTypeAllocSize(FArg.getParamByValType())
1888                            : DL.getTypeAllocSize(FArg.getType());
1889
1890        if (A == &FArg) {
1891          bool Overflow = ArgOffset + Size > kParamTLSSize;
1892          if (FArg.hasByValAttr()) {
1893            // ByVal pointer itself has clean shadow. We copy the actual
1894            // argument shadow to the underlying memory.
1895            // Figure out maximal valid memcpy alignment.
1896            const Align ArgAlign = DL.getValueOrABITypeAlignment(
1897                FArg.getParamAlign(), FArg.getParamByValType());
1898            Value *CpShadowPtr, *CpOriginPtr;
1899            std::tie(CpShadowPtr, CpOriginPtr) =
1900                getShadowOriginPtr(V, EntryIRB, EntryIRB.getInt8Ty(), ArgAlign,
1901                                   /*isStore*/ true);
1902            if (!PropagateShadow || Overflow) {
1903              // ParamTLS overflow.
1904              EntryIRB.CreateMemSet(
1905                  CpShadowPtr, Constant::getNullValue(EntryIRB.getInt8Ty()),
1906                  Size, ArgAlign);
1907            } else {
1908              Value *Base = getShadowPtrForArgument(&FArg, EntryIRB, ArgOffset);
1909              const Align CopyAlign = std::min(ArgAlign, kShadowTLSAlignment);
1910              Value *Cpy = EntryIRB.CreateMemCpy(CpShadowPtr, CopyAlign, Base,
1911                                                 CopyAlign, Size);
1912              LLVM_DEBUG(dbgs() << "  ByValCpy: " << *Cpy << "\n");
1913              (void)Cpy;
1914
1915              if (MS.TrackOrigins) {
1916                Value *OriginPtr =
1917                    getOriginPtrForArgument(&FArg, EntryIRB, ArgOffset);
1918                // FIXME: OriginSize should be:
1919                // alignTo(V % kMinOriginAlignment + Size, kMinOriginAlignment)
1920                unsigned OriginSize = alignTo(Size, kMinOriginAlignment);
1921                EntryIRB.CreateMemCpy(
1922                    CpOriginPtr,
1923                    /* by getShadowOriginPtr */ kMinOriginAlignment, OriginPtr,
1924                    /* by origin_tls[ArgOffset] */ kMinOriginAlignment,
1925                    OriginSize);
1926              }
1927            }
1928          }
1929
1930          if (!PropagateShadow || Overflow || FArg.hasByValAttr() ||
1931              (MS.EagerChecks && FArg.hasAttribute(Attribute::NoUndef))) {
1932            ShadowPtr = getCleanShadow(V);
1933            setOrigin(A, getCleanOrigin());
1934          } else {
1935            // Shadow over TLS
1936            Value *Base = getShadowPtrForArgument(&FArg, EntryIRB, ArgOffset);
1937            ShadowPtr = EntryIRB.CreateAlignedLoad(getShadowTy(&FArg), Base,
1938                                                   kShadowTLSAlignment);
1939            if (MS.TrackOrigins) {
1940              Value *OriginPtr =
1941                  getOriginPtrForArgument(&FArg, EntryIRB, ArgOffset);
1942              setOrigin(A, EntryIRB.CreateLoad(MS.OriginTy, OriginPtr));
1943            }
1944          }
1945          LLVM_DEBUG(dbgs()
1946                     << "  ARG:    " << FArg << " ==> " << *ShadowPtr << "\n");
1947          break;
1948        }
1949
1950        ArgOffset += alignTo(Size, kShadowTLSAlignment);
1951      }
1952      assert(ShadowPtr && "Could not find shadow for an argument");
1953      return ShadowPtr;
1954    }
1955    // For everything else the shadow is zero.
1956    return getCleanShadow(V);
1957  }
1958
1959  /// Get the shadow for i-th argument of the instruction I.
1960  Value *getShadow(Instruction *I, int i) {
1961    return getShadow(I->getOperand(i));
1962  }
1963
1964  /// Get the origin for a value.
1965  Value *getOrigin(Value *V) {
1966    if (!MS.TrackOrigins)
1967      return nullptr;
1968    if (!PropagateShadow || isa<Constant>(V) || isa<InlineAsm>(V))
1969      return getCleanOrigin();
1970    assert((isa<Instruction>(V) || isa<Argument>(V)) &&
1971           "Unexpected value type in getOrigin()");
1972    if (Instruction *I = dyn_cast<Instruction>(V)) {
1973      if (I->getMetadata(LLVMContext::MD_nosanitize))
1974        return getCleanOrigin();
1975    }
1976    Value *Origin = OriginMap[V];
1977    assert(Origin && "Missing origin");
1978    return Origin;
1979  }
1980
1981  /// Get the origin for i-th argument of the instruction I.
1982  Value *getOrigin(Instruction *I, int i) {
1983    return getOrigin(I->getOperand(i));
1984  }
1985
1986  /// Remember the place where a shadow check should be inserted.
1987  ///
1988  /// This location will be later instrumented with a check that will print a
1989  /// UMR warning in runtime if the shadow value is not 0.
1990  void insertShadowCheck(Value *Shadow, Value *Origin, Instruction *OrigIns) {
1991    assert(Shadow);
1992    if (!InsertChecks)
1993      return;
1994
1995    if (!DebugCounter::shouldExecute(DebugInsertCheck)) {
1996      LLVM_DEBUG(dbgs() << "Skipping check of " << *Shadow << " before "
1997                        << *OrigIns << "\n");
1998      return;
1999    }
2000#ifndef NDEBUG
2001    Type *ShadowTy = Shadow->getType();
2002    assert((isa<IntegerType>(ShadowTy) || isa<VectorType>(ShadowTy) ||
2003            isa<StructType>(ShadowTy) || isa<ArrayType>(ShadowTy)) &&
2004           "Can only insert checks for integer, vector, and aggregate shadow "
2005           "types");
2006#endif
2007    InstrumentationList.push_back(
2008        ShadowOriginAndInsertPoint(Shadow, Origin, OrigIns));
2009  }
2010
2011  /// Remember the place where a shadow check should be inserted.
2012  ///
2013  /// This location will be later instrumented with a check that will print a
2014  /// UMR warning in runtime if the value is not fully defined.
2015  void insertShadowCheck(Value *Val, Instruction *OrigIns) {
2016    assert(Val);
2017    Value *Shadow, *Origin;
2018    if (ClCheckConstantShadow) {
2019      Shadow = getShadow(Val);
2020      if (!Shadow)
2021        return;
2022      Origin = getOrigin(Val);
2023    } else {
2024      Shadow = dyn_cast_or_null<Instruction>(getShadow(Val));
2025      if (!Shadow)
2026        return;
2027      Origin = dyn_cast_or_null<Instruction>(getOrigin(Val));
2028    }
2029    insertShadowCheck(Shadow, Origin, OrigIns);
2030  }
2031
2032  AtomicOrdering addReleaseOrdering(AtomicOrdering a) {
2033    switch (a) {
2034    case AtomicOrdering::NotAtomic:
2035      return AtomicOrdering::NotAtomic;
2036    case AtomicOrdering::Unordered:
2037    case AtomicOrdering::Monotonic:
2038    case AtomicOrdering::Release:
2039      return AtomicOrdering::Release;
2040    case AtomicOrdering::Acquire:
2041    case AtomicOrdering::AcquireRelease:
2042      return AtomicOrdering::AcquireRelease;
2043    case AtomicOrdering::SequentiallyConsistent:
2044      return AtomicOrdering::SequentiallyConsistent;
2045    }
2046    llvm_unreachable("Unknown ordering");
2047  }
2048
2049  Value *makeAddReleaseOrderingTable(IRBuilder<> &IRB) {
2050    constexpr int NumOrderings = (int)AtomicOrderingCABI::seq_cst + 1;
2051    uint32_t OrderingTable[NumOrderings] = {};
2052
2053    OrderingTable[(int)AtomicOrderingCABI::relaxed] =
2054        OrderingTable[(int)AtomicOrderingCABI::release] =
2055            (int)AtomicOrderingCABI::release;
2056    OrderingTable[(int)AtomicOrderingCABI::consume] =
2057        OrderingTable[(int)AtomicOrderingCABI::acquire] =
2058            OrderingTable[(int)AtomicOrderingCABI::acq_rel] =
2059                (int)AtomicOrderingCABI::acq_rel;
2060    OrderingTable[(int)AtomicOrderingCABI::seq_cst] =
2061        (int)AtomicOrderingCABI::seq_cst;
2062
2063    return ConstantDataVector::get(IRB.getContext(),
2064                                   ArrayRef(OrderingTable, NumOrderings));
2065  }
2066
2067  AtomicOrdering addAcquireOrdering(AtomicOrdering a) {
2068    switch (a) {
2069    case AtomicOrdering::NotAtomic:
2070      return AtomicOrdering::NotAtomic;
2071    case AtomicOrdering::Unordered:
2072    case AtomicOrdering::Monotonic:
2073    case AtomicOrdering::Acquire:
2074      return AtomicOrdering::Acquire;
2075    case AtomicOrdering::Release:
2076    case AtomicOrdering::AcquireRelease:
2077      return AtomicOrdering::AcquireRelease;
2078    case AtomicOrdering::SequentiallyConsistent:
2079      return AtomicOrdering::SequentiallyConsistent;
2080    }
2081    llvm_unreachable("Unknown ordering");
2082  }
2083
2084  Value *makeAddAcquireOrderingTable(IRBuilder<> &IRB) {
2085    constexpr int NumOrderings = (int)AtomicOrderingCABI::seq_cst + 1;
2086    uint32_t OrderingTable[NumOrderings] = {};
2087
2088    OrderingTable[(int)AtomicOrderingCABI::relaxed] =
2089        OrderingTable[(int)AtomicOrderingCABI::acquire] =
2090            OrderingTable[(int)AtomicOrderingCABI::consume] =
2091                (int)AtomicOrderingCABI::acquire;
2092    OrderingTable[(int)AtomicOrderingCABI::release] =
2093        OrderingTable[(int)AtomicOrderingCABI::acq_rel] =
2094            (int)AtomicOrderingCABI::acq_rel;
2095    OrderingTable[(int)AtomicOrderingCABI::seq_cst] =
2096        (int)AtomicOrderingCABI::seq_cst;
2097
2098    return ConstantDataVector::get(IRB.getContext(),
2099                                   ArrayRef(OrderingTable, NumOrderings));
2100  }
2101
2102  // ------------------- Visitors.
2103  using InstVisitor<MemorySanitizerVisitor>::visit;
2104  void visit(Instruction &I) {
2105    if (I.getMetadata(LLVMContext::MD_nosanitize))
2106      return;
2107    // Don't want to visit if we're in the prologue
2108    if (isInPrologue(I))
2109      return;
2110    InstVisitor<MemorySanitizerVisitor>::visit(I);
2111  }
2112
2113  /// Instrument LoadInst
2114  ///
2115  /// Loads the corresponding shadow and (optionally) origin.
2116  /// Optionally, checks that the load address is fully defined.
2117  void visitLoadInst(LoadInst &I) {
2118    assert(I.getType()->isSized() && "Load type must have size");
2119    assert(!I.getMetadata(LLVMContext::MD_nosanitize));
2120    NextNodeIRBuilder IRB(&I);
2121    Type *ShadowTy = getShadowTy(&I);
2122    Value *Addr = I.getPointerOperand();
2123    Value *ShadowPtr = nullptr, *OriginPtr = nullptr;
2124    const Align Alignment = I.getAlign();
2125    if (PropagateShadow) {
2126      std::tie(ShadowPtr, OriginPtr) =
2127          getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ false);
2128      setShadow(&I,
2129                IRB.CreateAlignedLoad(ShadowTy, ShadowPtr, Alignment, "_msld"));
2130    } else {
2131      setShadow(&I, getCleanShadow(&I));
2132    }
2133
2134    if (ClCheckAccessAddress)
2135      insertShadowCheck(I.getPointerOperand(), &I);
2136
2137    if (I.isAtomic())
2138      I.setOrdering(addAcquireOrdering(I.getOrdering()));
2139
2140    if (MS.TrackOrigins) {
2141      if (PropagateShadow) {
2142        const Align OriginAlignment = std::max(kMinOriginAlignment, Alignment);
2143        setOrigin(
2144            &I, IRB.CreateAlignedLoad(MS.OriginTy, OriginPtr, OriginAlignment));
2145      } else {
2146        setOrigin(&I, getCleanOrigin());
2147      }
2148    }
2149  }
2150
2151  /// Instrument StoreInst
2152  ///
2153  /// Stores the corresponding shadow and (optionally) origin.
2154  /// Optionally, checks that the store address is fully defined.
2155  void visitStoreInst(StoreInst &I) {
2156    StoreList.push_back(&I);
2157    if (ClCheckAccessAddress)
2158      insertShadowCheck(I.getPointerOperand(), &I);
2159  }
2160
2161  void handleCASOrRMW(Instruction &I) {
2162    assert(isa<AtomicRMWInst>(I) || isa<AtomicCmpXchgInst>(I));
2163
2164    IRBuilder<> IRB(&I);
2165    Value *Addr = I.getOperand(0);
2166    Value *Val = I.getOperand(1);
2167    Value *ShadowPtr = getShadowOriginPtr(Addr, IRB, getShadowTy(Val), Align(1),
2168                                          /*isStore*/ true)
2169                           .first;
2170
2171    if (ClCheckAccessAddress)
2172      insertShadowCheck(Addr, &I);
2173
2174    // Only test the conditional argument of cmpxchg instruction.
2175    // The other argument can potentially be uninitialized, but we can not
2176    // detect this situation reliably without possible false positives.
2177    if (isa<AtomicCmpXchgInst>(I))
2178      insertShadowCheck(Val, &I);
2179
2180    IRB.CreateStore(getCleanShadow(Val), ShadowPtr);
2181
2182    setShadow(&I, getCleanShadow(&I));
2183    setOrigin(&I, getCleanOrigin());
2184  }
2185
2186  void visitAtomicRMWInst(AtomicRMWInst &I) {
2187    handleCASOrRMW(I);
2188    I.setOrdering(addReleaseOrdering(I.getOrdering()));
2189  }
2190
2191  void visitAtomicCmpXchgInst(AtomicCmpXchgInst &I) {
2192    handleCASOrRMW(I);
2193    I.setSuccessOrdering(addReleaseOrdering(I.getSuccessOrdering()));
2194  }
2195
2196  // Vector manipulation.
2197  void visitExtractElementInst(ExtractElementInst &I) {
2198    insertShadowCheck(I.getOperand(1), &I);
2199    IRBuilder<> IRB(&I);
2200    setShadow(&I, IRB.CreateExtractElement(getShadow(&I, 0), I.getOperand(1),
2201                                           "_msprop"));
2202    setOrigin(&I, getOrigin(&I, 0));
2203  }
2204
2205  void visitInsertElementInst(InsertElementInst &I) {
2206    insertShadowCheck(I.getOperand(2), &I);
2207    IRBuilder<> IRB(&I);
2208    auto *Shadow0 = getShadow(&I, 0);
2209    auto *Shadow1 = getShadow(&I, 1);
2210    setShadow(&I, IRB.CreateInsertElement(Shadow0, Shadow1, I.getOperand(2),
2211                                          "_msprop"));
2212    setOriginForNaryOp(I);
2213  }
2214
2215  void visitShuffleVectorInst(ShuffleVectorInst &I) {
2216    IRBuilder<> IRB(&I);
2217    auto *Shadow0 = getShadow(&I, 0);
2218    auto *Shadow1 = getShadow(&I, 1);
2219    setShadow(&I, IRB.CreateShuffleVector(Shadow0, Shadow1, I.getShuffleMask(),
2220                                          "_msprop"));
2221    setOriginForNaryOp(I);
2222  }
2223
2224  // Casts.
2225  void visitSExtInst(SExtInst &I) {
2226    IRBuilder<> IRB(&I);
2227    setShadow(&I, IRB.CreateSExt(getShadow(&I, 0), I.getType(), "_msprop"));
2228    setOrigin(&I, getOrigin(&I, 0));
2229  }
2230
2231  void visitZExtInst(ZExtInst &I) {
2232    IRBuilder<> IRB(&I);
2233    setShadow(&I, IRB.CreateZExt(getShadow(&I, 0), I.getType(), "_msprop"));
2234    setOrigin(&I, getOrigin(&I, 0));
2235  }
2236
2237  void visitTruncInst(TruncInst &I) {
2238    IRBuilder<> IRB(&I);
2239    setShadow(&I, IRB.CreateTrunc(getShadow(&I, 0), I.getType(), "_msprop"));
2240    setOrigin(&I, getOrigin(&I, 0));
2241  }
2242
2243  void visitBitCastInst(BitCastInst &I) {
2244    // Special case: if this is the bitcast (there is exactly 1 allowed) between
2245    // a musttail call and a ret, don't instrument. New instructions are not
2246    // allowed after a musttail call.
2247    if (auto *CI = dyn_cast<CallInst>(I.getOperand(0)))
2248      if (CI->isMustTailCall())
2249        return;
2250    IRBuilder<> IRB(&I);
2251    setShadow(&I, IRB.CreateBitCast(getShadow(&I, 0), getShadowTy(&I)));
2252    setOrigin(&I, getOrigin(&I, 0));
2253  }
2254
2255  void visitPtrToIntInst(PtrToIntInst &I) {
2256    IRBuilder<> IRB(&I);
2257    setShadow(&I, IRB.CreateIntCast(getShadow(&I, 0), getShadowTy(&I), false,
2258                                    "_msprop_ptrtoint"));
2259    setOrigin(&I, getOrigin(&I, 0));
2260  }
2261
2262  void visitIntToPtrInst(IntToPtrInst &I) {
2263    IRBuilder<> IRB(&I);
2264    setShadow(&I, IRB.CreateIntCast(getShadow(&I, 0), getShadowTy(&I), false,
2265                                    "_msprop_inttoptr"));
2266    setOrigin(&I, getOrigin(&I, 0));
2267  }
2268
2269  void visitFPToSIInst(CastInst &I) { handleShadowOr(I); }
2270  void visitFPToUIInst(CastInst &I) { handleShadowOr(I); }
2271  void visitSIToFPInst(CastInst &I) { handleShadowOr(I); }
2272  void visitUIToFPInst(CastInst &I) { handleShadowOr(I); }
2273  void visitFPExtInst(CastInst &I) { handleShadowOr(I); }
2274  void visitFPTruncInst(CastInst &I) { handleShadowOr(I); }
2275
2276  /// Propagate shadow for bitwise AND.
2277  ///
2278  /// This code is exact, i.e. if, for example, a bit in the left argument
2279  /// is defined and 0, then neither the value not definedness of the
2280  /// corresponding bit in B don't affect the resulting shadow.
2281  void visitAnd(BinaryOperator &I) {
2282    IRBuilder<> IRB(&I);
2283    //  "And" of 0 and a poisoned value results in unpoisoned value.
2284    //  1&1 => 1;     0&1 => 0;     p&1 => p;
2285    //  1&0 => 0;     0&0 => 0;     p&0 => 0;
2286    //  1&p => p;     0&p => 0;     p&p => p;
2287    //  S = (S1 & S2) | (V1 & S2) | (S1 & V2)
2288    Value *S1 = getShadow(&I, 0);
2289    Value *S2 = getShadow(&I, 1);
2290    Value *V1 = I.getOperand(0);
2291    Value *V2 = I.getOperand(1);
2292    if (V1->getType() != S1->getType()) {
2293      V1 = IRB.CreateIntCast(V1, S1->getType(), false);
2294      V2 = IRB.CreateIntCast(V2, S2->getType(), false);
2295    }
2296    Value *S1S2 = IRB.CreateAnd(S1, S2);
2297    Value *V1S2 = IRB.CreateAnd(V1, S2);
2298    Value *S1V2 = IRB.CreateAnd(S1, V2);
2299    setShadow(&I, IRB.CreateOr({S1S2, V1S2, S1V2}));
2300    setOriginForNaryOp(I);
2301  }
2302
2303  void visitOr(BinaryOperator &I) {
2304    IRBuilder<> IRB(&I);
2305    //  "Or" of 1 and a poisoned value results in unpoisoned value.
2306    //  1|1 => 1;     0|1 => 1;     p|1 => 1;
2307    //  1|0 => 1;     0|0 => 0;     p|0 => p;
2308    //  1|p => 1;     0|p => p;     p|p => p;
2309    //  S = (S1 & S2) | (~V1 & S2) | (S1 & ~V2)
2310    Value *S1 = getShadow(&I, 0);
2311    Value *S2 = getShadow(&I, 1);
2312    Value *V1 = IRB.CreateNot(I.getOperand(0));
2313    Value *V2 = IRB.CreateNot(I.getOperand(1));
2314    if (V1->getType() != S1->getType()) {
2315      V1 = IRB.CreateIntCast(V1, S1->getType(), false);
2316      V2 = IRB.CreateIntCast(V2, S2->getType(), false);
2317    }
2318    Value *S1S2 = IRB.CreateAnd(S1, S2);
2319    Value *V1S2 = IRB.CreateAnd(V1, S2);
2320    Value *S1V2 = IRB.CreateAnd(S1, V2);
2321    setShadow(&I, IRB.CreateOr({S1S2, V1S2, S1V2}));
2322    setOriginForNaryOp(I);
2323  }
2324
2325  /// Default propagation of shadow and/or origin.
2326  ///
2327  /// This class implements the general case of shadow propagation, used in all
2328  /// cases where we don't know and/or don't care about what the operation
2329  /// actually does. It converts all input shadow values to a common type
2330  /// (extending or truncating as necessary), and bitwise OR's them.
2331  ///
2332  /// This is much cheaper than inserting checks (i.e. requiring inputs to be
2333  /// fully initialized), and less prone to false positives.
2334  ///
2335  /// This class also implements the general case of origin propagation. For a
2336  /// Nary operation, result origin is set to the origin of an argument that is
2337  /// not entirely initialized. If there is more than one such arguments, the
2338  /// rightmost of them is picked. It does not matter which one is picked if all
2339  /// arguments are initialized.
2340  template <bool CombineShadow> class Combiner {
2341    Value *Shadow = nullptr;
2342    Value *Origin = nullptr;
2343    IRBuilder<> &IRB;
2344    MemorySanitizerVisitor *MSV;
2345
2346  public:
2347    Combiner(MemorySanitizerVisitor *MSV, IRBuilder<> &IRB)
2348        : IRB(IRB), MSV(MSV) {}
2349
2350    /// Add a pair of shadow and origin values to the mix.
2351    Combiner &Add(Value *OpShadow, Value *OpOrigin) {
2352      if (CombineShadow) {
2353        assert(OpShadow);
2354        if (!Shadow)
2355          Shadow = OpShadow;
2356        else {
2357          OpShadow = MSV->CreateShadowCast(IRB, OpShadow, Shadow->getType());
2358          Shadow = IRB.CreateOr(Shadow, OpShadow, "_msprop");
2359        }
2360      }
2361
2362      if (MSV->MS.TrackOrigins) {
2363        assert(OpOrigin);
2364        if (!Origin) {
2365          Origin = OpOrigin;
2366        } else {
2367          Constant *ConstOrigin = dyn_cast<Constant>(OpOrigin);
2368          // No point in adding something that might result in 0 origin value.
2369          if (!ConstOrigin || !ConstOrigin->isNullValue()) {
2370            Value *FlatShadow = MSV->convertShadowToScalar(OpShadow, IRB);
2371            Value *Cond =
2372                IRB.CreateICmpNE(FlatShadow, MSV->getCleanShadow(FlatShadow));
2373            Origin = IRB.CreateSelect(Cond, OpOrigin, Origin);
2374          }
2375        }
2376      }
2377      return *this;
2378    }
2379
2380    /// Add an application value to the mix.
2381    Combiner &Add(Value *V) {
2382      Value *OpShadow = MSV->getShadow(V);
2383      Value *OpOrigin = MSV->MS.TrackOrigins ? MSV->getOrigin(V) : nullptr;
2384      return Add(OpShadow, OpOrigin);
2385    }
2386
2387    /// Set the current combined values as the given instruction's shadow
2388    /// and origin.
2389    void Done(Instruction *I) {
2390      if (CombineShadow) {
2391        assert(Shadow);
2392        Shadow = MSV->CreateShadowCast(IRB, Shadow, MSV->getShadowTy(I));
2393        MSV->setShadow(I, Shadow);
2394      }
2395      if (MSV->MS.TrackOrigins) {
2396        assert(Origin);
2397        MSV->setOrigin(I, Origin);
2398      }
2399    }
2400  };
2401
2402  using ShadowAndOriginCombiner = Combiner<true>;
2403  using OriginCombiner = Combiner<false>;
2404
2405  /// Propagate origin for arbitrary operation.
2406  void setOriginForNaryOp(Instruction &I) {
2407    if (!MS.TrackOrigins)
2408      return;
2409    IRBuilder<> IRB(&I);
2410    OriginCombiner OC(this, IRB);
2411    for (Use &Op : I.operands())
2412      OC.Add(Op.get());
2413    OC.Done(&I);
2414  }
2415
2416  size_t VectorOrPrimitiveTypeSizeInBits(Type *Ty) {
2417    assert(!(Ty->isVectorTy() && Ty->getScalarType()->isPointerTy()) &&
2418           "Vector of pointers is not a valid shadow type");
2419    return Ty->isVectorTy() ? cast<FixedVectorType>(Ty)->getNumElements() *
2420                                  Ty->getScalarSizeInBits()
2421                            : Ty->getPrimitiveSizeInBits();
2422  }
2423
2424  /// Cast between two shadow types, extending or truncating as
2425  /// necessary.
2426  Value *CreateShadowCast(IRBuilder<> &IRB, Value *V, Type *dstTy,
2427                          bool Signed = false) {
2428    Type *srcTy = V->getType();
2429    size_t srcSizeInBits = VectorOrPrimitiveTypeSizeInBits(srcTy);
2430    size_t dstSizeInBits = VectorOrPrimitiveTypeSizeInBits(dstTy);
2431    if (srcSizeInBits > 1 && dstSizeInBits == 1)
2432      return IRB.CreateICmpNE(V, getCleanShadow(V));
2433
2434    if (dstTy->isIntegerTy() && srcTy->isIntegerTy())
2435      return IRB.CreateIntCast(V, dstTy, Signed);
2436    if (dstTy->isVectorTy() && srcTy->isVectorTy() &&
2437        cast<FixedVectorType>(dstTy)->getNumElements() ==
2438            cast<FixedVectorType>(srcTy)->getNumElements())
2439      return IRB.CreateIntCast(V, dstTy, Signed);
2440    Value *V1 = IRB.CreateBitCast(V, Type::getIntNTy(*MS.C, srcSizeInBits));
2441    Value *V2 =
2442        IRB.CreateIntCast(V1, Type::getIntNTy(*MS.C, dstSizeInBits), Signed);
2443    return IRB.CreateBitCast(V2, dstTy);
2444    // TODO: handle struct types.
2445  }
2446
2447  /// Cast an application value to the type of its own shadow.
2448  Value *CreateAppToShadowCast(IRBuilder<> &IRB, Value *V) {
2449    Type *ShadowTy = getShadowTy(V);
2450    if (V->getType() == ShadowTy)
2451      return V;
2452    if (V->getType()->isPtrOrPtrVectorTy())
2453      return IRB.CreatePtrToInt(V, ShadowTy);
2454    else
2455      return IRB.CreateBitCast(V, ShadowTy);
2456  }
2457
2458  /// Propagate shadow for arbitrary operation.
2459  void handleShadowOr(Instruction &I) {
2460    IRBuilder<> IRB(&I);
2461    ShadowAndOriginCombiner SC(this, IRB);
2462    for (Use &Op : I.operands())
2463      SC.Add(Op.get());
2464    SC.Done(&I);
2465  }
2466
2467  void visitFNeg(UnaryOperator &I) { handleShadowOr(I); }
2468
2469  // Handle multiplication by constant.
2470  //
2471  // Handle a special case of multiplication by constant that may have one or
2472  // more zeros in the lower bits. This makes corresponding number of lower bits
2473  // of the result zero as well. We model it by shifting the other operand
2474  // shadow left by the required number of bits. Effectively, we transform
2475  // (X * (A * 2**B)) to ((X << B) * A) and instrument (X << B) as (Sx << B).
2476  // We use multiplication by 2**N instead of shift to cover the case of
2477  // multiplication by 0, which may occur in some elements of a vector operand.
2478  void handleMulByConstant(BinaryOperator &I, Constant *ConstArg,
2479                           Value *OtherArg) {
2480    Constant *ShadowMul;
2481    Type *Ty = ConstArg->getType();
2482    if (auto *VTy = dyn_cast<VectorType>(Ty)) {
2483      unsigned NumElements = cast<FixedVectorType>(VTy)->getNumElements();
2484      Type *EltTy = VTy->getElementType();
2485      SmallVector<Constant *, 16> Elements;
2486      for (unsigned Idx = 0; Idx < NumElements; ++Idx) {
2487        if (ConstantInt *Elt =
2488                dyn_cast<ConstantInt>(ConstArg->getAggregateElement(Idx))) {
2489          const APInt &V = Elt->getValue();
2490          APInt V2 = APInt(V.getBitWidth(), 1) << V.countTrailingZeros();
2491          Elements.push_back(ConstantInt::get(EltTy, V2));
2492        } else {
2493          Elements.push_back(ConstantInt::get(EltTy, 1));
2494        }
2495      }
2496      ShadowMul = ConstantVector::get(Elements);
2497    } else {
2498      if (ConstantInt *Elt = dyn_cast<ConstantInt>(ConstArg)) {
2499        const APInt &V = Elt->getValue();
2500        APInt V2 = APInt(V.getBitWidth(), 1) << V.countTrailingZeros();
2501        ShadowMul = ConstantInt::get(Ty, V2);
2502      } else {
2503        ShadowMul = ConstantInt::get(Ty, 1);
2504      }
2505    }
2506
2507    IRBuilder<> IRB(&I);
2508    setShadow(&I,
2509              IRB.CreateMul(getShadow(OtherArg), ShadowMul, "msprop_mul_cst"));
2510    setOrigin(&I, getOrigin(OtherArg));
2511  }
2512
2513  void visitMul(BinaryOperator &I) {
2514    Constant *constOp0 = dyn_cast<Constant>(I.getOperand(0));
2515    Constant *constOp1 = dyn_cast<Constant>(I.getOperand(1));
2516    if (constOp0 && !constOp1)
2517      handleMulByConstant(I, constOp0, I.getOperand(1));
2518    else if (constOp1 && !constOp0)
2519      handleMulByConstant(I, constOp1, I.getOperand(0));
2520    else
2521      handleShadowOr(I);
2522  }
2523
2524  void visitFAdd(BinaryOperator &I) { handleShadowOr(I); }
2525  void visitFSub(BinaryOperator &I) { handleShadowOr(I); }
2526  void visitFMul(BinaryOperator &I) { handleShadowOr(I); }
2527  void visitAdd(BinaryOperator &I) { handleShadowOr(I); }
2528  void visitSub(BinaryOperator &I) { handleShadowOr(I); }
2529  void visitXor(BinaryOperator &I) { handleShadowOr(I); }
2530
2531  void handleIntegerDiv(Instruction &I) {
2532    IRBuilder<> IRB(&I);
2533    // Strict on the second argument.
2534    insertShadowCheck(I.getOperand(1), &I);
2535    setShadow(&I, getShadow(&I, 0));
2536    setOrigin(&I, getOrigin(&I, 0));
2537  }
2538
2539  void visitUDiv(BinaryOperator &I) { handleIntegerDiv(I); }
2540  void visitSDiv(BinaryOperator &I) { handleIntegerDiv(I); }
2541  void visitURem(BinaryOperator &I) { handleIntegerDiv(I); }
2542  void visitSRem(BinaryOperator &I) { handleIntegerDiv(I); }
2543
2544  // Floating point division is side-effect free. We can not require that the
2545  // divisor is fully initialized and must propagate shadow. See PR37523.
2546  void visitFDiv(BinaryOperator &I) { handleShadowOr(I); }
2547  void visitFRem(BinaryOperator &I) { handleShadowOr(I); }
2548
2549  /// Instrument == and != comparisons.
2550  ///
2551  /// Sometimes the comparison result is known even if some of the bits of the
2552  /// arguments are not.
2553  void handleEqualityComparison(ICmpInst &I) {
2554    IRBuilder<> IRB(&I);
2555    Value *A = I.getOperand(0);
2556    Value *B = I.getOperand(1);
2557    Value *Sa = getShadow(A);
2558    Value *Sb = getShadow(B);
2559
2560    // Get rid of pointers and vectors of pointers.
2561    // For ints (and vectors of ints), types of A and Sa match,
2562    // and this is a no-op.
2563    A = IRB.CreatePointerCast(A, Sa->getType());
2564    B = IRB.CreatePointerCast(B, Sb->getType());
2565
2566    // A == B  <==>  (C = A^B) == 0
2567    // A != B  <==>  (C = A^B) != 0
2568    // Sc = Sa | Sb
2569    Value *C = IRB.CreateXor(A, B);
2570    Value *Sc = IRB.CreateOr(Sa, Sb);
2571    // Now dealing with i = (C == 0) comparison (or C != 0, does not matter now)
2572    // Result is defined if one of the following is true
2573    // * there is a defined 1 bit in C
2574    // * C is fully defined
2575    // Si = !(C & ~Sc) && Sc
2576    Value *Zero = Constant::getNullValue(Sc->getType());
2577    Value *MinusOne = Constant::getAllOnesValue(Sc->getType());
2578    Value *LHS = IRB.CreateICmpNE(Sc, Zero);
2579    Value *RHS =
2580        IRB.CreateICmpEQ(IRB.CreateAnd(IRB.CreateXor(Sc, MinusOne), C), Zero);
2581    Value *Si = IRB.CreateAnd(LHS, RHS);
2582    Si->setName("_msprop_icmp");
2583    setShadow(&I, Si);
2584    setOriginForNaryOp(I);
2585  }
2586
2587  /// Build the lowest possible value of V, taking into account V's
2588  ///        uninitialized bits.
2589  Value *getLowestPossibleValue(IRBuilder<> &IRB, Value *A, Value *Sa,
2590                                bool isSigned) {
2591    if (isSigned) {
2592      // Split shadow into sign bit and other bits.
2593      Value *SaOtherBits = IRB.CreateLShr(IRB.CreateShl(Sa, 1), 1);
2594      Value *SaSignBit = IRB.CreateXor(Sa, SaOtherBits);
2595      // Maximise the undefined shadow bit, minimize other undefined bits.
2596      return IRB.CreateOr(IRB.CreateAnd(A, IRB.CreateNot(SaOtherBits)),
2597                          SaSignBit);
2598    } else {
2599      // Minimize undefined bits.
2600      return IRB.CreateAnd(A, IRB.CreateNot(Sa));
2601    }
2602  }
2603
2604  /// Build the highest possible value of V, taking into account V's
2605  ///        uninitialized bits.
2606  Value *getHighestPossibleValue(IRBuilder<> &IRB, Value *A, Value *Sa,
2607                                 bool isSigned) {
2608    if (isSigned) {
2609      // Split shadow into sign bit and other bits.
2610      Value *SaOtherBits = IRB.CreateLShr(IRB.CreateShl(Sa, 1), 1);
2611      Value *SaSignBit = IRB.CreateXor(Sa, SaOtherBits);
2612      // Minimise the undefined shadow bit, maximise other undefined bits.
2613      return IRB.CreateOr(IRB.CreateAnd(A, IRB.CreateNot(SaSignBit)),
2614                          SaOtherBits);
2615    } else {
2616      // Maximize undefined bits.
2617      return IRB.CreateOr(A, Sa);
2618    }
2619  }
2620
2621  /// Instrument relational comparisons.
2622  ///
2623  /// This function does exact shadow propagation for all relational
2624  /// comparisons of integers, pointers and vectors of those.
2625  /// FIXME: output seems suboptimal when one of the operands is a constant
2626  void handleRelationalComparisonExact(ICmpInst &I) {
2627    IRBuilder<> IRB(&I);
2628    Value *A = I.getOperand(0);
2629    Value *B = I.getOperand(1);
2630    Value *Sa = getShadow(A);
2631    Value *Sb = getShadow(B);
2632
2633    // Get rid of pointers and vectors of pointers.
2634    // For ints (and vectors of ints), types of A and Sa match,
2635    // and this is a no-op.
2636    A = IRB.CreatePointerCast(A, Sa->getType());
2637    B = IRB.CreatePointerCast(B, Sb->getType());
2638
2639    // Let [a0, a1] be the interval of possible values of A, taking into account
2640    // its undefined bits. Let [b0, b1] be the interval of possible values of B.
2641    // Then (A cmp B) is defined iff (a0 cmp b1) == (a1 cmp b0).
2642    bool IsSigned = I.isSigned();
2643    Value *S1 = IRB.CreateICmp(I.getPredicate(),
2644                               getLowestPossibleValue(IRB, A, Sa, IsSigned),
2645                               getHighestPossibleValue(IRB, B, Sb, IsSigned));
2646    Value *S2 = IRB.CreateICmp(I.getPredicate(),
2647                               getHighestPossibleValue(IRB, A, Sa, IsSigned),
2648                               getLowestPossibleValue(IRB, B, Sb, IsSigned));
2649    Value *Si = IRB.CreateXor(S1, S2);
2650    setShadow(&I, Si);
2651    setOriginForNaryOp(I);
2652  }
2653
2654  /// Instrument signed relational comparisons.
2655  ///
2656  /// Handle sign bit tests: x<0, x>=0, x<=-1, x>-1 by propagating the highest
2657  /// bit of the shadow. Everything else is delegated to handleShadowOr().
2658  void handleSignedRelationalComparison(ICmpInst &I) {
2659    Constant *constOp;
2660    Value *op = nullptr;
2661    CmpInst::Predicate pre;
2662    if ((constOp = dyn_cast<Constant>(I.getOperand(1)))) {
2663      op = I.getOperand(0);
2664      pre = I.getPredicate();
2665    } else if ((constOp = dyn_cast<Constant>(I.getOperand(0)))) {
2666      op = I.getOperand(1);
2667      pre = I.getSwappedPredicate();
2668    } else {
2669      handleShadowOr(I);
2670      return;
2671    }
2672
2673    if ((constOp->isNullValue() &&
2674         (pre == CmpInst::ICMP_SLT || pre == CmpInst::ICMP_SGE)) ||
2675        (constOp->isAllOnesValue() &&
2676         (pre == CmpInst::ICMP_SGT || pre == CmpInst::ICMP_SLE))) {
2677      IRBuilder<> IRB(&I);
2678      Value *Shadow = IRB.CreateICmpSLT(getShadow(op), getCleanShadow(op),
2679                                        "_msprop_icmp_s");
2680      setShadow(&I, Shadow);
2681      setOrigin(&I, getOrigin(op));
2682    } else {
2683      handleShadowOr(I);
2684    }
2685  }
2686
2687  void visitICmpInst(ICmpInst &I) {
2688    if (!ClHandleICmp) {
2689      handleShadowOr(I);
2690      return;
2691    }
2692    if (I.isEquality()) {
2693      handleEqualityComparison(I);
2694      return;
2695    }
2696
2697    assert(I.isRelational());
2698    if (ClHandleICmpExact) {
2699      handleRelationalComparisonExact(I);
2700      return;
2701    }
2702    if (I.isSigned()) {
2703      handleSignedRelationalComparison(I);
2704      return;
2705    }
2706
2707    assert(I.isUnsigned());
2708    if ((isa<Constant>(I.getOperand(0)) || isa<Constant>(I.getOperand(1)))) {
2709      handleRelationalComparisonExact(I);
2710      return;
2711    }
2712
2713    handleShadowOr(I);
2714  }
2715
2716  void visitFCmpInst(FCmpInst &I) { handleShadowOr(I); }
2717
2718  void handleShift(BinaryOperator &I) {
2719    IRBuilder<> IRB(&I);
2720    // If any of the S2 bits are poisoned, the whole thing is poisoned.
2721    // Otherwise perform the same shift on S1.
2722    Value *S1 = getShadow(&I, 0);
2723    Value *S2 = getShadow(&I, 1);
2724    Value *S2Conv =
2725        IRB.CreateSExt(IRB.CreateICmpNE(S2, getCleanShadow(S2)), S2->getType());
2726    Value *V2 = I.getOperand(1);
2727    Value *Shift = IRB.CreateBinOp(I.getOpcode(), S1, V2);
2728    setShadow(&I, IRB.CreateOr(Shift, S2Conv));
2729    setOriginForNaryOp(I);
2730  }
2731
2732  void visitShl(BinaryOperator &I) { handleShift(I); }
2733  void visitAShr(BinaryOperator &I) { handleShift(I); }
2734  void visitLShr(BinaryOperator &I) { handleShift(I); }
2735
2736  void handleFunnelShift(IntrinsicInst &I) {
2737    IRBuilder<> IRB(&I);
2738    // If any of the S2 bits are poisoned, the whole thing is poisoned.
2739    // Otherwise perform the same shift on S0 and S1.
2740    Value *S0 = getShadow(&I, 0);
2741    Value *S1 = getShadow(&I, 1);
2742    Value *S2 = getShadow(&I, 2);
2743    Value *S2Conv =
2744        IRB.CreateSExt(IRB.CreateICmpNE(S2, getCleanShadow(S2)), S2->getType());
2745    Value *V2 = I.getOperand(2);
2746    Function *Intrin = Intrinsic::getDeclaration(
2747        I.getModule(), I.getIntrinsicID(), S2Conv->getType());
2748    Value *Shift = IRB.CreateCall(Intrin, {S0, S1, V2});
2749    setShadow(&I, IRB.CreateOr(Shift, S2Conv));
2750    setOriginForNaryOp(I);
2751  }
2752
2753  /// Instrument llvm.memmove
2754  ///
2755  /// At this point we don't know if llvm.memmove will be inlined or not.
2756  /// If we don't instrument it and it gets inlined,
2757  /// our interceptor will not kick in and we will lose the memmove.
2758  /// If we instrument the call here, but it does not get inlined,
2759  /// we will memove the shadow twice: which is bad in case
2760  /// of overlapping regions. So, we simply lower the intrinsic to a call.
2761  ///
2762  /// Similar situation exists for memcpy and memset.
2763  void visitMemMoveInst(MemMoveInst &I) {
2764    getShadow(I.getArgOperand(1)); // Ensure shadow initialized
2765    IRBuilder<> IRB(&I);
2766    IRB.CreateCall(
2767        MS.MemmoveFn,
2768        {IRB.CreatePointerCast(I.getArgOperand(0), IRB.getInt8PtrTy()),
2769         IRB.CreatePointerCast(I.getArgOperand(1), IRB.getInt8PtrTy()),
2770         IRB.CreateIntCast(I.getArgOperand(2), MS.IntptrTy, false)});
2771    I.eraseFromParent();
2772  }
2773
2774  /// Instrument memcpy
2775  ///
2776  /// Similar to memmove: avoid copying shadow twice. This is somewhat
2777  /// unfortunate as it may slowdown small constant memcpys.
2778  /// FIXME: consider doing manual inline for small constant sizes and proper
2779  /// alignment.
2780  ///
2781  /// Note: This also handles memcpy.inline, which promises no calls to external
2782  /// functions as an optimization. However, with instrumentation enabled this
2783  /// is difficult to promise; additionally, we know that the MSan runtime
2784  /// exists and provides __msan_memcpy(). Therefore, we assume that with
2785  /// instrumentation it's safe to turn memcpy.inline into a call to
2786  /// __msan_memcpy(). Should this be wrong, such as when implementing memcpy()
2787  /// itself, instrumentation should be disabled with the no_sanitize attribute.
2788  void visitMemCpyInst(MemCpyInst &I) {
2789    getShadow(I.getArgOperand(1)); // Ensure shadow initialized
2790    IRBuilder<> IRB(&I);
2791    IRB.CreateCall(
2792        MS.MemcpyFn,
2793        {IRB.CreatePointerCast(I.getArgOperand(0), IRB.getInt8PtrTy()),
2794         IRB.CreatePointerCast(I.getArgOperand(1), IRB.getInt8PtrTy()),
2795         IRB.CreateIntCast(I.getArgOperand(2), MS.IntptrTy, false)});
2796    I.eraseFromParent();
2797  }
2798
2799  // Same as memcpy.
2800  void visitMemSetInst(MemSetInst &I) {
2801    IRBuilder<> IRB(&I);
2802    IRB.CreateCall(
2803        MS.MemsetFn,
2804        {IRB.CreatePointerCast(I.getArgOperand(0), IRB.getInt8PtrTy()),
2805         IRB.CreateIntCast(I.getArgOperand(1), IRB.getInt32Ty(), false),
2806         IRB.CreateIntCast(I.getArgOperand(2), MS.IntptrTy, false)});
2807    I.eraseFromParent();
2808  }
2809
2810  void visitVAStartInst(VAStartInst &I) { VAHelper->visitVAStartInst(I); }
2811
2812  void visitVACopyInst(VACopyInst &I) { VAHelper->visitVACopyInst(I); }
2813
2814  /// Handle vector store-like intrinsics.
2815  ///
2816  /// Instrument intrinsics that look like a simple SIMD store: writes memory,
2817  /// has 1 pointer argument and 1 vector argument, returns void.
2818  bool handleVectorStoreIntrinsic(IntrinsicInst &I) {
2819    IRBuilder<> IRB(&I);
2820    Value *Addr = I.getArgOperand(0);
2821    Value *Shadow = getShadow(&I, 1);
2822    Value *ShadowPtr, *OriginPtr;
2823
2824    // We don't know the pointer alignment (could be unaligned SSE store!).
2825    // Have to assume to worst case.
2826    std::tie(ShadowPtr, OriginPtr) = getShadowOriginPtr(
2827        Addr, IRB, Shadow->getType(), Align(1), /*isStore*/ true);
2828    IRB.CreateAlignedStore(Shadow, ShadowPtr, Align(1));
2829
2830    if (ClCheckAccessAddress)
2831      insertShadowCheck(Addr, &I);
2832
2833    // FIXME: factor out common code from materializeStores
2834    if (MS.TrackOrigins)
2835      IRB.CreateStore(getOrigin(&I, 1), OriginPtr);
2836    return true;
2837  }
2838
2839  /// Handle vector load-like intrinsics.
2840  ///
2841  /// Instrument intrinsics that look like a simple SIMD load: reads memory,
2842  /// has 1 pointer argument, returns a vector.
2843  bool handleVectorLoadIntrinsic(IntrinsicInst &I) {
2844    IRBuilder<> IRB(&I);
2845    Value *Addr = I.getArgOperand(0);
2846
2847    Type *ShadowTy = getShadowTy(&I);
2848    Value *ShadowPtr = nullptr, *OriginPtr = nullptr;
2849    if (PropagateShadow) {
2850      // We don't know the pointer alignment (could be unaligned SSE load!).
2851      // Have to assume to worst case.
2852      const Align Alignment = Align(1);
2853      std::tie(ShadowPtr, OriginPtr) =
2854          getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ false);
2855      setShadow(&I,
2856                IRB.CreateAlignedLoad(ShadowTy, ShadowPtr, Alignment, "_msld"));
2857    } else {
2858      setShadow(&I, getCleanShadow(&I));
2859    }
2860
2861    if (ClCheckAccessAddress)
2862      insertShadowCheck(Addr, &I);
2863
2864    if (MS.TrackOrigins) {
2865      if (PropagateShadow)
2866        setOrigin(&I, IRB.CreateLoad(MS.OriginTy, OriginPtr));
2867      else
2868        setOrigin(&I, getCleanOrigin());
2869    }
2870    return true;
2871  }
2872
2873  /// Handle (SIMD arithmetic)-like intrinsics.
2874  ///
2875  /// Instrument intrinsics with any number of arguments of the same type,
2876  /// equal to the return type. The type should be simple (no aggregates or
2877  /// pointers; vectors are fine).
2878  /// Caller guarantees that this intrinsic does not access memory.
2879  bool maybeHandleSimpleNomemIntrinsic(IntrinsicInst &I) {
2880    Type *RetTy = I.getType();
2881    if (!(RetTy->isIntOrIntVectorTy() || RetTy->isFPOrFPVectorTy() ||
2882          RetTy->isX86_MMXTy()))
2883      return false;
2884
2885    unsigned NumArgOperands = I.arg_size();
2886    for (unsigned i = 0; i < NumArgOperands; ++i) {
2887      Type *Ty = I.getArgOperand(i)->getType();
2888      if (Ty != RetTy)
2889        return false;
2890    }
2891
2892    IRBuilder<> IRB(&I);
2893    ShadowAndOriginCombiner SC(this, IRB);
2894    for (unsigned i = 0; i < NumArgOperands; ++i)
2895      SC.Add(I.getArgOperand(i));
2896    SC.Done(&I);
2897
2898    return true;
2899  }
2900
2901  /// Heuristically instrument unknown intrinsics.
2902  ///
2903  /// The main purpose of this code is to do something reasonable with all
2904  /// random intrinsics we might encounter, most importantly - SIMD intrinsics.
2905  /// We recognize several classes of intrinsics by their argument types and
2906  /// ModRefBehaviour and apply special instrumentation when we are reasonably
2907  /// sure that we know what the intrinsic does.
2908  ///
2909  /// We special-case intrinsics where this approach fails. See llvm.bswap
2910  /// handling as an example of that.
2911  bool handleUnknownIntrinsic(IntrinsicInst &I) {
2912    unsigned NumArgOperands = I.arg_size();
2913    if (NumArgOperands == 0)
2914      return false;
2915
2916    if (NumArgOperands == 2 && I.getArgOperand(0)->getType()->isPointerTy() &&
2917        I.getArgOperand(1)->getType()->isVectorTy() &&
2918        I.getType()->isVoidTy() && !I.onlyReadsMemory()) {
2919      // This looks like a vector store.
2920      return handleVectorStoreIntrinsic(I);
2921    }
2922
2923    if (NumArgOperands == 1 && I.getArgOperand(0)->getType()->isPointerTy() &&
2924        I.getType()->isVectorTy() && I.onlyReadsMemory()) {
2925      // This looks like a vector load.
2926      return handleVectorLoadIntrinsic(I);
2927    }
2928
2929    if (I.doesNotAccessMemory())
2930      if (maybeHandleSimpleNomemIntrinsic(I))
2931        return true;
2932
2933    // FIXME: detect and handle SSE maskstore/maskload
2934    return false;
2935  }
2936
2937  void handleInvariantGroup(IntrinsicInst &I) {
2938    setShadow(&I, getShadow(&I, 0));
2939    setOrigin(&I, getOrigin(&I, 0));
2940  }
2941
2942  void handleLifetimeStart(IntrinsicInst &I) {
2943    if (!PoisonStack)
2944      return;
2945    AllocaInst *AI = llvm::findAllocaForValue(I.getArgOperand(1));
2946    if (!AI)
2947      InstrumentLifetimeStart = false;
2948    LifetimeStartList.push_back(std::make_pair(&I, AI));
2949  }
2950
2951  void handleBswap(IntrinsicInst &I) {
2952    IRBuilder<> IRB(&I);
2953    Value *Op = I.getArgOperand(0);
2954    Type *OpType = Op->getType();
2955    Function *BswapFunc = Intrinsic::getDeclaration(
2956        F.getParent(), Intrinsic::bswap, ArrayRef(&OpType, 1));
2957    setShadow(&I, IRB.CreateCall(BswapFunc, getShadow(Op)));
2958    setOrigin(&I, getOrigin(Op));
2959  }
2960
2961  void handleCountZeroes(IntrinsicInst &I) {
2962    IRBuilder<> IRB(&I);
2963    Value *Src = I.getArgOperand(0);
2964
2965    // Set the Output shadow based on input Shadow
2966    Value *BoolShadow = IRB.CreateIsNotNull(getShadow(Src), "_mscz_bs");
2967
2968    // If zero poison is requested, mix in with the shadow
2969    Constant *IsZeroPoison = cast<Constant>(I.getOperand(1));
2970    if (!IsZeroPoison->isZeroValue()) {
2971      Value *BoolZeroPoison = IRB.CreateIsNull(Src, "_mscz_bzp");
2972      BoolShadow = IRB.CreateOr(BoolShadow, BoolZeroPoison, "_mscz_bs");
2973    }
2974
2975    Value *OutputShadow =
2976        IRB.CreateSExt(BoolShadow, getShadowTy(Src), "_mscz_os");
2977
2978    setShadow(&I, OutputShadow);
2979    setOriginForNaryOp(I);
2980  }
2981
2982  // Instrument vector convert intrinsic.
2983  //
2984  // This function instruments intrinsics like cvtsi2ss:
2985  // %Out = int_xxx_cvtyyy(%ConvertOp)
2986  // or
2987  // %Out = int_xxx_cvtyyy(%CopyOp, %ConvertOp)
2988  // Intrinsic converts \p NumUsedElements elements of \p ConvertOp to the same
2989  // number \p Out elements, and (if has 2 arguments) copies the rest of the
2990  // elements from \p CopyOp.
2991  // In most cases conversion involves floating-point value which may trigger a
2992  // hardware exception when not fully initialized. For this reason we require
2993  // \p ConvertOp[0:NumUsedElements] to be fully initialized and trap otherwise.
2994  // We copy the shadow of \p CopyOp[NumUsedElements:] to \p
2995  // Out[NumUsedElements:]. This means that intrinsics without \p CopyOp always
2996  // return a fully initialized value.
2997  void handleVectorConvertIntrinsic(IntrinsicInst &I, int NumUsedElements,
2998                                    bool HasRoundingMode = false) {
2999    IRBuilder<> IRB(&I);
3000    Value *CopyOp, *ConvertOp;
3001
3002    assert((!HasRoundingMode ||
3003            isa<ConstantInt>(I.getArgOperand(I.arg_size() - 1))) &&
3004           "Invalid rounding mode");
3005
3006    switch (I.arg_size() - HasRoundingMode) {
3007    case 2:
3008      CopyOp = I.getArgOperand(0);
3009      ConvertOp = I.getArgOperand(1);
3010      break;
3011    case 1:
3012      ConvertOp = I.getArgOperand(0);
3013      CopyOp = nullptr;
3014      break;
3015    default:
3016      llvm_unreachable("Cvt intrinsic with unsupported number of arguments.");
3017    }
3018
3019    // The first *NumUsedElements* elements of ConvertOp are converted to the
3020    // same number of output elements. The rest of the output is copied from
3021    // CopyOp, or (if not available) filled with zeroes.
3022    // Combine shadow for elements of ConvertOp that are used in this operation,
3023    // and insert a check.
3024    // FIXME: consider propagating shadow of ConvertOp, at least in the case of
3025    // int->any conversion.
3026    Value *ConvertShadow = getShadow(ConvertOp);
3027    Value *AggShadow = nullptr;
3028    if (ConvertOp->getType()->isVectorTy()) {
3029      AggShadow = IRB.CreateExtractElement(
3030          ConvertShadow, ConstantInt::get(IRB.getInt32Ty(), 0));
3031      for (int i = 1; i < NumUsedElements; ++i) {
3032        Value *MoreShadow = IRB.CreateExtractElement(
3033            ConvertShadow, ConstantInt::get(IRB.getInt32Ty(), i));
3034        AggShadow = IRB.CreateOr(AggShadow, MoreShadow);
3035      }
3036    } else {
3037      AggShadow = ConvertShadow;
3038    }
3039    assert(AggShadow->getType()->isIntegerTy());
3040    insertShadowCheck(AggShadow, getOrigin(ConvertOp), &I);
3041
3042    // Build result shadow by zero-filling parts of CopyOp shadow that come from
3043    // ConvertOp.
3044    if (CopyOp) {
3045      assert(CopyOp->getType() == I.getType());
3046      assert(CopyOp->getType()->isVectorTy());
3047      Value *ResultShadow = getShadow(CopyOp);
3048      Type *EltTy = cast<VectorType>(ResultShadow->getType())->getElementType();
3049      for (int i = 0; i < NumUsedElements; ++i) {
3050        ResultShadow = IRB.CreateInsertElement(
3051            ResultShadow, ConstantInt::getNullValue(EltTy),
3052            ConstantInt::get(IRB.getInt32Ty(), i));
3053      }
3054      setShadow(&I, ResultShadow);
3055      setOrigin(&I, getOrigin(CopyOp));
3056    } else {
3057      setShadow(&I, getCleanShadow(&I));
3058      setOrigin(&I, getCleanOrigin());
3059    }
3060  }
3061
3062  // Given a scalar or vector, extract lower 64 bits (or less), and return all
3063  // zeroes if it is zero, and all ones otherwise.
3064  Value *Lower64ShadowExtend(IRBuilder<> &IRB, Value *S, Type *T) {
3065    if (S->getType()->isVectorTy())
3066      S = CreateShadowCast(IRB, S, IRB.getInt64Ty(), /* Signed */ true);
3067    assert(S->getType()->getPrimitiveSizeInBits() <= 64);
3068    Value *S2 = IRB.CreateICmpNE(S, getCleanShadow(S));
3069    return CreateShadowCast(IRB, S2, T, /* Signed */ true);
3070  }
3071
3072  // Given a vector, extract its first element, and return all
3073  // zeroes if it is zero, and all ones otherwise.
3074  Value *LowerElementShadowExtend(IRBuilder<> &IRB, Value *S, Type *T) {
3075    Value *S1 = IRB.CreateExtractElement(S, (uint64_t)0);
3076    Value *S2 = IRB.CreateICmpNE(S1, getCleanShadow(S1));
3077    return CreateShadowCast(IRB, S2, T, /* Signed */ true);
3078  }
3079
3080  Value *VariableShadowExtend(IRBuilder<> &IRB, Value *S) {
3081    Type *T = S->getType();
3082    assert(T->isVectorTy());
3083    Value *S2 = IRB.CreateICmpNE(S, getCleanShadow(S));
3084    return IRB.CreateSExt(S2, T);
3085  }
3086
3087  // Instrument vector shift intrinsic.
3088  //
3089  // This function instruments intrinsics like int_x86_avx2_psll_w.
3090  // Intrinsic shifts %In by %ShiftSize bits.
3091  // %ShiftSize may be a vector. In that case the lower 64 bits determine shift
3092  // size, and the rest is ignored. Behavior is defined even if shift size is
3093  // greater than register (or field) width.
3094  void handleVectorShiftIntrinsic(IntrinsicInst &I, bool Variable) {
3095    assert(I.arg_size() == 2);
3096    IRBuilder<> IRB(&I);
3097    // If any of the S2 bits are poisoned, the whole thing is poisoned.
3098    // Otherwise perform the same shift on S1.
3099    Value *S1 = getShadow(&I, 0);
3100    Value *S2 = getShadow(&I, 1);
3101    Value *S2Conv = Variable ? VariableShadowExtend(IRB, S2)
3102                             : Lower64ShadowExtend(IRB, S2, getShadowTy(&I));
3103    Value *V1 = I.getOperand(0);
3104    Value *V2 = I.getOperand(1);
3105    Value *Shift = IRB.CreateCall(I.getFunctionType(), I.getCalledOperand(),
3106                                  {IRB.CreateBitCast(S1, V1->getType()), V2});
3107    Shift = IRB.CreateBitCast(Shift, getShadowTy(&I));
3108    setShadow(&I, IRB.CreateOr(Shift, S2Conv));
3109    setOriginForNaryOp(I);
3110  }
3111
3112  // Get an X86_MMX-sized vector type.
3113  Type *getMMXVectorTy(unsigned EltSizeInBits) {
3114    const unsigned X86_MMXSizeInBits = 64;
3115    assert(EltSizeInBits != 0 && (X86_MMXSizeInBits % EltSizeInBits) == 0 &&
3116           "Illegal MMX vector element size");
3117    return FixedVectorType::get(IntegerType::get(*MS.C, EltSizeInBits),
3118                                X86_MMXSizeInBits / EltSizeInBits);
3119  }
3120
3121  // Returns a signed counterpart for an (un)signed-saturate-and-pack
3122  // intrinsic.
3123  Intrinsic::ID getSignedPackIntrinsic(Intrinsic::ID id) {
3124    switch (id) {
3125    case Intrinsic::x86_sse2_packsswb_128:
3126    case Intrinsic::x86_sse2_packuswb_128:
3127      return Intrinsic::x86_sse2_packsswb_128;
3128
3129    case Intrinsic::x86_sse2_packssdw_128:
3130    case Intrinsic::x86_sse41_packusdw:
3131      return Intrinsic::x86_sse2_packssdw_128;
3132
3133    case Intrinsic::x86_avx2_packsswb:
3134    case Intrinsic::x86_avx2_packuswb:
3135      return Intrinsic::x86_avx2_packsswb;
3136
3137    case Intrinsic::x86_avx2_packssdw:
3138    case Intrinsic::x86_avx2_packusdw:
3139      return Intrinsic::x86_avx2_packssdw;
3140
3141    case Intrinsic::x86_mmx_packsswb:
3142    case Intrinsic::x86_mmx_packuswb:
3143      return Intrinsic::x86_mmx_packsswb;
3144
3145    case Intrinsic::x86_mmx_packssdw:
3146      return Intrinsic::x86_mmx_packssdw;
3147    default:
3148      llvm_unreachable("unexpected intrinsic id");
3149    }
3150  }
3151
3152  // Instrument vector pack intrinsic.
3153  //
3154  // This function instruments intrinsics like x86_mmx_packsswb, that
3155  // packs elements of 2 input vectors into half as many bits with saturation.
3156  // Shadow is propagated with the signed variant of the same intrinsic applied
3157  // to sext(Sa != zeroinitializer), sext(Sb != zeroinitializer).
3158  // EltSizeInBits is used only for x86mmx arguments.
3159  void handleVectorPackIntrinsic(IntrinsicInst &I, unsigned EltSizeInBits = 0) {
3160    assert(I.arg_size() == 2);
3161    bool isX86_MMX = I.getOperand(0)->getType()->isX86_MMXTy();
3162    IRBuilder<> IRB(&I);
3163    Value *S1 = getShadow(&I, 0);
3164    Value *S2 = getShadow(&I, 1);
3165    assert(isX86_MMX || S1->getType()->isVectorTy());
3166
3167    // SExt and ICmpNE below must apply to individual elements of input vectors.
3168    // In case of x86mmx arguments, cast them to appropriate vector types and
3169    // back.
3170    Type *T = isX86_MMX ? getMMXVectorTy(EltSizeInBits) : S1->getType();
3171    if (isX86_MMX) {
3172      S1 = IRB.CreateBitCast(S1, T);
3173      S2 = IRB.CreateBitCast(S2, T);
3174    }
3175    Value *S1_ext =
3176        IRB.CreateSExt(IRB.CreateICmpNE(S1, Constant::getNullValue(T)), T);
3177    Value *S2_ext =
3178        IRB.CreateSExt(IRB.CreateICmpNE(S2, Constant::getNullValue(T)), T);
3179    if (isX86_MMX) {
3180      Type *X86_MMXTy = Type::getX86_MMXTy(*MS.C);
3181      S1_ext = IRB.CreateBitCast(S1_ext, X86_MMXTy);
3182      S2_ext = IRB.CreateBitCast(S2_ext, X86_MMXTy);
3183    }
3184
3185    Function *ShadowFn = Intrinsic::getDeclaration(
3186        F.getParent(), getSignedPackIntrinsic(I.getIntrinsicID()));
3187
3188    Value *S =
3189        IRB.CreateCall(ShadowFn, {S1_ext, S2_ext}, "_msprop_vector_pack");
3190    if (isX86_MMX)
3191      S = IRB.CreateBitCast(S, getShadowTy(&I));
3192    setShadow(&I, S);
3193    setOriginForNaryOp(I);
3194  }
3195
3196  // Instrument sum-of-absolute-differences intrinsic.
3197  void handleVectorSadIntrinsic(IntrinsicInst &I) {
3198    const unsigned SignificantBitsPerResultElement = 16;
3199    bool isX86_MMX = I.getOperand(0)->getType()->isX86_MMXTy();
3200    Type *ResTy = isX86_MMX ? IntegerType::get(*MS.C, 64) : I.getType();
3201    unsigned ZeroBitsPerResultElement =
3202        ResTy->getScalarSizeInBits() - SignificantBitsPerResultElement;
3203
3204    IRBuilder<> IRB(&I);
3205    auto *Shadow0 = getShadow(&I, 0);
3206    auto *Shadow1 = getShadow(&I, 1);
3207    Value *S = IRB.CreateOr(Shadow0, Shadow1);
3208    S = IRB.CreateBitCast(S, ResTy);
3209    S = IRB.CreateSExt(IRB.CreateICmpNE(S, Constant::getNullValue(ResTy)),
3210                       ResTy);
3211    S = IRB.CreateLShr(S, ZeroBitsPerResultElement);
3212    S = IRB.CreateBitCast(S, getShadowTy(&I));
3213    setShadow(&I, S);
3214    setOriginForNaryOp(I);
3215  }
3216
3217  // Instrument multiply-add intrinsic.
3218  void handleVectorPmaddIntrinsic(IntrinsicInst &I,
3219                                  unsigned EltSizeInBits = 0) {
3220    bool isX86_MMX = I.getOperand(0)->getType()->isX86_MMXTy();
3221    Type *ResTy = isX86_MMX ? getMMXVectorTy(EltSizeInBits * 2) : I.getType();
3222    IRBuilder<> IRB(&I);
3223    auto *Shadow0 = getShadow(&I, 0);
3224    auto *Shadow1 = getShadow(&I, 1);
3225    Value *S = IRB.CreateOr(Shadow0, Shadow1);
3226    S = IRB.CreateBitCast(S, ResTy);
3227    S = IRB.CreateSExt(IRB.CreateICmpNE(S, Constant::getNullValue(ResTy)),
3228                       ResTy);
3229    S = IRB.CreateBitCast(S, getShadowTy(&I));
3230    setShadow(&I, S);
3231    setOriginForNaryOp(I);
3232  }
3233
3234  // Instrument compare-packed intrinsic.
3235  // Basically, an or followed by sext(icmp ne 0) to end up with all-zeros or
3236  // all-ones shadow.
3237  void handleVectorComparePackedIntrinsic(IntrinsicInst &I) {
3238    IRBuilder<> IRB(&I);
3239    Type *ResTy = getShadowTy(&I);
3240    auto *Shadow0 = getShadow(&I, 0);
3241    auto *Shadow1 = getShadow(&I, 1);
3242    Value *S0 = IRB.CreateOr(Shadow0, Shadow1);
3243    Value *S = IRB.CreateSExt(
3244        IRB.CreateICmpNE(S0, Constant::getNullValue(ResTy)), ResTy);
3245    setShadow(&I, S);
3246    setOriginForNaryOp(I);
3247  }
3248
3249  // Instrument compare-scalar intrinsic.
3250  // This handles both cmp* intrinsics which return the result in the first
3251  // element of a vector, and comi* which return the result as i32.
3252  void handleVectorCompareScalarIntrinsic(IntrinsicInst &I) {
3253    IRBuilder<> IRB(&I);
3254    auto *Shadow0 = getShadow(&I, 0);
3255    auto *Shadow1 = getShadow(&I, 1);
3256    Value *S0 = IRB.CreateOr(Shadow0, Shadow1);
3257    Value *S = LowerElementShadowExtend(IRB, S0, getShadowTy(&I));
3258    setShadow(&I, S);
3259    setOriginForNaryOp(I);
3260  }
3261
3262  // Instrument generic vector reduction intrinsics
3263  // by ORing together all their fields.
3264  void handleVectorReduceIntrinsic(IntrinsicInst &I) {
3265    IRBuilder<> IRB(&I);
3266    Value *S = IRB.CreateOrReduce(getShadow(&I, 0));
3267    setShadow(&I, S);
3268    setOrigin(&I, getOrigin(&I, 0));
3269  }
3270
3271  // Instrument vector.reduce.or intrinsic.
3272  // Valid (non-poisoned) set bits in the operand pull low the
3273  // corresponding shadow bits.
3274  void handleVectorReduceOrIntrinsic(IntrinsicInst &I) {
3275    IRBuilder<> IRB(&I);
3276    Value *OperandShadow = getShadow(&I, 0);
3277    Value *OperandUnsetBits = IRB.CreateNot(I.getOperand(0));
3278    Value *OperandUnsetOrPoison = IRB.CreateOr(OperandUnsetBits, OperandShadow);
3279    // Bit N is clean if any field's bit N is 1 and unpoison
3280    Value *OutShadowMask = IRB.CreateAndReduce(OperandUnsetOrPoison);
3281    // Otherwise, it is clean if every field's bit N is unpoison
3282    Value *OrShadow = IRB.CreateOrReduce(OperandShadow);
3283    Value *S = IRB.CreateAnd(OutShadowMask, OrShadow);
3284
3285    setShadow(&I, S);
3286    setOrigin(&I, getOrigin(&I, 0));
3287  }
3288
3289  // Instrument vector.reduce.and intrinsic.
3290  // Valid (non-poisoned) unset bits in the operand pull down the
3291  // corresponding shadow bits.
3292  void handleVectorReduceAndIntrinsic(IntrinsicInst &I) {
3293    IRBuilder<> IRB(&I);
3294    Value *OperandShadow = getShadow(&I, 0);
3295    Value *OperandSetOrPoison = IRB.CreateOr(I.getOperand(0), OperandShadow);
3296    // Bit N is clean if any field's bit N is 0 and unpoison
3297    Value *OutShadowMask = IRB.CreateAndReduce(OperandSetOrPoison);
3298    // Otherwise, it is clean if every field's bit N is unpoison
3299    Value *OrShadow = IRB.CreateOrReduce(OperandShadow);
3300    Value *S = IRB.CreateAnd(OutShadowMask, OrShadow);
3301
3302    setShadow(&I, S);
3303    setOrigin(&I, getOrigin(&I, 0));
3304  }
3305
3306  void handleStmxcsr(IntrinsicInst &I) {
3307    IRBuilder<> IRB(&I);
3308    Value *Addr = I.getArgOperand(0);
3309    Type *Ty = IRB.getInt32Ty();
3310    Value *ShadowPtr =
3311        getShadowOriginPtr(Addr, IRB, Ty, Align(1), /*isStore*/ true).first;
3312
3313    IRB.CreateStore(getCleanShadow(Ty),
3314                    IRB.CreatePointerCast(ShadowPtr, Ty->getPointerTo()));
3315
3316    if (ClCheckAccessAddress)
3317      insertShadowCheck(Addr, &I);
3318  }
3319
3320  void handleLdmxcsr(IntrinsicInst &I) {
3321    if (!InsertChecks)
3322      return;
3323
3324    IRBuilder<> IRB(&I);
3325    Value *Addr = I.getArgOperand(0);
3326    Type *Ty = IRB.getInt32Ty();
3327    const Align Alignment = Align(1);
3328    Value *ShadowPtr, *OriginPtr;
3329    std::tie(ShadowPtr, OriginPtr) =
3330        getShadowOriginPtr(Addr, IRB, Ty, Alignment, /*isStore*/ false);
3331
3332    if (ClCheckAccessAddress)
3333      insertShadowCheck(Addr, &I);
3334
3335    Value *Shadow = IRB.CreateAlignedLoad(Ty, ShadowPtr, Alignment, "_ldmxcsr");
3336    Value *Origin = MS.TrackOrigins ? IRB.CreateLoad(MS.OriginTy, OriginPtr)
3337                                    : getCleanOrigin();
3338    insertShadowCheck(Shadow, Origin, &I);
3339  }
3340
3341  void handleMaskedExpandLoad(IntrinsicInst &I) {
3342    IRBuilder<> IRB(&I);
3343    Value *Ptr = I.getArgOperand(0);
3344    Value *Mask = I.getArgOperand(1);
3345    Value *PassThru = I.getArgOperand(2);
3346
3347    if (ClCheckAccessAddress) {
3348      insertShadowCheck(Ptr, &I);
3349      insertShadowCheck(Mask, &I);
3350    }
3351
3352    if (!PropagateShadow) {
3353      setShadow(&I, getCleanShadow(&I));
3354      setOrigin(&I, getCleanOrigin());
3355      return;
3356    }
3357
3358    Type *ShadowTy = getShadowTy(&I);
3359    Type *ElementShadowTy = cast<FixedVectorType>(ShadowTy)->getElementType();
3360    auto [ShadowPtr, OriginPtr] =
3361        getShadowOriginPtr(Ptr, IRB, ElementShadowTy, {}, /*isStore*/ false);
3362
3363    Value *Shadow = IRB.CreateMaskedExpandLoad(
3364        ShadowTy, ShadowPtr, Mask, getShadow(PassThru), "_msmaskedexpload");
3365
3366    setShadow(&I, Shadow);
3367
3368    // TODO: Store origins.
3369    setOrigin(&I, getCleanOrigin());
3370  }
3371
3372  void handleMaskedCompressStore(IntrinsicInst &I) {
3373    IRBuilder<> IRB(&I);
3374    Value *Values = I.getArgOperand(0);
3375    Value *Ptr = I.getArgOperand(1);
3376    Value *Mask = I.getArgOperand(2);
3377
3378    if (ClCheckAccessAddress) {
3379      insertShadowCheck(Ptr, &I);
3380      insertShadowCheck(Mask, &I);
3381    }
3382
3383    Value *Shadow = getShadow(Values);
3384    Type *ElementShadowTy =
3385        getShadowTy(cast<FixedVectorType>(Values->getType())->getElementType());
3386    auto [ShadowPtr, OriginPtrs] =
3387        getShadowOriginPtr(Ptr, IRB, ElementShadowTy, {}, /*isStore*/ true);
3388
3389    IRB.CreateMaskedCompressStore(Shadow, ShadowPtr, Mask);
3390
3391    // TODO: Store origins.
3392  }
3393
3394  void handleMaskedGather(IntrinsicInst &I) {
3395    IRBuilder<> IRB(&I);
3396    Value *Ptrs = I.getArgOperand(0);
3397    const Align Alignment(
3398        cast<ConstantInt>(I.getArgOperand(1))->getZExtValue());
3399    Value *Mask = I.getArgOperand(2);
3400    Value *PassThru = I.getArgOperand(3);
3401
3402    Type *PtrsShadowTy = getShadowTy(Ptrs);
3403    if (ClCheckAccessAddress) {
3404      insertShadowCheck(Mask, &I);
3405      Value *MaskedPtrShadow = IRB.CreateSelect(
3406          Mask, getShadow(Ptrs), Constant::getNullValue((PtrsShadowTy)),
3407          "_msmaskedptrs");
3408      insertShadowCheck(MaskedPtrShadow, getOrigin(Ptrs), &I);
3409    }
3410
3411    if (!PropagateShadow) {
3412      setShadow(&I, getCleanShadow(&I));
3413      setOrigin(&I, getCleanOrigin());
3414      return;
3415    }
3416
3417    Type *ShadowTy = getShadowTy(&I);
3418    Type *ElementShadowTy = cast<FixedVectorType>(ShadowTy)->getElementType();
3419    auto [ShadowPtrs, OriginPtrs] = getShadowOriginPtr(
3420        Ptrs, IRB, ElementShadowTy, Alignment, /*isStore*/ false);
3421
3422    Value *Shadow =
3423        IRB.CreateMaskedGather(ShadowTy, ShadowPtrs, Alignment, Mask,
3424                               getShadow(PassThru), "_msmaskedgather");
3425
3426    setShadow(&I, Shadow);
3427
3428    // TODO: Store origins.
3429    setOrigin(&I, getCleanOrigin());
3430  }
3431
3432  void handleMaskedScatter(IntrinsicInst &I) {
3433    IRBuilder<> IRB(&I);
3434    Value *Values = I.getArgOperand(0);
3435    Value *Ptrs = I.getArgOperand(1);
3436    const Align Alignment(
3437        cast<ConstantInt>(I.getArgOperand(2))->getZExtValue());
3438    Value *Mask = I.getArgOperand(3);
3439
3440    Type *PtrsShadowTy = getShadowTy(Ptrs);
3441    if (ClCheckAccessAddress) {
3442      insertShadowCheck(Mask, &I);
3443      Value *MaskedPtrShadow = IRB.CreateSelect(
3444          Mask, getShadow(Ptrs), Constant::getNullValue((PtrsShadowTy)),
3445          "_msmaskedptrs");
3446      insertShadowCheck(MaskedPtrShadow, getOrigin(Ptrs), &I);
3447    }
3448
3449    Value *Shadow = getShadow(Values);
3450    Type *ElementShadowTy =
3451        getShadowTy(cast<FixedVectorType>(Values->getType())->getElementType());
3452    auto [ShadowPtrs, OriginPtrs] = getShadowOriginPtr(
3453        Ptrs, IRB, ElementShadowTy, Alignment, /*isStore*/ true);
3454
3455    IRB.CreateMaskedScatter(Shadow, ShadowPtrs, Alignment, Mask);
3456
3457    // TODO: Store origin.
3458  }
3459
3460  void handleMaskedStore(IntrinsicInst &I) {
3461    IRBuilder<> IRB(&I);
3462    Value *V = I.getArgOperand(0);
3463    Value *Ptr = I.getArgOperand(1);
3464    const Align Alignment(
3465        cast<ConstantInt>(I.getArgOperand(2))->getZExtValue());
3466    Value *Mask = I.getArgOperand(3);
3467    Value *Shadow = getShadow(V);
3468
3469    if (ClCheckAccessAddress) {
3470      insertShadowCheck(Ptr, &I);
3471      insertShadowCheck(Mask, &I);
3472    }
3473
3474    Value *ShadowPtr;
3475    Value *OriginPtr;
3476    std::tie(ShadowPtr, OriginPtr) = getShadowOriginPtr(
3477        Ptr, IRB, Shadow->getType(), Alignment, /*isStore*/ true);
3478
3479    IRB.CreateMaskedStore(Shadow, ShadowPtr, Alignment, Mask);
3480
3481    if (!MS.TrackOrigins)
3482      return;
3483
3484    auto &DL = F.getParent()->getDataLayout();
3485    paintOrigin(IRB, getOrigin(V), OriginPtr,
3486                DL.getTypeStoreSize(Shadow->getType()),
3487                std::max(Alignment, kMinOriginAlignment));
3488  }
3489
3490  void handleMaskedLoad(IntrinsicInst &I) {
3491    IRBuilder<> IRB(&I);
3492    Value *Ptr = I.getArgOperand(0);
3493    const Align Alignment(
3494        cast<ConstantInt>(I.getArgOperand(1))->getZExtValue());
3495    Value *Mask = I.getArgOperand(2);
3496    Value *PassThru = I.getArgOperand(3);
3497
3498    if (ClCheckAccessAddress) {
3499      insertShadowCheck(Ptr, &I);
3500      insertShadowCheck(Mask, &I);
3501    }
3502
3503    if (!PropagateShadow) {
3504      setShadow(&I, getCleanShadow(&I));
3505      setOrigin(&I, getCleanOrigin());
3506      return;
3507    }
3508
3509    Type *ShadowTy = getShadowTy(&I);
3510    Value *ShadowPtr, *OriginPtr;
3511    std::tie(ShadowPtr, OriginPtr) =
3512        getShadowOriginPtr(Ptr, IRB, ShadowTy, Alignment, /*isStore*/ false);
3513    setShadow(&I, IRB.CreateMaskedLoad(ShadowTy, ShadowPtr, Alignment, Mask,
3514                                       getShadow(PassThru), "_msmaskedld"));
3515
3516    if (!MS.TrackOrigins)
3517      return;
3518
3519    // Choose between PassThru's and the loaded value's origins.
3520    Value *MaskedPassThruShadow = IRB.CreateAnd(
3521        getShadow(PassThru), IRB.CreateSExt(IRB.CreateNeg(Mask), ShadowTy));
3522
3523    Value *ConvertedShadow = convertShadowToScalar(MaskedPassThruShadow, IRB);
3524    Value *NotNull = convertToBool(ConvertedShadow, IRB, "_mscmp");
3525
3526    Value *PtrOrigin = IRB.CreateLoad(MS.OriginTy, OriginPtr);
3527    Value *Origin = IRB.CreateSelect(NotNull, getOrigin(PassThru), PtrOrigin);
3528
3529    setOrigin(&I, Origin);
3530  }
3531
3532  // Instrument BMI / BMI2 intrinsics.
3533  // All of these intrinsics are Z = I(X, Y)
3534  // where the types of all operands and the result match, and are either i32 or
3535  // i64. The following instrumentation happens to work for all of them:
3536  //   Sz = I(Sx, Y) | (sext (Sy != 0))
3537  void handleBmiIntrinsic(IntrinsicInst &I) {
3538    IRBuilder<> IRB(&I);
3539    Type *ShadowTy = getShadowTy(&I);
3540
3541    // If any bit of the mask operand is poisoned, then the whole thing is.
3542    Value *SMask = getShadow(&I, 1);
3543    SMask = IRB.CreateSExt(IRB.CreateICmpNE(SMask, getCleanShadow(ShadowTy)),
3544                           ShadowTy);
3545    // Apply the same intrinsic to the shadow of the first operand.
3546    Value *S = IRB.CreateCall(I.getCalledFunction(),
3547                              {getShadow(&I, 0), I.getOperand(1)});
3548    S = IRB.CreateOr(SMask, S);
3549    setShadow(&I, S);
3550    setOriginForNaryOp(I);
3551  }
3552
3553  SmallVector<int, 8> getPclmulMask(unsigned Width, bool OddElements) {
3554    SmallVector<int, 8> Mask;
3555    for (unsigned X = OddElements ? 1 : 0; X < Width; X += 2) {
3556      Mask.append(2, X);
3557    }
3558    return Mask;
3559  }
3560
3561  // Instrument pclmul intrinsics.
3562  // These intrinsics operate either on odd or on even elements of the input
3563  // vectors, depending on the constant in the 3rd argument, ignoring the rest.
3564  // Replace the unused elements with copies of the used ones, ex:
3565  //   (0, 1, 2, 3) -> (0, 0, 2, 2) (even case)
3566  // or
3567  //   (0, 1, 2, 3) -> (1, 1, 3, 3) (odd case)
3568  // and then apply the usual shadow combining logic.
3569  void handlePclmulIntrinsic(IntrinsicInst &I) {
3570    IRBuilder<> IRB(&I);
3571    unsigned Width =
3572        cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
3573    assert(isa<ConstantInt>(I.getArgOperand(2)) &&
3574           "pclmul 3rd operand must be a constant");
3575    unsigned Imm = cast<ConstantInt>(I.getArgOperand(2))->getZExtValue();
3576    Value *Shuf0 = IRB.CreateShuffleVector(getShadow(&I, 0),
3577                                           getPclmulMask(Width, Imm & 0x01));
3578    Value *Shuf1 = IRB.CreateShuffleVector(getShadow(&I, 1),
3579                                           getPclmulMask(Width, Imm & 0x10));
3580    ShadowAndOriginCombiner SOC(this, IRB);
3581    SOC.Add(Shuf0, getOrigin(&I, 0));
3582    SOC.Add(Shuf1, getOrigin(&I, 1));
3583    SOC.Done(&I);
3584  }
3585
3586  // Instrument _mm_*_sd|ss intrinsics
3587  void handleUnarySdSsIntrinsic(IntrinsicInst &I) {
3588    IRBuilder<> IRB(&I);
3589    unsigned Width =
3590        cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
3591    Value *First = getShadow(&I, 0);
3592    Value *Second = getShadow(&I, 1);
3593    // First element of second operand, remaining elements of first operand
3594    SmallVector<int, 16> Mask;
3595    Mask.push_back(Width);
3596    for (unsigned i = 1; i < Width; i++)
3597      Mask.push_back(i);
3598    Value *Shadow = IRB.CreateShuffleVector(First, Second, Mask);
3599
3600    setShadow(&I, Shadow);
3601    setOriginForNaryOp(I);
3602  }
3603
3604  void handleVtestIntrinsic(IntrinsicInst &I) {
3605    IRBuilder<> IRB(&I);
3606    Value *Shadow0 = getShadow(&I, 0);
3607    Value *Shadow1 = getShadow(&I, 1);
3608    Value *Or = IRB.CreateOr(Shadow0, Shadow1);
3609    Value *NZ = IRB.CreateICmpNE(Or, Constant::getNullValue(Or->getType()));
3610    Value *Scalar = convertShadowToScalar(NZ, IRB);
3611    Value *Shadow = IRB.CreateZExt(Scalar, getShadowTy(&I));
3612
3613    setShadow(&I, Shadow);
3614    setOriginForNaryOp(I);
3615  }
3616
3617  void handleBinarySdSsIntrinsic(IntrinsicInst &I) {
3618    IRBuilder<> IRB(&I);
3619    unsigned Width =
3620        cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
3621    Value *First = getShadow(&I, 0);
3622    Value *Second = getShadow(&I, 1);
3623    Value *OrShadow = IRB.CreateOr(First, Second);
3624    // First element of both OR'd together, remaining elements of first operand
3625    SmallVector<int, 16> Mask;
3626    Mask.push_back(Width);
3627    for (unsigned i = 1; i < Width; i++)
3628      Mask.push_back(i);
3629    Value *Shadow = IRB.CreateShuffleVector(First, OrShadow, Mask);
3630
3631    setShadow(&I, Shadow);
3632    setOriginForNaryOp(I);
3633  }
3634
3635  // Instrument abs intrinsic.
3636  // handleUnknownIntrinsic can't handle it because of the last
3637  // is_int_min_poison argument which does not match the result type.
3638  void handleAbsIntrinsic(IntrinsicInst &I) {
3639    assert(I.getType()->isIntOrIntVectorTy());
3640    assert(I.getArgOperand(0)->getType() == I.getType());
3641
3642    // FIXME: Handle is_int_min_poison.
3643    IRBuilder<> IRB(&I);
3644    setShadow(&I, getShadow(&I, 0));
3645    setOrigin(&I, getOrigin(&I, 0));
3646  }
3647
3648  void visitIntrinsicInst(IntrinsicInst &I) {
3649    switch (I.getIntrinsicID()) {
3650    case Intrinsic::abs:
3651      handleAbsIntrinsic(I);
3652      break;
3653    case Intrinsic::lifetime_start:
3654      handleLifetimeStart(I);
3655      break;
3656    case Intrinsic::launder_invariant_group:
3657    case Intrinsic::strip_invariant_group:
3658      handleInvariantGroup(I);
3659      break;
3660    case Intrinsic::bswap:
3661      handleBswap(I);
3662      break;
3663    case Intrinsic::ctlz:
3664    case Intrinsic::cttz:
3665      handleCountZeroes(I);
3666      break;
3667    case Intrinsic::masked_compressstore:
3668      handleMaskedCompressStore(I);
3669      break;
3670    case Intrinsic::masked_expandload:
3671      handleMaskedExpandLoad(I);
3672      break;
3673    case Intrinsic::masked_gather:
3674      handleMaskedGather(I);
3675      break;
3676    case Intrinsic::masked_scatter:
3677      handleMaskedScatter(I);
3678      break;
3679    case Intrinsic::masked_store:
3680      handleMaskedStore(I);
3681      break;
3682    case Intrinsic::masked_load:
3683      handleMaskedLoad(I);
3684      break;
3685    case Intrinsic::vector_reduce_and:
3686      handleVectorReduceAndIntrinsic(I);
3687      break;
3688    case Intrinsic::vector_reduce_or:
3689      handleVectorReduceOrIntrinsic(I);
3690      break;
3691    case Intrinsic::vector_reduce_add:
3692    case Intrinsic::vector_reduce_xor:
3693    case Intrinsic::vector_reduce_mul:
3694      handleVectorReduceIntrinsic(I);
3695      break;
3696    case Intrinsic::x86_sse_stmxcsr:
3697      handleStmxcsr(I);
3698      break;
3699    case Intrinsic::x86_sse_ldmxcsr:
3700      handleLdmxcsr(I);
3701      break;
3702    case Intrinsic::x86_avx512_vcvtsd2usi64:
3703    case Intrinsic::x86_avx512_vcvtsd2usi32:
3704    case Intrinsic::x86_avx512_vcvtss2usi64:
3705    case Intrinsic::x86_avx512_vcvtss2usi32:
3706    case Intrinsic::x86_avx512_cvttss2usi64:
3707    case Intrinsic::x86_avx512_cvttss2usi:
3708    case Intrinsic::x86_avx512_cvttsd2usi64:
3709    case Intrinsic::x86_avx512_cvttsd2usi:
3710    case Intrinsic::x86_avx512_cvtusi2ss:
3711    case Intrinsic::x86_avx512_cvtusi642sd:
3712    case Intrinsic::x86_avx512_cvtusi642ss:
3713      handleVectorConvertIntrinsic(I, 1, true);
3714      break;
3715    case Intrinsic::x86_sse2_cvtsd2si64:
3716    case Intrinsic::x86_sse2_cvtsd2si:
3717    case Intrinsic::x86_sse2_cvtsd2ss:
3718    case Intrinsic::x86_sse2_cvttsd2si64:
3719    case Intrinsic::x86_sse2_cvttsd2si:
3720    case Intrinsic::x86_sse_cvtss2si64:
3721    case Intrinsic::x86_sse_cvtss2si:
3722    case Intrinsic::x86_sse_cvttss2si64:
3723    case Intrinsic::x86_sse_cvttss2si:
3724      handleVectorConvertIntrinsic(I, 1);
3725      break;
3726    case Intrinsic::x86_sse_cvtps2pi:
3727    case Intrinsic::x86_sse_cvttps2pi:
3728      handleVectorConvertIntrinsic(I, 2);
3729      break;
3730
3731    case Intrinsic::x86_avx512_psll_w_512:
3732    case Intrinsic::x86_avx512_psll_d_512:
3733    case Intrinsic::x86_avx512_psll_q_512:
3734    case Intrinsic::x86_avx512_pslli_w_512:
3735    case Intrinsic::x86_avx512_pslli_d_512:
3736    case Intrinsic::x86_avx512_pslli_q_512:
3737    case Intrinsic::x86_avx512_psrl_w_512:
3738    case Intrinsic::x86_avx512_psrl_d_512:
3739    case Intrinsic::x86_avx512_psrl_q_512:
3740    case Intrinsic::x86_avx512_psra_w_512:
3741    case Intrinsic::x86_avx512_psra_d_512:
3742    case Intrinsic::x86_avx512_psra_q_512:
3743    case Intrinsic::x86_avx512_psrli_w_512:
3744    case Intrinsic::x86_avx512_psrli_d_512:
3745    case Intrinsic::x86_avx512_psrli_q_512:
3746    case Intrinsic::x86_avx512_psrai_w_512:
3747    case Intrinsic::x86_avx512_psrai_d_512:
3748    case Intrinsic::x86_avx512_psrai_q_512:
3749    case Intrinsic::x86_avx512_psra_q_256:
3750    case Intrinsic::x86_avx512_psra_q_128:
3751    case Intrinsic::x86_avx512_psrai_q_256:
3752    case Intrinsic::x86_avx512_psrai_q_128:
3753    case Intrinsic::x86_avx2_psll_w:
3754    case Intrinsic::x86_avx2_psll_d:
3755    case Intrinsic::x86_avx2_psll_q:
3756    case Intrinsic::x86_avx2_pslli_w:
3757    case Intrinsic::x86_avx2_pslli_d:
3758    case Intrinsic::x86_avx2_pslli_q:
3759    case Intrinsic::x86_avx2_psrl_w:
3760    case Intrinsic::x86_avx2_psrl_d:
3761    case Intrinsic::x86_avx2_psrl_q:
3762    case Intrinsic::x86_avx2_psra_w:
3763    case Intrinsic::x86_avx2_psra_d:
3764    case Intrinsic::x86_avx2_psrli_w:
3765    case Intrinsic::x86_avx2_psrli_d:
3766    case Intrinsic::x86_avx2_psrli_q:
3767    case Intrinsic::x86_avx2_psrai_w:
3768    case Intrinsic::x86_avx2_psrai_d:
3769    case Intrinsic::x86_sse2_psll_w:
3770    case Intrinsic::x86_sse2_psll_d:
3771    case Intrinsic::x86_sse2_psll_q:
3772    case Intrinsic::x86_sse2_pslli_w:
3773    case Intrinsic::x86_sse2_pslli_d:
3774    case Intrinsic::x86_sse2_pslli_q:
3775    case Intrinsic::x86_sse2_psrl_w:
3776    case Intrinsic::x86_sse2_psrl_d:
3777    case Intrinsic::x86_sse2_psrl_q:
3778    case Intrinsic::x86_sse2_psra_w:
3779    case Intrinsic::x86_sse2_psra_d:
3780    case Intrinsic::x86_sse2_psrli_w:
3781    case Intrinsic::x86_sse2_psrli_d:
3782    case Intrinsic::x86_sse2_psrli_q:
3783    case Intrinsic::x86_sse2_psrai_w:
3784    case Intrinsic::x86_sse2_psrai_d:
3785    case Intrinsic::x86_mmx_psll_w:
3786    case Intrinsic::x86_mmx_psll_d:
3787    case Intrinsic::x86_mmx_psll_q:
3788    case Intrinsic::x86_mmx_pslli_w:
3789    case Intrinsic::x86_mmx_pslli_d:
3790    case Intrinsic::x86_mmx_pslli_q:
3791    case Intrinsic::x86_mmx_psrl_w:
3792    case Intrinsic::x86_mmx_psrl_d:
3793    case Intrinsic::x86_mmx_psrl_q:
3794    case Intrinsic::x86_mmx_psra_w:
3795    case Intrinsic::x86_mmx_psra_d:
3796    case Intrinsic::x86_mmx_psrli_w:
3797    case Intrinsic::x86_mmx_psrli_d:
3798    case Intrinsic::x86_mmx_psrli_q:
3799    case Intrinsic::x86_mmx_psrai_w:
3800    case Intrinsic::x86_mmx_psrai_d:
3801      handleVectorShiftIntrinsic(I, /* Variable */ false);
3802      break;
3803    case Intrinsic::x86_avx2_psllv_d:
3804    case Intrinsic::x86_avx2_psllv_d_256:
3805    case Intrinsic::x86_avx512_psllv_d_512:
3806    case Intrinsic::x86_avx2_psllv_q:
3807    case Intrinsic::x86_avx2_psllv_q_256:
3808    case Intrinsic::x86_avx512_psllv_q_512:
3809    case Intrinsic::x86_avx2_psrlv_d:
3810    case Intrinsic::x86_avx2_psrlv_d_256:
3811    case Intrinsic::x86_avx512_psrlv_d_512:
3812    case Intrinsic::x86_avx2_psrlv_q:
3813    case Intrinsic::x86_avx2_psrlv_q_256:
3814    case Intrinsic::x86_avx512_psrlv_q_512:
3815    case Intrinsic::x86_avx2_psrav_d:
3816    case Intrinsic::x86_avx2_psrav_d_256:
3817    case Intrinsic::x86_avx512_psrav_d_512:
3818    case Intrinsic::x86_avx512_psrav_q_128:
3819    case Intrinsic::x86_avx512_psrav_q_256:
3820    case Intrinsic::x86_avx512_psrav_q_512:
3821      handleVectorShiftIntrinsic(I, /* Variable */ true);
3822      break;
3823
3824    case Intrinsic::x86_sse2_packsswb_128:
3825    case Intrinsic::x86_sse2_packssdw_128:
3826    case Intrinsic::x86_sse2_packuswb_128:
3827    case Intrinsic::x86_sse41_packusdw:
3828    case Intrinsic::x86_avx2_packsswb:
3829    case Intrinsic::x86_avx2_packssdw:
3830    case Intrinsic::x86_avx2_packuswb:
3831    case Intrinsic::x86_avx2_packusdw:
3832      handleVectorPackIntrinsic(I);
3833      break;
3834
3835    case Intrinsic::x86_mmx_packsswb:
3836    case Intrinsic::x86_mmx_packuswb:
3837      handleVectorPackIntrinsic(I, 16);
3838      break;
3839
3840    case Intrinsic::x86_mmx_packssdw:
3841      handleVectorPackIntrinsic(I, 32);
3842      break;
3843
3844    case Intrinsic::x86_mmx_psad_bw:
3845    case Intrinsic::x86_sse2_psad_bw:
3846    case Intrinsic::x86_avx2_psad_bw:
3847      handleVectorSadIntrinsic(I);
3848      break;
3849
3850    case Intrinsic::x86_sse2_pmadd_wd:
3851    case Intrinsic::x86_avx2_pmadd_wd:
3852    case Intrinsic::x86_ssse3_pmadd_ub_sw_128:
3853    case Intrinsic::x86_avx2_pmadd_ub_sw:
3854      handleVectorPmaddIntrinsic(I);
3855      break;
3856
3857    case Intrinsic::x86_ssse3_pmadd_ub_sw:
3858      handleVectorPmaddIntrinsic(I, 8);
3859      break;
3860
3861    case Intrinsic::x86_mmx_pmadd_wd:
3862      handleVectorPmaddIntrinsic(I, 16);
3863      break;
3864
3865    case Intrinsic::x86_sse_cmp_ss:
3866    case Intrinsic::x86_sse2_cmp_sd:
3867    case Intrinsic::x86_sse_comieq_ss:
3868    case Intrinsic::x86_sse_comilt_ss:
3869    case Intrinsic::x86_sse_comile_ss:
3870    case Intrinsic::x86_sse_comigt_ss:
3871    case Intrinsic::x86_sse_comige_ss:
3872    case Intrinsic::x86_sse_comineq_ss:
3873    case Intrinsic::x86_sse_ucomieq_ss:
3874    case Intrinsic::x86_sse_ucomilt_ss:
3875    case Intrinsic::x86_sse_ucomile_ss:
3876    case Intrinsic::x86_sse_ucomigt_ss:
3877    case Intrinsic::x86_sse_ucomige_ss:
3878    case Intrinsic::x86_sse_ucomineq_ss:
3879    case Intrinsic::x86_sse2_comieq_sd:
3880    case Intrinsic::x86_sse2_comilt_sd:
3881    case Intrinsic::x86_sse2_comile_sd:
3882    case Intrinsic::x86_sse2_comigt_sd:
3883    case Intrinsic::x86_sse2_comige_sd:
3884    case Intrinsic::x86_sse2_comineq_sd:
3885    case Intrinsic::x86_sse2_ucomieq_sd:
3886    case Intrinsic::x86_sse2_ucomilt_sd:
3887    case Intrinsic::x86_sse2_ucomile_sd:
3888    case Intrinsic::x86_sse2_ucomigt_sd:
3889    case Intrinsic::x86_sse2_ucomige_sd:
3890    case Intrinsic::x86_sse2_ucomineq_sd:
3891      handleVectorCompareScalarIntrinsic(I);
3892      break;
3893
3894    case Intrinsic::x86_avx_cmp_pd_256:
3895    case Intrinsic::x86_avx_cmp_ps_256:
3896    case Intrinsic::x86_sse2_cmp_pd:
3897    case Intrinsic::x86_sse_cmp_ps:
3898      handleVectorComparePackedIntrinsic(I);
3899      break;
3900
3901    case Intrinsic::x86_bmi_bextr_32:
3902    case Intrinsic::x86_bmi_bextr_64:
3903    case Intrinsic::x86_bmi_bzhi_32:
3904    case Intrinsic::x86_bmi_bzhi_64:
3905    case Intrinsic::x86_bmi_pdep_32:
3906    case Intrinsic::x86_bmi_pdep_64:
3907    case Intrinsic::x86_bmi_pext_32:
3908    case Intrinsic::x86_bmi_pext_64:
3909      handleBmiIntrinsic(I);
3910      break;
3911
3912    case Intrinsic::x86_pclmulqdq:
3913    case Intrinsic::x86_pclmulqdq_256:
3914    case Intrinsic::x86_pclmulqdq_512:
3915      handlePclmulIntrinsic(I);
3916      break;
3917
3918    case Intrinsic::x86_sse41_round_sd:
3919    case Intrinsic::x86_sse41_round_ss:
3920      handleUnarySdSsIntrinsic(I);
3921      break;
3922    case Intrinsic::x86_sse2_max_sd:
3923    case Intrinsic::x86_sse_max_ss:
3924    case Intrinsic::x86_sse2_min_sd:
3925    case Intrinsic::x86_sse_min_ss:
3926      handleBinarySdSsIntrinsic(I);
3927      break;
3928
3929    case Intrinsic::x86_avx_vtestc_pd:
3930    case Intrinsic::x86_avx_vtestc_pd_256:
3931    case Intrinsic::x86_avx_vtestc_ps:
3932    case Intrinsic::x86_avx_vtestc_ps_256:
3933    case Intrinsic::x86_avx_vtestnzc_pd:
3934    case Intrinsic::x86_avx_vtestnzc_pd_256:
3935    case Intrinsic::x86_avx_vtestnzc_ps:
3936    case Intrinsic::x86_avx_vtestnzc_ps_256:
3937    case Intrinsic::x86_avx_vtestz_pd:
3938    case Intrinsic::x86_avx_vtestz_pd_256:
3939    case Intrinsic::x86_avx_vtestz_ps:
3940    case Intrinsic::x86_avx_vtestz_ps_256:
3941    case Intrinsic::x86_avx_ptestc_256:
3942    case Intrinsic::x86_avx_ptestnzc_256:
3943    case Intrinsic::x86_avx_ptestz_256:
3944    case Intrinsic::x86_sse41_ptestc:
3945    case Intrinsic::x86_sse41_ptestnzc:
3946    case Intrinsic::x86_sse41_ptestz:
3947      handleVtestIntrinsic(I);
3948      break;
3949
3950    case Intrinsic::fshl:
3951    case Intrinsic::fshr:
3952      handleFunnelShift(I);
3953      break;
3954
3955    case Intrinsic::is_constant:
3956      // The result of llvm.is.constant() is always defined.
3957      setShadow(&I, getCleanShadow(&I));
3958      setOrigin(&I, getCleanOrigin());
3959      break;
3960
3961    default:
3962      if (!handleUnknownIntrinsic(I))
3963        visitInstruction(I);
3964      break;
3965    }
3966  }
3967
3968  void visitLibAtomicLoad(CallBase &CB) {
3969    // Since we use getNextNode here, we can't have CB terminate the BB.
3970    assert(isa<CallInst>(CB));
3971
3972    IRBuilder<> IRB(&CB);
3973    Value *Size = CB.getArgOperand(0);
3974    Value *SrcPtr = CB.getArgOperand(1);
3975    Value *DstPtr = CB.getArgOperand(2);
3976    Value *Ordering = CB.getArgOperand(3);
3977    // Convert the call to have at least Acquire ordering to make sure
3978    // the shadow operations aren't reordered before it.
3979    Value *NewOrdering =
3980        IRB.CreateExtractElement(makeAddAcquireOrderingTable(IRB), Ordering);
3981    CB.setArgOperand(3, NewOrdering);
3982
3983    NextNodeIRBuilder NextIRB(&CB);
3984    Value *SrcShadowPtr, *SrcOriginPtr;
3985    std::tie(SrcShadowPtr, SrcOriginPtr) =
3986        getShadowOriginPtr(SrcPtr, NextIRB, NextIRB.getInt8Ty(), Align(1),
3987                           /*isStore*/ false);
3988    Value *DstShadowPtr =
3989        getShadowOriginPtr(DstPtr, NextIRB, NextIRB.getInt8Ty(), Align(1),
3990                           /*isStore*/ true)
3991            .first;
3992
3993    NextIRB.CreateMemCpy(DstShadowPtr, Align(1), SrcShadowPtr, Align(1), Size);
3994    if (MS.TrackOrigins) {
3995      Value *SrcOrigin = NextIRB.CreateAlignedLoad(MS.OriginTy, SrcOriginPtr,
3996                                                   kMinOriginAlignment);
3997      Value *NewOrigin = updateOrigin(SrcOrigin, NextIRB);
3998      NextIRB.CreateCall(MS.MsanSetOriginFn, {DstPtr, Size, NewOrigin});
3999    }
4000  }
4001
4002  void visitLibAtomicStore(CallBase &CB) {
4003    IRBuilder<> IRB(&CB);
4004    Value *Size = CB.getArgOperand(0);
4005    Value *DstPtr = CB.getArgOperand(2);
4006    Value *Ordering = CB.getArgOperand(3);
4007    // Convert the call to have at least Release ordering to make sure
4008    // the shadow operations aren't reordered after it.
4009    Value *NewOrdering =
4010        IRB.CreateExtractElement(makeAddReleaseOrderingTable(IRB), Ordering);
4011    CB.setArgOperand(3, NewOrdering);
4012
4013    Value *DstShadowPtr =
4014        getShadowOriginPtr(DstPtr, IRB, IRB.getInt8Ty(), Align(1),
4015                           /*isStore*/ true)
4016            .first;
4017
4018    // Atomic store always paints clean shadow/origin. See file header.
4019    IRB.CreateMemSet(DstShadowPtr, getCleanShadow(IRB.getInt8Ty()), Size,
4020                     Align(1));
4021  }
4022
4023  void visitCallBase(CallBase &CB) {
4024    assert(!CB.getMetadata(LLVMContext::MD_nosanitize));
4025    if (CB.isInlineAsm()) {
4026      // For inline asm (either a call to asm function, or callbr instruction),
4027      // do the usual thing: check argument shadow and mark all outputs as
4028      // clean. Note that any side effects of the inline asm that are not
4029      // immediately visible in its constraints are not handled.
4030      if (ClHandleAsmConservative && MS.CompileKernel)
4031        visitAsmInstruction(CB);
4032      else
4033        visitInstruction(CB);
4034      return;
4035    }
4036    LibFunc LF;
4037    if (TLI->getLibFunc(CB, LF)) {
4038      // libatomic.a functions need to have special handling because there isn't
4039      // a good way to intercept them or compile the library with
4040      // instrumentation.
4041      switch (LF) {
4042      case LibFunc_atomic_load:
4043        if (!isa<CallInst>(CB)) {
4044          llvm::errs() << "MSAN -- cannot instrument invoke of libatomic load."
4045                          "Ignoring!\n";
4046          break;
4047        }
4048        visitLibAtomicLoad(CB);
4049        return;
4050      case LibFunc_atomic_store:
4051        visitLibAtomicStore(CB);
4052        return;
4053      default:
4054        break;
4055      }
4056    }
4057
4058    if (auto *Call = dyn_cast<CallInst>(&CB)) {
4059      assert(!isa<IntrinsicInst>(Call) && "intrinsics are handled elsewhere");
4060
4061      // We are going to insert code that relies on the fact that the callee
4062      // will become a non-readonly function after it is instrumented by us. To
4063      // prevent this code from being optimized out, mark that function
4064      // non-readonly in advance.
4065      // TODO: We can likely do better than dropping memory() completely here.
4066      AttributeMask B;
4067      B.addAttribute(Attribute::Memory).addAttribute(Attribute::Speculatable);
4068
4069      Call->removeFnAttrs(B);
4070      if (Function *Func = Call->getCalledFunction()) {
4071        Func->removeFnAttrs(B);
4072      }
4073
4074      maybeMarkSanitizerLibraryCallNoBuiltin(Call, TLI);
4075    }
4076    IRBuilder<> IRB(&CB);
4077    bool MayCheckCall = MS.EagerChecks;
4078    if (Function *Func = CB.getCalledFunction()) {
4079      // __sanitizer_unaligned_{load,store} functions may be called by users
4080      // and always expects shadows in the TLS. So don't check them.
4081      MayCheckCall &= !Func->getName().startswith("__sanitizer_unaligned_");
4082    }
4083
4084    unsigned ArgOffset = 0;
4085    LLVM_DEBUG(dbgs() << "  CallSite: " << CB << "\n");
4086    for (const auto &[i, A] : llvm::enumerate(CB.args())) {
4087      if (!A->getType()->isSized()) {
4088        LLVM_DEBUG(dbgs() << "Arg " << i << " is not sized: " << CB << "\n");
4089        continue;
4090      }
4091      unsigned Size = 0;
4092      const DataLayout &DL = F.getParent()->getDataLayout();
4093
4094      bool ByVal = CB.paramHasAttr(i, Attribute::ByVal);
4095      bool NoUndef = CB.paramHasAttr(i, Attribute::NoUndef);
4096      bool EagerCheck = MayCheckCall && !ByVal && NoUndef;
4097
4098      if (EagerCheck) {
4099        insertShadowCheck(A, &CB);
4100        Size = DL.getTypeAllocSize(A->getType());
4101      } else {
4102        Value *Store = nullptr;
4103        // Compute the Shadow for arg even if it is ByVal, because
4104        // in that case getShadow() will copy the actual arg shadow to
4105        // __msan_param_tls.
4106        Value *ArgShadow = getShadow(A);
4107        Value *ArgShadowBase = getShadowPtrForArgument(A, IRB, ArgOffset);
4108        LLVM_DEBUG(dbgs() << "  Arg#" << i << ": " << *A
4109                          << " Shadow: " << *ArgShadow << "\n");
4110        if (ByVal) {
4111          // ByVal requires some special handling as it's too big for a single
4112          // load
4113          assert(A->getType()->isPointerTy() &&
4114                 "ByVal argument is not a pointer!");
4115          Size = DL.getTypeAllocSize(CB.getParamByValType(i));
4116          if (ArgOffset + Size > kParamTLSSize)
4117            break;
4118          const MaybeAlign ParamAlignment(CB.getParamAlign(i));
4119          MaybeAlign Alignment = std::nullopt;
4120          if (ParamAlignment)
4121            Alignment = std::min(*ParamAlignment, kShadowTLSAlignment);
4122          Value *AShadowPtr, *AOriginPtr;
4123          std::tie(AShadowPtr, AOriginPtr) =
4124              getShadowOriginPtr(A, IRB, IRB.getInt8Ty(), Alignment,
4125                                 /*isStore*/ false);
4126          if (!PropagateShadow) {
4127            Store = IRB.CreateMemSet(ArgShadowBase,
4128                                     Constant::getNullValue(IRB.getInt8Ty()),
4129                                     Size, Alignment);
4130          } else {
4131            Store = IRB.CreateMemCpy(ArgShadowBase, Alignment, AShadowPtr,
4132                                     Alignment, Size);
4133            if (MS.TrackOrigins) {
4134              Value *ArgOriginBase = getOriginPtrForArgument(A, IRB, ArgOffset);
4135              // FIXME: OriginSize should be:
4136              // alignTo(A % kMinOriginAlignment + Size, kMinOriginAlignment)
4137              unsigned OriginSize = alignTo(Size, kMinOriginAlignment);
4138              IRB.CreateMemCpy(
4139                  ArgOriginBase,
4140                  /* by origin_tls[ArgOffset] */ kMinOriginAlignment,
4141                  AOriginPtr,
4142                  /* by getShadowOriginPtr */ kMinOriginAlignment, OriginSize);
4143            }
4144          }
4145        } else {
4146          // Any other parameters mean we need bit-grained tracking of uninit
4147          // data
4148          Size = DL.getTypeAllocSize(A->getType());
4149          if (ArgOffset + Size > kParamTLSSize)
4150            break;
4151          Store = IRB.CreateAlignedStore(ArgShadow, ArgShadowBase,
4152                                         kShadowTLSAlignment);
4153          Constant *Cst = dyn_cast<Constant>(ArgShadow);
4154          if (MS.TrackOrigins && !(Cst && Cst->isNullValue())) {
4155            IRB.CreateStore(getOrigin(A),
4156                            getOriginPtrForArgument(A, IRB, ArgOffset));
4157          }
4158        }
4159        (void)Store;
4160        assert(Store != nullptr);
4161        LLVM_DEBUG(dbgs() << "  Param:" << *Store << "\n");
4162      }
4163      assert(Size != 0);
4164      ArgOffset += alignTo(Size, kShadowTLSAlignment);
4165    }
4166    LLVM_DEBUG(dbgs() << "  done with call args\n");
4167
4168    FunctionType *FT = CB.getFunctionType();
4169    if (FT->isVarArg()) {
4170      VAHelper->visitCallBase(CB, IRB);
4171    }
4172
4173    // Now, get the shadow for the RetVal.
4174    if (!CB.getType()->isSized())
4175      return;
4176    // Don't emit the epilogue for musttail call returns.
4177    if (isa<CallInst>(CB) && cast<CallInst>(CB).isMustTailCall())
4178      return;
4179
4180    if (MayCheckCall && CB.hasRetAttr(Attribute::NoUndef)) {
4181      setShadow(&CB, getCleanShadow(&CB));
4182      setOrigin(&CB, getCleanOrigin());
4183      return;
4184    }
4185
4186    IRBuilder<> IRBBefore(&CB);
4187    // Until we have full dynamic coverage, make sure the retval shadow is 0.
4188    Value *Base = getShadowPtrForRetval(&CB, IRBBefore);
4189    IRBBefore.CreateAlignedStore(getCleanShadow(&CB), Base,
4190                                 kShadowTLSAlignment);
4191    BasicBlock::iterator NextInsn;
4192    if (isa<CallInst>(CB)) {
4193      NextInsn = ++CB.getIterator();
4194      assert(NextInsn != CB.getParent()->end());
4195    } else {
4196      BasicBlock *NormalDest = cast<InvokeInst>(CB).getNormalDest();
4197      if (!NormalDest->getSinglePredecessor()) {
4198        // FIXME: this case is tricky, so we are just conservative here.
4199        // Perhaps we need to split the edge between this BB and NormalDest,
4200        // but a naive attempt to use SplitEdge leads to a crash.
4201        setShadow(&CB, getCleanShadow(&CB));
4202        setOrigin(&CB, getCleanOrigin());
4203        return;
4204      }
4205      // FIXME: NextInsn is likely in a basic block that has not been visited
4206      // yet. Anything inserted there will be instrumented by MSan later!
4207      NextInsn = NormalDest->getFirstInsertionPt();
4208      assert(NextInsn != NormalDest->end() &&
4209             "Could not find insertion point for retval shadow load");
4210    }
4211    IRBuilder<> IRBAfter(&*NextInsn);
4212    Value *RetvalShadow = IRBAfter.CreateAlignedLoad(
4213        getShadowTy(&CB), getShadowPtrForRetval(&CB, IRBAfter),
4214        kShadowTLSAlignment, "_msret");
4215    setShadow(&CB, RetvalShadow);
4216    if (MS.TrackOrigins)
4217      setOrigin(&CB, IRBAfter.CreateLoad(MS.OriginTy,
4218                                         getOriginPtrForRetval(IRBAfter)));
4219  }
4220
4221  bool isAMustTailRetVal(Value *RetVal) {
4222    if (auto *I = dyn_cast<BitCastInst>(RetVal)) {
4223      RetVal = I->getOperand(0);
4224    }
4225    if (auto *I = dyn_cast<CallInst>(RetVal)) {
4226      return I->isMustTailCall();
4227    }
4228    return false;
4229  }
4230
4231  void visitReturnInst(ReturnInst &I) {
4232    IRBuilder<> IRB(&I);
4233    Value *RetVal = I.getReturnValue();
4234    if (!RetVal)
4235      return;
4236    // Don't emit the epilogue for musttail call returns.
4237    if (isAMustTailRetVal(RetVal))
4238      return;
4239    Value *ShadowPtr = getShadowPtrForRetval(RetVal, IRB);
4240    bool HasNoUndef = F.hasRetAttribute(Attribute::NoUndef);
4241    bool StoreShadow = !(MS.EagerChecks && HasNoUndef);
4242    // FIXME: Consider using SpecialCaseList to specify a list of functions that
4243    // must always return fully initialized values. For now, we hardcode "main".
4244    bool EagerCheck = (MS.EagerChecks && HasNoUndef) || (F.getName() == "main");
4245
4246    Value *Shadow = getShadow(RetVal);
4247    bool StoreOrigin = true;
4248    if (EagerCheck) {
4249      insertShadowCheck(RetVal, &I);
4250      Shadow = getCleanShadow(RetVal);
4251      StoreOrigin = false;
4252    }
4253
4254    // The caller may still expect information passed over TLS if we pass our
4255    // check
4256    if (StoreShadow) {
4257      IRB.CreateAlignedStore(Shadow, ShadowPtr, kShadowTLSAlignment);
4258      if (MS.TrackOrigins && StoreOrigin)
4259        IRB.CreateStore(getOrigin(RetVal), getOriginPtrForRetval(IRB));
4260    }
4261  }
4262
4263  void visitPHINode(PHINode &I) {
4264    IRBuilder<> IRB(&I);
4265    if (!PropagateShadow) {
4266      setShadow(&I, getCleanShadow(&I));
4267      setOrigin(&I, getCleanOrigin());
4268      return;
4269    }
4270
4271    ShadowPHINodes.push_back(&I);
4272    setShadow(&I, IRB.CreatePHI(getShadowTy(&I), I.getNumIncomingValues(),
4273                                "_msphi_s"));
4274    if (MS.TrackOrigins)
4275      setOrigin(
4276          &I, IRB.CreatePHI(MS.OriginTy, I.getNumIncomingValues(), "_msphi_o"));
4277  }
4278
4279  Value *getLocalVarIdptr(AllocaInst &I) {
4280    ConstantInt *IntConst =
4281        ConstantInt::get(Type::getInt32Ty((*F.getParent()).getContext()), 0);
4282    return new GlobalVariable(*F.getParent(), IntConst->getType(),
4283                              /*isConstant=*/false, GlobalValue::PrivateLinkage,
4284                              IntConst);
4285  }
4286
4287  Value *getLocalVarDescription(AllocaInst &I) {
4288    return createPrivateConstGlobalForString(*F.getParent(), I.getName());
4289  }
4290
4291  void poisonAllocaUserspace(AllocaInst &I, IRBuilder<> &IRB, Value *Len) {
4292    if (PoisonStack && ClPoisonStackWithCall) {
4293      IRB.CreateCall(MS.MsanPoisonStackFn,
4294                     {IRB.CreatePointerCast(&I, IRB.getInt8PtrTy()), Len});
4295    } else {
4296      Value *ShadowBase, *OriginBase;
4297      std::tie(ShadowBase, OriginBase) = getShadowOriginPtr(
4298          &I, IRB, IRB.getInt8Ty(), Align(1), /*isStore*/ true);
4299
4300      Value *PoisonValue = IRB.getInt8(PoisonStack ? ClPoisonStackPattern : 0);
4301      IRB.CreateMemSet(ShadowBase, PoisonValue, Len, I.getAlign());
4302    }
4303
4304    if (PoisonStack && MS.TrackOrigins) {
4305      Value *Idptr = getLocalVarIdptr(I);
4306      if (ClPrintStackNames) {
4307        Value *Descr = getLocalVarDescription(I);
4308        IRB.CreateCall(MS.MsanSetAllocaOriginWithDescriptionFn,
4309                       {IRB.CreatePointerCast(&I, IRB.getInt8PtrTy()), Len,
4310                        IRB.CreatePointerCast(Idptr, IRB.getInt8PtrTy()),
4311                        IRB.CreatePointerCast(Descr, IRB.getInt8PtrTy())});
4312      } else {
4313        IRB.CreateCall(MS.MsanSetAllocaOriginNoDescriptionFn,
4314                       {IRB.CreatePointerCast(&I, IRB.getInt8PtrTy()), Len,
4315                        IRB.CreatePointerCast(Idptr, IRB.getInt8PtrTy())});
4316      }
4317    }
4318  }
4319
4320  void poisonAllocaKmsan(AllocaInst &I, IRBuilder<> &IRB, Value *Len) {
4321    Value *Descr = getLocalVarDescription(I);
4322    if (PoisonStack) {
4323      IRB.CreateCall(MS.MsanPoisonAllocaFn,
4324                     {IRB.CreatePointerCast(&I, IRB.getInt8PtrTy()), Len,
4325                      IRB.CreatePointerCast(Descr, IRB.getInt8PtrTy())});
4326    } else {
4327      IRB.CreateCall(MS.MsanUnpoisonAllocaFn,
4328                     {IRB.CreatePointerCast(&I, IRB.getInt8PtrTy()), Len});
4329    }
4330  }
4331
4332  void instrumentAlloca(AllocaInst &I, Instruction *InsPoint = nullptr) {
4333    if (!InsPoint)
4334      InsPoint = &I;
4335    NextNodeIRBuilder IRB(InsPoint);
4336    const DataLayout &DL = F.getParent()->getDataLayout();
4337    uint64_t TypeSize = DL.getTypeAllocSize(I.getAllocatedType());
4338    Value *Len = ConstantInt::get(MS.IntptrTy, TypeSize);
4339    if (I.isArrayAllocation())
4340      Len = IRB.CreateMul(Len,
4341                          IRB.CreateZExtOrTrunc(I.getArraySize(), MS.IntptrTy));
4342
4343    if (MS.CompileKernel)
4344      poisonAllocaKmsan(I, IRB, Len);
4345    else
4346      poisonAllocaUserspace(I, IRB, Len);
4347  }
4348
4349  void visitAllocaInst(AllocaInst &I) {
4350    setShadow(&I, getCleanShadow(&I));
4351    setOrigin(&I, getCleanOrigin());
4352    // We'll get to this alloca later unless it's poisoned at the corresponding
4353    // llvm.lifetime.start.
4354    AllocaSet.insert(&I);
4355  }
4356
4357  void visitSelectInst(SelectInst &I) {
4358    IRBuilder<> IRB(&I);
4359    // a = select b, c, d
4360    Value *B = I.getCondition();
4361    Value *C = I.getTrueValue();
4362    Value *D = I.getFalseValue();
4363    Value *Sb = getShadow(B);
4364    Value *Sc = getShadow(C);
4365    Value *Sd = getShadow(D);
4366
4367    // Result shadow if condition shadow is 0.
4368    Value *Sa0 = IRB.CreateSelect(B, Sc, Sd);
4369    Value *Sa1;
4370    if (I.getType()->isAggregateType()) {
4371      // To avoid "sign extending" i1 to an arbitrary aggregate type, we just do
4372      // an extra "select". This results in much more compact IR.
4373      // Sa = select Sb, poisoned, (select b, Sc, Sd)
4374      Sa1 = getPoisonedShadow(getShadowTy(I.getType()));
4375    } else {
4376      // Sa = select Sb, [ (c^d) | Sc | Sd ], [ b ? Sc : Sd ]
4377      // If Sb (condition is poisoned), look for bits in c and d that are equal
4378      // and both unpoisoned.
4379      // If !Sb (condition is unpoisoned), simply pick one of Sc and Sd.
4380
4381      // Cast arguments to shadow-compatible type.
4382      C = CreateAppToShadowCast(IRB, C);
4383      D = CreateAppToShadowCast(IRB, D);
4384
4385      // Result shadow if condition shadow is 1.
4386      Sa1 = IRB.CreateOr({IRB.CreateXor(C, D), Sc, Sd});
4387    }
4388    Value *Sa = IRB.CreateSelect(Sb, Sa1, Sa0, "_msprop_select");
4389    setShadow(&I, Sa);
4390    if (MS.TrackOrigins) {
4391      // Origins are always i32, so any vector conditions must be flattened.
4392      // FIXME: consider tracking vector origins for app vectors?
4393      if (B->getType()->isVectorTy()) {
4394        Type *FlatTy = getShadowTyNoVec(B->getType());
4395        B = IRB.CreateICmpNE(IRB.CreateBitCast(B, FlatTy),
4396                             ConstantInt::getNullValue(FlatTy));
4397        Sb = IRB.CreateICmpNE(IRB.CreateBitCast(Sb, FlatTy),
4398                              ConstantInt::getNullValue(FlatTy));
4399      }
4400      // a = select b, c, d
4401      // Oa = Sb ? Ob : (b ? Oc : Od)
4402      setOrigin(
4403          &I, IRB.CreateSelect(Sb, getOrigin(I.getCondition()),
4404                               IRB.CreateSelect(B, getOrigin(I.getTrueValue()),
4405                                                getOrigin(I.getFalseValue()))));
4406    }
4407  }
4408
4409  void visitLandingPadInst(LandingPadInst &I) {
4410    // Do nothing.
4411    // See https://github.com/google/sanitizers/issues/504
4412    setShadow(&I, getCleanShadow(&I));
4413    setOrigin(&I, getCleanOrigin());
4414  }
4415
4416  void visitCatchSwitchInst(CatchSwitchInst &I) {
4417    setShadow(&I, getCleanShadow(&I));
4418    setOrigin(&I, getCleanOrigin());
4419  }
4420
4421  void visitFuncletPadInst(FuncletPadInst &I) {
4422    setShadow(&I, getCleanShadow(&I));
4423    setOrigin(&I, getCleanOrigin());
4424  }
4425
4426  void visitGetElementPtrInst(GetElementPtrInst &I) { handleShadowOr(I); }
4427
4428  void visitExtractValueInst(ExtractValueInst &I) {
4429    IRBuilder<> IRB(&I);
4430    Value *Agg = I.getAggregateOperand();
4431    LLVM_DEBUG(dbgs() << "ExtractValue:  " << I << "\n");
4432    Value *AggShadow = getShadow(Agg);
4433    LLVM_DEBUG(dbgs() << "   AggShadow:  " << *AggShadow << "\n");
4434    Value *ResShadow = IRB.CreateExtractValue(AggShadow, I.getIndices());
4435    LLVM_DEBUG(dbgs() << "   ResShadow:  " << *ResShadow << "\n");
4436    setShadow(&I, ResShadow);
4437    setOriginForNaryOp(I);
4438  }
4439
4440  void visitInsertValueInst(InsertValueInst &I) {
4441    IRBuilder<> IRB(&I);
4442    LLVM_DEBUG(dbgs() << "InsertValue:  " << I << "\n");
4443    Value *AggShadow = getShadow(I.getAggregateOperand());
4444    Value *InsShadow = getShadow(I.getInsertedValueOperand());
4445    LLVM_DEBUG(dbgs() << "   AggShadow:  " << *AggShadow << "\n");
4446    LLVM_DEBUG(dbgs() << "   InsShadow:  " << *InsShadow << "\n");
4447    Value *Res = IRB.CreateInsertValue(AggShadow, InsShadow, I.getIndices());
4448    LLVM_DEBUG(dbgs() << "   Res:        " << *Res << "\n");
4449    setShadow(&I, Res);
4450    setOriginForNaryOp(I);
4451  }
4452
4453  void dumpInst(Instruction &I) {
4454    if (CallInst *CI = dyn_cast<CallInst>(&I)) {
4455      errs() << "ZZZ call " << CI->getCalledFunction()->getName() << "\n";
4456    } else {
4457      errs() << "ZZZ " << I.getOpcodeName() << "\n";
4458    }
4459    errs() << "QQQ " << I << "\n";
4460  }
4461
4462  void visitResumeInst(ResumeInst &I) {
4463    LLVM_DEBUG(dbgs() << "Resume: " << I << "\n");
4464    // Nothing to do here.
4465  }
4466
4467  void visitCleanupReturnInst(CleanupReturnInst &CRI) {
4468    LLVM_DEBUG(dbgs() << "CleanupReturn: " << CRI << "\n");
4469    // Nothing to do here.
4470  }
4471
4472  void visitCatchReturnInst(CatchReturnInst &CRI) {
4473    LLVM_DEBUG(dbgs() << "CatchReturn: " << CRI << "\n");
4474    // Nothing to do here.
4475  }
4476
4477  void instrumentAsmArgument(Value *Operand, Type *ElemTy, Instruction &I,
4478                             IRBuilder<> &IRB, const DataLayout &DL,
4479                             bool isOutput) {
4480    // For each assembly argument, we check its value for being initialized.
4481    // If the argument is a pointer, we assume it points to a single element
4482    // of the corresponding type (or to a 8-byte word, if the type is unsized).
4483    // Each such pointer is instrumented with a call to the runtime library.
4484    Type *OpType = Operand->getType();
4485    // Check the operand value itself.
4486    insertShadowCheck(Operand, &I);
4487    if (!OpType->isPointerTy() || !isOutput) {
4488      assert(!isOutput);
4489      return;
4490    }
4491    if (!ElemTy->isSized())
4492      return;
4493    int Size = DL.getTypeStoreSize(ElemTy);
4494    Value *Ptr = IRB.CreatePointerCast(Operand, IRB.getInt8PtrTy());
4495    Value *SizeVal = ConstantInt::get(MS.IntptrTy, Size);
4496    IRB.CreateCall(MS.MsanInstrumentAsmStoreFn, {Ptr, SizeVal});
4497  }
4498
4499  /// Get the number of output arguments returned by pointers.
4500  int getNumOutputArgs(InlineAsm *IA, CallBase *CB) {
4501    int NumRetOutputs = 0;
4502    int NumOutputs = 0;
4503    Type *RetTy = cast<Value>(CB)->getType();
4504    if (!RetTy->isVoidTy()) {
4505      // Register outputs are returned via the CallInst return value.
4506      auto *ST = dyn_cast<StructType>(RetTy);
4507      if (ST)
4508        NumRetOutputs = ST->getNumElements();
4509      else
4510        NumRetOutputs = 1;
4511    }
4512    InlineAsm::ConstraintInfoVector Constraints = IA->ParseConstraints();
4513    for (const InlineAsm::ConstraintInfo &Info : Constraints) {
4514      switch (Info.Type) {
4515      case InlineAsm::isOutput:
4516        NumOutputs++;
4517        break;
4518      default:
4519        break;
4520      }
4521    }
4522    return NumOutputs - NumRetOutputs;
4523  }
4524
4525  void visitAsmInstruction(Instruction &I) {
4526    // Conservative inline assembly handling: check for poisoned shadow of
4527    // asm() arguments, then unpoison the result and all the memory locations
4528    // pointed to by those arguments.
4529    // An inline asm() statement in C++ contains lists of input and output
4530    // arguments used by the assembly code. These are mapped to operands of the
4531    // CallInst as follows:
4532    //  - nR register outputs ("=r) are returned by value in a single structure
4533    //  (SSA value of the CallInst);
4534    //  - nO other outputs ("=m" and others) are returned by pointer as first
4535    // nO operands of the CallInst;
4536    //  - nI inputs ("r", "m" and others) are passed to CallInst as the
4537    // remaining nI operands.
4538    // The total number of asm() arguments in the source is nR+nO+nI, and the
4539    // corresponding CallInst has nO+nI+1 operands (the last operand is the
4540    // function to be called).
4541    const DataLayout &DL = F.getParent()->getDataLayout();
4542    CallBase *CB = cast<CallBase>(&I);
4543    IRBuilder<> IRB(&I);
4544    InlineAsm *IA = cast<InlineAsm>(CB->getCalledOperand());
4545    int OutputArgs = getNumOutputArgs(IA, CB);
4546    // The last operand of a CallInst is the function itself.
4547    int NumOperands = CB->getNumOperands() - 1;
4548
4549    // Check input arguments. Doing so before unpoisoning output arguments, so
4550    // that we won't overwrite uninit values before checking them.
4551    for (int i = OutputArgs; i < NumOperands; i++) {
4552      Value *Operand = CB->getOperand(i);
4553      instrumentAsmArgument(Operand, CB->getParamElementType(i), I, IRB, DL,
4554                            /*isOutput*/ false);
4555    }
4556    // Unpoison output arguments. This must happen before the actual InlineAsm
4557    // call, so that the shadow for memory published in the asm() statement
4558    // remains valid.
4559    for (int i = 0; i < OutputArgs; i++) {
4560      Value *Operand = CB->getOperand(i);
4561      instrumentAsmArgument(Operand, CB->getParamElementType(i), I, IRB, DL,
4562                            /*isOutput*/ true);
4563    }
4564
4565    setShadow(&I, getCleanShadow(&I));
4566    setOrigin(&I, getCleanOrigin());
4567  }
4568
4569  void visitFreezeInst(FreezeInst &I) {
4570    // Freeze always returns a fully defined value.
4571    setShadow(&I, getCleanShadow(&I));
4572    setOrigin(&I, getCleanOrigin());
4573  }
4574
4575  void visitInstruction(Instruction &I) {
4576    // Everything else: stop propagating and check for poisoned shadow.
4577    if (ClDumpStrictInstructions)
4578      dumpInst(I);
4579    LLVM_DEBUG(dbgs() << "DEFAULT: " << I << "\n");
4580    for (size_t i = 0, n = I.getNumOperands(); i < n; i++) {
4581      Value *Operand = I.getOperand(i);
4582      if (Operand->getType()->isSized())
4583        insertShadowCheck(Operand, &I);
4584    }
4585    setShadow(&I, getCleanShadow(&I));
4586    setOrigin(&I, getCleanOrigin());
4587  }
4588};
4589
4590/// AMD64-specific implementation of VarArgHelper.
4591struct VarArgAMD64Helper : public VarArgHelper {
4592  // An unfortunate workaround for asymmetric lowering of va_arg stuff.
4593  // See a comment in visitCallBase for more details.
4594  static const unsigned AMD64GpEndOffset = 48; // AMD64 ABI Draft 0.99.6 p3.5.7
4595  static const unsigned AMD64FpEndOffsetSSE = 176;
4596  // If SSE is disabled, fp_offset in va_list is zero.
4597  static const unsigned AMD64FpEndOffsetNoSSE = AMD64GpEndOffset;
4598
4599  unsigned AMD64FpEndOffset;
4600  Function &F;
4601  MemorySanitizer &MS;
4602  MemorySanitizerVisitor &MSV;
4603  Value *VAArgTLSCopy = nullptr;
4604  Value *VAArgTLSOriginCopy = nullptr;
4605  Value *VAArgOverflowSize = nullptr;
4606
4607  SmallVector<CallInst *, 16> VAStartInstrumentationList;
4608
4609  enum ArgKind { AK_GeneralPurpose, AK_FloatingPoint, AK_Memory };
4610
4611  VarArgAMD64Helper(Function &F, MemorySanitizer &MS,
4612                    MemorySanitizerVisitor &MSV)
4613      : F(F), MS(MS), MSV(MSV) {
4614    AMD64FpEndOffset = AMD64FpEndOffsetSSE;
4615    for (const auto &Attr : F.getAttributes().getFnAttrs()) {
4616      if (Attr.isStringAttribute() &&
4617          (Attr.getKindAsString() == "target-features")) {
4618        if (Attr.getValueAsString().contains("-sse"))
4619          AMD64FpEndOffset = AMD64FpEndOffsetNoSSE;
4620        break;
4621      }
4622    }
4623  }
4624
4625  ArgKind classifyArgument(Value *arg) {
4626    // A very rough approximation of X86_64 argument classification rules.
4627    Type *T = arg->getType();
4628    if (T->isFPOrFPVectorTy() || T->isX86_MMXTy())
4629      return AK_FloatingPoint;
4630    if (T->isIntegerTy() && T->getPrimitiveSizeInBits() <= 64)
4631      return AK_GeneralPurpose;
4632    if (T->isPointerTy())
4633      return AK_GeneralPurpose;
4634    return AK_Memory;
4635  }
4636
4637  // For VarArg functions, store the argument shadow in an ABI-specific format
4638  // that corresponds to va_list layout.
4639  // We do this because Clang lowers va_arg in the frontend, and this pass
4640  // only sees the low level code that deals with va_list internals.
4641  // A much easier alternative (provided that Clang emits va_arg instructions)
4642  // would have been to associate each live instance of va_list with a copy of
4643  // MSanParamTLS, and extract shadow on va_arg() call in the argument list
4644  // order.
4645  void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
4646    unsigned GpOffset = 0;
4647    unsigned FpOffset = AMD64GpEndOffset;
4648    unsigned OverflowOffset = AMD64FpEndOffset;
4649    const DataLayout &DL = F.getParent()->getDataLayout();
4650    for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
4651      bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
4652      bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
4653      if (IsByVal) {
4654        // ByVal arguments always go to the overflow area.
4655        // Fixed arguments passed through the overflow area will be stepped
4656        // over by va_start, so don't count them towards the offset.
4657        if (IsFixed)
4658          continue;
4659        assert(A->getType()->isPointerTy());
4660        Type *RealTy = CB.getParamByValType(ArgNo);
4661        uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
4662        Value *ShadowBase = getShadowPtrForVAArgument(
4663            RealTy, IRB, OverflowOffset, alignTo(ArgSize, 8));
4664        Value *OriginBase = nullptr;
4665        if (MS.TrackOrigins)
4666          OriginBase = getOriginPtrForVAArgument(RealTy, IRB, OverflowOffset);
4667        OverflowOffset += alignTo(ArgSize, 8);
4668        if (!ShadowBase)
4669          continue;
4670        Value *ShadowPtr, *OriginPtr;
4671        std::tie(ShadowPtr, OriginPtr) =
4672            MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(), kShadowTLSAlignment,
4673                                   /*isStore*/ false);
4674
4675        IRB.CreateMemCpy(ShadowBase, kShadowTLSAlignment, ShadowPtr,
4676                         kShadowTLSAlignment, ArgSize);
4677        if (MS.TrackOrigins)
4678          IRB.CreateMemCpy(OriginBase, kShadowTLSAlignment, OriginPtr,
4679                           kShadowTLSAlignment, ArgSize);
4680      } else {
4681        ArgKind AK = classifyArgument(A);
4682        if (AK == AK_GeneralPurpose && GpOffset >= AMD64GpEndOffset)
4683          AK = AK_Memory;
4684        if (AK == AK_FloatingPoint && FpOffset >= AMD64FpEndOffset)
4685          AK = AK_Memory;
4686        Value *ShadowBase, *OriginBase = nullptr;
4687        switch (AK) {
4688        case AK_GeneralPurpose:
4689          ShadowBase =
4690              getShadowPtrForVAArgument(A->getType(), IRB, GpOffset, 8);
4691          if (MS.TrackOrigins)
4692            OriginBase = getOriginPtrForVAArgument(A->getType(), IRB, GpOffset);
4693          GpOffset += 8;
4694          break;
4695        case AK_FloatingPoint:
4696          ShadowBase =
4697              getShadowPtrForVAArgument(A->getType(), IRB, FpOffset, 16);
4698          if (MS.TrackOrigins)
4699            OriginBase = getOriginPtrForVAArgument(A->getType(), IRB, FpOffset);
4700          FpOffset += 16;
4701          break;
4702        case AK_Memory:
4703          if (IsFixed)
4704            continue;
4705          uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
4706          ShadowBase =
4707              getShadowPtrForVAArgument(A->getType(), IRB, OverflowOffset, 8);
4708          if (MS.TrackOrigins)
4709            OriginBase =
4710                getOriginPtrForVAArgument(A->getType(), IRB, OverflowOffset);
4711          OverflowOffset += alignTo(ArgSize, 8);
4712        }
4713        // Take fixed arguments into account for GpOffset and FpOffset,
4714        // but don't actually store shadows for them.
4715        // TODO(glider): don't call get*PtrForVAArgument() for them.
4716        if (IsFixed)
4717          continue;
4718        if (!ShadowBase)
4719          continue;
4720        Value *Shadow = MSV.getShadow(A);
4721        IRB.CreateAlignedStore(Shadow, ShadowBase, kShadowTLSAlignment);
4722        if (MS.TrackOrigins) {
4723          Value *Origin = MSV.getOrigin(A);
4724          unsigned StoreSize = DL.getTypeStoreSize(Shadow->getType());
4725          MSV.paintOrigin(IRB, Origin, OriginBase, StoreSize,
4726                          std::max(kShadowTLSAlignment, kMinOriginAlignment));
4727        }
4728      }
4729    }
4730    Constant *OverflowSize =
4731        ConstantInt::get(IRB.getInt64Ty(), OverflowOffset - AMD64FpEndOffset);
4732    IRB.CreateStore(OverflowSize, MS.VAArgOverflowSizeTLS);
4733  }
4734
4735  /// Compute the shadow address for a given va_arg.
4736  Value *getShadowPtrForVAArgument(Type *Ty, IRBuilder<> &IRB,
4737                                   unsigned ArgOffset, unsigned ArgSize) {
4738    // Make sure we don't overflow __msan_va_arg_tls.
4739    if (ArgOffset + ArgSize > kParamTLSSize)
4740      return nullptr;
4741    Value *Base = IRB.CreatePointerCast(MS.VAArgTLS, MS.IntptrTy);
4742    Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
4743    return IRB.CreateIntToPtr(Base, PointerType::get(MSV.getShadowTy(Ty), 0),
4744                              "_msarg_va_s");
4745  }
4746
4747  /// Compute the origin address for a given va_arg.
4748  Value *getOriginPtrForVAArgument(Type *Ty, IRBuilder<> &IRB, int ArgOffset) {
4749    Value *Base = IRB.CreatePointerCast(MS.VAArgOriginTLS, MS.IntptrTy);
4750    // getOriginPtrForVAArgument() is always called after
4751    // getShadowPtrForVAArgument(), so __msan_va_arg_origin_tls can never
4752    // overflow.
4753    Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
4754    return IRB.CreateIntToPtr(Base, PointerType::get(MS.OriginTy, 0),
4755                              "_msarg_va_o");
4756  }
4757
4758  void unpoisonVAListTagForInst(IntrinsicInst &I) {
4759    IRBuilder<> IRB(&I);
4760    Value *VAListTag = I.getArgOperand(0);
4761    Value *ShadowPtr, *OriginPtr;
4762    const Align Alignment = Align(8);
4763    std::tie(ShadowPtr, OriginPtr) =
4764        MSV.getShadowOriginPtr(VAListTag, IRB, IRB.getInt8Ty(), Alignment,
4765                               /*isStore*/ true);
4766
4767    // Unpoison the whole __va_list_tag.
4768    // FIXME: magic ABI constants.
4769    IRB.CreateMemSet(ShadowPtr, Constant::getNullValue(IRB.getInt8Ty()),
4770                     /* size */ 24, Alignment, false);
4771    // We shouldn't need to zero out the origins, as they're only checked for
4772    // nonzero shadow.
4773  }
4774
4775  void visitVAStartInst(VAStartInst &I) override {
4776    if (F.getCallingConv() == CallingConv::Win64)
4777      return;
4778    VAStartInstrumentationList.push_back(&I);
4779    unpoisonVAListTagForInst(I);
4780  }
4781
4782  void visitVACopyInst(VACopyInst &I) override {
4783    if (F.getCallingConv() == CallingConv::Win64)
4784      return;
4785    unpoisonVAListTagForInst(I);
4786  }
4787
4788  void finalizeInstrumentation() override {
4789    assert(!VAArgOverflowSize && !VAArgTLSCopy &&
4790           "finalizeInstrumentation called twice");
4791    if (!VAStartInstrumentationList.empty()) {
4792      // If there is a va_start in this function, make a backup copy of
4793      // va_arg_tls somewhere in the function entry block.
4794      IRBuilder<> IRB(MSV.FnPrologueEnd);
4795      VAArgOverflowSize =
4796          IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
4797      Value *CopySize = IRB.CreateAdd(
4798          ConstantInt::get(MS.IntptrTy, AMD64FpEndOffset), VAArgOverflowSize);
4799      VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
4800      IRB.CreateMemCpy(VAArgTLSCopy, Align(8), MS.VAArgTLS, Align(8), CopySize);
4801      if (MS.TrackOrigins) {
4802        VAArgTLSOriginCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
4803        IRB.CreateMemCpy(VAArgTLSOriginCopy, Align(8), MS.VAArgOriginTLS,
4804                         Align(8), CopySize);
4805      }
4806    }
4807
4808    // Instrument va_start.
4809    // Copy va_list shadow from the backup copy of the TLS contents.
4810    for (size_t i = 0, n = VAStartInstrumentationList.size(); i < n; i++) {
4811      CallInst *OrigInst = VAStartInstrumentationList[i];
4812      NextNodeIRBuilder IRB(OrigInst);
4813      Value *VAListTag = OrigInst->getArgOperand(0);
4814
4815      Type *RegSaveAreaPtrTy = Type::getInt64PtrTy(*MS.C);
4816      Value *RegSaveAreaPtrPtr = IRB.CreateIntToPtr(
4817          IRB.CreateAdd(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
4818                        ConstantInt::get(MS.IntptrTy, 16)),
4819          PointerType::get(RegSaveAreaPtrTy, 0));
4820      Value *RegSaveAreaPtr =
4821          IRB.CreateLoad(RegSaveAreaPtrTy, RegSaveAreaPtrPtr);
4822      Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
4823      const Align Alignment = Align(16);
4824      std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
4825          MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
4826                                 Alignment, /*isStore*/ true);
4827      IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
4828                       AMD64FpEndOffset);
4829      if (MS.TrackOrigins)
4830        IRB.CreateMemCpy(RegSaveAreaOriginPtr, Alignment, VAArgTLSOriginCopy,
4831                         Alignment, AMD64FpEndOffset);
4832      Type *OverflowArgAreaPtrTy = Type::getInt64PtrTy(*MS.C);
4833      Value *OverflowArgAreaPtrPtr = IRB.CreateIntToPtr(
4834          IRB.CreateAdd(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
4835                        ConstantInt::get(MS.IntptrTy, 8)),
4836          PointerType::get(OverflowArgAreaPtrTy, 0));
4837      Value *OverflowArgAreaPtr =
4838          IRB.CreateLoad(OverflowArgAreaPtrTy, OverflowArgAreaPtrPtr);
4839      Value *OverflowArgAreaShadowPtr, *OverflowArgAreaOriginPtr;
4840      std::tie(OverflowArgAreaShadowPtr, OverflowArgAreaOriginPtr) =
4841          MSV.getShadowOriginPtr(OverflowArgAreaPtr, IRB, IRB.getInt8Ty(),
4842                                 Alignment, /*isStore*/ true);
4843      Value *SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSCopy,
4844                                             AMD64FpEndOffset);
4845      IRB.CreateMemCpy(OverflowArgAreaShadowPtr, Alignment, SrcPtr, Alignment,
4846                       VAArgOverflowSize);
4847      if (MS.TrackOrigins) {
4848        SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSOriginCopy,
4849                                        AMD64FpEndOffset);
4850        IRB.CreateMemCpy(OverflowArgAreaOriginPtr, Alignment, SrcPtr, Alignment,
4851                         VAArgOverflowSize);
4852      }
4853    }
4854  }
4855};
4856
4857/// MIPS64-specific implementation of VarArgHelper.
4858struct VarArgMIPS64Helper : public VarArgHelper {
4859  Function &F;
4860  MemorySanitizer &MS;
4861  MemorySanitizerVisitor &MSV;
4862  Value *VAArgTLSCopy = nullptr;
4863  Value *VAArgSize = nullptr;
4864
4865  SmallVector<CallInst *, 16> VAStartInstrumentationList;
4866
4867  VarArgMIPS64Helper(Function &F, MemorySanitizer &MS,
4868                     MemorySanitizerVisitor &MSV)
4869      : F(F), MS(MS), MSV(MSV) {}
4870
4871  void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
4872    unsigned VAArgOffset = 0;
4873    const DataLayout &DL = F.getParent()->getDataLayout();
4874    for (Value *A :
4875         llvm::drop_begin(CB.args(), CB.getFunctionType()->getNumParams())) {
4876      Triple TargetTriple(F.getParent()->getTargetTriple());
4877      Value *Base;
4878      uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
4879      if (TargetTriple.getArch() == Triple::mips64) {
4880        // Adjusting the shadow for argument with size < 8 to match the
4881        // placement of bits in big endian system
4882        if (ArgSize < 8)
4883          VAArgOffset += (8 - ArgSize);
4884      }
4885      Base = getShadowPtrForVAArgument(A->getType(), IRB, VAArgOffset, ArgSize);
4886      VAArgOffset += ArgSize;
4887      VAArgOffset = alignTo(VAArgOffset, 8);
4888      if (!Base)
4889        continue;
4890      IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
4891    }
4892
4893    Constant *TotalVAArgSize = ConstantInt::get(IRB.getInt64Ty(), VAArgOffset);
4894    // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
4895    // a new class member i.e. it is the total size of all VarArgs.
4896    IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
4897  }
4898
4899  /// Compute the shadow address for a given va_arg.
4900  Value *getShadowPtrForVAArgument(Type *Ty, IRBuilder<> &IRB,
4901                                   unsigned ArgOffset, unsigned ArgSize) {
4902    // Make sure we don't overflow __msan_va_arg_tls.
4903    if (ArgOffset + ArgSize > kParamTLSSize)
4904      return nullptr;
4905    Value *Base = IRB.CreatePointerCast(MS.VAArgTLS, MS.IntptrTy);
4906    Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
4907    return IRB.CreateIntToPtr(Base, PointerType::get(MSV.getShadowTy(Ty), 0),
4908                              "_msarg");
4909  }
4910
4911  void visitVAStartInst(VAStartInst &I) override {
4912    IRBuilder<> IRB(&I);
4913    VAStartInstrumentationList.push_back(&I);
4914    Value *VAListTag = I.getArgOperand(0);
4915    Value *ShadowPtr, *OriginPtr;
4916    const Align Alignment = Align(8);
4917    std::tie(ShadowPtr, OriginPtr) = MSV.getShadowOriginPtr(
4918        VAListTag, IRB, IRB.getInt8Ty(), Alignment, /*isStore*/ true);
4919    IRB.CreateMemSet(ShadowPtr, Constant::getNullValue(IRB.getInt8Ty()),
4920                     /* size */ 8, Alignment, false);
4921  }
4922
4923  void visitVACopyInst(VACopyInst &I) override {
4924    IRBuilder<> IRB(&I);
4925    VAStartInstrumentationList.push_back(&I);
4926    Value *VAListTag = I.getArgOperand(0);
4927    Value *ShadowPtr, *OriginPtr;
4928    const Align Alignment = Align(8);
4929    std::tie(ShadowPtr, OriginPtr) = MSV.getShadowOriginPtr(
4930        VAListTag, IRB, IRB.getInt8Ty(), Alignment, /*isStore*/ true);
4931    IRB.CreateMemSet(ShadowPtr, Constant::getNullValue(IRB.getInt8Ty()),
4932                     /* size */ 8, Alignment, false);
4933  }
4934
4935  void finalizeInstrumentation() override {
4936    assert(!VAArgSize && !VAArgTLSCopy &&
4937           "finalizeInstrumentation called twice");
4938    IRBuilder<> IRB(MSV.FnPrologueEnd);
4939    VAArgSize = IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
4940    Value *CopySize =
4941        IRB.CreateAdd(ConstantInt::get(MS.IntptrTy, 0), VAArgSize);
4942
4943    if (!VAStartInstrumentationList.empty()) {
4944      // If there is a va_start in this function, make a backup copy of
4945      // va_arg_tls somewhere in the function entry block.
4946      VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
4947      IRB.CreateMemCpy(VAArgTLSCopy, Align(8), MS.VAArgTLS, Align(8), CopySize);
4948    }
4949
4950    // Instrument va_start.
4951    // Copy va_list shadow from the backup copy of the TLS contents.
4952    for (size_t i = 0, n = VAStartInstrumentationList.size(); i < n; i++) {
4953      CallInst *OrigInst = VAStartInstrumentationList[i];
4954      NextNodeIRBuilder IRB(OrigInst);
4955      Value *VAListTag = OrigInst->getArgOperand(0);
4956      Type *RegSaveAreaPtrTy = Type::getInt64PtrTy(*MS.C);
4957      Value *RegSaveAreaPtrPtr =
4958          IRB.CreateIntToPtr(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
4959                             PointerType::get(RegSaveAreaPtrTy, 0));
4960      Value *RegSaveAreaPtr =
4961          IRB.CreateLoad(RegSaveAreaPtrTy, RegSaveAreaPtrPtr);
4962      Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
4963      const Align Alignment = Align(8);
4964      std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
4965          MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
4966                                 Alignment, /*isStore*/ true);
4967      IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
4968                       CopySize);
4969    }
4970  }
4971};
4972
4973/// AArch64-specific implementation of VarArgHelper.
4974struct VarArgAArch64Helper : public VarArgHelper {
4975  static const unsigned kAArch64GrArgSize = 64;
4976  static const unsigned kAArch64VrArgSize = 128;
4977
4978  static const unsigned AArch64GrBegOffset = 0;
4979  static const unsigned AArch64GrEndOffset = kAArch64GrArgSize;
4980  // Make VR space aligned to 16 bytes.
4981  static const unsigned AArch64VrBegOffset = AArch64GrEndOffset;
4982  static const unsigned AArch64VrEndOffset =
4983      AArch64VrBegOffset + kAArch64VrArgSize;
4984  static const unsigned AArch64VAEndOffset = AArch64VrEndOffset;
4985
4986  Function &F;
4987  MemorySanitizer &MS;
4988  MemorySanitizerVisitor &MSV;
4989  Value *VAArgTLSCopy = nullptr;
4990  Value *VAArgOverflowSize = nullptr;
4991
4992  SmallVector<CallInst *, 16> VAStartInstrumentationList;
4993
4994  enum ArgKind { AK_GeneralPurpose, AK_FloatingPoint, AK_Memory };
4995
4996  VarArgAArch64Helper(Function &F, MemorySanitizer &MS,
4997                      MemorySanitizerVisitor &MSV)
4998      : F(F), MS(MS), MSV(MSV) {}
4999
5000  ArgKind classifyArgument(Value *arg) {
5001    Type *T = arg->getType();
5002    if (T->isFPOrFPVectorTy())
5003      return AK_FloatingPoint;
5004    if ((T->isIntegerTy() && T->getPrimitiveSizeInBits() <= 64) ||
5005        (T->isPointerTy()))
5006      return AK_GeneralPurpose;
5007    return AK_Memory;
5008  }
5009
5010  // The instrumentation stores the argument shadow in a non ABI-specific
5011  // format because it does not know which argument is named (since Clang,
5012  // like x86_64 case, lowers the va_args in the frontend and this pass only
5013  // sees the low level code that deals with va_list internals).
5014  // The first seven GR registers are saved in the first 56 bytes of the
5015  // va_arg tls arra, followers by the first 8 FP/SIMD registers, and then
5016  // the remaining arguments.
5017  // Using constant offset within the va_arg TLS array allows fast copy
5018  // in the finalize instrumentation.
5019  void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
5020    unsigned GrOffset = AArch64GrBegOffset;
5021    unsigned VrOffset = AArch64VrBegOffset;
5022    unsigned OverflowOffset = AArch64VAEndOffset;
5023
5024    const DataLayout &DL = F.getParent()->getDataLayout();
5025    for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
5026      bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
5027      ArgKind AK = classifyArgument(A);
5028      if (AK == AK_GeneralPurpose && GrOffset >= AArch64GrEndOffset)
5029        AK = AK_Memory;
5030      if (AK == AK_FloatingPoint && VrOffset >= AArch64VrEndOffset)
5031        AK = AK_Memory;
5032      Value *Base;
5033      switch (AK) {
5034      case AK_GeneralPurpose:
5035        Base = getShadowPtrForVAArgument(A->getType(), IRB, GrOffset, 8);
5036        GrOffset += 8;
5037        break;
5038      case AK_FloatingPoint:
5039        Base = getShadowPtrForVAArgument(A->getType(), IRB, VrOffset, 8);
5040        VrOffset += 16;
5041        break;
5042      case AK_Memory:
5043        // Don't count fixed arguments in the overflow area - va_start will
5044        // skip right over them.
5045        if (IsFixed)
5046          continue;
5047        uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
5048        Base = getShadowPtrForVAArgument(A->getType(), IRB, OverflowOffset,
5049                                         alignTo(ArgSize, 8));
5050        OverflowOffset += alignTo(ArgSize, 8);
5051        break;
5052      }
5053      // Count Gp/Vr fixed arguments to their respective offsets, but don't
5054      // bother to actually store a shadow.
5055      if (IsFixed)
5056        continue;
5057      if (!Base)
5058        continue;
5059      IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
5060    }
5061    Constant *OverflowSize =
5062        ConstantInt::get(IRB.getInt64Ty(), OverflowOffset - AArch64VAEndOffset);
5063    IRB.CreateStore(OverflowSize, MS.VAArgOverflowSizeTLS);
5064  }
5065
5066  /// Compute the shadow address for a given va_arg.
5067  Value *getShadowPtrForVAArgument(Type *Ty, IRBuilder<> &IRB,
5068                                   unsigned ArgOffset, unsigned ArgSize) {
5069    // Make sure we don't overflow __msan_va_arg_tls.
5070    if (ArgOffset + ArgSize > kParamTLSSize)
5071      return nullptr;
5072    Value *Base = IRB.CreatePointerCast(MS.VAArgTLS, MS.IntptrTy);
5073    Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
5074    return IRB.CreateIntToPtr(Base, PointerType::get(MSV.getShadowTy(Ty), 0),
5075                              "_msarg");
5076  }
5077
5078  void visitVAStartInst(VAStartInst &I) override {
5079    IRBuilder<> IRB(&I);
5080    VAStartInstrumentationList.push_back(&I);
5081    Value *VAListTag = I.getArgOperand(0);
5082    Value *ShadowPtr, *OriginPtr;
5083    const Align Alignment = Align(8);
5084    std::tie(ShadowPtr, OriginPtr) = MSV.getShadowOriginPtr(
5085        VAListTag, IRB, IRB.getInt8Ty(), Alignment, /*isStore*/ true);
5086    IRB.CreateMemSet(ShadowPtr, Constant::getNullValue(IRB.getInt8Ty()),
5087                     /* size */ 32, Alignment, false);
5088  }
5089
5090  void visitVACopyInst(VACopyInst &I) override {
5091    IRBuilder<> IRB(&I);
5092    VAStartInstrumentationList.push_back(&I);
5093    Value *VAListTag = I.getArgOperand(0);
5094    Value *ShadowPtr, *OriginPtr;
5095    const Align Alignment = Align(8);
5096    std::tie(ShadowPtr, OriginPtr) = MSV.getShadowOriginPtr(
5097        VAListTag, IRB, IRB.getInt8Ty(), Alignment, /*isStore*/ true);
5098    IRB.CreateMemSet(ShadowPtr, Constant::getNullValue(IRB.getInt8Ty()),
5099                     /* size */ 32, Alignment, false);
5100  }
5101
5102  // Retrieve a va_list field of 'void*' size.
5103  Value *getVAField64(IRBuilder<> &IRB, Value *VAListTag, int offset) {
5104    Value *SaveAreaPtrPtr = IRB.CreateIntToPtr(
5105        IRB.CreateAdd(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
5106                      ConstantInt::get(MS.IntptrTy, offset)),
5107        Type::getInt64PtrTy(*MS.C));
5108    return IRB.CreateLoad(Type::getInt64Ty(*MS.C), SaveAreaPtrPtr);
5109  }
5110
5111  // Retrieve a va_list field of 'int' size.
5112  Value *getVAField32(IRBuilder<> &IRB, Value *VAListTag, int offset) {
5113    Value *SaveAreaPtr = IRB.CreateIntToPtr(
5114        IRB.CreateAdd(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
5115                      ConstantInt::get(MS.IntptrTy, offset)),
5116        Type::getInt32PtrTy(*MS.C));
5117    Value *SaveArea32 = IRB.CreateLoad(IRB.getInt32Ty(), SaveAreaPtr);
5118    return IRB.CreateSExt(SaveArea32, MS.IntptrTy);
5119  }
5120
5121  void finalizeInstrumentation() override {
5122    assert(!VAArgOverflowSize && !VAArgTLSCopy &&
5123           "finalizeInstrumentation called twice");
5124    if (!VAStartInstrumentationList.empty()) {
5125      // If there is a va_start in this function, make a backup copy of
5126      // va_arg_tls somewhere in the function entry block.
5127      IRBuilder<> IRB(MSV.FnPrologueEnd);
5128      VAArgOverflowSize =
5129          IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
5130      Value *CopySize = IRB.CreateAdd(
5131          ConstantInt::get(MS.IntptrTy, AArch64VAEndOffset), VAArgOverflowSize);
5132      VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
5133      IRB.CreateMemCpy(VAArgTLSCopy, Align(8), MS.VAArgTLS, Align(8), CopySize);
5134    }
5135
5136    Value *GrArgSize = ConstantInt::get(MS.IntptrTy, kAArch64GrArgSize);
5137    Value *VrArgSize = ConstantInt::get(MS.IntptrTy, kAArch64VrArgSize);
5138
5139    // Instrument va_start, copy va_list shadow from the backup copy of
5140    // the TLS contents.
5141    for (size_t i = 0, n = VAStartInstrumentationList.size(); i < n; i++) {
5142      CallInst *OrigInst = VAStartInstrumentationList[i];
5143      NextNodeIRBuilder IRB(OrigInst);
5144
5145      Value *VAListTag = OrigInst->getArgOperand(0);
5146
5147      // The variadic ABI for AArch64 creates two areas to save the incoming
5148      // argument registers (one for 64-bit general register xn-x7 and another
5149      // for 128-bit FP/SIMD vn-v7).
5150      // We need then to propagate the shadow arguments on both regions
5151      // 'va::__gr_top + va::__gr_offs' and 'va::__vr_top + va::__vr_offs'.
5152      // The remaining arguments are saved on shadow for 'va::stack'.
5153      // One caveat is it requires only to propagate the non-named arguments,
5154      // however on the call site instrumentation 'all' the arguments are
5155      // saved. So to copy the shadow values from the va_arg TLS array
5156      // we need to adjust the offset for both GR and VR fields based on
5157      // the __{gr,vr}_offs value (since they are stores based on incoming
5158      // named arguments).
5159
5160      // Read the stack pointer from the va_list.
5161      Value *StackSaveAreaPtr = getVAField64(IRB, VAListTag, 0);
5162
5163      // Read both the __gr_top and __gr_off and add them up.
5164      Value *GrTopSaveAreaPtr = getVAField64(IRB, VAListTag, 8);
5165      Value *GrOffSaveArea = getVAField32(IRB, VAListTag, 24);
5166
5167      Value *GrRegSaveAreaPtr = IRB.CreateAdd(GrTopSaveAreaPtr, GrOffSaveArea);
5168
5169      // Read both the __vr_top and __vr_off and add them up.
5170      Value *VrTopSaveAreaPtr = getVAField64(IRB, VAListTag, 16);
5171      Value *VrOffSaveArea = getVAField32(IRB, VAListTag, 28);
5172
5173      Value *VrRegSaveAreaPtr = IRB.CreateAdd(VrTopSaveAreaPtr, VrOffSaveArea);
5174
5175      // It does not know how many named arguments is being used and, on the
5176      // callsite all the arguments were saved.  Since __gr_off is defined as
5177      // '0 - ((8 - named_gr) * 8)', the idea is to just propagate the variadic
5178      // argument by ignoring the bytes of shadow from named arguments.
5179      Value *GrRegSaveAreaShadowPtrOff =
5180          IRB.CreateAdd(GrArgSize, GrOffSaveArea);
5181
5182      Value *GrRegSaveAreaShadowPtr =
5183          MSV.getShadowOriginPtr(GrRegSaveAreaPtr, IRB, IRB.getInt8Ty(),
5184                                 Align(8), /*isStore*/ true)
5185              .first;
5186
5187      Value *GrSrcPtr = IRB.CreateInBoundsGEP(IRB.getInt8Ty(), VAArgTLSCopy,
5188                                              GrRegSaveAreaShadowPtrOff);
5189      Value *GrCopySize = IRB.CreateSub(GrArgSize, GrRegSaveAreaShadowPtrOff);
5190
5191      IRB.CreateMemCpy(GrRegSaveAreaShadowPtr, Align(8), GrSrcPtr, Align(8),
5192                       GrCopySize);
5193
5194      // Again, but for FP/SIMD values.
5195      Value *VrRegSaveAreaShadowPtrOff =
5196          IRB.CreateAdd(VrArgSize, VrOffSaveArea);
5197
5198      Value *VrRegSaveAreaShadowPtr =
5199          MSV.getShadowOriginPtr(VrRegSaveAreaPtr, IRB, IRB.getInt8Ty(),
5200                                 Align(8), /*isStore*/ true)
5201              .first;
5202
5203      Value *VrSrcPtr = IRB.CreateInBoundsGEP(
5204          IRB.getInt8Ty(),
5205          IRB.CreateInBoundsGEP(IRB.getInt8Ty(), VAArgTLSCopy,
5206                                IRB.getInt32(AArch64VrBegOffset)),
5207          VrRegSaveAreaShadowPtrOff);
5208      Value *VrCopySize = IRB.CreateSub(VrArgSize, VrRegSaveAreaShadowPtrOff);
5209
5210      IRB.CreateMemCpy(VrRegSaveAreaShadowPtr, Align(8), VrSrcPtr, Align(8),
5211                       VrCopySize);
5212
5213      // And finally for remaining arguments.
5214      Value *StackSaveAreaShadowPtr =
5215          MSV.getShadowOriginPtr(StackSaveAreaPtr, IRB, IRB.getInt8Ty(),
5216                                 Align(16), /*isStore*/ true)
5217              .first;
5218
5219      Value *StackSrcPtr = IRB.CreateInBoundsGEP(
5220          IRB.getInt8Ty(), VAArgTLSCopy, IRB.getInt32(AArch64VAEndOffset));
5221
5222      IRB.CreateMemCpy(StackSaveAreaShadowPtr, Align(16), StackSrcPtr,
5223                       Align(16), VAArgOverflowSize);
5224    }
5225  }
5226};
5227
5228/// PowerPC64-specific implementation of VarArgHelper.
5229struct VarArgPowerPC64Helper : public VarArgHelper {
5230  Function &F;
5231  MemorySanitizer &MS;
5232  MemorySanitizerVisitor &MSV;
5233  Value *VAArgTLSCopy = nullptr;
5234  Value *VAArgSize = nullptr;
5235
5236  SmallVector<CallInst *, 16> VAStartInstrumentationList;
5237
5238  VarArgPowerPC64Helper(Function &F, MemorySanitizer &MS,
5239                        MemorySanitizerVisitor &MSV)
5240      : F(F), MS(MS), MSV(MSV) {}
5241
5242  void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
5243    // For PowerPC, we need to deal with alignment of stack arguments -
5244    // they are mostly aligned to 8 bytes, but vectors and i128 arrays
5245    // are aligned to 16 bytes, byvals can be aligned to 8 or 16 bytes,
5246    // For that reason, we compute current offset from stack pointer (which is
5247    // always properly aligned), and offset for the first vararg, then subtract
5248    // them.
5249    unsigned VAArgBase;
5250    Triple TargetTriple(F.getParent()->getTargetTriple());
5251    // Parameter save area starts at 48 bytes from frame pointer for ABIv1,
5252    // and 32 bytes for ABIv2.  This is usually determined by target
5253    // endianness, but in theory could be overridden by function attribute.
5254    if (TargetTriple.getArch() == Triple::ppc64)
5255      VAArgBase = 48;
5256    else
5257      VAArgBase = 32;
5258    unsigned VAArgOffset = VAArgBase;
5259    const DataLayout &DL = F.getParent()->getDataLayout();
5260    for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
5261      bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
5262      bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
5263      if (IsByVal) {
5264        assert(A->getType()->isPointerTy());
5265        Type *RealTy = CB.getParamByValType(ArgNo);
5266        uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
5267        Align ArgAlign = CB.getParamAlign(ArgNo).value_or(Align(8));
5268        if (ArgAlign < 8)
5269          ArgAlign = Align(8);
5270        VAArgOffset = alignTo(VAArgOffset, ArgAlign);
5271        if (!IsFixed) {
5272          Value *Base = getShadowPtrForVAArgument(
5273              RealTy, IRB, VAArgOffset - VAArgBase, ArgSize);
5274          if (Base) {
5275            Value *AShadowPtr, *AOriginPtr;
5276            std::tie(AShadowPtr, AOriginPtr) =
5277                MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(),
5278                                       kShadowTLSAlignment, /*isStore*/ false);
5279
5280            IRB.CreateMemCpy(Base, kShadowTLSAlignment, AShadowPtr,
5281                             kShadowTLSAlignment, ArgSize);
5282          }
5283        }
5284        VAArgOffset += alignTo(ArgSize, Align(8));
5285      } else {
5286        Value *Base;
5287        uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
5288        Align ArgAlign = Align(8);
5289        if (A->getType()->isArrayTy()) {
5290          // Arrays are aligned to element size, except for long double
5291          // arrays, which are aligned to 8 bytes.
5292          Type *ElementTy = A->getType()->getArrayElementType();
5293          if (!ElementTy->isPPC_FP128Ty())
5294            ArgAlign = Align(DL.getTypeAllocSize(ElementTy));
5295        } else if (A->getType()->isVectorTy()) {
5296          // Vectors are naturally aligned.
5297          ArgAlign = Align(ArgSize);
5298        }
5299        if (ArgAlign < 8)
5300          ArgAlign = Align(8);
5301        VAArgOffset = alignTo(VAArgOffset, ArgAlign);
5302        if (DL.isBigEndian()) {
5303          // Adjusting the shadow for argument with size < 8 to match the
5304          // placement of bits in big endian system
5305          if (ArgSize < 8)
5306            VAArgOffset += (8 - ArgSize);
5307        }
5308        if (!IsFixed) {
5309          Base = getShadowPtrForVAArgument(A->getType(), IRB,
5310                                           VAArgOffset - VAArgBase, ArgSize);
5311          if (Base)
5312            IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
5313        }
5314        VAArgOffset += ArgSize;
5315        VAArgOffset = alignTo(VAArgOffset, Align(8));
5316      }
5317      if (IsFixed)
5318        VAArgBase = VAArgOffset;
5319    }
5320
5321    Constant *TotalVAArgSize =
5322        ConstantInt::get(IRB.getInt64Ty(), VAArgOffset - VAArgBase);
5323    // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
5324    // a new class member i.e. it is the total size of all VarArgs.
5325    IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
5326  }
5327
5328  /// Compute the shadow address for a given va_arg.
5329  Value *getShadowPtrForVAArgument(Type *Ty, IRBuilder<> &IRB,
5330                                   unsigned ArgOffset, unsigned ArgSize) {
5331    // Make sure we don't overflow __msan_va_arg_tls.
5332    if (ArgOffset + ArgSize > kParamTLSSize)
5333      return nullptr;
5334    Value *Base = IRB.CreatePointerCast(MS.VAArgTLS, MS.IntptrTy);
5335    Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
5336    return IRB.CreateIntToPtr(Base, PointerType::get(MSV.getShadowTy(Ty), 0),
5337                              "_msarg");
5338  }
5339
5340  void visitVAStartInst(VAStartInst &I) override {
5341    IRBuilder<> IRB(&I);
5342    VAStartInstrumentationList.push_back(&I);
5343    Value *VAListTag = I.getArgOperand(0);
5344    Value *ShadowPtr, *OriginPtr;
5345    const Align Alignment = Align(8);
5346    std::tie(ShadowPtr, OriginPtr) = MSV.getShadowOriginPtr(
5347        VAListTag, IRB, IRB.getInt8Ty(), Alignment, /*isStore*/ true);
5348    IRB.CreateMemSet(ShadowPtr, Constant::getNullValue(IRB.getInt8Ty()),
5349                     /* size */ 8, Alignment, false);
5350  }
5351
5352  void visitVACopyInst(VACopyInst &I) override {
5353    IRBuilder<> IRB(&I);
5354    Value *VAListTag = I.getArgOperand(0);
5355    Value *ShadowPtr, *OriginPtr;
5356    const Align Alignment = Align(8);
5357    std::tie(ShadowPtr, OriginPtr) = MSV.getShadowOriginPtr(
5358        VAListTag, IRB, IRB.getInt8Ty(), Alignment, /*isStore*/ true);
5359    // Unpoison the whole __va_list_tag.
5360    // FIXME: magic ABI constants.
5361    IRB.CreateMemSet(ShadowPtr, Constant::getNullValue(IRB.getInt8Ty()),
5362                     /* size */ 8, Alignment, false);
5363  }
5364
5365  void finalizeInstrumentation() override {
5366    assert(!VAArgSize && !VAArgTLSCopy &&
5367           "finalizeInstrumentation called twice");
5368    IRBuilder<> IRB(MSV.FnPrologueEnd);
5369    VAArgSize = IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
5370    Value *CopySize =
5371        IRB.CreateAdd(ConstantInt::get(MS.IntptrTy, 0), VAArgSize);
5372
5373    if (!VAStartInstrumentationList.empty()) {
5374      // If there is a va_start in this function, make a backup copy of
5375      // va_arg_tls somewhere in the function entry block.
5376      VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
5377      IRB.CreateMemCpy(VAArgTLSCopy, Align(8), MS.VAArgTLS, Align(8), CopySize);
5378    }
5379
5380    // Instrument va_start.
5381    // Copy va_list shadow from the backup copy of the TLS contents.
5382    for (size_t i = 0, n = VAStartInstrumentationList.size(); i < n; i++) {
5383      CallInst *OrigInst = VAStartInstrumentationList[i];
5384      NextNodeIRBuilder IRB(OrigInst);
5385      Value *VAListTag = OrigInst->getArgOperand(0);
5386      Type *RegSaveAreaPtrTy = Type::getInt64PtrTy(*MS.C);
5387      Value *RegSaveAreaPtrPtr =
5388          IRB.CreateIntToPtr(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
5389                             PointerType::get(RegSaveAreaPtrTy, 0));
5390      Value *RegSaveAreaPtr =
5391          IRB.CreateLoad(RegSaveAreaPtrTy, RegSaveAreaPtrPtr);
5392      Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
5393      const Align Alignment = Align(8);
5394      std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
5395          MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
5396                                 Alignment, /*isStore*/ true);
5397      IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
5398                       CopySize);
5399    }
5400  }
5401};
5402
5403/// SystemZ-specific implementation of VarArgHelper.
5404struct VarArgSystemZHelper : public VarArgHelper {
5405  static const unsigned SystemZGpOffset = 16;
5406  static const unsigned SystemZGpEndOffset = 56;
5407  static const unsigned SystemZFpOffset = 128;
5408  static const unsigned SystemZFpEndOffset = 160;
5409  static const unsigned SystemZMaxVrArgs = 8;
5410  static const unsigned SystemZRegSaveAreaSize = 160;
5411  static const unsigned SystemZOverflowOffset = 160;
5412  static const unsigned SystemZVAListTagSize = 32;
5413  static const unsigned SystemZOverflowArgAreaPtrOffset = 16;
5414  static const unsigned SystemZRegSaveAreaPtrOffset = 24;
5415
5416  Function &F;
5417  MemorySanitizer &MS;
5418  MemorySanitizerVisitor &MSV;
5419  Value *VAArgTLSCopy = nullptr;
5420  Value *VAArgTLSOriginCopy = nullptr;
5421  Value *VAArgOverflowSize = nullptr;
5422
5423  SmallVector<CallInst *, 16> VAStartInstrumentationList;
5424
5425  enum class ArgKind {
5426    GeneralPurpose,
5427    FloatingPoint,
5428    Vector,
5429    Memory,
5430    Indirect,
5431  };
5432
5433  enum class ShadowExtension { None, Zero, Sign };
5434
5435  VarArgSystemZHelper(Function &F, MemorySanitizer &MS,
5436                      MemorySanitizerVisitor &MSV)
5437      : F(F), MS(MS), MSV(MSV) {}
5438
5439  ArgKind classifyArgument(Type *T, bool IsSoftFloatABI) {
5440    // T is a SystemZABIInfo::classifyArgumentType() output, and there are
5441    // only a few possibilities of what it can be. In particular, enums, single
5442    // element structs and large types have already been taken care of.
5443
5444    // Some i128 and fp128 arguments are converted to pointers only in the
5445    // back end.
5446    if (T->isIntegerTy(128) || T->isFP128Ty())
5447      return ArgKind::Indirect;
5448    if (T->isFloatingPointTy())
5449      return IsSoftFloatABI ? ArgKind::GeneralPurpose : ArgKind::FloatingPoint;
5450    if (T->isIntegerTy() || T->isPointerTy())
5451      return ArgKind::GeneralPurpose;
5452    if (T->isVectorTy())
5453      return ArgKind::Vector;
5454    return ArgKind::Memory;
5455  }
5456
5457  ShadowExtension getShadowExtension(const CallBase &CB, unsigned ArgNo) {
5458    // ABI says: "One of the simple integer types no more than 64 bits wide.
5459    // ... If such an argument is shorter than 64 bits, replace it by a full
5460    // 64-bit integer representing the same number, using sign or zero
5461    // extension". Shadow for an integer argument has the same type as the
5462    // argument itself, so it can be sign or zero extended as well.
5463    bool ZExt = CB.paramHasAttr(ArgNo, Attribute::ZExt);
5464    bool SExt = CB.paramHasAttr(ArgNo, Attribute::SExt);
5465    if (ZExt) {
5466      assert(!SExt);
5467      return ShadowExtension::Zero;
5468    }
5469    if (SExt) {
5470      assert(!ZExt);
5471      return ShadowExtension::Sign;
5472    }
5473    return ShadowExtension::None;
5474  }
5475
5476  void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
5477    bool IsSoftFloatABI = CB.getCalledFunction()
5478                              ->getFnAttribute("use-soft-float")
5479                              .getValueAsBool();
5480    unsigned GpOffset = SystemZGpOffset;
5481    unsigned FpOffset = SystemZFpOffset;
5482    unsigned VrIndex = 0;
5483    unsigned OverflowOffset = SystemZOverflowOffset;
5484    const DataLayout &DL = F.getParent()->getDataLayout();
5485    for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
5486      bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
5487      // SystemZABIInfo does not produce ByVal parameters.
5488      assert(!CB.paramHasAttr(ArgNo, Attribute::ByVal));
5489      Type *T = A->getType();
5490      ArgKind AK = classifyArgument(T, IsSoftFloatABI);
5491      if (AK == ArgKind::Indirect) {
5492        T = PointerType::get(T, 0);
5493        AK = ArgKind::GeneralPurpose;
5494      }
5495      if (AK == ArgKind::GeneralPurpose && GpOffset >= SystemZGpEndOffset)
5496        AK = ArgKind::Memory;
5497      if (AK == ArgKind::FloatingPoint && FpOffset >= SystemZFpEndOffset)
5498        AK = ArgKind::Memory;
5499      if (AK == ArgKind::Vector && (VrIndex >= SystemZMaxVrArgs || !IsFixed))
5500        AK = ArgKind::Memory;
5501      Value *ShadowBase = nullptr;
5502      Value *OriginBase = nullptr;
5503      ShadowExtension SE = ShadowExtension::None;
5504      switch (AK) {
5505      case ArgKind::GeneralPurpose: {
5506        // Always keep track of GpOffset, but store shadow only for varargs.
5507        uint64_t ArgSize = 8;
5508        if (GpOffset + ArgSize <= kParamTLSSize) {
5509          if (!IsFixed) {
5510            SE = getShadowExtension(CB, ArgNo);
5511            uint64_t GapSize = 0;
5512            if (SE == ShadowExtension::None) {
5513              uint64_t ArgAllocSize = DL.getTypeAllocSize(T);
5514              assert(ArgAllocSize <= ArgSize);
5515              GapSize = ArgSize - ArgAllocSize;
5516            }
5517            ShadowBase = getShadowAddrForVAArgument(IRB, GpOffset + GapSize);
5518            if (MS.TrackOrigins)
5519              OriginBase = getOriginPtrForVAArgument(IRB, GpOffset + GapSize);
5520          }
5521          GpOffset += ArgSize;
5522        } else {
5523          GpOffset = kParamTLSSize;
5524        }
5525        break;
5526      }
5527      case ArgKind::FloatingPoint: {
5528        // Always keep track of FpOffset, but store shadow only for varargs.
5529        uint64_t ArgSize = 8;
5530        if (FpOffset + ArgSize <= kParamTLSSize) {
5531          if (!IsFixed) {
5532            // PoP says: "A short floating-point datum requires only the
5533            // left-most 32 bit positions of a floating-point register".
5534            // Therefore, in contrast to AK_GeneralPurpose and AK_Memory,
5535            // don't extend shadow and don't mind the gap.
5536            ShadowBase = getShadowAddrForVAArgument(IRB, FpOffset);
5537            if (MS.TrackOrigins)
5538              OriginBase = getOriginPtrForVAArgument(IRB, FpOffset);
5539          }
5540          FpOffset += ArgSize;
5541        } else {
5542          FpOffset = kParamTLSSize;
5543        }
5544        break;
5545      }
5546      case ArgKind::Vector: {
5547        // Keep track of VrIndex. No need to store shadow, since vector varargs
5548        // go through AK_Memory.
5549        assert(IsFixed);
5550        VrIndex++;
5551        break;
5552      }
5553      case ArgKind::Memory: {
5554        // Keep track of OverflowOffset and store shadow only for varargs.
5555        // Ignore fixed args, since we need to copy only the vararg portion of
5556        // the overflow area shadow.
5557        if (!IsFixed) {
5558          uint64_t ArgAllocSize = DL.getTypeAllocSize(T);
5559          uint64_t ArgSize = alignTo(ArgAllocSize, 8);
5560          if (OverflowOffset + ArgSize <= kParamTLSSize) {
5561            SE = getShadowExtension(CB, ArgNo);
5562            uint64_t GapSize =
5563                SE == ShadowExtension::None ? ArgSize - ArgAllocSize : 0;
5564            ShadowBase =
5565                getShadowAddrForVAArgument(IRB, OverflowOffset + GapSize);
5566            if (MS.TrackOrigins)
5567              OriginBase =
5568                  getOriginPtrForVAArgument(IRB, OverflowOffset + GapSize);
5569            OverflowOffset += ArgSize;
5570          } else {
5571            OverflowOffset = kParamTLSSize;
5572          }
5573        }
5574        break;
5575      }
5576      case ArgKind::Indirect:
5577        llvm_unreachable("Indirect must be converted to GeneralPurpose");
5578      }
5579      if (ShadowBase == nullptr)
5580        continue;
5581      Value *Shadow = MSV.getShadow(A);
5582      if (SE != ShadowExtension::None)
5583        Shadow = MSV.CreateShadowCast(IRB, Shadow, IRB.getInt64Ty(),
5584                                      /*Signed*/ SE == ShadowExtension::Sign);
5585      ShadowBase = IRB.CreateIntToPtr(
5586          ShadowBase, PointerType::get(Shadow->getType(), 0), "_msarg_va_s");
5587      IRB.CreateStore(Shadow, ShadowBase);
5588      if (MS.TrackOrigins) {
5589        Value *Origin = MSV.getOrigin(A);
5590        unsigned StoreSize = DL.getTypeStoreSize(Shadow->getType());
5591        MSV.paintOrigin(IRB, Origin, OriginBase, StoreSize,
5592                        kMinOriginAlignment);
5593      }
5594    }
5595    Constant *OverflowSize = ConstantInt::get(
5596        IRB.getInt64Ty(), OverflowOffset - SystemZOverflowOffset);
5597    IRB.CreateStore(OverflowSize, MS.VAArgOverflowSizeTLS);
5598  }
5599
5600  Value *getShadowAddrForVAArgument(IRBuilder<> &IRB, unsigned ArgOffset) {
5601    Value *Base = IRB.CreatePointerCast(MS.VAArgTLS, MS.IntptrTy);
5602    return IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
5603  }
5604
5605  Value *getOriginPtrForVAArgument(IRBuilder<> &IRB, int ArgOffset) {
5606    Value *Base = IRB.CreatePointerCast(MS.VAArgOriginTLS, MS.IntptrTy);
5607    Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
5608    return IRB.CreateIntToPtr(Base, PointerType::get(MS.OriginTy, 0),
5609                              "_msarg_va_o");
5610  }
5611
5612  void unpoisonVAListTagForInst(IntrinsicInst &I) {
5613    IRBuilder<> IRB(&I);
5614    Value *VAListTag = I.getArgOperand(0);
5615    Value *ShadowPtr, *OriginPtr;
5616    const Align Alignment = Align(8);
5617    std::tie(ShadowPtr, OriginPtr) =
5618        MSV.getShadowOriginPtr(VAListTag, IRB, IRB.getInt8Ty(), Alignment,
5619                               /*isStore*/ true);
5620    IRB.CreateMemSet(ShadowPtr, Constant::getNullValue(IRB.getInt8Ty()),
5621                     SystemZVAListTagSize, Alignment, false);
5622  }
5623
5624  void visitVAStartInst(VAStartInst &I) override {
5625    VAStartInstrumentationList.push_back(&I);
5626    unpoisonVAListTagForInst(I);
5627  }
5628
5629  void visitVACopyInst(VACopyInst &I) override { unpoisonVAListTagForInst(I); }
5630
5631  void copyRegSaveArea(IRBuilder<> &IRB, Value *VAListTag) {
5632    Type *RegSaveAreaPtrTy = Type::getInt64PtrTy(*MS.C);
5633    Value *RegSaveAreaPtrPtr = IRB.CreateIntToPtr(
5634        IRB.CreateAdd(
5635            IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
5636            ConstantInt::get(MS.IntptrTy, SystemZRegSaveAreaPtrOffset)),
5637        PointerType::get(RegSaveAreaPtrTy, 0));
5638    Value *RegSaveAreaPtr = IRB.CreateLoad(RegSaveAreaPtrTy, RegSaveAreaPtrPtr);
5639    Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
5640    const Align Alignment = Align(8);
5641    std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
5642        MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(), Alignment,
5643                               /*isStore*/ true);
5644    // TODO(iii): copy only fragments filled by visitCallBase()
5645    IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
5646                     SystemZRegSaveAreaSize);
5647    if (MS.TrackOrigins)
5648      IRB.CreateMemCpy(RegSaveAreaOriginPtr, Alignment, VAArgTLSOriginCopy,
5649                       Alignment, SystemZRegSaveAreaSize);
5650  }
5651
5652  void copyOverflowArea(IRBuilder<> &IRB, Value *VAListTag) {
5653    Type *OverflowArgAreaPtrTy = Type::getInt64PtrTy(*MS.C);
5654    Value *OverflowArgAreaPtrPtr = IRB.CreateIntToPtr(
5655        IRB.CreateAdd(
5656            IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
5657            ConstantInt::get(MS.IntptrTy, SystemZOverflowArgAreaPtrOffset)),
5658        PointerType::get(OverflowArgAreaPtrTy, 0));
5659    Value *OverflowArgAreaPtr =
5660        IRB.CreateLoad(OverflowArgAreaPtrTy, OverflowArgAreaPtrPtr);
5661    Value *OverflowArgAreaShadowPtr, *OverflowArgAreaOriginPtr;
5662    const Align Alignment = Align(8);
5663    std::tie(OverflowArgAreaShadowPtr, OverflowArgAreaOriginPtr) =
5664        MSV.getShadowOriginPtr(OverflowArgAreaPtr, IRB, IRB.getInt8Ty(),
5665                               Alignment, /*isStore*/ true);
5666    Value *SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSCopy,
5667                                           SystemZOverflowOffset);
5668    IRB.CreateMemCpy(OverflowArgAreaShadowPtr, Alignment, SrcPtr, Alignment,
5669                     VAArgOverflowSize);
5670    if (MS.TrackOrigins) {
5671      SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSOriginCopy,
5672                                      SystemZOverflowOffset);
5673      IRB.CreateMemCpy(OverflowArgAreaOriginPtr, Alignment, SrcPtr, Alignment,
5674                       VAArgOverflowSize);
5675    }
5676  }
5677
5678  void finalizeInstrumentation() override {
5679    assert(!VAArgOverflowSize && !VAArgTLSCopy &&
5680           "finalizeInstrumentation called twice");
5681    if (!VAStartInstrumentationList.empty()) {
5682      // If there is a va_start in this function, make a backup copy of
5683      // va_arg_tls somewhere in the function entry block.
5684      IRBuilder<> IRB(MSV.FnPrologueEnd);
5685      VAArgOverflowSize =
5686          IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
5687      Value *CopySize =
5688          IRB.CreateAdd(ConstantInt::get(MS.IntptrTy, SystemZOverflowOffset),
5689                        VAArgOverflowSize);
5690      VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
5691      IRB.CreateMemCpy(VAArgTLSCopy, Align(8), MS.VAArgTLS, Align(8), CopySize);
5692      if (MS.TrackOrigins) {
5693        VAArgTLSOriginCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
5694        IRB.CreateMemCpy(VAArgTLSOriginCopy, Align(8), MS.VAArgOriginTLS,
5695                         Align(8), CopySize);
5696      }
5697    }
5698
5699    // Instrument va_start.
5700    // Copy va_list shadow from the backup copy of the TLS contents.
5701    for (size_t VaStartNo = 0, VaStartNum = VAStartInstrumentationList.size();
5702         VaStartNo < VaStartNum; VaStartNo++) {
5703      CallInst *OrigInst = VAStartInstrumentationList[VaStartNo];
5704      NextNodeIRBuilder IRB(OrigInst);
5705      Value *VAListTag = OrigInst->getArgOperand(0);
5706      copyRegSaveArea(IRB, VAListTag);
5707      copyOverflowArea(IRB, VAListTag);
5708    }
5709  }
5710};
5711
5712/// A no-op implementation of VarArgHelper.
5713struct VarArgNoOpHelper : public VarArgHelper {
5714  VarArgNoOpHelper(Function &F, MemorySanitizer &MS,
5715                   MemorySanitizerVisitor &MSV) {}
5716
5717  void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {}
5718
5719  void visitVAStartInst(VAStartInst &I) override {}
5720
5721  void visitVACopyInst(VACopyInst &I) override {}
5722
5723  void finalizeInstrumentation() override {}
5724};
5725
5726} // end anonymous namespace
5727
5728static VarArgHelper *CreateVarArgHelper(Function &Func, MemorySanitizer &Msan,
5729                                        MemorySanitizerVisitor &Visitor) {
5730  // VarArg handling is only implemented on AMD64. False positives are possible
5731  // on other platforms.
5732  Triple TargetTriple(Func.getParent()->getTargetTriple());
5733  if (TargetTriple.getArch() == Triple::x86_64)
5734    return new VarArgAMD64Helper(Func, Msan, Visitor);
5735  else if (TargetTriple.isMIPS64())
5736    return new VarArgMIPS64Helper(Func, Msan, Visitor);
5737  else if (TargetTriple.getArch() == Triple::aarch64)
5738    return new VarArgAArch64Helper(Func, Msan, Visitor);
5739  else if (TargetTriple.getArch() == Triple::ppc64 ||
5740           TargetTriple.getArch() == Triple::ppc64le)
5741    return new VarArgPowerPC64Helper(Func, Msan, Visitor);
5742  else if (TargetTriple.getArch() == Triple::systemz)
5743    return new VarArgSystemZHelper(Func, Msan, Visitor);
5744  else
5745    return new VarArgNoOpHelper(Func, Msan, Visitor);
5746}
5747
5748bool MemorySanitizer::sanitizeFunction(Function &F, TargetLibraryInfo &TLI) {
5749  if (!CompileKernel && F.getName() == kMsanModuleCtorName)
5750    return false;
5751
5752  if (F.hasFnAttribute(Attribute::DisableSanitizerInstrumentation))
5753    return false;
5754
5755  MemorySanitizerVisitor Visitor(F, *this, TLI);
5756
5757  // Clear out memory attributes.
5758  AttributeMask B;
5759  B.addAttribute(Attribute::Memory).addAttribute(Attribute::Speculatable);
5760  F.removeFnAttrs(B);
5761
5762  return Visitor.runOnFunction();
5763}
5764