summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Basic MTE stack tagging instrumentation.Evgeniy Stepanov2019-07-174-0/+351
| | | | | | | | | | | | | | | | Summary: Use MTE intrinsics to tag stack variables in functions with sanitize_memtag attribute. Reviewers: pcc, vitalybuka, hctim, ostannard Subscribers: srhines, mgorny, javed.absar, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64173 llvm-svn: 366361
* Basic codegen for MTE stack tagging.Evgeniy Stepanov2019-07-1714-8/+372
| | | | | | | | | | | | Implement IR intrinsics for stack tagging. Generated code is very unoptimized for now. Two special intrinsics, llvm.aarch64.irg.sp and llvm.aarch64.tagp are used to implement a tagged stack frame pointer in a virtual register. Differential Revision: https://reviews.llvm.org/D64172 llvm-svn: 366360
* Revert [AArch64] Add support for Transactional Memory Extension (TME)Momchil Velikov2019-07-174-77/+12
| | | | | | This reverts r366322 (git commit 4b8da3a503e434ddbc08ecf66582475765f449bc) llvm-svn: 366355
* [AMDGPU] Tune inlining parameters for AMDGPU targetDaniil Fukalov2019-07-174-12/+9
| | | | | | | | | | | | | | | | | | | Summary: Since the target has no significant advantage of vectorization, vector instructions bous threshold bonus should be optional. amdgpu-inline-arg-alloca-cost parameter default value and the target InliningThresholdMultiplier value tuned then respectively. Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, eraman, hiraditya, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64642 llvm-svn: 366348
* [ORC] Add deprecation warnings to ORCv1 layers and utilities.Lang Hames2019-07-174-27/+41
| | | | | | | | | | | | | | | | | Summary: ORCv1 is deprecated. The current aim is to remove it before the LLVM 10.0 release. This patch adds deprecation attributes to the ORCv1 layers and utilities to warn clients of the change. Reviewers: dblaikie, sgraenitz, AlexDenisov Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64609 llvm-svn: 366344
* AMDGPU: Use getTargetConstantMatt Arsenault2019-07-171-2/+2
| | | | | | Avoids creating an extra intermediate mov. llvm-svn: 366340
* [Attributor] Deduce "willreturn" function attributeHideto Ueno2019-07-171-0/+120
| | | | | | | | | | | | | | | | | Summary: Deduce the "willreturn" attribute for functions. For now, intrinsics are not willreturn. More annotation will be done in another patch. Reviewers: jdoerfert Subscribers: jvesely, nhaehnle, nicholas, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63046 llvm-svn: 366335
* [AsmPrinter] Make the encoding of call sites in .gcc_except_table ↵Alex Bradbury2019-07-174-6/+29
| | | | | | | | | | | | | | | | | | | configurable and use for RISC-V The original behavior was to always emit the offsets to each call site in the call site table as uleb128 values, however on some architectures (eg RISCV) these uleb128 offsets into the code cannot always be resolved until link time (because relaxation will invalidate any calculated offsets), and there are no appropriate relocations for uleb128 values. As a consequence it needs to be possible to specify an alternative. This also switches RISCV to use DW_EH_PE_udata4 for call side encodings in .gcc_except_table Differential Revision: https://reviews.llvm.org/D63415 Patch by Edward Jones. llvm-svn: 366329
* [RISCV] Set correct encodings for DWARF exception handlingAlex Bradbury2019-07-171-0/+8
| | | | | | | | | | | | This patch sets correct encodings for DWARF exception handling for RISC-V (other than call site encoding, which must be udata4 rather than uleb128 and is handled by D63415). This has the same intend as D63409, except this version matches GCC/binutils behaviour which uses the same encodings regardless of PIC/non-PIC and medlow/medany code model. llvm-svn: 366327
* [AMDGPU] Optimize atomic AND/OR/XORJay Foad2019-07-171-16/+55
| | | | | | | | | | | | | | Summary: Extend the atomic optimizer to handle AND, OR and XOR. Reviewers: arsenm, sheredom Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64809 llvm-svn: 366323
* [AArch64] Add support for Transactional Memory Extension (TME)Momchil Velikov2019-07-174-12/+77
| | | | | | | | | | | | | | | | | | | | | | | TME is a future architecture technology, documented in https://developer.arm.com/architectures/cpu-architecture/a-profile/exploration-tools https://developer.arm.com/docs/ddi0601/a More about the future architectures: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/new-technologies-for-the-arm-a-profile-architecture This patch adds support for the TME instructions TSTART, TTEST, TCOMMIT, and TCANCEL and the target feature/arch extension "tme". It also implements TME builtin functions, defined in ACLE Q2 2019 (https://developer.arm.com/docs/101028/latest) Patch by Javed Absar and Momchil Velikov Differential Revision: https://reviews.llvm.org/D64416 llvm-svn: 366322
* PowerPC: Fix register spilling for SPE registersJustin Hibbits2019-07-173-25/+47
| | | | | | | | | | | | | | | | | | Summary: Missed in the original commit, use the correct callee-saved register list for spilling, instead of the standard SVR432 list. This avoids needlessly spilling the SPE non-volatile registers when they're not used. As part of this, also add where missing, and sort, the spill opcode checks for SPE and SPE4 register classes. Reviewers: nemanjai, hfinkel, joerg Subscribers: kbarton, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D56703 llvm-svn: 366319
* PowerPC/SPE: Fix load/store handling for SPEJustin Hibbits2019-07-173-1/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Pointed out in a comment for D49754, register spilling will currently spill SPE registers at almost any offset. However, the instructions `evstdd` and `evldd` require a) 8-byte alignment, and b) a limit of 256 (unsigned) bytes from the base register, as the offset must fix into a 5-bit offset, which ranges from 0-31 (indexed in double-words). The update to the register spill test is taken partially from the test case shown in D49754. Additionally, pointed out by Kei Thomsen, globals will currently use evldd/evstdd, though the offset isn't known at compile time, so may exceed the 8-bit (unsigned) offset permitted. This fixes that as well, by forcing it to always use evlddx/evstddx when accessing globals. Part of the patch contributed by Kei Thomsen. Reviewers: nemanjai, hfinkel, joerg Subscribers: kbarton, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D54409 llvm-svn: 366318
* [MIPS GlobalISel] ClampScalar and select pointer G_ICMPPetar Avramovic2019-07-172-1/+38
| | | | | | | | | | | Add narrowScalar to half of original size for G_ICMP. ClampScalar G_ICMP's operands 2 and 3 to to s32. Select G_ICMP for pointers for MIPS32. Pointer compare is same as for integers, it is enough to declare them as legal type. Differential Revision: https://reviews.llvm.org/D64856 llvm-svn: 366317
* AMDGPU/GFX10: Apply the VMEM-to-scalar-write hazard also to writes to EXECNicolai Haehnle2019-07-171-1/+1
| | | | | | | | | | | | | | | Summary: Change-Id: I854fbf7d48e937bef9f8f3f5d0c8aeb970652630 Reviewers: rampitec, mareko Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64807 Change-Id: I4405b3a7f84186acea5a78d291bff71056e745fc llvm-svn: 366314
* AMDGPU: Improve alias analysis for GDSNicolai Haehnle2019-07-171-4/+4
| | | | | | | | | | | | | | | | | Summary: GDS cannot alias anything else. Original patch by: Marek Olšák Reviewers: arsenm, mareko Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64114 Change-Id: I07bfbd96f5d5c37a6dfba7997df12f291dd794b0 llvm-svn: 366313
* [ARM GlobalISel] Cleanup CallLowering. NFCDiana Picus2019-07-172-71/+20
| | | | | | | | | | Migrate CallLowering::lowerReturnVal to use the same infrastructure as lowerCall/FormalArguments and remove the now obsolete code path from splitToValueTypes. Forgot to push this earlier. llvm-svn: 366308
* [mips] Use mult/mflo pattern on 64-bit targets prior to MIPS64Simon Atanasyan2019-07-171-1/+1
| | | | | | The `MUL` instruction is available starting from the MIPS32/MIPS64 targets. llvm-svn: 366301
* [mips] Implement .cplocal directiveSimon Atanasyan2019-07-173-33/+105
| | | | | | | | | | | | | | This directive forces to use the alternate register for context pointer. For example, this code: .cplocal $4 jal foo expands to: ld $25, %call16(foo)($4) jalr $25 Differential Revision: https://reviews.llvm.org/D64743 llvm-svn: 366300
* [mips] Support the "o" inline asm constraintSimon Atanasyan2019-07-172-0/+3
| | | | | | | | | | | | | As well as other LLVM targets we do not handle "offsettable" memory addresses in any special way. In other words, the "o" constraint is an exact equivalent of the "m" one. But some existing code require the "o" constraint support. This fixes PR42589. Differential Revision: https://reviews.llvm.org/D64792 llvm-svn: 366299
* [AMDGPU] Autogenerate register asm namesStanislav Mekhanoshin2019-07-165-721/+139
| | | | | | Differential Revision: https://reviews.llvm.org/D64839 llvm-svn: 366283
* GlobalISel: Add overload of handleAssignments with CCStateMatt Arsenault2019-07-161-2/+11
| | | | | | | | | | | AMDGPU needs to allocate special argument registers separately from the user function argument list, so needs direct control over the CCState. The ArgLocs argument is only really necessary because CCState doesn't allow access to it. llvm-svn: 366279
* [WebAssembly] Compile all TLS on Emscripten as local-execGuanzhong Chen2019-07-161-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently, on Emscripten, dynamic linking is not supported with threads. This means that if thread-local storage is used, it must be used in a statically-linked executable. Hence, local-exec is the only possible model. This diff compiles all TLS variables to use local-exec on Emscripten as a temporary measure until dynamic linking is supported with threads. The goal for this is to allow C++ types with constructors to be thread-local. Currently, when `clang` compiles a `thread_local` variable with a constructor, it generates `__tls_guard` variable: @__tls_guard = internal thread_local global i8 0, align 1 As no TLS model is specified, this is treated as general-dynamic, which we do not support (and cannot support without implementing dynamic linking support with threads in Emscripten). As a result, any C++ constructor in `thread_local` variables would not compile. By compiling all `thread_local` as local-exec, `__tls_guard` will compile and we can support C++ constructors with TLS without implementing dynamic linking with threads. Depends on D64537 Reviewers: tlively, aheejin, sbc100 Reviewed By: aheejin Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64776 llvm-svn: 366275
* [WebAssembly] Implement thread-local storage (local-exec model)Guanzhong Chen2019-07-164-10/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Thread local variables are placed inside a `.tdata` segment. Their symbols are offsets from the start of the segment. The address of a thread local variable is computed as `__tls_base` + the offset from the start of the segment. `.tdata` segment is a passive segment and `memory.init` is used once per thread to initialize the thread local storage. `__tls_base` is a wasm global. Since each thread has its own wasm instance, it is effectively thread local. Currently, `__tls_base` must be initialized at thread startup, and so cannot be used with dynamic libraries. `__tls_base` is to be initialized with a new linker-synthesized function, `__wasm_init_tls`, which takes as an argument a block of memory to use as the storage for thread locals. It then initializes the block of memory and sets `__tls_base`. As `__wasm_init_tls` will handle the memory initialization, the memory does not have to be zeroed. To help allocating memory for thread-local storage, a new compiler intrinsic is introduced: `__builtin_wasm_tls_size()`. This instrinsic function returns the size of the thread-local storage for the current function. The expected usage is to run something like the following upon thread startup: __wasm_init_tls(malloc(__builtin_wasm_tls_size())); Reviewers: tlively, aheejin, kripken, sbc100 Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, jfb, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D64537 llvm-svn: 366272
* [x86] use more phadd for reductionsSanjay Patel2019-07-161-0/+54
| | | | | | | | | | | | | | | This is part of what is requested by PR42023: https://bugs.llvm.org/show_bug.cgi?id=42023 There's an extension needed for FP add, but exactly how we would specify that using flags is not clear to me, so I left that as a TODO. We're still missing patterns for partial reductions when the input vector is 256-bit or 512-bit, but I think that's a failure of vector narrowing. If we can reduce the widths, then this matching should work on those tests. Differential Revision: https://reviews.llvm.org/D64760 llvm-svn: 366268
* DWARF: Skip zero column for inline call sitesDavid Blaikie2019-07-161-1/+2
| | | | | | | | | | | | | | D64033 <https://reviews.llvm.org/D64033> added DW_AT_call_column for inline sites. However, that change wasn't aware of "-gno-column-info". To avoid adding column info when "-gno-column-info" is used, now DW_AT_call_column is only added when we have non-zero column (when "-gno-column-info" is used, column will be zero). Patch by Wenlei He! Differential Revision: https://reviews.llvm.org/D64784 llvm-svn: 366264
* AMDGPU/GlobalISel: Select G_ASHRMatt Arsenault2019-07-164-13/+4
| | | | llvm-svn: 366257
* AMDGPU/GlobalISel: Select G_LSHRMatt Arsenault2019-07-163-4/+4
| | | | llvm-svn: 366256
* [PowerPC][HTM] Fix impossible reg-to-reg copy assert with ttest builtinJinsong Ji2019-07-161-1/+3
| | | | | | | | | | | | | | | | | | | | Summary: This is exposed by our internal testing. The reduced testcase will assert with "Impossible reg-to-reg copy" We can't use COPY to do 32-bit to 64-bit conversion. Reviewers: kbarton, hfinkel, nemanjai Reviewed By: hfinkel Subscribers: hiraditya, MaskRay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64499 llvm-svn: 366255
* AMDGPU/GlobalISel: Select G_SHLMatt Arsenault2019-07-163-4/+4
| | | | | | | | | | I think this manages to not break the DAG handling with the divergent predicates because the stadalone divergent patterns end up with a higher priority than the pattern on the instruction definition. The 16-bit versions don't work yet. llvm-svn: 366254
* [AMDGPU] Change register type for v32 vectorsStanislav Mekhanoshin2019-07-161-2/+2
| | | | | | | | | | When it is AReg_1024 this results in unnecessary copying into AGPRs of a 32 element vectors even though they are not intended for an mfma instruction. Differential Revision: https://reviews.llvm.org/D64815 llvm-svn: 366252
* Fix -Wreturn-type warning. NFC.Michael Liao2019-07-161-0/+1
| | | | llvm-svn: 366251
* AMDGPU/GlobalISel: Fix selection of private storesMatt Arsenault2019-07-161-6/+7
| | | | llvm-svn: 366249
* AMDGPU/GlobalISel: Select private loadsMatt Arsenault2019-07-163-1/+147
| | | | llvm-svn: 366248
* AMDGPU/GlobalISel: Select flat storesMatt Arsenault2019-07-161-2/+4
| | | | llvm-svn: 366246
* AMDGPU: Add register classes to flat store patternsMatt Arsenault2019-07-161-25/+25
| | | | | | | For some reason GlobalISelEmitter needs register classes to import these, although it works for the load patterns. llvm-svn: 366242
* [IndVars] Speculative fix for an assertion failure seen in botsPhilip Reames2019-07-161-1/+6
| | | | | | I don't have an IR sample which is actually failing, but the issue described in the comment is theoretically possible, and should be guarded against even if there's a different root cause for the bot failures. llvm-svn: 366241
* AMDGPU: Replace store PatFragsMatt Arsenault2019-07-162-14/+34
| | | | | | Convert the easy cases to formats understood for GlobalISel. llvm-svn: 366240
* AMDGPU/GlobalISel: Select flat loadsMatt Arsenault2019-07-167-56/+102
| | | | | | | | Now that the patterns use the new PatFrag address space support, the only blocker to importing most load patterns is the addressing mode complex patterns. llvm-svn: 366237
* Teach `llvm-pdbutil pretty -native` about `-injected-sources`Nico Weber2019-07-165-11/+247
| | | | | | | | | `pretty -native -injected-sources -injected-source-content` works with this patch, and produces identical output to the dia version. Differential Revision: https://reviews.llvm.org/D64428 llvm-svn: 366236
* [AMDGPU] Optimize atomic max/minJay Foad2019-07-161-36/+141
| | | | | | | | | | | | | | | | Summary: Extend the atomic optimizer to handle signed and unsigned max and min operations, as well as add and subtract. Reviewers: arsenm, sheredom, critson, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64328 llvm-svn: 366235
* AMDGPU: Redefine load PatFragsMatt Arsenault2019-07-164-76/+105
| | | | | | | Rewrite PatFrags using the new PatFrag address space matching in tablegen. These will now work with both SelectionDAG and GlobalISel. llvm-svn: 366234
* [AMDGPU] Add the adjusted FP as a livein register.Michael Liao2019-07-163-34/+41
| | | | | | | | | | | | Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64145 llvm-svn: 366223
* [Strict FP] Allow more relaxed schedulingUlrich Weigand2019-07-161-10/+21
| | | | | | | | | | | | | | Reimplement scheduling constraints for strict FP instructions in ScheduleDAGInstrs::buildSchedGraph to allow for more relaxed scheduling. Specifially, allow one strict FP instruction to be scheduled across another, as long as it is not moved across any global barrier. Differential Revision: https://reviews.llvm.org/D64412 Reviewed By: cameron.mcinally llvm-svn: 366222
* [Remarks] Simplify and refactor the RemarkParser interfaceFrancis Visoiu Mistrih2019-07-166-415/+338
| | | | | | | | | | | | | | | | | | | | Before, everything was based on some kind of type erased parser implementation which container a lot of boilerplate code when multiple formats were to be supported. This simplifies it by: * the remark now owns its arguments * *always* returning an error from the implementation side * working around the way the YAML parser reports errors: catch them through callbacks and re-insert them in a proper llvm::Error * add a CParser wrapper that is used when implementing the C API to avoid cluttering the C++ API with useless state * LLVMRemarkParserGetNext now returns an object that needs to be released to avoid leaking resources * add a new API to dispose of a remark entry: LLVMRemarkEntryDispose llvm-svn: 366217
* [Remarks][NFC] Combine ParserFormat and SerializerFormatFrancis Visoiu Mistrih2019-07-167-37/+58
| | | | | | It's useless to have both. llvm-svn: 366216
* [ADCE] Fix non-deterministic behaviour due to iterating over a pointer set.Amara Emerson2019-07-161-3/+8
| | | | | | | | Original patch by Yann Laigle-Chapuy Differential Revision: https://reviews.llvm.org/D64785 llvm-svn: 366215
* [DAGCombiner] fold (addcarry (xor a, -1), b, c) -> (subcarry b, a, !c) and ↵Amaury Sechet2019-07-161-16/+28
| | | | | | | | | | | | | | | | | | | flip carry. Summary: As per title. DAGCombiner only mathes the special case where b = 0, this patches extends the pattern to match any value of b. Depends on D57302 Reviewers: hfinkel, RKSimon, craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59208 llvm-svn: 366214
* AMDGPU/GlobalISel: Fix test failures in release buildMatt Arsenault2019-07-162-3/+6
| | | | | | | | | | | | Apparently the check for legal instructions during instruction select does not happen without an asserts build, so these would successfully select in release, and fail in debug. Make s16 and/or/xor legal. These can just be selected directly to the 32-bit operation, as is already done in SelectionDAG, so just make them legal. llvm-svn: 366210
* [AArch64] Implement __jcvt intrinsic from Armv8.3-AKyrylo Tkachov2019-07-161-1/+3
| | | | | | | | | | | | | | | | The jcvt intrinsic defined in ACLE [1] is available when ARM_FEATURE_JCVT is defined. This change introduces the AArch64 intrinsic, wires it up to the instruction and a new clang builtin function. The __ARM_FEATURE_JCVT macro is now defined when an Armv8.3-A or higher target is used. I've implemented the target detection logic in Clang so that this feature is enabled for architectures from armv8.3-a onwards (so -march=armv8.4-a also enables this, for example). make check-all didn't show any new failures. [1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics Differential Revision: https://reviews.llvm.org/D64495 llvm-svn: 366197
OpenPOWER on IntegriCloud