summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Changes to display code view debug info type records in hex formatNilanjana Basu2019-07-174-7/+19
| | | | llvm-svn: 366390
* Make DT a transitive dependency of LI.Evgeniy Stepanov2019-07-171-1/+1
| | | | | | | | | | | | | | | | | | Summary: LoopInfoWrapperPass::verify uses DT, which means DT must be alive even if it has no direct users. Fixes a crash in expensive checks mode. Reviewers: pcc, leonardchan Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64896 llvm-svn: 366388
* [llvm-bcanalyzer] Fixed error 'Expected<T> must be checked before access or ↵Denis Bakhvalov2019-07-171-2/+5
| | | | | | | | | | | | | | | | destruction' After rL365286 I had failing test: LLVM :: tools/gold/X86/v1.12/thinlto_emit_linked_objects.ll It was failing with the output: $ llvm-bcanalyzer --dump llvm/test/tools/gold/X86/v1.12/Output/thinlto_emit_linked_objects.ll.tmp3.o.thinlto.bc Expected<T> must be checked before access or destruction. Unchecked Expected<T> contained error: Unexpected end of file reading 0 of 0 bytesStack dump: Change-Id: I07e03262074ea5e0aae7a8d787d5487c87f914a2 llvm-svn: 366387
* llvm-pdbdump: Fix several smaller issues with injected source compression ↵Nico Weber2019-07-173-10/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | handling - getCompression() used to return a PDB_SourceCompression even though the docs for IDiaInjectedSource are explicit about the return value being compiler-dependent. Return an uint32_t instead, and make the printing code handle unknown values better by printing "Unknown" and the int value instead of not printing any compression. - Print compressed contents as hex dump, not as string. - Add compression type "DotNet", which is used (at least) by csc.exe, the C# compiler. Also add a lengthy comment describing the stream contents (derived from looking at the raw hex contents long enough to see the GUIDs, which led me to the roslyn and mono implementations for handling this). - The native injected source dumper was dumping the contents of the whole data stream -- but csc.exe writes a stream that's padded with zero bytes to the next 512 boundary, and the dia api doesn't display those padding bytes. So make NativeInjectedSource::getCode() do the same thing. Differential Revision: https://reviews.llvm.org/D64879 llvm-svn: 366386
* [AMDGPU] Simplify AMDGPUInstPrinter::printRegOperand()Stanislav Mekhanoshin2019-07-172-157/+37
| | | | | | Differential Revision: https://reviews.llvm.org/D64892 llvm-svn: 366385
* [X86] Make sure we mark 128/256 MLOAD as Legal with VLX when ↵Craig Topper2019-07-171-5/+7
| | | | | | | | | min-legal-vector-width=256 is in effect. This started triggering an assertion after r364718 when we made these Custom under AVX2. llvm-svn: 366382
* hwasan: Initialize the pass only once.Peter Collingbourne2019-07-172-8/+21
| | | | | | | | | | This will let us instrument globals during initialization. This required making the new PM pass a module pass, which should still provide access to analyses via the ModuleAnalysisManager. Differential Revision: https://reviews.llvm.org/D64843 llvm-svn: 366379
* [AMDGPU] Stop special casing flat_scratch for register nameStanislav Mekhanoshin2019-07-172-13/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D64885 llvm-svn: 366376
* Speculative fix for stack-tagging.ll failure.Evgeniy Stepanov2019-07-171-2/+2
| | | | | | | Depending on the evaluation order of function call arguments, the current code may insert a use before def. llvm-svn: 366375
* [Attributor][NFC] Remove unnecessary debug outputHideto Ueno2019-07-171-1/+0
| | | | llvm-svn: 366373
* Adding inline comments to code view type record directives for better ↵Nilanjana Basu2019-07-173-127/+172
| | | | | | readability llvm-svn: 366372
* [PEI] Don't re-allocate a pre-allocated stack protector slotFrancis Visoiu Mistrih2019-07-172-2/+27
| | | | | | | | | | | | | | | | | | | | | | The LocalStackSlotPass pre-allocates a stack protector and makes sure that it comes before the local variables on the stack. We need to make sure that later during PEI we don't re-allocate a new stack protector slot. If that happens, the new stack protector slot will end up being **after** the local variables that it should be protecting. Therefore, we would have two slots assigned for two different stack protectors, one at the top of the stack, and one at the bottom. Since PEI will overwrite the assigned slot for the stack protector, the load that is used to compare the value of the stack protector will use the slot assigned by PEI, which is wrong. For this, we need to check if the object is pre-allocated, and re-use that pre-allocated slot. Differential Revision: https://reviews.llvm.org/D64757 llvm-svn: 366371
* [CodeGen][NFC] Simplify checks for stack protector index checkingFrancis Visoiu Mistrih2019-07-172-13/+11
| | | | | | | Use `hasStackProtectorIndex()` instead of `getStackProtectorIndex() >= 0`. llvm-svn: 366369
* GlobalISel: Handle widenScalar of arbitrary G_MERGE_VALUES sourcesMatt Arsenault2019-07-172-48/+87
| | | | | | | | | | | Extract the sources to the GCD of the original size and target size, padding with implicit_def as necessary. Also fix the case where the requested source type is wider than the original result type. This was ignoring the type, and just using the destination. Do the operation in the requested type and truncate back. llvm-svn: 366367
* GlobalISel: Handle more cases for widenScalar of G_MERGE_VALUESMatt Arsenault2019-07-171-4/+23
| | | | | | | | | | | | Use an anyext to the requested type for the leftover operand to produce a slightly wider type, and then truncate the final merge. I have another implementation almost ready which handles arbitrary widens, but I think it produces worse code in this example (which I think is 90% due to not folding redundant copies or folding out implicit_def users), so I wanted to add this as a baseline first. llvm-svn: 366366
* Basic MTE stack tagging instrumentation.Evgeniy Stepanov2019-07-174-0/+351
| | | | | | | | | | | | | | | | Summary: Use MTE intrinsics to tag stack variables in functions with sanitize_memtag attribute. Reviewers: pcc, vitalybuka, hctim, ostannard Subscribers: srhines, mgorny, javed.absar, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64173 llvm-svn: 366361
* Basic codegen for MTE stack tagging.Evgeniy Stepanov2019-07-1714-8/+372
| | | | | | | | | | | | Implement IR intrinsics for stack tagging. Generated code is very unoptimized for now. Two special intrinsics, llvm.aarch64.irg.sp and llvm.aarch64.tagp are used to implement a tagged stack frame pointer in a virtual register. Differential Revision: https://reviews.llvm.org/D64172 llvm-svn: 366360
* Revert [AArch64] Add support for Transactional Memory Extension (TME)Momchil Velikov2019-07-174-77/+12
| | | | | | This reverts r366322 (git commit 4b8da3a503e434ddbc08ecf66582475765f449bc) llvm-svn: 366355
* [AMDGPU] Tune inlining parameters for AMDGPU targetDaniil Fukalov2019-07-174-12/+9
| | | | | | | | | | | | | | | | | | | Summary: Since the target has no significant advantage of vectorization, vector instructions bous threshold bonus should be optional. amdgpu-inline-arg-alloca-cost parameter default value and the target InliningThresholdMultiplier value tuned then respectively. Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, eraman, hiraditya, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64642 llvm-svn: 366348
* [ORC] Add deprecation warnings to ORCv1 layers and utilities.Lang Hames2019-07-174-27/+41
| | | | | | | | | | | | | | | | | Summary: ORCv1 is deprecated. The current aim is to remove it before the LLVM 10.0 release. This patch adds deprecation attributes to the ORCv1 layers and utilities to warn clients of the change. Reviewers: dblaikie, sgraenitz, AlexDenisov Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64609 llvm-svn: 366344
* AMDGPU: Use getTargetConstantMatt Arsenault2019-07-171-2/+2
| | | | | | Avoids creating an extra intermediate mov. llvm-svn: 366340
* [Attributor] Deduce "willreturn" function attributeHideto Ueno2019-07-171-0/+120
| | | | | | | | | | | | | | | | | Summary: Deduce the "willreturn" attribute for functions. For now, intrinsics are not willreturn. More annotation will be done in another patch. Reviewers: jdoerfert Subscribers: jvesely, nhaehnle, nicholas, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63046 llvm-svn: 366335
* [AsmPrinter] Make the encoding of call sites in .gcc_except_table ↵Alex Bradbury2019-07-174-6/+29
| | | | | | | | | | | | | | | | | | | configurable and use for RISC-V The original behavior was to always emit the offsets to each call site in the call site table as uleb128 values, however on some architectures (eg RISCV) these uleb128 offsets into the code cannot always be resolved until link time (because relaxation will invalidate any calculated offsets), and there are no appropriate relocations for uleb128 values. As a consequence it needs to be possible to specify an alternative. This also switches RISCV to use DW_EH_PE_udata4 for call side encodings in .gcc_except_table Differential Revision: https://reviews.llvm.org/D63415 Patch by Edward Jones. llvm-svn: 366329
* [RISCV] Set correct encodings for DWARF exception handlingAlex Bradbury2019-07-171-0/+8
| | | | | | | | | | | | This patch sets correct encodings for DWARF exception handling for RISC-V (other than call site encoding, which must be udata4 rather than uleb128 and is handled by D63415). This has the same intend as D63409, except this version matches GCC/binutils behaviour which uses the same encodings regardless of PIC/non-PIC and medlow/medany code model. llvm-svn: 366327
* [AMDGPU] Optimize atomic AND/OR/XORJay Foad2019-07-171-16/+55
| | | | | | | | | | | | | | Summary: Extend the atomic optimizer to handle AND, OR and XOR. Reviewers: arsenm, sheredom Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64809 llvm-svn: 366323
* [AArch64] Add support for Transactional Memory Extension (TME)Momchil Velikov2019-07-174-12/+77
| | | | | | | | | | | | | | | | | | | | | | | TME is a future architecture technology, documented in https://developer.arm.com/architectures/cpu-architecture/a-profile/exploration-tools https://developer.arm.com/docs/ddi0601/a More about the future architectures: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/new-technologies-for-the-arm-a-profile-architecture This patch adds support for the TME instructions TSTART, TTEST, TCOMMIT, and TCANCEL and the target feature/arch extension "tme". It also implements TME builtin functions, defined in ACLE Q2 2019 (https://developer.arm.com/docs/101028/latest) Patch by Javed Absar and Momchil Velikov Differential Revision: https://reviews.llvm.org/D64416 llvm-svn: 366322
* PowerPC: Fix register spilling for SPE registersJustin Hibbits2019-07-173-25/+47
| | | | | | | | | | | | | | | | | | Summary: Missed in the original commit, use the correct callee-saved register list for spilling, instead of the standard SVR432 list. This avoids needlessly spilling the SPE non-volatile registers when they're not used. As part of this, also add where missing, and sort, the spill opcode checks for SPE and SPE4 register classes. Reviewers: nemanjai, hfinkel, joerg Subscribers: kbarton, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D56703 llvm-svn: 366319
* PowerPC/SPE: Fix load/store handling for SPEJustin Hibbits2019-07-173-1/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Pointed out in a comment for D49754, register spilling will currently spill SPE registers at almost any offset. However, the instructions `evstdd` and `evldd` require a) 8-byte alignment, and b) a limit of 256 (unsigned) bytes from the base register, as the offset must fix into a 5-bit offset, which ranges from 0-31 (indexed in double-words). The update to the register spill test is taken partially from the test case shown in D49754. Additionally, pointed out by Kei Thomsen, globals will currently use evldd/evstdd, though the offset isn't known at compile time, so may exceed the 8-bit (unsigned) offset permitted. This fixes that as well, by forcing it to always use evlddx/evstddx when accessing globals. Part of the patch contributed by Kei Thomsen. Reviewers: nemanjai, hfinkel, joerg Subscribers: kbarton, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D54409 llvm-svn: 366318
* [MIPS GlobalISel] ClampScalar and select pointer G_ICMPPetar Avramovic2019-07-172-1/+38
| | | | | | | | | | | Add narrowScalar to half of original size for G_ICMP. ClampScalar G_ICMP's operands 2 and 3 to to s32. Select G_ICMP for pointers for MIPS32. Pointer compare is same as for integers, it is enough to declare them as legal type. Differential Revision: https://reviews.llvm.org/D64856 llvm-svn: 366317
* AMDGPU/GFX10: Apply the VMEM-to-scalar-write hazard also to writes to EXECNicolai Haehnle2019-07-171-1/+1
| | | | | | | | | | | | | | | Summary: Change-Id: I854fbf7d48e937bef9f8f3f5d0c8aeb970652630 Reviewers: rampitec, mareko Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64807 Change-Id: I4405b3a7f84186acea5a78d291bff71056e745fc llvm-svn: 366314
* AMDGPU: Improve alias analysis for GDSNicolai Haehnle2019-07-171-4/+4
| | | | | | | | | | | | | | | | | Summary: GDS cannot alias anything else. Original patch by: Marek Olšák Reviewers: arsenm, mareko Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64114 Change-Id: I07bfbd96f5d5c37a6dfba7997df12f291dd794b0 llvm-svn: 366313
* [ARM GlobalISel] Cleanup CallLowering. NFCDiana Picus2019-07-172-71/+20
| | | | | | | | | | Migrate CallLowering::lowerReturnVal to use the same infrastructure as lowerCall/FormalArguments and remove the now obsolete code path from splitToValueTypes. Forgot to push this earlier. llvm-svn: 366308
* [mips] Use mult/mflo pattern on 64-bit targets prior to MIPS64Simon Atanasyan2019-07-171-1/+1
| | | | | | The `MUL` instruction is available starting from the MIPS32/MIPS64 targets. llvm-svn: 366301
* [mips] Implement .cplocal directiveSimon Atanasyan2019-07-173-33/+105
| | | | | | | | | | | | | | This directive forces to use the alternate register for context pointer. For example, this code: .cplocal $4 jal foo expands to: ld $25, %call16(foo)($4) jalr $25 Differential Revision: https://reviews.llvm.org/D64743 llvm-svn: 366300
* [mips] Support the "o" inline asm constraintSimon Atanasyan2019-07-172-0/+3
| | | | | | | | | | | | | As well as other LLVM targets we do not handle "offsettable" memory addresses in any special way. In other words, the "o" constraint is an exact equivalent of the "m" one. But some existing code require the "o" constraint support. This fixes PR42589. Differential Revision: https://reviews.llvm.org/D64792 llvm-svn: 366299
* [AMDGPU] Autogenerate register asm namesStanislav Mekhanoshin2019-07-165-721/+139
| | | | | | Differential Revision: https://reviews.llvm.org/D64839 llvm-svn: 366283
* GlobalISel: Add overload of handleAssignments with CCStateMatt Arsenault2019-07-161-2/+11
| | | | | | | | | | | AMDGPU needs to allocate special argument registers separately from the user function argument list, so needs direct control over the CCState. The ArgLocs argument is only really necessary because CCState doesn't allow access to it. llvm-svn: 366279
* [WebAssembly] Compile all TLS on Emscripten as local-execGuanzhong Chen2019-07-161-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently, on Emscripten, dynamic linking is not supported with threads. This means that if thread-local storage is used, it must be used in a statically-linked executable. Hence, local-exec is the only possible model. This diff compiles all TLS variables to use local-exec on Emscripten as a temporary measure until dynamic linking is supported with threads. The goal for this is to allow C++ types with constructors to be thread-local. Currently, when `clang` compiles a `thread_local` variable with a constructor, it generates `__tls_guard` variable: @__tls_guard = internal thread_local global i8 0, align 1 As no TLS model is specified, this is treated as general-dynamic, which we do not support (and cannot support without implementing dynamic linking support with threads in Emscripten). As a result, any C++ constructor in `thread_local` variables would not compile. By compiling all `thread_local` as local-exec, `__tls_guard` will compile and we can support C++ constructors with TLS without implementing dynamic linking with threads. Depends on D64537 Reviewers: tlively, aheejin, sbc100 Reviewed By: aheejin Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64776 llvm-svn: 366275
* [WebAssembly] Implement thread-local storage (local-exec model)Guanzhong Chen2019-07-164-10/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Thread local variables are placed inside a `.tdata` segment. Their symbols are offsets from the start of the segment. The address of a thread local variable is computed as `__tls_base` + the offset from the start of the segment. `.tdata` segment is a passive segment and `memory.init` is used once per thread to initialize the thread local storage. `__tls_base` is a wasm global. Since each thread has its own wasm instance, it is effectively thread local. Currently, `__tls_base` must be initialized at thread startup, and so cannot be used with dynamic libraries. `__tls_base` is to be initialized with a new linker-synthesized function, `__wasm_init_tls`, which takes as an argument a block of memory to use as the storage for thread locals. It then initializes the block of memory and sets `__tls_base`. As `__wasm_init_tls` will handle the memory initialization, the memory does not have to be zeroed. To help allocating memory for thread-local storage, a new compiler intrinsic is introduced: `__builtin_wasm_tls_size()`. This instrinsic function returns the size of the thread-local storage for the current function. The expected usage is to run something like the following upon thread startup: __wasm_init_tls(malloc(__builtin_wasm_tls_size())); Reviewers: tlively, aheejin, kripken, sbc100 Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, jfb, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D64537 llvm-svn: 366272
* [x86] use more phadd for reductionsSanjay Patel2019-07-161-0/+54
| | | | | | | | | | | | | | | This is part of what is requested by PR42023: https://bugs.llvm.org/show_bug.cgi?id=42023 There's an extension needed for FP add, but exactly how we would specify that using flags is not clear to me, so I left that as a TODO. We're still missing patterns for partial reductions when the input vector is 256-bit or 512-bit, but I think that's a failure of vector narrowing. If we can reduce the widths, then this matching should work on those tests. Differential Revision: https://reviews.llvm.org/D64760 llvm-svn: 366268
* DWARF: Skip zero column for inline call sitesDavid Blaikie2019-07-161-1/+2
| | | | | | | | | | | | | | D64033 <https://reviews.llvm.org/D64033> added DW_AT_call_column for inline sites. However, that change wasn't aware of "-gno-column-info". To avoid adding column info when "-gno-column-info" is used, now DW_AT_call_column is only added when we have non-zero column (when "-gno-column-info" is used, column will be zero). Patch by Wenlei He! Differential Revision: https://reviews.llvm.org/D64784 llvm-svn: 366264
* AMDGPU/GlobalISel: Select G_ASHRMatt Arsenault2019-07-164-13/+4
| | | | llvm-svn: 366257
* AMDGPU/GlobalISel: Select G_LSHRMatt Arsenault2019-07-163-4/+4
| | | | llvm-svn: 366256
* [PowerPC][HTM] Fix impossible reg-to-reg copy assert with ttest builtinJinsong Ji2019-07-161-1/+3
| | | | | | | | | | | | | | | | | | | | Summary: This is exposed by our internal testing. The reduced testcase will assert with "Impossible reg-to-reg copy" We can't use COPY to do 32-bit to 64-bit conversion. Reviewers: kbarton, hfinkel, nemanjai Reviewed By: hfinkel Subscribers: hiraditya, MaskRay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64499 llvm-svn: 366255
* AMDGPU/GlobalISel: Select G_SHLMatt Arsenault2019-07-163-4/+4
| | | | | | | | | | I think this manages to not break the DAG handling with the divergent predicates because the stadalone divergent patterns end up with a higher priority than the pattern on the instruction definition. The 16-bit versions don't work yet. llvm-svn: 366254
* [AMDGPU] Change register type for v32 vectorsStanislav Mekhanoshin2019-07-161-2/+2
| | | | | | | | | | When it is AReg_1024 this results in unnecessary copying into AGPRs of a 32 element vectors even though they are not intended for an mfma instruction. Differential Revision: https://reviews.llvm.org/D64815 llvm-svn: 366252
* Fix -Wreturn-type warning. NFC.Michael Liao2019-07-161-0/+1
| | | | llvm-svn: 366251
* AMDGPU/GlobalISel: Fix selection of private storesMatt Arsenault2019-07-161-6/+7
| | | | llvm-svn: 366249
* AMDGPU/GlobalISel: Select private loadsMatt Arsenault2019-07-163-1/+147
| | | | llvm-svn: 366248
* AMDGPU/GlobalISel: Select flat storesMatt Arsenault2019-07-161-2/+4
| | | | llvm-svn: 366246
OpenPOWER on IntegriCloud