summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [Hexagon] Add helper functions to identify single/pair vector types, NFCKrzysztof Parzyszek2018-02-062-3/+17
| | | | llvm-svn: 324349
* [Hexagon] Handle lowering of SETCC via setCondCodeActionKrzysztof Parzyszek2018-02-064-72/+82
| | | | | | | | | | It was expanded directly into instructions earlier. That was to avoid loads from a constant pool for a vector negation: "xor x, splat(i1 -1)". Implement ISD opcodes QTRUE and QFALSE to denote logical vectors of all true and all false values, and handle setcc with negations through selection patterns. llvm-svn: 324348
* [X86][SSE] Add PACKUS support for truncation of clamped valuesSimon Pilgrim2018-02-061-3/+13
| | | | | | Followup to D42544 that matches PACKUSWB cases for non-AVX512, SSE and PACKUSDW cases will have to wait until we can add support for general SMIN/SMAX matching. llvm-svn: 324347
* [AMDGPU] do not generate .AMDGPU.config for amdpal os typeTim Renouf2018-02-061-16/+12
| | | | | | | | | | | | | | | Summary: Now we generate PAL metadata for the amdpal os type, there is no need to generate the .AMDGPU.config section. Reviewers: arsenm, nhaehnle, dstuttard Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D37760 Change-Id: I303c5fad66656ce97293da60621afac6595b4c18 llvm-svn: 324346
* [AArch64][SVE] Asm: Add AND_ZI instructions and aliasesSander de Smalen2018-02-064-0/+136
| | | | | | | | | | | | | | Summary: Adds support for the SVE AND instruction with vector and logical-immediate operands, and their corresponding aliases. Reviewers: fhahn, rengolin, samparker, echristo, aadg, kristof.beyls Reviewed By: fhahn Subscribers: aemerson, javed.absar, tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D42295 llvm-svn: 324343
* [MergeICmps] Handle chains with several complex BCE basic blocks.Clement Courbet2018-02-061-3/+4
| | | | | | | | | | - Fix condition for detecting that a complex basic block was the first in the chain. - Add tests. This was caught by buildbots when submitting rL324319. llvm-svn: 324341
* [X86][SSE] Add PACKSS support for truncation of clamped valuesSimon Pilgrim2018-02-061-2/+8
| | | | | | Followup to D42544 that matches PACKSSWB cases for non-AVX512, SSE and PACKSSDW cases will have to wait until we can add support for general SMIN/SMAX matching. llvm-svn: 324339
* [PowerPC] fix up in rL324229, NFCHiroshi Inoue2018-02-061-1/+1
| | | | | | This patch fixes up my previous commit (add initialization of local variables). llvm-svn: 324336
* [DeadArgumentElim] Set pointer to DISubprogram before calling RAUW. NFCPetar Jovanovic2018-02-061-3/+3
| | | | | | | | | | | | It is better to update pointer of the DISuprogram before we call RAUW for still live arguments of the function, because with the change reviewed in D42541 in RAUW we compare DISubprograms rather than functions itself. Patch by Djordje Todorovic. Differential Revision: https://reviews.llvm.org/D42794 llvm-svn: 324335
* Fix unused variable warning in release mode. NFC.Alexander Ivchenko2018-02-061-0/+1
| | | | llvm-svn: 324330
* [AArch64] Fix spelling of ICH_ELRSR_EL2 system registerOliver Stannard2018-02-061-1/+1
| | | | | | | This register was mis-spelled as ICH_ELSR_EL2, but has the correct encoding for ICH_ELRSR_EL2. llvm-svn: 324325
* [ARM][AArch64] Add CSDB speculation barrier instructionOliver Stannard2018-02-064-11/+18
| | | | | | | | | | | | | | | This adds the CSDB instruction, which is a new barrier instruction described by the whitepaper at [1]. This is in encoding space which was previously executed as a NOP, so it is available for all targets that have the relevant NOP encoding space. This matches the binutils behaviour for these instructions [2][3]. [1] https://developer.arm.com/support/security-update [2] https://sourceware.org/ml/binutils/2018-01/msg00116.html [3] https://sourceware.org/ml/binutils/2018-01/msg00120.html llvm-svn: 324324
* [MergeICmps][NFC] Add more assertions.Clement Courbet2018-02-061-0/+4
| | | | llvm-svn: 324323
* [ARM] Armv8.2-A FP16 code generation (part 3/3)Sjoerd Meijer2018-02-063-34/+107
| | | | | | | | | | | | | | | This adds most of the FP16 codegen support, but these areas need further work: - FP16 literals and immediates are not properly supported yet (e.g. literal pool needs work), - Instructions that are generated from intrinsics (e.g. vabs) haven't been added. This will be addressed in follow-up patches. Differential Revision: https://reviews.llvm.org/D42849 llvm-svn: 324321
* Revert "[MergeICmps] Enable the MergeICmps Pass by default."Clement Courbet2018-02-061-4/+5
| | | | | | | | Breaks clang-ppc64be-linux-multistage buildbot. This reverts commit 515bab711f308c2e8299c49dd8c84ea6a2e0b60e. llvm-svn: 324319
* [MergeICmps] Enable the MergeICmps Pass by default.Clement Courbet2018-02-061-5/+4
| | | | | | | | | | | | Summary: Now that PR33325 is fixed, this should always improve the generated code. Reviewers: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42793 llvm-svn: 324317
* AMDGPU/MemoryModel: Fix monotonic atomic loadsKonstantin Zhuravlyov2018-02-061-1/+2
| | | | | | Those should have glc bit set for system and agent synchronization scopes llvm-svn: 324314
* ThinLTOBitcodeWriter: Do not include module-level inline asm in the merged ↵Peter Collingbourne2018-02-061-0/+1
| | | | | | | | | | | module. If the inline asm provides the definition of a symbol, this can result in duplicate symbol errors. Differential Revision: https://reviews.llvm.org/D42944 llvm-svn: 324313
* [DAGCombiner] Pass the original load to ExtendSetCCUses not the turncate.Craig Topper2018-02-061-11/+12
| | | | | | | | | | | | | | | | | | | | | | | Summary: This method is trying to use the truncate node to find which SETCC operand should be replaced directly with the extended load. This used to work correctly because all uses of the original load were replaced by the truncate before this function was called. So this was used to effectively bypass the truncate and find the load under it. All but one of the callers now call this before the truncate has replaced the laod so the setcc doesn't yet use the truncate. To account for this we should pass the original load instead. I changed the order of that one caller to make this work there too. I don't have a test case because this is probably hidden by later DAG combines causing the extend and truncate to cancel out. I assume this way is a little more efficient and matches what was originally intended. Reviewers: RKSimon, spatel, niravd Reviewed By: niravd Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42878 llvm-svn: 324311
* [RISCV] Add support for %pcrel_lo.Ahmed Charles2018-02-067-12/+36
| | | | llvm-svn: 324303
* Revert "Don't assume a null GV is local for ELF and MachO."Reid Kleckner2018-02-061-14/+7
| | | | | | | | This reverts r323297. It breaks building grub. llvm-svn: 324301
* [ThinLTO] Remove dead and dropped symbol declarations when possibleTeresa Johnson2018-02-061-6/+15
| | | | | | | | | | | | | | | | | | | | | Summary: Removing the dropped symbols will prevent indirect call promotion in the ThinLTO Backend from adding a new reference to a symbol, which can result in linker unsats. This can happen when we compile with a sample profile collected from one binary by used for another, which may have profiled targets that aren't used in the new binary. Note that until dropDeadSymbols handles variables and aliases (in progress), we may not be able to remove the declaration and can still have an issue. Reviewers: grimar, davidxl Subscribers: mehdi_amini, inglorion, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D42816 llvm-svn: 324299
* [X86] Relax restrictions on what setcc condition codes can be folded with a ↵Craig Topper2018-02-051-2/+1
| | | | | | | | sext when AVX512 is enabled. We now allow all signed comparisons and not equal. The complement that needs to be added for this is no worse than the extend. And the vector output forms of pcmpeq/pcmpgt have better latency than the k-register version on SKX. llvm-svn: 324294
* LTO: Also include dso-local bit for calls in ThinLTO cache key.Peter Collingbourne2018-02-051-1/+3
| | | | | | Differential Revision: https://reviews.llvm.org/D42934 llvm-svn: 324291
* [LoopStrengthReduce, x86] don't add cost for a cmp that will be macro-fused ↵Sanjay Patel2018-02-054-2/+12
| | | | | | | | | | | | | | | (PR35681) In the motivating case from PR35681 and represented by the macro-fuse-cmp test: https://bugs.llvm.org/show_bug.cgi?id=35681 ...there's a 37 -> 31 byte size win for the loop because we eliminate the big base address offsets. SPEC2017 on Ryzen shows no significant perf difference. Differential Revision: https://reviews.llvm.org/D42607 llvm-svn: 324289
* [PEI][NFC] Move StackSize opt-remark code next to -warn-stack codeFrancis Visoiu Mistrih2018-02-051-7/+6
| | | | | | | This allows us to make sure we're always having the same sizes in both remarks and warnings. llvm-svn: 324283
* [LowerMemIntrinsics] Update uses of deprecated MemIntrinsic::getAlignment ↵Daniel Neilson2018-02-051-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | API (NFC) Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the LowerMemIntrinsics pass to cease using the old getAlignment() API of MemoryIntrinsic in favour of getting source & dest specific alignments through the new API. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment() and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 324278
* [InstCombine] don't try to evaluate instructions with >1 use (revert r324014)Sanjay Patel2018-02-051-17/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This example causes a compile-time explosion: define i16 @foo(i16 %in) { %x = zext i16 %in to i32 %a1 = mul i32 %x, %x %a2 = mul i32 %a1, %a1 %a3 = mul i32 %a2, %a2 %a4 = mul i32 %a3, %a3 %a5 = mul i32 %a4, %a4 %a6 = mul i32 %a5, %a5 %a7 = mul i32 %a6, %a6 %a8 = mul i32 %a7, %a7 %a9 = mul i32 %a8, %a8 %a10 = mul i32 %a9, %a9 %a11 = mul i32 %a10, %a10 %a12 = mul i32 %a11, %a11 %a13 = mul i32 %a12, %a12 %a14 = mul i32 %a13, %a13 %a15 = mul i32 %a14, %a14 %a16 = mul i32 %a15, %a15 %a17 = mul i32 %a16, %a16 %a18 = mul i32 %a17, %a17 %a19 = mul i32 %a18, %a18 %a20 = mul i32 %a19, %a19 %a21 = mul i32 %a20, %a20 %a22 = mul i32 %a21, %a21 %a23 = mul i32 %a22, %a22 %a24 = mul i32 %a23, %a23 %T = trunc i32 %a24 to i16 ret i16 %T } llvm-svn: 324276
* [SDAG] Legalize all CondCodes by inverting them and/or swapping operandsKrzysztof Parzyszek2018-02-051-12/+19
| | | | | | Differential Revision: https://reviews.llvm.org/D42788 llvm-svn: 324274
* [SimplifyLibCalls] Update from deprecated IRBuilder API for creating memory ↵Daniel Neilson2018-02-051-25/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | intrinsics (NFC) Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the SimplifyLibCalls pass to cease using the old IRBuilder createMemCpy/createMemMove single-alignment APIs in favour of the new API that allows setting source and destination alignments independently. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment() and [get|set]SourceAlignment() instead. ( rL323886, rL323891, r3L24148 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 324273
* [DWARF] Regularize dumping strings from line tables.Paul Robinson2018-02-053-25/+46
| | | | | | | | | | | | | | | | | The major visible difference here is that in line-table dumps, directory and file names are wrapped in double-quotes; previously, directory names got single quotes and file names were not quoted at all. The improvement in this patch is that when a DWARF v5 line table header has indirect strings, in a verbose dump these will all have their section[offset] printed as well as the name itself. This matches the format used for dumping strings in the .debug_info section. Differential Revision: https://reviews.llvm.org/D42802 llvm-svn: 324270
* [X86] Teach DAG unfoldMemoryOperand to reconvert CMPs to testsNirav Dave2018-02-051-0/+24
| | | | | | | | | | | | | | | | Summary: Copy MI-level cmp->test conversion to SelectionDAG-level memory unfold. This fixes a regression from upcoming D41293 change. Reviewers: craig.topper, RKSimon Reviewed By: craig.topper Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D42808 llvm-svn: 324261
* [X86] Artificially lower the complexity of the scalar ANDN patterns so that ↵Craig Topper2018-02-051-2/+3
| | | | | | | | | | | | AND with immediate will match first. This allows the immediate to folded into the and instead of being forced to move into a register. This can sometimes result in shorter encodings since the and can sign extend an immediate. This also allows us to match an and to a movzx after a not. This can cause an extra move if the input to the separate NOT has an additional user which requires a copy before the NOT. llvm-svn: 324260
* [InstCombine] add unsigned saturation subtraction canonicalizationsSanjay Patel2018-02-051-1/+56
| | | | | | | | | | | | | | | | | | | | | | | | This is the instcombine part of unsigned saturation canonicalization. Backend patches already commited: https://reviews.llvm.org/D37510 https://reviews.llvm.org/D37534 It converts unsigned saturated subtraction patterns to forms recognized by the backend: (a > b) ? a - b : 0 -> ((a > b) ? a : b) - b) (b < a) ? a - b : 0 -> ((a > b) ? a : b) - b) (b > a) ? 0 : a - b -> ((a > b) ? a : b) - b) (a < b) ? 0 : a - b -> ((a > b) ? a : b) - b) ((a > b) ? b - a : 0) -> - ((a > b) ? a : b) - b) ((b < a) ? b - a : 0) -> - ((a > b) ? a : b) - b) ((b > a) ? 0 : b - a) -> - ((a > b) ? a : b) - b) ((a < b) ? 0 : b - a) -> - ((a > b) ? a : b) - b) Patch by Yulia Koval! Differential Revision: https://reviews.llvm.org/D41480 llvm-svn: 324255
* LTO: Include dso-local bit in ThinLTO cache key.Peter Collingbourne2018-02-053-11/+14
| | | | | | Differential Revision: https://reviews.llvm.org/D42713 llvm-svn: 324253
* [InstCombine] only allow narrow/wide evaluation of values with >1 use if ↵Sanjay Patel2018-02-051-4/+6
| | | | | | | | | | | that user is a binop There was a logic hole in D42739 / rL324014 because we're not accounting for select and phi instructions that might have repeated operands. This is likely a source of an infinite loop. I haven't manufactured a test case to prove that, but it should be safe to speculatively limit this transform to binops while we try to create that test. llvm-svn: 324252
* [Hexagon] Memoize instruction positions in BitTrackerKrzysztof Parzyszek2018-02-052-10/+22
| | | | llvm-svn: 324250
* [X86] Teach X86DAGToDAGISel::shrinkAndImmediate to preserve upper 32 zeroes ↵Craig Topper2018-02-051-3/+19
| | | | | | | | | | | | of a 64 bit mask. If the upper 32 bits of a 64 bit mask are all zeros, we have special isel patterns to use a 32-bit and instead of a 64-bit and by relying on the impliciting zeroing of 32 bit ops. This patch teachs shrinkAndImmediate not to break that optimization. Differential Revision: https://reviews.llvm.org/D42899 llvm-svn: 324249
* Revert r323472 "[Debug] Add dbg.value intrinsics for PHIs created during LCSSA."Hans Wennborg2018-02-051-7/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This broke the Chromium build; see PR36238. > This patch is an enhancement to propagate dbg.value information when > Phis are created on behalf of LCSSA. I noticed a case where a value > carried across a loop was reported as <optimized out>. > > Specifically this case: > > int bar(int x, int y) { > return x + y; > } > > int foo(int size) { > int val = 0; > for (int i = 0; i < size; ++i) { > val = bar(val, i); // Both val and i are correct > } > return val; // <optimized out> > } > > In the above case, after all of the interesting computation completes > our value is reported as "optimized out." This change will add a > dbg.value to correct this. > > This patch also moves the dbg.value insertion routine from > LoopRotation.cpp into Local.cpp, so that we can share it in both places > (LoopRotation and LCSSA). > > Patch by Matt Davis! > > Differential Revision: https://reviews.llvm.org/D42551 llvm-svn: 324247
* BitTracker.h needs a full definition of MachineInstr, so include the ↵Benjamin Kramer2018-02-051-1/+1
| | | | | | | | | | defining file. Patch by Dean Sturtevant! Differential Revision: https://reviews.llvm.org/D42907 llvm-svn: 324245
* [Hexagon] Forgot about HexagonISD::VZERO in selecting const vectorsKrzysztof Parzyszek2018-02-051-1/+1
| | | | llvm-svn: 324244
* [Hexagon] Don't use garbage mask in HvxSelector::shuffp2Krzysztof Parzyszek2018-02-051-0/+2
| | | | | | | | The function shuffp2 was breaking up a wide shuffle into a pair of narrower ones, except that the narrower shuffle masks were actually uninitialized. llvm-svn: 324243
* [ThinLTO] Convert dead alias to declarationsTeresa Johnson2018-02-052-22/+32
| | | | | | | | | | | | | | | | Summary: This complements the fixes in r323633 and r324075 which drop the definitions of dead functions and variables, respectively. Fixes PR36208. Reviewers: grimar, rafael Subscribers: mehdi_amini, llvm-commits, inglorion Differential Revision: https://reviews.llvm.org/D42856 llvm-svn: 324242
* [Hexagon] Use V6_vmpyih for halfword multiplicationKrzysztof Parzyszek2018-02-051-5/+6
| | | | | | | Unlike V6_vmpyhv, it produces the result in the exact form that is expected without the need for a shuffle. llvm-svn: 324241
* [AMDGPU][MC] Corrected dst/data size for MIMG opcodes with d16 modifierDmitry Preobrazhensky2018-02-054-15/+44
| | | | | | | | | See bug 36154: https://bugs.llvm.org/show_bug.cgi?id=36154 Differential Revision: https://reviews.llvm.org/D42847 Reviewers: cfang, artem.tamazov, arsenm llvm-svn: 324237
* [AMDGPU][MC] Added validation of d16 and r128 modifiers of MIMG opcodesDmitry Preobrazhensky2018-02-056-3/+67
| | | | | | | | | | | See bugs 36094, 36095: https://bugs.llvm.org/show_bug.cgi?id=36094 https://bugs.llvm.org/show_bug.cgi?id=36095 Differential Revision: https://reviews.llvm.org/D42692 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 324231
* [PowerPC] Check hot loop exit edge in PPCCTRLoopsHiroshi Inoue2018-02-051-0/+21
| | | | | | | | | | | | | | | | PPCCTRLoops transform loops using mtctr/bdnz instructions if loop trip count is known and big enough to compensate for the cost of mtctr. But if there is a loop exit edge which is known to be frequently taken (by builtin_expect or by PGO), we should not transform the loop to avoid the cost of mtctr instruction. Here is an example of a loop with hot exit edge: for (unsigned i = 0; i < TripCount; i++) { // do something if (__builtin_expect(check(), 1)) break; // do something } Differential Revision: https://reviews.llvm.org/D42637 llvm-svn: 324229
* [llvm-opt-fuzzer] Avoid adding incorrect inputs to the fuzzer corpusIgor Laevsky2018-02-051-0/+10
| | | | | | Differential Revision: https://reviews.llvm.org/D42414 llvm-svn: 324225
* Fix more print format specifiers in debug_rnglists dumpingJames Henderson2018-02-051-4/+6
| | | | | | | | | See also r324096. I have made the assumption that DWARF64 is not an issue for the time being with these fixes. llvm-svn: 324223
* Revert [SimplifyCFG] Relax restriction for folding unconditional branchesSerguei Katkov2018-02-051-4/+1
| | | | | | | | | | | The patch causes the failure of the test compiler-rt/test/profile/Linux/counter_promo_nest.c To unblock buildbot, revert the patch while investigation is in progress. Differential Revision: https://reviews.llvm.org/D42691 llvm-svn: 324214
OpenPOWER on IntegriCloud