summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [X86][SSE] Add srem x, (1 << c) combine testsSimon Pilgrim2018-07-051-0/+227
| | | | | | Now that D45806 has landed we can start trying to avoid scalarizing srem by constant - these tests demonstrate some example cases. llvm-svn: 336360
* [AArch64, PowerPC, x86] add tests for signbit bit hacks; NFCSanjay Patel2018-07-053-0/+459
| | | | llvm-svn: 336348
* [AMDGPU] Add VALU to V_INTERP InstructionsRyan Taylor2018-07-051-0/+19
| | | | | | | | | | | | Wait states are not properly being inserted after buffer_store for v_interp instructions. Add VALU to V_INTERP instructions so that the GCNHazardRecognizer can check and insert the appropriate wait states when needed. Differential Revision: https://reviews.llvm.org/D48772 Change-Id: Id540c9b074fc69b5c1de6b182276aa089c74aa64 llvm-svn: 336339
* Partially revert r336268 in address-offsets.llKrasimir Georgiev2018-07-051-40/+40
| | | | | | | | | | | | | | Summary: There the typos are intentional, explicitly introduced to disable these cases in r280285. Reviewers: bkramer Reviewed By: bkramer Subscribers: dschuff, sbc100, jgravelle-google, aheejin, llvm-commits Differential Revision: https://reviews.llvm.org/D48962 llvm-svn: 336336
* [X86][SSE] Add extra v16i16 shl x,c -> pmullw testSimon Pilgrim2018-07-051-0/+27
| | | | | | We want to compare shifts with repeated vs non-repeated v8i16 shuffle masks (for PBLENDW ymm) llvm-svn: 336333
* [mips] Fix atomic operations at O0, v3Aleksandar Beserminji2018-07-054-428/+9037
| | | | | | | | | | | | | | | | | | | | | | | Similar to PR/25526, fast-regalloc introduces spills at the end of basic blocks. When this occurs in between an ll and sc, the stores can cause the atomic sequence to fail. This patch fixes the issue by introducing more pseudos to represent atomic operations and moving their lowering to after the expansion of postRA pseudos. This version addresses issues with the initial implementation and covers all atomic operations. This resolves PR/32020. Thanks to James Cowgill for reporting the issue! Patch By: Simon Dardis Differential Revision: https://reviews.llvm.org/D31287 llvm-svn: 336328
* [NEON] Fix combining of vldx_dup intrinsics with updating of base addressesIvan A. Kosarev2018-07-051-0/+43
| | | | | | | | | | | | | Resolves: Unsupported ARM Neon intrinsics in Target-specific DAG combine function for VLDDUP https://bugs.llvm.org/show_bug.cgi?id=38031 Related diff: D48439 Differential Revision: https://reviews.llvm.org/D48920 llvm-svn: 336325
* Partial revert of "NFC - Various typo fixes in tests"Mikael Holmen2018-07-053-15/+19
| | | | | | | | This partially reverts r336268 since it causes buildbot failures. Added FIXME at the places where the CHECKs are misspelled. llvm-svn: 336323
* [ARM] ParallelDSP: only support i16 loads for nowSjoerd Meijer2018-07-051-1/+46
| | | | | | | | | We were miscompiling i8 loads, so reject them as unsupported narrow operations for now. Differential Revision: https://reviews.llvm.org/D48944 llvm-svn: 336319
* [Power9] Optimize codgen for conversions of int to float128Lei Huang2018-07-052-10/+136
| | | | | | | | | | | | Optimize code sequences for integer conversion to fp128 when the integer is a result of: * float->int * float->long * double->int * double->long Differential Revision: https://reviews.llvm.org/D48429 llvm-svn: 336316
* [X86] Remove X86 specific scalar FMA intrinsics and upgrade to tart ↵Craig Topper2018-07-054-88/+104
| | | | | | independent FMA and extractelement/insertelement. llvm-svn: 336315
* [Power9][NFC] add back-end tests for passing homogeneous fp128 aggregates by ↵Lei Huang2018-07-051-3/+187
| | | | | | | | | | | value Tests to verify that we are passing fp128 via VSX registers as per ABI. These are related to clang commit rL336308. Differential Revision: https://reviews.llvm.org/D48310 llvm-svn: 336314
* [Power9] Add tests for passing float128 in VSX reg for non-homogenous aggregatesLei Huang2018-07-051-0/+206
| | | | | | Add missing testcase for rL336310 llvm-svn: 336313
* [Power9]Legalize and emit code for quad-precision convert from single-precisionLei Huang2018-07-052-27/+162
| | | | | | | | | Legalize and emit code for quad-precision floating point operation conversion of single-precision value to quad-precision. Differential Revision: https://reviews.llvm.org/D47569 llvm-svn: 336307
* [Power9] Implement float128 parameter passing and return valuesLei Huang2018-07-051-0/+268
| | | | | | | | | | This patch enable parameter passing and return by value for float128 types. Passing aggregate/union which contain float128 members will be submitted in subsequent patches. Differential Revision: https://reviews.llvm.org/D47552 llvm-svn: 336306
* [X86] Add support for combining FMSUB/FNMADD/FNMSUB ISD nodes with an fneg ↵Craig Topper2018-07-053-38/+14
| | | | | | | | | | input. Previously we could only negate the FMADD opcodes. This used to be mostly ok when we lowered FMA intrinsics during lowering. But with the move to llvm.fma from target specific intrinsics, we can combine (fneg (fma)) to (fmsub) earlier. So if we start with (fneg (fma (fneg))) we would get stuck at (fmsub (fneg)). This patch fixes that so we can also combine things like (fmsub (fneg)). llvm-svn: 336304
* [X86] Remove some of the packed FMA3 intrinsics since we no longer use them ↵Craig Topper2018-07-052-6/+18
| | | | | | | | | | in clang. There's a regression in here due to inability to combine fneg inputs of X86ISD::FMSUB/FNMSUB/FNMADD nodes. More removals to come, but I wanted to stop and fix the regression that showed up in this first. llvm-svn: 336303
* [Power9]Legalize and emit code for round & convert quad-precision valuesLei Huang2018-07-041-0/+154
| | | | | | | | | Legalize and emit code for round & convert float128 to double precision and single precision. Differential Revision: https://reviews.llvm.org/D46997 llvm-svn: 336299
* [mips] Warn when crc, ginv, virt flags are used with too old revisionVladimir Stefanovic2018-07-041-0/+45
| | | | | | | | | CRC and GINV ASE require revision 6, Virtualization requires revision 5. Print a warning when revision is older than required. Differential Revision: https://reviews.llvm.org/D48843 llvm-svn: 336296
* [PowerPC] Replace the Post RA List Scheduler with the Machine SchedulerStefan Pintilie2018-07-0476-482/+457
| | | | | | | | | | | We want to run the Machine Scheduler instead of the List Scheduler after RA. Checked with a performance run on a Power 9 machine with SPEC 2006 and while some benchmarks improved and others degraded the geomean was slightly improved with the Machine Scheduler. Differential Revision: https://reviews.llvm.org/D45265 llvm-svn: 336295
* [X86][SSE] Add v16i16 shl x,c -> pmullw testSimon Pilgrim2018-07-041-3/+26
| | | | llvm-svn: 336277
* [X86][SSE] Add SSE2 target to some shift testsSimon Pilgrim2018-07-043-275/+638
| | | | | | | | Show the difference in behaviour cf SSE41 (no PMULLD, PBLENDW etc.) Raised by D48936 llvm-svn: 336271
* NFC - Various typo fixes in testsGabor Buella2018-07-0416-77/+77
| | | | llvm-svn: 336268
* [X86][SSE] Blend any v8i16/v4i32 shift with 2 shift unique values (REAPPLIED)Simon Pilgrim2018-07-041-22/+12
| | | | | | | | We were only doing this for basic blends, despite shuffle lowering now being good enough to handle more complex blends. This means that the two v8i16 splat shifts are performed in parallel instead of serially as the general shift case. Reapplied with a fixed (extra null tests) version of rL336113 after reversion in rL336189 - extra test case added at rL336247. llvm-svn: 336250
* [X86][SSE] Add reduced crash test case for r336113 - [X86][SSE] Blend any ↵Simon Pilgrim2018-07-041-0/+25
| | | | | | | | v8i16/v4i32 shift with 2 shift unique values The patch was reverted at r336189 due to crashes llvm-svn: 336247
* [ImplicitNullChecks] Check for rewrite of register used in 'test' instructionMax Kazantsev2018-07-041-0/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | The following code pattern: mov %rax, %rcx test %rax, %rax %rax = .... je throw_npe mov(%rcx), %r9 mov(%rax), %r10 gets transformed into the following incorrect code after implicit null check pass: mov %rax, %rcx %rax = .... faulting_load_op("movl (%rax), %r10", throw_npe) mov(%rcx), %r9 For implicit null check pass, if the register that is checked for null value (ie, the register used in the 'test' instruction) is written into before the condition jump, we should avoid doing the optimization. Patch by Surya Kumari Jangala! Differential Revision: https://reviews.llvm.org/D48627 Reviewed By: skatkov llvm-svn: 336241
* [NVPTX] Expand v2f16 INSERT_VECTOR_ELTBenjamin Kramer2018-07-031-0/+8
| | | | | | Vectorization can create them. llvm-svn: 336227
* [X86] Add tests for low/high bit clearing with different attributes.Roman Lebedev2018-07-032-0/+2776
| | | | | | | | | | | | | | D48768 may turn some of these into shifts. Reviewers: spatel Reviewed By: spatel Subscribers: spatel, RKSimon, llvm-commits, craig.topper Differential Revision: https://reviews.llvm.org/D48767 llvm-svn: 336224
* [X86][AsmParser] Don't consider %eip as a valid register outside of 32-bit mode.Craig Topper2018-07-031-1/+1
| | | | | | | | This might make the error message added in r335668 unneeded, but I'm not sure yet. The check for RIP is technically unnecessary since RIP is in GR64, but that fact is kind of surprising so be explicit. llvm-svn: 336217
* [AArch64][GlobalISel] Fix fallbacks introduced in r336120 due to ↵Amara Emerson2018-07-032-4/+8
| | | | | | | | | | unselectable stores. r336120 resulted in falling back to SelectionDAG more often due to the G_STORE MMOs not matching the vreg size. This fixes that by explicitly any-extending the value. llvm-svn: 336209
* [DAGCombiner] visitSDIV - Permit MIN_SIGNED_VALUE in pow2 vector codegenSimon Pilgrim2018-07-031-177/+206
| | | | | | Now that D45806 has landed, we can re-enable support for MIN_SIGNED_VALUE in the sdiv by pow2-constant code llvm-svn: 336198
* Revert "[X86][SSE] Blend any v8i16/v4i32 shift with 2 shift unique values"Benjamin Kramer2018-07-031-12/+22
| | | | | | This reverts commit r336113. It causes crashes. llvm-svn: 336189
* [MIPS GlobalISel] Lower arguments using stackPetar Jovanovic2018-07-031-0/+33
| | | | | | | | | | | Lower more than 4 arguments using stack. This patch targets MIPS32. It supports only functions with arguments of type i32. Patch by Petar Avramovic. Differential Revision: https://reviews.llvm.org/D47934 llvm-svn: 336185
* [X86] Add avx512vl command line to break-false-dep.llCraig Topper2018-07-031-2/+23
| | | | llvm-svn: 336169
* [WebAssembly] Support for atomic storesHeejin Ahn2018-07-024-1/+408
| | | | | | | | | | | | Summary: Add support for atomic store instructions. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D48839 llvm-svn: 336145
* [ARM] Fix PR37382: Don't optimize mul.with.overflow on thumbv6m.Vadzim Dambrouski2018-07-021-0/+5
| | | | | | | | | | | | Reviewers: efriedma, rogfer01, javed.absar Reviewed By: efriedma, rogfer01 Subscribers: kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D48846 llvm-svn: 336144
* [WebAssembly] Fix fast-isel optimization of branch conditions.Dan Gohman2018-07-021-0/+48
| | | | | | | | | | | LLVM doesn't guarantee anything about the high bits of a register holding an i1 value at the IR level, so don't translate LLVM IR i1 values directly into WebAssembly conditional branch operands. WebAssembly's conditional branches do demand all 32 bits be valid. Fixes PR38019. llvm-svn: 336138
* [X86] Add phony registers for high halves of regs with low halvesKrzysztof Parzyszek2018-07-024-4/+4
| | | | | | | | | | Add registers still missing after r328016 (D43353): - for bits 15-8 of SI, DI, BP, SP (*H), and R8-R15 (*BH), - for bits 31-16 of R8-R15 (*WH). Thanks to Craig Topper for pointing it out. llvm-svn: 336134
* [X86] Don't use aligned load/store instructions for fp128 if the load/store ↵Craig Topper2018-07-022-3/+3
| | | | | | | | | | isn't aligned. Similarily, don't fold fp128 loads into SSE instructions if the load isn't aligned. Unless we're targeting an AMD CPU that doesn't check alignment on arithmetic instructions. Should fix PR38001 llvm-svn: 336121
* [AArch64][GlobalISel] Any-extend vararg parameters to stack slot size on Darwin.Amara Emerson2018-07-021-3/+3
| | | | | | | | | | | We currently don't any-extend vararg parameters before storing them to the stack locations on Darwin. However, SelectionDAG however does this, and so user code is in the wild which inadvertently relies on this extension. This can manifest in cases where the value stored is (int)0, but the actual parameter is interpreted by va_arg as a pointer, and so not extending to 64 bits causes the callee to load additional undefined bits. llvm-svn: 336120
* [WebAssembly] Convert remaining tests from elf to wasm output formatSam Clegg2018-07-023-11/+12
| | | | | | Differential Revision: https://reviews.llvm.org/D48748 llvm-svn: 336116
* [X86][SSE] Blend any v8i16/v4i32 shift with 2 shift unique valuesSimon Pilgrim2018-07-021-22/+12
| | | | | | We were only doing this for basic blends, despite shuffle lowering now being good enough to handle more complex blends. This means that the two v8i16 splat shifts are performed in parallel instead of serially as the general shift case. llvm-svn: 336113
* [X86][SSE] Add v8i16 shift test for 2 shift values that doesn't match basic ↵Simon Pilgrim2018-07-021-0/+32
| | | | | | | | blend We have special case support for 2 shift values for basic blends, but irregular shift patterns end up using the generic lowering, despite shuffle lowering being good enough to handle more complex blends. llvm-svn: 336112
* [Mips][FastISel] Do not duplicate condition while lowering branchesPetar Jovanovic2018-07-021-0/+28
| | | | | | | | | | | | | | | | This change fixes the issue that arises when we duplicate condition from the predecessor block. If the condition's arguments are not considered alive across the blocks, fast regalloc gets confused and starts generating reloads from the slots that have never been spilled to. This change also leads to smaller code given that, unlike on architectures with condition codes, on Mips we can branch directly on register value, thus we gain nothing by duplication. Patch by Dragan Mladjenovic. Differential Revision: https://reviews.llvm.org/D48642 llvm-svn: 336084
* [PowerPC] Don't make it as pre-inc candidate if displacement isn't 4's ↵QingShan Zhang2018-07-021-0/+124
| | | | | | | | | | | | | | | | | | | | | | | | | | | | multiple for i64 pre-inc load/store For the below case, pre-inc prep think it's a good candidate to use pre-inc for the bucket, but 64bit integer load/store update (pre-inc) instruction on Power requires the displacement field should be DS-form (4's multiple). Since it can't satisfy the constraint, we have to do some fix ups later. As below, the original load/stores could be well-form, it makes things worse. unsigned long long result = 0; unsigned long long foo(char *p, unsigned long long n) { for (unsigned long long i = 0; i < n; i++) { unsigned long long x1 = *(unsigned long long *)(p - 50000 + i); unsigned long long x2 = *(unsigned long long *)(p - 61024 + i); unsigned long long x3 = *(unsigned long long *)(p - 62048 + i); unsigned long long x4 = *(unsigned long long *)(p - 64096 + i); result *= x1 * x2 * x3 * x4; } return result; } Patch by jedilyn(Kewen Lin). Differential Revision: https://reviews.llvm.org/D48813 --This line, and those below, will be ignored-- M lib/Target/PowerPC/PPCLoopPreIncPrep.cpp A test/CodeGen/PowerPC/preincprep-i64-check.ll llvm-svn: 336074
* Implement strip.invariant.groupPiotr Padlewski2018-07-021-1/+9
| | | | | | | | | | | | | | | | Summary: This patch introduce new intrinsic - strip.invariant.group that was described in the RFC: Devirtualization v2 Reviewers: rsmith, hfinkel, nlopes, sanjoy, amharc, kuhar Subscribers: arsenm, nhaehnle, JDevlieghere, hiraditya, xbolva00, llvm-commits Differential Revision: https://reviews.llvm.org/D47103 Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com> llvm-svn: 336073
* [X86] Fix a few test names in avx512-intrinsics-fast-isel.ll to match their ↵Craig Topper2018-07-011-8/+8
| | | | | | | | clang intrinsic names. I thought I fixed these yesterday, but I guess I missed a few. llvm-svn: 336071
* [DAGCombiner] Handle correctly non-splat power of 2 -1 divisor (PR37119)Simon Pilgrim2018-06-301-268/+80
| | | | | | | | | | The combine added in commit 329525 overlooked the case where one, but not all, of the divisor elements is -1, -1 is the only power of two value for which the sdiv expansion recipe breaks. Thanks to @zvi for the original patch. Differential Revision: https://reviews.llvm.org/D45806 llvm-svn: 336048
* [X86] Update some avx512 fast-isel tests to match their real clang IRgen.Craig Topper2018-06-302-332/+353
| | | | | | | | | | Especially of note was the test_mm_mask_set1_epi64 and other set1 tests that were truncating the element to be broadcasted to i8 and broadcasting that instead of a whole 64 bit value. Some of the others were just correcting mask sizes on parameters due to bugs in the clang test case they were generated from that have now been fixed. Some were converting i8 to <4 x i1>/<2 x i1> by truncating to i4/i2 and then bitcasting. But the clang codegen is bitcast to <8 x i1>, then extract to <4 x i1>/<2 x i1>. This is likely to incur less trouble from the integer type legalizer in the backend. llvm-svn: 336045
* [X86] Change some chec-prefixes from X32 to X86 to match the FileCheck ↵Craig Topper2018-06-301-48/+48
| | | | | | | | command line. I think this test changed and these test cases were created around the same time and missed the change. llvm-svn: 336044
OpenPOWER on IntegriCloud