summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/X86
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Add knownbits vector MUL testSimon Pilgrim2016-11-101-0/+23
| | | | | | In preparation for demandedelts support llvm-svn: 286463
* [SelectionDAG] Add support for vector demandedelts in SRA opcodesSimon Pilgrim2016-11-101-8/+2
| | | | llvm-svn: 286461
* [X86] Add knownbits vector arithmetic shift testSimon Pilgrim2016-11-101-0/+23
| | | | | | In preparation for demandedelts support llvm-svn: 286457
* [DAGCombiner] Correctly extract the ConstOrConstSplat shift value for SHL nodesSimon Pilgrim2016-11-101-2/+0
| | | | | | | | We were failing to extract a constant splat shift value if the shifted value was being masked. The (shl (and (setcc) N01CV) N1CV) -> (and (setcc) N01CV<<N1CV) combine was unnecessarily preventing this. llvm-svn: 286454
* [DAGCombiner] Show missed opportunity to UNDEF out-of-range SHLSimon Pilgrim2016-11-101-0/+15
| | | | | | Fails to match constant shift value due to presence of AND mask. llvm-svn: 286452
* [SelectionDAG] Add support for vector demandedelts in SHL/SRL opcodesSimon Pilgrim2016-11-101-8/+2
| | | | llvm-svn: 286448
* [X86] Add knownbits vector logical shift testSimon Pilgrim2016-11-101-0/+23
| | | | | | In preparation for demandedelts support llvm-svn: 286447
* [AVX-512] Allow legacy cvtpd2dq intrinsics to select EVEX encoded ↵Craig Topper2016-11-102-10/+20
| | | | | | instruction when available. llvm-svn: 286435
* [AVX-512][X86] Convert avx_cvtt_ps2dq_256 and sse2_cvttps2dq intrinsics to ↵Craig Topper2016-11-102-12/+27
| | | | | | | | ISD::FP_TO_SINT in the intrinsics table and delete patterns. While nearby also move CVTDQ2PS patterns into their instructions. This allows these intrinsics to also use EVEX instructons. llvm-svn: 286434
* [X86] Convert int_x86_avx_cvtt_pd2dq_256 to fp_to_sint using the intrinsics ↵Craig Topper2016-11-101-1/+1
| | | | | | table. Removes extra patterns and allows legacy intrinsic to select EVEX encoded instructions when available. llvm-svn: 286433
* [AVX-512] Add test cases to show missed opportunities for using VALIGND/Q to ↵Craig Topper2016-11-104-0/+145
| | | | | | handle shuffles. llvm-svn: 286425
* Re-apply r286384, "X86: Introduce the "relocImm" ComplexPattern, which ↵Peter Collingbourne2016-11-091-0/+22
| | | | | | | | | represents a relocatable immediate.", with a fix for 32-bit x86. Teach X86InstrInfo::analyzeCompare() not to crash on CMP and SUB instructions that take a global address operand. llvm-svn: 286420
* [AVX-512] Add lowering to cvttpd2udq/cvttps2udq for fptoui v2f64/2f32 to 2i32Craig Topper2016-11-091-43/+27
| | | | | | | | | | | | This patch adds support for fptoui to 2i32 from both 2f64 and 2f32, building on Simon's change for the signed version in r284459 and using AVX-512 instructions. If we don't have VLX support we need to use a 512-bit operation for v2f64->v2i32 and extract the result. It also recognises that cvttpd2udq zeroes the upper 64-bits of the xmm result. Differential Revision: https://reviews.llvm.org/D26331 llvm-svn: 286345
* [X86] Lower AVX512 and SSE intrinsics for CVTTPD2DQ to X86ISD::CVTTPD2DQ.Craig Topper2016-11-092-17/+19
| | | | | | | | | | | | Summary: This allows the SSE intrinsic to use the EVEX instruction when available. It also fixes EVEX to not use a weird (v4i32 (fp_to_sint v2f64)) node and it merges some isel patterns. This also fixes some cases that weren't combining vzmovl with cvttpd2dq to remove extra moves. Reviewers: delena, zvi, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26330 llvm-svn: 286344
* [AVX-512] Add more varied alignments to tests for storing the lower 128-bits ↵Craig Topper2016-11-091-0/+132
| | | | | | of a 256 or 512-bit subvector extract. llvm-svn: 286343
* [AVX-512] Use alignedstore256 in patterns that look for stores of the lower ↵Craig Topper2016-11-091-5/+5
| | | | | | | | 256-bits of a 512-bit vector to use a 256-bit aligned store. Previously we were only checking for 16 byte alignment instead of 32 byte alignment. Fixes PR30947. llvm-svn: 286342
* [AVX-512] Add test cases to demonstrate PR30947. We accidentally use 32 byte ↵Craig Topper2016-11-091-0/+144
| | | | | | aligned store instructions when the original store was only 16 byte aligned if the store is from the lower bits of a subvector extract. llvm-svn: 286341
* [AVX-512] Make VBMI instruction set enabling imply that the BWI instruction ↵Craig Topper2016-11-092-26/+26
| | | | | | | | | | | | | | | set is also enabled. Summary: This is needed to make the v64i8 and v32i16 types legal for the 512-bit VBMI instructions. Fixes PR30912. Reviewers: delena, zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26322 llvm-svn: 286339
* [ValueTracking] recognize obfuscated variants of umin/umaxSanjay Patel2016-11-091-16/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The smallest tests that expose this are codegen tests (because SelectionDAGBuilder::visitSelect() uses matchSelectPattern to create UMAX/UMIN nodes), but it's also possible to see the effects in IR alone with folds of min/max pairs. If these were written as unsigned compares in IR, InstCombine canonicalizes the unsigned compares to signed compares. Ie, running the optimizer pessimizes the codegen for this case without this patch: define <4 x i32> @umax_vec(<4 x i32> %x) { %cmp = icmp ugt <4 x i32> %x, <i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647> %sel = select <4 x i1> %cmp, <4 x i32> %x, <4 x i32> <i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647> ret <4 x i32> %sel } $ ./opt umax.ll -S | ./llc -o - -mattr=avx vpmaxud LCPI0_0(%rip), %xmm0, %xmm0 $ ./opt -instcombine umax.ll -S | ./llc -o - -mattr=avx vpxor %xmm1, %xmm1, %xmm1 vpcmpgtd %xmm0, %xmm1, %xmm1 vmovaps LCPI0_0(%rip), %xmm2 ## xmm2 = [2147483647,2147483647,2147483647,2147483647] vblendvps %xmm1, %xmm0, %xmm2, %xmm0 Differential Revision: https://reviews.llvm.org/D26096 llvm-svn: 286318
* [X86][SSE] Regenerate test (just adds missing header)Simon Pilgrim2016-11-081-0/+1
| | | | llvm-svn: 286241
* [TargetLowering] Fix undef vector element issue with true/false result handlingSimon Pilgrim2016-11-083-19/+12
| | | | | | | | | | | | | | Fixed an issue with vector usage of TargetLowering::isConstTrueVal / TargetLowering::isConstFalseVal boolean result matching. The comment said we shouldn't handle constant splat vectors with undef elements. But the the actual code was returning false if the build vector contained no undef elements.... This patch now ignores the number of undefs (getConstantSplatNode will return null if the build vector is all undefs). The change has also unearthed a couple of missed opportunities in AVX512 comparison code that will need to be addressed. Differential Revision: https://reviews.llvm.org/D26031 llvm-svn: 286238
* [VectorLegalizer] Expansion of CTLZ using CTPOP when possibleSimon Pilgrim2016-11-082-709/+604
| | | | | | | | | | This patch avoids scalarization of CTLZ by instead expanding to use CTPOP (ref: "Hacker's Delight") when the necessary operations are available. This also adds the necessary cost models for X86 SSE2 targets (the main beneficiary) to ensure vectorization only happens when its useful. Differential Revision: https://reviews.llvm.org/D25910 llvm-svn: 286233
* [AVX-512] Add an avx512f without avx512vl command line to vec_fp_to_int.ll ↵Craig Topper2016-11-081-129/+454
| | | | | | and regenerate. This will make a change in a future patch easier to see. NFC llvm-svn: 286216
* [X86][AVX512] Add AVX512VL/AVX512BWVL vector truncation testsSimon Pilgrim2016-11-071-65/+313
| | | | llvm-svn: 286105
* [X86][SSE] Drop unnecessary -mcpu argument from trunc testsSimon Pilgrim2016-11-071-7/+7
| | | | | | cpu/triple duplication llvm-svn: 286104
* [AVX-512] Remove masked pmovzx/pmovsx builtins and autoupgrade them to ↵Craig Topper2016-11-078-779/+778
| | | | | | | | selects and native zext/sext. This mostly reuses earlier autoupgrade support for the sse and avx equivalents. Just needed to add the code to add the select. llvm-svn: 286092
* [AVX-512] Remove 128/256 masked pshufb intrinsics. Autoupgrade them to ↵Craig Topper2016-11-072-32/+33
| | | | | | legacy intrinsics and a select. llvm-svn: 286089
* [SelectionDAG] Add support for vector demandedelts in XOR opcodesSimon Pilgrim2016-11-061-10/+2
| | | | llvm-svn: 286075
* [X86] Add knownbits vector xor testSimon Pilgrim2016-11-061-0/+31
| | | | | | In preparation for demandedelts support llvm-svn: 286074
* [AVX-512] Remove intrinsics for 128/256-bit masked variable shift. Instead ↵Craig Topper2016-11-062-215/+215
| | | | | | upgrade them to a select and the older AVX2 intrinsic. llvm-svn: 286073
* [AVX-512] Remove intrinsics for 128/256-bit masked shift by immediate. ↵Craig Topper2016-11-064-325/+244
| | | | | | Instead upgrade them to a select and the older SSE/AVX2 intrinsic. llvm-svn: 286072
* [SelectionDAG] Add support for vector demandedelts in OR opcodesSimon Pilgrim2016-11-061-10/+2
| | | | llvm-svn: 286071
* [AVX-512] Remove intrinsics for 128/256-bit masked shift by single element ↵Craig Topper2016-11-064-303/+304
| | | | | | in xmm. Instead upgrade them to a select and the older SSE/AVX2 intrinsic. llvm-svn: 286070
* [AVX-512] Remove a 512-bit test cases from the avx512vl test file. It ↵Craig Topper2016-11-061-20/+0
| | | | | | already exists in the avx512f test file. llvm-svn: 286069
* [X86] Add knownbits vector or testSimon Pilgrim2016-11-061-0/+31
| | | | | | In preparation for demandedelts support llvm-svn: 286068
* [X86] Add a few more fptoui test cases to the vec_fp_to_int.ll. The codegen ↵Craig Topper2016-11-061-0/+135
| | | | | | for these test cases will be improved for AVX512 in a future commit. llvm-svn: 286063
* [AVX-512] Add missing EVEX version of pattern for (v2f64 (extloadv2f32 ↵Craig Topper2016-11-061-2/+2
| | | | | | addr:)) -> VCVTPS2PDZ128rm llvm-svn: 286059
* [AVX-512] Add avx512vl command line to the fpext test and add ↵Craig Topper2016-11-061-80/+146
| | | | | | -show-mc-encoding to show where we aren't using EVEX instructions. llvm-svn: 286058
* [AVX-512] Lower AVX cvtpd2ps intrinsic to ISD::FP_ROUND so it can use EVEX ↵Craig Topper2016-11-061-1/+1
| | | | | | instruction when available. llvm-svn: 286057
* [AVX-512] Lower SSE/AVX cvtdq2ps intrinsics directly to ISD::SINT_TO_FP so ↵Craig Topper2016-11-062-12/+27
| | | | | | they can use EVEX instructions when available. llvm-svn: 286056
* [AVX-512] Add -show-mc-encoding to legacy vector intrinsic tests so we can ↵Craig Topper2016-11-068-4060/+4018
| | | | | | see when VEX or EVEX encoded instructions are being emitted. Make sure the tests all have an avx2 command line and an skx command line. llvm-svn: 286055
* [LoopStrengthReduce] Don't use a DenseSet<int64_t> when we might add any ↵Justin Lebar2016-11-051-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | valid int64_t to the set. Summary: SmallSetVector uses DenseSet, but that means we need to reserve some values for the empty and tombstone keys. It seems to me we should have a general way to let us store full-range ints inside of DenseSets, and furthermore that we probably shouldn't silently let you add ints into DenseSets without explicitly promising that they're in range. But that's a battle for another day; for now, just fix this code, since we currently do something Very Bad when compiling ffmpeg. Fixes PR30914. Reviewers: jeremyhu Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D26323 llvm-svn: 286038
* [X86] Broadcast from memory intructions aren't unfoldableZvi Rackover2016-11-041-1/+0
| | | | | | | | Broadcast from memory instructions should be treated as moves. They can't be unfolded. Fixes pr30693. llvm-svn: 285998
* Add bugpoint-reduced reproducer for pr30693Zvi Rackover2016-11-041-0/+148
| | | | llvm-svn: 285997
* [AVX-512] Use 'vnot' instead of 'not' in patterns involving vXi1 vectors.Craig Topper2016-11-032-9/+8
| | | | | | | | | | | | This fixes selection of KANDN instructions and allows us to remove an extra set of patterns for KNOT and KXNOR. Reviewers: delena, igorb Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26134 llvm-svn: 285878
* Expandload and Compressstore intrinsicsElena Demikhovsky2016-11-031-0/+247
| | | | | | | | 2 new intrinsics covering AVX-512 compress/expand functionality. This implementation includes syntax, DAG builder, operation lowering and tests. Does not include: handling of illegal data types, codegen prepare pass and the cost model. llvm-svn: 285876
* [DAG] disable nsw/nuw for add/sub/mul when simplifying based on demanded ↵Sanjay Patel2016-10-311-0/+25
| | | | | | | | | | | | | | | | bits (PR30841) This bug was exposed by using nsw/nuw for more aggressive folds in: https://reviews.llvm.org/rL284844 The changes mimic the IR demanded bits logic in InstCombiner::SimplifyDemandedUseBits(), but we can't just flip flag bits in the DAG; we have to create a new node that has the bits cleared. This should fix: https://llvm.org/bugs/show_bug.cgi?id=30841 llvm-svn: 285656
* Add triple to test so it does not fail on windows.Manuel Klimek2016-10-311-1/+1
| | | | llvm-svn: 285560
* Delete .s file that did not test anything, and check in test that works.Manuel Klimek2016-10-312-20/+27
| | | | | | | In D26098, Davide Italiano submitted a .s file instead of the .ll file that was the last stage of the review. llvm-svn: 285559
* [AVX-512] Add missing patterns for selecting masked vector extracts that ↵Craig Topper2016-10-311-0/+229
| | | | | | started from shuffles. llvm-svn: 285546
OpenPOWER on IntegriCloud