summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86InstrFragmentsSIMD.td
Commit message (Collapse)AuthorAgeFilesLines
...
* AVX-512: added arithmetic and logical operations.Elena Demikhovsky2013-08-191-4/+14
| | | | | | | ADD, SUB, MUL integer and FP types. OR, AND, XOR. Added embeded broadcast form for these instructions. llvm-svn: 188673
* AVX-512: Added VMOVD, VMOVQ, VMOVSS, VMOVSD instructions.Elena Demikhovsky2013-08-181-1/+3
| | | | llvm-svn: 188637
* Don't use v16i32 for load pattern matching. All 512-bit loads are cated to ↵Craig Topper2013-08-161-5/+5
| | | | | | v8i64. llvm-svn: 188534
* AVX-512: Added CMP and BLEND instructions.Elena Demikhovsky2013-08-131-0/+9
| | | | | | Lowering for SETCC. llvm-svn: 188265
* AVX-512: Added VPERM* instructons and MOV* zmm-to-zmm instructions.Elena Demikhovsky2013-08-111-1/+39
| | | | | | Added a test for shuffles using VPERM. llvm-svn: 188147
* AVX-512 set: Added BROADCAST instructionsElena Demikhovsky2013-08-071-1/+4
| | | | | | with lowering logic and a test. llvm-svn: 187884
* AVX-512 set: added mask operations, lowering BUILD_VECTOR for i1 vector types.Elena Demikhovsky2013-08-051-0/+2
| | | | | | Added intrinsics and tests. llvm-svn: 187717
* X86: Turn fp selects into mask operations.Benjamin Kramer2013-08-041-0/+2
| | | | | | | | | | | | | | | | | | | | | | | double test(double a, double b, double c, double d) { return a<b ? c : d; } before: _test: ucomisd %xmm0, %xmm1 ja LBB0_2 movaps %xmm3, %xmm2 LBB0_2: movaps %xmm2, %xmm0 after: _test: cmpltsd %xmm1, %xmm0 andpd %xmm0, %xmm2 andnpd %xmm3, %xmm0 orpd %xmm2, %xmm0 Small speedup on Benchmarks/SmallPT llvm-svn: 187706
* Added INSERT and EXTRACT intructions from AVX-512 ISA.Elena Demikhovsky2013-07-311-14/+40
| | | | | | | | | All insertf*/extractf* functions replaced with insert/extract since we have insertf and inserti forms. Added lowering for INSERT_VECTOR_ELT / EXTRACT_VECTOR_ELT for 512-bit vectors. Added lowering for EXTRACT/INSERT subvector for 512-bit vectors. Added a test. llvm-svn: 187491
* Fix inconsistent usage of PALIGN and PALIGNR when referring to the same ↵Craig Topper2013-01-281-1/+1
| | | | | | instruction. llvm-svn: 173667
* X86: Match the SSE/AVX min/max vector ops using a custom node instead of ↵Benjamin Kramer2012-12-211-0/+5
| | | | | | | | intrinsics This is very mechanical, no functionality change. Preparation for PR14667. llvm-svn: 170898
* X86: Add a couple of target-specific dag combines that turn VSELECTS into ↵Benjamin Kramer2012-12-151-0/+1
| | | | | | | | | | | psubus if possible. We match the pattern "x >= y ? x-y : 0" into "subus x, y" and two special cases if y is a constant. DAGCombiner canonicalizes those so we first have to undo the canonicalization for those cases. The pattern occurs in gzip when the loop vectorizer is enabled. Part of PR14613. llvm-svn: 170273
* Simplified BLEND pattern matching for shuffles.Elena Demikhovsky2012-12-051-3/+1
| | | | | | Generate VPBLENDD for AVX2 and VPBLENDW for v16i16 type on AVX2. llvm-svn: 169366
* Enable lowering ZERO_EXTEND/ANY_EXTEND to PMOVZX from SSE4.1Michael Liao2012-10-231-0/+8
| | | | llvm-svn: 166486
* Add support for FP_ROUND from v2f64 to v2f32Michael Liao2012-10-101-0/+3
| | | | | | | | | | - Due to the current matching vector elements constraints in ISD::FP_ROUND, rounding from v2f64 to v4f32 (after legalization from v2f32) is scalarized. Add a customized v2f32 widening to convert it into a target-specific X86ISD::VFPROUND to work around this constraints. llvm-svn: 165631
* Enhance PR11334 fix to support extload from v2f32/v4f32Michael Liao2012-09-101-0/+4
| | | | | | - Fix an remaining issue of PR11674 as well llvm-svn: 163528
* Convert FMA4 patterns to use target specific nodes instead of intrinsics to ↵Craig Topper2012-08-291-2/+2
| | | | | | align with FMA3. llvm-svn: 162829
* When unsafe math is used, we can use commutative FMAX and FMIN. In some casesNadav Rotem2012-08-191-2/+10
| | | | | | | | | | | | | | | | | | | this allows for better code generation. Added a new DAGCombine transformation to convert FMAX and FMIN to FMANC and FMINC, which are commutative. For example: movaps %xmm0, %xmm1 movsd LC(%rip), %xmm0 minsd %xmm1, %xmm0 becomes: minsd LC(%rip), %xmm0 llvm-svn: 162187
* fix PR11334Michael Liao2012-08-141-0/+5
| | | | | | | | | | | | - FP_EXTEND only support extending from vectors with matching elements. This results in the scalarization of extending to v2f64 from v2f32, which will be legalized to v4f32 not matching with v2f64. - add X86-specific VFPEXT supproting extending from v4f32 to v2f64. - add BUILD_VECTOR lowering helper to recover back the original extending from v4f32 to v2f64. - test case is enhanced to include different vector width. llvm-svn: 161894
* Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires ↵Craig Topper2012-08-061-0/+11
| | | | | | custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305. llvm-svn: 161318
* Added FMA functionality to X86 target.Elena Demikhovsky2012-08-011-4/+13
| | | | llvm-svn: 161110
* Remove tabs.Bill Wendling2012-07-191-2/+2
| | | | llvm-svn: 160477
* Use XOP vpcom intrinsics in patterns instead of a target specific SDNode ↵Craig Topper2012-06-091-7/+0
| | | | | | type. Remove the custom lowering code that selected the SDNode type. llvm-svn: 158279
* ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2Elena Demikhovsky2012-04-221-1/+6
| | | | llvm-svn: 155309
* Change type profile for vpermv back to using operand type for the mask ↵Craig Topper2012-04-161-3/+1
| | | | | | argument to match intrinsic behavior. Add a bitcast to the lowering code to convert mask from v8i32 to v8f32 for vpermps. llvm-svn: 154798
* Merge vpermps/vpermd and vpermpd/vpermq SD nodes.Craig Topper2012-04-161-4/+2
| | | | llvm-svn: 154782
* Fix SDTypeProfile for vpermps. The mask operand should be v8i32.Craig Topper2012-04-161-2/+4
| | | | llvm-svn: 154781
* Added VPERM optimization for AVX2 shufflesElena Demikhovsky2012-04-151-0/+4
| | | | llvm-svn: 154761
* Reapply 154396 after fixing a test.Nadav Rotem2012-04-111-0/+6
| | | | | | | | | Original message: Modify the code that lowers shuffles to blends from using blendvXX to vblendXX. blendV uses a register for the selection while Vblend uses an immediate. On sandybridge they still have the same latency and execute on the same execution ports. llvm-svn: 154483
* Temporarily revert this patch to see if it brings the buildbots back.Eric Christopher2012-04-101-6/+0
| | | | llvm-svn: 154425
* Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.Nadav Rotem2012-04-101-0/+6
| | | | | | | blendv uses a register for the selection while vblend uses an immediate. On sandybridge they still have the same latency and execute on the same execution ports. llvm-svn: 154396
* Fix a regression from r147481.Chad Rosier2012-03-091-0/+5
| | | | | | | | | | | | Original commit message from r147481: DAGCombine for transforming 128->256 casts into a vmovaps, rather then a vxorps + vinsertf128 pair if the original vector came from a load. Fix: Unaligned loads need to generate a vmovups. rdar://10974078 llvm-svn: 152366
* some comment fix for X86 and ARMJia Liu2012-02-191-1/+1
| | | | llvm-svn: 150902
* Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, ↵Jia Liu2012-02-181-2/+2
| | | | | | MSP430, PPC, PTX, Sparc, X86, XCore. llvm-svn: 150878
* Remove the last of the old vector_shuffle patterns from X86 isel.Craig Topper2012-02-171-26/+0
| | | | llvm-svn: 150795
* Move old movl vector_shuffle patterns. Not needed anymore since ↵Craig Topper2012-02-141-5/+0
| | | | | | vector_shuffles shouldn't reach isel. llvm-svn: 150462
* Still more vector_shuffle pattern removal.Craig Topper2012-02-131-10/+0
| | | | llvm-svn: 150365
* Recommit r150328. Previous test failures should be fixed by r150360.Craig Topper2012-02-131-27/+0
| | | | llvm-svn: 150362
* Revert r150328, "Remove more vector_shuffle patterns."NAKAMURA Takumi2012-02-131-0/+27
| | | | | | It caused 3 failures on pre-penryn and non-x86(generic) hosts. llvm-svn: 150357
* Remove more vector_shuffle patterns.Craig Topper2012-02-121-27/+0
| | | | llvm-svn: 150328
* Remove more vector_shuffle patterns.Craig Topper2012-02-121-5/+0
| | | | llvm-svn: 150321
* Remove some patterns for matching vector_shuffle instructions since ↵Craig Topper2012-02-111-11/+0
| | | | | | vector_shuffles should be custom lowered before isel. llvm-svn: 150299
* Add target specific node for PMULUDQ. Change patterns to use it and custom ↵Craig Topper2012-02-051-0/+4
| | | | | | lower intrinsics to it. Use it instead of intrinsic to handle 64-bit vector multiplies. llvm-svn: 149807
* Optimization for SIGN_EXTEND operation on AVX.Elena Demikhovsky2012-02-021-0/+3
| | | | | | | Special handling was added for v4i32 -> v4i64 and v8i16 -> v8i32 extensions. llvm-svn: 149600
* Move some XOP patterns into instruction definition. Replae VPCMOV intrinsic ↵Craig Topper2012-01-301-0/+7
| | | | | | patterns with custom lowering to a target specific nodes. llvm-svn: 149216
* Custom lower PSIGN and PSHUFB intrinsics to their corresponding target ↵Craig Topper2012-01-251-1/+1
| | | | | | specific nodes so we can remove the isel patterns. llvm-svn: 148933
* Add comments near load pattern fragments indicating that all integer vector ↵Craig Topper2012-01-241-0/+6
| | | | | | loads are promoted to v2i64 or v4i64 so that no one tries to reintroduce pattern fragments for other types. llvm-svn: 148771
* Remove pattern fragments for v32i8, v16i16, v8i32, v16i8, v8i16, and v4i32 ↵Craig Topper2012-01-231-12/+0
| | | | | | loads. All integer vector loads are promoted to v2i64 or v4i64 so these pattern fragments can never match. Fix or remove patterns that used these fragments. llvm-svn: 148672
* Combine X86 CMPPD and CMPPS node types. Simplifies selection code and ↵Craig Topper2012-01-221-2/+1
| | | | | | pattern matching. llvm-svn: 148670
* Merge PCMPEQB/PCMPEQW/PCMPEQD/PCMPEQQ and PCMPGTB/PCMPGTW/PCMPGTD/PCMPGTQ ↵Craig Topper2012-01-221-8/+2
| | | | | | X86 ISD node types into only two node types. Simplifying opcode selection and pattern matching. llvm-svn: 148667
OpenPOWER on IntegriCloud