summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86InstrFragmentsSIMD.td
Commit message (Collapse)AuthorAgeFilesLines
...
* Add target specific ISD node types for SSE/AVX vector shuffle instructions ↵Craig Topper2012-01-221-2/+16
| | | | | | and change all the code that used to create intrinsic nodes to create the new nodes instead. llvm-svn: 148664
* Merge 128-bit and 256-bit SHUFPS/SHUFPD handling.Craig Topper2012-01-191-2/+2
| | | | llvm-svn: 148466
* Merge X86 SHUFPS and SHUFPD node types.Craig Topper2011-12-311-2/+1
| | | | llvm-svn: 147394
* Remove an unused X86ISD node type.Craig Topper2011-12-171-1/+0
| | | | llvm-svn: 146833
* Remove some remants of the old palign pattern fragment that were still ↵Craig Topper2011-12-111-6/+0
| | | | | | hanging around. Also remove a cast from inside getShuffleVPERM2X128Immediate and getShuffleVPERMILPImmediate since the only caller already had done the cast. llvm-svn: 146344
* Merge floating point and integer UNPCK X86ISD node types.Craig Topper2011-12-061-5/+2
| | | | llvm-svn: 145926
* Merge VPERM2F128/VPERM2I128 ISD node types.Craig Topper2011-11-301-2/+1
| | | | llvm-svn: 145485
* Merge decoding of VPERMILPD and VPERMILPS shuffle masks. Merge X86ISD node ↵Craig Topper2011-11-301-2/+1
| | | | | | type for VPERMILPD/PS. Add instruction selection support for VINSERTI128/VEXTRACTI128. llvm-svn: 145483
* Add X86 instruction selection for VPERM2I128 when AVX2 is enabled. Merge ↵Craig Topper2011-11-281-0/+1
| | | | | | VPERMILPS/VPERMILPD detection since they are pretty similar. llvm-svn: 145238
* Merge 128-bit and 256-bit X86ISD node types for VPERMILPS and VPERMILPD. ↵Craig Topper2011-11-261-2/+0
| | | | | | Simplify some shuffle lowering code since V1 can never be UNDEF due to canonalizing that occurs when shuffle nodes are created. llvm-svn: 145153
* Collapse X86ISD node types for PUNPCKH*, PUNPCKL*, UNPCKLP*, and UNPCKHP* to ↵Craig Topper2011-11-261-14/+4
| | | | | | not be type specific. Now we just have integer high and low and floating point high and low. Pattern matching will choose the correct instruction based on the vector type. llvm-svn: 145148
* Remove 256-bit specific node types for UNPCKHPS/D and instead use the ↵Craig Topper2011-11-241-4/+0
| | | | | | 128-bit versions and let the operand type disinquish. Also fix the load form of the v8i32 patterns for these to realize that the load would be promoted to v4i64. llvm-svn: 145126
* Remove AVX2 specific X86ISD node types for PUNPCKH/L and instead just reuse ↵Craig Topper2011-11-241-8/+0
| | | | | | the 128-bit versions and let the vector type distinguish. llvm-svn: 145125
* Lowering for v32i8 to VPUNPCKLBW/VPUNPCKHBW when AVX2 is enabled.Craig Topper2011-11-211-0/+2
| | | | llvm-svn: 145028
* Add support for lowering 256-bit shuffles to VPUNPCKL/H for i16, i32, i64 if ↵Craig Topper2011-11-211-2/+8
| | | | | | AVX2 is enabled. llvm-svn: 145026
* Synthesize SSSE3/AVX 128-bit horizontal integer add/sub instructions from ↵Craig Topper2011-11-191-0/+2
| | | | | | add/sub of appropriate shuffle vectors. llvm-svn: 144989
* Collapse X86 PSIGNB/PSIGNW/PSIGND node types.Craig Topper2011-11-191-7/+1
| | | | llvm-svn: 144988
* Extend VPBLENDVB and VPSIGN lowering to work for AVX2.Craig Topper2011-11-191-3/+3
| | | | llvm-svn: 144987
* Add a bunch more X86 AVX2 instructions and their corresponding intrinsics.Craig Topper2011-11-021-1/+4
| | | | llvm-svn: 143529
* Synthesize SSE3/AVX 128 bit horizontal add/sub instructions fromDuncan Sands2011-09-221-0/+2
| | | | | | | floating point add/sub of appropriate shuffle vectors. Does not synthesize the 256 bit AVX versions because they work differently. llvm-svn: 140332
* Add versions 256-bit versions of alignedstore and alignedload, to beBruno Cardoso Lopes2011-09-131-6/+17
| | | | | | | | more strict about the alignment checking. This was found by inspection and I don't have any testcases so far, although the llvm testsuite runs without any problem. llvm-svn: 139625
* Format patterns, remove unused X86blend patternsNadav Rotem2011-09-121-3/+0
| | | | llvm-svn: 139491
* Implement vector-select support for avx256. Refactor the vblend ↵Nadav Rotem2011-09-091-8/+2
| | | | | | implementation to have tablegen match the instruction by the node type llvm-svn: 139400
* Add AVX versions of blend vector operations and fix some issues noticedBruno Cardoso Lopes2011-09-081-2/+2
| | | | | | | | | | | | in Nadav's r139285 and r139287 commits. 1) Rename vsel.ll to a more descriptive name 2) Change the order of BLEND operands to "Op1, Op2, Cond", this is necessary because PBLENDVB is already used in different places with this order, and it was being emitted in the wrong way for vselect 3) Add AVX patterns and tests for the same SSE41 instructions llvm-svn: 139305
* Add X86-SSE4 codegen support for vector-select.Nadav Rotem2011-09-081-1/+7
| | | | llvm-svn: 139285
* Introduce matching patterns for vbroadcast AVX instruction. The idea is toBruno Cardoso Lopes2011-08-171-0/+4
| | | | | | | | | | | | | match splats in the form (splat (scalar_to_vector (load ...))) whenever the load can be folded. All the logic and instruction emission is working but because of PR8156, there are no ways to match loads, cause they can never be folded for splats. Thus, the tests are XFAILed, but I've tested and exercised all the logic using a relaxed version for checking the foldable loads, as if the bug was already fixed. This should work out of the box once PR8156 gets fixed since MayFoldLoad will work as expected. llvm-svn: 137810
* The VPERM2F128 is a AVX instruction which permutes between two 256-bitBruno Cardoso Lopes2011-08-121-0/+2
| | | | | | | | vectors. It operates on 128-bit elements instead of regular scalar types. Recognize shuffles that are suitable for VPERM2F128 and teach the x86 legalizer how to handle them. llvm-svn: 137519
* Cleanup PALIGNR handling and remove the old palign pattern fragment.Bruno Cardoso Lopes2011-07-291-5/+0
| | | | | | | Also make PALIGNR masks to don't match 256-bits, which isn't supported It's also a step to solve PR10489 llvm-svn: 136448
* The vpermilps and vpermilpd have different behaviour regarding theBruno Cardoso Lopes2011-07-271-1/+4
| | | | | | | | | usage of the shuffle bitmask. Both work in 128-bit lanes without crossing, but in the former the mask of the high part is the same used by the low part while in the later both lanes have independent masks. Handle this properly and and add support for vpermilpd. llvm-svn: 136200
* Remove more dead code!Bruno Cardoso Lopes2011-07-271-15/+5
| | | | llvm-svn: 136199
* Recognize unpckh* masks and match 256-bit versions. The new versions areBruno Cardoso Lopes2011-07-261-4/+7
| | | | | | | different from the previous 128-bit because they work in lanes. Update a few comments and add testcases llvm-svn: 136157
* Cleanup movsldup/movshdup matching.Bruno Cardoso Lopes2011-07-261-10/+0
| | | | | | 27 insertions(+), 62 deletions(-) llvm-svn: 136047
* - Handle special scalar_to_vector case: splats. Using a native 128-bitBruno Cardoso Lopes2011-07-251-0/+1
| | | | | | | | | | shuffle before inserting on a 256-bit vector. - Add AVX versions of movd/movq instructions - Introduce a few COPY patterns to match insert_subvector instructions. This turns a trivial insert_subvector instruction into a register copy, coalescing the xmm into a ymm and avoid emiting on more instruction. llvm-svn: 136002
* Add support for 256-bit versions of VPERMIL instruction. This is a newBruno Cardoso Lopes2011-07-211-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | instruction introduced in AVX, which can operate on 128 and 256-bit vectors. It considers a 256-bit vector as two independent 128-bit lanes. It can permute any 32 or 64 elements inside a lane, and restricts the second lane to have the same permutation of the first one. With the improved splat support introduced early today, adding codegen for this instruction enable more efficient 256-bit code: Instead of: vextractf128 $0, %ymm0, %xmm0 punpcklbw %xmm0, %xmm0 punpckhbw %xmm0, %xmm0 vinsertf128 $0, %xmm0, %ymm0, %ymm1 vinsertf128 $1, %xmm0, %ymm1, %ymm0 vextractf128 $1, %ymm0, %xmm1 shufps $1, %xmm1, %xmm1 movss %xmm1, 28(%rsp) movss %xmm1, 24(%rsp) movss %xmm1, 20(%rsp) movss %xmm1, 16(%rsp) vextractf128 $0, %ymm0, %xmm0 shufps $1, %xmm0, %xmm0 movss %xmm0, 12(%rsp) movss %xmm0, 8(%rsp) movss %xmm0, 4(%rsp) movss %xmm0, (%rsp) vmovaps (%rsp), %ymm0 We get: vextractf128 $0, %ymm0, %xmm0 punpcklbw %xmm0, %xmm0 punpckhbw %xmm0, %xmm0 vinsertf128 $0, %xmm0, %ymm0, %ymm1 vinsertf128 $1, %xmm0, %ymm1, %ymm0 vpermilps $85, %ymm0, %ymm0 llvm-svn: 135662
* Port operand types for ARM and X86 over from EDIS to the .td files.Benjamin Kramer2011-07-141-0/+2
| | | | llvm-svn: 135198
* Make X86ISD::ANDNP more general and Codegen 256-bit VANDNP. A moreBruno Cardoso Lopes2011-07-131-1/+1
| | | | | | | general version of X86ISD::ANDNP also opened the room for a little bit of refactoring. llvm-svn: 135088
* The target specific node PANDN name is misleading. That happens becauseBruno Cardoso Lopes2011-07-131-1/+1
| | | | | | | it's later selected to a ANDNPD/ANDNPS instruction instead of the PANDN instruction. Rename it. llvm-svn: 135087
* AVX Codegen support for 256-bit versions of vandps, vandpd, vorps, vorpd, ↵Bruno Cardoso Lopes2011-07-131-0/+1
| | | | | | vxorps, vxorpd llvm-svn: 135023
* Reapply 132424 with fixes. This fixes PR10068.Stuart Hastings2011-06-031-0/+2
| | | | | | rdar://problem/5993888 llvm-svn: 132606
* Revert 132424 to fix PR10068.Rafael Espindola2011-06-021-2/+0
| | | | llvm-svn: 132479
* Recommit 132404 with fixes. rdar://problem/5993888Stuart Hastings2011-06-011-0/+2
| | | | llvm-svn: 132424
* Revert 132404 to appease a buildbot. rdar://problem/5993888Stuart Hastings2011-06-011-2/+0
| | | | llvm-svn: 132419
* Add support for x86 CMPEQSS and friends. These instructions do aStuart Hastings2011-06-011-0/+2
| | | | | | | | floating-point comparison, generate a mask of 0s or 1s, and generally DTRT with NaNs. Only profitable when the user wants a materialized 0 or 1 at runtime. rdar://problem/5993888 llvm-svn: 132404
* FGETSIGN support for x86, using movmskps/pd. Will be enabled with aStuart Hastings2011-06-011-0/+1
| | | | | | patch to TargetLowering.cpp. rdar://problem/5660695 llvm-svn: 132388
* [AVX] Fix mask predicates for 256-bit UNPCKLPS/D and implementDavid Greene2011-03-021-0/+2
| | | | | | | | | | missing patterns for them. Add a SIMD test subdirectory to hold tests for SIMD instruction selection correctness and quality. ' llvm-svn: 126845
* [AVX] Support VSINSERTF128 with more patterns and appropriateDavid Greene2011-02-041-0/+13
| | | | | | | infrastructure. This makes lowering 256-bit vectors to 128-bit vectors simple when 256-bit vector support is not available. llvm-svn: 124868
* [AVX] VEXTRACTF128 support. This commit includes patterns forDavid Greene2011-02-031-0/+12
| | | | | | | | | | matching EXTRACT_SUBVECTOR to VEXTRACTF128 along with support routines to examine and translate index values. VINSERTF128 comes next. With these two in place we can begin supporting more AVX operations as INSERT/EXTRACT can be used as a fallback when 256-bit support is not available. llvm-svn: 124797
* Implement feedback from Bruno on making pblendvb an x86-specific ISD node in ↵Nate Begeman2010-12-201-0/+3
| | | | | | | | addition to being an intrinsic, and convert lowering to use it. Hopefully the pattern fragment is doing the right thing with XMM0, looks correct in testing. llvm-svn: 122277
* Add support for matching psign & plendvb to the x86 targetNate Begeman2010-12-171-0/+12
| | | | | | Remove unnecessary pandn patterns, 'vnot' patfrag looks through bitcasts llvm-svn: 122098
* Massive rewrite of MMX: Dale Johannesen2010-09-301-48/+3
| | | | | | | | | | | | | | | | | | | The x86_mmx type is used for MMX intrinsics, parameters and return values where these use MMX registers, and is also supported in load, store, and bitcast. Only the above operations generate MMX instructions, and optimizations do not operate on or produce MMX intrinsics. MMX-sized vectors <2 x i32> etc. are lowered to XMM or split into smaller pieces. Optimizations may occur on these forms and the result casted back to x86_mmx, provided the result feeds into a previous existing x86_mmx operation. The point of all this is prevent optimizations from introducing MMX operations, which is unsafe due to the EMMS problem. llvm-svn: 115243
OpenPOWER on IntegriCloud