summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* extractelement(undef,x) -> undefChris Lattner2006-03-311-6/+8
| | | | llvm-svn: 27300
* Do not endian swap split vector loads. This fixes ↵Chris Lattner2006-03-311-2/+0
| | | | | | | | UnitTests/Vector/sumarray-dbl on PPC. Now all UnitTests/Vector/* tests pass on PPC. llvm-svn: 27299
* Do not endian swap the operands to a store if the operands came from a vector.Chris Lattner2006-03-311-3/+3
| | | | | | This fixes UnitTests/Vector/simple.c with altivec. llvm-svn: 27298
* Remove dead *extloads. This allows us to codegen vector.ll:test_extract_eltChris Lattner2006-03-311-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to: test_extract_elt: alloc r3 = ar.pfs,0,1,0,0 adds r8 = 12, r32 ;; ldfs f8 = [r8] mov ar.pfs = r3 br.ret.sptk.many rp instead of: test_extract_elt: alloc r3 = ar.pfs,0,1,0,0 adds r8 = 28, r32 adds r9 = 24, r32 adds r10 = 20, r32 adds r11 = 16, r32 ;; ldfs f6 = [r8] ;; ldfs f6 = [r9] adds r8 = 12, r32 adds r9 = 8, r32 adds r14 = 4, r32 ;; ldfs f6 = [r10] ;; ldfs f6 = [r11] ldfs f8 = [r8] ;; ldfs f6 = [r9] ;; ldfs f6 = [r14] ;; ldfs f6 = [r32] mov ar.pfs = r3 br.ret.sptk.many rp llvm-svn: 27297
* Delete dead loads in the dag. This allows us to compileChris Lattner2006-03-311-0/+5
| | | | | | | | | | | | | | | | | | | | | | | vector.ll:test_extract_elt2 into: _test_extract_elt2: lfd f1, 32(r3) blr instead of: _test_extract_elt2: lfd f0, 56(r3) lfd f0, 48(r3) lfd f0, 40(r3) lfd f1, 32(r3) lfd f0, 24(r3) lfd f0, 16(r3) lfd f0, 8(r3) lfd f0, 0(r3) blr llvm-svn: 27296
* Implement PromoteOp for VEXTRACT_VECTOR_ELT. Thsi fixesChris Lattner2006-03-311-42/+54
| | | | | | Generic/vector.ll:test_extract_elt on non-sse X86 systems. llvm-svn: 27294
* Scalarized vector stores need not be legal, e.g. if the vector element typeChris Lattner2006-03-311-0/+3
| | | | | | | needs to be promoted or expanded. Relegalize the scalar store once created. This fixes CodeGen/Generic/vector.ll:test1 on non-SSE x86 targets. llvm-svn: 27293
* Fix build breakage.Jeff Cohen2006-03-311-0/+1
| | | | llvm-svn: 27292
* note to self: *save* file, then check it inChris Lattner2006-03-311-1/+1
| | | | llvm-svn: 27291
* Implement an item from the readme, folding vcmp/vcmp. instructions withChris Lattner2006-03-312-9/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | identical instructions into a single instruction. For example, for: void test(vector float *x, vector float *y, int *P) { int v = vec_any_out(*x, *y); *x = (vector float)vec_cmpb(*x, *y); *P = v; } we now generate: _test: mfspr r2, 256 oris r6, r2, 49152 mtspr 256, r6 lvx v0, 0, r4 lvx v1, 0, r3 vcmpbfp. v0, v1, v0 mfcr r4, 2 stvx v0, 0, r3 rlwinm r3, r4, 27, 31, 31 xori r3, r3, 1 stw r3, 0(r5) mtspr 256, r2 blr instead of: _test: mfspr r2, 256 oris r6, r2, 57344 mtspr 256, r6 lvx v0, 0, r4 lvx v1, 0, r3 vcmpbfp. v2, v1, v0 mfcr r4, 2 *** vcmpbfp v0, v1, v0 rlwinm r4, r4, 27, 31, 31 stvx v0, 0, r3 xori r3, r4, 1 stw r3, 0(r5) mtspr 256, r2 blr Testcase here: CodeGen/PowerPC/vcmp-fold.ll llvm-svn: 27290
* compactify some more instruction definitionsChris Lattner2006-03-311-61/+15
| | | | llvm-svn: 27288
* Compactify comparisons.Chris Lattner2006-03-311-104/+34
| | | | llvm-svn: 27287
* Lower vector compares to VCMP nodes, just like we lower vector comparisonChris Lattner2006-03-314-43/+72
| | | | | | predicates to VCMPo nodes. llvm-svn: 27285
* These are doneChris Lattner2006-03-311-5/+0
| | | | llvm-svn: 27284
* Add a new method to verify intrinsic function prototypes.Chris Lattner2006-03-311-2/+60
| | | | llvm-svn: 27282
* Make sure to pass enough values to phi nodes when we are dealing withChris Lattner2006-03-311-2/+10
| | | | | | decimated vectors. This fixes UnitTests/Vector/sumarray-dbl.c llvm-svn: 27280
* Significantly improve handling of vectors that are live across basic blocks,Chris Lattner2006-03-313-52/+101
| | | | | | | handling cases where the vector elements need promotion, expansion, and when the vector type itself needs to be decimated. llvm-svn: 27278
* Was returning the wrong type.Chris Lattner2006-03-311-4/+5
| | | | llvm-svn: 27277
* Mark INSERT_VECTOR_ELT as expandChris Lattner2006-03-311-0/+1
| | | | llvm-svn: 27276
* Expand all INSERT_VECTOR_ELT (obviously bad) for now.Evan Cheng2006-03-311-0/+1
| | | | llvm-svn: 27275
* Expand INSERT_VECTOR_ELT to store vec, sp; store elt, sp+k; vec = load sp;Evan Cheng2006-03-311-1/+24
| | | | llvm-svn: 27274
* Modify the TargetLowering::getPackedTypeBreakdown method to also return theChris Lattner2006-03-311-4/+7
| | | | | | unpromoted element type. llvm-svn: 27273
* TypoEvan Cheng2006-03-311-2/+2
| | | | llvm-svn: 27272
* Ok for vector_shuffle mask to contain undef elements.Evan Cheng2006-03-311-56/+120
| | | | llvm-svn: 27271
* Implement TargetLowering::getPackedTypeBreakdownChris Lattner2006-03-311-0/+41
| | | | llvm-svn: 27270
* Add the rest of the vmul instructions and the vmulsum* instructions.Chris Lattner2006-03-301-1/+15
| | | | llvm-svn: 27268
* Use a new tblgen feature to significantly shrinkify instruction definitions thatChris Lattner2006-03-301-108/+46
| | | | | | directly correspond to intrinsics. llvm-svn: 27266
* Add a bunch of new instructions for intrinsics.Chris Lattner2006-03-301-6/+87
| | | | llvm-svn: 27265
* Fix Transforms/InstCombine/2006-03-30-ExtractElement.llChris Lattner2006-03-301-3/+7
| | | | llvm-svn: 27261
* Make sure all possible shuffles are matched.Evan Cheng2006-03-302-30/+89
| | | | | | | Use pshufd, pshuhw, and pshulw to shuffle v4f32 if shufps doesn't match. Use shufps to shuffle v4f32 if pshufd, pshuhw, and pshulw don't match. llvm-svn: 27259
* More logical ops patternsEvan Cheng2006-03-301-0/+106
| | | | llvm-svn: 27257
* Add support for _mm_cmp{cc}_ss and _mm_cmp{cc}_ps intrinsicsEvan Cheng2006-03-301-9/+35
| | | | llvm-svn: 27256
* Add 128-bit pmovmskb intrinsic support.Evan Cheng2006-03-301-7/+12
| | | | llvm-svn: 27255
* Change SSE pack operation definitions to fit what the intrinsics expected.Evan Cheng2006-03-291-20/+20
| | | | | | | For example, packsswb actually creates a v16i8 from a pair of v8i16. But since the intrinsic specification forces the output type to match the operands. llvm-svn: 27254
* - Added some SSE2 128-bit packed integer ops.Evan Cheng2006-03-293-25/+314
| | | | | | | - Added SSE2 128-bit integer pack with signed saturation ops. - Added pshufhw and pshuflw ops. llvm-svn: 27252
* Need to special case splat after all. Make the second operand of splatEvan Cheng2006-03-292-15/+18
| | | | | | vector_shuffle undef. llvm-svn: 27250
* Floating point logical operation patterns should match bit_convert. Or elseEvan Cheng2006-03-291-29/+53
| | | | | | integer vector logical operations would match andp{s|d} instead of pand. llvm-svn: 27248
* - More shuffle related bug fixes.Evan Cheng2006-03-292-47/+30
| | | | | | | - Whenever possible use ops of the right packed types for vector shuffles / splats. llvm-svn: 27246
* Another entry about shuffles.Evan Cheng2006-03-291-0/+6
| | | | llvm-svn: 27245
* - Only use pshufd for v4i32 vector shuffles.Evan Cheng2006-03-292-61/+83
| | | | | | - Other shuffle related fixes. llvm-svn: 27244
* add a noteChris Lattner2006-03-291-0/+4
| | | | llvm-svn: 27243
* Bug fixes: handle constantexpr insert/extract element operationsChris Lattner2006-03-291-16/+6
| | | | | | | | Handle constantpacked vectors with constantexpr elements. This fixes CodeGen/Generic/vector-constantexpr.ll llvm-svn: 27241
* Added aliases to scalar SSE instructions, e.g. addss, to match x86 intrinsics.Evan Cheng2006-03-281-47/+201
| | | | | | | The source operands type are v4sf with upper bits passes through. Added matching code for these. llvm-svn: 27240
* Fixing buggy code.Evan Cheng2006-03-281-6/+6
| | | | llvm-svn: 27239
* When building a VVECTOR_SHUFFLE node from extract_element operations, makeChris Lattner2006-03-281-1/+11
| | | | | | | | | | | | | | | | | | sure to build it as SHUFFLE(X, undef, mask), not SHUFFLE(X, X, mask). The later is not canonical form, and prevents the PPC splat pattern from matching. For a particular splat, we go from generating this: li r10, lo16(LCPI1_0) lis r11, ha16(LCPI1_0) lvx v3, r11, r10 vperm v3, v2, v2, v3 to generating: vspltw v3, v2, 3 llvm-svn: 27236
* Canonicalize VECTOR_SHUFFLE(X, X, Y) -> VECTOR_SHUFFLE(X,undef,Y')Chris Lattner2006-03-281-0/+30
| | | | llvm-svn: 27235
* Turn a series of extract_element's feeding a build_vector into aChris Lattner2006-03-281-0/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vector_shuffle node. For this: void test(__m128 *res, __m128 *A, __m128 *B) { *res = _mm_unpacklo_ps(*A, *B); } we now produce this code: _test: movl 8(%esp), %eax movaps (%eax), %xmm0 movl 12(%esp), %eax unpcklps (%eax), %xmm0 movl 4(%esp), %eax movaps %xmm0, (%eax) ret instead of this: _test: subl $76, %esp movl 88(%esp), %eax movaps (%eax), %xmm0 movaps %xmm0, (%esp) movaps %xmm0, 32(%esp) movss 4(%esp), %xmm0 movss 32(%esp), %xmm1 unpcklps %xmm0, %xmm1 movl 84(%esp), %eax movaps (%eax), %xmm0 movaps %xmm0, 16(%esp) movaps %xmm0, 48(%esp) movss 20(%esp), %xmm0 movss 48(%esp), %xmm2 unpcklps %xmm0, %xmm2 unpcklps %xmm1, %xmm2 movl 80(%esp), %eax movaps %xmm2, (%eax) addl $76, %esp ret GCC produces this (with -fomit-frame-pointer): _test: subl $12, %esp movl 20(%esp), %eax movaps (%eax), %xmm0 movl 24(%esp), %eax unpcklps (%eax), %xmm0 movl 16(%esp), %eax movaps %xmm0, (%eax) addl $12, %esp ret llvm-svn: 27233
* Teach Legalize how to pack VVECTOR_SHUFFLE nodes into VECTOR_SHUFFLE nodes.Chris Lattner2006-03-281-0/+21
| | | | llvm-svn: 27232
* new nodeChris Lattner2006-03-281-0/+1
| | | | llvm-svn: 27231
* Don't crash on X^X if X is a vector. Instead, produce a vector of zeros.Chris Lattner2006-03-281-2/+10
| | | | llvm-svn: 27229
OpenPOWER on IntegriCloud