summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Make the BUILD_VECTOR lowering code much more aggressive w.r.t constant vectors.Chris Lattner2006-04-161-47/+96
| | | | | | Remove some done items from the todo list. llvm-svn: 27729
* Fix a crash when faced with a shuffle vector that has an undef in its mask.Chris Lattner2006-04-151-1/+5
| | | | llvm-svn: 27726
* Allow undef in a shuffle maskChris Lattner2006-04-141-0/+1
| | | | llvm-svn: 27714
* Move the rest of the PPCTargetLowering::LowerOperation cases out intoChris Lattner2006-04-141-468/+529
| | | | | | separate functions, for simplicity and code clarity. llvm-svn: 27693
* Pull the VECTOR_SHUFFLE and BUILD_VECTOR lowering code out into separateChris Lattner2006-04-141-147/+155
| | | | | | functions, which makes the code much cleaner :) llvm-svn: 27692
* Force non-darwin targets to use a static relo model. This fixes PR734,Chris Lattner2006-04-131-7/+8
| | | | | | tested by CodeGen/Generic/vector.ll llvm-svn: 27657
* Add a new way to match vector constants, which make it easier to bang bits ofChris Lattner2006-04-121-4/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | different types. Codegen spltw(0x7FFFFFFF) and spltw(0x80000000) without a constant pool load, implementing PowerPC/vec_constants.ll:test1. This compiles: typedef float vf __attribute__ ((vector_size (16))); typedef int vi __attribute__ ((vector_size (16))); void test(vi *P1, vi *P2, vf *P3) { *P1 &= (vi){0x80000000,0x80000000,0x80000000,0x80000000}; *P2 &= (vi){0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF}; *P3 = vec_abs((vector float)*P3); } to: _test: mfspr r2, 256 oris r6, r2, 49152 mtspr 256, r6 vspltisw v0, -1 vslw v0, v0, v0 lvx v1, 0, r3 vand v1, v1, v0 stvx v1, 0, r3 lvx v1, 0, r4 vandc v1, v1, v0 stvx v1, 0, r4 lvx v1, 0, r5 vandc v0, v1, v0 stvx v0, 0, r5 mtspr 256, r2 blr instead of (with two constant pool entries): _test: mfspr r2, 256 oris r6, r2, 49152 mtspr 256, r6 li r6, lo16(LCPI1_0) lis r7, ha16(LCPI1_0) li r8, lo16(LCPI1_1) lis r9, ha16(LCPI1_1) lvx v0, r7, r6 lvx v1, 0, r3 vand v0, v1, v0 stvx v0, 0, r3 lvx v0, r9, r8 lvx v1, 0, r4 vand v1, v1, v0 stvx v1, 0, r4 lvx v1, 0, r5 vand v0, v1, v0 stvx v0, 0, r5 mtspr 256, r2 blr GCC produces (with 2 cp entries): _test: mfspr r0,256 stw r0,-4(r1) oris r0,r0,0xc00c mtspr 256,r0 lis r2,ha16(LC0) lis r9,ha16(LC1) la r2,lo16(LC0)(r2) lvx v0,0,r3 lvx v1,0,r5 la r9,lo16(LC1)(r9) lwz r12,-4(r1) lvx v12,0,r2 lvx v13,0,r9 vand v0,v0,v12 stvx v0,0,r3 vspltisw v0,-1 vslw v12,v0,v0 vandc v1,v1,v12 stvx v1,0,r5 lvx v0,0,r4 vand v0,v0,v13 stvx v0,0,r4 mtspr 256,r12 blr llvm-svn: 27624
* Rename get_VSPLI_elt -> get_VSPLTI_eltChris Lattner2006-04-121-7/+28
| | | | | | | | | Canonicalize BUILD_VECTOR's that match VSPLTI's into a single type for each form, eliminating a bunch of Pat patterns in the .td file and allowing us to CSE stuff more aggressively. This implements PowerPC/buildvec_canonicalize.ll:VSPLTI llvm-svn: 27614
* Ensure that zero vectors are always v4i32, which forces them to CSE withChris Lattner2006-04-121-3/+11
| | | | | | each other. This implements CodeGen/PowerPC/vxor-canonicalize.ll llvm-svn: 27609
* Vector function results go into V2 according to GCC. The darwin ABI docChris Lattner2006-04-111-1/+10
| | | | | | doesn't say where they go :-/ llvm-svn: 27579
* Move some return-handling code from lowerarguments to the ISD::RET handling ↵Chris Lattner2006-04-111-20/+9
| | | | | | | | stuff. No functionality change. llvm-svn: 27577
* properly mark vector selects as expanded to select_ccChris Lattner2006-04-081-0/+4
| | | | llvm-svn: 27544
* Add VRRC select supportChris Lattner2006-04-081-1/+2
| | | | llvm-svn: 27543
* Implement PowerPC/CodeGen/vec_splat.ll:spltish to use vsplish instead of aChris Lattner2006-04-081-0/+57
| | | | | | constant pool load. llvm-svn: 27538
* Change the interface to the predicate that determines if vsplti* can be used.Chris Lattner2006-04-081-16/+17
| | | | | | No functionality changes. llvm-svn: 27536
* Make sure to return the result in the right type.Chris Lattner2006-04-061-4/+6
| | | | llvm-svn: 27469
* Match vpku[hw]um(x,x).Chris Lattner2006-04-061-45/+46
| | | | | | Convert vsldoi(x,x) to work the same way other (x,x) cases work. llvm-svn: 27467
* Add support for matching vmrg(x,x) patternsChris Lattner2006-04-061-33/+39
| | | | llvm-svn: 27463
* Pattern match vmrg* instructions, which are now lowered by the CFE into ↵Chris Lattner2006-04-061-3/+49
| | | | | | shuffles. llvm-svn: 27457
* Support pattern matching vsldoi(x,y) and vsldoi(x,x), which allows the f.e. toChris Lattner2006-04-061-33/+62
| | | | | | | lower it and LLVM to have one fewer intrinsic. This implements CodeGen/PowerPC/vec_shuffle.ll llvm-svn: 27450
* Compile the vpkuhum/vpkuwum intrinsics into vpkuhum/vpkuwum instead of intoChris Lattner2006-04-061-0/+54
| | | | | | vperm with a perm mask lvx'd from the constant pool. llvm-svn: 27448
* Fix CodeGen/PowerPC/2006-04-05-splat-ish.llChris Lattner2006-04-051-2/+2
| | | | llvm-svn: 27439
* Fallthrough to expand if a VECTOR_SHUFFLE cannot be custom lowered.Evan Cheng2006-04-051-1/+1
| | | | llvm-svn: 27433
* Fix some broken logic that would cause us to codegen ↵Chris Lattner2006-04-041-2/+2
| | | | | | {2147483647,2147483647,2147483647,2147483647} as 'vspltisb v0, -1'. llvm-svn: 27413
* Ask legalize to promote all vector shuffles to be v16i8 instead of having toChris Lattner2006-04-041-17/+39
| | | | | | | | handle all 4 PPC vector types. This simplifies the matching code and allows us to eliminate a bunch of patterns. This also adds cases we were missing, such as CodeGen/PowerPC/vec_splat.ll:splat_h. llvm-svn: 27400
* Revert accidentally committed hunks.Chris Lattner2006-04-031-3/+1
| | | | llvm-svn: 27386
* Make sure to mark unsupported SCALAR_TO_VECTOR operations as expand.Chris Lattner2006-04-031-1/+5
| | | | llvm-svn: 27385
* Inform the dag combiner that the predicate compares only return a low bit.Chris Lattner2006-04-021-1/+34
| | | | llvm-svn: 27359
* Custom lower all BUILD_VECTOR's so that we can compile vec_splat_u8(8) intoChris Lattner2006-04-021-0/+2
| | | | | | "vspltisb v0, 8" instead of a constant pool load. llvm-svn: 27335
* Rearrange code a bitChris Lattner2006-03-311-21/+25
| | | | llvm-svn: 27306
* Add, sub and shuffle are legal for all vector typesChris Lattner2006-03-311-8/+9
| | | | llvm-svn: 27305
* note to self: *save* file, then check it inChris Lattner2006-03-311-1/+1
| | | | llvm-svn: 27291
* Implement an item from the readme, folding vcmp/vcmp. instructions withChris Lattner2006-03-311-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | identical instructions into a single instruction. For example, for: void test(vector float *x, vector float *y, int *P) { int v = vec_any_out(*x, *y); *x = (vector float)vec_cmpb(*x, *y); *P = v; } we now generate: _test: mfspr r2, 256 oris r6, r2, 49152 mtspr 256, r6 lvx v0, 0, r4 lvx v1, 0, r3 vcmpbfp. v0, v1, v0 mfcr r4, 2 stvx v0, 0, r3 rlwinm r3, r4, 27, 31, 31 xori r3, r3, 1 stw r3, 0(r5) mtspr 256, r2 blr instead of: _test: mfspr r2, 256 oris r6, r2, 57344 mtspr 256, r6 lvx v0, 0, r4 lvx v1, 0, r3 vcmpbfp. v2, v1, v0 mfcr r4, 2 *** vcmpbfp v0, v1, v0 rlwinm r4, r4, 27, 31, 31 stvx v0, 0, r3 xori r3, r4, 1 stw r3, 0(r5) mtspr 256, r2 blr Testcase here: CodeGen/PowerPC/vcmp-fold.ll llvm-svn: 27290
* Lower vector compares to VCMP nodes, just like we lower vector comparisonChris Lattner2006-03-311-15/+38
| | | | | | predicates to VCMPo nodes. llvm-svn: 27285
* Mark INSERT_VECTOR_ELT as expandChris Lattner2006-03-311-0/+1
| | | | llvm-svn: 27276
* Add a few more altivec intrinsicsNate Begeman2006-03-281-2/+2
| | | | llvm-svn: 27215
* Use normal lvx for scalar_to_vector instead of lve*x. They do the exactChris Lattner2006-03-281-4/+2
| | | | | | same thing and we have a dag node for the former. llvm-svn: 27205
* Tblgen doesn't like multiple SDNode<> definitions that map to the sameenum ↵Chris Lattner2006-03-281-2/+2
| | | | | | value. Split them into separate enums. llvm-svn: 27201
* SelectionDAGISel can now natively handle Switch instructions, in the sameNate Begeman2006-03-271-1/+9
| | | | | | | | | | | | | | manner that the LowerSwitch LLVM to LLVM pass does: emitting a binary search tree of basic blocks. The new approach has several advantages: it is faster, it generates significantly smaller code in many cases, and it paves the way for implementing dense switch tables as a jump table by handling switches directly in the instruction selector. This functionality is currently only enabled on x86, but should be safe for every target. In anticipation of making it the default, the cfg is now properly updated in the x86, ppc, and sparc select lowering code. llvm-svn: 27156
* Codegen vector predicate compares.Chris Lattner2006-03-261-0/+80
| | | | llvm-svn: 27151
* Remove PPC:isZeroVector, use ISD::isBuildVectorAllZeros insteadEvan Cheng2006-03-261-22/+2
| | | | llvm-svn: 27149
* Add some basic patterns for other datatypesChris Lattner2006-03-251-4/+2
| | | | llvm-svn: 27116
* Codegen things like:Chris Lattner2006-03-251-0/+63
| | | | | | | | | | | | | | | | | <int -1, int -1, int -1, int -1> and <int 65537, int 65537, int 65537, int 65537> Using things like: vspltisb v0, -1 and: vspltish v0, 1 instead of using constant pool loads. This implements CodeGen/PowerPC/vec_splat.ll:splat_imm_i{32|16}. llvm-svn: 27106
* Disable the i32->float G5 optimization. It is unsafe, as documented in theChris Lattner2006-03-241-1/+7
| | | | | | | | comment. This fixes 177.mesa, and McCat/09-vor with the td scheduler. llvm-svn: 27060
* add support for using vxor to build zero vectors. This implementsChris Lattner2006-03-241-3/+33
| | | | | | Regression/CodeGen/PowerPC/vec_zero.ll llvm-svn: 27059
* When possible, custom lower 32-bit SINT_TO_FP to this:Chris Lattner2006-03-221-27/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | _foo2: extsw r2, r3 std r2, -8(r1) lfd f0, -8(r1) fcfid f0, f0 frsp f1, f0 blr instead of this: _foo2: lis r2, ha16(LCPI2_0) lis r4, 17200 xoris r3, r3, 32768 stw r3, -4(r1) stw r4, -8(r1) lfs f0, lo16(LCPI2_0)(r2) lfd f1, -8(r1) fsub f0, f1, f0 frsp f1, f0 blr This speeds up Misc/pi from 2.44s->2.09s with LLC and from 3.01->2.18s with llcbeta (16.7% and 38.1% respectively). llvm-svn: 26943
* These targets don't support EXTRACT_VECTOR_ELT, though, in time, X86 will.Chris Lattner2006-03-211-0/+1
| | | | llvm-svn: 26930
* remove dead variableChris Lattner2006-03-201-1/+1
| | | | llvm-svn: 26907
* Fix a couple of bugs in permute/splat generate, thanks to Nate for actuallyChris Lattner2006-03-201-2/+0
| | | | | | figuring these out! :) llvm-svn: 26904
* Add support for generating vspltw, instead of a vperm instruction with aChris Lattner2006-03-201-9/+19
| | | | | | | | constant pool load. This generates significantly nicer code for splats. When tblgen gets bugfixed, we can remove the custom selection code. llvm-svn: 26898
OpenPOWER on IntegriCloud