summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Factor out a whole bunch of code into it's own method.Chris Lattner2007-08-041-65/+82
| | | | llvm-svn: 40824
* Use getNumPreds(BB) instead of computing them manually. This is a very small butChris Lattner2007-08-041-4/+4
| | | | | | measurable speedup. llvm-svn: 40823
* Change the rename pass to be "tail recursive", only adding N-1 successorsChris Lattner2007-08-041-21/+35
| | | | | | | to the worklist, and handling the last one with a 'tail call'. This speeds up PR1432 from 2.0578s to 2.0012s (2.8%) llvm-svn: 40822
* cache computation of #preds for a BB. This speeds upChris Lattner2007-08-041-3/+14
| | | | | | mem2reg from 2.0742->2.0522s on PR1432. llvm-svn: 40821
* reserve operand space for phi nodes when we insert them.Chris Lattner2007-08-041-0/+1
| | | | llvm-svn: 40820
* use continue to avoid nesting, no functionality change.Chris Lattner2007-08-041-14/+15
| | | | llvm-svn: 40819
* Promoting allocas with the 'single store' fastpath is Chris Lattner2007-08-041-10/+9
| | | | | | | faster than with the 'local to a block' fastpath. This speeds up PR1432 from 2.1232 to 2.0686s (2.6%) llvm-svn: 40818
* When PromoteLocallyUsedAllocas promoted allocas, it didn't rememberChris Lattner2007-08-041-2/+13
| | | | | | | to increment NumLocalPromoted, and didn't actually delete the dead alloca, leading to an extra iteration of mem2reg. llvm-svn: 40817
* std::map -> DenseMapChris Lattner2007-08-041-3/+3
| | | | llvm-svn: 40816
* Clean up comments, fix up some confusing code logic.Nick Lewycky2007-08-041-30/+47
| | | | | | Predsimplify fails llvm-gcc bootstrap. llvm-svn: 40815
* fix a logic bug where we wouldn't promote single store allocas if the Chris Lattner2007-08-041-2/+2
| | | | | | | | | stored value was a non-instruction value. Doh. This increase the # single store allocas from 8982 to 9026, and speeds up mem2reg on the testcase in PR1432 from 2.17 to 2.13s. llvm-svn: 40813
* When we do the single-store optimization, delete both the storeChris Lattner2007-08-041-2/+8
| | | | | | | | and the alloca so they don't get reprocessed. This speeds up PR1432 from 2.20s to 2.17s. llvm-svn: 40812
* Three improvements:Chris Lattner2007-08-041-6/+16
| | | | | | | | | | | | | 1. Check for revisiting a block before checking domination, which is faster. 2. If the stored value isn't an instruction, we don't have to check for domination. 3. If we have a value used in the same block more than once, make sure to remove the block from the UsingBlocks vector. Not doing so forces us to go through the slow path for the alloca. The combination of these improvements increases the number of allocas on the fastpath from 8935 to 8982 on PR1432. This speeds it up from 2.90s to 2.20s (31%) llvm-svn: 40811
* switch from using a std::set to using a SmallPtrSet. This speeds up theChris Lattner2007-08-041-3/+3
| | | | | | testcase in PR1432 from 6.33s to 2.90s (2.22x) llvm-svn: 40810
* In mem2reg, when handling the single-store case, make sure to removeChris Lattner2007-08-041-8/+10
| | | | | | | | | | a using block from the list if we handle it. Not doing this caused us to not be able to promote (with the fast path) allocas which have uses (whoops). This increases the # allocas hitting this fastpath from 4042 to 8935 on the testcase in PR1432, speeding up mem2reg by 2.6x llvm-svn: 40809
* Regenerating.Chandler Carruth2007-08-046-4835/+6441
| | | | llvm-svn: 40808
* This is the patch to provide clean intrinsic function overloading support in ↵Chandler Carruth2007-08-0427-323/+607
| | | | | | | | LLVM. It cleans up the intrinsic definitions and generally smooths the process for more complicated intrinsic writing. It will be used by the upcoming atomic intrinsics as well as vector and float intrinsics in the future. This also changes the syntax for llvm.bswap, llvm.part.set, llvm.part.select, and llvm.ct* intrinsics. They are automatically upgraded by both the LLVM ASM reader and the bitcode reader. The test cases have been updated, with special tests added to ensure the automatic upgrading is supported. llvm-svn: 40807
* split rewriting of single-store allocas into its ownChris Lattner2007-08-041-39/+57
| | | | | | method. llvm-svn: 40806
* refactor some code to shrink PromoteMem2Reg::run a bitChris Lattner2007-08-041-63/+96
| | | | llvm-svn: 40805
* add a typedef, no other change.Chris Lattner2007-08-041-7/+8
| | | | llvm-svn: 40804
* avoid an unneeded vector copy. This speeds up mem2reg on the testcase Chris Lattner2007-08-041-1/+9
| | | | | | in PR1432 by 6% llvm-svn: 40803
* make RenamePassWorkList a local var instead of an ivar.Chris Lattner2007-08-041-8/+8
| | | | llvm-svn: 40802
* Implement codegen for __builtin_choose_expr. For example:Chris Lattner2007-08-044-3/+17
| | | | | | | | | | | | | | | | | | | struct X { int A; }; void foo() { struct X s; int i; i = __builtin_choose_expr(0, s, i); } compiles to: %tmp = load i32* %i ; <i32> [#uses=1] store i32 %tmp, i32* %i wow :) llvm-svn: 40801
* the sse intrinsics are missing, leading to errors.Chris Lattner2007-08-041-1/+1
| | | | llvm-svn: 40800
* fix hang in testsuiteChris Lattner2007-08-041-1/+1
| | | | llvm-svn: 40799
* fix constness issues.Chris Lattner2007-08-041-4/+8
| | | | llvm-svn: 40798
* Make x86 long double alignment 32 for everything butDale Johannesen2007-08-031-1/+3
| | | | | | Darwin (which makes size within a struct==96) llvm-svn: 40796
* Restrict vector component access (using "." and "[]") to variables.Steve Naroff2007-08-033-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | Chris suggested this, since it simplifies the code generator. If this features is needed (and we don't think it is), we can revisit. The following test case now produces an error. [dylan:~/llvm/tools/clang] admin% cat t.c typedef __attribute__(( ocu_vector_type(4) )) float float4; static void test() { float4 vec4; vec4.rg.g; vec4.rg[1]; } [dylan:~/llvm/tools/clang] admin% ../../Debug/bin/clang t.c t.c:8:12: error: vector component access limited to variables vec4.rg.g; ^~ t.c:9:12: error: vector component access limited to variables vec4.rg[1]; ^~~ 2 diagnostics generated. llvm-svn: 40795
* Implement __builtin_choose_expr.Steve Naroff2007-08-039-11/+109
| | | | llvm-svn: 40794
* long double patch 3 of N. Add to MVT.Dale Johannesen2007-08-033-82/+92
| | | | llvm-svn: 40793
* long double patch 2 of N. Handle it in TargetData.Dale Johannesen2007-08-036-7/+19
| | | | | | | (I've tried to get the info right for all targets, but I'm not expert on all of them - check yours.) llvm-svn: 40792
* Fix a subtle miscompilation. This allows 197.parser to be compiled correctly.Owen Anderson2007-08-031-6/+8
| | | | llvm-svn: 40791
* Add a test case to validate code gen for typeof/builtin_types_compatible.Steve Naroff2007-08-032-1/+36
| | | | | | | | | | | | | | | This test case currently generates the following unexpected warnings (when compared with gcc). [dylan:clang/test/Parser] admin% ../../../../Debug/bin/clang -parse-ast-check builtin_types_compatible.c Warnings seen but not expected: Line 28: expression result unused Line 29: expression result unused Line 30: expression result unused Line 31: expression result unused Line 32: expression result unused Line 33: expression result unused llvm-svn: 40789
* implement codegen support for __builtin_types_compatible_pChris Lattner2007-08-032-0/+11
| | | | llvm-svn: 40788
* fix a buggy comment I addedChris Lattner2007-08-031-1/+1
| | | | llvm-svn: 40787
* Rename AddrLabel and OCUVectorComponent -> AddrLabelExpr and ↵Chris Lattner2007-08-037-66/+69
| | | | | | OCUVectorElementExpr respectively. This is for consistency with other expr nodes end with *Expr. llvm-svn: 40785
* testcase for vector element access stuff.Chris Lattner2007-08-031-0/+24
| | | | llvm-svn: 40783
* implement codegen for multidest ocuvector expressions, like:Chris Lattner2007-08-031-5/+14
| | | | | | | vec2.yx = vec2; // reverse llvm-svn: 40782
* add codegen support for storing into a single-element ocu lvalue, such as:Chris Lattner2007-08-032-10/+40
| | | | | | vec2.x = f; llvm-svn: 40781
* refactor handling of ocuvector lvalue->rvalue codegen into its own method.Chris Lattner2007-08-032-43/+52
| | | | llvm-svn: 40780
* In the common case where we are shuffling a vector, emit anChris Lattner2007-08-031-2/+19
| | | | | | | | | | | | | | | | | | | | | llvm vector shuffle instead of a bunch of insert/extract operations. For: vec4 = vec4.yyyy; // splat Emit: %tmp1 = shufflevector <4 x float> %tmp, <4 x float> undef, <4 x i32> < i32 1, i32 1, i32 1, i32 1 > instead of: %tmp1 = extractelement <4 x float> %tmp, i32 1 %tmp2 = insertelement <4 x float> undef, float %tmp1, i32 0 %tmp3 = extractelement <4 x float> %tmp, i32 1 %tmp4 = insertelement <4 x float> %tmp2, float %tmp3, i32 1 %tmp5 = extractelement <4 x float> %tmp, i32 1 %tmp6 = insertelement <4 x float> %tmp4, float %tmp5, i32 2 %tmp7 = extractelement <4 x float> %tmp, i32 1 %tmp8 = insertelement <4 x float> %tmp6, float %tmp7, i32 3 llvm-svn: 40779
* add OCUVectorComponent::getNumComponents()Chris Lattner2007-08-032-1/+9
| | | | llvm-svn: 40778
* Add support for scalar-returning element accesses like V.xChris Lattner2007-08-031-2/+12
| | | | llvm-svn: 40777
* Fix a subtle iterator invalidation bug in a recursive algorithm.Owen Anderson2007-08-031-5/+7
| | | | llvm-svn: 40776
* Prepare for "core" website.Reid Spencer2007-08-031-0/+26
| | | | llvm-svn: 40775
* Long double, part 1 of N. Support in IR.Dale Johannesen2007-08-0315-4515/+3750
| | | | llvm-svn: 40774
* add an observationChris Lattner2007-08-031-0/+27
| | | | llvm-svn: 40772
* implement lvalue to rvalue conversion for ocuvector components. We can now ↵Chris Lattner2007-08-032-1/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | compile stuff like this: typedef __attribute__(( ocu_vector_type(4) )) float float4; float4 test1(float4 V) { return V.wzyx+V; } to: _test1: pshufd $27, %xmm0, %xmm1 addps %xmm0, %xmm1 movaps %xmm1, %xmm0 ret and: _test1: mfspr r2, 256 oris r3, r2, 4096 mtspr 256, r3 li r3, lo16(LCPI1_0) lis r4, ha16(LCPI1_0) lvx v3, r4, r3 vperm v3, v2, v2, v3 vaddfp v2, v3, v2 mtspr 256, r2 blr llvm-svn: 40771
* add support for codegen of an OCUVectorComponent as an lvalue.Chris Lattner2007-08-022-4/+34
| | | | | | | | | | We can now codegen: vec4.xy; as nothing! llvm-svn: 40769
* Add support for encoding a OCUVectorComponent into a single integer.Chris Lattner2007-08-023-5/+37
| | | | llvm-svn: 40768
OpenPOWER on IntegriCloud