summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* filecheckize this.Chris Lattner2010-01-181-14/+22
| | | | llvm-svn: 93776
* filecheckizeChris Lattner2010-01-181-13/+19
| | | | llvm-svn: 93775
* remove a redundant test, filecheckize another.Chris Lattner2010-01-182-43/+26
| | | | llvm-svn: 93774
* Reduce fsub-fadd.ll and merge it into fsub-fsub.ll. Rename fsub-fsub.ll toBill Wendling2010-01-173-47/+23
| | | | | | fsub.ll and FileCheckify it. llvm-svn: 93669
* When the visitSub method was split into visitSub and visitFSub, this xform wasBill Wendling2010-01-131-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | added to the FSub version. However, the original version of this xform guarded against doing this for floating point (!Op0->getType()->isFPOrFPVector()). This is causing LLVM to perform incorrect xforms for code like: void func(double *rhi, double *rlo, double xh, double xl, double yh, double yl){ double mh, ml; double c = 134217729.0; double up, u1, u2, vp, v1, v2; up = xh*c; u1 = (xh - up) + up; u2 = xh - u1; vp = yh*c; v1 = (yh - vp) + vp; v2 = yh - v1; mh = xh*yh; ml = (((u1*v1 - mh) + (u1*v2)) + (u2*v1)) + (u2*v2); ml += xh*yl + xl*yh; *rhi = mh + ml; *rlo = (mh - (*rhi)) + ml; } The last line was optimized away, but rl is intended to be the difference between the infinitely precise result of mh + ml and after it has been rounded to double precision. llvm-svn: 93369
* 1) Use the new SimplifyInstructionsInBlock routine instead of the copyChris Lattner2010-01-121-6/+1
| | | | | | | | | | | | in JT. 2) When cloning blocks for PHI or xor conditions, use instsimplify to simplify the code as we go. This allows us to squish common cases early in JT which opens up opportunities for subsequent iterations, and allows it to completely simplify the testcase. llvm-svn: 93253
* Make several tests less fragile.Dan Gohman2010-01-122-5/+9
| | | | llvm-svn: 93230
* Teach jump threading to duplicate small blocks when the branchChris Lattner2010-01-121-5/+18
| | | | | | | | | | | | | | | | | | | condition is a xor with a phi node. This eliminates nonsense like this from 176.gcc in several places: LBB166_84: testl %eax, %eax - setne %al - xorb %cl, %al - notb %al - testb $1, %al - je LBB166_85 + je LBB166_69 + jmp LBB166_85 This is rdar://7391699 llvm-svn: 93221
* disable this testcase, PR5997Chris Lattner2010-01-111-6/+8
| | | | llvm-svn: 93206
* add one more bitfield optimization, allowing clang to generateChris Lattner2010-01-111-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | good code on PR4216: _test_bitfield: ## @test_bitfield orl $32962, %edi movl $4294941946, %eax andq %rdi, %rax ret instead of: _test_bitfield: movl $4294941696, %ecx movl %edi, %eax orl $194, %edi orl $32768, %eax andq $250, %rdi andq %rax, %rcx movq %rdi, %rax orq %rcx, %rax ret Evan is looking into the remaining andq+imm -> andl optimization. llvm-svn: 93147
* Extend CanEvaluateZExtd to handle and/or/xor more aggressively in theChris Lattner2010-01-111-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BitsToClear case. This allows it to promote expressions which have an and/or/xor after the lshr, promoting cases like test2 (from PR4216) and test3 (random extample extracted from a spec benchmark). clang now compiles the code in PR4216 into: _test_bitfield: ## @test_bitfield movl %edi, %eax orl $194, %eax movl $4294902010, %ecx andq %rax, %rcx orl $32768, %edi andq $39936, %rdi movq %rdi, %rax orq %rcx, %rax ret instead of: _test_bitfield: ## @test_bitfield movl %edi, %eax orl $194, %eax movl $4294902010, %ecx andq %rax, %rcx shrl $8, %edi orl $128, %edi shlq $8, %rdi andq $39936, %rdi movq %rdi, %rax orq %rcx, %rax ret which is still not great, but is progress. llvm-svn: 93145
* Remove the dead TD argument to CanEvaluateZExtd, and add aChris Lattner2010-01-111-1/+24
| | | | | | | | | new BitsToClear result which allows us to start promoting expressions that end with a lshr-by-constant. This is conservatively correct and better than what we had before (see testcases) but still needs to be extended further. llvm-svn: 93144
* teach sext optimization to handle truncs from types that are notChris Lattner2010-01-101-0/+26
| | | | | | the dest of the sext. llvm-svn: 93128
* teach zext optimization how to deal with truncs that don't come fromChris Lattner2010-01-101-1/+25
| | | | | | | | | | | | | | | | | the zext dest type. This allows us to handle test52/53 in cast.ll, and allows llvm-gcc to generate much better code for PR4216 in -m64 mode: _test_bitfield: ## @test_bitfield orl $32962, %edi movl %edi, %eax andl $-25350, %eax ret This also fixes a bug handling vector extends, ensuring that the mask produced is a vector constant, not an integer constant. llvm-svn: 93127
* now that the cost model has changed, we can always consider Chris Lattner2010-01-101-2/+44
| | | | | | | | elimination of a sign extend to be a win, which simplifies the client of CanEvaluateSExtd, and allows us to eliminate more casts (examples taken from real code). llvm-svn: 93109
* change the preferred canonical form for a sign extension to beChris Lattner2010-01-101-7/+0
| | | | | | | | lshr+ashr instead of trunc+sext. We want to avoid type conversions whenever possible, it is easier to codegen expressions without truncates and extensions. llvm-svn: 93107
* two changes: Chris Lattner2010-01-101-0/+26
| | | | | | | | | | | 1) don't try to optimize a sext or zext that is only used by a trunc, let the trunc get optimized first. This avoids some pointless effort in some common cases since instcombine scans down a block in the first pass. 2) Change the cost model for zext elimination to consider an 'and' cheaper than a zext. This allows us to do it more aggressively, and for the next patch to simplify the code quite a bit. llvm-svn: 93097
* enhance CanEvaluateZExtd to handle shift left and sext, allowingChris Lattner2010-01-101-0/+28
| | | | | | more expressions to be promoted and casts eliminated. llvm-svn: 93096
* Use WriteAsOperand instead of getName() to print loop header names,Dan Gohman2010-01-092-2/+2
| | | | | | so that unnamed blocks are handled. llvm-svn: 93059
* only factor from expressions whose uses are empty and whoseChris Lattner2010-01-091-1/+19
| | | | | | base is the right expression type. This fixes PR5981. llvm-svn: 93045
* teach instcombine to delete sign extending shift pairs (sra(shl X, C), C) whenChris Lattner2010-01-081-0/+19
| | | | | | the input is already sign extended. llvm-svn: 93019
* fix PR5978 by peeling the loop so that we avoid shifting theChris Lattner2010-01-081-0/+10
| | | | | | | result int by 8 for the first byte. While normally harmless, if the result is smaller than a byte, this shift is invalid. llvm-svn: 93018
* teach ComputeNumSignBits to look through PHI nodes.Chris Lattner2010-01-071-0/+18
| | | | llvm-svn: 92964
* filecheckizeChris Lattner2010-01-071-3/+5
| | | | llvm-svn: 92963
* Enhance instcombine to reason more strongly about promoting computationChris Lattner2010-01-071-0/+11
| | | | | | | that feeds into a zext, similar to the patch I did yesterday for sext. There is a lot of room for extension beyond this patch. llvm-svn: 92962
* fix a globalopt crash on 'bullet' (handling evaluation of a storeChris Lattner2010-01-071-0/+16
| | | | | | | | | | to an element of a vector in a static ctor) which occurs with an unrelated patch I'm testing. Annoyingly, EvaluateStoreInto basically does exactly the same stuff as InsertElement constant folding, but it now handles vectors, and you can't insertelement into a vector. It would be 'really nice' if GEP into a vector were not legal. llvm-svn: 92889
* Fix a README item: have functionattrs look through selects andDuncan Sands2010-01-061-2/+27
| | | | | | | | | phi nodes when deciding which pointers point to local memory. I actually checked long ago how useful this is, and it isn't very: it hardly ever fires in the testsuite, but since Chris wants it here it is! llvm-svn: 92836
* Partially address a README by having functionattrs consider calls toDuncan Sands2010-01-061-2/+31
| | | | | | | | | | memcpy, memset and other intrinsics that only access their arguments to be readnone if the intrinsic's arguments all point to local memory. This improves the testcase in the README to readonly, but it could in theory be made readnone, however this would involve more sophisticated analysis that looks through the memcpy. llvm-svn: 92829
* Teach instcombine's sext elimination logic to be more aggressive.Chris Lattner2010-01-062-21/+11
| | | | | | | | | | | | | | Previously, instcombine would only promote an expression tree to the larger type if doing so eliminated two casts. This is because a need to manually do the sign extend after the promoted expression tree with two shifts. Now, we keep track of whether the result of the computation is going to be properly sign extended already. If so, we can unconditionally promote the expression, which allows us to zap more sext's. This implements rdar://6598839 (aka gcc pr38751) llvm-svn: 92815
* Move this test from test/Transforms/IndVarSimplify toDan Gohman2010-01-051-19/+0
| | | | | | | test/CodeGen/X86, as doesn't use -indvars, and it does use llc -march=x86-64. llvm-svn: 92799
* more rearrangement and cleanup, fix my test failure.Chris Lattner2010-01-051-4/+4
| | | | llvm-svn: 92792
* remove two trunc xforms that are subsumed by EvaluateInDifferentType.Chris Lattner2010-01-051-0/+2
| | | | | | | The only difference is that EvaluateInDifferentType checks to ensure they are profitable before doing them :) llvm-svn: 92788
* merge some tests.Chris Lattner2010-01-055-49/+39
| | | | llvm-svn: 92786
* merge cast2 into cast.llChris Lattner2010-01-052-37/+37
| | | | llvm-svn: 92784
* remove useless test.Chris Lattner2010-01-051-35/+0
| | | | llvm-svn: 92782
* another example.Chris Lattner2010-01-051-0/+8
| | | | llvm-svn: 92781
* remove a useless negative test, add a rdar # to an xfail that I'm working on.Chris Lattner2010-01-052-46/+1
| | | | llvm-svn: 92777
* clean up tests.Chris Lattner2010-01-053-28/+9
| | | | llvm-svn: 92776
* just remove this xform which is subsumed by others.Chris Lattner2010-01-051-2/+2
| | | | llvm-svn: 92775
* optimize comparisons against cttz/ctlz/ctpop, patch by Alastair Lynn!Chris Lattner2010-01-051-4/+26
| | | | llvm-svn: 92745
* Delete useless trailing semicolons.Dan Gohman2010-01-0519-25/+25
| | | | llvm-svn: 92740
* optimize cttz and ctlz when we can prove something about the Chris Lattner2010-01-051-1/+24
| | | | | | leading/trailing bits. Patch by Alastair Lynn! llvm-svn: 92706
* fix an infinite loop in reassociate building emacs.Chris Lattner2010-01-051-0/+15
| | | | llvm-svn: 92679
* Remove dead debug info intrinsics.Devang Patel2010-01-051-55/+0
| | | | | | | | | | Intrinsic::dbg_stoppoint Intrinsic::dbg_region_start Intrinsic::dbg_region_end Intrinsic::dbg_func_start AutoUpgrade simply ignores these intrinsics now. llvm-svn: 92557
* Truncate GEP indexes larger than the pointer size down to pointer sizeChris Lattner2010-01-041-9/+9
| | | | | | | | | | | when doing this transform if the GEP is not inbounds. No testcase because it is very difficult to trigger this: instcombine already canonicalizes GEP indices to pointer size, so it relies specific permutations of the instcombine worklist. Thanks to Duncan for pointing this possible problem out. llvm-svn: 92495
* implement an instcombine xform needed by clang's codegenChris Lattner2010-01-041-0/+13
| | | | | | | | on the example in PR4216. This doesn't trigger in the testsuite, so I'd really appreciate someone scrutinizing the logic for correctness. llvm-svn: 92458
* generalize the previous transformation to handle indexing intoChris Lattner2010-01-031-0/+18
| | | | | | | | | | | | | | arrays of structs and other arrays, so long as all the subsequent indexes are constants. This triggers frequently for stuff like: @divisions = internal constant [29 x [2 x i32]] [[2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 1], [2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 2], [2 x i32] zeroinitializer, [2 x i32] zeroinitializer, [2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 1], [2 x i32] zeroinitializer, [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 2]], align 32 ; <[29 x [2 x i32]]*> [#uses=50] %623 = getelementptr inbounds [29 x [2 x i32]]* @divisions, i64 0, i64 %619, i64 0 ; <i32*> [#uses=1] %684 = icmp eq i32 %683, 999 also for the "my_defs" table in 'gs', etc. llvm-svn: 92444
* teach instcombine to optimize idioms like A[i]&42 == 0. ThisChris Lattner2010-01-021-0/+12
| | | | | | | occurs in 403.gcc in mode_mask_array, in safe-ctype.c (which is copied in multiple apps) in _sch_istable, etc. llvm-svn: 92427
* Teach the table lookup optimization to generate range comparesChris Lattner2010-01-021-3/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | when a consequtive sequence of elements all satisfies the predicate. Like the double compare case, this generates better code than the magic constant case and generalizes to more than 32/64 element array lookups. Here are some examples where it triggers. From 403.gcc, most accesses to the rtx_class array are handled, e.g.: @rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]*> [#uses=547] %142 = icmp eq i8 %141, 105 @rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]*> [#uses=543] %165 = icmp eq i8 %164, 60 Also, most of the 59-element arrays (mode_class/rid_to_yy, etc) optimized before are actually range compares. This lets 32-bit machines optimize them. 400.perlbmk has stuff like this: 400.perlbmk: PL_regkind, even for 32-bit: @PL_regkind = constant [62 x i8] c"\00\00\02\02\02\06\06\06\06\09\09\0B\0B\0D\0E\0E\0E\11\12\12\14\14\16\16\18\18\1A\1A\1C\1C\1E\1F !!!$$&'((((,-.///88886789:;8$", align 32 ; <[62 x i8]*> [#uses=4] %811 = icmp ne i8 %810, 33 @PL_utf8skip = constant [256 x i8] c"\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\04\04\04\04\04\04\04\04\05\05\05\05\06\06\07\0D", align 32 ; <[256 x i8]*> [#uses=94] %12 = icmp ult i8 %10, 2 etc. llvm-svn: 92426
* Fix logic error in previous commit. The != case needs to become an or, not anNick Lewycky2010-01-021-0/+14
| | | | | | and. llvm-svn: 92419
OpenPOWER on IntegriCloud