summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Scalar/CodeGenPrepare.cpp
Commit message (Collapse)AuthorAgeFilesLines
* CodeGenPrepare: Don't crash when TLI is not available.Benjamin Kramer2012-06-291-1/+2
| | | | | | This happens when codegenprepare is invoked via opt. llvm-svn: 159457
* Move llvm/Support/IRBuilder.h -> llvm/IRBuilder.hChandler Carruth2012-06-291-12/+12
| | | | | | | | | | | | | | | | | This was always part of the VMCore library out of necessity -- it deals entirely in the IR. The .cpp file in fact was already part of the VMCore library. This is just a mechanical move. I've tried to go through and re-apply the coding standard's preferred header sort, but at 40-ish files, I may have gotten some wrong. Please let me know if so. I'll be committing the corresponding updates to Clang and Polly, and Duncan has DragonEgg. Thanks to Bill and Eric for giving the green light for this bit of cleanup. llvm-svn: 159421
* Switch the select to branch transformation on by default.Benjamin Kramer2012-05-061-3/+4
| | | | | | | | | The primitive conservative heuristic seems to give a slight overall improvement while not regressing stuff. Make it available to wider testing. If you notice any speed regressions (or significant code size regressions) let me know! llvm-svn: 156258
* CodeGenPrepare: Add a transform to turn selects into branches in some cases.Benjamin Kramer2012-05-051-0/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This came up when a change in block placement formed a cmov and slowed down a hot loop by 50%: ucomisd (%rdi), %xmm0 cmovbel %edx, %esi cmov is a really bad choice in this context because it doesn't get branch prediction. If we emit it as a branch, an out-of-order CPU can do a better job (if the branch is predicted right) and avoid waiting for the slow load+compare instruction to finish. Of course it won't help if the branch is unpredictable, but those are really rare in practice. This patch uses a dumb conservative heuristic, it turns all cmovs that have one use and a direct memory operand into branches. cmovs usually save some code size, so we disable the transform in -Os mode. In-Order architectures are unlikely to benefit as well, those are included in the "predictableSelectIsExpensive" flag. It would be better to reuse branch probability info here, but BPI doesn't support select instructions currently. It would make sense to use the same heuristics as the if-converter pass, which does the opposite direction of this transform. Test suite shows a small improvement here and there on corei7-level machines, but the actual results depend a lot on the used microarchitecture. The transformation is currently disabled by default and available by passing the -enable-cgp-select2branch flag to the code generator. Thanks to Chandler for the initial test case to him and Evan Cheng for providing me with comments and test-suite numbers that were more stable than mine :) llvm-svn: 156234
* Refactor the interface to recursively simplifying instructions to be tadChandler Carruth2012-03-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | bit simpler by handling a common case explicitly. Also, refactor the implementation to use a worklist based walk of the recursive users, rather than trying to use value handles to detect and recover from RAUWs during the recursive descent. This fixes a very subtle bug in the previous implementation where degenerate control flow structures could cause mutually recursive instructions (PHI nodes) to collapse in just such a way that From became equal to To after some amount of recursion. At that point, we hit the inf-loop that the assert at the top attempted to guard against. This problem is defined away when not using value handles in this manner. There are lots of comments claiming that the WeakVH will protect against just this sort of error, but they're not accurate about the actual implementation of WeakVHs, which do still track RAUWs. I don't have any test case for the bug this fixes because it requires running the recursive simplification on unreachable phi nodes. I've no way to either run this or easily write an input that triggers it. It was found when using instruction simplification inside the inliner when running over the nightly test-suite. llvm-svn: 153393
* Target override to allow CodeGenPrepare to sink address operands to ↵Pete Cooper2012-03-131-0/+9
| | | | | | intrinsics in the same way it current does for loads and stores llvm-svn: 152666
* Do trivial CSE of dead BBs during codegen preparation.Bill Wendling2012-03-041-1/+20
| | | | | | | | Some BBs can become dead after codegen preparation. If we delete them here, it could help enable tail-call optimizations later on. <rdar://problem/10256573> llvm-svn: 152002
* Extend Attributes to 64 bitsKostya Serebryany2012-01-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | Problem: LLVM needs more function attributes than currently available (32 bits). One such proposed attribute is "address_safety", which shows that a function is being checked for address safety (by AddressSanitizer, SAFECode, etc). Solution: - extend the Attributes from 32 bits to 64-bits - wrap the object into a class so that unsigned is never erroneously used instead - change "unsigned" to "Attributes" throughout the code, including one place in clang. - the class has no "operator uint64 ()", but it has "uint64_t Raw() " to support packing/unpacking. - the class has "safe operator bool()" to support the common idiom: if (Attributes attr = getAttrs()) useAttrs(attr); - The CTOR from uint64_t is marked explicit, so I had to add a few explicit CTOR calls - Add the new attribute "address_safety". Doing it in the same commit to check that attributes beyond first 32 bits actually work. - Some of the functions from the Attribute namespace are worth moving inside the class, but I'd prefer to have it as a separate commit. Tested: "make check" on Linux (32-bit and 64-bit) and Mac (10.6) built/run spec CPU 2006 on Linux with clang -O2. This change will break clang build in lib/CodeGen/CGCall.cpp. The following patch will fix it. llvm-svn: 148553
* Propagate TargetLibraryInfo throughout ConstantFolding.cpp and Chad Rosier2011-12-011-2/+9
| | | | | | | InstructionSimplify.cpp. Other fixups as needed. Part of rdar://10500969 llvm-svn: 145559
* Fold two identical set lookups into one. No functionality change.Nick Lewycky2011-09-291-4/+2
| | | | llvm-svn: 140821
* Stop emitting instructions with the name "tmp" they eat up memory and have ↵Benjamin Kramer2011-09-271-1/+1
| | | | | | | | to be uniqued, without any benefit. If someone prefers %tmp42 to %42, run instnamer. llvm-svn: 140634
* Use IRBuilder.Devang Patel2011-09-061-17/+14
| | | | llvm-svn: 139156
* Dramatically speedup codegen prepare by a) avoiding use of dominator tree ↵Devang Patel2011-08-181-16/+38
| | | | | | and b) doing a separate pass over dbg.value instructions. llvm-svn: 137908
* Use the getFirstInsertionPt() method instead of getFirstNonPHI + an 'isa<>'Bill Wendling2011-08-161-13/+6
| | | | | | check for a LandingPadInst. llvm-svn: 137745
* In places where it's using "getFirstNonPHI", skip the landingpad instruction ↵Bill Wendling2011-08-151-5/+8
| | | | | | if necessary. llvm-svn: 137679
* Skip the insertion iterator past the landingpad instruction if there.Bill Wendling2011-08-151-0/+1
| | | | llvm-svn: 137626
* land David Blaikie's patch to de-constify Type, with a few tweaks.Chris Lattner2011-07-181-4/+4
| | | | llvm-svn: 135375
* Fix warnings due to 132263; Thanks rdivacky.Nadav Rotem2011-05-291-2/+4
| | | | llvm-svn: 132285
* Refactor getActionType and getTypeToTransformTo ; place all of the 'decision'Nadav Rotem2011-05-271-2/+2
| | | | | | code in one place. Re-apply 131534 and fix the multi-step promotion of integers. llvm-svn: 132217
* Fix warning about || and && without explicit grouping.Chandler Carruth2011-05-261-2/+2
| | | | | | | | This looks like it flagged an actual bug. Devang, please review. I added the parentheses that change behavior, but make the behavior more closely match commit log's intent. llvm-svn: 132165
* Do not insert anything after terminator.Devang Patel2011-05-261-1/+2
| | | | llvm-svn: 132164
* Do not move DBG_VALUE in middle of PHI nodes.Devang Patel2011-05-261-1/+4
| | | | llvm-svn: 132161
* If llvm.dbg.value and the value instruction it refers to are far apart then ↵Devang Patel2011-05-261-1/+13
| | | | | | iSel may not be able to find corresponding Node for llvm.dbg.value during DAG construction. Make iSel's life easier by removing this distance between llvm.dbg.value and its value instruction. llvm-svn: 132151
* Add a parameter to ConstantFoldTerminator() that callers can use to ask it ↵Frits van Bommel2011-05-221-1/+1
| | | | | | | | to also clean up the condition of any conditional terminator it folds to be unconditional, if that turns the condition into dead code. This just means it calls RecursivelyDeleteTriviallyDeadInstructions() in strategic spots. It defaults to the old behavior. I also changed -simplifycfg, -jump-threading and -codegenprepare to use this to produce slightly better code without any extra cleanup passes (AFAICT this was the only place in -simplifycfg where now-dead conditions of replaced terminators weren't being cleaned up). The only other user of this function is -sccp, but I didn't read that thoroughly enough to figure out whether it might be holding pointers to instructions that could be deleted by this. llvm-svn: 131855
* Revert commit 131534 since it seems to have broken several buildbots.Duncan Sands2011-05-181-2/+2
| | | | | | | | Original log entry: Refactor getActionType and getTypeToTransformTo ; place all of the 'decision' code in one place. llvm-svn: 131536
* Refactor getActionType and getTypeToTransformTo ; place all of the 'decision'Nadav Rotem2011-05-181-2/+2
| | | | | | code in one place. llvm-svn: 131534
* Fix a bug where RecursivelyDeleteTriviallyDeadInstructions couldChris Lattner2011-04-091-3/+18
| | | | | | | delete the instruction pointed to by CGP's current instruction iterator, leading to a crash on the testcase. This fixes PR9578. llvm-svn: 129200
* Debug intrinsics must be skipped at the beginning and ends of blocks, lest theyCameron Zwarich2011-03-241-2/+6
| | | | | | affect the generated code. llvm-svn: 128217
* It is enough for the CallInst to have no uses to be made a tail call with a retCameron Zwarich2011-03-241-1/+1
| | | | | | void; it doesn't need to have a void type. llvm-svn: 128212
* s/UpdateDT/ModifiedDT/gDevang Patel2011-03-241-8/+8
| | | | llvm-svn: 128211
* Do early taildup of ret in CodeGenPrepare for potential tail calls that have aCameron Zwarich2011-03-241-17/+37
| | | | | | void return type. This fixes PR9487. llvm-svn: 128197
* Use an early return instead of a long if block.Cameron Zwarich2011-03-241-51/+51
| | | | llvm-svn: 128196
* When UpdateDT is set, DT is invalid, which could cause problems when trying toCameron Zwarich2011-03-241-2/+3
| | | | | | use it later. I couldn't make a test that hits this with the current code. llvm-svn: 128195
* Check for TLI so that -codegenprepare can be used from opt.Cameron Zwarich2011-03-241-0/+3
| | | | llvm-svn: 128194
* Re-apply r127953 with fixes: eliminate empty return block if it has no ↵Evan Cheng2011-03-211-10/+122
| | | | | | predecessors; update dominator tree if cfg is modified. llvm-svn: 127981
* Revert r127953, "SimplifyCFG has stopped duplicating returns into predecessorsDaniel Dunbar2011-03-191-99/+4
| | | | | | to canonicalize IR", it broke a lot of things. llvm-svn: 127954
* SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IREvan Cheng2011-03-191-4/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to have single return block (at least getting there) for optimizations. This is general goodness but it would prevent some tailcall optimizations. One specific case is code like this: int f1(void); int f2(void); int f3(void); int f4(void); int f5(void); int f6(void); int foo(int x) { switch(x) { case 1: return f1(); case 2: return f2(); case 3: return f3(); case 4: return f4(); case 5: return f5(); case 6: return f6(); } } => LBB0_2: ## %sw.bb callq _f1 popq %rbp ret LBB0_3: ## %sw.bb1 callq _f2 popq %rbp ret LBB0_4: ## %sw.bb3 callq _f3 popq %rbp ret This patch teaches codegenprep to duplicate returns when the return value is a phi and where the phi operands are produced by tail calls followed by an unconditional branch: sw.bb7: ; preds = %entry %call8 = tail call i32 @f5() nounwind br label %return sw.bb9: ; preds = %entry %call10 = tail call i32 @f6() nounwind br label %return return: %retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ] ret i32 %retval.0 This allows codegen to generate better code like this: LBB0_2: ## %sw.bb jmp _f1 ## TAILCALL LBB0_3: ## %sw.bb1 jmp _f2 ## TAILCALL LBB0_4: ## %sw.bb3 jmp _f3 ## TAILCALL rdar://9147433 llvm-svn: 127953
* Roll r127459 back in:Cameron Zwarich2011-03-111-0/+14
| | | | | | | | | | | Optimize trivial branches in CodeGenPrepare, which often get created from the lowering of objectsize intrinsics. Unfortunately, a number of tests were relying on llc not optimizing trivial branches, so I had to add an option to allow them to continue to test what they originally tested. This fixes <rdar://problem/8785296> and <rdar://problem/9112893>. llvm-svn: 127498
* Revert r127459, "Optimize trivial branches in CodeGenPrepare, which often getDaniel Dunbar2011-03-111-14/+0
| | | | | | created from the", it broke some GCC test suite tests. llvm-svn: 127477
* Optimize trivial branches in CodeGenPrepare, which often get created from theCameron Zwarich2011-03-111-0/+14
| | | | | | | | | | lowering of objectsize intrinsics. Unfortunately, a number of tests were relying on llc not optimizing trivial branches, so I had to add an option to allow them to continue to test what they originally tested. This fixes <rdar://problem/8785296> and <rdar://problem/9112893>. llvm-svn: 127459
* Fix PR9398 - 10% of llc compile time is spent in Value::getNumUses. This reducesCameron Zwarich2011-03-051-7/+22
| | | | | | | the percentage of time spent in CodeGenPrepare when llcing 403.gcc from 12.6% to 1.8% of total llc time. llvm-svn: 127069
* Remove some more unused code that I missed.Cameron Zwarich2011-03-021-19/+0
| | | | llvm-svn: 126826
* Eliminate the unused CodeGenPrepare option to split critical edges.Cameron Zwarich2011-03-021-128/+1
| | | | llvm-svn: 126825
* Stop computing the number of uses twice per value in CodeGenPrepare's sinking ofCameron Zwarich2011-03-011-3/+4
| | | | | | | | addressing code. On 403.gcc this almost halves CodeGenPrepare time and reduces total llc time by 9.5%. Unfortunately, getNumUses() is still the hottest function in llc. llvm-svn: 126782
* fix rdar://8878965, a regression I introduced with the recentChris Lattner2011-01-181-1/+3
| | | | | | llvm.objectsize changes. llvm-svn: 123771
* temporarily revert r123526. While working on a follow-on patch IChris Lattner2011-01-151-3/+0
| | | | | | realize that ConstantFoldTerminator doesn't preserve dominfo. llvm-svn: 123527
* fix rdar://8785296 - -fcatch-undefined-behavior generates inefficient codeChris Lattner2011-01-151-0/+3
| | | | | | | | | The basic issue is that isel (very reasonably!) expects conditional branches to be folded, so CGP leaving around a bunch dead computation feeding conditional branches isn't such a good idea. Just fold branches on constants into unconditional branches. llvm-svn: 123526
* simplify code, no functionality change.Chris Lattner2011-01-151-30/+37
| | | | llvm-svn: 123525
* Now that instruction optzns can update the iterator as they go, we can Chris Lattner2011-01-151-10/+16
| | | | | | | | have objectsize folding recursively simplify away their result when it folds. It is important to catch this here, because otherwise we won't eliminate the cross-block values at isel and other times. llvm-svn: 123524
* make the current instruction iterator an ivar, allowing xforms thatChris Lattner2011-01-151-35/+38
| | | | | | | potentially invalidate it (like inline asm lowering) to be sunk into their proper place, cleaning up a ton of code. llvm-svn: 123523
OpenPOWER on IntegriCloud