summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Scalar/CodeGenPrepare.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Fix Doxygen issues:Dmitri Gribenko2012-09-131-1/+4
| | | | | | | | * wrap code blocks in \code ... \endcode; * refer to parameter names in paragraphs correctly (\arg is not what most people want -- it starts a new paragraph). llvm-svn: 163790
* Generic Bypass Slow DivPreston Gurd2012-09-041-1/+14
| | | | | | | | | | | | | | | | | | | | | | | - CodeGenPrepare pass for identifying div/rem ops - Backend specifies the type mapping using addBypassSlowDivType - Enabled only for Intel Atom with O2 32-bit -> 8-bit - Replace IDIV with instructions which test its value and use DIVB if the value is positive and less than 256. - In the case when the quotient and remainder of a divide are used a DIV and a REM instruction will be present in the IR. In the non-Atom case they are both lowered to IDIVs and CSE removes the redundant IDIV instruction, using the quotient and remainder from the first IDIV. However, due to this optimization CSE is not able to eliminate redundant IDIV instructions because they are located in different basic blocks. This is overcome by calculating both the quotient (DIV) and remainder (REM) in each basic block that is inserted by the optimization and reusing the result values when a subsequent DIV or REM instruction uses the same operands. - Test cases check for the presents of the optimization when calculating either the quotient, remainder, or both. Patch by Tyler Nowicki! llvm-svn: 163150
* Not all targets have efficient ISel code generation for select instructions.Nadav Rotem2012-09-021-7/+22
| | | | | | | | | For example, the ARM target does not have efficient ISel handling for vector selects with scalar conditions. This patch adds a TLI hook which allows the different targets to report which selects are supported well and which selects should be converted to CF duting codegen prepare. llvm-svn: 163093
* Make MemoryBuiltins aware of TargetLibraryInfo.Benjamin Kramer2012-08-291-1/+1
| | | | | | | | | | | | | | | | This disables malloc-specific optimization when -fno-builtin (or -ffreestanding) is specified. This has been a problem for a long time but became more severe with the recent memory builtin improvements. Since the memory builtin functions are used everywhere, this required passing TLI in many places. This means that functions that now have an optional TLI argument, like RecursivelyDeleteTriviallyDeadFunctions, won't remove dead mallocs anymore if the TLI argument is missing. I've updated most passes to do the right thing. Fixes PR13694 and probably others. llvm-svn: 162841
* revise debug output to avoid dangling pointerMichael Liao2012-08-211-1/+1
| | | | llvm-svn: 162256
* Remove dead flag.Bill Wendling2012-08-151-9/+3
| | | | llvm-svn: 161990
* During the CodeGenPrepare we often lower intrinsics (such as objsize)Nadav Rotem2012-08-141-0/+39
| | | | | | | | | | | and allow some optimizations to turn conditional branches into unconditional. This commit adds a simple control-flow optimization which merges two consecutive basic blocks which are connected by a single edge. This allows the codegen to operate on larger basic blocks. rdar://11973998 llvm-svn: 161852
* Teach CodeGenPrep to look past bitcast when it's duplicating return instructionEvan Cheng2012-07-271-3/+14
| | | | | | | | into predecessor blocks to enable tail call optimization. rdar://11958338 llvm-svn: 160894
* make all Emit*() functions consult the TargetLibraryInfo information before ↵Nuno Lopes2012-07-251-1/+1
| | | | | | | | | creating a call to a library function. Update all clients to pass the TLI information around. Previous draft reviewed by Eli. llvm-svn: 160733
* Clean whitespaces.Nadav Rotem2012-07-241-29/+29
| | | | llvm-svn: 160668
* CodeGenPrepare: Don't crash when TLI is not available.Benjamin Kramer2012-06-291-1/+2
| | | | | | This happens when codegenprepare is invoked via opt. llvm-svn: 159457
* Move llvm/Support/IRBuilder.h -> llvm/IRBuilder.hChandler Carruth2012-06-291-12/+12
| | | | | | | | | | | | | | | | | This was always part of the VMCore library out of necessity -- it deals entirely in the IR. The .cpp file in fact was already part of the VMCore library. This is just a mechanical move. I've tried to go through and re-apply the coding standard's preferred header sort, but at 40-ish files, I may have gotten some wrong. Please let me know if so. I'll be committing the corresponding updates to Clang and Polly, and Duncan has DragonEgg. Thanks to Bill and Eric for giving the green light for this bit of cleanup. llvm-svn: 159421
* Switch the select to branch transformation on by default.Benjamin Kramer2012-05-061-3/+4
| | | | | | | | | The primitive conservative heuristic seems to give a slight overall improvement while not regressing stuff. Make it available to wider testing. If you notice any speed regressions (or significant code size regressions) let me know! llvm-svn: 156258
* CodeGenPrepare: Add a transform to turn selects into branches in some cases.Benjamin Kramer2012-05-051-0/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This came up when a change in block placement formed a cmov and slowed down a hot loop by 50%: ucomisd (%rdi), %xmm0 cmovbel %edx, %esi cmov is a really bad choice in this context because it doesn't get branch prediction. If we emit it as a branch, an out-of-order CPU can do a better job (if the branch is predicted right) and avoid waiting for the slow load+compare instruction to finish. Of course it won't help if the branch is unpredictable, but those are really rare in practice. This patch uses a dumb conservative heuristic, it turns all cmovs that have one use and a direct memory operand into branches. cmovs usually save some code size, so we disable the transform in -Os mode. In-Order architectures are unlikely to benefit as well, those are included in the "predictableSelectIsExpensive" flag. It would be better to reuse branch probability info here, but BPI doesn't support select instructions currently. It would make sense to use the same heuristics as the if-converter pass, which does the opposite direction of this transform. Test suite shows a small improvement here and there on corei7-level machines, but the actual results depend a lot on the used microarchitecture. The transformation is currently disabled by default and available by passing the -enable-cgp-select2branch flag to the code generator. Thanks to Chandler for the initial test case to him and Evan Cheng for providing me with comments and test-suite numbers that were more stable than mine :) llvm-svn: 156234
* Refactor the interface to recursively simplifying instructions to be tadChandler Carruth2012-03-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | bit simpler by handling a common case explicitly. Also, refactor the implementation to use a worklist based walk of the recursive users, rather than trying to use value handles to detect and recover from RAUWs during the recursive descent. This fixes a very subtle bug in the previous implementation where degenerate control flow structures could cause mutually recursive instructions (PHI nodes) to collapse in just such a way that From became equal to To after some amount of recursion. At that point, we hit the inf-loop that the assert at the top attempted to guard against. This problem is defined away when not using value handles in this manner. There are lots of comments claiming that the WeakVH will protect against just this sort of error, but they're not accurate about the actual implementation of WeakVHs, which do still track RAUWs. I don't have any test case for the bug this fixes because it requires running the recursive simplification on unreachable phi nodes. I've no way to either run this or easily write an input that triggers it. It was found when using instruction simplification inside the inliner when running over the nightly test-suite. llvm-svn: 153393
* Target override to allow CodeGenPrepare to sink address operands to ↵Pete Cooper2012-03-131-0/+9
| | | | | | intrinsics in the same way it current does for loads and stores llvm-svn: 152666
* Do trivial CSE of dead BBs during codegen preparation.Bill Wendling2012-03-041-1/+20
| | | | | | | | Some BBs can become dead after codegen preparation. If we delete them here, it could help enable tail-call optimizations later on. <rdar://problem/10256573> llvm-svn: 152002
* Extend Attributes to 64 bitsKostya Serebryany2012-01-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | Problem: LLVM needs more function attributes than currently available (32 bits). One such proposed attribute is "address_safety", which shows that a function is being checked for address safety (by AddressSanitizer, SAFECode, etc). Solution: - extend the Attributes from 32 bits to 64-bits - wrap the object into a class so that unsigned is never erroneously used instead - change "unsigned" to "Attributes" throughout the code, including one place in clang. - the class has no "operator uint64 ()", but it has "uint64_t Raw() " to support packing/unpacking. - the class has "safe operator bool()" to support the common idiom: if (Attributes attr = getAttrs()) useAttrs(attr); - The CTOR from uint64_t is marked explicit, so I had to add a few explicit CTOR calls - Add the new attribute "address_safety". Doing it in the same commit to check that attributes beyond first 32 bits actually work. - Some of the functions from the Attribute namespace are worth moving inside the class, but I'd prefer to have it as a separate commit. Tested: "make check" on Linux (32-bit and 64-bit) and Mac (10.6) built/run spec CPU 2006 on Linux with clang -O2. This change will break clang build in lib/CodeGen/CGCall.cpp. The following patch will fix it. llvm-svn: 148553
* Propagate TargetLibraryInfo throughout ConstantFolding.cpp and Chad Rosier2011-12-011-2/+9
| | | | | | | InstructionSimplify.cpp. Other fixups as needed. Part of rdar://10500969 llvm-svn: 145559
* Fold two identical set lookups into one. No functionality change.Nick Lewycky2011-09-291-4/+2
| | | | llvm-svn: 140821
* Stop emitting instructions with the name "tmp" they eat up memory and have ↵Benjamin Kramer2011-09-271-1/+1
| | | | | | | | to be uniqued, without any benefit. If someone prefers %tmp42 to %42, run instnamer. llvm-svn: 140634
* Use IRBuilder.Devang Patel2011-09-061-17/+14
| | | | llvm-svn: 139156
* Dramatically speedup codegen prepare by a) avoiding use of dominator tree ↵Devang Patel2011-08-181-16/+38
| | | | | | and b) doing a separate pass over dbg.value instructions. llvm-svn: 137908
* Use the getFirstInsertionPt() method instead of getFirstNonPHI + an 'isa<>'Bill Wendling2011-08-161-13/+6
| | | | | | check for a LandingPadInst. llvm-svn: 137745
* In places where it's using "getFirstNonPHI", skip the landingpad instruction ↵Bill Wendling2011-08-151-5/+8
| | | | | | if necessary. llvm-svn: 137679
* Skip the insertion iterator past the landingpad instruction if there.Bill Wendling2011-08-151-0/+1
| | | | llvm-svn: 137626
* land David Blaikie's patch to de-constify Type, with a few tweaks.Chris Lattner2011-07-181-4/+4
| | | | llvm-svn: 135375
* Fix warnings due to 132263; Thanks rdivacky.Nadav Rotem2011-05-291-2/+4
| | | | llvm-svn: 132285
* Refactor getActionType and getTypeToTransformTo ; place all of the 'decision'Nadav Rotem2011-05-271-2/+2
| | | | | | code in one place. Re-apply 131534 and fix the multi-step promotion of integers. llvm-svn: 132217
* Fix warning about || and && without explicit grouping.Chandler Carruth2011-05-261-2/+2
| | | | | | | | This looks like it flagged an actual bug. Devang, please review. I added the parentheses that change behavior, but make the behavior more closely match commit log's intent. llvm-svn: 132165
* Do not insert anything after terminator.Devang Patel2011-05-261-1/+2
| | | | llvm-svn: 132164
* Do not move DBG_VALUE in middle of PHI nodes.Devang Patel2011-05-261-1/+4
| | | | llvm-svn: 132161
* If llvm.dbg.value and the value instruction it refers to are far apart then ↵Devang Patel2011-05-261-1/+13
| | | | | | iSel may not be able to find corresponding Node for llvm.dbg.value during DAG construction. Make iSel's life easier by removing this distance between llvm.dbg.value and its value instruction. llvm-svn: 132151
* Add a parameter to ConstantFoldTerminator() that callers can use to ask it ↵Frits van Bommel2011-05-221-1/+1
| | | | | | | | to also clean up the condition of any conditional terminator it folds to be unconditional, if that turns the condition into dead code. This just means it calls RecursivelyDeleteTriviallyDeadInstructions() in strategic spots. It defaults to the old behavior. I also changed -simplifycfg, -jump-threading and -codegenprepare to use this to produce slightly better code without any extra cleanup passes (AFAICT this was the only place in -simplifycfg where now-dead conditions of replaced terminators weren't being cleaned up). The only other user of this function is -sccp, but I didn't read that thoroughly enough to figure out whether it might be holding pointers to instructions that could be deleted by this. llvm-svn: 131855
* Revert commit 131534 since it seems to have broken several buildbots.Duncan Sands2011-05-181-2/+2
| | | | | | | | Original log entry: Refactor getActionType and getTypeToTransformTo ; place all of the 'decision' code in one place. llvm-svn: 131536
* Refactor getActionType and getTypeToTransformTo ; place all of the 'decision'Nadav Rotem2011-05-181-2/+2
| | | | | | code in one place. llvm-svn: 131534
* Fix a bug where RecursivelyDeleteTriviallyDeadInstructions couldChris Lattner2011-04-091-3/+18
| | | | | | | delete the instruction pointed to by CGP's current instruction iterator, leading to a crash on the testcase. This fixes PR9578. llvm-svn: 129200
* Debug intrinsics must be skipped at the beginning and ends of blocks, lest theyCameron Zwarich2011-03-241-2/+6
| | | | | | affect the generated code. llvm-svn: 128217
* It is enough for the CallInst to have no uses to be made a tail call with a retCameron Zwarich2011-03-241-1/+1
| | | | | | void; it doesn't need to have a void type. llvm-svn: 128212
* s/UpdateDT/ModifiedDT/gDevang Patel2011-03-241-8/+8
| | | | llvm-svn: 128211
* Do early taildup of ret in CodeGenPrepare for potential tail calls that have aCameron Zwarich2011-03-241-17/+37
| | | | | | void return type. This fixes PR9487. llvm-svn: 128197
* Use an early return instead of a long if block.Cameron Zwarich2011-03-241-51/+51
| | | | llvm-svn: 128196
* When UpdateDT is set, DT is invalid, which could cause problems when trying toCameron Zwarich2011-03-241-2/+3
| | | | | | use it later. I couldn't make a test that hits this with the current code. llvm-svn: 128195
* Check for TLI so that -codegenprepare can be used from opt.Cameron Zwarich2011-03-241-0/+3
| | | | llvm-svn: 128194
* Re-apply r127953 with fixes: eliminate empty return block if it has no ↵Evan Cheng2011-03-211-10/+122
| | | | | | predecessors; update dominator tree if cfg is modified. llvm-svn: 127981
* Revert r127953, "SimplifyCFG has stopped duplicating returns into predecessorsDaniel Dunbar2011-03-191-99/+4
| | | | | | to canonicalize IR", it broke a lot of things. llvm-svn: 127954
* SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IREvan Cheng2011-03-191-4/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to have single return block (at least getting there) for optimizations. This is general goodness but it would prevent some tailcall optimizations. One specific case is code like this: int f1(void); int f2(void); int f3(void); int f4(void); int f5(void); int f6(void); int foo(int x) { switch(x) { case 1: return f1(); case 2: return f2(); case 3: return f3(); case 4: return f4(); case 5: return f5(); case 6: return f6(); } } => LBB0_2: ## %sw.bb callq _f1 popq %rbp ret LBB0_3: ## %sw.bb1 callq _f2 popq %rbp ret LBB0_4: ## %sw.bb3 callq _f3 popq %rbp ret This patch teaches codegenprep to duplicate returns when the return value is a phi and where the phi operands are produced by tail calls followed by an unconditional branch: sw.bb7: ; preds = %entry %call8 = tail call i32 @f5() nounwind br label %return sw.bb9: ; preds = %entry %call10 = tail call i32 @f6() nounwind br label %return return: %retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ] ret i32 %retval.0 This allows codegen to generate better code like this: LBB0_2: ## %sw.bb jmp _f1 ## TAILCALL LBB0_3: ## %sw.bb1 jmp _f2 ## TAILCALL LBB0_4: ## %sw.bb3 jmp _f3 ## TAILCALL rdar://9147433 llvm-svn: 127953
* Roll r127459 back in:Cameron Zwarich2011-03-111-0/+14
| | | | | | | | | | | Optimize trivial branches in CodeGenPrepare, which often get created from the lowering of objectsize intrinsics. Unfortunately, a number of tests were relying on llc not optimizing trivial branches, so I had to add an option to allow them to continue to test what they originally tested. This fixes <rdar://problem/8785296> and <rdar://problem/9112893>. llvm-svn: 127498
* Revert r127459, "Optimize trivial branches in CodeGenPrepare, which often getDaniel Dunbar2011-03-111-14/+0
| | | | | | created from the", it broke some GCC test suite tests. llvm-svn: 127477
* Optimize trivial branches in CodeGenPrepare, which often get created from theCameron Zwarich2011-03-111-0/+14
| | | | | | | | | | lowering of objectsize intrinsics. Unfortunately, a number of tests were relying on llc not optimizing trivial branches, so I had to add an option to allow them to continue to test what they originally tested. This fixes <rdar://problem/8785296> and <rdar://problem/9112893>. llvm-svn: 127459
OpenPOWER on IntegriCloud