summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Utils
Commit message (Collapse)AuthorAgeFilesLines
* Make sure the instruction right after an inlined function has aAdrian Prantl2013-04-231-4/+10
| | | | | | | | | | debug location. This solves a problem where range of an inlined subroutine is emitted wrongly. Patch by Manman Ren. Fixes rdar://problem/12415623 llvm-svn: 180140
* Move C++ code out of the C headers and into either C++ headersEric Christopher2013-04-221-0/+1
| | | | | | | or the C++ files themselves. This enables people to use just a C compiler to interoperate with LLVM. llvm-svn: 180063
* Revert "SimplifyCFG: If convert single conditional stores"Arnold Schwaighofer2013-04-211-88/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is the temptation to make this tranform dependent on target information as it is not going to be beneficial on all (sub)targets. Therefore, we should probably do this in MI Early-Ifconversion. This reverts commit r179957. Original commit message: "SimplifyCFG: If convert single conditional stores This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case were the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. I am going to watch performance numbers across the builtbots and will revert this if anything unexpected comes up." llvm-svn: 179980
* SimplifyCFG: If convert single conditional storesArnold Schwaighofer2013-04-201-4/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case were the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. I am going to watch performance numbers across the builtbots and will revert this if anything unexpected comes up. llvm-svn: 179957
* Do not optimise fprintf() calls if its return value is used.Peter Collingbourne2013-04-171-9/+12
| | | | | | Differential Revision: http://llvm-reviews.chandlerc.com/D620 llvm-svn: 179661
* simplifycfg: Fix integer overflow converting switch into icmp.Hans Wennborg2013-04-161-1/+6
| | | | | | | | | | | If a switch instruction has a case for every possible value of its type, with the same successor, SimplifyCFG would replace it with an icmp ult, but the computation of the bound overflows in that case, which inverts the test. Patch by Jed Davis! llvm-svn: 179587
* Change CloneFunctionInto to always clone Argument attributes induvidually,Joey Gouly2013-04-101-22/+19
| | | | | | | rather than checking if the source and destination have the same number of arguments and copying the attributes over directly. llvm-svn: 179169
* Add all clauses when merging the landing pads. Duplicates will be handled ↵Bill Wendling2013-03-221-24/+14
| | | | | | later on. llvm-svn: 177757
* Don't use the removed API.Bill Wendling2013-03-221-5/+2
| | | | llvm-svn: 177749
* Fix llvm::removeUnreachableBlocks to handle unreachable loops.Evgeniy Stepanov2013-03-221-12/+7
| | | | llvm-svn: 177713
* Always forward 'resume' instructions to the outter landing pad.Bill Wendling2013-03-211-16/+39
| | | | | | | | | | | | | | | | | How did this ever work? Basically, if you have a function that's inlined into the caller, it may not have any 'call' instructions, but any 'resume' instructions it may have should still be forwarded to the outer (caller's) landing pad. This requires that all of the 'landingpad' instructions in the callee have their clauses merged with the caller's outer 'landingpad' instruction (hence the bit of ugly code in the `forwardResume' method). Testcase in a follow commit to the test-suite repository. <rdar://problem/13360379> & PR15555 llvm-svn: 177680
* LibCallSimplifier: optimize speed for short-lived instancesMeador Inge2013-03-121-177/+225
| | | | | | | | | | | | | | | | | | | | | | | | | | Nadav reported a performance regression due to the work I did to merge the library call simplifier into instcombine [1]. The issue is that a new LibCallSimplifier object is being created whenever InstCombiner::runOnFunction is called. Every time a LibCallSimplifier object is used to optimize a call it creates a hash table to map from a function name to an object that optimizes functions of that name. For short-lived LibCallSimplifier instances this is quite inefficient. Especially for cases where no calls are actually simplified. This patch fixes the issue by dropping the hash table and implementing an explicit lookup function to correlate the function name to the object that optimizes functions of that name. This avoids the cost of always building and destroying the hash table in cases where the LibCallSimplifier object is short-lived and avoids the cost of building the table when no simplifications are actually preformed. On a benchmark containing 100,000 calls where none of them are simplified I noticed a 30% speedup. On a benchmark containing 100,000 calls where all of them are simplified I noticed an 8% speedup. [1] http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130304/167639.html llvm-svn: 176840
* Don't remove a landing pad if the invoke requires a table entry.Bill Wendling2013-03-111-3/+17
| | | | | | | | An invoke may require a table entry. For instance, when the function it calls is expected to throw. <rdar://problem/13360379> llvm-svn: 176827
* Fixed a crash when cloning a function into a function withPekka Jaaskelainen2013-03-071-3/+6
| | | | | | | different size argument list and without attributes in the arguments. llvm-svn: 176632
* SimplifyCFG fix for volatile load/store.Andrew Trick2013-03-071-2/+4
| | | | | | | | | | | | | Fixes rdar:13349374. Volatile loads and stores need to be preserved even if the language standard says they are undefined. "volatile" in this context means "get out of the way compiler, let my platform handle it". Additionally, this is the only way I know of with llvm to write to the first page (when hardware allows) without dropping to assembly. llvm-svn: 176599
* Bypass Slow DividesPreston Gurd2013-03-041-2/+2
| | | | | | | | | | | | | * Only apply divide bypass optimization when not optimizing for size. * Fixed bug caused by constant for 0 value of type Int32, used dividend type to generate the constant instead. * For atom x86-64 apply the divide bypass to use 16-bit divides instead of 64-bit divides when operand values are small enough. * Added lit tests for 64-bit divide bypass. Patch by Tyler Nowicki! llvm-svn: 176442
* Modify {Call,Invoke}Inst::addAttribute to take an AttrKind.Peter Collingbourne2013-03-021-2/+1
| | | | llvm-svn: 176397
* For each function that we optimize we initialize a new list of lib ↵Nadav Rotem2013-02-271-1/+2
| | | | | | functions. For each function name we malloc memory. This patch changes the Libcall map to use BumpPtrAllocator. Now we malloc only once. This speeds up instcombine by a few % on a large c++ program. llvm-svn: 176170
* Enhance integer division emulation support to handle types smaller than 32 bits,Pedro Artigas2013-02-261-0/+104
| | | | | | | | enhancement done the trivial way; by extending inputs and truncating outputs which is addequate for targets with little or no support for integer arithmetic on integer types less than 32 bits. llvm-svn: 176139
* Implement the NoBuiltin attribute.Bill Wendling2013-02-221-0/+1
| | | | | | | | The 'nobuiltin' attribute is applied to call sites to indicate that LLVM should not treat the callee function as a built-in function. I.e., it shouldn't try to replace that function with different code. llvm-svn: 175835
* Temporarily revert r175470 for more review.Bill Wendling2013-02-191-3/+0
| | | | llvm-svn: 175476
* Check to see if the 'no-builtin' attribute is set before simplifying a ↵Bill Wendling2013-02-181-0/+3
| | | | | | library call. llvm-svn: 175470
* Remove #includes from the commonly used LoopInfo.h.Jakub Staszak2013-02-091-0/+1
| | | | llvm-svn: 174786
* [SimplifyLibCalls] Library call simplification doen't work if the call site Chad Rosier2013-02-081-1/+7
| | | | | | | | isn't using the default calling convention. However, if the transformation is from a call to inline IR, then the calling convention doesn't matter. rdar://13157990 llvm-svn: 174724
* [SjLj Prepare] When demoting an invoke instructions to the stack, if the normalChad Rosier2013-02-051-5/+15
| | | | | | | edge is critical, then split it so we can insert the store. rdar://13126179 llvm-svn: 174418
* Linker: correctly link in dbg.declareManman Ren2013-01-311-2/+17
| | | | | | | | | | | | | | | | | | | This is a re-worked version of r174048. Given source IR: call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !14), !dbg !15 we used to generate call void @llvm.dbg.declare(metadata !27, metadata !28), !dbg !29 !27 = metadata !{null} With this patch, we will correctly generate call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !27), !dbg !28 Looking up %argc.addr in ValueMap will return null, since %argc.addr is already correctly set up, we can use identity mapping. rdar://problem/13089880 llvm-svn: 174093
* Revert r173946. This breaks compilation of googletest with ClangAlexey Samsonov2013-01-311-11/+2
| | | | llvm-svn: 174048
* Remove addRetAttributes and addFnAttributes, which aren't useful abstractions.Bill Wendling2013-01-301-4/+6
| | | | llvm-svn: 173992
* Linker: correctly link in dbg.declareManman Ren2013-01-301-2/+11
| | | | | | | | | | | | | | | | Given source IR: call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !14), !dbg !15 we used to generate call void @llvm.dbg.declare(metadata !27, metadata !28), !dbg !29 !27 = metadata !{null} With this patch, we will correctly generate call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !27), !dbg !28 Looking up %argc.addr in ValueMap will return null, since %argc.addr is already correctly set up, we can use identity mapping. llvm-svn: 173946
* Re-revert r173342, without losing the compile time improvements, flatChandler Carruth2013-01-271-27/+12
| | | | | | out bug fixes, or functionality preserving refactorings. llvm-svn: 173610
* Convert BuildLibCalls.cpp to using the AttributeSet methods instead of ↵Bill Wendling2013-01-261-66/+66
| | | | | | AttributeWithIndex. llvm-svn: 173536
* Switch this code away from Value::isUsedInBasicBlock. That code eitherChandler Carruth2013-01-251-7/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | loops over instructions in the basic block or the use-def list of the value, neither of which are really efficient when repeatedly querying about values in the same basic block. What's more, we already know that the CondBB is small, and so we can do a much more efficient test by counting the uses in CondBB, and seeing if those account for all of the uses. Finally, we shouldn't blanket fail on any such instruction, instead we should conservatively assume that those instructions are part of the cost. Note that this actually fixes a bug in the pass because isUsedInBasicBlock has a really terrible bug in it. I'll fix that in my next commit, but the fix for it would make this code suddenly take the compile time hit I thought it already was taking, so I wanted to go ahead and migrate this code to a faster & better pattern. The bug in isUsedInBasicBlock was also causing other tests to test the wrong thing entirely: for example we weren't actually disabling speculation for floating point operations as intended (and tested), but the test passed because we failed to speculate them due to the isUsedInBasicBlock failure. llvm-svn: 173417
* Reapply chandlerc's r173342 now that the miscompile it was triggering is fixed.Benjamin Kramer2013-01-241-9/+18
| | | | | | | | | | | | | | | | | | | | Original commit message: Plug TTI into the speculation logic, giving it a real cost interface that can be specialized by targets. The goal here is not to be more aggressive, but to just be more accurate with very obvious cases. There are instructions which are known to be truly free and which were not being modeled as such in this code -- see the regression test which is distilled from an inner loop of zlib. Everywhere the TTI cost model is insufficiently conservative I've added explicit checks with FIXME comments to go add proper modelling of these cost factors. If this causes regressions, the likely solution is to make TTI even more conservative in its cost estimates, but test cases will help here. llvm-svn: 173357
* Revert r173342 temporarily. It appears to cause a very late miscompileChandler Carruth2013-01-241-18/+9
| | | | | | of stage2 in a bootstrap. Still investigating.... llvm-svn: 173343
* Plug TTI into the speculation logic, giving it a real cost interfaceChandler Carruth2013-01-241-9/+18
| | | | | | | | | | | | | | | | | | that can be specialized by targets. The goal here is not to be more aggressive, but to just be more accurate with very obvious cases. There are instructions which are known to be truly free and which were not being modeled as such in this code -- see the regression test which is distilled from an inner loop of zlib. Everywhere the TTI cost model is insufficiently conservative I've added explicit checks with FIXME comments to go add proper modelling of these cost factors. If this causes regressions, the likely solution is to make TTI even more conservative in its cost estimates, but test cases will help here. llvm-svn: 173342
* Address a large chunk of this FIXME by accumulating the cost forChandler Carruth2013-01-241-8/+6
| | | | | | | unfolded constant expressions rather than checking each one independently. llvm-svn: 173341
* Switch the constant expression speculation cost evaluation away fromChandler Carruth2013-01-241-7/+14
| | | | | | | | | | | | | | | | | | | | a cost fuction that seems both a bit ad-hoc and also poorly suited to evaluating constant expressions. Notably, it is missing any support for trivial expressions such as 'inttoptr'. I could fix this routine, but it isn't clear to me all of the constraints its other users are operating under. The core protection that seems relevant here is avoiding the formation of a select instruction wich a further chain of select operations in a constant expression operand. Just explicitly encode that constraint. Also, update the comments and organization here to make it clear where this needs to go -- this should be driven off of real cost measurements which take into account the number of constants expressions and the depth of the constant expression tree. llvm-svn: 173340
* Rephrase the speculating scan of the conditional BB to be phrased inChandler Carruth2013-01-241-19/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | terms of cost rather than hoisting a single instruction. This does *not* change the cost model! We still set the cost threshold at 1 here, it's just that we track it by accumulating cost rather than by storing an instruction. The primary advantage is that we no longer leave no-op intrinsics in the basic block. For example, this will now move both debug info intrinsics and a single instruction, instead of only moving the instruction and leaving a basic block with nothing bug debug info intrinsics in it, and those intrinsics now no longer ordered correctly with the hoisted value. Instead, we now splice the entire conditional basic block's instruction sequence. This also places the code for checking the safety of hoisting next to the code computing the cost. Currently, the only observable side-effect of this change is that debug info intrinsics are no longer abandoned. I'm not sure how to craft a test case for this, and my real goal was the refactoring, but I'll talk to Dave or Eric about how to add a test case for this. llvm-svn: 173339
* Simplify the PHI node operand rewriting.Chandler Carruth2013-01-241-42/+35
| | | | | | | | | | | | | | | | | | Previously, the code would scan the PHI nodes and build up a small setvector of candidate value pairs in phi nodes to go and rewrite. Once certain the rewrite could be performed, the code walks the set, and for each one re-scans the entire PHI node list looking for nodes to rewrite operands. Instead, scan the PHI nodes once to check for hazards, and then scan it a second time to rewrite the operands to selects. No set vector, and a max of two scans. The only downside is that we might form identical selects, but instcombine or anything else should fold those easily, and it seems unlikely to happen often. llvm-svn: 173337
* Give the basic block variables here names based on the if-then-endChandler Carruth2013-01-241-32/+33
| | | | | | structure being analyzed. No functionality changed. llvm-svn: 173334
* Lift a cheap early exit test above loops and other complex early exitChandler Carruth2013-01-241-5/+5
| | | | | | | | | tests. No need to pay the high cost when we're never going to do anything. No functionality changed. llvm-svn: 173331
* Spiff up the comment on this method, making the example a bit moreChandler Carruth2013-01-241-16/+35
| | | | | | | | | | | pretty in doxygen, adding some of the details actually present in a classic example where this matters (a loop from gzip and many other compression algorithms), and a cautionary note about the risks inherent in the transform. This has come up on the mailing lists recently, and I suspect folks reading this code could benefit from going and looking at the MI pass that can really deal with these issues. llvm-svn: 173329
* Make sure metarenamer won't rename special stuff (intrinsics and explicitly ↵Anton Korobeynikov2013-01-231-3/+17
| | | | | | | | renamed stuff). Otherwise this might hide the problems. llvm-svn: 173265
* Initialize the components of this class. Otherwise GCC thinks that Array may beDuncan Sands2013-01-231-1/+2
| | | | | | | | used uninitialized, since it fails to understand that Array is only used when SingleValue is not, and outputs a warning. It also seems generally safer given that the constructor is non-trivial and has plenty of early exits. llvm-svn: 173242
* Remove the last of uses that use the Attribute object as a collection of ↵Bill Wendling2013-01-231-1/+1
| | | | | | | | | attributes. Collections of attributes are handled via the AttributeSet class now. This finally frees us up to make significant changes to how attributes are structured. llvm-svn: 173228
* Use AttributeSet accessor methods instead of Attribute accessor methods.Bill Wendling2013-01-181-4/+2
| | | | | | | Further encapsulation of the Attribute object. Don't allow direct access to the Attribute object as an aggregate. llvm-svn: 172853
* Push some more methods down to hide the use of the Attribute class.Bill Wendling2013-01-181-4/+2
| | | | | | | | Because the Attribute class is going to stop representing a collection of attributes, limit the use of it as an aggregate in favor of using AttributeSet. This replaces some of the uses for querying the function attributes. llvm-svn: 172844
* Remove trailing spaces.Jakub Staszak2013-01-141-38/+38
| | | | llvm-svn: 172489
* Make sure we don't emit instructions before a landingpad instruction.Bill Wendling2013-01-081-1/+6
| | | | | | PR14782 llvm-svn: 171846
* Move TypeFinder.h into the IR tree, it clearly belongs with the IR library.Chandler Carruth2013-01-071-1/+1
| | | | llvm-svn: 171749
OpenPOWER on IntegriCloud