summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* InstCombine: Fix infinite loop when encountering switch on trivial icmp.Benjamin Kramer2012-05-281-1/+1
| | | | | | | | | | | | The test case feeds the following into InstCombine's visitSelect: %tobool8 = icmp ne i32 0, 0 %phitmp = select i1 %tobool8, i32 3, i32 0 Then instcombine replaces the right side of the switch with 0, doesn't notice that nothing changes and tries again indefinitely. This fixes PR12897. llvm-svn: 157587
* PR1255: Case RangesStepan Dyatkovskiy2012-05-282-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | Implemented IntItem - the wrapper around APInt. Why not to use APInt item directly right now? 1. It will very difficult to implement case ranges as series of small patches. We got several large and heavy patches. Each patch will about 90-120 kb. If you replace ConstantInt with APInt in SwitchInst you will need to changes at the same time all Readers,Writers and absolutely all passes that uses SwitchInst. 2. We can implement APInt pool inside and save memory space. E.g. we use several switches that works with 256 bit items (switch on signatures, or strings). We can avoid value duplicates in this case. 3. IntItem can be easyly easily replaced with APInt. 4. Currenly we can interpret IntItem both as ConstantInt and as APInt. It allows to provide SwitchInst methods that works with ConstantInt for non-updated passes. Why I need it right now? Currently I need to update SimplifyCFG pass (EqualityComparisons). I need to work with APInts directly a lot, so peaces of code ConstantInt *V = ...; if (V->getValue().ugt(AnotherV->getValue()) { ... } will look awful. Much more better this way: IntItem V = ConstantIntVal->getValue(); if (AnotherV < V) { } Of course any reviews are welcome. P.S.: I'm also going to rename ConstantRangesSet to IntegersSubset, and CRSBuilder to IntegersSubsetMapping (allows to map individual subsets of integers to the BasicBlocks). Since in future these classes will founded on APInt, it will possible to use them in more generic ways. llvm-svn: 157576
* Implement the indirect counter increment code in a better way. Instead ofBill Wendling2012-05-281-53/+72
| | | | | | | | replicating the code for every place it's needed, we instead generate a function that does that for us. This function is local to the executable, so there shouldn't be any writing violations. llvm-svn: 157564
* switch AttrListPtr::get to take an ArrayRef, simplifying a lot of clients.Chris Lattner2012-05-284-27/+20
| | | | llvm-svn: 157556
* PR12967: Don't crash when trying to fold a shift that's larger than the ↵Benjamin Kramer2012-05-271-1/+1
| | | | | | type's size. llvm-svn: 157548
* Reimplement the intrinsic verifier to use the same table as ↵Chris Lattner2012-05-271-1/+1
| | | | | | | | | | | | | Intrinsic::getDefinition, making it stronger and more sane. Delete the code from tblgen that produced the old code. Besides being a path forward in intrinsic sanity, this also eliminates a bunch of machine generated code that was compiled into Function.o llvm-svn: 157545
* Since commit 157467, if reassociate isn't actually going to change an expressionDuncan Sands2012-05-261-17/+20
| | | | | | | | | | | then it doesn't alter the instructions composing it, however it would continue to move the instructions to just before the expression root. Ensure it doesn't move them either, so now it really does nothing if there is nothing to do. That commit also ensured that nsw etc flags weren't cleared if the expression was not being changed. Tweak this a bit so that it doesn't clear flags on the initial part of a computation either if that part didn't change but later bits did. llvm-svn: 157518
* SimplifyCFG: Turn the ad-hoc std::pair that represents switch cases into an ↵Benjamin Kramer2012-05-261-39/+54
| | | | | | explicit struct. llvm-svn: 157516
* Add support for branch weight metadata to MDBuilder and use it in various ↵Benjamin Kramer2012-05-262-32/+23
| | | | | | places. llvm-svn: 157515
* Move this debug statement earlier so it is easy to see the order inDuncan Sands2012-05-261-2/+2
| | | | | | which operands come flying out of the linearization stage. llvm-svn: 157512
* The llvm_gcda_increment_indirect_counter function writes to the arguments thatBill Wendling2012-05-251-17/+57
| | | | | | | | | | | | | | are passed in. However, those arguments may be in a write-protected area, as far as the runtime library is concerned. For instance, the data could be placed into a 'linkedit' section, which isn't writable. Emit the code from llvm_gcda_increment_indirect_counter directly into the function instead. Note: The code for this is ugly, and can lead to bloat. We should look into simplifying this code instead of having all of these branches. <rdar://problem/11181370> llvm-svn: 157505
* bounds checking: add support for byval argumentsNuno Lopes2012-05-251-1/+9
| | | | llvm-svn: 157498
* boundschecking:Nuno Lopes2012-05-251-50/+104
| | | | | | | add support for select add experimental support for alloc_size metadata llvm-svn: 157481
* Make the reassociation pass more powerful so that it can handle expressionsDuncan Sands2012-05-251-255/+405
| | | | | | | | | | | | | | | | | with arbitrary topologies (previously it would give up when hitting a diamond in the use graph for example). The testcase from PR12764 is now reduced from a pile of additions to the optimal 1617*%x0+208. In doing this I changed the previous strategy of dropping all uses for expression leaves to one of dropping all but one use. This works out more neatly (but required a bunch of tweaks) and is also safer: some recently fixed bugs during recursive linearization were because the linearization code thinks it completely owns a node if it has no uses outside the expression it is linearizing. But if the node was also in another expression that had been linearized (and thus all uses of the node from that expression dropped) then the conclusion that it is completely owned by the expression currently being linearized is wrong. Keeping one use from within each linearized expression avoids this kind of mistake. llvm-svn: 157467
* PR1255 related changes (case ranges):Stepan Dyatkovskiy2012-05-241-40/+18
| | | | | | | LowerSwitch::Clusterify : main functinality was replaced with CRSBuilder::optimize, so big part of Clusterify's code was reduced. test/Transform/LowerSwitch/feature.ll - this test was refactored: grep + count was replaced with FileCheck usage. llvm-svn: 157384
* BoundsChecking: add a couple of simple tests and fix a bug in branch emitionNuno Lopes2012-05-231-7/+19
| | | | llvm-svn: 157329
* Fix the inliner so that the optsize function attribute don't alter thePatrik Hägglund2012-05-231-8/+11
| | | | | | | | inline threshold if the global inline threshold is lower (as for -Oz). Reviewed by Chandler Carruth and Bill Wendling. llvm-svn: 157323
* Use zero-based shadow by default on Android.Evgeniy Stepanov2012-05-231-2/+7
| | | | llvm-svn: 157317
* PR1255(case ranges) related changes in Local Transformations.Stepan Dyatkovskiy2012-05-231-10/+14
| | | | llvm-svn: 157315
* address some of John Criswell's commentsNuno Lopes2012-05-221-31/+84
| | | | | | teach computeAllocSize about realloc, reallocf, and valloc llvm-svn: 157298
* hopefully fix the CMake build. sorry for breakageNuno Lopes2012-05-221-0/+1
| | | | llvm-svn: 157264
* add a new pass to instrument loads and stores for run-time bounds checkingNuno Lopes2012-05-225-62/+286
| | | | | | | | move EmitGEPOffset from InstCombine to Transforms/Utils/Local.h (a draft of this) patch reviewed by Andrew, thanks. llvm-svn: 157261
* revert my previous patches that introduced an additional parameter to the ↵Nuno Lopes2012-05-221-106/+60
| | | | | | | | objectsize intrinsic. After a lot of discussion, we realized it's not the best option for run-time bounds checking llvm-svn: 157255
* Fix PR12858, a crash due to GVN's PRE not fully removing an instruction from theDuncan Sands2012-05-221-6/+12
| | | | | | | | | | | | leader table. That's because it wasn't expecting instructions to turn up as leader for a value number that is not its own, but equality propagation could create this situation. One solution is to have the leader table use a WeakVH but this slows down GVN by about 5%. Instead just have equality propagation not add instructions to the leader table, only constants and arguments. In theory this might cause GVN to run more (each time it changes something it runs again) but it doesn't seem to occur enough to cause a slow down. llvm-svn: 157251
* Mark an unreachable region of code with llvm_unreachable.Dan Gohman2012-05-211-1/+1
| | | | llvm-svn: 157197
* Do not pass an invalid domtree to SimplifyInstruction fromPeter Collingbourne2012-05-201-2/+2
| | | | | | LoopUnswitch. Fixes PR12887. llvm-svn: 157140
* Do not eliminate allocas whose alignment exceeds that of thePeter Collingbourne2012-05-191-12/+35
| | | | | | | copied-in constant, as a subsequent user may rely on over alignment. Fixes PR12885. llvm-svn: 157134
* Fix replacing all the users of objc weak runtime routinesDan Gohman2012-05-181-2/+12
| | | | | | when deleting them. rdar://11434915. llvm-svn: 157080
* Teach SimplifyLibCalls about stpcpy.David Majnemer2012-05-151-7/+54
| | | | llvm-svn: 156815
* Move the capture analysis from MemoryDependencyAnalysis to a more general placeChad Rosier2012-05-141-1/+5
| | | | | | | | | so that it can be reused in MemCpyOptimizer. This analysis is needed to remove an unnecessary memcpy when returning a struct into a local variable. rdar://11341081 PR12686 llvm-svn: 156776
* Teach Function::hasAddressTaken that BlockAddress doesn't really takeJay Foad2012-05-121-0/+4
| | | | | | the address of a function. llvm-svn: 156703
* objectsize: add a few more tests and fix a bugNuno Lopes2012-05-111-1/+1
| | | | llvm-svn: 156625
* Fix a minor logic mistake transforming compares in instcombine. PR12514.Eli Friedman2012-05-111-1/+1
| | | | llvm-svn: 156600
* objectsize: add support for GEPs with non-constant indexesNuno Lopes2012-05-103-34/+34
| | | | | | add an additional parameter to InstCombiner::EmitGEPOffset() to force it to *not* emit operations with NUW flag llvm-svn: 156585
* Teach DeadStoreElimination to eliminate exit-block stores with phi addresses.Dan Gohman2012-05-101-3/+19
| | | | llvm-svn: 156558
* teach DSE and isInstructionTriviallyDead() about callocNuno Lopes2012-05-102-4/+17
| | | | llvm-svn: 156553
* Fix the objc_storeStrong recognizer to stop before walking off theDan Gohman2012-05-091-1/+4
| | | | | | end of a basic block if there's no store. llvm-svn: 156520
* objectsize:Nuno Lopes2012-05-091-55/+96
| | | | | | | refactor code a bit to enable future changes to support run-time information add support to compute allocation sizes at run-time if penalty > 1 (e.g., malloc(x), calloc(x, y), and VLAs) llvm-svn: 156515
* Remove unused variable to get rid of warning.Craig Topper2012-05-091-1/+1
| | | | llvm-svn: 156466
* Miscellaneous accumulated cleanups.Dan Gohman2012-05-081-104/+78
| | | | llvm-svn: 156445
* Fix objc_storeStrong pattern matching to catch a potential use of theDan Gohman2012-05-081-9/+29
| | | | | | | old value after the store but before it is released. This fixes rdar:/11116986. llvm-svn: 156442
* Calling ReassociateExpression recursively is extremely dangerous since it willDuncan Sands2012-05-081-7/+7
| | | | | | | | | | | | | replace the operands of expressions with only one use with undef and generate a new expression for the original without using RAUW to update the original. Thus any copies of the original expression held in a vector may end up referring to some bogus value - and using a ValueHandle won't help since there is no RAUW. There is already a mechanism for getting the effect of recursion non-recursively: adding the value to be recursed on to RedoInsts. But it wasn't being used systematically. Have various places where recursion had snuck in at some point use the RedoInsts mechanism instead. Fixes PR12169. llvm-svn: 156379
* Allow NULL LoopPassManager argument in UnrollLoop. PR12734.Andrew Trick2012-05-082-20/+26
| | | | llvm-svn: 156358
* Teach reassociate to commute FMul's and FAdd's in order to canonicalize the ↵Owen Anderson2012-05-071-4/+28
| | | | | | order of their operands across instructions. This allows for greater CSE opportunities. llvm-svn: 156323
* Switch the select to branch transformation on by default.Benjamin Kramer2012-05-061-3/+4
| | | | | | | | | The primitive conservative heuristic seems to give a slight overall improvement while not regressing stuff. Make it available to wider testing. If you notice any speed regressions (or significant code size regressions) let me know! llvm-svn: 156258
* Remove trailing spaces.Jakub Staszak2012-05-061-60/+60
| | | | llvm-svn: 156257
* CodeGenPrepare: Add a transform to turn selects into branches in some cases.Benjamin Kramer2012-05-051-0/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This came up when a change in block placement formed a cmov and slowed down a hot loop by 50%: ucomisd (%rdi), %xmm0 cmovbel %edx, %esi cmov is a really bad choice in this context because it doesn't get branch prediction. If we emit it as a branch, an out-of-order CPU can do a better job (if the branch is predicted right) and avoid waiting for the slow load+compare instruction to finish. Of course it won't help if the branch is unpredictable, but those are really rare in practice. This patch uses a dumb conservative heuristic, it turns all cmovs that have one use and a direct memory operand into branches. cmovs usually save some code size, so we disable the transform in -Os mode. In-Order architectures are unlikely to benefit as well, those are included in the "predictableSelectIsExpensive" flag. It would be better to reuse branch probability info here, but BPI doesn't support select instructions currently. It would make sense to use the same heuristics as the if-converter pass, which does the opposite direction of this transform. Test suite shows a small improvement here and there on corei7-level machines, but the actual results depend a lot on the used microarchitecture. The transformation is currently disabled by default and available by passing the -enable-cgp-select2branch flag to the code generator. Thanks to Chandler for the initial test case to him and Evan Cheng for providing me with comments and test-suite numbers that were more stable than mine :) llvm-svn: 156234
* Small fix in InstCombineCasts.cpp. Restored "alloca + bitcast" reducing for ↵Stepan Dyatkovskiy2012-05-051-1/+1
| | | | | | | | case when alloca's size is calculated within the "add/sub/... nsw". Also added fix to 2011-06-13-nsw-alloca.ll test. llvm-svn: 156231
* Teach the code extractor how to extract a sequence of blocks fromChandler Carruth2012-05-041-7/+32
| | | | | | | RegionInfo's RegionNode. This mirrors the logic for automating the extraction from a Loop. llvm-svn: 156208
* Factor the computation of input and output sets into a public interfaceChandler Carruth2012-05-041-35/+34
| | | | | | | | | | | | | | | | of the CodeExtractor utility. This allows speculatively computing input and output sets to measure the likely size impact of the code extraction. These sets cannot be reused sadly -- we mutate the function prior to forming the final sets used by the actual extraction. The interface has been revamped slightly to make it easier to use correctly by making the interface const and sinking the computation of the number of exit blocks into the full extraction function and away from the rest of this logic which just computed two output parameters. llvm-svn: 156168
OpenPOWER on IntegriCloud