summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* Don't vectorize target-specific types (ppc_fp128, x86_fp80, etc.).Hal Finkel2012-04-271-0/+6
| | | | | | | | | | Target specific types should not be vectorized. As a practical matter, these types are already register matched (at least in the x86 case), and codegen does not always work correctly (at least in the ppc case, and this is not worth fixing because ppc_fp128 is currently broken and will probably go away soon). llvm-svn: 155729
* Change recurse depth limit to uint32 to fix warning.David Blaikie2012-04-271-1/+1
| | | | llvm-svn: 155727
* Miscellaneous accumulated cleanups.Dan Gohman2012-04-271-71/+57
| | | | llvm-svn: 155725
* Add an early bailout to IsValueFullyAvailableInBlock from deeply nested blocks.Mon P Wang2012-04-271-3/+12
| | | | | | | The limit is set to an arbitrary 1000 recursion depth to avoid stack overflow issues. <rdar://problem/11286839>. llvm-svn: 155722
* [asan] small optimization: do not emit "x+0" instructions Kostya Serebryany2012-04-271-3/+4
| | | | llvm-svn: 155701
* [tsan] Atomic support for ThreadSanitizer, patch by Dmitry VyukovKostya Serebryany2012-04-271-33/+152
| | | | llvm-svn: 155698
* Break up getProfitableChainIncrement().Jakob Stoklund Olesen2012-04-261-39/+47
| | | | | | | | | | | The required checks are moved to ChainInstruction() itself and the policy decisions are moved to IVChain::isProfitableInc(). Also cache the ExprBase in IVChain to avoid frequent recomputations. No functional change intended. llvm-svn: 155676
* Turn IVChain into a struct.Jakob Stoklund Olesen2012-04-261-19/+42
| | | | | | No functional change intended. llvm-svn: 155675
* Add instcombine patterns for the following transformations:Chad Rosier2012-04-262-0/+19
| | | | | | | | | | (x & y) | (x ^ y) -> x | y (x & y) + (x ^ y) -> x | y Patch by Manman Ren. rdar://10770603 llvm-svn: 155674
* Teach the reassociate pass to fold chains of multiplies with repeatedChandler Carruth2012-04-261-10/+247
| | | | | | | | | | | | | | | | | elements to minimize the number of multiplies required to compute the final result. This uses a heuristic to attempt to form near-optimal binary exponentiation-style multiply chains. While there are some cases it misses, it seems to at least a decent job on a very diverse range of inputs. Initial benchmarks show no interesting regressions, and an 8% improvement on SPASS. Let me know if any other interesting results (in either direction) crop up! Credit to Richard Smith for the core algorithm, and helping code the patch itself. llvm-svn: 155616
* Print IV chain numbers while collecting them.Jakob Stoklund Olesen2012-04-251-4/+5
| | | | llvm-svn: 155567
* Reverting r155468. Chris and Chandler have convinced me that it's dangerous andLang Hames2012-04-251-35/+0
| | | | | | | | in poor taste. Talking through some alternate solutions with Chandler. llvm-svn: 155530
* Simplify the known retain count tracking; use a boolean state insteadDan Gohman2012-04-251-41/+34
| | | | | | | of a precise count. Also, move RRInfo's Partial field into PtrState, now that it won't increase the size. llvm-svn: 155513
* Build custom predecessor and successor lists for each basic block.Dan Gohman2012-04-241-115/+101
| | | | | | | | These lists exclude invoke unwind edges and loop backedges which are being ignored. This makes it easier to ignore them consistently. llvm-svn: 155500
* Add support for llvm.arm.neon.vmull* intrinsics to InstCombine. This fixesLang Hames2012-04-241-0/+35
| | | | | | <rdar://problem/11291436>. llvm-svn: 155468
* Reapply r155136 after fixing PR12599.Jakob Stoklund Olesen2012-04-231-39/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Original commit message: Defer some shl transforms to DAGCombine. The shl instruction is used to represent multiplication by a constant power of two as well as bitwise left shifts. Some InstCombine transformations would turn an shl instruction into a bit mask operation, making it difficult for later analysis passes to recognize the constsnt multiplication. Disable those shl transformations, deferring them to DAGCombine time. An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'. These transformations are deferred: (X >>? C) << C --> X & (-1 << C) (When X >> C has multiple uses) (X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2) (When C2 > C1) (X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2) (When C1 > C2) The corresponding exact transformations are preserved, just like div-exact + mul: (X >>?,exact C) << C --> X (X >>?,exact C1) << C2 --> X << (C2-C1) (X >>?,exact C1) << C2 --> X >>?,exact (C1-C2) The disabled transformations could also prevent the instruction selector from recognizing rotate patterns in hash functions and cryptographic primitives. I have a test case for that, but it is too fragile. llvm-svn: 155362
* Fix issue 67 by checking that the interface functions weren't redefined in ↵Alexander Potapenko2012-04-231-4/+18
| | | | | | the compiled source file. llvm-svn: 155346
* [tsan] use llvm/ADT/Statistic.h for tsan statsKostya Serebryany2012-04-231-40/+17
| | | | llvm-svn: 155341
* Revert r155136 "Defer some shl transforms to DAGCombine."Jakob Stoklund Olesen2012-04-201-35/+39
| | | | | | | | | While the patch was perfect and defect free, it exposed a really nasty bug in X86 SelectionDAG that caused an llc crash when compiling lencod. I'll put the patch back in after fixing the SelectionDAG problem. llvm-svn: 155181
* Put this expensive check below the less expensive ones.Bill Wendling2012-04-191-9/+9
| | | | llvm-svn: 155166
* Avoid a bug in the path count computation, preventing an infiniteDan Gohman2012-04-191-1/+1
| | | | | | loop repeatedlt making the same change. This is for rdar://11256239. llvm-svn: 155160
* Defer some shl transforms to DAGCombine.Jakob Stoklund Olesen2012-04-191-39/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The shl instruction is used to represent multiplication by a constant power of two as well as bitwise left shifts. Some InstCombine transformations would turn an shl instruction into a bit mask operation, making it difficult for later analysis passes to recognize the constsnt multiplication. Disable those shl transformations, deferring them to DAGCombine time. An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'. These transformations are deferred: (X >>? C) << C --> X & (-1 << C) (When X >> C has multiple uses) (X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2) (When C2 > C1) (X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2) (When C1 > C2) The corresponding exact transformations are preserved, just like div-exact + mul: (X >>?,exact C) << C --> X (X >>?,exact C1) << C2 --> X << (C2-C1) (X >>?,exact C1) << C2 --> X >>?,exact (C1-C2) The disabled transformations could also prevent the instruction selector from recognizing rotate patterns in hash functions and cryptographic primitives. I have a test case for that, but it is too fragile. llvm-svn: 155136
* Don't crash on code where the user put __attribute__((constructor)) onDan Gohman2012-04-181-1/+5
| | | | | | a function with arguments. This fixes rdar://11265785. llvm-svn: 155073
* Use a heavy hammer to fix PR12573.Bill Wendling2012-04-181-0/+9
| | | | | | | | | | | If the loop contains invoke instructions, whose unwind edge escapes the loop, then don't try to unswitch the loop. Doing so may cause the unwind edge to be split, which not only is non-trivial but doesn't preserve loop simplify information. Fixes PR12573 llvm-svn: 154987
* loop-reduce: Add an early bailout to catch extremely large loops.Andrew Trick2012-04-181-0/+17
| | | | | | | | | | | | | | This introduces a threshold of 200 IV Users, which is very conservative but should be sufficient to avoid serious compile time sink or stack overflow. The llvm test-suite with LTO never exceeds 190 users per loop. The bug doesn't relate to a specific type of loop. Checking in an arbitrary giant loop as a unit test would be silly. Fixes rdar://11262507. llvm-svn: 154983
* fix pr12559: mark unavailable win32 math libcallsJoe Groff2012-04-171-15/+10
| | | | | | also fix SimplifyLibCalls to use TLI rather than compile-time conditionals to enable optimizations on floor, ceil, round, rint, and nearbyint llvm-svn: 154960
* Fix style violation in BBVectorize (pointed out by Bill Wendling)Hal Finkel2012-04-161-3/+3
| | | | llvm-svn: 154810
* Add a Fixme.Bill Wendling2012-04-161-0/+2
| | | | llvm-svn: 154793
* Simplify checking for pointer types in BBVectorize (this change was ↵Hal Finkel2012-04-161-5/+2
| | | | | | suggested by Duncan). llvm-svn: 154787
* Fix an error in BBVectorize important for vectorizing pointer types.Hal Finkel2012-04-141-0/+31
| | | | | | | | | | When vectorizing pointer types it is important to realize that potential pairs cannot be connected via the address pointer argument of a load or store. This is because even after vectorization, the address is still a scalar because the address of the higher half of the pair is implicit from the address of the lower half (it need not be, and should not be, explicitly computed). llvm-svn: 154735
* Enhance BBVectorize to more-properly handle pointer values and vectorize GEPs.Hal Finkel2012-04-141-2/+27
| | | | llvm-svn: 154734
* Add support to BBVectorize for vectorizing selects.Hal Finkel2012-04-131-0/+8
| | | | llvm-svn: 154700
* Add some comments, and fix a few places that missed setting Changed.Dan Gohman2012-04-131-2/+24
| | | | llvm-svn: 154687
* Consider ObjC runtime calls objc_storeWeak and others which make a copy ofDan Gohman2012-04-131-14/+29
| | | | | | | their argument as "escape" points for objc_retainBlock optimization. This fixes rdar://11229925. llvm-svn: 154682
* By default, use Early-CSE instead of GVN for vectorization cleanup.Hal Finkel2012-04-131-2/+9
| | | | | | | | | | As has been suggested by Duncan and others, Early-CSE and GVN should do similar redundancy elimination, but Early-CSE is much less expensive. Most of my autovectorization benchmarks show a performance regresion, but all of these are < 0.1%, and so I think that it is still worth using the less expensive pass. llvm-svn: 154673
* Use the new Use-aware dominates method to apply the objc runtimeDan Gohman2012-04-131-8/+5
| | | | | | | library return value optimization for phi uses. Even when the phi itself is not dominated, the specific use may be dominated. llvm-svn: 154647
* Code-gen may inject code into the IR before it emits the ASM. The linkerBill Wendling2012-04-131-0/+6
| | | | | | | | obviously cannot know that this code is present, let alone used. So prevent the internalize pass from internalizing those global values which code-gen may insert. llvm-svn: 154645
* Don't move objc_autorelease calls past autorelease pool boundaries whenDan Gohman2012-04-131-3/+43
| | | | | | | optimizing autorelease calls on phi nodes with null operands. This fixes rdar://11207070. llvm-svn: 154642
* Typo.Chad Rosier2012-04-111-1/+1
| | | | llvm-svn: 154522
* Add two statistics to help track how we are computing the inline cost.Chandler Carruth2012-04-111-0/+6
| | | | | | Yea, 'NumCallerCallersAnalyzed' isn't a great name, suggestions welcome. llvm-svn: 154492
* [tsan] two more compile-time optimizations:Kostya Serebryany2012-04-101-11/+42
| | | | | | | | | | | | | - don't isntrument reads from constant globals. Saves ~1.5% of instrumented instructions on CPU2006 (counting static instructions, not their execution). - don't insrument reads from vtable (which is a global constant too). Saves ~5%. I did not measure the run-time impact of this, but it is certainly non-negative. llvm-svn: 154444
* [tsan] compile-time instrumentation: do not instrument a read ifKostya Serebryany2012-04-101-5/+82
| | | | | | | | | | | | | a write to the same temp follows in the same BB. Also add stats printing. On Spec CPU2006 this optimization saves roughly 4% of instrumented reads (which is 3% of all instrumented accesses): Writes : 161216 Reads : 446458 Reads-before-write: 18295 llvm-svn: 154418
* Fix 12513: Loop unrolling breaks with indirect branches.Andrew Trick2012-04-102-29/+18
| | | | | | | | Take this opportunity to generalize the indirectbr bailout logic for loop transformations. CFG transformations will never get indirectbr right, and there's no point trying. llvm-svn: 154386
* whitespaceAndrew Trick2012-04-101-140/+140
| | | | llvm-svn: 154385
* Teach InstCombine to nuke a common alloca pattern -- an alloca which hasChandler Carruth2012-04-081-1/+70
| | | | | | | | | | | | GEPs, bit casts, and stores reaching it but no other instructions. These often show up during the iterative processing of the inliner, SROA, and DCE. Once we hit this point, we can completely remove the alloca. These were actually showing up in the final, fully optimized code in a bunch of inliner tests I've been working on, and notably they show up after LLVM finishes optimizing away all function calls involved in hash_combine(a, b). llvm-svn: 154285
* Refactor: Use positive field names in VectorizeConfig.Hongbin Zheng2012-04-071-13/+15
| | | | llvm-svn: 154249
* Sink the collection of return instructions until after *all*Chandler Carruth2012-04-061-7/+9
| | | | | | | | | | | simplification has been performed. This is a bit less efficient (requires another ilist walk of the basic blocks) but shouldn't matter in practice. More importantly, it's just too much work to keep track of all the various ways the return instructions can be mutated while simplifying them. This fixes yet another crasher, reported by Daniel Dunbar. llvm-svn: 154179
* Make GVN's propagateEquality non-recursive. No intended functionality change.Duncan Sands2012-04-061-98/+105
| | | | | | The modifications are a lot more trivial than they appear to be in the diff! llvm-svn: 154174
* Sink the return instruction collection until after we're done deletingChandler Carruth2012-04-061-7/+9
| | | | | | | | | | | | | | dead code, including dead return instructions in some cases. Otherwise, we end up having a bogus poniter to a return instruction that blows up much further down the road. It turns out that this pattern is both simpler to code, easier to update in the face of enhancements to the inliner cleanup, and likely cheaper given that it won't add dead instructions to the list. Thanks to John Regehr's numerous test cases for teasing this out. llvm-svn: 154157
* Fix accidentally inverted logic from r152803, and make theDan Gohman2012-04-051-1/+1
| | | | | | testcase slightly less trivial. This fixes rdar://11171718. llvm-svn: 154118
OpenPOWER on IntegriCloud