summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [unroll] Don't use a map from pointer to bool. Use a set.Chandler Carruth2015-02-131-4/+4
| | | | | | | | | This is much more efficient. In particular, the query with the user instruction has to insert a false for every missing instruction into the set. This is just a cleanup a long the way to fixing the underlying algorithm problems here. llvm-svn: 228994
* Prevent division by 0.Michael Zolotukhin2015-02-131-1/+1
| | | | | | | | When we try to estimate number of potentially removed instructions in loop unroller, we analyze first N iterations and then scale the computed number by TripCount/N. We should bail out early if N is 0. llvm-svn: 228988
* [unroll] Update the new analysis logic from r228265 to use modern codingChandler Carruth2015-02-131-10/+10
| | | | | | | conventions for function names consistently. Some were already using this but not all. llvm-svn: 228987
* InstCombine: Allow folding of xor into icmp by changing the predicate for ↵Benjamin Kramer2015-02-121-3/+4
| | | | | | | | vectors The loop vectorizer can create this pattern. llvm-svn: 228954
* [LoopRerolling] Be more forgiving with instruction order.James Molloy2015-02-121-17/+82
| | | | | | | | | We can't solve the full subgraph isomorphism problem. But we can allow obvious cases, where for example two instructions of different types are out of order. Due to them having different types/opcodes, there is no ambiguity. llvm-svn: 228931
* tsan: do not instrument not captured valuesDmitry Vyukov2015-02-121-0/+15
| | | | | | | | | | | | | | | | I've built some tests in WebRTC with and without this change. With this change number of __tsan_read/write calls is reduced by 20-40%, binary size decreases by 5-10% and execution time drops by ~5%. For example: $ ls -l old/modules_unittests new/modules_unittests -rwxr-x--- 1 dvyukov 41708976 Jan 20 18:35 old/modules_unittests -rwxr-x--- 1 dvyukov 38294008 Jan 20 18:29 new/modules_unittests $ objdump -d old/modules_unittests | egrep "callq.*__tsan_(read|write|unaligned)" | wc -l 239871 $ objdump -d new/modules_unittests | egrep "callq.*__tsan_(read|write|unaligned)" | wc -l 148365 http://reviews.llvm.org/D7069 llvm-svn: 228917
* [slp] Fix a nasty bug in the SLP vectorizer that Joerg pointed out.Chandler Carruth2015-02-121-4/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Apparently some code finally started to tickle this after my canonicalization changes to instcombine. The bug stems from trying to form a vector type out of scalars that aren't compatible at all. In this example, from x86_mmx values. The code in the vectorizer that checks for reasonable types whas checking for aggregates or vectors, but there are lots of other types that should just never reach the vectorizer. Debugging this was made more confusing by the lie in an assert in VectorType::get() -- it isn't that the types are *primitive*. The types must be integer, pointer, or floating point types. No other types are allowed. I've improved the assert and added a helper to the vectorizer to handle the element type validity checks. It now re-uses the VectorType static function and then further excludes weird target-specific types that we probably shouldn't be touching here (x86_fp80 and ppc_fp128). Neither of these are really reachable anyways (neither 80-bit nor 128-bit things will get vectorized) but it seems better to just eagerly exclude such nonesense. I've added a test case, but while it definitely covers two of the paths through this code there may be more paths that would benefit from test coverage. I'm not familiar enough with the SLP vectorizer to synthesize test cases for all of these, but was able to update the code itself by inspection. llvm-svn: 228899
* DeadArgElim: aggregate Return assessment properly.Tim Northover2015-02-111-4/+7
| | | | | | | | | I mistakenly thought the liveness of each "RetVal(F, i)" depended only on F. It actually depends on the index too, which means we need to be careful about how the results are combined before return. In particular if a single Use returns Live, that counts for the entire object, at the granularity we're considering. llvm-svn: 228885
* Reassociate: cannot negate a INT_MIN valueMehdi Amini2015-02-111-1/+1
| | | | | | | | | | | | | | | | | | Summary: When trying to canonicalize negative constants out of multiplication expressions, we need to check that the constant is not INT_MIN which cannot be negated. Reviewers: mcrosier Reviewed By: mcrosier Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7286 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 228872
* [SimplifyCFG] Swap to using TargetTransformInfo for costJames Molloy2015-02-111-50/+28
| | | | | | | | | | | | | | | | | | analysis. We're already using TTI in SimplifyCFG, so remove the hard-baked "cheapness" heuristic and use TTI directly. Generally NFC intended, but we're using a slightly different heuristic now so there is a slight test churn. Test changes: * combine-comparisons-by-cse.ll: Removed unneeded branch check. * 2014-08-04-muls-it.ll: Test now doesn't branch but emits muleq. * coalesce-subregs.ll: Superfluous block check. * 2008-01-02-hoist-fp-add.ll: fadd is safe to speculate. Change to udiv. * PhiBlockMerge.ll: Superfluous CFG checking code. Main checks still present. * select-gep.ll: A variable GEP is not expensive, just TCC_Basic, according to the TTI. llvm-svn: 228826
* [LoopReroll] Introduce the concept of DAGRootSets.James Molloy2015-02-111-202/+376
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A DAGRootSet models an induction variable being used in a rerollable loop. For example: x[i*3+0] = y1 x[i*3+1] = y2 x[i*3+2] = y3 Base instruction -> i*3 +---+----+ / | \ ST[y1] +1 +2 <-- Roots | | ST[y2] ST[y3] There may be multiple DAGRootSets, for example: x[i*2+0] = ... (1) x[i*2+1] = ... (1) x[i*2+4] = ... (2) x[i*2+5] = ... (2) x[(i+1234)*2+5678] = ... (3) x[(i+1234)*2+5679] = ... (3) This concept is similar to the "Scale" member used previously, but allows multiple independent sets of roots based off the same induction variable. llvm-svn: 228821
* Use ADDITIONAL_HEADER_DIRS in all LLVM CMake projects.Zachary Turner2015-02-117-0/+25
| | | | | | | | | | This allows IDEs to recognize the entire set of header files for each of the core LLVM projects. Differential Revision: http://reviews.llvm.org/D7526 Reviewed By: Chris Bieneman llvm-svn: 228798
* InstrProf: Lower coverage mappings by setting their sections appropriatelyJustin Bogner2015-02-111-0/+41
| | | | | | | | | | | | | | | Add handling for __llvm_coverage_mapping to the InstrProfiling pass. We need to make sure the constant and any profile names it refers to are in the correct sections, which is easier and cleaner to do here where we have to know about profiling sections anyway. This is really tricky to test without a frontend, so I'm committing the test for the fix in clang. If anyone knows a good way to test this within LLVM, please let me know. Fixes PR22531. llvm-svn: 228793
* Don't promote asynch EH invokes of nounwind functions to callsReid Kleckner2015-02-113-3/+7
| | | | | | | | | | | If the landingpad of the invoke is using a personality function that catches asynch exceptions, then it can catch a trap. Also add some landingpads to invalid LLVM IR test cases that lack them. Over-the-shoulder reviewed by David Majnemer. llvm-svn: 228782
* EarlyCSE: It isn't safe to CSE across synchronization boundariesDavid Majnemer2015-02-101-0/+3
| | | | | | This fixes PR22514. llvm-svn: 228760
* DeadArgElim: arguments affect all returned sub-values by default.Tim Northover2015-02-101-4/+16
| | | | | | | | | | Unless we meet an insertvalue on a path from some value to a return, that value will be live if *any* of the return's components are live, so all of those components must be added to the MaybeLiveUses. Previously we were deleting arguments if sub-value 0 turned out to be dead. llvm-svn: 228731
* Revert r228556: InstCombine: propagate nonNull through assumeChandler Carruth2015-02-101-8/+1
| | | | | | | | | This commit isn't using the correct context, and is transfoming calls that are operands to loads rather than calls that are operands to an icmp feeding into an assume. I've replied on the original review thread with a very reduced test case and some thoughts on how to rework this. llvm-svn: 228677
* Adjust how we avoid poll insertion inside the poll function (NFC)Philip Reames2015-02-101-5/+11
| | | | | | | | I realized that my early fix for this was overly complicated. Rather than scatter checks around in a bunch of places, just exit early when we visit the poll function itself. Thinking about it a bit, the whole inlining mechanism used with gc.safepoint_poll could probably be cleaned up a bit. Originally, poll insertion was fused with gc relocation rewriting. It might be worth going back to see if we can simplify the chain of events now that these two are seperated. As one thought, maybe it makes sense to rewrite calls inside the helper function before inlining it to the many callers. This would require us to visit the poll function before any other functions though.. llvm-svn: 228634
* Debug info: When updating debug info during SROA, do not emit debug infoAdrian Prantl2015-02-091-8/+18
| | | | | | | | | | for any padding introduced by SROA. In particular, do not emit debug info for an alloca that represents only the padding introduced by a previous iteration. Fixes PR22495. llvm-svn: 228632
* Debug info: Use DW_OP_bit_piece instead of DW_OP_piece in theAdrian Prantl2015-02-091-5/+5
| | | | | | | | | | | intermediate representation. This - increases consistency by using the same granularity everywhere - allows for pieces < 1 byte - DW_OP_piece didn't actually allow storing an offset. Part of PR22495. llvm-svn: 228631
* [Statepoint] Improve two asserts, fix some style (NFC)Ramkumar Ramachandra2015-02-091-1/+2
| | | | | | | | | | | | | | | Summary: It's important that our users immediately know what gc.safepoint_poll is. Also fix the style of the declaration of CreateGCStatepoint, in preparation for another change that will wrap it. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7517 llvm-svn: 228626
* PlaceSafepoints: modernize gc.result.* -> gc.resultRamkumar Ramachandra2015-02-091-12/+1
| | | | | | Differential Revision: http://reviews.llvm.org/D7516 llvm-svn: 228625
* Update file comment to clarify points highlighted in review (NFC)Philip Reames2015-02-091-31/+30
| | | | llvm-svn: 228621
* Use range for loops in PlaceSafepoints (NFC)Philip Reames2015-02-091-8/+4
| | | | llvm-svn: 228620
* IR: Take uint64_t in DIBuilder::createExpression()Duncan P. N. Exon Smith2015-02-091-1/+1
| | | | | | | | | | | `DIExpression` deals with `uint64_t`, so it doesn't make sense that `createExpression()` is created from `int64_t`. Switch to `uint64_t` to unify them. I've temporarily left in the `int64_t` version, which forwards to the `uint64_t` version. I'll delete it once I've updated the callers. llvm-svn: 228619
* Add basic tests for PlaceSafepointsPhilip Reames2015-02-091-5/+21
| | | | | | This is just adding really simple tests which should have been part of the original submission. When doing so, I discovered that I'd mistakenly removed required pieces when preparing the patch for upstream submission. I fixed two such bugs in this submission. llvm-svn: 228610
* Fix a bug in DemoteRegToStack where a reload instruction was inserted into theAkira Hatanaka2015-02-091-18/+16
| | | | | | | | | | | | | | | | | | wrong basic block. This would happen when the result of an invoke was used by a phi instruction in the invoke's normal destination block. An instruction to reload the invoke's value would get inserted before the critical edge was split and a new basic block (which is the correct insertion point for the reload) was created. This commit fixes the bug by splitting the critical edge before all the reload instructions are inserted. Also, hoist up the code which computes the insertion point to the only place that need that computation. rdar://problem/15978721 llvm-svn: 228566
* DeadArgElim: fix mismatch in accounting of array return types.Tim Northover2015-02-091-39/+47
| | | | | | | | | | | | Some parts of DeadArgElim were only considering the individual fields of StructTypes separately, but others (where insertvalue & extractvalue instructions occur) also looked into ArrayTypes. This one is an actual bug; the mismatch can lead to an argument being considered used by a return sub-value that isn't being tracked (and hence is dead by default). It then gets incorrectly eliminated. llvm-svn: 228559
* DeadArgElim: assess uses of entire return value aggregate.Tim Northover2015-02-091-26/+26
| | | | | | | | | | | | | | | | | | | | | | | Previously, a non-extractvalue use of an aggregate return value meant the entire return was considered live (the algorithm gave up entirely). This was correct, but conservative. It's better to actually look at that Use, making the analysis results apply to all sub-values under consideration. E.g. %val = call { i32, i32 } @whatever() [...] ret { i32, i32 } %val The return is using the entire aggregate (sub-values 0 and 1). We can still simplify @whatever if we can prove that this return is itself unused. Also unifies the logic slightly between aggregate and non-aggregate cases.. llvm-svn: 228558
* InstCombine: propagate nonNull through assumeRamkumar Ramachandra2015-02-091-1/+8
| | | | | | | | | Make assume (load (call|invoke) != null) set nonNull return attribute for the call and invoke. Also include tests. Differential Revision: http://reviews.llvm.org/D7107 llvm-svn: 228556
* Correctly combine alias.scope metadata by a union instead of intersectingBjorn Steinbrink2015-02-082-0/+4
| | | | | | | | | | | | | | | | | Summary: The alias.scope metadata represents sets of things an instruction might alias with. When generically combining the metadata from two instructions the result must be the union of the original sets, because the new instruction might alias with anything any of the original instructions aliased with. Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7490 llvm-svn: 228525
* LoopIdiom: Use utility functions.Benjamin Kramer2015-02-071-54/+15
| | | | | | | | | The only difference between deleteIfDeadInstruction and RecursivelyDeleteTriviallyDeadInstructions is that the former also manually invalidates SCEV. That's unnecessary because SCEV automatically gets informed when an instruction is deleted via a ValueHandle. NFC. llvm-svn: 228508
* Properly update AA metadata when performing call slot optimizationBjorn Steinbrink2015-02-071-0/+10
| | | | | | | | Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7482 llvm-svn: 228500
* [msan] Fix "missing origin" in atomic store.Evgeniy Stepanov2015-02-061-1/+1
| | | | | | | | | | An atomic store always make the target location fully initialized (in the current implementation). It should not store origin. Initialized memory can't have meaningful origin, and, due to origin granularity (4 bytes) there is a chance that this extra store would overwrite meaningfull origin for an adjacent location. llvm-svn: 228444
* Use estimated number of optimized insns in unroll-threshold computation.Michael Zolotukhin2015-02-061-2/+44
| | | | | | | | | | If complete-unroll could help us to optimize away N% of instructions, we might want to do this even if the final size would exceed loop-unroll threshold. However, we don't want to unroll huge loop, and we are add AbsoluteThreshold to avoid that - this threshold will never be crossed, even if we expect to optimize 99% instructions after that. llvm-svn: 228434
* [InstSimplify] Add SimplifyFPBinOp function.Michael Zolotukhin2015-02-061-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | It is a variation of SimplifyBinOp, but it takes into account FastMathFlags. It is needed in inliner and loop-unroller to accurately predict the transformation's outcome (previously we dropped the flags and were too conservative in some cases). Example: float foo(float *a, float b) { float r; if (a[1] * b) r = /* a lot of expensive computations */; else r = 1; return r; } float boo(float *a) { return foo(a, 0.0); } Without this patch, we don't inline 'foo' into 'boo'. llvm-svn: 228432
* [LV] Move addRuntimeCheck to LoopAccessAnalysisAdam Nemet2015-02-061-104/+5
| | | | | | | | | | | | | This will allow it to be shared with the new Loop Distribution pass. getFirstInst is currently duplicated across LoopVectorize.cpp and LoopAccessAnalysis.cpp. This is a short-term work-around until we figure out a better solution. NFC. (The code moved is adjusted a bit for the name of the Loop member and that PtrRtCheck is now a reference rather than a pointer.) llvm-svn: 228418
* Make helper functions/classes/globals static. NFC.Benjamin Kramer2015-02-062-5/+5
| | | | llvm-svn: 228410
* InstCombine: Combine select sequences into a single selectMatthias Braun2015-02-061-0/+18
| | | | | | | | | | | | | | Normalize select(C0, select(C1, a, b), b) -> select((C0 & C1), a, b) select(C0, a, select(C1, a, b)) -> select((C0 | C1), a, b) This normal form may enable further combines on the And/Or and shortens paths for the values. Many targets prefer the other but can go back easily in CodeGen. Differential Revision: http://reviews.llvm.org/D7399 llvm-svn: 228409
* IRCE: Demote template to ArrayRef and SmallVector to array.Benjamin Kramer2015-02-061-26/+15
| | | | | | NFC. llvm-svn: 228398
* [ASan] Enable -asan-stack-dynamic-alloca by default.Alexey Samsonov2015-02-051-1/+1
| | | | | | | | | | | By default, store all local variables in dynamic alloca instead of static one. It reduces the stack space usage in use-after-return mode (dynamic alloca will not be called if the local variables are stored in a fake stack), and improves the debug info quality for local variables (they will not be described relatively to %rbp/%rsp, which are assumed to be clobbered by function calls). llvm-svn: 228336
* LowerSwitch: Use ConstantInt for CaseRange::{Low,High}Hans Wennborg2015-02-051-20/+20
| | | | | | | Case values are always ConstantInt. This allows us to remove a bunch of casts. NFC. llvm-svn: 228312
* LowerSwitch: remove default args from CaseRange ctor; NFCHans Wennborg2015-02-051-3/+2
| | | | llvm-svn: 228311
* Removing an unused variable warning I accidentally introduced with my last ↵Aaron Ballman2015-02-051-1/+1
| | | | | | warning fix; NFC. llvm-svn: 228295
* Silencing an MSVC warning about a switch statement with no cases; NFC.Aaron Ballman2015-02-051-8/+5
| | | | llvm-svn: 228294
* Implement new heuristic for complete loop unrolling.Michael Zolotukhin2015-02-051-2/+332
| | | | | | | | | | | | | | | | | | | | | | | | | Complete loop unrolling can make some loads constant, thus enabling a lot of other optimizations. To catch such cases, we look for loads that might become constants and estimate number of instructions that would be simplified or become dead after substitution. Example: Suppose we have: int a[] = {0, 1, 0}; v = 0; for (i = 0; i < 3; i ++) v += b[i]*a[i]; If we completely unroll the loop, we would get: v = b[0]*a[0] + b[1]*a[1] + b[2]*a[2] Which then will be simplified to: v = b[0]* 0 + b[1]* 1 + b[2]* 0 And finally: v = b[1] llvm-svn: 228265
* StructurizeCFG: Remove obsolete fix for loop backedge detectionTom Stellard2015-02-041-1/+1
| | | | | | | This is no longer needed now that we are using a reverse post-order traversal. llvm-svn: 228187
* StructurizeCFG: Use a reverse post-order traversalTom Stellard2015-02-041-4/+64
| | | | | | | | | | | | | | We were previously doing a post-order traversal and operating on the list in reverse, however this would occasionaly cause backedges for loops to be visited before some of the other blocks in the loop. We know use a reverse post-order traversal, which avoids this issue. The reverse post-order traversal is not completely ideal, so we need to manually fixup the list to ensure that inner loop backedges are visited before outer loop backedges. llvm-svn: 228186
* Utils: Resolve cycles under distinct MDNodesDuncan P. N. Exon Smith2015-02-041-20/+45
| | | | | | | | Track unresolved nodes under distinct `MDNode`s during `MapMetadata()`, and resolve them at the end. Previously, these cycles wouldn't get resolved. llvm-svn: 228180
* Add range adapters predecessors() and successors() for BBsReid Kleckner2015-02-042-7/+6
| | | | | | | Use them in two isolated transforms so we know they work and aren't dead code. llvm-svn: 228173
OpenPOWER on IntegriCloud