summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [SimplifyLibCalls] Factor out common unsafe-math checks.Davide Italiano2015-10-291-29/+23
| | | | llvm-svn: 251595
* Add a flag vectorizer-maximize-bandwidth in loop vectorizer to enable using ↵Cong Hou2015-10-291-28/+93
| | | | | | | | larger vectorization factor. To be able to maximize the bandwidth during vectorization, this patch provides a new flag vectorizer-maximize-bandwidth. When it is turned on, the vectorizer will determine the vectorization factor (VF) using the smallest instead of widest type in the loop. To avoid increasing register pressure too much, estimates of the register usage for different VFs are calculated so that we only choose a VF when its register usage doesn't exceed the number of available registers. llvm-svn: 251592
* SamplePGO - Add flag to check sampling coverage.Diego Novillo2015-10-281-3/+83
| | | | | | | | | | | | | | | | This adds the flag -mllvm -sample-profile-check-coverage=N to the SampleProfile pass. N is the percent of input sample records that the user expects to apply. If the pass does not use N% (or more) of the sample records in the input, it emits a warning. This is useful to detect some forms of stale profiles. If the code has drifted enough from the original profile, there will be records that do not match the IR anymore. This will not detect cases where a sample profile record for line L is referring to some other instructions that also used to be at line L. llvm-svn: 251568
* [JumpThreading] Use dominating conditions to prove implicationsSanjoy Das2015-10-281-2/+40
| | | | | | | | | | | | | | | Summary: If P branches to Q conditional on C and Q branches to R conditional on C' and C => C' then the branch conditional on C' can be folded to an unconditional branch. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13972 llvm-svn: 251557
* SamplePGO - Clear per-function data after applying a profile.Diego Novillo2015-10-281-4/+21
| | | | | | | | | | The pass was keeping around a lot of per-function data (visited blocks, edges, dominance, etc) that is just taking up memory for no reason. In fact, from function to function it could potentially confuse the propagator since some maps are indexed by line offsets which can be common between functions. llvm-svn: 251531
* Typo.Chad Rosier2015-10-281-1/+1
| | | | llvm-svn: 251521
* Reapply: [LIR] Add support for creating memsets from loops with a negative ↵Chad Rosier2015-10-281-24/+32
| | | | | | | | stride. The simple fix is to prevent forming memcpy from loops with a negative stride. llvm-svn: 251518
* [GlobalOpt] Add newlines to DEBUG messagesJames Molloy2015-10-281-4/+4
| | | | | | | | I think these were affected by a change way back when to stop printing newlines in Value::dump() by default. This change simply allows the debug output to be readable. NFC. llvm-svn: 251517
* Revert "[LIR] Add support for creating memsets from loops with a negative ↵Chad Rosier2015-10-281-28/+24
| | | | | | | | stride." This reverts commit r251512. This is causing LNT/chomp to fail. llvm-svn: 251513
* [LIR] Add support for creating memsets from loops with a negative stride.Chad Rosier2015-10-281-24/+28
| | | | | | http://reviews.llvm.org/D14125 llvm-svn: 251512
* Revert r251492 "[IndVarSimplify] Rewrite loop exit values with their Chen Li2015-10-281-71/+0
| | | | | | initial values from loop preheader", because it broke some bots. llvm-svn: 251498
* [IndVarSimplify] Rewrite loop exit values with their initial values from ↵Chen Li2015-10-281-0/+71
| | | | | | | | | | | | | | | | | loop preheader Summary: This patch adds support to check if a loop has loop invariant conditions which lead to loop exits. If so, we know that if the exit path is taken, it is at the first loop iteration. If there is an induction variable used in that exit path whose value has not been updated, it will keep its initial value passing from loop preheader. We can therefore rewrite the exit value with its initial value. This will help remove phis created by LCSSA and enable other optimizations like loop unswitch. Reviewers: sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13974 llvm-svn: 251492
* [SimplifyCFG] Don't DCE catchret because the successor is unreachableDavid Majnemer2015-10-271-2/+1
| | | | | | | CatchReturnInst has side-effects: it runs a destructor. This destructor could conceivably run forever/call exit/etc. and should not be removed. llvm-svn: 251461
* Whitespace.NAKAMURA Takumi2015-10-271-1/+1
| | | | llvm-svn: 251437
* Revert r251291, "Loop Vectorizer - skipping "bitcast" before GEP"NAKAMURA Takumi2015-10-271-16/+3
| | | | | | | It causes miscompilation of llvm/lib/ExecutionEngine/Interpreter/Execution.cpp. See also PR25324. llvm-svn: 251436
* Tidy a comment. NFC.Diego Novillo2015-10-271-1/+1
| | | | llvm-svn: 251434
* [SLP] Be more aggressive about reduction width selection.Charlie Turner2015-10-271-12/+35
| | | | | | | | | | | | | | | | | | | Summary: This change could be way off-piste, I'm looking for any feedback on whether it's an acceptable approach. It never seems to be a problem to gobble up as many reduction values as can be found, and then to attempt to reduce the resulting tree. Some of the workloads I'm looking at have been aggressively unrolled by hand, and by selecting reduction widths that are not constrained by a vector register size, it becomes possible to profitably vectorize. My test case shows such an unrolling which SLP was not vectorizing (on neither ARM nor X86) before this patch, but with it does vectorize. I measure no significant compile time impact of this change when combined with D13949 and D14063. There are also no significant performance regressions on ARM/AArch64 in SPEC or LNT. The more principled approach I thought of was to generate several candidate tree's and use the cost model to pick the cheapest one. That seemed like quite a big design change (the algorithms seem very much one-shot), and would likely be a costly thing for compile time. This seemed to do the job at very little cost, but I'm worried I've misunderstood something! Reviewers: nadav, jmolloy Subscribers: mssimpso, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D14116 llvm-svn: 251428
* [SLP] Try a bit harder to find reduction PHIsCharlie Turner2015-10-271-5/+43
| | | | | | | | | | | | | | | Summary: Currently, when the SLP vectorizer considers whether a phi is part of a reduction, it dismisses phi's whose incoming blocks are not the same as the block containing the phi. For the patterns I'm looking at, extending this rule to allow phis whose incoming block is a containing loop latch allows me to vectorize certain workloads. There is no significant compile-time impact, and combined with D13949, no performance improvement measured in ARM/AArch64 in any of SPEC2000, SPEC2006 or LNT. Reviewers: jmolloy, mcrosier, nadav Subscribers: mssimpso, nadav, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D14063 llvm-svn: 251425
* [SLP] Treat SelectInsts as reduction values.Charlie Turner2015-10-271-6/+7
| | | | | | | | | | | | | | | Summary: Certain workloads, in particular sum-of-absdiff loops, can be vectorized using SLP if it can treat select instructions as reduction values. The test case is a bit awkward. The AArch64 cost model needs some tuning to not be so pessimistic about selects. I've had to tweak the SLP threshold here. Reviewers: jmolloy, mzolotukhin, spatel, nadav Subscribers: nadav, mssimpso, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D13949 llvm-svn: 251424
* Fix SamplePGO segfault when debug info is missing.Diego Novillo2015-10-271-2/+4
| | | | | | | | | | When emitting a remark for a conditional branch annotation, the remark uses the line location information of the conditional branch in the message. In some cases, that information is unavailable and the optimization would segfaul. I'm still not sure whether this is a bug or WAI, but the optimizer should not die because of this. llvm-svn: 251420
* [SimplifyLibCalls] Use range-based loop. No functional change.Davide Italiano2015-10-271-4/+2
| | | | llvm-svn: 251383
* [function-attrs] Refactor code to handle shorter code with early exits.Chandler Carruth2015-10-271-31/+37
| | | | | | | | | | | No functionality changed here, but the indentation is substantially reduced and IMO the code is much easier to read. I've also added some helpful comments. This is just a clean-up I wrote while studying the code, and that has been in my backlog for a while. llvm-svn: 251381
* Remove unused local variable. NFC.Diego Novillo2015-10-261-2/+0
| | | | llvm-svn: 251344
* [RS4GC] Strip noalias attribute after statepoint rewriteIgor Laevsky2015-10-261-1/+4
| | | | | | | | | We should remove noalias along with dereference and dereference_or_null attributes because statepoint could potentially touch the entire heap including noalias objects. Differential Revision: http://reviews.llvm.org/D14032 llvm-svn: 251333
* SamplePGO - Add optimization reports.Diego Novillo2015-10-261-6/+30
| | | | | | | | | | | | | | | | | | | | | | | This adds a couple of optimization remarks to the SamplePGO transformation. When it decides to inline a hot function (to mimic the inline stack and repeat useful inline decisions in the original build). It will also report branch destinations. For instance, given the code fragment: 6 if (i < 1000) 7 sum -= i; 8 else 9 sum += -i * rand(); If the 'else' branch is taken most of the time, building this code with -Rpass=sample-profile will produce: a.cc:9:14: remark: most popular destination for conditional branches at small.cc:6:9 [-Rpass=sample-profile] sum += -i * rand(); ^ llvm-svn: 251330
* Move the canonical header to the top of its matching cpp file as per coding ↵David Blaikie2015-10-261-1/+2
| | | | | | | | | convention This ensures that the header will be verified to be standalone (and avoid mistakes like the one fixed in r251178) llvm-svn: 251326
* [safestack] Fast access to the unsafe stack pointer on AArch64/Android.Evgeniy Stepanov2015-10-261-47/+25
| | | | | | | | | | | | | | | | | | | | | Android libc provides a fixed TLS slot for the unsafe stack pointer, and this change implements direct access to that slot on AArch64 via __builtin_thread_pointer() + offset. This change also moves more code into TargetLowering and its target-specific subclasses to get rid of target-specific codegen in SafeStackPass. This change does not touch the ARM backend because ARM lowers builting_thread_pointer as aeabi_read_tp, which is not available on Android. The previous iteration of this change was reverted in r250461. This version leaves the generic, compiler-rt based implementation in SafeStack.cpp instead of moving it to TargetLoweringBase in order to allow testing without a TargetMachine. llvm-svn: 251324
* Refactor: Simplify boolean conditional return statements in ↵Alexey Samsonov2015-10-261-4/+2
| | | | | | | | | | | | lib/Transforms/Instrumentation Summary: Use clang-tidy to simplify boolean conditional return statements. Differential Revision: http://reviews.llvm.org/D9996 Patch by Richard (legalize@xmission.com)! llvm-svn: 251318
* Loop Vectorizer - skipping "bitcast" before GEPElena Demikhovsky2015-10-261-3/+16
| | | | | | | | | | Vectorization of memory instruction (Load/Store) is possible when the pointer is coming from GEP. The GEP analysis allows to estimate the profit. In some cases we have a "bitcast" between GEP and memory instruction. I added code that skips the "bitcast". http://reviews.llvm.org/D13886 llvm-svn: 251291
* [InstCombine] Teach instcombine not to create extra PHI nodes when folding GEPsSilviu Baranga2015-10-261-1/+6
| | | | | | | | | | | | | | | | | | Summary: InstCombine tries to transform GEP(PHI(GEP1, GEP2, ..)) into GEP(GEP(PHI(...)) when possible. However, this may leave the old PHI node around. Even if we do end up folding the GEPs, having an extra PHI node might not be beneficial. This change makes the transformation more conservative. We now only do this if the PHI has only one use, and can therefore be removed after the transformation. Reviewers: jmolloy, majnemer Subscribers: mcrosier, mssimpso, llvm-commits Differential Revision: http://reviews.llvm.org/D13887 llvm-svn: 251281
* Convert assert(false) into llvm_unreachable where it makes sense.Benjamin Kramer2015-10-251-1/+1
| | | | llvm-svn: 251266
* [LCSSA] Unbreak build, don't reuse L; NFCSanjoy Das2015-10-251-2/+2
| | | | | | The build broke in r251248. llvm-svn: 251251
* [LCSSA] Use range for loops; NFCSanjoy Das2015-10-251-28/+21
| | | | llvm-svn: 251248
* Refactor: Simplify boolean conditional return statements in ↵Michael Zolotukhin2015-10-242-11/+5
| | | | | | | | | | | | lib/Transforms/Vectorize (NFC). Summary: Use clang-tidy to simplify boolean conditional return statements Differential Revision: http://reviews.llvm.org/D10003 Patch by Richard<legalize@xmission.com> llvm-svn: 251206
* ScalarReplAggregates.cpp: Try to appease clash of anonymous::SROA in modules ↵NAKAMURA Takumi2015-10-241-0/+1
| | | | | | build. llvm-svn: 251181
* [RS4GC] Rename stripDereferenceabilityInfo into stripNonValidAttributes.Igor Laevsky2015-10-231-18/+18
| | | | llvm-svn: 251157
* Revert rL251061 [SimplifyCFG] Extend SimplifyResume to handle phi of trivial ↵Chen Li2015-10-231-65/+11
| | | | | | landing pad. llvm-svn: 251149
* Handle non-constant shifts in computeKnownBits, and use computeKnownBits for ↵Hal Finkel2015-10-231-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | constant folding in InstCombine/Simplify First, the motivation: LLVM currently does not realize that: ((2072 >> (L == 0)) >> 7) & 1 == 0 where L is some arbitrary value. Whether you right-shift 2072 by 7 or by 8, the lowest-order bit is always zero. There are obviously several ways to go about fixing this, but the generic solution pursued in this patch is to teach computeKnownBits something about shifts by a non-constant amount. Previously, we would give up completely on these. Instead, in cases where we know something about the low-order bits of the shift-amount operand, we can combine (and together) the associated restrictions for all shift amounts consistent with that knowledge. As a further generalization, I refactored all of the logic for all three kinds of shifts to have this capability. This works well in the above case, for example, because the dynamic shift amount can only be 0 or 1, and thus we can say a lot about the known bits of the result. This brings us to the second part of this change: Even when we know all of the bits of a value via computeKnownBits, nothing used to constant-fold the result. This introduces the necessary code into InstCombine and InstSimplify. I've added it into both because: 1. InstCombine won't automatically pick up the associated logic in InstSimplify (InstCombine uses InstSimplify, but not via the API that passes in the original instruction). 2. Putting the logic in InstCombine allows the resulting simplifications to become part of the iterative worklist 3. Putting the logic in InstSimplify allows the resulting simplifications to be used by everywhere else that calls SimplifyInstruction (inlining, unrolling, and many others). And this requires a small change to our definition of an ephemeral value so that we don't break the rest case from r246696 (where the icmp feeding the @llvm.assume, is also feeding a br). Under the old definition, the icmp would not be considered ephemeral (because it is used by the br), but this causes the assume to remove itself (in addition to simplifying the branch structure), and it seems more-useful to prevent that from happening. llvm-svn: 251146
* GVN: don't try to replace instruction with itself.Tim Northover2015-10-231-5/+9
| | | | | | | | | | | After some look-ahead PRE was added for GEPs, an instruction could end up in the table of candidates before it was actually inspected. When this happened the pass might decide it was the best candidate to replace itself. This didn't go well. Should fix PR25291 llvm-svn: 251145
* [Inliner] Don't inline through callsites with operand bundlesSanjoy Das2015-10-231-0/+4
| | | | | | | | | | | | | | | | Summary: This change teaches the LLVM inliner to not inline through callsites with unknown operand bundles. Currently all operand bundles are "unknown" operand bundles but in the near future we will add support for inlining through some select kinds of operand bundles. Reviewers: reames, chandlerc, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14001 llvm-svn: 251141
* Add more intrumentation/runtime helper interfaces (NFC)Xinliang David Li2015-10-231-25/+17
| | | | | | | | | | | This patch converts the remaining references to literal strings for names of profile runtime entites (such as profile runtime hook, runtime hook use function, profile init method, register function etc). Also added documentation for all the new interfaces. llvm-svn: 251093
* SLPVectorizer: AllSameOpcode* starts "true" only for instructionsMehdi Amini2015-10-231-3/+4
| | | | | | | r251085 wasn't as NFC as intended... From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 251087
* SLPVectorizer: refactor reorderInputsAccordingToOpcode (NFC)Mehdi Amini2015-10-231-52/+81
| | | | | | | This is intended to simplify the changes needed to solve PR25247. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 251085
* LoopPass: Simplify the API for adding a new loop. NFCJustin Bogner2015-10-221-5/+4
| | | | | | | The insertLoop() API is only used to add new loops, and has confusing ownership semantics. Simplify it by replacing it with addLoop(). llvm-svn: 251064
* [SimplifyCFG] Extend SimplifyResume to handle phi of trivial landing pad.Chen Li2015-10-221-11/+65
| | | | | | | | | | | | Summary: Currently SimplifyResume can convert an invoke instruction to a call instruction if its landing pad is trivial. In practice we could have several invoke instructions with trivial landing pads and share a common rethrow block, and in the common rethrow block, all the landing pads join to a phi node. The patch extends SimplifyResume to check the phi of landing pad and their incoming blocks. If any of them is trivial, remove it from the phi node and convert the invoke instruction to a call instruction. Reviewers: hfinkel, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13718 llvm-svn: 251061
* Add helper functions and remove hard coded references to instProf related ↵Xinliang David Li2015-10-221-12/+16
| | | | | | | | | | | | name/name-prefixes This is a clean up patch that defines instr prof section and variable name prefixes in a common header with access helper functions. clang FE change will be done as a follow up once this patch is in. Differential Revision: http://reviews.llvm.org/D13919 llvm-svn: 251058
* [Sink] Don't check BB.empty()David Majnemer2015-10-221-1/+1
| | | | | | | | As an invariant, BasicBlocks cannot be empty when passed to a transform. This is not the case for MachineBasicBlocks and the Sink pass was ported from the MachineSink pass which would explain the check's existence. llvm-svn: 251057
* [ASan] Enable instrumentation of dynamic allocas by default.Alexey Samsonov2015-10-221-1/+1
| | | | llvm-svn: 251056
* [ASan] Minor fixes to dynamic allocas handling:Alexey Samsonov2015-10-221-12/+11
| | | | | | | | | | | | | | | * Don't instrument promotable dynamic allocas: We already have a test that checks that promotable dynamic allocas are ignored, as well as static promotable allocas. Make sure this test will still pass if/when we enable dynamic alloca instrumentation by default. * Handle lifetime intrinsics before handling dynamic allocas: lifetime intrinsics may refer to dynamic allocas, so we need to emit instrumentation before these dynamic allocas would be replaced. Differential Revision: http://reviews.llvm.org/D12704 llvm-svn: 251045
* Use ArrayRef instead of pointer and size. NFCCraig Topper2015-10-221-2/+1
| | | | llvm-svn: 251029
OpenPOWER on IntegriCloud