summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [SimplifyCFG] Don't DCE catchret because the successor is unreachableDavid Majnemer2015-10-271-2/+1
| | | | | | | CatchReturnInst has side-effects: it runs a destructor. This destructor could conceivably run forever/call exit/etc. and should not be removed. llvm-svn: 251461
* Whitespace.NAKAMURA Takumi2015-10-271-1/+1
| | | | llvm-svn: 251437
* Revert r251291, "Loop Vectorizer - skipping "bitcast" before GEP"NAKAMURA Takumi2015-10-271-16/+3
| | | | | | | It causes miscompilation of llvm/lib/ExecutionEngine/Interpreter/Execution.cpp. See also PR25324. llvm-svn: 251436
* Tidy a comment. NFC.Diego Novillo2015-10-271-1/+1
| | | | llvm-svn: 251434
* [SLP] Be more aggressive about reduction width selection.Charlie Turner2015-10-271-12/+35
| | | | | | | | | | | | | | | | | | | Summary: This change could be way off-piste, I'm looking for any feedback on whether it's an acceptable approach. It never seems to be a problem to gobble up as many reduction values as can be found, and then to attempt to reduce the resulting tree. Some of the workloads I'm looking at have been aggressively unrolled by hand, and by selecting reduction widths that are not constrained by a vector register size, it becomes possible to profitably vectorize. My test case shows such an unrolling which SLP was not vectorizing (on neither ARM nor X86) before this patch, but with it does vectorize. I measure no significant compile time impact of this change when combined with D13949 and D14063. There are also no significant performance regressions on ARM/AArch64 in SPEC or LNT. The more principled approach I thought of was to generate several candidate tree's and use the cost model to pick the cheapest one. That seemed like quite a big design change (the algorithms seem very much one-shot), and would likely be a costly thing for compile time. This seemed to do the job at very little cost, but I'm worried I've misunderstood something! Reviewers: nadav, jmolloy Subscribers: mssimpso, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D14116 llvm-svn: 251428
* [SLP] Try a bit harder to find reduction PHIsCharlie Turner2015-10-271-5/+43
| | | | | | | | | | | | | | | Summary: Currently, when the SLP vectorizer considers whether a phi is part of a reduction, it dismisses phi's whose incoming blocks are not the same as the block containing the phi. For the patterns I'm looking at, extending this rule to allow phis whose incoming block is a containing loop latch allows me to vectorize certain workloads. There is no significant compile-time impact, and combined with D13949, no performance improvement measured in ARM/AArch64 in any of SPEC2000, SPEC2006 or LNT. Reviewers: jmolloy, mcrosier, nadav Subscribers: mssimpso, nadav, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D14063 llvm-svn: 251425
* [SLP] Treat SelectInsts as reduction values.Charlie Turner2015-10-271-6/+7
| | | | | | | | | | | | | | | Summary: Certain workloads, in particular sum-of-absdiff loops, can be vectorized using SLP if it can treat select instructions as reduction values. The test case is a bit awkward. The AArch64 cost model needs some tuning to not be so pessimistic about selects. I've had to tweak the SLP threshold here. Reviewers: jmolloy, mzolotukhin, spatel, nadav Subscribers: nadav, mssimpso, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D13949 llvm-svn: 251424
* Fix SamplePGO segfault when debug info is missing.Diego Novillo2015-10-271-2/+4
| | | | | | | | | | When emitting a remark for a conditional branch annotation, the remark uses the line location information of the conditional branch in the message. In some cases, that information is unavailable and the optimization would segfaul. I'm still not sure whether this is a bug or WAI, but the optimizer should not die because of this. llvm-svn: 251420
* [SimplifyLibCalls] Use range-based loop. No functional change.Davide Italiano2015-10-271-4/+2
| | | | llvm-svn: 251383
* [function-attrs] Refactor code to handle shorter code with early exits.Chandler Carruth2015-10-271-31/+37
| | | | | | | | | | | No functionality changed here, but the indentation is substantially reduced and IMO the code is much easier to read. I've also added some helpful comments. This is just a clean-up I wrote while studying the code, and that has been in my backlog for a while. llvm-svn: 251381
* Remove unused local variable. NFC.Diego Novillo2015-10-261-2/+0
| | | | llvm-svn: 251344
* [RS4GC] Strip noalias attribute after statepoint rewriteIgor Laevsky2015-10-261-1/+4
| | | | | | | | | We should remove noalias along with dereference and dereference_or_null attributes because statepoint could potentially touch the entire heap including noalias objects. Differential Revision: http://reviews.llvm.org/D14032 llvm-svn: 251333
* SamplePGO - Add optimization reports.Diego Novillo2015-10-261-6/+30
| | | | | | | | | | | | | | | | | | | | | | | This adds a couple of optimization remarks to the SamplePGO transformation. When it decides to inline a hot function (to mimic the inline stack and repeat useful inline decisions in the original build). It will also report branch destinations. For instance, given the code fragment: 6 if (i < 1000) 7 sum -= i; 8 else 9 sum += -i * rand(); If the 'else' branch is taken most of the time, building this code with -Rpass=sample-profile will produce: a.cc:9:14: remark: most popular destination for conditional branches at small.cc:6:9 [-Rpass=sample-profile] sum += -i * rand(); ^ llvm-svn: 251330
* Move the canonical header to the top of its matching cpp file as per coding ↵David Blaikie2015-10-261-1/+2
| | | | | | | | | convention This ensures that the header will be verified to be standalone (and avoid mistakes like the one fixed in r251178) llvm-svn: 251326
* [safestack] Fast access to the unsafe stack pointer on AArch64/Android.Evgeniy Stepanov2015-10-261-47/+25
| | | | | | | | | | | | | | | | | | | | | Android libc provides a fixed TLS slot for the unsafe stack pointer, and this change implements direct access to that slot on AArch64 via __builtin_thread_pointer() + offset. This change also moves more code into TargetLowering and its target-specific subclasses to get rid of target-specific codegen in SafeStackPass. This change does not touch the ARM backend because ARM lowers builting_thread_pointer as aeabi_read_tp, which is not available on Android. The previous iteration of this change was reverted in r250461. This version leaves the generic, compiler-rt based implementation in SafeStack.cpp instead of moving it to TargetLoweringBase in order to allow testing without a TargetMachine. llvm-svn: 251324
* Refactor: Simplify boolean conditional return statements in ↵Alexey Samsonov2015-10-261-4/+2
| | | | | | | | | | | | lib/Transforms/Instrumentation Summary: Use clang-tidy to simplify boolean conditional return statements. Differential Revision: http://reviews.llvm.org/D9996 Patch by Richard (legalize@xmission.com)! llvm-svn: 251318
* Loop Vectorizer - skipping "bitcast" before GEPElena Demikhovsky2015-10-261-3/+16
| | | | | | | | | | Vectorization of memory instruction (Load/Store) is possible when the pointer is coming from GEP. The GEP analysis allows to estimate the profit. In some cases we have a "bitcast" between GEP and memory instruction. I added code that skips the "bitcast". http://reviews.llvm.org/D13886 llvm-svn: 251291
* [InstCombine] Teach instcombine not to create extra PHI nodes when folding GEPsSilviu Baranga2015-10-261-1/+6
| | | | | | | | | | | | | | | | | | Summary: InstCombine tries to transform GEP(PHI(GEP1, GEP2, ..)) into GEP(GEP(PHI(...)) when possible. However, this may leave the old PHI node around. Even if we do end up folding the GEPs, having an extra PHI node might not be beneficial. This change makes the transformation more conservative. We now only do this if the PHI has only one use, and can therefore be removed after the transformation. Reviewers: jmolloy, majnemer Subscribers: mcrosier, mssimpso, llvm-commits Differential Revision: http://reviews.llvm.org/D13887 llvm-svn: 251281
* Convert assert(false) into llvm_unreachable where it makes sense.Benjamin Kramer2015-10-251-1/+1
| | | | llvm-svn: 251266
* [LCSSA] Unbreak build, don't reuse L; NFCSanjoy Das2015-10-251-2/+2
| | | | | | The build broke in r251248. llvm-svn: 251251
* [LCSSA] Use range for loops; NFCSanjoy Das2015-10-251-28/+21
| | | | llvm-svn: 251248
* Refactor: Simplify boolean conditional return statements in ↵Michael Zolotukhin2015-10-242-11/+5
| | | | | | | | | | | | lib/Transforms/Vectorize (NFC). Summary: Use clang-tidy to simplify boolean conditional return statements Differential Revision: http://reviews.llvm.org/D10003 Patch by Richard<legalize@xmission.com> llvm-svn: 251206
* ScalarReplAggregates.cpp: Try to appease clash of anonymous::SROA in modules ↵NAKAMURA Takumi2015-10-241-0/+1
| | | | | | build. llvm-svn: 251181
* [RS4GC] Rename stripDereferenceabilityInfo into stripNonValidAttributes.Igor Laevsky2015-10-231-18/+18
| | | | llvm-svn: 251157
* Revert rL251061 [SimplifyCFG] Extend SimplifyResume to handle phi of trivial ↵Chen Li2015-10-231-65/+11
| | | | | | landing pad. llvm-svn: 251149
* Handle non-constant shifts in computeKnownBits, and use computeKnownBits for ↵Hal Finkel2015-10-231-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | constant folding in InstCombine/Simplify First, the motivation: LLVM currently does not realize that: ((2072 >> (L == 0)) >> 7) & 1 == 0 where L is some arbitrary value. Whether you right-shift 2072 by 7 or by 8, the lowest-order bit is always zero. There are obviously several ways to go about fixing this, but the generic solution pursued in this patch is to teach computeKnownBits something about shifts by a non-constant amount. Previously, we would give up completely on these. Instead, in cases where we know something about the low-order bits of the shift-amount operand, we can combine (and together) the associated restrictions for all shift amounts consistent with that knowledge. As a further generalization, I refactored all of the logic for all three kinds of shifts to have this capability. This works well in the above case, for example, because the dynamic shift amount can only be 0 or 1, and thus we can say a lot about the known bits of the result. This brings us to the second part of this change: Even when we know all of the bits of a value via computeKnownBits, nothing used to constant-fold the result. This introduces the necessary code into InstCombine and InstSimplify. I've added it into both because: 1. InstCombine won't automatically pick up the associated logic in InstSimplify (InstCombine uses InstSimplify, but not via the API that passes in the original instruction). 2. Putting the logic in InstCombine allows the resulting simplifications to become part of the iterative worklist 3. Putting the logic in InstSimplify allows the resulting simplifications to be used by everywhere else that calls SimplifyInstruction (inlining, unrolling, and many others). And this requires a small change to our definition of an ephemeral value so that we don't break the rest case from r246696 (where the icmp feeding the @llvm.assume, is also feeding a br). Under the old definition, the icmp would not be considered ephemeral (because it is used by the br), but this causes the assume to remove itself (in addition to simplifying the branch structure), and it seems more-useful to prevent that from happening. llvm-svn: 251146
* GVN: don't try to replace instruction with itself.Tim Northover2015-10-231-5/+9
| | | | | | | | | | | After some look-ahead PRE was added for GEPs, an instruction could end up in the table of candidates before it was actually inspected. When this happened the pass might decide it was the best candidate to replace itself. This didn't go well. Should fix PR25291 llvm-svn: 251145
* [Inliner] Don't inline through callsites with operand bundlesSanjoy Das2015-10-231-0/+4
| | | | | | | | | | | | | | | | Summary: This change teaches the LLVM inliner to not inline through callsites with unknown operand bundles. Currently all operand bundles are "unknown" operand bundles but in the near future we will add support for inlining through some select kinds of operand bundles. Reviewers: reames, chandlerc, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14001 llvm-svn: 251141
* Add more intrumentation/runtime helper interfaces (NFC)Xinliang David Li2015-10-231-25/+17
| | | | | | | | | | | This patch converts the remaining references to literal strings for names of profile runtime entites (such as profile runtime hook, runtime hook use function, profile init method, register function etc). Also added documentation for all the new interfaces. llvm-svn: 251093
* SLPVectorizer: AllSameOpcode* starts "true" only for instructionsMehdi Amini2015-10-231-3/+4
| | | | | | | r251085 wasn't as NFC as intended... From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 251087
* SLPVectorizer: refactor reorderInputsAccordingToOpcode (NFC)Mehdi Amini2015-10-231-52/+81
| | | | | | | This is intended to simplify the changes needed to solve PR25247. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 251085
* LoopPass: Simplify the API for adding a new loop. NFCJustin Bogner2015-10-221-5/+4
| | | | | | | The insertLoop() API is only used to add new loops, and has confusing ownership semantics. Simplify it by replacing it with addLoop(). llvm-svn: 251064
* [SimplifyCFG] Extend SimplifyResume to handle phi of trivial landing pad.Chen Li2015-10-221-11/+65
| | | | | | | | | | | | Summary: Currently SimplifyResume can convert an invoke instruction to a call instruction if its landing pad is trivial. In practice we could have several invoke instructions with trivial landing pads and share a common rethrow block, and in the common rethrow block, all the landing pads join to a phi node. The patch extends SimplifyResume to check the phi of landing pad and their incoming blocks. If any of them is trivial, remove it from the phi node and convert the invoke instruction to a call instruction. Reviewers: hfinkel, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13718 llvm-svn: 251061
* Add helper functions and remove hard coded references to instProf related ↵Xinliang David Li2015-10-221-12/+16
| | | | | | | | | | | | name/name-prefixes This is a clean up patch that defines instr prof section and variable name prefixes in a common header with access helper functions. clang FE change will be done as a follow up once this patch is in. Differential Revision: http://reviews.llvm.org/D13919 llvm-svn: 251058
* [Sink] Don't check BB.empty()David Majnemer2015-10-221-1/+1
| | | | | | | | As an invariant, BasicBlocks cannot be empty when passed to a transform. This is not the case for MachineBasicBlocks and the Sink pass was ported from the MachineSink pass which would explain the check's existence. llvm-svn: 251057
* [ASan] Enable instrumentation of dynamic allocas by default.Alexey Samsonov2015-10-221-1/+1
| | | | llvm-svn: 251056
* [ASan] Minor fixes to dynamic allocas handling:Alexey Samsonov2015-10-221-12/+11
| | | | | | | | | | | | | | | * Don't instrument promotable dynamic allocas: We already have a test that checks that promotable dynamic allocas are ignored, as well as static promotable allocas. Make sure this test will still pass if/when we enable dynamic alloca instrumentation by default. * Handle lifetime intrinsics before handling dynamic allocas: lifetime intrinsics may refer to dynamic allocas, so we need to emit instrumentation before these dynamic allocas would be replaced. Differential Revision: http://reviews.llvm.org/D12704 llvm-svn: 251045
* Use ArrayRef instead of pointer and size. NFCCraig Topper2015-10-221-2/+1
| | | | llvm-svn: 251029
* [SimplifyCFG] Don't use-after-free an SSA valueDavid Majnemer2015-10-211-1/+2
| | | | | | | | | SimplifyTerminatorOnSelect didn't consider the possibility that the condition might be related to one of PHI nodes. This fixes PR25267. llvm-svn: 250922
* Tolerate negative offset when matching sample profile.Dehao Chen2015-10-211-9/+20
| | | | | | | | In some cases (as illustrated in the unittest), lineno can be less than the heade_lineno because the function body are included from some other files. In this case, offset will be negative. This patch makes clang still able to match the profile to IR in this situation. http://reviews.llvm.org/D13914 llvm-svn: 250873
* [MemorySanitizer] NFC. Do not use GET_INTRINSIC_MODREF_BEHAVIOR table.Igor Laevsky2015-10-201-28/+3
| | | | | | | | | It is now possible to infer intrinsic modref behaviour purely from intrinsic attributes. This change will allow to completely remove GET_INTRINSIC_MODREF_BEHAVIOR table. Differential Revision: http://reviews.llvm.org/D13907 llvm-svn: 250860
* Fix missing INITIALIZE_PASS_DEPENDENCY for AddressSanitizerKeno Fischer2015-10-201-0/+1
| | | | | | | | | | | | | | Summary: In r231241, TargetLibraryInfoWrapperPass was added to `getAnalysisUsage` for `AddressSanitizer`, but the corresponding `INITIALIZE_PASS_DEPENDENCY` was not added. Reviewers: dvyukov, chandlerc, kcc Subscribers: kcc, llvm-commits Differential Revision: http://reviews.llvm.org/D13629 llvm-svn: 250813
* [RS4GC] Remove a redundant linear search, NFCISanjoy Das2015-10-201-2/+1
| | | | | | | Since LiveVariables is uniqued (we just created it from a `DenseSet`), `FindIndex(LiveVariables, LiveVariables[i])` is always `i`. llvm-svn: 250786
* [RS4GC] Clean up `find_index`; NFCSanjoy Das2015-10-201-11/+11
| | | | | | | - Bring it up to the LLVM Coding Style - Sink it inside `CreateGCRelocates`, which is its only user llvm-svn: 250785
* [RS4GC] Re-purpose `normalizeForInvokeSafepoint`; NFC.Sanjoy Das2015-10-201-9/+9
| | | | | | | | | | | | `normalizeForInvokeSafepoint` in RewriteStatepointsForGC.cpp, as it is written today, deals with `gc.relocate` and `gc.result` uses of a statepoint equally well. This change documents this fact and adds a test case. There is no functional change here -- only documentation of existing functionality. llvm-svn: 250784
* [RS4GC] Minor cleanup to `normalizeForInvokeSafepoint`; NFCSanjoy Das2015-10-201-3/+3
| | | | llvm-svn: 250783
* ObjCARC: Remove implicit ilist iterator conversions, NFCDuncan P. N. Exon Smith2015-10-195-53/+48
| | | | llvm-svn: 250756
* [InstCombine] Optimize icmp of inc/dec at RHSMichael Liao2015-10-191-0/+20
| | | | | | | | | | | | | | | | Allow LLVM to optimize the sequence like the following: %inc = add nsw i32 %i, 1 %cmp = icmp slt %n, %inc into: %cmp = icmp sle i32 %n, %i The case is not handled previously due to the complexity of compuation of %n. Hence, LLVM cannot swap operands of icmp accordingly. llvm-svn: 250746
* Vectorize: Remove implicit ilist iterator conversions, NFCDuncan P. N. Exon Smith2015-10-193-85/+91
| | | | | | | | | | | | | Besides the usual, I finally added an overload to `BasicBlock::splitBasicBlock()` that accepts an `Instruction*` instead of `BasicBlock::iterator`. Someone can go back and remove this overload later (after updating the callers I'm going to skip going forward), but the most common call seems to be `BB->splitBasicBlock(BB->getTerminator(), ...)` and I'm not sure it's better to add `->getIterator()` to every one than have the overload. It's pretty hard to get the usage wrong. llvm-svn: 250745
* Removed parameter "Consecutive" from isLegalMaskedLoad() / isLegalMaskedStore().Elena Demikhovsky2015-10-191-2/+2
| | | | | | | | | | Originally I planned to use the same interface for masked gather/scatter and set isConsecutive to "false" in this case. Now I'm implementing masked gather/scatter and see that the interface is inconvenient. I want to add interfaces isLegalMaskedGather() / isLegalMaskedScatter() instead of using the "Consecutive" parameter in the existing interfaces. Differential Revision: http://reviews.llvm.org/D13850 llvm-svn: 250686
OpenPOWER on IntegriCloud