summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* [LoopVectorize] Address post-commit feedback on r250032James Molloy2015-11-091-3/+4
| | | | | | | | | | Implemented as many of Michael's suggestions as were possible: * clang-format the added code while it is still fresh. * tried to change Value* to Instruction* in many places in computeMinimumValueSizes - unfortunately there are several places where Constants need to be handled so this wasn't possible. * Reduce the pass list on loop-vectorization-factors.ll. * Fix a bug where we were querying MinBWs for I->getOperand(0) but using MinBWs[I]. llvm-svn: 252469
* Allow LLE/LD and the loop versioning infrastructure to use SCEV predicatesSilviu Baranga2015-11-093-28/+90
| | | | | | | | | | | | | | | | | | | Summary: LAA currently generates a set of SCEV predicates that must be checked by users. In the case of Loop Distribute/Loop Load Elimination, no such predicates could have been emitted, since we don't allow stride versioning. However, in the future there could be SCEV predicates that will need to be checked. This change adds support for SCEV predicate versioning in the Loop Distribute, Loop Load Eliminate and the loop versioning infrastructure. Reviewers: anemet Subscribers: mssimpso, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D14240 llvm-svn: 252467
* [LoopStrengthReduce] Don't bother fixing up PHIs from EH Pad predsDavid Majnemer2015-11-081-0/+3
| | | | | | | | We cannot really insert fixup code into a PHI's predecessor. This fixes PR25445. llvm-svn: 252416
* Unbreak the buildSanjoy Das2015-11-071-1/+1
| | | | | | | My code clashed with some ilist iterator changes upstream. Fix by adding an explicit "&*" coercion. llvm-svn: 252392
* [FunctionAttrs] Add comment and clarify assertion message; NFCSanjoy Das2015-11-071-1/+6
| | | | llvm-svn: 252389
* [FunctionAttrs] Add handling for operand bundlesSanjoy Das2015-11-071-4/+31
| | | | | | | | | | | | | | Summary: Teach the FunctionAttrs to do the right thing for IR with operand bundles. Reviewers: reames, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14408 llvm-svn: 252387
* [FunctionAttrs] Fix an iterator wraparound bugSanjoy Das2015-11-071-18/+19
| | | | | | | | | | | | | | | | | | | Summary: This change fixes an iterator wraparound bug in `determinePointerReadAttrs`. Ideally, ++'ing off the `end()` of an iplist should result in a failed assert, but currently iplist seems to silently wrap to the head of the list on `end()++`. This is why the bad behavior is difficult to demonstrate. Reviewers: chandlerc, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14350 llvm-svn: 252386
* [InstCombine] Teach FoldPHIArgZextsIntoPHI about EHPadsDavid Majnemer2015-11-071-0/+6
| | | | | | | | FoldPHIArgZextsIntoPHI cannot insert an instruction after the PHI if there is an EHPad in the BB. Doing so would result in an instruction inserted after a terminator. llvm-svn: 252377
* ADT: Remove last implicit ilist iterator conversions, NFCDuncan P. N. Exon Smith2015-11-073-6/+7
| | | | | | | | | | Some implicit ilist iterator conversions have crept back into Analysis, Transforms, Hexagon, and llvm-stress. This removes them. I'll commit a patch immediately after this to disallow them (in a separate patch so that it's easy to revert if necessary). llvm-svn: 252371
* [InstCombine] Don't insert an instruction after a terminatorDavid Majnemer2015-11-061-0/+6
| | | | | | | | We tried to insert a cast of a phi in a block whose terminator is an EHPad. This is invalid. Do not attempt the transform in these circumstances. llvm-svn: 252370
* Add 'notail' marker for call instructions.Akira Hatanaka2015-11-061-2/+4
| | | | | | | | | | | | This marker prevents optimization passes from adding 'tail' or 'musttail' markers to a call. Is is used to prevent tail call optimization from being performed on the call. rdar://problem/22667622 Differential Revision: http://reviews.llvm.org/D12923 llvm-svn: 252368
* [InstCombine] Don't RAUW tokens with undefDavid Majnemer2015-11-061-2/+3
| | | | | | Let SimplifyCFG remove unreachable BBs which define token instructions. llvm-svn: 252343
* [SimplifyLibCalls] Don't hardcode the function name.Davide Italiano2015-11-061-1/+2
| | | | llvm-svn: 252342
* Fix SLPVectorizer commutativity reorderingMehdi Amini2015-11-061-76/+69
| | | | | | | | | | | | | | | | | | | The SLPVectorizer had a very crude way of trying to benefit from associativity: it tried to optimize for splat/broadcast or in order to have the same operator on the same side. This is benefitial to the cost model and allows more vectorization to occur. This patch improve the logic and make the detection optimal (locally, we don't look at the full tree but only at the immediate children). Should fix https://llvm.org/bugs/show_bug.cgi?id=25247 Reviewers: mzolotukhin Differential Revision: http://reviews.llvm.org/D13996 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 252337
* [ValueTracking] Add parameters to isImpliedCondition; NFCSanjoy Das2015-11-062-2/+4
| | | | | | | | | | | | | | | | Summary: This change makes the `isImpliedCondition` interface similar to the rest of the functions in ValueTracking (in that it takes a DataLayout, AssumptionCache etc.). This is an NFC, intended to make a later diff less noisy. Depends on D14369 Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14391 llvm-svn: 252333
* [LIR] Simplify code by making DataLayout globally accessible. NFC.Chad Rosier2015-11-061-11/+10
| | | | llvm-svn: 252317
* DI: Reverse direction of subprogram -> function edge.Peter Collingbourne2015-11-056-52/+24
| | | | | | | | | | | | | | | | | | | | | | | Previously, subprograms contained a metadata reference to the function they described. Because most clients need to get or set a subprogram for a given function rather than the other way around, this created unneeded inefficiency. For example, many passes needed to call the function llvm::makeSubprogramMap() to build a mapping from functions to subprograms, and the IR linker needed to fix up function references in a way that caused quadratic complexity in the IR linking phase of LTO. This change reverses the direction of the edge by storing the subprogram as function-level metadata and removing DISubprogram's function field. Since this is an IR change, a bitcode upgrade has been provided. Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is attached to the PR. Differential Revision: http://reviews.llvm.org/D14265 llvm-svn: 252219
* [ASan] Disable instrumentation for inalloca variables.Alexey Samsonov2015-11-051-1/+4
| | | | | | | | inalloca variables were not treated as static allocas, therefore didn't participate in regular stack instrumentation. We don't want them to participate in dynamic alloca instrumentation as well. llvm-svn: 252213
* [SimplifyLibCalls] Use hasFloatVersion(). NFCI.Davide Italiano2015-11-051-15/+13
| | | | llvm-svn: 252186
* [SimplifyCFG] Tweak heuristic for merging conditional storesJames Molloy2015-11-051-7/+13
| | | | | | We were correctly skipping dbginfo intrinsics and terminators, but the initial bailout wasn't, causing it to bail out on almost any block. llvm-svn: 252152
* [FunctionAttrs] Remove a loop, NFC refactorSanjoy Das2015-11-051-16/+14
| | | | | | | | | | | | | | | | | | | | | | Summary: Remove the loop over the uses of the CallSite in ArgumentUsesTracker. Since we have the `Use *` for actual argument operand, we can just use pointer subtraction. The time complexity remains the same though (except for a vararg argument) -- `std::advance` is O(UseIndex) for the ArgumentList iterator. The real motivation is to make a later change adding support for operand bundles simpler. Reviewers: reames, chandlerc, nlewycky Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14363 llvm-svn: 252141
* [PGO] Use template file to define runtime structuresXinliang David Li2015-11-051-26/+24
| | | | | | | | | | | With this change, instrumentation code and reader/write code related to profile data structs are kept strictly in-sync. THis will be extended to cfe and compile-rt references as well. Differential Revision: http://reviews.llvm.org/D13843 llvm-svn: 252113
* [SimplifyLibCalls] New transformation: tan(atan(x)) -> xDavide Italiano2015-11-041-1/+38
| | | | | | | | | | | | | | | | | | | | This is enabled only under -ffast-math. So, instead of emitting: 4007b0: 50 push %rax 4007b1: e8 8a fd ff ff callq 400540 <atanf@plt> 4007b6: 58 pop %rax 4007b7: e9 94 fd ff ff jmpq 400550 <tanf@plt> 4007bc: 0f 1f 40 00 nopl 0x0(%rax) for: float mytan(float x) { return tanf(atanf(x)); } we emit a single retq. Differential Revision: http://reviews.llvm.org/D14302 llvm-svn: 252098
* Fix some Clang-tidy modernize warnings, other minor fixes.Eugene Zelenko2015-11-043-17/+15
| | | | | | | | Fixed warnings are: modernize-use-override, modernize-use-nullptr and modernize-redundant-void-arg. Differential revision: http://reviews.llvm.org/D14312 llvm-svn: 252087
* [SimplifyCFG] Merge conditional storesJames Molloy2015-11-041-3/+312
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can often end up with conditional stores that cannot be speculated. They can come from fairly simple, idiomatic code: if (c & flag1) *a = x; if (c & flag2) *a = y; ... There is no dominating or post-dominating store to a, so it is not legal to move the store unconditionally to the end of the sequence and cache the intermediate result in a register, as we would like to. It is, however, legal to merge the stores together and do the store once: tmp = undef; if (c & flag1) tmp = x; if (c & flag2) tmp = y; if (c & flag1 || c & flag2) *a = tmp; The real power in this optimization is that it allows arbitrary length ladders such as these to be completely and trivially if-converted. The typical code I'd expect this to trigger on often uses binary-AND with constants as the condition (as in the above example), which means the ending condition can simply be truncated into a single binary-AND too: 'if (c & (flag1|flag2))'. As in the general case there are bitwise operators here, the ladder can often be optimized further too. This optimization involves potentially increasing register pressure. Even in the simplest case, the lifetime of the first predicate is extended. This can be elided in some cases such as using binary-AND on constants, but not in the general case. Threading 'tmp' through all branches can also increase register pressure. The optimization as in this patch is enabled by default but kept in a very conservative mode. It will only optimize if it thinks the resultant code should be if-convertable, and additionally if it can thread 'tmp' through at least one existing PHI, so it will only ever in the worst case create one more PHI and extend the lifetime of a predicate. This doesn't trigger much in LNT, unfortunately, but it does trigger in a big way in a third party test suite. llvm-svn: 252051
* [CVP] Fold return values if possiblePhilip Reames2015-11-041-0/+51
| | | | | | | | | | | | In my previous change to CVP (251606), I made CVP much more aggressive about trying to constant fold comparisons. This patch is a reversal in direction. Rather than being agressive about every compare, we restore the non-block local restriction for most, and then try hard for compares feeding returns. The motivation for this is two fold: * The more I thought about it, the less comfortable I got with the possible compile time impact of the other approach. There have been no reported issues, but after talking to a couple of folks, I've come to the conclusion the time probably isn't justified. * It turns out we need to know the context to leverage the full power of LVI. In particular, asking about something at the end of it's block (the use of a compare in a return) will frequently get more precise results than something in the middle of a block. This is an implementation detail, but it's also hard to get around since mid-block queries have to reason about possible throwing instructions and don't get to use most of LVI's block focused infrastructure. This will become particular important when combined with http://reviews.llvm.org/D14263. Differential Revision: http://reviews.llvm.org/D14271 llvm-svn: 252032
* Fix unused variable warning from r252017Adam Nemet2015-11-041-3/+2
| | | | llvm-svn: 252019
* LLE 6/6: Add LoopLoadElimination passAdam Nemet2015-11-034-0/+566
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The goal of this pass is to perform store-to-load forwarding across the backedge of a loop. E.g.: for (i) A[i + 1] = A[i] + B[i] => T = A[0] for (i) T = T + B[i] A[i + 1] = T The pass relies on loop dependence analysis via LoopAccessAnalisys to find opportunities of loop-carried dependences with a distance of one between a store and a load. Since it's using LoopAccessAnalysis, it was easy to also add support for versioning away may-aliasing intervening stores that would otherwise prevent this transformation. This optimization is also performed by Load-PRE in GVN without the option of multi-versioning. As was discussed with Daniel Berlin in http://reviews.llvm.org/D9548, this is inferior to a more loop-aware solution applied here. Hopefully, we will be able to remove some complexity from GVN/MemorySSA as a consequence. In the long run, we may want to extend this pass (or create a new one if there is little overlap) to also eliminate loop-indepedent redundant loads and store that *require* versioning due to may-aliasing intervening stores/loads. I have some motivating cases for store elimination. My plan right now is to wait for MemorySSA to come online first rather than using memdep for this. The main motiviation for this pass is the 456.hmmer loop in SPECint2006 where after distributing the original loop and vectorizing the top part, we are left with the critical path exposed in the bottom loop. Being able to promote the memory dependence into a register depedence (even though the HW does perform store-to-load fowarding as well) results in a major gain (~20%). This gain also transfers over to x86: it's around 8-10%. Right now the pass is off by default and can be enabled with -enable-loop-load-elim. On the LNT testsuite, there are two performance changes (negative number -> improvement): 1. -28% in Polybench/linear-algebra/solvers/dynprog: the length of the critical paths is reduced 2. +2% in Polybench/stencils/adi: Unfortunately, I couldn't reproduce this outside of LNT The pass is scheduled after the loop vectorizer (which is after loop distribution). The rational is to try to reuse LAA state, rather than recomputing it. The order between LV and LLE is not critical because normally LV does not touch scalar st->ld forwarding cases where vectorizing would inhibit the CPU's st->ld forwarding to kick in. LoopLoadElimination requires LAA to provide the full set of dependences (including forward dependences). LAA is known to omit loop-independent dependences in certain situations. The big comment before removeDependencesFromMultipleStores explains why this should not occur for the cases that we're interested in. Reviewers: dberlin, hfinkel Subscribers: junbuml, dberlin, mssimpso, rengolin, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13259 llvm-svn: 252017
* InstCombine: fix sinking of convergent callsFiona Glaser2015-11-031-0/+6
| | | | llvm-svn: 251991
* [LAA] LLE 3/6: Rename InterestingDependence to Dependences, NFCAdam Nemet2015-11-031-6/+5
| | | | | | | | | | | | | | Summary: We now collect all types of dependences including lexically forward deps not just "interesting" ones. Reviewers: hfinkel Subscribers: rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13256 llvm-svn: 251985
* [SimplifyLibCalls] Add a new transformation: pow(exp(x), y) -> exp(x*y)Davide Italiano2015-11-031-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | This one is enabled only under -ffast-math (due to rounding/overflows) but allows us to emit shorter code. Before (on FreeBSD x86-64): 4007f0: 50 push %rax 4007f1: f2 0f 11 0c 24 movsd %xmm1,(%rsp) 4007f6: e8 75 fd ff ff callq 400570 <exp2@plt> 4007fb: f2 0f 10 0c 24 movsd (%rsp),%xmm1 400800: 58 pop %rax 400801: e9 7a fd ff ff jmpq 400580 <pow@plt> 400806: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 40080d: 00 00 00 After: 4007b0: f2 0f 59 c1 mulsd %xmm1,%xmm0 4007b4: e9 87 fd ff ff jmpq 400540 <exp2@plt> 4007b9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) Differential Revision: http://reviews.llvm.org/D14045 llvm-svn: 251976
* LoopVectorizer - skip 'bitcast' between GEP and load.Elena Demikhovsky2015-11-031-2/+28
| | | | | | | | | | | | Skipping 'bitcast' in this case allows to vectorize load: %arrayidx = getelementptr inbounds double*, double** %in, i64 %indvars.iv %tmp53 = bitcast double** %arrayidx to i64* %tmp54 = load i64, i64* %tmp53, align 8 Differential Revision http://reviews.llvm.org/D14112 llvm-svn: 251907
* Revert "[IndVarSimplify] Rewrite loop exit values with their initial values ↵Tobias Grosser2015-11-031-73/+0
| | | | | | | | | | | | | | | | | | | from loop preheader" Commit 251839 triggers miscompiles on some bots: http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-fast/builds/13723 (The commit is listed in 13722, but due to an existing failure introduced in 13721 and reverted in 13723 the failure is only visible in 13723) To verify r251839 is indeed the only change that triggered the buildbot failures and to ensure the buildbots remain green while investigating I temporarily revert this commit. At the current state it is unclear if this commit introduced some miscompile or if it only exposed code to Polly that is subsequently miscompiled by Polly. llvm-svn: 251901
* Restore "Support for ThinLTO function importing and symbol linking."Teresa Johnson2015-11-031-0/+39
| | | | | | | This restores commit r251837, with the new library dependence added to llvm-link/Makefile to address bot failures. llvm-svn: 251866
* [SimplifyLibCalls] Remove variables that are not used. NFC.Davide Italiano2015-11-021-7/+2
| | | | llvm-svn: 251852
* Add a flag vectorizer-maximize-bandwidth in loop vectorizer to enable using ↵Cong Hou2015-11-021-28/+102
| | | | | | | | | | | | larger vectorization factor. To be able to maximize the bandwidth during vectorization, this patch provides a new flag vectorizer-maximize-bandwidth. When it is turned on, the vectorizer will determine the vectorization factor (VF) using the smallest instead of widest type in the loop. To avoid increasing register pressure too much, estimates of the register usage for different VFs are calculated so that we only choose a VF when its register usage doesn't exceed the number of available registers. This is the second attempt to submit this patch. The first attempt got a test failure on ARM. This patch is updated to try to fix the failure (more specifically, by handling the case when VF=1). Differential revision: http://reviews.llvm.org/D8943 llvm-svn: 251850
* don't repeat function names in comments; NFCSanjay Patel2015-11-021-2/+2
| | | | llvm-svn: 251846
* [SimplifyLibCalls] Merge two if statements. NFC.Davide Italiano2015-11-021-4/+1
| | | | llvm-svn: 251845
* Revert "Support for ThinLTO function importing and symbol linking."Teresa Johnson2015-11-021-39/+0
| | | | | | | | | | | | | | | | | | | | This reverts commit r251837, due to a number of bot failures of the form: /home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.obj/tools/llvm-link/Release+Asserts/llvm-link.o:llvm-link.cpp:function loadIndex(llvm::LLVMContext&, llvm::Module const*): error: undefined reference to 'llvm::object::FunctionIndexObjectFile::create(llvm::MemoryBufferRef, llvm::LLVMContext&, llvm::Module const*, bool)' /home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.obj/tools/llvm-link/Release+Asserts/llvm-link.o:llvm-link.cpp:function loadIndex(llvm::LLVMContext&, llvm::Module const*): error: undefined reference to 'llvm::object::FunctionIndexObjectFile::takeIndex()' I'm not sure why these are happening - I added Object to the requred libraries in tools/llvm-link/LLVMBuild.txt and the LLVM_LINK_COMPONENTS in tools/llvm-link/CMakeLists.txt. Confirmed for my build that these symbols come out of libLLVMObject.a. What am I missing? llvm-svn: 251841
* [IndVarSimplify] Rewrite loop exit values with their initial values from ↵Chen Li2015-11-021-0/+73
| | | | | | | | | | | | | | | | | loop preheader Summary: This patch adds support to check if a loop has loop invariant conditions which lead to loop exits. If so, we know that if the exit path is taken, it is at the first loop iteration. If there is an induction variable used in that exit path whose value has not been updated, it will keep its initial value passing from loop preheader. We can therefore rewrite the exit value with its initial value. This will help remove phis created by LCSSA and enable other optimizations like loop unswitch. Reviewers: sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13974 llvm-svn: 251839
* Support for ThinLTO function importing and symbol linking.Teresa Johnson2015-11-021-0/+39
| | | | | | | | | | | | | | | | | | | | | Summary: Support for necessary linkage changes and symbol renaming during ThinLTO function importing. Also includes llvm-link support for manually importing functions and associated llvm-link based tests. Note that this does not include support for intelligently importing metadata, which is currently imported duplicate times. That support will be in the follow-on patch, and currently is ignored by the tests. Reviewers: dexonsmith, joker.eph, davidxl Subscribers: tobiasvk, tejohnson, llvm-commits Differential Revision: http://reviews.llvm.org/D13515 llvm-svn: 251837
* StringRef-ify DiagnosticInfoSampleProfile::FilenameDavid Blaikie2015-11-021-3/+2
| | | | llvm-svn: 251823
* Clang format a few prior patches (NFC)Teresa Johnson2015-11-021-13/+14
| | | | | | | I had clang formatted my earlier patches using the wrong style. Reformatted with the LLVM style. llvm-svn: 251812
* Preserve load alignment and dereferenceable metadata during some transformationsArtur Pilipenko2015-11-024-7/+32
| | | | | | | | Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D13953 llvm-svn: 251809
* [SCEV][LV] Add SCEV Predicates and use them to re-implement stride versioningSilviu Baranga2015-11-021-97/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: SCEV Predicates represent conditions that typically cannot be derived from static analysis, but can be used to reduce SCEV expressions to forms which are usable for different optimizers. ScalarEvolution now has the rewriteUsingPredicate method which can simplify a SCEV expression using a SCEVPredicateSet. The normal workflow of a pass using SCEVPredicates would be to hold a SCEVPredicateSet and every time assumptions need to be made a new SCEV Predicate would be created and added to the set. Each time after calling getSCEV, the user will call the rewriteUsingPredicate method. We add two types of predicates SCEVPredicateSet - implements a set of predicates SCEVEqualPredicate - tests for equality between two SCEV expressions We use the SCEVEqualPredicate to re-implement stride versioning. Every time we version a stride, we will add a SCEVEqualPredicate to the context. Instead of adding specific stride checks, LoopVectorize now adds a more generic SCEV check. We only need to add support for this in the LoopVectorizer since this is the only pass that will do stride versioning. Reviewers: mzolotukhin, anemet, hfinkel, sanjoy Subscribers: sanjoy, hfinkel, rengolin, jmolloy, llvm-commits Differential Revision: http://reviews.llvm.org/D13595 llvm-svn: 251800
* Simplify a check. NFC.Davide Italiano2015-11-011-2/+2
| | | | llvm-svn: 251757
* [SimplifyLibCalls] Factor out other common code.Davide Italiano2015-10-311-21/+10
| | | | llvm-svn: 251754
* SamplePGO - Count sample records in embedded profiles when computing coverage.Diego Novillo2015-10-311-30/+54
| | | | | | | The initial coverage checking code for sample records failed to count records inside inlined profiles. This change fixes the oversight. llvm-svn: 251752
* [SimplifyLibCalls] Remove dead code.Davide Italiano2015-10-311-6/+0
| | | | llvm-svn: 251737
* [FunctionAttrs] Inline the prototype attribute inference to an existingChandler Carruth2015-10-311-21/+6
| | | | | | | | loop over the SCC. The separate function wasn't really adding much, NFC. llvm-svn: 251728
OpenPOWER on IntegriCloud