summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis
Commit message (Collapse)AuthorAgeFilesLines
...
* [LAA] Hide NeedRTCheck logic completely inside canCheckPtrAtRT, NFCAdam Nemet2015-07-091-31/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | Currently canCheckPtrAtRT returns two flags NeedRTCheck and CanDoRT. NeedRTCheck says whether we need checks and CanDoRT whether we can generate the checks. The idea is to encode three states with these: Need/Can: (1) false/dont-care: no checks are needed (2) true/false: we need checks but can't generate them (3) true/true: we need checks and we can generate them This is pretty unnecessary since the caller (analyzeLoop) is only interested in whether we can generate the checks if we actually need them (i.e. 1 or 3). So this change cleans up to return just that (CanDoRTIfNeeded) and pulls all the underlying logic into canCheckPtrAtRT. By doing all this, we simplify analyzeLoop which is the complex function in LAA. There is further room for improvement here by using RtCheck.Need directly rather than a new local variable NeedRTCheck but that's for a later patch. llvm-svn: 241866
* Don't rely on the DepCands iteration order when constructing checking ↵Silviu Baranga2015-07-091-4/+26
| | | | | | | | | | | | | | | | | | | pointer groups Summary: The checking pointer group construction algorithm relied on the iteration on DepCands. We would need the same leaders across runs and the same iteration order over the underlying std::set for determinism. This changes the algorithm to process the pointers in the order in which they were added to the runtime check, which is deterministic. We need to update the tests, since the order in which pointers appear has changed. No new tests were added, since it is impossible to test for non-determinism. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11064 llvm-svn: 241809
* [LAA] Fix line break in commentAdam Nemet2015-07-091-1/+1
| | | | llvm-svn: 241785
* [LAA] Rename IsRTNeeded to IsRTCheckAnalysisNeededAdam Nemet2015-07-091-6/+17
| | | | | | | | | | | | The original name was too close to NeedRTCheck which is what the actual memcheck analysis returns. This flag, as the new name suggests, is only used to whether to initiate that analysis. Also a comment is added to answer one question I had about this code for a long time. Namely, how does this flag differ from isDependencyCheckNeeded since they are seemingly set at the same time. llvm-svn: 241784
* Make TargetTransformInfo keeping a reference to the Module DataLayoutMehdi Amini2015-07-091-3/+3
| | | | | | | | | | | | | | | | | | | | DataLayout is no longer optional. It was initialized with or without a DataLayout, and the DataLayout when supplied could have been the one from the TargetMachine. Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11021 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241774
* [LAA] Fix misleading use of word 'consecutive'Adam Nemet2015-07-091-3/+3
| | | | | | | Fix some places where the word consecutive is used but the code really means constant-stride (i.e. not just unit stride). llvm-svn: 241763
* [LAA] Revert a small part of r239295Adam Nemet2015-07-081-6/+20
| | | | | | | | | | | | | | | | This commit ([LAA] Fix estimation of number of memchecks) regressed the logic a bit. We shouldn't quit the analysis if we encounter a pointer without known bounds *unless* we actually need to emit a memcheck for it. The original code was using NumComparisons which is now computed differently. Instead I compute NeedRTCheck from NumReadPtrChecks and NumWritePtrChecks. As side note, I find the separation of NeedRTCheck and CanDoRT confusing, so I will try to merge them in a follow-up patch. llvm-svn: 241756
* [LAA] Add missing debug output after r239285Adam Nemet2015-07-081-1/+3
| | | | | | | | | r239285 ([LoopAccessAnalysis] Teach LAA to check the memory dependence between strided accesses.) introduced a new case under MemoryDepChecker::isDependent. We normally have debug output for each case. llvm-svn: 241707
* [LAA] Merge memchecks for accesses separated by a constant offsetSilviu Baranga2015-07-081-38/+215
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Often filter-like loops will do memory accesses that are separated by constant offsets. In these cases it is common that we will exceed the threshold for the allowable number of checks. However, it should be possible to merge such checks, sice a check of any interval againt two other intervals separated by a constant offset (a,b), (a+c, b+c) will be equivalent with a check againt (a, b+c), as long as (a,b) and (a+c, b+c) overlap. Assuming the loop will be executed for a sufficient number of iterations, this will be true. If not true, checking against (a, b+c) is still safe (although not equivalent). As long as there are no dependencies between two accesses, we can merge their checks into a single one. We use this technique to construct groups of accesses, and then check the intervals associated with the groups instead of checking the accesses directly. Reviewers: anemet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10386 llvm-svn: 241673
* Allow constfolding of llvm.sin.* and llvm.cos.* intrinsicsKarthik Bhat2015-07-081-0/+6
| | | | | | | | This patch const folds llvm.sin.* and llvm.cos.* intrinsics whenever feasible. Differential Revision: http://reviews.llvm.org/D10836 llvm-svn: 241665
* Rename llvm.frameescape and llvm.framerecover to localescape and localrecoverReid Kleckner2015-07-071-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Initially, these intrinsics seemed like part of a family of "frame" related intrinsics, but now I think that's more confusing than helpful. Initially, the LangRef specified that this would create a new kind of allocation that would be allocated at a fixed offset from the frame pointer (EBP/RBP). We ended up dropping that design, and leaving the stack frame layout alone. These intrinsics are really about sharing local stack allocations, not frame pointers. I intend to go further and add an `llvm.localaddress()` intrinsic that returns whatever register (EBP, ESI, ESP, RBX) is being used to address locals, which should not be confused with the frame pointer. Naming suggestions at this point are welcome, I'm happy to re-run sed. Reviewers: majnemer, nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11011 llvm-svn: 241633
* IR: Do not consider available_externally linkage to be linker-weak.Peter Collingbourne2015-07-051-1/+1
| | | | | | | | | | | | | | | From the linker's perspective, an available_externally global is equivalent to an external declaration (per isDeclarationForLinker()), so it is incorrect to consider it to be a weak definition. Also clean up some logic in the dead argument elimination pass and clarify its comments to better explain how its behavior depends on linkage, introduce GlobalValue::isStrongDefinitionForLinker() and start using it throughout the optimizers and backend. Differential Revision: http://reviews.llvm.org/D10941 llvm-svn: 241413
* Delete whitespace at start of line.Yaron Keren2015-07-021-1/+1
| | | | llvm-svn: 241265
* Add a routine to TargetTransformInfo that will allow targets to lookEric Christopher2015-07-022-4/+10
| | | | | | | at the attributes on a function to determine whether or not to allow inlining. llvm-svn: 241220
* Move delinearization from SCEVAddRecExpr to ScalarEvolutionTobias Grosser2015-06-293-21/+25
| | | | | | | | | | The expressions we delinearize do not necessarily have to have a SCEVAddRecExpr at the outermost level. At this moment, the additional flexibility is not exploited in LLVM itself, but in Polly we will soon soonish use this functionality. For LLVM, this change should not affect existing functionality (which is covered by test/Analysis/Delinearization/) llvm-svn: 240952
* Teach InlineCost to account for a null check which can be folded awayPhilip Reames2015-06-261-17/+56
| | | | | | | | | | If we have a caller that knows a particular argument can never be null, we can exploit this fact while simplifying values in the inline cost analysis. This has the effect of reducing the cost for inlining when a null check is present in the callee, but the value is known non null in the caller. In particular, any dependent control flow can be discounted from the cost estimate. Note that we use the parameter attributes at the call site to memoize the analysis within the caller's code. The setting of this attribute is done in InstCombine, the inline cost analysis just consumes it. This is intentional and important because we want the inline cost analysis results to be easily cachable themselves. We're not currently doing so, but initial results on LTO indicate this will quickly become important. Differential Revision: http://reviews.llvm.org/D9129 llvm-svn: 240828
* Move VectorUtils from Transforms to Analysis to correct layering violationDavid Blaikie2015-06-263-1/+215
| | | | llvm-svn: 240804
* [LAA] Try to prove non-wrapping of pointers if SCEV cannotAdam Nemet2015-06-261-1/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Scalar evolution does not propagate the non-wrapping flags to values that are derived from a non-wrapping induction variable because the non-wrapping property could be flow-sensitive. This change is a first attempt to establish the non-wrapping property in some simple cases. The main idea is to look through the operations defining the pointer. As long as we arrive to a non-wrapping AddRec via a small chain of non-wrapping instruction, the pointer should not wrap either. I believe that this essentially is what Andy described in http://article.gmane.org/gmane.comp.compilers.llvm.cvs/220731 as the way forward. Reviewers: aschwaighofer, nadav, sanjoy, atrick Reviewed By: atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10472 llvm-svn: 240798
* Take alignment into account in isSafeToLoadUnconditionallyArtur Pilipenko2015-06-251-6/+20
| | | | | | | | Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D10475 llvm-svn: 240636
* [LSR] canonicalize Prod*(1<<C) to Prod<<CJingyue Wu2015-06-241-5/+12
| | | | | | | | | | | | | | | | | | | | | Summary: Because LSR happens at a late stage where mul of a power of 2 is typically canonicalized to shl, this canonicalization emits code that can be better CSE'ed. Test Plan: Transforms/LoopStrengthReduce/shl.ll shows how this change makes GVN more powerful. Fixes some existing tests due to this change. Reviewers: sanjoy, majnemer, atrick Reviewed By: majnemer, atrick Subscribers: majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D10448 llvm-svn: 240573
* [CaptureTracking] Avoid long compilation time on large basic blocksBruno Cardoso Lopes2015-06-242-25/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CaptureTracking becomes very expensive in large basic blocks while calling PointerMayBeCaptured. PointerMayBeCaptured scans the BB the number of times equal to the number of uses of 'BeforeHere', which is currently capped at 20 and bails out with Tracker->tooManyUses(). The bottleneck here is the number of calls to PointerMayBeCaptured * the basic block scan. In a testcase with a 82k instruction BB, PointerMayBeCaptured is called 130k times, leading to 'shouldExplore' taking 527k runs, this currently takes ~12min. To fix this we locally (within PointerMayBeCaptured) number the instructions in the basic block using a DenseMap to cache instruction positions/numbers. We build the cache incrementally every time we need to scan an unexplored part of the BB, improving compile time to only take ~2min. This triggers in the flow: DeadStoreElimination -> MepDepAnalysis -> CaptureTracking. Side note: after multiple runs in the test-suite I've seen no performance nor compile time regressions, but could note a couple of compile time improvements: Performance Improvements - Compile Time Delta Previous Current StdDev SingleSource/Benchmarks/Misc-C++/bigfib -4.48% 0.8547 0.8164 0.0022 MultiSource/Benchmarks/TSVC/LoopRerolling-dbl/LoopRerolling-dbl -1.47% 1.3912 1.3707 0.0056 Differential Revision: http://reviews.llvm.org/D7010 llvm-svn: 240560
* Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)Alexander Kornienko2015-06-2325-38/+38
| | | | | | Apparently, the style needs to be agreed upon first. llvm-svn: 240390
* [PM/AA] Hoist the AliasResult enum out of the AliasAnalysis class.Chandler Carruth2015-06-2213-100/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | This will allow classes to implement the AA interface without deriving from the class or referencing an internal enum of some other class as their return types. Also, to a pretty fundamental extent, concepts such as 'NoAlias', 'MayAlias', and 'MustAlias' are first class concepts in LLVM and we aren't saving anything by scoping them heavily. My mild preference would have been to use a scoped enum, but that feature is essentially completely broken AFAICT. I'm extremely disappointed. For example, we cannot through any reasonable[1] means construct an enum class (or analog) which has scoped names but converts to a boolean in order to test for the possibility of aliasing. [1]: Richard Smith came up with a "solution", but it requires class templates, and lots of boilerplate setting up the enumeration multiple times. Something like Boost.PP could potentially bundle this up, but even that would be quite painful and it doesn't seem realistically worth it. The enum class solution would probably work without the need for a bool conversion. Differential Revision: http://reviews.llvm.org/D10495 llvm-svn: 240255
* [PM/AA] Rework the names and comments in AliasSetTracker to moreChandler Carruth2015-06-221-24/+24
| | | | | | | | | | | | | | | | | | | | accurately describe what is being tracked. While these two enums do track mod/ref information and aliasing information, they don't represent the exact same things as either the mod/ref enums or the alias result enum in AA. They're definitions are dominated by the structure of their lattice and the bit's various semantics. This patch just calls them what they are and tries to spell out usefully distinct names for these things. This will clear the path for using a raw unscoped enum to represent some of these concepts across LLVM's analysis library. No functionality changed here. Differential Revision: http://reviews.llvm.org/D10494 llvm-svn: 240254
* [CallGraph] Given -print-callgraph a stable printing order.Sanjoy Das2015-06-191-2/+20
| | | | | | | | | | | | | | | | Summary: Since FunctionMap has llvm::Function pointers as keys, the order in which the traversal happens can differ from run to run, causing spurious FileCheck failures. Have CallGraph::print sort the CallGraphNodes by name before printing them. Reviewers: bogner, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10575 llvm-svn: 240191
* Typo. NFC.Chad Rosier2015-06-191-2/+1
| | | | llvm-svn: 240141
* Fixed/added namespace ending comments using clang-tidy. NFCAlexander Kornienko2015-06-1925-38/+38
| | | | | | | | | | | | | The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
* Fix "the the" in comments.Eric Christopher2015-06-191-1/+1
| | | | llvm-svn: 240112
* [CallGraph] Teach the CallGraph about non-leaf intrinsics.Sanjoy Das2015-06-182-3/+7
| | | | | | | | | | | | | | | | | | | | Summary: Currently intrinsics don't affect the creation of the call graph. This is not accurate with respect to statepoint and patchpoint intrinsics -- these do call (or invoke) LLVM level functions. This change fixes this inconsistency by adding a call to the external node for call sites that call these non-leaf intrinsics. This coupled with the fact that these intrinsics also escape the function pointer they call gives us a conservatively correct call graph. Reviewers: reames, chandlerc, atrick, pgavlin Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10526 llvm-svn: 240039
* Move the personality function from LandingPadInst to FunctionDavid Majnemer2015-06-171-3/+2
| | | | | | | | | | | | | | | | | | | The personality routine currently lives in the LandingPadInst. This isn't desirable because: - All LandingPadInsts in the same function must have the same personality routine. This means that each LandingPadInst beyond the first has an operand which produces no additional information. - There is ongoing work to introduce EH IR constructs other than LandingPadInst. Moving the personality routine off of any one particular Instruction and onto the parent function seems a lot better than have N different places a personality function can sneak onto an exceptional function. Differential Revision: http://reviews.llvm.org/D10429 llvm-svn: 239940
* Add documentation for new backedge mass propagation in irregular loops.Diego Novillo2015-06-171-3/+2
| | | | | | Tweak test cases and rename headerIndexFor -> getHeaderIndex. llvm-svn: 239915
* [PM/AA] Suffix lots of member variables that directly use enumerationChandler Carruth2015-06-171-48/+72
| | | | | | | | | | | | | | names for counts with the word 'Count' to make them less ambiguous. This will be an actual error if we use unscoped enums for any of these, and generally this seems much clearer to read. Also, use clang-format to normalize the formatting of this code which seems to have been needlessly odd. No functionality changed here. llvm-svn: 239887
* [PM/AA] Remove the UnknownSize static member from AliasAnalysis.Chandler Carruth2015-06-177-61/+66
| | | | | | | | This is now living in MemoryLocation, which is what it pertains to. It is also an enum there rather than a static data member which is left never defined. llvm-svn: 239886
* [PM/AA] Remove the Location typedef from the AliasAnalysis class nowChandler Carruth2015-06-1714-151/+145
| | | | | | | | | | | | that it is its own entity in the form of MemoryLocation, and update all the callers. This is an entirely mechanical change. References to "Location" within AA subclases become "MemoryLocation", and elsewhere "AliasAnalysis::Location" becomes "MemoryLocation". Hope that helps out-of-tree folks update. llvm-svn: 239885
* [PM/AA] Split the location computation out of getArgLocation so theChandler Carruth2015-06-174-91/+122
| | | | | | | | | | | | | | | | | | | | | | | | | | | | virtual interface on AliasAnalysis only deals with ModRef information. This interface was both computing memory locations by using TLI and other tricks to estimate the size of memory referenced by an operand, and computing ModRef information through similar investigations. This change narrows the scope of the virtual interface on AliasAnalysis slightly. Note that all of this code could live in BasicAA, and be done with a single investigation of the argument, if it weren't for the fact that the generic code in AliasAnalysis::getModRefBehavior for a callsite calls into the virtual aspect of (now) getArgModRefInfo. But this patch's arrangement seems a not terrible way to go for now. The other interesting wrinkle is how we could reasonably extend LLVM with support for custom memory location sizes and mod/ref behavior for library routines. After discussions with Hal on the review, the conclusion is that this would be best done by fleshing out the much desired support for extensions to TLI, and support these types of queries in that interface where we would likely be doing other library API recognition and analysis. Differential Revision: http://reviews.llvm.org/D10259 llvm-svn: 239884
* Fix PR 23525 - Separate header mass propagation in irregular loops.Diego Novillo2015-06-161-21/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When propagating mass through irregular loops, the mass flowing through each loop header may not be equal. This was causing wrong frequencies to be computed for irregular loop headers. Fixed by keeping track of masses flowing through each of the headers in an irregular loop. To do this, we now keep track of per-header backedge weights. After the loop mass is distributed through the loop, the backedge weights are used to re-distribute the loop mass to the loop headers. Since each backedge will have a mass proportional to the different branch weights, the loop headers will end up with a more approximate weight distribution (as opposed to the current distribution that assumes that every loop header is the same). Reviewers: dexonsmith Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10348 llvm-svn: 239843
* [InstSimplify] Allow folding of fdiv X, X with just NaNs ignoredBenjamin Kramer2015-06-161-3/+3
| | | | | | | | Any combination of +-inf/+-inf is NaN so it's already ignored with nnan and we can skip checking for ninf. Also rephrase logic in comments a bit. llvm-svn: 239821
* Move logic from JumpThreading into LazyValue info to simplify caller. Philip Reames2015-06-161-2/+34
| | | | | | | | This change is hopefully NFC. The only tricky part is that I changed the context instruction being used to the branch rather than the comparison. I believe both to be correct, but the branch is strictly more powerful. With the moved code, using the branch instruction is required for the basic block comparison test to return the same result. The previous code was able to directly access both the branch and the comparison where the revised code is not. Differential Revision: http://reviews.llvm.org/D9652 llvm-svn: 239797
* [ValueTracking] do not overwrite analysis results already computedJingyue Wu2015-06-151-146/+160
| | | | | | | | | | | | | | | | | | Summary: ValueTracking used to overwrite the analysis results computed from assumes and dominating conditions. This patch fixes this issue. Test Plan: test/Analysis/ValueTracking/assume.ll Reviewers: hfinkel, majnemer Reviewed By: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10283 llvm-svn: 239718
* [InstSimplify] fsub nnan x, x -> 0.0 is valid without ninfBenjamin Kramer2015-06-141-2/+2
| | | | | | | Both inf - inf and (-inf) - (-inf) are NaN, so it's already covered by nnan. llvm-svn: 239702
* [InstSimplify] Add self-fdiv identities for -ffinite-math-only.Benjamin Kramer2015-06-141-0/+15
| | | | | | | | | When NaNs and Infs are ignored we can fold X / X -> 1.0 -X / X -> -1.0 X / -X -> -1.0 llvm-svn: 239701
* Don't create instructions from ConstantExpr's in CFLAliasAnalysis.Pete Cooper2015-06-121-12/+41
| | | | | | | | | | The CFLAA code currently calls ConstantExpr::getAsInstruction which creates an instruction from a constant expr. We then pass that instruction to the InstVisitor to analyze it. Its not necessary to create these instructions as we can just cast from Constant to Operator in the visitor. This is how other InstVisitor’s such as SelectionDAGBuilder handle ConstantExpr. llvm-svn: 239616
* Rangify for loops, NFC.Yaron Keren2015-06-121-8/+6
| | | | llvm-svn: 239596
* [GVN] Set proper debug locations for some instructions created by GVN.Alexey Samsonov2015-06-101-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Determining proper debug locations for instructions created in PHITransAddr is tricky. We use a simple approach here and simply copy debug locations from instructions computing load address to "corresponding" instructions re-creating the address computation in predecessor basic blocks. This may not always be correct, given all the rearrangement and simplification going on, and debug locations may jump around a lot, as the basic blocks we copy locations between may be very far from each other. Still, this would work good in most simple cases (e.g. when chain of address computing instruction is short, or our mapping turns out to be 1-to-1), and we desire to have *some* reasonable debug locations associated with newly inserted instructions. See http://reviews.llvm.org/D10351 review thread for more details. Test Plan: regression test suite Reviewers: spatel, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10351 llvm-svn: 239479
* Replace loop with std::equal. NFC intended.Benjamin Kramer2015-06-091-7/+1
| | | | llvm-svn: 239430
* Minor refactoring of GEP handling in isDereferenceablePointerArtur Pilipenko2015-06-081-28/+15
| | | | | | | | | | For GEP instructions isDereferenceablePointer checks that all indices are constant and within bounds. Replace this index calculation logic to a call to accumulateConstantOffset. Separated from the http://reviews.llvm.org/D9791 Reviewed By: sanjoy Differential Revision: http://reviews.llvm.org/D9874 llvm-svn: 239299
* [LAA] Fix estimation of number of memchecksSilviu Baranga2015-06-081-38/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: We need to add a runtime memcheck for pair of accesses (x,y) where at least one of x and y are writes. Assuming we have w writes and r reads, currently this number is estimated as being w* (w+r-1). This estimation will count (write,write) pairs twice and will overestimate the number of checks required. This change adds a getNumberOfChecks method to RuntimePointerCheck, which will count the number of runtime checks needed (similar in implementation to needsAnyChecking) and uses it to produce the correct number of runtime checks. Test Plan: llvm test suite spec2k spec2k6 Performance results: no changes observed (not surprising since the formula for 1 writer is basically the same, which would covers most cases - at least with the current check limit). Reviewers: anemet Reviewed By: anemet Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D10217 llvm-svn: 239295
* [LoopVectorize] Teach Loop Vectorizor about interleaved memory accesses.Hao Liu2015-06-082-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | Interleaved memory accesses are grouped and vectorized into vector load/store and shufflevector. E.g. for (i = 0; i < N; i+=2) { a = A[i]; // load of even element b = A[i+1]; // load of odd element ... // operations on a, b, c, d A[i] = c; // store of even element A[i+1] = d; // store of odd element } The loads of even and odd elements are identified as an interleave load group, which will be transfered into vectorized IRs like: %wide.vec = load <8 x i32>, <8 x i32>* %ptr %vec.even = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 0, i32 2, i32 4, i32 6> %vec.odd = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 1, i32 3, i32 5, i32 7> The stores of even and odd elements are identified as an interleave store group, which will be transfered into vectorized IRs like: %interleaved.vec = shufflevector <4 x i32> %vec.even, %vec.odd, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7> store <8 x i32> %interleaved.vec, <8 x i32>* %ptr This optimization is currently disabled by defaut. To try it by adding '-enable-interleaved-mem-accesses=true'. llvm-svn: 239291
* [LoopAccessAnalysis] Teach LAA to check the memory dependence between ↵Hao Liu2015-06-081-12/+101
| | | | | | | | strided accesses. Differential Revision: http://reviews.llvm.org/D9368 llvm-svn: 239285
* Add isLegalAddressingMode address space argument to TTIMatt Arsenault2015-06-071-4/+6
| | | | | | | Update to match the TLI version, and remove the TLI version's default argument. llvm-svn: 239260
OpenPOWER on IntegriCloud