summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAGCombiner] Added CTTZ vector constant folding support.Simon Pilgrim2015-06-082-2/+4
| | | | llvm-svn: 239293
* [LoopVectorize] Teach Loop Vectorizor about interleaved memory accesses.Hao Liu2015-06-083-9/+708
| | | | | | | | | | | | | | | | | | | | | | | | Interleaved memory accesses are grouped and vectorized into vector load/store and shufflevector. E.g. for (i = 0; i < N; i+=2) { a = A[i]; // load of even element b = A[i+1]; // load of odd element ... // operations on a, b, c, d A[i] = c; // store of even element A[i+1] = d; // store of odd element } The loads of even and odd elements are identified as an interleave load group, which will be transfered into vectorized IRs like: %wide.vec = load <8 x i32>, <8 x i32>* %ptr %vec.even = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 0, i32 2, i32 4, i32 6> %vec.odd = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 1, i32 3, i32 5, i32 7> The stores of even and odd elements are identified as an interleave store group, which will be transfered into vectorized IRs like: %interleaved.vec = shufflevector <4 x i32> %vec.even, %vec.odd, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7> store <8 x i32> %interleaved.vec, <8 x i32>* %ptr This optimization is currently disabled by defaut. To try it by adding '-enable-interleaved-mem-accesses=true'. llvm-svn: 239291
* [LoopAccessAnalysis] Teach LAA to check the memory dependence between ↵Hao Liu2015-06-081-12/+101
| | | | | | | | strided accesses. Differential Revision: http://reviews.llvm.org/D9368 llvm-svn: 239285
* Remove SCEVCache and FindConstantPointers from complete loop unrolling ↵Michael Zolotukhin2015-06-081-212/+89
| | | | | | | | | | | | | | | | | | | | heuristic. Summary: Using some SCEV functionality helped to entirely remove SCEVCache class and FindConstantPointers SCEV visitor. Also, this makes the code more universal - I'll take advandate of it in next patches where I start handling additional types of instructions. Test Plan: Tests would be submitted in subsequent patches. Reviewers: atrick, chandlerc Reviewed By: atrick, chandlerc Subscribers: atrick, llvm-commits Differential Revision: http://reviews.llvm.org/D10205 llvm-svn: 239282
* Fix Windows build.Peter Collingbourne2015-06-081-0/+4
| | | | llvm-svn: 239279
* llvm-ar: Move archive writer to Object.Peter Collingbourne2015-06-082-0/+339
| | | | | | | | | No functional change intended, other than some minor changes to certain diagnostics. Differential Revision: http://reviews.llvm.org/D10296 llvm-svn: 239278
* SeparateConstOffsetFromGEP: Pass address space to isLegalAddressingModeMatt Arsenault2015-06-071-1/+3
| | | | llvm-svn: 239262
* Make NaryReassociate pass the address space to isLegalAddressingModeMatt Arsenault2015-06-071-1/+3
| | | | | | | No test since the kinds of transforms this prevents seem to not really be relevant for SI's different addressing modes. llvm-svn: 239261
* Add isLegalAddressingMode address space argument to TTIMatt Arsenault2015-06-071-4/+6
| | | | | | | Update to match the TLI version, and remove the TLI version's default argument. llvm-svn: 239260
* [X86] Added BitScanForward/BitScanReverse memory folding + testsSimon Pilgrim2015-06-071-0/+6
| | | | llvm-svn: 239257
* Remove global std::string. NFC.Benjamin Kramer2015-06-071-1/+1
| | | | llvm-svn: 239254
* [DAGCombiner] Added CTPOP vector constant folding support.Simon Pilgrim2015-06-072-2/+3
| | | | | | Added tests to the existing SSE/AVX test files. llvm-svn: 239252
* [AsmWriter] Rewrite module asm printing using StringRef::split.Benjamin Kramer2015-06-071-16/+9
| | | | | | No change in output intended. llvm-svn: 239251
* Fix doxygen comments. NFCFilipe Cabecinhas2015-06-071-10/+10
| | | | llvm-svn: 239250
* [InstCombine, InstSimplify] Move xforms from Combine to SimplifyDavid Majnemer2015-06-062-126/+140
| | | | | | | | There were several SelectInst combines that always returned an existing instruction instead of modifying an old one or creating a new one. These are prime candidates for moving to InstSimplify. llvm-svn: 239229
* Use early return idiom. NFCFilipe Cabecinhas2015-06-061-6/+6
| | | | llvm-svn: 239228
* [MC] Common symbols weren't being checked for redeclaration which allowed an ↵Colin LeMahieu2015-06-061-1/+3
| | | | | | assembly file to generate an assertion in setCommon(): !isCommon(). This change allows redeclaration as long as the size and alignment match exactly, otherwise report a fatal error. llvm-svn: 239227
* [LoopUnroll] Fix truncation bug in canUnrollCompletely.Sanjoy Das2015-06-061-3/+3
| | | | | | | | | | | | | | | | | Summary: canUnrollCompletely takes `unsigned` values for `UnrolledCost` and `RolledDynamicCost` but is passed in `uint64_t`s that are silently truncated. Because of this, when `UnrolledSize` is a large integer that has a small remainder with UINT32_MAX, LLVM tries to completely unroll loops with high trip counts. Reviewers: mzolotukhin, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10293 llvm-svn: 239218
* [CVP] Don't assume Constants of type i1 can be known to be true or falseDavid Majnemer2015-06-061-3/+4
| | | | | | | | | | | | | CVP wants to analyze the condition operand of a select along an edge. It succeeds in getting back a Constant but not a ConstantInt. Instead, it gets a ConstantExpr. It then assumes that the Constant must be equal to false because it isn't equal to true. Instead, perform an additional comparison. This fixes PR23752. llvm-svn: 239217
* [InstCombine] Don't miscompile select to poisonDavid Majnemer2015-06-061-0/+13
| | | | | | | | | | | | | | | | If we have (select a, b, c), it is sometimes valid to simplify this to a single select operand. However, doing so is only valid if the computation doesn't inject poison into the computation. It might be helpful to consider the following example: (select (icmp ne %i, INT_MAX), (add nsw %i, 1), INT_MIN) The select is equivalent to (add %i, 1) but not (add nsw %i, 1). Self hosting on x86_64 revealed that this occurs very, very rarely so bailing out is hopefully pretty reasonable. llvm-svn: 239215
* Handle 16 bit PC relative relocations.Rafael Espindola2015-06-061-0/+1
| | | | | | Fixes pr23771. llvm-svn: 239214
* TargetParser: Fix comments in enum(s) introduced in r239150. [-Wdocumentation]NAKAMURA Takumi2015-06-061-1/+1
| | | | llvm-svn: 239211
* [TableGen] Change OpInit::getNumOperands and getOperand to use unsigned ↵Craig Topper2015-06-061-2/+2
| | | | | | integers. NFC llvm-svn: 239210
* [TableGen] Remove trailing whitespace, add space between 'if' and paren, ↵Craig Topper2015-06-061-16/+16
| | | | | | other formatting fixes. NFC llvm-svn: 239209
* [TableGen] Remove unnecessary temporary. NFCCraig Topper2015-06-061-2/+1
| | | | llvm-svn: 239208
* [TableGen] Fold variable declaration/initialization into if condition for a ↵Craig Topper2015-06-061-8/+6
| | | | | | couple short lived variables. NFC llvm-svn: 239207
* [TableGen] Remove unnecessary outer 'if' and merge it's conditions into the ↵Craig Topper2015-06-061-42/+41
| | | | | | inner 'if's. NFC llvm-svn: 239206
* [TableGen] Fold variable declarations with their assignments. NFCCraig Topper2015-06-061-4/+2
| | | | llvm-svn: 239205
* Move the code in TargetPassConfig::addPass that inserts machine printer pass toAkira Hatanaka2015-06-051-16/+18
| | | | | | | | | | | the overloaded version of addPass which takes Pass*. This change enables inserting the machine printer pass when the overloaded version of addPass that takes Pass* is called to add a pass, instead of the one which takes AnalysisID. I need this to prevent make-check tests from failing when I commit another patch later. llvm-svn: 239192
* Revert "[InstCombine] Rephrase fix to SimplifyWithOpReplaced"Renato Golin2015-06-051-22/+4
| | | | | | | | | This reverts commit r239141. This commit was an attempt to reintroduce a previous patch that broke many self-hosting bots with clang timeouts, but it still has slowdown issues, at least on ARM, increasing the compilation time (stage 2, clang's) by 5x. llvm-svn: 239175
* Refactor padding writing into a helper function.Rafael Espindola2015-06-051-12/+12
| | | | llvm-svn: 239174
* [InstCombine][NFC] Add a ``break;`` statement.Sanjoy Das2015-06-051-0/+1
| | | | | | | | This change is NFC because both the ``break;`` and the fall through end up returning immediately. However, this helps clarify intent and also ensures correctness in case more ``case`` blocks are added later. llvm-svn: 239172
* [InstCombine] Fix PR23751.Sanjoy Das2015-06-051-0/+1
| | | | | | PR23751 was caused by a missing ``break;`` in r234388. llvm-svn: 239171
* Revert r238473, "Thumb2: Modify codegen for memcpy intrinsic to prefer LDM/STM."Peter Collingbourne2015-06-056-129/+33
| | | | | | | as it caused miscompilations and assertion failures (PR23768, http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150601/280380.html). llvm-svn: 239169
* Save a map lookup. NFC.Rafael Espindola2015-06-051-2/+4
| | | | llvm-svn: 239168
* DAGCombiner: don't duplicate (fmul x, c) in visitFNEG if fneg is freeFiona Glaser2015-06-051-1/+2
| | | | | | | | | | For targets with a free fneg, this fold is always a net loss if it ends up duplicating the multiply, so definitely avoid it. This might be true for some targets without a free fneg too, but I'll leave that for future investigation. llvm-svn: 239167
* Rangify more for loops in LegacyPassManager.cpp.Yaron Keren2015-06-051-39/+21
| | | | llvm-svn: 239166
* [Unroll] Rework the naming and structure of the new unroll heuristics.Chandler Carruth2015-06-051-95/+121
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new naming is (to me) much easier to understand. Here is a summary of the new state of the world: - '*Threshold' is the threshold for full unrolling. It is measured against the estimated unrolled cost as computed by getUserCost in TTI (or CodeMetrics, etc). We will exceed this threshold when unrolling loops where unrolling exposes a significant degree of simplification of the logic within the loop. - '*PercentDynamicCostSavedThreshold' is the percentage of the loop's estimated dynamic execution cost which needs to be saved by unrolling to apply a discount to the estimated unrolled cost. - '*DynamicCostSavingsDiscount' is the discount applied to the estimated unrolling cost when the dynamic savings are expected to be high. When actually analyzing the loop, we now produce both an estimated unrolled cost, and an estimated rolled cost. The rolled cost is notably a dynamic estimate based on our analysis of the expected execution of each iteration. While we're still working to build up the infrastructure for making these estimates, to me it is much more clear *how* to make them better when they have reasonably descriptive names. For example, we may want to apply estimated (from heuristics or profiles) dynamic execution weights to the *dynamic* cost estimates. If we start doing that, we would also need to track the static unrolled cost and the dynamic unrolled cost, as only the latter could reasonably be weighted by profile information. This patch is sadly not without functionality change for the new unroll analysis logic. Buried in the heuristic management were several things that surprised me. For example, we never subtracted the optimized instruction count off when comparing against the unroll heursistics! I don't know if this just got lost somewhere along the way or what, but with the new accounting of things, this is much easier to keep track of and we use the post-simplification cost estimate to compare to the thresholds, and use the dynamic cost reduction ratio to select whether we can exceed the baseline threshold. The old values of these flags also don't necessarily make sense. My impression is that none of these thresholds or discounts have been tuned yet, and so they're just arbitrary placehold numbers. As such, I've not bothered to adjust for the fact that this is now a discount and not a tow-tier threshold model. We need to tune all these values once the logic is ready to be enabled. Differential Revision: http://reviews.llvm.org/D9966 llvm-svn: 239164
* [bpf] rename triple names bpf_be -> bpfebAlexei Starovoitov2015-06-054-22/+22
| | | | llvm-svn: 239162
* [Hexagon] Reapply r239097 with tests corrected for shuffling and duplexing.Colin LeMahieu2015-06-0515-58/+2824
| | | | llvm-svn: 239161
* [TargetParser] Properly attach functions of ARMTargetParser to the classBenjamin Kramer2015-06-051-6/+2
| | | | llvm-svn: 239158
* [ARM] Make helper function static.Benjamin Kramer2015-06-052-10/+2
| | | | | | | This one had a declaration but it differed from the definition so the declaration was actually dead. llvm-svn: 239157
* Rangify for loops in LegacyPassManager.cpp.Yaron Keren2015-06-051-19/+10
| | | | llvm-svn: 239155
* [ARM] Add support for -sp- FPUs and FPU none to TargetParserJohn Brawn2015-06-053-5/+16
| | | | | | | | | | These are added mainly for the benefit of clang, but this also means that they are now allowed in .fpu directives and we emit the correct .fpu directive when single-precision-only is used. Differential Revision: http://reviews.llvm.org/D10238 llvm-svn: 239151
* [ARM] Add knowledge of FPU subtarget features to TargetParserJohn Brawn2015-06-054-104/+153
| | | | | | | | | | | | | Add getFPUFeatures to TargetParser, which gets the list of subtarget features that are enabled/disabled for each FPU, and use it when handling the .fpu directive. No functional change in this commit, though clang will start behaving differently once it starts using this. Differential Revision: http://reviews.llvm.org/D10237 llvm-svn: 239150
* [ARMTargetParser] Follow-up for r239099: one case was missedArtyom Skrobov2015-06-051-1/+1
| | | | llvm-svn: 239147
* Revert "[mips] [IAS] Restore STI.FeatureBits in .set pop." (r239144).Toma Tabacu2015-06-051-15/+11
| | | | | | This is breaking the Windows buildbots. llvm-svn: 239145
* [mips] [IAS] Restore STI.FeatureBits in .set pop.Toma Tabacu2015-06-051-11/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Only restoring AvailableFeatures is not enough and will lead to buggy behaviour. For example, if we have a feature enabled and we ".set pop", the next time we try to ".set" that feature nothing will happen because the "!(STI.getFeatureBits()[Feature])" check will be false, because we didn't restore STI.FeatureBits. In order to fix this, we need to make MipsAssemblerOptions remember the STI.FeatureBits instead of the AvailableFeatures and then regenerate AvailableFeatures each time we ".set pop". This is because, AFAIK, there is no way to convert from AvailableFeatures back to STI.FeatureBits, but the reverse is possible by using ComputeAvailableFeatures(STI.FeatureBits). I also moved the updating of AssemblerOptions inside the "if" statement in setFeatureBits() and clearFeatureBits(), as there is no reason to update if nothing changes. Reviewers: dsanders, mkuper Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9156 llvm-svn: 239144
* [LoopVectorize] Don't crash on zero-sized types in isInductionPHIDavid Majnemer2015-06-051-0/+3
| | | | | | | | | isInductionPHI wants to calculate the stride based on the pointee size. However, this is not possible when the pointee is zero sized. This fixes PR23763. llvm-svn: 239143
* Simplify code; NFC.Andrea Di Biagio2015-06-051-7/+7
| | | | | | | | Also, moved test cases from CodeGen/X86/fold-buildvector-bug.ll into CodeGen/X86/buildvec-insertvec.ll and regenerated CHECK lines using update_llc_test_checks.py. llvm-svn: 239142
OpenPOWER on IntegriCloud