summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [NewGVN] Update phi-of-ops def block when updating existing ValuePHI.Florian Hahn2018-02-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | In case we update a ValuePHI node created earlier, we could update it based on a different OpPHI which could be in a different block. We need to update the TempToBlock mapping reflecting the new block, otherwise we would end up placing the new phi node in a wrong block. This problem is exposed by the test case in https://bugs.llvm.org/show_bug.cgi?id=36504. This patch fixes a slightly simpler problem than in the bug report. In the bug's re-producer, the additional problem is that we are re-using a ValuePHI node with to few incoming values for the new OpPHI. If this patch makes sense, I will follow it up with a patch that creates a new PHI node if the existing PHI node has a different number of incoming values. Reviewers: davide, dberlin Reviewed By: dberlin Differential Revision: https://reviews.llvm.org/D43770 llvm-svn: 326181
* [SystemZ] Make sure SelectCode() is not called on a target opcode.Jonas Paulsson2018-02-271-1/+2
| | | | | | | | | Since getNode() might not always return the requsted opcode, for instance if called with (ISD::AND, -1) arguments, there should be a check so that SelectCode() is only called when appropriate. Review: Ulrich Weigand llvm-svn: 326178
* [MemorySSA] Call the correct dtorsGeorge Burgess IV2018-02-271-3/+2
| | | | | | | | | | | | It appears that there were many cases where we were directly (through templates) calling the dtor of MemoryAccess, which is conceptually an abstract class. This hasn't been a problem, since the data members of all of the subclasses of MemoryAccess have been POD. I'm planning on changing that. :) llvm-svn: 326175
* [SCEV] Cleanup SCEVInitRewriter. NFC.Serguei Katkov2018-02-271-2/+2
| | | | | | | | | | Set default value for IgnoreOtherLoops of SCEVInitRewriter::rewrite to true to be consistent with SCEVPostIncRewriter which does not have this parameter but behaves as it would be true. This is follow up for rL326067. llvm-svn: 326174
* [X86] Simplify if condition. NFCCraig Topper2018-02-271-1/+1
| | | | | | SSE2 implies SSE1 and we already covered f32 in the SSE1 check so we don't need to check f32 in the SSE2 check. llvm-svn: 326170
* [X86] Replace an impossible if condition with an assert.Craig Topper2018-02-271-2/+1
| | | | llvm-svn: 326167
* Fix PR36032, PR35432Evgeny Stupachenko2018-02-271-0/+6
| | | | | | | | | | | | | | | Summary: The change fix an assert fail at ScalarEvolutionExpander.cpp: assert(ExitCount != SE.getCouldNotCompute() && "Invalid loop count"); Reviewers: sbaranga Differential Revision: http://reviews.llvm.org/D42604 From: Evgeny Stupachenko <evstupac@gmail.com> <evgeny.v.stupachenko@intel.com> llvm-svn: 326154
* [SelectionDAG] Remove code from PromoteIntRes_CONCAT_VECTORS that was added ↵Craig Topper2018-02-271-15/+0
| | | | | | | | | | in r320674 to help X86. AVX512 used to promote v32i1 to v32i8 during legalization when BWI was disabled. So this code was added to improve legalization of v32i1 concat_vectors of v16i1 by extending the v16i1 to v16i8 to avoid scalarization. X86 has since switched to legalizing v32i1 by splitting to v16i1 instead. This has rendered this code unnecessary and its no longer exercised. llvm-svn: 326153
* [GISel]: Don't assert when constraining RegisterOperands which are uses.Aditya Nandakumar2018-02-264-12/+12
| | | | | | | | | | | | | Currently we assert that only non target specific opcodes can have missing RegisterClass constraints in the MCDesc. The backend can have instructions with register operands but don't have RegisterClass constraints (say using unknown_class) in which case the instruction defining the register will constrain it. Change the assert to only fire if a def has no regclass. https://reviews.llvm.org/D43409 llvm-svn: 326142
* [ValueTracking] Teach cannotBeOrderedLessThanZeroImpl to handle vector ↵Craig Topper2018-02-261-0/+18
| | | | | | | | | | | | | | | | constants. Summary: This allows vector fabs to be removed in more cases. Reviewers: spatel, arsenm, RKSimon Reviewed By: spatel Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D43739 llvm-svn: 326138
* [X86][SSE] Reduce FADD/FSUB/FMUL costs on later targets (PR36280)Simon Pilgrim2018-02-261-0/+30
| | | | | | | | | | Agner's tables indicate that for SSE42+ targets (Core2 and later) we can reduce the FADD/FSUB/FMUL costs down to 1, which should fix the Himeno benchmark. Note: the AVX512 FDIV costs look rather dodgy, but this isn't part of this patch. Differential Revision: https://reviews.llvm.org/D43733 llvm-svn: 326133
* [X86] Add constant folding to combineMOVMSK.Craig Topper2018-02-261-0/+13
| | | | | | There's still some shortcoming in our ability to combine binops of constants with different sizes separated by an extend. I'll try to look at that next. llvm-svn: 326128
* [X86] Add a custom legalization for (i16 (bitcast v16i1)) and (i32 (bitcast ↵Craig Topper2018-02-261-11/+35
| | | | | | | | | | | | | | | | | | | v32i1)) without AVX512 to prevent scalarization Summary: We have an early DAG combine to turn these patterns into MOVMSK, but that combine doesn't work if the vXi1 type has more elements than the widest legal vXi8 type. Type legalization will eventually split it down to v16i1 or v32i1 and then the bitcast gets legalized to a truncstore and a scalar load. The truncstore will get lowered to a series of extracts and bit math. This patch adds a custom legalization to use a sign extend and MOVMSK instead. This prevents the eventual scalarization. Reviewers: spatel, RKSimon, zvi Reviewed By: RKSimon Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D43593 llvm-svn: 326119
* [LTO] Support filtering by hotness thresholdAdam Nemet2018-02-264-7/+19
| | | | | | | | | | | This wires up -pass-remarks-hotness-threshold to LTO and ThinLTO. Next is to change the clang driver to pass this with -fdiagnostics-hotness-threshold. Differential Revision: https://reviews.llvm.org/D41465 llvm-svn: 326107
* [X86][AVX] createPSADBW - support 256-bit cases on AVX1 via ↵Simon Pilgrim2018-02-261-10/+16
| | | | | | SplitBinaryOpsAndApply llvm-svn: 326104
* AMDGPU/GlobalISel: Make f64 constants legalMatt Arsenault2018-02-261-0/+1
| | | | llvm-svn: 326101
* [InstCombine] allow fdiv folds with less than fully 'fast' opsSanjay Patel2018-02-261-13/+3
| | | | | | | | | | | | | | | | | | Note: gcc appears to allow this fold with -freciprocal-math alone, but clang/llvm require more than that with this patch. The wording in the definitions seems fuzzy enough that it could go either way, but we'll err on the conservative side of FMF interpretation. This patch also changes the newly created fmul to have FMF propagated by the last fdiv rather than intersecting the FMF of the fdivs. This matches the behavior of other folds near here. The new fmul is only used to produce an intermediate op for the final fdiv result, so it shouldn't be any stricter than that result. The previous behavior could result in dropping FMF via other folds in instcombine or CSE. Differential Revision: https://reviews.llvm.org/D43398 llvm-svn: 326098
* [CodeGen] Don't omit any redundant information in -debug outputFrancis Visoiu Mistrih2018-02-262-6/+5
| | | | | | | | | | | | | | | | | | | | | In r322867, we introduced IsStandalone when printing MIR in -debug output. The default behaviour for that was: 1) If any of MBB, MI, or MO are -debug-printed separately, don't omit any redundant information. 2) When -debug-printing a MF entirely, don't print any redundant information. 3) When printing MIR, don't print any redundant information. I'd like to change 2) to: 2) When -debug-printing a MF entirely, don't omit any redundant information. Differential Revision: https://reviews.llvm.org/D43337 llvm-svn: 326094
* Re-land: "[Support] Replace HashString with djbHash."Jonas Devlieghere2018-02-261-19/+20
| | | | | | | | | | | | | | | | | | | | | This patch removes the HashString function from StringExtraces and replaces its uses with calls to djbHash from DJB.h. This change is *almost* NFC. While the algorithm is identical, the djbHash implementation in StringExtras used 0 as its default seed while the implementation in DJB uses 5381. The latter has been shown to result in less collisions and improved avalanching and is used by the DWARF accelerator tables. Because some test were implicitly relying on the hash order, I've reverted to using zero as a seed for the following two files: lld/include/lld/Core/SymbolTable.h llvm/lib/Support/StringMap.cpp Differential revision: https://reviews.llvm.org/D43615 llvm-svn: 326091
* [AMDGPU] Scratch setup fix on AMDPAL gfx9+ merge shaderTim Renouf2018-02-262-1/+20
| | | | | | | | | | | | | | | | Summary: With OS type AMDPAL, the scratch descriptor is hardwired to be loaded from offset 0 of the global information table, whose low pointer is passed in s0. For a merge shader on gfx9+, it needs to be s8 instead, as the hardware reserves s0-s7. Reviewers: kzhuravl Subscribers: arsenm, nhaehnle, dstuttard, llvm-commits, t-tye, yaxunl, wdng, kzhuravl Differential Revision: https://reviews.llvm.org/D42203 llvm-svn: 326088
* [LiveIntervals] Handle moving up dead partial writeTim Renouf2018-02-261-0/+30
| | | | | | | | | | | | | | | | | | | | | | | Summary: In the test case, the machine scheduler moves a dead write to a subreg up into the middle of a segment of the overall reg's live range, where the segment had liveness only for other subregs in the reg. handleMoveUp created an invalid live range, causing an assert a bit later. This commit fixes it to handle that situation. The segment is split in two at the insertion point, and the part after the split, and any subsequent segments up to the old position, are changed to be defined by the moved def. V2: Better test. Subscribers: MatzeB, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D43478 Change-Id: Ibc42445ddca84e79ad1f616401015d22bc63832e llvm-svn: 326087
* Test commitDavid Zarzycki2018-02-261-1/+0
| | | | llvm-svn: 326085
* Revert "[Support] Replace HashString with djbHash."Jonas Devlieghere2018-02-261-20/+19
| | | | | | | | | | | | | It looks like some of our tests depend on the ordering of hashed values. I'm reverting my changes while I try to reproduce and fix this locally. Failing builds: lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/18388 lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/6743 lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/15607 llvm-svn: 326082
* [Support] Replace HashString with djbHash.Jonas Devlieghere2018-02-261-19/+20
| | | | | | | | | | | | | | | This removes the HashString function from StringExtraces and replaces its uses with calls to djbHash from DJB.h This is *almost* NFC. While the algorithm is identical, the djbHash implementation in StringExtras used 0 as its seed while the implementation in DJB uses 5381. The latter has been shown to result in less collisions and improved avalanching. https://reviews.llvm.org/D43615 (cherry picked from commit 77f7f965bc9499a9ae768a296ca5a1f7347d1d2c) llvm-svn: 326081
* [WebAssembly] Relax constexpr for old standard libraries.Benjamin Kramer2018-02-261-1/+1
| | | | | | | | This will still be constexpr when the standard library supports it, but doesn't force constexpr. Old libraries will get a global constructor, which is not too bad. llvm-svn: 326080
* [LV] Move isLegalMasked* functions from Legality to CostModelRenato Golin2018-02-261-101/+124
| | | | | | | | | | | | | | | | | | | | | | | All SIMD architectures can emulate masked load/store/gather/scatter through element-wise condition check, scalar load/store, and insert/extract. Therefore, bailing out of vectorization as legality failure, when they return false, is incorrect. We should proceed to cost model and determine profitability. This patch is to address the vectorizer's architectural limitation described above. As such, I tried to keep the cost model and vectorize/don't-vectorize behavior nearly unchanged. Cost model tuning should be done separately. Please see http://lists.llvm.org/pipermail/llvm-dev/2018-January/120164.html for RFC and the discussions. Closes D43208. Patch by: Hideki Saito <hideki.saito@intel.com> llvm-svn: 326079
* [LoopInterchange] Loops with empty dependency matrix are safe.Florian Hahn2018-02-261-3/+0
| | | | | | | | | | | | | | | | | | | | The dependency matrix is only empty if no conflicting load/store instructions have been found. In that case, it is safe to interchange. For the LLVM test-suite, after this change around 1900 loops are interchanged, whereas it is 15 before this change. On cortex-a57, this gives an improvement of -0.57% on the geomean execution time of SPEC2006, SPEC2000 and the test-suite. There are a few small perf regressions, but I think we can improve on those by making the cost model better. Reviewers: karthikthecool, mcrosier Reviewed by: karthikthecool Differential Revision: https://reviews.llvm.org/D43236 llvm-svn: 326077
* The final step to close D41278 [MachineCombiner] Improve debug output (NFC).Andrew V. Tischenko2018-02-262-4/+4
| | | | | | Differential Revision: https://reviews.llvm.org/D41278 llvm-svn: 326074
* [SCEV] Factor out getUsedLoopsSerguei Katkov2018-02-261-4/+12
| | | | | | | | | | | | | | | The patch introduces the new function in ScalarEvolution to get all loops used in specified SCEV. This is a preparation for re-writing isKnownPredicate utility as described in https://reviews.llvm.org/D42417. Reviewers: sanjoy, mkazantsev, reames Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43504 llvm-svn: 326072
* [SCEV] Introduce SCEVPostIncRewriterSerguei Katkov2018-02-261-0/+41
| | | | | | | | | | | | | | | | The patch introduces the SCEVPostIncRewriter rewriter which is similar to SCEVInitRewriter but rewrites AddRec with post increment value of this AddRec. This is a preparation for re-writing isKnownPredicate utility as described in https://reviews.llvm.org/D42417. Reviewers: sanjoy, mkazantsev, reames Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43499 llvm-svn: 326071
* [XCore] Return true in enableMultipleCopyHints().Jonas Paulsson2018-02-261-0/+2
| | | | | | | | | | Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: Robert Lytton llvm-svn: 326069
* [SCEV] Extends the SCEVInitRewriterSerguei Katkov2018-02-261-8/+20
| | | | | | | | | | | | | | | | | The patch introduces an additional parameter IgnoreOtherLoops to SCEVInitRewriter. if it is equal to true then rewriter will not invalidate result in case SCEV depends on other loops then specified during creation. The patch does not change the default behavior. This is a preparation for re-writing isKnownPredicate utility as described in https://reviews.llvm.org/D42417. Reviewers: sanjoy, mkazantsev, reames Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43498 llvm-svn: 326067
* [X86] Don't use getZExtValue when we have no idea how large the input ↵Craig Topper2018-02-261-2/+2
| | | | | | elements are. llvm-svn: 326066
* [X86] Use SelectionDAG::SplitVectorOperand to simplify some code. NFCCraig Topper2018-02-261-6/+2
| | | | llvm-svn: 326065
* [X86] Simplify the ReplaceNodeResults code for X86ISD::AVG.Craig Topper2018-02-261-13/+7
| | | | | | This code seemed to try to widen to 128, 256, or 512 bit vectors, but we only create X86ISD::AVG with a power of 2 number of elements. This means the only nodes that need to be legalized are less than 128-bits and need to be widened up to 128 bits. llvm-svn: 326064
* [X86] Remove VT.isSimple() check from detectAVGPattern.Craig Topper2018-02-261-1/+1
| | | | | | Which types are considered 'simple' is a function of the requirements of all targets that LLVM supports. That shouldn't directly affect what types we are able to handle. The remainder of this code checks that the number of elements is a power of 2 and takes care of splitting down to a legal size. llvm-svn: 326063
* TableGen: Remove VarInit::getFieldTypeNicolai Haehnle2018-02-251-7/+0
| | | | | | | | | | | | | | It is redundant with the implementation in TypedInit. Change-Id: I8ab1fb5c77e4923f7eb3ffae5889f0f8af6093b4 Reviewers: arsenm, craig.topper, tra, MartinO Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D43678 llvm-svn: 326061
* TableGen: Get rid of Init::getFieldInitNicolai Haehnle2018-02-251-24/+6
| | | | | | | | | | | | | | | | | | | | Summary: FieldInit will just rely on the standardized resolving mechanism to give us DefInits for folding, thus simplifying the code. Unlike the removal of resolveListElementReference, this shouldn't have performance implications, because DefInits do not recurse inside their record. Change-Id: Id4544c774c9d9ee92f293615af6ecff706453f21 Reviewers: arsenm, craig.topper, tra, MartinO Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D43563 llvm-svn: 326060
* TableGen: Remove Init::resolveListElementReferenceNicolai Haehnle2018-02-251-88/+9
| | | | | | | | | | | | | | | | | | | | | | Summary: Resolving a VarListElementInit should just resolve the list and then take its element. This eliminates a lot of duplicated logic and simplifies the next steps of refactoring resolveReferences. This does potentially cause sub-elements of the entire list to be resolved resulting in more work, but I didn't notice a measurable change in performance, and a later patch adds a caching mechanism that covers at least the common case of `var[i]` in a more generic way. Change-Id: I7b59185b855c7368585c329c31e5be38c5749dac Reviewers: arsenm, craig.topper, tra, MartinO Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D43562 llvm-svn: 326059
* [DebugInfo] Stable sort symbols to remove non-deterministic orderingMandeep Singh Grang2018-02-251-1/+1
| | | | | | | | | | | | | | | | Summary: This fixes failure in DebugInfo/X86/multiple-aranges.ll uncovered by D39245. Reviewers: rafael, echristo, probinson Reviewed By: probinson Subscribers: probinson, llvm-commits, JDevlieghere Tags: #debug-info Differential Revision: https://reviews.llvm.org/D39950 llvm-svn: 326056
* [X86] Use SDNode instead of SDPatternOperator. NFCCraig Topper2018-02-251-7/+7
| | | | llvm-svn: 326048
* [TargetLowering] SimplifyDemandedVectorElts - pass demanded elts through ↵Simon Pilgrim2018-02-241-0/+13
| | | | | | ADD/SUB ops llvm-svn: 326044
* [TargetLowering] SimplifyDemandedVectorElts - pass demanded elts through ↵Simon Pilgrim2018-02-241-0/+5
| | | | | | TRUNCATE ops llvm-svn: 326043
* [X86] Allow int_x86_sse2_cvtps2dq and int_x86_avx_cvt_ps2dq_256 to select ↵Craig Topper2018-02-243-7/+11
| | | | | | EVEX encoded instructions. llvm-svn: 326041
* Revert "StructurizeCFG: Test for branch divergence correctly"Adam Nemet2018-02-241-12/+3
| | | | | | | | This reverts commit r325881. Breaks many bots llvm-svn: 326037
* [X86][SSE] combineSubToSubus - support v8i64 handling from SSSE3Simon Pilgrim2018-02-241-1/+1
| | | | | | Our UMIN/UMAX, vector truncation and shuffle combining is good enough to efficiently handle v8i64 with the number of leading zeros that are necessary for PSUBUS. llvm-svn: 326034
* [X86][SSE] combineSubToSubus - support v8i32 handling from SSSE3 (not SSE41)Simon Pilgrim2018-02-241-3/+3
| | | | | | Now that UMIN etc are Legal/Custom for SSE2+, we can efficiently match SUBUS v8i32 cases from SSSE3 which can perform efficient truncation with PSHUFB. llvm-svn: 326033
* [X86][SSE] combineSubToSubus - begun generalizing to work with any type ↵Simon Pilgrim2018-02-241-4/+11
| | | | | | sizes with SplitBinaryOpsAndApply llvm-svn: 326030
* Fix spelling in comment. NFCI.Simon Pilgrim2018-02-241-1/+1
| | | | llvm-svn: 326029
* [Sparc] Return true in enableMultipleCopyHints().Jonas Paulsson2018-02-241-0/+2
| | | | | | | | | | Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: James Y Knight llvm-svn: 326028
OpenPOWER on IntegriCloud