summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] remove extract-of-select vector transformSanjay Patel2017-09-251-33/+0
| | | | | | | | | | | | | | | | | | | | | | | The transform to convert an extract-of-a-select-of-vectors was added at: rL194013 And a question about the validity of this transform was raised in the review: https://reviews.llvm.org/D1539: ...but not answered AFAICT> Most of the motivating cases in that patch are now handled by other combines. These are the tests that were added with the original commit, but they are not regressing even after we remove the transform in this patch. The diffs we see after removing this transform cause us to avoid increasing the instruction count, so we don't want to do those transforms as canonicalizations. The motivation for not turning a vector-select-of-vectors into a scalar operation is shown in PR33301: https://bugs.llvm.org/show_bug.cgi?id=33301 ...in those cases, we'll get vector ops with this patch rather than the vector/scalar mix that we currently see. Differential Revision: https://reviews.llvm.org/D38006 llvm-svn: 314117
* Remove trailing whitespaces.Michael Liao2017-09-251-41/+41
| | | | llvm-svn: 314115
* [DebugInfo] Sort the SDDbgValue list before assuming it is in IR orderReid Kleckner2017-09-251-9/+18
| | | | | | | | | | | | | | | | | | | | | | | Summary: This code iterates the 'Orders' vector in parallel with the DbgValue list, emitting all DBG_VALUEs that occurred between the last IR order insertion point and the next insertion point. This assumes the SDDbgValue list is sorted in IR order, which it usually is. However, it is not sorted when a node with a debug value is replaced with another one. When this happens, TransferDbgValues is called, and the new value is added to the end of the list. The problem can be solved by stably sorting the list by IR order. Reviewers: aprantl, Ka-Ka Reviewed By: aprantl Subscribers: MatzeB, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D38197 llvm-svn: 314114
* Use {} instead of make_pair and an iterator for the insertion point, NFCReid Kleckner2017-09-251-5/+6
| | | | llvm-svn: 314113
* [X86][LLVM]Expanding Supports lowerInterleavedStore() in ↵Michael Zuckerman2017-09-251-1/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | X86InterleavedAccess (VF8 stride 4): This patch expands the support of lowerInterleavedStore to 8x8i stride 4. LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=4 VF=8) and we plan to include more patterns in the future. The patch goal is to optimize the following sequence: At the end of the computation, we have xmm2, xmm0, xmm12 and xmm3 holding each 8 chars: c0, c1, , c7 m0, m1, , m7 y0, y1, , y7 k0, k1, ., k7 And these need to be transposed/interleaved and stored like so: c0 m0 y0 k0 c1 m1 y1 k1 c2 m2 y2 k2 c3 m3 y3 k3 .... Reviewers DavidKreitzer Farhana zvi igorb guyblank RKSimon Ayal Differential Revision: https://reviews.llvm.org/D36058 Change-Id: I3cc5c2ca5d6318901c192a4428493b99ef424c32 llvm-svn: 314109
* [PowerPC] Eliminate compares - add i64 sext/zext handling for SETLT/SETGTNemanja Ivanovic2017-09-251-2/+76
| | | | | | | | As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential review. llvm-svn: 314106
* [AArch64] Add basic support for Qualcomm's Saphira CPU.Chad Rosier2017-09-254-0/+22
| | | | llvm-svn: 314105
* Adding missing feature to goldmont.Michael Zuckerman2017-09-251-1/+2
| | | | | Change-Id: I1ddc619169fae6a56308deef8dae5db3da702cf4 llvm-svn: 314103
* [SLP] Support for horizontal min/max reduction.Alexey Bataev2017-09-251-68/+382
| | | | | | | | | | | | | | | Summary: SLP vectorizer supports horizontal reductions for Add/FAdd binary operations. Patch adds support for horizontal min/max reductions. Function getReductionCost() is split to getArithmeticReductionCost() for binary operation reductions and getMinMaxReductionCost() for min/max reductions. Patch fixes PR26956. Reviewers: spatel, mkuper, hfinkel, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27846 llvm-svn: 314101
* [CodeGenPrepare][NFC] Rename TargetTransformInfo::expandMemCmp -> ↵Clement Courbet2017-09-256-7/+7
| | | | | | | | | | | | | | | | TargetTransformInfo::enableMemCmpExpansion. Summary: Right now there are two functions with the same name, one does the work and the other one returns true if expansion is needed. Rename TargetTransformInfo::expandMemCmp to make it more consistent with other members of TargetTransformInfo. Remove the unused Instruction* parameter. Differential Revision: https://reviews.llvm.org/D38165 llvm-svn: 314096
* [X86] Make IFMA instructions during isel so we can fold broadcast loads.Craig Topper2017-09-245-21/+46
| | | | | | This required changing the ISD opcode for these instructions to have the commutable operands first and the addend last. This way tablegen can autogenerate the additional patterns for us. llvm-svn: 314083
* [X86] Add IFMA instructions to the load folding tables and make them ↵Craig Topper2017-09-242-1/+54
| | | | | | commutable for the multiply operands. llvm-svn: 314080
* Fix signed/unsigned warningSimon Pilgrim2017-09-241-1/+1
| | | | llvm-svn: 314078
* [X86][SSE] Add support for extending bool vectors bitcasted from scalarsSimon Pilgrim2017-09-241-0/+113
| | | | | | | | | | This patch acts as a reverse to combineBitcastvxi1 - bitcasting a scalar integer to a boolean vector and extending it 'in place' to the requested legal type. Currently this doesn't handle AVX512 at all - but the current mask register approach is lacking for some cases. Differential Revision: https://reviews.llvm.org/D35320 llvm-svn: 314076
* [PowerPC] Eliminate compares - add i64 sext/zext handling for SETLE/SETGENemanja Ivanovic2017-09-241-0/+96
| | | | | | | | As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential review. llvm-svn: 314073
* [AVX-512] Add pattern for selecting masked version of v8i32/v8f32 compare ↵Craig Topper2017-09-241-0/+17
| | | | | | | | instructions when VLX isn't available. We use a v16i32/v16f32 compare instead and truncate the result. We already did this for the unmasked version, but were missing the version with 'and'. llvm-svn: 314072
* [X86] Make sure we still mark the full register as implicitly defined when ↵Craig Topper2017-09-241-4/+10
| | | | | | | | we shrink 256/512 bit zeroing xors to 128-bit. Not sure if anything really cares, but this seems like the right thing to do. llvm-svn: 314071
* [AVR] Implement getCmpLibcallReturnType().Dylan McKay2017-09-241-0/+5
| | | | | | | | | This fixes the avr-rust issue (#75) with floating-point comparisons generating broken code. By default, LLVM assumes these comparisons return 32-bit values, but ours are 8-bit. Patch By Thomas Backman. llvm-svn: 314070
* [Verifier] Stop accepting broken DIGlobalVariable(s).Davide Italiano2017-09-241-1/+3
| | | | | | | | The code wasn't yelling at the user when there's a reference from a DIGlobalVariableExpression. Thanks to Adrian for the reduced testcase. Fixes PR34672. llvm-svn: 314069
* [x86] reduce 64-bit mask constant to 32-bits by right shiftingSanjay Patel2017-09-231-13/+14
| | | | | | | | | | | | | | This is a follow-up from D38181 (r314023). We have to put 64-bit constants into a register using a separate instruction, so we should try harder to avoid that. From what I see, we're not likely to encounter this pattern in the DAG because the upstream setcc combines from this don't (usually?) produce this pattern. If we fix that, then this will become more relevant. Since the cost of handling this case is just loosening the predicate of the existing fold, we might as well do it now. llvm-svn: 314064
* [PowerPC] Eliminate compares - add i32 sext/zext handling for SETULT/SETUGTNemanja Ivanovic2017-09-231-3/+34
| | | | | | | | As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential revision. llvm-svn: 314062
* [PowerPC] Eliminate compares - add i32 sext/zext handling for SETULE/SETUGENemanja Ivanovic2017-09-232-1/+74
| | | | | | | | As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential revision. llvm-svn: 314060
* [X86] Move the getInsertVINSERTImmediate and getExtractVEXTRACTImmediate ↵Craig Topper2017-09-234-70/+22
| | | | | | | | helper functions over to X86ISelDAGToDAG.cpp Redefine them to call getI8Imm and return that directly. llvm-svn: 314059
* [X86] Remove is the isVINSERT*Index/isVEXTRACT*Index predicates from isel.Craig Topper2017-09-233-78/+8
| | | | | | The only insert_subvector/extract_subvector nodes that make it to isel are guaranteed to match. llvm-svn: 314058
* [PowerPC] Eliminate compares - add i32 sext/zext handling for SETLT/SETGTNemanja Ivanovic2017-09-231-0/+101
| | | | | | | | As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential revision. llvm-svn: 314055
* [Support] Rename tool_output_file to ToolOutputFile, NFCReid Kleckner2017-09-236-15/+15
| | | | | | | This class isn't similar to anything from the STL, so it shouldn't use the STL naming conventions. llvm-svn: 314050
* [CodeGen] Fix build bots which uses old Clang broken in r314046. (NFC)Eugene Zelenko2017-09-221-1/+1
| | | | llvm-svn: 314049
* [CodeGen] Fix some Clang-tidy modernize-use-default-member-init and Include ↵Eugene Zelenko2017-09-2210-182/+322
| | | | | | What You Use warnings; other minor fixes (NFC). llvm-svn: 314046
* [X86] [MC] fixed non optimal encoding of instruction memory operand (PR24038).Konstantin Belochapka2017-09-221-2/+5
| | | | | | | Fixed suboptimal encoding of instruction memory operand when assembler is used to select 32 bit fixup rather than 8 bit immediate for encoding memory offset value. Differential Revision: https://reviews.llvm.org/D38117 llvm-svn: 314044
* Fix uninteneded fallthrough detected by GCC warningReid Kleckner2017-09-221-0/+1
| | | | llvm-svn: 314043
* [InstCombine] Teach foldICmpUsingKnownBits to simplify SLE/SGE/ULE/UGE to ↵Craig Topper2017-09-221-0/+8
| | | | | | | | equality comparisons when the min/max ranges intersect in a single value. This is the inverse of what we do for SGT/SLT/UGT/ULT. llvm-svn: 314032
* [PowerPC] Mark P9 scheduling model completeStefan Pintilie2017-09-224-266/+503
| | | | | | | | | | | | This patch just adds the missing information to the P9 scheduling model to allow the model to be marked as complete. The model has been verified against P9 documentation. The model was verified with utils/schedcover.py. Differential Revision: https://reviews.llvm.org/D35695 llvm-svn: 314026
* [InstCombine] Add constant splat handling to one of the ICMP_SLT/SGT cases ↵Craig Topper2017-09-221-6/+5
| | | | | | in foldICmpUsingKnownBits. llvm-svn: 314025
* [x86] shiftRightAlgebraic -> shiftRightArithmetic; NFCSanjay Patel2017-09-221-2/+2
| | | | | | | x86 re-education camp is in session. The LLVM LangRef agrees with x86 too. The DAG nodes are undocumented and ambiguous as always. :) llvm-svn: 314024
* [x86] swap order of srl (and X, C1), C2 when it saves sizeSanjay Patel2017-09-221-0/+38
| | | | | | | | | | | | | | The (non-)obvious win comes from saving 3 bytes by using the 0x83 'and' opcode variant instead of 0x81. There are also better improvements based on known-bits that allow us to eliminate the mask entirely. As noted, this could be extended. There are potentially other wins from always shifting first, but doing that reveals a tangle of problems in other pattern matching. We do this transform generically in instcombine, but we often have icmp IR that doesn't match that pattern, so we must account for this in the backend. Differential Revision: https://reviews.llvm.org/D38181 llvm-svn: 314023
* [InstCombine] Move the call to isSignBitCheck into getDemandedBitsLHSMask ↵Craig Topper2017-09-221-15/+8
| | | | | | | | instead of calling it outside and passing its result through a flag. NFCI The result of the isSignBitCheck isn't used anywhere else and this allows us to share the m_APInt call in the likely case that it isn't a sign bit check. llvm-svn: 314018
* [InstCombine] Simplify check for RHS being a splat constant in ↵Craig Topper2017-09-221-8/+6
| | | | | | foldICmpUsingKnownBits by just checking Op1Min==Op1Max rather than going through m_APInt. llvm-svn: 314017
* [InstCombine] Make cases for ICMP_UGT/ICMP_ULT use similar formatting since ↵Craig Topper2017-09-221-6/+3
| | | | | | they use similar code. NFC llvm-svn: 314016
* Move code to a helper function. NFC.Rafael Espindola2017-09-221-7/+13
| | | | | | Part of a patch by Jake Ehrlich! llvm-svn: 314012
* llvm-ar: align the first archive member consistently.Rafael Espindola2017-09-221-3/+5
| | | | | | | Before we were aligning the member after the symbol table to 4 but other members to 8. llvm-svn: 314010
* [XRay] support conditional return on PPC.Tim Shen2017-09-224-72/+164
| | | | | | | | | | | | Summary: Conditional returns were not taken into consideration at all. Implement them by turning them into jumps and normal returns. This means there is a slightly higher performance penalty for conditional returns, but this is the best we can do, and it still disturbs little of the rest. Reviewers: dberris, echristo Subscribers: sanjoy, nemanjai, hiraditya, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D38102 llvm-svn: 314005
* llvm-ar: Don't add an unnecessary alignment in gnu mode.Rafael Espindola2017-09-221-1/+2
| | | | | | | This is mostly for getting stricter testing in preparation for future changes. llvm-svn: 314000
* [Falkor] Add falkor CPU to host detectionBalaram Makam2017-09-221-0/+1
| | | | | | This returns "falkor" for Falkor CPU. llvm-svn: 313998
* Check vector elements for equivalence in the HexagonVectorLoopCarriedReuse passPranav Bhandarkar2017-09-221-0/+15
| | | | | | | | | If the two instructions being compared for equivalence have corresponding operands that are integer constants, then check their values to determine equivalence. Patch by Suyog Sarda! llvm-svn: 313993
* [SCEV] Generalize folding of trunc(x)+n*trunc(y) into folding ↵Daniel Neilson2017-09-221-6/+17
| | | | | | | | | | | | | | | | | | | | | | | | | m*trunc(x)+n*trunc(y) Summary: A SCEV such as: {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> can be folded into, simply, {%v2,+,0}. However, the current code in ::getAddExpr() will not try to apply the simplification m*trunc(x)+n*trunc(y) -> trunc(trunc(m)*x+trunc(n)*y) because it only keys off having a non-multiplied trunc as the first term in the simplification. This patch generalizes this code to try to do a more generic fold of these trunc expressions. Reviewers: sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37888 llvm-svn: 313988
* [X86] Combining CMOVs with [ANY,SIGN,ZERO]_EXTEND for cases where CMOV has ↵Alexander Ivchenko2017-09-221-0/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | constant arguments Combine CMOV[i16]<-[SIGN,ZERO,ANY]_EXTEND to [i32,i64] into CMOV[i32,i64]. One example of where it is useful is: before (20 bytes) <foo>: test $0x1,%dil mov $0x307e,%ax mov $0xffff,%cx cmovne %ax,%cx movzwl %cx,%eax retq after (18 bytes) <foo>: test $0x1,%dil mov $0x307e,%ecx mov $0xffff,%eax cmovne %ecx,%eax retq Reviewers: craig.topper, aaboud, spatel, RKSimon, zvi Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36711 llvm-svn: 313982
* Rework loop predication passArtur Pilipenko2017-09-221-38/+220
| | | | | | | | | | | | | | | | | | | | | | | We've found a serious issue with the current implementation of loop predication. The current implementation relies on SCEV and this turned out to be problematic. To fix the problem we had to rework the pass substantially. We have had the reworked implementation in our downstream tree for a while. This is the initial patch of the series of changes to upstream the new implementation. For now the transformation is limited to the following case: * The loop has a single latch with either ult or slt icmp condition. * The step of the IV used in the latch condition is 1. * The IV of the latch condition is the same as the post increment IV of the guard condition. * The guard condition is ult. See the review or the LoopPredication.cpp header for the details about the problem and the new implementation. Reviewed By: sanjoy, mkazantsev Differential Revision: https://reviews.llvm.org/D37569 llvm-svn: 313981
* Remove the default clause from a fully-covering switchNemanja Ivanovic2017-09-221-4/+10
| | | | | | | to appease bots that use a compiler that warns about this and use -Werror. llvm-svn: 313980
* [ARM] Fix assembly and disassembly for VMRS/VMSRAndre Vieira2017-09-223-35/+71
| | | | | | | Reviewed by: t.p.northover Differential Revision: https://reviews.llvm.org/D36306 llvm-svn: 313979
* Recommit r310809 with a fix for the spill problemNemanja Ivanovic2017-09-221-51/+150
| | | | | | | | | | This patch re-commits the patch that was pulled out due to a problem it caused, but with a fix for the problem. The fix was reviewed separately by Eric Christopher and Hal Finkel. Differential Revision: https://reviews.llvm.org/D38054 llvm-svn: 313978
OpenPOWER on IntegriCloud