summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* PEI: Add default handling of spills to registersMatt Arsenault2019-06-261-9/+22
| | | | llvm-svn: 364472
* [CodeGen] Improve formatting of jump tables (NFC)Evandro Menezes2019-06-261-1/+3
| | | | | | Split jump tables into individual lines and fix spacing. llvm-svn: 364436
* [X86] X86DAGToDAGISel::matchBitExtract(): pattern b: truncation awarenessRoman Lebedev2019-06-261-0/+6
| | | | | | | | | | | | | | | | | | Summary: (Not so) boringly identical to pattern a (D62786) Not yet sure how do deal with the last pattern c. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62793 llvm-svn: 364418
* Revert "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into ↵Clement Courbet2019-06-264-0/+886
| | | | | | | | | | | | | | opt pipeline." Breaks sanitizers: libFuzzer :: cxxstring.test libFuzzer :: memcmp.test libFuzzer :: recommended-dictionary.test libFuzzer :: strcmp.test libFuzzer :: value-profile-mem.test libFuzzer :: value-profile-strcmp.test llvm-svn: 364416
* [HardwareLoops] NFC - move loop with irreducible control flow checking logic ↵Chen Zheng2019-06-261-1/+4
| | | | | | to HarewareLoopInfo. llvm-svn: 364415
* [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.Clement Courbet2019-06-264-886/+0
| | | | | | | | | This allows later passes (in particular InstCombine) to optimize more cases. One that's important to us is `memcmp(p, q, constant) < 0` and memcmp(p, q, constant) > 0. llvm-svn: 364412
* [DAGCombine] visitEXTRACT_SUBVECTOR - add TODO for ↵Simon Pilgrim2019-06-261-0/+1
| | | | | | | | extract_subvector(bitcast()) support We support 'big to little' (e.g. extract_subvector(v16i8 bitcast(v2i64))) but not 'little to big' cases (e.g. extract_subvector(v2i64 bitcast(v16i8))) llvm-svn: 364405
* [HardwareLoops] NFC - move loop with irreducible control flow checking logic ↵Chen Zheng2019-06-261-9/+1
| | | | | | to isHardwareLoopProfitable() llvm-svn: 364397
* Teach the DAGCombine to fold this pattern(c1 and c2 is constant).QingShan Zhang2019-06-261-2/+28
| | | | | | | | | | | | | | | | | | // fold (sext (select cond, c1, c2)) -> (select cond, sext c1, sext c2) // fold (zext (select cond, c1, c2)) -> (select cond, zext c1, zext c2) // fold (aext (select cond, c1, c2)) -> (select cond, sext c1, sext c2) Sign extend the operands if it is any_extend, to keep the signess of the operands that, the other combine rule would apply. The any_extend is handled as zero extend for constants. i.e. t1: i8 = select t0, Constant:i8<-1>, Constant:i8<0> t2: i64 = any_extend t1 --> t3: i64 = select t0, Constant:i64<-1>, Constant:i64<0> --> t4: i64 = sign_extend_inreg t3 Differential Revision: https://reviews.llvm.org/D63318 llvm-svn: 364382
* [MachinePipeliner] Fix risky iterator usage R++, --RJinsong Ji2019-06-251-7/+3
| | | | | | | | | | | | | When we calculate MII, we use two loops, one with iterator R++ to check whether we can reserve the resource, then --R to move back the iterator to do reservation. This is risky, as R++, --R may not point to the same element at all. The can cause wrong MII. Differential Revision: https://reviews.llvm.org/D63536 llvm-svn: 364353
* [Peephole] Allow folding loads into instructions w/multiple uses (such as ↵Philip Reames2019-06-252-3/+10
| | | | | | | | | | | | test64rr) Peephole opt has a one use limitation which appears to be accidental. The function being used was incorrectly documented as returning whether the def had one *user*, but instead returned true only when there was one *use*. Add a corresponding hasOneNonDbgUser helper, and adjust peephole-opt to use the appropriate one. All of the actual folding code handles multiple uses within a single instruction. That codepath is well exercised through instruction selection. Differential Revision: https://reviews.llvm.org/D63656 llvm-svn: 364336
* [DAGCombine] combineRepeatedFPDivisors - recognize -1.0 / X as a reciprocalSimon Pilgrim2019-06-251-2/+2
| | | | | | Fixes issue identified by @nemanjai (Nemanja Ivanovic) in D62963 / rL363040 - infinite loop due to GetNegatedExpression fighting combineRepeatedFPDivisors resulting in fneg(fdiv(x,splat)) -> fneg(fmul(x,1.0/splat)) -> fmul(x,-1.0/splat) -> fmul(x,(-1.0 * 1.0)/splat) ...... llvm-svn: 364326
* [SDAG] expand ctpop != 1Sanjay Patel2019-06-251-11/+11
| | | | | | | | | | | | | | | | | Change the generic ctpop expansion to more efficiently handle a check for not-a-power-of-two value: (ctpop x) != 1 --> (x == 0) || ((x & x-1) != 0) This is the inverted predicate sibling pattern that was added with: D63004 This should have been done before I changed IR canonicalization to favor this form with: rL364246 ...so if this requires revert/changing, the earlier commit may also need to modified. llvm-svn: 364319
* [TargetLowering] SimplifyDemandedBits - add ANY_EXTEND_VECTOR_INREG supportSimon Pilgrim2019-06-251-2/+18
| | | | | | | | Add 'lowest' demanded elt -> bitcast fold to all *_EXTEND_VECTOR_INREG cases. Reapplies rL363856. llvm-svn: 364311
* [TargetLowering] SimplifyDemandedBits ZERO_EXTEND_VECTOR_INREG -> ↵Simon Pilgrim2019-06-251-6/+4
| | | | | | | | | | | | ANY_EXTEND_VECTOR_INREG Simplify ZERO_EXTEND_VECTOR_INREG if the extended bits are not required. Matches what we already do for ZERO_EXTEND. Reapplies rL363850 but now with legality checks added at rL364290 llvm-svn: 364303
* [SDAG] improve expansion of ctpop+setccSanjay Patel2019-06-251-11/+14
| | | | | | | | | This should not cause any visible change in output, but it's more efficient because we were producing non-canonical 'sub x, 1' and 'setcc ugt x, 0'. As mentioned in the TODO, we should also be handling the inverse predicate. llvm-svn: 364302
* [TargetLowering] SimplifyDemandedBits SIGN_EXTEND_VECTOR_INREG -> ↵Simon Pilgrim2019-06-251-6/+6
| | | | | | | | | | | | ANY/ZERO_EXTEND_VECTOR_INREG Simplify SIGN_EXTEND_VECTOR_INREG if the extended bits are not required/known zero. Matches what we already do for SIGN_EXTEND. Reapplies rL363802 but now with legality checks added at rL364290 llvm-svn: 364299
* [VectorLegalizer] ↵Simon Pilgrim2019-06-251-0/+26
| | | | | | | | | | | | ExpandANY_EXTEND_VECTOR_INREG/ExpandZERO_EXTEND_VECTOR_INREG - widen source vector The *_EXTEND_VECTOR_INREG opcodes were relaxed back around rL346784 to support source vector widths that are smaller than the output - it looks like the legalizers were never updated to account for this. This patch inserts the smaller source vector into an undef vector of the same width of the result before performing the shuffle+bitcast to correctly handle this. Part of the yak shaving to solve the crashes from rL364264 and rL364272 llvm-svn: 364295
* [TargetLowering] SimplifyDemandedBits - legal checks for SIGN/ZERO_EXTEND -> ↵Simon Pilgrim2019-06-251-6/+15
| | | | | | | | | | ZERO/ANY_EXTEND As part of the fix for rL364264 + rL364272 - limit the *_EXTEND conversion to !TLO.LegalOperations || isOperationLegal cases. We'll improve X86 legality in future commits. llvm-svn: 364290
* [Codegen] TargetLowering::SimplifySetCC(): omit urem when possibleRoman Lebedev2019-06-251-0/+12
| | | | | | | | | | | | | | | | | | | | | | Summary: This addresses the regression that is being exposed by D50222 in `test/CodeGen/X86/jump_sign.ll` The missing fold, at least partially, looks trivial: https://rise4fun.com/Alive/Zsln i.e. if we are comparing with zero, and comparing the `urem`-by-non-power-of-two, and the `urem` is of something that may at most have a single bit set (or no bits set at all), the `urem` is not needed. Reviewers: RKSimon, craig.topper, xbolva00, spatel Reviewed By: xbolva00, spatel Subscribers: xbolva00, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63390 llvm-svn: 364286
* [ExpandMemCmp] Move all options to TargetTransformInfo.Clement Courbet2019-06-251-29/+20
| | | | | | Split off from D60318. llvm-svn: 364281
* Revert r363802, r363850, and r363856 "[TargetLowering] SimplifyDemandedBits..."Craig Topper2019-06-251-26/+20
| | | | | | | | | | | | | | | | | | | | This reverts the following patches. "[TargetLowering] SimplifyDemandedBits SIGN_EXTEND_VECTOR_INREG -> ANY/ZERO_EXTEND_VECTOR_INREG" "[TargetLowering] SimplifyDemandedBits ZERO_EXTEND_VECTOR_INREG -> ANY_EXTEND_VECTOR_INREG" "[TargetLowering] SimplifyDemandedBits - add ANY_EXTEND_VECTOR_INREG support" We can end up with an any_extend_vector_inreg with a 256 bit result type and a 128 bit result type. This is allowed by the ISD opcode, but the generic operation legalizer is only able to expand cases where the total vector width is the same. The X86 backend creates these mismatched cases for zext_vec_inreg/sext_vec_inreg. The SimplifyDemandedBits changes are allowing those nodes to become aext_vec_inreg. For the zext/sext cases, the X86 backend has Custom handling and never lets them get to the generic legalizer. We need to do the same for aext_vec_inreg. llvm-svn: 364264
* [CodeGen] Add missing vector type legalization for ctlz_zero_undefRoland Froese2019-06-241-0/+2
| | | | | | | | | Widen vector result type for ctlz_zero_undef and cttz_zero_undef the same as ctlz and cttz. Differential Revision: https://reviews.llvm.org/D63463 llvm-svn: 364221
* GlobalISel: Remove unsigned variant of SrcOpMatt Arsenault2019-06-245-226/+226
| | | | | | | | | Force using Register. One downside is the generated register enums require explicit conversion. llvm-svn: 364194
* CodeGen: Introduce a class for registersMatt Arsenault2019-06-2413-123/+123
| | | | | | | | | Avoids using a plain unsigned for registers throughoug codegen. Doesn't attempt to change every register use, just something a little more than the set needed to build after changing the return type of MachineOperand::getReg(). llvm-svn: 364191
* [DAGCombine] visitMUL - allow shift by zero in MulByConstant.Simon Pilgrim2019-06-241-6/+6
| | | | | | | | This can occur under certain circumstances when undefs are created later on in the constant multipliers (e.g. in this case due to SimplifyDemandedVectorElts). Its better to let the shift by zero to occur and perform any cleanup afterward. Fixes OSS Fuzz #15429 llvm-svn: 364179
* SlotIndexes: delete unused functionsFangrui Song2019-06-231-15/+0
| | | | llvm-svn: 364154
* SlotIndexes: simplify IdxMBBPair operatorsFangrui Song2019-06-231-1/+1
| | | | llvm-svn: 364152
* [SelectionDAG] Remove the code that attempts to calculate the alignment for ↵Craig Topper2019-06-232-27/+4
| | | | | | | | | | | | | | the second half of a split masked load/store. The code divides the alignment by 2 if the original alignment is equal to the original VT size. But this wouldn't be correct if the alignment was larger than the VT size. The memory operand object already takes care of calling MinAlign on the base alignment and the memory pointer offset. So we don't need any special code at all. llvm-svn: 364151
* Make GlobalISel depend on SelectionDAG after D63169Fangrui Song2019-06-221-1/+1
| | | | | | | | | | | GlobalISel/IRTranslator.cpp now references SelectionDAG/FunctionLoweringInfo.cpp. This fixes a link error in -DBUILD_SHARED_LIBS=on builds: ld.lld: error: undefined symbol: llvm::FunctionLoweringInfo::clear() >>> referenced by IRTranslator.cpp:2198 (../lib/CodeGen/GlobalISel/IRTranslator.cpp:2198) >>> lib/CodeGen/GlobalISel/CMakeFiles/LLVMGlobalISel.dir/IRTranslator.cpp.o:(llvm::IRTranslator::finalizeFunction()) llvm-svn: 364124
* [GlobalISel][IRTranslator] Change switch table translation to generate jump ↵Amara Emerson2019-06-212-54/+441
| | | | | | | | | | | | | | | | | | | | | | | | | | tables and range checks. This change makes use of the newly refactored SwitchLoweringUtils code from SelectionDAG to in order to generate jump tables and range checks where appropriate. Much of this code is ported from SDAG with some modifications. We generate G_JUMP_TABLE and G_BRJT instructions when JT opportunities are found. This means that targets which previously relied on the naive one MBB per case stmt translation will now start falling back until they add support for the new opcodes. For range checks, we don't generate any previously unused operations. This just recognizes contiguous ranges of case values and generates a single block per range. Single case value blocks are just a special case of ranges so we get that support almost for free. There are still some optimizations missing that I haven't ported over, and bit-tests are also unimplemented. This patch series is already complex enough. Actual arm64 support for selection of jump tables is coming in a later patch. Differential Revision: https://reviews.llvm.org/D63169 llvm-svn: 364085
* [DAGCombine] narrowExtractedVectorBinOp - pull out repeated getOpcode(). NFCI.Simon Pilgrim2019-06-211-2/+2
| | | | llvm-svn: 364076
* [DAGCombine] narrowInsertExtractVectorBinOp - reuse "extract from insert" ↵Simon Pilgrim2019-06-211-11/+15
| | | | | | | | detection code. Move the "extract from insert detection code" into a lambda helper function. llvm-svn: 364059
* Simplify std::lower_bound with llvm::{bsearch,lower_bound}. NFCFangrui Song2019-06-212-13/+8
| | | | llvm-svn: 364006
* [GlobalISel][Localizer] Allow localization of G_INTTOPTR and chains of ↵Amara Emerson2019-06-211-14/+15
| | | | | | | | | | | | | | | | | | | | instructions. G_INTTOPTR can prevent the localizer from moving G_CONSTANTs, but since it's essentially a side effect free cast instruction we can remat both instructions. This patch changes the localizer to enable localization of the chains by iterating over the entry block instructions in reverse order. That way, uses will localized first, and then the defs are free to be localized as well. This also changes the previous SmallPtrSet of localized instructions to use a SetVector instead. We're dealing with pointers and need deterministic iteration order. Overall, this change improves ARM64 -O0 CTMark code size by around 0.7% geomean. Differential Revision: https://reviews.llvm.org/D63630 llvm-svn: 364001
* [DAGCombiner] Use getAPIntValue() instead of getZExtValue() where possible.Simon Pilgrim2019-06-201-21/+20
| | | | | | Better handling of out-of-i64-range values due to large integer types or from fuzz tests. llvm-svn: 363955
* [DAGCombiner][NFC] Remove unused varJordan Rupprecht2019-06-201-1/+0
| | | | llvm-svn: 363954
* Store a pointer to the return value in a static alloca and let the debugger ↵Amy Huang2019-06-201-2/+12
| | | | | | | | | | | | | | use that as the variable address for NRVO variables. Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D63361 llvm-svn: 363952
* [CodeGen] Fix formatting and comments (NFC)Evandro Menezes2019-06-201-6/+8
| | | | llvm-svn: 363947
* [DAGCombiner] Support (shl (zext (srl x, C)), C) -> (zext (shl (srl x, C), ↵Simon Pilgrim2019-06-201-17/+19
| | | | | | | | C)) non-uniform folds. Use matchBinaryPredicate instead of isConstOrConstSplat to let us handle non-uniform shift cases. llvm-svn: 363929
* [DAGCombine] Add TODOs for some combines that should support non-uniform vectorsSimon Pilgrim2019-06-201-0/+15
| | | | | | We tend to only test for scalar/scalar consts when really we could support non-uniform vectors using ISD::matchUnaryPredicate/matchBinaryPredicate etc. llvm-svn: 363924
* [DAGCombine] Reduce scope of ShAmtVal variable. NFCI.Simon Pilgrim2019-06-201-2/+1
| | | | | | | | Fixes cppcheck warning. Use the more capable getAPIntVal() instead of getZExtValue() as well since I'm here. llvm-svn: 363921
* [MIPS GlobalISel] Select integer to floating point conversionsPetar Avramovic2019-06-201-2/+2
| | | | | | | | Select G_SITOFP and G_UITOFP for MIPS32. Differential Revision: https://reviews.llvm.org/D63542 llvm-svn: 363912
* [MIPS GlobalISel] Select floating point to integer conversionsPetar Avramovic2019-06-201-2/+3
| | | | | | | | Select G_FPTOSI and G_FPTOUI for MIPS32. Differential Revision: https://reviews.llvm.org/D63541 llvm-svn: 363911
* [DAGCombine] Use ConstantSDNode::getAPIntValue() instead of getZExtValue().Simon Pilgrim2019-06-191-2/+2
| | | | | | Use getAPIntValue() in a few more places. Most of the time getZExtValue() is fine, but occasionally there's fuzzed code or someone decides to create i65536 or something..... llvm-svn: 363887
* [TargetLowering] SimplifyDemandedBits - add ANY_EXTEND_VECTOR_INREG supportSimon Pilgrim2019-06-191-11/+12
| | | | | | Move 'lowest' demanded elt -> bitcast fold out of ZERO_EXTEND_VECTOR_INREG into ANY_EXTEND_VECTOR_INREG case. llvm-svn: 363856
* [TargetLowering] SimplifyDemandedBits ZERO_EXTEND_VECTOR_INREG -> ↵Simon Pilgrim2019-06-191-3/+4
| | | | | | | | | | ANY_EXTEND_VECTOR_INREG Simplify ZERO_EXTEND_VECTOR_INREG if the extended bits are not required. Matches what we already do for ZERO_EXTEND. llvm-svn: 363850
* [TargetLowering] SimplifyDemandedBits SIGN_EXTEND_VECTOR_INREG -> ↵Simon Pilgrim2019-06-191-6/+10
| | | | | | | | | | ANY/ZERO_EXTEND_VECTOR_INREG Simplify SIGN_EXTEND_VECTOR_INREG if the extended bits are not required/known zero. Matches what we already do for SIGN_EXTEND. llvm-svn: 363802
* [DAGCombiner] Support (shl (ext (shl x, c1)), c2) -> (shl (ext x), (add c1, ↵Simon Pilgrim2019-06-191-16/+15
| | | | | | | | c2)) non-uniform folds. Use matchBinaryPredicate instead of isConstOrConstSplat to let us handle non-uniform shift cases. llvm-svn: 363793
* [DAGCombiner] Support (shl (ext (shl x, c1)), c2) -> 0 non-uniform folds.Simon Pilgrim2019-06-192-11/+27
| | | | | | | | Use matchBinaryPredicate instead of isConstOrConstSplat to let us handle non-uniform shift cases. This requires us to tweak matchBinaryPredicate to allow it to (optionally) handle constants with different type widths. llvm-svn: 363792
OpenPOWER on IntegriCloud