summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
...
* [SelectionDAG] Add initial implementation of ↵Simon Pilgrim2018-02-152-91/+238
| | | | | | | | | | | | TargetLowering::SimplifyDemandedVectorElts This is mainly a move of simplifyShuffleOperands from DAGCombiner::visitVECTOR_SHUFFLE to create a more general purpose TargetLowering::SimplifyDemandedVectorElts implementation. Further features can be moved/added in future patches. Differential Revision: https://reviews.llvm.org/D42896 llvm-svn: 325232
* [SelectionDAG][X86] Fix incorrect offset generated for VMASKMOVAlexander Ivchenko2018-02-142-18/+21
| | | | | | | | When creating high MachineMemOperand for MSTORE/MLOAD we supply it with the original PointerInfo, while the pointer itself had been incremented. The patch adds the proper offset to the PointerInfo. llvm-svn: 325135
* Adding a width of the GEP index to the Data Layout.Elena Demikhovsky2018-02-142-9/+8
| | | | | | | | | | | | | | | | | | Making a width of GEP Index, which is used for address calculation, to be one of the pointer properties in the Data Layout. p[address space]:size:memory_size:alignment:pref_alignment:index_size_in_bits. The index size parameter is optional, if not specified, it is equal to the pointer size. Till now, the InstCombiner normalized GEPs and extended the Index operand to the pointer width. It works fine if you can convert pointer to integer for address calculation and all registered targets do this. But some ISAs have very restricted instruction set for the pointer calculation. During discussions were desided to retrieve information for GEP index from the Data Layout. http://lists.llvm.org/pipermail/llvm-dev/2018-January/120416.html I added an interface to the Data Layout and I changed the InstCombiner and some other passes to take the Index width into account. This change does not affect any in-tree target. I added tests to cover data layouts with explicitly specified index size. Differential Revision: https://reviews.llvm.org/D42123 llvm-svn: 325102
* [SelectionDAG] Remove duplicate code from TargetLowering::SimplifySetCC.Craig Topper2018-02-141-4/+0
| | | | | | This exact code already exists a little further up. llvm-svn: 325101
* [DAGCombiner] Add one use check to fold (not (and x, y)) -> (or (not x), ↵Craig Topper2018-02-131-2/+2
| | | | | | | | | | | | | | | | | | | (not y)) Summary: If the and has an additional use we shouldn't invert it. That creates an additional instruction. While there add a one use check to the transform above that looked similar. Reviewers: spatel, RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43225 llvm-svn: 325019
* [DAG] fix type of undef returned by getNode()Sanjay Patel2018-02-131-2/+2
| | | | | | | | The bug has been lying dormant, but apparently was never exposed, until after rL324941 because we didn't return the correct result for shifts with undef operands. llvm-svn: 325010
* [DAG] make binops with undef operands consistent with IRSanjay Patel2018-02-121-20/+7
| | | | | | | | | | | | | | | | | | | | | This started by noticing that scalar and vector types were producing different results with div ops in PR36305: https://bugs.llvm.org/show_bug.cgi?id=36305 ...but the problem is bigger. I couldn't keep it straight without a table, so I'm attaching that as a PDF to the review. The x86 tests in undef-ops.ll correspond to that table. Green means that instsimplify and the DAG agree on the result for all types. Red means the DAG was returning undef when IR was not. Yellow means the DAG was returning a non-undef result when IR returned undef. This patch assumes that we're currently doing the right thing in IR. Note: I couldn't find any problems with lowering vector constants as the code comments were warning, but those comments were written long ago in rL36413 . Differential Revision: https://reviews.llvm.org/D43141 llvm-svn: 324941
* [AArch64] Improve v8.1-A code-gen for atomic load-andOliver Stannard2018-02-124-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | Armv8.1-A added an atomic load-clear instruction (which performs bitwise and with the complement of it's operand), but not a load-and instruction. Our current code-generation for atomic load-and always inserts an MVN instruction to invert its argument, even if it could be folded into a constant or another instruction. This adds lowering early in selection DAG to convert a load-and operation into an xor with -1 and a load-clear, allowing the normal DAG optimisations to work on it. To do this, I've had to add a new ISD opcode, ATOMIC_LOAD_CLR. I don't see any easy way to do this with an AArch64-specific ISD node, because the code-generation for atomic operations assumes the SDNodes are of type AtomicSDNode. I've left the old tablegen patterns in because they are still needed for global isel. Differential revision: https://reviews.llvm.org/D42478 llvm-svn: 324908
* [TargetLowering] try to create -1 constant operand for math ops via demanded ↵Sanjay Patel2018-02-111-0/+21
| | | | | | | | | | | | | | | | | | bits This reverses instcombine's demanded bits' transform which always tries to clear bits in constants. As noted in PR35792 and shown in the test diffs: https://bugs.llvm.org/show_bug.cgi?id=35792 ...we can do better in codegen by trying to form -1. The x86 sub test shows a missed opportunity. I did investigate changing instcombine's behavior, but it would be more work to change canonicalization in IR. Clearing bits / shrinking constants can allow killing instructions, so we'd have to figure out how to not regress those cases. Differential Revision: https://reviews.llvm.org/D42986 llvm-svn: 324839
* [X86][SSE] Enable SMIN/SMAX/UMIN/UMAX custom lowering for all legal typesSimon Pilgrim2018-02-111-0/+19
| | | | | | | | This allows us to recognise more saturation patterns and also simplify some MINMAX codegen that was failing to combine CMPGE comparisons to a legal CMPGT. Differential Revision: https://reviews.llvm.org/D43014 llvm-svn: 324837
* [SelectionDAG] Remove TargetLowering::getConstTrueVal. Use ↵Craig Topper2018-02-112-12/+3
| | | | | | | | SelectionDAG::getBoolConstant in the one place it was used. SelectionDAG::getBoolConstant was recently introduced. At the time I didn't know getConstTrueVal existed, but I think getBoolConstant is better as it will use the source VT to make sure it can properly detect floating point if it is configured differently. llvm-svn: 324832
* [DAG] Make early exit hasPredecessorHelper return true. NFCI.Nirav Dave2018-02-101-3/+0
| | | | | | | All uses conservatively assume in early exit case that it will be a predecessor. Changing default removes checking code in all uses. llvm-svn: 324797
* [SelectionDAG] Provide adequate register class for RegisterSDNodeStefan Maksimovic2018-02-091-1/+16
| | | | | | | | | | When adding operands to machine instructions in case of RegisterSDNodes, generate a COPY node in case the register class does not match the one in the instruction definition. Differental Revision: https://reviews.llvm.org/D35561 llvm-svn: 324733
* Revert "WIP: [DAGCombiner] Assert that debug info is preserved"Vedant Kumar2018-02-081-31/+4
| | | | | | This reverts commit r324648. It was committed accidentally. llvm-svn: 324650
* WIP: [DAGCombiner] Assert that debug info is preservedVedant Kumar2018-02-081-4/+31
| | | | llvm-svn: 324648
* [SelectionDAG] Add a helper function for creating a boolean constant based ↵Craig Topper2018-02-082-82/+82
| | | | | | | | | | | | | on the target's boolean content Many in SimplifySetCC and FoldSetCC try to create true or false constants. Some of them query getBooleanContents to figure out whether to use all ones or just 1 for true. But many places do not check and just use 1 without ensuring the VT has an i1 scalar type. Note sure if those places only trigger before type legalization so they only see an i1 type? To cleanup the inconsistency and reduce some duplicated code, this patch adds a getBoolConstant method to SelectionDAG that takes are of querying getBooleanContents and doing the right thing. Differential Revision: https://reviews.llvm.org/D43037 llvm-svn: 324634
* [DAGCombiner] Fix a couple mistakes from r324311 by really passing the ↵Craig Topper2018-02-081-2/+4
| | | | | | | | | | original load to ExtendSetCCUses. We're passing the binary op that uses the load instead of the load. Noticed by inspection. Not sure how to test this because this just prevents the introduction of an extend that will later be truncated and will probably be combined out. llvm-svn: 324568
* [DAGCombiner] Don't create truncate nodes in (aext (zextload x)) -> ↵Craig Topper2018-02-081-15/+5
| | | | | | | | (zextload x) and similar folds. NFCI The truncate is being used to replace other users of of the load, but we checked that the load only has one use so there are no other uses to replace. llvm-svn: 324567
* [DAGCombiner] Avoid creating truncate nodes in (zext (and (load)))->(and ↵Craig Topper2018-02-081-22/+25
| | | | | | | | (zextload)) fold until we know for sure we're going to need it. NFCI The truncate is only needed if the load has additional users. It used to get passed to extendSetCCUses so was created early, but that's no longer the case. llvm-svn: 324562
* [DAGCombiner] Rename variable to be slightly better. NFCCraig Topper2018-02-081-21/+21
| | | | | | We were calling a load LN0 but it came from N0.getOperand(0) so its really more like LN00 if we follow the name used in other places. llvm-svn: 324561
* [SelectionDAG] More Aggressibly prune nodes in AddChains. NFCI.Nirav Dave2018-02-071-1/+3
| | | | | | | | Travel all chains paths to first non-tokenfactor node can be exponential work. Add simple redundency check to avoid this. Fixes PR36264. llvm-svn: 324491
* [LegalizeDAG] Truncate condition operand of ISD::SELECTEugene Leviant2018-02-071-1/+6
| | | | | | Differential revision: https://reviews.llvm.org/D42737 llvm-svn: 324447
* [DAGCombiner][AMDGPU][X86] Turn cttz/ctlz into ↵Craig Topper2018-02-061-0/+14
| | | | | | | | | | | | cttz_zero_undef/ctlz_zero_undef if we can prove the input is never zero X86 currently has a late DAG combine after cttz/ctlz are turned into BSR+BSF+CMOV to detect this and remove the CMOV. But we should be able to do this much earlier and avoid creating the cmov all together. For the changed AMDGPU test case it appears that previously the i8 cttz was type legalized to i16 which introduced an OR with 256 in order to limit the result to 8 on the widened type. At this point the result is known to never be zero, but nothing checked that. Then operation legalization is told to promote all i16 cttz to i32. This introduces an extend and a truncate and another OR with 65536 to limit the result to 16. With the DAG combiner change we are able to prevent the creation of the second OR since the opcode will have been changed to cttz_zero_undef after the first OR. I the lack of the OR caused the instruction to change to v_ffbl_b32_sdwa Differential Revision: https://reviews.llvm.org/D42985 llvm-svn: 324427
* Add SelectionDAGDumper support for strict FP nodesAndrew Kaylor2018-02-061-0/+20
| | | | | | Patch by Kevin P. Neal llvm-svn: 324416
* [TargetLowering] use local variable to reduce duplication; NFCISanjay Patel2018-02-061-52/+32
| | | | llvm-svn: 324401
* [TargetLowering] use local variables to reduce duplication; NFCISanjay Patel2018-02-061-6/+6
| | | | llvm-svn: 324397
* [DAG, X86] Improve Dependency analysis when doing multi-nodeNirav Dave2018-02-061-215/+80
| | | | | | | | | | | | | | | | | | | | Instruction Selection Cleanup cycle/validity checks in ISel (IsLegalToFold, HandleMergeInputChains) and X86 (isFusableLoadOpStore). Now do a full search for cycles / dependencies pruning the search when topological property of NodeId allows. As part of this propogate the NodeId-based cutoffs to narrow hasPreprocessorHelper searches. Reviewers: craig.topper, bogner Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D41293 llvm-svn: 324359
* [DAGCombiner] Pass the original load to ExtendSetCCUses not the turncate.Craig Topper2018-02-061-11/+12
| | | | | | | | | | | | | | | | | | | | | | | Summary: This method is trying to use the truncate node to find which SETCC operand should be replaced directly with the extended load. This used to work correctly because all uses of the original load were replaced by the truncate before this function was called. So this was used to effectively bypass the truncate and find the load under it. All but one of the callers now call this before the truncate has replaced the laod so the setcc doesn't yet use the truncate. To account for this we should pass the original load instead. I changed the order of that one caller to make this work there too. I don't have a test case because this is probably hidden by later DAG combines causing the extend and truncate to cancel out. I assume this way is a little more efficient and matches what was originally intended. Reviewers: RKSimon, spatel, niravd Reviewed By: niravd Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42878 llvm-svn: 324311
* [SDAG] Legalize all CondCodes by inverting them and/or swapping operandsKrzysztof Parzyszek2018-02-051-12/+19
| | | | | | Differential Revision: https://reviews.llvm.org/D42788 llvm-svn: 324274
* [DAGCombiner] When folding fold (sext/zext (and/or/xor (sextload/zextload ↵Craig Topper2018-02-031-4/+6
| | | | | | | | | | | | | | | | | | | x), cst)) -> (and/or/xor (sextload/zextload x), (sext/zext cst)) make sure we check the legality of the full extended load. Summary: If the load is already an extended load we should be using the memory VT for the legality check, not just the VT of the current extension. I don't have a test case, just noticed it while investigating some load extension improvements. Reviewers: RKSimon, spatel, niravd Reviewed By: niravd Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42783 llvm-svn: 324181
* [SelectionDAG] Don't use simple VT in generic shuffle codeSimon Pilgrim2018-02-031-1/+1
| | | | | | | | Better to assume that any value type may be commuted, not just MVTs. No test case right now, but discovered while investigating possible shuffle combines. llvm-svn: 324179
* [SelectionDAG] Consider endianness in scalarizeVectorStore().Jonas Paulsson2018-02-021-2/+5
| | | | | | | | | | | | When handling vectors with non byte-sized elements, reverse the order of the elements in the built integer if the target is Big-Endian. SystemZ tests updated. Review: Eli Friedman, Ulrich Weigand. https://reviews.llvm.org/D42786 llvm-svn: 324063
* [SelectionDAG] Add an assert in getNode() for EXTRACT_VECTOR_ELT.Jonas Paulsson2018-02-021-0/+4
| | | | | | | | When getNode() is called to create an EXTRACT_VECTOR_ELT, assert that the result VT is at least as wide as the vector element type. Review: Eli Friedman llvm-svn: 324061
* [DAGCombiner] When folding (insert_subvector undef, (bitcast ↵Craig Topper2018-02-011-1/+3
| | | | | | | | | | | | (extract_subvector N1, Idx)), Idx) -> (bitcast N1) make sure that N1 has the same total size as the original output We were only checking the element count, but not the total width. This could cause illegal bitcasts to be created if for example the output was 512-bits, but N1 is 256 bits, and the extraction size was 128-bits. Fixes PR36199 Differential Revision: https://reviews.llvm.org/D42809 llvm-svn: 324002
* [DAGCombiner] filter out denorm inputs when calculating sqrt estimate (PR34994)Sanjay Patel2018-02-011-10/+25
| | | | | | | | | | | | | | | | | | | | | | | As shown in the example in PR34994: https://bugs.llvm.org/show_bug.cgi?id=34994 ...we can return a very wrong answer (inf instead of 0.0) for square root when using a reciprocal square root estimate instruction. Here, I've conditionalized the filtering out of denorms based on the function having "denormal-fp-math"="ieee" in its attributes. The other options for this attribute are 'preserve-sign' and 'positive-zero'. So we don't generate this extra code by default with just '-ffast-math' (because then there's no denormal attribute string at all), but it works if you specify '-ffast-math -fdenormal-fp-math=ieee' from clang. As noted in the review, there may be other problems in clang that affect the results depending on platform (Linux x86 at least), but this should allow creating the desired codegen. Differential Revision: https://reviews.llvm.org/D42323 llvm-svn: 323981
* [SelectionDAG] Fix UpdateChains handling of TokenFactorsNirav Dave2018-02-011-1/+2
| | | | | | | | | | | | | | | | | | | Summary: In Instruction Selection UpdateChains replaces all matched Nodes' chain references including interior token factors and deletes them. This may allow nodes which depend on these interior nodes but are not part of the set of matched nodes to be left with a dangling dependence. Avoid this by doing the replacement for matched non-TokenFactor nodes. Fixes PR36164. Reviewers: jonpa, RKSimon, bogner Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D42754 llvm-svn: 323977
* [XRay][compiler-rt+llvm] Update XRay register stashing semanticsDean Michael Berris2018-02-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | Summary: This change expands the amount of registers stashed by the entry and `__xray_CustomEvent` trampolines. We've found that since the `__xray_CustomEvent` trampoline calls can show up in situations where the scratch registers are being used, and since we don't typically want to affect the code-gen around the disabled `__xray_customevent(...)` intrinsic calls, that we need to save and restore the state of even the scratch registers in the handling of these custom events. Reviewers: pcc, pelikan, dblaikie, eizan, kpw, echristo, chandlerc Reviewed By: echristo Subscribers: chandlerc, echristo, hiraditya, davide, dblaikie, llvm-commits Differential Revision: https://reviews.llvm.org/D40894 llvm-svn: 323940
* DAG: Fix not truncating when promoting bswap/bitreverseMatt Arsenault2018-01-311-1/+2
| | | | | | | These need to convert back to the original type, like any other promotion. llvm-svn: 323932
* [DAG] Prevent NodeId pruning of TokenFactors in Instruction Selection.Nirav Dave2018-01-311-1/+3
| | | | | | | | | | | | | | | | | | Summary: Instruction Selection preserves relative orders of all nodes save TokenFactors which we treat specially. As a result Node Ids for TokenFactors may violate the topological ordering and should not be considered as valid pruning candidates in predecessor search. Fixes PR35316. Reviewers: RKSimon, hfinkel Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D42701 llvm-svn: 323880
* [ARM] Allow the scheduler to clone a node with glue to avoid a copy CPSR ↔ ↵Roger Ferrer Ibanez2018-01-311-4/+16
| | | | | | | | | | | | | | | | | | | | | | GPR. In Thumb 1, with the new ADDCARRY / SUBCARRY the scheduler may need to do copies CPSR ↔ GPR but not all Thumb1 targets implement them. The schedule can attempt, before attempting a copy, to clone the instructions but it does not currently do that for nodes with input glue. In this patch we introduce a target-hook to let the hook decide if a glued machinenode is still eligible for copying. In this case these are ARM::tADCS and ARM::tSBCS . As a follow-up of this change we should actually implement the copies for the Thumb1 targets that do implement them and restrict the hook to the targets that can't really do such copy as these clones are not ideal. This change fixes PR35836. Differential Revision: https://reviews.llvm.org/D42051 llvm-svn: 323857
* [mips] Fix incorrect sign extension for fpowi libcallSimon Dardis2018-01-301-5/+7
| | | | | | | | | | | | | | | | | PR36061 showed that during the expansion of ISD::FPOWI, that there was an incorrect zero extension of the integer argument which for MIPS64 would then give incorrect results. Address this with the existing mechanism for correcting sign extensions. This resolves PR36061. Thanks to James Cowgill for reporting the issue! Reviewers: atanasyan, hfinkel Differential Revision: https://reviews.llvm.org/D42537 llvm-svn: 323781
* [SelectionDAG]: Ignore "returned" in the presence of an implicit sret.Dan Gohman2018-01-301-2/+4
| | | | | | | | | | | | | | | | When a function return value can't be directly lowered, such as returning an i128 on WebAssembly, as indicated by the CanLowerReturn target hook, SelectionDAGBuilder can translate it to return the value through a hidden sret-like argument. If such a function has an argument with the "returned" attribute, the attribute can't be automatically lowered, because the function no longer has a normal return value. For now, just discard the "returned" attribute. This fixes PR36128. llvm-svn: 323715
* [X86] Make foldLogicOfSetCCs work better for vectors pre legal types/operationsCraig Topper2018-01-291-1/+1
| | | | | | | | | | | | | | | | | Summary: There's a check in the code to only check getSetCCResultType after LegalOperations or if the type is MVT::i1. But the i1 check is only allowing scalar types through. I think it should check that the scalar type is MVT::i1 so that it will work for vectors. The changed test already does this combine with AVX512VL where getSetCCResultType returns vXi1. But with avx512f and no VLX getSetCCResultType returns a type matching the width of the input type. Reviewers: spatel, RKSimon Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42619 llvm-svn: 323631
* [NFC] fix trivial typos in comments and documentsHiroshi Inoue2018-01-291-1/+1
| | | | | | "to to" -> "to" llvm-svn: 323628
* [TargetLowering] Teach TargetLowering::SimplifySetCC to simplify setcc of ↵Craig Topper2018-01-271-14/+16
| | | | | | | | vXi1 vectors into logic ops. This transform was already being done for setcc of scalar i1. This extends it to vectors. llvm-svn: 323585
* [SelectionDAG] Make DAGTypeLegalizer::PromoteSetCCOperands handle ↵Craig Topper2018-01-271-4/+4
| | | | | | | | SETEQ/SETNE correctly for vector types. The code was using getValueSizeInBits and combining with the result of a call to DAG.ComputeNumSignBits. But for vector types getValueSizeInBits returns the width of the full vector while ComputeNumSignBits is going to give a number no larger than the width of a single element. So we should be using getScalarValueSizeInBits to get the element width. llvm-svn: 323583
* [SelectionDAGISel] Add a debug print before call to Select. Adjust where ↵Craig Topper2018-01-261-6/+5
| | | | | | | | | | | | blank lines are printed during isel process to make things more sensibly grouped. Previously some targets printed their own message at the start of Select to indicate what they were selecting. For the targets that didn't, it means there was no print of the root node before any custom handling in the target executed. So if the target did something custom and never called SelectNodeCommon, no print would be made. For the targets that did print a message in Select, if they didn't custom handle a node SelectNodeCommon would reprint the root node before walking the isel table. It seems better to just print the message before the call to Select so all targets behave the same. And then remove the root node printing from SelectNodeCommon and just leave a message that says we're starting the table search. There were also some oddities in blank line behavior. Usually due to a \n after a call to SelectionDAGNode::dump which already inserted a new line. llvm-svn: 323551
* [DAG] Teach findBaseOffset to interpret indexes of indexed memory operationsNirav Dave2018-01-261-8/+35
| | | | | | Indexed outputs are addition / subtractions and can be interpreted as such. llvm-svn: 323539
* [DAGCombine] reduceBuildVecToShuffle - ensure EXTRACT_VECTOR_ELT index is in ↵Simon Pilgrim2018-01-261-1/+5
| | | | | | | | range From OSS Fuzz Test Case #5688 llvm-svn: 323535
* [NFC] fix trivial typos in comments and documentsHiroshi Inoue2018-01-261-1/+1
| | | | | | "in in" -> "in", "on on" -> "on" etc. llvm-svn: 323508
OpenPOWER on IntegriCloud