summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAGCombine] Fixup SETCC legality checkingHal Finkel2015-08-311-11/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SETCC is one of those special node types for which operation actions (legality, etc.) is keyed off of an operand type, not the node's value type. This makes sense because the value type of a legal SETCC node is determined by its operands' value type (via the TLI function getSetCCResultType). When the SDAGBuilder creates SETCC nodes, it either creates them with an MVT::i1 value type, or directly with the value type provided by TLI.getSetCCResultType. The first problem being fixed here is that DAGCombine had several places querying TLI.isOperationLegal on SETCC, but providing the return of getSetCCResultType, instead of the operand type directly. This does not mean what the author thought, and "luckily", most in-tree targets have SETCC with Custom lowering, instead of marking them Legal, so these checks return false anyway. The second problem being fixed here is that two of the DAGCombines could create SETCC nodes with arbitrary (integer) value types; specifically, those that would simplify: (setcc a, b, op1) and|or (setcc a, b, op2) -> setcc a, b, op3 (which is possible for some combinations of (op1, op2)) If the operands of the and|or node are actual setcc nodes, then this is not an issue (because the and|or must share the same type), but, the relevant code in DAGCombiner::visitANDLike and DAGCombiner::visitORLike actually calls DAGCombiner::isSetCCEquivalent on each operand, and that function will recognise setcc-like select_cc nodes with other return types. And, thus, when creating new SETCC nodes, we need to be careful to respect the value-type constraint. This is even true before type legalization, because it is quite possible for the SELECT_CC node to have a legal type that does not happen to match the corresponding TLI.getSetCCResultType type. To be explicit, there is nothing that later fixes the value types of SETCC nodes (if the type is legal, but does not happen to match TLI.getSetCCResultType). Creating SETCCs with an MVT::i1 value type seems to work only because, either MVT::i1 is not legal, or it is what TLI.getSetCCResultType returns if it is legal. Fixing that is a larger change, however. For the time being, restrict the relevant transformations to produce only SETCC nodes with a value type matching TLI.getSetCCResultType (or MVT::i1 prior to type legalization). Fixes PR24636. llvm-svn: 246507
* don't set a legal vector type if we know we can't use that type (NFCI)Sanjay Patel2015-08-311-18/+12
| | | | | | Added benefit: the 'if' logic now matches the text of the comment that describes it. llvm-svn: 246506
* generalize helper function of MergeConsecutiveStores to handle vector types ↵Sanjay Patel2015-08-311-14/+21
| | | | | | | | | | (NFCI) This was part of D7208 (r227242), but that commit was reverted because it exposed a bug in AArch64 lowering. I should have that fixed and the rest of the commit reinstated soon. llvm-svn: 246493
* [DAGCombine] Use getSetCCResultType utility functionHal Finkel2015-08-311-1/+1
| | | | | | | DAGCombine has a utility wrapper around TLI's getSetCCResultType; use it in the one place in DAGCombine still directly calling the TLI function. NFC. llvm-svn: 246482
* [DAGCombine] Remove some old dead code for forming SETCC nodesHal Finkel2015-08-311-45/+0
| | | | | | | | | | This code was dead when it was committed in r23665 (Oct 7, 2005), and before it reaches its 10th anniversary, it really should go. We can always bring it back if we'd like, but it forms more SETCC nodes, and the way we do legality checking on SETCC nodes is wrong in a number of places, and removing this means fewer places to fix. NFC. llvm-svn: 246466
* Make MergeConsecutiveStores look at other stores on same chainMatt Arsenault2015-08-281-24/+149
| | | | | | | | | | | | | | | | | | | | | | | | | When combiner AA is enabled, look at stores on the same chain. Non-aliasing stores are moved to the same chain so the existing code fails because it expects to find an adajcent store on a consecutive chain. Because of how DAGCombiner tries these store combines, MergeConsecutiveStores doesn't see the correct set of stores on the chain when it visits the other stores. Each store individually has its chain fixed before trying to merge consecutive stores, and then tries to merge stores from that point before the other stores have been processed to have their chains fixed. To fix this, attempt to use FindBetterChain on any possibly neighboring stores in visitSTORE. Suppose you have 4 32-bit stores that should be merged into 1 vector store. One store would be visited first, fixing the chain. What happens is because not all of the store chains have yet been fixed, 2 of the stores are merged. The other 2 stores later have their chains fixed, but because the other stores were already merged, they have different memory types and merging the two different sized stores is not supported and would be more difficult to handle. llvm-svn: 246307
* [CodeGen] Check FoldConstantArithmetic result before using it.Ahmed Bougacha2015-08-271-2/+3
| | | | | | | | Fixes PR24602: r245689 introduced an unguarded use of SelectionDAG::FoldConstantArithmetic, which returns 0 when it fails because of opaque (hoisted) constants. llvm-svn: 246217
* Pass function attributes instead of boolean in isIntDivCheap().Steve King2015-08-251-9/+6
| | | | llvm-svn: 245921
* Add DAG optimisation for FP16_TO_FPOliver Stannard2015-08-241-0/+17
| | | | | | | | | | | | | | The FP16_TO_FP node only uses the bottom 16 bits of its input, so the following pattern can be optimised by removing the AND: (FP16_TO_FP (AND op, 0xffff)) -> (FP16_TO_FP op) This is a common pattern for ARM targets when functions have __fp16 arguments, as they are passed as floats (so that they get passed in the correct registers), but then bitcast and truncated to ignore the top 16 bits. llvm-svn: 245832
* [DAGCombiner] Fold CONCAT_VECTORS of bitcasted EXTRACT_SUBVECTORSimon Pilgrim2015-08-231-2/+11
| | | | | | Minor generalization of D12125 - peek through any bitcast to the original vector that we're extracting from. llvm-svn: 245814
* Do not use dyn_cast<> after isa<>Mehdi Amini2015-08-231-1/+1
| | | | | | | Reported by coverity. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 245799
* [DAGCombiner] Fold together mul and shl when both are by a constantJohn Brawn2015-08-211-0/+8
| | | | | | | | | | This is intended to improve code generation for GEPs, as the index value is shifted by the element size and in GEPs of multi-dimensional arrays the index of higher dimensions is multiplied by the lower dimension size. Differential Revision: http://reviews.llvm.org/D12197 llvm-svn: 245689
* [DAGCombiner] Added SMAX/SMIN/UMAX/UMIN constant foldingSimon Pilgrim2015-08-191-0/+29
| | | | | | | | | | We still need to add constant folding of vector comparisons to fold the tests for targets that don't support the respective min/max nodes I needed to update 2011-12-06-AVXVectorExtractCombine to load a vector instead of using a constant vector to prevent it folding Differential Revision: http://reviews.llvm.org/D12118 llvm-svn: 245503
* [DAGCombiner] Fold CONCAT_VECTORS of EXTRACT_SUBVECTOR (or undef) to ↵Simon Pilgrim2015-08-191-5/+79
| | | | | | | | | | VECTOR_SHUFFLE. Check to see if this is a CONCAT_VECTORS of a bunch of EXTRACT_SUBVECTOR operations. If so, and if the EXTRACT_SUBVECTOR vector inputs come from at most two distinct vectors the same size as the result, attempt to turn this into a legal shuffle. Differential Revision: http://reviews.llvm.org/D12125 llvm-svn: 245490
* [TLI] Refactor "is integer division cheap" queries.Michael Kuperstein2015-08-191-5/+7
| | | | | | | | | | | | | This removes the isPow2SDivCheap() query, as it is not currently used in any meaningful way. isIntDivCheap() no longer relies on a state variable (as all in-tree target set it to false), but the interface allows querying based on the type optimization level. NFC. Differential Revision: http://reviews.llvm.org/D12082 llvm-svn: 245430
* Fix backward operands in call to isTruncateFree() and improve comments.Steve King2015-08-181-2/+2
| | | | llvm-svn: 245385
* DAGCombiner: Improve DAGCombiner select normalizationMatthias Braun2015-08-181-20/+30
| | | | | | | | | | | | | | | | The current code normalizes select(C0, x, select(C1, x, y)) towards select(C0|C1, x, y) if the targets prefers that form. This patch adds an additional rule that if the select(C1, x, y) part already exists in the function then we want to normalize into the other direction because the effects of reusing the existing value are bigger than transforming into the target preferred form. This addresses regressions following r238793, see also: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150727/290272.html Differential Revision: http://reviews.llvm.org/D11616 llvm-svn: 245350
* DAGCombiner: Optimize SELECTs first before turning them into SELECT_CCMatthias Braun2015-08-181-32/+32
| | | | | | | This is part of http://reviews.llvm.org/D11616 - I just decided to split this up into a separate commit. llvm-svn: 245349
* use SDValue bool operator; NFCISanjay Patel2015-08-161-4/+3
| | | | llvm-svn: 245181
* [DAGCombiner] Attempt to mask vectors before zero extension instead of after.Simon Pilgrim2015-08-151-14/+28
| | | | | | | | | | | | | | For cases where we TRUNCATE and then ZERO_EXTEND to a larger size (often from vector legalization), see if we can mask the source data and then ZERO_EXTEND (instead of after a ANY_EXTEND). This can help avoid having to generate a larger mask, and possibly applying it to several sub-vectors. (zext (truncate x)) -> (zext (and(x, m)) Includes a minor patch to SystemZ to better recognise 8/16-bit zero extension patterns from RISBG bit-extraction code. This is the first of a number of minor patches to help improve the conversion of byte masks to clear mask shuffles. Differential Revision: http://reviews.llvm.org/D11764 llvm-svn: 245160
* PseudoSourceValue: Replace global manager with a manager in a machine function.Alex Lorenz2015-08-111-3/+4
| | | | | | | | | | | | | | | | | | | | | | This commit removes the global manager variable which is responsible for storing and allocating pseudo source values and instead it introduces a new manager class named 'PseudoSourceValueManager'. Machine functions now own an instance of the pseudo source value manager class. This commit also modifies the 'get...' methods in the 'MachinePointerInfo' class to construct pseudo source values using the instance of the pseudo source value manager object from the machine function. This commit updates calls to the 'get...' methods from the 'MachinePointerInfo' class in a lot of different files because those calls now need to pass in a reference to a machine function to those methods. This change will make it easier to serialize pseudo source values as it will enable me to transform the mips specific MipsCallEntry PseudoSourceValue subclass into two target independent subclasses. Reviewers: Akira Hatanaka llvm-svn: 244693
* SelectionDAG: Prefer to combine multiplication with less uses for fmaJingyue Wu2015-08-111-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: For example: s6 = s0*s5; s2 = s6*s6 + s6; ... s4 = s6*s3; We notice that it is possible for s2 is folded to fma (s0, s5, fmul (s6 s6)). This only happens when Aggressive is true, otherwise hasOneUse() check already prevents from folding the multiplication with more uses. Test Plan: test/CodeGen/NVPTX/fma-assoc.ll Patch by Xuetian Weng Reviewers: hfinkel, apazos, jingyue, ohsallen, arsenm Subscribers: arsenm, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D11855 llvm-svn: 244649
* wrap OptSize and MinSize attributes for easier and consistent access (NFCI)Sanjay Patel2015-08-041-3/+1
| | | | | | | | | | | | | | | | | Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of "or'ing" them together when optimizing just for size (-Os). Currently, we are not consistent about this and rely on a front-end to always set OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here that should be added as follow-on patches with regression tests. This patch is NFC-intended: it just replaces existing direct accesses of the attributes by the equivalent wrapper call. Differential Revision: http://reviews.llvm.org/D11734 llvm-svn: 243994
* Remove trailing whitespace. NFCI.Simon Pilgrim2015-08-011-7/+7
| | | | llvm-svn: 243838
* Use SDValue bool check. NFCI.Simon Pilgrim2015-08-011-20/+10
| | | | llvm-svn: 243837
* [DAGCombiner] Convert constant AND masks to shuffle clear masks down to the ↵Simon Pilgrim2015-08-011-21/+64
| | | | | | | | | | | | | | byte level The XformToShuffleWithZero method currently checks AND masks at the per-lane level for all-one and all-zero constants and attempts to convert them to legal shuffle clear masks. This patch generalises XformToShuffleWithZero, splitting and checking the sub-lanes of the constants down to the byte level to see if any legal shuffle clear masks are possible. This allows a lot of masks (often from legalization or truncation) to be folded into existing shuffle patterns and removes a lot of constant mask loading. There are a few examples of poor shuffle lowering that are exposed by this patch that will be cleaned up in future patches (e.g. merging shuffles that are separated by bitcasts, x86 legalized v8i8 zero extension uses PMOVZX+AND+AND instead of AND+PMOVZX, etc.) Differential Revision: http://reviews.llvm.org/D11518 llvm-svn: 243831
* move DAGCombiner's allowableAlignment() helper function into the TLISanjay Patel2015-07-291-36/+33
| | | | | | | | | | | | | | | | | | | | | | | Making allowableAlignment() more accessible was suggested as a predecessor patch for D10662, so I've pulled it into TargetLowering. This let's us remove 4 instances of duplicate logic in LegalizeDAG. There's a subtle functional change in the implementation: the existing allowableAlignment() code was using getPrefTypeAlignment() when checking alignment with the DataLayout and assumed that was fast. In this implementation, we use getABITypeAlignment() and assume that is fast. See the TODO comment or the discussion in the Phab review for future improvements in this implementation (don't use the data layout at all). There are no regression test changes from this difference, and I'm not sure how to expose it via a test. I think we actually do want to provide the 'Fast' param when checking this from DAGCombiner::MergeConsecutiveStores(). Ie, we shouldn't merge stores if the new stores are not going to be fast. But that change will require fixing allowsMisalignedMemoryAccess() overrides as noted in D10662. Differential Revision: http://reviews.llvm.org/D10905 llvm-svn: 243549
* ignore duplicate divisor uses when transforming into reciprocal multiplies ↵Sanjay Patel2015-07-281-4/+4
| | | | | | | | | | | | | | | | | | | (PR24141) PR24141: https://llvm.org/bugs/show_bug.cgi?id=24141 contains a test case where we have duplicate entries in a node's uses() list. After r241826, we use CombineTo() to delete dead nodes when combining the uses into reciprocal multiplies, but this fails if we encounter the just-deleted node again in the list. The solution in this patch is to not add duplicate entries to the list of users that we will subsequently iterate over. For the test case, this avoids triggering the combine divisors logic entirely because there really is only one user of the divisor. Differential Revision: http://reviews.llvm.org/D11345 llvm-svn: 243500
* fix TLI's combineRepeatedFPDivisors interface to return the minimum user ↵Sanjay Patel2015-07-281-4/+10
| | | | | | | | | | | | | | | threshold This fix was suggested as part of D11345 and is part of fixing PR24141. With this change, we can avoid walking the uses of a divisor node if the target doesn't want the combineRepeatedFPDivisors transform in the first place. There is no NFC-intended other than that. Differential Revision: http://reviews.llvm.org/D11531 llvm-svn: 243498
* move combineRepeatedFPDivisors logic into a helper function; NFCISanjay Patel2015-07-271-42/+57
| | | | llvm-svn: 243293
* [DAGCombiner] Fixed minor typo that was missed in D9097.Simon Pilgrim2015-07-191-2/+2
| | | | | | We don't bitcast the UNDEFs - that is done in visitVECTOR_SHUFFLE, and the getValueType should come from the operand's SDValue not the SDNode. llvm-svn: 242640
* Use SDValue bool check. NFCI.Simon Pilgrim2015-07-191-58/+39
| | | | llvm-svn: 242636
* Only do fmul (fadd x, x), c combine if the fadd only has one useMatt Arsenault2015-07-171-1/+3
| | | | | | This was increasing the instruction count if the fadd has multiple uses. llvm-svn: 242498
* Use more foreach loops in SelectionDAG. NFCPete Cooper2015-07-141-3/+2
| | | | llvm-svn: 242249
* DAGCombiner: Assume invariant load cannot alias a storeMatt Arsenault2015-07-101-0/+9
| | | | | | | | | | The motivation is to allow GatherAllAliases / FindBetterChain to not give up on dependent loads of a pointer from constant memory. This is important for AMDGPU, because most loads are pointers derived from a load of a kernel argument from constant memory. llvm-svn: 241948
* fix an invisible bug when combining repeated FP divisorsSanjay Patel2015-07-091-2/+9
| | | | | | | | | | | | | | | | | | | | | | | This patch fixes bugs that were exposed by the addition of fast-math-flags in the DAG: r237046 ( http://reviews.llvm.org/rL237046 ): 1. When replacing a division node, it's not enough to RAUW. We should call CombineTo() to delete dead nodes and combine again. 2. Because we are changing the DAG, we can't return an empty SDValue after the transform. As the code comments say: Visitation implementation - Implement dag node combining for different node types. The semantics are as follows: Return Value: SDValue.getNode() == 0 - No change was made SDValue.getNode() == N - N was replaced, is dead and has been handled. otherwise - N should be replaced by the returned Operand. The new test case shows no difference with or without this patch, but it will crash if we re-apply r237046 or enable FMF via the current -enable-fmf-dag cl::opt. Differential Revision: http://reviews.llvm.org/D9893 llvm-svn: 241826
* Re-instate the EVT parameter to getScalarShiftAmountTy() for OOT userMehdi Amini2015-07-091-1/+2
| | | | | | | A documentation for this function would be nice by the way. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241807
* Make isLegalAddressingMode() taking DataLayout as an argumentMehdi Amini2015-07-091-1/+2
| | | | | | | | | | | | | | | | Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11040 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241778
* Make TargetLowering::getShiftAmountTy() taking DataLayout as an argumentMehdi Amini2015-07-091-2/+1
| | | | | | | | | | | | | | | | Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11037 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241776
* Make TargetLowering::getPointerTy() taking DataLayout as an argumentMehdi Amini2015-07-091-11/+16
| | | | | | | | | | | | | | | | Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, ted, yaron.keren, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D11028 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241775
* early exits -> less indenting; NFCISanjay Patel2015-07-081-23/+22
| | | | llvm-svn: 241716
* Remove IsLittleEndian from TargetLowering and redirect to DataLayoutMehdi Amini2015-07-081-2/+2
| | | | | | | | | | | | | | | | Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11017 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241655
* Redirect DataLayout from TargetMachine to Module in SelectionDAGMehdi Amini2015-07-071-24/+24
| | | | | | | | | | | | | | | | | | | | Summary: SelectionDAG itself is not invoking directly the DataLayout in the TargetMachine, but the "TargetLowering" class is still using it. I'll address it in a following commit. This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11000 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241618
* Reapply r240291: Fix shl folding in DAG combiner.Pawel Bylica2015-07-021-1/+1
| | | | | | | | The code responsible for shl folding in the DAGCombiner was assuming incorrectly that all constants are less than 64 bits. This patch simply changes the way values are compared. It has been reverted previously because of some problems with comparing APInt with raw uint64_t. That has been fixed/changed with r241204. llvm-svn: 241254
* [DAGCombiner] Fix & simplify constant folding of sext/zext.Pawel Bylica2015-06-291-13/+11
| | | | | | | | | | | | | | | | Summary: This patch fixes the cases of sext/zext constant folding in DAG combiner where constans do not fit 64 bits. The fix simply removes un$ Test Plan: New regression test included. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: RKSimon, llvm-commits Differential Revision: http://reviews.llvm.org/D10607 llvm-svn: 240991
* [SDAG] Now that we have a way to communicate the exact bit on sdiv use it to ↵Benjamin Kramer2015-06-271-0/+4
| | | | | | | | | | | | | | | | | simplify sdiv by a constant. We had a hack in SDAGBuilder in place to work around this but now we can avoid that. Call BuildExactSDIV from BuildSDIV so DAGCombiner can perform this trick automatically. The added check in DAGCombiner is necessary to prevent exact sdiv by pow2 from regressing as the target-specific pow2 lowering is not aware of exact bits yet. This is mostly covered by existing tests. One side effect is that we get the better lowering for exact vector sdivs now too :) llvm-svn: 240891
* Convert a bunch of loops to foreach. NFC.Pete Cooper2015-06-261-14/+12
| | | | | | This uses the new SDNode::op_values() iterator range committed in r240805. llvm-svn: 240809
* [DAGCombine] fold (X >>?,exact C1) << C2 --> X << (C2-C1)Benjamin Kramer2015-06-261-0/+16
| | | | | | | Instcombine also does this but many opportunities only become visible after GEPs are lowered. llvm-svn: 240787
* DAGCombiner: Use pop_back_val()Matt Arsenault2015-06-251-2/+1
| | | | llvm-svn: 240709
* DAGCombiner: Remove redundant checkMatt Arsenault2015-06-251-1/+1
| | | | | | | MemIntrinsicSDNode is already a subclass of MemSDNode, so the MemSDNode check is sufficient. llvm-svn: 240672
OpenPOWER on IntegriCloud