summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [TargetLowering] fix SETCC SETLT folding with FP typesSanjay Patel2017-02-121-9/+13
| | | | | | | | | | | | The bug was introduced with: https://reviews.llvm.org/rL294863 ...and manifests as a selection failure in x86, but that's actually another bug. This fix prevents wrong codegen with -0.0, but in the more common case when we have NSZ and NNAN (-ffast-math), we should still be able to fold this setcc/compare. llvm-svn: 294924
* [TargetLowering] check for sign-bit comparisons in SimplifyDemandedBitsSanjay Patel2017-02-111-0/+19
| | | | | | | | | | | | | | | | I don't know if anything other than x86 vectors is affected by this change, but this may allow us to remove target-specific intrinsics for blendv* (vector selects). The simplification arises from the fact that blendv* instructions only use the sign-bit when deciding which vector element to choose for the destination vector. The mechanism to fold VSELECT into SHRUNKBLEND nodes already exists in x86 lowering; this demanded bits change just enables the transform to fire more often. The original motivation starts with a bug for DSE of masked stores that seems completely unrelated, but I've explained the likely steps in this series here: https://llvm.org/bugs/show_bug.cgi?id=11210 Differential Revision: https://reviews.llvm.org/D29687 llvm-svn: 294863
* [TargetLowering] fix formatting and comments for ShrinkDemandedConstant; NFCSanjay Patel2017-02-071-19/+20
| | | | llvm-svn: 294325
* DAG: Fold fneg into compare with constant into the constantMatt Arsenault2017-01-301-0/+10
| | | | | | | | fcmp (fneg x), c, pred -> fcmp x, -c, (swap pred) InstCombine already does this. llvm-svn: 293512
* Add iterator_range<regclass_iterator> to {Target,MC}RegisterInfo, NFCKrzysztof Parzyszek2017-01-251-4/+1
| | | | llvm-svn: 293077
* DAG: Avoid OOB when legalizing vector indexingMatt Arsenault2017-01-101-1/+44
| | | | | | | | | If a vector index is out of bounds, the result is supposed to be undefined but is not undefined behavior. Change the legalization for indexing the vector on the stack so that an out of bounds index does not create an out of bounds memory access. llvm-svn: 291604
* [SelectionDAG] Use SelectionDAG.getBuildVector helper. NFCI.Simon Pilgrim2016-12-091-1/+1
| | | | | | Makes interception of BUILD_VECTOR creation easier for debugging. llvm-svn: 289218
* [SelectionDAG] Add expansion and promotion of [US]MUL_LOHINicolai Haehnle2016-12-081-21/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Most targets set the action for these nodes to Expand even though there isn't actually any code for them in ExpandNode. Instead, targets simply relied on the fact that no code generates these nodes as long as the nodes aren't legal or custom. However, generating these nodes can be useful e.g. for divide-by-constant in wider integer types. Expand of [US]MUL_LOHI will use MULH[US] when legal or custom, and a sequence of half-width multiplications otherwise. Promote uses a wider multiply. This patch intends to not change the generated code, but indirect effects are possible since expansions/promotions that were previously done in DAGCombine may now be done in LegalizeDAG. See D24822 for a change that actually uses the new expansion. Reviewers: spatel, bkramer, venkatra, efriedma, hfinkel, ast, nadav, tstellarAMD Subscribers: arsenm, jyknight, nemanjai, wdng, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D24956 llvm-svn: 289050
* [TargetLowering] add special-case for demanded bits analysis of 'not'Sanjay Patel2016-12-051-5/+19
| | | | | | | | | | | | | | | | We treat bitwise 'not' as a special operation and try not to reduce its all-ones mask. Presumably, this is because a 'not' may be cheaper than a generic 'xor' or it may get folded into another logic op if the target has those. However, if we can remove a logic instruction by changing the xor's constant mask value, that should always be a win. Note that the IR version of SimplifyDemandedBits() does not treat 'not' as a special-case currently (although that's marked with a FIXME). So if you run this IR through -instcombine, you should get the same end result. I'm hoping to add a different backend transform that will expose this problem though, so I need to solve this first. Differential Revision: https://reviews.llvm.org/D27356 llvm-svn: 288676
* [SelectionDAG] Refactor TargetLowering::expandMUL (NFC)Nicolai Haehnle2016-11-301-50/+32
| | | | | | | | | | | | Summary: Further preparation for the expansion of MUL_LOHI added in D24956. Reviewers: efriedma, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27064 llvm-svn: 288248
* [SelectionDAG] Early-out in TargetLowering::expandMUL (NFC)Nicolai Haehnle2016-11-231-77/+80
| | | | | | | | | | | | Summary: Reduce indentation level; preparation for D24956. Reviewers: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27063 llvm-svn: 287831
* Type legalization for compressstore and expandload intrinsics. Elena Demikhovsky2016-11-231-0/+32
| | | | | | | | Implemented widening (v2f32) and splitting (v16f64). On splitting, I use "popcnt" to calculate memory increment. More type legalization work will come in the next patches. llvm-svn: 287761
* Fix spelling in comment. NFC.Simon Pilgrim2016-11-171-1/+1
| | | | llvm-svn: 287222
* [TargetLowering] Fix undef vector element issue with true/false result handlingSimon Pilgrim2016-11-081-10/+10
| | | | | | | | | | | | | | Fixed an issue with vector usage of TargetLowering::isConstTrueVal / TargetLowering::isConstFalseVal boolean result matching. The comment said we shouldn't handle constant splat vectors with undef elements. But the the actual code was returning false if the build vector contained no undef elements.... This patch now ignores the number of undefs (getConstantSplatNode will return null if the build vector is all undefs). The change has also unearthed a couple of missed opportunities in AVX512 comparison code that will need to be addressed. Differential Revision: https://reviews.llvm.org/D26031 llvm-svn: 286238
* [DAG] disable nsw/nuw for add/sub/mul when simplifying based on demanded ↵Sanjay Patel2016-10-311-7/+18
| | | | | | | | | | | | | | | | bits (PR30841) This bug was exposed by using nsw/nuw for more aggressive folds in: https://reviews.llvm.org/rL284844 The changes mimic the IR demanded bits logic in InstCombiner::SimplifyDemandedUseBits(), but we can't just flip flag bits in the DAG; we have to create a new node that has the bits cleared. This should fix: https://llvm.org/bugs/show_bug.cgi?id=30841 llvm-svn: 285656
* TargetLowering: Add SimplifyDemandedBits() helper to TargetLoweringOptTom Stellard2016-10-141-2/+55
| | | | | | | | | | | | | | | | | | | | | | | | Summary: The main purpose of this new helper is to enable simplifying operations that have multiple uses. SimplifyDemandedBits does not handle multiple uses currently, and this new function makes it possible to optimize: and v1, v0, 0xffffff mul24 v2, v1, v1 ; Multiply ignoring high 8-bits. To: mul24 v2, v0, v0 Where before this would not be optimized, because v1 has multiple uses. Reviewers: bogner, arsenm Subscribers: nhaehnle, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D24964 llvm-svn: 284266
* getValueType().getSizeInBits() -> getValueSizeInBits() ; NFCISanjay Patel2016-09-141-8/+8
| | | | llvm-svn: 281493
* getValueType().getScalarSizeInBits() -> getScalarValueSizeInBits() ; NFCISanjay Patel2016-09-141-7/+4
| | | | llvm-svn: 281490
* getScalarType().getSizeInBits() -> getScalarSizeInBits() ; NFCISanjay Patel2016-09-141-11/+11
| | | | llvm-svn: 281489
* [SelectionDAG] Ensure DAG::getZeroExtendInReg is called with a scalar typeSimon Pilgrim2016-09-091-2/+2
| | | | | | Fixes issue with rL280927 identified by Mikael Holmén llvm-svn: 281042
* [SelectionDAG] Add BUILD_VECTOR support to computeKnownBits and ↵Simon Pilgrim2016-09-081-0/+27
| | | | | | | | | | | | SimplifyDemandedBits Add the ability to computeKnownBits and SimplifyDemandedBits to extract the known zero/one bits from BUILD_VECTOR, returning the known bits that are shared by every vector element. This is an initial step towards determining the sign bits of a vector (PR29079). Differential Revision: https://reviews.llvm.org/D24253 llvm-svn: 280927
* Replace a few more "fall through" comments with LLVM_FALLTHROUGHJustin Bogner2016-08-171-2/+2
| | | | | | Follow up to r278902. I had missed "fall through", with a space. llvm-svn: 278970
* [x86] Refactor a PowerPC specific ctlz/srl transformation (NFC).Pierre Gousseau2016-08-161-0/+25
| | | | | | | | Following the discussion on D22038, this refactors a PowerPC specific setcc -> srl(ctlz) transformation so it can be used by other targets. Differential Revision: https://reviews.llvm.org/D23445 llvm-svn: 278799
* Fix typo in lowering for fp128 ueq.Eli Friedman2016-08-151-1/+1
| | | | | | | | Regression from r259791. Differential Revision: https://reviews.llvm.org/D23374 llvm-svn: 278750
* [DAGCombine] Make sext(setcc) combine respect getBooleanContentsMichael Kuperstein2016-08-011-0/+10
| | | | | | | | | | | We used to combine "sext(setcc x, y, cc) -> (select (setcc x, y, cc), -1, 0)" Instead, we should combine to (select (setcc x, y, cc), T, 0) where the value of T is 1 or -1, depending on the type of the setcc, and getBooleanContents() for the type if it is not i1. This fixes PR28504. llvm-svn: 277371
* MachineFunction: Return reference for getFrameInfo(); NFCMatthias Braun2016-07-281-3/+3
| | | | | | | getFrameInfo() never returns nullptr so we should use a reference instead of a pointer. llvm-svn: 277017
* AVX-512: Fixed BT instruction selection.Elena Demikhovsky2016-07-191-0/+4
| | | | | | | | | | | The following condition expression ( a >> n) & 1 is converted to "bt a, n" instruction. It works on all intel targets. But on AVX-512 it was broken because the expression is modified to (truncate (a >>n) to i1). I added the new sequence (truncate (a >>n) to i1) to the BT pattern. Differential Revision: https://reviews.llvm.org/D22354 llvm-svn: 275950
* [SelectionDAG] Get rid of bool parameters in SelectionDAG::getLoad, ↵Justin Lebar2016-07-151-70/+48
| | | | | | | | | | | | | | | | | | | | | | | getStore, and friends. Summary: Instead, we take a single flags arg (a bitset). Also add a default 0 alignment, and change the order of arguments so the alignment comes before the flags. This greatly simplifies many callsites, and fixes a bug in AMDGPUISelLowering, wherein the order of the args to getLoad was inverted. It also greatly simplifies the process of adding another flag to getLoad. Reviewers: chandlerc, tstellarAMD Subscribers: jholewinski, arsenm, jyknight, dsanders, nemanjai, llvm-commits Differential Revision: http://reviews.llvm.org/D22249 llvm-svn: 275592
* Delete unused includes. NFC.Rafael Espindola2016-06-301-1/+0
| | | | llvm-svn: 274225
* Use isPositionIndependent in a few more places.Rafael Espindola2016-06-281-1/+1
| | | | | | | | | I think this converts all the simple cases that really just care about the generated code being position independent or not. The remaining uses are a bit more complicated and are checking things like "is this a library or executable" or "can this symbol be preempted". llvm-svn: 274055
* Move shouldAssumeDSOLocal to Target.Rafael Espindola2016-06-271-3/+1
| | | | | | Should fix the shared library build. llvm-svn: 273958
* Use isPositionIndependent predicate. NFC.Rafael Espindola2016-06-261-1/+1
| | | | llvm-svn: 273830
* Use isPositionIndependent predicate.Rafael Espindola2016-06-261-1/+1
| | | | llvm-svn: 273828
* Refactor a duplicated predicate. NFC.Rafael Espindola2016-06-261-0/+4
| | | | llvm-svn: 273826
* Use shouldAssumeDSOLocal in isOffsetFoldingLegal.Rafael Espindola2016-06-241-9/+14
| | | | | | This makes it slightly more powerful for dynamic-no-pic. llvm-svn: 273704
* [SDAG] Remove FixedArgs parameter from CallLoweringInfo::setCalleeKrzysztof Parzyszek2016-06-221-2/+2
| | | | | | | | | | | The setCallee function will set the number of fixed arguments based on the size of the argument list. The FixedArgs parameter was often explicitly set to 0, leading to a lack of consistent value for non- vararg functions. Differential Revision: http://reviews.llvm.org/D20376 llvm-svn: 273403
* Pass DebugLoc and SDLoc by const ref.Benjamin Kramer2016-06-121-17/+14
| | | | | | | | This used to be free, copying and moving DebugLocs became expensive after the metadata rewrite. Passing by reference eliminates a ton of track/untrack operations. No functionality change intended. llvm-svn: 272512
* [SelectionDAG] rename/move isKnownToBeAPowerOfTwo() from TargetLowering (NFC)Sanjay Patel2016-05-191-32/+1
| | | | | | | | | There are at least 2 places (DAGCombiner, X86ISelLowering) where this could be used instead of ad-hoc and watered down code that is trying to match a power-of-2 pattern. Differential Revision: http://reviews.llvm.org/D20439 llvm-svn: 270073
* [TargetLowering] make helper function for SetCC + and optimizations (NFC)Sanjay Patel2016-05-091-52/+40
| | | | | | | | | | After looking at D19087 again, it occurred to me that we can do better. If we consolidate the valueHasExactlyOneBitSet() transforms, we won't incur extra overhead from calling it a 2nd time, and we can shrink SimplifySetCC() a bit. No functional change intended. Differential Revision: http://reviews.llvm.org/D20050 llvm-svn: 268932
* [x86, BMI] add TLI hook for 'andn' and use it to simplify comparisonsSanjay Patel2016-05-071-0/+49
| | | | | | | | | | | | | | | | | | | | | For the sake of minimalism, this patch is x86 only, but I think that at least PPC, ARM, AArch64, and Sparc probably want to do this too. We might want to generalize the hook and pattern recognition for a target like PPC that has a full assortment of negated logic ops (orc, nand). Note that http://reviews.llvm.org/D18842 will cause this transform to trigger more often. For reference, this relates to: https://llvm.org/bugs/show_bug.cgi?id=27105 https://llvm.org/bugs/show_bug.cgi?id=27202 https://llvm.org/bugs/show_bug.cgi?id=27203 https://llvm.org/bugs/show_bug.cgi?id=27328 Differential Revision: http://reviews.llvm.org/D19087 llvm-svn: 268858
* clean up; NFCISanjay Patel2016-05-041-15/+15
| | | | llvm-svn: 268564
* LegalizeDAG: Move unaligned load/store expansion to TLIMatt Arsenault2016-04-211-0/+290
| | | | | | | | When custom lowered, this is not called if the store is custom lowered. Move it to be a utility function so targets can easily expand unaligned accesses when custom lowering. llvm-svn: 267029
* [NFC] Header cleanupMehdi Amini2016-04-181-1/+0
| | | | | | | | | | | | | | Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595
* TargetLowering: Factor out common code for tail call eligibility checking; NFCMatthias Braun2016-04-141-0/+27
| | | | llvm-svn: 266270
* TargetRegisterInfo: Add getRegAsmName()Tom Stellard2016-04-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The motivation for this new function is to move an invalid assumption about the relationship between the names of register definitions in tablegen files and their assembly names into TargetRegisterInfo, so that we can begin working on fixing this assumption. The current problem is that if you have a register definition in TableGen like: def MYReg0 : Register<"r0", 0>; The function TargetLowering::getRegForInlineAsmConstraint() derives the assembly name from the tablegen name: "MyReg0" rather than the given assembly name "r0". This is working, because on most targets the tablegen name and the assembly names are case insensitive matches for each other (e.g. def EAX : X86Reg<"eax", ...> getRegAsmName() will allow targets to override this default assumption and return the correct assembly name. Reviewers: echristo, hfinkel Subscribers: SamWot, echristo, hfinkel, llvm-commits Differential Revision: http://reviews.llvm.org/D15614 llvm-svn: 265955
* Swift Calling Convention: add swifterror attribute.Manman Ren2016-04-011-0/+1
| | | | | | | | | | | | A ``swifterror`` attribute can be applied to a function parameter or an AllocaInst. This commit does not include any target-specific change. The target-specific optimization will come as a follow-up patch. Differential Revision: http://reviews.llvm.org/D18092 llvm-svn: 265189
* LegalizeDAG: Don't replace vector store with integer if not legalMatt Arsenault2016-03-301-0/+54
| | | | | | | | | | | For the same reason as the corresponding load change. Note that ExpandStore is completely broken for non-byte sized element vector stores, but preserve the current broken behavior which has tests for it. The behavior should be the same, but now introduces a new typed store that is incorrectly split later rather than doing it directly. llvm-svn: 264928
* LegalizeDAG: Don't replace vector load with integer unless legalMatt Arsenault2016-03-301-0/+42
| | | | | | | | | | | | | On AMDGPU we want to be able to promote i64/f64 loads to v2i32. If the access is unaligned, this would conclude that since i64 is legal, it would convert it back to i64 and there is an endless legalization loop. Extract the logic for scalarizing the load into a new TargetLowering function, where this can also replace the custom function AMDGPU has for this. llvm-svn: 264927
* Swift Calling Convention: add swiftself attribute.Manman Ren2016-03-291-0/+1
| | | | | | Differential Revision: http://reviews.llvm.org/D17866 llvm-svn: 264754
* [DAG] use !isUndef() ; NFCISanjay Patel2016-03-141-1/+1
| | | | llvm-svn: 263453
OpenPOWER on IntegriCloud