summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
* [CodeGen] Document and use getConstant's splat-building feature. NFC.Ahmed Bougacha2016-02-152-10/+4
| | | | | | Differential Revision: http://reviews.llvm.org/D17229 llvm-svn: 260901
* Don't combine fp_round (fp_round x) if f80 to f16 is generatedPirama Arumuga Nainar2016-02-131-0/+11
| | | | | | | | | | | | | | | | | | | | Summary: This patch skips DAG combine of fp_round (fp_round x) if it results in an fp_round from f80 to f16. fp_round from f80 to f16 always generates an expensive (and as yet, unimplemented) libcall to __truncxfhf2. This prevents selection of native f16 conversion instructions from f32 or f64. Moreover, the first (value-preserving) fp_round from f80 to either f32 or f64 may become a NOP in platforms like x86. Reviewers: ab Subscribers: srhines, llvm-commits Differential Revision: http://reviews.llvm.org/D17221 llvm-svn: 260769
* [SelectionDAG] change getConstant() to use the input SDLoc when building ↵Sanjay Patel2016-02-111-5/+4
| | | | | | | | | | | | | | | | | | | | | | splat vectors The code change is simple enough: instead of attaching an anonymous SDLoc to splatted vector constants, use the scalar constant's existing SDLoc since that is what is passed into getConstant() as a param. But this changes instruction scheduling, so I'll explain why that happens. The motivation for this patch starts near: http://reviews.llvm.org/rL258833 ...x86's getZeroVector() could be similarly cleaned up and I thought it would be 'NFC'. But when I made that change locally, several x86 codegen tests wiggled. It turns out that the lack of SDLoc consistency in getConstant() changes the way ScheduleDAGRRList behaves. This is because the SDLoc contains 'IROrder' and some DAG scheduler algorithms use IROrder for tie-breaking. Differential Revision: http://reviews.llvm.org/D16972 llvm-svn: 260582
* [CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI.Ahmed Bougacha2016-02-094-50/+34
| | | | llvm-svn: 260316
* [SelectionDAG] make getMemBasePlusOffset() accessible; NFCISanjay Patel2016-02-091-12/+9
| | | | | | | | | I reinvented this functionality in http://reviews.llvm.org/D16828 because it was hidden away as a static function. The changes in x86 are not based on a complete audit. I suspect there are other possible uses there, and there are almost certainly more potential users in other targets. llvm-svn: 260295
* [X86] Don't zero/sign-extend i1, i8, or i16 return values to 32 bits (PR22532)Hans Wennborg2016-02-081-1/+1
| | | | | | | | | | | | | | | | | | | | This matches GCC and MSVC's behaviour, and saves on code size. We were already not extending i1 return values on x86_64 after r127766. This takes that patch further by applying it to x86 target as well, and also for i8 and i16. The ABI docs have been unclear about the required behaviour here. The new i386 psABI [1] clearly states (Table 2.4, page 14) that i1, i8, and i16 return vales do not need to be extended beyond 8 bits. The x86_64 ABI doc is being updated to say the same [2]. Differential Revision: http://reviews.llvm.org/D16907 [1]. https://01.org/sites/default/files/file_attach/intel386-psabi-1.0.pdf [2]. https://groups.google.com/d/msg/x86-64-abi/E8O33onbnGQ/_RFWw_ixDQAJ llvm-svn: 260133
* SelectionDAG: Lower some range metadata to AssertZextMatt Arsenault2016-02-082-3/+45
| | | | | | | | | | If a range has a lower bound of 0, add an AssertZext from the nearest floor power of two. This allows operations with some workitem intrinsics with known maximum ranges to use fast 24-bit multiplies. llvm-svn: 260109
* [StatepointLower] Use None instead of Optional<int>()Sanjoy Das2016-02-051-5/+5
| | | | llvm-svn: 259956
* [Power PC] softening long double typePetar Jovanovic2016-02-041-17/+33
| | | | | | | | | | | This patch implements softening of long double type (ppcf128) on ppc32 architecture and enables operations for this type for soft float. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D15811 llvm-svn: 259791
* rangify; NFCISanjay Patel2016-02-031-159/+129
| | | | llvm-svn: 259722
* [SelectionDAG] Fix CombineToPreIndexedLoadStore O(n^2) behaviorTim Shen2016-02-032-6/+9
| | | | | | | | | | | | | | | | | | | | This patch consists of two parts: a performance fix in DAGCombiner.cpp and a correctness fix in SelectionDAG.cpp. The test case tests the bug that's uncovered by the performance fix, and fixed by the correctness fix. The performance fix keeps the containers required by the hasPredecessorHelper (which is a lazy DFS) and reuse them. Since hasPredecessorHelper is called in a loop, the overall efficiency reduced from O(n^2) to O(n), where n is the number of SDNodes. The correctness fix keeps iterating the neighbor list even if it's time to early return. It will return after finishing adding all neighbors to Worklist, so that no neighbors are discarded due to the original early return. llvm-svn: 259691
* Fix Clang-tidy readability-redundant-control-flow warnings; other minor fixes.Eugene Zelenko2016-02-021-14/+6
| | | | | | Differential revision: http://reviews.llvm.org/D16793 llvm-svn: 259539
* AArch64: Implement missed conditional compare sequences.Balaram Makam2016-02-011-2/+2
| | | | | | | | | | | | | | | | | | Summary: This is an extension to the existing implementation of r242436 which restricts to only select inputs. This version fixes missed opportunities in pr26084 by attempting to lower conditional compare sequences of and/or trees with setcc leafs. This will additionaly handle the case when a tree with select input is not a conjunction-disjunction tree but some of the sub trees are conjunction-disjunction trees. Reviewers: jmolloy, t.p.northover, mcrosier, MatzeB Subscribers: mcrosier, llvm-commits, junbuml, haicheng, mssimpso, gberry Differential Revision: http://reviews.llvm.org/D16291 llvm-svn: 259387
* [SelectionDAG] Eliminate exponential behavior in WalkChainUsersTim Shen2016-01-311-5/+20
| | | | llvm-svn: 259315
* Avoid overly large SmallPtrSet/SmallSetMatthias Braun2016-01-305-5/+5
| | | | | | | These sets perform linear searching in small mode so it is never a good idea to use SmallSize/N bigger than 32. llvm-svn: 259283
* Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith ↵Yaron Keren2016-01-291-2/+2
| | | | | | | | r259192 post commit comment. clang part in r259232, this is the LLVM part of the patch. llvm-svn: 259240
* [X86] Don't transform X << 1 to X + X during type legalizationDavid Majnemer2016-01-281-9/+0
| | | | | | | | | | | | | | | | | | | | While legalizing a 64-bit shift left by 1, the following occurs: We split the shift operand in half: a high half and a low half. We then create an ADDC with the low half and a ADDE with the high half + the carry bit from the ADDC. This is problematic if X is any_ext'd because the high half computation is now undef + undef + carry bit and there is no way to ensure that the two undef values had the same bitwise representation. This results in the lowest bit in the high half turning into garbage. Instead, do not try to turn shifts into arithmetic during type legalization. This fixes PR26350. llvm-svn: 259065
* [DAGCombiner] Don't add volatile or indexed stores to ChainedStoresJunmo Park2016-01-281-0/+4
| | | | | | | | | | | | Summary: findBetterNeighborChains does not handle volatile or indexed stores. However, it did not check when adding stores to ChainedStores. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D16463 llvm-svn: 259024
* Rename TargetSelectionDAGInfo into SelectionDAGTargetInfo and move it to ↵Benjamin Kramer2016-01-274-17/+15
| | | | | | | | CodeGen/ It's a SelectionDAG thing, not a Target thing. llvm-svn: 258939
* Remove autoconf supportChris Bieneman2016-01-261-13/+0
| | | | | | | | | | | | | | | | Summary: This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html "I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened." - Obi Wan Kenobi Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D16471 llvm-svn: 258861
* tidy up; NFCSanjay Patel2016-01-261-9/+9
| | | | llvm-svn: 258838
* fix formatting; NFCSanjay Patel2016-01-261-2/+1
| | | | llvm-svn: 258825
* [SelectionDAG] Use the correct return type for memcpy, memmove, and memset.Dan Gohman2016-01-251-3/+3
| | | | | | | | | | | | | When generating calls to memcpy, memmove, and memset, use void* as the return type rather than void, to match the standard signatures for these functions. This has no practical effect for most targets, since the return values of these calls aren't being used anyway, and most calling conventions tolerate this kind of mismatch. However, this change will help support future optimizations to utilize the return value to avoid holding the argument value live across a call. llvm-svn: 258691
* [SelectionDAG] Generalised the CONCAT_VECTORS creation to support ↵Simon Pilgrim2016-01-231-10/+12
| | | | | | BUILD_VECTOR and UNDEF folding. llvm-svn: 258646
* Tidied up TRUNC combine code. NFC.Simon Pilgrim2016-01-231-9/+5
| | | | | | Make use of DAG.getBitcast and use clang-format to reduce number of lines (and make it more readable). llvm-svn: 258644
* Don't check if a list is empty with ilist::size.Benjamin Kramer2016-01-231-1/+1
| | | | | | ilist::size() is O(n) while ilist::empty() is O(1) llvm-svn: 258636
* Remove extra whitespace. NFC.Junmo Park2016-01-231-4/+4
| | | | llvm-svn: 258617
* [SelectionDAG] Fold more offsets into GlobalAddressesDan Gohman2016-01-222-75/+123
| | | | | | | | This reapplies r258296 and r258366, and also fixes an existing bug in SelectionDAG.cpp's isMemSrcFromString, neglecting to account for the offset in a GlobalAddressSDNode, which is uncovered by those patches. llvm-svn: 258482
* [opaque pointer types] [NFC] Add an explicit type argument to ↵Eduard Burtescu2016-01-221-1/+1
| | | | | | | | | | | | ConstantFoldLoadFromConstPtr. Reviewers: mjacob, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16418 llvm-svn: 258472
* Revert "[SelectionDAG] Fold more offsets into GlobalAddresses"Reid Kleckner2016-01-222-120/+73
| | | | | | | | | | | | | | This reverts r258296 and the follow up r258366. With this change, we miscompiled the following program on Windows: #include <string> #include <iostream> static const char kData[] = "asdf jkl;"; int main() { std::string s(kData + 3, sizeof(kData) - 3); std::cout << s << '\n'; } llvm-svn: 258465
* [SelectionDAG] Fix constant offset folding to avoid commuting ↵Dan Gohman2016-01-201-2/+3
| | | | | | | | | non-commutative operators. This fixes a miscompile in MultiSource/Benchmarks/MiBench/consumer-lame introduced in r258296. llvm-svn: 258366
* [SelectionDAG] Fold more offsets into GlobalAddressesDan Gohman2016-01-202-73/+119
| | | | | | | | | | | | | | | | | | SelectionDAG previously missed opportunities to fold constants into GlobalAddresses in several areas. For example, given `(add (add GA, c1), y)`, it would often reassociate to `(add (add GA, y), c1)`, missing the opportunity to create `(add GA+c, y)`. This isn't often visible on targets such as X86 which effectively reassociate adds in their complex address-mode folding logic, however it is currently visible on WebAssembly since it currently has very simple address mode folding code that doesn't reassociate anything. This patch fixes this by making SelectionDAG fold offsets into GlobalAddresses at the same times that it folds constants together, so that it doesn't miss any opportunities to perform such folding. Differential Revision: http://reviews.llvm.org/D16090 llvm-svn: 258296
* [NFC] Replace several manual GEP loops with gep_type_iterator.Eduard Burtescu2016-01-202-31/+13
| | | | | | | | | | Reviewers: dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16335 llvm-svn: 258262
* [opaque pointer types] [NFC] GEP: replace get(Pointer)ElementType uses with ↵Eduard Burtescu2016-01-192-2/+15
| | | | | | | | | | | | | | | | | | get{Source,Result}ElementType. Summary: GEPOperator: provide getResultElementType alongside getSourceElementType. This is made possible by adding a result element type field to GetElementPtrConstantExpr, which GetElementPtrInst already has. GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType. Reviewers: mjacob, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16275 llvm-svn: 258145
* Fixed MSVC warning that not all control paths return a value.Simon Pilgrim2016-01-181-0/+1
| | | | llvm-svn: 258099
* TargetLowering: Improve handling of (setcc ([sz]ext x) 0, cc) in SimplifySetCCTom Stellard2016-01-181-0/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When SimplifySetCC sees a setcc node that compares the result of a value extension operation with a constant, it tries to simplify the setcc node by eliminating the extension and shrinking the constant. If shrinking the inputs to setcc is deemed not desirable by the target (e.g. the target does not want a setcc comparing i1 values), then it is still possible to optimize this sequence in some cases. This patch adds the following combines to SimplifySetCC when shrinking setcc inputs is not desirable: (setcc ([sz]ext (setcc x, y, cc)), 0, setne) -> (setcc (x, y, cc)) (setcc ([sz]ext (setcc x, y, cc)), 0, seteq) -> (setcc (x, Y, !cc)) There are no tests for this yet, but once AMDGPU correctly implements TargetLowering::isTypeDesirableForOp(), this new combine will be exercised by the existing CodeGen/AMDGPU/setcc-opt.ll test. Reviewers: resistor, arsenm Subscribers: jroelofs, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15034 llvm-svn: 258067
* [opaque pointer types] [NFC] CallSite: use getFunctionType() instead of ↵Manuel Jacob2016-01-172-9/+6
| | | | | | | | | | | | | | going through PointerType::getElementType. Patch by Eduard Burtescu. Reviewers: dblaikie, mjacob Subscribers: dsanders, llvm-commits, dblaikie Differential Revision: http://reviews.llvm.org/D16273 llvm-svn: 258023
* [SelectionDAG] CSE nodes with differing SDNodeFlagsDan Gohman2016-01-151-22/+22
| | | | | | | | | | | | In the optimizer (GVN etc.) when eliminating redundant nodes with different flags, the flags are ignored for the purposes of testing for congruence, and then intersected for the purposes of producing a result that supports the union of all the uses. This commit makes SelectionDAG's CSE do the same thing, allowing it to CSE nodes in more cases. This fixes PR26063. Differential Revision: http://reviews.llvm.org/D15957 llvm-svn: 257940
* [WinEH] Rename CatchReturnInst::getParentPad, NFCJoseph Tremoulet2016-01-151-1/+1
| | | | | | | | | | | | | | | | Summary: Rename to getCatchSwitchParentPad, to make it more clear which ancestor the "parent" in question is. Add a comment pointing out the key feature that the returned pad indicates which funclet contains the successor block. Reviewers: rnk, andrew.w.kaylor, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16222 llvm-svn: 257933
* [CodeGen] Don't assume fp_to_fp16 produces i16 when legalizing it.Ahmed Bougacha2016-01-141-1/+1
| | | | | | | | | | | | | | Since r230276, we support an improved legalization for f64->f16, which goes through a temporary f32, improving codegen when f32->f16 is legal but not f64->f16. This requires unsafe-fp-math. However, that legalization assumed that the second step, producing a pseudo-softened f16, had type i16. That's not true on targets with illegal i16, such as ARM. Use the initial f64->f16 result type instead. llvm-svn: 257794
* [X86] Don't alter HasOpaqueSPAdjustment after we've relied on itDavid Majnemer2016-01-141-1/+1
| | | | | | | | | | | | | | We rely on HasOpaqueSPAdjustment not changing after we've calculated things based on it. Things like whether or not we can use 'rep;movs' to copy bytes around, that sort of thing. If it changes, invariants in the backend will quietly break. This situation arose when we had a call to memcpy *and* a COPY of the FLAGS register where we would attempt to reference local variables using %esi, a register that was clobbered by the 'rep;movs'. This fixes PR26124. llvm-svn: 257730
* Fix Release build warning.Philip Reames2016-01-141-0/+1
| | | | | | A value used only in an assert. Again. llvm-svn: 257728
* [GCRoot] Assert preconditions to clarify behaviorPhilip Reames2016-01-141-8/+12
| | | | | | This code isn't reachable if the GFI (GCFunctionInfo*) is null. Clarify this by adding an assert and removing an always taken if. llvm-svn: 257724
* [TLS] New lower emutls pass, fix linkage bugs.Chih-Hung Hsieh2016-01-131-3/+1
| | | | | | | | | | | | | | | | | | | Previous implementation in http://reviews.llvm.org/D10522 created external references to __emutls_v.* variables. Such references are inaccurate and cannot be handled by all linkers, e.g. Android dynamic and gold linkers for aarch64. Now a new LowerEmuTLS pass to go through all global variables, and add emutls_v.* and emutls_t.* variables. These __emutls* variables have the same linkage and visibility as the associated user defined TLS variable. Also removed old code that dump __emutls* variables in AsmPrinter.cpp, and updated TLS unit tests. Differential Revision: http://reviews.llvm.org/D15300 llvm-svn: 257718
* LegalizeDAG: Expand ctlz with ctlz_zero_undef if legalMatt Arsenault2016-01-111-2/+12
| | | | llvm-svn: 257345
* [DAGCombiner] don't dereference an operand that doesn't exist (PR26070)Sanjay Patel2016-01-081-12/+13
| | | | | | | | | | | | | The bug was introduced with changes for x86-64 fp128: http://reviews.llvm.org/rL254653 I don't know why an x86 change is here, so I'll follow up in: http://reviews.llvm.org/D15134 Should fix: https://llvm.org/bugs/show_bug.cgi?id=26070 llvm-svn: 257200
* Test commit access - add a blank line in comment.Tim Shen2016-01-081-0/+1
| | | | llvm-svn: 257192
* Do not ASSERTZEXT for i16 result of bitcast from f16 operandPirama Arumuga Nainar2016-01-081-6/+2
| | | | | | | | | | | | | | | | | | | | | | | Summary: During legalization if i16, do not ASSERTZEXT the result of FP_TO_FP16. Directly return an FP_TO_FP16 node with return type as the promote-to-type of i16. This patch also removes extraneous length check. This legalization should be valid even if integer and float types are of different lengths. This patch breaks a hard-float test for fp16 args. The test is changed to allow a vmov to zero-out the top bits, and also ensure that the return value is in an FP register. Reviewers: ab, jmolloy Subscribers: srhines, llvm-commits Differential Revision: http://reviews.llvm.org/D15438 llvm-svn: 257184
* Undo spurious change made in r256965David Majnemer2016-01-071-2/+1
| | | | llvm-svn: 257028
* [Statepoints] Add test cases around vectors and stablize testPhilip Reames2016-01-071-1/+3
| | | | | | | | | | Unlike my comment in 257022 said, it turns out we do handle constant vectors in the statepoint lowering, but only because SelectionDAG doesn't actually produce constants for them. Add a couple of tests which show this working. Also, add a triple to the same test file to hopefully fix a failing bot. It turns out we do han llvm-svn: 257025
OpenPOWER on IntegriCloud