summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
* SelectionDAG: Use correct addrspace when lowering memcpyMatt Arsenault2016-02-221-9/+16
| | | | | | | | | | | This was causing assertions later from using the wrong pointer size with LDS operations. getOptimalMemOpType should also have address space arguments later. This avoids assertions in existing tests exposed by a future commit. llvm-svn: 261580
* ADT: Remove == and != comparisons between ilist iterators and pointersDuncan P. N. Exon Smith2016-02-211-3/+4
| | | | | | | | | | | | | | I missed == and != when I removed implicit conversions between iterators and pointers in r252380 since they were defined outside ilist_iterator. Since they depend on getNodePtrUnchecked(), they indirectly rely on UB. This commit removes all uses of these operators. (I'll delete the operators themselves in a separate commit so that it can be easily reverted if necessary.) There should be NFC here. llvm-svn: 261498
* [DAGCombiner] Use getBitcast helper when possible. NFCI.Simon Pilgrim2016-02-201-7/+3
| | | | llvm-svn: 261437
* [StatepointLowering] Minor non-semantic cleanupsSanjoy Das2016-02-191-23/+18
| | | | | | Use auto, bring file up to coding standards etc. llvm-svn: 261358
* [StatepointLowering] Update StatepointMaxSlotsRequired correctlySanjoy Das2016-02-191-3/+4
| | | | | | | | | | Now that we don't always add an element to AllocatedStackSlots if we don't find a pre-existing unallocated stack slot, bumping StatepointMaxSlotsRequired to `NumSlots + 1` is not correct. Instead bump the statistic near the push_back, to Builder.FuncInfo.StatepointStackSlots.size(). llvm-svn: 261348
* [StatepointLowering] Fix a mistake in rL261336Sanjoy Das2016-02-191-4/+5
| | | | | | | | The check on MFI->getObjectSize() has to be on the FrameIndex, not on the index of the FrameIndex in AllocatedStackSlots. Weirdly, the tests I added in rL261336 didn't catch this. llvm-svn: 261347
* [StatepointLowering] Change AllocatedStackSlots to use SmallBitVectorSanjoy Das2016-02-192-13/+15
| | | | | | | | | | | | | | | | | NFCI. They key motivation here is that I'd like to use SmallBitVector::all() in a later change. Also, using a bit vector here seemed better in general. The only interesting change here is that in the failure case of allocateStackSlot, we no longer (the equivalent of) push_back(true) to AllocatedStackSlots. As far as I can tell, this is fine, since we'd never re-use those slots in the same StatepointLoweringState instance. Technically there was no need to change the operator[] type accesses to set() and test(), but I thought it'd be nice to make it obvious that we're using something other than a std::vector like thing. llvm-svn: 261337
* [StatepointLowering] Fix bug in allocateStackSlotSanjoy Das2016-02-191-2/+19
| | | | | | | | | | | | | | | | | | | | | | | | | allocateStackSlot did not consider the size of the value to be spilled before deciding to re-use a spill slot. This was originally okay (since originally we'd only ever spill pointers), but it became not okay when we changed our scheme to directly spill vectors of pointers. While this change fixes the bug pointed out, it has two performance caveats: - It matches spill slot and spillee size exactly, while in theory we can spill, e.g., an 8 byte pointer into a 16 byte slot. This is slightly complicated to fix since in the stackmaps section, we report the size of the spill slot as the size of the "indirect value"; and if they're no longer equivalent, we'll have to keep track of the (indirect) value size separately from the stack slot size. - It will "spuriously run out" of reusable slots, since we now have an second check in the search loop in addition to the availablity check (e.g. you had two free scalar slots, and you first ask for a vector slot followed by a scalar slot). I'll fix this in a later commit. llvm-svn: 261336
* [StatepointLowering] Clean up allocateStackSlotSanjoy Das2016-02-191-35/+22
| | | | | | | | | This removes the unusual loop structure in allocateStackSlot in favor of something more straightforward. I've also removed the cautionary comment in the function, which I suspect is historical cruft now, and confuses more than it enlightens. llvm-svn: 261335
* LegalizeDAG: Fix ExpandFCOPYSIGN assuming the same type on both inputsMatthias Braun2016-02-191-5/+31
| | | | llvm-svn: 261306
* Remove uses of builtin comma operator.Richard Trieu2016-02-182-20/+30
| | | | | | Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261270
* Revert r261070, it caused PR26652 / PR26653.Nico Weber2016-02-171-126/+0
| | | | llvm-svn: 261127
* Detecte vector reduction operations just before instruction selection.Cong Hou2016-02-171-0/+126
| | | | | | | | | | | | | | | | | | | | | | | | | This patch detects vector reductions before instruction selection. Vector reductions are vectorized reduction operations, and for such operations we have freedom to reorganize the elements of the result as long as the reduction of them stay unchanged. This will enable some reduction pattern recognition during instruction combine such as SAD/dot-product on X86. A flag is added to SDNodeFlags to mark those vector reduction nodes to be checked during instruction combine. To detect those vector reductions, we search def-use chains starting from the given instruction, and check if all uses fall into two categories: 1. Reduction with another vector. 2. Reduction on all elements. in which 2 is detected by recognizing the pattern that the loop vectorizer generates to reduce all elements in the vector outside of the loop, which includes several ShuffleVector and one ExtractElement instructions. Differential revision: http://reviews.llvm.org/D15250 llvm-svn: 261070
* [CodeGen] Document and use getConstant's splat-building feature. NFC.Ahmed Bougacha2016-02-152-10/+4
| | | | | | Differential Revision: http://reviews.llvm.org/D17229 llvm-svn: 260901
* Don't combine fp_round (fp_round x) if f80 to f16 is generatedPirama Arumuga Nainar2016-02-131-0/+11
| | | | | | | | | | | | | | | | | | | | Summary: This patch skips DAG combine of fp_round (fp_round x) if it results in an fp_round from f80 to f16. fp_round from f80 to f16 always generates an expensive (and as yet, unimplemented) libcall to __truncxfhf2. This prevents selection of native f16 conversion instructions from f32 or f64. Moreover, the first (value-preserving) fp_round from f80 to either f32 or f64 may become a NOP in platforms like x86. Reviewers: ab Subscribers: srhines, llvm-commits Differential Revision: http://reviews.llvm.org/D17221 llvm-svn: 260769
* [SelectionDAG] change getConstant() to use the input SDLoc when building ↵Sanjay Patel2016-02-111-5/+4
| | | | | | | | | | | | | | | | | | | | | | splat vectors The code change is simple enough: instead of attaching an anonymous SDLoc to splatted vector constants, use the scalar constant's existing SDLoc since that is what is passed into getConstant() as a param. But this changes instruction scheduling, so I'll explain why that happens. The motivation for this patch starts near: http://reviews.llvm.org/rL258833 ...x86's getZeroVector() could be similarly cleaned up and I thought it would be 'NFC'. But when I made that change locally, several x86 codegen tests wiggled. It turns out that the lack of SDLoc consistency in getConstant() changes the way ScheduleDAGRRList behaves. This is because the SDLoc contains 'IROrder' and some DAG scheduler algorithms use IROrder for tie-breaking. Differential Revision: http://reviews.llvm.org/D16972 llvm-svn: 260582
* [CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI.Ahmed Bougacha2016-02-094-50/+34
| | | | llvm-svn: 260316
* [SelectionDAG] make getMemBasePlusOffset() accessible; NFCISanjay Patel2016-02-091-12/+9
| | | | | | | | | I reinvented this functionality in http://reviews.llvm.org/D16828 because it was hidden away as a static function. The changes in x86 are not based on a complete audit. I suspect there are other possible uses there, and there are almost certainly more potential users in other targets. llvm-svn: 260295
* [X86] Don't zero/sign-extend i1, i8, or i16 return values to 32 bits (PR22532)Hans Wennborg2016-02-081-1/+1
| | | | | | | | | | | | | | | | | | | | This matches GCC and MSVC's behaviour, and saves on code size. We were already not extending i1 return values on x86_64 after r127766. This takes that patch further by applying it to x86 target as well, and also for i8 and i16. The ABI docs have been unclear about the required behaviour here. The new i386 psABI [1] clearly states (Table 2.4, page 14) that i1, i8, and i16 return vales do not need to be extended beyond 8 bits. The x86_64 ABI doc is being updated to say the same [2]. Differential Revision: http://reviews.llvm.org/D16907 [1]. https://01.org/sites/default/files/file_attach/intel386-psabi-1.0.pdf [2]. https://groups.google.com/d/msg/x86-64-abi/E8O33onbnGQ/_RFWw_ixDQAJ llvm-svn: 260133
* SelectionDAG: Lower some range metadata to AssertZextMatt Arsenault2016-02-082-3/+45
| | | | | | | | | | If a range has a lower bound of 0, add an AssertZext from the nearest floor power of two. This allows operations with some workitem intrinsics with known maximum ranges to use fast 24-bit multiplies. llvm-svn: 260109
* [StatepointLower] Use None instead of Optional<int>()Sanjoy Das2016-02-051-5/+5
| | | | llvm-svn: 259956
* [Power PC] softening long double typePetar Jovanovic2016-02-041-17/+33
| | | | | | | | | | | This patch implements softening of long double type (ppcf128) on ppc32 architecture and enables operations for this type for soft float. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D15811 llvm-svn: 259791
* rangify; NFCISanjay Patel2016-02-031-159/+129
| | | | llvm-svn: 259722
* [SelectionDAG] Fix CombineToPreIndexedLoadStore O(n^2) behaviorTim Shen2016-02-032-6/+9
| | | | | | | | | | | | | | | | | | | | This patch consists of two parts: a performance fix in DAGCombiner.cpp and a correctness fix in SelectionDAG.cpp. The test case tests the bug that's uncovered by the performance fix, and fixed by the correctness fix. The performance fix keeps the containers required by the hasPredecessorHelper (which is a lazy DFS) and reuse them. Since hasPredecessorHelper is called in a loop, the overall efficiency reduced from O(n^2) to O(n), where n is the number of SDNodes. The correctness fix keeps iterating the neighbor list even if it's time to early return. It will return after finishing adding all neighbors to Worklist, so that no neighbors are discarded due to the original early return. llvm-svn: 259691
* Fix Clang-tidy readability-redundant-control-flow warnings; other minor fixes.Eugene Zelenko2016-02-021-14/+6
| | | | | | Differential revision: http://reviews.llvm.org/D16793 llvm-svn: 259539
* AArch64: Implement missed conditional compare sequences.Balaram Makam2016-02-011-2/+2
| | | | | | | | | | | | | | | | | | Summary: This is an extension to the existing implementation of r242436 which restricts to only select inputs. This version fixes missed opportunities in pr26084 by attempting to lower conditional compare sequences of and/or trees with setcc leafs. This will additionaly handle the case when a tree with select input is not a conjunction-disjunction tree but some of the sub trees are conjunction-disjunction trees. Reviewers: jmolloy, t.p.northover, mcrosier, MatzeB Subscribers: mcrosier, llvm-commits, junbuml, haicheng, mssimpso, gberry Differential Revision: http://reviews.llvm.org/D16291 llvm-svn: 259387
* [SelectionDAG] Eliminate exponential behavior in WalkChainUsersTim Shen2016-01-311-5/+20
| | | | llvm-svn: 259315
* Avoid overly large SmallPtrSet/SmallSetMatthias Braun2016-01-305-5/+5
| | | | | | | These sets perform linear searching in small mode so it is never a good idea to use SmallSize/N bigger than 32. llvm-svn: 259283
* Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith ↵Yaron Keren2016-01-291-2/+2
| | | | | | | | r259192 post commit comment. clang part in r259232, this is the LLVM part of the patch. llvm-svn: 259240
* [X86] Don't transform X << 1 to X + X during type legalizationDavid Majnemer2016-01-281-9/+0
| | | | | | | | | | | | | | | | | | | | While legalizing a 64-bit shift left by 1, the following occurs: We split the shift operand in half: a high half and a low half. We then create an ADDC with the low half and a ADDE with the high half + the carry bit from the ADDC. This is problematic if X is any_ext'd because the high half computation is now undef + undef + carry bit and there is no way to ensure that the two undef values had the same bitwise representation. This results in the lowest bit in the high half turning into garbage. Instead, do not try to turn shifts into arithmetic during type legalization. This fixes PR26350. llvm-svn: 259065
* [DAGCombiner] Don't add volatile or indexed stores to ChainedStoresJunmo Park2016-01-281-0/+4
| | | | | | | | | | | | Summary: findBetterNeighborChains does not handle volatile or indexed stores. However, it did not check when adding stores to ChainedStores. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D16463 llvm-svn: 259024
* Rename TargetSelectionDAGInfo into SelectionDAGTargetInfo and move it to ↵Benjamin Kramer2016-01-274-17/+15
| | | | | | | | CodeGen/ It's a SelectionDAG thing, not a Target thing. llvm-svn: 258939
* Remove autoconf supportChris Bieneman2016-01-261-13/+0
| | | | | | | | | | | | | | | | Summary: This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html "I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened." - Obi Wan Kenobi Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D16471 llvm-svn: 258861
* tidy up; NFCSanjay Patel2016-01-261-9/+9
| | | | llvm-svn: 258838
* fix formatting; NFCSanjay Patel2016-01-261-2/+1
| | | | llvm-svn: 258825
* [SelectionDAG] Use the correct return type for memcpy, memmove, and memset.Dan Gohman2016-01-251-3/+3
| | | | | | | | | | | | | When generating calls to memcpy, memmove, and memset, use void* as the return type rather than void, to match the standard signatures for these functions. This has no practical effect for most targets, since the return values of these calls aren't being used anyway, and most calling conventions tolerate this kind of mismatch. However, this change will help support future optimizations to utilize the return value to avoid holding the argument value live across a call. llvm-svn: 258691
* [SelectionDAG] Generalised the CONCAT_VECTORS creation to support ↵Simon Pilgrim2016-01-231-10/+12
| | | | | | BUILD_VECTOR and UNDEF folding. llvm-svn: 258646
* Tidied up TRUNC combine code. NFC.Simon Pilgrim2016-01-231-9/+5
| | | | | | Make use of DAG.getBitcast and use clang-format to reduce number of lines (and make it more readable). llvm-svn: 258644
* Don't check if a list is empty with ilist::size.Benjamin Kramer2016-01-231-1/+1
| | | | | | ilist::size() is O(n) while ilist::empty() is O(1) llvm-svn: 258636
* Remove extra whitespace. NFC.Junmo Park2016-01-231-4/+4
| | | | llvm-svn: 258617
* [SelectionDAG] Fold more offsets into GlobalAddressesDan Gohman2016-01-222-75/+123
| | | | | | | | This reapplies r258296 and r258366, and also fixes an existing bug in SelectionDAG.cpp's isMemSrcFromString, neglecting to account for the offset in a GlobalAddressSDNode, which is uncovered by those patches. llvm-svn: 258482
* [opaque pointer types] [NFC] Add an explicit type argument to ↵Eduard Burtescu2016-01-221-1/+1
| | | | | | | | | | | | ConstantFoldLoadFromConstPtr. Reviewers: mjacob, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16418 llvm-svn: 258472
* Revert "[SelectionDAG] Fold more offsets into GlobalAddresses"Reid Kleckner2016-01-222-120/+73
| | | | | | | | | | | | | | This reverts r258296 and the follow up r258366. With this change, we miscompiled the following program on Windows: #include <string> #include <iostream> static const char kData[] = "asdf jkl;"; int main() { std::string s(kData + 3, sizeof(kData) - 3); std::cout << s << '\n'; } llvm-svn: 258465
* [SelectionDAG] Fix constant offset folding to avoid commuting ↵Dan Gohman2016-01-201-2/+3
| | | | | | | | | non-commutative operators. This fixes a miscompile in MultiSource/Benchmarks/MiBench/consumer-lame introduced in r258296. llvm-svn: 258366
* [SelectionDAG] Fold more offsets into GlobalAddressesDan Gohman2016-01-202-73/+119
| | | | | | | | | | | | | | | | | | SelectionDAG previously missed opportunities to fold constants into GlobalAddresses in several areas. For example, given `(add (add GA, c1), y)`, it would often reassociate to `(add (add GA, y), c1)`, missing the opportunity to create `(add GA+c, y)`. This isn't often visible on targets such as X86 which effectively reassociate adds in their complex address-mode folding logic, however it is currently visible on WebAssembly since it currently has very simple address mode folding code that doesn't reassociate anything. This patch fixes this by making SelectionDAG fold offsets into GlobalAddresses at the same times that it folds constants together, so that it doesn't miss any opportunities to perform such folding. Differential Revision: http://reviews.llvm.org/D16090 llvm-svn: 258296
* [NFC] Replace several manual GEP loops with gep_type_iterator.Eduard Burtescu2016-01-202-31/+13
| | | | | | | | | | Reviewers: dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16335 llvm-svn: 258262
* [opaque pointer types] [NFC] GEP: replace get(Pointer)ElementType uses with ↵Eduard Burtescu2016-01-192-2/+15
| | | | | | | | | | | | | | | | | | get{Source,Result}ElementType. Summary: GEPOperator: provide getResultElementType alongside getSourceElementType. This is made possible by adding a result element type field to GetElementPtrConstantExpr, which GetElementPtrInst already has. GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType. Reviewers: mjacob, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16275 llvm-svn: 258145
* Fixed MSVC warning that not all control paths return a value.Simon Pilgrim2016-01-181-0/+1
| | | | llvm-svn: 258099
* TargetLowering: Improve handling of (setcc ([sz]ext x) 0, cc) in SimplifySetCCTom Stellard2016-01-181-0/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When SimplifySetCC sees a setcc node that compares the result of a value extension operation with a constant, it tries to simplify the setcc node by eliminating the extension and shrinking the constant. If shrinking the inputs to setcc is deemed not desirable by the target (e.g. the target does not want a setcc comparing i1 values), then it is still possible to optimize this sequence in some cases. This patch adds the following combines to SimplifySetCC when shrinking setcc inputs is not desirable: (setcc ([sz]ext (setcc x, y, cc)), 0, setne) -> (setcc (x, y, cc)) (setcc ([sz]ext (setcc x, y, cc)), 0, seteq) -> (setcc (x, Y, !cc)) There are no tests for this yet, but once AMDGPU correctly implements TargetLowering::isTypeDesirableForOp(), this new combine will be exercised by the existing CodeGen/AMDGPU/setcc-opt.ll test. Reviewers: resistor, arsenm Subscribers: jroelofs, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15034 llvm-svn: 258067
* [opaque pointer types] [NFC] CallSite: use getFunctionType() instead of ↵Manuel Jacob2016-01-172-9/+6
| | | | | | | | | | | | | | going through PointerType::getElementType. Patch by Eduard Burtescu. Reviewers: dblaikie, mjacob Subscribers: dsanders, llvm-commits, dblaikie Differential Revision: http://reviews.llvm.org/D16273 llvm-svn: 258023
OpenPOWER on IntegriCloud