summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREGSimon Pilgrim2016-03-031-1/+18
| | | | | | | | Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns. Differential Revision: http://reviews.llvm.org/D17691 llvm-svn: 262599
* Revert "[ARM] Merging 64-bit divmod lib calls into one"Renato Golin2016-03-031-2/+1
| | | | | | This reverts commit r262507, which broke some ARM buildbots. llvm-svn: 262594
* [X86] Don't give catch objects a displacement of zeroDavid Majnemer2016-03-031-20/+40
| | | | | | | | | | | | | | | | | | Catch objects with a displacement of zero do not initialize a catch object. The displacement is relative to %rsp at the end of the function's prologue for x86_64 targets. If we place an object at the top-of-stack, we will end up wit a displacement of zero resulting in our catch object remaining uninitialized. Address this by creating our catch objects as fixed objects. We will ensure that the UnwindHelp object is created after the catch objects so that no catch object will have a displacement of zero. Differential Revision: http://reviews.llvm.org/D17823 llvm-svn: 262546
* [ARM] Merging 64-bit divmod lib calls into oneRenato Golin2016-03-021-1/+2
| | | | | | | | | | | | | | | | | When div+rem calls on the same arguments are found, the ARM back-end merges the two calls into one __aeabi_divmod call for up to 32-bits values. However, for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't merging the calls, and thus calling ldivmod twice and spilling the temporary results, which generated pretty bad code. This patch legalises 64-bit lib calls for divmod, so that now all the spilling and the second call are gone. It also relaxes the DivRem combiner a bit on the legal type check, since it was already checking for isLegalOrCustom on every value, so the extra check for isTypeLegal was redundant. This patch fixes PR17193 (and a long time FIXME in the tests). llvm-svn: 262507
* SelectionDAG: Use correctly sized allocation functions for SDNodesJustin Bogner2016-03-021-116/+86
| | | | | | | | | | | | | | | | The placement new calls here were all calling the allocation function in RecyclingAllocator/Recycler for SDNode, instead of the function for the specific subclass we were constructing. Since this particular allocator always overallocates it more or less worked, but would hide what we're actually doing from any memory tools. Also, if you tried to change this allocator so something like a BumpPtrAllocator or MallocAllocator, the compiler would crash horribly all the time. Part of llvm.org/PR26808. llvm-svn: 262500
* DAGCombiner: Make sure an integer is being truncatedMatt Arsenault2016-03-021-1/+1
| | | | llvm-svn: 262446
* DAGCombiner: Turn truncate of a bitcasted vector to an extractMatt Arsenault2016-03-011-0/+16
| | | | | | | | | | | On AMDGPU where operations i64 operations are often bitcasted to v2i32 and back, this pattern shows up regularly where it breaks some expected combines on i64, such as load width reducing. This fixes some test failures in a future commit when i64 loads are changed to promote. llvm-svn: 262397
* Revert "[mips] Promote the result of SETCC nodes to GPR width."Vasileios Kalintiris2016-03-014-16/+6
| | | | | | | | | This reverts commit r262316. It seems that my change breaks an out-of-tree chromium buildbot, so I'm reverting this in order to investigate the situation further. llvm-svn: 262387
* [NVPTX] Use different, convergent MIs for convergent calls.Justin Lebar2016-03-011-3/+5
| | | | | | | | | | | | | | | | | | | | | | | Summary: Calls sometimes need to be convergent. This is already handled at the LLVM IR level, but it also needs to be handled at the MI level. Ideally we'd propagate convergence from instructions, down through the selection DAG, and into MIs. But this is Hard, and would affect optimizations in the SDNs -- right now only SDNs with two operands have any flags at all. Instead, here's a much simpler hack: Add new opcodes for NVPTX for convergent calls, and generate these when lowering convergent LLVM calls. Reviewers: jholewinski Subscribers: jholewinski, chandlerc, joker.eph, jhen, tra, llvm-commits Differential Revision: http://reviews.llvm.org/D17423 llvm-svn: 262373
* DAGCombiner: Turn extract of bitcasted integer into truncateMatt Arsenault2016-03-011-0/+8
| | | | | | | This reduces the number of bitcast nodes and generally cleans up the DAG when bitcasting between integers and vectors everywhere. llvm-svn: 262358
* [mips] Promote the result of SETCC nodes to GPR width.Vasileios Kalintiris2016-03-014-6/+16
| | | | | | | | | | | | | | | | | | | | Summary: This patch modifies the existing comparison, branch, conditional-move and select patterns, and adds new ones where needed. Also, the updated SLT{u,i,iu} set of instructions generate a GPR width result. The majority of the code changes in the Mips back-end fix the wrong assumption that the result of SETCC nodes always produce an i32 value. The changes in the common code path account for the fact that in 64-bit MIPS targets, i1 is promoted to i32 instead of i64. Reviewers: dsanders Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D10970 llvm-svn: 262316
* LegalizeDAG: Use correct ptr type when expanding unaligned load/storeMatt Arsenault2016-03-011-14/+21
| | | | | | | This fixes regressions exposed in existing AMDGPU tests in a future commit when all loads are custom lowered. llvm-svn: 262299
* DAGCombiner: Don't unnecessarily swap operands in ReassociateOpsMatt Arsenault2016-02-271-2/+2
| | | | | | | | | | | | | | | | | | In the case where op = add, y = base_ptr, and x = offset, this transform: (op y, (op x, c1)) -> (op (op x, y), c1) breaks the canonical form of add by putting the base pointer in the second operand and the offset in the first. This fix is important for the R600 target, because for some address spaces the base pointer and the offset are stored in separate register classes. The old pattern caused the ISel code for matching addressing modes to put the base pointer and offset in the wrong register classes, which required no-trivial code transformations to fix. llvm-svn: 262148
* DAGCombiner: Relax sqrt NaN folding checkMatt Arsenault2016-02-271-7/+7
| | | | | | This is OK for +0 since compares to +/-0 give the same result. llvm-svn: 262125
* Fix a bug in isVectorReductionOp() in SelectionDAGBuilder.cpp that may cause ↵Cong Hou2016-02-261-4/+4
| | | | | | assertion failure on AArch64. llvm-svn: 262091
* Detecte vector reduction operations just before instruction selection.Cong Hou2016-02-241-0/+130
| | | | | | | | | | | | | | | | | | | | | | | | | | | (This is the second attemp to commit this patch, after fixing pr26652 & pr26653). This patch detects vector reductions before instruction selection. Vector reductions are vectorized reduction operations, and for such operations we have freedom to reorganize the elements of the result as long as the reduction of them stay unchanged. This will enable some reduction pattern recognition during instruction combine such as SAD/dot-product on X86. A flag is added to SDNodeFlags to mark those vector reduction nodes to be checked during instruction combine. To detect those vector reductions, we search def-use chains starting from the given instruction, and check if all uses fall into two categories: 1. Reduction with another vector. 2. Reduction on all elements. in which 2 is detected by recognizing the pattern that the loop vectorizer generates to reduce all elements in the vector outside of the loop, which includes several ShuffleVector and one ExtractElement instructions. Differential revision: http://reviews.llvm.org/D15250 llvm-svn: 261804
* NFC. Move isDereferenceable to Loads.h/cppArtur Pilipenko2016-02-241-0/+1
| | | | | | | | | | This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In subsequent change I'm going to eliminate isDerferenceableAndAlignedPointer from Loads API, leaving isSafeToLoadSpecualtively the only function to check is load instruction can be speculated. Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D16180 llvm-svn: 261736
* SelectionDAG: Use correct addrspace when lowering memcpyMatt Arsenault2016-02-221-9/+16
| | | | | | | | | | | This was causing assertions later from using the wrong pointer size with LDS operations. getOptimalMemOpType should also have address space arguments later. This avoids assertions in existing tests exposed by a future commit. llvm-svn: 261580
* ADT: Remove == and != comparisons between ilist iterators and pointersDuncan P. N. Exon Smith2016-02-211-3/+4
| | | | | | | | | | | | | | I missed == and != when I removed implicit conversions between iterators and pointers in r252380 since they were defined outside ilist_iterator. Since they depend on getNodePtrUnchecked(), they indirectly rely on UB. This commit removes all uses of these operators. (I'll delete the operators themselves in a separate commit so that it can be easily reverted if necessary.) There should be NFC here. llvm-svn: 261498
* [DAGCombiner] Use getBitcast helper when possible. NFCI.Simon Pilgrim2016-02-201-7/+3
| | | | llvm-svn: 261437
* [StatepointLowering] Minor non-semantic cleanupsSanjoy Das2016-02-191-23/+18
| | | | | | Use auto, bring file up to coding standards etc. llvm-svn: 261358
* [StatepointLowering] Update StatepointMaxSlotsRequired correctlySanjoy Das2016-02-191-3/+4
| | | | | | | | | | Now that we don't always add an element to AllocatedStackSlots if we don't find a pre-existing unallocated stack slot, bumping StatepointMaxSlotsRequired to `NumSlots + 1` is not correct. Instead bump the statistic near the push_back, to Builder.FuncInfo.StatepointStackSlots.size(). llvm-svn: 261348
* [StatepointLowering] Fix a mistake in rL261336Sanjoy Das2016-02-191-4/+5
| | | | | | | | The check on MFI->getObjectSize() has to be on the FrameIndex, not on the index of the FrameIndex in AllocatedStackSlots. Weirdly, the tests I added in rL261336 didn't catch this. llvm-svn: 261347
* [StatepointLowering] Change AllocatedStackSlots to use SmallBitVectorSanjoy Das2016-02-192-13/+15
| | | | | | | | | | | | | | | | | NFCI. They key motivation here is that I'd like to use SmallBitVector::all() in a later change. Also, using a bit vector here seemed better in general. The only interesting change here is that in the failure case of allocateStackSlot, we no longer (the equivalent of) push_back(true) to AllocatedStackSlots. As far as I can tell, this is fine, since we'd never re-use those slots in the same StatepointLoweringState instance. Technically there was no need to change the operator[] type accesses to set() and test(), but I thought it'd be nice to make it obvious that we're using something other than a std::vector like thing. llvm-svn: 261337
* [StatepointLowering] Fix bug in allocateStackSlotSanjoy Das2016-02-191-2/+19
| | | | | | | | | | | | | | | | | | | | | | | | | allocateStackSlot did not consider the size of the value to be spilled before deciding to re-use a spill slot. This was originally okay (since originally we'd only ever spill pointers), but it became not okay when we changed our scheme to directly spill vectors of pointers. While this change fixes the bug pointed out, it has two performance caveats: - It matches spill slot and spillee size exactly, while in theory we can spill, e.g., an 8 byte pointer into a 16 byte slot. This is slightly complicated to fix since in the stackmaps section, we report the size of the spill slot as the size of the "indirect value"; and if they're no longer equivalent, we'll have to keep track of the (indirect) value size separately from the stack slot size. - It will "spuriously run out" of reusable slots, since we now have an second check in the search loop in addition to the availablity check (e.g. you had two free scalar slots, and you first ask for a vector slot followed by a scalar slot). I'll fix this in a later commit. llvm-svn: 261336
* [StatepointLowering] Clean up allocateStackSlotSanjoy Das2016-02-191-35/+22
| | | | | | | | | This removes the unusual loop structure in allocateStackSlot in favor of something more straightforward. I've also removed the cautionary comment in the function, which I suspect is historical cruft now, and confuses more than it enlightens. llvm-svn: 261335
* LegalizeDAG: Fix ExpandFCOPYSIGN assuming the same type on both inputsMatthias Braun2016-02-191-5/+31
| | | | llvm-svn: 261306
* Remove uses of builtin comma operator.Richard Trieu2016-02-182-20/+30
| | | | | | Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261270
* Revert r261070, it caused PR26652 / PR26653.Nico Weber2016-02-171-126/+0
| | | | llvm-svn: 261127
* Detecte vector reduction operations just before instruction selection.Cong Hou2016-02-171-0/+126
| | | | | | | | | | | | | | | | | | | | | | | | | This patch detects vector reductions before instruction selection. Vector reductions are vectorized reduction operations, and for such operations we have freedom to reorganize the elements of the result as long as the reduction of them stay unchanged. This will enable some reduction pattern recognition during instruction combine such as SAD/dot-product on X86. A flag is added to SDNodeFlags to mark those vector reduction nodes to be checked during instruction combine. To detect those vector reductions, we search def-use chains starting from the given instruction, and check if all uses fall into two categories: 1. Reduction with another vector. 2. Reduction on all elements. in which 2 is detected by recognizing the pattern that the loop vectorizer generates to reduce all elements in the vector outside of the loop, which includes several ShuffleVector and one ExtractElement instructions. Differential revision: http://reviews.llvm.org/D15250 llvm-svn: 261070
* [CodeGen] Document and use getConstant's splat-building feature. NFC.Ahmed Bougacha2016-02-152-10/+4
| | | | | | Differential Revision: http://reviews.llvm.org/D17229 llvm-svn: 260901
* Don't combine fp_round (fp_round x) if f80 to f16 is generatedPirama Arumuga Nainar2016-02-131-0/+11
| | | | | | | | | | | | | | | | | | | | Summary: This patch skips DAG combine of fp_round (fp_round x) if it results in an fp_round from f80 to f16. fp_round from f80 to f16 always generates an expensive (and as yet, unimplemented) libcall to __truncxfhf2. This prevents selection of native f16 conversion instructions from f32 or f64. Moreover, the first (value-preserving) fp_round from f80 to either f32 or f64 may become a NOP in platforms like x86. Reviewers: ab Subscribers: srhines, llvm-commits Differential Revision: http://reviews.llvm.org/D17221 llvm-svn: 260769
* [SelectionDAG] change getConstant() to use the input SDLoc when building ↵Sanjay Patel2016-02-111-5/+4
| | | | | | | | | | | | | | | | | | | | | | splat vectors The code change is simple enough: instead of attaching an anonymous SDLoc to splatted vector constants, use the scalar constant's existing SDLoc since that is what is passed into getConstant() as a param. But this changes instruction scheduling, so I'll explain why that happens. The motivation for this patch starts near: http://reviews.llvm.org/rL258833 ...x86's getZeroVector() could be similarly cleaned up and I thought it would be 'NFC'. But when I made that change locally, several x86 codegen tests wiggled. It turns out that the lack of SDLoc consistency in getConstant() changes the way ScheduleDAGRRList behaves. This is because the SDLoc contains 'IROrder' and some DAG scheduler algorithms use IROrder for tie-breaking. Differential Revision: http://reviews.llvm.org/D16972 llvm-svn: 260582
* [CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI.Ahmed Bougacha2016-02-094-50/+34
| | | | llvm-svn: 260316
* [SelectionDAG] make getMemBasePlusOffset() accessible; NFCISanjay Patel2016-02-091-12/+9
| | | | | | | | | I reinvented this functionality in http://reviews.llvm.org/D16828 because it was hidden away as a static function. The changes in x86 are not based on a complete audit. I suspect there are other possible uses there, and there are almost certainly more potential users in other targets. llvm-svn: 260295
* [X86] Don't zero/sign-extend i1, i8, or i16 return values to 32 bits (PR22532)Hans Wennborg2016-02-081-1/+1
| | | | | | | | | | | | | | | | | | | | This matches GCC and MSVC's behaviour, and saves on code size. We were already not extending i1 return values on x86_64 after r127766. This takes that patch further by applying it to x86 target as well, and also for i8 and i16. The ABI docs have been unclear about the required behaviour here. The new i386 psABI [1] clearly states (Table 2.4, page 14) that i1, i8, and i16 return vales do not need to be extended beyond 8 bits. The x86_64 ABI doc is being updated to say the same [2]. Differential Revision: http://reviews.llvm.org/D16907 [1]. https://01.org/sites/default/files/file_attach/intel386-psabi-1.0.pdf [2]. https://groups.google.com/d/msg/x86-64-abi/E8O33onbnGQ/_RFWw_ixDQAJ llvm-svn: 260133
* SelectionDAG: Lower some range metadata to AssertZextMatt Arsenault2016-02-082-3/+45
| | | | | | | | | | If a range has a lower bound of 0, add an AssertZext from the nearest floor power of two. This allows operations with some workitem intrinsics with known maximum ranges to use fast 24-bit multiplies. llvm-svn: 260109
* [StatepointLower] Use None instead of Optional<int>()Sanjoy Das2016-02-051-5/+5
| | | | llvm-svn: 259956
* [Power PC] softening long double typePetar Jovanovic2016-02-041-17/+33
| | | | | | | | | | | This patch implements softening of long double type (ppcf128) on ppc32 architecture and enables operations for this type for soft float. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D15811 llvm-svn: 259791
* rangify; NFCISanjay Patel2016-02-031-159/+129
| | | | llvm-svn: 259722
* [SelectionDAG] Fix CombineToPreIndexedLoadStore O(n^2) behaviorTim Shen2016-02-032-6/+9
| | | | | | | | | | | | | | | | | | | | This patch consists of two parts: a performance fix in DAGCombiner.cpp and a correctness fix in SelectionDAG.cpp. The test case tests the bug that's uncovered by the performance fix, and fixed by the correctness fix. The performance fix keeps the containers required by the hasPredecessorHelper (which is a lazy DFS) and reuse them. Since hasPredecessorHelper is called in a loop, the overall efficiency reduced from O(n^2) to O(n), where n is the number of SDNodes. The correctness fix keeps iterating the neighbor list even if it's time to early return. It will return after finishing adding all neighbors to Worklist, so that no neighbors are discarded due to the original early return. llvm-svn: 259691
* Fix Clang-tidy readability-redundant-control-flow warnings; other minor fixes.Eugene Zelenko2016-02-021-14/+6
| | | | | | Differential revision: http://reviews.llvm.org/D16793 llvm-svn: 259539
* AArch64: Implement missed conditional compare sequences.Balaram Makam2016-02-011-2/+2
| | | | | | | | | | | | | | | | | | Summary: This is an extension to the existing implementation of r242436 which restricts to only select inputs. This version fixes missed opportunities in pr26084 by attempting to lower conditional compare sequences of and/or trees with setcc leafs. This will additionaly handle the case when a tree with select input is not a conjunction-disjunction tree but some of the sub trees are conjunction-disjunction trees. Reviewers: jmolloy, t.p.northover, mcrosier, MatzeB Subscribers: mcrosier, llvm-commits, junbuml, haicheng, mssimpso, gberry Differential Revision: http://reviews.llvm.org/D16291 llvm-svn: 259387
* [SelectionDAG] Eliminate exponential behavior in WalkChainUsersTim Shen2016-01-311-5/+20
| | | | llvm-svn: 259315
* Avoid overly large SmallPtrSet/SmallSetMatthias Braun2016-01-305-5/+5
| | | | | | | These sets perform linear searching in small mode so it is never a good idea to use SmallSize/N bigger than 32. llvm-svn: 259283
* Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith ↵Yaron Keren2016-01-291-2/+2
| | | | | | | | r259192 post commit comment. clang part in r259232, this is the LLVM part of the patch. llvm-svn: 259240
* [X86] Don't transform X << 1 to X + X during type legalizationDavid Majnemer2016-01-281-9/+0
| | | | | | | | | | | | | | | | | | | | While legalizing a 64-bit shift left by 1, the following occurs: We split the shift operand in half: a high half and a low half. We then create an ADDC with the low half and a ADDE with the high half + the carry bit from the ADDC. This is problematic if X is any_ext'd because the high half computation is now undef + undef + carry bit and there is no way to ensure that the two undef values had the same bitwise representation. This results in the lowest bit in the high half turning into garbage. Instead, do not try to turn shifts into arithmetic during type legalization. This fixes PR26350. llvm-svn: 259065
* [DAGCombiner] Don't add volatile or indexed stores to ChainedStoresJunmo Park2016-01-281-0/+4
| | | | | | | | | | | | Summary: findBetterNeighborChains does not handle volatile or indexed stores. However, it did not check when adding stores to ChainedStores. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D16463 llvm-svn: 259024
* Rename TargetSelectionDAGInfo into SelectionDAGTargetInfo and move it to ↵Benjamin Kramer2016-01-274-17/+15
| | | | | | | | CodeGen/ It's a SelectionDAG thing, not a Target thing. llvm-svn: 258939
* Remove autoconf supportChris Bieneman2016-01-261-13/+0
| | | | | | | | | | | | | | | | Summary: This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html "I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened." - Obi Wan Kenobi Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D16471 llvm-svn: 258861
OpenPOWER on IntegriCloud