summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
...
* LegalizeDAG: Don't replace vector load with integer unless legalMatt Arsenault2016-03-303-28/+71
| | | | | | | | | | | | | On AMDGPU we want to be able to promote i64/f64 loads to v2i32. If the access is unaligned, this would conclude that since i64 is legal, it would convert it back to i64 and there is an endless legalization loop. Extract the logic for scalarizing the load into a new TargetLowering function, where this can also replace the custom function AMDGPU has for this. llvm-svn: 264927
* Add support for no-jump-tablesNirav Dave2016-03-291-2/+7
| | | | | | | | | | | | | Add function soft attribute to the generation of Jump Tables in CodeGen as initial step towards clang support of gcc's no-jump-table support Reviewers: hans, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18321 llvm-svn: 264756
* Swift Calling Convention: add swiftself attribute.Manman Ren2016-03-293-0/+9
| | | | | | Differential Revision: http://reviews.llvm.org/D17866 llvm-svn: 264754
* [Codegen] Decrease minimum jump table density.Kyle Butt2016-03-292-8/+23
| | | | | | | | | | | Minimum density for both optsize and non optsize are now options -sparse-jump-table-density (default 10) for non optsize functions -dense-jump-table-density (default 40) for optsize functions, which matches the current default. This improves several benchmarks at google at the cost of a small codesize increase. For code compiled with -Os, the old behavior continues llvm-svn: 264689
* Minor code cleanup. NFC.Junmo Park2016-03-261-1/+1
| | | | llvm-svn: 264505
* Prevent construction of cycle in DAG store mergeNirav Dave2016-03-253-40/+44
| | | | | | | | | | | | | | | | | | | | | When merging stores in DAGCombiner, add check to ensure that no dependenices exist that would cause the construction of a cycle in our DAG. This may happen if one store has a data dependence on another instruction (e.g. a load) which itself has a (chain) dependence on another store being merged. These stores cannot be merged safely and doing so results in a cycle that is discovered in LegalizeDAG. This test is only done in cases where Antialias analysis is used (UseAA) as non-AA store merge candidates will be merged logically after all loads which have been checked to not alias. Reviewers: ahatanak, spatel, niravd, arsenm, hfinkel, tstellarAMD, jyknight Subscribers: llvm-commits, tberghammer, danalbert, srhines Differential Revision: http://reviews.llvm.org/D18336 llvm-svn: 264461
* CXX TLS: collect return blocks after SelectAllBasicBlocks.Manman Ren2016-03-241-7/+15
| | | | | | | | | | It is incorrect to get the corresponding MBB for a ReturnInst before SelectAllBasicBlocks since SelectAllBasicBlocks can change the correspondence between a ReturnInst and the MBB it is in. PR27062 llvm-svn: 264358
* Reduce code duplication by extracting out a helper function; NFCSanjoy Das2016-03-242-30/+21
| | | | llvm-svn: 264355
* Lower varargs correctly in deopt bundle loweringSanjoy Das2016-03-241-0/+1
| | | | | | | Earlier we were ignoring varargs in LowerCallSiteWithDeoptBundle because populateCallLoweringInfo does not set CallLoweringInfo::IsVarArg. llvm-svn: 264354
* Add lowering support for llvm.experimental.deoptimizeSanjoy Das2016-03-243-0/+40
| | | | | | | | | | | | | | | Summary: Only adds support for "naked" calls to llvm.experimental.deoptimize. Support for round-tripping through RewriteStatepointsForGC will come as a separate patch (should be simpler than this one). Reviewers: reames Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18429 llvm-svn: 264329
* [Statepoints] Fix yet another issue around gc pointer uniqueingSanjoy Das2016-03-242-19/+22
| | | | | | | | | | | | | Given that StatepointLowering now uniques derived pointers before putting them in the per-statepoint spill map, we may end up with missing entries for derived pointers when we visit a gc.relocate on a pointer that was de-duplicated away. Fix this by keeping two maps, one mapping gc pointers to their de-duplicated values, and one mapping a de-duplicated value to the slot it is spilled in. llvm-svn: 264320
* Minor cosmestic changes (NFC)Sanjoy Das2016-03-241-7/+7
| | | | | | | - Reflow comments - Rename function llvm-svn: 264319
* CodeGen: extend RHS when splitting ATOMIC_CMP_SWAP_WITH_SUCCESS.Tim Northover2016-03-241-3/+18
| | | | | | | | | | | | | If the operation's type has been promoted during type legalization, we need to account for the fact that the high bits of the comparison operand are likely unspecified. The LHS is usually zero-extended, but MIPS sign extends it, so we have to be slightly careful. Patch by Simon Dardis. llvm-svn: 264296
* Remove unsafe AssertZext after promoting result of FP_TO_FP16Pirama Arumuga Nainar2016-03-241-4/+1
| | | | | | | | | | | | | | | Summary: Some target lowerings of FP_TO_FP16, for instance ARM's vcvtb.f16.f32 instruction, do not guarantee that the top 16 bits are zeroed out. Remove the unsafe AssertZext and add tests to exercise this. Reviewers: jmolloy, sbaranga, kristof.beyls, aadg Subscribers: llvm-commits, srhines, aemerson Differential Revision: http://reviews.llvm.org/D18426 llvm-svn: 264285
* SelectionDAG: Remove a tautological dyn_cast. NFCJustin Bogner2016-03-231-3/+2
| | | | | | Index is already a StoreSDNode, so this dyn_cast doesn't do anything. llvm-svn: 264177
* Remove stale commentSanjoy Das2016-03-231-2/+1
| | | | llvm-svn: 264131
* [StatepointLowering] Don't do two DenseMap lookups; nfciSanjoy Das2016-03-231-2/+3
| | | | llvm-svn: 264130
* [StatepointLowering] Minor NFC cleanupsSanjoy Das2016-03-231-11/+12
| | | | | | | | | - Use auto - Name variables in LLVM style - Use llvm::find instead of std::find - Blank lines between declarations llvm-svn: 264129
* [StatepointLowering] Minor nfc refactoringSanjoy Das2016-03-231-29/+6
| | | | | | | | | | Now that StatepointLoweringInfo represents base pointers, derived pointers and gc relocates as SmallVectors and not ArrayRefs, we no longer need to allocate "backing storage" on stack in LowerStatepoint. So elide the backing storage, and inline the trivial body of getIncomingStatepointGCValues. llvm-svn: 264128
* [StatepointLowering] Schedule gc relocates before uniqueing themSanjoy Das2016-03-232-9/+13
| | | | | | Otherwise we can see an "unexpected" gc.relocate that we uniqued away. llvm-svn: 264127
* [SelectionDAG] Ensure constant folded legalized vector element types are ↵Simon Pilgrim2016-03-221-1/+1
| | | | | | | | compatible with the BUILD_VECTOR type Found during fuzz testing - 32-bit x86 targets were legalizing a <2 x i1> compare result to <2 x i32> when <2 x i64> was expected. llvm-svn: 264085
* CodeGen: check return types match when emitting tail call to builtin.Tim Northover2016-03-221-2/+5
| | | | | | | | | | | We were just completely ignoring the types when determining whether we could safely emit a libcall as a tail call. This is clearly wrong. Theoretically, we could dig deeper looking for incidental matches (much like the generic code in Analysis.cpp does), but it's probably not worth it for the few libcalls that exist. llvm-svn: 264084
* Allow lowering call sites with both funclets and deopt stateSanjoy Das2016-03-221-5/+1
| | | | | | | Lowering funclets is a no-op, so we can just go ahead and lower the deopt state. llvm-svn: 264078
* Add a hasOperandBundlesOtherThan helper, and use it; NFCSanjoy Das2016-03-221-12/+6
| | | | llvm-svn: 264072
* [X86][SSE] Reapplied: Simplify vector LOAD + EXTEND on pre-SSE41 hardwareSimon Pilgrim2016-03-222-0/+102
| | | | | | | | | | | | | | Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions. We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT. Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718). Reapplied with a fix for PR26953 (missing vector widening legalization). Differential Revision: http://reviews.llvm.org/D17932 llvm-svn: 264062
* Add "first class" lowering for deopt operand bundlesSanjoy Das2016-03-224-24/+99
| | | | | | | | | | | | | | | | | Summary: After this change, deopt operand bundles can be lowered directly by SelectionDAG into STATEPOINT instructions (which are then lowered to a call or sequence of nop, with an associated __llvm_stackmaps entry0. This obviates the need to round-trip deoptimization state through gc.statepoint via RewriteStatepointsForGC. Reviewers: reames, atrick, majnemer, JosephTremoulet, pgavlin Subscribers: sanjoy, mcrosier, majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D18257 llvm-svn: 264015
* [DAGCombine] Catch the case where extract_vector_elt can cause an any_ext ↵Silviu Baranga2016-03-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | while processing AND SDNodes Summary: extract_vector_elt can cause an implicit any_ext if the types don't match. When processing the following pattern: (and (extract_vector_elt (load ([non_ext|any_ext|zero_ext] V))), c) DAGCombine was ignoring the possible extend, and sometimes removing the AND even though it was required to maintain some of the bits in the result to 0, resulting in a miscompile. This change fixes the issue by limiting the transformation only to cases where the extract_vector_elt doesn't perform the implicit extend. Reviewers: t.p.northover, jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18247 llvm-svn: 263935
* [CXX_FAST_TLS] fix issues with O0 on ARM, AArch64 and X86.Manman Ren2016-03-181-1/+1
| | | | | | | Since at O0, explicit copies via SplitCSR may not be removed even if they are unnecessary, we choose not to use SplitCSR at O0. llvm-svn: 263855
* [SelectionDAG] Remove visitStatepoint; NFCSanjoy Das2016-03-173-11/+2
| | | | | | | This way we have a single entry point into StatepointLowering. The method was a direct dispatch to LowerStatepoint anyway. llvm-svn: 263682
* Fix indentation; NFCSanjoy Das2016-03-161-3/+2
| | | | llvm-svn: 263672
* Extract out a SelectionDAGBuilder::LowerAsStatepoint; NFCSanjoy Das2016-03-162-144/+197
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is a step towards implementing "direct" lowering of calls and invokes with deopt operand bundles into STATEPOINT nodes (as opposed to having them mandatorily pass through RewriteStatepointsForGC, which is the case today). This change extracts out a `SelectionDAGBuilder::LowerAsStatepoint` helper function that is able to lower a "statepoint like thing", and uses it to lower `gc.statepoint` calls. This is an NFC now, but in a later change we will use `LowerAsStatepoint` to directly lower calls and invokes with operand bundles without going through an intermediate `gc.statepoint` IR representation. FYI: I expect `SelectionDAGBuilder::StatepointInfo` will evolve as I add support for lowering non gc.statepoints, right now it is fairly tightly coupled with an IR level `gc.statepoint`. Reviewers: reames, pgavlin, JosephTremoulet Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18106 llvm-svn: 263671
* Tweak some atomics functions in preparation for larger changes; NFC.James Y Knight2016-03-162-2/+2
| | | | | | | | | | | | | | | | - Rename getATOMIC to getSYNC, as llvm will soon be able to emit both '__sync' libcalls and '__atomic' libcalls, and this function is for the '__sync' ones. - getInsertFencesForAtomic() has been replaced with shouldInsertFencesForAtomic(Instruction), so that the decision can be made per-instruction. This functionality will be used soon. - emitLeadingFence/emitTrailingFence are no longer called if shouldInsertFencesForAtomic returns false, and thus don't need to check the condition themselves. llvm-svn: 263665
* [SelectionDAG] Extract out populateCallLoweringInfo; NFCSanjoy Das2016-03-163-30/+31
| | | | | | | | | SelectionDAGBuilder::populateCallLoweringInfo is now used instead of SelectionDAGBuilder::lowerCallOperands. The populateCallLoweringInfo interface is more composable in face of design changes like http://reviews.llvm.org/D18106 llvm-svn: 263663
* Removed trailing whitespaceSimon Pilgrim2016-03-161-12/+12
| | | | llvm-svn: 263650
* [StatepointLowering] Move an assertion; NFCISanjoy Das2016-03-151-6/+4
| | | | | | | | Instead of running an explicit loop over `gc.relocate` calls hanging off of a `gc.statepoint`, assert the validity of the type of the value being relocated in `visitRelocate`. llvm-svn: 263516
* Temporarily Revert "[X86][SSE] Simplify vector LOAD + EXTEND onEric Christopher2016-03-142-40/+0
| | | | | | | | | pre-SSE41 hardware" as it seems to be causing crashes during code generation in halide. PR forthcoming. This reverts commit r263303. llvm-svn: 263512
* [DAG] use !isUndef() ; NFCISanjay Patel2016-03-144-13/+11
| | | | llvm-svn: 263453
* [DAG] use isUndef() ; NFCISanjay Patel2016-03-145-98/+94
| | | | llvm-svn: 263448
* Make gc relocates more strongly typed; NFCSanjoy Das2016-03-121-10/+13
| | | | | | | Don't use a `Value *` where we can use a stronger `GCRelocateInst *` type. llvm-svn: 263327
* [X86][SSE] Simplify vector LOAD + EXTEND on pre-SSE41 hardwareSimon Pilgrim2016-03-112-0/+40
| | | | | | | | | | | | Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions. We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT. Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718). Differential Revision: http://reviews.llvm.org/D17932 llvm-svn: 263303
* [X86][SSE] Reapplied: Improve vector ZERO_EXTEND by combining to ↵Simon Pilgrim2016-03-102-2/+19
| | | | | | | | | | | | ZERO_EXTEND_VECTOR_INREG Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns. Reapplied with a fix for PR26870 (avoid premature use of TargetConstant in ZERO_EXTEND_VECTOR_INREG expansion). Differential Revision: http://reviews.llvm.org/D17691 llvm-svn: 263159
* SelectionDAG: Fix a crash on inline asm when output register supports ↵Tom Stellard2016-03-091-3/+7
| | | | | | | | | | | | | | | | multiple types Summary: The code in SelectionDAG did not handle the case where the register type and output types were different, but had the same size. Reviewers: arsenm, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17940 llvm-svn: 263022
* Revert r262599 "[X86][SSE] Improve vector ZERO_EXTEND by combining to ↵Hans Wennborg2016-03-081-18/+1
| | | | | | | | ZERO_EXTEND_VECTOR_INREG" This caused PR26870. llvm-svn: 262935
* Re-apply "SelectionDAG: Store SDNode operands in an ArrayRecycler"Justin Bogner2016-03-081-143/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This re-applies r262886 with a fix for 32 bit platforms that have 8 byte pointer alignment, effectively reverting r262892. Original Message: Currently some SDNode operands are malloc'd, some are stored inline in subclasses of SDNode, and some are thrown into a BumpPtrAllocator. This scheme is complex, inconsistent, and makes refactoring SDNodes fairly difficult. Instead, we can allocate all of the operands using an ArrayRecycler that wraps a BumpPtrAllocator. This keeps the cache locality when iterating operands, improves locality when iterating SDNodes without looking at operands, and vastly simplifies the ownership semantics. It also means we stop overallocating SDNodes by 2-3x and will make it simpler to fix the rampant undefined behaviour we have in how we mutate SDNodes from one kind to another (See llvm.org/pr26808). This is NFC other than the changes in memory behaviour, and I ran some LNT tests to make sure this didn't hurt compile time. Not many tests changed: there were a couple of 1-2% regressions reported, but there were more improvements (of up to 4%) than regressions. llvm-svn: 262902
* Revert "SelectionDAG: Store SDNode operands in an ArrayRecycler"Justin Bogner2016-03-081-118/+143
| | | | | | | | | Looks like the largest SDNode is different between 32 and 64 bit now, so this is breaking 32 bit bots. Reverting while I figure out a fix. This reverts r262886. llvm-svn: 262892
* SelectionDAG: Store SDNode operands in an ArrayRecyclerJustin Bogner2016-03-081-143/+118
| | | | | | | | | | | | | | | | | | | | | | | Currently some SDNode operands are malloc'd, some are stored inline in subclasses of SDNode, and some are thrown into a BumpPtrAllocator. This scheme is complex, inconsistent, and makes refactoring SDNodes fairly difficult. Instead, we can allocate all of the operands using an ArrayRecycler that wraps a BumpPtrAllocator. This keeps the cache locality when iterating operands, improves locality when iterating SDNodes without looking at operands, and vastly simplifies the ownership semantics. It also means we stop overallocating SDNodes by 2-3x and will make it simpler to fix the rampant undefined behaviour we have in how we mutate SDNodes from one kind to another (See llvm.org/pr26808). This is NFC other than the changes in memory behaviour, and I ran some LNT tests to make sure this didn't hurt compile time. Not many tests changed: there were a couple of 1-2% regressions reported, but there were more improvements (of up to 4%) than regressions. llvm-svn: 262886
* DAGCombiner: Check legality before creating extract_vector_eltMatt Arsenault2016-03-071-1/+3
| | | | | | Problem not hit by any in tree target. llvm-svn: 262852
* [CodeGen] Add space-optimized EmitMergeInputChains1_2 to the DAG isel ↵Craig Topper2016-03-071-2/+3
| | | | | | matching tables. Shaves about 5100 bytes from the X86 matcher table. NFC llvm-svn: 262815
* [DAGCombine] Fix divrem combine not to assume div/rem type is simple.Michael Kuperstein2016-03-041-1/+4
| | | | | | | | | | | | | The divrem combine assumed the type of the div/rem is simple, which isn't necessarily true. This probably worked fine until r250825, since it only saw legal types, but now breaks when it runs as a pre-type-legalization combine. This fixes PR26835. Differential Revision: http://reviews.llvm.org/D17878 llvm-svn: 262746
* [ARM] Merging 64-bit divmod lib calls into oneRenato Golin2016-03-041-4/+8
| | | | | | | | | | | | | | | | | | | | | When div+rem calls on the same arguments are found, the ARM back-end merges the two calls into one __aeabi_divmod call for up to 32-bits values. However, for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't merging the calls, and thus calling ldivmod twice and spilling the temporary results, which generated pretty bad code. This patch legalises 64-bit lib calls for divmod, so that now all the spilling and the second call are gone. It also relaxes the DivRem combiner a bit on the legal type check, since it was already checking for isLegalOrCustom on every value, so the extra check for isTypeLegal was redundant. Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make sure we only emit valid types or the ones that were explicitly marked as custom. Now, passing check-all and test-suite on x86, ARM and AArch64. This patch fixes PR17193 (and a long time FIXME in the tests). llvm-svn: 262738
OpenPOWER on IntegriCloud