bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[X86][AARCH64] Improve ISD::ABS support	Simon Pilgrim	2019-01-12	3	-0/+43
\| \| \| \| \| \| \| \|	This patch takes some of the code from D49837 to allow us to enable ISD::ABS support for all SSE vector types. Differential Revision: https://reviews.llvm.org/D56544 llvm-svn: 350998
*	[Legalizer] Use correct ValueType of SELECT_CC node during Float promotion	Pirama Arumuga Nainar	2019-01-11	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When legalizing the result of a SELECT_CC node by promoting the floating-point type, use the promoted-to type rather than the original type. Fix PR40273. Reviewers: efriedma, majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56566 llvm-svn: 350951
*	Revert "[SelectionDAGBuilder] Refactor GetRegistersForValue. NFCI."	Martin Storsjo	2019-01-11	1	-42/+60
\| \| \| \| \| \| \|	This reverts commit r350841, as it actually had functional changes and broke compilation. See PR40290. llvm-svn: 350921
*	[DAGCombiner] simplify code; NFC	Sanjay Patel	2019-01-10	1	-11/+11
\| \| \| \|	llvm-svn: 350844
*	[SelectionDAGBuilder] Refactor GetRegistersForValue. NFCI.	Nirav Dave	2019-01-10	1	-60/+42
\| \| \| \|	llvm-svn: 350841
*	[SelectionDAGBuilder] Fix formatting. NFC.	Nirav Dave	2019-01-10	1	-1/+2
\| \| \| \|	llvm-svn: 350839
*	[SelectionDAGBuilder] Refactor visitInlineAsm. NFC.	Nirav Dave	2019-01-10	1	-45/+24
\| \| \| \|	llvm-svn: 350837
*	[opaque pointer types] Remove some calls to generic Type subtype accessors.	James Y Knight	2019-01-10	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \|	That is, remove many of the calls to Type::getNumContainedTypes(), Type::subtypes(), and Type::getContainedType(N). I'm not intending to remove these accessors -- they are useful/necessary in some cases. However, removing the pointee type from pointers would potentially break some uses, and reducing the number of calls makes it easier to audit. llvm-svn: 350835
*	Remove check for single use in ShrinkDemandedConstant	Stanislav Mekhanoshin	2019-01-09	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This removes check for single use from general ShrinkDemandedConstant to the BE because of the AArch64 regression after D56289/rL350475. After several hours of experiments I did not come up with a testcase failing on any other targets if check is not performed. Moreover, direct call to ShrinkDemandedConstant is not really needed and superceed by SimplifyDemandedBits. Differential Revision: https://reviews.llvm.org/D56406 llvm-svn: 350684
*	[TargetLowering][AMDGPU] Remove the SimplifyDemandedBits function that takes ↵	Craig Topper	2019-01-07	1	-50/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	a User and OpIdx. Stop using it in AMDGPU target for simplifyI24. As we saw in D56057 when we tried to use this function on X86, it's unsafe. It allows the operand node to have multiple users, but doesn't prevent recursing past the first node when it does have multiple users. This can cause other simplifications earlier in the graph without regard to what bits are needed by the other users of the first node. Ideally all we should do to the first node if it has multiple uses is bypass it when its not needed by the user we started from. Doing any other transformation that SimplifyDemandedBits can do like turning ZEXT/SEXT into AEXT would result in an increase in instructions. Fortunately, we already have a function that can do just that, GetDemandedBits. It will only make transformations that involve bypassing a node. This patch changes AMDGPU's simplifyI24, to use a combination of GetDemandedBits to handle the multiple use simplifications. And then uses the regular SimplifyDemandedBits on each operand to handle simplifications allowed when the operand only has a single use. Unfortunately, GetDemandedBits simplifies constants more aggressively than SimplifyDemandedBits. This caused the -7 constant in the changed test to be simplified to remove the upper bits. I had to modify computeKnownBits to account for this by ignoring the upper 8 bits of the input. Differential Revision: https://reviews.llvm.org/D56087 llvm-svn: 350560
*	[LegalizeVectorOps] Add FSHL/FSHR to the list of vector operations that ↵	Craig Topper	2019-01-06	1	-0/+2
\| \| \| \| \| \| \| \|	should be handled. The FSHL/FSHR nodes are handled in the expand function, but they need to also be listed in the code that queries for the operation action too. llvm-svn: 350490
*	Added single use check to ShrinkDemandedConstant	Stanislav Mekhanoshin	2019-01-05	1	-0/+3
\| \| \| \| \| \| \| \| \|	Fixes cvt_f32_ubyte combine. performCvtF32UByteNCombine() could shrink source node to demanded bits only even if there are other uses. Differential Revision: https://reviews.llvm.org/D56289 llvm-svn: 350475
*	[X86] Add INSERT_SUBVECTOR to ComputeNumSignBits	Craig Topper	2019-01-04	1	-1/+35
\| \| \| \| \| \| \| \| \| \|	This adds support for calculating sign bits of insert_subvector. I based it on the computeKnownBits. My motivating case is propagating sign bits information across basic blocks on AVX targets where concatenating using insert_subvector is common. Differential Revision: https://reviews.llvm.org/D56283 llvm-svn: 350432
*	[DAGCombiner][x86] scalarize binop followed by extractelement	Sanjay Patel	2019-01-03	1	-5/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As noted in PR39973 and D55558: https://bugs.llvm.org/show_bug.cgi?id=39973 ...this is a partial implementation of a fold that we do as an IR canonicalization in instcombine: // extelt (binop X, Y), Index --> binop (extelt X, Index), (extelt Y, Index) We want to have this in the DAG too because as we can see in some of the test diffs (reductions), the pattern may not be visible in IR. Given that this is already an IR canonicalization, any backend that would prefer a vector op over a scalar op is expected to already have the reverse transform in DAG lowering (not sure if that's a realistic expectation though). The transform is limited with a TLI hook because there's an existing transform in CodeGenPrepare that tries to do the opposite transform. Differential Revision: https://reviews.llvm.org/D55722 llvm-svn: 350354
*	[DAGCombiner] After performing the division by constant optimization for a ↵	Craig Topper	2019-01-02	1	-2/+29
\| \| \| \| \| \| \| \| \| \| \| \|	DIV or REM node, replace the users of the corresponding REM or DIV node if it exists. Currently we expand the two nodes separately. This gives DAG combiner an opportunity to optimize the expanded sequence taking into account only one set of users. When we expand the other node we'll create the expansion again, but might not be able to optimize it the same way. So the nodes won't CSE and we'll have two similarish sequences in the same basic block. By expanding both nodes at the same time we'll avoid prematurely optimizing the expansion until both the division and remainder have been replaced. Improves the test case from PR38217. There may be additional opportunities after this. Differential Revision: https://reviews.llvm.org/D56145 llvm-svn: 350239
*	[LegalizeIntegerTypes] When promoting the result of an extract_vector_elt ↵	Craig Topper	2019-01-02	1	-2/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	also promote the input type if necessary By also promoting the input type we get a better idea for what scalar type to use. This can provide better results if the result of the extract is sign extended. What was previously happening is that the extract result would be legalized, sometime later the input of the sign extend would be legalized using the result of the extract. Then later the extract input would be legalized forcing a truncate into the input of the sign extend using a replace all uses. This requires DAG combine to combine out the sext/truncate pair. But sometimes we visited the truncate first and messed things up before the sext could be combined. By creating the extract with the correct scalar type when we create legalize the result type, the truncate will be added right away. Then when the sign_extend input is legalized it will create an any_extend of the truncate which can be optimized by getNode to maybe remove the truncate. And then a sign_extend_inreg. Now DAG combine doesn't have to worry about getting rid of the extend. This fixes the regression on X86 in D56156. Differential Revision: https://reviews.llvm.org/D56176 llvm-svn: 350236
*	[DAGCombiner][X86][PowerPC] Teach visitSIGN_EXTEND_INREG to fold ↵	Craig Topper	2019-01-02	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \|	(sext_in_reg (aext/sext x)) -> (sext x) when x has more than 1 sign bit and the sext_inreg is from one of them. If x has multiple sign bits than it doesn't matter which one we extend from so we can sext from x's msb instead. The X86 setcc-combine.ll changes are a little weird. It appears we ended up with a (sext_inreg (aext (trunc (extractelt)))) after type legalization. The sext_inreg+aext now gets optimized by this combine to leave (sext (trunc (extractelt))). Then we visit the trunc before we visit the sext. This ends up changing the truncate to an extractvectorelt from a bitcasted vector. I have a follow up patch to fix this. Differential Revision: https://reviews.llvm.org/D56156 llvm-svn: 350235
*	Reversing the commit in revision 350186. Revision causes regression in 4	Ayonam Ray	2019-01-01	2	-33/+53
\| \| \| \| \| \|	tests. llvm-svn: 350187
*	Omit range checks from jump tables when lowering switches with unreachable	Ayonam Ray	2019-01-01	2	-53/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	default During the lowering of a switch that would result in the generation of a jump table, a range check is performed before indexing into the jump table, for the switch value being outside the jump table range and a conditional branch is inserted to jump to the default block. In case the default block is unreachable, this conditional jump can be omitted. This patch implements omitting this conditional branch for unreachable defaults. Review Reference: D52002 llvm-svn: 350186
*	[SelectionDAG] Add SIGN_EXTEND_VECTOR_INREG support to computeKnownBits.	Craig Topper	2018-12-31	1	-1/+9
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D56168 llvm-svn: 350179
*	[DAGCombiner] Add missing one use check on the shuffle in the ↵	Craig Topper	2018-12-31	1	-1/+1
\| \| \| \| \| \| \| \|	bitcast(shuffle(bitcast(s0),bitcast(s1))) -> shuffle(s0,s1) transform. Found while trying out some other changes so I don't really have a test case. llvm-svn: 350172
*	[PowerPC] Fix ADDE, SUBE do not know how to promote operator	Kang Zhang	2018-12-30	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch is created to fix the Bugzilla bug 39815: https://bugs.llvm.org/show_bug.cgi?id=39815 This patch is to support promotion integer result for the instruction ADDE, SUBE. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D56119 llvm-svn: 350161
*	Add vtable anchor to classes.	Richard Trieu	2018-12-29	1	-0/+2
\| \| \| \|	llvm-svn: 350142
*	[NVPTX] Allow libcalls that are defined in the current module.	Justin Lebar	2018-12-26	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The patch adds a possibility to make library calls on NVPTX. An important thing about library functions - they must be defined within the current module. This basically should guarantee that we produce a valid PTX assembly (without calls to not defined functions). The one who wants to use the libcalls is probably will have to link against compiler-rt or any other implementation. Currently, it's completely impossible to make library calls because of error LLVM ERROR: Cannot select: i32 = ExternalSymbol '...'. But we can lower ExternalSymbol to TargetExternalSymbol and verify if the function definition is available. Also, there was an issue with a DAG during legalisation. When we expand instruction into libcall, the inner call-chain isn't being "integrated" into outer chain. Since the last "data-flow" (call retval load) node is located in call-chain earlier than CALLSEQ_END node, the latter becomes a leaf and therefore a dead node (and is being removed quite fast). Proposed here solution relies on another data-flow pseudo nodes (ProxyReg) which purpose is only to keep CALLSEQ_END at legalisation and instruction selection phases - we remove the pseudo instructions before register scheduling phase. Patch by Denys Zariaiev! Differential Revision: https://reviews.llvm.org/D34708 llvm-svn: 350069
*	[X86] Use GetDemandedBits to simplify the operands of PMULDQ/PMULUDQ.	Craig Topper	2018-12-24	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is an alternative to what I attempted in D56057. GetDemandedBits is a special version of SimplifyDemandedBits that allows simplifications even when the operand has other uses. GetDemandedBits will only do simplifications that allow a node to be bypassed. It won't create new nodes or alter any of the other users. I had to add support for bypassing SIGN_EXTEND_INREG to GetDemandedBits. Based on a patch that Simon Pilgrim sent me in email. Fixes PR40142. llvm-svn: 350059
*	[SelectionDAGBuilder] Use ::precise LocationSizes; NFC	George Burgess IV	2018-12-24	1	-11/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	More migration so we can disable the implicit int -> LocationSize conversion. All of these are either scatter/gather'ed vector instructions, or direct loads. Hence, they're all precise. Perhaps if we see way more getTypeStoreSize calls, we can make a getTypeStoreLocationSize (or similar) as a wrapper that applies this ::precise. Doesn't appear that it's a good idea to make getTypeStoreSize return a LocationSize itself, however. llvm-svn: 350042
*	[DAGCombiner] limit shuffle to extend transform (PR40146)	Sanjay Patel	2018-12-23	1	-4/+5
\| \| \| \| \| \| \| \| \| \|	It's dangerous to knowingly create an illegal vector type no matter what stage of combining we're in. This prevents the missed folding/scalarization seen in: https://bugs.llvm.org/show_bug.cgi?id=40146 llvm-svn: 350034
*	[DAGCombiner] allow hoisting vector bitwise logic ahead of extends	Sanjay Patel	2018-12-23	1	-6/+5
\| \| \| \|	llvm-svn: 350032
*	[DAGCombiner] allow narrowing of add followed by truncate	Sanjay Patel	2018-12-22	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	trunc (add X, C ) --> add (trunc X), C' If we're throwing away the top bits of an 'add' instruction, do it in the narrow destination type. This makes the truncate-able opcode list identical to the sibling transform done in IR (in instcombine). This change used to show regressions for x86, but those are gone after D55494. This gets us closer to deleting the x86 custom function (combineTruncatedArithmetic) that does almost the same thing. Differential Revision: https://reviews.llvm.org/D55866 llvm-svn: 350006
*	[DAGCombiner] simplify code leading to scalarizeExtractedVectorLoad; NFC	Sanjay Patel	2018-12-21	1	-6/+5
\| \| \| \|	llvm-svn: 349958
*	[SelectionDAG] Always use the version of computeKnownBits that returns a ↵	Simon Pilgrim	2018-12-21	5	-27/+16
\| \| \| \| \| \| \| \|	value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349907
*	[ARM] Complete the Thumb1 shift+and->shift+shift transforms.	Eli Friedman	2018-12-20	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This saves materializing the immediate. The additional forms are less common (they don't usually show up for bitfield insert/extract), but they're still relevant. I had to add a new target hook to prevent DAGCombine from reversing the transform. That isn't the only possible way to solve the conflict, but it seems straightforward enough. Differential Revision: https://reviews.llvm.org/D55630 llvm-svn: 349857
*	[SelectionDAGBuilder] Enable funnel shift building to custom rotates	Simon Pilgrim	2018-12-20	1	-4/+2
\| \| \| \| \| \| \| \| \| \|	This patch enables funnel shift -> rotate building for all ROTL/ROTR custom/legal operations. AFAICT X86 was the last target that was missing modulo support (PR38243), but I've tried to CC stakeholders for every target that has ROTL/ROTR custom handling for their final OK. Differential Revision: https://reviews.llvm.org/D55747 llvm-svn: 349765
*	[DAGCombiner] Fix a place that was creating a SIGN_EXTEND with an extra operand.	Craig Topper	2018-12-20	1	-1/+1
\| \| \| \|	llvm-svn: 349726
*	[SelectionDAG] Optional handling of UNDEF elements in matchBinaryPredicate ↵	Simon Pilgrim	2018-12-19	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	(part 2 of 2) Now that SimplifyDemandedBits/SimplifyDemandedVectorElts is simplifying vector elements, we're seeing more constant BUILD_VECTOR containing undefs. This patch provides opt-in support for UNDEF elements in matchBinaryPredicate, passing NULL instead of the result ConstantSDNode* argument. I've updated the (or (and X, c1), c2) -> (and (or X, c2), c1\|c2) fold to demonstrate its use, which I believe is safe for undef cases. Differential Revision: https://reviews.llvm.org/D55822 llvm-svn: 349629
*	[SelectionDAG] Optional handling of UNDEF elements in matchBinaryPredicate ↵	Simon Pilgrim	2018-12-19	1	-6/+13
\| \| \| \| \| \| \| \| \| \| \| \|	(part 1 of 2) Now that SimplifyDemandedBits/SimplifyDemandedVectorElts is simplifying vector elements, we're seeing more constant BUILD_VECTOR containing undefs. This patch provides opt-in support for UNDEF elements in matchBinaryPredicate, passing NULL instead of the result ConstantSDNode* argument. Differential Revision: https://reviews.llvm.org/D55822 llvm-svn: 349628
*	[TargetLowering] Fix propagation of undefs in zero extension ops (PR40091)	Simon Pilgrim	2018-12-19	2	-4/+23
\| \| \| \| \| \| \| \| \| \| \| \|	As described on PR40091, we have several places where zext (and zext_vector_inreg) fold an undef input into an undef output. For zero extensions this is incorrect as the output should guarantee to least have the new upper bits set to zero. SimplifyDemandedVectorElts is the worst offender (and its the most likely to cause new undefs to appear) but DAGCombiner's tryToFoldExtendOfConstant has a similar issue. Thanks to @dmgreen for catching this. Differential Revision: https://reviews.llvm.org/D55883 llvm-svn: 349625
*	[SelectionDAG] Optional handling of UNDEF elements in matchUnaryPredicate	Simon Pilgrim	2018-12-19	1	-4/+13
\| \| \| \| \| \| \| \| \| \| \| \|	Now that SimplifyDemandedBits/SimplifyDemandedVectorElts are simplifying vector elements, we're seeing more constant BUILD_VECTOR containing UNDEFs. This patch provides opt-in handling of UNDEF elements in matchUnaryPredicate, passing NULL instead of the ConstantSDNode* argument. I've updated SelectionDAG::simplifyShift to demonstrate its use. Differential Revision: https://reviews.llvm.org/D55819 llvm-svn: 349616
*	Rewrite objc intrinsics to runtime methods in PreISelIntrinsicLowering ↵	Pete Cooper	2018-12-18	1	-50/+0
\| \| \| \| \| \| \| \| \| \|	instead of SDAG. SelectionDAG currently changes these intrinsics to function calls, but that won't work for other ISel's. Also we want to eventually support nonlazybind and weak linkage coming from the front-end which we can't do in SelectionDAG. llvm-svn: 349552
*	[SelectionDAG][X86] Fix [US](ADD\|SUB)SAT vector legalization, add tests	Nikita Popov	2018-12-18	2	-2/+6
\| \| \| \| \| \| \| \| \|	Integer result promotion needs to use the scalar size, and we need support for result widening. This is in preparation for D55787. llvm-svn: 349480
*	[TargetLowering] Fallback from SimplifyDemandedVectorElts to ↵	Simon Pilgrim	2018-12-18	1	-1/+8
\| \| \| \| \| \| \| \|	SimplifyDemandedBits For opcodes not covered by SimplifyDemandedVectorElts, SimplifyDemandedBits might be able to help now that it supports demanded elts as well. llvm-svn: 349466
*	[SDAG] Clarify the origin of chain in REG_SEQUENCE in comment, NFC	Krzysztof Parzyszek	2018-12-17	1	-1/+3
\| \| \| \|	llvm-svn: 349391
*	[SelectionDAG] Fix noop detection for vectors in AssertZext/AssertSext in ↵	Craig Topper	2018-12-17	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	getNode The assertion type is always supposed to be a scalar type. So if the result VT of the assertion is a vector, we need to get the scalar VT before we can compare them. Similarly for the assert above it. I don't have a test case because I don't know of any place we violate this today. A coworker found this while trying to use r347287 on the 6.0 branch without also having r336868 llvm-svn: 349390
*	NFC: remove unused variable	JF Bastien	2018-12-17	1	-1/+0
\| \| \| \| \| \|	D55768 removed its use. llvm-svn: 349377
*	[TargetLowering] Add DemandedElts mask to SimplifyDemandedBits (PR40000)	Simon Pilgrim	2018-12-17	1	-42/+120
\| \| \| \| \| \| \| \| \| \|	This is an initial patch to add the necessary support for a DemandedElts argument to SimplifyDemandedBits, more closely matching computeKnownBits and to help improve vector codegen. I've added only a small amount of the changes necessary to get at least one test to update - a lot more can be done but I'd like to add these methodically with proper test coverage, at the same time the hope is to slowly move some/all of SimplifyDemandedVectorElts into SimplifyDemandedBits as well. Differential Revision: https://reviews.llvm.org/D55768 llvm-svn: 349374
*	FastIsel: take care to update iterators when removing instructions.	Tim Northover	2018-12-17	1	-0/+9
\| \| \| \| \| \| \| \| \| \|	We keep a few iterators into the basic block we're selecting while performing FastISel. Usually this is fine, but occasionally code wants to remove already-emitted instructions. When this happens we have to be careful to update those iterators so they're not pointint at dangling memory. llvm-svn: 349365
*	[DAGCombiner] allow hoisting vector bitwise logic ahead of truncates	Sanjay Patel	2018-12-16	1	-5/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The transform performs a bitwise logic op in a wider type followed by truncate when both inputs are truncated from the same source type: logic_op (truncate x), (truncate y) --> truncate (logic_op x, y) There are a bunch of other checks that should prevent doing this when it might be harmful. We already do this transform for scalars in this spot. The vector limitation was shared with a check for the case when the operands are extended. I'm not sure if that limit is needed either, but that would be a separate patch. Differential Revision: https://reviews.llvm.org/D55448 llvm-svn: 349303
*	[SelectionDAG] Add FSHL/FSHR support to computeKnownBits	Simon Pilgrim	2018-12-16	2	-2/+37
\| \| \| \| \| \|	Also exposes an issue in DAGCombiner::visitFunnelShift where we were assuming the shift amount had the result type (after legalization it'll have the targets shift amount type). llvm-svn: 349298
*	[TargetLowering] Add ISD::OR + ISD::XOR handling to SimplifyDemandedVectorElts	Simon Pilgrim	2018-12-15	1	-0/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D55600 llvm-svn: 349264
*	[SDAG] Ignore chain operand in REG_SEQUENCE when emitting instructions	Krzysztof Parzyszek	2018-12-14	1	-0/+4
\| \| \| \|	llvm-svn: 349186