bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[DAGCombiner] convert logic-of-setcc into bit magic (PR40611)	Sanjay Patel	2019-02-12	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we're comparing some value for equality against 2 constants and those constants have an absolute difference of just 1 bit, then we can offset and mask off that 1 bit and reduce to a single compare against zero: and/or (setcc X, C0, ne), (setcc X, C1, ne/eq) --> setcc ((add X, -C1), ~(C0 - C1)), 0, ne/eq https://rise4fun.com/Alive/XslKj This transform is disabled by default using a TLI hook ("convertSetCCLogicToBitwiseLogic()"). That should be overridden for AArch64, MIPS, Sparc and possibly others based on the asm shown in: https://bugs.llvm.org/show_bug.cgi?id=40611 llvm-svn: 353859
*	Move some classes into anonymous namespaces. NFC.	Benjamin Kramer	2019-02-11	1	-0/+2
\| \| \| \|	llvm-svn: 353710
*	[DAG] Add optional AllowUndefs to isNullOrNullSplat	Simon Pilgrim	2019-02-10	1	-5/+1
\| \| \| \| \| \|	No change in default behaviour (AllowUndefs = false) llvm-svn: 353646
*	[DAGCombine] Simplify funnel shifts with undef/zero args to bitshifts	Simon Pilgrim	2019-02-10	1	-2/+41
\| \| \| \| \| \| \| \|	Now that we have SimplifyDemandedBits support for funnel shifts (rL353539), we need to simplify funnel shifts back to bitshifts in cases where either argument has been folded to undef/zero. Differential Revision: https://reviews.llvm.org/D58009 llvm-svn: 353645
*	[DAGCombine] Optimize pow(X, 0.75) to sqrt(X) * sqrt(sqrt(X))	Nemanja Ivanovic	2019-02-08	1	-4/+14
\| \| \| \| \| \| \| \| \| \| \| \|	The sqrt case is faster and we already do this for the case where the exponent is 0.25. This adds the 0.75 case which is also not sensitive to signed zeros. Patch by Whitney Tsang (Whitney) Differential revision: https://reviews.llvm.org/D57434 llvm-svn: 353557
*	[TargetLowering] Add SimplifyDemandedBits funnel shift support	Simon Pilgrim	2019-02-08	1	-0/+4
\| \| \| \|	llvm-svn: 353539
*	Revert r353416 "[DAG] Cleanup unused nodes on failed store-to-load forward ↵	Nirav Dave	2019-02-08	1	-21/+9
\| \| \| \| \| \| \| \|	combine." This cleanup causes out-of-tree crashes. llvm-svn: 353527
*	[DAGCombiner] (add (umax X, C), -C) --> (usubsat X, C) (PR40111)	Simon Pilgrim	2019-02-07	1	-0/+12
\| \| \| \| \| \| \| \| \| \|	Move the (add (umax X, C), -C) --> (usubsat X, C) X86 combine into generic DAGCombiner First of a number of saturated arithmetic folds that can be moved out of X86-specific code for PR40111. Differential Revision: https://reviews.llvm.org/D57754 llvm-svn: 353457
*	Revert "[DAG] Cleanup of unused node in SimplifySelectCC."	Nirav Dave	2019-02-07	1	-15/+8
\| \| \| \| \| \|	Causes ASAN use-after-poison errors. llvm-svn: 353442
*	[DAGCombiner] fold add/sub with bool operand based on target's boolean contents	Sanjay Patel	2019-02-07	1	-3/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I noticed that we are missing this canonicalization in IR: rL352515 ...and then realized that we don't get this right in SDAG either, so this has to be fixed first regardless of what we choose to do in IR. The existing fold was limited to scalars and using the wrong predicate to guard the transform. We have a boolean contents TLI query that can be used to decide which direction to fold. This may eventually lead back to the problems/question in: https://bugs.llvm.org/show_bug.cgi?id=40486 ...but it makes no difference to that yet. Differential Revision: https://reviews.llvm.org/D57401 llvm-svn: 353433
*	[DAG] Cleanup of unused node in SimplifySelectCC.	Nirav Dave	2019-02-07	1	-8/+15
\| \| \| \|	llvm-svn: 353428
*	[DAG] Cleanup unused node on failed SELECT Combine.	Nirav Dave	2019-02-07	1	-0/+6
\| \| \| \|	llvm-svn: 353426
*	[DAG] Cleanup unused nodes on failed store-to-load forward combine.	Nirav Dave	2019-02-07	1	-9/+21
\| \| \| \|	llvm-svn: 353416
*	[DAG] Immediately cleanup unused nodes from extend-based combines.	Nirav Dave	2019-02-06	1	-2/+7
\| \| \| \|	llvm-svn: 353338
*	[DAGCombine][NFC] GatherAllAliases should take a LSBaseSDNode.	Clement Courbet	2019-02-06	1	-8/+8
\| \| \| \| \| \| \|	GatherAllAliases only makes sense for LSBaseSDNode. Enforce it with static typing instead of runtime cast. llvm-svn: 353291
*	[DAGCombiner] Discard pointer info when combining extract_vector_elt of a ↵	Craig Topper	2019-02-05	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vector load when the index isn't constant Summary: If the index isn't constant, this transform inserts a multiply and an add on the index to calculating the base pointer for a scalar load. But we still create a memory operand with an offset of 0 and the size of the scalar access. But the access is really to an unknown offset within the original access size. This can cause the machine scheduler to incorrectly calculate dependencies between this load and other accesses. In the case we saw, there was a 32 byte vector store that was split into two 16 byte stores, one with offset 0 and one with offset 16. The size of the memory operand for both was 16. The scheduler correctly detected the alias with the offset 0 store, but not the offset 16 store. This patch discards the pointer info so we don't incorrectly detect aliasing. I wasn't sure if we could keep using the original offset and size without risking some other transform on the load changing the size. I tried to reduce a test case, but there's still a lot of memory operations needed to get the scheduler to do the bad reordering. So it looked pretty fragile to maintain. Reviewers: efriedma Reviewed By: efriedma Subscribers: arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57616 llvm-svn: 353124
*	[DAGCombine] Add ADD(SUB,SUB) combines	Simon Pilgrim	2019-02-04	1	-0/+12
\| \| \| \| \| \| \| \|	Noticed while investigating PR40483, and fixes the basic test case from the bug - but not a more general case. We're pretty weak at dealing with ADD/SUB combines compared to the SimplifyAssociativeOrCommutative/SimplifyUsingDistributiveLaws abilities that InstCombine can manage. llvm-svn: 353044
*	[SDAG] Add SDNode/SDValue getConstantOperandAPInt helper. NFCI.	Simon Pilgrim	2019-02-02	1	-1/+1
\| \| \| \| \| \| \| \|	We already have the getConstantOperandVal helper which returns a uint64_t, but along comes the fuzzer and inserts a i128 -1 constant or something and the whole thing asserts....... I've updated a few obvious cases, and tried to make use of the const reference where possible, but there's more to do. A number of existing oss-fuzz tickets should be fixed if we start using APInt and perform value clamping where necessary. llvm-svn: 352961
*	[DAGCombine] Avoid CombineZExtLogicopShiftLoad if there is free ZEXT	Guozhi Wei	2019-01-31	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes pr39098. For the attached test case, CombineZExtLogicopShiftLoad can optimize it to t25: i64 = Constant<1099511627775> t35: i64 = Constant<0> t0: ch = EntryToken t57: i64,ch = load<(load 4 from `i40* undef`, align 8), zext from i32> t0, undef:i64, undef:i64 t58: i64 = srl t57, Constant:i8<1> t60: i64 = and t58, Constant:i64<524287> t29: ch = store<(store 5 into `i40* undef`, align 8), trunc to i40> t57:1, t60, undef:i64, undef:i64 But later visitANDLike transforms it to t25: i64 = Constant<1099511627775> t35: i64 = Constant<0> t0: ch = EntryToken t57: i64,ch = load<(load 4 from `i40* undef`, align 8), zext from i32> t0, undef:i64, undef:i64 t61: i32 = truncate t57 t63: i32 = srl t61, Constant:i8<1> t64: i32 = and t63, Constant:i32<524287> t65: i64 = zero_extend t64 t58: i64 = srl t57, Constant:i8<1> t60: i64 = and t58, Constant:i64<524287> t29: ch = store<(store 5 into `i40* undef`, align 8), trunc to i40> t57:1, t60, undef:i64, undef:i64 And it triggers CombineZExtLogicopShiftLoad again, causes a dead loop. Both forms should generate same instructions, CombineZExtLogicopShiftLoad generated IR looks cleaner. But it looks more difficult to prevent visitANDLike to do the transform, so I prevent CombineZExtLogicopShiftLoad to do the transform if the ZExt is free. Differential Revision: https://reviews.llvm.org/D57491 llvm-svn: 352792
*	[DAG] Aggressively cleanup dangling node in CombineZExtLogicopShiftLoad.	Nirav Dave	2019-01-31	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	While dangling nodes will eventually be pruned when they are considered, leaving them disables combines requiring single-use. Reviewers: Carrot, spatel, craig.topper, RKSimon, efriedma Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D57520 llvm-svn: 352784
*	[DAGCombiner] sub X, 0/1 --> add X, 0/-1	Sanjay Patel	2019-01-30	1	-10/+22
\| \| \| \| \| \| \| \| \| \|	This extends the existing transform for: add X, 0/1 --> sub X, 0/-1 ...to allow the sibling subtraction fold. This pattern could regress with the proposed change in D57401. llvm-svn: 352680
*	[DAGCombiner] fold extract_subvector of extract_subvector	Sanjay Patel	2019-01-29	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the sibling fold for insert-of-insert that was added with D56604. Now that we have x86 shuffle narrowing (D57156), this change shows improvements for lots of AVX512 reduction code (not sure that we would ever expect extract-of-extract otherwise). There's a small regression in some of the partial-permute tests (extracting followed by splat). That is tracked by PR40500: https://bugs.llvm.org/show_bug.cgi?id=40500 Differential Revision: https://reviews.llvm.org/D57336 llvm-svn: 352528
*	[NFC] TLI query with default(on) behavior wrt DAG combines for fmin/fmax ↵	Michael Berg	2019-01-28	1	-3/+7
\| \| \| \| \| \|	target control llvm-svn: 352396
*	[DAGCombine] Enable more pre-indexed stores	Sam Parker	2019-01-23	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \|	The current check in CombineToPreIndexedLoadStore is too conversative, preventing a pre-indexed store when the base pointer is a predecessor of the value being stored. Instead, we should check the pointer operand of the store. Differential Revision: https://reviews.llvm.org/D56719 llvm-svn: 351933
*	[DAGCombiner] narrow vector binop with 2 insert subvector operands	Sanjay Patel	2019-01-22	1	-1/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vecbo (insertsubv undef, X, Z), (insertsubv undef, Y, Z) --> insertsubv VecC, (vecbo X, Y), Z This is another step in generic vector narrowing. It's also a step towards more horizontal op formation specifically for x86 (although we still failed to match those in the affected tests). The scalarization cases are also not optimal (we should be scalarizing those), but it's still an improvement to use a narrower vector op when we know part of the result must be constant because both inputs are undef in some vector lanes. I think a similar match but checking for a constant operand might help some of the cases in D51553. Differential Revision: https://reviews.llvm.org/D56875 llvm-svn: 351825
*	[DAGCombiner] fix crash when converting build vector to shuffle	Sanjay Patel	2019-01-21	1	-5/+11
\| \| \| \| \| \| \| \| \| \|	The regression test is reduced from the example shown in D56281. This does raise a question as noted in the test file: do we want to handle this pattern? I don't have a motivating example for that on x86 yet, but it seems like we could have that pattern there too, so we could avoid the back-and-forth using a shuffle. llvm-svn: 351753
*	Update the file headers across all of the LLVM projects in the monorepo	Chandler Carruth	2019-01-19	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
*	[SelectionDAG] Split very large token factors for chained stores to 64k chunks.	Florian Hahn	2019-01-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Similar to D55073. Without this change, the DAG combiner crashes on code with more than 64k of stores in a single basic block that form parallelizable chains. No test case, as it would be very IR file. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D56740 llvm-svn: 351571
*	[DAGCombine] Fix ReduceLoadWidth for shifted offsets	Sam Parker	2019-01-16	1	-12/+8
\| \| \| \| \| \| \| \| \| \| \| \|	ReduceLoadWidth can trigger using a shifted mask is used and this requires that the function return a shl node to correct for the offset. However, the way that this was implemented meant that the returned result could be an existing node, which would be incorrect. This fixes the method of inserting the new node and replacing uses. Differential Revision: https://reviews.llvm.org/D50432 llvm-svn: 351310
*	[DAGCombiner] reduce buildvec of zexted extracted element to shuffle	Sanjay Patel	2019-01-15	1	-0/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The motivating case for this is shown in the first regression test. We are transferring to scalar and back rather than just zero-extending with 'vpmovzxdq'. That's a special-case for a more general pattern as shown here. In all tests, we're avoiding the vector-scalar-vector moves in favor of vector ops. We aren't producing optimal shuffle code in some cases though, so the patch is limited to reduce regressions. Differential Revision: https://reviews.llvm.org/D56281 llvm-svn: 351198
*	[DAGCombiner] Add (sub_sat x, x) -> 0 combine	Simon Pilgrim	2019-01-14	1	-0/+4
\| \| \| \|	llvm-svn: 351073
*	[DAGCombiner] Enable sub saturation constant folding	Simon Pilgrim	2019-01-14	1	-1/+6
\| \| \| \|	llvm-svn: 351072
*	[DAGCombiner] Add add/sub saturation undef handling	Simon Pilgrim	2019-01-14	1	-0/+8
\| \| \| \| \| \| \| \|	Match ConstantFolding.cpp: (add_sat x, undef) -> -1 (sub_sat x, undef) -> 0 llvm-svn: 351070
*	[DAGCombiner] Enable add saturation constant folding	Simon Pilgrim	2019-01-14	1	-2/+3
\| \| \| \|	llvm-svn: 351060
*	[DAGCombiner] Add add saturation constant folding tests.	Simon Pilgrim	2019-01-14	1	-2/+3
\| \| \| \| \| \|	Exposes an issue with sadd_sat for computeOverflowKind, so I've disabled it for now. llvm-svn: 351057
*	[DAGCombiner] If add_sat(x,y) can't overflow -> add(x,y)	Simon Pilgrim	2019-01-13	1	-0/+4
\| \| \| \| \|	NOTE: We need more powerful signed overflow detection in computeOverflowKind llvm-svn: 351026
*	Fix unused variable warning. NFCI.	Simon Pilgrim	2019-01-13	1	-1/+0
\| \| \| \|	llvm-svn: 351025
*	[DAGCombiner] Some very basic add/sub saturation combines.	Simon Pilgrim	2019-01-13	1	-0/+64
\| \| \| \| \| \|	Handle combines with zero and constant canonicalization for adds. llvm-svn: 351024
*	[DAGCombiner] fold insert_subvector of insert_subvector	Sanjay Patel	2019-01-12	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This pattern: t33: v8i32 = insert_subvector undef:v8i32, t35, Constant:i64<0> t21: v16i32 = insert_subvector undef:v16i32, t33, Constant:i64<0> ...shows up in PR33758: https://bugs.llvm.org/show_bug.cgi?id=33758 ...although this patch doesn't make any difference to the final result on that yet. In the affected tests here, it looks like it just makes RA wiggle. But we might as well squash this to prevent it interfering with other pattern-matching. Differential Revision: https://reviews.llvm.org/D56604 llvm-svn: 351008
*	[DAGCombiner] simplify code; NFC	Sanjay Patel	2019-01-10	1	-11/+11
\| \| \| \|	llvm-svn: 350844
*	[DAGCombiner][x86] scalarize binop followed by extractelement	Sanjay Patel	2019-01-03	1	-5/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As noted in PR39973 and D55558: https://bugs.llvm.org/show_bug.cgi?id=39973 ...this is a partial implementation of a fold that we do as an IR canonicalization in instcombine: // extelt (binop X, Y), Index --> binop (extelt X, Index), (extelt Y, Index) We want to have this in the DAG too because as we can see in some of the test diffs (reductions), the pattern may not be visible in IR. Given that this is already an IR canonicalization, any backend that would prefer a vector op over a scalar op is expected to already have the reverse transform in DAG lowering (not sure if that's a realistic expectation though). The transform is limited with a TLI hook because there's an existing transform in CodeGenPrepare that tries to do the opposite transform. Differential Revision: https://reviews.llvm.org/D55722 llvm-svn: 350354
*	[DAGCombiner] After performing the division by constant optimization for a ↵	Craig Topper	2019-01-02	1	-2/+29
\| \| \| \| \| \| \| \| \| \| \| \|	DIV or REM node, replace the users of the corresponding REM or DIV node if it exists. Currently we expand the two nodes separately. This gives DAG combiner an opportunity to optimize the expanded sequence taking into account only one set of users. When we expand the other node we'll create the expansion again, but might not be able to optimize it the same way. So the nodes won't CSE and we'll have two similarish sequences in the same basic block. By expanding both nodes at the same time we'll avoid prematurely optimizing the expansion until both the division and remainder have been replaced. Improves the test case from PR38217. There may be additional opportunities after this. Differential Revision: https://reviews.llvm.org/D56145 llvm-svn: 350239
*	[DAGCombiner][X86][PowerPC] Teach visitSIGN_EXTEND_INREG to fold ↵	Craig Topper	2019-01-02	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \|	(sext_in_reg (aext/sext x)) -> (sext x) when x has more than 1 sign bit and the sext_inreg is from one of them. If x has multiple sign bits than it doesn't matter which one we extend from so we can sext from x's msb instead. The X86 setcc-combine.ll changes are a little weird. It appears we ended up with a (sext_inreg (aext (trunc (extractelt)))) after type legalization. The sext_inreg+aext now gets optimized by this combine to leave (sext (trunc (extractelt))). Then we visit the trunc before we visit the sext. This ends up changing the truncate to an extractvectorelt from a bitcasted vector. I have a follow up patch to fix this. Differential Revision: https://reviews.llvm.org/D56156 llvm-svn: 350235
*	[DAGCombiner] Add missing one use check on the shuffle in the ↵	Craig Topper	2018-12-31	1	-1/+1
\| \| \| \| \| \| \| \|	bitcast(shuffle(bitcast(s0),bitcast(s1))) -> shuffle(s0,s1) transform. Found while trying out some other changes so I don't really have a test case. llvm-svn: 350172
*	[DAGCombiner] limit shuffle to extend transform (PR40146)	Sanjay Patel	2018-12-23	1	-4/+5
\| \| \| \| \| \| \| \| \| \|	It's dangerous to knowingly create an illegal vector type no matter what stage of combining we're in. This prevents the missed folding/scalarization seen in: https://bugs.llvm.org/show_bug.cgi?id=40146 llvm-svn: 350034
*	[DAGCombiner] allow hoisting vector bitwise logic ahead of extends	Sanjay Patel	2018-12-23	1	-6/+5
\| \| \| \|	llvm-svn: 350032
*	[DAGCombiner] allow narrowing of add followed by truncate	Sanjay Patel	2018-12-22	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	trunc (add X, C ) --> add (trunc X), C' If we're throwing away the top bits of an 'add' instruction, do it in the narrow destination type. This makes the truncate-able opcode list identical to the sibling transform done in IR (in instcombine). This change used to show regressions for x86, but those are gone after D55494. This gets us closer to deleting the x86 custom function (combineTruncatedArithmetic) that does almost the same thing. Differential Revision: https://reviews.llvm.org/D55866 llvm-svn: 350006
*	[DAGCombiner] simplify code leading to scalarizeExtractedVectorLoad; NFC	Sanjay Patel	2018-12-21	1	-6/+5
\| \| \| \|	llvm-svn: 349958
*	[SelectionDAG] Always use the version of computeKnownBits that returns a ↵	Simon Pilgrim	2018-12-21	1	-10/+6
\| \| \| \| \| \| \| \|	value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349907
*	[ARM] Complete the Thumb1 shift+and->shift+shift transforms.	Eli Friedman	2018-12-20	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This saves materializing the immediate. The additional forms are less common (they don't usually show up for bitfield insert/extract), but they're still relevant. I had to add a new target hook to prevent DAGCombine from reversing the transform. That isn't the only possible way to solve the conflict, but it seems straightforward enough. Differential Revision: https://reviews.llvm.org/D55630 llvm-svn: 349857