bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[ARM] Sink add/mul(shufflevector(insertelement())) for MVE instruction selection	Sam Tebbs	2019-09-06	1	-0/+216
\| \| \| \| \| \| \| \| \| \| \| \|	This patch sinks add/mul(shufflevector(insertelement())) into the basic block in which they are used so that they can then be selected together. This is useful for various MVE instructions, such as vmla and others that take R registers. Loop tests have been added to the vmla test file to make sure vmlas are generated in loops. Differential revision: https://reviews.llvm.org/D66295 llvm-svn: 371218
*	Cleanup test.	Alina Sbirlea	2019-09-06	1	-2/+1
\| \| \| \|	llvm-svn: 371158
*	InstCombine: Fix crash on icmp of gep with addrspacecasted null	Matt Arsenault	2019-09-05	1	-0/+28
\| \| \| \|	llvm-svn: 371146
*	[AliasSetTracker] Correct AAInfo check.	Alina Sbirlea	2019-09-05	1	-0/+71
\| \| \| \| \| \| \| \|	Properly check if NewAAInfo conflicts with AAInfo. Update local variable and alias set that a change occured when a conflict is found. Resolves PR42969. llvm-svn: 371139
*	[SimplifyCFG] Don't SimplifyBranchOnICmpChain with ExtraCase	Vitaly Buka	2019-09-05	1	-0/+102
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Here we try to avoid issues with "explicit branch" with SimplifyBranchOnICmpChain which can check on undef. Msan by design reports branches on uninitialized memory and undefs, so we have false report here. In general msan does not like when we convert ``` // If at least one of them is true we can MSAN is ok if another is undefs if (a \|\| b) return; ``` into ``` // If 'a' is undef MSAN will complain even if 'b' is true if (a) return; if (b) return; ``` Example Before optimization we had something like this: ``` while (true) { bool maybe_undef = doStuff(); while (true) { char c = getChar(); if (c != 10 && c != 13) continue break; } // we know that c == 10 \|\| c == 13 if we get here, // so msan know that branch is not affected by maybe_undef if (maybe_undef \|\| c == 10 \|\| c == 13) continue; return; } ``` SimplifyBranchOnICmpChain will convert that into ``` while (true) { bool maybe_undef = doStuff(); while (true) { char c = getChar(); if (c != 10 && c != 13) continue; break; } // however msan will complain here: if (maybe_undef) continue; // we know that c == 10 \|\| c == 13, so either way we will get continue switch(c) { case 10: continue; case 13: continue; } return; } ``` Reviewers: eugenis, efriedma Reviewed By: eugenis, efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67205 llvm-svn: 371138
*	[NFC][InstCombine] Overhaul 'unsigned add overflow' tests, ensure that all 3 ↵	Roman Lebedev	2019-09-05	6	-8/+854
\| \| \| \| \| \|	patterns have full test coverage llvm-svn: 371108
*	[InstCombine] foldICmpBinOp(): consider inverted check in 'unsigned sub ↵	Roman Lebedev	2019-09-05	2	-10/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	overflow' check A follow-up for r329011. This may be changed to produce @llvm.sub.with.overflow in a later patch, but for now just make things more consistent overall. A few observations stem from this: * There does not seem to be a similar one-instruction fold for uadd-overflow * I'm not sure we'll want to canonicalize `B u> A` as `usub.with.overflow`, so since the `icmp` here no longer refers to `sub`, reconstructing `usub.with.overflow` will be problematic, and will likely require standalone pass (similar to DivRemPairs). https://rise4fun.com/Alive/Zqs Name: (A - B) u> A --> B u> A %t0 = sub i8 %A, %B %r = icmp ugt i8 %t0, %A => %r = icmp ugt i8 %B, %A Name: (A - B) u<= A --> B u<= A %t0 = sub i8 %A, %B %r = icmp ule i8 %t0, %A => %r = icmp ule i8 %B, %A Name: C u< (C - D) --> C u< D %t0 = sub i8 %C, %D %r = icmp ult i8 %C, %t0 => %r = icmp ult i8 %C, %D Name: C u>= (C - D) --> C u>= D %t0 = sub i8 %C, %D %r = icmp uge i8 %C, %t0 => %r = icmp uge i8 %C, %D llvm-svn: 371101
*	[InstCombine] foldICmpBinOp(): consider inverted check in 'unsigned add ↵	Roman Lebedev	2019-09-05	1	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	overflow' check A follow-up for r342004. This will be changed to produce @llvm.add.with.overflow in a later patch, but for now just make things more consistent overall. https://rise4fun.com/Alive/qxE Name: (Op1 + X) u< Op1 --> ~Op1 u< X %t0 = add i8 %Op1, %X %r = icmp ult i8 %t0, %Op1 => %n = xor i8 %Op1, -1 %r = icmp ult i8 %n, %X Name: (Op1 + X) u>= Op1 --> ~Op1 u>= X %t0 = add i8 %Op1, %X %r = icmp uge i8 %t0, %Op1 => %n = xor i8 %Op1, -1 %r = icmp uge i8 %n, %X ;------------------------------------------------------------------------------- Name: Op0 u> (Op0 + X) --> X u> ~Op0 %t0 = add i8 %Op0, %X %r = icmp ugt i8 %Op0, %t0 => %n = xor i8 %Op0, -1 %r = icmp ugt i8 %X, %n Name: Op0 u<= (Op0 + X) --> X u<= ~Op0 %t0 = add i8 %Op0, %X %r = icmp ule i8 %Op0, %t0 => %n = xor i8 %Op0, -1 %r = icmp ule i8 %X, %n llvm-svn: 371100
*	[InstCombine][NFC] Tests for 'unsigned sub overflow' check	Roman Lebedev	2019-09-05	2	-0/+313
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	---------------------------------------- Name: unsigned sub, overflow, v0 %sub = sub i8 %x, %y %ov = icmp ugt i8 %sub, %x => %agg = usub_overflow i8 %x, %y %sub = extractvalue {i8, i1} %agg, 0 %ov = extractvalue {i8, i1} %agg, 1 Done: 1 Optimization is correct! ---------------------------------------- Name: unsigned sub, no overflow, v0 %sub = sub i8 %x, %y %ov = icmp ule i8 %sub, %x => %agg = usub_overflow i8 %x, %y %sub = extractvalue {i8, i1} %agg, 0 %not.ov = extractvalue {i8, i1} %agg, 1 %ov = xor %not.ov, -1 Done: 1 Optimization is correct! llvm-svn: 371099
*	[InstCombine][NFC] Tests for 'unsigned add overflow' check	Roman Lebedev	2019-09-05	2	-0/+398
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	---------------------------------------- Name: unsigned add, overflow, v0 %add = add i8 %x, %y %ov = icmp ult i8 %add, %x => %agg = uadd_overflow i8 %x, %y %add = extractvalue {i8, i1} %agg, 0 %ov = extractvalue {i8, i1} %agg, 1 Done: 1 Optimization is correct! ---------------------------------------- Name: unsigned add, overflow, v1 %add = add i8 %x, %y %ov = icmp ult i8 %add, %y => %agg = uadd_overflow i8 %x, %y %add = extractvalue {i8, i1} %agg, 0 %ov = extractvalue {i8, i1} %agg, 1 Done: 1 Optimization is correct! ---------------------------------------- Name: unsigned add, no overflow, v0 %add = add i8 %x, %y %ov = icmp uge i8 %add, %x => %agg = uadd_overflow i8 %x, %y %add = extractvalue {i8, i1} %agg, 0 %not.ov = extractvalue {i8, i1} %agg, 1 %ov = xor %not.ov, -1 Done: 1 Optimization is correct! ---------------------------------------- Name: unsigned add, no overflow, v1 %add = add i8 %x, %y %ov = icmp uge i8 %add, %y => %agg = uadd_overflow i8 %x, %y %add = extractvalue {i8, i1} %agg, 0 %not.ov = extractvalue {i8, i1} %agg, 1 %ov = xor %not.ov, -1 Done: 1 Optimization is correct! llvm-svn: 371098
*	[MergedLoadStoreMotion] Sink stores to BB with more than 2 predecessors	Denis Bakhvalov	2019-09-05	1	-0/+94
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we have: bb5: br i1 %arg3, label %bb6, label %bb7 bb6: %tmp = getelementptr inbounds i32, i32* %arg1, i64 2 store i32 3, i32* %tmp, align 4 br label %bb9 bb7: %tmp8 = getelementptr inbounds i32, i32* %arg1, i64 2 store i32 3, i32* %tmp8, align 4 br label %bb9 bb9: ; preds = %bb4, %bb6, %bb7 ... We can't sink stores directly into bb9. This patch creates new BB that is successor of %bb6 and %bb7 and sinks stores into that block. SplitFooterBB is the parameter to the pass that controls that behavior. Change-Id: I7fdf50a772b84633e4b1b860e905bf7e3e29940f Differential: https://reviews.llvm.org/D66234 llvm-svn: 371089
*	[PGO][CHR] Speed up following long, interlinked use-def chains.	Hiroshi Yamauchi	2019-09-05	1	-0/+151
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Avoid visiting an instruction more than once by using a map. This is similar to https://reviews.llvm.org/rL361416. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67198 llvm-svn: 371086
*	AMDGPU: Add intrinsics for address space identification	Matt Arsenault	2019-09-05	1	-0/+55
\| \| \| \| \| \| \|	The library currently uses ptrtoint and directly checks the queue ptr for this, which counts as a pointer capture. llvm-svn: 371009
*	[Attributor][Fix] Make sure we do not delete live code	Johannes Doerfert	2019-09-04	3	-8/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Liveness needs to mark edges, not blocks as dead. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67191 llvm-svn: 370975
*	[InstCombine] Add more test cases (NFC)	Evandro Menezes	2019-09-04	1	-40/+99
\| \| \| \| \| \|	Add more test cases simplifying `log()`. llvm-svn: 370966
*	[InstCombine] sub(xor(x, y), or(x, y)) -> neg(and(x, y))	David Bolvansky	2019-09-04	1	-20/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: ``` Name: sub(xor(x, y), or(x, y)) -> neg(and(x, y)) %or = or i32 %y, %x %xor = xor i32 %x, %y %sub = sub i32 %xor, %or => %sub1 = and i32 %x, %y %sub = sub i32 0, %sub1 Optimization: sub(xor(x, y), or(x, y)) -> neg(and(x, y)) Done: 1 Optimization is correct! ``` https://rise4fun.com/Alive/8OI Reviewers: lebedev.ri Reviewed By: lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67188 llvm-svn: 370945
*	[NFC] Added tests for new fold	David Bolvansky	2019-09-04	1	-0/+120
\| \| \| \|	llvm-svn: 370941
*	[NFC] Adjust test filename	David Bolvansky	2019-09-04	1	-0/+0
\| \| \| \|	llvm-svn: 370939
*	[InstCombine] Fold sub (and A, B) (or A, B)) to neg (xor A, B)	David Bolvansky	2019-09-04	1	-20/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: ``` Name: sub(and(x, y), or(x, y)) -> neg(xor(x, y)) %or = or i32 %y, %x %and = and i32 %x, %y %sub = sub i32 %and, %or => %sub1 = xor i32 %x, %y %sub = sub i32 0, %sub1 Optimization: sub(and(x, y), or(x, y)) -> neg(xor(x, y)) Done: 1 Optimization is correct! ``` https://rise4fun.com/Alive/VI6 Found by @lebedev.ri. Also author of the proof. Reviewers: lebedev.ri, spatel Reviewed By: lebedev.ri Subscribers: llvm-commits, lebedev.ri Tags: #llvm Differential Revision: https://reviews.llvm.org/D67155 llvm-svn: 370934
*	[Attributor] Look at internal functions only on-demand	Johannes Doerfert	2019-09-04	3	-4/+250
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Instead of building attributes for internal functions which we do not update as long as we assume they are dead, we now do not create attributes until we assume the internal function to be live. This improves the number of required iterations, as well as the number of required updates, in real code. On our tests, the results are mixed. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66914 llvm-svn: 370924
*	[Attributor] Use the white list for attributes consistently	Johannes Doerfert	2019-09-04	11	-14/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We create attributes on-demand so we need to check the white list on-demand. This also unifies the location at which we create, initialize, and eventually invalidate new abstract attributes. The tests show mixed results, a few more call site attributes are determined which can cause more iterations. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66913 llvm-svn: 370922
*	[Attributor] Deal more explicit with non-exact definitions	Johannes Doerfert	2019-09-04	4	-2/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Before we tried to rule out non-exact definitions early but that lead to on-demand attributes created for them anyway. As a consequence we needed to look at the definition in the initialize of each attribute again. This patch centralized this lookup and tightens the condition under which we give up on non-exact definitions. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67115 llvm-svn: 370917
*	[InstSimplify] guard against unreachable code (PR43218)	Sanjay Patel	2019-09-04	1	-0/+23
\| \| \| \| \| \| \|	This would crash: https://bugs.llvm.org/show_bug.cgi?id=43218 llvm-svn: 370911
*	[Debuginfo][SROA] Need to handle dbg.value in SROA pass.	Alexey Lapshin	2019-09-04	2	-154/+86
\| \| \| \| \| \| \| \| \| \| \| \|	SROA pass processes debug info incorrecly if applied twice. Specifically, after SROA works first time, instcombine converts dbg.declare intrinsics into dbg.value. Inlining creates new opportunities for SROA, so it is called again. This time it does not handle correctly previously inserted dbg.value intrinsics. Differential Revision: https://reviews.llvm.org/D64595 llvm-svn: 370906
*	[InstCombine] add tests for insert/extract with identity shuffles; NFC	Sanjay Patel	2019-09-04	1	-0/+92
\| \| \| \|	llvm-svn: 370901
*	[NFC] Added a negative test for new fold	David Bolvansky	2019-09-04	1	-0/+19
\| \| \| \|	llvm-svn: 370890
*	[NFC] Fixed test	David Bolvansky	2019-09-04	1	-2/+2
\| \| \| \|	llvm-svn: 370888
*	[NFC] Adjust tests for new fold	David Bolvansky	2019-09-04	1	-4/+4
\| \| \| \|	llvm-svn: 370886
*	[NFC] Added tests for new fold	David Bolvansky	2019-09-04	1	-0/+101
\| \| \| \|	llvm-svn: 370885
*	[InstCombine] Fold sub (or A, B) (and A, B) to (xor A, B)	David Bolvansky	2019-09-04	1	-20/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: ``` Name: sub or and to xor %or = or i32 %y, %x %and = and i32 %x, %y %sub = sub i32 %or, %and => %sub = xor i32 %x, %y Optimization: sub or and to xor Done: 1 Optimization is correct! ``` https://rise4fun.com/Alive/eJu Reviewers: spatel, lebedev.ri Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67153 llvm-svn: 370883
*	[NFC] Added a new test for D67153	David Bolvansky	2019-09-04	1	-0/+13
\| \| \| \|	llvm-svn: 370881
*	[NFC] Added tests for 'SUB of OR and AND to XOR' fold	David Bolvansky	2019-09-04	1	-0/+103
\| \| \| \|	llvm-svn: 370878
*	[Attributor] Use the delete API for liveness	Johannes Doerfert	2019-09-03	2	-11/+8
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66833 llvm-svn: 370818
*	[Attributor] Deduce "no-capture" argument attribute	Johannes Doerfert	2019-09-03	11	-117/+155
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the no-capture argument attribute deduction to the Attributor fixpoint framework. The new string attributed "no-capture-maybe-returned" is introduced to allow deduction of no-capture through functions that "capture" an argument but only by "returning" it. It is only used by the Attributor for testing. Differential Revision: https://reviews.llvm.org/D59922 llvm-svn: 370817
*	[GVN] Propagate simple equalities from assumes within the tail of the block	Philip Reames	2019-09-03	1	-10/+24
\| \| \| \| \| \| \| \| \| \| \| \|	This extends the existing logic for propagating constant expressions in an analogous manner for what we do across basic blocks. The core point is that we chose some order of operands, and canonicalize uses towards that one. The heuristic used is inspired by the one used across blocks; in a follow up change, I'd plan to common them so that the cross block version uses the slightly stronger ordering herein. As noted by the TODOs in the code, there's a good amount of room for improving the existing code and making it more powerful. Some follow up work planned. Differential Revision: https://reviews.llvm.org/D66977 llvm-svn: 370791
*	Revert r370454 "[LoopIdiomRecognize] BCmp loop idiom recognition"	Roman Lebedev	2019-09-03	4	-398/+580
\| \| \| \| \| \| \| \| \| \|	https://bugs.llvm.org/show_bug.cgi?id=43206 was filed, claiming that there is a miscompilation. Reverting until i investigate. This reverts commit r370454 llvm-svn: 370788
*	[Tests/GVN] Precommit requested test additions from D66977	Philip Reames	2019-09-03	1	-0/+88
\| \| \| \|	llvm-svn: 370784
*	[LV] Fix miscompiles by adding non-header PHI nodes to AllowedExit	Bjorn Pettersson	2019-09-03	1	-91/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Fold-tail currently supports reduction last-vector-value live-out's, but has yet to support last-scalar-value live-outs, including non-header phi's. As it relies on AllowedExit in order to detect them and bail out we need to add the non-header PHI nodes to AllowedExit, otherwise we end up with miscompiles. Solves https://bugs.llvm.org/show_bug.cgi?id=43166 Reviewers: fhahn, Ayal Reviewed By: fhahn, Ayal Subscribers: anna, hiraditya, rkruppe, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67074 llvm-svn: 370721
*	[LV] Precommit test case showing miscompile from PR43166. NFC	Bjorn Pettersson	2019-09-03	1	-0/+235
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Precommit test case showing miscompile from PR43166. Reviewers: fhahn, Ayal Reviewed By: fhahn Subscribers: rkruppe, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67072 llvm-svn: 370720
*	[InstCombine] recognize bswap disguised as shufflevector	Sanjay Patel	2019-09-02	1	-8/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bitcast <N x i8> (shuf X, undef, <N, N-1,...0>) to i{N8} --> bswap (bitcast X to i{N8}) In PR43146: https://bugs.llvm.org/show_bug.cgi?id=43146 ...we have a more complicated case where SLP is making a mess of bswap. This patch won't do anything for that currently, but we need to improve bswap recognition in instcombine, SLP, and/or a standalone pass to avoid that problem. This is limited using the data-layout so we don't try to do this transform with actual vector types. The backend does not appear to have folds to convert in either direction, so we don't want to mess up something that is actually better lowered as a shuffle. On x86, we're trading something like this: vmovd %edi, %xmm0 vpshufb LCPI0_0(%rip), %xmm0, %xmm0 ## xmm0 = xmm0[3,2,1,0,u,u,u,u,u,u,u,u,u,u,u,u] vmovd %xmm0, %eax For: movl %edi, %eax bswapl %eax Differential Revision: https://reviews.llvm.org/D66965 llvm-svn: 370659
*	[ConstantFolding] Fix 'undef' folding for @llvm.[us]{add,sub}.with.overflow ↵	Roman Lebedev	2019-09-01	1	-5/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ops (PR43188) As we have already established/fixed in https://bugs.llvm.org/show_bug.cgi?id=42209 https://reviews.llvm.org/D63065 https://reviews.llvm.org/rL363522 the InstSimplify handling for @llvm.with.overflow ops with undefs is correct. Therefore if ConstantFolding produces different results, then it is wrong. This duplication of code hints at the need for some refactoring, but for now address the brokenness of ConstantFolding by copying the known-good handling from rL363522. Fixes https://bugs.llvm.org/show_bug.cgi?id=43188 llvm-svn: 370608
*	[InstCombine] mempcpy(d,s,n) to memcpy(d,s,n) + n	David Bolvansky	2019-08-31	1	-6/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Back-end currently expands mempcpy, but middle-end should work with memcpy instead of mempcpy to enable more memcpy-optimization. GCC backend emits mempcpy, so LLVM backend could form it too, if we know mempcpy libcall is better than memcpy + n. https://godbolt.org/z/dOCG96 Reviewers: efriedma, spatel, craig.topper, RKSimon, jdoerfert Reviewed By: efriedma Subscribers: hjl.tools, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65737 llvm-svn: 370593
*	[CVP] Add tests for simplified with.overflow + icmp; NFC	Nikita Popov	2019-08-31	1	-0/+278
\| \| \| \| \| \|	These tests are based on D19867. llvm-svn: 370574
*	[CVP] Generate simpler code for elided with.overflow intrinsics	Nikita Popov	2019-08-31	1	-112/+97
\| \| \| \| \| \| \| \| \| \|	Use a { iN undef, i1 false } struct as the base, and only insert the first operand, instead of using { iN undef, i1 undef } as the base and inserting both. This is the same as what we do in InstCombine. Differential Revision: https://reviews.llvm.org/D67034 llvm-svn: 370573
*	[SampleFDO] Add profile symbol list section to discriminate function being	Wei Mi	2019-08-31	4	-0/+152
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cold versus function being newly added. This is the second half of https://reviews.llvm.org/D66374. Profile symbol list is the collection of function symbols showing up in the binary which generates the current profile. It is used to discriminate function being cold versus function being newly added. Profile symbol list is only added for profile with ExtBinary format. During profile use compilation, when profile-sample-accurate is enabled, a function without profile will be regarded as cold only when it is contained in that list. Differential Revision: https://reviews.llvm.org/D66766 llvm-svn: 370563
*	[GVN] Verify value equality before doing phi translation for call instruction	Wei Mi	2019-08-30	1	-0/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is an updated version of https://reviews.llvm.org/D66909 to fix PR42605. Basically, current phi translatation translates an old value number to an new value number for a call instruction based on the literal equality of call expression, without verifying there is no clobber in between. This is incorrect. To get a finegrain check, use MachineDependence analysis to do the job. However, this is still not ideal. Although given a call instruction, `MemoryDependenceResults::getCallDependencyFrom` returns identical call instructions without clobber in between using MemDepResult with its DepType to be `Def`. However, identical is too strict here and we want it to be relaxed a little to consider phi-translation -- callee is the same, param operands can be different. That means changing the semantic of `MemDepResult::Def` and I don't know the potential impact. So currently the patch is still conservative to only handle MemDepResult::NonFuncLocal, which means the current call has no function local clobber. If there is clobber, even if the clobber doesn't stand in between the current call and the call with the new value, we won't do phi-translate. Differential Revision: https://reviews.llvm.org/D67013 llvm-svn: 370547
*	[Attributor] Use existing function information for the call site	Johannes Doerfert	2019-08-30	14	-14/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Instead of recomputing information for call sites we now use the function information directly. This is always valid and once we have call site specific information we can improve here. This patch also bootstraps attributes that are created on-demand through an initial update call. Information that is known will then directly be available in the new attribute without causing an iteration delay. The tests show how this improves the iteration count. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66781 llvm-svn: 370480
*	[Attributor] Manifest load/store alignment generally	Johannes Doerfert	2019-08-30	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Any pointer could have load/store users not only floating ones so we move the manifest logic for alignment into the AAAlignImpl class. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66922 llvm-svn: 370479
*	[InstCombine][AMDGPU] Simplify tbuffer loads	Piotr Sobczak	2019-08-30	1	-0/+658
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add missing tbuffer loads intrinsics in SimplifyDemandedVectorElts. Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66926 llvm-svn: 370475
*	[Attributor] Implement AANoAliasCallSiteArgument initialization	Hideto Ueno	2019-08-30	1	-5/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds an appropriate `initialize` method for `AANoAliasCallSiteArgument`. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66927 llvm-svn: 370456