bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[InstCombine] [NFC] Add tests for getting rid of select of bittest (PR36950 ↵	Roman Lebedev	2018-04-04	1	-0/+464
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	/ PR17564) Summary: See [[ https://bugs.llvm.org/show_bug.cgi?id=36950 \| PR36950 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=17564 \| PR17564 ]], D45065, D45108 Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45107 llvm-svn: 329198
*	[SLPVectorizer][X86] Regenerate some tests. NFCI	Simon Pilgrim	2018-04-04	5	-74/+187
\| \| \| \|	llvm-svn: 329196
*	StructurizeCFG: Test for branch divergence correctly	Nicolai Haehnle	2018-04-04	1	-0/+82
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes cases like the new test @nonuniform. In that test, %cc itself is a uniform value; however, when reading it after the end of the loop in basic block %if, its value is effectively non-uniform, so the branch is non-uniform. This problem was encountered in https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change in itself is not sufficient to fix that bug, as there is another issue in the AMDGPU backend. As discovered after committing an earlier version of this change, this exposes a subtle interaction between this pass and DivergenceAnalysis: since we remove and re-create branch instructions, we can no longer rely on DivergenceAnalysis for branches in subregions that were already processed by the pass. Explicitly remove branch instructions from DivergenceAnalysis to avoid dangling pointers as a matter of defensive programming, and change how we detect non-uniform subregions. Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4 Differential Revision: https://reviews.llvm.org/D43743 llvm-svn: 329165
*	[SCEV] Prove implications for SCEVUnknown Phis	Max Kazantsev	2018-04-04	2	-0/+181
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch teaches SCEV how to prove implications for SCEVUnknown nodes that are Phis. If we need to prove `Pred` for `LHS, RHS`, and `LHS` is a Phi with possible incoming values `L1, L2, ..., LN`, then if we prove `Pred` for `(L1, RHS), (L2, RHS), ..., (LN, RHS)` then we can also prove it for `(LHS, RHS)`. If both `LHS` and `RHS` are Phis from the same block, it is sufficient to prove the predicate for values that come from the same predecessor block. The typical case that it handles is that we sometimes need to prove that `Phi(Len, Len - 1) >= 0` given that `Len > 0`. The new logic was added to `isImpliedViaOperations` and only uses it and non-recursive reasoning to prove the facts we need, so it should not hurt compile time a lot. Differential Revision: https://reviews.llvm.org/D44001 Reviewed By: anna llvm-svn: 329150
*	[SimplifyCFG] Teach merge conditional stores to handle cases where the ↵	Craig Topper	2018-04-04	1	-0/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PostBB has more than 2 predecessors by inserting a new block for the store. Summary: Currently merge conditional stores can't handle cases where PostBB (the block we need to move the store to) has more than 2 predecessors. This patch removes that restriction by creating a new block with only the 2 predecessors we care about and an unconditional branch to the original block. This provides a place to put the store. Reviewers: efriedma, jmolloy, ABataev Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39760 llvm-svn: 329142
*	[InstCombine] allow more fmul folds with 'reassoc'	Sanjay Patel	2018-04-03	1	-25/+28
\| \| \| \| \| \| \| \|	The tests marked with 'FIXME' require loosening the check in SimplifyAssociativeOrCommutative() to optimize completely; that's still checking isFast() in Instruction::isAssociative(). llvm-svn: 329121
*	[coroutines] Respect alloca alignment requirements when building coroutine frame	Gor Nishanov	2018-04-03	1	-0/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If an alloca need to be stored in the coroutine frame and it has an alignment specified and the alignment does not match the natural alignment of the alloca type. Insert appropriate padding into the coroutine frame to make sure that it gets requested alignment. For example for a packet type (which natural alignment is 1), but alloca alignment is 8, we may need to insert a padding field with required number of bytes to make sure it is properly aligned. ``` %PackedStruct = type <{ i64 }> ... %data = alloca %PackedStruct, align 8 ``` If the previous field in the coroutine frame had alignment 2, we would have [6 x i8] inserted before %PackedStruct in the coroutine frame: ``` %f.Frame = type { ..., i16, [6 x i8], %PackedStruct } ``` Reviewers: rnk, lewissbaker, modocache Reviewed By: modocache Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D45221 llvm-svn: 329112
*	[LoopInterchange] Add remark for calls preventing interchanging.	Florian Hahn	2018-04-03	1	-44/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	It also updates test/Transforms/LoopInterchange/call-instructions.ll to use accesses where we can prove dependence after D35430. Reviewers: sebpop, karthikthecool, blitz.opensource Reviewed By: sebpop Differential Revision: https://reviews.llvm.org/D45206 llvm-svn: 329111
*	[InstCombine] Fold compare of int constant against a splatted vector of ints	Daniel Neilson	2018-04-03	1	-0/+127
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Folding patterns like: %vec = shufflevector <4 x i8> %insvec, <4 x i8> undef, <4 x i32> zeroinitializer %cast = bitcast <4 x i8> %vec to i32 %cond = icmp eq i32 %cast, 0 into: %ext = extractelement <4 x i8> %insvec, i32 0 %cond = icmp eq i32 %ext, 0 Combined with existing rules, this allows us to fold patterns like: %insvec = insertelement <4 x i8> undef, i8 %val, i32 0 %vec = shufflevector <4 x i8> %insvec, <4 x i8> undef, <4 x i32> zeroinitializer %cast = bitcast <4 x i8> %vec to i32 %cond = icmp eq i32 %cast, 0 into: %cond = icmp eq i8 %val, 0 When we construct a splat vector via a shuffle, and bitcast the vector into an integer type for comparison against an integer constant. Then we can simplify the the comparison to compare the splatted value against the integer constant. Reviewers: spatel, anna, mkazantsev Reviewed By: spatel Subscribers: efriedma, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D44997 llvm-svn: 329087
*	[SLP] Fix PR36481: vectorize reassociated instructions.	Alexey Bataev	2018-04-03	9	-186/+126
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If the load/extractelement/extractvalue instructions are not originally consecutive, the SLP vectorizer is unable to vectorize them. Patch allows reordering of such instructions. Patch does not support reordering of the repeated instruction, this must be handled in the separate patch. Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43776 llvm-svn: 329085
*	[LoopInterchange] Update tests so DA can handle access after D35430.	Florian Hahn	2018-04-03	8	-325/+365
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I have taken the opportunity to simplify some tests slightly and move parts around. It also brings back a few IR checks for interchangable loops. Reviewers: karthikthecool, sebpop, grosser Reviewed By: sebpop Differential Revision: https://reviews.llvm.org/D45207 llvm-svn: 329081
*	[SLP] Added tests for checks of reordering of the repeated instructions,	Alexey Bataev	2018-04-03	1	-0/+129
\| \| \| \| \| \|	NFC. llvm-svn: 329080
*	Revert "[SLP] Fix PR36481: vectorize reassociated instructions."	Benjamin Kramer	2018-04-03	8	-122/+183
\| \| \| \| \| \|	This reverts commit r328980 and r329046. Makes the vectorizer crash. llvm-svn: 329071
*	peel loops with runtime small trip counts	Ikhlas Ajbar	2018-04-03	1	-0/+37
\| \| \| \| \| \| \| \| \| \|	For Hexagon, peeling loops with small runtime trip count is beneficial for our benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set by the target for computing the desired peel count. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 329042
*	[SLP] Distinguish "demanded and shrinkable" from "demanded and not ↵	Haicheng Wu	2018-04-03	2	-19/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	shrinkable" values when determining the minimum bitwidth We use two approaches for determining the minimum bitwidth. * Demanded bits * Value tracking If demanded bits doesn't result in a narrower type, we then try value tracking. We need this if we want to root SLP trees with the indices of getelementptr instructions since all the bits of the indices are demanded. But there is a missing piece though. We need to be able to distinguish "demanded and shrinkable" from "demanded and not shrinkable". For example, the bits of %i in %i = sext i32 %e1 to i64 %gep = getelementptr inbounds i64, i64* %p, i64 %i are demanded, but we can shrink %i's type to i32 because it won't change the result of the getelementptr. On the other hand, in %tmp15 = sext i32 %tmp14 to i64 %tmp16 = insertvalue { i64, i64 } undef, i64 %tmp15, 0 it doesn't make sense to shrink %tmp15 and we can skip the value tracking. Ideas are from Matthew Simpson! Differential Revision: https://reviews.llvm.org/D44868 llvm-svn: 329035
*	[Coroutines] Avoid assert splitting hidden coros	Brian Gesiak	2018-04-02	1	-0/+81
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When attempting to split a coroutine with 'hidden' visibility (for example, a C++ coroutine that is inlined when compiled with the option '-fvisibility-inlines-hidden'), LLVM would hit an assertion in include/llvm/IR/GlobalValue.h:240: "local linkage requires default visibility". The issue is that the visibility is copied from the source of the function split in the `CloneFunctionInto` function, but the linkage is not. To fix, create the new function first with external linkage, then copy the linkage from the original function after `CloneFunctionInto` is called. Since `GlobalValue::setLinkage` in turn calls `maybeSetDsoLocal`, the explicit call to `setDSOLocal` can be removed in CoroSplit.cpp. Test Plan: check-llvm Reviewers: GorNishanov, lewissbaker, EricWF, majnemer, rnk Reviewed By: rnk Subscribers: llvm-commits, eric_niebler Differential Revision: https://reviews.llvm.org/D44185 llvm-svn: 329033
*	[InstCombine] Don't strip function type casts from musttail calls	Reid Kleckner	2018-04-02	1	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The cast simplifications that instcombine does here do not make any attempt to obey the verifier rules for musttail calls. Therefore we have to disable them. Reviewers: efriedma, majnemer, pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45186 llvm-svn: 329027
*	Treat inlining a notail call as a regular, non-tail call	Reid Kleckner	2018-04-02	1	-0/+21
\| \| \| \| \| \| \| \| \|	Otherwise, we end up inlining a musttail call into a non-tail position, which breaks verifier invariants. Fixes PR31014 llvm-svn: 329015
*	[InstCombine] add folds for icmp + sub (PR36969)	Sanjay Patel	2018-04-02	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(A - B) >u A --> A <u B C <u (C - D) --> C <u D https://rise4fun.com/Alive/e7j Name: ugt %sub = sub i8 %x, %y %cmp = icmp ugt i8 %sub, %x => %cmp = icmp ult i8 %x, %y Name: ult %sub = sub i8 %x, %y %cmp = icmp ult i8 %x, %sub => %cmp = icmp ult i8 %x, %y This should fix: https://bugs.llvm.org/show_bug.cgi?id=36969 llvm-svn: 329011
*	[InstCombine] add tests for icmp (sub x, y), x (PR36969); NFC	Sanjay Patel	2018-04-02	1	-0/+30
\| \| \| \|	llvm-svn: 329010
*	[DeadArgumentElim] Clone function level metadatas	Rong Xu	2018-04-02	1	-0/+67
\| \| \| \| \| \| \| \| \| \| \| \|	Some Function level metadatas, such as function entry count, are not cloned in DeadArgumentElim. This happens a lot in lto/thinlto because of DeadArgumentElim after internalization. This patch clones the metadatas in the original function to the new function. Differential Revision: https://reviews.llvm.org/D44127 llvm-svn: 328991
*	[coroutines] Add support for llvm.coro.noop intrinsics	Gor Nishanov	2018-04-02	1	-2/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: A recent addition to Coroutines TS (https://wg21.link/p0913) adds a pre-defined coroutine noop_coroutine that does nothing. To implement this feature, we implemented an llvm.coro.noop intrinsic that returns a coroutine handle to a coroutine that does nothing when resumed or destroyed. Reviewers: EricWF, modocache, rnk, lewissbaker Reviewed By: modocache Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45114 llvm-svn: 328986
*	[SLP] Fix PR36481: vectorize reassociated instructions.	Alexey Bataev	2018-04-02	8	-183/+122
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If the load/extractelement/extractvalue instructions are not originally consecutive, the SLP vectorizer is unable to vectorize them. Patch allows reordering of such instructions. Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43776 llvm-svn: 328980
*	[ThinLTO] Add an import cutoff for debugging/triaging	Teresa Johnson	2018-04-01	2	-0/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Adds -import-cutoff=N which will stop importing during the thin link after N imports. Default is -1 (no limit). Reviewers: wmi Subscribers: inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D45127 llvm-svn: 328934
*	[LoopRotate] Rotate loops with loop exiting latches	David Green	2018-04-01	1	-0/+234
\| \| \| \| \| \| \| \| \| \| \|	If a loop has a loop exiting latch, it can be profitable to rotate the loop if it leads to the simplification of a phi node. Perform rotation in these cases even if loop rotate itself didnt simplify the loop to get there. Differential Revision: https://reviews.llvm.org/D44199 llvm-svn: 328933
*	[ThinLTO] Add an option to force summary call edges cold for debugging	Teresa Johnson	2018-03-31	3	-0/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Useful to selectively disable importing into specific modules for debugging/triaging/workarounds. Reviewers: eraman Subscribers: inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D45062 llvm-svn: 328909
*	Revert "peel loops with runtime small trip counts"	Krzysztof Parzyszek	2018-03-30	1	-37/+0
\| \| \| \| \| \|	This reverts commit r328854, it breaks some Hexagon tests. llvm-svn: 328875
*	[Hexagon] add missing lit config file	Ikhlas Ajbar	2018-03-30	1	-0/+3
\| \| \| \|	llvm-svn: 328855
*	peel loops with runtime small trip counts	Ikhlas Ajbar	2018-03-30	1	-0/+37
\| \| \| \| \| \| \| \| \| \|	For Hexagon, peeling loops with small runtime trip count is beneficial for our benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set by the target for computing the desired peel count. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 328854
*	[SLPVectorizer] Add tests related to PR30787, NFCI.	Dinar Temirbulatov	2018-03-29	4	-0/+390
\| \| \| \|	llvm-svn: 328813
*	[JumpThreading] Don't select an edge that we know we can't thread	Haicheng Wu	2018-03-29	1	-0/+99
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In r312664 (D36404), JumpThreading stopped threading edges into loop headers. Unfortunately, I observed a significant performance regression as a result of this change. Upon further investigation, the problematic pattern looked something like this (after many high level optimizations): while (true) { bool cond = ...; if (!cond) { <body> } if (cond) break; } Now, naturally we want jump threading to essentially eliminate the second if check and hook up the edges appropriately. However, the above mentioned change, prevented it from doing this because it would have to thread an edge into the loop header. Upon further investigation, what is happening is that since both branches are threadable, JumpThreading picks one of them at arbitrarily. In my case, because of the way that the IR ended up, it tended to pick the one to the loop header, bailing out immediately after. However, if it had picked the one to the exit block, everything would have worked out fine (because the only remaining branch would then be folded, not thraded which is acceptable). Thus, to fix this problem, we can simply eliminate loop headers from consideration as possible threading targets earlier, to make sure that if there are multiple eligible branches, we can still thread one of the ones that don't target a loop header. Patch by Keno Fischer! Differential Revision: https://reviews.llvm.org/D42260 llvm-svn: 328798
*	[PGO] Fix branch probability remarks assert	Rong Xu	2018-03-27	2	-0/+30
\| \| \| \| \| \| \| \| \|	Fixed counter/weight overflow that leads to an assertion. Also fixed the help string for pgo-emit-branch-prob option. Differential Revision: https://reviews.llvm.org/D44809 llvm-svn: 328653
*	[IRCE] Enable decreasing loops of non-const bound	Sam Parker	2018-03-27	1	-0/+180
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	As a follow-up to r328480, this updates the logic for the decreasing safety checks in a similar manner: - CanBeMax is replaced by CannotBeMaxInLoop which queries isLoopEntryGuardedByCond on the maximum value. - SumCanReachMin is replaced by isSafeDecreasingBound which includes some logic from parseLoopStructure and, again, has been updated to use isLoopEntryGuardedByCond on the given bounds. Differential Revision: https://reviews.llvm.org/D44776 llvm-svn: 328613
*	[SCEV] Make exact taken count calculation more optimistic	Max Kazantsev	2018-03-27	2	-7/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, `getExact` fails if it sees two exit counts in different blocks. There is no solid reason to do so, given that we only calculate exact non-taken count for exiting blocks that dominate latch. Using this fact, we can simply take min out of all exits of all blocks to get the exact taken count. This patch makes the calculation more optimistic with enforcing our assumption with asserts. It allows us to calculate exact backedge taken count in trivial loops like for (int i = 0; i < 100; i++) { if (i > 50) break; . . . } Differential Revision: https://reviews.llvm.org/D44676 Reviewed By: fhahn llvm-svn: 328611
*	[SCEV] Add one more case in computeConstantDifference	Max Kazantsev	2018-03-27	1	-0/+89
\| \| \| \| \| \| \| \| \| \|	This patch teaches `computeConstantDifference` handle calculation of constant difference between `(X + C1)` and `(X + C2)` which is `(C2 - C1)`. Differential Revision: https://reviews.llvm.org/D43759 Reviewed By: anna llvm-svn: 328609
*	[MemorySSA] Fix exponential compile-time updating MemorySSA.	Eli Friedman	2018-03-26	1	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MemorySSAUpdater::getPreviousDefRecursive is a recursive algorithm, for each block, it computes the previous definition for each predecessor, then takes those definitions and combines them. But currently it doesn't remember results which it already computed; this means it can visit the same block multiple times, which adds up to exponential time overall. To fix this, this patch adds a cache. If we computed the result for a block already, we don't need to visit it again because we'll come up with the same result. Well, unless we RAUW a MemoryPHI; in that case, the TrackingVH will be updated automatically. This matches the original source paper for this algorithm. The testcase isn't really a test for the bug, but it adds coverage for the case where tryRemoveTrivialPhi erases an existing PHI node. (It's hard to write a good regression test for a performance issue.) Differential Revision: https://reviews.llvm.org/D44715 llvm-svn: 328577
*	[SLP] Add more checks to a test case. NFC.	Haicheng Wu	2018-03-26	1	-0/+14
\| \| \| \|	llvm-svn: 328572
*	[SLP] Add a test case. NFC.	Haicheng Wu	2018-03-26	1	-0/+38
\| \| \| \|	llvm-svn: 328546
*	[InstCombine] reassociate loop invariant GEP chains to enable LICM	Sebastian Pop	2018-03-26	2	-2/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change brings performance of zlib up by 10%. The example below is from a hot loop in longest_match() from zlib. do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 %idx.ext1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 -1 In this example %idx.ext1 is a loop invariant. It will be moved above the use of loop induction variable %idx.ext such that it can be hoisted out of the loop by LICM. The operands that have dependences carried by the loop will be sinked down in the GEP chain. This patch will produce the following output: do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext1 %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 -1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 %idx.ext llvm-svn: 328539
*	[InstCombine] distribute fmul over fadd/fsub	Sanjay Patel	2018-03-26	2	-13/+11
\| \| \| \| \| \| \| \| \| \|	This replaces a large chunk of code that was looking for compound patterns that include these sub-patterns. Existing tests ensure that all of the previous examples are still folded as expected. We still need to loosen the FMF check. llvm-svn: 328502
*	[InstCombine] check uses before creating instructions for fmul distribution	Sanjay Patel	2018-03-26	1	-12/+3
\| \| \| \| \| \|	As the tests show, we could create extra instructions without any obvious benefit. llvm-svn: 328498
*	[LoopUnroll] Fix dangling pointers in SCEV	Max Kazantsev	2018-03-26	1	-0/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current logic of loop SCEV invalidation in Loop Unroller implicitly relies on fact that exit count of outer loops cannot rely on exiting blocks of inner loops, which is true in current implementation of backedge taken count calculation but is wrong in general. As result, when we only forget the loop that we have just unrolled, we may still have cached data for its outer loops (in particular, exit counts) which keeps references on blocks of inner loop that could have been changed or even deleted. The attached test demonstrates a situaton when after unrolling of innermost loop the outermost loop contains a dangling pointer on non-existant block. The problem shows up when we apply patch https://reviews.llvm.org/D44677 that makes SCEV smarter about exit count calculation. I am not sure if the bug exists without this patch, it appears that now it is accidentally correct just because in practice exact backedge taken count for outer loops with complex control flow inside is never calculated. But when SCEV learns to do so, this problem shows up. This patch replaces existing logic of SCEV loop invalidation with a correct one, which happens to be invalidation of outermost loop (which also leads to invalidation of all loops inside of it). It is the only way to ensure that no outer loop keeps dangling pointers on removed blocks, or just outdated information that has changed after unrolling. Differential Revision: https://reviews.llvm.org/D44818 Reviewed By: samparker llvm-svn: 328483
*	[DeadArgElim] Strip allocsize attributes when deleting an argument.	Benjamin Kramer	2018-03-26	1	-0/+13
\| \| \| \| \| \| \|	Since allocsize refers to the argument number it gets invalidated when an argument is removed and the numbers shift. llvm-svn: 328481
*	[IRCE] Enable increasing loops of variable bounds	Sam Parker	2018-03-26	2	-6/+176
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CanBeMin is currently used which will report true for any unknown values, but often a check is performed outside the loop which covers this situation: for (int i = 0; i < N; ++i) ... if (N > 0) for (int i = 0; i < N; ++i) ... So I've add 'LoopGuardedAgainstMin' which reports whether N is greater than the minimum value which then allows loop with a variable loop count to be optimised. I've also moved the increasing bound checking into its own function and replaced SumCanReachMax is another isLoopEntryGuardedByCond function. llvm-svn: 328480
*	[PatternMatch] allow undef elements when matching vector FP +0.0	Sanjay Patel	2018-03-25	6	-14/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	This continues the FP constant pattern matching improvements from: https://reviews.llvm.org/rL327627 https://reviews.llvm.org/rL327339 https://reviews.llvm.org/rL327307 Several integer constant matchers also have this ability. I'm separating matching of integer/pointer null from FP positive zero and renaming/commenting to make the functionality clearer. llvm-svn: 328461
*	[InstSimplify, InstCombine] add/update tests with FP +0.0 vector with undef; NFC	Sanjay Patel	2018-03-25	8	-364/+423
\| \| \| \|	llvm-svn: 328455
*	[InstCombine] adjust test comments; NFC	Sanjay Patel	2018-03-25	1	-9/+6
\| \| \| \|	llvm-svn: 328450
*	[InstCombine] consolidate casted icmp vector tests	Sanjay Patel	2018-03-25	1	-660/+43
\| \| \| \| \| \| \| \|	We have thorough coverage of predicates and scalar types, so we just need a sampling of vector tests to show that things are working or not with vectors types. llvm-svn: 328449
*	[InstCombine] peek through more icmp of FP cast + bitcast	Sanjay Patel	2018-03-25	1	-135/+45
\| \| \| \| \| \|	This is an extension of rL328426 as noted in D44367. llvm-svn: 328448
*	[InstCombine] peek through FP casts for sign-bit compares (PR36682)	Sanjay Patel	2018-03-24	2	-107/+27
\| \| \| \| \| \| \| \| \| \| \| \|	This pattern came up in PR36682: https://bugs.llvm.org/show_bug.cgi?id=36682 https://godbolt.org/g/LhuD9A Equality checks are planned as a follow-up enhancement. Differential Revision: https://reviews.llvm.org/D44367 llvm-svn: 328426