bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[LV] Tail-Loop Folding	Sjoerd Meijer	2019-08-01	2	-54/+99
\| \| \| \| \| \| \| \| \| \| \|	This allows folding of the scalar epilogue loop (the tail) into the main vectorised loop body when the loop is annotated with a "vector predicate" metadata hint. To fold the tail, instructions need to be predicated (masked), enabling/disabling lanes for the remainder iterations. Differential Revision: https://reviews.llvm.org/D65197 llvm-svn: 367592
*	[Attributor][FIX] Indicate a missing update change	Johannes Doerfert	2019-08-01	1	-3/+7
\| \| \| \| \| \| \| \| \| \|	User of AAReturnedValues need to know if HasOverdefinedReturnedCalls changed from false to true as it will impact the result of the return value traversal (calls are not ignored anymore). This will be tested with the tests in D59978. llvm-svn: 367581
*	[IR] Value: add replaceUsesWithIf() utility	Roman Lebedev	2019-08-01	3	-21/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: While there is always a `Value::replaceAllUsesWith()`, sometimes the replacement needs to be conditional. I have only cleaned a few cases where `replaceUsesWithIf()` could be used, to both add test coverage, and show that it is actually useful. Reviewers: jdoerfert, spatel, RKSimon, craig.topper Reviewed By: jdoerfert Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, aheejin, george.burgess.iv, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65528 llvm-svn: 367548
*	[IR] SelectInst: add swapValues() utility	Roman Lebedev	2019-08-01	3	-14/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Sometimes we need to swap true-val and false-val of a `SelectInst`. Having a function for that is nicer than hand-writing it each time. Reviewers: spatel, RKSimon, craig.topper, jdoerfert Reviewed By: jdoerfert Subscribers: jdoerfert, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65520 llvm-svn: 367547
*	Fix a release-only build warning triggered by rL367485	Philip Reames	2019-08-01	1	-0/+3
\| \| \| \|	llvm-svn: 367499
*	[IndVars, RLEV] Support rewriting exit values in loops without known exits ↵	Philip Reames	2019-07-31	1	-9/+7
\| \| \| \| \| \| \| \| \| \|	(prep work) This is a prepatory patch for future work on support exit value rewriting in loops with a mixture of computable and non-computable exit counts. The intention is to be "mostly NFC" - i.e. not enable any interesting new transforms - but in practice, there are some small output changes. The test differences are caused by cases wherewhere getSCEVAtScope can simplify a single entry phi without needing any knowledge of the loop. llvm-svn: 367485
*	[InstCombine] canonicalize fneg before fmul/fdiv	Sanjay Patel	2019-07-31	2	-20/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reverse the canonicalization of fneg relative to fmul/fdiv. That makes it easier to implement the transforms (and possibly other fneg transforms) in 1 place because we can always start the pattern match from fneg (either the legacy binop or the new unop). There's a secondary practical benefit seen in PR21914 and PR42681: https://bugs.llvm.org/show_bug.cgi?id=21914 https://bugs.llvm.org/show_bug.cgi?id=42681 ...hoisting fneg rather than sinking seems to play nicer with LICM in IR (although this change may expose analysis holes in the other direction). 1. The instcombine test changes show the expected neutral IR diffs from reversing the order. 2. The reassociation tests show that we were missing an optimization opportunity to fold away fneg-of-fneg. My reading of IEEE-754 says that all of these transforms are allowed (regardless of binop/unop fneg version) because: "For all other operations [besides copy/abs/negate/copysign], this standard does not specify the sign bit of a NaN result." In all of these transforms, we always have some other binop (fadd/fsub/fmul/fdiv), so we are free to flip the sign bit of a potential intermediate NaN operand. (If that interpretation is wrong, then we must already have a bug in the existing transforms?) 3. The clang tests shouldn't exist as-is, but that's effectively a revert of rL367149 (the test broke with an extension of the pre-existing fneg canonicalization in rL367146). Differential Revision: https://reviews.llvm.org/D65399 llvm-svn: 367447
*	[AMDGPU] Fix for vectorizer crash with pointers of different size	Stanislav Mekhanoshin	2019-07-31	1	-0/+5
\| \| \| \| \| \| \| \| \|	When vectorizer strips pointers it can eventually end up with pointers of two different sizes, then SCEV will crash. Differential Revision: https://reviews.llvm.org/D65480 llvm-svn: 367443
*	[IPSCCP] Move callsite check to the beginning of the loop.	Florian Hahn	2019-07-31	1	-14/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have some code marks instructions with struct operands as overdefined, but if the instruction is a call to a function with tracked arguments, this breaks the assumption that the lattice values of all call sites are not overdefined and will be replaced by a constant. This also re-adds the assertion from D65222, with additionally skipping non-callsite uses. This patch should address the cases reported in which the assertion fired. Fixes PR42738. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D65439 llvm-svn: 367430
*	[DivRemPairs] Fixup DNDEBUG build - variable is only used in assertion	Roman Lebedev	2019-07-31	1	-0/+1
\| \| \| \|	llvm-svn: 367423
*	[DivRemPairs] Recommit: Handling for expanded-form rem - recomposition (PR42673)	Roman Lebedev	2019-07-31	1	-6/+100
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: While `-div-rem-pairs` pass can decompose rem in div+rem pair when div-rem pair is unsupported by target, nothing performs the opposite fold. We can't do that in InstCombine or DAGCombine since neither of those has access to TTI. So it makes most sense to teach `-div-rem-pairs` about it. If we matched rem in expanded form, we know we will be able to place div-rem pair next to each other so we won't regress the situation. Also, we shouldn't decompose rem if we matched already-decomposed form. This is surprisingly straight-forward otherwise. The original patch was committed in rL367288 but was reverted in rL367289 because it exposed pre-existing RAUW issues in internal data structures of the pass; those now have been addressed in a previous patch. https://bugs.llvm.org/show_bug.cgi?id=42673 Reviewers: spatel, RKSimon, efriedma, ZaMaZaN4iK, bogner Reviewed By: bogner Subscribers: bogner, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65298 llvm-svn: 367419
*	[DivRemPairs] Avoid RAUW pitfalls (PR42823)	Roman Lebedev	2019-07-31	1	-26/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: `DivRemPairs` internally creates two maps: * {sign, divident, divisor} -> div instruction * {sign, divident, divisor} -> rem instruction Then it iterates over rem map, and looks if there is an entry in div map with the same key. Then depending on some internal logic it may RAUW rem instruction with something else. But if that rem instruction is an input to other div/rem, then it was used as a key in these maps, so the old value (used in key) is now dandling, because RAUW didn't update those maps. And we can't even RAUW map keys in general, there's `ValueMap`, but we don't have a single `Value` as key... The bug was discovered via D65298, and the test there exists. Now, i'm not sure how to expose this issue in trunk. The bug is clearly there if i change the map keys to be `AssertingVH`/`PoisoningVH`, but i guess this didn't miscompiled anything thus far? I really don't think this is benin without that patch. The fix is actually rather straight-forward - instead of trying to somehow shoe-horn `ValueMap` here (doesn't fit, key isn't just `Value`), or writing a new `ValueMap` with key being a struct of `Value`s, we can just have an intermediate data structure - a vector, each entry containing matching `Div, Rem` pair, and pre-filling it before doing any modifications. This way we won't need to query map after doing RAUW, so no bug is possible. Reviewers: spatel, bogner, RKSimon, craig.topper Reviewed By: spatel Subscribers: hiraditya, hans, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65451 llvm-svn: 367417
*	Recommit "[GVN] Preserve loop related analysis/canonical forms."	Florian Hahn	2019-07-31	1	-5/+20
\| \| \| \| \| \| \|	This fixes some pipeline tests. This reverts commit d0b6f42936bfb6d56d325c732ae79400c9c6016a. llvm-svn: 367401
*	Revert [GVN] Preserve loop related analysis/canonical forms.	Florian Hahn	2019-07-30	1	-20/+5
\| \| \| \| \| \|	This reverts r367332 (git commit 2d7227ec3ac91f36fc32b1c21e72e2f1f5d030ad) llvm-svn: 367335
*	[GVN] Preserve loop related analysis/canonical forms.	Florian Hahn	2019-07-30	1	-5/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	LoopInfo can be easily preserved by passing it to the functions that modify the CFG (SplitCriticalEdge and MergeBlockIntoPredecessor. SplitCriticalEdge also preserves LoopSimplify and LCSSA form when when passing in LoopInfo. The test case shows that we preserve LoopSimplify and LoopInfo. Adding addPreservedID(LCSSAID) did not preserve LCSSA for some reason. Also I am not sure if it is possible to preserve those in the new pass manager, as they aren't analysis passes. Reviewers: reames, hfinkel, davide, jdoerfert Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D65137 llvm-svn: 367332
*	[LoopFusion] Extend use of OptimizationRemarkEmitter	Kit Barton	2019-07-30	1	-73/+107
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch extends the use of the OptimizationRemarkEmitter to provide information about loops that are not fused, and loops that are not eligible for fusion. In particular, it uses the OptimizationRemarkAnalysis to identify loops that are not eligible for fusion and the OptimizationRemarkMissed to identify loops that cannot be fused. It also reuses the statistics to provide the messages used in the OptimizationRemarks. This provides common message strings between the optimization remarks and the statistics. I would like feedback on this approach, in general. If people are OK with this, I will flesh out additional remarks in subsequent commits. Subscribers: hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63844 llvm-svn: 367327
*	[InstCombine] Fold "x ?% y ==/!= 0" to "x & (y-1) ==/!= 0" iff y is ↵	Roman Lebedev	2019-07-30	3	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	power-of-two Summary: I have stumbled into this by accident while preparing to extend backend `x s% C ==/!= 0` handling. While we did happen to handle this fold in most of the cases, the folding is indirect - we fold `x u% y` to `x & (y-1)` (iff `y` is power-of-two), or first turn `x s% -y` to `x u% y`; that does handle most of the cases. But we can't turn `x s% INT_MIN` to `x u% -INT_MIN`, and thus we end up being stuck with `(x s% INT_MIN) == 0`. There is no such restriction for the more general fold: https://rise4fun.com/Alive/IIeS To be noted, the fold does not enforce that `y` is a constant, so it may indeed increase instruction count. This is consistent with what `x u% y`->`x & (y-1)` already does. I think it makes sense, it's at most one (simple) extra instruction, while `rem`ainder is really much more un-simple (and likely very costly). Reviewers: spatel, RKSimon, nikic, xbolva00, craig.topper Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65046 llvm-svn: 367322
*	Revert "[DivRemPairs] Handling for expanded-form rem - recomposition (PR42673)"	Roman Lebedev	2019-07-30	1	-97/+13
\| \| \| \| \| \| \| \| \| \| \|	test-suite/MultiSource/Benchmarks/DOE-ProxyApps-C/miniGMG broke: Only PHI nodes may reference their own value! %sub33 = srem i32 %sub33, %ranks_in_i This reverts commit r367288. llvm-svn: 367289
*	[DivRemPairs] Handling for expanded-form rem - recomposition (PR42673)	Roman Lebedev	2019-07-30	1	-13/+97
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: While `-div-rem-pairs` pass can decompose rem in div+rem pair when div-rem pair is unsupported by target, nothing performs the opposite fold. We can't do that in InstCombine or DAGCombine since neither of those has access to TTI. So it makes most sense to teach `-div-rem-pairs` about it. If we matched rem in expanded form, we know we will be able to place div-rem pair next to each other so we won't regress the situation. Also, we shouldn't decompose rem if we matched already-decomposed form. This is surprisingly straight-forward otherwise. https://bugs.llvm.org/show_bug.cgi?id=42673 Reviewers: spatel, RKSimon, efriedma, ZaMaZaN4iK, bogner Reviewed By: bogner Subscribers: bogner, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65298 llvm-svn: 367288
*	ThinLTOBitcodeWriter: Include globals associated with type metadata globals ↵	Peter Collingbourne	2019-07-29	1	-3/+11
\| \| \| \| \| \| \| \| \| \| \|	in the merged module. Globals that are associated with globals with type metadata need to appear in the merged module because they will reference the global's section directly. Differential Revision: https://reviews.llvm.org/D65312 llvm-svn: 367242
*	[InstCombine] fold fadd+fneg with fdiv/fmul betweena	Sanjay Patel	2019-07-29	1	-0/+18
\| \| \| \| \| \| \| \| \|	The backend already does this via isNegatibleForFree(), but we may want to alter the fneg IR canonicalizations that currently exist, so we need to try harder to fold fneg in IR to avoid regressions. llvm-svn: 367227
*	[InstCombine] reduce code for fadd with fneg operand; NFC	Sanjay Patel	2019-07-29	1	-7/+4
\| \| \| \|	llvm-svn: 367224
*	[InstCombine] fold fsub+fneg with fdiv/fmul between	Sanjay Patel	2019-07-28	1	-0/+15
\| \| \| \| \| \| \| \| \|	The backend already does this via isNegatibleForFree(), but we may want to alter the fneg IR canonicalizations that currently exist, so we need to try harder to fold fneg in IR to avoid regressions. llvm-svn: 367194
*	[Attributor] Deduce "align" attribute	Hideto Ueno	2019-07-28	1	-0/+236
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Deduce "align" attribute in attributor. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64152 llvm-svn: 367187
*	[InstSimplify] remove quadratic time looping (PR42771)	Sanjay Patel	2019-07-27	1	-21/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The test case from: https://bugs.llvm.org/show_bug.cgi?id=42771 ...shows a ~30x slowdown caused by the awkward loop iteration (rL207302) that is seemingly done just to avoid invalidating the instruction iterator. We can instead delay instruction deletion until we reach the end of the block (or we could delay until we reach the end of all blocks). There's a test diff here for a degenerate case with llvm.assume that is not meaningful in itself, but serves to verify this change in logic. This change probably doesn't result in much overall compile-time improvement because we call '-instsimplify' as a standalone pass only once in the standard -O2 opt pipeline currently. Differential Revision: https://reviews.llvm.org/D65336 llvm-svn: 367173
*	Revert [IPSCCP] Add assertion to surface cases where we zap returns with ↵	Florian Hahn	2019-07-26	1	-15/+0
\| \| \| \| \| \| \| \| \| \|	overdefined users. This reverts r366998 (git commit 5354c83ece00690b4dbfa47925f8f5a8f33f1d9e) This breaks a linux kernel build and we have reproducer to investigate. llvm-svn: 367160
*	[JumpThreading] Stop searching predecessor when the current bb is in a	Wei Mi	2019-07-26	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	unreachable loop. updatePredecessorProfileMetadata in jumpthreading tries to find the first dominating predecessor block for a PHI value by searching upwards the predecessor block chain. But jumpthreading may see some temporary IR state which contains unreachable bb not being cleaned up. If an unreachable loop happens to be on the predecessor block chain, keeping chasing the predecessor block will run into an infinite loop. The patch fixes it. Differential Revision: https://reviews.llvm.org/D65310 llvm-svn: 367154
*	[InstCombine] canonicalize negated operand of fdiv	Sanjay Patel	2019-07-26	1	-0/+10
\| \| \| \| \| \| \|	This is a transform that we use with fmul, so use it for fdiv too for consistency. llvm-svn: 367146
*	[InstCombine] remove flop from lerp patterns	Sanjay Patel	2019-07-26	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(Y * (1.0 - Z)) + (X * Z) --> Y - (Y * Z) + (X * Z) --> Y + Z * (X - Y) This is part of solving: https://bugs.llvm.org/show_bug.cgi?id=42716 Factoring eliminates an instruction, so that should be a good canonicalization. The potential conversion to FMA would be handled by the backend based on target capabilities. Differential Revision: https://reviews.llvm.org/D65305 llvm-svn: 367101
*	[Loop Utils] Extend the scope of addStringMetadataToLoop.	Serguei Katkov	2019-07-26	1	-2/+18
\| \| \| \| \| \| \| \| \| \| \| \|	To avoid duplicates in loop metadata, if the string to add is already there, just update the value. Reviewers: reames, Ashutosh Reviewed By: reames Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D65265 llvm-svn: 367087
*	[Loop Utils] Move utilty addStringMetadataToLoop to LoopUtils.cpp. NFC.	Serguei Katkov	2019-07-26	2	-31/+31
\| \| \| \| \| \| \| \| \| \| \|	Just move the utility function to LoopUtils.cpp to re-use it in loop peeling. Reviewers: reames, Ashutosh Reviewed By: reames Subscribers: hiraditya, asbirlea, llvm-commits Differential Revision: https://reviews.llvm.org/D65264 llvm-svn: 367085
*	Reland the "[NewPM] Port Sancov" patch from rL365838. No functional	Leonard Chan	2019-07-25	2	-110/+258
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	changes were made to the patch since then. -------- [NewPM] Port Sancov This patch contains a port of SanitizerCoverage to the new pass manager. This one's a bit hefty. Changes: - Split SanitizerCoverageModule into 2 SanitizerCoverage for passing over functions and ModuleSanitizerCoverage for passing over modules. - ModuleSanitizerCoverage exists for adding 2 module level calls to initialization functions but only if there's a function that was instrumented by sancov. - Added legacy and new PM wrapper classes that own instances of the 2 new classes. - Update llvm tests and add clang tests. llvm-svn: 367053
*	[PredicateInfo] Replace pointer comparisons with deterministic compares.	Florian Hahn	2019-07-25	1	-9/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently there are a few pointer comparisons in ValueDFS_Compare, which can cause non-deterministic ordering when materializing values. There are 2 cases this patch fixes: 1. Order defs before uses used to compare pointers, which guarantees defs before uses, but causes non-deterministic ordering between 2 uses or 2 defs, depending on the allocation order. By converting the pointers to booleans, we can circumvent that problem. 2. comparePHIRelated was comparing the basic block pointers of edges, which also results in a non-deterministic order and is also not really meaningful for ordering. By ordering by their destination DFS numbers we guarantee a deterministic order. For the example below, we can end up with 2 different uselist orderings, when running `opt -mem2reg -ipsccp` hundreds of times. Because the non-determinism is caused by allocation ordering, we cannot reproduce it with ipsccp alone. declare i32 @hoge() local_unnamed_addr #0 define dso_local i32 @ham(i8* %arg, i8* %arg1) #0 { bb: %tmp = alloca i32 %tmp2 = alloca i32, align 4 br label %bb19 bb4: ; preds = %bb20 br label %bb6 bb6: ; preds = %bb4 %tmp7 = call i32 @hoge() store i32 %tmp7, i32* %tmp %tmp8 = load i32, i32* %tmp %tmp9 = icmp eq i32 %tmp8, 912730082 %tmp10 = load i32, i32* %tmp br i1 %tmp9, label %bb11, label %bb16 bb11: ; preds = %bb6 unreachable bb13: ; preds = %bb20 br label %bb14 bb14: ; preds = %bb13 %tmp15 = load i32, i32* %tmp br label %bb16 bb16: ; preds = %bb14, %bb6 %tmp17 = phi i32 [ %tmp10, %bb6 ], [ 0, %bb14 ] br label %bb19 bb18: ; preds = %bb20 unreachable bb19: ; preds = %bb16, %bb br label %bb20 bb20: ; preds = %bb19 indirectbr i8* null, [label %bb4, label %bb13, label %bb18] } Reviewers: davide, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D64866 llvm-svn: 367049
*	[Loop Peeling] Fix idom detection algorithm.	Serguei Katkov	2019-07-25	1	-1/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We'd like to determine the idom of exit block after peeling one iteration. Let Exit is exit block. Let ExitingSet - is a set of predecessors of Exit block. They are exiting blocks. Let Latch' and ExitingSet' are copies after a peeling. We'd like to find an idom'(Exit) - idom of Exit after peeling. It is an evident that idom'(Exit) will be the nearest common dominator of ExitingSet and ExitingSet'. idom(Exit) is a nearest common dominator of ExitingSet. idom(Exit)' is a nearest common dominator of ExitingSet'. Taking into account that we have a single Latch, Latch' will dominate Header and idom(Exit). So the idom'(Exit) is nearest common dominator of idom(Exit)' and Latch'. All these basic blocks are in the same loop, so what we find is (nearest common dominator of idom(Exit) and Latch)'. Reviewers: reames, fhahn Reviewed By: reames Subscribers: hiraditya, zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D65292 llvm-svn: 367044
*	[SimplifyCFG] avoid crashing after simplifying a switch (PR42737)	Sanjay Patel	2019-07-25	1	-8/+17
\| \| \| \| \| \| \| \|	Later code in TryToSimplifyUncondBranchFromEmptyBlock() assumes that we have cleaned up unreachable blocks, but that was not happening with this switch transform. llvm-svn: 367037
*	Allow prefetching from non-zero address spaces	JF Bastien	2019-07-25	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is useful for targets which have prefetch instructions for non-default address spaces. <rdar://problem/42662136> Subscribers: nemanjai, javed.absar, hiraditya, kbarton, jkorous, dexonsmith, cfe-commits, llvm-commits, RKSimon, hfinkel, t.p.northover, craig.topper, anemet Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D65254 llvm-svn: 367032
*	Revert "[InstCombine] try to narrow a truncated load"	Vlad Tsyrklevich	2019-07-25	1	-39/+0
\| \| \| \| \| \| \| \| \|	This reverts commit bc4a63fd3c29c1a8ce22891bf34ee4dccfef578c, this is a speculative revert to fix a number of sanitizer bots (like sanitizer-x86_64-linux-bootstrap-ubsan) that have started to see stage2 compiler crashes, presumably due to a miscompile. llvm-svn: 367029
*	[PredicateInfo] Use SmallVector instead of SmallPtrSet.	Florian Hahn	2019-07-25	1	-13/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We do not need the SmallPtrSet to avoid adding duplicates to OpsToRename, because we already keep a ValueInfo mapping. If we see an op for the first time, Infos will be empty and we can also add it to OpsToRename. We process operands by visiting BBs depth-first and then iterate over all instructions & users, so the order should be deterministic. Therefore we can skip one round of sorting, which we purely needed for guaranteeing a deterministic order when iterating over the SmallPtrSet. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D64816 llvm-svn: 367028
*	[Utils] remove duplicated documentation comments; NFC	Sanjay Patel	2019-07-25	1	-29/+4
\| \| \| \| \| \|	http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments llvm-svn: 367015
*	[InstCombine] try to narrow a truncated load	Sanjay Patel	2019-07-25	1	-0/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	trunc (load X) --> load (bitcast X to narrow type) We have this transform in DAGCombiner::ReduceLoadWidth(), but the truncated load pattern can interfere with other instcombine transforms, so I'd like to allow the fold sooner. Example: https://bugs.llvm.org/show_bug.cgi?id=16739 ...in that report, we have bitcasts bracketing these ops, so those could get eliminated too. We've generally ruled out widening of loads early in IR ( LoadCombine - http://lists.llvm.org/pipermail/llvm-dev/2016-September/105291.html ), but that reasoning may not apply to narrowing if we can preserve information such as the dereferenceable range. Differential Revision: https://reviews.llvm.org/D64432 llvm-svn: 367011
*	[IPSCCP] Add assertion to surface cases where we zap returns with ↵	Florian Hahn	2019-07-25	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	overdefined users. We should only zap returns in functions, where all live users have a replace-able value (are not overdefined). Unused return values should be undefined. This should make it easier to detect bugs like in PR42738. Alternatively we could bail out of zapping the function returns, but I think it would be better to address those divergences between function and call-site values where they are actually caused. Reviewers: davide, efriedma Reviewed By: davide, efriedma Differential Revision: https://reviews.llvm.org/D65222 llvm-svn: 366998
*	[LV] Scalar Epilogue Lowering. NFC.	Sjoerd Meijer	2019-07-25	2	-57/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This refactors boolean 'OptForSize' that was passed around in a lot of places. It controlled folding of the tail loop, the scalar epilogue, into the main loop but code-size reasons may not be the only reason to do this. Thus, this is a first step to generalise the concept of tail-loop folding, and hence OptForSize has been renamed and is using an enum ScalarEpilogueStatus that holds the status how the epilogue should be lowered. This will be followed up by D65197, that picks up the predicate loop hint and performs the tail-loop folding. Differential Revision: https://reviews.llvm.org/D64916 llvm-svn: 366993
*	[PowerPC] exclude more icmps in LSR which is converted in later hardware ↵	Chen Zheng	2019-07-25	1	-5/+6
\| \| \| \| \| \| \| \|	loop pass Differential Revision: https://reviews.llvm.org/D64795 llvm-svn: 366976
*	[InstCombine] Swap order of checks to improve compile time (NFC)	Evandro Menezes	2019-07-24	1	-3/+3
\| \| \| \|	llvm-svn: 366962
*	[Transforms] move copying of load metadata to helper function; NFC	Sanjay Patel	2019-07-24	2	-45/+52
\| \| \| \| \| \| \|	There's another proposed load combine that can make use of this code in D64432. llvm-svn: 366949
*	[InstCombine] Teach foldOrOfICmps to allow icmp eq MIN_INT/MAX to be part of ↵	Craig Topper	2019-07-24	1	-8/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	a range comparision. Similar for foldAndOfICmps We can treat icmp eq X, MIN_UINT as icmp ule X, MIN_UINT and allow it to merge with icmp ugt X, C. Similar for the other constants. We can do simliar for icmp ne X, (U)INT_MIN/MAX in foldAndOfICmps. And we already handled UINT_MIN there. Fixes PR42691. Differential Revision: https://reviews.llvm.org/D65017 llvm-svn: 366945
*	Let CorrelatedValuePropagation preserve LazyValueInfo	David Bolvansky	2019-07-24	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch makes CorrelatedValuePropagation preserve LazyValueInfo by adding LazyValueInfo::eraseValue & calling it whenever an instruction is erased. Passes `make check` , test-suite, and SPECrate 2017. Patch by aqjune (Juneyoung Lee) Reviewers: reames, mzolotukhin Reviewed By: reames Subscribers: xbolva00, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59349 llvm-svn: 366942
*	[SafeStack] Insert the deref before remaining elements	Petr Hosek	2019-07-24	1	-7/+2
\| \| \| \| \| \| \| \| \| \| \|	This is a follow up to D64971. While we need to insert the deref after the offset, it needs to come before the remaining elements in the original expression since the deref needs to happen before the LLVM fragment if present. Differential Revision: https://reviews.llvm.org/D65172 llvm-svn: 366865
*	[IndVars] Fix a subtle bug in optimizeLoopExits	Philip Reames	2019-07-23	1	-2/+4
\| \| \| \| \| \| \| \| \| \|	The original code failed to account for the fact that one exit can have a pointer exit count without all of them having pointer exit counts. This could cause two separate bugs: 1) We might exit the loop early, and leave optimizations undone. This is what triggered the assertion failure in the reported test case. 2) We might optimize one exit, then exit without indicating a change. This could result in an analysis invalidaton bug if no other transform is done by the rest of indvars. Note that the pointer exit counts are a really fragile concept. They show up only when we have a pointer IV w/o a datalayout to provide their size. It's really questionable to me whether the complexity implied is worth it. llvm-svn: 366829
*	[SLPVectorizer] Revert local change that got accidently got committed in ↵	Simon Pilgrim	2019-07-23	1	-1/+0
\| \| \| \| \| \| \| \|	rL366799 This wasn't part of D63281 llvm-svn: 366807