| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
| |
This fixes some pipeline tests.
This reverts commit d0b6f42936bfb6d56d325c732ae79400c9c6016a.
llvm-svn: 367401
|
|
|
|
|
|
| |
This reverts r367332 (git commit 2d7227ec3ac91f36fc32b1c21e72e2f1f5d030ad)
llvm-svn: 367335
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LoopInfo can be easily preserved by passing it to the functions that
modify the CFG (SplitCriticalEdge and MergeBlockIntoPredecessor.
SplitCriticalEdge also preserves LoopSimplify and LCSSA form when when passing in
LoopInfo. The test case shows that we preserve LoopSimplify and
LoopInfo. Adding addPreservedID(LCSSAID) did not preserve LCSSA for some
reason.
Also I am not sure if it is possible to preserve those in the new pass
manager, as they aren't analysis passes.
Reviewers: reames, hfinkel, davide, jdoerfert
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D65137
llvm-svn: 367332
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch extends the use of the OptimizationRemarkEmitter to provide
information about loops that are not fused, and loops that are not eligible for
fusion. In particular, it uses the OptimizationRemarkAnalysis to identify loops
that are not eligible for fusion and the OptimizationRemarkMissed to identify
loops that cannot be fused.
It also reuses the statistics to provide the messages used in the
OptimizationRemarks. This provides common message strings between the
optimization remarks and the statistics.
I would like feedback on this approach, in general. If people are OK with this,
I will flesh out additional remarks in subsequent commits.
Subscribers: hiraditya, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63844
llvm-svn: 367327
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
power-of-two
Summary:
I have stumbled into this by accident while preparing to extend backend `x s% C ==/!= 0` handling.
While we did happen to handle this fold in most of the cases,
the folding is indirect - we fold `x u% y` to `x & (y-1)` (iff `y` is power-of-two),
or first turn `x s% -y` to `x u% y`; that does handle most of the cases.
But we can't turn `x s% INT_MIN` to `x u% -INT_MIN`,
and thus we end up being stuck with `(x s% INT_MIN) == 0`.
There is no such restriction for the more general fold:
https://rise4fun.com/Alive/IIeS
To be noted, the fold does not enforce that `y` is a constant,
so it may indeed increase instruction count.
This is consistent with what `x u% y`->`x & (y-1)` already does.
I think it makes sense, it's at most one (simple) extra instruction,
while `rem`ainder is really much more un-simple (and likely **very** costly).
Reviewers: spatel, RKSimon, nikic, xbolva00, craig.topper
Reviewed By: RKSimon
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65046
llvm-svn: 367322
|
|
|
|
|
|
|
|
|
|
|
| |
test-suite/MultiSource/Benchmarks/DOE-ProxyApps-C/miniGMG broke:
Only PHI nodes may reference their own value!
%sub33 = srem i32 %sub33, %ranks_in_i
This reverts commit r367288.
llvm-svn: 367289
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
While `-div-rem-pairs` pass can decompose rem in div+rem pair when div-rem pair
is unsupported by target, nothing performs the opposite fold.
We can't do that in InstCombine or DAGCombine since neither of those has access to TTI.
So it makes most sense to teach `-div-rem-pairs` about it.
If we matched rem in expanded form, we know we will be able to place div-rem pair
next to each other so we won't regress the situation.
Also, we shouldn't decompose rem if we matched already-decomposed form.
This is surprisingly straight-forward otherwise.
https://bugs.llvm.org/show_bug.cgi?id=42673
Reviewers: spatel, RKSimon, efriedma, ZaMaZaN4iK, bogner
Reviewed By: bogner
Subscribers: bogner, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65298
llvm-svn: 367288
|
|
|
|
|
|
|
|
|
|
|
| |
in the merged module.
Globals that are associated with globals with type metadata need to appear
in the merged module because they will reference the global's section directly.
Differential Revision: https://reviews.llvm.org/D65312
llvm-svn: 367242
|
|
|
|
|
|
|
|
|
| |
The backend already does this via isNegatibleForFree(),
but we may want to alter the fneg IR canonicalizations
that currently exist, so we need to try harder to fold
fneg in IR to avoid regressions.
llvm-svn: 367227
|
|
|
|
| |
llvm-svn: 367224
|
|
|
|
|
|
|
|
|
| |
The backend already does this via isNegatibleForFree(),
but we may want to alter the fneg IR canonicalizations
that currently exist, so we need to try harder to fold
fneg in IR to avoid regressions.
llvm-svn: 367194
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Deduce "align" attribute in attributor.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64152
llvm-svn: 367187
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The test case from:
https://bugs.llvm.org/show_bug.cgi?id=42771
...shows a ~30x slowdown caused by the awkward loop iteration (rL207302) that is
seemingly done just to avoid invalidating the instruction iterator. We can instead
delay instruction deletion until we reach the end of the block (or we could delay
until we reach the end of all blocks).
There's a test diff here for a degenerate case with llvm.assume that is not
meaningful in itself, but serves to verify this change in logic.
This change probably doesn't result in much overall compile-time improvement
because we call '-instsimplify' as a standalone pass only once in the standard
-O2 opt pipeline currently.
Differential Revision: https://reviews.llvm.org/D65336
llvm-svn: 367173
|
|
|
|
|
|
|
|
|
|
| |
overdefined users.
This reverts r366998 (git commit 5354c83ece00690b4dbfa47925f8f5a8f33f1d9e)
This breaks a linux kernel build and we have reproducer to investigate.
llvm-svn: 367160
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
unreachable loop.
updatePredecessorProfileMetadata in jumpthreading tries to find the
first dominating predecessor block for a PHI value by searching upwards
the predecessor block chain.
But jumpthreading may see some temporary IR state which contains
unreachable bb not being cleaned up. If an unreachable loop happens to
be on the predecessor block chain, keeping chasing the predecessor
block will run into an infinite loop.
The patch fixes it.
Differential Revision: https://reviews.llvm.org/D65310
llvm-svn: 367154
|
|
|
|
|
|
|
| |
This is a transform that we use with fmul, so use
it for fdiv too for consistency.
llvm-svn: 367146
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(Y * (1.0 - Z)) + (X * Z) -->
Y - (Y * Z) + (X * Z) -->
Y + Z * (X - Y)
This is part of solving:
https://bugs.llvm.org/show_bug.cgi?id=42716
Factoring eliminates an instruction, so that should be a good canonicalization.
The potential conversion to FMA would be handled by the backend based on target
capabilities.
Differential Revision: https://reviews.llvm.org/D65305
llvm-svn: 367101
|
|
|
|
|
|
|
|
|
|
|
|
| |
To avoid duplicates in loop metadata, if the string to add is
already there, just update the value.
Reviewers: reames, Ashutosh
Reviewed By: reames
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D65265
llvm-svn: 367087
|
|
|
|
|
|
|
|
|
|
|
| |
Just move the utility function to LoopUtils.cpp to re-use it in loop peeling.
Reviewers: reames, Ashutosh
Reviewed By: reames
Subscribers: hiraditya, asbirlea, llvm-commits
Differential Revision: https://reviews.llvm.org/D65264
llvm-svn: 367085
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
changes were made to the patch since then.
--------
[NewPM] Port Sancov
This patch contains a port of SanitizerCoverage to the new pass manager. This one's a bit hefty.
Changes:
- Split SanitizerCoverageModule into 2 SanitizerCoverage for passing over
functions and ModuleSanitizerCoverage for passing over modules.
- ModuleSanitizerCoverage exists for adding 2 module level calls to initialization
functions but only if there's a function that was instrumented by sancov.
- Added legacy and new PM wrapper classes that own instances of the 2 new classes.
- Update llvm tests and add clang tests.
llvm-svn: 367053
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently there are a few pointer comparisons in ValueDFS_Compare, which
can cause non-deterministic ordering when materializing values. There
are 2 cases this patch fixes:
1. Order defs before uses used to compare pointers, which guarantees
defs before uses, but causes non-deterministic ordering between 2
uses or 2 defs, depending on the allocation order. By converting the
pointers to booleans, we can circumvent that problem.
2. comparePHIRelated was comparing the basic block pointers of edges,
which also results in a non-deterministic order and is also not
really meaningful for ordering. By ordering by their destination DFS
numbers we guarantee a deterministic order.
For the example below, we can end up with 2 different uselist orderings,
when running `opt -mem2reg -ipsccp` hundreds of times. Because the
non-determinism is caused by allocation ordering, we cannot reproduce it
with ipsccp alone.
declare i32 @hoge() local_unnamed_addr #0
define dso_local i32 @ham(i8* %arg, i8* %arg1) #0 {
bb:
%tmp = alloca i32
%tmp2 = alloca i32, align 4
br label %bb19
bb4: ; preds = %bb20
br label %bb6
bb6: ; preds = %bb4
%tmp7 = call i32 @hoge()
store i32 %tmp7, i32* %tmp
%tmp8 = load i32, i32* %tmp
%tmp9 = icmp eq i32 %tmp8, 912730082
%tmp10 = load i32, i32* %tmp
br i1 %tmp9, label %bb11, label %bb16
bb11: ; preds = %bb6
unreachable
bb13: ; preds = %bb20
br label %bb14
bb14: ; preds = %bb13
%tmp15 = load i32, i32* %tmp
br label %bb16
bb16: ; preds = %bb14, %bb6
%tmp17 = phi i32 [ %tmp10, %bb6 ], [ 0, %bb14 ]
br label %bb19
bb18: ; preds = %bb20
unreachable
bb19: ; preds = %bb16, %bb
br label %bb20
bb20: ; preds = %bb19
indirectbr i8* null, [label %bb4, label %bb13, label %bb18]
}
Reviewers: davide, efriedma
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D64866
llvm-svn: 367049
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We'd like to determine the idom of exit block after peeling one iteration.
Let Exit is exit block.
Let ExitingSet - is a set of predecessors of Exit block. They are exiting blocks.
Let Latch' and ExitingSet' are copies after a peeling.
We'd like to find an idom'(Exit) - idom of Exit after peeling.
It is an evident that idom'(Exit) will be the nearest common dominator of ExitingSet and ExitingSet'.
idom(Exit) is a nearest common dominator of ExitingSet.
idom(Exit)' is a nearest common dominator of ExitingSet'.
Taking into account that we have a single Latch, Latch' will dominate Header and idom(Exit).
So the idom'(Exit) is nearest common dominator of idom(Exit)' and Latch'.
All these basic blocks are in the same loop, so what we find is
(nearest common dominator of idom(Exit) and Latch)'.
Reviewers: reames, fhahn
Reviewed By: reames
Subscribers: hiraditya, zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D65292
llvm-svn: 367044
|
|
|
|
|
|
|
|
| |
Later code in TryToSimplifyUncondBranchFromEmptyBlock() assumes that
we have cleaned up unreachable blocks, but that was not happening
with this switch transform.
llvm-svn: 367037
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is useful for targets which have prefetch instructions for non-default address spaces.
<rdar://problem/42662136>
Subscribers: nemanjai, javed.absar, hiraditya, kbarton, jkorous, dexonsmith, cfe-commits, llvm-commits, RKSimon, hfinkel, t.p.northover, craig.topper, anemet
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D65254
llvm-svn: 367032
|
|
|
|
|
|
|
|
|
| |
This reverts commit bc4a63fd3c29c1a8ce22891bf34ee4dccfef578c, this is a
speculative revert to fix a number of sanitizer bots (like
sanitizer-x86_64-linux-bootstrap-ubsan) that have started to see stage2
compiler crashes, presumably due to a miscompile.
llvm-svn: 367029
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We do not need the SmallPtrSet to avoid adding duplicates to
OpsToRename, because we already keep a ValueInfo mapping. If we see an
op for the first time, Infos will be empty and we can also add it to
OpsToRename.
We process operands by visiting BBs depth-first and then iterate over
all instructions & users, so the order should be deterministic.
Therefore we can skip one round of sorting, which we purely needed for
guaranteeing a deterministic order when iterating over the SmallPtrSet.
Reviewers: efriedma, davide
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D64816
llvm-svn: 367028
|
|
|
|
|
|
| |
http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments
llvm-svn: 367015
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
trunc (load X) --> load (bitcast X to narrow type)
We have this transform in DAGCombiner::ReduceLoadWidth(), but the truncated
load pattern can interfere with other instcombine transforms, so I'd like to
allow the fold sooner.
Example:
https://bugs.llvm.org/show_bug.cgi?id=16739
...in that report, we have bitcasts bracketing these ops, so those could get
eliminated too.
We've generally ruled out widening of loads early in IR ( LoadCombine -
http://lists.llvm.org/pipermail/llvm-dev/2016-September/105291.html ), but
that reasoning may not apply to narrowing if we can preserve information
such as the dereferenceable range.
Differential Revision: https://reviews.llvm.org/D64432
llvm-svn: 367011
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
overdefined users.
We should only zap returns in functions, where all live users have a
replace-able value (are not overdefined). Unused return values should be
undefined.
This should make it easier to detect bugs like in PR42738.
Alternatively we could bail out of zapping the function returns, but I
think it would be better to address those divergences between function
and call-site values where they are actually caused.
Reviewers: davide, efriedma
Reviewed By: davide, efriedma
Differential Revision: https://reviews.llvm.org/D65222
llvm-svn: 366998
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This refactors boolean 'OptForSize' that was passed around in a lot of places.
It controlled folding of the tail loop, the scalar epilogue, into the main loop
but code-size reasons may not be the only reason to do this. Thus, this is a
first step to generalise the concept of tail-loop folding, and hence OptForSize
has been renamed and is using an enum ScalarEpilogueStatus that holds the
status how the epilogue should be lowered.
This will be followed up by D65197, that picks up the predicate loop hint and
performs the tail-loop folding.
Differential Revision: https://reviews.llvm.org/D64916
llvm-svn: 366993
|
|
|
|
|
|
|
|
| |
loop pass
Differential Revision: https://reviews.llvm.org/D64795
llvm-svn: 366976
|
|
|
|
| |
llvm-svn: 366962
|
|
|
|
|
|
|
| |
There's another proposed load combine that can make use of this code
in D64432.
llvm-svn: 366949
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a range comparision. Similar for foldAndOfICmps
We can treat icmp eq X, MIN_UINT as icmp ule X, MIN_UINT and allow
it to merge with icmp ugt X, C. Similar for the other constants.
We can do simliar for icmp ne X, (U)INT_MIN/MAX in foldAndOfICmps. And we already handled UINT_MIN there.
Fixes PR42691.
Differential Revision: https://reviews.llvm.org/D65017
llvm-svn: 366945
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch makes CorrelatedValuePropagation preserve LazyValueInfo by adding LazyValueInfo::eraseValue & calling it whenever an instruction is erased.
Passes `make check` , test-suite, and SPECrate 2017.
Patch by aqjune (Juneyoung Lee)
Reviewers: reames, mzolotukhin
Reviewed By: reames
Subscribers: xbolva00, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59349
llvm-svn: 366942
|
|
|
|
|
|
|
|
|
|
|
| |
This is a follow up to D64971. While we need to insert the deref after
the offset, it needs to come before the remaining elements in the
original expression since the deref needs to happen before the LLVM
fragment if present.
Differential Revision: https://reviews.llvm.org/D65172
llvm-svn: 366865
|
|
|
|
|
|
|
|
|
|
| |
The original code failed to account for the fact that one exit can have a pointer exit count without all of them having pointer exit counts. This could cause two separate bugs:
1) We might exit the loop early, and leave optimizations undone. This is what triggered the assertion failure in the reported test case.
2) We might optimize one exit, then exit without indicating a change. This could result in an analysis invalidaton bug if no other transform is done by the rest of indvars.
Note that the pointer exit counts are a really fragile concept. They show up only when we have a pointer IV w/o a datalayout to provide their size. It's really questionable to me whether the complexity implied is worth it.
llvm-svn: 366829
|
|
|
|
|
|
|
|
| |
rL366799
This wasn't part of D63281
llvm-svn: 366807
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces the DAG version of SimplifyMultipleUseDemandedBits, which attempts to peek through ops (mainly and/or/xor so far) that don't contribute to the demandedbits/elts of a node - which means we can do this even in cases where we have multiple uses of an op, which normally requires us to demanded all bits/elts. The intention is to remove a similar instruction - SelectionDAG::GetDemandedBits - once SimplifyMultipleUseDemandedBits has matured.
The InstCombine version of SimplifyMultipleUseDemandedBits can constant fold which I haven't added here yet, and so far I've only wired this up to some basic binops (and/or/xor/add/sub/mul) to demonstrate its use.
We do see a couple of regressions that need to be addressed:
AMDGPU unsigned dot product codegen retains an AND mask (for ZERO_EXTEND) that it previously removed (but otherwise the dotproduct codegen is a lot better).
X86/AVX2 has poor handling of vector ANY_EXTEND/ANY_EXTEND_VECTOR_INREG - it prematurely gets converted to ZERO_EXTEND_VECTOR_INREG.
The code owners have confirmed its ok for these cases to fixed up in future patches.
Differential Revision: https://reviews.llvm.org/D63281
llvm-svn: 366799
|
|
|
|
|
|
| |
cast<CallInst> shouldn't return null and we dereference the pointer in a lot of other places, causing both MSVC + cppcheck to warn about dereferenced null pointers
llvm-svn: 366793
|
|
|
|
| |
llvm-svn: 366789
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Deduce dereferenceable attribute in Attributor.
These will be added in a later patch.
* dereferenceable(_or_null)_globally (D61652)
* Deduction based on load instruction (similar to D64258)
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: hiraditya, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64876
llvm-svn: 366788
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Adds a binding to the internalize pass that allows the caller to pass a function pointer that acts as the visibility-preservation predicate. Previously, one could only pass an unsigned value (not LLVMBool?) that directed the pass to consider "main" or not.
Reviewers: whitequark, deadalnix, harlanhaskins
Reviewed By: whitequark, harlanhaskins
Subscribers: kren1, hiraditya, llvm-commits, harlanhaskins
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62456
llvm-svn: 366777
|
|
|
|
| |
llvm-svn: 366774
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[Attributor] Liveness analysis.
Liveness analysis abstract attribute used to indicate which BasicBlocks are dead and can therefore be ignored.
Right now we are only looking at noreturn calls.
Reviewers: jdoerfert, uenoku
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D64162
llvm-svn: 366769
|
|
|
|
|
|
| |
This reverts commit 95cbc3da8871f43c1ce2b2926afaedcd826202b1.
llvm-svn: 366759
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[Attributor] Liveness analysis.
Liveness analysis abstract attribute used to indicate which BasicBlocks are dead and can therefore be ignored.
Right now we are only looking at noreturn calls.
Reviewers: jdoerfert, uenoku
Subscribers: hiraditya, llvm-commits
Differential revision: https://reviews.llvm.org/D64162
llvm-svn: 366753
|
|
|
|
|
|
|
|
|
| |
constant"
Having it as a proper matcher is better for reusability elsewhere
(in a follow-up patch.)
llvm-svn: 366752
|
|
|
|
|
|
| |
This reverts commit 9285295f75a231dc446fa7cbc10a0a391b3434a5.
llvm-svn: 366737
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Liveness analysis abstract attribute used to indicate which BasicBlocks are dead and can therefore be ignored.
Right now we are only looking at noreturn calls.
Reviewers: jdoerfert, uenoku
Subscribers: hiraditya, llvm-commits
Differential revision: https://reviews.llvm.org/D64162
llvm-svn: 366736
|