| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If the callsite is inside landing pad, do not perform callsite splitting.
Callsite splitting uses utility function llvm::DuplicateInstructionsInSplitBetween, which eventually calls llvm::SplitEdge. llvm::SplitEdge calls llvm::SplitCriticalEdge with an assumption that the function returns nullptr only when the target edge is not a critical edge (and further assumes that if the return value was not nullptr, the predecessor of the original target edge always has a single successor because critical edge splitting was successful). However, this assumtion is not true because SplitCriticalEdge returns nullptr if the destination block is a landing pad. This invalid assumption results assertion failure.
Fundamental solution might be fixing llvm::SplitEdge to not to rely on the invalid assumption. However, it'll involve a lot of work because current API assumes that llvm::SplitEdge never fails. Instead, this patch makes callsite splitting to not to attempt splitting if the callsite is in a landing pad.
Attached test case will crash with assertion failure without the fix.
Reviewers: fhahn, junbuml, dberlin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45130
llvm-svn: 329250
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sometimes instead of storing addresses as is, the kernel stores the address of
a page and an offset within that page, and then computes the actual address
when it needs to make an access. Because of this the pointer tag gets lost
(gets set to 0xff). The solution is to ignore all accesses tagged with 0xff.
This patch adds a -hwasan-match-all-tag flag to hwasan, which allows to ignore
accesses through pointers with a particular pointer tag value for validity.
Patch by Andrey Konovalov.
Differential Revision: https://reviews.llvm.org/D44827
llvm-svn: 329228
|
| |
|
|
| |
llvm-svn: 329170
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes cases like the new test @nonuniform. In that test, %cc itself
is a uniform value; however, when reading it after the end of the loop in
basic block %if, its value is effectively non-uniform, so the branch is
non-uniform.
This problem was encountered in
https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change
in itself is not sufficient to fix that bug, as there is another issue
in the AMDGPU backend.
As discovered after committing an earlier version of this change, this
exposes a subtle interaction between this pass and DivergenceAnalysis:
since we remove and re-create branch instructions, we can no longer rely
on DivergenceAnalysis for branches in subregions that were already
processed by the pass.
Explicitly remove branch instructions from DivergenceAnalysis to
avoid dangling pointers as a matter of defensive programming, and
change how we detect non-uniform subregions.
Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4
Differential Revision: https://reviews.llvm.org/D43743
llvm-svn: 329165
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PostBB has more than 2 predecessors by inserting a new block for the store.
Summary:
Currently merge conditional stores can't handle cases where PostBB (the block we need to move the store to) has more than 2 predecessors.
This patch removes that restriction by creating a new block with only the 2 predecessors we care about and an unconditional branch to the original block. This provides a place to put the store.
Reviewers: efriedma, jmolloy, ABataev
Reviewed By: efriedma
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D39760
llvm-svn: 329142
|
| |
|
|
|
|
|
|
| |
Move the check canPeel() to Hexagon Target before setting PeelCount.
Differential Revision: https://reviews.llvm.org/D44880
llvm-svn: 329129
|
| |
|
|
|
|
|
|
| |
The tests marked with 'FIXME' require loosening the check
in SimplifyAssociativeOrCommutative() to optimize completely;
that's still checking isFast() in Instruction::isAssociative().
llvm-svn: 329121
|
| |
|
|
| |
llvm-svn: 329118
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If an alloca need to be stored in the coroutine frame and it has an alignment specified and the alignment does not match the natural alignment of the alloca type. Insert appropriate padding into the coroutine frame to make sure that it gets requested alignment.
For example for a packet type (which natural alignment is 1), but alloca alignment is 8, we may need to insert a padding field with required number of bytes to make sure it is properly aligned.
```
%PackedStruct = type <{ i64 }>
...
%data = alloca %PackedStruct, align 8
```
If the previous field in the coroutine frame had alignment 2, we would have [6 x i8] inserted before %PackedStruct in the coroutine frame:
```
%f.Frame = type { ..., i16, [6 x i8], %PackedStruct }
```
Reviewers: rnk, lewissbaker, modocache
Reviewed By: modocache
Subscribers: EricWF, llvm-commits
Differential Revision: https://reviews.llvm.org/D45221
llvm-svn: 329112
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
It also updates test/Transforms/LoopInterchange/call-instructions.ll
to use accesses where we can prove dependence after D35430.
Reviewers: sebpop, karthikthecool, blitz.opensource
Reviewed By: sebpop
Differential Revision: https://reviews.llvm.org/D45206
llvm-svn: 329111
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Introduce the ShadowCallStack function attribute. It's added to
functions compiled with -fsanitize=shadow-call-stack in order to mark
functions to be instrumented by a ShadowCallStack pass to be submitted
in a separate change.
Reviewers: pcc, kcc, kubamracek
Reviewed By: pcc, kcc
Subscribers: cryptoad, mehdi_amini, javed.absar, llvm-commits, kcc
Differential Revision: https://reviews.llvm.org/D44800
llvm-svn: 329108
|
| |
|
|
| |
llvm-svn: 329091
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Folding patterns like:
%vec = shufflevector <4 x i8> %insvec, <4 x i8> undef, <4 x i32> zeroinitializer
%cast = bitcast <4 x i8> %vec to i32
%cond = icmp eq i32 %cast, 0
into:
%ext = extractelement <4 x i8> %insvec, i32 0
%cond = icmp eq i32 %ext, 0
Combined with existing rules, this allows us to fold patterns like:
%insvec = insertelement <4 x i8> undef, i8 %val, i32 0
%vec = shufflevector <4 x i8> %insvec, <4 x i8> undef, <4 x i32> zeroinitializer
%cast = bitcast <4 x i8> %vec to i32
%cond = icmp eq i32 %cast, 0
into:
%cond = icmp eq i8 %val, 0
When we construct a splat vector via a shuffle, and bitcast the vector into an integer type for comparison against an integer constant. Then we can simplify the the comparison to compare the splatted value against the integer constant.
Reviewers: spatel, anna, mkazantsev
Reviewed By: spatel
Subscribers: efriedma, rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D44997
llvm-svn: 329087
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If the load/extractelement/extractvalue instructions are not originally
consecutive, the SLP vectorizer is unable to vectorize them. Patch
allows reordering of such instructions.
Patch does not support reordering of the repeated instruction, this must
be handled in the separate patch.
Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43776
llvm-svn: 329085
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The primary issue here is that using NDEBUG alone isn't enough to guard
debug printing -- instead the DEBUG() macro needs to be used so that the
specific pass debug logging check is employed. Without this, every
asserts-enabled build was printing out information when it hit this.
I also fixed another place where we had multiple statements in a DEBUG
macro to use {}s to be a bit cleaner. And I fixed a place that used
errs() rather than dbgs().
llvm-svn: 329082
|
| |
|
|
|
|
| |
This reverts commit r328980 and r329046. Makes the vectorizer crash.
llvm-svn: 329071
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The default assembly handling mode may introduce false positives in the
cases when MSan doesn't understand that the assembly call initializes
the memory pointed to by one of its arguments.
We introduce the conservative mode, which initializes the first
|sizeof(type)| bytes for every |type*| pointer passed into the
assembly statement.
llvm-svn: 329054
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The primary issue here is that using NDEBUG alone isn't enough to guard
debug printing -- instead the DEBUG() macro needs to be used so that the
specific pass debug logging check is employed. Without this, every
asserts-enabled build was printing out information when it hit this.
I also fixed another place where we had multiple statements in a DEBUG
macro to use {}s to be a bit cleaner. And I fixed a place that used
`errs()` rather than `dbgs()`.
llvm-svn: 329046
|
| |
|
|
|
|
|
|
|
|
| |
For Hexagon, peeling loops with small runtime trip count is beneficial for our
benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set
by the target for computing the desired peel count.
Differential Revision: https://reviews.llvm.org/D44880
llvm-svn: 329042
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
shrinkable" values when determining the minimum bitwidth
We use two approaches for determining the minimum bitwidth.
* Demanded bits
* Value tracking
If demanded bits doesn't result in a narrower type, we then try value tracking.
We need this if we want to root SLP trees with the indices of getelementptr
instructions since all the bits of the indices are demanded.
But there is a missing piece though. We need to be able to distinguish "demanded
and shrinkable" from "demanded and not shrinkable". For example, the bits of %i
in
%i = sext i32 %e1 to i64
%gep = getelementptr inbounds i64, i64* %p, i64 %i
are demanded, but we can shrink %i's type to i32 because it won't change the
result of the getelementptr. On the other hand, in
%tmp15 = sext i32 %tmp14 to i64
%tmp16 = insertvalue { i64, i64 } undef, i64 %tmp15, 0
it doesn't make sense to shrink %tmp15 and we can skip the value tracking.
Ideas are from Matthew Simpson!
Differential Revision: https://reviews.llvm.org/D44868
llvm-svn: 329035
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When attempting to split a coroutine with 'hidden' visibility (for
example, a C++ coroutine that is inlined when compiled with the option
'-fvisibility-inlines-hidden'), LLVM would hit an assertion in
include/llvm/IR/GlobalValue.h:240: "local linkage requires default
visibility". The issue is that the visibility is copied from the source
of the function split in the `CloneFunctionInto` function, but the linkage
is not. To fix, create the new function first with external linkage,
then copy the linkage from the original function *after* `CloneFunctionInto`
is called.
Since `GlobalValue::setLinkage` in turn calls `maybeSetDsoLocal`, the
explicit call to `setDSOLocal` can be removed in CoroSplit.cpp.
Test Plan: check-llvm
Reviewers: GorNishanov, lewissbaker, EricWF, majnemer, rnk
Reviewed By: rnk
Subscribers: llvm-commits, eric_niebler
Differential Revision: https://reviews.llvm.org/D44185
llvm-svn: 329033
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The cast simplifications that instcombine does here do not make any
attempt to obey the verifier rules for musttail calls. Therefore we have
to disable them.
Reviewers: efriedma, majnemer, pcc
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D45186
llvm-svn: 329027
|
| |
|
|
|
|
|
|
|
| |
Otherwise, we end up inlining a musttail call into a non-tail position,
which breaks verifier invariants.
Fixes PR31014
llvm-svn: 329015
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(A - B) >u A --> A <u B
C <u (C - D) --> C <u D
https://rise4fun.com/Alive/e7j
Name: ugt
%sub = sub i8 %x, %y
%cmp = icmp ugt i8 %sub, %x
=>
%cmp = icmp ult i8 %x, %y
Name: ult
%sub = sub i8 %x, %y
%cmp = icmp ult i8 %x, %sub
=>
%cmp = icmp ult i8 %x, %y
This should fix:
https://bugs.llvm.org/show_bug.cgi?id=36969
llvm-svn: 329011
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Some Function level metadatas, such as function entry count, are not cloned in
DeadArgumentElim. This happens a lot in lto/thinlto because of DeadArgumentElim
after internalization.
This patch clones the metadatas in the original function to the new function.
Differential Revision: https://reviews.llvm.org/D44127
llvm-svn: 328991
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
A recent addition to Coroutines TS (https://wg21.link/p0913) adds a pre-defined coroutine noop_coroutine that does nothing.
To implement this feature, we implemented an llvm.coro.noop intrinsic that returns a coroutine handle to a coroutine that does nothing when resumed or destroyed.
Reviewers: EricWF, modocache, rnk, lewissbaker
Reviewed By: modocache
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45114
llvm-svn: 328986
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If the load/extractelement/extractvalue instructions are not originally
consecutive, the SLP vectorizer is unable to vectorize them. Patch
allows reordering of such instructions.
Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43776
llvm-svn: 328980
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Adds -import-cutoff=N which will stop importing during the thin link
after N imports. Default is -1 (no limit).
Reviewers: wmi
Subscribers: inglorion, llvm-commits
Differential Revision: https://reviews.llvm.org/D45127
llvm-svn: 328934
|
| |
|
|
|
|
|
|
|
|
|
| |
If a loop has a loop exiting latch, it can be profitable
to rotate the loop if it leads to the simplification of
a phi node. Perform rotation in these cases even if loop
rotate itself didnt simplify the loop to get there.
Differential Revision: https://reviews.llvm.org/D44199
llvm-svn: 328933
|
| |
|
|
| |
llvm-svn: 328907
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
same linkage as the function being wrapped
This patch resolves link errors when the address of a static function is taken, and that function is uninstrumented by DFSan.
This change resolves bug 36314.
Patch by Sam Kerner!
Differential Revision: https://reviews.llvm.org/D44784
llvm-svn: 328890
|
| |
|
|
|
|
| |
This reverts commit r328854, it breaks some Hexagon tests.
llvm-svn: 328875
|
| |
|
|
|
|
|
|
|
|
| |
For Hexagon, peeling loops with small runtime trip count is beneficial for our
benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set
by the target for computing the desired peel count.
Differential Revision: https://reviews.llvm.org/D44880
llvm-svn: 328854
|
| |
|
|
|
|
| |
IPO.h to Utils.h to match its implementation
llvm-svn: 328844
|
| |
|
|
| |
llvm-svn: 328842
|
| |
|
|
| |
llvm-svn: 328840
|
| |
|
|
|
|
|
| |
To fix layering (so that Scalar.h, a libScalarOpts header, isn't
included from Utils - which libScalarOpts depends on).
llvm-svn: 328839
|
| |
|
|
|
|
|
|
|
|
|
| |
Similarly to https://reviews.llvm.org/D18865 this adds options to provide custom mapping for msan.
As discussed in http://lists.llvm.org/pipermail/llvm-dev/2018-February/121339.html
Patch by vit9696(at)avp.su.
Differential Revision: https://reviews.llvm.org/D44926
llvm-svn: 328830
|
| |
|
|
| |
llvm-svn: 328822
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In r312664 (D36404), JumpThreading stopped threading edges into
loop headers. Unfortunately, I observed a significant performance
regression as a result of this change. Upon further investigation,
the problematic pattern looked something like this (after
many high level optimizations):
while (true) {
bool cond = ...;
if (!cond) {
<body>
}
if (cond)
break;
}
Now, naturally we want jump threading to essentially eliminate the
second if check and hook up the edges appropriately. However, the
above mentioned change, prevented it from doing this because it would
have to thread an edge into the loop header.
Upon further investigation, what is happening is that since both branches
are threadable, JumpThreading picks one of them at arbitrarily. In my
case, because of the way that the IR ended up, it tended to pick
the one to the loop header, bailing out immediately after. However,
if it had picked the one to the exit block, everything would have
worked out fine (because the only remaining branch would then be folded,
not thraded which is acceptable).
Thus, to fix this problem, we can simply eliminate loop headers from
consideration as possible threading targets earlier, to make sure that
if there are multiple eligible branches, we can still thread one of
the ones that don't target a loop header.
Patch by Keno Fischer!
Differential Revision: https://reviews.llvm.org/D42260
llvm-svn: 328798
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
with Loop Rotation Utility Interface
The existing LoopRotation.cpp is implemented as one of loop passes instead of
being a utility. The user cannot easily perform the loop rotation selectively
(or on demand) under different optimization level. For example, the loop
rotation is needed as part of the logic to convert a loop into a loop with
bottom test for a transformation. If the loop rotation is simply added as a
loop pass before the transformation, the pass is skipped if it is compiled at
–O0 or if it is explicitly disabled by the user, causing the compiler to
generate incorrect code. Furthermore, as a loop pass it will rotate all loops
instead of just the relevant loops.
We provide a utility interface for the loop rotation so that the loop rotation
can be called on demand. The changeset is as follows:
- Create a new file lib/Transforms/Utils/LoopRotationUtils.cpp and move the main
implementation of class LoopRotate into this file.
- Create a new file llvm/include/Transform/Utils/LoopRotationUtils.h with the
interface LoopRotation(...).
- Original LoopRotation.cpp is changed to use the utility function LoopRotation
in LoopRotationUtils.cpp. This is done in the same way community did for
mem-to-reg implementation.
Patch by Jin Lin!
Differential Revision: https://reviews.llvm.org/D44595
llvm-svn: 328766
|
| |
|
|
|
|
|
|
|
|
| |
binding functions
Otherwise the definitions can't see the extern C declarations and get
name mangled, making it impossible for users to call them. This breaks
the Go bindings.
llvm-svn: 328765
|
| |
|
|
|
|
|
|
| |
dependency
Thanks to echristo for the pointers on direction.
llvm-svn: 328737
|
| |
|
|
|
|
| |
LoopSimplifyCFG things back
llvm-svn: 328720
|
| |
|
|
|
|
|
|
|
| |
declarations amongst Scalar.h and IPO.h
Fixes layering - Transforms/Utils shouldn't depend on including a Scalar
or IPO header, because Scalar and IPO depend on Utils.
llvm-svn: 328717
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This is a step towards the upcoming KMSAN implementation patch.
KMSAN is going to prepend a special basic block containing
tool-specific calls to each function. Because we still want to
instrument the original entry block, we'll need to store it in
ActualFnStart.
For MSan this will still be F.getEntryBlock(), whereas for KMSAN
it'll contain the second BB.
llvm-svn: 328697
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a step towards the upcoming KMSAN implementation patch.
The isStore argument is to be used by getShadowOriginPtrKernel(),
it is ignored by getShadowOriginPtrUserspace().
Depending on whether a memory access is a load or a store, KMSAN
instruments it with different functions, __msan_metadata_ptr_for_load_X()
and __msan_metadata_ptr_for_store_X().
Those functions may return different values for a single address,
which is necessary in the case the runtime library decides to ignore
particular accesses.
llvm-svn: 328692
|
| |
|
|
| |
llvm-svn: 328660
|
| |
|
|
|
|
|
|
|
| |
Fixed counter/weight overflow that leads to an assertion. Also fixed the help
string for pgo-emit-branch-prob option.
Differential Revision: https://reviews.llvm.org/D44809
llvm-svn: 328653
|
| |
|
|
|
|
|
|
| |
The default implementation returns false and keeps the current behavior.
Differential Revision: https://reviews.llvm.org/D44735
llvm-svn: 328632
|