| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
llvm-svn: 348913
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Enable suspend point simplification for cases where:
* coro.save and coro.suspend are in different basic blocks
* where there are intervening intrinsics
Reviewers: modocache, tks2103, lewissbaker
Reviewed By: modocache
Subscribers: EricWF, llvm-commits
Differential Revision: https://reviews.llvm.org/D55160
llvm-svn: 348897
|
|
|
|
|
|
|
|
|
|
|
| |
IR-printing AfterPass instrumentation might be called on a loop
that has just been invalidated. We should skip printing it to
avoid spurious asserts.
Reviewed By: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D54740
llvm-svn: 348887
|
|
|
|
|
|
|
|
|
|
| |
It's currently not safe to outline landingpad instructions (see
llvm.org/PR39917). Like @llvm.eh.typeid.for, the order and content of
previous landingpad instructions in a function alters the lowering of
subsequent landingpads by renumbering type info ID's. Outlining a
landingpad therefore breaks exception handling & unwinding.
llvm-svn: 348870
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
call iM movmsk(sext <N x i1> X) --> zext (bitcast <N x i1> X to iN) to iM
This has the potential to create less-than-8-bit scalar types as shown in
some of the test diffs, but it looks like the backend knows how to deal
with that in these patterns. This is the simple part of the fix suggested in:
https://bugs.llvm.org/show_bug.cgi?id=39927
Differential Revision: https://reviews.llvm.org/D55529
llvm-svn: 348862
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When eliminating a dead argument or return value in a function with
local linkage, all uses, including in dbg.value intrinsics, would be
replaced with null constants. This would mean that, for example for an
integer argument, the debug info would incorrectly express that the
value is 0. Instead, replace all uses with undef to indicate that the
argument/return value is optimized out.
Also, make sure that metadata uses of return values are rewritten even
if there are no non-metadata uses of the value.
As a bit of historical curiosity, the code that emitted null constants
was introduced in the initial check-in of the pass in 2003, before
'undef' values even existed in LLVM.
This fixes PR23260.
Reviewers: dblaikie, aprantl, vsk, djtodoro
Reviewed By: aprantl
Subscribers: llvm-commits
Tags: #debug-info
Differential Revision: https://reviews.llvm.org/D55513
llvm-svn: 348837
|
|
|
|
| |
llvm-svn: 348804
|
|
|
|
| |
llvm-svn: 348801
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently memcpyopt optimizes cases like
memset(a, byte, N);
memcpy(b, a, M);
to
memset(a, byte, N);
memset(b, byte, M);
if M <= N. Often this allows further simplifications down the line,
which drop the first memset entirely.
This patch extends this optimization for the case where M > N, but we
know that the bytes a[N..M] are undef due to alloca/lifetime.start.
This situation arises relatively often for Rust code, because Rust does
not initialize trailing structure padding and loves to insert redundant
memcpys. This also fixes https://bugs.llvm.org/show_bug.cgi?id=39844.
For the implementation, I'm reusing a bit of code for a similar existing
optimization (direct memcpy of undef). I've also added memset support to
MemDepAnalysis GetLocation -- Instead, getPointerDependencyFrom could be
used, but it seems to make more sense to add this to GetLocation and thus
make the computation cachable.
Differential Revision: https://reviews.llvm.org/D55120
llvm-svn: 348645
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The splitting pass uses its 'unlikelyExecuted' predicate to statically
decide which blocks are cold.
- Do not treat noreturn calls as if they are cold unless they are actually
marked cold. This is motivated by functions like exit() and longjmp(), which
are not beneficial to outline.
- Do not treat inline asm as an outlining barrier. In practice asm("") is
frequently used to inhibit basic block merging; enabling outlining in this case
results in substantial memory savings.
- Treat invokes of cold functions as cold.
As a drive-by, remove the 'exceptionHandlingFunctions' predicate, because it's
no longer needed. The pass can identify & outline blocks dominated by EH pads,
so there's no need to special-case __cxa_begin_catch etc.
Differential Revision: https://reviews.llvm.org/D54244
llvm-svn: 348640
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Algorithm: Identify maximal cold regions and put them in a worklist. If
a candidate region overlaps with another, discard it. While the worklist
is full, remove a single-entry sub-region from the worklist and attempt
to outline it. By the non-overlap property, this should not invalidate
parts of the domtree pertaining to other outlining regions.
Testing: LNT results on X86 are clean. With test-suite + externals, llvm
outlines 134KB pre-patch, and 352KB post-patch (+ ~2.6x). The file
483.xalancbmk/src/Constants.cpp stands out as an extreme case where llvm
outlines over 100 times in some functions (mostly EH paths). There was
not a significant performance impact pre vs. post-patch.
Differential Revision: https://reviews.llvm.org/D53887
llvm-svn: 348639
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DemandedBits and BDCE currently only support scalar integers. This
patch extends them to also handle vector integer operations. In this
case bits are not tracked for individual vector elements, instead a
bit is demanded if it is demanded for any of the elements. This matches
the behavior of computeKnownBits in ValueTracking and
SimplifyDemandedBits in InstCombine.
Unlike the previous iteration of this patch, getDemandedBits() can now
again be called on arbirary (sized) instructions, even if they don't
have integer or vector of integer type. (For vector types the size of the
returned mask will now be the scalar size in bits though.)
The added LoopVectorize test case shows a case which triggered an
assertion failure with the previous attempt, because getDemandedBits()
was called on a pointer-typed instruction.
Differential Revision: https://reviews.llvm.org/D55297
llvm-svn: 348602
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces a new instinsic `@llvm.experimental.widenable_condition`
that allows explicit representation for guards. It is an alternative to using
`@llvm.experimental.guard` intrinsic that does not contain implicit control flow.
We keep finding places where `@llvm.experimental.guard` is not supported or
treated too conservatively, and there are 2 reasons to that:
- `@llvm.experimental.guard` has memory write side effect to model implicit control flow,
and this sometimes confuses passes and analyzes that work with memory;
- Not all passes and analysis are aware of the semantics of guards. These passes treat them
as regular throwing call and have no idea that the condition of guard may be used to prove
something. One well-known place which had caused us troubles in the past is explicit loop
iteration count calculation in SCEV. Another example is new loop unswitching which is not
aware of guards. Whenever a new pass appears, we potentially have this problem there.
Rather than go and fix all these places (and commit to keep track of them and add support
in future), it seems more reasonable to leverage the existing optimizer's logic as much as possible.
The only significant difference between guards and regular explicit branches is that guard's condition
can be widened. It means that a guard contains (explicitly or implicitly) a `deopt` block successor,
and it is always legal to go there no matter what the guard condition is. The other successor is
a guarded block, and it is only legal to go there if the condition is true.
This patch introduces a new explicit form of guards alternative to `@llvm.experimental.guard`
intrinsic. Now a widenable guard can be represented in the CFG explicitly like this:
%widenable_condition = call i1 @llvm.experimental.widenable.condition()
%new_condition = and i1 %cond, %widenable_condition
br i1 %new_condition, label %guarded, label %deopt
guarded:
; Guarded instructions
deopt:
call type @llvm.experimental.deoptimize(<args...>) [ "deopt"(<deopt_args...>) ]
The new intrinsic `@llvm.experimental.widenable.condition` has semantics of an
`undef`, but the intrinsic prevents the optimizer from folding it early. This form
should exploit all optimization boons provided to `br` instuction, and it still can be
widened by replacing the result of `@llvm.experimental.widenable.condition()`
with `and` with any arbitrary boolean value (as long as the branch that is taken when
it is `false` has a deopt and has no side-effects).
For more motivation, please check llvm-dev discussion "[llvm-dev] Giving up using
implicit control flow in guards".
This patch introduces this new intrinsic with respective LangRef changes and a pass
that converts old-style guards (expressed as intrinsics) into the new form.
The naming discussion is still ungoing. Merging this to unblock further items. We can
later change the name of this intrinsic.
Reviewed By: reames, fedor.sergeev, sanjoy
Differential Revision: https://reviews.llvm.org/D51207
llvm-svn: 348593
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D54848
llvm-svn: 348570
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current algorithm that collects live/dead/inloop blocks relies on some invariants
related to RPO and PO traversals. In particular, the important fact it requires is that
the only loop's latch is the first block in PO traversal. It also relies on fact that during
RPO we visit all prececessors of a block before we visit this block (backedges ignored).
If a loop has irreducible non-loop cycle inside, both these assumptions may break.
This patch adds detection for this situation and prohibits the terminator folding
for loops with irreducible CFG.
We can in theory support this later, for this some algorithmic changes are needed.
Besides, irreducible CFG is not a frequent situation and we can just don't bother.
Thanks @uabelho for finding this!
Differential Revision: https://reviews.llvm.org/D55357
Reviewed By: skatkov
llvm-svn: 348567
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When CodeExtractor outlines values which are used by the original
function, it must store those values in some in-out parameter. This
store instruction must not be inserted in between a PHI and an EH pad
instruction, as that results in invalid IR.
This fixes the following verifier failure seen while outlining within
ObjC methods with live exit values:
The unwind destination does not have an exception handling instruction!
%call35 = invoke i8* bitcast (i8* (i8*, i8*, ...)* @objc_msgSend to i8* (i8*, i8*)*)(i8* %exn.adjusted, i8* %1)
to label %invoke.cont34 unwind label %lpad33, !dbg !4183
The unwind destination does not have an exception handling instruction!
invoke void @objc_exception_throw(i8* %call35) #12
to label %invoke.cont36 unwind label %lpad33, !dbg !4184
LandingPadInst not the first non-PHI instruction in the block.
%3 = landingpad { i8*, i32 }
catch i8* null, !dbg !1411
rdar://46540815
llvm-svn: 348562
|
|
|
|
|
|
|
| |
This reverts commit r348549. Causing assertion failures during
clang build.
llvm-svn: 348558
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DemandedBits and BDCE currently only support scalar integers. This
patch extends them to also handle vector integer operations. In this
case bits are not tracked for individual vector elements, instead a
bit is demanded if it is demanded for any of the elements. This matches
the behavior of computeKnownBits in ValueTracking and
SimplifyDemandedBits in InstCombine.
The getDemandedBits() method can now only be called on instructions that
have integer or vector of integer type. Previously it could be called on
any sized instruction (even if it was not particularly useful). The size
of the return value is now always the scalar size in bits (while
previously it was the type size in bits).
Differential Revision: https://reviews.llvm.org/D55297
llvm-svn: 348549
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r348203 and reapplies D55085 with an additional
GCOV bugfix to make the change NFC for relative file paths in .gcno files.
Thanks to Ilya Biryukov for additional testing!
Original commit message:
Update Diagnostic handling for changes in CFE.
The clang frontend no longer emits the current working directory for
DIFiles containing an absolute path in the filename: and will move the
common prefix between current working directory and the file into the
directory: component.
https://reviews.llvm.org/D55085
llvm-svn: 348512
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Partial Redundancy Elimination of GEPs prevents CodeGenPrepare from
sinking the addressing mode computation of memory instructions back
to its uses. The problem comes from the insertion of PHIs, which
confuse CGP and make it bail.
I've autogenerated the check lines of an existing test and added a
store instruction to demonstrate the motivation behind this change.
The store is now using the gep instead of a phi.
Differential Revision: https://reviews.llvm.org/D55009
llvm-svn: 348496
|
|
|
|
|
|
|
|
| |
This reverts commit r348457.
The original commit causes clang to crash when doing an instrumented
build with a new pass manager. Reverting to unbreak our integrate.
llvm-svn: 348484
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I was finally able to quantify what i thought was missing in the fix,
it was vector constants. If we have a scalar (and %x, -1),
it will be instsimplified before we reach this code,
but if it is a vector, we may still have a -1 element.
Thus, we want to avoid the fold if *at least one* element is -1.
Or in other words, ignoring the undef elements, no sign bits
should be set. Thus, m_NonNegative().
A follow-up for rL348181
https://bugs.llvm.org/show_bug.cgi?id=39861
llvm-svn: 348462
|
|
|
|
|
|
|
|
|
|
| |
This patch teaches LoopSimplifyCFG to delete loop blocks that have
become unreachable after terminator folding has been done.
Differential Revision: https://reviews.llvm.org/D54023
Reviewed By: anna
llvm-svn: 348457
|
|
|
|
|
|
|
|
| |
Extracting from a splat constant is always handled by InstSimplify.
Move the test for this from InstCombine to InstSimplify to make
sure that stays true.
llvm-svn: 348423
|
|
|
|
| |
llvm-svn: 348418
|
|
|
|
|
|
|
|
|
|
| |
Treat terminators which resume exception propagation as returning instructions
(at least, for the purposes of marking outlined functions `noreturn`). This is
to avoid inserting traps after calls to outlined functions which unwind.
rdar://46129950
llvm-svn: 348404
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: debug intrinsics might be marked norecurse to enable the caller function to be norecurse and optimized if needed. This avoids code gen optimisation differences when -g is used, as in globalOpt.cpp:processInternalGlobal checks.
Reviewers: chandlerc, jmolloy, aprantl
Reviewed By: aprantl
Subscribers: aprantl, llvm-commits
Differential Revision: https://reviews.llvm.org/D55187
llvm-svn: 348381
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The tests here are based on the motivating cases from D54827.
More background:
1. We don't get these cases in general with SimplifyCFG because the root
of the pattern match is an icmp, not a branch. I'm not sure how often
we encounter this pattern vs. the seemingly more likely case with
branches, but I don't see evidence to leave the minimal pattern
unoptimized.
2. This has a chance of increasing compile-time because we're using a
ValueTracking call to handle the match. The motivating cases could be
handled with a simpler pair of calls to isImpliedTrueByMatchingCmp/
isImpliedFalseByMatchingCmp, but I saw that we have a more
comprehensive wrapper around those, so we might as well use it here
unless there's evidence that it's significantly slower.
3. Ideally, we'd handle the fold to constants in InstSimplify, but as
with the existing code here, we could extend this to handle cases
where the result is not a constant, but a new combined predicate.
That would mean splitting the logic across the 2 passes and possibly
duplicating the pattern-matching cost.
4. As mentioned in D54827, this seems like the kind of thing that should
be handled in Correlated Value Propagation, but that pass is currently
limited to dealing with instructions with constant operands, so extending
this bit of InstCombine is the smallest/easiest way to get these patterns
optimized.
llvm-svn: 348367
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The remaining code paths that ControlFlowHoisting introduced that were
not disabled, increased compile time by 3x for some benchmarks.
The time is spent in DominatorTree updates.
Reviewers: john.brawn, mkazantsev
Subscribers: sanjoy, jlebar, llvm-commits
Differential Revision: https://reviews.llvm.org/D55313
llvm-svn: 348345
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: eugenis, m.ostapenko, ygribov
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D55157
llvm-svn: 348327
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: eugenis, m.ostapenko, ygribov
Subscribers: mehdi_amini, kubamracek, hiraditya, steven_wu, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D55156
llvm-svn: 348316
|
|
|
|
|
|
|
| |
The old function underspecified the return type, took an unused parameter,
and had a misleading name.
llvm-svn: 348292
|
|
|
|
|
|
|
|
| |
Move it out from under the constant check, reorder
predicates, add comments. This makes it easier to
extend to handle the non-constant case.
llvm-svn: 348284
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r348203.
Reason: this produces absolute paths in .gcno files, breaking us
internally as we rely on them being consistent with the filenames passed
in the command line.
Also reverts r348157 and r348155 to account for revert of r348154 in
clang repository.
llvm-svn: 348279
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's a potential small enhancement to this code that could
solve the cases currently under proposal in D54827 via SimplifyCFG.
Whether instcombine should be doing this kind of semi-non-local
analysis in the first place is an open question, but separating
the logic out can only help if/when we decide to move it to a
different pass.
AFAICT, any proposal to do this in SimplifyCFG could also be seen
as an overreach + it would be incomplete to start the fold from a
branch rather than an icmp.
There's another question here about the code for processUGT_ADDCST_ADD().
That part may be completely dead after rL234638 ?
llvm-svn: 348273
|
|
|
|
| |
llvm-svn: 348267
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Teach SimpleLoopUnswitch to preserve MemorySSA.
Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits
Differential Revision: https://reviews.llvm.org/D47022
llvm-svn: 348263
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
--asan-use-private-alias increases binary sizes by 10% or more.
Most of this space was long names of aliases and new symbols.
These symbols are not needed for the ODC check at all.
Reviewers: eugenis
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D55146
llvm-svn: 348221
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(PR39433)
If a PHI node out of extracted region has multiple incoming values from it,
split this PHI on two parts. First PHI has incomings only from region and
extracts with it (they are placed to the separate basic block that added to the
list of outlined), and incoming values in original PHI are replaced by first
PHI. Similar solution is already used in CodeExtractor for PHIs in entry block
(severSplitPHINodes method). It covers PR39433 bug.
Patch by Sergei Kachkov!
Differential Revision: https://reviews.llvm.org/D55018
llvm-svn: 348205
|
|
|
|
|
|
|
|
|
|
|
|
| |
The clang frontend no longer emits the current working directory for
DIFiles containing an absolute path in the filename: and will move the
common prefix between current working directory and the file into the
directory: component.
This fixes the GCOV tests in compiler-rt that were broken by the Clang
change.
llvm-svn: 348203
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we have a shuffle that extends a source vector with undefs
and then do some binop on that, we must make sure that the extra
elements remain undef with that binop if we reverse the order of
the binop and shuffle.
'or' is probably the easiest example to show the bug because
'or C, undef --> -1' (not undef). But there are other
opcode/constant combinations where this is true as shown by
the 'shl' test.
llvm-svn: 348191
|
|
|
|
|
|
|
|
|
|
|
| |
These two folds are invalid for this non-constant pattern
when the mask ends up being all-ones:
https://rise4fun.com/Alive/9au
https://rise4fun.com/Alive/UcQM
Fixes https://bugs.llvm.org/show_bug.cgi?id=39861
llvm-svn: 348181
|
|
|
|
|
|
|
|
| |
This code has a bug dealing with undefs, so we need
to add another escape hatch, so doing some cleanup
ahead of that.
llvm-svn: 348175
|
|
|
|
|
|
|
|
| |
There are potential improvements to the structure of this API
raised by D54994, but remove some cosmetic blemishes before
making any functional changes.
llvm-svn: 348149
|
|
|
|
|
|
|
|
|
|
| |
This change enables conservative assembly instrumentation in KMSAN builds
by default.
It's still possible to disable it with -msan-handle-asm-conservative=0
if something breaks. It's now impossible to enable conservative
instrumentation for userspace builds, but it's not used anyway.
llvm-svn: 348112
|
|
|
|
|
|
|
|
| |
We were duplicating code around the existing isImpliedCondition() that
checks for a predecessor block/dominating condition, so make that a
wrapper call.
llvm-svn: 348088
|
|
|
|
|
|
|
|
|
|
|
|
| |
Extend ssub.sat(X, C) -> sadd.sat(X, -C) canonicalization to also
support non-splat vector constants. This is done by generalizing
the implementation of the isNotMinSignedValue() helper to return
true for constants that are non-splat, but don't contain any
signed min elements.
Differential Revision: https://reviews.llvm.org/D55011
llvm-svn: 348072
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When mem2reg inserts phi nodes in blocks with unreachable predecessors,
it adds undef operands for those incoming edges. When there are
multiple such predecessors, the order is currently based on the address
of the BasicBlocks. This change fixes that by using the BBNumbers in
the sort/search predicates, as is done elsewhere in mem2reg to ensure
determinism.
Also adds a testcase with a bunch of unreachable preds, which
(nodeterministically) fails without the fix.
Reviewers: majnemer
Reviewed By: majnemer
Subscribers: mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D55077
llvm-svn: 348024
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
An additional fix for PR39774. Need to update the references for the
RedcutionRoot instruction when it is replaced during the vectorization
phase to avoid compiler crash on reduction vectorization.
Reviewers: RKSimon, spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D55017
llvm-svn: 347997
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Terminator folding transform lacks MemorySSA update for memory Phis,
while they exist within MemorySSA analysis. They need exactly the same
type of updates as regular Phis. Failing to update them properly ends up
with inconsistent MemorySSA and manifests in various assertion failures.
This patch adds Memory Phi updates to this transform.
Thanks to @jonpa for finding this!
Differential Revision: https://reviews.llvm.org/D55050
Reviewed By: asbirlea
llvm-svn: 347979
|