| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Unrolled Loop Size calculations moved to a function.
Constant representing number of optimized instructions
when "back edge" becomes "fall through" replaced with
variable.
Some comments added.
Reviewers: mzolotukhin
Differential Revision: http://reviews.llvm.org/D21719
From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 286389
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D26439
llvm-svn: 286382
|
| |
|
|
| |
llvm-svn: 286361
|
| |
|
|
|
|
|
|
|
|
|
| |
Scalar Evolution asserts when not all the operands of an Add Recurrence
Expression are loop invariants. Loop Strength Reduction should only
create affine Add Recurrences, so that both the start and the step of
the expression are loop invariants.
Differential Revision: https://reviews.llvm.org/D26185
llvm-svn: 286347
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: For functions with profile data, we are confident that loop sink will be optimal in sinking code.
Reviewers: davidxl, hfinkel
Subscribers: mehdi_amini, mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D26155
llvm-svn: 286325
|
| |
|
|
|
|
|
|
|
|
| |
As the test change shows, we can increase the critical path by adding
a 'not' instruction, so make sure that we're actually removing an
instruction if we do this transform.
This transform could also cause us to miss folds of min/max pairs.
llvm-svn: 286315
|
| |
|
|
| |
llvm-svn: 286314
|
| |
|
|
|
|
|
|
| |
This addresses PR30746, <https://llvm.org/bugs/show_bug.cgi?id=30746>. The ASan pass iterates over entry-block instructions and checks each alloca whether it's in NonInstrumentedStaticAllocaVec, which is apparently slow. This patch gathers the instructions to move during visitAllocaInst.
Differential Revision: https://reviews.llvm.org/D26380
llvm-svn: 286296
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D26408
llvm-svn: 286280
|
| |
|
|
|
|
|
|
|
|
| |
For example, it invalidates the domtree, causing assertions
in later passes which need dominator infos. Make it preserve
GlobalsAA, as suggested by Eli.
Differential Revision: https://reviews.llvm.org/D26381
llvm-svn: 286271
|
| |
|
|
| |
llvm-svn: 286270
|
| |
|
|
| |
llvm-svn: 286250
|
| |
|
|
|
|
|
|
| |
ULEB128 (NFC).
From experiments, discriminator is rarely greater than 127. Here we enforce it to be no greater than 127 so that it will always fit in 1 byte.
llvm-svn: 286245
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
These are good candidates for jump threading. This enables later opts
(such as InstCombine) to combine instructions from the selects with
instructions out of the selects. SimplifyCFG will fold the select
again if unfolding wasn't worth it.
Patch by James Molloy and Pablo Barrio.
Reviewers: rengolin, haicheng, sebpop
Subscribers: jojo, jmolloy, llvm-commits
Differential Revision: https://reviews.llvm.org/D26391
llvm-svn: 286236
|
| |
|
|
|
|
| |
Address review by Eli Friedman on rL286147.
llvm-svn: 286165
|
| |
|
|
|
|
|
|
| |
InnerLoopVectorizer::createInductionVariable. (NFC)
This is to prevent SetInsertionPoint from setting debug loc to Latch->getTerminator().
llvm-svn: 286159
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In some specific scenarios with well understood operand bundle types
(like `"deopt"`) it may be possible to go ahead and convert recursion to
iteration, but TailRecursionElimination does not have that logic today
so avoid doing the right thing for now.
I need some input on whether `"funclet"` operand bundles should also
block tail recursion elimination. If not, I'll allow TRE across calls
with `"funclet"` operand bundles and add a test case.
Reviewers: rnk, majnemer, nlewycky, ahatanak
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D26270
llvm-svn: 286147
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: Use -fsanitize-recover instead of -mllvm -msan-keep-going.
Reviewers: eugenis
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26352
llvm-svn: 286145
|
| |
|
|
|
|
|
|
|
|
| |
accesses, LLVM part
Although rare, atomic accesses to floating-point types seem to be valid, i.e. `%a = load atomic float ...`. The TSan instrumentation pass however tries to emit inttoptr, which is incorrect, we should use a bitcast here. Anyway, IRBuilder already has a convenient helper function for this.
Differential Revision: https://reviews.llvm.org/D26266
llvm-svn: 286135
|
| |
|
|
|
|
|
|
|
| |
Argument evaluation order is one of the edge cases where Clang differs
from GCC, yielding different IR depending on which compiler LLVM was
built with. Make the order deterministic and tune the test to actually
verify the order instead of trying to hide it.
llvm-svn: 286126
|
| |
|
|
| |
llvm-svn: 286117
|
| |
|
|
|
|
|
|
| |
This was reverted at r285866 because there was a crash handling a scalar
select of vectors. I added a check for that pattern and a test case based
on the example provided in the post-commit thread for r285732.
llvm-svn: 286113
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
valid int64_t to the set.
Summary:
SmallSetVector uses DenseSet, but that means we need to reserve some
values for the empty and tombstone keys.
It seems to me we should have a general way to let us store full-range
ints inside of DenseSets, and furthermore that we probably shouldn't
silently let you add ints into DenseSets without explicitly promising
that they're in range. But that's a battle for another day; for now,
just fix this code, since we currently do something Very Bad when
compiling ffmpeg.
Fixes PR30914.
Reviewers: jeremyhu
Subscribers: llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D26323
llvm-svn: 286038
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instruction.
This avoids dereferencing null in the debug logging if the instruction
was not in fact a return instruction. This potential bug was found by
PVS-Studio.
This actually fixes the last of the "dereferenced a pointer before
checking it for null" reports in the recent PVS-Studio run. However,
there are quite a few reports of this nature that I did not do anything
to fix because they are pretty glaring false positives. They usually
took the form of quite clear correlated checks or a check made in
a separate function. I've even added asserts anywhere this correlation
wasn't pretty obvious and fundamental to the code.
llvm-svn: 285988
|
| |
|
|
| |
llvm-svn: 285978
|
| |
|
|
|
|
|
|
|
|
|
| |
This condition is trivially always true prior to the change. The comment
at the call site makes it clear that we expect *all* of these to be '=',
'S', or 'I' so fix the code.
We have a bug I will update to track the fact that Clang doesn't warn on
this: http://llvm.org/PR13101
llvm-svn: 285930
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The recent change I made to consult the summary when deciding whether to
rename (to handle inline asm) in r285513 broke the distributed build
case. In a distributed backend we will only have a portion of the
combined index, specifically for imported modules we only have the
summaries for any imported definitions. When renaming on import we were
asserting because no summary entry was found for a local reference being
linked in (def wasn't imported).
We only need to consult the summary for a renaming decision for the
exporting module. For imports, we would have prevented importing any
references to NoRename values already.
Reviewers: mehdi_amini
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26250
llvm-svn: 285871
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r285732.
This change introduced a new assertion failure in the following
testcase at -O2:
typedef short __v8hi __attribute__((__vector_size__(16)));
__v8hi foo(__v8hi &V1, __v8hi &V2, unsigned mask) {
__v8hi Result = V1;
if (mask & 0x80)
Result[0] = V2[0];
return Result;
}
llvm-svn: 285866
|
| |
|
|
|
|
|
|
|
|
|
| |
On platforms which use -fmath-errno, math libcalls without any uses
require some extra checks to figure out if they are actually dead.
Fixes https://llvm.org/bugs/show_bug.cgi?id=30464 .
Differential Revision: https://reviews.llvm.org/D25970
llvm-svn: 285857
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
It was detected that the reassociate pass could enter an inifite
loop when analysing dead code. Simply skipping to analyse basic
blocks that are dead avoids such problems (and as a side effect
we avoid spending time on optimising dead code).
The solution is using the same Reverse Post Order ordering of the
basic blocks when doing the optimisations, as when building the
precalculated rank map. A nice side-effect of this solution is
that we now know that we only try to do optimisations for blocks
with ranked instructions.
Fixes https://llvm.org/bugs/show_bug.cgi?id=30818
Reviewers: llvm-commits, davide, eli.friedman, mehdi_amini
Subscribers: dberlin
Differential Revision: https://reviews.llvm.org/D26154
llvm-svn: 285793
|
| |
|
|
|
|
|
|
| |
Patch by bryant.
Differential Revision: https://reviews.llvm.org/D26126
llvm-svn: 285750
|
| |
|
|
| |
llvm-svn: 285732
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This transforms
%a = shl nuw %x, c1
%b = icmp {ugt|ule} %a, c0
into
%b = icmp {ugt|ule} %x, (c0 >> c1)
z3:
(declare-const x (_ BitVec 64))
(declare-const c0 (_ BitVec 64))
(declare-const c1 (_ BitVec 64))
(push)
(assert (= x (bvlshr (bvshl x c1) c1))) ; nuw
(assert (not (= (bvugt (bvshl x c1) c0)
(bvugt x
(bvlshr c0 c1)))))
(check-sat)
(get-model)
(pop)
(push)
(assert (= x (bvlshr (bvshl x c1) c1))) ; nuw
(assert (not (= (bvule (bvshl x c1) c0)
(bvule x
(bvlshr c0 c1)))))
(check-sat)
(get-model)
(pop)
Patch by bryant!
Differential Revision: https://reviews.llvm.org/D25913
llvm-svn: 285729
|
| |
|
|
|
|
|
|
|
| |
1. Change param names for readability
2. Change pointer param to ref
3. Early exit to reduce indent
4. Change switch to if/else
llvm-svn: 285718
|
| |
|
|
|
|
| |
This is just a cut and paste; clean-up and enhancements to follow.
llvm-svn: 285715
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces the combine:
(C1 shift (A add C2)) -> ((C1 shift C2) shift A)
iff A and C2 are both positive
If both A and C2 are know to be positive then we can safely split into 2 shifts, permitting the folding of the Inner shift.
Fix for the spec benchmark case mentioned by @nadav on PR15141 (assuming we can prove that the inputs as positive).
Differential Revision: https://reviews.llvm.org/D26000
llvm-svn: 285696
|
| |
|
|
|
|
| |
Found with PVS-Studio here: http://www.viva64.com/en/b/0446/
llvm-svn: 285652
|
| |
|
|
|
|
|
|
| |
On Darwin, simple C null-terminated constant strings normally end up in the __TEXT,__cstring section of the resulting Mach-O binary. When instrumented with ASan, these strings are transformed in a way that they cannot be in __cstring (the linker unifies the content of this section and strips extra NUL bytes, which would break instrumentation), and are put into a generic __const section. This breaks some of the tools that we have: Some tools need to scan all C null-terminated strings in Mach-O binaries, and scanning all the contents of __const has a large performance penalty. This patch instead introduces a special section, __asan_cstring which will now hold the instrumented null-terminated strings.
Differential Revision: https://reviews.llvm.org/D25026
llvm-svn: 285619
|
| |
|
|
| |
llvm-svn: 285568
|
| |
|
|
| |
llvm-svn: 285518
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
possible pointer-wrap-around concerns, in some cases.
Before this patch, collectConstStridedAccesses (part of interleaved-accesses
analysis) called getPtrStride with [Assume=false, ShouldCheckWrap=true] when
examining all candidate pointers. This is too conservative. Instead, this
patch makes collectConstStridedAccesses use an optimistic approach, calling
getPtrStride with [Assume=true, ShouldCheckWrap=false], and then, once the
candidate interleave groups have been formed, revisits the pointer-wrapping
analysis but only where it matters: namely, in groups that have gaps, and where
the gaps are not at the very end of the group (in which case the loop is
peeled). This second time getPtrStride is called with [Assume=false,
ShouldCheckWrap=true], but this could further be improved to using Assume=true,
once we also add the logic to track that we are not going to meet the scev
runtime checks threshold.
Differential Revision: https://reviews.llvm.org/D25276
llvm-svn: 285517
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Instead of using the workaround of suppressing the entire index for
modules that call inline asm that may reference locals, use the
NoRename flag on the summary for any locals in the llvm.used set, and
add a reference edge from any functions containing inline asm.
This avoids issues from having no summaries despite the module defining
global values, which was preventing more aggressive index-based
optimization. It will be followed by a subsequent patch to make a
similar fix for local references in module level asm (to fix PR30610).
Reviewers: mehdi_amini
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26121
llvm-svn: 285513
|
| |
|
|
|
|
| |
Rename as suggested in code review for D26063.
llvm-svn: 285508
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Replace the check of whether a GV has a section with the flag check
in the summary. This is in preparation for using the NoPromote flag
to convey other situations when we can't promote (e.g. locals used in
inline asm).
Reviewers: mehdi_amini
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26063
llvm-svn: 285507
|
| |
|
|
|
|
|
|
|
|
|
| |
These mixed bitcast patterns show up with SSE/AVX intrinsics because we bitcast function parameters to <2 x i64>.
The bitcasts obfuscate the expected min/max forms as shown in PR28001:
https://llvm.org/bugs/show_bug.cgi?id=28001#c6
Differential Revision: https://reviews.llvm.org/D25943
llvm-svn: 285495
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This "pass" eagerly creates div and rem instructions even when only one
is needed -- it relies on a later pass (machine DCE?) to clean them up.
This is problematic not just from a cleanliness perspective (this pass
is running during CodeGenPrepare, so should leave the IR in a better
state), but it also creates a problem for instruction selection. If we
always have a div+rem, isel will always select a divrem instruction (if
possible), even when a single div or rem would do.
Specifically, in NVPTX, we want to compute rem from the output of div,
if available. But if a div is not available, we want to leave the rem
alone. This transformation is overeager if div is always available.
Because this code runs as part of CodeGenPrepare, it's nontrivial to
write a test for this change. But this will effectively be tested by
a later patch which adds the aforementioned change to NVPTX isel.
Reviewers: tra
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26088
llvm-svn: 285460
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In BypassSlowDivision's short-dividend path, we would create e.g.
udiv exact i32 %a, %b
"exact" here means that we are asserting that %a is a multiple of %b.
But we have no reason to believe this must be true -- this is just a
bug, as far as I can tell.
Reviewers: tra
Subscribers: jholewinski, llvm-commits
Differential Revision: https://reviews.llvm.org/D26097
llvm-svn: 285459
|
| |
|
|
|
|
|
| |
Partial step towards removing the whitelist and only
using TTI's cost.
llvm-svn: 285438
|
| |
|
|
|
|
|
|
| |
Thanks to bryant for the patch!
Differential Revision: https://reviews.llvm.org/D26086
llvm-svn: 285432
|
| |
|
|
|
|
|
|
|
| |
Now LPPassManager will run LCSSA verification only for the top-level loop
which was processed on the current iteration.
Differential Revision: https://reviews.llvm.org/D25873
llvm-svn: 285394
|