| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
| |
Whether it is legal or not needs to check for the instruction
it will be replaced with.
llvm-svn: 291711
|
| |
|
|
| |
llvm-svn: 291708
|
| |
|
|
|
|
|
|
|
| |
This means that we can use a shorter instruction sequence in the case where
the size is a power of two and on the boundary between two representations.
Differential Revision: https://reviews.llvm.org/D28421
llvm-svn: 291706
|
| |
|
|
|
|
|
|
|
| |
Refines max backedge-taken count if a loop like
"for (int i = 0; i != n; ++i) { /* body */ }" is rotated.
Differential Revision: https://reviews.llvm.org/D28536
llvm-svn: 291704
|
| |
|
|
|
|
|
|
|
| |
This is both easier to understand, and produces a tighter bound in certain
cases.
Differential Revision: https://reviews.llvm.org/D28393
llvm-svn: 291701
|
| |
|
|
|
|
| |
and a lowering phase.", with a fix for an off-by-one error.
llvm-svn: 291699
|
| |
|
|
|
|
|
|
|
|
|
| |
classes, and updating checking to allow for equivalence through
reachability.
(Sadly, the checking here is not perfect, and can't be made perfect,
so we'll have to disable it after we are satisfied with correctness.
Right now it is just "very unlikely" to happen.)
llvm-svn: 291698
|
| |
|
|
|
|
| |
This patch resubmits the changes in r291588.
llvm-svn: 291696
|
| |
|
|
|
|
|
|
|
| |
This reverts commit ada6595a526d71df04988eb0a4b4fe84df398ded.
This needs a simple probability check because there are some cases where it is
not profitable.
llvm-svn: 291695
|
| |
|
|
|
|
|
|
|
| |
The new matchers work after legalization to make them simpler, and to avoid
blocking other optimizations.
Differential Revision: https://reviews.llvm.org/D27779
llvm-svn: 291693
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The removed assert seems bogus - it's perfectly legal for the roots of the
vectorized subtrees to be equal even if the original scalar values aren't,
if the original scalars happen to be equivalent.
This fixes PR31599.
Differential Revision: https://reviews.llvm.org/D28539
llvm-svn: 291692
|
| |
|
|
| |
llvm-svn: 291690
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Revert LowerTypeTests: Split the pass in two: a resolution phase and a lowering phase.
This change separates how type identifiers are resolved from how intrinsic
calls are lowered. All information required to lower an intrinsic call
is stored in a new TypeIdLowering data structure. The idea is that this
data structure can either be initialized using the module itself during
regular LTO, or using the module summary in ThinLTO backends.
Original URL: https://reviews.llvm.org/D28341
Reviewers: pcc
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D28532
llvm-svn: 291684
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Commit rL290616 (https://reviews.llvm.org/rL290616) changed a checking command
for the triple arm-apple-darwin in LLVM::CodeGen/ARM/fpcmp_ueq.ll. As a result
of the changes the test could fail for the valid generated code.
These changes fixes the test to check only instructions we would expect.
Differential Revision: https://reviews.llvm.org/D28159
llvm-svn: 291678
|
| |
|
|
|
|
|
|
|
| |
DAG patterns optimization: truncate + unsigned saturation supported by VPMOVUS* instructions in AVX-512.
And VPACKUS* instructions on SEE* targets.
Differential Revision: https://reviews.llvm.org/D28216
llvm-svn: 291670
|
| |
|
|
|
|
|
|
|
|
|
|
| |
immediate operands
Reviewers: artem.tamazov, nhaustov, vpykhtin, tstellarAMD
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D28157
llvm-svn: 291668
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D28447
llvm-svn: 291665
|
| |
|
|
| |
llvm-svn: 291663
|
| |
|
|
|
|
|
|
|
|
|
| |
The code emiited by Clang's intrinsics for (v)cvtsi2ss, (v)cvtsi2sd,
(v)cvtsd2ss and (v)cvtss2sd is lowered to a code sequence that includes
redundant (v)movss/(v)movsd instructions. This patch adds patterns for
optimizing these sequences.
Differential revision: https://reviews.llvm.org/D28455
llvm-svn: 291660
|
| |
|
|
|
|
| |
Missing Requires asserts.
llvm-svn: 291659
|
| |
|
|
|
|
|
|
|
|
|
|
| |
updated instructions:
pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd.
special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq.
In case if the real operands bitwidth <= 16.
Differential Revision: https://reviews.llvm.org/D28104
llvm-svn: 291657
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In this change we move the definition of the log reading routines from
the tools directory in LLVM to {include/llvm,lib}/XRay. We improve the
documentation a little bit for the publicly accessible headers, and
adjust the top-matter. This also leads to some refactoring and cleanup
in the tooling code.
In particular, we do the following:
- Rename the class from LogReader to Trace, as it better represents
the logical set of records as opposed to a log.
- Use file type detection instead of asking the user to say what
format the input file is. This allows us to keep the interface
simple and encapsulate the logic of loading the data appropriately.
In future changes we increase the API surface and write dedicated unit
tests for the XRay library.
Depends on D24376.
Reviewers: dblaikie, echristo
Subscribers: mehdi_amini, mgorny, llvm-commits, varno
Differential Revision: https://reviews.llvm.org/D28345
llvm-svn: 291652
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
arguments much like the CGSCC pass manager.
This is a major redesign following the pattern establish for the CGSCC layer to
support updates to the set of loops during the traversal of the loop nest and
to support invalidation of analyses.
An additional significant burden in the loop PM is that so many passes require
access to a large number of function analyses. Manually ensuring these are
cached, available, and preserved has been a long-standing burden in LLVM even
with the help of the automatic scheduling in the old pass manager. And it made
the new pass manager extremely unweildy. With this design, we can package the
common analyses up while in a function pass and make them immediately available
to all the loop passes. While in some cases this is unnecessary, I think the
simplicity afforded is worth it.
This does not (yet) address loop simplified form or LCSSA form, but those are
the next things on my radar and I have a clear plan for them.
While the patch is very large, most of it is either mechanically updating loop
passes to the new API or the new testing for the loop PM. The code for it is
reasonably compact.
I have not yet updated all of the loop passes to correctly leverage the update
mechanisms demonstrated in the unittests. I'll do that in follow-up patches
along with improved FileCheck tests for those passes that ensure things work in
more realistic scenarios. In many cases, there isn't much we can do with these
until the loop simplified form and LCSSA form are in place.
Differential Revision: https://reviews.llvm.org/D28292
llvm-svn: 291651
|
| |
|
|
|
|
|
|
| |
AllOnes), N1, N2) -> (vselect N0, N2, N1). Only do this if the target indicates its vector boolean type is ZeroOrNegativeOneBooleanContent."
Some test appears to be hanging on the build bots.
llvm-svn: 291650
|
| |
|
|
|
|
|
|
|
|
|
|
| |
These are interesting again because the user may not be aware that this
is a common reason preventing LICM.
A const is removed from an instruction pointer declaration in order to
pass it to ORE.
Differential Revision: https://reviews.llvm.org/D27940
llvm-svn: 291649
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These are interesting because lack of precision in alias information
could be standing in the way of this optimization.
An example is the case in the test suite that I showed in the DevMeeting
talk:
http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/MultiSource/Benchmarks/FreeBench/distray/CMakeFiles/distray.dir/html/_org_test-suite_MultiSource_Benchmarks_FreeBench_distray_distray.c.html#L236
canSinkOrHoistInst is also used from LoopSink, which does not use
opt-remarks so we need to take ORE as an optional argument.
Differential Revision: https://reviews.llvm.org/D27939
llvm-svn: 291648
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D27938
llvm-svn: 291646
|
| |
|
|
|
|
| |
-> (vselect N0, N2, N1). Only do this if the target indicates its vector boolean type is ZeroOrNegativeOneBooleanContent.
llvm-svn: 291645
|
| |
|
|
|
|
|
|
| |
Even with aggressive fusion enabled, this requires duplicating
the fmul, or increases an fadd to another fma which is not an
improvement.
llvm-svn: 291642
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
to (setcc (LADD x, -C), COND) (PR31367)
This was reverted because it would miscompile code where the cmp had
multiple uses. That was due to a deficiency in the existing code, which
was fixed in r291630 (see the PR for details).
This re-commit includes an extra test for the kind of code that got
miscompiled: @test_sub_1_setcc_jcc.
llvm-svn: 291640
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We would miscompile the following:
void g(int);
int f(volatile long long *p) {
bool b = __atomic_fetch_add(p, 1, __ATOMIC_SEQ_CST) < 0;
g(b ? 12 : 34);
return b ? 56 : 78;
}
into
pushq %rax
lock incq (%rdi)
movl $12, %eax
movl $34, %edi
cmovlel %eax, %edi
callq g(int)
testq %rax, %rax <---- Bad.
movl $56, %ecx
movl $78, %eax
cmovsl %ecx, %eax
popq %rcx
retq
because the code failed to take into account that the cmp has multiple
uses, replaced one of them, and left the other one comparing garbage.
llvm-svn: 291630
|
| |
|
|
| |
llvm-svn: 291624
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D28164
llvm-svn: 291622
|
| |
|
|
|
|
|
| |
This patch reverts r291588: [PGO] Turn off comdat renaming in IR PGO by default,
as we are seeing some hash mismatches in our internal tests.
llvm-svn: 291621
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Previously if you had
* a function with the fast-math-enabled attr, followed by
* a function without the fast-math attr,
the second function would inherit the first function's fast-math-ness.
This means that mixing fast-math and non-fast-math functions in a module
was completely broken unless you explicitly annotated every
non-fast-math function with "unsafe-fp-math"="false". This appears to
have been broken since r176986 (March 2013), when the resetTargetOptions
function was introduced.
This patch tests the correct behavior as best we can. I don't think I
can test FPDenormalMode and NoTrappingFPMath, because they aren't used
in any backends during function lowering. Surprisingly, I also can't
find any uses at all of LessPreciseFPMAD affecting generated code.
The NVPTX/fast-math.ll test changes are an expected result of fixing
this bug. When FMA is disabled, we emit add as "add.rn.f32", which
prevents fma combining. Before this patch, fast-math was enabled in all
functions following the one which explicitly enabled it on itself, so we
were emitting plain "add.f32" where we should have generated
"add.rn.f32".
Reviewers: mkuper
Subscribers: hfinkel, majnemer, jholewinski, nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D28507
llvm-svn: 291618
|
| |
|
|
|
|
|
|
| |
Also fix up whitespace.
Test-only change.
llvm-svn: 291617
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The original code considered only v2i64 as slow for this feature. This patch
consider all 128-bit long vector types as slow candidates.
In internal tests, extending this feature to all 128-bit vector types
resulted in an overall improvement of 1% on Exynos M1.
Differential revision: https://reviews.llvm.org/D27998
llvm-svn: 291616
|
| |
|
|
|
|
| |
In future commits these patterns will appear after moveToVALU changes.
llvm-svn: 291615
|
| |
|
|
| |
llvm-svn: 291611
|
| |
|
|
|
|
|
|
|
|
|
| |
When choosing the best successor for a block, ordinarily we would have preferred
a block that preserves the CFG unless there is a strong probability the other
direction. For small blocks that can be duplicated we now skip that requirement
as well.
Differential revision: https://reviews.llvm.org/D27742
llvm-svn: 291609
|
| |
|
|
|
|
|
|
| |
about the value.
Differential Revision: https://reviews.llvm.org/D28487
llvm-svn: 291605
|
| |
|
|
|
|
|
|
|
| |
If a vector index is out of bounds, the result is supposed to be
undefined but is not undefined behavior. Change the legalization
for indexing the vector on the stack so that an out of bounds
index does not create an out of bounds memory access.
llvm-svn: 291604
|
| |
|
|
|
|
|
|
|
|
|
|
| |
When we collect 2 uses of a function in FindUses and then RAUW when we
visit the first, we end up visiting the wrapper (because the second was
RAUW'd). We still want to use RAUW instead of just Use->set() because
it has special handling for Constants, so this patch just ensures that
only one use of each constant is added to the work list.
Differential Revision: https://reviews.llvm.org/D28504
llvm-svn: 291603
|
| |
|
|
| |
llvm-svn: 291601
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Support for DW_FORM_implicit_const DWARFv5 feature.
When this form is used attribute value goes to .debug_abbrev section (as SLEB).
As this form would break any debug tool which doesn't support DWARFv5
it is guarded by dwarf version check. Attempt to use this form with
dwarf version <= 4 is considered a fatal error.
Differential Revision: https://reviews.llvm.org/D28456
llvm-svn: 291599
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Following the similar change to lit configuration, ensure that all CMake
booleans are canonicalized to 0/1 when being passed to llvm-config. This
fixes the incorrect interpretation of values when user passes another
value than the ON/OFF, and simplifies the code by removing unnecessary
string matching.
Furthermore, the code for --has-rtti and --has-global-isel has been
modified to print consistent values indepdently of the boolean used by
passed by the user to CMake. Sadly, the code already implicitly used
different values for the two (YES/NO for --has-rtti, ON/OFF for
--has-global-isel).
Include tests for all booleans and multi-value options in llvm-config.
Differential Revision: https://reviews.llvm.org/D28366
llvm-svn: 291593
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bail out instead of asserting when we encounter this situation,
which can actually happen.
The reason the test uses the new PM is that the "bad" phi, incidentally, gets
cleaned up by LoopSimplify. But LICM can create this kind of phi and preserve
loop simplify form, so the cleanup has no chance to run.
This fixes PR31190.
We may want to solve this in a less conservative manner, since this phi is
actually uniform within the inner loop (or we may want LICM to output a cleaner
promotion to begin with).
Differential Revision: https://reviews.llvm.org/D28490
llvm-svn: 291589
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In IR PGO we append the function hash to comdat functions to avoid the
potential hash mismatch. This turns out not legal in some cases: if the comdat
function is address-taken and used in comparison. Renaming changes the semantic.
This patch turns off comdat renaming by default.
To alleviate the hash mismatch issue, we now rename the profile variable
for comdat functions. Profile allows co-existing multiple versions of profiles
with different hash value. The inlined copy will always has the correct profile
counter. The out-of-line copy might not have the correct count. But we will
not have the bogus mismatch warning.
Reviewers: davidxl
Subscribers: llvm-commits, xur
Differential Revision: https://reviews.llvm.org/D28416
llvm-svn: 291588
|
| |
|
|
|
|
| |
This was enabled without many specific tests or the comment.
llvm-svn: 291586
|
| |
|
|
| |
llvm-svn: 291585
|