| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
This was causing bot failures on greendragon
llvm-svn: 326169
|
| |
|
|
| |
llvm-svn: 326167
|
| |
|
|
|
|
|
|
|
|
| |
Summary:
Add specific mtriples to tests added in r326154.
From: Evgeny Stupachenko <evstupac@gmail.com>
<evgeny.v.stupachenko@intel.com>
llvm-svn: 326158
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The change fix an assert fail at ScalarEvolutionExpander.cpp:
assert(ExitCount != SE.getCouldNotCompute() && "Invalid loop count");
Reviewers: sbaranga
Differential Revision: http://reviews.llvm.org/D42604
From: Evgeny Stupachenko <evstupac@gmail.com>
<evgeny.v.stupachenko@intel.com>
llvm-svn: 326154
|
| |
|
|
|
|
|
|
|
|
| |
in r320674 to help X86.
AVX512 used to promote v32i1 to v32i8 during legalization when BWI was disabled. So this code was added to improve legalization of v32i1 concat_vectors of v16i1 by extending the v16i1 to v16i8 to avoid scalarization.
X86 has since switched to legalizing v32i1 by splitting to v16i1 instead. This has rendered this code unnecessary and its no longer exercised.
llvm-svn: 326153
|
| |
|
|
|
|
| |
vectors; NFC
llvm-svn: 326148
|
| |
|
|
|
|
| |
NFC
llvm-svn: 326147
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we assert that only non target specific opcodes can have
missing RegisterClass constraints in the MCDesc. The backend can have
instructions with register operands but don't have RegisterClass
constraints (say using unknown_class) in which case the instruction
defining the register will constrain it.
Change the assert to only fire if a def has no regclass.
https://reviews.llvm.org/D43409
llvm-svn: 326142
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
constants.
Summary: This allows vector fabs to be removed in more cases.
Reviewers: spatel, arsenm, RKSimon
Reviewed By: spatel
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D43739
llvm-svn: 326138
|
| |
|
|
|
|
|
|
|
|
| |
Agner's tables indicate that for SSE42+ targets (Core2 and later) we can reduce the FADD/FSUB/FMUL costs down to 1, which should fix the Himeno benchmark.
Note: the AVX512 FDIV costs look rather dodgy, but this isn't part of this patch.
Differential Revision: https://reviews.llvm.org/D43733
llvm-svn: 326133
|
| |
|
|
|
|
|
| |
This AsmParser test is target-agnostic, but contained some target-specific
instructions, which broke on SystemZ.
llvm-svn: 326129
|
| |
|
|
|
|
| |
There's still some shortcoming in our ability to combine binops of constants with different sizes separated by an extend. I'll try to look at that next.
llvm-svn: 326128
|
| |
|
|
|
|
| |
The main benefit is that they release the memory they were holding onto.
llvm-svn: 326127
|
| |
|
|
|
|
|
| |
When reading the resulting files back with opt-viewer, they will be parsed in
parallel.
llvm-svn: 326126
|
| |
|
|
| |
llvm-svn: 326125
|
| |
|
|
| |
llvm-svn: 326124
|
| |
|
|
|
|
| |
One can start looking at the index while the pages are still generating
llvm-svn: 326123
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v32i1)) without AVX512 to prevent scalarization
Summary:
We have an early DAG combine to turn these patterns into MOVMSK, but that combine doesn't work if the vXi1 type has more elements than the widest legal vXi8 type. Type legalization will eventually split it down to v16i1 or v32i1 and then the bitcast gets legalized to a truncstore and a scalar load. The truncstore will get lowered to a series of extracts and bit math.
This patch adds a custom legalization to use a sign extend and MOVMSK instead. This prevents the eventual scalarization.
Reviewers: spatel, RKSimon, zvi
Reviewed By: RKSimon
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D43593
llvm-svn: 326119
|
| |
|
|
| |
llvm-svn: 326117
|
| |
|
|
| |
llvm-svn: 326115
|
| |
|
|
|
|
| |
checks. NFC
llvm-svn: 326114
|
| |
|
|
|
|
|
|
| |
This change improves incremental rebuild performance on dual Xeon 8168
machines by 54%. This change also improves run time code gen by not
forcing the case values to be lvalues.
llvm-svn: 326109
|
| |
|
|
|
|
|
|
|
|
|
| |
This wires up -pass-remarks-hotness-threshold to LTO and ThinLTO.
Next is to change the clang driver to pass this
with -fdiagnostics-hotness-threshold.
Differential Revision: https://reviews.llvm.org/D41465
llvm-svn: 326107
|
| |
|
|
|
|
| |
SplitBinaryOpsAndApply
llvm-svn: 326104
|
| |
|
|
| |
llvm-svn: 326101
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Note: gcc appears to allow this fold with -freciprocal-math alone,
but clang/llvm require more than that with this patch. The wording
in the definitions seems fuzzy enough that it could go either way,
but we'll err on the conservative side of FMF interpretation.
This patch also changes the newly created fmul to have FMF propagated
by the last fdiv rather than intersecting the FMF of the fdivs. This
matches the behavior of other folds near here. The new fmul is only
used to produce an intermediate op for the final fdiv result, so it
shouldn't be any stricter than that result. The previous behavior
could result in dropping FMF via other folds in instcombine or CSE.
Differential Revision: https://reviews.llvm.org/D43398
llvm-svn: 326098
|
| |
|
|
|
|
| |
Cleanup check-prefixes to share more AVX/AVX512 codegen checks
llvm-svn: 326097
|
| |
|
|
|
|
|
|
|
| |
It seems to break the following buildbot:
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/24729
Will resubmit after investigating and fixing it.
llvm-svn: 326096
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In r322867, we introduced IsStandalone when printing MIR in -debug
output. The default behaviour for that was:
1) If any of MBB, MI, or MO are -debug-printed separately, don't omit any
redundant information.
2) When -debug-printing a MF entirely, don't print any redundant
information.
3) When printing MIR, don't print any redundant information.
I'd like to change 2) to:
2) When -debug-printing a MF entirely, don't omit any redundant information.
Differential Revision: https://reviews.llvm.org/D43337
llvm-svn: 326094
|
| |
|
|
|
|
| |
Fixes scary typo in a check that lost the end digit off a reg#...
llvm-svn: 326093
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
It was printed using code for generic containers before, resulting in
unreadable output.
Reviewers: sammccall, labath
Reviewed By: sammccall, labath
Subscribers: labath, zturner, llvm-commits
Differential Revision: https://reviews.llvm.org/D43330
llvm-svn: 326092
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch removes the HashString function from StringExtraces and
replaces its uses with calls to djbHash from DJB.h.
This change is *almost* NFC. While the algorithm is identical, the
djbHash implementation in StringExtras used 0 as its default seed while
the implementation in DJB uses 5381. The latter has been shown to result
in less collisions and improved avalanching and is used by the DWARF
accelerator tables.
Because some test were implicitly relying on the hash order, I've
reverted to using zero as a seed for the following two files:
lld/include/lld/Core/SymbolTable.h
llvm/lib/Support/StringMap.cpp
Differential revision: https://reviews.llvm.org/D43615
llvm-svn: 326091
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
With OS type AMDPAL, the scratch descriptor is hardwired to be loaded
from offset 0 of the global information table, whose low pointer is
passed in s0. For a merge shader on gfx9+, it needs to be s8 instead, as
the hardware reserves s0-s7.
Reviewers: kzhuravl
Subscribers: arsenm, nhaehnle, dstuttard, llvm-commits, t-tye, yaxunl, wdng, kzhuravl
Differential Revision: https://reviews.llvm.org/D42203
llvm-svn: 326088
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In the test case, the machine scheduler moves a dead write to a subreg
up into the middle of a segment of the overall reg's live range, where
the segment had liveness only for other subregs in the reg.
handleMoveUp created an invalid live range, causing an assert a bit
later.
This commit fixes it to handle that situation. The segment is split in
two at the insertion point, and the part after the split, and any
subsequent segments up to the old position, are changed to be defined by
the moved def.
V2: Better test.
Subscribers: MatzeB, nhaehnle, llvm-commits
Differential Revision: https://reviews.llvm.org/D43478
Change-Id: Ibc42445ddca84e79ad1f616401015d22bc63832e
llvm-svn: 326087
|
| |
|
|
| |
llvm-svn: 326085
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
It looks like some of our tests depend on the ordering of hashed values.
I'm reverting my changes while I try to reproduce and fix this locally.
Failing builds:
lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/18388
lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/6743
lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/15607
llvm-svn: 326082
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This removes the HashString function from StringExtraces and replaces
its uses with calls to djbHash from DJB.h
This is *almost* NFC. While the algorithm is identical, the djbHash
implementation in StringExtras used 0 as its seed while the
implementation in DJB uses 5381. The latter has been shown to result in
less collisions and improved avalanching.
https://reviews.llvm.org/D43615
(cherry picked from commit 77f7f965bc9499a9ae768a296ca5a1f7347d1d2c)
llvm-svn: 326081
|
| |
|
|
|
|
|
|
| |
This will still be constexpr when the standard library supports it, but
doesn't force constexpr. Old libraries will get a global constructor,
which is not too bad.
llvm-svn: 326080
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
All SIMD architectures can emulate masked load/store/gather/scatter
through element-wise condition check, scalar load/store, and
insert/extract. Therefore, bailing out of vectorization as legality
failure, when they return false, is incorrect. We should proceed to cost
model and determine profitability.
This patch is to address the vectorizer's architectural limitation
described above. As such, I tried to keep the cost model and
vectorize/don't-vectorize behavior nearly unchanged. Cost model tuning
should be done separately.
Please see
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120164.html for
RFC and the discussions.
Closes D43208.
Patch by: Hideki Saito <hideki.saito@intel.com>
llvm-svn: 326079
|
| |
|
|
| |
llvm-svn: 326078
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The dependency matrix is only empty if no conflicting load/store
instructions have been found. In that case, it is safe to interchange.
For the LLVM test-suite, after this change around 1900 loops are
interchanged, whereas it is 15 before this change. On cortex-a57,
this gives an improvement of -0.57% on the geomean execution
time of SPEC2006, SPEC2000 and the test-suite. There are a
few small perf regressions, but I think we can improve on those
by making the cost model better.
Reviewers: karthikthecool, mcrosier
Reviewed by: karthikthecool
Differential Revision: https://reviews.llvm.org/D43236
llvm-svn: 326077
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D41278
llvm-svn: 326074
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch introduces the new function in ScalarEvolution to get
all loops used in specified SCEV.
This is a preparation for re-writing isKnownPredicate utility as
described in https://reviews.llvm.org/D42417.
Reviewers: sanjoy, mkazantsev, reames
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43504
llvm-svn: 326072
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch introduces the SCEVPostIncRewriter rewriter which
is similar to SCEVInitRewriter but rewrites AddRec with post increment
value of this AddRec.
This is a preparation for re-writing isKnownPredicate utility as
described in https://reviews.llvm.org/D42417.
Reviewers: sanjoy, mkazantsev, reames
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43499
llvm-svn: 326071
|
| |
|
|
|
|
|
|
|
|
| |
Enable multiple COPY hints to eliminate more COPYs during register allocation.
Note that this is something all targets should do, see
https://reviews.llvm.org/D38128.
Review: Robert Lytton
llvm-svn: 326069
|
| |
|
|
|
|
| |
256-bit operations.
llvm-svn: 326068
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch introduces an additional parameter IgnoreOtherLoops to SCEVInitRewriter.
if it is equal to true then rewriter will not invalidate result in case
SCEV depends on other loops then specified during creation.
The patch does not change the default behavior.
This is a preparation for re-writing isKnownPredicate utility as
described in https://reviews.llvm.org/D42417.
Reviewers: sanjoy, mkazantsev, reames
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43498
llvm-svn: 326067
|
| |
|
|
|
|
| |
elements are.
llvm-svn: 326066
|
| |
|
|
| |
llvm-svn: 326065
|
| |
|
|
|
|
| |
This code seemed to try to widen to 128, 256, or 512 bit vectors, but we only create X86ISD::AVG with a power of 2 number of elements. This means the only nodes that need to be legalized are less than 128-bits and need to be widened up to 128 bits.
llvm-svn: 326064
|