| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
This was causing bot failures on greendragon
llvm-svn: 326169
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add specific mtriples to tests added in r326154.
From: Evgeny Stupachenko <evstupac@gmail.com>
<evgeny.v.stupachenko@intel.com>
llvm-svn: 326158
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The change fix an assert fail at ScalarEvolutionExpander.cpp:
assert(ExitCount != SE.getCouldNotCompute() && "Invalid loop count");
Reviewers: sbaranga
Differential Revision: http://reviews.llvm.org/D42604
From: Evgeny Stupachenko <evstupac@gmail.com>
<evgeny.v.stupachenko@intel.com>
llvm-svn: 326154
|
|
|
|
|
|
| |
vectors; NFC
llvm-svn: 326148
|
|
|
|
|
|
| |
NFC
llvm-svn: 326147
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
constants.
Summary: This allows vector fabs to be removed in more cases.
Reviewers: spatel, arsenm, RKSimon
Reviewed By: spatel
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D43739
llvm-svn: 326138
|
|
|
|
|
|
|
|
|
|
| |
Agner's tables indicate that for SSE42+ targets (Core2 and later) we can reduce the FADD/FSUB/FMUL costs down to 1, which should fix the Himeno benchmark.
Note: the AVX512 FDIV costs look rather dodgy, but this isn't part of this patch.
Differential Revision: https://reviews.llvm.org/D43733
llvm-svn: 326133
|
|
|
|
|
|
|
| |
This AsmParser test is target-agnostic, but contained some target-specific
instructions, which broke on SystemZ.
llvm-svn: 326129
|
|
|
|
|
|
| |
There's still some shortcoming in our ability to combine binops of constants with different sizes separated by an extend. I'll try to look at that next.
llvm-svn: 326128
|
|
|
|
| |
llvm-svn: 326125
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v32i1)) without AVX512 to prevent scalarization
Summary:
We have an early DAG combine to turn these patterns into MOVMSK, but that combine doesn't work if the vXi1 type has more elements than the widest legal vXi8 type. Type legalization will eventually split it down to v16i1 or v32i1 and then the bitcast gets legalized to a truncstore and a scalar load. The truncstore will get lowered to a series of extracts and bit math.
This patch adds a custom legalization to use a sign extend and MOVMSK instead. This prevents the eventual scalarization.
Reviewers: spatel, RKSimon, zvi
Reviewed By: RKSimon
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D43593
llvm-svn: 326119
|
|
|
|
| |
llvm-svn: 326117
|
|
|
|
| |
llvm-svn: 326115
|
|
|
|
|
|
| |
checks. NFC
llvm-svn: 326114
|
|
|
|
|
|
|
|
|
|
|
| |
This wires up -pass-remarks-hotness-threshold to LTO and ThinLTO.
Next is to change the clang driver to pass this
with -fdiagnostics-hotness-threshold.
Differential Revision: https://reviews.llvm.org/D41465
llvm-svn: 326107
|
|
|
|
|
|
| |
SplitBinaryOpsAndApply
llvm-svn: 326104
|
|
|
|
| |
llvm-svn: 326101
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Note: gcc appears to allow this fold with -freciprocal-math alone,
but clang/llvm require more than that with this patch. The wording
in the definitions seems fuzzy enough that it could go either way,
but we'll err on the conservative side of FMF interpretation.
This patch also changes the newly created fmul to have FMF propagated
by the last fdiv rather than intersecting the FMF of the fdivs. This
matches the behavior of other folds near here. The new fmul is only
used to produce an intermediate op for the final fdiv result, so it
shouldn't be any stricter than that result. The previous behavior
could result in dropping FMF via other folds in instcombine or CSE.
Differential Revision: https://reviews.llvm.org/D43398
llvm-svn: 326098
|
|
|
|
|
|
| |
Cleanup check-prefixes to share more AVX/AVX512 codegen checks
llvm-svn: 326097
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In r322867, we introduced IsStandalone when printing MIR in -debug
output. The default behaviour for that was:
1) If any of MBB, MI, or MO are -debug-printed separately, don't omit any
redundant information.
2) When -debug-printing a MF entirely, don't print any redundant
information.
3) When printing MIR, don't print any redundant information.
I'd like to change 2) to:
2) When -debug-printing a MF entirely, don't omit any redundant information.
Differential Revision: https://reviews.llvm.org/D43337
llvm-svn: 326094
|
|
|
|
|
|
| |
Fixes scary typo in a check that lost the end digit off a reg#...
llvm-svn: 326093
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
With OS type AMDPAL, the scratch descriptor is hardwired to be loaded
from offset 0 of the global information table, whose low pointer is
passed in s0. For a merge shader on gfx9+, it needs to be s8 instead, as
the hardware reserves s0-s7.
Reviewers: kzhuravl
Subscribers: arsenm, nhaehnle, dstuttard, llvm-commits, t-tye, yaxunl, wdng, kzhuravl
Differential Revision: https://reviews.llvm.org/D42203
llvm-svn: 326088
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
All SIMD architectures can emulate masked load/store/gather/scatter
through element-wise condition check, scalar load/store, and
insert/extract. Therefore, bailing out of vectorization as legality
failure, when they return false, is incorrect. We should proceed to cost
model and determine profitability.
This patch is to address the vectorizer's architectural limitation
described above. As such, I tried to keep the cost model and
vectorize/don't-vectorize behavior nearly unchanged. Cost model tuning
should be done separately.
Please see
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120164.html for
RFC and the discussions.
Closes D43208.
Patch by: Hideki Saito <hideki.saito@intel.com>
llvm-svn: 326079
|
|
|
|
| |
llvm-svn: 326078
|
|
|
|
|
|
|
|
|
|
| |
Enable multiple COPY hints to eliminate more COPYs during register allocation.
Note that this is something all targets should do, see
https://reviews.llvm.org/D38128.
Review: Robert Lytton
llvm-svn: 326069
|
|
|
|
|
|
| |
256-bit operations.
llvm-svn: 326068
|
|
|
|
|
|
| |
elements are.
llvm-svn: 326066
|
|
|
|
|
|
| |
Which types are considered 'simple' is a function of the requirements of all targets that LLVM supports. That shouldn't directly affect what types we are able to handle. The remainder of this code checks that the number of elements is a power of 2 and takes care of splitting down to a legal size.
llvm-svn: 326063
|
|
|
|
| |
llvm-svn: 326050
|
|
|
|
| |
llvm-svn: 326049
|
|
|
|
|
|
| |
ADD/SUB ops
llvm-svn: 326044
|
|
|
|
|
|
| |
TRUNCATE ops
llvm-svn: 326043
|
|
|
|
| |
llvm-svn: 326042
|
|
|
|
|
|
| |
EVEX encoded instructions.
llvm-svn: 326041
|
|
|
|
|
|
|
|
| |
This reverts commit r325881.
Breaks many bots
llvm-svn: 326037
|
|
|
|
| |
llvm-svn: 326035
|
|
|
|
|
|
| |
Our UMIN/UMAX, vector truncation and shuffle combining is good enough to efficiently handle v8i64 with the number of leading zeros that are necessary for PSUBUS.
llvm-svn: 326034
|
|
|
|
|
|
| |
Now that UMIN etc are Legal/Custom for SSE2+, we can efficiently match SUBUS v8i32 cases from SSSE3 which can perform efficient truncation with PSHUFB.
llvm-svn: 326033
|
|
|
|
|
|
| |
sizes with SplitBinaryOpsAndApply
llvm-svn: 326030
|
|
|
|
|
|
|
|
|
|
| |
Enable multiple COPY hints to eliminate more COPYs during register allocation.
Note that this is something all targets should do, see
https://reviews.llvm.org/D38128.
Review: James Y Knight
llvm-svn: 326028
|
|
|
|
|
|
|
|
|
|
| |
V_SUBBREV_U32 is a commute opcode for V_SUBB_U32. However, when
we try to commute V_SUBB_U32 in order to shrink it we do not then
process V_SUBBREV_U32 and it stay VOP3. This is fixed.
Differential Revision: https://reviews.llvm.org/D43699
llvm-svn: 326011
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch implements the name lookup functionality of the .debug_names
accelerator table and hooks it up to "llvm-dwarfdump -find". To make the
interface of the two kinds of accelerator tables more consistent, I've
created an abstract "DWARFAcceleratorTable::Entry" class, which provides
a consistent interface to access the common functionality of the table
entries (such as getting the die offset, die tag, etc.). I've also
modified the apple table to vend entries conforming to this interface.
Reviewers: JDevlieghere, aprantl, probinson, dblaikie
Subscribers: vleschuk, clayborg, echristo, llvm-commits
Differential Revision: https://reviews.llvm.org/D43067
llvm-svn: 326003
|
|
|
|
|
|
|
|
| |
masked move patterns.
This portion can be matched by other patterns. We don't need it to make the larger pattern valid. It's sufficient to have a v1i1 mask input without caring where it came from.
llvm-svn: 325999
|
|
|
|
| |
llvm-svn: 325995
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch test disassembler output for load/store instructions when
-mattr=+alu32 specified for which we want to use "w" register format.
Also, this patch extended the existing insn-unit.s and insn-unit-32.s to
make sure disassemblers for all other instructions are not affected.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325993
|
|
|
|
|
|
|
|
|
|
| |
This patch adds some unit tests for 32-bit subregister support.
We want to make sure ALU32, subregister load/store and new peephole
optimization are truely enabled once -mattr=+alu32 specified.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325992
|
|
|
|
|
|
| |
Add files which I missed in the original check-in
llvm-svn: 325973
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The instruction sequence used to get the address of the PC into a GPR requires
that we clobber the link register. Doing so without having first saved it in
the prologue leaves the function unable to return. Currently, this sequence is
emitted into the entry block. To ensure the prologue is inserted before this
sequence, disable shrink-wrapping.
This fixes PR33547.
Differential Revision: https://reviews.llvm.org/D43677
llvm-svn: 325972
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In DWARF v5 the Line Number Program Header is extensible, allowing values with
new content types. In this extension a content type is added,
DW_LNCT_LLVM_source, which contains the embedded source code of the file.
Add new optional attribute for !DIFile IR metadata called source which contains
source text. Use this to output the source to the DWARF line table of code
objects. Analogously extend METADATA_FILE in Bitcode and .file directive in ASM
to support optional source.
Teach llvm-dwarfdump and llvm-objdump about the new values. Update the output
format of llvm-dwarfdump to make room for the new attribute on file_names
entries, and support embedded sources for the -source option in llvm-objdump.
Differential Revision: https://reviews.llvm.org/D42765
llvm-svn: 325970
|
|
|
|
|
|
| |
This was misplaced in InstCombine. We can loosen the FMF as a follow-up step.
llvm-svn: 325965
|