| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Packing the flags into one bitcode word will save effort in
adding new flags in the future.
Differential Revision: https://reviews.llvm.org/D54755
llvm-svn: 347806
|
|
|
|
|
|
|
|
|
|
|
|
| |
AVX512 is enabled, but 512-bit vectors aren't legal.
Unlike most cost model functions this code makes a lot of table lookups without using the results from getTypeLegalizationCost. This means 512-bit vectors can be looked up even when the type isn't legal.
This patch adds a check around the two tables that contain 512-bit types to make sure that neither of the types would be split by type legalization. Meaning 512 bit types are illegal. I wanted to write this in a somewhat generic way that uses type legalization query hooks. But if prefered, I can switch to just using is512BitVector and the subtarget feature.
Differential Revision: https://reviews.llvm.org/D54984
llvm-svn: 347786
|
|
|
|
|
|
|
|
|
|
| |
This fixes some of scalarization costs reported for sext/zext using avx512bw. This does not fix all scalarization costs being reported. Just the worst.
I've restricted this only to combinations of types that are legal with avx512bw like v32i1/v64i1/v32i16/v64i8 and conversions between vXi1 and vXi8/vXi16 with legal vXi8/vXi16 result types.
Differential Revision: https://reviews.llvm.org/D54979
llvm-svn: 347785
|
|
|
|
|
|
|
|
| |
Expansion of SIGN_EXTEND_INREG can create a VSRAI instruction. If there is already a VSRAI after it, we should combine them into a larger VSRAI
Differential Revision: https://reviews.llvm.org/D54959
llvm-svn: 347784
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In PR39807 we incorrectly handle circumstances where calls are common'd
from conditional blocks into the parent BB. Calls that can be inlined
must always have DebugLocs, however we strip them during commoning, which
the IR verifier asserts on.
Fix this by using applyMergedLocation: it will perform the same DebugLoc
stripping of conditional Locs, but will also generate an unknown location
DebugLoc that satisfies the requirement for inlinable calls to always have
locations.
Some of the prior logic for selecting a DebugLoc is now likely redundant;
I'll generate a follow-up to remove it (involves editing more regression
tests).
Differential Revision: https://reviews.llvm.org/D54997
llvm-svn: 347782
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D54949
llvm-svn: 347778
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit caused failures because it failed to correctly handle cases where
we hoist a phi, then hoist a use of that phi, then have to rehoist that use. We
need to make sure that we rehoist the use to _after_ the hoisted phi, which we
do by always rehoisting to the immediate dominator instead of just rehoisting
everything to the original preheader.
An option is also added to control whether control flow is hoisted, which is
off in this commit but will be turned on in a subsequent commit.
Differential Revision: https://reviews.llvm.org/D52827
llvm-svn: 347776
|
|
|
|
|
|
|
|
|
| |
This adds support in the RISCVAsmParser the storing of Subtarget feature bits to a stack so that they can be pushed/popped to enable/disable multiple features at once.
Differential Revision: https://reviews.llvm.org/D46424
Patch by Lewis Revill.
llvm-svn: 347774
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Combine
sat(sat(X + C1) + C2) -> sat(X + (C1+C2))
and
sat(sat(X - C1) - C2) -> sat(X - (C1+C2))
if the sign of C1 and C2 matches.
In the unsigned case we can compute C1+C2 with saturating arithmetic,
and InstSimplify will reduce this just to the saturation value. For
the signed case, we cannot perform the simplification if the result
of the addition overflows.
This change is part of https://reviews.llvm.org/D54534.
llvm-svn: 347773
|
|
|
|
|
|
|
|
|
| |
Canonicalize ssub.sat(X, C) to ssub.sat(X, -C) if C is constant and
not signed minimum. This will help further optimizations to apply.
This change is part of https://reviews.llvm.org/D54534.
llvm-svn: 347772
|
|
|
|
|
|
|
|
|
|
|
|
| |
Always-overflow was already determined for unsigned addition, but
not subtraction. This patch establishes parity.
This allows us to perform some additional simplifications for
signed saturating subtractions.
This change is part of https://reviews.llvm.org/D54534.
llvm-svn: 347771
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If ValueTracking can determine that the add/sub can newer overflow,
replace it with the corresponding nuw/nsw add/sub.
Additionally, for the unsigned case, if ValueTracking determines
that the add/sub always overflows, replace the result with the
saturation value.
This change is part of https://reviews.llvm.org/D54534.
llvm-svn: 347770
|
|
|
|
|
|
|
|
|
| |
If a saturating add intrinsic has one constant argument, make sure
it is on the RHS. This will simplify further transformations.
This change is part of https://reviews.llvm.org/D54534.
llvm-svn: 347769
|
|
|
|
| |
llvm-svn: 347768
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is a NFC as we do not import non-odr vague linkage when computing
for import list for a module.
Reviewers: tejohnson, pcc
Subscribers: inglorion, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D54928
llvm-svn: 347763
|
|
|
|
|
|
| |
in ARMTargetParser.cpp.
llvm-svn: 347762
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If the original reduction root instruction was vectorized, it might be
removed from the tree. It means that the insertion point may become
invalidated and the whole vectorization of the reduction leads to the
incorrect output result.
The ReductionRoot instruction must be marked as externally used so it
could not be removed. Otherwise it might cause inconsistency with the
cost model and we may end up with too optimistic optimization.
Reviewers: RKSimon, spatel, hfinkel, mkuper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D54955
llvm-svn: 347759
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before this patch, the following stores in `merge_fail` would fail to be
merged, while they would get merged in `merge_ok`:
```
void use(unsigned long long *);
void merge_fail(unsigned key, unsigned index)
{
unsigned long long args[8];
args[0] = key;
args[1] = index;
use(args);
}
void merge_ok(unsigned long long *dst, unsigned a, unsigned b)
{
dst[0] = a;
dst[1] = b;
}
```
The reason is that `getMemOpBaseImmOfs` would return false for FI base
operands.
This adds support for this.
Differential Revision: https://reviews.llvm.org/D54847
llvm-svn: 347747
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, instructions doing memory accesses through a base operand that is
not a register can not be analyzed using `TII::getMemOpBaseRegImmOfs`.
This means that functions such as `TII::shouldClusterMemOps` will bail
out on instructions using an FI as a base instead of a register.
The goal of this patch is to refactor all this to return a base
operand instead of a base register.
Then in a separate patch, I will add FI support to the mem op clustering
in the MachineScheduler.
Differential Revision: https://reviews.llvm.org/D54846
llvm-svn: 347746
|
|
|
|
|
|
|
|
|
| |
This reverts r294500. DwarfCompileUnit::addAddressExpr uses DIEExpr
for PCOffset. In that case the expression is unrelated to thread locals
and so emitting a value of the DIEExpr does not have to always mean
emit-debug-thread-local.
llvm-svn: 347744
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
separate files to enable future changes.
This moves ARM and AArch64 target parsing into their
own files. They are still accessible through
TargetParser.h as before.
Several functions in AArch64 which were just forwarders to ARM
have been removed. All except AArch64::getFPUName were unused,
and that was only used in a test. Which itself was overlapping
one in ARM, so it has also been removed.
Differential revision: https://reviews.llvm.org/D53980
llvm-svn: 347741
|
|
|
|
|
|
|
|
|
|
|
|
| |
CGF/CLGF compares an i64 register with a sign/zero extended loaded i32 value
in memory.
This patch makes such a load considered foldable and so gets a 0 cost.
Review: Ulrich Weigand
https://reviews.llvm.org/D54944
llvm-svn: 347735
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AH, SH and MH costs are already covered in the cases where LHS is 32 bits and
RHS is 16 bits of memory sign-extended to i32.
As these instructions are also used when LHS is i16, this patch recognizes
that the loads will get folded then as well.
Review: Ulrich Weigand
https://reviews.llvm.org/D54940
llvm-svn: 347734
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Single instructions exist for i8 and i16 comparisons of memory against a
small immediate.
This patch makes sure that if the load in these cases has a single user (the
ICmp), it gets a 0 cost (folded), and also that the ICmp gets a cost of 1.
Review: Ulrich Weigand
https://reviews.llvm.org/D54897
llvm-svn: 347733
|
|
|
|
|
|
|
|
|
|
| |
Since byte-swapping loads and stores are supported, a 'load -> bswap' or
'bswap -> store' sequence should have the cost of one.
Review: Ulrich Weigand
https://reviews.llvm.org/D54870
llvm-svn: 347732
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Ignore advices where the memory operand of the 'anchor' instruction
uses unsupported register types.
Reviewers: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D54983
llvm-svn: 347724
|
|
|
|
|
|
| |
Make the names for the macros for `TargetInstrInfo` uniform.
llvm-svn: 347706
|
|
|
|
| |
llvm-svn: 347690
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This speeds up linking clang.exe/pdb with /DEBUG:GHASH by 31%, from
12.9s to 9.8s.
Symbol records are typically small (16.7 bytes on average), but we
processed them one at a time. CVSymbol is a relatively "large" type. It
wraps an ArrayRef<uint8_t> with a kind an optional 32-bit hash, which we
don't need. Before this change, each DbiModuleDescriptorBuilder would
maintain an array of CVSymbols, and would write them individually with a
BinaryItemStream.
With this change, we now add symbols that happen to appear contiguously
in bulk. For each .debug$S section (roughly one per function), we
allocate two copies, one for relocation, and one for realignment
purposes. For runs of symbols that go in the module stream, which is
most symbols, we now add them as a single ArrayRef<uint8_t>, so the
vector DbiModuleDescriptorBuilder is roughly linear in the number of
.debug$S sections (O(# funcs)) instead of the number of symbol records
(very large).
Some stats on symbol sizes for the curious:
PDB size: 507M
sym bytes: 316,508,016
sym count: 18,954,971
sym byte avg: 16.7
As future work, we may be able to skip copying symbol records in the
linker for realignment purposes if we make LLVM write them aligned into
the object file. We need to double check that such symbol records are
still compatible with link.exe, but if so, it's definitely worth doing,
since my profile shows we spend 500ms in memcpy in the symbol merging
code. We could potentially cut that in half by saving a copy.
Alternatively, we could apply the relocations *after* we iterate the
symbols. This would require some careful re-engineering of the
relocation processing code, though.
Reviewers: zturner, aganea, ruiu
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D54554
llvm-svn: 347687
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D54926
llvm-svn: 347686
|
|
|
|
|
|
|
|
| |
We're already mixing this APInt with other 'unsigned' variables. This allows us to use regular comparison operators instead of needing to use APInt::ult or APInt::uge. And it removes a later conversion from APInt to unsigned.
I might be adding another combine to this function and this will probably simplify the logic required for that.
llvm-svn: 347684
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
InlineCost also treats them as free and the current implementation
can cause assertion failures if PHI nodes are moved outside the region
from entry BBs to the region.
It also updates the code to use the instructionsWithoutDebug iterator.
Reviewers: davidxl, davide, vsk, graham-yiu-huawei
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D54748
llvm-svn: 347683
|
|
|
|
|
|
|
|
|
|
| |
This is skylake-avx512 with the addition of avx512vnni ISA.
Patch by Jianping Chen
Differential Revision: https://reviews.llvm.org/D54785
llvm-svn: 347681
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This (very specialized) function was added to enable an LLDB use case.
Now that a more generic interface (overriding of parser functions -
D52992) is available, and LLDB has been converted to use that (D54074),
the function is unused and can be removed.
Reviewers: erik.pilkington, sgraenitz, rsmith
Subscribers: mgorny, hiraditya, christof, libcxx-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D54893
llvm-svn: 347670
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D54358
llvm-svn: 347659
|
|
|
|
|
|
|
|
| |
I tried to change this, not quite realising the logic behind what we
were doing. Hopefully this comment will help the next person to come
along.
llvm-svn: 347653
|
|
|
|
|
|
| |
nodes.
llvm-svn: 347642
|
|
|
|
| |
llvm-svn: 347641
|
|
|
|
|
|
|
|
|
|
|
|
| |
It fixes a bug that doesn't update Phi inputs of the only live successor that
is in the list of block's successors more than once.
Thanks @uabelho for finding this.
Differential Revision: https://reviews.llvm.org/D54849
Reviewed By: anna
llvm-svn: 347640
|
|
|
|
|
|
|
|
|
|
| |
store on pre-AVX512 targets.
If we fold the bitcast into the store we'll end up creating a truncating store to vXi1 that will get scalarized. Instead allow the bitcast to be turned into a movmsk.
We probably need to do something if the store itself is a vXi1 type, but I'll leave that til a testcase appears.
llvm-svn: 347632
|
|
|
|
| |
llvm-svn: 347626
|
|
|
|
| |
llvm-svn: 347625
|
|
|
|
| |
llvm-svn: 347624
|
|
|
|
|
|
| |
Comment out an assertion from D54543 which failed with error: no member named 'Range' in '(anonymous namespace)::PassAsArgInfo'.
llvm-svn: 347616
|
|
|
|
|
|
|
|
| |
a prologue.
More context here: https://go-review.googlesource.com/c/go/+/148819/
llvm-svn: 347614
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
IPA is implemented as module pass which produce map from Function or Alias to
StackSafetyInfo for a single function.
From prototype by Evgenii Stepanov and Vlad Tsyrklevich.
Reviewers: eugenis, vlad.tsyrklevich, pcc, glider
Subscribers: hiraditya, mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D54543
llvm-svn: 347611
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: eugenis, vlad.tsyrklevich
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D54541
llvm-svn: 347610
|
|
|
|
|
|
| |
Patch by Arthur O'Dwyer!
llvm-svn: 347609
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Fixes test failures introduced by rL347596.
Reviewers: davidxl
Reviewed By: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D54916
llvm-svn: 347607
|
|
|
|
| |
llvm-svn: 347606
|