| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
llvm-svn: 318875
|
|
|
|
|
|
| |
Had to tweak the setcc's used by the code to use a vXi1 result type with a sign extend back to vector size.
llvm-svn: 318871
|
|
|
|
|
|
| |
match what is done for VPOPCNTDQ in the AVX512F block. NFC
llvm-svn: 318870
|
|
|
|
|
|
| |
most of the rest of the AVX2 legalization lives.
llvm-svn: 318869
|
|
|
|
|
|
|
|
|
|
| |
We already allowed keep+discard. It is important to be able to discard
a temporary if a rename fail. It is also convenient as it allows the
use of RAII for discarding.
Allow discarding twice for similar reasons.
llvm-svn: 318867
|
|
|
|
|
|
|
|
|
|
|
| |
The default limit is 1000000 but it can be configured with a cache
policy. The motivation is that some filesystems (notably ext4) have
a limit on the number of files that can be contained in a directory
(separate from the inode limit).
Differential Revision: https://reviews.llvm.org/D40327
llvm-svn: 318857
|
|
|
|
| |
llvm-svn: 318856
|
|
|
|
| |
llvm-svn: 318855
|
|
|
|
|
|
|
|
|
|
|
| |
SITargetLowering::LowerCall uses dummy pointer info for byval argument, which causes
flat load instead of buffer load.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D40040
llvm-svn: 318844
|
|
|
|
|
|
|
|
| |
without any compile units.
Differential Revision: https://reviews.llvm.org/D40114
llvm-svn: 318842
|
|
|
|
|
|
|
|
|
|
|
| |
As a side effect, the .debug_line section will be dumped in physical
order, rather than in the order that compile units refer to their
associated portions of the .debug_line section. These are probably
always the same order anyway, and no tests noticed the difference.
Differential Revision: https://reviews.llvm.org/D39854
llvm-svn: 318839
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D40200
llvm-svn: 318838
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SILoadStoreOptimizer
Summary:
This bug seems to have gone unnoticed because critical cases with LDS
instructions are eliminated by the peephole optimizer.
However, equivalent situations arise with buffer loads and stores
as well, so this fixes regressions since r317751 ("AMDGPU: Merge
S_BUFFER_LOAD_DWORD_IMM into x2, x4").
Fixes at least:
KHR-GL45.shader_storage_buffer_object.basic-operations-case1-cs
KHR-GL45.cull_distance.functional
piglit tes-input-gl_ClipDistance.shader_test
... and probably more
Change-Id: I0e371536288eb8e6afeaa241a185266fd45d129d
Reviewers: arsenm, mareko, rampitec
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D40303
llvm-svn: 318829
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since i1 is a legal type, this:
NumBytes = Op1->getMemoryVT().getSizeInBits() >> 3;
is wrong and should be instead
NumBytes = Op0->getMemoryVT().getStoreSize();
There seems to be more places where this should be fixed outside DAGCombiner.
Review: Hal Finkel
https://bugs.llvm.org/show_bug.cgi?id=35366
llvm-svn: 318824
|
|
|
|
|
|
|
|
| |
This makes the fact that X86 needs an explicit mask output not part of the type constraint for the ISD::MSCATTER.
This also gives the X86ISD::MGATHER/MSCATTER nodes a common base class simplifying the address selection code in X86ISelDAGToDAG.cpp
llvm-svn: 318823
|
|
|
|
|
|
|
|
| |
Now we consistently represent the mask result without relying on isel ignoring it.
We now have a more general SDNode and type constraints to represent these nodes in isel patterns. This allows us to present both both vXi1 and XMM/YMM mask types with a single set of constraints.
llvm-svn: 318821
|
|
|
|
|
|
|
|
| |
than result 0.
I plan to use this to check the type of the mask result of masked gathers in the X86 backend.
llvm-svn: 318820
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Given loops `L1` and `L2` with AddRecs `AR1` and `AR2` varying in them respectively.
When identifying loop disposition of `AR2` w.r.t. `L1`, we only say that it is varying if
`L1` contains `L2`. But there is also a possible situation where `L1` and `L2` are
consecutive sibling loops within the parent loop. In this case, `AR2` is also varying
w.r.t. `L1`, but we don't correctly identify it.
It can lead, for exaple, to attempt of incorrect folding. Consider:
AR1 = {a,+,b}<L1>
AR2 = {c,+,d}<L2>
EXAR2 = sext(AR1)
MUL = mul AR1, EXAR2
If we incorrectly assume that `EXAR2` is invariant w.r.t. `L1`, we can end up trying to
construct something like: `{a * {c,+,d}<L2>,+,b * {c,+,d}<L2>}<L1>`, which is incorrect
because `AR2` is not available on entrance of `L1`.
Both situations "`L1` contains `L2`" and "`L1` preceeds sibling loop `L2`" can be handled
with one check: "header of `L1` dominates header of `L2`". This patch replaces the old
insufficient check with this one.
Differential Revision: https://reviews.llvm.org/D39453
llvm-svn: 318819
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After the dataflow algorithm proves that an argument is constant,
it replaces it value with the integer constant and drops the lattice
value associated to the DEF.
e.g. in the example we have @f() that's called twice:
call @f(undef, ...)
call @f(2, ...)
`undef` MEET 2 = 2 so we replace the argument and all its uses with
the constant 2.
Shortly after, tryToReplaceWithConstantRange() tries to get the lattice
value for the argument we just replaced, causing an assertion.
This function is a little peculiar as it runs when we're doing replacement
and not as part of the solver but still queries the solver.
The fix is that of checking whether we replaced the value already and
get a temporary lattice value for the constant.
Thanks to Zhendong Su for the report!
Fixes PR35357.
llvm-svn: 318817
|
|
|
|
|
|
|
|
| |
shared by Host.cpp to a .def file and TargetParser.h so clang can make use of it.
Since we keep Host.cpp and compiler-rt relatively in sync, clang can use this information as a proxy.
llvm-svn: 318814
|
|
|
|
| |
llvm-svn: 318807
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change the representation of COFF comdats so that a COFF linker
is able to accurately resolve comdats between IR and native object
files. Specifically, apply name mangling to comdat names consistently
with native object files, and do not export comdats with an internal
leader because they do not affect symbol resolution.
Differential Revision: https://reviews.llvm.org/D40278
llvm-svn: 318805
|
|
|
|
|
|
|
| |
Since EH_LABELs (and other labels) no longer have "side-effects", they
should be checked for separately.
llvm-svn: 318801
|
|
|
|
|
|
|
|
| |
folding.
The commuting patterns for the AVX version actually still had priority over the new patterns.
llvm-svn: 318800
|
|
|
|
|
|
|
|
| |
to icelake CPU.
This is based on table 1-1 of the October 2017 revision of Intel® Architecture Instruction Set Extensions and Future Features Programming Reference
llvm-svn: 318799
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Segment moves to memory are always 16-bit. Remove invalid 32 and 64
bit variants.
Recommiting with missing clang inline assembly test change.
Fixes PR34478.
Reviewers: rnk, craig.topper
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D39847
llvm-svn: 318797
|
|
|
|
| |
llvm-svn: 318792
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This partially reverts r298851. The the underlying issue is that we don't
currently model the dependency between mrs (read system register) and
msr (write system register) instructions.
Something like the below should never be reordered:
msr TPIDR_EL0, x0 ;; set thread pointer
mrs x8, TPIDR_EL0 ;; read thread pointer
but was being reordered after r298851. The functional part of the patch
that wasn't reverted needed to remain in place in order to not break
r299462.
PR35317
llvm-svn: 318788
|
|
|
|
| |
llvm-svn: 318787
|
|
|
|
| |
llvm-svn: 318786
|
|
|
|
|
|
| |
It works just like __cyg_profile_func_enter but takes no arguments.
llvm-svn: 318783
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These are pre-UAL syntax, and we don't support any other pre-UAL instructions,
with the exception of FLDMX/FSTMX, which don't have a UAL equivalent. Therefore
there's no reason to keep them or their AsmParser hacks around.
With the AsmParser hacks removed, the FLDMX and FSTMX instructions get the same
operand diagnostics as the UAL instructions.
Differential revision: https://reviews.llvm.org/D39196
llvm-svn: 318777
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
First step in adding MemorySSA as dependency for loop pass manager.
Adding the dependency under a flag.
New pass manager: MSSA pointer in LoopStandardAnalysisResults can be null.
Legacy and new pass manager: Use cl::opt EnableMSSALoopDependency. Disabled by default.
Reviewers: sanjoy, davide, gberry
Subscribers: mehdi_amini, Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D40274
llvm-svn: 318772
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was causing the (invalid) predicated versions of the NEON VRINTX and
VRINTZ instructions to be accepted, with the condition code being ignored.
Also, there is no NEON VRINTR instruction, so that part of the check was not
necessary.
Differential revision: https://reviews.llvm.org/D39193
llvm-svn: 318771
|
|
|
|
|
|
|
|
|
|
| |
- We can still emit this error if the actual instruction has two or more
operands missing compared to the expected one.
- We should only emit this error once per instruction.
Differential revision: https://reviews.llvm.org/D36746
llvm-svn: 318770
|
|
|
|
|
|
| |
Differential revision: https://reviews.llvm.org/D39195
llvm-svn: 318766
|
|
|
|
|
|
|
| |
Almost too trivial to worry about, but it seems worth having consistency with
upcoming commits.
llvm-svn: 318760
|
|
|
|
|
|
| |
All match equivalent basic classes (WritePHAdd, WriteFAdd etc.) according to both the AMD 15h SOG and Agner's tables.
llvm-svn: 318758
|
|
|
|
|
|
|
|
|
|
| |
As pointed out in post-commit review of r318738, `return ReplaceNode(..)` when
both ReplaceNode and the current function return void is confusing. This patch
moves to using a more obvious early return, and moves to just using an if to
catch the one case we currently care about. A future patch that adds further
custom instruction selection can introduce a switch.
llvm-svn: 318757
|
|
|
|
|
|
|
|
| |
Reviewers: mstorsjo
Differential Revision: https://reviews.llvm.org/D40286
llvm-svn: 318756
|
|
|
|
|
|
| |
It's on all other LWP instruction but I missed it from lwpins, despite similar scheduling behaviour.
llvm-svn: 318751
|
|
|
|
|
|
|
|
| |
This patch fixes instregex for interger vector add/sub instructions
Differential revision: https://reviews.llvm.org/D40254
llvm-svn: 318749
|
|
|
|
|
|
|
| |
vpopcnt{b,w}
Differential Revision: https://reviews.llvm.org/D40213
llvm-svn: 318748
|
|
|
|
|
|
|
|
|
| |
Introducing Vector Neural Network Instructions, consisting of:
vpdpbusd{s}
vpdpwssd{s}
Differential Revision: https://reviews.llvm.org/D40208
llvm-svn: 318746
|
|
|
|
|
|
|
|
|
|
|
| |
introducing vbmi2, consisting of
vpcompress{b,w}
vpexpand{b,w}
vpsh{l,r}d{w,d,q}
vpsh{l,r}dv{w,d,q}
Differential Revision: https://reviews.llvm.org/D40206
llvm-svn: 318745
|
|
|
|
|
|
|
| |
properlyDominates() shouldn't be used as sort key. It causes different output between stdlibc++ and libc++.
Instead, I introduced RPOT. In most cases, it works for CSE.
llvm-svn: 318743
|
|
|
|
|
|
|
| |
an icelake promotion of pclmulqdq
Differential Revision: https://reviews.llvm.org/D40101
llvm-svn: 318741
|
|
|
|
|
|
|
| |
an icelake promotion of AES
Differential Revision: https://reviews.llvm.org/D40078
llvm-svn: 318740
|
|
|
|
|
|
|
|
|
|
|
|
| |
The obvious approach of defining a pattern like the one below actually doesn't
work:
`def : Pat<(i32 0), (i32 X0)>;`
As was noted when Lanai made this change (https://reviews.llvm.org/rL288215),
attempting to handle the constant 0 in tablegen leads to assertions due to a
physical register being used where a virtual register is expected.
llvm-svn: 318738
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previous patches primarily ensured that codegen was possible for the standard
RISC-V instructions. However, there are a number of IR inputs that wouldn't be
appropriately lowered. This patch both adds test cases and supports lowering
for a number of these cases:
* Improved sext/zext/trunc support
* Support for setcc variants that don't map directly to RISC-V instructions
* Lowering mul, and hence support for external symbols
* addc, adde, subc, sube
* mulhs, srem, mulhu, urem, udiv, sdiv
* {srl,sra,shl}_parts
* brind
* br_jt
* bswap, ctlz, cttz, ctpop
* rotl, rotr
* BlockAddress operands
Differential Revision: https://reviews.llvm.org/D29938
llvm-svn: 318737
|