| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
- Emit NT_AMD_AMDGPU_ISA
- Add assembler parsing for isa version directive
- If isa version directive does not match command line arguments, then return error
Differential Revision: https://reviews.llvm.org/D38748
llvm-svn: 315808
|
|
|
|
|
|
|
|
| |
If we are applying a byte mask to a value extracted from a shuffle, see if we can combine the mask into shuffle.
Fixes the last issue with PR22415
llvm-svn: 315807
|
|
|
|
| |
llvm-svn: 315802
|
|
|
|
| |
llvm-svn: 315801
|
|
|
|
| |
llvm-svn: 315800
|
|
|
|
|
|
| |
These select the same instruction as the non-bitcasted pattern. So this provides no additional value.
llvm-svn: 315799
|
|
|
|
|
|
|
|
| |
extended VCVTPD2UDQZ128rr and VCVTTPD2UDQZ128rr.
We don't need a bitconvert as a root pattern in these cases. The types in the other parts of the pattern are sufficient to express the behavior of these instructions.
llvm-svn: 315798
|
|
|
|
|
|
|
|
|
|
| |
VCVTUDQ2PD.
This matches the patterns we have for the SSE/AVX version.
This is a prerequisite for D38714.
llvm-svn: 315797
|
|
|
|
|
|
| |
tables.
llvm-svn: 315796
|
|
|
|
|
|
|
|
| |
folding tables.
I believe these were added incorrectly under the belief that the load size was smaller than the input register size, but that's not true.
llvm-svn: 315795
|
|
|
|
|
|
|
|
| |
load folding without the peephole pass.
This pattern is already used in AVX512VL version of these instructions. Though AVX512VL version is missing other patterns.
llvm-svn: 315794
|
|
|
|
|
|
|
| |
This reverts r315697 and my ill-fated attempts to fix it on Windows.
I'll try again when I get access to a Windows machine.
llvm-svn: 315793
|
|
|
|
|
|
|
|
|
|
| |
"No such file or directory: C:\\...\\tests\\Output\\shared-output.py.tmp/Output/Shared/SHARED.tmp"
And yet other forward-slashes don't seem to be causing the same
problem. I'll see if I can get ahold of a Windows machine to poke at
this directly later.
llvm-svn: 315792
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently llvm assembler emits parsing error for valid IR assembly
alloca i32, i32 9, addrspace(5)
when alloca addr space is 5.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D38713
llvm-svn: 315791
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch removes the `verifyNCD` check.
The reason for this is that the other checks are sufficient to prove or disprove correctness of any DominatorTree, and that `verifyNCD` doesn't provide (in my option) better error messages then the other ones.
Additionally, this should give a (small) improvement to the total verification time, as the check is O(n), and checking the sibling property takes O(n^3).
Reviewers: dberlin, grosser, davide, brzycki
Reviewed By: brzycki
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38802
llvm-svn: 315790
|
|
|
|
|
|
|
|
|
|
| |
There were two copies of the logic needed to construct a line stats
object for each line in a range: this patch brings it down to one. In
the future, this will make it easier for IDE clients to display coverage
in-line in source editors. To do that, we just need to move the new
LineCoverageIterator class to libCoverage.
llvm-svn: 315789
|
|
|
|
|
|
|
|
| |
(corrected OtherInsnID->OtherOpIdx).
The tests were passing by luck since the instruction ID and operand index happened to be the same.
llvm-svn: 315788
|
|
|
|
|
|
| |
Two debugging statements snuck into the commit.
llvm-svn: 315783
|
|
|
|
|
|
|
| |
I don't have access to a Windows machine at the moment, so if this
doesn't fix it I'll just revert for now.
llvm-svn: 315782
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We use to resort on the generic implementation to get the mappings for
COPYs. The generic implementation resorts on table lookup and
dynamically allocated objects to get the valid mappings.
Given we already know how to map G_BITCAST and have the static mappings
for them, use that code path for COPY as well. This is much more
efficient.
Improve the compile time of RegBankSelect by up to 20%.
Note: When we eventually generate all the mappings via TableGen, we
wouldn't have to do that dance to shave compile time. The intent of this
change was to make sure that moving to static structure really pays off.
NFC.
llvm-svn: 315781
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
wrong-code bug this revealed.
Summary:
Operand variable lookups are now performed by the RuleMatcher rather than
searching the whole matcher hierarchy for a match. This revealed a wrong-code
bug that currently affects ARM and X86 where patterns that use a variable more
than once in the match pattern will be imported but won't check that the
operands are identical. This can cause the tablegen-erated matcher to
accept matches that should be rejected.
Depends on D36569
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Subscribers: aemerson, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D36618
llvm-svn: 315780
|
|
|
|
|
|
|
| |
I didn't think about '%{inputs}' having the same problem. This one
should be a fully Windows path name.
llvm-svn: 315779
|
|
|
|
| |
llvm-svn: 315773
|
|
|
|
|
|
| |
Broke some builds (using libstdc++).
llvm-svn: 315769
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
available
This is particularly important for AVX512VL where we are better able to recognize the VBROADCAST loads to fold with other operations.
For AVX512VL we now use X86ISD::VBROADCAST for all of the patterns and remove the 128-bit X86ISD::VMOVDDUP.
We may be able to use this for AVX1 as well which would allow us to remove more isel patterns.
I also had to add X86ISD::VBROADCAST as a node to call combineShuffle for so that we treat it similar to X86ISD::MOVDDUP.
Differential Revision: https://reviews.llvm.org/D38836
llvm-svn: 315768
|
|
|
|
|
|
| |
from folding movddup as a broadcast load.
llvm-svn: 315767
|
|
|
|
|
|
| |
machines.
llvm-svn: 315765
|
|
|
|
| |
llvm-svn: 315763
|
|
|
|
| |
llvm-svn: 315762
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
based ImmLeaf.
Summary:
There's only a tablegen testcase for IntImmLeaf and not a CodeGen one
because the relevant rules are rejected for other reasons at the moment.
On AArch64, it's because there's an SDNodeXForm attached to the operand.
On X86, it's because the rule either emits multiple instructions or has
another predicate using PatFrag which cannot easily be supported at the
same time.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D36569
llvm-svn: 315761
|
|
|
|
|
|
| |
warnings; other minor fixes (NFC).
llvm-svn: 315760
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TargetRegisterInfo::getMinimalPhysRegClass is actually pretty expensive
because it has to iterate over all the register classes.
Cache this information as we need and get it so that we limit its usage.
Right now, we heavily rely on it, because this is how we get the mapping
for vregs defined by copies from physreg (i.e., the one that are ABI
related).
Improve compile time by up to 10% for that pass.
NFC
llvm-svn: 315759
|
|
|
|
|
|
| |
NFC
llvm-svn: 315758
|
|
|
|
|
|
|
|
| |
Match the LLVM coding standard for loop conditions.
NFC.
llvm-svn: 315757
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to this patch we used to create SetVectors in temporaries that
were created and destroyed for each instruction. Now, instead we create
and destroyed them only once, but clear the content for each
instruction.
This speeds up the pass by ~25%.
NFC.
llvm-svn: 315756
|
|
|
|
| |
llvm-svn: 315754
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
not assumed not to alias.
It is possible for both a base and a derived class to be satisfied
with a unique vtable. If a program contains casts of the same pointer
to both of those types, the CFI checks will be lowered to this
(with ThinLTO):
if (p != &__typeid_base_global_addr)
trap();
if (p != &__typeid_derived_global_addr)
trap();
The optimizer may then use the first condition combined
with the assumption that __typeid_base_global_addr and
__typeid_derived_global_addr may not alias to optimize away the second
comparison, resulting in an unconditional trap.
This patch fixes the bug by giving imported globals the type [0 x i8]*,
which prevents the optimizer from assuming that they do not alias.
Differential Revision: https://reviews.llvm.org/D38873
llvm-svn: 315753
|
|
|
|
|
|
| |
These would fail if the created variable names changed.
llvm-svn: 315752
|
|
|
|
|
|
| |
No functionality change intended.
llvm-svn: 315749
|
|
|
|
|
|
|
| |
When selecting modifiers for mad_mix instructions,
look at fneg/fabs that occur before the conversion.
llvm-svn: 315748
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
use them.
Summary:
The purpose of this patch is to expose more information about ImmLeaf-like
PatLeaf's so that GlobalISel can learn to import them. Previously, ImmLeaf
could only be used to test int64_t's produced by sign-extending an APInt.
Other tests on immediates had to use the generic PatLeaf and extract the
constant using C++.
With this patch, tablegen will know how to generate predicates for APInt,
and APFloat. This will allow it to 'do the right thing' for both SelectionDAG
and GlobalISel which require different methods of extracting the immediate
from the IR.
This is NFC for SelectionDAG since the new code is equivalent to the
previous code. It's also NFC for FastISel because FastIselShouldIgnore is 1
for the ImmLeaf subclasses. Enabling FastIselShouldIgnore == 0 for these new
subclasses will require a significant re-factor of FastISel.
For GlobalISel, it's currently NFC because the relevant code to import the
affected rules is not yet present. This will be added in a later patch.
Depends on D36086
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: bjope, aemerson, rengolin, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D36534
llvm-svn: 315747
|
|
|
|
|
|
| |
This will detect invalid iterators when ABI breaking checks are enabled.
llvm-svn: 315746
|
|
|
|
|
|
| |
Also, consolidate tests for this fold in one place.
llvm-svn: 315745
|
|
|
|
|
|
| |
This helps match v_mad_mix* in some cases.
llvm-svn: 315744
|
|
|
|
|
|
| |
Also, clean up unnecessary matcher capture variable initializations.
llvm-svn: 315743
|
|
|
|
|
|
|
|
| |
This is only currently used for mad/fma transforms.
This is the only case where it should be used for AMDGPU,
so add an opcode to be sure.
llvm-svn: 315740
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each constant extender requires an extra instruction, which adds to the
code size and also reduces the number of available slots in an instruction
packet. In most cases, the value of a repeated constant extender could be
loaded into a register, and the instructions using the extender could be
replaced with their counterparts that use that register instead.
This patch adds a pass that tries to reduce the number of constant
extenders, including extenders which differ only in an immediate offset
known at compile time, e.g. @global and @global+12.
llvm-svn: 315735
|
|
|
|
|
|
|
|
| |
I'm about to commit a patch that makes them necessary for getPredCode() and
it would be strange for getPredCode() and getImmCode() to require different
usage.
llvm-svn: 315733
|
|
|
|
| |
llvm-svn: 315728
|
|
|
|
| |
llvm-svn: 315726
|