| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
| |
Complete fp16 support by ensuring that load extension / truncate store
operations are properly expanded.
Reviewers: asb, lenary
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D69246
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
selectImpl is able to select G_FSQRT when we set bank for vector
operands to fprb. Add detailed tests.
Note: G_FSQRT is generated from llvm-ir intrinsics llvm.sqrt.*,
and at the moment MIPS is not able to generate this intrinsic for
vector type (some targets generate vector llvm.sqrt.* from calls
to a builtin function).
__builtin_msa_fsqrt_<format> will be transformed into G_FSQRT
in legalizeIntrinsic and selected in the same way.
Differential Revision: https://reviews.llvm.org/D69376
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SHT_NOTE is the section that consists of
namesz, descsz, type, name + padding, desc + padding data.
This patch teaches yaml2obj, obj2yaml to dump and parse them.
This patch implements the section how it is described here:
https://docs.oracle.com/cd/E23824_01/html/819-0690/chapter6-18048.html
Which says: "For 64–bit objects and 32–bit objects, each entry is an array of 4-byte words in
the format of the target processor"
The official specification is different
http://www.sco.com/developers/gabi/latest/ch5.pheader.html#note_section
And says: "n 64-bit objects (files with e_ident[EI_CLASS] equal to ELFCLASS64), each entry is an array
of 8-byte words in the format of the target processor. In 32-bit objects (files with e_ident[EI_CLASS]
equal to ELFCLASS32), each entry is an array of 4-byte words in the format of the target processor"
Since LLVM uses the first, 32-bit way, this patch follows it.
Differential revision: https://reviews.llvm.org/D68983
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
renamable $x6 = ADDI8 $x1, -80 ;;; 0 is replaced with -80
renamable $x6 = ADD8 killed renamable $x6, renamable $x5
STW killed renamable $r3, 4, killed renamable $x6 :: (store 4 into %ir.14, !tbaa !2)
After PEI there is a peephole opt opportunity to combine above -80 in ADDI8 with 4 in the STW to eliminate unnecessary ADD8.
Expected result:
renamable $x6 = ADDI8 $x1, -76
STWX killed renamable $r3, renamable $x5, killed renamable $x6 :: (store 4 into %ir.6, !tbaa !2)
Reviewed by: stefanp
Differential Revision: https://reviews.llvm.org/D66329
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
| |
We were already going to all of the trouble of computing maximum constant exit counts for each loop exit, we might as well expose them through the API. The change in IndVars is mostly to demonstrate that the wired up code works, but it als very slightly strengthens the transform. The strengthened case is rather narrow though: it requires one exactly analyzeable exit, one imprecisely analyzeable exit (with the upper bound less than the precise one), and one unanalyzeable exit. I coudn't construct a reasonably stable test case.
This does increase the memory usage of the BackedgeTakenCount by a factor of 2 in the worst case.
I also noticed the loop in IndVars is O(#Exits ^ 2). This doesn't change with this patch. A future patch will cache this result inside of SCEV to avoid requering.
|
| |
|
|
| |
default to after the switch
|
| |
|
|
| |
This is a first step in figuring out a proper API for maximum (non constant) exit counts. This may evolve a bit as we get experience with the API needs; suggestions very welcome. This patch just tried to provide a framework that we can later add maximum too in a clean and obvious way.
|
| |
|
|
| |
This has become visible with the --fatal-warnings support.
|
| | |
|
| |
|
|
|
|
|
|
|
|
| |
(This time verified locally.)
It was failing with:
llvm/lib/MC/XCOFFObjectWriter.cpp:168:56: error: array must be initialized with a brace-enclosed initializer
std::array<Section *const, 2> Sections = {&Text, &BSS};
^
|
| | |
|
| |
|
|
|
| |
Clang emit warning for skipping field initialization. Add {} to fix it.
This is a patch that fixes issue introduced in https://reviews.llvm.org/D69112
|
| |
|
|
| |
NFCI.
|
| |
|
|
|
|
|
| |
This reverts commit 32ce14e55e5a99dd99c3b4fd4bd0ccaaf2948c30.
In post-commit review, Pavel pointed out that there's a simpler way to
ignore SIGPIPE in lldb that doesn't rely on llvm's handlers.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
new functions are compatible before upgrading a function call to an
intrinsic call.
Sometimes users insert calls to ARC runtime functions that are not
compatible with the corresponding intrinsic functions (for example,
'i8* @objc_storeStrong' instead of 'void @objc_storeStrong'). Don't
upgrade those calls.
rdar://problem/56447127
|
| |
|
|
|
|
|
| |
An SUnit can be neither intruction not SDNode. It is all
null if represents a nop. Fixed a crash on using SU->getInstr().
Differential Revision: https://reviews.llvm.org/D69395
|
| |
|
|
|
|
|
|
| |
It was failing with
llvm/lib/MC/XCOFFObjectWriter.cpp:168:53: error: array must be initialized with a brace-enclosed initializer
std::array<Section *const, 2> Sections{&Text, &BSS};
^
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: rampitec
Reviewed By: rampitec
Subscribers: arsenm, jvesely, nhaehnle, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69375
|
| |
|
|
| |
Avoids warnings in Release builds. NFC.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Right now we handle each CsectGroup(ProgramCodeCsects, BSSCsects)
individually when assigning indices, writing symbol table, and
writing section raw data. However, there is already a pattern there,
and we could common up those actions for every CsectGroup. This will
make adding new CsectGroup(Read Write data, Read only data, TC/TOC,
mergeable string) easier, and less error prone.
Reviewed by: sfertile, daltenty, DiggerLin
Approved by: daltenty
Differential Revision: https://reviews.llvm.org/D69112
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The MVE VADC instruction reads and writes the carry bit at bit 29 of
the FPSCR register. The corresponding ACLE intrinsic is specified to
work with an integer in which the carry bit is stored at bit 0. So if
a user writes a code sequence in C that passes the carry from one VADC
to the next, like this,
s0 = vadcq_u32(a0, b0, &carry);
s1 = vadcq_u32(a1, b1, &carry);
then clang will generate IR for each of those operations that shifts
the carry bit up into bit 29 before the VADC, and after it, shifts it
back down and masks off all but the low bit. But in this situation
what you really wanted was two consecutive VADC instructions, so that
the second one directly reads the value left in FPSCR by the first,
without wasting several instructions on pointlessly clearing the other
flag bits in between.
This commit explains to InstCombine that the other bits of the flags
operand don't matter, and adds a test that demonstrates that all the
code between the two VADC instructions can be optimized away as a
result.
Reviewers: dmgreen, miyuki, ostannard
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67162
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The VST2 and VST4 instructions take two or four vector registers as
input, and store part of each register to memory in an interleaved
pattern. They come in variants indicating which part of each register
they store (VST20 and VST21; VST40 to VST43 inclusive); the intention
is that issuing each of those variants in turn has the combined effect
of loading or storing the whole set of registers to a memory block of
equal size. The corresponding VLD2 and VLD4 instructions load from
memory in the same interleaved format: each one overwrites only part
of its output register set, and again, the idea is that if you use
VLD4{0,1,2,3} or VLD2{0,1} together, you end up having written to the
whole of each register.
I've implemented the stores and loads quite differently. The loads
were easiest to implement as a single intrinsic that expands to all
four VLD4x instructions or both VLD2x, delivering four complete output
registers. (Implementing each individual load as a separate
instruction taking four input registers to partially overwrite is
possible in theory, but pointless, and when I tried it, I found it
would need extra work to get the register allocation not to be
horrible.) Since that intrinsic delivers multiple outputs, it has to
be instruction-selected in custom C++.
But the store instructions are easier to model individually, because
they don't overwrite any register at all and you can write a DAG Isel
pattern in Tablegen for each one.
Hence, my new intrinsic `int_arm_mve_vld4q` expands to four load
instructions, delivers four full output vectors, and is handled by C++
code, whereas `int_arm_mve_vst4q` expands to just one store
instruction, takes four input vectors and a constant indicating which
lanes to store, and is handled entirely in Tablegen. (And similarly
for vld2q/vst2q.) This is asymmetric, but it was the easiest way to do
each one.
Reviewers: dmgreen, miyuki, ostannard
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68700
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds some initial example IR intrinsics for MVE instructions that
deliver multiple output values, and hence, have to be instruction-
selected by custom C++ code instead of Tablegen patterns.
I've added the writeback gather load instructions (taking a vector of
base addresses and a single common offset, returning a vector of
loaded values and an updated vector of base addresses); one example
from the long shift family (taking and returning a 64-bit value in two
GPRs); and the VADC instruction (which propagates a carry bit from
each vector-lane addition to the next, taking an input carry flag in
FPSCR and outputting the final one in FPSCR as well).
To support the VPT-predicated forms of these instructions, I've
written some helper functions to add the cluster of MVE predicate
operands to the end of a MachineInstr. `AddMVEPredicateToOps` is used
when the instruction actually is predicated (so it takes a predicate
mask argument), and `AddEmptyMVEPredicateToOps` is for when the
instruction is unpredicated (so it fills in $noreg for the mask). Each
one comes in a form suitable for `vpred_n`, and one for `vpred_r`
which takes the extra 'inactive' parameter.
For VADC, the representation of the carry flag in the IR intrinsic is
a word intended to be moved directly to and from `FPSCR_nzcvqc`, i.e.
with the carry flag in bit 29 of the word. (The user-facing ACLE
intrinsic will want it to be in bit 0, but I'll do that on the clang
side.)
Reviewers: dmgreen, miyuki, ostannard
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68699
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit, together with the next few, will add a representative
sample of the kind of IR intrinsics that we'll need in order to
implement the user-facing ACLE intrinsics for MVE. Supporting all of
them will take more work; the intention of this initial series of
commits is to implement an intrinsic or two from lots of different
categories, as examples and proofs of concept.
This initial commit introduces a small number of IR intrinsics for
instructions simple enough that they can use Tablegen ISel patterns:
the predicated versions of the VADD and VSUB instructions (both
integer and FP), VMIN and VMAX, and the float->half VCVT instruction
(predicated and unpredicated).
When using VPT-predicated instructions in automatic code generation,
it will be convenient to specify the predicate value as a vector of
the appropriate number of i1. To make it easy to specify all sizes of
an instruction in one go and give each one the matching predicate
vector type, I've added a system of Tablegen informational records
describing MVE's vector types: each one gives the underlying LLVM IR
ValueType (which may not be the same if the MVE vector is of
explicitly signed or unsigned integers) and an appropriate vNi1 to use
as the predicate vector.
(Also, those info records include the usual encoding for the types, so
that as we add associations between each instruction encoding and one
of the new `MVEVectorVTInfo` records, we can remove some of the
existing template parameters and replace them with references to the
vector type info's fields.)
The user-facing ACLE intrinsics will receive a predicate mask as a
16-bit integer, so I've also provided a pair of intrinsics i2v and
v2i, to convert between an integer and a vector of i1 by just changing
the register class.
Reviewers: dmgreen, miyuki, ostannard
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67158
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: rampitec, arsenm
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69355
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
selectImpl is able to select G_FABS when we set bank for vector
operands to fprb. Add detailed tests.
Note: G_FABS is generated from llvm-ir intrinsics llvm.fabs.*,
and at the moment MIPS is not able to generate this intrinsic for
vector type (some targets generate vector llvm.fabs.* from calls
to a builtin function).
We can handle fabs using __builtin_msa_fmax_a_<format> and passing
same vector as both arguments. __builtin_msa_fmax_a_<format> will
be directly selected into FMAX_A_<format> in legalizeIntrinsic.
Differential Revision: https://reviews.llvm.org/D69346
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
| |
Select vector G_FADD, G_FSUB, G_FMUL and G_FDIV for MIPS32 with MSA. We
have to set bank for vector operands to fprb and selectImpl will do the
rest. __builtin_msa_fadd_<format>, __builtin_msa_fsub_<format>,
__builtin_msa_fmul_<format> and __builtin_msa_fdiv_<format> will be
transformed into G_FADD, G_FSUB, G_FMUL and G_FDIV in legalizeIntrinsic
respectively and selected in the same way.
Differential Revision: https://reviews.llvm.org/D69340
|
| |
|
|
|
|
|
|
|
|
|
| |
Select vector G_SDIV, G_SREM, G_UDIV and G_UREM for MIPS32 with MSA. We
have to set bank for vector operands to fprb and selectImpl will do the
rest. __builtin_msa_div_s_<format>, __builtin_msa_mod_s_<format>,
__builtin_msa_div_u_<format> and __builtin_msa_mod_u_<format> will be
transformed into G_SDIV, G_SREM, G_UDIV and G_UREM in legalizeIntrinsic
respectively and selected in the same way.
Differential Revision: https://reviews.llvm.org/D69333
|
| |
|
|
|
|
|
|
| |
Potentially sgpr to sgpr copy should also be possible.
That is however trickier because we may end up with a
wrong register class at use because of xm0/xexec permutations.
Differential Revision: https://reviews.llvm.org/D69280
|
| |
|
|
| |
Testing git push access.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
'-c' can be folded into the x node."
This broke various Windows builds, see comments on the Phabricator
review.
This also reverts the follow-up 20bf0cf.
> Summary:
> This fold, helps recover from the rest of the D62266 ARM regressions.
> https://rise4fun.com/Alive/TvpC
>
> Note that while the fold is quite flexible, i've restricted it
> to the single interesting pattern at the moment.
>
> Reviewers: efriedma, craig.topper, spatel, RKSimon, deadalnix
>
> Reviewed By: deadalnix
>
> Subscribers: javed.absar, kristof.beyls, llvm-commits
>
> Tags: #llvm
>
> Differential Revision: https://reviews.llvm.org/D62450
|
| |
|
|
|
|
| |
solveBlockValueIntrinsic()
Now that there's SaturatingInst class, this is cleaner.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
from `add` op
Summary:
This was suggested in https://reviews.llvm.org/D69277#1717210
In this form (this is what was suggested, right?), the results aren't staggering
(especially since given LVI cross-block focus)
this does catch some things (as per test-suite), but not too much:
| statistic | old | new | delta | % change |
| correlated-value-propagation.NumAddNSW | 4981 | 4982 | 1 | 0.0201% |
| correlated-value-propagation.NumAddNW | 12125 | 12126 | 1 | 0.0082% |
| correlated-value-propagation.NumCmps | 1199 | 1202 | 3 | 0.2502% |
| correlated-value-propagation.NumDeadCases | 112 | 111 | -1 | -0.8929% |
| correlated-value-propagation.NumMulNSW | 275 | 278 | 3 | 1.0909% |
| correlated-value-propagation.NumMulNUW | 1323 | 1326 | 3 | 0.2268% |
| correlated-value-propagation.NumMulNW | 1598 | 1604 | 6 | 0.3755% |
| correlated-value-propagation.NumNSW | 7158 | 7167 | 9 | 0.1257% |
| correlated-value-propagation.NumNUW | 13304 | 13310 | 6 | 0.0451% |
| correlated-value-propagation.NumNW | 20462 | 20477 | 15 | 0.0733% |
| correlated-value-propagation.NumOverflows | 4 | 7 | 3 | 75.0000% |
| correlated-value-propagation.NumPhis | 15366 | 15381 | 15 | 0.0976% |
| correlated-value-propagation.NumSExt | 6273 | 6277 | 4 | 0.0638% |
| correlated-value-propagation.NumShlNSW | 1172 | 1171 | -1 | -0.0853% |
| correlated-value-propagation.NumShlNUW | 2793 | 2794 | 1 | 0.0358% |
| correlated-value-propagation.NumSubNSW | 730 | 736 | 6 | 0.8219% |
| correlated-value-propagation.NumSubNUW | 2044 | 2046 | 2 | 0.0978% |
| correlated-value-propagation.NumSubNW | 2774 | 2782 | 8 | 0.2884% |
| instcount.NumAddInst | 277586 | 277569 | -17 | -0.0061% |
| instcount.NumAndInst | 66056 | 66054 | -2 | -0.0030% |
| instcount.NumBrInst | 709147 | 709146 | -1 | -0.0001% |
| instcount.NumCallInst | 528579 | 528576 | -3 | -0.0006% |
| instcount.NumExtractValueInst | 18307 | 18301 | -6 | -0.0328% |
| instcount.NumOrInst | 102660 | 102665 | 5 | 0.0049% |
| instcount.NumPHIInst | 318008 | 318007 | -1 | -0.0003% |
| instcount.NumSelectInst | 46373 | 46370 | -3 | -0.0065% |
| instcount.NumSExtInst | 79496 | 79488 | -8 | -0.0101% |
| instcount.NumShlInst | 40654 | 40657 | 3 | 0.0074% |
| instcount.NumTruncInst | 62251 | 62249 | -2 | -0.0032% |
| instcount.NumZExtInst | 68211 | 68221 | 10 | 0.0147% |
| instcount.TotalBlocks | 843910 | 843909 | -1 | -0.0001% |
| instcount.TotalInsts | 7387448 | 7387423 | -25 | -0.0003% |
Reviewers: nikic, reames
Reviewed By: nikic
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69321
|
| |
|
|
|
| |
- Reduce code duplication
- Get partial support of JAL expansion for XGOT.
|
| | |
|
| | |
|
| |
|
|
|
| |
This reverts commit 7bc7fe6b789d25d48d6dc71d533a411e9e981237.
The immediate callers have been fixed to pass nullopt where appropriate.
|
| |
|
|
|
|
| |
This reverts commit 40668abca4d307e02b33345cfdb7271549ff48d0.
This causes clang tests to fail, as stacksize=0 is being explicitly passed and
is no longer a no-op.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This roughly mimics `std::thread(...).detach()` except it allows to
customize the stack size. Required for https://reviews.llvm.org/D50993.
I've decided against reusing the existing `llvm_execute_on_thread` because
it's not obvious what to do with the ownership of the passed
function/arguments:
1. If we pass possibly owning functions data to `llvm_execute_on_thread`,
we'll lose the ability to pass small non-owning non-allocating functions
for the joining case (as it's used now). Is it important enough?
2. If we use the non-owning interface in the new use case, we'll force
clients to transfer ownership to the spawned thread manually, but
similar code would still have to exist inside
`llvm_execute_on_thread(_async)` anyway (as we can't just pass the same
non-owning pointer to pthreads and Windows implementations, and would be
forced to wrap it in some structure, and deal with its ownership.
Patch by Dmitry Kozhevnikov!
Differential Revision: https://reviews.llvm.org/D51103
|
| |
|
|
|
|
|
|
|
|
| |
MipsMCAsmInfo was using '$' prefix for Mips32 and '.L' for Mips64
regardless of -target-abi option. By passing MCTargetOptions to MCAsmInfo
we can find out Mips ABI and pick appropriate prefix.
Tags: #llvm, #clang, #lldb
Differential Revision: https://reviews.llvm.org/D66795
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The default implementation of the describeLoadedValue() hook uses the
MoveImm property to determine if an instruction moves an immediate. If
an instruction has that property the function returns the second
operand, assuming that that is the immediate value the instruction
moves. As far as I can tell, the MoveImm property does not imply that
the second operand is the immediate value, nor that any other operand
necessarily holds the immediate value; it just means that the
instruction moves some immediate value.
One example where the second operand is not the immediate is SystemZ's
LZER instruction, which moves a zero immediate implicitly: $f0S = LZER.
That case triggered an out-of-bound assertion when getting the operand.
I have added a test case for that instruction.
Another example is ARM's MVN instruction, which holds the logical
bitwise NOT'd value of the immediate that is moved. For the following
reproducer:
extern void foo(int);
int main() { foo(-11); }
an incorrect call site value would be emitted:
$ clang --target=arm foo.c -O1 -g -Xclang -femit-debug-entry-values \
-c -o - | ./build/bin/llvm-dwarfdump - | \
grep -A2 call_site_parameter
0x00000058: DW_TAG_GNU_call_site_parameter
DW_AT_location (DW_OP_reg0 R0)
DW_AT_GNU_call_site_value (DW_OP_lit10)
Another example is the A2_combineii instruction on Hexagon which moves
two immediates to a super-register: $d0 = A2_combineii 20, 10.
Perhaps these are rare exceptions, and most MoveImm instructions hold
the immediate in the second operand, but in my opinion the default
implementation of the hook should only describe values that it can, by
some contract, guarantee are safe to describe, rather than leaving it up
to the targets to override the exceptions, as that can silently result
in incorrect call site values.
This patch adds X86's relevant move immediate instructions to the
target's hook implementation, so this commit should be a NFC for that
target. We need to do the same for ARM and AArch64.
Reviewers: djtodoro, NikolaPrica, aprantl, vsk
Reviewed By: vsk
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #debug-info, #llvm
Differential Revision: https://reviews.llvm.org/D69109
|
| |
|
|
| |
A test commit.
|
| |
|
|
|
|
|
|
|
|
| |
Select vector G_MUL for MIPS32 with MSA. We have to set bank
for vector operands to fprb and selectImpl will do the rest.
Manual selection of G_MUL is now done for gprb only.
__builtin_msa_mulv_<format> will be transformed into G_MUL
in legalizeIntrinsic and selected in the same way.
Differential Revision: https://reviews.llvm.org/D69310
|
| |
|
|
|
|
|
|
|
|
|
| |
Select vector G_SUB for MIPS32 with MSA. We have to set bank
for vector operands to fprb and selectImpl will do the rest.
__builtin_msa_subv_<format> will be transformed into G_SUB
in legalizeIntrinsic and selected in the same way.
__builtin_msa_subvi_<format> will be directly selected into
SUBVI_<format> in legalizeIntrinsic.
Differential Revision: https://reviews.llvm.org/D69306
|
| |
|
|
|
|
|
|
|
|
|
|
| |
checks (PR43769)
We should do the fold only if both constants are plain,
non-opaque constants, at least that is the DAG.FoldConstantArithmetic()
requirement.
And if the constant we are comparing with is zero - we shouldn't be
trying to do this fold in the first place.
Fixes https://bugs.llvm.org/show_bug.cgi?id=43769
|