| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
| |
G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics,
clang genrates these intrinsics from __builtin_bitreverse32 and
__builtin_bitreverse64.
Add lower and narrowscalar for G_BITREVERSE.
Lower G_BITREVERSE on MIPS32.
Differential Revision: https://reviews.llvm.org/D71363
|
| |
|
|
|
|
|
|
|
| |
G_BSWAP is generated from llvm.bswap.<type> intrinsics, clang genrates
these intrinsics from __builtin_bswap32 and __builtin_bswap64.
Add lower and narrowscalar for G_BSWAP.
Lower G_BSWAP on MIPS32, select G_BSWAP on MIPS32 revision 2 and later.
Differential Revision: https://reviews.llvm.org/D71362
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This patch makes `AAValueSimplify` use `changeUsesAfterManifest` in `manifest`. This will invoke simple folding after the manifest.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: hiraditya, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71972
|
| |
|
|
|
|
| |
Indirect C_Immediate or C_Other constraints have been excluded.
Also simplify an unneeded change to indirect 'X' by D60942.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For now, PowerPC will using several instructions to get the constant and "and" it with the following case:
define i32 @test1(i32 %a) {
%and = and i32 %a, -2
ret i32 %and
}
However, we could exploit it with the rotate mask instructions.
MB ME
+----------------------+
|xxxxxxxxxxx00011111000|
+----------------------+
0 32 64
Notice that, we can only do it if the MB is larger than 32 and MB <= ME as
RLWINM will replace the content of [0 - 32) with [32 - 64) even we didn't rotate it.
Differential Revision: https://reviews.llvm.org/D71829
|
| |
|
|
|
| |
These are implemented slightly more efficiently than comparing
to 1 in the case that the value is more than 64 bits.
|
| | |
|
| |
|
|
|
|
|
|
| |
X86ISD::VSRLI/VSRAI/VSRLI. Use getConstantOperandVal and APInt operations.
These nodes should only ever be formed with an i8 TargetConstant
so we don't need to check for it to be a constant. It's also
always 8-bits so we don't need to use APInt compare functions.
|
| |
|
|
|
|
|
|
|
| |
This allows us to delete InlineAsm::Constraint_i workarounds in
SelectionDAGISel::SelectInlineAsmMemoryOperand overrides and
TargetLowering::getInlineAsmMemConstraint overrides.
They were introduced to X86 in r237517 to prevent crashes for
constraints like "=*imr". They were later copied to other targets.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A branch is considered UB if it depends on an undefined / uninitialized value.
At this point this handles simple UB branches in the form: `br i1 undef, ...`
We query `AAValueSimplify` to get a value for the branch condition, so the branch
can be more complicated than just: `br i1 undef, ...`.
Patch By: Stefanos Baziotis (@baziotis)
Reviewers: jdoerfert, sstefan1, uenoku
Reviewed By: uenoku
Differential Revision: https://reviews.llvm.org/D71799
|
| |
|
|
|
|
|
|
|
| |
targets.
We had a Custom operation action for v4i32 on SSE1. But since
v4i32 isn't legal until SSE2 this was not what was intended. The
code that get executed was intended for op legalization and
creates a bunch of v4i32 nodes that all end up scalarized.
|
| |
|
|
|
|
| |
Use dedicated API for getting the mask instead of duplicating it.
Differential Revision: https://reviews.llvm.org/D71964
|
| |
|
|
| |
LowerUINT_TO_FP_i32. NFCI
|
| |
|
|
| |
D48683 accidentally disabled -enable-machine-outliner for x86-32.
|
| |
|
|
|
|
|
| |
This reverts commit 7ca86ee6494d4307333b300bae80e42df4a5140f.
Apparently this change causes MS link.exe to error out with
"LNK1235: corrupt or invalid COFF symbol table".
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we have references to the same extern_weak in multiple objects,
all of them would generate external symbols with the same name. Make
them static to avoid duplicate definitions; nothing should need to
refer to this symbol outside of the current object.
GCC/binutils seems to handle the same by not using a fixed string
for the ".default" suffix, but instead using the name of some other
defined external symbol from the same object (which is supposed to
be unique among objects unless there's other duplicate definitions).
Differential Revision: https://reviews.llvm.org/D71711
|
| |
|
|
|
| |
In the last commit, I neglected to initialize the new subtarget feature
I added which caused failures on a few bots. This should fix that.
|
| |
|
|
|
|
|
|
|
|
|
| |
This is a fix for https://bugs.llvm.org/show_bug.cgi?id=40554
Some CPU's trap to the kernel on unaligned floating point access and there are
kernels that do not handle the interrupt. The program then fails with a SIGBUS
according to the PR. This just switches the default for unaligned access to only
allow it on recent server CPUs that are known to allow this.
Differential revision: https://reviews.llvm.org/D71954
|
| |
|
|
| |
Pulled out from D56387 - cleanup variable names, move shift amount legalization inside if() of its only user and remove duplicate getOperand() call.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If we didn't set the value for hasSideEffects bit in our td file, `llvm-tblgen`
will set it as true for those instructions which has no match pattern.
Below 6 instructions don't set the hasSideEffects flag and don't have match
pattern, so their hasSideEffects flag will be set true by llvm-tblgen.
But in fact below instructions don't modify any special register and don't have
other SideEffects, they shouldn't have SideEffects.
This patch is to modify the hasSideEffects of below instructions from 1 to 0.
```
VEXTUHLX
VEXTUHRX
VEXTUWLX
VEXTUWRX
VSPLTBs
VSPLTHs
```
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D71391
|
| |
|
|
| |
function the code is based on. NFC
|
| | |
|
| |
|
|
|
|
| |
serialization/deserialization
Follow-up to r346788 review feedback from Adrian Prantl.
|
| |
|
|
|
| |
This matches the DAG behavior where we don't use SReg_32_XM0
everywhere anymore, and fixes not coalescing the copies into m0.
|
| | |
|
| | |
|
| | |
|
| |
|
|
|
|
| |
The early tail duplicator pass introduces new ones, so a MIR test that
infers no phis since there were none on the input would fail the
verifier after running.
|
| |
|
|
|
|
|
|
|
|
|
| |
We returning a full set, we should use ResultBitWidth. Otherwise we might
it assertions when the resulting constant ranges are used later on.
Reviewers: nikic, spatel, reames
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D71937
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This patch extends the current shape propagation and shape aware
lowering to also support binary operators. Those operators are uniform
with respect to their shape (shape of the input operands is the same as
the shape of their result).
Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor
Reviewed By: anemet
Differential Revision: https://reviews.llvm.org/D70898
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
[DA] Move common code in checkSrcSubscript and checkDstSubscript to a
new function checkSubscript. This avoids duplicate code and possible
out of sync in the future.
Reviewers: sebpop, jmolloy, reames
Reviewed By: sebpop
Subscribers: bmahjour, hiraditya, llvm-commits, amehsan
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71087
Patch by zhongduo.
|
| |
|
|
|
|
| |
There ended up being two result registers, which would fail on
select. It was really defing a new temp register in the correct def
position, instead of the correct result register.
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
| |
v4i32->v4f32 under avx512.
With avx512vl we get v4i32->v4f32 uint_to_fp instructions. With
avx512f we get v16i32->v16f32 instructions which we can use to
emulate v4i32->v4f32.
|
| | |
|
| |
|
|
|
|
|
| |
Intrinsic has incorrect argument type!
i32 (i32*)* @llvm.setjmp
*wipes tear*
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
avx512vl. Similar for vXi64 on avx512dq without avx512vl.
Summary:
Previously we did this with isel patterns that used garbage in
the widened part of the source. But that's not valid for strictfp.
So now we custom widen and use zeroes for the widened elemens for
strictfp.
This replaces D71864.
Reviewers: RKSimon, spatel, andrew.w.kaylor, pengfei, LiuChen3
Reviewed By: pengfei
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71879
|
| |
|
|
|
| |
For non-strict, generic type legalization will take care of this,
but that doesn't happen currently for strict nodes.
|
| | |
|
| |
|
|
|
|
|
|
| |
i686-pc-windows-msvc to match what we do for non-strict.
The float libcalls are inlined in MSVC's math header where they
just cast to double and use the double libcall. Do the same when
we emit libcalls.
|
| |
|
|
|
|
|
|
| |
I believe the algorithm we use for non-strict is exception safe
for strict. The fsub won't generate any exceptions. After it we
will have an exact version of the i32 integer in a double. Then
we just round it to f32. That rounding will generate a precision
exception if it can't be represented exactly.
|
| |
|
|
| |
Differential Revision: https://reviews.llvm.org/D71892
|
| | |
|
| |
|
|
|
|
|
|
| |
avx512vl. Similar for vXi64 sint_to_fp/uint_to_fp on avx512dq without avx512vl.
Previously we widened these through isel patterns, but that
didn't work for STRICT_ nodes. Those need to be padded with
zeroes in the upper bits which is harder to do in isel patterns.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
but not AVX512VL.
Previously we were widening with isel patterns, but that wasn't
exception safe for strict FP. So now we widen to v4i32->v4f64
during type legalization. And then let op legalization further
widen to v8i32->v8f64.
The vec_int_to_fp.ll changes are caused by us no longer narrowing
extracts of strict_uint_to_fp to the v4i32->v2f64 instruction
without AVX512VL only to have isel rewiden it. Now we just keep
it wide throughout. So we don't have an opportunity to narrow
the load.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
but not avx512vl.
AVX512F added instruction for vector fp_to_uint conversions. With
AVX512VL we can use a specific instruction that does v2f64->v4i32 with
zeroes in the 2 extra elements. For non-strict nodes without AVX512VL
we relied on type legalization to turn it to v4f64->v4i32 which would
later be widened by op legalization to v8f64->v8i32. But type legalization
doesn't currently widen strict nodes since it doesn't know how to
safely and efficiently pad the extra elements. But for X86 we know
padding with zeroes is safe and efficient so do that ourselves.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
another node
SelectionDAG::transferDbgValues() can 'reattach' SDDbgValue from one to
another node, but doesn't change its source order. If the destination node has
the order greater than the SDDbgValue, there are two possible issues
revealed later:
* If debug info is attached to an instruction that is the first definition
of a register, this ends up with a def-after-use and the debug info
gets 'undef' later.
* If MIR has another definition of a register above the debug info,
the debug info may represent a source variable incorrectly because
it appears (significantly) before an instruction corresponded
to this debug info.
So, the patch changes the order of an SDDbgValue when it is moved
to a node with greater order.
Reviewers: dblaikie, jmorse, aprantl
Reviewed By: aprantl
Subscribers: aprantl, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71175
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Calling `changeToUnreachable` in `manifest` from different places might cause really unpredictable problems. As other deleting functions are doing, we need to change these instructions after all `manifest`.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71910
|