| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
| |
Temporarily remove switch statement without any case labels
in function legalizeCustom in order to fix r349346 for MSVC.
llvm-svn: 349356
|
| |
|
|
|
|
|
|
| |
These features (fairly) recently got split out into their own feature, so we
should make CodeGen use them when available. The main change here is that the
check used to be based on the triple, but now it's based on CPU features.
llvm-svn: 349355
|
| |
|
|
|
|
|
|
| |
Lower G_UADDE and legalize G_ADD using narrowScalar on MIPS32.
Differential Revision: https://reviews.llvm.org/D54580
llvm-svn: 349346
|
| |
|
|
|
|
|
|
|
| |
The Load/Store Optimizer runs before Machine Block Placement. At O3 the
Tail Duplication Threshold is set to 4 instructions and this can create
new opportunities for the Load/Store Optimizer. It seems worthwhile to
run it once again.
llvm-svn: 349338
|
| |
|
|
|
|
| |
The CC is operand 2 not operand 3.
llvm-svn: 349330
|
| |
|
|
|
|
| |
We had 3 different approaches - consistently use getTargetConstantBitsFromNode and allow undef elts.
llvm-svn: 349319
|
| |
|
|
|
|
|
|
|
|
| |
I'd like to try to move a lot of the flag matching out of EmitTest and push it to isel or isel preprocessing. This is a step towards that.
The test-shrink-bug.ll changie is an improvement because we are no longer interfering with test shrink handling in isel.
The pr34137.ll change is a regression, but the IR came from -O0 and was not reduced by InstCombine. So it contains a lot of redundancies like duplicate loads that made it combine poorly.
llvm-svn: 349315
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(PR39859)
This is part of fixing PR39859:
https://bugs.llvm.org/show_bug.cgi?id=39859
We have a crippled vector ISA, so we have to invert a typical fold and create min/max here.
As discussed in the bug report, we can probably do better by using saturating subtract when
it's available, but we should have this improvement for the min/max patterns regardless.
Alive proofs:
https://rise4fun.com/Alive/zsf
https://rise4fun.com/Alive/Qrl
Differential Revision: https://reviews.llvm.org/D55515
llvm-svn: 349304
|
| |
|
|
|
|
| |
In preparation for converting to funnel shifts.
llvm-svn: 349286
|
| |
|
|
|
|
| |
Use consistent rules for when to lower to SHLD/SHRD for slow machines - fixes a weird issue where funnel shift gets expanded but then X86ISelLowering's combineOr sees the optsize and combines to SHLD/SHRD, but now with the modulo amount guard......
llvm-svn: 349285
|
| |
|
|
| |
llvm-svn: 349265
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using regular abs() causes the following warning
error: absolute value function 'abs' given an argument of type 'int64_t' (aka 'long') but has parameter of type 'int' which may cause truncation of value [-Werror,-Wabsolute-value]
(uint32_t)abs(Dist) > MaxDist) {
^
lib/Target/AMDGPU/SILoadStoreOptimizer.cpp:1369:19: note: use function 'std::abs' instead
which causes a bot to fail:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/18284/steps/bootstrap%20clang/logs/stdio
llvm-svn: 349224
|
| |
|
|
|
|
|
|
|
|
| |
instruction that only modify the O flag to the waiver list.
The only caller of this turns CMP with 0 into TEST. CMP with 0 and TEST both set OF to 0 so we should have no issues with instructions that only use OF.
Though I don't think there's any reason we would read just OF after a compare with 0 anyway. So this probably isn't an observable change.
llvm-svn: 349223
|
| |
|
|
|
|
|
|
|
|
|
|
| |
indicates which result is the flag result. NFCI
hasNoCarryFlagUses hardcoded that the flag result is 1 and used that to filter which uses were of interest. hasNoSignedComparisonUses just assumes the only result is flags and checks whether any user of the node is a CopyToReg instruction.
After this patch we now do a result number check in both and rely on the caller to provide the result number.
This shouldn't change behavior it was just an odd difference between the two functions that I noticed.
llvm-svn: 349222
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The change is an effort to split and refactor abandoned
D34708 into smaller parts.
Here the behaviour of unsupported instructions is changed
to match the behaviour of explicit intrinsics calls.
Currently LLVM crashes with:
> Assertion getInstruction() && "Not a call or invoke instruction!" failed.
With this patch LLVM produces a more sensible error message:
> Cannot select: ... i32 = ExternalSymbol'__foobar'
Author: Denys Zariaiev <denys.zariaiev@gmail.com>
Differential Revision: https://reviews.llvm.org/D55145
llvm-svn: 349213
|
| |
|
|
|
|
| |
This fixes https://llvm.org/PR39983.
llvm-svn: 349202
|
| |
|
|
| |
llvm-svn: 349199
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
13bit constant offset from the nearby instructions.
Summary: Promote constant offset to immediate by recomputing the relative 13bit offset from nearby instructions.
E.g.
s_movk_i32 s0, 0x1800
v_add_co_u32_e32 v0, vcc, s0, v2
v_addc_co_u32_e32 v1, vcc, 0, v6, vcc
s_movk_i32 s0, 0x1000
v_add_co_u32_e32 v5, vcc, s0, v2
v_addc_co_u32_e32 v6, vcc, 0, v6, vcc
global_load_dwordx2 v[5:6], v[5:6], off
global_load_dwordx2 v[0:1], v[0:1], off
=>
s_movk_i32 s0, 0x1000
v_add_co_u32_e32 v5, vcc, s0, v2
v_addc_co_u32_e32 v6, vcc, 0, v6, vcc
global_load_dwordx2 v[5:6], v[5:6], off
global_load_dwordx2 v[0:1], v[5:6], off offset:2048
Author: FarhanaAleen
Reviewed By: arsenm, rampitec
Subscribers: llvm-commits, AMDGPU
Differential Revision: https://reviews.llvm.org/D55539
llvm-svn: 349196
|
| |
|
|
|
|
|
| |
The instruction encodings make it unnecessary to distinguish extended W-form
from X-form instructions.
llvm-svn: 349185
|
| |
|
|
| |
llvm-svn: 349158
|
| |
|
|
|
|
| |
Mark as legal and add tests. Nothing special to do.
llvm-svn: 349147
|
| |
|
|
|
|
|
|
| |
Refactor the ARMInstructionSelector to cache some opcodes in the
constructor instead of checking all the time if we're in ARM or Thumb
mode.
llvm-svn: 349143
|
| |
|
|
|
|
|
|
|
|
|
| |
Mark G_ADD, G_SUB, G_MUL, G_AND, G_OR and G_XOR as legal for both ARM
and Thumb2.
Extract the legalizer tests for these opcodes into another file.
Add tests for the instruction selector.
llvm-svn: 349142
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(setcc)) already has the target desired type for the setcc
Summary:
If the setcc already has the target desired type we can reach the getSetCC/getSExtOrTrunc after the MatchingVecType check with the exact same types as the nodes we started with. This causes those causes VsetCC to be CSEd to N0 and the getSExtOrTrunc will CSE to N. When we return N, the caller will think that meant we called CombineTo and did our own worklist management. But that's not what happened. This prevents target hooks from being called for the node.
To fix this, I've now returned SDValue if the setcc is already the desired type. But to avoid some regressions in X86 I've had to disable one of the target combines that wasn't being reached before in the case of a (sext (setcc)). If we get vector widening legalization enabled that entire function will be deleted anyway so hopefully this is only for the short term.
Reviewers: RKSimon, spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D55459
llvm-svn: 349137
|
| |
|
|
|
|
|
|
|
|
| |
except EmitCmp through EmitCmp.
This requires the two callers to manifest a 0 to make EmitCmp call EmitTest.
I'm looking into changing how we combine TEST and flag setting instructions to not be part of lowering. And instead be part of DAG combine or isel. Which will mean EmitTest will probably become gutted and maybe disappear entirely.
llvm-svn: 349094
|
| |
|
|
|
|
|
|
| |
Fix the logic in the definition of the `ExynosShiftExPred` as a more
specific version of `ExynosShiftPred`. But, since `ExynosShiftExPred` is
not used yet, this change has NFC.
llvm-svn: 349091
|
| |
|
|
|
|
| |
This patch breaks RADV (and probably RadeonSI as well)
llvm-svn: 349084
|
| |
|
|
| |
llvm-svn: 349081
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Macros are expanded on a single line. In case of large expansions,
with sufficiently many instructions with memory operands (and when
-fdebug-info-for-profiling is requested), we may be unable to generate
new base discriminator values - new values overflow (base
discriminators may not be larger than 2^12).
This CL warns instead of asserting in such a case. A subsequent CL
will add APIs to check for overflow before creating new debug info.
See https://bugs.llvm.org/show_bug.cgi?id=39890
Reviewers: davidxl, wmi, gbedwell
Reviewed By: davidxl
Subscribers: aprantl, llvm-commits
Differential Revision: https://reviews.llvm.org/D55643
llvm-svn: 349075
|
| |
|
|
|
|
| |
SimplifyDemandedVectorEltsForTargetNode
llvm-svn: 349057
|
| |
|
|
|
|
| |
There's still a couple of minor SimplifyDemandedElts regressions in some of the shift amount splats that will be fixed in future patches.
llvm-svn: 349052
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: The Sparc V9 membar instruction can enforce different types of
memory orderings depending on the value in its immediate field. In the
architectural manual the type is selected by combining different assembler
tags into a mask. This patch adds support for these tags.
Reviewers: jyknight, venkatra, brad
Reviewed By: jyknight
Subscribers: fedor.sergeev, jrtc27, jfb, llvm-commits
Differential Revision: https://reviews.llvm.org/D53491
llvm-svn: 349048
|
| |
|
|
| |
llvm-svn: 349047
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Constraining an integer value to a floating point register using "f"
causes an llvm_unreachable to trigger. This patch allows i32 integers
to be placed in a single precision float register and i64 integers to
be placed in a double precision float register. This matches the behavior
of GCC.
For other types the llvm_unreachable is removed to instead trigger an
error message that points out the offending line.
Reviewers: jyknight, venkatra
Reviewed By: jyknight
Subscribers: eraman, fedor.sergeev, jrtc27, llvm-commits
Differential Revision: https://reviews.llvm.org/D51614
llvm-svn: 349045
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are several Pseudo in PowerPC backend.
eg:
* ISel Pseudo-instructions , which has let usesCustomInserter=1 in td
ExpandISelPseudos -> EmitInstrWithCustomInserter will deal with them.
* Post-RA pseudo instruction, which has let isPseudo = 1 in td, or Standard pseudo (SUBREG_TO_REG,COPY etc.)
ExpandPostRAPseudos -> expandPostRAPseudo will expand them
* Multi-instruction pseudo operations will expand them PPCAsmPrinter::EmitInstruction
* Pseudo instruction in CodeEmitter, which has encoding of 0.
Currently, in td files, especially PPCInstrVSX.td,
we did not distinguish Post-RA pseudo instruction and Pseudo instruction in CodeEmitter very clearly.
This patch is to
* Rename Pseudo<> class to PPCEmitTimePseudo, which means encoding of 0 in CodeEmitter
* Introduce new class PPCPostRAExpPseudo <> for previous PostRA Pseudo
* Introduce new class PPCCustomInserterPseudo <> for previous Isel Pseudo
Differential Revision: https://reviews.llvm.org/D55143
llvm-svn: 349044
|
| |
|
|
|
|
| |
Merged the repeated code into a single if().
llvm-svn: 349040
|
| |
|
|
|
|
|
|
|
| |
When computing register allocation hints for a GRX32Bit register, make sure
that any of the hinted registers that are also copy hints are returned first
in the list.
Review: Ulrich Weigand.
llvm-svn: 349037
|
| |
|
|
|
|
| |
We always expand to shifts anyhow - test changes are just different scheduling only.
llvm-svn: 349034
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D55499
llvm-svn: 349029
|
| |
|
|
|
|
| |
Remove common code from custom lowering (code is still safe if somehow a zero value gets used).
llvm-svn: 349028
|
| |
|
|
|
|
|
|
|
|
|
| |
Mark G_SEXT, G_ZEXT and G_ANYEXT to 32 bits as legal and add support for
them in the instruction selector. This uses handwritten code again
because the patterns that are generated with TableGen are tuned for what
the DAG combiner would produce and not for simple sext/zext nodes.
Luckily, we only need to update the opcodes to use the Thumb2 variants,
everything else can be reused from ARM.
llvm-svn: 349026
|
| |
|
|
|
|
|
|
|
|
| |
Move existing rotation expansion code into TargetLowering and set it up for vectors as well.
Ideally this would share more of the funnel shift expansion, but we handle the shift amount modulo quite differently at the moment.
Begun removing x86 vector rotate custom lowering to use the expansion.
llvm-svn: 349025
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds support for the various RISC-V FMA instructions (fmadd, fmsub, fnmsub, fnmadd).
The criteria for choosing whether a fused add or subtract is used, as well as
whether the product is negated or not, is whether some of the arguments to the
llvm.fma.* intrinsic are negated or not. In the tests, extraneous fadd
instructions were added to avoid the negation being performed using a xor
trick, which prevented the proper FMA forms from being selected and thus
tested.
The FMA instruction patterns might seem incorrect (e.g., fnmadd: -rs1 * rs2 -
rs3), but they should be correct. The misleading names were inherited from
MIPS, where the negation happens after computing the sum.
The llvm.fmuladd.* intrinsics still do not generate RISC-V FMA instructions,
as that depends on TargetLowering::isFMAFasterthanFMulAndFAdd.
Some comments in the test files about what type of instructions are there
tested were updated, to better reflect the current content of those test
files.
Differential Revision: https://reviews.llvm.org/D54205
Patch by Luís Marques.
llvm-svn: 349023
|
| |
|
|
|
|
| |
Fixes https://bugs.llvm.org/show_bug.cgi?id=33486
llvm-svn: 349022
|
| |
|
|
| |
llvm-svn: 349014
|
| |
|
|
| |
llvm-svn: 349012
|
| |
|
|
|
|
| |
accurate assert. NFC
llvm-svn: 349007
|
| |
|
|
|
|
|
| |
Some compilers complain that variable is captured and some
complain when it is not. Switch to [&].
llvm-svn: 349006
|
| |
|
|
|
|
|
| |
Fixed error 'lambda capture 'CondReg' is not required to be captured
for this use'.
llvm-svn: 349005
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Optimize sequence:
%sel = V_CNDMASK_B32_e64 0, 1, %cc
%cmp = V_CMP_NE_U32 1, %1
$vcc = S_AND_B64 $exec, %cmp
S_CBRANCH_VCC[N]Z
=>
$vcc = S_ANDN2_B64 $exec, %cc
S_CBRANCH_VCC[N]Z
It is the negation pattern inserted by DAGCombiner::visitBRCOND() in the
rebuildSetCC().
Differential Revision: https://reviews.llvm.org/D55402
llvm-svn: 349003
|