| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
| |
We are missing costs for a lot of truncation cases, I'm hoping to address all the 'zero cost' cases in trunc.ll
I thought this was a vector widening side effect, but even before this we had some interesting LV decisions (notably over indvars) being made due to these zero costs.
llvm-svn: 372498
|
|
|
|
|
|
|
|
|
|
|
|
| |
vgetmantss/sd.
Previously we only matched scalar_to_vector and scalar load, but
we should be able to narrow a vector load or match vzload.
Also need to match TargetConstant instead of Constant. The register
patterns were previously updated, but not the memory patterns.
llvm-svn: 372458
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This fixes a crasher introduced by r372338.
Reviewers: echristo, arsenm
Subscribers: wdng, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67850
llvm-svn: 372434
|
|
|
|
| |
llvm-svn: 372395
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 52621307bcab2013e8833f3317cebd63a6db3885.
Tests have been failing all night with
[0/2] ACTION //llvm/test:check-llvm(//llvm/utils/gn/build/toolchain:unix)
-- Testing: 33647 tests, 64 threads --
Testing: 0 .. 10..
UNRESOLVED: LLVM :: CodeGen/AMDGPU/GlobalISel/isel-blendi-gettargetconstant.ll (6943 of 33647)
******************** TEST 'LLVM :: CodeGen/AMDGPU/GlobalISel/isel-blendi-gettargetconstant.ll' FAILED ********************
Test has no run line!
********************
Since there were other concerns on https://reviews.llvm.org/D67785,
I'm just reverting for now.
llvm-svn: 372383
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
argument to a regular Constant during lowering.
We reuse an ISD opcode here that can be reached from BMI that
doesn't require it to be an immediate. Our isel patterns to match
the TBM immediate form require a Constant and not a TargetConstant.
We were accidentally getting the Constant due to a quirk of
combineBEXTR calling SimplifyDemandedBits. The call to
SimplifyDemandedBits ended up constant folding the TargetConstant
to a regular Constant. But we should probably instead be asserting
if SimplifyDemandedBits on a TargetConstant so we shouldn't rely
on this behavior.
llvm-svn: 372373
|
|
|
|
|
|
| |
This fixes an isel failure after r372338.
llvm-svn: 372371
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This fixes a crasher introduced by r372338.
Reviewers: echristo, arsenm
Subscribers: jvesely, wdng, nhaehnle, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67785
Tighten up the test case.
llvm-svn: 372366
|
|
|
|
|
|
|
|
|
|
| |
LowerBUILD_VECTORvXi1.
The later code that generates a constant when there are
some non-const elements works basically the same and doesn't
require there to be any non-const elements.
llvm-svn: 372365
|
|
|
|
|
|
|
|
|
| |
This reverts r372314, reapplying r372285 and the commits which depend
on it (r372286-r372293, and r372296-r372297)
This was missing one switch to getTargetConstant in an untested case.
llvm-svn: 372338
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(PR42863)
This patch converts the DAGCombine isNegatibleForFree/GetNegatedExpression into overridable TLI hooks and includes a demonstration X86 implementation.
The intention is to let us extend existing FNEG combines to work more generally with negatible float ops, allowing it work with target specific combines and opcodes (e.g. X86's FMA variants).
Unlike the SimplifyDemandedBits, we can't just handle target nodes through a Target callback, we need to do this as an override to allow targets to handle generic opcodes as well. This does mean that the target implementations has to duplicate some checks (recursion depth etc.).
I've only begun to replace X86's FNEG handling here, handling FMADDSUB/FMSUBADD negation and some low impact codegen changes (some FMA negatation propagation). We can build on this in future patches.
Differential Revision: https://reviews.llvm.org/D67557
llvm-svn: 372333
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This broke the Chromium build, causing it to fail with e.g.
fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15>
See llvm-commits thread of r372285 for details.
This also reverts r372286, r372287, r372288, r372289, r372290, r372291,
r372292, r372293, r372296, and r372297, which seemed to depend on the
main commit.
> Encode them directly as an imm argument to G_INTRINSIC*.
>
> Since now intrinsics can now define what parameters are required to be
> immediates, avoid using registers for them. Intrinsics could
> potentially want a constant that isn't a legal register type. Also,
> since G_CONSTANT is subject to CSE and legalization, transforms could
> potentially obscure the value (and create extra work for the
> selector). The register bank of a G_CONSTANT is also meaningful, so
> this could throw off future folding and legalization logic for AMDGPU.
>
> This will be much more convenient to work with than needing to call
> getConstantVRegVal and checking if it may have failed for every
> constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
> immarg operands, many of which need inspection during lowering. Having
> to find the value in a register is going to add a lot of boilerplate
> and waste compile time.
>
> SelectionDAG has always provided TargetConstant for constants which
> should not be legalized or materialized in a register. The distinction
> between Constant and TargetConstant was somewhat fuzzy, and there was
> no automatic way to force usage of TargetConstant for certain
> intrinsic parameters. They were both ultimately ConstantSDNode, and it
> was inconsistently used. It was quite easy to mis-select an
> instruction requiring an immediate. For SelectionDAG, start emitting
> TargetConstant for these arguments, and using timm to match them.
>
> Most of the work here is to cleanup target handling of constants. Some
> targets process intrinsics through intermediate custom nodes, which
> need to preserve TargetConstant usage to match the intrinsic
> expectation. Pattern inputs now need to distinguish whether a constant
> is merely compatible with an operand or whether it is mandatory.
>
> The GlobalISelEmitter needs to treat timm as a special case of a leaf
> node, simlar to MachineBasicBlock operands. This should also enable
> handling of patterns for some G_* instructions with immediates, like
> G_FENCE or G_EXTRACT.
>
> This does include a workaround for a crash in GlobalISelEmitter when
> ARM tries to uses "imm" in an output with a "timm" pattern source.
llvm-svn: 372314
|
|
|
|
|
|
|
|
|
| |
targets when the vector is a mix of constants and non-constant.
We need to materialize the constants as two 32-bit values that
are casted to v32i1 and then concatenated.
llvm-svn: 372304
|
|
|
|
|
|
| |
Avoids repeating the size.
llvm-svn: 372302
|
|
|
|
| |
llvm-svn: 372301
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Encode them directly as an imm argument to G_INTRINSIC*.
Since now intrinsics can now define what parameters are required to be
immediates, avoid using registers for them. Intrinsics could
potentially want a constant that isn't a legal register type. Also,
since G_CONSTANT is subject to CSE and legalization, transforms could
potentially obscure the value (and create extra work for the
selector). The register bank of a G_CONSTANT is also meaningful, so
this could throw off future folding and legalization logic for AMDGPU.
This will be much more convenient to work with than needing to call
getConstantVRegVal and checking if it may have failed for every
constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
immarg operands, many of which need inspection during lowering. Having
to find the value in a register is going to add a lot of boilerplate
and waste compile time.
SelectionDAG has always provided TargetConstant for constants which
should not be legalized or materialized in a register. The distinction
between Constant and TargetConstant was somewhat fuzzy, and there was
no automatic way to force usage of TargetConstant for certain
intrinsic parameters. They were both ultimately ConstantSDNode, and it
was inconsistently used. It was quite easy to mis-select an
instruction requiring an immediate. For SelectionDAG, start emitting
TargetConstant for these arguments, and using timm to match them.
Most of the work here is to cleanup target handling of constants. Some
targets process intrinsics through intermediate custom nodes, which
need to preserve TargetConstant usage to match the intrinsic
expectation. Pattern inputs now need to distinguish whether a constant
is merely compatible with an operand or whether it is mandatory.
The GlobalISelEmitter needs to treat timm as a special case of a leaf
node, simlar to MachineBasicBlock operands. This should also enable
handling of patterns for some G_* instructions with immediates, like
G_FENCE or G_EXTRACT.
This does include a workaround for a crash in GlobalISelEmitter when
ARM tries to uses "imm" in an output with a "timm" pattern source.
llvm-svn: 372285
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
existing datalayouts.
Summary:
Add function to AutoUpgrade to change the datalayout of old X86 datalayout strings.
This adds "-p270:32:32-p271:32:32-p272:64:64" to X86 datalayouts that are otherwise valid
and don't already contain it.
This also removes the compatibility changes in https://reviews.llvm.org/D66843.
Datalayout change in https://reviews.llvm.org/D64931.
Reviewers: rnk, echristo
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67631
llvm-svn: 372267
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67620
llvm-svn: 372231
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: sdardis, nemanjai, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67704
llvm-svn: 372230
|
|
|
|
|
|
|
|
|
|
| |
with avx512.
This generates worse code, but matches what is done for avx2 and
prevents crashes when more arguments are passed than we have
registers for.
llvm-svn: 372200
|
|
|
|
|
|
|
|
| |
-mno-sse2 on x86-64.
As seen in the most recent updates to PR10498
llvm-svn: 372197
|
|
|
|
| |
llvm-svn: 372159
|
|
|
|
| |
llvm-svn: 372155
|
|
|
|
|
|
| |
combine.
llvm-svn: 372154
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The case were Immediate is 0 and HasConstElts is true should never
happen since that would mean the constant elts were all zero. But
we check for all zero build vector earlier. So just use HasConstElts
and blindly take Immediate without checking if its 0.
Move the code that bitcasts and extract the immediate into the
the HasConstElts case since the other code just creates an undef
with the right type. No casting needed.
llvm-svn: 372153
|
|
|
|
|
|
|
|
| |
non-null. NFCI.
Silences a static analyzer warning.
llvm-svn: 372118
|
|
|
|
| |
llvm-svn: 372113
|
|
|
|
|
|
| |
Also avoids a static analyzer warning about out of range shifts.
llvm-svn: 372103
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Reordered MVT simple types to group scalable vector types
together.
* New range functions in MachineValueType.h to only iterate over
the fixed-length int/fp vector types.
* Stopped backends which don't support scalable vector types from
iterating over scalable types.
Reviewers: sdesmalen, greened
Reviewed By: greened
Differential Revision: https://reviews.llvm.org/D66339
llvm-svn: 372099
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
on avx512 targets.
Previously we tried to split them into narrower v64i1 or v16i1
pieces that each got promoted to vXi8 and then passed in a zmm
or xmm register. But this crashes when you need to pass more
pieces than available registers reserved for argument passing.
The scalarizing done here generates much longer and slower code,
but is consistent with the behavior of avx2 and earlier targets
for these types.
Fixes PR43323.
llvm-svn: 372069
|
|
|
|
|
|
|
|
|
|
| |
broadcast load to avoid a copy.
The BLENDM instructions allow an 2 sources and an independent
destination while masked VBROADCAST has the destination tied
to the source.
llvm-svn: 372068
|
|
|
|
|
|
| |
Previously we limited to the EQ/NE/TRUE/FALSE/ORD/UNORD immediates.
llvm-svn: 372067
|
|
|
|
| |
llvm-svn: 372065
|
|
|
|
|
|
| |
Determine if all of the uses of LHS/RHS operands can be replaced with a zero vector.
llvm-svn: 372013
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This allows enabling useaa on the command-line and will allow enabling the
feature on a per-CPU basis where benchmarking shows improvements.
This is modelled after the ARM/AArch64 target.
Reviewers: RKSimon, andreadb, craig.topper
Subscribers: javed.absar, kristof.beyls, hiraditya, ychen, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67266
llvm-svn: 371989
|
|
|
|
|
|
|
| |
incDecVectorConstant is used for a similar reason in LowerVSETCCWithSUBUS
so we might as well share the code.
llvm-svn: 371861
|
|
|
|
|
|
|
|
|
|
| |
negation. NFCI.
Some prep work for PR42863, this change allows us to move all the FMA opcode mappings into the negateFMAOpcode helper.
For the FMADDSUB/FMSUBADD cases, we can only negate the accumulator - any other negations will result in an error.
llvm-svn: 371840
|
|
|
|
|
|
|
|
| |
The result integer does not need to be the same width as the input.
AMDGPU, NVPTX, and Hexagon all have patterns working around the types
matching. GlobalISel defines these as being different type indexes.
llvm-svn: 371797
|
|
|
|
|
|
| |
Implement the TODO from D66318.
llvm-svn: 371789
|
|
|
|
|
|
|
|
|
|
|
| |
can exclude fp128 compares.
The X86 decision assumes the compare will produce a result in an XMM
register, but that can't happen for an fp128 compare since those
go to a libcall the returns an i32. Pass the VT so X86 can check
the type.
llvm-svn: 371775
|
|
|
|
| |
llvm-svn: 371770
|
|
|
|
|
|
|
|
| |
This is the main CodeGen patch to support the arm64_32 watchOS ABI in LLVM.
FastISel is mostly disabled for now since it would generate incorrect code for
ILP32.
llvm-svn: 371722
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
later Intel CPUs.
AVX512 instructions can cause a frequency drop on these CPUs. This
can negate the performance gains from using wider vectors. Enabling
prefer-vector-width=256 will prevent generation of zmm registers
unless explicit 512 bit operations are used in the original source
code.
I believe gcc and icc both do something similar to this by default.
Differential Revision: https://reviews.llvm.org/D67259
llvm-svn: 371694
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I found three issues:
1. the loop over E[ABCD]X copies run over BB start
2. the direct address of cmpxchg8b could be a frame index
3. the displacement of cmpxchg8b could be a global instead of an
immediate
These were all introduced together in r287875, and should be fixed with
this change.
Issue reported by Zachary Turner.
llvm-svn: 371678
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
legalization
fp128 is considered a legal type for a register, but has almost no legal operations so everything needs to be converted to a libcall. Previously this was implemented by tricking type legalization into softening the operations with various checks for "is legal in hardware register" to change the behavior to still use f128 as the resulting type instead of converting to i128.
This patch abandons this approach and instead moves the libcall conversions to LegalizeDAG. This is the approach taken by AArch64 where they also have a legal fp128 type, but no legal operations. I think this is more in spirit with how SelectionDAG's phases are supposed to work.
I had to make some hacks for STRICT_FP_ROUND because some of the strict FP handling checks if ISD::FP_ROUND is Legal for a given result type, but I had to make ISD::FP_ROUND Custom to allow making a libcall when the input is f128. For all other types the Custom handler just returns the original node. These hacks are incomplete and don't work for a strict truncate from f128, but I don't think it worked before either since LegalizeFloatTypes doesn't know about strict ops yet. I've also raised PR43209 against AArch64 which currently crashes on a strict ftrunc from f64->f32 because of FP_ROUND being marked Custom for the same reason there.
Differential Revision: https://reviews.llvm.org/D67128
llvm-svn: 371672
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: dschuff, sdardis, nemanjai, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67443
llvm-svn: 371616
|
|
|
|
|
|
|
|
|
|
| |
for 32 bit signed, 32 bit unsigned, and 64 bit pointers."
This reverts 57076d3199fc2b0af4a3736b7749dd5462cacda5.
Original review at https://reviews.llvm.org/D64931.
Review for added fix at https://reviews.llvm.org/D66843.
llvm-svn: 371568
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
isAtomic in addition to isVolatile
See D66309 for context.
This is the first sweep of x86 target specific code to add isAtomic bailouts where appropriate. The intention here is to have the switch from AtomicSDNode to LoadSDNode/StoreSDNode be close to NFC; that is, I'm not looking to allow additional optimizations at this time.
Sorry for the lack of tests. As discussed in the review, most of these are vector tests (for which atomicity is not well defined) and I couldn't figure out to exercise the anyextend cases which aren't vector specific.
Differential Revision: https://reviews.llvm.org/D66322
llvm-svn: 371547
|
|
|
|
| |
llvm-svn: 371487
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is an alternative to D66980, which was reverted. Instead of
inserting a pseudo instruction that optionally expands to nothing, add a
pass that inserts int3 when appropriate after basic block layout.
Reviewers: hans
Differential Revision: https://reviews.llvm.org/D67201
llvm-svn: 371466
|