| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-x86-experimental-vector-widening-legalization by default."
The assert that caused this to be reverted should be fixed now.
Original commit message:
This patch changes our defualt legalization behavior for 16, 32, and
64 bit vectors with i8/i16/i32/i64 scalar types from promotion to
widening. For example, v8i8 will now be widened to v16i8 instead of
promoted to v8i16. This keeps the elements widths the same and pads
with undef elements. We believe this is a better legalization strategy.
But it carries some issues due to the fragmented vector ISA. For
example, i8 shifts and multiplies get widened and then later have
to be promoted/split into vXi16 vectors.
This has the potential to cause regressions so we wanted to get
it in early in the 10.0 cycle so we have plenty of time to
address them.
Next steps will be to merge tests that explicitly test the command
line option. And then we can remove the option and its associated
code.
llvm-svn: 368183
|
|
|
|
|
|
| |
Don't attempt to merge loads for types that aren't modulo 8-bits.
llvm-svn: 368165
|
|
|
|
|
|
|
|
| |
We have aliases that disambiguate memory forms of bt/btc/btr/bts
without suffixes to the 32-bit form. These aliases should have
been updated when the instructions were updated in r356413.
llvm-svn: 368127
|
|
|
|
| |
llvm-svn: 368126
|
|
|
|
|
|
|
|
|
|
| |
The upper 4 bits of the immediate byte are used to encode a
register. We need to limit the explicit immediate to fit in the
remaining 4 bits.
Fixes PR42899.
llvm-svn: 368123
|
|
|
|
|
|
|
|
|
| |
This reverts commit 3de33245d2c992c9e0af60372043540b60f3a810.
This commit broke the MSan buildbots. See
https://reviews.llvm.org/rL367901 for more information.
llvm-svn: 368107
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
legalization.
If we're after type legalization we should only be trying to turn
v2i64 into v2i32. So bitcast to v4i32, shuffle the even elements
together. Then use X86ISD::CVTSI2P. The alternative is to leave
the v2i64 type alone and let it scalarized. Hopefully keeping
it packed is better.
Fixes PR42905.
llvm-svn: 368091
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Cleans X86.td's Barcelona entry to be more like the others,
by moving the features out of the `Proc<>`, thus potentially
making it possible to inherit from them.
Split off from D63628
Reviewers: craig.topper, RKSimon
Reviewed By: craig.topper
Subscribers: hiraditya, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65791
llvm-svn: 368061
|
|
|
|
|
|
| |
This mainly helps to replace unused arguments with UNDEF in the case where they have multiple users.
llvm-svn: 368026
|
|
|
|
|
|
|
|
| |
If we don't demand any non-undef shuffle elements then the assert will fail as all shuffle inputs would still be flagged as 'identity' safe.
Exposed by an incoming patch.
llvm-svn: 368022
|
|
|
|
|
|
| |
As mentioned on D65047 / rL366933 the plan is to enable partial reduction handling wherever possible.
llvm-svn: 368016
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Before this patch MGATHER/MSCATTER is capable of representing all
common addressing modes, but only when illegal types are used.
This patch adds an IndexType property so more representations
are available when using legal types only.
Original modes:
vector of bases
base + vector of signed scaled offsets
New modes:
base + vector of signed unscaled offsets
base + vector of unsigned scaled offsets
base + vector of unsigned unscaled offsets
The current behaviour of addressing modes for gather/scatter remains
unchanged.
Patch by Paul Walker.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D65636
llvm-svn: 368008
|
|
|
|
|
|
|
|
|
| |
isIncomingArgumentHandler()
Previous name and comment incorrectly implied it was just for formal arg handlers,
which is not true.
llvm-svn: 367945
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch changes our defualt legalization behavior for 16, 32, and
64 bit vectors with i8/i16/i32/i64 scalar types from promotion to
widening. For example, v8i8 will now be widened to v16i8 instead of
promoted to v8i16. This keeps the elements widths the same and pads
with undef elements. We believe this is a better legalization strategy.
But it carries some issues due to the fragmented vector ISA. For
example, i8 shifts and multiplies get widened and then later have
to be promoted/split into vXi16 vectors.
This has the potential to cause regressions so we wanted to get
it in early in the 10.0 cycle so we have plenty of time to
address them.
Next steps will be to merge tests that explicitly test the command
line option. And then we can remove the option and its associated
code.
llvm-svn: 367901
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The test case is based on the example from the post-commit thread for:
https://reviews.llvm.org/rGc9171bd0a955
This replaces the x86-specific simple-type check from:
rL367766
with a check in the DAGCombiner. Adding the check isn't
strictly necessary after the fix from:
rL367768
...but it seems likely that we're heading for trouble if
we are creating weird types in this transform.
I combined the earlier legality check into the initial
clause to simplify the code.
So we should only try the trunc/sext transform at the
earliest combine stage, but we limit the transform to
simple types anyway because the TLI hook is probably
too lax about what it considers a free truncate.
llvm-svn: 367834
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is patch is part of a serie to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet, jfb, jakehehrlich
Reviewed By: jfb
Subscribers: wuzish, jholewinski, arsenm, dschuff, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, s.egerton, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65514
llvm-svn: 367828
|
|
|
|
|
|
| |
shuffle combining from running with -x86-experimental-vector-widening-legalization.
llvm-svn: 367798
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vector of INSERT_SUBVECTOR with undef if none of the elements are demanded even if the node has other users.
Summary:
The SimplifyDemandedVectorElts function can replace with undef
when no elements are demanded, but due to how it interacts with
TargetLoweringOpts, it can only do this when the node has
no other users.
Remove a now unneeded DAG combine from the X86 backend.
Reviewers: RKSimon, spatel
Reviewed By: RKSimon
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65713
llvm-svn: 367788
|
|
|
|
|
|
| |
for ANY_EXTEND shuffles
llvm-svn: 367784
|
|
|
|
| |
llvm-svn: 367783
|
|
|
|
| |
llvm-svn: 367782
|
|
|
|
|
|
|
|
|
| |
INSERTPS nodes.
This is the type listed in the type constraint for isel. But since
we list a type there, it doesn't get checked during isel matching.
llvm-svn: 367775
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This avoids the crash from:
https://bugs.llvm.org/show_bug.cgi?id=42880
...and I think it's a proper constraint for the TLI hook.
But that example raises questions about what happens to get us
into this situation (created i29 types) and what happens later
(why does legalization die on those types), so I'm not sure if
we will resolve the bug based on this change.
llvm-svn: 367766
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
An inline asm call can result in an immediate after inlining. Therefore emit a
diagnostic here if constraint requires an immediate but one isn't supplied.
Reviewers: joerg, mgorny, efriedma, rsmith
Reviewed By: joerg
Subscribers: asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, s.egerton, MaskRay, jyknight, dylanmckay, javed.absar, fedor.sergeev, jrtc27, Jim, krytarowski, eraman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60942
llvm-svn: 367750
|
|
|
|
|
|
|
|
|
|
|
| |
intrinsics.
This is consistent with the target independent intrinsic handling.
Not sure this really matters since we just pull the constant out
using getZExtValue later.
llvm-svn: 367736
|
|
|
|
| |
llvm-svn: 367683
|
|
|
|
|
|
| |
llvm::Register as started by r367614. NFC
llvm-svn: 367633
|
|
|
|
|
|
|
|
|
|
|
|
| |
multiply is legal
If a type is larger than a legal type and needs to be split, we would previously allow the multiply to be decomposed even if the split multiply is legal. Since the shift + add/sub code would also need to be split, its not any better to decompose it.
This patch figures out what type the mul will eventually be legalized to and then uses that type for the query. I tried just returning false illegal types and letting them get handled after type legalization, but then we can't recognize and i64 constant splat on 32-bit targets since will be destroyed by type legalization. We could special case vectors of i64 to avoid that...
Differential Revision: https://reviews.llvm.org/D65533
llvm-svn: 367601
|
|
|
|
|
|
| |
We should probably extend this to cover bitcasts as well to help other cases in promote-vec3.ll.
llvm-svn: 367582
|
|
|
|
|
|
| |
This adds SimplifyMultipleUseDemandedBitsForTargetNode X86 support and uses it to allow us to peek through vector insertions to avoid dependencies on entire insertion chains.
llvm-svn: 367570
|
|
|
|
| |
llvm-svn: 367556
|
|
|
|
|
|
|
|
|
|
| |
and partial fix.
Causes windows buildbot errors.
This reverts commit 6e65c34523963094acd0d6c94a5f5c64b32fe6aa and
53da7ca94343166ac68aef81db0398932fc258bb.
llvm-svn: 367496
|
|
|
|
|
|
|
|
|
|
|
|
| |
extractelement+store
We have custom code that ignores the normal promoting type legalization on less than 128-bit vector types like v4i8 to emit pavgb, paddusb, psubusb since we don't have the equivalent instruction on a larger element type like v4i32. If this operation appears before a store, we can be left with an any_extend_vector_inreg followed by a truncstore after type legalization. When truncstore isn't legal, this will normally be decomposed into shuffles and a non-truncating store. This will then combine away the any_extend_vector_inreg and shuffle leaving just the store. On avx512, truncstore is legal so we don't decompose it and we had no combines to fix it.
This patch adds a new DAG combine to detect this case and emit either an extract_store for 64-bit stoers or a extractelement+store for 32 and 16 bit stores. This makes the avx512 codegen match the avx2 codegen for these situations. I'm restricting to only when -x86-experimental-vector-widening-legalization is false. When we're widening we're not likely to create this any_extend_inreg+truncstore combination. This means we should be able to remove this code when we flip the default. I would like to flip the default soon, but I need to investigate some performance regressions its causing in our branch that I wasn't seeing on trunk.
Differential Revision: https://reviews.llvm.org/D65538
llvm-svn: 367488
|
|
|
|
|
|
|
| |
Preserve the nullptr default for KnownCallees that appears in
the base class.
llvm-svn: 367477
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This will make it possible to improve IPRA by taking into account
register usage in indirect calls.
NFC yet; this is just laying the groundwork to start building
up patches to take advantage of the information for improved register
allocation.
Reviewers: aditya_nandakumar, volkan, qcolombet, arsenm, rovka, aemerson, paquette
Subscribers: sdardis, wdng, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65488
llvm-svn: 367476
|
|
|
|
|
|
|
|
|
| |
The build failure found after the rL365467 has been
resolved.
Differential Revision: https://reviews.llvm.org/D60716
llvm-svn: 367446
|
|
|
|
|
|
| |
Makes it available for more combines to use without adding declarations.
llvm-svn: 367436
|
|
|
|
|
|
| |
Before combining insert_subvector(insert_subvector(vec, sub0, c0), sub1, c1) patterns, ensure that the subvectors are all the same type. On AVX512 targets especially we might have a mixture of 128/256 subvector insertions.
llvm-svn: 367429
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This emits labels around heapallocsite calls in SelectionDAG.
Reviewers: rnk
Subscribers: MatzeB, aprantl, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61105
llvm-svn: 367374
|
|
|
|
|
|
| |
The code is matching sext not zext.
llvm-svn: 367357
|
|
|
|
|
|
|
|
| |
resolveTargetShuffleInputs not getTargetShuffleMask
Add TODO comment.
llvm-svn: 367318
|
|
|
|
|
|
|
|
|
|
| |
X86ISD::SUBV_BROADCAST source (PR42819)
PR42819 showed an issue that we couldn't handle the case where we demanded a 'sub-sub-vector' of the SUBV_BROADCAST 'sub-vector' source.
This patch recognizes these cases and extracts the sub-sub-vector instead of trying to broadcast to a type smaller than the 'sub-vector' source.
llvm-svn: 367306
|
|
|
|
| |
llvm-svn: 367251
|
|
|
|
|
|
| |
Avoids slow downs from calls to ComputeNumSignBits/computeKnownBits going too deep.
llvm-svn: 367240
|
|
|
|
|
|
|
|
| |
As discussed on rL367171, we have a problem where the depth recursion used in combineX86ShufflesRecursively was subtly different to computeKnownBits etc. - it starts at Depth=1 instead of Depth=0 like the others and has a different maximum recursion depth.
This NFC patch fixes the recursion depth to start at 0, so we can more easily reuse depth values in calls from combineX86ShufflesRecursively and its helper functions in computeKnownBits etc.
llvm-svn: 367232
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
inputs have an additional user.
The pmaddwd inserts a truncate, if that truncate would end up
creating additional instructions instead of making a zext
narrower, then we shouldn't do it.
I've restricted this to only sse4.1 targets since on prior
targets the zext will be done in stages. So the truncate will
probably not create additional instructions. Might need some
more investigation of mul shrinking and the other pmaddwd
transform to be sure this is the right decision.
There might be a slight regression on AVX1 targets due to add
splitting. Hard to say for sure. Maybe we need to look into
using the vector reduction flag to use 2 narrow loads and a
blend instead of extracting and inserting.
llvm-svn: 367198
|
|
|
|
|
|
|
|
|
|
| |
vector reduction flag on the final add. Handle unrolled loops by letting DAG combine revisit.
This reverts r340478 and r340631 and replaces them with a simpler
method of just letting DAG combine revisit the nodes to handle
the other operand.
llvm-svn: 367195
|
|
|
|
|
|
|
|
| |
SimplifyMultipleUseDemandedBits handler (Reapplied)
Recommit rL367100 which was reverted at rL367141. Until PR42777 is fixed, we no longer get the benefits of peeking through bitcasts but it does still remove a GetDemandedBits user and gives us the equivalent combines.
llvm-svn: 367172
|
|
|
|
|
|
|
|
|
| |
SimplifyMultipleUseDemandedBits handler."
This reverts r367100, it appears to be causing test failures after
Nico's revert of r367091.
llvm-svn: 367141
|
|
|
|
|
|
|
|
| |
SimplifyMultipleUseDemandedBits handler.
This removes a GetDemandedBits user and allows us to benefit from the DemandedElts propagated through SimplifyDemandedBits.
llvm-svn: 367100
|