| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
| |
All three 256->128 bit cases were already handled above.
Noticed while looking at the coverage report.
llvm-svn: 368609
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
truncating stores.
If we're after type legalize, we should make sure we won't create
a store with an illegal type when we separate the AVG pattern
from the truncating store.
I don't know of a way to fail for this today. Just noticed while
I was in the vicinity.
llvm-svn: 368608
|
|
|
|
|
|
|
| |
We just need to check if the truncating store is legal
instead of going through isSATValidOnAVX512Subtarget.
llvm-svn: 368607
|
|
|
|
|
|
| |
We have no custom trunc stores on X86.
llvm-svn: 368606
|
|
|
|
|
|
|
|
|
| |
prefer-vector-width=256 and min-legal-vector-width=256.
Under this config, the v16f32 type we try to use isn't to a register
class so the getRegClassFor call will fail.
llvm-svn: 368594
|
|
|
|
| |
llvm-svn: 368558
|
|
|
|
|
|
|
| |
If we have SSE2 we can handle any i8/i16 type and let
type legalization deal with it.
llvm-svn: 368538
|
|
|
|
|
|
|
| |
Target independent type legalization and custom lowering
should be able to handle it.
llvm-svn: 368537
|
|
|
|
|
|
| |
with widening legalization.
llvm-svn: 368523
|
|
|
|
|
|
|
|
|
|
|
| |
with widening legalization.
The test case that changed is probably better served through
allowing combineTruncatedArithmetic to create narrow vectors. It
also appears InstCombine would have simplified this test case
to remove the zext and trunc anyway.
llvm-svn: 368522
|
|
|
|
|
|
|
|
|
|
| |
On SSE41+ targets we always lower vector shuffles to ZERO_EXTEND_VECTOR_INREG, even if we don't need the extended bits.
This patch relaxes this so that we lower to ANY_EXTEND_VECTOR_INREG if we can, meaning that shuffle combines have a better idea of what elements need to be kept zero. This helps the multiple reduction code as we can now combine away a lot more of the pack+extend codes.
Differential Revision: https://reviews.llvm.org/D65741
llvm-svn: 368515
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
isn't legal
Summary:
This patch adds a special DAG combine for SSE1 to recognize the IR pattern InstCombine gives us for movmsk. This only does the recognition for a few cases where its obvious the input won't be scalarized resulting in building a vector just do to the movmsk. I've made it separate from our existing matching for movmsk since that's called in multiple places and I didn't spend time to see if the other callers would make sense here. Plus the restrictions and additional checks would complicate that.
This fixes the case from PR42870. Buts its probably still broken the presence of logic ops feeding the movmsk pattern which would further hide the v4f32 type.
Reviewers: spatel, RKSimon, xbolva00
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65689
llvm-svn: 368506
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
On windows if the frame size exceed 4096 bytes, compiler need to
generate a call to _alloca_probe. X86CallFrameOptimization pass
changes the reserved stack size and cause of stack probe function
not be inserted. This patch fix the issue by detecting the call
frame size, if the size exceed 4096 bytes, drop X86CallFrameOptimization.
Reviewers: craig.topper, wxiao3, annita.zhang, rnk, RKSimon
Reviewed By: rnk
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65923
llvm-svn: 368503
|
|
|
|
| |
llvm-svn: 368486
|
|
|
|
|
|
| |
We don't appear to need this with widening legalization.
llvm-svn: 368479
|
|
|
|
|
|
|
|
| |
As discussed on PR42825, if we are inverting the selection mask we can just swap the inputs and avoid the inversion.
Differential Revision: https://reviews.llvm.org/D65522
llvm-svn: 368438
|
|
|
|
|
|
|
| |
We shouldn't form trunc stores that need to be expanded now that
we are using widening legalization.
llvm-svn: 368400
|
|
|
|
|
|
|
| |
I believe PR34584 was tracking that FIXME, but its since been
closed and a test case was added.
llvm-svn: 368397
|
|
|
|
|
|
|
|
|
|
| |
masked store.
The only way to generate these was through promoting legalization
of narrow vectors, but we widen those types now. So we shouldn't
produce these nodes.
llvm-svn: 368396
|
|
|
|
|
|
|
|
| |
TypeWidenVector check from code that handles X86ISD::VPMADDWD and X86ISD::AVG.
More unneeded code since we now legalize narrow vectors by widening.
llvm-svn: 368395
|
|
|
|
|
|
| |
This is no longer needed since we widen v2i32 instead of promoting.
llvm-svn: 368394
|
|
|
|
|
|
| |
handling in LowerStore now that v2i32 is widened to v4i32.
llvm-svn: 368390
|
|
|
|
|
|
| |
ReplaceNodeResults/LowerMSCATTER now that v2i32 is also widened like v2f32.
llvm-svn: 368389
|
|
|
|
|
|
|
|
| |
ReplaceNodeResults.
We rely on the generic type legalizer for this now.
llvm-svn: 368388
|
|
|
|
|
|
| |
to only handle widening.
llvm-svn: 368387
|
|
|
|
|
|
| |
SIGN_EXTEND/ZERO_EXTEND/TRUNCATE for vectors to only handle widening.
llvm-svn: 368386
|
|
|
|
|
|
| |
vectors to only handle widening.
llvm-svn: 368385
|
|
|
|
|
|
|
|
| |
handling code.
We now widen illegal vector types so we don't need this anymore.
llvm-svn: 368384
|
|
|
|
|
|
|
|
|
|
|
|
| |
avx512vl, avx512bw, min-legal-vector-width<=256 and prefer-vector-width=256
Under this configuration we'll want to split the v8i64 or v16i32 into two vectors. The default legalization will try to truncate each of those 256-bit pieces one step to 128-bit, concatenate those, then truncate one more time from the new 256 to 128 bits.
With this patch we now truncate the two splits to 64-bits then concatenate those. We have to do this two different ways depending on whether have widening legalization enabled. Without widening legalization we have to manually construct X86ISD::VTRUNC to prevent the ISD::TRUNCATE with a narrow result being promoted to 128 bits with a larger element type than what we want followed by something like a pshufb to grab the lower half of each element to finish the job. With widening legalization we just get the right thing. When we switch to widening by default we can just delete the other code path.
Differential Revision: https://reviews.llvm.org/D65626
llvm-svn: 368349
|
|
|
|
|
|
|
|
| |
If the target shuffle mask is from a wider type, attempt to scale the mask so that the extraction can attempt to peek through.
Fixes the regression mentioned in rL368307
llvm-svn: 368308
|
|
|
|
|
|
|
|
|
|
|
|
| |
DemandedElts mask
If we don't demand all elements, then attempt to combine to a simpler shuffle.
At the moment we can only do this if Depth == 0 as combineX86ShufflesRecursively uses Depth to track whether the shuffle has really changed or not - we'll need to change this before we can properly start merging combineX86ShufflesRecursively into SimplifyDemandedVectorElts.
The insertps-combine.ll regression is because XFormVExtractWithShuffleIntoLoad can't see through shuffles of different widths - this will be fixed in a follow-up commit.
llvm-svn: 368307
|
|
|
|
|
|
| |
We need to prefer INSERTPS with zeros over SHUFPS, but fallback to INSERTPS if that fails.
llvm-svn: 368292
|
|
|
|
|
|
|
|
|
| |
option and all its uses.
This option is now defaulted to true and we don't want to support
turning it off so remove the option.
llvm-svn: 368258
|
|
|
|
| |
llvm-svn: 368250
|
|
|
|
|
|
| |
with a fix to clear the SDNode map when SelectionDAG is cleared.
llvm-svn: 368230
|
|
|
|
|
|
|
|
|
|
| |
-mprefer-vector-width=256 is causing 512-bit vectors to be split
If we're splitting the 512-bit vector anyway and we have zero/sign bits, then we might as well use pack instructions to concat and truncate at once.
Differential Revision: https://reviews.llvm.org/D65904
llvm-svn: 368210
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-x86-experimental-vector-widening-legalization by default."
The assert that caused this to be reverted should be fixed now.
Original commit message:
This patch changes our defualt legalization behavior for 16, 32, and
64 bit vectors with i8/i16/i32/i64 scalar types from promotion to
widening. For example, v8i8 will now be widened to v16i8 instead of
promoted to v8i16. This keeps the elements widths the same and pads
with undef elements. We believe this is a better legalization strategy.
But it carries some issues due to the fragmented vector ISA. For
example, i8 shifts and multiplies get widened and then later have
to be promoted/split into vXi16 vectors.
This has the potential to cause regressions so we wanted to get
it in early in the 10.0 cycle so we have plenty of time to
address them.
Next steps will be to merge tests that explicitly test the command
line option. And then we can remove the option and its associated
code.
llvm-svn: 368183
|
|
|
|
|
|
| |
Don't attempt to merge loads for types that aren't modulo 8-bits.
llvm-svn: 368165
|
|
|
|
|
|
|
|
|
| |
This reverts commit 3de33245d2c992c9e0af60372043540b60f3a810.
This commit broke the MSan buildbots. See
https://reviews.llvm.org/rL367901 for more information.
llvm-svn: 368107
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
legalization.
If we're after type legalization we should only be trying to turn
v2i64 into v2i32. So bitcast to v4i32, shuffle the even elements
together. Then use X86ISD::CVTSI2P. The alternative is to leave
the v2i64 type alone and let it scalarized. Hopefully keeping
it packed is better.
Fixes PR42905.
llvm-svn: 368091
|
|
|
|
|
|
| |
This mainly helps to replace unused arguments with UNDEF in the case where they have multiple users.
llvm-svn: 368026
|
|
|
|
|
|
|
|
| |
If we don't demand any non-undef shuffle elements then the assert will fail as all shuffle inputs would still be flagged as 'identity' safe.
Exposed by an incoming patch.
llvm-svn: 368022
|
|
|
|
|
|
| |
As mentioned on D65047 / rL366933 the plan is to enable partial reduction handling wherever possible.
llvm-svn: 368016
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Before this patch MGATHER/MSCATTER is capable of representing all
common addressing modes, but only when illegal types are used.
This patch adds an IndexType property so more representations
are available when using legal types only.
Original modes:
vector of bases
base + vector of signed scaled offsets
New modes:
base + vector of signed unscaled offsets
base + vector of unsigned scaled offsets
base + vector of unsigned unscaled offsets
The current behaviour of addressing modes for gather/scatter remains
unchanged.
Patch by Paul Walker.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D65636
llvm-svn: 368008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch changes our defualt legalization behavior for 16, 32, and
64 bit vectors with i8/i16/i32/i64 scalar types from promotion to
widening. For example, v8i8 will now be widened to v16i8 instead of
promoted to v8i16. This keeps the elements widths the same and pads
with undef elements. We believe this is a better legalization strategy.
But it carries some issues due to the fragmented vector ISA. For
example, i8 shifts and multiplies get widened and then later have
to be promoted/split into vXi16 vectors.
This has the potential to cause regressions so we wanted to get
it in early in the 10.0 cycle so we have plenty of time to
address them.
Next steps will be to merge tests that explicitly test the command
line option. And then we can remove the option and its associated
code.
llvm-svn: 367901
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The test case is based on the example from the post-commit thread for:
https://reviews.llvm.org/rGc9171bd0a955
This replaces the x86-specific simple-type check from:
rL367766
with a check in the DAGCombiner. Adding the check isn't
strictly necessary after the fix from:
rL367768
...but it seems likely that we're heading for trouble if
we are creating weird types in this transform.
I combined the earlier legality check into the initial
clause to simplify the code.
So we should only try the trunc/sext transform at the
earliest combine stage, but we limit the transform to
simple types anyway because the TLI hook is probably
too lax about what it considers a free truncate.
llvm-svn: 367834
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is patch is part of a serie to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet, jfb, jakehehrlich
Reviewed By: jfb
Subscribers: wuzish, jholewinski, arsenm, dschuff, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, s.egerton, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65514
llvm-svn: 367828
|
|
|
|
|
|
| |
shuffle combining from running with -x86-experimental-vector-widening-legalization.
llvm-svn: 367798
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vector of INSERT_SUBVECTOR with undef if none of the elements are demanded even if the node has other users.
Summary:
The SimplifyDemandedVectorElts function can replace with undef
when no elements are demanded, but due to how it interacts with
TargetLoweringOpts, it can only do this when the node has
no other users.
Remove a now unneeded DAG combine from the X86 backend.
Reviewers: RKSimon, spatel
Reviewed By: RKSimon
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65713
llvm-svn: 367788
|
|
|
|
|
|
| |
for ANY_EXTEND shuffles
llvm-svn: 367784
|