| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
CVT2MASK is just checking the sign bit which can be represented with a comparison with zero.
llvm-svn: 321985
|
| |
|
|
|
|
| |
128/256-bit compares when VLX is not available.
llvm-svn: 321984
|
| |
|
|
|
|
| |
ability to constant fold vector SIGN_EXTEND.
llvm-svn: 321979
|
| |
|
|
| |
llvm-svn: 321978
|
| |
|
|
|
|
| |
I had removed the qualifiers around the autogenerated folding table so I could compare with the manual table, but didn't intend to commit the change.
llvm-svn: 321971
|
| |
|
|
|
|
|
|
| |
SIGN_EXTEND_INREG nodes created during legalization of v2i1/v4i1 masks on KNL.
v2i1/v4i1 are now legal on KNL so no sign_extend_inreg is generated.
llvm-svn: 321968
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
There are few oddities that occur due to v1i1, v8i1, v16i1 being legal without v2i1 and v4i1 being legal when we don't have VLX. Particularly during legalization of v2i32/v4i32/v2i64/v4i64 masked gather/scatter/load/store. We end up promoting the mask argument to these during type legalization and then have to widen the promoted type to v8iX/v16iX and truncate it to get the element size back down to v8i1/v16i1 to use a 512-bit operation. Since need to fill the upper bits of the mask we have to fill with 0s at the promoted type.
It would be better if we could just have the v2i1/v4i1 types as legal so they don't undergo any promotion. Then we can just widen with 0s directly in a k register. There are no real v4i1/v2i1 instructions anyway. Everything is done on a larger register anyway.
This also fixes an issue that we couldn't implement a masked vextractf32x4 from zmm to xmm properly.
We now have to support widening more compares to 512-bit to get a mask result out so new tablegen patterns got added.
I had to hack the legalizer for widening the operand of a setcc a bit so it didn't try create a setcc returning v4i32, extract from it, then try to promote it using a sign extend to v2i1. Now we create the setcc with v4i1 if the original setcc's result type is v2i1. Then extract that and don't sign extend it at all.
There's definitely room for improvement with some follow up patches.
Reviewers: RKSimon, zvi, guyblank
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41560
llvm-svn: 321967
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ppc_is_decremented_ctr_nonzero
Summary:
I believe legalization is really expecting that ReplaceNodeResults will return something with the same type as the thing that's being legalized. Ultimately, it uses the output to replace the uses in the DAG so the type should match to make that work.
There are two relevant cases here. When crbits are enabled, then i1 is a legal type and getSetCCResultType should return i1. In this case, the truncate will be between i1 and i1 and should be removed (SelectionDAG::getNode does this). Otherwise, getSetCCResultType will be i32 and the legalizer will promote the truncate to be i32 -> i32 which will be similarly removed.
With this fixed we can remove some code from PromoteIntRes_SETCC that seemed to only exist to deal with the intrinsic being replaced with a larger type without changing the other operand. With the truncate being used for connectivity this doesn't happen anymore.
Reviewers: hfinkel
Reviewed By: hfinkel
Subscribers: nemanjai, llvm-commits, kbarton
Differential Revision: https://reviews.llvm.org/D41654
llvm-svn: 321959
|
| |
|
|
| |
llvm-svn: 321958
|
| |
|
|
|
|
| |
The instructions that load 64-bits or an xmm register should be TB_NO_REVERSE to avoid the load being widened during unfold. The instructions that load 128-bits need to ensure 128-bit alignment.
llvm-svn: 321956
|
| |
|
|
|
|
| |
folding table.
llvm-svn: 321955
|
| |
|
|
|
|
|
|
| |
folding tables.
EVEX_B means different things for memory and register forms. The instructions should not be considered equivalent.
llvm-svn: 321954
|
| |
|
|
| |
llvm-svn: 321953
|
| |
|
|
| |
llvm-svn: 321952
|
| |
|
|
| |
llvm-svn: 321951
|
| |
|
|
|
|
|
|
| |
versions of cvtps2ph to the store folding tables.
The memory form of the xmm->xmm version only writes 64-bits. If we use it in the folding tables and its get used for a stack spill, only half the slot will be written. Then a reload may read all 128-bits which will pull in garbage. But without the spill the upper bits of the register would have been zero. By not folding we would preserve the zeros.
llvm-svn: 321950
|
| |
|
|
| |
llvm-svn: 321949
|
| |
|
|
|
|
|
|
| |
We don't do fine grained feature control like this on features prior to AVX512.
We do still have checks in place in the assembly parser itself that prevents %zmm references or %xmm16-31 from being parsed without at least -mattr=avx512f. Same for rounding control and mask operands. That will prevent the table matcher from matching for any instructions that need those features and that's probably good enough.
llvm-svn: 321947
|
| |
|
|
|
|
|
|
| |
matcher table.
This is also needed to fix PR35837.
llvm-svn: 321946
|
| |
|
|
| |
llvm-svn: 321945
|
| |
|
|
|
|
|
|
| |
isel pattern that only existed for the assembler. Use VCVTTSD2SIrrb_Int instead.
For consistency use the _Int version of VCVTTSD2SIrr_Int and VCVTTSD2SIrm_Int for the assembler as well.
llvm-svn: 321944
|
| |
|
|
|
|
|
|
|
|
| |
assembler matcher table
We should always prefer the VEX encoded version of these instructions. There is no advantage to the EVEX version.
Fixes PR35837.
llvm-svn: 321939
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the last step needed to fix PR33325:
https://bugs.llvm.org/show_bug.cgi?id=33325
We're trading branch and compares for loads and logic ops.
This makes the code smaller and hopefully faster in most cases.
The 24-byte test shows an interesting construct: we load the trailing scalar
elements into vector registers and generate the same pcmpeq+movmsk code that
we expected for a pair of full vector elements (see the 32- and 64-byte tests).
Differential Revision: https://reviews.llvm.org/D41714
llvm-svn: 321934
|
| |
|
|
|
|
| |
This makes the names consistent with the mnemonics like every other instruction.
llvm-svn: 321931
|
| |
|
|
|
|
| |
we don't crash when trying to print an error message using it.
llvm-svn: 321930
|
| |
|
|
|
|
| |
lowerV4I64VectorShuffle.
llvm-svn: 321929
|
| |
|
|
|
|
|
|
| |
instructions.
This matches their VEX equivalents.
llvm-svn: 321912
|
| |
|
|
|
|
| |
Recommit r321897 with updated testcases.
llvm-svn: 321908
|
| |
|
|
|
|
|
| |
Commit message:
[Hexagon] Add patterns for sext_inreg of HVX vector types
llvm-svn: 321904
|
| |
|
|
|
|
|
|
|
|
| |
instructions as well.
Without this we allow "vmovd %rax, %xmm0", but not "vmovd %rax, %xmm16"
This exists due to continue a silly bug where really old versions of the GNU assembler required movd instead of movq on these instructions. This compatibility hack then crept forward to avx version too, but we didn't propagate it to avx512.
llvm-svn: 321903
|
| |
|
|
|
|
|
| |
Commit message:
[Hexagon] Even simpler patterns for sign- and zero-extending HVX vectors
llvm-svn: 321902
|
| |
|
|
|
|
|
|
| |
'movq' instead.
This behavior existed to work with an old version of the gnu assembler on MacOS that only accepted this form. Newer versions of GNU assembler and the current LLVM derived version of the assembler on MacOS support movq as well.
llvm-svn: 321898
|
| |
|
|
| |
llvm-svn: 321897
|
| |
|
|
|
|
| |
Only non-bool vectors.
llvm-svn: 321895
|
| |
|
|
| |
llvm-svn: 321894
|
| |
|
|
| |
llvm-svn: 321893
|
| |
|
|
| |
llvm-svn: 321892
|
| |
|
|
| |
llvm-svn: 321891
|
| |
|
|
|
|
| |
These arise because enums are 'int' by default.
llvm-svn: 321887
|
| |
|
|
|
|
|
|
|
|
|
|
| |
operands
Currently the assembler would accept, e.g. `ldr r0, [s0, #12]` and similar.
This patch add checks that only general-purpose registers are used in address
operands, shifted registers, and shift amounts.
Differential revision: https://reviews.llvm.org/D39910
llvm-svn: 321866
|
| |
|
|
|
|
|
|
|
| |
Instead of using, for example, `dup v0.4s, wzr`, which transfers between
register files, use the more efficient `movi v0.4s, #0` instead.
Differential revision: https://reviews.llvm.org/D41515
llvm-svn: 321824
|
| |
|
|
|
|
| |
of integer.
llvm-svn: 321821
|
| |
|
|
|
|
|
|
|
| |
Wide Thumb2 instructions should be emitted into the object file as pairs of
16-bit words of the appropriate endianness, not one 32-bit word.
Differential revision: https://reviews.llvm.org/D41185
llvm-svn: 321799
|
| |
|
|
| |
llvm-svn: 321798
|
| |
|
|
|
|
|
|
| |
Select G_PHI to PHI and manually constrain the result register. This is
very similar to how COPY is handled, so extract and reuse some of that
code.
llvm-svn: 321797
|
| |
|
|
|
|
|
| |
Mark G_PHI as Legal for s32 and p0, and also for s64 if we have hard
float. Widen any smaller types.
llvm-svn: 321795
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We used to handle G_CONSTANT with pointer type by forcing the type of
the result register to s32 and then letting TableGen handle it.
Unfortunately, setting the type only works for generic virtual
registers, that haven't yet been constrained to a register class (e.g.
those used only by a COPY later on). If the result register has already
been constrained as a use of a previously selected instruction, then
setting the type will assert.
It would be nice to be able to teach TableGen to select pointer
constants the same as integer constants, but since it's such an edge
case (at the moment the only pointer constant that we're generally
interested in is 0, and that is mostly used for comparisons and selects,
which are also not supported by TableGen) it's probably not worth the
effort right now. Instead, handle pointer constants with some trivial
handwritten code.
llvm-svn: 321793
|
| |
|
|
| |
llvm-svn: 321755
|
| |
|
|
| |
llvm-svn: 321752
|
| |
|
|
|
|
|
|
|
|
| |
This custom inserter was added in r124272 at which time it added about bunch of Defs for Win64. In r150708, those defs were removed leaving only the "return BB". So I think this means the custom inserter is a NOP these days.
This patch removes the remaining code and stops tagging the instructions for custom insertion
Differential Revision: https://reviews.llvm.org/D41671
llvm-svn: 321747
|