| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
| |
elements for scalar add,div,mul,sub,max,min intrinsics with masking and rounding.
These intrinsics don't read the upper bits of their second input. And the third input is the passthru for masking and that only uses the lower element as well.
llvm-svn: 289370
|
| |
|
|
| |
llvm-svn: 289368
|
| |
|
|
|
|
| |
to match 128 and 256-bit.
llvm-svn: 289354
|
| |
|
|
| |
llvm-svn: 289352
|
| |
|
|
|
|
| |
being able to constant fold them in InstCombineCalls like we do for 128/256-bit.
llvm-svn: 289350
|
| |
|
|
| |
llvm-svn: 289349
|
| |
|
|
|
|
| |
shufflevector if the indices are constant.
llvm-svn: 289348
|
| |
|
|
|
|
| |
This should've been removed in r289323.
llvm-svn: 289346
|
| |
|
|
|
|
| |
able to constant fold it in InstCombineCalls like we do for 128/256-bit.
llvm-svn: 289344
|
| |
|
|
|
|
|
| |
These are currently limited to integer types, but we should
be able to extend to splat vectors and possibly general vectors.
llvm-svn: 289343
|
| |
|
|
|
|
| |
LowerHorizontalByteSum
llvm-svn: 289341
|
| |
|
|
|
|
| |
select around the unmasked avx1 intrinsics.
llvm-svn: 289340
|
| |
|
|
|
|
|
|
| |
if/else chain. This should buy a little more time against the MSVC limit mentioned in PR31034.
The handlers for stores all return at the end of their block so they can be picked off early.
llvm-svn: 289339
|
| |
|
|
|
|
|
| |
This was failing when trying to fold immediates into operand 1 of a
phi, which only has one statically known operand.
llvm-svn: 289337
|
| |
|
|
|
|
|
|
| |
actually used.
Also fix the ZeroVector's type - I've no idea how this hasn't caused problems........
llvm-svn: 289336
|
| |
|
|
|
|
| |
vcvttps2uqq when AVX512DQ and AVX512VL are available.
llvm-svn: 289335
|
| |
|
|
| |
llvm-svn: 289334
|
| |
|
|
|
|
| |
single boolean flag passed to a helper function. Just check the opcode and create the flag.
llvm-svn: 289333
|
| |
|
|
| |
llvm-svn: 289332
|
| |
|
|
| |
llvm-svn: 289331
|
| |
|
|
|
|
|
|
| |
from 'large element' scalar/vector to 'small element' vector.
Extension to D27129 which already supported bitcasts from 'small element' vector to 'large element' scalar/vector types.
llvm-svn: 289329
|
| |
|
|
| |
llvm-svn: 289326
|
| |
|
|
|
|
|
|
|
|
| |
There was a bug where we would hit an assertion if 'Q' was used as a
constraint.
I also removed hardcoded register names to prefer regexes so the tests
don't break when the register allocator changes.
llvm-svn: 289325
|
| |
|
|
|
|
|
| |
It looks like some time in the past, constraint codes were changed from
chars being passed around to enums.
llvm-svn: 289323
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: This gets rid of the hardcoded 'r0' that was used previously.
Reviewers: asl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D27567
llvm-svn: 289322
|
| |
|
|
|
|
| |
This would previously trigger an assertion error in AVRISelDAGToDAG.
llvm-svn: 289321
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
outer IR unit.
Summary:
This never really got implemented, and was very hard to test before
a lot of the refactoring changes to make things more robust. But now we
can test it thoroughly and cleanly, especially at the CGSCC level.
The core idea is that when an inner analysis manager proxy receives the
invalidation event for the outer IR unit, it needs to walk the inner IR
units and propagate it to the inner analysis manager for each of those
units. For example, each function in the SCC needs to get an
invalidation event when the SCC gets one.
The function / module interaction is somewhat boring here. This really
becomes interesting in the face of analysis-backed IR units. This patch
effectively handles all of the CGSCC layer's needs -- both invalidating
SCC analysis and invalidating function analysis when an SCC gets
invalidated.
However, this second aspect doesn't really handle the
LoopAnalysisManager well at this point. That one will need some change
of design in order to fully integrate, because unlike the call graph,
the entire function behind a LoopAnalysis's results can vanish out from
under us, and we won't even have a cached API to access. I'd like to try
to separate solving the loop problems into a subsequent patch though in
order to keep this more focused so I've adapted them to the API and
updated the tests that immediately fail, but I've not added the level of
testing and validation at that layer that I have at the CGSCC layer.
An important aspect of this change is that the proxy for the
FunctionAnalysisManager at the SCC pass layer doesn't work like the
other proxies for an inner IR unit as it doesn't directly manage the
FunctionAnalysisManager and invalidation or clearing of it. This would
create an ever worsening problem of dual ownership of this
responsibility, split between the module-level FAM proxy and this
SCC-level FAM proxy. Instead, this patch changes the SCC-level FAM proxy
to work in terms of the module-level proxy and defer to it to handle
much of the updates. It only does SCC-specific invalidation. This will
become more important in subsequent patches that support more complex
invalidaiton scenarios.
Reviewers: jlebar
Subscribers: mehdi_amini, mcrosier, mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D27197
llvm-svn: 289317
|
| |
|
|
|
|
|
|
|
|
| |
cvttps2qq and cvttps2uqq intrinsics since there is a mismatch between number of input and output elements.
Ideally ISD::FP_TO_SINT and ISD::FP_TO_UINT would only be used for cases with the same number of input and output elements.
Similar things have already been done for other convert intrinsics.
llvm-svn: 289316
|
| |
|
|
|
|
|
|
|
|
| |
These should've been checking whether the immediate is a 6-bit unsigned
integer.
If the immediate was '63', this would cause an assertion error which
shouldn't have occurred.
llvm-svn: 289315
|
| |
|
|
| |
llvm-svn: 289314
|
| |
|
|
| |
llvm-svn: 289313
|
| |
|
|
| |
llvm-svn: 289312
|
| |
|
|
|
|
| |
-fsanitize-coverage=trace-pc-guard. Support for the previosly used instrumentation will be removed in the following changes
llvm-svn: 289311
|
| |
|
|
|
|
| |
name while printing the coverage
llvm-svn: 289310
|
| |
|
|
|
|
|
| |
The users of the addrspacecast were having their types incorrectly
changed, producing invalid bitcasts between address spaces.
llvm-svn: 289307
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since 32-bit instructions with 32-bit input immediate behavior
are used to materialize 16-bit constants in 32-bit registers
for 16-bit instructions, determining the legality based
on the size is incorrect. Change operands to have the size
specified in the type.
Also adds a workaround for a disassembler bug that
produces an immediate MCOperand for an operand that
is supposed to be OPERAND_REGISTER.
The assembler appears to accept out of bounds immediates and
truncates them, but this seems to be an issue for 32-bit
already.
llvm-svn: 289306
|
| |
|
|
| |
llvm-svn: 289292
|
| |
|
|
|
|
|
| |
Some of the immediates need to be printed differently
eventually.
llvm-svn: 289291
|
| |
|
|
|
|
| |
You Use warnings; other minor fixes (NFC).
llvm-svn: 289282
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LLVM's use of DW_OP_bit_piece is incorrect and a based on a
misunderstanding of the wording in the DWARF specification. The offset
argument of DW_OP_bit_piece refers to the offset into the location
that is on the top of the DWARF expression stack, and not an offset
into the source variable. This has since also been clarified in the
DWARF specification.
This patch fixes all uses of DW_OP_bit_piece to emit the correct
offset and simplifies the DwarfExpression class to semi-automaticaly
emit empty DW_OP_pieces to adjust the offset of the source variable,
thus simplifying the code using DwarfExpression.
While this is an incompatible bugfix, in practice I don't expect this
to be much of a problem since LLVM's old interpretation and the
correct interpretation of DW_OP_bit_piece differ only when there are
gaps in the fragmented locations of the described variables or if
individual fragments are smaller than a byte. LLDB at least won't
interpret locations with gaps in them because is has no way to present
undefined bits in a variable, and there is a high probability that an
old-form expression will be malformed when interpreted correctly,
because the DW_OP_bit_piece offset will be outside of the location at
the top of the stack.
As a nice side-effect, this patch enables us to use a more efficient
encoding for subregisters: In order to express a sub-register at a
non-zero offset we now use a DW_OP_bit_piece instead of shifting the
value into place manually.
This patch also adds missing test coverage for code paths that weren't
exercised before.
<rdar://problem/29335809>
Differential Revision: https://reviews.llvm.org/D27550
llvm-svn: 289266
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: CI doesn't have XNACK.
Reviewers: tstellarAMD
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D27175
llvm-svn: 289263
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This frees 2 additional scalar registers.
These are results from all of my 3 patches combined:
Polaris:
Spilled SGPRs: 2231 -> 1517 (-32.00 %)
Tonga:
Spilled SGPRs: 3829 -> 2608 (-31.89 %)
Spilled VGPRs: 100 -> 84 (-16.00 %)
Tonga even spills SGPRs via VGPRs to scratch. That's a compute shader
limited to 64 VGPRs.
Reviewers: tstellarAMD
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D27151
llvm-svn: 289262
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: This frees 2 scalar registers.
Reviewers: tstellarAMD
Subscribers: qcolombet, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D27150
llvm-svn: 289261
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
There is no point in setting SGPRS=104, because VI allocates SGPRs
in multiples of 16, so 104 -> 112. That enables us to use all 102 SGPRs
for general purposes.
Reviewers: tstellarAMD
Subscribers: qcolombet, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D27149
llvm-svn: 289260
|
| |
|
|
|
|
|
|
|
| |
Like DBG_VALUE, these emit nothing to the .text section, and sometimes
have no source location specified. Just ignore them.
Differential Revision: http://reviews.llvm.org/D27492
llvm-svn: 289256
|
| |
|
|
|
|
| |
This should do nothing for targets without i16.
llvm-svn: 289235
|
| |
|
|
|
|
| |
Reapplied with fix for PR31323 - X86 SSE2 vXi16 multiplies for illegal types were creating CONCAT_VECTORS nodes with vector inputs that might not total the number of elements in the result type.
llvm-svn: 289232
|
| |
|
|
| |
llvm-svn: 289231
|
| |
|
|
|
|
| |
Fixes assembler regressions.
llvm-svn: 289230
|
| |
|
|
|
|
|
|
|
| |
Sort the instruction bits by type and make sure there is one
for each format.
Also cleanup namespaces.
llvm-svn: 289229
|