| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
I don't think this matters because ConstantFP is legal.
llvm-svn: 290299
|
|
|
|
|
|
|
|
| |
-256 is a legal indexed address part.
Differential Revision: https://reviews.llvm.org/D27537
llvm-svn: 290296
|
|
|
|
|
|
|
| |
The range metadata inserted by NVVMIntrRange is pessimistic, range
metadata already present could be more precise.
llvm-svn: 290294
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a basic tablegen backend that analyzes the SelectionDAG
patterns to find simple ones that are eligible for GlobalISel-emission.
That's similar to FastISel, with one notable difference: we're not fed
ISD opcodes, so we need to map the SDNode operators to generic opcodes.
That's done using GINodeEquiv in TargetGlobalISel.td.
Otherwise, this is mostly boilerplate, and lots of filtering of any kind
of "complicated" pattern. On AArch64, this is sufficient to match G_ADD
up to s64 (to ADDWrr/ADDXrr) and G_BR (to B).
Differential Revision: https://reviews.llvm.org/D26878
llvm-svn: 290284
|
|
|
|
| |
llvm-svn: 290281
|
|
|
|
|
|
|
|
| |
The case AM.Scale == 0 is already handled by the code right above.
Differential Revision: https://reviews.llvm.org/D28003
llvm-svn: 290275
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As mentioned on PR30845, we were performing our vXi64 multiplication as:
AloBlo = pmuludq(a, b);
AloBhi = pmuludq(a, psrlqi(b, 32));
AhiBlo = pmuludq(psrlqi(a, 32), b);
return AloBlo + psllqi(AloBhi, 32)+ psllqi(AhiBlo, 32);
when we could avoid one of the upper shifts with:
AloBlo = pmuludq(a, b);
AloBhi = pmuludq(a, psrlqi(b, 32));
AhiBlo = pmuludq(psrlqi(a, 32), b);
return AloBlo + psllqi(AloBhi + AhiBlo, 32);
This matches the lowering on gcc/icc.
Differential Revision: https://reviews.llvm.org/D27756
llvm-svn: 290267
|
|
|
|
| |
llvm-svn: 290265
|
|
|
|
| |
llvm-svn: 290255
|
|
|
|
| |
llvm-svn: 290254
|
|
|
|
|
|
|
|
|
|
| |
I added API for creation a target specific memory node in DAG. Today, all memory nodes are common for all targets and their constructors are located in SelectionDAG.cpp.
There are some cases in X86 where we need to create a special node - truncation-with-saturation store, float-to-half-store.
In the current patch I added truncation-with-saturation nodes and I'm using them for intrinsics. In the future I plan to implement DAG lowering for truncation-with-saturation pattern.
Differential Revision: https://reviews.llvm.org/D27899
llvm-svn: 290250
|
|
|
|
| |
llvm-svn: 290249
|
|
|
|
|
|
| |
Fixing a warning.
llvm-svn: 290248
|
|
|
|
|
|
| |
Fixing build issues.
llvm-svn: 290244
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The vectorcall calling convention specifies that arguments to functions are to be passed in registers, when possible.
vectorcall uses more registers for arguments than fastcall or the default x64 calling convention use.
The vectorcall calling convention is only supported in native code on x86 and x64 processors that include Streaming SIMD Extensions 2 (SSE2) and above.
The current implementation does not handle Homogeneous Vector Aggregates (HVAs) correctly and this review attempts to fix it.
This aubmit also includes additional lit tests to cover better HVAs corner cases.
Differential Revision: https://reviews.llvm.org/D27392
llvm-svn: 290240
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See https://reviews.llvm.org/D6678 for the history of
isExtractSubvectorCheap. Essentially the same considerations apply
to ARM.
This temporarily breaks the formation of vpadd/vpaddl in certain cases;
AddCombineToVPADDL essentially assumes that we won't form VUZP shuffles.
See https://reviews.llvm.org/D27779 for followup fix.
Differential Revision: https://reviews.llvm.org/D27774
llvm-svn: 290198
|
|
|
|
| |
llvm-svn: 290193
|
|
|
|
|
|
|
|
|
| |
When the instruction is processed the first time, it may be
deleted resulting in crashes. While the new test adds the same
user to the worklist twice, this particular case doesn't crash
but I'm not sure why.
llvm-svn: 290191
|
|
|
|
| |
llvm-svn: 290185
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm, nhaehnle, mareko
Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D27834
llvm-svn: 290184
|
|
|
|
|
|
| |
I haven't managed to get this to fail yet but its technically possible for the AND -> shuffle decomposition to result in illegal types.
llvm-svn: 290183
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Without a MachineMemOperand, the scheduler was assuming MIMG instructions
were ordered memory references, so no loads or stores could be reordered
across them.
Reviewers: arsenm
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D27536
llvm-svn: 290179
|
|
|
|
|
|
|
|
|
| |
Include of llvm/IR/Verifier.h was removed from HexagonCommonGEP.cpp in r289604
as unused. In fact it is required when expensive checks are enabled, because
it declared function `verifyFunction`, which is called in conditionally compiled
part of the file.
llvm-svn: 290170
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The expression for computing the return value of getMemOpBaseRegImmOfs has only
one possible value. The other value would result in a return earlier in the
function. This patch replaces the expression with its only possible value.
Reviewers: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D27437
llvm-svn: 290133
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D27725
llvm-svn: 290114
|
|
|
|
| |
llvm-svn: 290109
|
|
|
|
|
|
|
|
|
|
|
| |
This allows lowering i8 and i16 arguments if they can fit in the registers. Note
that the lowering is incomplete - ABI extensions are handled in a subsequent
patch.
(Last part of)
Differential Revision: https://reviews.llvm.org/D27704
llvm-svn: 290106
|
|
|
|
|
|
|
|
|
| |
Teach the instruction selector and legalizer that it's ok to have adds with 8 or
16-bit integers.
This is the second part of https://reviews.llvm.org/D27704
llvm-svn: 290105
|
|
|
|
|
|
|
|
|
| |
Teach the instruction selector that it's ok to copy small values from physical
registers.
First part of https://reviews.llvm.org/D27704
llvm-svn: 290104
|
|
|
|
|
|
|
|
| |
PWR9 processor model for instruction scheduling. A subsequent patch will migrate
PWR9 to Post RA MIScheduler.
https://reviews.llvm.org/D24525
llvm-svn: 290102
|
|
|
|
| |
llvm-svn: 290100
|
|
|
|
|
|
|
|
|
|
| |
This adds support for lowering more than 4 arguments (although still i32 only).
It uses the handleAssignments / ValueHandler infrastructure extracted from
the AArch64 backend in r288658.
Differential Revision: https://reviews.llvm.org/D27195
llvm-svn: 290098
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
functime metadata V2.0
Summary:
Added pair of directives .hsa_code_object_metadata/.end_hsa_code_object_metadata.
Between them user can put YAML string that would be directly put to the generated note. E.g.:
'''
.hsa_code_object_metadata
{
amd.MDVersion: [ 2, 0 ]
}
.end_hsa_code_object_metadata
'''
Based on D25046
Reviewers: vpykhtin, nhaustov, yaxunl, tstellarAMD
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, mgorny, tony-tye
Differential Revision: https://reviews.llvm.org/D27619
llvm-svn: 290097
|
|
|
|
|
|
|
|
|
|
| |
Add support for selecting simple G_LOAD and G_FRAME_INDEX instructions (32-bit
scalars only). This will be useful for functions that need to pass arguments on
the stack.
First part of https://reviews.llvm.org/D27195.
llvm-svn: 290096
|
|
|
|
|
|
| |
make sure we pass the load's user rather than load itself to the second operand of IsLegalToFold.
llvm-svn: 290089
|
|
|
|
|
|
|
|
| |
for the ones needed for SSE1. Anything SSE2 or above uses the integer ISD opcode.
This removes 11721 bytes from the DAG isel table or 2.2%
llvm-svn: 290073
|
|
|
|
|
|
|
|
| |
Not sure whether it causes and ASAN false positive or whether it
actually leads to incorrect code or whether it even exposes bad code.
Hans, I'll get you instructions to reproduce this.
llvm-svn: 290066
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit on behalf of Gadi Haber
Removed EVEX_V512 prefix from scalar EVEX instructions since HW ignores L'L bits anyway (LIG). 4 instructions are modified.
The changed encodings are validated with XED.
Rviewers: delena, igorb
Differential revision: https://reviews.llvm.org/D27802
llvm-svn: 290065
|
|
|
|
|
|
| |
As discussed on D27692, the next step will be to allow cross-domain shuffles once the combined shuffle depth passes a certain point.
llvm-svn: 290064
|
|
|
|
|
|
|
|
|
|
|
|
| |
if they are available. This will allow a bunch of patterns to be removed.
These nodes are only emitted for lowering FABS/FNEG/FNABS/FCOPYSIGN. Ideally we just wouldn't create these nodes if SSE2 or higher is available, but it was simple to just convert them in DAG combine.
For SSE2, AVX, and AVX512 with DQI this is no functional change as the execution domain fixing pass ensures the right domain is selected regardless of the ISD opcode.
For AVX-512 without DQI we end up using integer instructions since the floating point versions aren't available. But we were already doing that for any logical operations in code that didn't come from FABS/FNEG/FNABS/FCOPYSIGN so this seems no worse. And we get the benefit of being able to fold broadcasts now.
llvm-svn: 290060
|
|
|
|
|
|
|
|
| |
DQI and VLX instructions are available.
This can give the register allocator more registers to use.
llvm-svn: 290057
|
|
|
|
|
|
| |
for scalars. I missed this in r290049.
llvm-svn: 290055
|
|
|
|
|
|
| |
available. This gives the register allocator more registers to work with.
llvm-svn: 290049
|
|
|
|
|
|
|
|
| |
It is still breaking Chrome. http://llvm.org/PR31361
This reverts commit r290026.
llvm-svn: 290047
|
|
|
|
|
|
| |
(NFC).
llvm-svn: 290028
|
|
|
|
| |
llvm-svn: 290027
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Re-apply r288561: Liveness tracking should be correct now after r290014.
Previously this pass was using up to 5% compile time in some cases which
is a bit much for what it is doing. The pass featured a full blown
data-flow analysis which in the default configuration was restricted to a
single block.
This rewrites the pass under the assumption that we only ever work on a
single block. This is done in a single pass maintaining a state machine
per general purpose register to catch LOH patterns.
Differential Revision: https://reviews.llvm.org/D27329
llvm-svn: 290026
|
|
|
|
|
|
| |
other minor fixes (NFC).
llvm-svn: 290024
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D27559
llvm-svn: 290014
|
|
|
|
| |
llvm-svn: 289974
|