| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
llvm-svn: 339813
|
|
|
|
|
|
|
| |
For some reason this wasn't done for floats like
integers.
llvm-svn: 339811
|
|
|
|
|
|
| |
Pull out some types to match layout in TargetLowering::BuildUDIV. Early step towards adding non-uniform vector support.
llvm-svn: 339763
|
|
|
|
|
|
|
|
| |
scalar/vector magic value collection. NFCI.
Use the same ISD::matchUnaryPredicate pattern that was used in D50392.
llvm-svn: 339758
|
|
|
|
|
|
|
|
| |
failed. NFCI.
Matches the code in BuildSDIV/BuildUDIV
llvm-svn: 339757
|
|
|
|
|
|
|
|
| |
This patch refactors the existing BuildExactSDIV implementation to support non-uniform constant vector denominators.
Differential Revision: https://reviews.llvm.org/D50392
llvm-svn: 339756
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`MachineMemOperand` pointers attached to `MachineSDNodes` and instead
have the `SelectionDAG` fully manage the memory for this array.
Prior to this change, the memory management was deeply confusing here --
The way the MI was built relied on the `SelectionDAG` allocating memory
for these arrays of pointers using the `MachineFunction`'s allocator so
that the raw pointer to the array could be blindly copied into an
eventual `MachineInstr`. This creates a hard coupling between how
`MachineInstr`s allocate their array of `MachineMemOperand` pointers and
how the `MachineSDNode` does.
This change is motivated in large part by a change I am making to how
`MachineFunction` allocates these pointers, but it seems like a layering
improvement as well.
This would run the risk of increasing allocations overall, but I've
implemented an optimization that should avoid that by storing a single
`MachineMemOperand` pointer directly instead of allocating anything.
This is expected to be a net win because the vast majority of uses of
these only need a single pointer.
As a side-effect, this makes the API for updating a `MachineSDNode` and
a `MachineInstr` reasonably different which seems nice to avoid
unexpected coupling of these two layers. We can map between them, but we
shouldn't be *surprised* at where that occurs. =]
Differential Revision: https://reviews.llvm.org/D50680
llvm-svn: 339740
|
|
|
|
|
|
|
|
| |
Add a helper function to scalarize constrained FP operations as needed.
Differential Revision: https://reviews.llvm.org/D50720
llvm-svn: 339735
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Intentionally excluding nodes from the DAGCombine worklist is likely to
lead to weird optimizations and infinite loops, so it's generally a bad
idea.
To avoid the infinite loops, fix DAGCombine to use the
isDesirableToCommuteWithShift target hook before performing the
transforms in question, and implement the target hook in the ARM backend
disable the transforms in question.
Fixes https://bugs.llvm.org/show_bug.cgi?id=38530 . (I don't have a
reduced testcase for that bug. But we should have sufficient test
coverage for PerformSHLSimplify given that we're not playing weird
tricks with the worklist. I can try to bugpoint it if necessary,
though.)
Differential Revision: https://reviews.llvm.org/D50667
llvm-svn: 339734
|
|
|
|
|
|
| |
Patch by Henric Karlsson.
llvm-svn: 339688
|
|
|
|
|
|
|
|
|
|
| |
Fix SelectionDAG::computeKnownBits asserting when handling EXTRACT_SUBVECTOR
when zero extending the demanded elements mask if it is already as long as the
source vector.
Differential Revision: https://reviews.llvm.org/D49574
llvm-svn: 339600
|
|
|
|
|
|
| |
combine. NFC.
llvm-svn: 339561
|
|
|
|
|
|
|
|
| |
fp_to_fp16 in case the result type isn't a scalar integer.
This is another variation of PR38533. In this case, the result type of the bitcast is legal and 16-bits wide, but not a scalar integer. So we need to emit the convert to i16 and then bitcast it to the true result type. This new bitcast will be further type legalized if necessary.
llvm-svn: 339536
|
|
|
|
|
|
|
|
|
|
| |
make sure the output type is scalar. For vectors, use a store and load of temporary.
Previously if the result type was a vector, we emitted a FP_TO_FP16 with a vector result type which isn't valid.
This is basically the opposite case of the root cause of PR38533.
llvm-svn: 339535
|
|
|
|
|
|
|
|
|
|
| |
fp16_to_fp in case the input type isn't an i16.
The bitcast can be further legalized as needed.
Fixes PR38533.
llvm-svn: 339533
|
|
|
|
|
|
|
| |
Addresses fixme, although this should still be checking individual
operand flags.
llvm-svn: 339525
|
|
|
|
|
|
|
|
| |
for XOR. NFCI
We were checking for all bits being Known by checking Known.Zero|Known.One, but if all the bits are known then the value should be a Constant and we can just check for that instead.
llvm-svn: 339509
|
|
|
|
| |
llvm-svn: 339508
|
|
|
|
|
|
|
|
|
|
|
|
| |
Similar to rL337966 - if the DAGCombiner's rotate matching was
working as expected, I don't think we'd see any test diffs here.
AArch only goes right, and PPC only goes left.
x86 has both, so no diffs there.
Differential Revision: https://reviews.llvm.org/D50091
llvm-svn: 339359
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This change provides a common optimization path for both Unsafe and FMF driven optimization for this fsub fold adding reassociation, as it the flag that most closely represents the translation
Reviewers: spatel, wristow, arsenm
Reviewed By: spatel
Subscribers: wdng
Differential Revision: https://reviews.llvm.org/D50195
llvm-svn: 339357
|
|
|
|
|
|
| |
As requested in D50392, pull the magic constant calculations out into a helper function.
llvm-svn: 339346
|
|
|
|
|
|
|
|
|
| |
isNegatibleForFree() should not matter here (as the test diffs show)
because it's always a win to replace an fsub+fadd with fneg. The
problem in D50195 persists because either (1) we are doing these
folds in the wrong order or (2) we're missing another fold for fadd.
llvm-svn: 339299
|
|
|
|
|
|
|
|
|
| |
I don't know if it's possible to expose this diff in a test,
but we should always try simplifications (no new nodes created)
before more complicated transforms for efficiency (similar to
what we do in IR).
llvm-svn: 339298
|
|
|
|
|
|
|
|
| |
call. NFCI.
The isConstOrConstSplat result is only used in a ISD::matchUnaryPredicate call which can perform the equivalent iteration just as quickly.
llvm-svn: 339262
|
|
|
|
|
|
|
|
| |
Provide a pass-through of the numerator for divide by one cases - this is the same approach we take in DAGCombiner::visitSDIVLike.
I investigated whether we could achieve this by magic MULHU/SRL values but nothing appeared to work as we don't have a way for MULHU(x,c) -> x
llvm-svn: 339254
|
|
|
|
|
|
|
|
| |
As requested in D50392, this is a minor refactor to BuildExactSDIV to stop taking the uniform constant APInt divisor and instead extract it locally.
I also cleanup the operands and valuetypes to better match BuildUDiv (and BuildSDIV in the near future).
llvm-svn: 339246
|
|
|
|
|
|
|
|
|
|
| |
Summary: changing a few typos
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D50445
llvm-svn: 339245
|
|
|
|
|
|
| |
We're not handling the UDIV by one special case properly - for now just early out.
llvm-svn: 339229
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Extend fix for PR34170 to support inline assembly with multiple output operands that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR as in the PR).
Reviewers: bogner, t.p.northover, lattner, javed.absar, efriedma
Reviewed By: efriedma
Subscribers: efriedma, tra, eraman, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D45437
llvm-svn: 339225
|
|
|
|
|
|
|
|
|
|
| |
serial chain dependency.
Scatter could have multiple identical indices. We need to maintain sequential order. We get this right in LegalizeVectorTypes, but not in this code.
Differential Revision: https://reviews.llvm.org/D50374
llvm-svn: 339157
|
|
|
|
|
|
|
|
| |
This was missed in D50185.
NFC until we add actual non-uniform support to BuildSDIV (similar BuildUDIV support in D49248) - for now it just early outs.
llvm-svn: 339147
|
|
|
|
|
|
| |
This was missed in D49248
llvm-svn: 339146
|
|
|
|
|
|
|
|
|
|
| |
This patch refactors the existing TargetLowering::BuildUDIV base implementation to support non-uniform constant vector denominators.
It also includes a fold for MULHU by pow2 constants to SRL which can now more readily occur from BuildUDIV.
Differential Revision: https://reviews.llvm.org/D49248
llvm-svn: 339121
|
|
|
|
|
|
| |
Src0 doesn't really convey any meaning to what the operand is. Passthru matches what's used in the documentation for the intrinsic this comes from.
llvm-svn: 339101
|
|
|
|
|
|
| |
getValue is more meaningful name for scatter than it is for gather. Split them and use getPassThru for gather.
llvm-svn: 339096
|
|
|
|
|
|
|
|
|
|
|
| |
This assert fires when attempting to extract a subregister from the
global PIC base register. This virtual register SD node is not in the
VRBaseMap, so we shouldn't call getVR to look it up there. If this is a
RegisterSDNode, we should be able to use the virtual register directly.
Fixes PR38385
llvm-svn: 339056
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the past, DbgInfoIntrinsic has a strong assumption that these
intrinsics all have variables and expressions attached to them.
However, it is too strong to derive the class for other debug entities.
Now, it has problems for debug labels.
In order to make DbgInfoIntrinsic as a base class for 'debug info', I
create a class for 'variable debug info', DbgVariableIntrinsic.
DbgDeclareInst, DbgAddrIntrinsic, and DbgValueInst will be derived from it.
Differential Revision: https://reviews.llvm.org/D50220
llvm-svn: 338984
|
|
|
|
|
|
|
|
|
|
| |
store.
The mask operand is visited before the data operand so we need to be able to widen it.
Fixes PR38436.
llvm-svn: 338915
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a parameter for testing specifically for
sNaNs - at least one instruction pattern on AMDGPU
needs to check specifically for this.
Also handle more cases, and add a target hook
for custom nodes, similar to the hooks for known
bits.
llvm-svn: 338910
|
|
|
|
|
|
|
|
| |
First step towards a BuildSDIV equivalent to D49248 for non-uniform vector support - this just pushes the splat detection down into TargetLowering::BuildSDIV where its still used.
Differential Revision: https://reviews.llvm.org/D50185
llvm-svn: 338838
|
|
|
|
| |
llvm-svn: 338715
|
|
|
|
|
|
|
|
|
|
|
|
| |
In expansion of FCOPYSIGN, the shift node is missing when the two
operands of FCOPYSIGN are of the same size. We should always generate
shift node (if the required shift bit is not zero) to put the sign
bit into the right position, regardless of the size of underlying
types.
Differential Revision: https://reviews.llvm.org/D49973
llvm-svn: 338665
|
|
|
|
| |
llvm-svn: 338604
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The bug is visible in the constant-folded x86 tests. We can't use the
negated shift amount when the type is not power-of-2:
https://rise4fun.com/Alive/US1r
...so in that case, use the regular lowering that includes a select
to guard against a shift-by-bitwidth. This path is improved by only
calculating the modulo shift amount once now.
Also, improve the rotate (with power-of-2 size) lowering to use
a negate rather than subtract from bitwidth. This improves the
codegen whether we have a rotate instruction or not (although
we can still see that we're not matching to a legal rotate in
all cases).
llvm-svn: 338592
|
|
|
|
|
|
|
|
| |
There is nothing x86-specific about this code, so it'd be nice to make this available for other targets to use in the future (and get it out of X86ISelLowering!).
Differential Revision: https://reviews.llvm.org/D50083
llvm-svn: 338586
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D49806
llvm-svn: 338562
|
|
|
|
|
|
|
|
|
|
| |
Correct the address space for the inserted argument
stack slot.
AMDGPU seems to not do anything with this information,
so I don't think this was breaking anything.
llvm-svn: 338428
|
|
|
|
| |
llvm-svn: 338382
|
|
|
|
| |
llvm-svn: 338352
|
|
|
|
|
|
|
|
| |
BuildSDIV/BuildUDIV/etc.
The vector contains the SDNodes that these functions create. The number of nodes is always a small number so we should use SmallVector to avoid a heap allocation.
llvm-svn: 338329
|