| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Search from AND nodes to find whether they can be propagated back to
loads, so that the AND and load can be combined into a narrow load.
We search through OR, XOR and other AND nodes and all bar one of the
leaves are required to be loads or constants. The exception node then
needs to be masked off meaning that the 'and' isn't removed, but the
loads(s) are narrowed still.
Differential Revision: https://reviews.llvm.org/D41177
llvm-svn: 320962
|
|
|
|
| |
llvm-svn: 320885
|
|
|
|
|
|
| |
The Function can never be nullptr so we can return a reference.
llvm-svn: 320884
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
non-constant index
Summary:
Currently we don't handle v32i1/v64i1 insert_vector_elt correctly as we fail to look at the number of elements closely and assume it can only be v16i1 or v8i1.
We also can't type legalize v64i1 insert_vector_elt correctly on KNL due to the type not being byte addressable as required by the legalizing through memory accesses path requires.
For the first issue, the patch now tries to pick a 512-bit register with the correct number of elements and promotes to that.
For the second issue, we now extend the vector to a byte addressable type, do the stores to memory, load the two halves, and then truncate the halves back to the original type. Technically since we changed the type, we may not need two loads, but actually checking that is more work and for the v64i1 case we do need them.
Reviewers: RKSimon, delena, spatel, zvi
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D40942
llvm-svn: 320849
|
|
|
|
|
|
|
|
|
|
| |
operands call NewSDValueDbgMsg.
This makes it work better with some build_vector and concat_vectors creations.
Adjust the NewSDValueDbgMsg in getConstant to avoid duplicating the print when it calls getSplatBuildVector since getSplatBuildVector didn't trigger a print before.
llvm-svn: 320783
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While investigating LLVM PR22316 (http://llvm.org/bugs/show_bug.cgi?id=22316)
I started wondering if it were not always preferable to emit the
initial DBG_VALUEs for stack arguments as FI locations instead of
describing the first register they get copied into. The advantage of
doing this is that the arguments will be available as soon as the
stack is setup. As illustrated by the testcase in the PR, the first
copy of the FI into a register may be sunk by MachineSink.cpp into a
later basic block. By describing the argument on the stack, we nicely
circumvent this problem.
<rdar://problem/19583723>
Differential Revision: https://reviews.llvm.org/D41135
llvm-svn: 320758
|
|
|
|
| |
llvm-svn: 320756
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most of the -Wsign-compare warnings are due to the fact that
enums are signed by default in the MS ABI, while the
tautological comparison warnings trigger on x86 builds where
sizeof(size_t) is 4 bytes, so N > numeric_limits<unsigned>::max()
is always false.
Differential Revision: https://reviews.llvm.org/D41256
llvm-svn: 320750
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than adding more bits to express every
MMO flag you could want, just directly use the
MMO flags. Also fixes using a bunch of bool arguments to
getMemIntrinsicNode.
On AMDGPU, buffer and image intrinsics should always
have MODereferencable set, but currently there is no
way to do that directly during the initial intrinsic
lowering.
llvm-svn: 320746
|
|
|
|
|
|
| |
This reverts commit r320679. Causes miscompiles.
llvm-svn: 320698
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Recommitting rL319773, which was reverted due to a recursive issue
causing timeouts. This happened because I failed to check whether
the discovered loads could be narrowed further. In the case of a tree
with one or more narrow loads, that could not be further narrowed, as
well as a node that would need masking, an AND could be introduced
which could then be visited and recombined again with the same load.
This could again create the masking load, with would be combined
again... We now check that the load can be narrowed so that this
process stops.
Original commit message:
Search from AND nodes to find whether they can be propagated back to
loads, so that the AND and load can be combined into a narrow load.
We search through OR, XOR and other AND nodes and all bar one of the
leaves are required to be loads or constants. The exception node then
needs to be masked off meaning that the 'and' isn't removed, but the
loads(s) are narrowed still.
Differential Revision: https://reviews.llvm.org/D41177
llvm-svn: 320679
|
|
|
|
|
|
|
|
|
|
| |
for AVX512F.
A v32i1 CONCAT_VECTORS of v16i1 uses promotion to v32i8 to legalize the v32i1. This results in a bunch of extract_vector_elts and a build_vector that ultimately gets scalarized.
This patch checks to see if v16i8 is legal and inserts a any_extend to that so that we can concat v16i8 to v32i8 and avoid creating the extracts.
llvm-svn: 320674
|
|
|
|
|
|
|
|
|
|
| |
account whether the input type also needs to be promoted.
If so go ahead and get the promoted input vector to extract from. Previously, we would create a bunch of any_extends of extract_vector_elts with illegal input type that needs to be promoted. The legalization of those extract_vector_elts would then potentially introduce a truncate. So now we have a bunch of any_extends of truncates. By legalizing both parts together we avoid creating these extra nodes.
The test changes seem to be because we were previously combining the build_vector with the any_extend before the any_extend got combined with the truncate.
llvm-svn: 320669
|
|
|
|
| |
llvm-svn: 320619
|
|
|
|
|
|
|
|
| |
Add missing case that was not implemented yet.
Differential Revision: https://reviews.llvm.org/D38942
llvm-svn: 320567
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At first, I tried to thread the x86 needle and use a target hook (isVectorShiftByScalarCheap())
to disable the transform only for non-splat pow-of-2 constants, but not AVX2, but only some
element types, but...it's difficult.
Here we just avoid the loop with the x86 vector transform that conflicts with the general DAG
combine and preserve all of the existing behavior AFAICT otherwise.
Some tests that will probably fail if someone does try to restrict this in a more targeted way
for x86-only may be found in:
test/CodeGen/X86/combine-mul.ll
test/CodeGen/X86/vector-mul.ll
test/CodeGen/X86/widen_arith-5.ll
This should prevent the infinite looping seen with:
https://bugs.llvm.org/show_bug.cgi?id=35579
Differential Revision: https://reviews.llvm.org/D41040
llvm-svn: 320374
|
|
|
|
|
|
|
|
|
|
| |
This commit is the first part of https://reviews.llvm.org/D40348.
In order to allow target combines to be performed on newly combined
indexed loads, add them back to the worklist. The remainder of the
above patch will be committed in subsequent revisions and will use
this. Test cases will be included with those follow-up commits.
llvm-svn: 320365
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a preparatory step for D34515.
This change:
- makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
- lowering is done by first converting the boolean value into the carry flag
using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
operations does the actual addition.
- for subtraction, given that ISD::SUBCARRY second result is actually a
borrow, we need to invert the value of the second operand and result before
and after using ARMISD::SUBE. We need to invert the carry result of
ARMISD::SUBE to preserve the semantics.
- given that the generic combiner may lower ISD::ADDCARRY and
ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
as well otherwise i64 operations now would require branches. This implies
updating the corresponding test for unsigned.
- add new combiner to remove the redundant conversions from/to carry flags
to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
- fixes PR34045
- fixes PR34564
- fixes PR35103
Differential Revision: https://reviews.llvm.org/D35192
llvm-svn: 320355
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduces the AddrFI "addressing mode", which is necessary simply because
it's not possible to write a pattern that directly matches a frameindex.
Ensure callee-saved registers are accessed relative to the stackpointer. This
is necessary as callee-saved register spills are performed before the frame
pointer is set.
Move HexagonDAGToDAGISel::isOrEquivalentToAdd to SelectionDAGISel, so we can
make use of it in the RISC-V backend.
Differential Revision: https://reviews.llvm.org/D39848
llvm-svn: 320353
|
|
|
|
|
|
| |
We should probably also fold (mulhs/u X, 1) for vectors, but that's harder.
llvm-svn: 320344
|
|
|
|
| |
llvm-svn: 320343
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This relaxes an assertion inside SelectionDAGBuilder which is overly
restrictive on targets which have no concept of alignment (such as AVR).
In these architectures, all types are aligned to 8-bits.
After this, LLVM will only assert that accesses are aligned on targets
which actually require alignment.
This patch follows from a discussion on llvm-dev a few months ago
http://llvm.1065342.n5.nabble.com/llvm-dev-Unaligned-atomic-load-store-td112815.html
Reviewers: bogner, nemanjai, joerg, efriedma
Reviewed By: efriedma
Subscribers: efriedma, cactus, llvm-commits
Differential Revision: https://reviews.llvm.org/D39946
llvm-svn: 320243
|
|
|
|
|
|
| |
is mentioned in the documentation (inserting a deref before the plus_uconst).
llvm-svn: 320203
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I noticed this pattern in D38316 / D38388. We failed to combine a shuffle that is either
repeating a scalar insertion at the same position in a vector or translated to a different
element index.
Like the earlier patch, this could be an instcombine too, but since we opted to make this
a DAG transform earlier, I've made this one a DAG patch too.
We do not need any legality checking because the new insert is identical to the existing
insert except that it may have a different constant insertion operand.
The constant insertion test in test/CodeGen/X86/vector-shuffle-combining.ll was the
motivation for D38756.
Differential Revision: https://reviews.llvm.org/D40209
llvm-svn: 320050
|
|
|
|
|
|
|
|
| |
makes the type byte addressable.
We can just extend the original vector to vXi1 and trust that the legalization process will revisit it.
llvm-svn: 320013
|
|
|
|
|
|
| |
EXTRACT_VECTOR_ELT index instead of hardcoding MVT::i8.
llvm-svn: 320012
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reenable post-legalize stores with constant merging computation and
corresponding test case.
* Properly truncate store merge constants
* Disable merging of truncated stores floating points
* Ensure merges of constant stores into a single vector are
constructed from legal elements.
Reviewers: eastig, efriedma
Reviewed By: eastig
Subscribers: spatel, rengolin, aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D40701
llvm-svn: 319899
|
|
|
|
|
|
|
| |
This reverts commit r319773. It was causing some buildbots to hang, e.g.
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-android/builds/5589
llvm-svn: 319867
|
|
|
|
|
|
|
|
| |
the result.
The condition operand should be promoted during operand promotion.
llvm-svn: 319853
|
|
|
|
|
|
| |
If the mask needs to be promoted that should occur by the legalizer detecting the mask operand needs to be promoted not as a side effect of another action.
llvm-svn: 319852
|
|
|
|
|
|
| |
If the mask needs to be promoted it should be handled by operand promotion after the result is legalized.
llvm-svn: 319851
|
|
|
|
|
|
|
|
| |
result type while widening the result. Just widen the mask.
The mask will be promoted if necessary when operands are promoted. It's possible the mask type is legal, but the setcc result type is a different. We shouldn't promote to the setcc result type unless the mask needs to be promoted.
llvm-svn: 319850
|
|
|
|
|
|
| |
GetWidenedVector does't guarantee the widened elements are zero which would break the intended behavior of the operation.
llvm-svn: 319849
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
protecting the stack"
The patch originally broke Chromium (crbug.com/791714) due to its failing to
specify that the new pseudo instructions clobber EFLAGS. This commit fixes
that.
> Summary: This strengthens the guard and matches MSVC.
>
> Reviewers: hans, etienneb
>
> Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D40622
llvm-svn: 319824
|
|
|
|
|
|
|
|
|
|
| |
from vector widening.
There's no such thing as a setcc with vector operands and scalar result. And if we're trying to widen the result we would have to already be looking at a vector result type.
So this patch renames the VSETCC function as the SETCC function and delete the original SETCC function.
llvm-svn: 319799
|
|
|
|
|
|
| |
The method implementation was removed in r318982.
llvm-svn: 319798
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Search from AND nodes to find whether they can be propagated back to
loads, so that the AND and load can be combined into a narrow load.
We search through OR, XOR and other AND nodes and all bar one of the
leaves are required to be loads or constants. The exception node then
needs to be masked off meaning that the 'and' isn't removed, but the
loads(s) are narrowed still.
Differential Revision: https://reviews.llvm.org/D39604
llvm-svn: 319773
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Found out, at code inspection, that there was a fault in
DAGCombiner::CombineConsecutiveLoads for big-endian targets.
A BUILD_PAIR is always having the least significant bits of
the composite value in element 0. So when we are doing the checks
for consecutive loads, for big endian targets, we should check
if the load to elt 1 is at the lower address and the load
to elt 0 is at the higher address.
Normally this bug only resulted in missed oppurtunities for
doing the load combine. I guess that in some rare situation it
could lead to faulty combines, but I've not seen that happen.
Note that this patch actually will trigger load combine for
some big endian regression tests.
One example is test/CodeGen/PowerPC/anon_aggr.ll where we now get
t76: i64,ch = load<LD8[FixedStack-9]
instead of
t37: i32,ch = load<LD4[FixedStack-10]>
t35: i32,ch = load<LD4[FixedStack-9]>
t41: i64 = build_pair t37, t35
before legalization. Then the legalization will split the LD8
into two loads, so the end result is the same. That should
verify that the transfomation is correct now.
Reviewers: niravd, hfinkel
Reviewed By: niravd
Subscribers: nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D40444
llvm-svn: 319771
|
|
|
|
|
|
|
|
|
| |
Pull the checks upon the load out from ReduceLoadWidth into their own
function.
Differential Revision: https://reviews.llvm.org/D40833
llvm-svn: 319766
|
|
|
|
|
|
|
|
| |
WidenVecOp_MSTORE instead of implementing it manually and incorrectly.
The CONCAT_VECTORS operand get its type from getSetCCResultType, but if the mask type and the setcc have different scalar sizes this creates an illegal CONCAT_VECTORS operation. The concat type should be 2x the mask type, and then an extend should be added if needed.
llvm-svn: 319744
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the stack"
This broke the Chromium build (crbug.com/791714). Reverting while investigating.
> Summary: This strengthens the guard and matches MSVC.
>
> Reviewers: hans, etienneb
>
> Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D40622
>
> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319490 91177308-0d34-0410-b5e6-96231b3b80d8
llvm-svn: 319706
|
|
|
|
|
|
|
|
|
| |
MatchRotate assumes the types of the types of LHS and RHS are equal,
which is always the case then they come from an OR node, but here
we're getting them from two different TRUNC nodes, so we have to check
the types.
llvm-svn: 319695
|
|
|
|
|
|
|
|
|
| |
If the truncation has been pushed past the or-node, look through it and
truncate afterwards.
Differential revision: https://reviews.llvm.org/D40792
llvm-svn: 319692
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.
The MIR printer prints the IR name of a MBB only for block definitions.
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g'
* find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g'
* grep -nr 'BB#' and fix
Differential Revision: https://reviews.llvm.org/D40422
llvm-svn: 319665
|
|
|
|
|
|
|
|
|
| |
Both LoadedVT and NarrowLoad are passed as references and neither
of them are used by any of its callers.
Differential Revision: https://reviews.llvm.org/D40713
llvm-svn: 319645
|
|
|
|
|
|
|
|
|
| |
non-splat constant shift amount.
If we have a non-splat constant shift amount, the minimum shift amount can be used to infer the number of zero upper bits of the result. There's probably a lot more that we can do here, but this
fixes a case where I wanted to infer the sign bit as zero when all the shift amounts are non-zero.
llvm-svn: 319639
|
|
|
|
|
|
|
|
|
|
|
| |
SelectionDAGISel::LowerArguments assumes sret addr space is 0, which is
not true for amdgcn---amdgiz target.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D40255
llvm-svn: 319630
|
|
|
|
|
|
|
|
| |
bounds checked the shift.
The version that takes APInt is out of line. The 'unsigned' version optimizes for the common case of single word APInts.
llvm-svn: 319628
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SplitVecOp_EXTRACT_VECTOR_ELT/SplitVecRes_INSERT_VECTOR_ELT
Two issues found when doing codegen for splitting vector with non-zero alloca addr space:
DAGTypeLegalizer::SplitVecRes_INSERT_VECTOR_ELT/SplitVecOp_EXTRACT_VECTOR_ELT uses dummy pointer info for creating
SDStore. Since one pointer operand contains multiply and add, InferPointerInfo is unable to
infer the correct pointer info, which ends up with a dummy pointer info for the target to lower
store and results in isel failure. The fix is to introduce MachinePointerInfo::getUnknownStack to
represent MachinePointerInfo which is known in alloca address space but without other information.
TargetLowering::getVectorElementPointer uses value type of pointer in addr space 0 for
multiplication of index and then add it to the pointer. However the pointer may be in an addr
space which has different size than addr space 0. The fix is to use the pointer value type for
index multiplication.
Differential Revision: https://reviews.llvm.org/D39758
llvm-svn: 319622
|
|
|
|
|
|
| |
This reverts r319543, due to ASan bot breakage.
llvm-svn: 319591
|