| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 321585
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
I have been getting rather difficult to reproduce SIGBUS crashes when
compiling certain FreeBSD sources, and their stack traces pointed
squarely at `SelectionDAG::salvageDebugInfo()`:
```
Core was generated by `/usr/obj/share/dim/src/freebsd/clang600-import/amd64.amd64/tmp/usr/bin/cc -cc1 -'.
Program terminated with signal SIGBUS, Bus error.
#0 isInvalidated () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SDNodeDbgValue.h:115
115 bool isInvalidated() const { return Invalid; }
(gdb) bt
#0 isInvalidated () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SDNodeDbgValue.h:115
#1 salvageDebugInfo () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7116
#2 0x00000000033b2516 in operator() () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:3595
#3 __invoke<(lambda at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:3593:59) &, llvm::SDNode *, llvm::SDNode *> () at /usr/include/c++/v1/type_traits:4323
#4 __call<(lambda at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:3593:59) &, llvm::SDNode *, llvm::SDNode *> () at /usr/include/c++/v1/__functional_base:349
#5 operator() () at /usr/include/c++/v1/functional:1562
#6 0x00000000033b0817 in operator() () at /usr/include/c++/v1/functional:1916
#7 NodeDeleted () at /share/dim/src/freebsd/clang600-import/contrib/llvm/include/llvm/CodeGen/SelectionDAG.h:293
#8 0x0000000003529dde in RemoveDeadNodes () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:610
#9 0x00000000035556df in MorphNodeTo () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:6794
#10 0x00000000033a9acc in MorphNode () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:2594
#11 0x00000000033ac80b in SelectCodeCommon () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:3601
#12 0x00000000023d464b in SelectCode () at /usr/obj/share/dim/src/freebsd/clang600-import/amd64.amd64/tmp/obj-tools/lib/clang/libllvm/X86GenDAGISel.inc:282902
#13 Select () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/Target/X86/X86ISelDAGToDAG.cpp:3072
#14 0x00000000033a5afa in DoInstructionSelection () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:988
#15 0x00000000033a4e1a in CodeGenAndEmitDAG () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:868
#16 0x00000000033a2643 in SelectAllBasicBlocks () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:1624
#17 0x000000000339f158 in runOnMachineFunction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:466
#18 0x00000000023d03c4 in runOnMachineFunction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/Target/X86/X86ISelDAGToDAG.cpp:175
#19 0x00000000035cc8c2 in runOnFunction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/MachineFunctionPass.cpp:62
#20 0x00000000030dca9a in runOnFunction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/IR/LegacyPassManager.cpp:1520
#21 0x00000000030dccf3 in runOnModule () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/IR/LegacyPassManager.cpp:1541
#22 0x00000000030dd228 in runOnModule () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/IR/LegacyPassManager.cpp:1597
#23 run () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/IR/LegacyPassManager.cpp:1700
#24 0x00000000014db578 in EmitAssembly () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/CodeGen/BackendUtil.cpp:815
#25 EmitBackendOutput () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/CodeGen/BackendUtil.cpp:1181
#26 0x00000000014d5b26 in HandleTranslationUnit () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/CodeGen/CodeGenAction.cpp:292
#27 0x0000000001c4c332 in ParseAST () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/Parse/ParseAST.cpp:159
#28 0x00000000015d546c in Execute () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/Frontend/FrontendAction.cpp:897
#29 0x0000000001cec311 in ExecuteAction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/Frontend/CompilerInstance.cpp:991
#30 0x00000000014b4f81 in ExecuteCompilerInvocation () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:252
#31 0x00000000014aa73f in cc1_main () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/tools/driver/cc1_main.cpp:221
#32 0x00000000014b2928 in ExecuteCC1Tool () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/tools/driver/driver.cpp:309
#33 main () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/tools/driver/driver.cpp:388
(gdb) frame 1
#1 salvageDebugInfo () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7116
7116 if (DV->isInvalidated())
(gdb) disassemble
Dump of assembler code for function salvageDebugInfo():
[...]
0x0000000003557348 <+744>: nopl 0x0(%rax,%rax,1)
0x0000000003557350 <+752>: mov (%r12),%r13
=> 0x0000000003557354 <+756>: cmpb $0x0,0x31(%r13)
0x0000000003557359 <+761>: jne 0x35573b0 <salvageDebugInfo()+848>
(gdb) info registers
[...]
r13 0x5a5a5a5a5a5a5a5a 6510615555426900570
```
The `0x5a5a5a5a5a5a5a5a` value in `r13` indicates the memory was either
uninitialized, or already freed.
Unfortunately I do not have a simple self-contained test case for this.
However, it seems pretty clear that the call to `AddDbgValue()` in
`salvageDebugInfo()` causes the problems, since it modifies
`SelectionDag::DbgInfo` while looping through one of its DenseMaps:
```
void SelectionDAG::salvageDebugInfo(SDNode &N) {
[...]
for (auto DV : GetDbgValues(&N)) {
if (DV->isInvalidated())
continue;
[...]
AddDbgValue(Clone, N0.getNode(), false);
[...]
}
}
```
At least, if I comment out the `AddDbgValue()` call, the crashes go
away. I propose to change this function slightly, similar to the
`SelectionDAG::transferDbgValues()` function just above it, to save the
cloned SDDbgValues in a separate SmallVector, and only call
AddDbgValue() on them after the for loop is done.
Reviewers: aprantl, bogner, bkramer, davide
Reviewed By: davide
Subscribers: davide, krytarowski, JDevlieghere, emaste, llvm-commits
Differential Revision: https://reviews.llvm.org/D41589
llvm-svn: 321545
|
| |
|
|
|
|
| |
and scatter.
llvm-svn: 321540
|
| |
|
|
| |
llvm-svn: 321535
|
| |
|
|
|
|
|
|
| |
For example, float operations may fail to constant fold under certain circumstances (inf/nan/denormal creation etc.)
Reduced from oss-fuzz #4802 test case
llvm-svn: 321488
|
| |
|
|
|
|
|
|
| |
getSExtValue/getZExtValue
Reduced from oss-fuzz #4782 test case
llvm-svn: 321464
|
| |
|
|
|
|
|
|
| |
(add X, 1), 2) for i1
Reduced from oss-fuzz #4773 test case
llvm-svn: 321455
|
| |
|
|
|
|
|
|
| |
other combines a chance to run.
This moves the combine for turning ANDs into shuffle with zero out of SimplifyVBinOps and places it only in visitAND below the reassociate handling. This fixes the specific case I noticed where we failed to combine two ands with constants.
llvm-svn: 321417
|
| |
|
|
|
|
| |
of constant build vectors.
llvm-svn: 321414
|
| |
|
|
|
|
|
|
|
|
| |
to get the type of the operand.
getOperand returns an SDValue that contains the node and the result number. There is no guarantee that the result number if 0. By using the -> operator we are calling SDNode::getValueType rather than SDValue::getValueType. This requires supplying a result number and we shouldn't assume it was 0.
I don't have a test case. Just noticed while cleaning up some other code and saw that it occurred in other places.
llvm-svn: 321397
|
| |
|
|
| |
llvm-svn: 321391
|
| |
|
|
|
|
|
|
|
| |
BaseIndexOffset supercedes findBaseOffset analysis save only Constant
Pool addresses. Migrate analysis to BaseIndexOffset.
Relanding after correcting base address matching check.
llvm-svn: 321389
|
| |
|
|
|
|
|
|
|
|
| |
NFCI."
which was causing miscompilations in for some test-suite components.
This reverts commit 3e9de9ff0f3162920a2a3cba51c7dc14b54b4d16.
llvm-svn: 321380
|
| |
|
|
|
|
|
|
|
|
| |
TargetLowering::getVectorElementPointer so that the FrameIndex is on the left.
This seems to improve X86's ability to match this into an address computation. Otherwise the other operand gets assigned to the base register and the stack pointer + frame index ends up in the index register. But index registers can't encode ESP/RSP so we end up having to move it into another register to meet the constraint.
I could try to improve the address matcher in X86, but swapping the producer seemed easier. Several other places already have the operands in this order so this is at least consistent.
llvm-svn: 321370
|
| |
|
|
|
|
|
| |
BaseIndexOffset supercedes findBaseOffset analysis save only Constant
Pool addresses. Migrate analysis to BaseIndexOffset.
llvm-svn: 321364
|
| |
|
|
|
|
|
| |
Improve ReduceLoadWidth for SRL Patch is causing an issue on the
PPC64 BE santizer.
llvm-svn: 321349
|
| |
|
|
|
|
| |
More general cases are already handled by constant canonicalization and then the ReassociateOps call at line 5327
llvm-svn: 321280
|
| |
|
|
|
|
|
|
| |
combine to work on non-splat vectors
The knownbits_mask_or_shuffle_uitofp change is interesting - shuffle combines manage to kick in, removing the AND constant mask load. For targets with fast-variable-shuffle this should reduce further to VPOR+VPSHUFB+VCVTDQ2PS.
llvm-svn: 321279
|
| |
|
|
|
|
| |
work on non-splat vectors
llvm-svn: 321275
|
| |
|
|
|
|
|
|
|
|
|
| |
If the SRL node is only used by an AND, we may be able to set the
ExtVT to the width of the mask, making the AND redundant. To support
this, another check has been added in isLegalNarrowLoad which queries
whether the load is valid.
Differential Revision: https://reviews.llvm.org/D41350
llvm-svn: 321259
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When intrinsics are allowed to have mem operands, there
are two ways this can happen. First is an intrinsic
that is marked has having a mem operand, but is not handled
by getTgtMemIntrinsic.
The second way can occur even for intrinsics which do not
have a mem operand. It seems the selector table does
some kind of sorting based on the opcode, and the
mem ref recording can happen in the same scope for
intrinsics that both do and do not have mem refs.
I haven't been able to figure out exactly why this happens
(although it happens even with the matcher optimizations disabled).
I'm not sure if it's worth trying to avoid hitting this for
these nodes since I think it's still reasonable to handle
this in case getTgtMemIntrinic is not implemented.
llvm-svn: 321208
|
| |
|
|
|
|
|
| |
Prevent overlapping store elision when overlapping store is
pre-inc/dec as analysis is wrong in these cases.
llvm-svn: 321204
|
| |
|
|
|
|
|
|
|
| |
These functions simply call their counterparts in the associated SDNode,
which do take an optional SelectionDAG. This change makes the legalization
debug trace a little easier to read, since target-specific nodes will
now have their names shown instead of "Unknown node #123".
llvm-svn: 321180
|
| |
|
|
| |
llvm-svn: 321114
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Extend overlapping store elision to handle overwrites of stores by
larger stores.
Nontemporal tests have been modified to add memory dependencies to
prevent store elision.
Reviewers: craig.topper, rnk, t.p.northover
Subscribers: javed.absar, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D40969
llvm-svn: 321089
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Search from AND nodes to find whether they can be propagated back to
loads, so that the AND and load can be combined into a narrow load.
We search through OR, XOR and other AND nodes and all bar one of the
leaves are required to be loads or constants. The exception node then
needs to be masked off meaning that the 'and' isn't removed, but the
loads(s) are narrowed still.
Differential Revision: https://reviews.llvm.org/D41177
llvm-svn: 320962
|
| |
|
|
| |
llvm-svn: 320885
|
| |
|
|
|
|
| |
The Function can never be nullptr so we can return a reference.
llvm-svn: 320884
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
non-constant index
Summary:
Currently we don't handle v32i1/v64i1 insert_vector_elt correctly as we fail to look at the number of elements closely and assume it can only be v16i1 or v8i1.
We also can't type legalize v64i1 insert_vector_elt correctly on KNL due to the type not being byte addressable as required by the legalizing through memory accesses path requires.
For the first issue, the patch now tries to pick a 512-bit register with the correct number of elements and promotes to that.
For the second issue, we now extend the vector to a byte addressable type, do the stores to memory, load the two halves, and then truncate the halves back to the original type. Technically since we changed the type, we may not need two loads, but actually checking that is more work and for the v64i1 case we do need them.
Reviewers: RKSimon, delena, spatel, zvi
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D40942
llvm-svn: 320849
|
| |
|
|
|
|
|
|
|
|
| |
operands call NewSDValueDbgMsg.
This makes it work better with some build_vector and concat_vectors creations.
Adjust the NewSDValueDbgMsg in getConstant to avoid duplicating the print when it calls getSplatBuildVector since getSplatBuildVector didn't trigger a print before.
llvm-svn: 320783
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While investigating LLVM PR22316 (http://llvm.org/bugs/show_bug.cgi?id=22316)
I started wondering if it were not always preferable to emit the
initial DBG_VALUEs for stack arguments as FI locations instead of
describing the first register they get copied into. The advantage of
doing this is that the arguments will be available as soon as the
stack is setup. As illustrated by the testcase in the PR, the first
copy of the FI into a register may be sunk by MachineSink.cpp into a
later basic block. By describing the argument on the stack, we nicely
circumvent this problem.
<rdar://problem/19583723>
Differential Revision: https://reviews.llvm.org/D41135
llvm-svn: 320758
|
| |
|
|
| |
llvm-svn: 320756
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Most of the -Wsign-compare warnings are due to the fact that
enums are signed by default in the MS ABI, while the
tautological comparison warnings trigger on x86 builds where
sizeof(size_t) is 4 bytes, so N > numeric_limits<unsigned>::max()
is always false.
Differential Revision: https://reviews.llvm.org/D41256
llvm-svn: 320750
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than adding more bits to express every
MMO flag you could want, just directly use the
MMO flags. Also fixes using a bunch of bool arguments to
getMemIntrinsicNode.
On AMDGPU, buffer and image intrinsics should always
have MODereferencable set, but currently there is no
way to do that directly during the initial intrinsic
lowering.
llvm-svn: 320746
|
| |
|
|
|
|
| |
This reverts commit r320679. Causes miscompiles.
llvm-svn: 320698
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Recommitting rL319773, which was reverted due to a recursive issue
causing timeouts. This happened because I failed to check whether
the discovered loads could be narrowed further. In the case of a tree
with one or more narrow loads, that could not be further narrowed, as
well as a node that would need masking, an AND could be introduced
which could then be visited and recombined again with the same load.
This could again create the masking load, with would be combined
again... We now check that the load can be narrowed so that this
process stops.
Original commit message:
Search from AND nodes to find whether they can be propagated back to
loads, so that the AND and load can be combined into a narrow load.
We search through OR, XOR and other AND nodes and all bar one of the
leaves are required to be loads or constants. The exception node then
needs to be masked off meaning that the 'and' isn't removed, but the
loads(s) are narrowed still.
Differential Revision: https://reviews.llvm.org/D41177
llvm-svn: 320679
|
| |
|
|
|
|
|
|
|
|
| |
for AVX512F.
A v32i1 CONCAT_VECTORS of v16i1 uses promotion to v32i8 to legalize the v32i1. This results in a bunch of extract_vector_elts and a build_vector that ultimately gets scalarized.
This patch checks to see if v16i8 is legal and inserts a any_extend to that so that we can concat v16i8 to v32i8 and avoid creating the extracts.
llvm-svn: 320674
|
| |
|
|
|
|
|
|
|
|
| |
account whether the input type also needs to be promoted.
If so go ahead and get the promoted input vector to extract from. Previously, we would create a bunch of any_extends of extract_vector_elts with illegal input type that needs to be promoted. The legalization of those extract_vector_elts would then potentially introduce a truncate. So now we have a bunch of any_extends of truncates. By legalizing both parts together we avoid creating these extra nodes.
The test changes seem to be because we were previously combining the build_vector with the any_extend before the any_extend got combined with the truncate.
llvm-svn: 320669
|
| |
|
|
| |
llvm-svn: 320619
|
| |
|
|
|
|
|
|
| |
Add missing case that was not implemented yet.
Differential Revision: https://reviews.llvm.org/D38942
llvm-svn: 320567
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At first, I tried to thread the x86 needle and use a target hook (isVectorShiftByScalarCheap())
to disable the transform only for non-splat pow-of-2 constants, but not AVX2, but only some
element types, but...it's difficult.
Here we just avoid the loop with the x86 vector transform that conflicts with the general DAG
combine and preserve all of the existing behavior AFAICT otherwise.
Some tests that will probably fail if someone does try to restrict this in a more targeted way
for x86-only may be found in:
test/CodeGen/X86/combine-mul.ll
test/CodeGen/X86/vector-mul.ll
test/CodeGen/X86/widen_arith-5.ll
This should prevent the infinite looping seen with:
https://bugs.llvm.org/show_bug.cgi?id=35579
Differential Revision: https://reviews.llvm.org/D41040
llvm-svn: 320374
|
| |
|
|
|
|
|
|
|
|
| |
This commit is the first part of https://reviews.llvm.org/D40348.
In order to allow target combines to be performed on newly combined
indexed loads, add them back to the worklist. The remainder of the
above patch will be committed in subsequent revisions and will use
this. Test cases will be included with those follow-up commits.
llvm-svn: 320365
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a preparatory step for D34515.
This change:
- makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
- lowering is done by first converting the boolean value into the carry flag
using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
operations does the actual addition.
- for subtraction, given that ISD::SUBCARRY second result is actually a
borrow, we need to invert the value of the second operand and result before
and after using ARMISD::SUBE. We need to invert the carry result of
ARMISD::SUBE to preserve the semantics.
- given that the generic combiner may lower ISD::ADDCARRY and
ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
as well otherwise i64 operations now would require branches. This implies
updating the corresponding test for unsigned.
- add new combiner to remove the redundant conversions from/to carry flags
to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
- fixes PR34045
- fixes PR34564
- fixes PR35103
Differential Revision: https://reviews.llvm.org/D35192
llvm-svn: 320355
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduces the AddrFI "addressing mode", which is necessary simply because
it's not possible to write a pattern that directly matches a frameindex.
Ensure callee-saved registers are accessed relative to the stackpointer. This
is necessary as callee-saved register spills are performed before the frame
pointer is set.
Move HexagonDAGToDAGISel::isOrEquivalentToAdd to SelectionDAGISel, so we can
make use of it in the RISC-V backend.
Differential Revision: https://reviews.llvm.org/D39848
llvm-svn: 320353
|
| |
|
|
|
|
| |
We should probably also fold (mulhs/u X, 1) for vectors, but that's harder.
llvm-svn: 320344
|
| |
|
|
| |
llvm-svn: 320343
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This relaxes an assertion inside SelectionDAGBuilder which is overly
restrictive on targets which have no concept of alignment (such as AVR).
In these architectures, all types are aligned to 8-bits.
After this, LLVM will only assert that accesses are aligned on targets
which actually require alignment.
This patch follows from a discussion on llvm-dev a few months ago
http://llvm.1065342.n5.nabble.com/llvm-dev-Unaligned-atomic-load-store-td112815.html
Reviewers: bogner, nemanjai, joerg, efriedma
Reviewed By: efriedma
Subscribers: efriedma, cactus, llvm-commits
Differential Revision: https://reviews.llvm.org/D39946
llvm-svn: 320243
|
| |
|
|
|
|
| |
is mentioned in the documentation (inserting a deref before the plus_uconst).
llvm-svn: 320203
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I noticed this pattern in D38316 / D38388. We failed to combine a shuffle that is either
repeating a scalar insertion at the same position in a vector or translated to a different
element index.
Like the earlier patch, this could be an instcombine too, but since we opted to make this
a DAG transform earlier, I've made this one a DAG patch too.
We do not need any legality checking because the new insert is identical to the existing
insert except that it may have a different constant insertion operand.
The constant insertion test in test/CodeGen/X86/vector-shuffle-combining.ll was the
motivation for D38756.
Differential Revision: https://reviews.llvm.org/D40209
llvm-svn: 320050
|
| |
|
|
|
|
|
|
| |
makes the type byte addressable.
We can just extend the original vector to vXi1 and trust that the legalization process will revisit it.
llvm-svn: 320013
|