| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
| |
These tests should have been included in r310697 / D34099 but apparently
I missed them.
llvm-svn: 313737
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Also starts selecting global loads for constant address
in some cases. Some end up selecting to mubuf still, which
requires investigation.
We still get sub-optimal regalloc and extra waitcnts inserted
due to not really tracking the liveness of the separate register
halves.
llvm-svn: 313716
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D36849
llvm-svn: 313714
|
| |
|
|
| |
llvm-svn: 313712
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This re-applies commit r313685, this time with the proper updates to
the test cases.
Original commit message:
Unreachable blocks in the machine instr representation are these
weird empty blocks with no successors.
The MIR printer used to not print empty lists of successors. However,
the MIR parser now treats non-printed list of successors as "please
guess it for me". As a result, the parser tries to guess the list of
successors and given the block is empty, just assumes it falls through
the next block (if any).
For instance, the following test case used to fail the verifier.
The MIR printer would print
entry
/ \
true (def) false (no list of successors)
|
split.true (use)
The MIR parser would understand this:
entry
/ \
true (def) false
| / <-- invalid edge
split.true (use)
Because of the invalid edge, we get the "def does not
dominate all uses" error.
The fix consists in printing empty successor lists, so that the parser
knows what to do for unreachable blocks.
rdar://problem/34022159
llvm-svn: 313696
|
| |
|
|
|
|
|
|
|
| |
This reverts commit r313685.
I thought I had ran ninja check, but apparently I didn't...
Need to update a bunch of mir tests.
llvm-svn: 313686
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unreachable blocks in the machine instr representation are these
weird empty blocks with no successors.
The MIR printer used to not print empty lists of successors. However,
the MIR parser now treats non-printed list of successors as "please
guess it for me". As a result, the parser tries to guess the list of
successors and given the block is empty, just assumes it falls through
the next block (if any).
For instance, the following test case used to fail the verifier.
The MIR printer would print
entry
/ \
true (def) false (no list of successors)
|
split.true (use)
The MIR parser would understand this:
entry
/ \
true (def) false
| / <-- invalid edge
split.true (use)
Because of the invalid edge, we get the "def does not
dominate all uses" error.
The fix consists in printing empty successor lists, so that the parser
knows what to do for unreachable blocks.
rdar://problem/34022159
llvm-svn: 313685
|
| |
|
|
|
|
|
|
|
| |
The pre-RA scheduler does load/store clustering, but post-RA
scheduler undoes it. Add mutation to prevent it.
Differential Revision: https://reviews.llvm.org/D38014
llvm-svn: 313670
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
SystemZTargetLowering::combineSTORE contains code to transform a
combination of STORE + BSWAP into a STRV type instruction.
This transformation is correct for regular stores, but not for
truncating stores. The routine neglected to check for that case.
Fixes a miscompilation of llvm-objcopy with clang, which caused
test suite failures in the SystemZ multistage build bot.
llvm-svn: 313669
|
| |
|
|
|
|
|
|
|
|
| |
Two blocks prior to the join each perform an li and the the join block has an
add using the initialized register. Optimize each predecessor block to instead
use addi and delete the li's and add.
Differential Revision: https://reviews.llvm.org/D36734
llvm-svn: 313639
|
| |
|
|
|
|
| |
Include instances of FP register pairs.
llvm-svn: 313638
|
| |
|
|
| |
llvm-svn: 313633
|
| |
|
|
| |
llvm-svn: 313632
|
| |
|
|
| |
llvm-svn: 313631
|
| |
|
|
| |
llvm-svn: 313629
|
| |
|
|
|
|
| |
This maps directly to G_INTRINSIC_W_SIDE_EFFECTS.
llvm-svn: 313627
|
| |
|
|
|
|
|
|
| |
This patch, together with a matching clang patch (https://reviews.llvm.org/D37668), implements the lowering of X86 mask set1 intrinsics to IR.
Differential Revision: https://reviews.llvm.org/D37669
llvm-svn: 313625
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a preparatory step for D34515.
This change:
- makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
- lowering is done by first converting the boolean value into the carry flag
using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
operations does the actual addition.
- for subtraction, given that ISD::SUBCARRY second result is actually a
borrow, we need to invert the value of the second operand and result before
and after using ARMISD::SUBE. We need to invert the carry result of
ARMISD::SUBE to preserve the semantics.
- given that the generic combiner may lower ISD::ADDCARRY and
ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
as well otherwise i64 operations now would require branches. This implies
updating the corresponding test for unsigned.
- add new combiner to remove the redundant conversions from/to carry flags
to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
- fixes PR34045
- fixes PR34564
Differential Revision: https://reviews.llvm.org/D35192
llvm-svn: 313618
|
| |
|
|
| |
llvm-svn: 313617
|
| |
|
|
|
|
|
|
| |
The relocations used for externally visible functions
aren't supported, so the direct call emitted ends
up hitting a linker error.
llvm-svn: 313616
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the instruction scheduling information for the SkylakeClient (SKL) architecture target by adding the file X86SchedSkylakeClient.td located under the X86 Target.
We used the scheduling information retrieved from the Skylake architects in order to create the file.
The scheduling information includes latency, number of micro-Ops and used ports by each SKL instruction.
The patch continues the scheduling replacement and insertion effort started with the SNB target in r307529 and r310792 and for HSW in r311879.
Please expect some performance fluctuations due to code alignment effects.
Reviewers: craig.topper, zvi, chandlerc, igorb, aymanmus, RKSimon, delena
Differential Revision: https://reviews.llvm.org/D37294
llvm-svn: 313613
|
| |
|
|
|
|
| |
table.
llvm-svn: 313610
|
| |
|
|
|
|
| |
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 313593
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
If we have an AssertZext of a truncated value that has already been AssertZext'ed,
we can assert on the wider source op to improve the zext-y knowledge:
assert (trunc (assert X, i8) to iN), i1 --> trunc (assert X, i1) to iN
This moves a fold from being Mips-specific to general combining, and x86 shows
improvements.
Differential Revision: https://reviews.llvm.org/D37017
llvm-svn: 313577
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D37981
llvm-svn: 313565
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rL310710 allowed store merging to occur after legalization to catch stores that are created late,
but this exposes a logic hole seen in PR34217:
https://bugs.llvm.org/show_bug.cgi?id=34217
We will miss merging stores if the target lowers vector extracts into target-specific operations.
This patch allows store merging to occur both before and after legalization if the target chooses
to get maximum merging.
I don't think the potential regressions in the other tests are relevant. The tests are for
correctness of weird IR constructs rather than perf tests, and I think those are still correct.
Differential Revision: https://reviews.llvm.org/D37987
llvm-svn: 313564
|
| |
|
|
|
|
|
|
|
|
|
|
| |
zext is AssertZext
The AssertZext we might see in this case is only giving information about the lower 32 bits. It isn't providing information about the upper 32 bits. So we should emit a zext.
This fixes PR28540.
Differential Revision: https://reviews.llvm.org/D37729
llvm-svn: 313563
|
| |
|
|
| |
llvm-svn: 313548
|
| |
|
|
|
|
|
|
| |
results.
As commented on D37849, AVX1 targets were missing a chance to use vmovmskps for v8f32/v8i32 results for bool vector bitcasts
llvm-svn: 313547
|
| |
|
|
| |
llvm-svn: 313545
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add test cases when float <-> pointer types conversion is triggered
in presence of load instructions.
Reviewers: Ayal, srhines, mkuper, rengolin
Reviewed By: rengolin
Subscribers: javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D37967
llvm-svn: 313544
|
| |
|
|
|
|
|
|
|
|
|
|
| |
bits.
For cases where we are BITCASTing to vectors of smaller elements, then if the entire source was a splatted sign (src's NumSignBits == SrcBitWidth) we can say that the dst's NumSignBit == DstBitWidth, as we're just splitting those sign bits across multiple elements.
We could generalize this but at the moment the only use case I have is to peek through bitcasts to vector comparison results.
Differential Revision: https://reviews.llvm.org/D37849
llvm-svn: 313543
|
| |
|
|
|
|
|
|
|
|
|
|
| |
enabled
The shuffle combining and lowerVectorShuffleAsLanePermuteAndBlend were both still trying to use VPERM2XF128 for unary shuffles when AVX2 is enabled. VPERM2X128 takes two inputs meaning when we use it for a unary shuffle one of those inputs is left undefined creating a false dependency on whatever register gets allocated there.
If we have VPERMQ/PD we should prefer those since they only have a single input.
Differential Revision: https://reviews.llvm.org/D37947
llvm-svn: 313542
|
| |
|
|
|
|
|
|
| |
As discussed on PR28925 and D37849.
Differential Revision: https://reviews.llvm.org/D37975
llvm-svn: 313532
|
| |
|
|
| |
llvm-svn: 313529
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Subregister liveness tracking is not implemented for X86 backend, so
sometimes the whole super register is said to be live, when only a
subregister is really live. That might happen if the def and the use
are located in different MBBs, see added fixup-bw-isnt.mir test.
However, using knowledge of the specific instructions handled by the
bw-fixup-pass we can get more precise liveness information which this
change does.
Reviewers: MatzeB, DavidKreitzer, ab, andrew.w.kaylor, craig.topper
Reviewed By: craig.topper
Subscribers: n.bozhenov, myatsina, llvm-commits, hiraditya
Patch by Andrei Elovikov <andrei.elovikov@intel.com>
Differential Revision: https://reviews.llvm.org/D37559
llvm-svn: 313524
|
| |
|
|
|
|
|
|
|
| |
related to patch: https://reviews.llvm.org/D35772
adding llvm gathers test before gathers codegen support.
Differential Revision: https://reviews.llvm.org/D37800
llvm-svn: 313516
|
| |
|
|
|
|
|
|
| |
unpcklpd for the packed single domain.
MOVLHPS has a smaller encoding than UNPCKLPD in the legacy encodings. With VEX and EVEX encodings it doesn't matter.
llvm-svn: 313509
|
| |
|
|
|
|
| |
instructions.
llvm-svn: 313508
|
| |
|
|
| |
llvm-svn: 313507
|
| |
|
|
|
|
| |
shuffles with SSE1 only.
llvm-svn: 313504
|
| |
|
|
|
|
| |
These can be implemented with movlhps and movhlps.
llvm-svn: 313503
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D37962
llvm-svn: 313490
|
| |
|
|
| |
llvm-svn: 313483
|
| |
|
|
| |
llvm-svn: 313479
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: G_FCONSTANT support, port the implementation from X86FastIsel.
Reviewers: zvi, delena, guyblank
Reviewed By: delena
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D37734
llvm-svn: 313478
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows vector-sized store merging of constants in DAGCombiner using the existing code in MergeConsecutiveStores().
All of the twisted logic that decides exactly what vector operations are legal and fast for each particular CPU are
handled separately in there using the appropriate hooks.
For the motivating tests in merge-store-constants.ll, we already produce the same vector code in IR via the SLP vectorizer.
So this is just providing a backend backstop for code that doesn't go through that pass (-O1). More details in PR24449:
https://bugs.llvm.org/show_bug.cgi?id=24449 (this change should be the last step to resolve that bug)
Differential Revision: https://reviews.llvm.org/D37451
llvm-svn: 313458
|
| |
|
|
|
|
|
|
|
|
| |
the load is on the first input to the SDNode.
We just need to toggle bits 1 and 5 of the immediate and swap the sources. The peephole pass could trigger commuting/folding for this later, but its easy enough to fix in isel.
Disable the peephole pass on the main vperm2x128 test so we know we're doing this through isel.
llvm-svn: 313455
|
| |
|
|
|
|
| |
the instrinsic upgrade file and regenerated.
llvm-svn: 313454
|
| |
|
|
|
|
| |
I missed the we already had a pretty thorough test file for these instructions.
llvm-svn: 313451
|