| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: UADDO has 2 result, and one must check the result no before doing any kind of combine. Without it, the transform is invalid.
Reviewers: joerg
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34088
llvm-svn: 305162
|
| |
|
|
| |
llvm-svn: 305154
|
| |
|
|
| |
llvm-svn: 305153
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D28531
llvm-svn: 305137
|
| |
|
|
| |
llvm-svn: 305131
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Fix assertion failures on F16 to/from int types in FastISel by falling
back to regular ISel
- Add a testcase of various conversion cases with FastISel (-O0)
Reviewers: kristof.beyls, jmolloy, SjoerdMeijer
Reviewed By: SjoerdMeijer
Subscribers: SjoerdMeijer, llvm-commits, srhines, pirama, aemerson, rengolin, javed.absar, kristof.beyls
Differential Revision: https://reviews.llvm.org/D33734
llvm-svn: 305127
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D34046
llvm-svn: 305098
|
| |
|
|
|
|
| |
If the inputs won't saturate during packing then we can treat the PACKSS as a truncation shuffle
llvm-svn: 305091
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and returns"
By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown,
backends can request that LLVM to scalarize vector types for calls
and returns.
The MIPS vector ABI requires that vector arguments and returns are passed in
integer registers. With SelectionDAG's new hooks, the MIPS backend can now
handle LLVM-IR with vector types in calls and returns. E.g.
'call @foo(<4 x i32> %4)'.
Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for
calls and returns if vector types were not legal. If vector types were legal,
a single 128bit vector argument would be assigned to a single 32 bit / 64 bit
integer register.
By teaching the MIPS backend to inspect the original types, it can now
implement the MIPS vector ABI which requires a particular method of
scalarizing vectors.
Previously, the MIPS backend relied on clang to scalarize types such as "call
@foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3,
i32 inreg %4)".
This patch enables the MIPS backend to take either form for vector types.
The previous version of this patch had a "conditional move or jump depends on
uninitialized value".
Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur
Differential Revision: https://reviews.llvm.org/D27845
llvm-svn: 305083
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Alloca promotion pass not dealing with non-canonical input
Added some additional checks so the pass simply backs-off forms it can't deal with (non-canonical)
Also added some test cases in non-canonical form to check that it no longer crashes
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D31710
llvm-svn: 305079
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This prevents against assertion errors like PR32659 which occur from a
replacement deleting a node after it's been added to the list argument
of RemoveDeadNodes. The specific failure from PR32659 does not
currently happen, but it is still potentially possible. The underlying
cause is that the callers of the change dfunction builds up a list of
nodes to delete after having moved their uses and it possible that a
move of a later node will cause a previously deleted nodes to be
deleted.
Reviewers: bkramer, spatel, davide
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33731
llvm-svn: 305070
|
| |
|
|
|
|
|
|
|
|
| |
The scalar VFMS instructions did not have scheduling information attached (but
VFMA did), which was causing assertion failures with the Cortex-A57 scheduling
model and -fp-contract=fast.
Differential Revision: https://reviews.llvm.org/D34040
llvm-svn: 305064
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(0) RegAllocPBQP: Since getRawAllocationOrder() may return a collection that includes reserved physical registers, iterate to find an un-reserved physical register.
(1) VirtRegMap: Enforce the invariant: "no reserved physical registers" in assignVirt2Phys(). Previously, this was checked only after the fact in VirtRegRewriter::rewrite.
(2) MachineVerifier: updated the test per MatzeB's review.
(3) +testcase
Patch by Nick Johnson<Nicholas.Paul.Johnson@deshawresearch.com>!
Differential Revision: https://reviews.llvm.org/D33947
llvm-svn: 305016
|
| |
|
|
| |
llvm-svn: 305014
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This matches the behavior used in the SDAG when expanding memcmp.
For reference, we're intentionally treating the earlier fortified call transforms differently after:
https://bugs.llvm.org/show_bug.cgi?id=23093
https://reviews.llvm.org/rL233776
One motivation for not transforming nobuiltin calls is that it can interfere with sanitizers:
https://reviews.llvm.org/D19781
https://reviews.llvm.org/D19801
Differential Revision: https://reviews.llvm.org/D34043
llvm-svn: 305007
|
| |
|
|
|
|
| |
Fixes using physical registers in inline asm from clang.
llvm-svn: 305004
|
| |
|
|
|
|
|
|
|
|
| |
In PPCBoolRetToInt bool value is changed to i32 type. On ppc64 it may introduce an extra zero extension for the return value. This patch changes the integer type to i64 to avoid the zero extension on ppc64.
This patch fixed PR32442.
Differential Revision: https://reviews.llvm.org/D31407
llvm-svn: 305001
|
| |
|
|
|
|
|
|
| |
The V_MQSAD_PK_U16_U8, V_QSAD_PK_U16_U8, and V_MQSAD_U32_U8 take more than 1 pass in hardware. For these three instructions, the destination registers must be different than all sources, so that the first pass does not overwrite sources for the following passes.
Differential Revision: https://reviews.llvm.org/D33783
llvm-svn: 304998
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds build vector patterns to exploit the vector integer
extend instructions:
vextsb2w - Vector Extend Sign Byte To Word
vextsb2d - Vector Extend Sign Byte To Doubleword
vextsh2w - Vector Extend Sign Halfword To Word
vextsh2d - Vector Extend Sign Halfword To Doubleword
vextsw2d - Vector Extend Sign Word To Doubleword
Differential Revision: https://reviews.llvm.org/D33510
llvm-svn: 304992
|
| |
|
|
|
|
|
|
|
|
|
| |
In SDAG, we don't expand libcalls with a nobuiltin attribute.
It's not clear if that's correct from the existing code comment:
"Don't do the check if marked as nobuiltin for some reason."
...adding a test here either way to show that there is currently
a different behavior implemented in the CGP-based expansion.
llvm-svn: 304991
|
| |
|
|
| |
llvm-svn: 304989
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The test diff for PowerPC shows we can better optimize if this case is one block.
For x86, there's would be a substantial difference if CGP expansion was enabled because branches are assumed
cheap and SDAG can't optimize across blocks.
Instead of this:
_cmp_eq8:
movq (%rdi), %rax
cmpq (%rsi), %rax
je LBB23_1
## BB#2: ## %res_block
movl $1, %ecx
jmp LBB23_3
LBB23_1:
xorl %ecx, %ecx
LBB23_3: ## %endblock
xorl %eax, %eax
testl %ecx, %ecx
sete %al
retq
We get this:
cmp_eq8:
movq (%rdi), %rcx
xorl %eax, %eax
cmpq (%rsi), %rcx
sete %al
retq
And that matches the optimal codegen that we get from the current expansion in SelectionDAGBuilder::visitMemCmpCall().
If this looks right, then I just need to confirm that vector-sized expansion will work from here, and we can enable
CGP memcmp() expansion for x86. Ie, we'll bypass the power-of-2 special cases currently optimized in SDAG because we
can lower the IR produced here optimally.
Differential Revision: https://reviews.llvm.org/D34005
llvm-svn: 304987
|
| |
|
|
|
|
|
| |
This patch will close PR32801.
Differential Revision: https://reviews.llvm.org/D33203
llvm-svn: 304986
|
| |
|
|
|
|
|
|
|
|
|
|
| |
We already had a test to demonstrate PR33325:
https://bugs.llvm.org/show_bug.cgi?id=33325
I'm adding tests for general memcmp expansion (see D34005 / D33963) and:
https://bugs.llvm.org/show_bug.cgi?id=33329
...plus non-power-of-2 sizes, so we can see what that looks like currently or if expanded.
llvm-svn: 304979
|
| |
|
|
|
|
|
|
| |
constants.
The initial patch was rejected: I fixed the issue and re-apply it.
llvm-svn: 304972
|
| |
|
|
|
|
|
|
| |
Add a couple of tests to increase coverage for the TableGen'erated code,
in particular for rules where 2 generic instructions may be combined
into a single machine instruction.
llvm-svn: 304971
|
| |
|
|
| |
llvm-svn: 304937
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This could be viewed as another shortcoming of the DAGCombiner:
when both operands of a compare are zexted from the same source
type, we should be able to compare the original types.
The effect on PowerPC perf is likely unnoticeable, but there's a
visible regression for x86 if we feed the suboptimal IR for memcmp
expansion to the DAG:
_cmp_eq4_zexted_to_i64:
movl (%rdi), %ecx
movl (%rsi), %edx
xorl %eax, %eax
cmpq %rdx, %rcx
sete %al
_cmp_eq4_better:
movl (%rdi), %ecx
xorl %eax, %eax
cmpl (%rsi), %ecx
sete %al
llvm-svn: 304923
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Changed immediate type for repl.ph from uimm10 to simm10 as per the specs.
Repl.qb still accepts uimm8. Both instructions now mimic the behaviour of
GNU as.
Patch by Stefan Maksimovic.
Differential Revision: https://reviews.llvm.org/D33594
llvm-svn: 304918
|
| |
|
|
| |
llvm-svn: 304915
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D33949
llvm-svn: 304910
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we know that both operands of an unsigned integer vector comparison are non-negative,
then it's safe to directly use a signed-compare-greater-than instruction (the only non-equality
integer vector compare predicate provided by SSE/AVX).
We're intentionally not changing the condition code to signed in order to preserve the
existing transforms that use min/max/psubus below here.
This should solve PR33276:
https://bugs.llvm.org/show_bug.cgi?id=33276
Differential Revision: https://reviews.llvm.org/D33862
llvm-svn: 304909
|
| |
|
|
|
|
|
|
| |
Adds handling for i64 SETNE comparison (both sign and zero extended).
Differential Revision: https://reviews.llvm.org/D33720
llvm-svn: 304907
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The clang compiler by default uses FastISel when invoked with -O0, which
is also the default. In that case, passing of -mxgot does not get honored,
i.e. the code path that is to deal with large got is not taken.
Clang produces same output regardless of -mxgot being present or not.
This change checks whether -mxgot is passed as an option, and turns off
FastISel if it is.
Patch by Stefan Maksimovic.
Differential Revision: https://reviews.llvm.org/D33593
llvm-svn: 304906
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the commit message from r296921, G_MERGE_VALUES and
G_INSERT are to be preferred over G_SEQUENCE. Therefore, stop generating
G_SEQUENCE in the ARM backend and remove the code dealing with it.
This boils down to the code breaking up double values for the soft float
calling convention. Use G_MERGE_VALUES + G_UNMERGE_VALUES instead of
G_SEQUENCE + G_EXTRACT for it. This maps very nicely to VMOVDRR +
VMOVRRD and simplifies the code in the instruction selector.
There's one occurence of G_SEQUENCE left in arm-irtranslator.ll, but
that is part of the target-independent code for translating constant
structs. Therefore, it is beyond the scope of this commit.
llvm-svn: 304902
|
| |
|
|
|
|
|
|
| |
Adds handling for i32 SETNE comparison (both sign and zero extended).
Differential Revision: https://reviews.llvm.org/D33718
llvm-svn: 304901
|
| |
|
|
|
|
|
|
|
| |
Same as the other binary operators:
- legalize to 32 bits
- map to GPRs
- select to EORrr via TableGen'erated code
llvm-svn: 304898
|
| |
|
|
|
|
|
| |
This reverts commit r301394. It broke some internal buildbots, reverting
while the issue is being investigated.
llvm-svn: 304896
|
| |
|
|
|
|
| |
We were checking that the index was in range of the destination vector type, not the (larger) source vector type
llvm-svn: 304894
|
| |
|
|
|
|
|
|
|
| |
Same as the other binary operators:
- legalize to 32 bits
- map to GPRs
- select ORRrr thanks to TableGen'erated code
llvm-svn: 304890
|
| |
|
|
|
|
|
|
|
| |
This is identical to the support for the other binary operators:
- widen to s32
- map into GPR
- select ANDrr (via TableGen'erated code)
llvm-svn: 304885
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
memcmp() expansion
I'd like to enable CGP memcmp expansion for x86, but the output from CGP would regress the
special cases (memcmp(x,y,N) != 0 for N=1,2,4,8,16,32 bytes) that we already handle.
I'm not sure if we'll actually be able to produce the optimal code given the block-at-a-time
limitation in the DAG. We might have to just avoid those special-cases here in CGP. But
regardless of that, I think this is a win for the more general cases.
http://rise4fun.com/Alive/cbQ
Differential Revision: https://reviews.llvm.org/D33963
llvm-svn: 304849
|
| |
|
|
|
|
|
| |
3 of the tests were testing exactly the same thing: memcmp(x, y, 16) != 0.
I changed that to test 4, 7, and 16 bytes, so we can see how those differ.
llvm-svn: 304838
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: RKSimon, DavidKreitzer
Differential Revision: https://reviews.llvm.org/D33684
Patch by: Aleen Farhana <Farhana.aleen@gmail.com>
llvm-svn: 304834
|
| |
|
|
|
|
|
|
| |
- Add -x <language> option to switch between IR and MIR inputs.
- Change MIR parser to read from stdin when filename is '-'.
- Add a simple mir roundtrip test.
llvm-svn: 304825
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The patch makes instruction count the highest priority for
LSR solution for X86 (previously registers had highest priority).
Reviewers: qcolombet
Differential Revision: http://reviews.llvm.org/D30562
From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 304824
|
| |
|
|
|
|
|
|
|
|
|
| |
CodeGen uses MO_ExternalSymbol to represent the inline assembly strings.
Empty strings for symbol names appear to be invalid. For now just
special case the output code to avoid hitting an `assert()` in
`printLLVMNameWithoutPrefix()`.
This fixes https://llvm.org/PR33317
llvm-svn: 304815
|
| |
|
|
|
|
|
|
|
|
|
| |
Addition of a feature and a predicate used to control generation of madd.fmt
and similar instructions.
Patch by Stefan Maksimovic.
Differential Revision: https://reviews.llvm.org/D33400
llvm-svn: 304801
|
| |
|
|
|
|
|
|
| |
non-temporal (PR32744)
Extension to D33728
llvm-svn: 304798
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D33890
llvm-svn: 304797
|