| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
| |
TableGen had been nicely generating code to print a number of instructions using
shorter aliases (and PowerPC has plenty of short mnemonics), but we were not
calling it. For some of the aliases we support in the parser, TableGen can't
infer the "inverse" alias relationship, so there is still more to do.
Thus, after some hours of updating test cases...
llvm-svn: 235616
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Constant stores of f16 vectors can create NvCast nodes from various
operand types to v4f16 or v8f16 depending on patterns in the stored
constants. This patch adds nvcast rules with v4f16 and v8f16 values.
AArchISelLowering::LowerBUILD_VECTOR has the details on which constant
patterns generate the nvcast nodes.
Reviewers: jmolloy, srhines, ab
Subscribers: rengolin, aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D9201
llvm-svn: 235610
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Set operation action for SINT_TO_FP and UINT_TO_FP nodes with v4i32,
v8i8, v8i16 inputs to allow promotion of v4f16 results.
Add tests for sitofp and uitofp for vec4, vec8, vec16, and i8, i16, i32,
and i64 vectors. Only missing tests are for v16i8 and v16i16 as the
shift operations are too complicated to write a proper check sequence.
The conversions from v4i64 to v4f16 do not depend on this patch - v4i64
is split and the conversion gets handled while lowering v2i64. I am
adding a test here for completeness.
Reviewers: aemerson, rengolin, ab, jmolloy, srhines
Subscribers: rengolin, aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D9166
llvm-svn: 235609
|
| |
|
|
|
|
|
|
|
|
|
| |
building binary tree (PR22262)
Third time's the charm. The previous commit was reverted as a
reverse for-loop in SelectionDAGBuilder::lowerWorkItem did 'I--'
on an iterator at the beginning of a vector, causing asserts
when using debugging iterators. This commit fixes that.
llvm-svn: 235608
|
| |
|
|
| |
llvm-svn: 235603
|
| |
|
|
|
|
| |
As suggested in the review for http://reviews.llvm.org/D8537.
llvm-svn: 235601
|
| |
|
|
|
|
| |
builds using MSVC's STL. The iterator is being used outside of its valid range.
llvm-svn: 235597
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
tree (PR22262)
This is a re-commit of r235101, which also fixes the problems with the previous patch:
- Switches with only a default case and non-fallthrough were handled incorrectly
- The previous patch tickled a bug in PowerPC Early-Return Creation which is fixed here.
> This is a major rewrite of the SelectionDAG switch lowering. The previous code
> would lower switches as a binary tre, discovering clusters of cases
> suitable for lowering by jump tables or bit tests as it went along. To increase
> the likelihood of finding jump tables, the binary tree pivot was selected to
> maximize case density on both sides of the pivot.
>
> By not selecting the pivot in the middle, the binary trees would not always
> be balanced, leading to performance problems in the generated code.
>
> This patch rewrites the lowering to search for clusters of cases
> suitable for jump tables or bit tests first, and then builds the binary
> tree around those clusters. This way, the binary tree will always be balanced.
>
> This has the added benefit of decoupling the different aspects of the lowering:
> tree building and jump table or bit tests finding are now easier to tweak
> separately.
>
> For example, this will enable us to balance the tree based on profile info
> in the future.
>
> The algorithm for finding jump tables is quadratic, whereas the previous algorithm
> was O(n log n) for common cases, and quadratic only in the worst-case. This
> doesn't seem to be major problem in practice, e.g. compiling a file consisting
> of a 10k-case switch was only 30% slower, and such large switches should be rare
> in practice. Compiling e.g. gcc.c showed no compile-time difference. If this
> does turn out to be a problem, we could limit the search space of the algorithm.
>
> This commit also disables all optimizations during switch lowering in -O0.
>
> Differential Revision: http://reviews.llvm.org/D8649
llvm-svn: 235560
|
| |
|
|
| |
llvm-svn: 235552
|
| |
|
|
| |
llvm-svn: 235535
|
| |
|
|
|
|
|
|
|
|
|
|
| |
liveness. NFC.
The CondOpt pass currently uses LiveIntervals to set the dead flag on a def. This patch uses MachineRegisterInfo::use_empty instead as that is equivalent to the def being dead.
This removes an instance of LiveIntervals in the pass manager pipeline and saves 3.8% of compile time on llc conpiled for AArch64.
Reviewed by Chad Rosier and Zhaoshi.
llvm-svn: 235532
|
| |
|
|
| |
llvm-svn: 235529
|
| |
|
|
| |
llvm-svn: 235525
|
| |
|
|
|
|
| |
No test since calls are not actually supported yet.
llvm-svn: 235524
|
| |
|
|
|
|
|
|
| |
- Use static allocation for aligned stack objects.
- Simplify dynamic stack object allocation.
- Simplify elimination of frame-indices.
llvm-svn: 235521
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D7296
llvm-svn: 235517
|
| |
|
|
| |
llvm-svn: 235516
|
| |
|
|
| |
llvm-svn: 235514
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nodes (PR23259).
This fixes a regression introduced at revision 218263.
On AVX, if we optimize for size, a splat build_vector of a load
is lowered into a VBROADCAST node. This is done even if the value type of the
splat build_vector node is v2i64.
Since AVX doesn't support v2f64/v2i64 broadcasts, revision 218263 added two
extra tablegen patterns to allow selecting a VMOVDDUPrm from an X86VBroadcast
where the scalar element comes from a loadi64/loadf64.
However, revision 218263 forgot to add an extra fallback pattern for the case
where we have a X86VBroadcast of a loadi64 with multiple uses.
This patch adds the missing tablegen pattern in X86InstrSSE.td.
This patch also adds an extra test to 'splat-for-size.ll' to verify that ISel
doesn't crash with a 'fatal error in the backend' due to a missing AVX pattern
to select v2i64 X86ISD::BROADCAST nodes.
llvm-svn: 235509
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D8661
llvm-svn: 235505
|
| |
|
|
|
|
|
| |
This reverts commit r235194. It was causing a failure in FastISel buildbots
due to sign-extension issues.
llvm-svn: 235495
|
| |
|
|
|
|
|
|
| |
Enough concerns were raised that this optimization is pessimising some code patterns.
The obvious fix, to add a Reassociate run afterwards, causes even more pessimisation in some cases due to fewer complex addressing modes being matched. As there isn't a trivial fix for this, backing this out by default until someone gets a chance to fix the addressing mode matcher.
llvm-svn: 235491
|
| |
|
|
|
|
|
|
|
|
|
|
| |
X86 backend.
The code generated for symbolic targets is identical to the code generated for
constant targets, except that a relocation is emitted to fix up the actual
target address at link-time. This allows IR and object files containing
patchpoints to be cached across JIT-invocations where the target address may
change.
llvm-svn: 235483
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With SSE2, we can generate a 'movq' or other 64-bit store op on a 32-bit system
even though 64-bit integers are not legal types.
So instead of producing this:
pshufd $229, %xmm0, %xmm1 ## xmm1 = xmm0[1,1,2,3]
movd %xmm0, (%eax)
movd %xmm1, 4(%eax)
We can do:
movq %xmm0, (%eax)
This is a fix for the problem noted in D7296.
Differential Revision: http://reviews.llvm.org/D9134
llvm-svn: 235460
|
| |
|
|
| |
llvm-svn: 235418
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
With D9096 and D9101, there's no need to run DCE after SLSR and
SeparateConstOffsetFromGEP.
Test Plan: no regression
Reviewers: jholewinski, meheff
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D9172
llvm-svn: 235415
|
| |
|
|
|
|
|
|
|
| |
There doesn't seem to be a reason to perform this target ISD node matching
in an DAGCombine, moving it to lowering fixes PR23296.
Differential Revision: http://reviews.llvm.org/D9137
llvm-svn: 235394
|
| |
|
|
|
|
| |
fixed encoding of VPMOVM2x.
llvm-svn: 235385
|
| |
|
|
| |
llvm-svn: 235383
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This directive is exactly the same as .asciz, except it's only used by MIPS.
It is used to store null terminated strings in object files.
Reviewers: rafael, dsanders, echristo
Reviewed By: dsanders, echristo
Subscribers: echristo, llvm-commits
Differential Revision: http://reviews.llvm.org/D7530
llvm-svn: 235382
|
| |
|
|
|
|
|
|
| |
Implement CACHE and PREF instructions using mapping.
Differential Revision: http://reviews.llvm.org/D8893
llvm-svn: 235379
|
| |
|
|
|
|
|
|
| |
Reviewers: dsanders
Differential Revision: http://reviews.llvm.org/D7947
llvm-svn: 235377
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The 64-bit version of the variable shift instructions uses the
shift_rotate_reg class which uses a GPR32Opnd to specify the variable
shift amount. With this patch we avoid the generation of a redundant
SLL instruction for the variable shift instructions in 64-bit targets.
Reviewers: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D7413
llvm-svn: 235376
|
| |
|
|
|
|
| |
by Asaf Badouh (asaf.badouh@intel.com)
llvm-svn: 235375
|
| |
|
|
|
|
|
|
|
|
| |
This is an updated version of Chandler's patch D7402 that got accepted but never committed, and has bit-rotted a bit since.
I've updated the execution domain declarations to match the approach of the packed templates and also added some extra scalar unary tests.
Differential Revision: http://reviews.llvm.org/D9095
llvm-svn: 235372
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
X86ISD::ADDSUB, X86ISD::(F)HADD, X86ISD::(F)HSUB should not be selected
if the operand types do not match the result type because vector type
legalization cannot deal with this for custom nodes.
Testcase X86ISD::ADDSUB is attached. I could not create a testcase for
the FHADD/FHSUB cases because of: https://llvm.org/bugs/show_bug.cgi?id=23296
Differential Revision: http://reviews.llvm.org/D9120
llvm-svn: 235367
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Set operation action for FP16 conversion opcodes, so the Op legalizer
can choose the gnu_* libcalls for Mips.
Set LoadExtAction and TruncStoreAction for f16 scalars and vectors to
prevent (fpext (load )) and (store (fptrunc)) from getting combined into
unsupported operations.
Added test cases to test that these operations are handled correctly
for f16 scalars and vectors. This patch depends on
http://reviews.llvm.org/D8755.
Reviewers: srhines
Subscribers: llvm-commits, ab
Differential Revision: http://reviews.llvm.org/D8804
llvm-svn: 235341
|
| |
|
|
|
|
|
|
| |
Implement BITSWAP instruction using mapping.
Differential Revision: http://reviews.llvm.org/D8857
llvm-svn: 235321
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Patch by: John Brawn
Reviewers: jmolloy
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9105
llvm-svn: 235314
|
| |
|
|
| |
llvm-svn: 235310
|
| |
|
|
| |
llvm-svn: 235309
|
| |
|
|
|
|
|
|
| |
Implement disassembler support for microMIPS32r6.
Differential Revision: http://reviews.llvm.org/D8490
llvm-svn: 235307
|
| |
|
|
|
|
|
|
| |
This patch implements BALC and BC instructions using mapping.
Differential Revision: http://reviews.llvm.org/D8388
llvm-svn: 235302
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D8387
llvm-svn: 235298
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D8386
llvm-svn: 235296
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
conversion (PR23273).
This fixes a regression introduced at revision 231243.
The target-independent selection algorithm in FastISel knows how to select
a SINT_TO_FP if the target is SSE but not AVX. That is because on X86, the
tablegen'd 'fastEmit' functions know how to select CVTSI2SSrr and CVTSI2SDrr.
Method X86FastISel::X86SelectSIToFP was therefore working under the
wrong assumption that the target was AVX. That assumption was incorrect since
we can have a target that is neither AVX nor SSE.
So, rather than asserting for the presence of AVX, we should have had an
early exit from 'X86SelectSIToFP' if the target was not AVX.
This patch fixes the issue replacing the invalid assertion with an early exit.
Thanks to Dimitry Andric for reporting this problem and for providing a small
reproducible testcase. Added test pr23273.ll.
llvm-svn: 235295
|
| |
|
|
|
|
|
|
|
|
| |
requiring truncation.
The fix ensures that scalar sources inserted into a vector are the correct bit size.
Integer scalar sources from BUILD_VECTOR and SCALAR_TO_VECTOR nodes may require truncation that this function doesn't currently support.
llvm-svn: 235281
|
| |
|
|
| |
llvm-svn: 235262
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The result is either an Untyped reg sequence, on ldN with N > 1, or
just the type of the input vector, on ld1. Don't force Untyped.
Instead, just use the type of the reg sequence.
This mirrors the behavior of createTuple, which feeds the LD1*_POST.
The narrow code path wasn't actually covered by tests, because V64
insert_vector_elt are widened to V128 before the LD1LANEpost combine
has the chance to run, usually.
The only case where it does run on V64 vectors is if the vector ops
legalizer ran. So, tickle the code with a ctpop.
Fixes PR23265.
llvm-svn: 235243
|
| |
|
|
|
|
|
|
| |
They would break the SelectionDAG.
Note that the opposite load->vector dependency is already obvious in:
(LD1*post vec, ..)
llvm-svn: 235224
|