| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
If the loop condition was an i1 phi with a constantexpr input, this
would add a loop intrinsic fed by a phi dependent on a call to
if.break in the same block. Insert the call in the loop header.
llvm-svn: 298121
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move backend internal intrinsics along with the rest of the
normal intrinsics, and use the Intrinsic::getDeclaration
API instead of manually constructing the type list.
It's surprising this was working before. fdiv.fast had
the wrong number of parameters. The control flow intrinsic
declaration attributes were not being applied, and
their types were inconsistent. The actual IR use types
did not match the declaration, and were closer to the
types used for the patterns. The brcond lowering
was changing the types, so introduce new nodes for those.
llvm-svn: 298119
|
|
|
|
| |
llvm-svn: 298118
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Use this code pattern when RAX is live, instead of emitting up to 2
billion adjustments:
pushq %rax
movabsq +-$Offset+-8, %rax
addq %rsp, %rax
xchg %rax, (%rsp)
movq (%rsp), %rsp
Try to clean this code up a bit while I'm here. In particular, hoist the
logic that handles the entire adjustment with `movabsq $imm, %rax` out
of the loop.
This negates the offset in the prologue and uses ADD because X86 only
has a two operand subtract which always subtracts from the destination
register, which can no longer be RSP.
Fixes PR31962
Reviewers: majnemer, sdardis
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D30052
llvm-svn: 298116
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As noted in the comment, we might want to account for this case,
but I didn't look at what that would mean for the asm.
I'm also not sure why this only reproduces with avx512, but I'm
putting a conservative fix in for now to avoid the crash.
Also, if both sides of an add are zexted, shouldn't we shrink that add?
https://bugs.llvm.org/show_bug.cgi?id=32316
llvm-svn: 298107
|
|
|
|
| |
llvm-svn: 298106
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Loop unswitching can be extremely harmful for a SIMT target. In case
if hoisted condition is not uniform a SIMT machine will execute both
clones of a loop sequentially. Therefor LoopUnswitch checks if the
condition is non-divergent.
Since DivergenceAnalysis adds an expensive PostDominatorTree analysis
not needed for non-SIMT targets a new option is added to avoid unneded
analysis initialization. The method getAnalysisUsage is called when
TargetTransformInfo is not yet available and we cannot use it here.
For that reason a new field DivergentTarget is added to PassManagerBuilder
to control the behavior and set this field from a target.
Differential Revision: https://reviews.llvm.org/D30796
llvm-svn: 298104
|
|
|
|
|
|
|
|
| |
This allows the optimization to rearrange loads and stores more aggressively.
Differential Revision: http://reviews.llvm.org/D30903
llvm-svn: 298092
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixing triple format in the tests added for the branch label fix for Thumb
Targets. Also recommitting previously approved patch, see
https://reviews.llvm.org/D30943.
Reviewed by: samparker
Differential Revision: https://reviews.llvm.org/D30987
llvm-svn: 298056
|
|
|
|
|
|
|
|
|
|
|
|
| |
regardless of whether +fma was added on the command line.
We weren't able to handle isel of the 128/256-bit FMA instructions when AVX512F was enabled but VLX and FMA weren't.
I didn't mask FeatureAVX512 imply FeatureFMA as I wasn't sure I wanted disabling FMA to also disable AVX512. Instead we just can't prevent FMA instructions if AVX512 is enabled.
Another option would be to promote 128/256-bit to 512-bit, do the operation and extract it. But that requires a lot of extra isel patterns. Since no CPUs exist that support AVX512, but not FMA just using the VEX instructions seems better.
llvm-svn: 298051
|
|
|
|
| |
llvm-svn: 298050
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If one of the subregs of the 128 bit reg is undefined when splitMove() splits
a store into two instructions, a use of an undefined physical register
results.
To remedy this, an implicit use of the super register is added onto both new
instructions, along with propagated kill and undef flags.
This was discovered with llvm-stress, and that test case is attached as
test/CodeGen/SystemZ/splitMove_undefReg_mverifier.ll
Thanks to Matthias Braun for helping with a nice explanation.
Review: Ulrich Weigand
llvm-svn: 298047
|
|
|
|
|
|
|
|
| |
FMA, AVX512 and no VLX.
We were giving priority if VLX was enabled.
llvm-svn: 298046
|
|
|
|
|
|
| |
This makes the values a little more consistent between similar instruction and reduces the values some. This results in better grouping in the isel table saving a few bytes.
llvm-svn: 298043
|
|
|
|
|
|
|
| |
associated command line options and functions - it's currently unused
in all of llvm and clang other than being set and reset.
llvm-svn: 298023
|
|
|
|
|
|
|
|
|
|
| |
This allows the optimization to rearrange loads and stores more
aggressively. This doesn't really affect performance, but it helps
codesize.
Differential Revision: https://reviews.llvm.org/D30839
llvm-svn: 298021
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This patch cleans the namespace of the Lanai target.
Reviewers: jpienaar
Reviewed By: jpienaar
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D30955
llvm-svn: 298015
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Users often call getArgumentList().size(), which is a linear way to get
the number of function arguments. arg_size(), on the other hand, is
constant time.
In general, the fact that arguments are stored in an iplist is an
implementation detail, so I've removed it from the Function interface
and moved all other users to the argument container APIs (arg_begin(),
arg_end(), args(), arg_size()).
Reviewed By: chandlerc
Differential Revision: https://reviews.llvm.org/D31052
llvm-svn: 298010
|
|
|
|
|
|
|
|
|
|
|
|
| |
A recent change switch the in-memory wasm value types
to be signed integers, but I missing a few cases where
these were being writing to the binary.
Differential Revision: https://reviews.llvm.org/D31014
Patch by Sam Clegg
llvm-svn: 297991
|
|
|
|
|
|
|
|
|
|
| |
In fact this default implementation should be the only implementation,
keep it virtual for now to accomodate targets that don't model flags
correctly.
Differential Revision: https://reviews.llvm.org/D30747
llvm-svn: 297980
|
|
|
|
|
|
|
|
|
| |
Earlier stages of GlobalISel always use ConstantInt in G_CONSTANT so that's
what we should check for.
This fixes a crash introduced in r297782.
llvm-svn: 297968
|
|
|
|
| |
llvm-svn: 297959
|
|
|
|
|
|
|
|
|
|
| |
We can mark functions to always inline early in the opt. Since we do not have
call support this early inlining creates opportunities for inter-procedural
optimizations which would not occur otherwise.
Differential Revision: https://reviews.llvm.org/D31016
llvm-svn: 297958
|
|
|
|
| |
llvm-svn: 297920
|
|
|
|
| |
llvm-svn: 297915
|
|
|
|
| |
llvm-svn: 297913
|
|
|
|
|
|
| |
Prep work for PR31810
llvm-svn: 297876
|
|
|
|
|
|
|
| |
We're now able to select ADDWri thanks to the new complex pattern
support. Extend that to ADDXri.
llvm-svn: 297874
|
|
|
|
|
|
|
|
|
| |
computeKnownBits didn't handle fp_to_fp16 to report
the high bits as 0. ARM maps the generic node to an instruction
that does not modify the high bits of the register, so introduce
a target node where the high bits are known 0.
llvm-svn: 297873
|
|
|
|
|
|
|
|
|
|
|
| |
If we got unlucky with register allocation and actual constpool placement, we
could end up producing a tTBB_JT with an index that's already been clobbered.
Technically, we might be able to fix this situation up with a MOV, but I think
the constant islands pass is complex enough without having to deal with more
weird edge-cases.
llvm-svn: 297871
|
|
|
|
|
|
| |
Newline fixes, early return, range loops.
llvm-svn: 297865
|
|
|
|
|
|
|
|
|
|
|
| |
mfvrd and mffprd are both alias to mfvrsd.
This patch enables correct parsing of the aliases, but we still emit a mfvrsd.
Committing on behalf of brunoalr (Bruno Rosa).
Differential Revision: https://reviews.llvm.org/D29177
llvm-svn: 297849
|
|
|
|
| |
llvm-svn: 297846
|
|
|
|
|
|
| |
This reverts r297820 which apparently fails on A15 hosts.
llvm-svn: 297842
|
|
|
|
| |
llvm-svn: 297841
|
|
|
|
| |
llvm-svn: 297840
|
|
|
|
| |
llvm-svn: 297838
|
|
|
|
|
|
|
|
| |
Turns out it can happen, so the assertion was too harsh
Found during fuzz testing
llvm-svn: 297833
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for recognizing more patterns to match to DEXT and
CINS instructions.
It finds cases where multiple instructions could be replaced with a single
DEXT or CINS instruction.
For example, for the following:
define i64 @dext_and32(i64 zeroext %a) {
entry:
%and = and i64 %a, 4294967295
ret i64 %and
}
instead of generating:
0000000000000088 <dext_and32>:
88: 64010001 daddiu at,zero,1
8c: 0001083c dsll32 at,at,0x0
90: 6421ffff daddiu at,at,-1
94: 03e00008 jr ra
98: 00811024 and v0,a0,at
9c: 00000000 nop
the following gets generated:
0000000000000068 <dext_and32>:
68: 03e00008 jr ra
6c: 7c82f803 dext v0,a0,0x0,0x20
Cases that are covered:
DEXT:
1. and $src, mask where mask > 0xffff
2. zext $src zero extend from i32 to i64
CINS:
1. and (shl $src, pos), mask
2. shl (and $src, mask), pos
3. zext (shl $src, pos) zero extend from i32 to i64
Patch by Violeta Vukobrat.
Differential Revision: https://reviews.llvm.org/D30464
llvm-svn: 297832
|
|
|
|
| |
llvm-svn: 297824
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Different MCInstrAnalysis classes for arm and thumb mode, each with
their own evaluateBranch implementation. I added a test case and
fixed the coff-relocations test to use '<label>:' rather than
'<label>' in the CHECK-LABEL entries, since the ones without the
colon would match branch targets. Might be worth noticing that
llvm-objdump does not lookup the relocation and thus assigns it a
target depending on the encoded immediate which #0, so it thinks it
branches to the next instruction.
Committed on behalf of Andre Vieira (avieira).
Differential Revision: https://reviews.llvm.org/D30943
llvm-svn: 297821
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D30829
llvm-svn: 297820
|
|
|
|
|
|
|
|
|
|
|
| |
Enable the selection of the 64-bit signed multiply accumulate
instructions which operate on 16-bit operands. These are enabled for
ARMv5TE onwards for ARM and for V6T2 and other DSP enabled Thumb
architectures.
Differential Revision: https://reviews.llvm.org/D30044
llvm-svn: 297809
|
|
|
|
|
|
| |
instruction selector from being declared.
llvm-svn: 297786
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Adds a new kind of MachineOperand: MO_Placeholder.
This operand must not appear in the MIR and only exists as a way of
creating an 'uninitialized' operand until a matcher function overwrites it.
Depends on D30046, D29712
Reviewers: t.p.northover, ab, rovka, aditya_nandakumar, javed.absar, qcolombet
Reviewed By: qcolombet
Subscribers: dberris, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D30089
llvm-svn: 297782
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reduced version of D26357 - based on the discussion on llvm-dev about canonicalization of UMIN/UMAX/SMIN/SMAX as well as ABS I've reduced that patch to just the ABS ISD node (with x86/sse support) to improve basic combines and lowering.
ARM/AArch64, Hexagon, PowerPC and NVPTX all have similar instructions allowing us to make this a generic opcode and move away from the hard coded tablegen patterns which makes it tricky to match more complex patterns.
At the moment this patch doesn't attempt legalization as we only create an ABS node if its legal/custom.
Differential Revision: https://reviews.llvm.org/D29639
llvm-svn: 297780
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we were using the encoded LEB hex values
for the value types. This change uses the decoded
negative value and the LEB encoder to write them out.
Differential Revision: https://reviews.llvm.org/D30847
Patch by Sam Clegg
llvm-svn: 297777
|
|
|
|
|
|
|
| |
Make MCSectionELF::AssociatedSection be a link to a symbol, because
that's how it works in the assembly, and use it in the asm printer.
llvm-svn: 297769
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D30794
llvm-svn: 297768
|
|
|
|
|
|
| |
This fixes llvm.org/PR32265.
llvm-svn: 297745
|