| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code generation of VLD3, VLD4, VST3 and VST4 with register writeback is
broken due to 2 separate bugs:
1) VLD1d64TPseudoWB_register and VLD1d64QPseudoWB_register are missing
rules to expand them to non pseudo MIR. These are selected for
ARMISD::VLD3_UPD/VLD4_UPD with v1i64 vectors in SelectVLD.
2) Selection of the right VLD/VST instruction is broken for load and
store of 3 and 4 v1i64 vectors. SelectVLD and SelectVST are called
with MIR opcode for fixed writeback (ie increment is access size)
and call getVLDSTRegisterUpdateOpcode() to select an opcode with
register writeback if base register update is of a different size.
Since getVLDSTRegisterUpdateOpcode() only knows about
VLD1/VLD2/VST1/VST2 the call is currently conditional on the number
of element in the vector.
However, VLD1/VST1 is selected by SelectVLD/SelectVST's caller for
load and stores of 3 or 4 v1i64 vectors. Therefore the opcode is not
updated which later lead to a fixed writeback instruction being
constructed with an extra operand for the register writeback.
This patch addresses the two issues as follows:
- it adds the necessary mapping from VLD1d64TPseudoWB_register and
VLD1d64QPseudoWB_register to VLD1d64Twb_register and
VLD1d64Qwb_register respectively. Like for the existing _fixed
variants, the cost of these is bumped for unaligned access.
- it changes the logic in SelectVLD and SelectVSD to call isVLDfixed
and isVSTfixed respectively to decide whether the opcode should be
updated. It also reworks the logic and comments for pushing the
writeback offset operand and r0 operand to clarify the logic:
writeback offset needs to be pushed if it's a register writeback,
r0 needs to be pushed if not and the instruction is a
VLD1/VLD2/VST1/VST2.
Reviewers: rengolin, t.p.northover, samparker
Reviewed By: samparker
Patch by Thomas Preud'homme <thomas.preudhomme@arm.com>
Differential Revision: https://reviews.llvm.org/D42970
llvm-svn: 326570
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Emulated TLS is enabled by llc flag -emulated-tls,
which is passed by clang driver.
When llc is called explicitly or from other drivers like LTO,
missing -emulated-tls flag would generate wrong TLS code for targets
that supports only this mode.
Now use useEmulatedTLS() instead of Options.EmulatedTLS to decide whether
emulated TLS code should be generated.
Unit tests are modified to run with and without the -emulated-tls flag.
Differential Revision: https://reviews.llvm.org/D42999
llvm-svn: 326341
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Expressions of the form x < 0 ? 0 : x; and x < -1 ? -1 : x can be lowered using bit-operations instead of branching or conditional moves
In thumb-mode this results in a two-instruction sequence, a shift followed by a bic or or while in ARM/thumb2 mode that has flexible second operand the shift can be folded into a single bic/or instructions. In most cases this results in smaller code and possibly less branches, and in no case larger than before.
Patch by Martin Svanfeldt
Reviewers: fhahn, pbarrio, rogfer01
Reviewed By: pbarrio, rogfer01
Subscribers: chrib, yroux, eugenis, efriedma, rogfer01, aemerson, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D42574
llvm-svn: 326333
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
We were always setting the block alignment to 2 bytes in Thumb mode
and 4-bytes in ARM mode (r325754, and r325012), but this could cause
reducing the block alignment when it already had been aligned (e.g.
in Thumb mode when the block is a CPE that was already 4-byte aligned).
Patch by Momchil Velikov, I've only added a test.
Differential Revision: https://reviews.llvm.org/D43777
llvm-svn: 326232
|
| |
|
|
|
|
|
|
| |
Re-enable commit r323991 now that r325931 has been committed to make
MachineOperand::isRenamable() check more conservative w.r.t. code
changes and opt-in on a per-target basis.
llvm-svn: 326208
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In r322867, we introduced IsStandalone when printing MIR in -debug
output. The default behaviour for that was:
1) If any of MBB, MI, or MO are -debug-printed separately, don't omit any
redundant information.
2) When -debug-printing a MF entirely, don't print any redundant
information.
3) When printing MIR, don't print any redundant information.
I'd like to change 2) to:
2) When -debug-printing a MF entirely, don't omit any redundant information.
Differential Revision: https://reviews.llvm.org/D43337
llvm-svn: 326094
|
| |
|
|
|
|
|
| |
This recommits r325754; the modified and failing test case
actually didn't need any modifications.
llvm-svn: 325765
|
| |
|
|
| |
llvm-svn: 325756
|
| |
|
|
| |
llvm-svn: 325755
|
| |
|
|
|
|
|
|
|
|
|
| |
This is a follow up of r325012, that allowed half types in constant pools.
Proper alignment was enforced when a big basic block was split up, but not when
a CPE was placed before/after a block; the successor block had the wrong
alignment.
Differential Revision: https://reviews.llvm.org/D43580
llvm-svn: 325754
|
| |
|
|
|
|
|
|
| |
This case wasn't handled yet.
Differential Revision: https://reviews.llvm.org/D43508
llvm-svn: 325616
|
| |
|
|
| |
llvm-svn: 325512
|
| |
|
|
|
|
|
|
|
| |
Follow up of Clang commit r325351; this adds the LLVM tests, which
were also missing.
Differential Revision: https://reviews.llvm.org/D43395
llvm-svn: 325443
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r323991.
This commit breaks target that don't model all the register constraints
in TableGen. So far the workaround was to set the
hasExtraXXXRegAllocReq, but it proves that it doesn't cover all the
cases.
For instance, when mutating an instruction (like in the lowering of
COPYs) the isRenamable flag is not properly updated. The same problem
will happen when attaching machine operand from one instruction to
another.
Geoff Berry is working on a fix in https://reviews.llvm.org/D43042.
llvm-svn: 325421
|
| |
|
|
|
|
|
|
|
|
| |
Enable multiple COPY hints to eliminate more COPYs during register allocation.
Note that this is something all targets should do, see
https://reviews.llvm.org/D38128.
Review: Eli Friedman
llvm-svn: 325327
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently when expanding a SETCC node into a SELECT_CC, LLVM uses
an incorrect type for determining BooleanContent of the result. This
patch fixes the issue.
Fixes PR36079.
Reviewers: rogfer01, javed.absar, efriedma
Reviewed By: efriedma
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43282
llvm-svn: 325325
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This patch combines some cases of ARMISD::CMOV for integers that arise in comparisons of the form
a != b ? x : 0
a == b ? 0 : x
and that currently (e.g. in Thumb1) are emitted as branches.
Differential Revision: https://reviews.llvm.org/D34515
llvm-svn: 325323
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: Fix silly mistake in a test
Reviewers: gkistanova, apilipenko
Subscribers: javed.absar, eraman, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D43342
llvm-svn: 325283
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In LLVM, 't' selects a floating-point/SIMD register and only supports
32-bit values. This is appropriately documented in the LLVM Language
Reference Manual. However, this behaviour diverges from that of GCC, where
't' selects the s0-s31 registers and its qX and dX variants depending on
additional operand modifiers (q/P).
For example, the following C code:
#include <arm_neon.h>
float32x4_t a, b, x;
asm("vadd.f32 %0, %1, %2" : "=t" (x) : "t" (a), "t" (b))
results in the following assembly if compiled with GCC:
vadd.f32 s0, s0, s1
whereas LLVM will show "error: couldn't allocate output register for
constraint 't'", since a, b, x are 128-bit variables, not 32-bit.
This patch extends the use of 't' to mean that of GCC, thus allowing
selection of the lower Q vector regs and their D/S variants. For example,
the earlier code will now compile as:
vadd.f32 q0, q0, q1
This behaviour still differs from that of GCC but I think it is actually
more correct, since LLVM picks up the right register type based on the
datatype of x, while GCC would need an extra operand modifier to achieve
the same result, as follows:
asm("vadd.f32 %q0, %q1, %q2" : "=t" (x) : "t" (a), "t" (b))
Since this is only an extension of functionality, existing code should not
be affected by this change. Note that operand modifiers q/P are already
supported by LLVM, so this patch should suffice to support inline
assembly with constraint 't' originally built for GCC.
Reviewers: grosbach, rengolin
Reviewed By: rengolin
Subscribers: rogfer01, efriedma, olista01, aemerson, javed.absar, eraman, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D42962
llvm-svn: 325244
|
| |
|
|
|
|
|
|
| |
This adds f16 VCMP match rules and fixes the test cases.
Differential Revision: https://reviews.llvm.org/D43291
llvm-svn: 325228
|
| |
|
|
|
|
|
|
| |
This adds support for handling f16 stack spills/reloads.
Differential Revision: https://reviews.llvm.org/D43280
llvm-svn: 325130
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Old syntax:
BUNDLE implicit-def %r0, implicit-def %r1, implicit %r2
* %r0 = SOME_OP %r2
* %r1 = ANOTHER_OP internal %r0
New syntax:
BUNDLE implicit-def %r0, implicit-def %r1, implicit %r2 {
%r0 = SOME_OP %r2
%r1 = ANOTHER_OP internal %r0
}
llvm-svn: 325032
|
| |
|
|
|
|
|
|
|
|
|
| |
Change ARMConstantIslandPass to:
- accept f16 literals as litpool entries,
- if the litpool needs to be inserted in the middle of a big block, then we
need to 4-byte align the next instruction in ARM mode.
Differential Revision: https://reviews.llvm.org/D42784
llvm-svn: 325012
|
| |
|
|
|
|
|
|
| |
This addressing mode wasn't checked, so we were running in an assert.
Differential Revision: https://reviews.llvm.org/D43179
llvm-svn: 324996
|
| |
|
|
|
|
|
|
|
| |
If merging them, the dllexport attribute needs to be brought along
to the new GlobalAlias.
Differential Revision: https://reviews.llvm.org/D43192
llvm-svn: 324937
|
| |
|
|
|
|
|
|
|
|
|
| |
Add a common -trap-unreachable option, similar to the target
specific hexagon equivalent, which has been replaced. This
turns unreachable instructions into traps, which is useful for
debugging.
Differential Revision: https://reviews.llvm.org/D42965
llvm-svn: 324880
|
| |
|
|
|
|
|
| |
D43141 proposes to correct undef folding in the DAG,
and this test would not survive that change.
llvm-svn: 324814
|
| |
|
|
|
|
|
|
|
|
|
| |
* Use uleb128 for code offsets in the LSDA call site table.
* Omit the TTBase offset if the type table is empty.
This change can reduce the size of the DWARF/Itanium LSDA by about half.
Patch by Ryan Prichard!
llvm-svn: 324750
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rely on the assembler to finalize the layout of the DWARF/Itanium
exception-handling LSDA. Rather than calculate the exact size of each
thing in the LSDA, use assembler directives:
To emit the offset to the TTBase label:
.uleb128 .Lttbase0-.Lttbaseref0
.Lttbaseref0:
To emit the size of the call site table:
.uleb128 .Lcst_end0-.Lcst_begin0
.Lcst_begin0:
... call site table entries ...
.Lcst_end0:
To align the type info table:
... action table ...
.balign 4
.long _ZTIi
.long _ZTIl
.Lttbase0:
Using assembler directives simplifies the compiler and allows switching
the encoding of offsets in the call site table from udata4 to uleb128 for
a large code size savings. (This commit does not change the encoding.)
The combination of the uleb128 followed by a balign creates an unfortunate
dependency cycle that the assembler must sometimes resolve either by
padding an LEB or by inserting zero padding before the type table. See
PR35809 or GNU as bug 4029.
Patch by Ryan Prichard!
llvm-svn: 324749
|
| |
|
|
|
|
|
|
|
|
|
| |
Instead of:
Successors according to CFG: %bb.6(0x12492492 / 0x80000000 = 14.29%)
print:
successors: %bb.6(0x12492492); %bb.6(14.29%)
llvm-svn: 324685
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of:
%bb.1: derived from LLVM BB %for.body
print:
bb.1.for.body:
Also use MIR syntax for MBB attributes like "align", "landing-pad", etc.
llvm-svn: 324563
|
| |
|
|
|
|
|
|
|
| |
This is a follow up of r324321, adding a match pattern for mov with a FP16
immediate (also fixing operand vfp_f16imm that wasn't even compiling).
Differential Revision: https://reviews.llvm.org/D42973
llvm-svn: 324456
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Following up on the discussion from
http://lists.llvm.org/pipermail/llvm-dev/2017-April/112305.html, undef
values are now placed in the .bss as well as null values. This prevents
undef global values taking up potentially huge amounts of space in the
.data section.
The following two lines now both generate equivalent .bss data:
@vals1 = internal unnamed_addr global [20000000 x i32] zeroinitializer, align 4
@vals2 = internal unnamed_addr global [20000000 x i32] undef, align 4 ; previously unaccounted for
This is primarily motivated by the corresponding issue in the Rust
compiler (https://github.com/rust-lang/rust/issues/41315).
Differential Revision: https://reviews.llvm.org/D41705
Patch by varkor!
llvm-svn: 324424
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See D42509 for the original version of this.
Basically, there are two significant changes to behavior here:
- addLiveOuts always adds all pristine registers (even if a block has
no successors).
- addLiveOuts and addLiveOutsNoPristines always add all callee-saved
registers for return blocks (including conditional return blocks).
I cleaned up the functions a bit to make it clear these properties hold.
Differential Revision: https://reviews.llvm.org/D42655
llvm-svn: 324422
|
| |
|
|
|
|
|
|
|
| |
This is a follow up of r324321, adding f16 <-> f32 and f16 <-> f64 conversion
match patterns.
Differential Revision: https://reviews.llvm.org/D42954
llvm-svn: 324360
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds most of the FP16 codegen support, but these areas need further work:
- FP16 literals and immediates are not properly supported yet (e.g. literal
pool needs work),
- Instructions that are generated from intrinsics (e.g. vabs) haven't been
added.
This will be addressed in follow-up patches.
Differential Revision: https://reviews.llvm.org/D42849
llvm-svn: 324321
|
| |
|
|
| |
llvm-svn: 324074
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Example situation:
```
BB0:
%0 = ...
use %0
; ...
condjump BB1
jmp BB2
BB1:
%0 = ... ; rematerialized def from above (from earlier split step)
jmp BB2
BB2:
; ...
use %0
```
%0 will have a live interval with 3 value numbers (for the BB0, BB1 and
BB2 parts). Now SplitKit tries and succeeds in rematerializing the value
number in BB2 (This only works because it is a secondary split so
SplitKit is can trace this back to a single original def).
We need to recompute all live ranges affected by a value number that we
rematerialize. The case that we missed before is that when the value
that is rematerialized is at a join (Phi VNI) then we also have to
recompute liveness for the predecessor VNIs.
rdar://35699130
Differential Revision: https://reviews.llvm.org/D42667
llvm-svn: 324039
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change extends MachineCopyPropagation to do COPY source forwarding
and adds an additional run of the pass to the default pass pipeline just
after register allocation.
This version of this patch uses the newly added
MachineOperand::isRenamable bit to avoid forwarding registers is such a
way as to violate constraints that aren't captured in the
Machine IR (e.g. ABI or ISA constraints).
This change is a continuation of the work started in D30751.
Reviewers: qcolombet, javed.absar, MatzeB, jonpa, tstellar
Subscribers: tpr, mgorny, mcrosier, nhaehnle, nemanjai, jyknight, hfinkel, arsenm, inouehrs, eraman, sdardis, guyblank, fedor.sergeev, aheejin, dschuff, jfb, myatsina, llvm-commits
Differential Revision: https://reviews.llvm.org/D41835
llvm-svn: 323991
|
| |
|
|
|
|
|
|
|
|
| |
Commit r323512 introduced an optimisation in LowerReturn for half-precision
return values. A missing check caused a crash when the return value is "undef"
(i.e. a node that has no operands).
Differential Revision: https://reviews.llvm.org/D42743
llvm-svn: 323968
|
| |
|
|
|
|
|
|
|
|
| |
bit-operations"
Miscompiles code. Testcase pending.
This reverts commit r323869.
llvm-svn: 323929
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html
In preparation for adding support for named vregs we are changing the sigil for
physical registers in MIR to '$' from '%'. This will prevent name clashes of
named physical register with named vregs.
llvm-svn: 323922
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Expressions of the form x < 0 ? 0 : x; and x < -1 ? -1 : x can be lowered using bit-operations instead of branching or conditional moves
In thumb-mode this results in a two-instruction sequence, a shift followed by a bic or or while in ARM/thumb2 mode that has flexible second operand the shift can be folded into a single bic/or instructions. In most cases this results in smaller code and possibly less branches, and in no case larger than before.
Patch by Marten Svanfeldt.
Reviewers: fhahn, pbarrio
Reviewed By: pbarrio
Subscribers: efriedma, rogfer01, aemerson, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D42574
llvm-svn: 323869
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Half-precision arguments and return values are passed as if it were an int or
float for ARM. This results in truncates and bitcasts to/from i16 and f16
values, which are legalized very early to stack stores/loads. When FullFP16 is
enabled, we want to avoid codegen for these bitcasts as it is unnecessary and
inefficient.
Differential Revision: https://reviews.llvm.org/D42580
llvm-svn: 323861
|
| |
|
|
|
|
| |
These are handled by the TableGen'erated code.
llvm-svn: 323732
|
| |
|
|
|
|
|
| |
Straightforward mapping (integer operand to GPR, floating point operand
to FPR).
llvm-svn: 323731
|
| |
|
|
|
|
|
|
| |
Legal if we have hardware support, libcall otherwise.
Also add supporting code to the legalizer helper for libcalls.
llvm-svn: 323730
|
| |
|
|
|
|
| |
The work is done by the TableGen'erated code.
llvm-svn: 323728
|
| |
|
|
|
|
|
| |
Straightforward mapping (integer operand goes to GPR, floating point
operand goes to FPR).
llvm-svn: 323727
|
| |
|
|
|
|
|
|
|
| |
Legal if we have hardware support for floating point, libcalls
otherwise.
Also add the necessary support for libcalls in the legalizer helper.
llvm-svn: 323726
|