| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
insert_subvectors and extract_subvector sequences to remove extra zeroing.wq
llvm-svn: 324791
|
| |
|
|
|
|
|
|
| |
filling upper elements with zero. Replace with insert_subvector.
There's still some extra kshifts in one of the modified test cases here, but hopefully that's only a DAG combine away.
llvm-svn: 324782
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a wasm-import-module function attribute and a .import_module
assembler directive, for specifying module import names for WebAssembly.
Currently these may only be used for function symbols; global variables
may be considered in the future.
WebAssembly has a two-level namespace scheme for symbols, and it's
normally the linker's job to assign the module name, which is the
first-level name. The attributes here allow users to specify their
own module names explicitly, which is useful for tools generating
bindings to modules defined in other languages.
This feature is not fully usable yet. It will evolve along with the
ongoing symbol table and lld changes.
Differential Revision: https://reviews.llvm.org/D42520
llvm-svn: 324778
|
| |
|
|
|
|
| |
Fixes http://llvm.org/PR36320.
llvm-svn: 324763
|
| |
|
|
|
|
|
| |
These are incomplete and were made redundant with the consolidation in:
https://reviews.llvm.org/rL324678
llvm-svn: 324754
|
| |
|
|
|
|
|
|
|
|
|
| |
* Use uleb128 for code offsets in the LSDA call site table.
* Omit the TTBase offset if the type table is empty.
This change can reduce the size of the DWARF/Itanium LSDA by about half.
Patch by Ryan Prichard!
llvm-svn: 324750
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rely on the assembler to finalize the layout of the DWARF/Itanium
exception-handling LSDA. Rather than calculate the exact size of each
thing in the LSDA, use assembler directives:
To emit the offset to the TTBase label:
.uleb128 .Lttbase0-.Lttbaseref0
.Lttbaseref0:
To emit the size of the call site table:
.uleb128 .Lcst_end0-.Lcst_begin0
.Lcst_begin0:
... call site table entries ...
.Lcst_end0:
To align the type info table:
... action table ...
.balign 4
.long _ZTIi
.long _ZTIl
.Lttbase0:
Using assembler directives simplifies the compiler and allows switching
the encoding of offsets in the call site table from udata4 to uleb128 for
a large code size savings. (This commit does not change the encoding.)
The combination of the uleb128 followed by a balign creates an unfortunate
dependency cycle that the assembler must sometimes resolve either by
padding an LEB or by inserting zero padding before the type table. See
PR35809 or GNU as bug 4029.
Patch by Ryan Prichard!
llvm-svn: 324749
|
| |
|
|
|
|
| |
This reverts r324494 and reapplies r324487.
llvm-svn: 324747
|
| |
|
|
|
|
| |
Additionally, simplify the rest of the argument/parameter lowering code.
llvm-svn: 324737
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
r314974 introduced insertion of DEBUG_VALUEs after
each redefinition of debug value register in the slot index range.
In case the instruction redefining the debug value register
was a terminator, machine verifier would complain since it
enforces the rule of no non-terminator instructions
following the first terminator.
Differential Revision: https://reviews.llvm.org/D42801
llvm-svn: 324734
|
| |
|
|
|
|
|
|
|
|
| |
When adding operands to machine instructions in case of
RegisterSDNodes, generate a COPY node in case the register class
does not match the one in the instruction definition.
Differental Revision: https://reviews.llvm.org/D35561
llvm-svn: 324733
|
| |
|
|
|
|
|
| |
Repurpose this previously XFAIL'd test to check that jalr uses $25
as per ABI requirements for PIC code.
llvm-svn: 324729
|
| |
|
|
|
|
|
|
|
|
| |
Enable multiple COPY hints to eliminate more COPYs during register allocation.
Note that this is something all targets should do, see
https://reviews.llvm.org/D38128.
Review: Martin Storsjö
llvm-svn: 324720
|
| |
|
|
|
|
|
|
| |
lowerV2X128VectorShuffle
Previously we extracted two subvectors and concatenate. But the concatenate will be lowered to two insert subvectors. Then DAG combine will merge once of the inserts and one of the extracts back into the original vector. We might as well just directly use one extract and one insert.
llvm-svn: 324710
|
| |
|
|
|
|
|
|
| |
vector.
This regresses a couple cases in the shuffle combining test. But those cases use intrinsics that InstCombine knows how to turn into a generic shuffle earlier. This should give opportunities to fold this earlier in InstCombine or DAG combine.
llvm-svn: 324709
|
| |
|
|
|
|
|
|
| |
zeros in the upper portion.
We should recognize this and just use a mov that will zero the upper bits.
llvm-svn: 324708
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Add verification for copies involving generic registers if they are
compatible - ie if it is a generic copy, then the types are the
same, and if a COPY b/w generic and target virtual register, then
the sizes should be the same. Only checks if there are no sub registers
involved for now.
https://reviews.llvm.org/D37775
llvm-svn: 324696
|
| |
|
|
|
|
|
|
|
|
|
| |
Instead of:
Live Ins: %r0 %r1
print:
liveins: %r0, %r1
llvm-svn: 324694
|
| |
|
|
|
|
| |
sequences.
llvm-svn: 324693
|
| |
|
|
|
|
|
|
|
|
|
| |
Instead of:
Successors according to CFG: %bb.6(0x12492492 / 0x80000000 = 14.29%)
print:
successors: %bb.6(0x12492492); %bb.6(14.29%)
llvm-svn: 324685
|
| |
|
|
|
|
|
| |
As was already shown in the div/rem tests and noted in PR36305,
the behavior is inconsistent, but it's not limited to div/rem only.
llvm-svn: 324678
|
| |
|
|
|
|
|
|
| |
Emitting the correct (root of compilation) file at index 0 will be
posted for review later; I wanted to get this minor change out of the
way first.
llvm-svn: 324669
|
| |
|
|
|
|
| |
Column limit, typo, unnecessary reference
llvm-svn: 324666
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch essentially makes sure that X86CallLowering adds proper
G_COPY/G_TRUNC and G_ANYEXT/G_COPY when we are doing lowering of
arguments/returns for floating point values passed on registers.
Tests are updated accordingly
Reviewed By: qcolombet
Differential Revision: https://reviews.llvm.org/D42287
llvm-svn: 324665
|
| |
|
|
|
|
|
|
|
|
|
| |
The patch is a split from D42287 and is related to
fixing failures after https://reviews.llvm.org/D37775
Reviewed By: qcolombet
Differential Revision: https://reviews.llvm.org/D42287
llvm-svn: 324664
|
| |
|
|
|
|
|
|
| |
X, C)))->(vXi1 (and/or/xor (bitcast X), (bitcast C)) where C is a constant build_vector.
Most vxi1 constant build vectors have to be implemented in the scalar domain anyway so we'll probably end up with a cast there later. But by then its too late to do the combine to get rid of it.
llvm-svn: 324662
|
| |
|
|
|
|
| |
build_vector into a scalar integer.
llvm-svn: 324661
|
| |
|
|
| |
llvm-svn: 324646
|
| |
|
|
|
|
|
|
| |
This allows the register name to be printed without the leading '%'.
This can be used for emitting calls to the retpoline thunks from inline
asm.
llvm-svn: 324645
|
| |
|
|
| |
llvm-svn: 324638
|
| |
|
|
| |
llvm-svn: 324637
|
| |
|
|
|
|
|
|
| |
One of them shows a missed opportunity to use SimplifyDemandedBits on the condition when its used by multiple vselects.
The other is a case we shouldn't optimize because the condition has a non-vselect use.
llvm-svn: 324630
|
| |
|
|
|
|
| |
Needed for checking current code generation.
llvm-svn: 324601
|
| |
|
|
| |
llvm-svn: 324585
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instructions affected:
mthc1, mfhc1, add.d, sub.d, mul.d, div.d,
mov.d, neg.d, cvt.w.d, cvt.d.s, cvt.d.w, cvt.s.d
These instructions are now defined for
microMIPS32r3 + microMIPS32r6 in MicroMipsInstrFPU.td
since they shared their encoding with those already defined
in microMIPS32r6InstrInfo.td and have been therefore
removed from the latter file.
Some instructions present in MicroMipsInstrFPU.td which
did not have both AFGR64 and FGR64 variants defined have
been altered to do so.
Differential revision: https://reviews.llvm.org/D42738
llvm-svn: 324584
|
| |
|
|
| |
llvm-svn: 324583
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were generating "fmov h0, wzr" instructions when FullFP16 is not enabled.
I've not added any tests, because the problem was visible in:
test/CodeGen/AArch64/arm64-zero-cycle-zeroing.ll,
which I had to change: I don't think Cyclone has FullFP16 enabled
by default, so it shouldn't be using this v8.2a instruction.
I've also removed these rdar tags, please shout if there are any objections.
Differential Revision: https://reviews.llvm.org/D43020
llvm-svn: 324581
|
| |
|
|
|
|
|
|
| |
compare of a bitcast from vXi1.
This should allow us to remove the kortest intrinsic from IR and use compare+bitcast+or in IR instead.
llvm-svn: 324580
|
| |
|
|
|
|
|
|
| |
The KTEST instruction sets the C flag if the result of anding both operands together is all 1s. We can use this to lower (icmp eq/ne (bitcast (vXi1 X), -1)
Differential Revision: https://reviews.llvm.org/D42772
llvm-svn: 324577
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
KTEST has weird flag behavior. The Z flag is set for all bits in the AND of the k-registers being 0, and the C flag is set for all bits being 1. All other flags are cleared.
We currently emit this instruction in EmitTEST and don't check the condition code. This can lead to strange things like using the S flag after a KTEST for a signed compare.
The domain reassignment pass can also transform TEST instructions into KTEST and is not protected against the flag usage either. For now I've disabled this part of the domain reassignment pass. I tried to comment out the checks in the mir test so that we could recover them later, but I couldn't figure out how to get that to work.
This patch moves the KTEST handling into LowerSETCC and now creates a ktest+x86setcc. I've chosen this approach because I'd like to add support for the C flag for all ones in a followup patch. To do that requires that I can rewrite the condition code going in the x86setcc to be different than the original SETCC condition code.
This fixes PR36182. I'll file a PR to fix domain reassignment once this goes in. Should this be merged to 6.0?
Reviewers: spatel, guyblank, RKSimon, zvi
Reviewed By: guyblank
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42770
llvm-svn: 324576
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of:
%bb.1: derived from LLVM BB %for.body
print:
bb.1.for.body:
Also use MIR syntax for MBB attributes like "align", "landing-pad", etc.
llvm-svn: 324563
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LowerSELECT_CC is not generating optimal Select_Ri pattern at the moment. It
is not guaranteed to place ConstantNode at RHS which would miss matching
Select_Ri.
A new testcase added into the existing select_ri.ll, also there is an
existing case in cmp.ll which would be improved to use Select_Ri after this
patch, it is adjusted accordingly.
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
llvm-svn: 324560
|
| |
|
|
|
|
|
| |
Defs of operands outside of the instruction's explicit defs need
to be checked.
llvm-svn: 324554
|
| |
|
|
| |
llvm-svn: 324550
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
hit from IR but creates a minefield for MI passes.
The x86 backend has fairly powerful logic to try and fold loads that
feed register operands to instructions into a memory operand on the
instruction. This is almost always a good thing, but there are specific
relocated loads that are only allowed to appear in specific
instructions. Notably, R_X86_64_GOTTPOFF is only allowed in `movq` and
`addq`. This patch blocks folding of memory operands using this
relocation unless the target is in fact `addq`.
The particular relocation indicates why we simply don't hit this under
normal circumstances. This relocation is only used for TLS, and it gets
used in very specific ways in conjunction with %fs-relative addressing.
The result is that loads using this relocation are essentially never
eligible for folding into an instruction's memory operands. Unless, of
course, you have an MI pass that inserts usage of such a load. I have
exactly such an MI pass and was greeted by truly mysterious miscompiles
where the linker replaced my instruction with a completely garbage byte
sequence. Go team.
This is the only such relocation I'm aware of in x86, but there may be
others that need to be similarly restricted.
Fixes PR36165.
Differential Revision: https://reviews.llvm.org/D42732
llvm-svn: 324546
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
KMOVQ on non-BWI targets
If we are saving/restoring k-registers, the default behavior of getMinimalRegisterClass will find the VK64 class with a spill size of 64 bits. This will cause the KMOVQ opcode to be used for save/restore. If we don't have have BWI instructions we need to constrain the class returned to give us VK16 with a 16-bit spill size. We can do this by passing the either v16i1 or v64i1 into getMinimalRegisterClass.
Also add asserts to make sure BWI is enabled anytime we use KMOVD/KMOVQ. These are what caught this bug.
Fixes PR36256
Differential Revision: https://reviews.llvm.org/D42989
llvm-svn: 324533
|
| |
|
|
| |
llvm-svn: 324530
|
| |
|
|
| |
llvm-svn: 324497
|
| |
|
|
|
|
|
|
| |
This reverts commit r324487.
It broke clang tests.
llvm-svn: 324494
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Note: This is a candidate for LLVM 6.0, because it was planned to be
in that release but was delayed due to a long review period.
Merge conflict in release_60 - resolution:
Add "-p6:32:32" into the second (non-amdgiz) string.
Only scalar loads support 32-bit pointers. An address in a VGPR will
fail to compile. That's OK because the results of loads will only be used
in places where VGPRs are forbidden.
Updated AMDGPUAliasAnalysis and used SReg_64_XEXEC.
The tests cover all uses cases we need for Mesa.
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D41651
llvm-svn: 324487
|