| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
This is just bdver3 + AVX2 + BMI2.
llvm-svn: 207847
|
| |
|
|
|
|
|
| |
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Samuel Li <samuel.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
llvm-svn: 207846
|
| |
|
|
|
|
|
|
| |
v2: move code to AMDGPUISelLowering.cpp
squash with tests (both EG and SI)
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 207845
|
| |
|
|
| |
llvm-svn: 207844
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The register spiller assumes that only one new instruction is created
when spilling and restoring registers, so we need to emit pseudo
instructions for vector register spills and lower them after
register allocation.
v2:
- Fix calculation of lane index
- Extend VGPR liveness to end of program.
v3:
- Use SIMM16 field of S_NOP to specify multiple NOPs.
https://bugs.freedesktop.org/show_bug.cgi?id=75005
llvm-svn: 207843
|
| |
|
|
| |
llvm-svn: 207840
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, LLVM had no knowledge that these instructions actually
modified their address register: fine if they never end up in CodeGen,
but when I'd rather like to write some patterns for them it becomes a
disaster.
The change is mostly straightforward, I think the most significant
design decision was to *always* put the address write-back first. This
allows loads and stores to be accessed more uniformly, for example
permitting the continued sharing of the InstAlias definitions.
I also discovered that the custom Decode logic is no longer needed, so
I removed it.
No tests, because there should be no functionality change.
llvm-svn: 207839
|
| |
|
|
|
|
|
|
| |
While post-indexed LD1/ST1 instructions do exist for vector loads,
this patch makes use of the more flexible addressing-modes in LDR/STR
instructions.
llvm-svn: 207838
|
| |
|
|
|
|
| |
vector VT but scalar values.
llvm-svn: 207835
|
| |
|
|
|
|
| |
times in a bootstrap of clang.
llvm-svn: 207828
|
| |
|
|
| |
llvm-svn: 207807
|
| |
|
|
| |
llvm-svn: 207805
|
| |
|
|
| |
llvm-svn: 207804
|
| |
|
|
| |
llvm-svn: 207803
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Given the following C code llvm currently generates suboptimal code for
x86-64:
__m128 bss4( const __m128 *ptr, size_t i, size_t j )
{
float f = ptr[i][j];
return (__m128) { f, f, f, f };
}
=================================================
define <4 x float> @_Z4bss4PKDv4_fmm(<4 x float>* nocapture readonly %ptr, i64 %i, i64 %j) #0 {
%a1 = getelementptr inbounds <4 x float>* %ptr, i64 %i
%a2 = load <4 x float>* %a1, align 16, !tbaa !1
%a3 = trunc i64 %j to i32
%a4 = extractelement <4 x float> %a2, i32 %a3
%a5 = insertelement <4 x float> undef, float %a4, i32 0
%a6 = insertelement <4 x float> %a5, float %a4, i32 1
%a7 = insertelement <4 x float> %a6, float %a4, i32 2
%a8 = insertelement <4 x float> %a7, float %a4, i32 3
ret <4 x float> %a8
}
=================================================
shlq $4, %rsi
addq %rdi, %rsi
movslq %edx, %rax
vbroadcastss (%rsi,%rax,4), %xmm0
retq
=================================================
The movslq is uneeded, but is present because of the trunc to i32 and then
sext back to i64 that the backend adds for vbroadcastss.
We can't remove it because it changes the meaning. The IR that clang
generates is already suboptimal. What clang really should emit is:
%a4 = extractelement <4 x float> %a2, i64 %j
This patch makes that legal. A separate patch will teach clang to do it.
Differential Revision: http://reviews.llvm.org/D3519
llvm-svn: 207801
|
| |
|
|
|
|
| |
more.
llvm-svn: 207800
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This creates a lot of core infrastructure in which to add, with little
effort, quite a bit more to mips fast-isel
Test Plan: simplestore.ll
Reviewers: dsanders
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D3527
llvm-svn: 207790
|
| |
|
|
|
|
|
|
|
|
| |
This was initialized by llvm-mc (calling setDwarfVersion) but other
clients (such as clang, llc, etc) aren't necessarily initializing this
so we were getting garbage DWARF version values in the output.
Initialize it to a reasonable default (the same default used in llvm-mc,
though this is higher than it was (2) previously).
llvm-svn: 207788
|
| |
|
|
|
|
|
|
|
| |
This matches gas' behaviour on COFF.
I think that this yak is now sufficiently shaved for aliases with offset
to work.
llvm-svn: 207786
|
| |
|
|
| |
llvm-svn: 207785
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This optimization merges the common part of a group of GEPs, so we can compute
each pointer address by adding a simple offset to the common part.
The optimization is currently only enabled for the NVPTX backend, where it has
a large payoff on some benchmarks.
Review: http://reviews.llvm.org/D3462
Patch by Jingyue Wu.
llvm-svn: 207783
|
| |
|
|
|
|
|
|
| |
This the LLVM portion that will allow Clang and other frontends to emit
typedefs of void by providing a null type for the typedef's underlying
type.
llvm-svn: 207777
|
| |
|
|
|
|
|
|
| |
a const char *, so casting to non-const was triggering a warning (even though the assignment and usage was always const anyway).
No functional changes intended.
llvm-svn: 207774
|
| |
|
|
|
|
|
|
| |
Use i32 instead of specifying SReg_32. When this is
the pseudo INDIRECT_BASE_ADDR, this would give a bogus
verifier error.
llvm-svn: 207770
|
| |
|
|
|
|
|
|
|
| |
This fixes pr19147.
There are a few more related issues to fix, but the testcase in the bug now
passes.
llvm-svn: 207763
|
| |
|
|
|
|
| |
I will use it there in a second.
llvm-svn: 207761
|
| |
|
|
| |
llvm-svn: 207760
|
| |
|
|
| |
llvm-svn: 207759
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently force symbols to be globals in .thumb_set. The intent
seems to be that given
.thumb_set foo, bar
we emit an undefined symbol to bar if it is never defined. The side
effect is that we mark bar as global, even if it is defined, which gas
does not.
Producing an undefined reference to bar is a general difference from MC and gas.
For example, given
a = b
gas will produce an undefined reference to b, MC will not. I would be surprised
if any code depends on this, but it it does, we should fix the general
difference, not special case .thumb_set.
llvm-svn: 207757
|
| |
|
|
|
|
|
| |
The canonical form of the BFM instruction is always one of the more explicit
extract or insert operations, which makes reading output much easier.
llvm-svn: 207752
|
| |
|
|
|
|
|
|
|
| |
just connects an SCC to one of its descendants directly. Not much of an
impact. The last one is the hard one -- connecting an SCC to one of its
ancestors, and thereby forming a cycle such that we have to merge all
the SCCs participating in the cycle.
llvm-svn: 207751
|
| |
|
|
|
|
| |
no functionality changed.
llvm-svn: 207750
|
| |
|
|
|
|
|
|
| |
of SCCs in the SCC DAG. Exercise them in the big graph test case. These
will be especially useful for establishing invariants in insertion
logic.
llvm-svn: 207749
|
| |
|
|
|
|
|
|
| |
top bit set.
This fixes an ARM assembler crash - regression test added.
llvm-svn: 207747
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
=[
Turns out that this was the root cause of PR19621. We found a crasher
only recently (likely due to improvements elsewhere in the SLP
vectorizer) but the reduced test case failed all the way back to here.
I've confirmed that reverting this patch both fixes the reduced test
case in PR19621 and the actual source file that led to it, so it seems
to really be rooted here. I've replied to the commit thread with
discussion of my (feeble) attempts to debug this. Didn't make it very
far, so reverting now that we have a good test case so that things can
get back to healthy while the debugging carries on.
llvm-svn: 207746
|
| |
|
|
| |
llvm-svn: 207742
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
There are two functional changes:
1) The directive is not expanded for the ASM->ASM code path.
2) If PIC is not set, there's no expansion for the ASM->OBJ code path (same behaviour as GAS).
Reviewers: dsanders
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D3482
llvm-svn: 207741
|
| |
|
|
|
|
|
|
|
|
| |
and drotrv
GAS doesn't actually accept these particular cases.
The mnemonic without the trailing 'v' still supports two-operand aliases.
llvm-svn: 207740
|
| |
|
|
|
|
|
| |
Record the DWARF version in MCContext, and use it when
emitting the dwarf version into the debug info.
llvm-svn: 207739
|
| |
|
|
|
|
|
|
|
| |
This fixes the memory leak introduced with the initial addition of support for
WoA stack probing. Now that the pseudo-instruction expansion can handle an
external symbol, use that to generate the load which simplifies the logic as
well as avoids the memory leak.
llvm-svn: 207737
|
| |
|
|
|
|
|
|
| |
This enhances the expansion of the mov32imm pseudo-instruction to support an
external symbol reference. This is motivated by a simplification of the stack
probe emission for Windows on ARM (and fixing a leak).
llvm-svn: 207736
|
| |
|
|
|
|
| |
appear to be breaking a bootstrapped build of compiler-rt.
llvm-svn: 207732
|
| |
|
|
| |
llvm-svn: 207730
|
| |
|
|
|
|
|
| |
This makes the coff writer compute the correct symbol value for the test in
pr19147. The section is still incorrect, that will be fixed in a followup patch.
llvm-svn: 207728
|
| |
|
|
| |
llvm-svn: 207726
|
| |
|
|
| |
llvm-svn: 207725
|
| |
|
|
|
|
| |
Ownership of abstract scopes coming soon.
llvm-svn: 207724
|
| |
|
|
| |
llvm-svn: 207723
|
| |
|
|
| |
llvm-svn: 207721
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Get rid of UserVariables set, and turn DbgValues into MapVector
to get a fixed ordering, as suggested in review for http://reviews.llvm.org/D3573.
Test Plan: llvm regression tests
Reviewers: dblaikie
Reviewed By: dblaikie
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3579
llvm-svn: 207720
|