| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The fix itself is fairly simple: move getAccessVariant to MCValue so that we
replace the old weak expression evaluation with the far more general
EvaluateAsRelocatable.
This then requires that EvaluateAsRelocatable stop when it finds a non
trivial reference kind. And that in turn requires the ELF writer to look
harder for weak references.
Last but not least, this found a case where we were being bug by bug
compatible with gas and accepting an invalid input. I reported pr19647
to track it.
llvm-svn: 207920
|
| |
|
|
|
|
| |
http://reviews.llvm.org/D3598
llvm-svn: 207917
|
| |
|
|
|
|
|
|
| |
There is no point in creating it if we're not going to vectorize
anything. Creating the map is expensive as it creates large values.
No functionality change.
llvm-svn: 207916
|
| |
|
|
| |
llvm-svn: 207915
|
| |
|
|
|
|
|
|
| |
which are corresponding to the current target read from the ELF file.
This fix cannot be tested until obj2yaml does not support ELF format.
llvm-svn: 207905
|
| |
|
|
| |
llvm-svn: 207904
|
| |
|
|
|
|
|
| |
This patch adds support to recognize and vectorize intrinsic math functions in SLPVectorizer.
Review: http://reviews.llvm.org/D3560 and http://reviews.llvm.org/D3559
llvm-svn: 207901
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Committed initially in r207724-r207726 and reverted due to compiler-rt
crashes in r207732.
Instead, fix this harder with unordered_map and store the LexicalScopes
by value in the map. This did necessitate moving the definition of
LexicalScope above the definition of LexicalScopes.
Let's see how the buildbots/compilers tolerate unordered_map::emplace +
std::piecewise_construct + std::forward_as_tuple...
llvm-svn: 207876
|
| |
|
|
| |
llvm-svn: 207871
|
| |
|
|
| |
llvm-svn: 207869
|
| |
|
|
|
|
|
|
|
| |
There are public functions that mutate various members as well as
another private member already, so make all the members private to
avoid the discontinuity and add accessors for the values. Should
be no functional change.
llvm-svn: 207868
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Reading line tables in llvm-cov was pretty broken, but would happen to
work as long as no line in the table was 0. It's not clear to me
whether a line of zero *should* show up in these tables, but deciding
to read a string in the middle of the line table is certainly the
wrong thing to do if it does.
I've also added some comments, as trying to figure out what this block
of code was doing was fairly unpleasant.
llvm-svn: 207866
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This moves most of GlobalOpt's constructor optimization
code out of GlobalOpt into Transforms/Utils/CDtorUtils.{h,cpp}. The
public interface is a single function OptimizeGlobalCtorsList() that
takes a predicate returning which constructors to remove.
GlobalOpt calls this with a function that statically evaluates all
constructors, just like it did before. This part of the change is
behavior-preserving.
Also add a call to this from GlobalDCE with a filter that removes global
constructors that contain a "ret" instruction and nothing else – this
fixes PR19590.
llvm-svn: 207856
|
| |
|
|
|
|
|
|
|
| |
address to AnalyzeLoadFromClobberingLoad. This fixes a bug in load-PRE where
PRE is applied to a load that is not partially redundant.
<rdar://problem/16638765>.
llvm-svn: 207853
|
| |
|
|
|
|
|
|
|
| |
.file records are supposed to have a section identifier of 65534
(IMAGE_SCN_DEBUG) rather than 0. This is spelt out clearly within the PE/COFF
specification. Fix this minor oversight with the implementation for support for
.file records.
llvm-svn: 207851
|
| |
|
|
| |
llvm-svn: 207850
|
| |
|
|
|
|
| |
This is just bdver3 + AVX2 + BMI2.
llvm-svn: 207847
|
| |
|
|
|
|
|
| |
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Samuel Li <samuel.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
llvm-svn: 207846
|
| |
|
|
|
|
|
|
| |
v2: move code to AMDGPUISelLowering.cpp
squash with tests (both EG and SI)
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 207845
|
| |
|
|
| |
llvm-svn: 207844
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The register spiller assumes that only one new instruction is created
when spilling and restoring registers, so we need to emit pseudo
instructions for vector register spills and lower them after
register allocation.
v2:
- Fix calculation of lane index
- Extend VGPR liveness to end of program.
v3:
- Use SIMM16 field of S_NOP to specify multiple NOPs.
https://bugs.freedesktop.org/show_bug.cgi?id=75005
llvm-svn: 207843
|
| |
|
|
| |
llvm-svn: 207840
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, LLVM had no knowledge that these instructions actually
modified their address register: fine if they never end up in CodeGen,
but when I'd rather like to write some patterns for them it becomes a
disaster.
The change is mostly straightforward, I think the most significant
design decision was to *always* put the address write-back first. This
allows loads and stores to be accessed more uniformly, for example
permitting the continued sharing of the InstAlias definitions.
I also discovered that the custom Decode logic is no longer needed, so
I removed it.
No tests, because there should be no functionality change.
llvm-svn: 207839
|
| |
|
|
|
|
|
|
| |
While post-indexed LD1/ST1 instructions do exist for vector loads,
this patch makes use of the more flexible addressing-modes in LDR/STR
instructions.
llvm-svn: 207838
|
| |
|
|
|
|
| |
vector VT but scalar values.
llvm-svn: 207835
|
| |
|
|
|
|
| |
times in a bootstrap of clang.
llvm-svn: 207828
|
| |
|
|
| |
llvm-svn: 207807
|
| |
|
|
| |
llvm-svn: 207805
|
| |
|
|
| |
llvm-svn: 207804
|
| |
|
|
| |
llvm-svn: 207803
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Given the following C code llvm currently generates suboptimal code for
x86-64:
__m128 bss4( const __m128 *ptr, size_t i, size_t j )
{
float f = ptr[i][j];
return (__m128) { f, f, f, f };
}
=================================================
define <4 x float> @_Z4bss4PKDv4_fmm(<4 x float>* nocapture readonly %ptr, i64 %i, i64 %j) #0 {
%a1 = getelementptr inbounds <4 x float>* %ptr, i64 %i
%a2 = load <4 x float>* %a1, align 16, !tbaa !1
%a3 = trunc i64 %j to i32
%a4 = extractelement <4 x float> %a2, i32 %a3
%a5 = insertelement <4 x float> undef, float %a4, i32 0
%a6 = insertelement <4 x float> %a5, float %a4, i32 1
%a7 = insertelement <4 x float> %a6, float %a4, i32 2
%a8 = insertelement <4 x float> %a7, float %a4, i32 3
ret <4 x float> %a8
}
=================================================
shlq $4, %rsi
addq %rdi, %rsi
movslq %edx, %rax
vbroadcastss (%rsi,%rax,4), %xmm0
retq
=================================================
The movslq is uneeded, but is present because of the trunc to i32 and then
sext back to i64 that the backend adds for vbroadcastss.
We can't remove it because it changes the meaning. The IR that clang
generates is already suboptimal. What clang really should emit is:
%a4 = extractelement <4 x float> %a2, i64 %j
This patch makes that legal. A separate patch will teach clang to do it.
Differential Revision: http://reviews.llvm.org/D3519
llvm-svn: 207801
|
| |
|
|
|
|
| |
more.
llvm-svn: 207800
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This creates a lot of core infrastructure in which to add, with little
effort, quite a bit more to mips fast-isel
Test Plan: simplestore.ll
Reviewers: dsanders
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D3527
llvm-svn: 207790
|
| |
|
|
|
|
|
|
|
|
| |
This was initialized by llvm-mc (calling setDwarfVersion) but other
clients (such as clang, llc, etc) aren't necessarily initializing this
so we were getting garbage DWARF version values in the output.
Initialize it to a reasonable default (the same default used in llvm-mc,
though this is higher than it was (2) previously).
llvm-svn: 207788
|
| |
|
|
|
|
|
|
|
| |
This matches gas' behaviour on COFF.
I think that this yak is now sufficiently shaved for aliases with offset
to work.
llvm-svn: 207786
|
| |
|
|
| |
llvm-svn: 207785
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This optimization merges the common part of a group of GEPs, so we can compute
each pointer address by adding a simple offset to the common part.
The optimization is currently only enabled for the NVPTX backend, where it has
a large payoff on some benchmarks.
Review: http://reviews.llvm.org/D3462
Patch by Jingyue Wu.
llvm-svn: 207783
|
| |
|
|
|
|
|
|
| |
This the LLVM portion that will allow Clang and other frontends to emit
typedefs of void by providing a null type for the typedef's underlying
type.
llvm-svn: 207777
|
| |
|
|
|
|
|
|
| |
a const char *, so casting to non-const was triggering a warning (even though the assignment and usage was always const anyway).
No functional changes intended.
llvm-svn: 207774
|
| |
|
|
|
|
|
|
| |
Use i32 instead of specifying SReg_32. When this is
the pseudo INDIRECT_BASE_ADDR, this would give a bogus
verifier error.
llvm-svn: 207770
|
| |
|
|
|
|
|
|
|
| |
This fixes pr19147.
There are a few more related issues to fix, but the testcase in the bug now
passes.
llvm-svn: 207763
|
| |
|
|
|
|
| |
I will use it there in a second.
llvm-svn: 207761
|
| |
|
|
| |
llvm-svn: 207760
|
| |
|
|
| |
llvm-svn: 207759
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently force symbols to be globals in .thumb_set. The intent
seems to be that given
.thumb_set foo, bar
we emit an undefined symbol to bar if it is never defined. The side
effect is that we mark bar as global, even if it is defined, which gas
does not.
Producing an undefined reference to bar is a general difference from MC and gas.
For example, given
a = b
gas will produce an undefined reference to b, MC will not. I would be surprised
if any code depends on this, but it it does, we should fix the general
difference, not special case .thumb_set.
llvm-svn: 207757
|
| |
|
|
|
|
|
| |
The canonical form of the BFM instruction is always one of the more explicit
extract or insert operations, which makes reading output much easier.
llvm-svn: 207752
|
| |
|
|
|
|
|
|
|
| |
just connects an SCC to one of its descendants directly. Not much of an
impact. The last one is the hard one -- connecting an SCC to one of its
ancestors, and thereby forming a cycle such that we have to merge all
the SCCs participating in the cycle.
llvm-svn: 207751
|
| |
|
|
|
|
| |
no functionality changed.
llvm-svn: 207750
|
| |
|
|
|
|
|
|
| |
of SCCs in the SCC DAG. Exercise them in the big graph test case. These
will be especially useful for establishing invariants in insertion
logic.
llvm-svn: 207749
|
| |
|
|
|
|
|
|
| |
top bit set.
This fixes an ARM assembler crash - regression test added.
llvm-svn: 207747
|