| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
| |
llvm-svn: 206195
|
| |
|
|
|
|
|
|
|
|
|
|
| |
basic blocks inside loops correctly.
Previously, BranchProbabilityInfo::calcLoopBranchHeuristics would determine the weights of basic blocks inside loops even when it didn't have enough information to estimate the branch probabilities correctly. This patch fixes the function to exit early if it doesn't see any exit edges or back edges and let the later heuristics determine the weights.
This fixes PR18705 and <rdar://problem/15991090>.
Differential Revision: http://reviews.llvm.org/D3363
llvm-svn: 206194
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This allows correct relocations to be generated for a symbolic
address that is being adjusted by a negative constant. Since r204294,
such expressions have triggered undefined behavior when LLVM was built
without assertions.
Credit goes to Rafael for this patch; I'm submitting it on his behalf
as he is on vacation this week.
llvm-svn: 206192
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This was another incorrect use of hasMips64() vs isGP64bit().
Depends on D3344
Reviewers: matheusalmeida, vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3347
llvm-svn: 206187
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Conditional moves acting on 64-bit GPR's should require MIPS-IV rather than MIPS64
- ISD::MUL, and ISD::MULH[US] should be lowered on all 64-bit ISA's
Patch by David Chisnall
His work was sponsored by: DARPA, AFRL
I've added additional testcases to cover as much of the codegen changes
affecting MIPS-IV as I can. Where I've been unable to find an existing
MIPS64 testcase that can be re-used for MIPS-IV (mainly tests covering
ISD::GlobalAddress and similar), I at least agree that MIPS-IV should
behave like MIPS64. Further testcases that are fixed by this patch will follow
in my next commit. The testcases from that commit that fail for MIPS-IV without
this patch are:
LLVM :: CodeGen/Mips/2010-07-20-Switch.ll
LLVM :: CodeGen/Mips/cmov.ll
LLVM :: CodeGen/Mips/eh-dwarf-cfa.ll
LLVM :: CodeGen/Mips/largeimmprinting.ll
LLVM :: CodeGen/Mips/longbranch.ll
LLVM :: CodeGen/Mips/mips64-f128.ll
LLVM :: CodeGen/Mips/mips64directive.ll
LLVM :: CodeGen/Mips/mips64ext.ll
LLVM :: CodeGen/Mips/mips64fpldst.ll
LLVM :: CodeGen/Mips/mips64intldst.ll
LLVM :: CodeGen/Mips/mips64load-store-left-right.ll
LLVM :: CodeGen/Mips/sint-fp-store_pattern.ll
Reviewers: dsanders
Reviewed By: dsanders
CC: matheusalmeida
Differential Revision: http://reviews.llvm.org/D3343
llvm-svn: 206183
|
| |
|
|
|
|
| |
Patch by Nick Tomlinson!
llvm-svn: 206177
|
| |
|
|
|
|
| |
The 32-bit pattern is still valid: 0123 -> 3210 -> 1032.
llvm-svn: 206172
|
| |
|
|
|
|
|
| |
Code change is because optimizeCompareInstr didn't know how to pull the
condition code out of FCSEL instructions.
llvm-svn: 206171
|
| |
|
|
|
|
|
| |
AArch64 tests for this, and it's obviously a good idea. Have to invert the
condition code, of course.
llvm-svn: 206170
|
| |
|
|
|
|
|
|
|
| |
There was one definite issue in ARM64 (the off-by-1 check for whether
a shift could be folded in) and one difference that is probably
correct: ARM64 didn't fold nodes with multiple uses into the
arithmetic operations unless optimising for code size.
llvm-svn: 206168
|
| |
|
|
|
|
|
| |
This transformation is only valid when being used for an EQ or NE
comparison since the flags change otherwise.
llvm-svn: 206167
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Previously loadImmediate() would produce MKMSK instructions with invalid
immediate values such as mkmsk r0, 9. Fix this by checking the mask size
is valid.
Reviewers: robertlytton
Reviewed By: robertlytton
CC: llvm-commits
Differential Revision: http://reviews.llvm.org/D3289
llvm-svn: 206163
|
| |
|
|
| |
llvm-svn: 206154
|
| |
|
|
|
|
| |
It broke some builders, at least, i686.
llvm-svn: 206153
|
| |
|
|
|
|
|
| |
declaration. GCC 4.7 appears to get hopelessly confused by declaring
this function within a member function of a class template. Go figure.
llvm-svn: 206152
|
| |
|
|
|
|
|
|
| |
BasicTTI::getMemoryOpCost must explicitly check for non-simple types; setting
AllowUnknown=true with TLI->getSimpleValueType is not sufficient because, for
example, non-power-of-two vector types return non-simple EVTs (not MVT::Other).
llvm-svn: 206150
|
| |
|
|
|
|
|
|
|
|
|
|
| |
abstract interface. The only user of this functionality is the JIT
memory manager and it is quite happy to have a custom type here. This
removes a virtual function call and a lot of unnecessary abstraction
from the common case where this is just a *very* thin vaneer around
a call to malloc.
Hopefully still no functionality changed here. =]
llvm-svn: 206149
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
slabs rather than embedding a singly linked list in the slabs
themselves. This has a few advantages:
- Better utilization of the slab's memory by not wasting 16-bytes at the
front.
- Simpler allocation strategy by not having a struct packed at the
front.
- Avoids paging every allocated slab in just to traverse them for
deallocating or dumping stats.
The latter is the really nice part. Folks have complained from time to
time bitterly that tearing down a BumpPtrAllocator, even if it doesn't
run any destructors, pages in all of the memory allocated. Now it won't.
=]
Also resolves a FIXME with the scaling of the slab sizes. The scaling
now disregards specially sized slabs for allocations larger than the
threshold.
llvm-svn: 206147
|
| |
|
|
| |
llvm-svn: 206144
|
| |
|
|
|
|
| |
instead of comparing to nullptr.
llvm-svn: 206142
|
| |
|
|
|
|
|
|
|
|
| |
Implements the various TTI functions to enable constant hoisting on PPC. The
only significant test-suite change is this:
MultiSource/Benchmarks/VersaBench/bmm/bmm - 20% speedup
(which essentially reverses the slowdown from r206120).
llvm-svn: 206141
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The values for the relocation type can (and do) overlap across various
architectures. When performing an adjustment of the emitted relocation in the
final object file, check that the file magic matches the target for which the
relocation type is valid (e.g. a I386 relocation is only applied to an X86
object file, and an AMD64 relocation is only applied to an X86_64 object file).
This was noticed while adding support for ARM WinCOFF object file emission.
A test case for this is not really possible as the values for REL32 do not
overlap on I386 and AMD64, which is why this was never noticed in practice. The
ARM WinCOFF emission is not yet ready to merge into the tree.
llvm-svn: 206138
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If multiplication involves zero-extended arguments and the result is
compared as in the patterns:
%mul32 = trunc i64 %mul64 to i32
%zext = zext i32 %mul32 to i64
%overflow = icmp ne i64 %mul64, %zext
or
%overflow = icmp ugt i64 %mul64 , 0xffffffff
then the multiplication may be replaced by call to umul.with.overflow.
This change fixes PR4917 and PR4918.
Differential Revision: http://llvm-reviews.chandlerc.com/D2814
llvm-svn: 206137
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We had been using the known-zero values of the operand of the or to construct
the mask for an rlwimi; this is not quite correct, but fine when the mask is
constant. When the mask is constant, then the known zeros of the operand must
be a superset of the zeros in the mask. However, when the mask is not a
constant, then there might be bits in the operand that are not known to be zero
that, at runtime, might be zero in the mask. Therefore, we check that any bits
not known to be zero *are* known to be one in the mask. Otherwise, we can't
fold the mask with the or and shift.
This was revealed as a miscompile of
MultiSource/Benchmarks/BitBench/drop3/drop3 when I started experimenting with
constant hoisting.
llvm-svn: 206136
|
| |
|
|
|
|
|
|
|
|
|
|
| |
I found this from a particular GDB test suite case of inlining
(something similar is provided as a test case) but came across a few
other related cases (other callers of the same functions, and one other
instance of the same coding mistake in a separate function).
I'm not sure what the best way to test this is (let alone to cover the
other cases I discovered), so hopefully this sufficies - open to ideas.
llvm-svn: 206130
|
| |
|
|
|
|
| |
check instead of comparing to nullptr.
llvm-svn: 206129
|
| |
|
|
| |
llvm-svn: 206127
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add implementations of:
bool isLegalICmpImmediate(int64_t Imm) const
bool isLegalAddImmediate(int64_t Imm) const
bool isTruncateFree(Type *Ty1, Type *Ty2) const
bool isTruncateFree(EVT VT1, EVT VT2) const
bool shouldConvertConstantLoadToIntImm(const APInt &Imm, Type *Ty) const
Unfortunately, this regresses counter-register-based loop formation because
some of the loops now end up in forms were SE cannot compute loop counts.
However, nevertheless, the test-suite results favor committing:
SingleSource/Benchmarks/BenchmarkGame/puzzle: 26% speedup
MultiSource/Benchmarks/FreeBench/analyzer/analyzer: 21% speedup
MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan: 20% speedup
SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trisolv/trisolv: 19% speedup
SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gesummv/gesummv: 15% speedup
MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2: 2% speedup
MultiSource/Benchmarks/VersaBench/bmm/bmm: 26% slowdown
llvm-svn: 206120
|
| |
|
|
|
|
| |
Not sure why clang didn't diagnose this (GCC does).
llvm-svn: 206117
|
| |
|
|
| |
llvm-svn: 206116
|
| |
|
|
|
|
| |
While there make array_lengthof constexpr if we have support for it.
llvm-svn: 206112
|
| |
|
|
|
|
|
| |
Making them inline was a historical accident, they're neither hot nor
templated.
llvm-svn: 206109
|
| |
|
|
|
|
|
| |
DWARF3 introduced DW_TAG_restrict_type, so avoid using it in prior
versions.
llvm-svn: 206105
|
| |
|
|
|
|
|
| |
This reverts commit 206096 while I investigate why this broke the gdb
buildbot.
llvm-svn: 206103
|
| |
|
|
|
|
|
| |
There is no need to check if we want to hoist the immediate value of an
shift instruction. Simply return TCC_Free right away.
llvm-svn: 206101
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Originally the cost model would give up for large constants and just return the
maximum cost. This is not what we want for constant hoisting, because some of
these constants are large in bitwidth, but are still cheap to materialize.
This commit fixes the cost model to either return TCC_Free if the cost cannot be
determined, or accurately calculate the cost even for large constants
(bitwidth > 128).
This fixes <rdar://problem/16591573>.
llvm-svn: 206100
|
| |
|
|
|
|
|
|
|
|
|
| |
Nice to be able to just print out the Tag and have the debugger print
dwarf::DW_TAG_subprogram or whatever, rather than an int.
It's a bit finicky (for example DIDescriptor::getTag still returns
unsigned) because some places still handle real dwarf tags + our fake
tags (one day we'll remove the fake tags, hopefully).
llvm-svn: 206098
|
| |
|
|
|
|
|
|
|
|
|
|
| |
therefore, their declaration cannot have one DW_AT_linkage_name.
The specific instances however can and should have that attribute.
This patch reorders the code in DwarfUnit::getOrCreateSubprogramDIE()
to emit linkage names for C/Dtors.
rdar://problem/16362674.
llvm-svn: 206096
|
| |
|
|
|
|
|
|
| |
This logic is properly in the realm of whatever is creating the
TargetMachine. This makes plain 'llc foo.ll' consistent across
heterogenous machines.
llvm-svn: 206094
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We had disabled use of TBAA during CodeGen (even when otherwise using AA)
because the ptrtoint/inttoptr used by CGP for address sinking caused BasicAA to
miss basic type punning that it should catch (and, thus, we'd fail to override
TBAA when we should).
However, when AA is in use during CodeGen, CGP now uses normal GEPs and
bitcasts, instead of ptrtoint/inttoptr, when doing address sinking. As a
result, BasicAA should be able to make us do the right thing in the face of
type-punning, and it seems safe to enable use of TBAA again. self-hosting seems
fine on PPC64/Linux on the P7, with TBAA enabled and -misched=shuffle.
Note: We still don't update TBAA when merging stack slots, although because
BasicAA should now catch all such cases, this is no longer a blocking issue.
Nevertheless, I plan to commit code to deal with this properly in the near
future.
llvm-svn: 206093
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current memory-instruction optimization logic in CGP, which sinks parts of
the address computation that can be adsorbed by the addressing mode, does this
by explicitly converting the relevant part of the address computation into
IR-level integer operations (making use of ptrtoint and inttoptr). For most
targets this is currently not a problem, but for targets wishing to make use of
IR-level aliasing analysis during CodeGen, the use of ptrtoint/inttoptr is a
problem for two reasons:
1. BasicAA becomes less powerful in the face of the ptrtoint/inttoptr
2. In cases where type-punning was used, and BasicAA was used
to override TBAA, BasicAA may no longer do so. (this had forced us to disable
all use of TBAA in CodeGen; something which we can now enable again)
This (use of GEPs instead of ptrtoint/inttoptr) is not currently enabled by
default (except for those targets that use AA during CodeGen), and so aside
from some PowerPC subtargets and SystemZ, there should be no change in
behavior. We may be able to switch completely away from the ptrtoint/inttoptr
sinking on all targets, but further testing is required.
I've doubled-up on a number of existing tests that are sensitive to the
address sinking behavior (including some store-merging tests that are
sensitive to the order of the resulting ADD operations at the SDAG level).
llvm-svn: 206092
|
| |
|
|
| |
llvm-svn: 206089
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This is a shared implementation class for BlockFrequencyInfo and
MachineBlockFrequencyInfo, not for BlockFrequency, a related (but
distinct) class.
No functionality change.
<rdar://problem/14292693>
llvm-svn: 206083
|
| |
|
|
|
|
|
|
| |
No functionality change.
<rdar://problem/14292693>
llvm-svn: 206082
|
| |
|
|
|
|
| |
Based on a code review suggestion from Eric Christopher in r205990
llvm-svn: 206080
|
| |
|
|
|
|
|
|
|
| |
This patch adds patterns to generate the cls instruction ARM64. Includes tests
for 64 bit and 32 bit operands.
rdar://15611957
llvm-svn: 206079
|
| |
|
|
| |
llvm-svn: 206078
|
| |
|
|
|
|
|
|
|
|
|
| |
search option.
fexhaustive-register-search => exhaustive-register-search
'f' is a Clang thing!
This is related to PR18747.
llvm-svn: 206075
|
| |
|
|
|
|
|
|
|
|
|
| |
-fexhaustive-register-search option to allow an exhaustive search during last
chance recoloring.
This is related to PR18747
Patch by MAYUR PANDEY <mayur.p@samsung.com>.
llvm-svn: 206072
|
| |
|
|
|
|
|
|
|
|
| |
Through some oddity where truncate (sextload x) isn't folded into
an anyextload for vectors, the sextload remains if the
vector isn't immediately scalarized. This keeps the expected
zextload instructions in the kernel-args test when small type
vectors aren't scalarized.
llvm-svn: 206070
|