| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
lack sse2.
llvm-svn: 112175
|
| |
|
|
| |
llvm-svn: 112171
|
| |
|
|
| |
llvm-svn: 112155
|
| |
|
|
| |
llvm-svn: 112104
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
expanding: e.g. <2 x float> -> <4 x float> instead of -> 2 floats. This
affects two places in the code: handling cross block values and handling
function return and arguments. Since vectors are already widened by
legalizetypes, this gives us much better code and unblocks x86-64 abi
and SPU abi work.
For example, this (which is a silly example of a cross-block value):
define <4 x float> @test2(<4 x float> %A) nounwind {
%B = shufflevector <4 x float> %A, <4 x float> undef, <2 x i32> <i32 0, i32 1>
%C = fadd <2 x float> %B, %B
br label %BB
BB:
%D = fadd <2 x float> %C, %C
%E = shufflevector <2 x float> %D, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef>
ret <4 x float> %E
}
Now compiles into:
_test2: ## @test2
## BB#0:
addps %xmm0, %xmm0
addps %xmm0, %xmm0
ret
previously it compiled into:
_test2: ## @test2
## BB#0:
addps %xmm0, %xmm0
pshufd $1, %xmm0, %xmm1
## kill: XMM0<def> XMM0<kill> XMM0<def>
insertps $0, %xmm0, %xmm0
insertps $16, %xmm1, %xmm0
addps %xmm0, %xmm0
ret
This implements rdar://8230384
llvm-svn: 112101
|
| |
|
|
| |
llvm-svn: 112086
|
| |
|
|
| |
llvm-svn: 112085
|
| |
|
|
|
|
| |
no functionality change.
llvm-svn: 111994
|
| |
|
|
|
|
| |
functionality change.
llvm-svn: 111990
|
| |
|
|
| |
llvm-svn: 111982
|
| |
|
|
|
|
|
| |
hierarchy with virtual methods and using llvm_unreachable to properly indicate
unreachable states which would otherwise leave variables uninitialized.
llvm-svn: 111803
|
| |
|
|
|
|
|
|
|
|
|
| |
it involves specific floating-point types, legalize should expand an
extending load to a non-extending load followed by a separate extend operation.
For example, we currently expand SEXTLOAD to EXTLOAD+SIGN_EXTEND_INREG (and
assert that EXTLOAD should always be supported). Now we can expand that to
LOAD+SIGN_EXTEND. This is needed to allow vector SIGN_EXTEND and ZERO_EXTEND
to be used for NEON.
llvm-svn: 111586
|
| |
|
|
|
|
| |
PR 7882. Follows suggestion by Amaury Pouly, thanks.
llvm-svn: 111306
|
| |
|
|
| |
llvm-svn: 111223
|
| |
|
|
| |
llvm-svn: 110649
|
| |
|
|
| |
llvm-svn: 110460
|
| |
|
|
| |
llvm-svn: 110410
|
| |
|
|
|
|
|
|
| |
address of the static
ID member as the sole unique type identifier. Clean up APIs related to this change.
llvm-svn: 110396
|
| |
|
|
| |
llvm-svn: 110183
|
| |
|
|
|
|
|
|
| |
Fixes potential ambiguity problems on VS 2010.
Patch by nobled!
llvm-svn: 110029
|
| |
|
|
|
|
| |
ISD::AND case of TargetLowering::SimplifyDemandedBits.
llvm-svn: 110019
|
| |
|
|
|
|
|
| |
check the range of the constant when optimizing a comparison between a
constant and a sign_extend_inreg node.
llvm-svn: 109854
|
| |
|
|
|
|
|
|
| |
ConstantFoldBIT_CONVERTofBUILD_VECTOR calling itself
recursively and returning a SCALAR_TO_VECTOR node, but assuming the input was always a BUILD_VECTOR.
llvm-svn: 109519
|
| |
|
|
|
|
|
|
|
| |
protectors, to be near the stack protectors on the stack. Accomplish this by
tagging the stack object with a predicate that indicates that it would trigger
this. In the prolog-epilog inserter, assign these objects to the stack after the
stack protector but before the other objects.
llvm-svn: 109481
|
| |
|
|
|
|
| |
enough to factor into scheduling priority. Eliminate it and add early exits to speed up scheduling.
llvm-svn: 109449
|
| |
|
|
| |
llvm-svn: 109415
|
| |
|
|
|
|
|
|
| |
parameter)
may be used uninitialized in the callers of HighRegPressure.
llvm-svn: 109393
|
| |
|
|
| |
llvm-svn: 109383
|
| |
|
|
|
|
| |
those. Radar 8231572.
llvm-svn: 109367
|
| |
|
|
|
|
|
|
|
|
|
|
| |
appropriate for targets without detailed instruction iterineries.
The scheduler schedules for increased instruction level parallelism in
low register pressure situation; it schedules to reduce register pressure
when the register pressure becomes high.
On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2
by 16%.
llvm-svn: 109300
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
it's too late to start backing off aggressive latency scheduling when most
of the registers are in use so the threshold should be a bit tighter.
- Correctly handle live out's and extract_subreg etc.
- Enable register pressure aware scheduling by default for hybrid scheduler.
For ARM, this is almost always a win on # of instructions. It's runtime
neutral for most of the tests. But for some kernels with high register
pressure it can be a huge win. e.g. 464.h264ref reduced number of spills by
54 and sped up by 20%.
llvm-svn: 109279
|
| |
|
|
| |
llvm-svn: 109265
|
| |
|
|
|
|
| |
are not demanded. This often allows the anyext to be folded away.
llvm-svn: 109242
|
| |
|
|
| |
llvm-svn: 109234
|
| |
|
|
| |
llvm-svn: 109205
|
| |
|
|
| |
llvm-svn: 109122
|
| |
|
|
| |
llvm-svn: 109103
|
| |
|
|
| |
llvm-svn: 109083
|
| |
|
|
| |
llvm-svn: 109082
|
| |
|
|
| |
llvm-svn: 109079
|
| |
|
|
| |
llvm-svn: 109064
|
| |
|
|
| |
llvm-svn: 108991
|
| |
|
|
|
|
|
| |
update the current basic block in addition to the current insert
position, so that they remain consistent. This fixes rdar://8204072.
llvm-svn: 108765
|
| |
|
|
|
|
| |
its scalar floating point registers alias its vector registers.
llvm-svn: 108761
|
| |
|
|
|
|
|
|
| |
for legal value types. A "representative" register class is the largest legal super-reg register class for a value type. e.g. On i386, GR32 is the rep register class for i8 / i16 / i32; on x86_64 it would be GR64.
This property will be used by the register pressure tracking instruction scheduler.
llvm-svn: 108735
|
| |
|
|
| |
llvm-svn: 108688
|
| |
|
|
|
|
|
|
|
|
| |
conversions around sqrt instructions.
I am assured by people more knowledgeable than me that there are no rounding issues in eliminating this.
This fixed <rdar://problem/8197504>.
llvm-svn: 108639
|
| |
|
|
|
|
|
|
| |
information.
No functional change yet.
llvm-svn: 108583
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
since it doesn't work for front-ends which don't emit column information
(which includes llvm-gcc in its present configuration), and doesn't
work for clang for K&R style variables where the variables are declared
in a different order from the parameter list.
Instead, make a separate pass through the instructions to collect the
llvm.dbg.declare instructions in order. This ensures that the debug
information for variables is emitted in this order.
llvm-svn: 108538
|
| |
|
|
|
|
|
| |
because it's more likely to keep debug line information in its original
order.
llvm-svn: 108496
|