| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
the old value.
llvm-svn: 109567
|
|
|
|
|
|
| |
This is about 4x faster and smaller than the existing scalarization.
llvm-svn: 109566
|
|
|
|
|
|
| |
This is still not perfect, but better than it was before.
llvm-svn: 109563
|
|
|
|
|
|
|
| |
add instead a CallSite(Value* V) constructor that is consistent with ImmutableCallSize
and use that one in client code
llvm-svn: 109553
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
types coming in future patches.
For:
define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp {
entry:
%shl = shl <4 x i32> %r, %a ; <<4 x i32>> [#uses=1]
%tmp2 = bitcast <4 x i32> %shl to <2 x i64> ; <<2 x i64>> [#uses=1]
ret <2 x i64> %tmp2
}
We get:
_shl: ## @shl
pslld $23, %xmm1
paddd LCPI0_0, %xmm1
cvttps2dq %xmm1, %xmm1
pmulld %xmm1, %xmm0
ret
Instead of:
_shl: ## @shl
pshufd $3, %xmm0, %xmm2
movd %xmm2, %eax
pshufd $3, %xmm1, %xmm2
movd %xmm2, %ecx
shll %cl, %eax
movd %eax, %xmm2
pshufd $1, %xmm0, %xmm3
movd %xmm3, %eax
pshufd $1, %xmm1, %xmm3
movd %xmm3, %ecx
shll %cl, %eax
movd %eax, %xmm3
punpckldq %xmm2, %xmm3
movd %xmm0, %eax
movd %xmm1, %ecx
shll %cl, %eax
movd %eax, %xmm2
movhlps %xmm0, %xmm0
movd %xmm0, %eax
movhlps %xmm1, %xmm1
movd %xmm1, %ecx
shll %cl, %eax
movd %eax, %xmm0
punpckldq %xmm0, %xmm2
movdqa %xmm2, %xmm0
punpckldq %xmm3, %xmm0
ret
llvm-svn: 109549
|
|
|
|
|
|
| |
problem in CallSiteBase is fixed
llvm-svn: 109547
|
|
|
|
| |
llvm-svn: 109538
|
|
|
|
| |
llvm-svn: 109525
|
|
|
|
|
|
|
|
| |
ConstantFoldBIT_CONVERTofBUILD_VECTOR calling itself
recursively and returning a SCALAR_TO_VECTOR node, but assuming the input was always a BUILD_VECTOR.
llvm-svn: 109519
|
|
|
|
| |
llvm-svn: 109513
|
|
|
|
| |
llvm-svn: 109511
|
|
|
|
| |
llvm-svn: 109510
|
|
|
|
| |
llvm-svn: 109509
|
|
|
|
| |
llvm-svn: 109508
|
|
|
|
| |
llvm-svn: 109506
|
|
|
|
| |
llvm-svn: 109504
|
|
|
|
| |
llvm-svn: 109503
|
|
|
|
| |
llvm-svn: 109502
|
|
|
|
| |
llvm-svn: 109500
|
|
|
|
|
|
| |
Also fix some comments.
llvm-svn: 109499
|
|
|
|
|
|
|
| |
getMaxRegionExit returns the exit of the maximal refined region starting
at a specific basic block.
llvm-svn: 109496
|
|
|
|
|
|
|
|
| |
are still on the list. This might happen if a CallbackVH created some new value
handles for the old value when doing RAUW. Barf if it occurs, since it is almost
certainly a mistake.
llvm-svn: 109495
|
|
|
|
| |
llvm-svn: 109494
|
|
|
|
|
|
|
| |
* contains(Loop), * getOutermostLoop()
* Improve getNameStr() to return a sensible name, if basic blocks are not named.
llvm-svn: 109490
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
subregister operands like this:
%reg1040:sub_32bit<def> = MOV32rm <fi#-2>, 1, %reg0, 0, %reg0, %reg1040<imp-def>; mem:LD4[FixedStack-2](align=8)
Make them return false when subreg operands are present. VirtRegRewriter is
making bad assumptions otherwise.
This fixes PR7713.
llvm-svn: 109489
|
|
|
|
|
|
| |
with a too-big register class.
llvm-svn: 109488
|
|
|
|
|
|
|
|
|
| |
protectors, to be near the stack protectors on the stack. Accomplish this by
tagging the stack object with a predicate that indicates that it would trigger
this. In the prolog-epilog inserter, assign these objects to the stack after the
stack protector but before the other objects.
llvm-svn: 109481
|
|
|
|
|
|
|
|
| |
rewrite instructions for live range splitting.
Still work in progress.
llvm-svn: 109469
|
|
|
|
| |
llvm-svn: 109468
|
|
|
|
| |
llvm-svn: 109462
|
|
|
|
| |
llvm-svn: 109458
|
|
|
|
| |
llvm-svn: 109456
|
|
|
|
|
|
|
| |
exception handling. Also fix an extra underscore typo in one instance of
"__ARM_EABI__". Radar 8236264.
llvm-svn: 109451
|
|
|
|
| |
llvm-svn: 109450
|
|
|
|
|
|
| |
enough to factor into scheduling priority. Eliminate it and add early exits to speed up scheduling.
llvm-svn: 109449
|
|
|
|
| |
llvm-svn: 109448
|
|
|
|
|
|
|
|
|
| |
we are using AVX and no AVX version of the desired intruction is present,
this is better for incremental dev (without fallbacks it's easier to spot
what's missing). Not sure this is the best hack thought (we can also disable
all HasSSE* predicates by dinamically marking them 'false' if AVX is present)
llvm-svn: 109434
|
|
|
|
|
|
| |
Disabled for now.
llvm-svn: 109424
|
|
|
|
|
|
|
|
| |
This assumption is not satisfied due to global mergeing.
Workaround the issue by temporary disablinge mergeing of const globals.
Also, ignore LLVM "special" globals. This fixes PR7716
llvm-svn: 109423
|
|
|
|
| |
llvm-svn: 109421
|
|
|
|
|
|
|
|
| |
it inserted rather than using LoopInfo::getCanonicalInductionVariable to
rediscover it, since that doesn't work on non-canonical loops. This fixes
infinite recurrsion on such loops; PR7562.
llvm-svn: 109419
|
|
|
|
| |
llvm-svn: 109415
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dependence on DominanceFrontier. Instead, add an explicit DominanceFrontier
pass in StandardPasses.h to ensure that it gets scheduled at the right
time.
Declare that loop unrolling preserves ScalarEvolution, and shuffle some
getAnalysisUsages.
This eliminates one LoopSimplify and one LCCSA run in the standard
compile opts sequence.
llvm-svn: 109413
|
|
|
|
| |
llvm-svn: 109412
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
don't visit all blocks in the function, and don't iterate over the split blocks'
predecessor lists for each block visited.
Also, remove the special-case test for the entry block. Splitting the entry
block isn't common enough to make this worthwhile.
This fixes a major compile-time bottleneck which is exposed now that
LoopSimplify isn't being redundantly run both before and after
DominanceFrontier.
llvm-svn: 109408
|
|
|
|
| |
llvm-svn: 109405
|
|
|
|
| |
llvm-svn: 109404
|
|
|
|
| |
llvm-svn: 109403
|
|
|
|
| |
llvm-svn: 109402
|
|
|
|
|
|
| |
explicit inequality check.
llvm-svn: 109401
|