| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 93776
|
| |
|
|
| |
llvm-svn: 93775
|
| |
|
|
| |
llvm-svn: 93774
|
| |
|
|
|
|
| |
fsub.ll and FileCheckify it.
llvm-svn: 93669
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
added to the FSub version. However, the original version of this xform guarded
against doing this for floating point (!Op0->getType()->isFPOrFPVector()).
This is causing LLVM to perform incorrect xforms for code like:
void func(double *rhi, double *rlo, double xh, double xl, double yh, double yl){
double mh, ml;
double c = 134217729.0;
double up, u1, u2, vp, v1, v2;
up = xh*c;
u1 = (xh - up) + up;
u2 = xh - u1;
vp = yh*c;
v1 = (yh - vp) + vp;
v2 = yh - v1;
mh = xh*yh;
ml = (((u1*v1 - mh) + (u1*v2)) + (u2*v1)) + (u2*v2);
ml += xh*yl + xl*yh;
*rhi = mh + ml;
*rlo = (mh - (*rhi)) + ml;
}
The last line was optimized away, but rl is intended to be the difference
between the infinitely precise result of mh + ml and after it has been rounded
to double precision.
llvm-svn: 93369
|
| |
|
|
|
|
|
|
|
|
|
|
| |
in JT.
2) When cloning blocks for PHI or xor conditions, use
instsimplify to simplify the code as we go. This allows us to
squish common cases early in JT which opens up opportunities for
subsequent iterations, and allows it to completely simplify the
testcase.
llvm-svn: 93253
|
| |
|
|
| |
llvm-svn: 93230
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
condition is a xor with a phi node. This eliminates nonsense
like this from 176.gcc in several places:
LBB166_84:
testl %eax, %eax
- setne %al
- xorb %cl, %al
- notb %al
- testb $1, %al
- je LBB166_85
+ je LBB166_69
+ jmp LBB166_85
This is rdar://7391699
llvm-svn: 93221
|
| |
|
|
| |
llvm-svn: 93206
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
good code on PR4216:
_test_bitfield: ## @test_bitfield
orl $32962, %edi
movl $4294941946, %eax
andq %rdi, %rax
ret
instead of:
_test_bitfield:
movl $4294941696, %ecx
movl %edi, %eax
orl $194, %edi
orl $32768, %eax
andq $250, %rdi
andq %rax, %rcx
movq %rdi, %rax
orq %rcx, %rax
ret
Evan is looking into the remaining andq+imm -> andl optimization.
llvm-svn: 93147
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BitsToClear case. This allows it to promote expressions which have an
and/or/xor after the lshr, promoting cases like test2 (from PR4216)
and test3 (random extample extracted from a spec benchmark).
clang now compiles the code in PR4216 into:
_test_bitfield: ## @test_bitfield
movl %edi, %eax
orl $194, %eax
movl $4294902010, %ecx
andq %rax, %rcx
orl $32768, %edi
andq $39936, %rdi
movq %rdi, %rax
orq %rcx, %rax
ret
instead of:
_test_bitfield: ## @test_bitfield
movl %edi, %eax
orl $194, %eax
movl $4294902010, %ecx
andq %rax, %rcx
shrl $8, %edi
orl $128, %edi
shlq $8, %rdi
andq $39936, %rdi
movq %rdi, %rax
orq %rcx, %rax
ret
which is still not great, but is progress.
llvm-svn: 93145
|
| |
|
|
|
|
|
|
|
| |
new BitsToClear result which allows us to start promoting
expressions that end with a lshr-by-constant. This is
conservatively correct and better than what we had before
(see testcases) but still needs to be extended further.
llvm-svn: 93144
|
| |
|
|
|
|
| |
the dest of the sext.
llvm-svn: 93128
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the zext dest type. This allows us to handle test52/53 in cast.ll,
and allows llvm-gcc to generate much better code for PR4216 in -m64
mode:
_test_bitfield: ## @test_bitfield
orl $32962, %edi
movl %edi, %eax
andl $-25350, %eax
ret
This also fixes a bug handling vector extends, ensuring that the
mask produced is a vector constant, not an integer constant.
llvm-svn: 93127
|
| |
|
|
|
|
|
|
| |
elimination of a sign extend to be a win, which simplifies
the client of CanEvaluateSExtd, and allows us to eliminate
more casts (examples taken from real code).
llvm-svn: 93109
|
| |
|
|
|
|
|
|
| |
lshr+ashr instead of trunc+sext. We want to avoid type
conversions whenever possible, it is easier to codegen expressions
without truncates and extensions.
llvm-svn: 93107
|
| |
|
|
|
|
|
|
|
|
|
| |
1) don't try to optimize a sext or zext that is only used by a trunc, let
the trunc get optimized first. This avoids some pointless effort in
some common cases since instcombine scans down a block in the first pass.
2) Change the cost model for zext elimination to consider an 'and' cheaper
than a zext. This allows us to do it more aggressively, and for the next
patch to simplify the code quite a bit.
llvm-svn: 93097
|
| |
|
|
|
|
| |
more expressions to be promoted and casts eliminated.
llvm-svn: 93096
|
| |
|
|
|
|
| |
so that unnamed blocks are handled.
llvm-svn: 93059
|
| |
|
|
|
|
| |
base is the right expression type. This fixes PR5981.
llvm-svn: 93045
|
| |
|
|
|
|
| |
the input is already sign extended.
llvm-svn: 93019
|
| |
|
|
|
|
|
| |
result int by 8 for the first byte. While normally harmless,
if the result is smaller than a byte, this shift is invalid.
llvm-svn: 93018
|
| |
|
|
| |
llvm-svn: 92964
|
| |
|
|
| |
llvm-svn: 92963
|
| |
|
|
|
|
|
| |
that feeds into a zext, similar to the patch I did yesterday for sext.
There is a lot of room for extension beyond this patch.
llvm-svn: 92962
|
| |
|
|
|
|
|
|
|
|
| |
to an element of a vector in a static ctor) which occurs with an
unrelated patch I'm testing. Annoyingly, EvaluateStoreInto basically
does exactly the same stuff as InsertElement constant folding, but it
now handles vectors, and you can't insertelement into a vector. It
would be 'really nice' if GEP into a vector were not legal.
llvm-svn: 92889
|
| |
|
|
|
|
|
|
|
| |
phi nodes when deciding which pointers point to local memory.
I actually checked long ago how useful this is, and it isn't
very: it hardly ever fires in the testsuite, but since Chris
wants it here it is!
llvm-svn: 92836
|
| |
|
|
|
|
|
|
|
|
| |
memcpy, memset and other intrinsics that only access their arguments
to be readnone if the intrinsic's arguments all point to local memory.
This improves the testcase in the README to readonly, but it could in
theory be made readnone, however this would involve more sophisticated
analysis that looks through the memcpy.
llvm-svn: 92829
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, instcombine would only promote an expression tree to
the larger type if doing so eliminated two casts. This is because
a need to manually do the sign extend after the promoted expression
tree with two shifts. Now, we keep track of whether the result of
the computation is going to be properly sign extended already. If
so, we can unconditionally promote the expression, which allows us
to zap more sext's.
This implements rdar://6598839 (aka gcc pr38751)
llvm-svn: 92815
|
| |
|
|
|
|
|
| |
test/CodeGen/X86, as doesn't use -indvars, and it does use
llc -march=x86-64.
llvm-svn: 92799
|
| |
|
|
| |
llvm-svn: 92792
|
| |
|
|
|
|
|
| |
The only difference is that EvaluateInDifferentType checks to ensure
they are profitable before doing them :)
llvm-svn: 92788
|
| |
|
|
| |
llvm-svn: 92786
|
| |
|
|
| |
llvm-svn: 92784
|
| |
|
|
| |
llvm-svn: 92782
|
| |
|
|
| |
llvm-svn: 92781
|
| |
|
|
| |
llvm-svn: 92777
|
| |
|
|
| |
llvm-svn: 92776
|
| |
|
|
| |
llvm-svn: 92775
|
| |
|
|
| |
llvm-svn: 92745
|
| |
|
|
| |
llvm-svn: 92740
|
| |
|
|
|
|
| |
leading/trailing bits. Patch by Alastair Lynn!
llvm-svn: 92706
|
| |
|
|
| |
llvm-svn: 92679
|
| |
|
|
|
|
|
|
|
|
| |
Intrinsic::dbg_stoppoint
Intrinsic::dbg_region_start
Intrinsic::dbg_region_end
Intrinsic::dbg_func_start
AutoUpgrade simply ignores these intrinsics now.
llvm-svn: 92557
|
| |
|
|
|
|
|
|
|
|
|
| |
when doing this transform if the GEP is not inbounds. No testcase because
it is very difficult to trigger this: instcombine already canonicalizes
GEP indices to pointer size, so it relies specific permutations of the
instcombine worklist.
Thanks to Duncan for pointing this possible problem out.
llvm-svn: 92495
|
| |
|
|
|
|
|
|
| |
on the example in PR4216. This doesn't trigger in the testsuite,
so I'd really appreciate someone scrutinizing the logic for
correctness.
llvm-svn: 92458
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
arrays of structs and other arrays, so long as all the subsequent
indexes are constants. This triggers frequently for stuff like:
@divisions = internal constant [29 x [2 x i32]] [[2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 1], [2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 2], [2 x i32] zeroinitializer, [2 x i32] zeroinitializer, [2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 1], [2 x i32] zeroinitializer, [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 2]], align 32 ; <[29 x [2 x i32]]*> [#uses=50]
%623 = getelementptr inbounds [29 x [2 x i32]]* @divisions, i64 0, i64 %619, i64 0 ; <i32*> [#uses=1]
%684 = icmp eq i32 %683, 999
also for the "my_defs" table in 'gs', etc.
llvm-svn: 92444
|
| |
|
|
|
|
|
| |
occurs in 403.gcc in mode_mask_array, in safe-ctype.c (which
is copied in multiple apps) in _sch_istable, etc.
llvm-svn: 92427
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
when a consequtive sequence of elements all satisfies the
predicate. Like the double compare case, this generates better
code than the magic constant case and generalizes to more than
32/64 element array lookups.
Here are some examples where it triggers. From 403.gcc, most
accesses to the rtx_class array are handled, e.g.:
@rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]*> [#uses=547]
%142 = icmp eq i8 %141, 105
@rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]*> [#uses=543]
%165 = icmp eq i8 %164, 60
Also, most of the 59-element arrays (mode_class/rid_to_yy, etc)
optimized before are actually range compares. This lets 32-bit
machines optimize them.
400.perlbmk has stuff like this:
400.perlbmk: PL_regkind, even for 32-bit:
@PL_regkind = constant [62 x i8] c"\00\00\02\02\02\06\06\06\06\09\09\0B\0B\0D\0E\0E\0E\11\12\12\14\14\16\16\18\18\1A\1A\1C\1C\1E\1F !!!$$&'((((,-.///88886789:;8$", align 32 ; <[62 x i8]*> [#uses=4]
%811 = icmp ne i8 %810, 33
@PL_utf8skip = constant [256 x i8] c"\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\04\04\04\04\04\04\04\04\05\05\05\05\06\06\07\0D", align 32 ; <[256 x i8]*> [#uses=94]
%12 = icmp ult i8 %10, 2
etc.
llvm-svn: 92426
|
| |
|
|
|
|
| |
and.
llvm-svn: 92419
|