| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 24491
|
| |
|
|
| |
llvm-svn: 24488
|
| |
|
|
| |
llvm-svn: 24487
|
| |
|
|
|
|
| |
half the problem.
llvm-svn: 24414
|
| |
|
|
|
|
| |
compiling mysql reported by Ted Kremenek.
llvm-svn: 24402
|
| |
|
|
| |
llvm-svn: 24288
|
| |
|
|
| |
llvm-svn: 24270
|
| |
|
|
|
|
|
|
| |
Reg2Mem
for fun you can opt -reg2mem -mem2reg
llvm-svn: 24267
|
| |
|
|
|
|
|
|
|
| |
Add support for specifying alignment and size of setjmp jmpbufs.
No targets currently do anything with this information, nor is it presrved
in the bytecode representation. That's coming up next.
llvm-svn: 24196
|
| |
|
|
|
|
| |
that has been sitting in my inbox since May 18. :)
llvm-svn: 24194
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a few times in crafty:
OLD: %tmp.36 = div int %tmp.35, 8 ; <int> [#uses=1]
NEW: %tmp.36 = div uint %tmp.35, 8 ; <uint> [#uses=0]
OLD: %tmp.19 = div int %tmp.18, 8 ; <int> [#uses=1]
NEW: %tmp.19 = div uint %tmp.18, 8 ; <uint> [#uses=0]
OLD: %tmp.117 = div int %tmp.116, 8 ; <int> [#uses=1]
NEW: %tmp.117 = div uint %tmp.116, 8 ; <uint> [#uses=0]
OLD: %tmp.92 = div int %tmp.91, 8 ; <int> [#uses=1]
NEW: %tmp.92 = div uint %tmp.91, 8 ; <uint> [#uses=0]
Which all turn into shrs.
llvm-svn: 24190
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
8 times in vortex, allowing the srems to be turned into shrs:
OLD: %tmp.104 = rem int %tmp.5.i37, 16 ; <int> [#uses=1]
NEW: %tmp.104 = rem uint %tmp.5.i37, 16 ; <uint> [#uses=0]
OLD: %tmp.98 = rem int %tmp.5.i24, 16 ; <int> [#uses=1]
NEW: %tmp.98 = rem uint %tmp.5.i24, 16 ; <uint> [#uses=0]
OLD: %tmp.91 = rem int %tmp.5.i19, 8 ; <int> [#uses=1]
NEW: %tmp.91 = rem uint %tmp.5.i19, 8 ; <uint> [#uses=0]
OLD: %tmp.88 = rem int %tmp.5.i14, 8 ; <int> [#uses=1]
NEW: %tmp.88 = rem uint %tmp.5.i14, 8 ; <uint> [#uses=0]
OLD: %tmp.85 = rem int %tmp.5.i9, 1024 ; <int> [#uses=2]
NEW: %tmp.85 = rem uint %tmp.5.i9, 1024 ; <uint> [#uses=0]
OLD: %tmp.82 = rem int %tmp.5.i, 512 ; <int> [#uses=2]
NEW: %tmp.82 = rem uint %tmp.5.i1, 512 ; <uint> [#uses=0]
OLD: %tmp.48.i = rem int %tmp.5.i.i161, 4 ; <int> [#uses=1]
NEW: %tmp.48.i = rem uint %tmp.5.i.i161, 4 ; <uint> [#uses=0]
OLD: %tmp.20.i2 = rem int %tmp.5.i.i, 4 ; <int> [#uses=1]
NEW: %tmp.20.i2 = rem uint %tmp.5.i.i, 4 ; <uint> [#uses=0]
it also occurs 9 times in gcc, but with odd constant divisors (1009 and 61)
so the payoff isn't as great.
llvm-svn: 24189
|
| |
|
|
|
|
| |
/Regression/Transforms/InstCombine/add.ll
llvm-svn: 24158
|
| |
|
|
|
|
|
| |
bad cases. This fixes Markus's second testcase in PR639, and should
seal it for good.
llvm-svn: 24123
|
| |
|
|
|
|
| |
infrastructure and the simple isels have been removed.
llvm-svn: 24090
|
| |
|
|
|
|
| |
This allows us to turn code like malloc(4*x+4) -> malloc int, (x+1)
llvm-svn: 24081
|
| |
|
|
|
|
| |
change.
llvm-svn: 24076
|
| |
|
|
| |
llvm-svn: 24056
|
| |
|
|
|
|
| |
PR640
llvm-svn: 24046
|
| |
|
|
| |
llvm-svn: 24033
|
| |
|
|
|
|
| |
into: malloc int, (2*X)
llvm-svn: 24032
|
| |
|
|
|
|
|
| |
(malloc [25 x int]) directly without having to convert to
(malloc [100 x sbyte]) first.
llvm-svn: 24031
|
| |
|
|
| |
llvm-svn: 24030
|
| |
|
|
|
|
| |
fixes a very slow compile in PR639.
llvm-svn: 24011
|
| |
|
|
| |
llvm-svn: 23998
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
one use (but one is a cast). This handles the very common case of:
X = alloc [n x byte]
Y = cast X to somethingbetter
seteq X, null
In order to avoid infinite looping when there are multiple casts, we only
allow this if the xform is strictly increasing the alignment of the
allocation.
llvm-svn: 23961
|
| |
|
|
|
|
|
| |
where the second has less alignment required. If we had explicit alignment
support in the IR, we could handle this case, but we can't until we do.
llvm-svn: 23960
|
| |
|
|
|
|
|
| |
more effective at promoting these allocations, catching them earlier in the
compile process.
llvm-svn: 23959
|
| |
|
|
| |
llvm-svn: 23958
|
| |
|
|
| |
llvm-svn: 23940
|
| |
|
|
|
|
| |
This should speed up build times.
llvm-svn: 23933
|
| |
|
|
|
|
| |
code
llvm-svn: 23931
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
pointer marking the end of the list, the zero *must* be cast to the pointer
type. An un-cast zero is a 32-bit int, and at least on x86_64, gcc will
not extend the zero to 64 bits, thus allowing the upper 32 bits to be
random junk.
The new END_WITH_NULL macro may be used to annotate a such a function
so that GCC (version 4 or newer) will detect the use of un-casted zero
at compile time.
llvm-svn: 23888
|
| |
|
|
|
|
| |
allow pointer types.
llvm-svn: 23859
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
inner loop like this:
LBB_RateConvertMono8AltiVec_2: ; no_exit
lis r2, ha16(.CPI_RateConvertMono8AltiVec_0)
lfs f3, lo16(.CPI_RateConvertMono8AltiVec_0)(r2)
fmr f3, f3
fadd f0, f2, f0
fadd f3, f0, f3
fcmpu cr0, f3, f1
bge cr0, LBB_RateConvertMono8AltiVec_2 ; no_exit
to an inner loop like this:
LBB_RateConvertMono8AltiVec_1: ; no_exit
fsub f2, f2, f1
fcmpu cr0, f2, f1
fmr f0, f2
bge cr0, LBB_RateConvertMono8AltiVec_1 ; no_exit
Doh! good catch!
llvm-svn: 23838
|
| |
|
|
| |
llvm-svn: 23773
|
| |
|
|
| |
llvm-svn: 23772
|
| |
|
|
| |
llvm-svn: 23771
|
| |
|
|
|
|
|
| |
out CSE's of base expressions it could build a result whose order was
nondet.
llvm-svn: 23698
|
| |
|
|
|
|
| |
from the end of a vector instead of the beginning
llvm-svn: 23697
|
| |
|
|
| |
llvm-svn: 23695
|
| |
|
|
| |
llvm-svn: 23677
|
| |
|
|
| |
llvm-svn: 23674
|
| |
|
|
| |
llvm-svn: 23673
|
| |
|
|
|
|
|
| |
IV strides dependend on the pointer order of the strides in memory.
Non-determinism is bad.
llvm-svn: 23672
|
| |
|
|
| |
llvm-svn: 23656
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
particular, it should realize that phi's use their values in the pred block
not the phi block itself. This change turns our em3d loop from this:
_test:
cmpwi cr0, r4, 0
bgt cr0, LBB_test_2 ; entry.no_exit_crit_edge
LBB_test_1: ; entry.loopexit_crit_edge
li r2, 0
b LBB_test_6 ; loopexit
LBB_test_2: ; entry.no_exit_crit_edge
li r6, 0
LBB_test_3: ; no_exit
or r2, r6, r6
lwz r6, 0(r3)
cmpw cr0, r6, r5
beq cr0, LBB_test_6 ; loopexit
LBB_test_4: ; endif
addi r3, r3, 4
addi r6, r2, 1
cmpw cr0, r6, r4
blt cr0, LBB_test_3 ; no_exit
LBB_test_5: ; endif.loopexit.loopexit_crit_edge
addi r3, r2, 1
blr
LBB_test_6: ; loopexit
or r3, r2, r2
blr
into:
_test:
cmpwi cr0, r4, 0
bgt cr0, LBB_test_2 ; entry.no_exit_crit_edge
LBB_test_1: ; entry.loopexit_crit_edge
li r2, 0
b LBB_test_5 ; loopexit
LBB_test_2: ; entry.no_exit_crit_edge
li r6, 0
LBB_test_3: ; no_exit
lwz r2, 0(r3)
cmpw cr0, r2, r5
or r2, r6, r6
beq cr0, LBB_test_5 ; loopexit
LBB_test_4: ; endif
addi r3, r3, 4
addi r6, r6, 1
cmpw cr0, r6, r4
or r2, r6, r6
blt cr0, LBB_test_3 ; no_exit
LBB_test_5: ; loopexit
or r3, r2, r2
blr
Unfortunately, this is actually worse code, because the register coallescer
is getting confused somehow. If it were doing its job right, it could turn the
code into this:
_test:
cmpwi cr0, r4, 0
bgt cr0, LBB_test_2 ; entry.no_exit_crit_edge
LBB_test_1: ; entry.loopexit_crit_edge
li r6, 0
b LBB_test_5 ; loopexit
LBB_test_2: ; entry.no_exit_crit_edge
li r6, 0
LBB_test_3: ; no_exit
lwz r2, 0(r3)
cmpw cr0, r2, r5
beq cr0, LBB_test_5 ; loopexit
LBB_test_4: ; endif
addi r3, r3, 4
addi r6, r6, 1
cmpw cr0, r6, r4
blt cr0, LBB_test_3 ; no_exit
LBB_test_5: ; loopexit
or r3, r6, r6
blr
... which I'll work on next. :)
llvm-svn: 23604
|
| |
|
|
| |
llvm-svn: 23603
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
memoizing code when IV's are used by phinodes outside of loops. In a simple
example, we were getting this code before (note that r6 and r7 are isomorphic
IV's):
li r6, 0
or r7, r6, r6
LBB_test_3: ; no_exit
lwz r2, 0(r3)
cmpw cr0, r2, r5
or r2, r7, r7
beq cr0, LBB_test_5 ; loopexit
LBB_test_4: ; endif
addi r2, r7, 1
addi r7, r7, 1
addi r3, r3, 4
addi r6, r6, 1
cmpw cr0, r6, r4
blt cr0, LBB_test_3 ; no_exit
Now we get:
li r6, 0
LBB_test_3: ; no_exit
or r2, r6, r6
lwz r6, 0(r3)
cmpw cr0, r6, r5
beq cr0, LBB_test_6 ; loopexit
LBB_test_4: ; endif
addi r3, r3, 4
addi r6, r2, 1
cmpw cr0, r6, r4
blt cr0, LBB_test_3 ; no_exit
this was noticed in em3d.
llvm-svn: 23602
|
| |
|
|
|
|
|
|
| |
check the presplit pred, not the post-split pred. This was causing us
to make the wrong decision in some cases, leaving the critical edge block
in the loop.
llvm-svn: 23601
|