| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 40908
|
| |
|
|
|
|
|
|
|
|
|
| |
SSE mode (all but conversions <-> other FP types, I think):
>>Do not mark all-80-bit operations as "Requires[FPStack]"
(which really means "not SSE").
>>Refactor load-and-extend to facilitate this.
>>Update comments.
>>Handle long double in SSE when computing FP_REG_KILL.
llvm-svn: 40906
|
| |
|
|
| |
llvm-svn: 40903
|
| |
|
|
| |
llvm-svn: 40898
|
| |
|
|
| |
llvm-svn: 40897
|
| |
|
|
| |
llvm-svn: 40896
|
| |
|
|
| |
llvm-svn: 40887
|
| |
|
|
|
|
|
|
| |
Last x87 bits for full functionality (not
thoroughly tested, and long doubles do not work
in SSE modes at all - use -mcpu=i486 for now)
llvm-svn: 40886
|
| |
|
|
|
|
| |
on 403.gcc from ~15s to ~10s.
llvm-svn: 40884
|
| |
|
|
| |
llvm-svn: 40883
|
| |
|
|
|
|
| |
This brings GVN to parity with GCSE+LoadVN.
llvm-svn: 40882
|
| |
|
|
| |
llvm-svn: 40881
|
| |
|
|
| |
llvm-svn: 40878
|
| |
|
|
|
|
|
| |
(on Darwin, anyway). Fix some table omissions for
LD arithmetic.
llvm-svn: 40877
|
| |
|
|
| |
llvm-svn: 40875
|
| |
|
|
| |
llvm-svn: 40874
|
| |
|
|
|
|
| |
information for overloaded intrinsics (PR1600). This resolves that issue, and improves the matching scheme to use a BitVector rather than a binary search.
llvm-svn: 40872
|
| |
|
|
| |
llvm-svn: 40870
|
| |
|
|
| |
llvm-svn: 40868
|
| |
|
|
|
|
| |
condition. Fixes 1597.
llvm-svn: 40867
|
| |
|
|
|
|
| |
comparison. Fixes bug 1598.
llvm-svn: 40866
|
| |
|
|
|
|
| |
introduced by chandler's patch.
llvm-svn: 40864
|
| |
|
|
| |
llvm-svn: 40863
|
| |
|
|
| |
llvm-svn: 40861
|
| |
|
|
| |
llvm-svn: 40859
|
| |
|
|
|
|
|
|
|
|
|
|
| |
2. Make domtree printing print dfin/dfout #'s
3. Fix the Transforms/LoopSimplify/2004-04-13-LoopSimplifyUpdateDomFrontier.ll failure from last night (in DominanceFrontier::splitBlock).
w.r.t. #3, my patches last night happened to expose the bug, but this
has been broken since Owen's r35839 patch to LoopSimplify. The code
was subsequently moved over from LoopSimplify into Dominators, carrying
the latent bug. Fun stuff.
llvm-svn: 40858
|
| |
|
|
| |
llvm-svn: 40854
|
| |
|
|
|
|
| |
actual argument name of the documented function.
llvm-svn: 40851
|
| |
|
|
| |
llvm-svn: 40850
|
| |
|
|
| |
llvm-svn: 40849
|
| |
|
|
|
|
| |
Lots of problems yet but some simple things work.
llvm-svn: 40847
|
| |
|
|
| |
llvm-svn: 40843
|
| |
|
|
|
|
|
|
| |
This shrinks it down to something small. On the testcase
from PR1432, this speeds up instcombine from 0.7959s to 0.5000s,
(59%)
llvm-svn: 40840
|
| |
|
|
|
|
|
| |
which dynamically allocates the string result. This speeds up dse on the
testcase from PR1432 from 0.3781s to 0.1804s (2.1x).
llvm-svn: 40838
|
| |
|
|
|
|
|
|
|
|
| |
contents of the set were small, deallocate and shrink the set. This
avoids having us to memset as much data, significantly speeding up
some pathological cases. For example, this speeds up the verifier
from 0.3899s to 0.0763 (5.1x) on the testcase from PR1432 in a
release build.
llvm-svn: 40837
|
| |
|
|
| |
llvm-svn: 40830
|
| |
|
|
|
|
| |
domtree by 10% and postdomtree by 17%
llvm-svn: 40829
|
| |
|
|
|
|
| |
a smallptrset. This speeds up domtree by about 15% and postdomtree by 20%.
llvm-svn: 40828
|
| |
|
|
|
|
|
|
| |
speeds up idom by about 45% and postidom by about 33%.
Some extra precautions must be taken not to invalidate densemap iterators.
llvm-svn: 40827
|
| |
|
|
|
|
|
|
| |
DenseMap instead of an std::map. This speeds up postdomtree
by about 25% and domtree by about 23%. It also speeds up clients,
for example, domfrontier by 11%, mem2reg by 4% and ADCE by 6%.
llvm-svn: 40826
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the old way, we computed and inserted phi nodes for the whole IDF of
the definitions of the alloca, then computed which ones were dead and
removed them.
In the new method, we first compute the region where the value is live,
and use that information to only insert phi nodes that are live. This
eliminates the need to compute liveness later, and stops the algorithm
from inserting a bunch of phis which it then later removes.
This speeds up the testcase in PR1432 from 2.00s to 0.15s (14x) in a
release build and 6.84s->0.50s (14x) in a debug build.
llvm-svn: 40825
|
| |
|
|
| |
llvm-svn: 40824
|
| |
|
|
|
|
| |
measurable speedup.
llvm-svn: 40823
|
| |
|
|
|
|
|
| |
to the worklist, and handling the last one with a 'tail call'. This speeds
up PR1432 from 2.0578s to 2.0012s (2.8%)
llvm-svn: 40822
|
| |
|
|
|
|
| |
mem2reg from 2.0742->2.0522s on PR1432.
llvm-svn: 40821
|
| |
|
|
| |
llvm-svn: 40820
|
| |
|
|
| |
llvm-svn: 40819
|
| |
|
|
|
|
|
| |
faster than with the 'local to a block' fastpath. This speeds
up PR1432 from 2.1232 to 2.0686s (2.6%)
llvm-svn: 40818
|
| |
|
|
|
|
|
| |
to increment NumLocalPromoted, and didn't actually delete the
dead alloca, leading to an extra iteration of mem2reg.
llvm-svn: 40817
|
| |
|
|
| |
llvm-svn: 40816
|