| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
| |
llvm-svn: 191172
|
| |
|
|
|
|
|
|
|
|
| |
SROA wants to convert any types of equivalent widths but it's not possible to
convert vectors of pointers to an integer scalar with a single cast. As a
workaround we add a bitcast to the corresponding int ptr type first. This type
of cast used to be an edge case but has become common with SLP vectorization.
Fixes PR17271.
llvm-svn: 191143
|
| |
|
|
| |
llvm-svn: 191135
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reapply r191108 with a fix for a memory corruption error I introduced. Of
course, we can't reference the scalars that we replace by vectorizing and then
call their eraseFromParent method. I only 'needed' the scalars to get the
DebugLoc. Just store the DebugLoc before actually vectorizing instead. As a nice
side effect, this also simplifies the interface between BoUpSLP and the
HorizontalReduction class to returning a value pointer (the vectorized tree
root).
radar://14607682
llvm-svn: 191123
|
| |
|
|
|
|
|
|
| |
sure that the functions 'abs' or 'round' are the functions from libm.
rdar://15012650
llvm-svn: 191122
|
| |
|
|
|
|
|
|
|
| |
This reverts commit r191108.
The horizontal.ll test case fails under libgmalloc. Thanks Shuxin for pointing
this out to me.
llvm-svn: 191121
|
| |
|
|
|
|
|
|
|
| |
PR17307 & 17308.
The problem of r191017 is that when GVN fabricate a val-number for a dead instruction (in order
to make following expr-PRE happy), it forget to fabricate a leader-table entry for it as well.
llvm-svn: 191118
|
| |
|
|
| |
llvm-svn: 191112
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Match reductions starting at binary operation feeding into a phi. The code
handles trees like
r += v1 + v2 + v3 ...
and
r += v1
r += v2
...
and
r *= v1 + v2 + ...
We currently only handle associative operations (add, fadd fast).
The code can now also handle reductions feeding into stores.
a[i] = v1 + v2 + v3 + ...
The code is currently disabled behind the flag "-slp-vectorize-hor". The cost
model for most architectures is not there yet.
I found one opportunity of a horizontal reduction feeding a phi in TSVC
(LoopRerolling-flt) and there are several opportunities where reductions feed
into stores.
radar://14607682
llvm-svn: 191108
|
| |
|
|
| |
llvm-svn: 191104
|
| |
|
|
|
|
|
|
|
|
| |
(ptrtoint Y))
The GEP pattern is what SCEV expander emits for "ugly geps". The latter is what
you get for pointer subtraction in C code. The rest of instcombine already
knows how to deal with that so just canonicalize on that.
llvm-svn: 191090
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If "C1/X" were having multiple uses, the only benefit of this
transformation is to potentially shorten critical path. But it is at the
cost of instroducing additional div.
The additional div may or may not incur cost depending on how div is
implemented. If it is implemented using Newton–Raphson iteration, it dosen't
seem to incur any cost (FIXME). However, if the div blocks the entire
pipeline, that sounds to be pretty expensive. Let CodeGen to take care
this transformation.
This patch sees 6% on a benchmark.
rdar://15032743
llvm-svn: 191037
|
| |
|
|
|
|
| |
The code below can't handle any pointers. PR17293.
llvm-svn: 191036
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is how it ignores the dead code:
1) When a dead branch target, say block B, is identified, all the
blocks dominated by B is dead as well.
2) The PHIs of those blocks in dominance-frontier(B) is updated such
that the operands corresponding to dead predecessors are replaced
by "UndefVal".
Using lattice's jargon, the "UndefVal" is the "Top" in essence.
Phi node like this "phi(v1 bb1, undef xx)" will be optimized into
"v1" if v1 is constant, or v1 is an instruction which dominate this
PHI node.
3) When analyzing the availability of a load L, all dead mem-ops which
L depends on disguise as a load which evaluate exactly same value as L.
4) The dead mem-ops will be materialized as "UndefVal" during code motion.
llvm-svn: 191017
|
| |
|
|
|
|
|
|
|
| |
Adds a flag to the MemorySanitizer pass that enables runtime rewriting of
indirect calls. This is part of MSanDR implementation and is needed to return
control to the DynamiRio-based helper tool on transition between instrumented
and non-instrumented modules. Disabled by default.
llvm-svn: 191006
|
| |
|
|
|
|
| |
enabled with the run-time option
llvm-svn: 190939
|
| |
|
|
|
|
|
|
|
|
| |
registers.
XCore target: Add XCoreTargetTransformInfo
This is where getNumberOfRegisters() resides, which in turn returns the
number of vector registers (=0).
llvm-svn: 190936
|
| |
|
|
|
|
| |
still work correctly.
llvm-svn: 190917
|
| |
|
|
|
|
| |
VINSERTF128/VEXTRACTF128. Fixes PR17268.
llvm-svn: 190916
|
| |
|
|
| |
llvm-svn: 190905
|
| |
|
|
|
|
|
| |
To avoid regressions with bitfield optimizations, this slicing should take place
later, like ISel time.
llvm-svn: 190891
|
| |
|
|
|
|
|
|
|
|
| |
Some of this code is no longer necessary since int<->ptr casts are no
longer occur as of r187444.
This also fixes handling vectors of pointers, and adds a bunch of new
testcases for vectors and address spaces.
llvm-svn: 190885
|
| |
|
|
|
|
|
|
|
| |
We can't insert an insertelement after an invoke. We would have to split a
critical edge. So when we see a phi node that uses an invoke we just give up.
radar://14990770
llvm-svn: 190871
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
other in memory.
The motivation was to get rid of truncate and shift right instructions that get
in the way of paired load or floating point load.
E.g.,
Consider the following example:
struct Complex {
float real;
float imm;
};
When accessing a complex, llvm was generating a 64-bits load and the imm field
was obtained by a trunc(lshr) sequence, resulting in poor code generation, at
least for x86.
The idea is to declare that two load instructions is the canonical form for
loading two arithmetic type, which are next to each other in memory.
Two scalar loads at a constant offset from each other are pretty
easy to detect for the sorts of passes that like to mess with loads.
<rdar://problem/14477220>
llvm-svn: 190870
|
| |
|
|
|
|
| |
10%-20% speedup for use-after-return
llvm-svn: 190863
|
| |
|
|
|
|
|
|
|
|
|
| |
Wrong cast operation.
MergeFunctions emits Bitcast instead of pointer-to-integer operation.
Patch fixes MergeFunctions::writeThunk function. It replaces
unconditional Bitcast creation with "Value* createCast(...)" method, that
checks operand types and selects proper instruction.
See unit-test as example.
llvm-svn: 190859
|
| |
|
|
|
|
|
|
|
|
|
|
| |
If there are no legal integers, assume 1 byte.
This makes more sense than using the pointer size as
a guess for the maximum GPR width.
It is conceivable to want to use some 64-bit pointers
on a target where 64-bit integers aren't legal.
llvm-svn: 190817
|
| |
|
|
|
|
|
|
|
|
|
|
| |
We would have to compute the pre increment value, either by computing it on
every loop iteration or by splitting the edge out of the loop and inserting a
computation for it there.
For now, just give up vectorizing such loops.
Fixes PR17179.
llvm-svn: 190790
|
| |
|
|
| |
llvm-svn: 190782
|
| |
|
|
|
|
|
|
|
| |
Previous discussion:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-July/063909.html
Differential Revision: http://llvm-reviews.chandlerc.com/D1191
llvm-svn: 190773
|
| |
|
|
| |
llvm-svn: 190770
|
| |
|
|
| |
llvm-svn: 190750
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pass was based on the previous (essentially unused) profiling
infrastructure and the assumption that by ordering the basic blocks at
the IR level in a particular way, the correct layout would happen in the
end. This sometimes worked, and mostly didn't. It also was a really
naive implementation of the classical paper that dates from when branch
predictors were primarily directional and when loop structure wasn't
commonly available. It also didn't factor into the equation
non-fallthrough branches and other machine level details.
Anyways, for all of these reasons and more, I wrote
MachineBlockPlacement, which completely supercedes this pass. It both
uses modern profile information infrastructure, and actually works. =]
llvm-svn: 190748
|
| |
|
|
|
|
| |
Compiler part.
llvm-svn: 190689
|
| |
|
|
|
|
| |
disabled.
llvm-svn: 190668
|
| |
|
|
|
|
|
|
|
| |
Allow targets to customize the default behavior of the generic loop unrolling
transformation. This will be used by the PowerPC backend when targeting the A2
core (which is in-order with a deep pipeline), and using more aggressive
defaults is important.
llvm-svn: 190542
|
| |
|
|
|
|
|
| |
It works with clang, but GCC has different rules so we can't make all of those
hidden. This reverts commit r190534.
llvm-svn: 190536
|
| |
|
|
|
|
| |
Worth 100k on a linux/x86_64 Release+Asserts clang.
llvm-svn: 190534
|
| |
|
|
|
|
|
| |
This doesn't change anything since malloc always returns
address space 0.
llvm-svn: 190498
|
| |
|
|
| |
llvm-svn: 190491
|
| |
|
|
| |
llvm-svn: 190490
|
| |
|
|
| |
llvm-svn: 190461
|
| |
|
|
| |
llvm-svn: 190450
|
| |
|
|
| |
llvm-svn: 190446
|
| |
|
|
| |
llvm-svn: 190442
|
| |
|
|
| |
llvm-svn: 190425
|
| |
|
|
|
|
| |
scan the whole block.
llvm-svn: 190422
|
| |
|
|
|
|
| |
at compile time instead of at run-time. llvm part
llvm-svn: 190407
|
| |
|
|
| |
llvm-svn: 190375
|
| |
|
|
|
|
|
|
|
| |
LLVM IR doesn't currently allow atomic bool load/store operations, and the
transformation is dubious anyway because it isn't profitable on all platforms.
PR17163.
llvm-svn: 190357
|