| Commit message (Collapse) | Author | Age | Files | Lines | 
| ... |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
(ptrtoint Y))
The GEP pattern is what SCEV expander emits for "ugly geps". The latter is what
you get for pointer subtraction in C code. The rest of instcombine already
knows how to deal with that so just canonicalize on that.
llvm-svn: 191090
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
If "C1/X" were having multiple uses, the only benefit of this
transformation is to potentially shorten critical path. But it is at the
cost of instroducing additional div.
  The additional div may or may not incur cost depending on how div is
implemented. If it is implemented using Newton–Raphson iteration, it dosen't
seem to incur any cost (FIXME). However, if the div blocks the entire
pipeline, that sounds to be pretty expensive. Let CodeGen to take care 
this transformation.
  This patch sees 6% on a benchmark.
rdar://15032743
llvm-svn: 191037
 | 
| | 
| 
| 
| 
| 
|  | 
The code below can't handle any pointers. PR17293.
llvm-svn: 191036
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
This is how it ignores the dead code:
1) When a dead branch target, say block B, is identified, all the
    blocks dominated by B is dead as well.
2) The PHIs of those blocks in dominance-frontier(B) is updated such
   that the operands corresponding to dead predecessors are replaced
   by "UndefVal".
   Using lattice's jargon, the "UndefVal" is the "Top" in essence.
   Phi node like this "phi(v1 bb1, undef xx)" will be optimized into
   "v1" if v1 is constant, or v1 is an instruction which dominate this
   PHI node.
3) When analyzing the availability of a load L, all dead mem-ops which
   L depends on disguise as a load which evaluate exactly same value as L.
4) The dead mem-ops will be materialized as "UndefVal" during code motion.
llvm-svn: 191017
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
Adds a flag to the MemorySanitizer pass that enables runtime rewriting of
indirect calls. This is part of MSanDR implementation and is needed to return
control to the DynamiRio-based helper tool on transition between instrumented
and non-instrumented modules. Disabled by default.
llvm-svn: 191006
 | 
| | 
| 
| 
| 
| 
|  | 
enabled with the run-time option
llvm-svn: 190939
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
registers.
XCore target: Add XCoreTargetTransformInfo
This is where getNumberOfRegisters() resides, which in turn returns the
number of vector registers (=0).
llvm-svn: 190936
 | 
| | 
| 
| 
| 
| 
|  | 
still work correctly.
llvm-svn: 190917
 | 
| | 
| 
| 
| 
| 
|  | 
VINSERTF128/VEXTRACTF128. Fixes PR17268.
llvm-svn: 190916
 | 
| | 
| 
| 
|  | 
llvm-svn: 190905
 | 
| | 
| 
| 
| 
| 
| 
|  | 
To avoid regressions with bitfield optimizations, this slicing should take place
later, like ISel time.
llvm-svn: 190891
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Some of this code is no longer necessary since int<->ptr casts are no
longer occur as of r187444.
This also fixes handling vectors of pointers, and adds a bunch of new
testcases for vectors and address spaces.
llvm-svn: 190885
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
We can't insert an insertelement after an invoke. We would have to split a
critical edge. So when we see a phi node that uses an invoke we just give up.
radar://14990770
llvm-svn: 190871
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
other in memory.
The motivation was to get rid of truncate and shift right instructions that get
in the way of paired load or floating point load.
E.g.,
Consider the following example:
struct Complex {
  float real;
  float imm;
};
When accessing a complex, llvm was generating a 64-bits load and the imm field
was obtained by a trunc(lshr) sequence, resulting in poor code generation, at
least for x86.
The idea is to declare that two load instructions is the canonical form for
loading two arithmetic type, which are next to each other in memory.
Two scalar loads at a constant offset from each other are pretty
easy to detect for the sorts of passes that like to mess with loads. 
<rdar://problem/14477220>
llvm-svn: 190870
 | 
| | 
| 
| 
| 
| 
|  | 
10%-20% speedup for use-after-return
llvm-svn: 190863
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Wrong cast operation.
MergeFunctions emits Bitcast instead of pointer-to-integer operation.
Patch fixes MergeFunctions::writeThunk function. It replaces
unconditional Bitcast creation with "Value* createCast(...)" method, that
checks operand types and selects proper instruction.
See unit-test as example.
llvm-svn: 190859
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
If there are no legal integers, assume 1 byte.
This makes more sense than using the pointer size as
a guess for the maximum GPR width.
It is conceivable to want to use some 64-bit pointers
on a target where 64-bit integers aren't legal.
llvm-svn: 190817
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
We would have to compute the pre increment value, either by computing it on
every loop iteration or by splitting the edge out of the loop and inserting a
computation for it there.
For now, just give up vectorizing such loops.
Fixes PR17179.
llvm-svn: 190790
 | 
| | 
| 
| 
|  | 
llvm-svn: 190782
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
Previous discussion:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-July/063909.html
Differential Revision: http://llvm-reviews.chandlerc.com/D1191
llvm-svn: 190773
 | 
| | 
| 
| 
|  | 
llvm-svn: 190770
 | 
| | 
| 
| 
|  | 
llvm-svn: 190750
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
This pass was based on the previous (essentially unused) profiling
infrastructure and the assumption that by ordering the basic blocks at
the IR level in a particular way, the correct layout would happen in the
end. This sometimes worked, and mostly didn't. It also was a really
naive implementation of the classical paper that dates from when branch
predictors were primarily directional and when loop structure wasn't
commonly available. It also didn't factor into the equation
non-fallthrough branches and other machine level details.
Anyways, for all of these reasons and more, I wrote
MachineBlockPlacement, which completely supercedes this pass. It both
uses modern profile information infrastructure, and actually works. =]
llvm-svn: 190748
 | 
| | 
| 
| 
| 
| 
|  | 
Compiler part.
llvm-svn: 190689
 | 
| | 
| 
| 
| 
| 
|  | 
disabled.
llvm-svn: 190668
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
Allow targets to customize the default behavior of the generic loop unrolling
transformation. This will be used by the PowerPC backend when targeting the A2
core (which is in-order with a deep pipeline), and using more aggressive
defaults is important.
llvm-svn: 190542
 | 
| | 
| 
| 
| 
| 
| 
|  | 
It works with clang, but GCC has different rules so we can't make all of those
hidden. This reverts commit r190534.
llvm-svn: 190536
 | 
| | 
| 
| 
| 
| 
|  | 
Worth 100k on a linux/x86_64 Release+Asserts clang.
llvm-svn: 190534
 | 
| | 
| 
| 
| 
| 
| 
|  | 
This doesn't change anything since malloc always returns
address space 0.
llvm-svn: 190498
 | 
| | 
| 
| 
|  | 
llvm-svn: 190491
 | 
| | 
| 
| 
|  | 
llvm-svn: 190490
 | 
| | 
| 
| 
|  | 
llvm-svn: 190461
 | 
| | 
| 
| 
|  | 
llvm-svn: 190450
 | 
| | 
| 
| 
|  | 
llvm-svn: 190446
 | 
| | 
| 
| 
|  | 
llvm-svn: 190442
 | 
| | 
| 
| 
|  | 
llvm-svn: 190425
 | 
| | 
| 
| 
| 
| 
|  | 
scan the whole block.
llvm-svn: 190422
 | 
| | 
| 
| 
| 
| 
|  | 
at compile time instead of at run-time. llvm part
llvm-svn: 190407
 | 
| | 
| 
| 
|  | 
llvm-svn: 190375
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
LLVM IR doesn't currently allow atomic bool load/store operations, and the
transformation is dubious anyway because it isn't profitable on all platforms.
PR17163.
llvm-svn: 190357
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Several architectures use the same instruction to perform both a comparison and
a subtract. The instruction selection framework does not allow to consider
different basic blocks to expose such fusion opportunities.
Therefore, these instructions are “merged” by CSE at MI IR level.
To increase the likelihood of CSE to apply in such situation, we reorder the
operands of the comparison, when they have the same complexity, so that they
matches the order of the most frequent subtract.
E.g.,
icmp A, B
...
sub B, A
<rdar://problem/14514580>
llvm-svn: 190352
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
The work on this project was left in an unfinished and inconsistent state.
Hopefully someone will eventually get a chance to implement this feature, but
in the meantime, it is better to put things back the way the were.  I have
left support in the bitcode reader to handle the case-range bitcode format,
so that we do not lose bitcode compatibility with the llvm 3.3 release.
This reverts the following commits: 155464, 156374, 156377, 156613, 156704,
156757, 156804 156808, 156985, 157046, 157112, 157183, 157315, 157384, 157575,
157576, 157586, 157612, 157810, 157814, 157815, 157880, 157881, 157882, 157884,
157887, 157901, 158979, 157987, 157989, 158986, 158997, 159076, 159101, 159100,
159200, 159201, 159207, 159527, 159532, 159540, 159583, 159618, 159658, 159659,
159660, 159661, 159703, 159704, 160076, 167356, 172025, 186736
llvm-svn: 190328
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
instead of having its own implementation.
The implementation of isTBAAVtableAccess is in TypeBasedAliasAnalysis.cpp
since it is related to the format of TBAA metadata.
The path for struct-path tbaa will be exercised by
test/Instrumentation/ThreadSanitizer/read_from_global.ll, vptr_read.ll, and
vptr_update.ll when struct-path tbaa is on by default.
llvm-svn: 190216
 | 
| | 
| 
| 
|  | 
llvm-svn: 190113
 | 
| | 
| 
| 
|  | 
llvm-svn: 190112
 | 
| | 
| 
| 
|  | 
llvm-svn: 190093
 | 
| | 
| 
| 
|  | 
llvm-svn: 190090
 | 
| | 
| 
| 
|  | 
llvm-svn: 190035
 | 
| | 
| 
| 
| 
| 
|  | 
I am about to patch this code, and this makes the diff far more readable.
llvm-svn: 189982
 | 
| | 
| 
| 
|  | 
llvm-svn: 189971
 |