| Commit message (Collapse) | Author | Age | Files | Lines | 
| ... |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
When SROA was evaluating a mixture of i1 and i8 loads and stores, in
just a particular case, it would tickle a latent bug where we compared
bits to bytes rather than bits to bits. As a consequence of the latent
bug, we would allow integers through which were not byte-size multiples,
a situation the later rewriting code was never intended to handle.
In release builds this could trigger all manner of oddities, but the
reported issue in PR14548 was forming invalid bitcast instructions.
The only downside of this fix is that it makes it more clear that SROA
in its current form is not capable of handling mixed i1 and i8 loads and
stores. Sometimes with the previous code this would work by luck, but
usually it would crash, so I'm not terribly worried. I'll watch the LNT
numbers just to be sure.
llvm-svn: 169719
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
- added function to VectorTargetTransformInfo to query cost of intrinsics
- vectorize trivially vectorizable intrinsic calls such as sin, cos, log, etc.
Reviewed by: Nadav
llvm-svn: 169711
 | 
| | 
| 
| 
|  | 
llvm-svn: 169709
 | 
| | 
| 
| 
| 
| 
|  | 
No functionality change.
llvm-svn: 169703
 | 
| | 
| 
| 
|  | 
llvm-svn: 169701
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
This will more closely match the behavior of the new PtrUseVisitor that
I am adding. Hopefully this will not change the actual behavior in any
way, but by making the processing order more similar help in debugging.
llvm-svn: 169697
 | 
| | 
| 
| 
| 
| 
| 
|  | 
- fix a bug which cause sigfault.
- add two testing cases which was causing crash
llvm-svn: 169687
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
There are still bugs in this pass, as well as other issues that are
being worked on, but the bugs are crashers that occur pretty easily in
the wild. Test cases have been sent to the original commit's review
thread.
This reverts the commits:
  r169671: Fix a logic error.
  r169604: Move the popcnt tests to an X86 subdirectory.
  r168931: Initial commit adding the pass.
llvm-svn: 169683
 | 
| | 
| 
| 
|  | 
llvm-svn: 169671
 | 
| | 
| 
| 
| 
| 
|  | 
in the near future.
llvm-svn: 169651
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
MSan uses a TLS slot to pass shadow for function arguments and return values.
This makes all instrumented functions not readonly, and at the same time
requires that all callees of an instrumented function that may be
MSan-instrumented do not have readonly attribute (otherwise some of the
instrumentation may be optimized out).
llvm-svn: 169591
 | 
| | 
| 
| 
|  | 
llvm-svn: 169551
 | 
| | 
| 
| 
|  | 
llvm-svn: 169550
 | 
| | 
| 
| 
|  | 
llvm-svn: 169504
 | 
| | 
| 
| 
|  | 
llvm-svn: 169491
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Instead of unconditionally storing origin with every application store,
only do this when the shadow of the stored value is != 0.
This change also delays instrumentation of stores until after the walk over
function's instructions, because adding new basic blocks confuses InstVisitor.
We only keep 1 origin value per 4 bytes of application memory. This change
fixes the bug when a store of a single clean byte wiped the origin for the
whole 4-byte area.
Since stores of uninitialized values are relatively uncommon, this change
improves performance of track-origins mode by 5% median and by up to 47% on
specs.
llvm-svn: 169490
 | 
| | 
| 
| 
|  | 
llvm-svn: 169455
 | 
| | 
| 
| 
|  | 
llvm-svn: 169383
 | 
| | 
| 
| 
| 
| 
|  | 
This mirrors the change in ASan & TSan done in r168864.
llvm-svn: 169378
 | 
| | 
| 
| 
| 
| 
| 
|  | 
LinkOnceODRLinkage globals may be removed in GlobalOpt if not used in the
current module.
llvm-svn: 169377
 | 
| | 
| 
| 
| 
| 
|  | 
runtime. If we cant prove statically that the pointers are disjoint then we add the runtime check.
llvm-svn: 169334
 | 
| | 
| 
| 
|  | 
llvm-svn: 169331
 | 
| | 
| 
| 
| 
| 
| 
|  | 
reduction variable is not used outside the loop then we ran into an
endless loop. This change checks if we found the original PHI.
llvm-svn: 169324
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
This change attempts to simplify (X^Y) -> X or Y in the user's context if we know that
only bits from X or Y are demanded.
  A minimized case is provided bellow. This change will simplify "t>>16" into "var1 >>16".
  =============================================================
  unsigned foo (unsigned val1, unsigned val2) {
    unsigned t = val1 ^ 1234;
    return (t >> 16) | t; // NOTE: t is used more than once.
  }
  =============================================================
  Note that if the "t" were used only once, the expression would be finally optimized as well.
However, with with this change, the optimization will take place earlier.
  Reviewed by Nadav, Thanks a lot!
llvm-svn: 169317
 | 
| | 
| 
| 
|  | 
llvm-svn: 169288
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
missed in the first pass because the script didn't yet handle include
guards.
Note that the script is now able to handle all of these headers without
manual edits. =]
llvm-svn: 169224
 | 
| | 
| 
| 
| 
| 
|  | 
executed due to CF.
llvm-svn: 169223
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Added the code that actually performs the if-conversion during vectorization.
We can now vectorize this code:
for (int i=0; i<n; ++i) {
  unsigned k = 0;
  if (a[i] > b[i])   <------ IF inside the loop.
    k = k * 5 + 3;
  a[i] = k;          <---- K is a phi node that becomes vector-select.
}
llvm-svn: 169217
 | 
| | 
| 
| 
| 
| 
|  | 
does not change the current behavior)
llvm-svn: 169216
 | 
| | 
| 
| 
|  | 
llvm-svn: 169214
 | 
| | 
| 
| 
| 
| 
| 
|  | 
The type of shirt-right (logical or arithemetic) should remain unchanged 
when transforming  "X << C1 >> C2" into "X << (C1-C2)"
llvm-svn: 169209
 | 
| | 
| 
| 
| 
| 
|  | 
emit calls into runtime library that poison memory for local variables when their lifetime is over and unpoison memory when their lifetime begins.
llvm-svn: 169200
 | 
| | 
| 
| 
|  | 
llvm-svn: 169195
 | 
| | 
| 
| 
|  | 
llvm-svn: 169194
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
This change tries to simmplify E1 = " X >> C1 << C2" into :
  - E2 = "X << (C2 - C1)" if C2 > C1, or
  - E2 = "X >> (C1 - C2)" if C1 > C2, or
  - E2 = X if C1 == C2.
 Reviewed by Nadav. Thanks!
llvm-svn: 169182
 | 
| | 
| 
| 
|  | 
llvm-svn: 169175
 | 
| | 
| 
| 
|  | 
llvm-svn: 169171
 | 
| | 
| 
| 
| 
| 
|  | 
"single basic block loop vectorizer" to "innermost loop vectorizer".
llvm-svn: 169158
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
which is the legality of the if-conversion transformation. The next step is to
implement the cost-model for the if-converted code as well as the
vectorization itself.
llvm-svn: 169152
 | 
| | 
| 
| 
|  | 
llvm-svn: 169143
 | 
| | 
| 
| 
| 
| 
|  | 
calculating the cost after passing the threshold.
llvm-svn: 169135
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.
Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]
llvm-svn: 169131
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
The partitioning logic attempted to handle uses of an alloca with an
offset starting before the alloca so long as the use had some overlap
with the alloca itself. However, there was a bug where we tested
'(uint64_t)Offset >= AllocSize' without first checking whether 'Offset'
was positive. As a consequence, essentially every negative offset (that
is, starting *before* the alloca does) would be thrown out, even if it
was overlapping. The subsequent code to throw out negative offsets which
were actually non-overlapping was essentially dead. The code to *handle*
overlapping negative offsets was actually dead!
I've just removed all of this, and taught SROA to discard any uses which
start prior to the alloca from the beginning. It has the lovely property
of simplifying the code. =] All the tests still pass, and in fact no new
tests are needed as this is already covered by our testsuite. Fixing the
code so that negative offsets work the way the comments indicate they
were supposed to work causes regressions. That's how I found this.
Anyways, this is all progress in the correct direction -- tightening up
SROA to be maximally aggressive. Some day, I really hope to turn
out-of-bounds accesses to an alloca into 'unreachable'.
llvm-svn: 169120
 | 
| | 
| 
| 
|  | 
llvm-svn: 169119
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
integer type.
Fixes PR14465.
Differential Revision: http://llvm-reviews.chandlerc.com/D148
llvm-svn: 169084
 | 
| | 
| 
| 
| 
| 
|  | 
status.
llvm-svn: 169083
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Also check in a case to repeat the issue, on which 'opt -globalopt' consumes 1.6GB memory.
The big memory footprint cause is that current GlobalOpt one by one hoists and stores the leaf element constant into the global array, in each iteration, it recreates the global array initializer constant and leave the old initializer alone. This may result in many obsolete constants left.
For example:  we have global array @rom = global [16 x i32] zeroinitializer
After the first element value is hoisted and installed:   @rom = global [16 x i32] [ 1, 0, 0, ... ]
After the second element value is installed:  @rom = global [16 x 32] [ 1, 2, 0, 0, ... ]        // here the previous initializer is obsolete
...
When the transform is done, we have 15 obsolete initializers left useless.
llvm-svn: 169079
 | 
| | 
| 
| 
| 
| 
|  | 
nested ifs
llvm-svn: 169049
 | 
| | 
| 
| 
|  | 
llvm-svn: 169048
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
The original patch removed a bunch of code that the SjLjEHPrepare pass placed
into the entry block if all of the landing pads were removed during the
CodeGenPrepare class. The more natural way of doing things is to run the CGP
*before* we run the SjLjEHPrepare pass.
Make it so!
llvm-svn: 169044
 |