| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
llvm-svn: 272243
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Enable existing summary-based importing support in the gold-plugin.
Reviewers: mehdi_amini
Subscribers: llvm-commits, mehdi_amini
Differential Revision: http://reviews.llvm.org/D21080
llvm-svn: 272239
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fixes PR26682. Also add LCSSA as a preserved pass to LoopSimplify,
that looks correct to me and allows to write a test for the issue.
Reviewers: chandlerc, bogner, sanjoy
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D21112
llvm-svn: 272224
|
|
|
|
| |
llvm-svn: 272221
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fixes PR27617.
Bug description: The SLPVectorizer asserts on encountering GEPs with different index types, such as i8 and i64.
The patch includes a simple relaxation of the assert to allow constants being of different types, along with a regression test that will provoke the unrelaxed assert.
Reviewers: nadav, mzolotukhin
Subscribers: JesperAntonsson, llvm-commits, mzolotukhin
Differential Revision: http://reviews.llvm.org/D20685
Patch by Jesper Antonsson!
llvm-svn: 272206
|
|
|
|
| |
llvm-svn: 272204
|
|
|
|
| |
llvm-svn: 272199
|
|
|
|
| |
llvm-svn: 272196
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
with user specified count has been applied.
Summary:
Previously SetLoopAlreadyUnrolled() set the disable pragma only if
there was some loop metadata.
Now it set the pragma in all cases. This helps to prevent multiple
unroll when -unroll-count=N is given.
Reviewers: mzolotukhin
Differential Revision: http://reviews.llvm.org/D20765
From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 272195
|
|
|
|
|
|
|
|
| |
This is the preparation patch to port the analysis to new PM
Differential Revision: http://reviews.llvm.org/D20560
llvm-svn: 272194
|
|
|
|
| |
llvm-svn: 272193
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: iteratee
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D21087
llvm-svn: 272192
|
|
|
|
| |
llvm-svn: 272191
|
|
|
|
|
|
|
| |
Avoids unnecessary copies. All changes audited & pass tests with asan.
No functional change intended.
llvm-svn: 272190
|
|
|
|
| |
llvm-svn: 272140
|
|
|
|
| |
llvm-svn: 272139
|
|
|
|
|
|
|
|
| |
As suggested by clang-tidy's performance-unnecessary-copy-initialization.
This can easily hit lifetime issues, so I audited every change and ran the
tests under asan, which came back clean.
llvm-svn: 272126
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D21040
llvm-svn: 272009
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
native shifts
Unlike native shifts, the AVX2 per-element shift instructions VPSRAV/VPSRLV/VPSLLV handle out of range shift values (logical shifts set the result to zero, arithmetic shifts splat the sign bit).
If the shift amount is constant we can sometimes convert these instructions to native shifts:
1 - if all shift amounts are in range then the conversion is trivial.
2 - out of range arithmetic shifts can be clamped to the (bitwidth - 1) (a legal shift amount) before conversion.
3 - logical shifts just return zero if all elements have out of range shift amounts.
In addition, UNDEF shift amounts are handled - either as an UNDEF shift amount in a native shift or as an UNDEF in the logical 'all out of range' zero constant special case for logical shifts.
Differential Revision: http://reviews.llvm.org/D19675
llvm-svn: 271996
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for folding undef/zero/constant inputs to MOVMSK instructions.
The SSE/AVX versions can be fully folded, but the MMX version can only handle undef inputs.
Differential Revision: http://reviews.llvm.org/D20998
llvm-svn: 271990
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
scalarizePHI only looked for phis that have exactly two uses - the "latch"
use, and an extract. Unfortunately, we can not assume all equivalent extracts
are CSE'd, since InstCombine itself may create an extract which is a duplicate
of an existing one. This extends it to handle several distinct extracts from
the same index.
This should fix at least some of the performance regressions from PR27988.
Differential Revision: http://reviews.llvm.org/D20983
llvm-svn: 271961
|
|
|
|
| |
llvm-svn: 271934
|
|
|
|
| |
llvm-svn: 271932
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fix LSRInstance::HoistInsertPosition() to check the original insert
position block first for a canonical insertion point that is dominated
by all inputs. This leads to SCEV being able to reuse more instructions
since it currently tracks the instructions it creates for reuse by
keeping a table of <Value, insert point> pairs.
Originally reviewed in http://reviews.llvm.org/D18001
Reviewers: atrick
Subscribers: llvm-commits, mzolotukhin, mcrosier
Differential Revision: http://reviews.llvm.org/D18480
llvm-svn: 271929
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In r271810 ( http://reviews.llvm.org/rL271810 ), I loosened the check
above this to work for any Constant rather than ConstantInt. AFAICT,
that part makes sense if we can determine that the shrunken/extended
constant remained equal. But it doesn't make sense for this later
transform where we assume that the constant DID change.
This could assert for a ConstantExpr:
https://llvm.org/bugs/show_bug.cgi?id=28011
And it could be wrong for a vector as shown in the added regression test.
llvm-svn: 271908
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This hasn't been caught before because it requires noalias or similarly
strong alias analysis to actually reproduce.
Fixes http://llvm.org/PR27952 .
Reviewers: hfinkel, sanjoy
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D20944
llvm-svn: 271858
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since FoldOpIntoPhi speculates the binary operation to potentially each
of the predecessors of the PHI node (pulling it out of arbitrary control
dependence in the process), we can FoldOpIntoPhi only if we know the
operation doesn't have UB.
This also brings up an interesting profitability question -- the way it
is written today, commonIRemTransforms will hoist out work from
dynamically dead code into code that will execute at runtime. Perhaps
that isn't the best canonicalization?
Fixes PR27968.
llvm-svn: 271857
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
There are some rough corners, since the new pass manager doesn't have
(as far as I can tell) LoopSimplify and LCSSA, so I've updated the
tests to run them separately in the old pass manager in the lit tests.
We also don't have an equivalent for AU.setPreservesCFG() in the new
pass manager, so I've left a FIXME.
Reviewers: bogner, chandlerc, davide
Subscribers: sanjoy, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D20783
llvm-svn: 271846
|
|
|
|
|
|
|
|
|
|
| |
It is an off-by-default option that no one seems to use[0], and given
that SCEV directly understands the overflow instrinsics there is no real
need for it anymore.
[0]: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098181.html
llvm-svn: 271845
|
|
|
|
| |
llvm-svn: 271843
|
|
|
|
| |
llvm-svn: 271839
|
|
|
|
| |
llvm-svn: 271823
|
|
|
|
| |
llvm-svn: 271822
|
|
|
|
|
|
|
|
|
| |
Change the name of the ICmpInst to 'ICmp' and the Constant (was a ConstantInt) to 'C',
so that it's hopefully clearer that 'CI' refers to CastInst in this context.
While we're scrubbing, fix the documentation comment and use 'auto' with 'dyn_cast'.
llvm-svn: 271817
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A basic block could contain:
%cp = cleanuppad []
cleanupret from %cp unwind to caller
This basic block is empty and is thus a candidate for removal. However,
there can be other uses of %cp outside of this basic block. This is
only possible in unreachable blocks.
Make our transform more correct by checking that the pad has a single
user before removing the BB.
This fixes PR28005.
llvm-svn: 271816
|
|
|
|
|
|
|
| |
This is step 1 of unknown towards fixing PR28001:
https://llvm.org/bugs/show_bug.cgi?id=28001
llvm-svn: 271810
|
|
|
|
| |
llvm-svn: 271807
|
|
|
|
| |
llvm-svn: 271804
|
|
|
|
|
|
|
|
| |
Add the MMX implementation to the SimplifyDemandedUseBits SSE/AVX MOVMSK support added in D19614
Requires a minor tweak as llvm.x86.mmx.pmovmskb takes a x86_mmx argument - so we have to be explicit about the implied v8i8 vector type.
llvm-svn: 271789
|
|
|
|
| |
llvm-svn: 271746
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Adds an option -esan-assume-intra-cache-line which causes esan to assume
that a single memory access touches just one cache line, even if it is not
aligned, for better performance at a potential accuracy cost. Experiments
show that the performance difference can be 2x or more, and accuracy loss
is typically negligible, so we turn this on by default. This currently
applies just to the working set tool.
Reviewers: aizatsky
Subscribers: vitalybuka, zhaoqin, kcc, eugenis, llvm-commits
Differential Revision: http://reviews.llvm.org/D20978
llvm-svn: 271743
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Adds a global variable to specify the tool, to support handling early
interceptors that invoke instrumented code and require shadow memory to be
initialized prior to __esan_init() being invoked.
Reviewers: aizatsky
Subscribers: vitalybuka, zhaoqin, kcc, eugenis, llvm-commits
Differential Revision: http://reviews.llvm.org/D20973
llvm-svn: 271715
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There was concern that creating bitcasts for the simpler potential select pattern:
define <2 x i64> @vecBitcastOp1(<4 x i1> %cmp, <2 x i64> %a) {
%a2 = add <2 x i64> %a, %a
%sext = sext <4 x i1> %cmp to <4 x i32>
%bc = bitcast <4 x i32> %sext to <2 x i64>
%and = and <2 x i64> %a2, %bc
ret <2 x i64> %and
}
might lead to worse code for some targets, so this patch is matching the larger
patterns seen in the test cases.
The motivating example for this patch is this IR produced via SSE intrinsics in C:
define <2 x i64> @gibson(<2 x i64> %a, <2 x i64> %b) {
%t0 = bitcast <2 x i64> %a to <4 x i32>
%t1 = bitcast <2 x i64> %b to <4 x i32>
%cmp = icmp sgt <4 x i32> %t0, %t1
%sext = sext <4 x i1> %cmp to <4 x i32>
%t2 = bitcast <4 x i32> %sext to <2 x i64>
%and = and <2 x i64> %t2, %a
%neg = xor <4 x i32> %sext, <i32 -1, i32 -1, i32 -1, i32 -1>
%neg2 = bitcast <4 x i32> %neg to <2 x i64>
%and2 = and <2 x i64> %neg2, %b
%or = or <2 x i64> %and, %and2
ret <2 x i64> %or
}
For an AVX target, this is currently:
vpcmpgtd %xmm1, %xmm0, %xmm2
vpand %xmm0, %xmm2, %xmm0
vpandn %xmm1, %xmm2, %xmm1
vpor %xmm1, %xmm0, %xmm0
retq
With this patch, it becomes:
vpmaxsd %xmm1, %xmm0, %xmm0
Differential Revision: http://reviews.llvm.org/D20774
llvm-svn: 271676
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Instrument GEP instruction for counting the number of struct field
address calculation to approximate the number of struct field accesses.
Adds test struct_field_count_basic.ll to test the struct field
instrumentation.
Reviewers: bruening, aizatsky
Subscribers: junbuml, zhaoqin, llvm-commits, eugenis, vitalybuka, kcc, bruening
Differential Revision: http://reviews.llvm.org/D20892
llvm-svn: 271619
|
|
|
|
|
|
|
|
|
|
|
| |
heuristic.
In r270478, where I enabled the new heuristic I posted testing results,
which I got when explicitly passed the thresholds values via CL options.
However, setting the CL options init-values is not enough to change the
default values of thresholds, so I'm changing them in another place now.
llvm-svn: 271615
|
|
|
|
|
|
|
|
|
| |
In preparation for porting to the new PM.
Patch by Jake VanAdrighem! (review mainly by me/Justin)
Differential Revision: http://reviews.llvm.org/D20610
llvm-svn: 271607
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The module pass pipeline includes a late LICM run after loop
unrolling. LCSSA is implicitly run as a pass dependency of LICM. However no
cleanup pass was run after this, so the LCSSA nodes ended in the optimized output.
Reviewers: hfinkel, mehdi_amini
Subscribers: majnemer, bruno, mzolotukhin, mehdi_amini, llvm-commits
Differential Revision: http://reviews.llvm.org/D20606
llvm-svn: 271602
|
|
|
|
| |
llvm-svn: 271601
|
|
|
|
| |
llvm-svn: 271600
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is effectively a revert of:
http://reviews.llvm.org/rL249702 - [InstCombine] transform masking off of an FP sign bit into a fabs() intrinsic call (PR24886)
and:
http://reviews.llvm.org/rL249701 - [ValueTracking] teach computeKnownBits that a fabs() clears sign bits
and a reimplementation as a DAG combine for targets that have IEEE754-compliant fabs/fneg instructions.
This is intended to resolve the objections raised on the dev list:
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098154.html
and:
https://llvm.org/bugs/show_bug.cgi?id=24886#c4
In the interest of patch minimalism, I've only partly enabled AArch64. PowerPC, MIPS, x86 and others can enable later.
Differential Revision: http://reviews.llvm.org/D19391
llvm-svn: 271573
|