| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
| |
llvm-svn: 191154
|
| |
|
|
|
|
|
|
|
|
| |
SROA wants to convert any types of equivalent widths but it's not possible to
convert vectors of pointers to an integer scalar with a single cast. As a
workaround we add a bitcast to the corresponding int ptr type first. This type
of cast used to be an edge case but has become common with SLP vectorization.
Fixes PR17271.
llvm-svn: 191143
|
| |
|
|
|
|
|
|
| |
splitting too."
This reverts commit r191130.
llvm-svn: 191138
|
| |
|
|
| |
llvm-svn: 191136
|
| |
|
|
| |
llvm-svn: 191135
|
| |
|
|
|
|
|
|
|
|
| |
Allow binutils .type and .section directives to take the following
forms:
- @<type>
- %<type>
- "<type>"
llvm-svn: 191134
|
| |
|
|
| |
llvm-svn: 191133
|
| |
|
|
|
|
|
|
|
|
|
| |
In AVX 256bit vectors are valid vectors and therefore the Type Legalizer doesn't
split the VSELECT and SETCC nodes. AVX only supports MIN/MAX on 128bit vectors
and this fix enables vector splitting for this special case in the X86 DAG
Combiner.
This fix is related to PR16695, PR17002, and <rdar://problem/14594431>.
llvm-svn: 191131
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Type Legalizer recognizes that VSELECT needs to be split, because the type
is to wide for the given target. The same does not always apply to SETCC,
because less space is required to encode the result of a comparison. As a result
VSELECT is split and SETCC is unrolled into scalar comparisons.
This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
Combiner. If a matching pattern is found, then the result mask of SETCC is
promoted to the expected vector mask for the given target. This mask has usually
te same size as the VSELECT return type (except for Intel KNL). Now the type
legalizer will split both VSELECT and SETCC.
This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.
llvm-svn: 191130
|
| |
|
|
|
|
| |
This can revert r191087.
llvm-svn: 191128
|
| |
|
|
| |
llvm-svn: 191125
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reapply r191108 with a fix for a memory corruption error I introduced. Of
course, we can't reference the scalars that we replace by vectorizing and then
call their eraseFromParent method. I only 'needed' the scalars to get the
DebugLoc. Just store the DebugLoc before actually vectorizing instead. As a nice
side effect, this also simplifies the interface between BoUpSLP and the
HorizontalReduction class to returning a value pointer (the vectorized tree
root).
radar://14607682
llvm-svn: 191123
|
| |
|
|
|
|
|
|
| |
sure that the functions 'abs' or 'round' are the functions from libm.
rdar://15012650
llvm-svn: 191122
|
| |
|
|
|
|
|
|
|
| |
This reverts commit r191108.
The horizontal.ll test case fails under libgmalloc. Thanks Shuxin for pointing
this out to me.
llvm-svn: 191121
|
| |
|
|
|
|
|
| |
info finalization to greatly reduce the number of fixups that the
assembler has to handle in order to improve compile time.
llvm-svn: 191119
|
| |
|
|
|
|
|
|
|
| |
PR17307 & 17308.
The problem of r191017 is that when GVN fabricate a val-number for a dead instruction (in order
to make following expr-PRE happy), it forget to fabricate a leader-table entry for it as well.
llvm-svn: 191118
|
| |
|
|
|
|
|
|
| |
Clean up some simple code quality issues. Bring internal naming
conventions up to current standard, fix inconsistent formatting, and
tidy up a couple of odd contructs.
llvm-svn: 191117
|
| |
|
|
|
|
| |
to further work.
llvm-svn: 191113
|
| |
|
|
| |
llvm-svn: 191112
|
| |
|
|
|
|
| |
I cannot think of a test case that reliably triggers this bug.
llvm-svn: 191109
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Match reductions starting at binary operation feeding into a phi. The code
handles trees like
r += v1 + v2 + v3 ...
and
r += v1
r += v2
...
and
r *= v1 + v2 + ...
We currently only handle associative operations (add, fadd fast).
The code can now also handle reductions feeding into stores.
a[i] = v1 + v2 + v3 + ...
The code is currently disabled behind the flag "-slp-vectorize-hor". The cost
model for most architectures is not there yet.
I found one opportunity of a horizontal reduction feeding a phi in TSVC
(LoopRerolling-flt) and there are several opportunities where reductions feed
into stores.
radar://14607682
llvm-svn: 191108
|
| |
|
|
| |
llvm-svn: 191104
|
| |
|
|
|
|
|
|
|
|
| |
(ptrtoint Y))
The GEP pattern is what SCEV expander emits for "ugly geps". The latter is what
you get for pointer subtraction in C code. The rest of instcombine already
knows how to deal with that so just canonicalize on that.
llvm-svn: 191090
|
| |
|
|
|
|
| |
This reverts r191030
llvm-svn: 191075
|
| |
|
|
| |
llvm-svn: 191073
|
| |
|
|
|
|
|
|
|
|
|
| |
interface.
The global registry is used to allow command line override of the
scheduler selection, but does not work well as the normal selection
API. For example, the same LLVM process should be able to target
multiple targets or subtargets.
llvm-svn: 191071
|
| |
|
|
|
|
| |
way in r191060.
llvm-svn: 191065
|
| |
|
|
| |
llvm-svn: 191062
|
| |
|
|
|
|
|
|
|
| |
for easy llvm::formating
This was previously invoking UB by passing a user-defined type to
format. Thanks to Jordan Rose for pointing this out.
llvm-svn: 191060
|
| |
|
|
|
|
| |
These violations were introduced in r191049
llvm-svn: 191059
|
| |
|
|
|
|
|
|
|
|
| |
Ensures that the pubnames entries actually refer to the intended
entities. This test could be more flexible if there was a way to do
multiline FileCheck matches with captures (in that way the test wouldn't
need to have hardcoded offset values and would thus be resilient to
changes in the layout of the DIEs in this CU).
llvm-svn: 191055
|
| |
|
|
| |
llvm-svn: 191052
|
| |
|
|
|
|
|
|
|
|
| |
This was an experimental scheduler a year ago. It's now used by
several subtargets, both in-order and out-of-order, and it
is about to be enabled by default for x86 and armv7. It will be the
new GenericScheduler for subtargets that don't provide their own
SchedulingStrategy.
llvm-svn: 191051
|
| |
|
|
| |
llvm-svn: 191050
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
C-like languages promote types like unsigned short to unsigned int before
performing an arithmetic operation. Currently the rotate matcher in the
DAGCombiner does not consider this situation.
This commit extends the DAGCombiner in the way that the pattern
(or (shl ([az]ext x), (*ext y)), (srl ([az]ext x), (*ext (sub 32, y))))
is folded into
([az]ext (rotl x, y))
The matching is restricted to aext and zext because in this cases the upper
bits are either undefined or known. Test case is included.
This fixes PR16726.
llvm-svn: 191049
|
| |
|
|
|
|
| |
There is a buildbot failure. Need to investigate this.
llvm-svn: 191048
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
C-like languages promote types like unsigned short to unsigned int before
performing an arithmetic operation. Currently the rotate matcher in the
DAGCombiner does not consider this situation.
This commit extends the DAGCombiner in the way that the pattern
(or (shl ([az]ext x), (*ext y)), (srl ([az]ext x), (*ext (sub 32, y))))
is folded into
([az]ext (rotl x, y))
The matching is restricted to aext and zext because in this cases the upper
bits are either undefined or known. Test case is included.
This fixes PR16726.
llvm-svn: 191045
|
| |
|
|
| |
llvm-svn: 191043
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If "C1/X" were having multiple uses, the only benefit of this
transformation is to potentially shorten critical path. But it is at the
cost of instroducing additional div.
The additional div may or may not incur cost depending on how div is
implemented. If it is implemented using Newton–Raphson iteration, it dosen't
seem to incur any cost (FIXME). However, if the div blocks the entire
pipeline, that sounds to be pretty expensive. Let CodeGen to take care
this transformation.
This patch sees 6% on a benchmark.
rdar://15032743
llvm-svn: 191037
|
| |
|
|
|
|
| |
The code below can't handle any pointers. PR17293.
llvm-svn: 191036
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Based on code review feedback from Eric Christopher, unshifting these
constants as they can appear in the gdb_index itself, shifted a further
24 bits. This means that keeping them preshifted is a bit inflexible, so
let's not do that.
Given the motivation, wrap up some nicer enums, more type safety, and
some utility functions.
llvm-svn: 191035
|
| |
|
|
|
|
| |
Differential Revision: http://llvm-reviews.chandlerc.com/D1715
llvm-svn: 191030
|
| |
|
|
|
|
| |
in normally.
llvm-svn: 191026
|
| |
|
|
|
|
|
| |
Names open to bikeshedding. Could switch back to the constants being
unshifted, but this way seems a bit easier to work with.
llvm-svn: 191025
|
| |
|
|
| |
llvm-svn: 191021
|
| |
|
|
| |
llvm-svn: 191020
|
| |
|
|
| |
llvm-svn: 191018
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is how it ignores the dead code:
1) When a dead branch target, say block B, is identified, all the
blocks dominated by B is dead as well.
2) The PHIs of those blocks in dominance-frontier(B) is updated such
that the operands corresponding to dead predecessors are replaced
by "UndefVal".
Using lattice's jargon, the "UndefVal" is the "Top" in essence.
Phi node like this "phi(v1 bb1, undef xx)" will be optimized into
"v1" if v1 is constant, or v1 is an instruction which dominate this
PHI node.
3) When analyzing the availability of a load L, all dead mem-ops which
L depends on disguise as a load which evaluate exactly same value as L.
4) The dead mem-ops will be materialized as "UndefVal" during code motion.
llvm-svn: 191017
|
| |
|
|
|
|
|
|
|
| |
Adds a flag to the MemorySanitizer pass that enables runtime rewriting of
indirect calls. This is part of MSanDR implementation and is needed to return
control to the DynamiRio-based helper tool on transition between instrumented
and non-instrumented modules. Disabled by default.
llvm-svn: 191006
|
| |
|
|
|
|
|
|
| |
a power of 2 but differ in bit width.
PR17283.
llvm-svn: 191000
|