| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
| |
llvm-svn: 196634
|
| |
|
|
| |
llvm-svn: 196633
|
| |
|
|
|
|
|
|
| |
The sefault occurs due to an infinite loop when the verifier tries to
determine the size of a type of the form "%rt = type { %rt }" while
checking an alloca of the type.
llvm-svn: 196626
|
| |
|
|
|
|
|
|
| |
This commit caches the value of the AllowAtInIdentifier variable as
a class variable in AsmLexer. We do this to avoid repeated MAI
queries and string comparisons each time we lex an identifier.
llvm-svn: 196622
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Command line arguments that begin with @ but aren't a path to an
existing file currently cause later @file arguments to be ignored.
Correctly skip over these arguments instead of trying to read a
non-existent file 20 times and giving up.
Since the problem manifests in the clang driver, the test is in that
repository.
Fixes rdar://problem/15590906
llvm-svn: 196620
|
| |
|
|
|
|
|
|
|
|
|
|
| |
- krait processor currently modeled with the same features as A9.
- Krait processor additionally has VFP4 (fused multiply add/sub)
and hardware division features enabled.
- krait has currently the same Schedule model as A9
- krait cpu flag is not recognized by the GNU assembler yet,
it is replaced with march=armv7-a to avoid a lower march
from being used.
llvm-svn: 196619
|
| |
|
|
|
|
|
|
| |
This removes another case of spooky action at a distance (building the
same label names in multiple places creating an implicit dependency
between those places) and helps pave the way for type units.
llvm-svn: 196617
|
| |
|
|
|
|
|
| |
This is a precursor to moving type units into the correct (debug_types)
section with comdat groups and full type unit headers.
llvm-svn: 196615
|
| |
|
|
|
|
|
|
| |
ConstantExpr can evaluate to false even when isNullValue gives false.
Fixes PR18143.
llvm-svn: 196611
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The integrated assembler fails to properly lex arm comments when
they are adjacent to an identifier in the input stream. The reason
is that the arm comment symbol '@' is also used as symbol variant in
other assembly languages so when lexing an identifier it allows the
'@' symbol as part of the identifier.
Example:
$ cat comment.s
foo:
add r0, r0@got to parse this as a comment
$ llvm-mc -triple armv7 comment.s
comment.s:4:18: error: unexpected token in argument list
add r0, r0@got to parse this as a comment
^
This should be parsed as correctly as `add r0, r0`.
This commit modifes the assembly lexer to not include the '@' symbol
in identifiers when lexing for targets that use '@' for comments.
llvm-svn: 196607
|
| |
|
|
|
|
| |
and structs.
llvm-svn: 196604
|
| |
|
|
|
|
|
|
|
|
|
| |
This more accurately represents the actual walk - pubnames/pubtypes are
emitted into the .o, not the .dwo, and reference the skeletons not the
full units.
Use the newly established ID->index invariant to lookup the underlying
full unit to retrieve its public names and types.
llvm-svn: 196601
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
list
This simplifies reasoning about the code and enables simple navigation
from a skeleton to its full unit. (currently there are no type unit
skeletons, so the skeleton list doesn't have the same ID == index
property)
Eventually we should get rid of this ID and just store the labels we
need as the IDs are allowing this code to create difficult to
manage/understand associations (loops over non-skeletal units are
implicitly referencing their skeletal units during pub* emission, for
example). It may be necessary to have some kind of skeleton->full unit
association and a more direct pointer or similar device would be
preferable than an index.
llvm-svn: 196600
|
| |
|
|
|
|
|
|
|
| |
The current peephole optimizing for compare inst assumes an instr that
uses CPSR has an MO for ARM Cond code.However, for VSEL instructions
(vseqeq, vselgt, vselgt, vselvs), there is no such operand nor do
they support the modification of Cond Code.
llvm-svn: 196588
|
| |
|
|
| |
llvm-svn: 196585
|
| |
|
|
| |
llvm-svn: 196581
|
| |
|
|
|
|
| |
...since it os equivalent to comparison with +0.
llvm-svn: 196580
|
| |
|
|
|
|
|
| |
instcombine prefers to put extended operands first, so this patch
handles that case for C(L)GFR.
llvm-svn: 196579
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Since z has no setcc instruction as such, the choice of setBooleanContents
is a bit arbitrary. Currently it's set to ZeroOrOneBooleanContent,
so we produced a branch-free form when selecting between 0 and 1,
but not when selecting between 0 and -1. This patch handles the latter
case too.
At some point I'd like to measure whether it's better to use conditional
moves for constant selects on z196, but that's future work.
llvm-svn: 196578
|
| |
|
|
| |
llvm-svn: 196574
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Rewrite asan's stack frame layout.
First, most of the stack layout logic is moved into a separte file
to make it more testable and (potentially) useful for other projects.
Second, make the frames more compact by using adaptive redzones
(smaller for small objects, larger for large objects).
Third, try to minimized gaps due to large alignments (this is hypothetical since
today we don't see many stack vars aligned by more than 32).
The frames indeed become more compact, but I'll still need to run more benchmarks
before committing, but I am sking for review now to get early feedback.
This change will be accompanied by a trivial change in compiler-rt tests
to match the new frame sizes.
Reviewers: samsonov, dvyukov
Reviewed By: samsonov
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2324
llvm-svn: 196568
|
| |
|
|
|
|
|
|
|
|
| |
Not only does it trigger -Wparentheses, I think the assert actually
relies on incorrect operator precedence.
Also, the grammar as questionable, but I might not know enough about the
problem at hand.
llvm-svn: 196567
|
| |
|
|
|
|
| |
Patch by Marius Wachtler.
llvm-svn: 196561
|
| |
|
|
|
|
| |
Patch by Marius Wachtler.
llvm-svn: 196560
|
| |
|
|
| |
llvm-svn: 196551
|
| |
|
|
| |
llvm-svn: 196544
|
| |
|
|
| |
llvm-svn: 196542
|
| |
|
|
| |
llvm-svn: 196541
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The intended behaviour is to force vectorization on the presence
of the flag (either turn on or off), and to continue the behaviour
as expected in its absence. Tests were added to make sure the all
cases are covered in opt. No tests were added in other tools with
the assumption that they should use the PassManagerBuilder in the
same way.
This patch also removes the outdated -late-vectorize flag, which was
on by default and not helping much.
The pragma metadata is being attached to the same place as other loop
metadata, but nothing forbids one from attaching it to a function
(to enable #pragma optimize) or basic blocks (to hint the basic-block
vectorizers), etc. The logic should be the same all around.
Patches to Clang to produce the metadata will be produced after the
initial implementation is agreed upon and committed. Patches to other
vectorizers (such as SLP and BB) will be added once we're happy with
the pass manager changes.
llvm-svn: 196537
|
| |
|
|
| |
llvm-svn: 196536
|
| |
|
|
| |
llvm-svn: 196533
|
| |
|
|
| |
llvm-svn: 196530
|
| |
|
|
|
|
|
| |
The typedef is used inside the DEBUG(), and apparently can't be moved
inside of it.
llvm-svn: 196528
|
| |
|
|
|
|
| |
Unused typedefs and unused variables.
llvm-svn: 196526
|
| |
|
|
|
|
|
|
| |
There is no reason to use std::deque here over std::vector. Thus given the
performance differences inbetween the two it makes sense to change deque to
vector.
llvm-svn: 196524
|
| |
|
|
|
|
|
|
|
|
| |
We use CSEBlocks to initialize a worklist:
SmallVector<BasicBlock *, 8> CSEWorkList(CSEBlocks.begin(), CSEBlocks.end());
so it must have a deterministic order.
llvm-svn: 196520
|
| |
|
|
| |
llvm-svn: 196519
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows a target to use MI-Sched as an in-order scheduler that
will model strict resource conflicts without defining a processor
itinerary. Instead, the target can now use the new per-operand machine
model and define in-order resources with BufferSize=0. For example,
this would allow restricting the type of operations that can be formed
into a dispatch group. (Normally NumMicroOps is sufficient to enforce
dispatch groups).
If the intent is to model latency in in-order pipeline, as opposed to
resource conflicts, then a resource with BufferSize=1 should be
defined instead.
This feature is only casually tested as there are no in-tree targets
using it yet. However, Hal will be experimenting with POWER7.
llvm-svn: 196517
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The per-operand machine model allows the target to define "unbuffered"
processor resources. This change is a quick, cheap way to model stalls
caused by the latency of operations that use such resources. This only
applies when the processor's micro-op buffer size is non-zero
(Out-of-Order). We can't precisely model in-order stalls during
out-of-order execution, but this is an easy and effective
heuristic. It benefits cortex-a9 scheduling when using the new
machine model, which is not yet on by default.
MI-Sched for armv7 was evaluated on Swift (and only not enabled because
of a performance bug related to predication). However, we never
evaluated Cortex-A9 performance on MI-Sched in its current form. This
change adds MI-Sched functionality to reach performance goals on
A9. The only remaining change is to allow MI-Sched to run as a PostRA
pass.
I evaluated performance using a set of options to estimate the performance impact once MI sched is default on armv7:
-mcpu=cortex-a9 -disable-post-ra -misched-bench -scheditins=false
For a simple saxpy loop I see a 1.7x speedup. Here are the llvm-testsuite results:
(min run time over 2 runs, filtering tiny changes)
Speedups:
| Benchmarks/BenchmarkGame/recursive | 52.39% |
| Benchmarks/VersaBench/beamformer | 20.80% |
| Benchmarks/Misc/pi | 19.97% |
| Benchmarks/Misc/mandel-2 | 19.95% |
| SPEC/CFP2000/188.ammp | 18.72% |
| Benchmarks/McCat/08-main/main | 18.58% |
| Benchmarks/Misc-C++/Large/sphereflake | 18.46% |
| Benchmarks/Olden/power | 17.11% |
| Benchmarks/Misc-C++/mandel-text | 16.47% |
| Benchmarks/Misc/oourafft | 15.94% |
| Benchmarks/Misc/flops-7 | 14.99% |
| Benchmarks/FreeBench/distray | 14.26% |
| SPEC/CFP2006/470.lbm | 14.00% |
| mediabench/mpeg2/mpeg2dec/mpeg2decode | 12.28% |
| Benchmarks/SmallPT/smallpt | 10.36% |
| Benchmarks/Misc-C++/Large/ray | 8.97% |
| Benchmarks/Misc/fp-convert | 8.75% |
| Benchmarks/Olden/perimeter | 7.10% |
| Benchmarks/Bullet/bullet | 7.03% |
| Benchmarks/Misc/mandel | 6.75% |
| Benchmarks/Olden/voronoi | 6.26% |
| Benchmarks/Misc/flops-8 | 5.77% |
| Benchmarks/Misc/matmul_f64_4x4 | 5.19% |
| Benchmarks/MiBench/security-rijndael | 5.15% |
| Benchmarks/Misc/flops-6 | 5.10% |
| Benchmarks/Olden/tsp | 4.46% |
| Benchmarks/MiBench/consumer-lame | 4.28% |
| Benchmarks/Misc/flops-5 | 4.27% |
| Benchmarks/mafft/pairlocalalign | 4.19% |
| Benchmarks/Misc/himenobmtxpa | 4.07% |
| Benchmarks/Misc/lowercase | 4.06% |
| SPEC/CFP2006/433.milc | 3.99% |
| Benchmarks/tramp3d-v4 | 3.79% |
| Benchmarks/FreeBench/pifft | 3.66% |
| Benchmarks/Ptrdist/ks | 3.21% |
| Benchmarks/Adobe-C++/loop_unroll | 3.12% |
| SPEC/CINT2000/175.vpr | 3.12% |
| Benchmarks/nbench | 2.98% |
| SPEC/CFP2000/183.equake | 2.91% |
| Benchmarks/Misc/perlin | 2.85% |
| Benchmarks/Misc/flops-1 | 2.82% |
| Benchmarks/Misc-C++-EH/spirit | 2.80% |
| Benchmarks/Misc/flops-2 | 2.77% |
| Benchmarks/NPB-serial/is | 2.42% |
| Benchmarks/ASC_Sequoia/CrystalMk | 2.33% |
| Benchmarks/BenchmarkGame/n-body | 2.28% |
| Benchmarks/SciMark2-C/scimark2 | 2.27% |
| Benchmarks/Olden/bh | 2.03% |
| skidmarks10/skidmarks | 1.81% |
| Benchmarks/Misc/flops | 1.72% |
Slowdowns:
| Benchmarks/llubenchmark/llu | -14.14% |
| Benchmarks/Polybench/stencils/seidel-2d | -5.67% |
| Benchmarks/Adobe-C++/functionobjects | -5.25% |
| Benchmarks/Misc-C++/oopack_v1p8 | -5.00% |
| Benchmarks/Shootout/hash | -2.35% |
| Benchmarks/Prolangs-C++/ocean | -2.01% |
| Benchmarks/Polybench/medley/floyd-warshall | -1.98% |
| Polybench/linear-algebra/kernels/3mm | -1.95% |
| Benchmarks/McCat/09-vor/vor | -1.68% |
llvm-svn: 196516
|
| |
|
|
| |
llvm-svn: 196514
|
| |
|
|
| |
llvm-svn: 196513
|
| |
|
|
|
|
| |
Should fix the msan and valgrind bots.
llvm-svn: 196509
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were creating external uses for scalar values in MustGather entries that also
had a ScalarToTreeEntry (they also are present in a vectorized tuple). This
meant we would keep a value 'alive' as a scalar and vectorized causing havoc.
This is not necessary because when we create a MustGather vector we explicitly
create external uses entries for the insertelement instructions of the
MustGather vector elements.
Fixes PR18129.
radar://15582184
llvm-svn: 196508
|
| |
|
|
|
|
| |
integer type (after other optimizations)
llvm-svn: 196507
|
| |
|
|
| |
llvm-svn: 196503
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
in case the operands are constants and its difference is |1|.
It should be possible in those cases to rematerialize the result using
MIPS's slt and similar instructions.
The small update to some of the tests in cmov.ll, sel1c.ll and sel2c.ll was needed
otherwise the optimization implemented in this patch would have been triggered
(difference between the operands was 1) and that would have changed the semantic
of the tests.
llvm-svn: 196498
|
| |
|
|
|
|
|
|
|
|
| |
performSELECTCombine.
The structure of the code was slightly modified so that the next patch is easier to read/review.
No functional changes.
llvm-svn: 196496
|
| |
|
|
|
|
|
|
|
| |
not being correctly encoded/decoded.
In more detail, immediate fields of LD/ST instructions should be
divided/multiplied by the size of the data format before encoding and
after decoding, respectively.
llvm-svn: 196494
|
| |
|
|
|
|
|
|
|
|
|
| |
We were trying to fold the stack adjustment into the wrong instruction in the
situation where the entire basic-block was epilogue code. Really, it can only
ever be valid to do the folding precisely where the "add sp, ..." would be
placed so there's no need for a separate iterator to track that.
Should fix PR18136.
llvm-svn: 196493
|
| |
|
|
|
|
| |
reduce duplication
llvm-svn: 196479
|