| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
| |
When we predicate an instruction (div, rem, store) we place the instruction in
its own basic block within the vectorized loop. If a predicated instruction has
scalar operands, it's possible to recursively sink these scalar expressions
into the predicated block so that they might avoid execution. This patch sinks
as much scalar computation as possible into predicated blocks. We previously
were able to sink such operands only if they were extractelement instructions.
Differential Revision: https://reviews.llvm.org/D25632
llvm-svn: 285097
|
| |
|
|
|
|
|
|
| |
Patch by bryant!
Differential Revision: https://reviews.llvm.org/D25952
llvm-svn: 285095
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a new function to DebugInfo.cpp that takes an llvm::Module
as input and removes all debug info metadata that is not directly
needed for line tables, thus effectively stripping all type and
variable information from the module.
The primary motivation for this feature was the bitcode work flow
(cf. http://lists.llvm.org/pipermail/llvm-dev/2016-June/100643.html
for more background). This is not wired up yet, but will be in
subsequent patches. For testing, the new functionality is exposed to
opt with a -strip-nonlinetable-debuginfo option.
The secondary use-case (and one that works right now!) is as a
reduction pass in bugpoint. I added two new bugpoint options
(-disable-strip-debuginfo and -disable-strip-debug-types) to control
the new features. By default it will first attempt to remove all debug
information, then only the type info, and then proceed to hack at any
remaining MDNodes.
Thanks to Adrian Prantl for stewarding this patch!
llvm-svn: 285094
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The branch folding pass tail merges blocks into a common-tail. However, the
tail retains the debug information from one of the original inputs to the
merge (chosen randomly). This is a problem for sampled-based PGO, as hits
on the common-tail will be attributed to whichever block was chosen,
irrespective of which path was actually taken to the common-tail.
This patch fixes the issue by nulling the debug location for the common-tail.
Differential Revision: https://reviews.llvm.org/D25742
llvm-svn: 285093
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D25086
llvm-svn: 285088
|
| |
|
|
|
|
|
| |
call_indirect, grow_memory, and current_memory now have immediate
operands in the 0xd binary encoding.
llvm-svn: 285085
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
debug loc.
When indvars widened an induction variable, the debug location for the loop
increment computation was incorrectly set equal to the debug loc of the loop
latch terminator.
This patch fixes the issue by propagating the correct location from the
original loop increment instruction to the new widened increment.
Differential Revision: https://reviews.llvm.org/D25872
llvm-svn: 285083
|
| |
|
|
|
|
|
|
|
|
| |
Now that MemorySSA keeps track of whether MemoryUses are optimized, use
getClobberingMemoryAccess() to check MemoryUse memory dependencies since
it should no longer be so expensive.
This is a follow-up change to https://reviews.llvm.org/D25881
llvm-svn: 285080
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is not safe to use LOAD ON CONDITION to implement access to a memory
location marked "volatile", since the architecture leaves it unspecified
whether or not an access happens if the condition is false.
The current code already appears to care about that:
def LOC : CondUnaryRSY<"loc", 0xEBF2, nonvolatile_load, GR32, 4>;
Unfortunately, that "nonvolatile_load" operator is simply ignored
by the CondUnaryRSY class, and there was no test to catch it.
llvm-svn: 285077
|
| |
|
|
|
|
| |
trunc transform
llvm-svn: 285075
|
| |
|
|
|
|
|
|
|
|
| |
We already have (V)PMOVZX* combining support, this is the beginning of handling (V)PMOVSX* similarly - other combines in combineVSZext can be generalized in future patches.
This unearthed an interesting bug in that we were generating illegal build vectors on 32-bit targets - it was proving difficult to create a test for it from PMOVZX, but it fired immediately with PMOVSX. I've created a more general form of the existing getConstVector to handle these cases - ideally this should be handled in non-target-specific code but I couldn't find an equivalent.
Differential Revision: https://reviews.llvm.org/D25874
llvm-svn: 285072
|
| |
|
|
|
|
| |
Accidentally put in the hoped-for checks ahead of the transform!
llvm-svn: 285070
|
| |
|
|
| |
llvm-svn: 285069
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Do *not* perform combines such as:
vector_shuffle<4,1,2,3>(build_vector(Ud, C0, C1 C2), scalar_to_vector(X))
->
build_vector(X, C0, C1, C2)
Keeping the shuffle allows lowering the constant build_vector to a materialized
constant vector (such as a vector-load from the constant-pool or some other idiom).
Reviewers: delena, igorb, spatel, mkuper, andreadb, RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25524
llvm-svn: 285063
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
ZERO_EXTEND_VECTOR_INREG for 512-bit vectors to support vpmovzxbq and vpmovsxbq.
Summary: The one tricky thing about this is that the sign/zero_extend_inreg uses v64i8 as an input type which isn't legal without BWI support. Though the vpmovsxbq and vpmovzxbq instructions themselves don't require BWI. To support this we need to add custom lowering for ZERO_EXTEND_VECTOR_INREG with v64i8 input. This can mostly reuse the existing sign extend code with a couple checks for sign extend vs zero extend added.
Reviewers: delena, RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25594
llvm-svn: 285053
|
| |
|
|
| |
llvm-svn: 285046
|
| |
|
|
| |
llvm-svn: 285045
|
| |
|
|
|
|
|
|
|
| |
When we load coverage data from multiple objects, we don't have a way to
attribute a source object to a function record. Printing out the object
filename next to the source filename is already not very useful: soon,
it'll actually become misleading. Stop printing out the filename now.
llvm-svn: 285043
|
| |
|
|
| |
llvm-svn: 285036
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
There are two fixes here: one, AnalyzeUsesOfPointer can't return
false until it has checked all the uses of the pointer. Two, if a
global uses another global, we have to assume the address of the
first global escapes.
Fixes https://llvm.org/bugs/show_bug.cgi?id=30707 .
Differential Revision: https://reviews.llvm.org/D25798
llvm-svn: 285034
|
| |
|
|
|
|
|
|
|
|
| |
or vector splats
Use isConstOrConstSplat helper.
Also use APInt instead of getZExtValue directly to avoid out of range issues.
llvm-svn: 285033
|
| |
|
|
|
|
|
| |
when contained in a Mach-O universal file and the
cputypes in both headers don’t match.
llvm-svn: 285026
|
| |
|
|
| |
llvm-svn: 285005
|
| |
|
|
|
|
|
|
| |
Reviewers: MatzeB, mcrosier, rengolin
Differential Revision: https://reviews.llvm.org/D25894
llvm-svn: 285003
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a bug in the handling of lexical scopes, when more than one
scope is defined on the same line or functions are inlined into call
sites that are on the same line as the function definition. This
situation can easily happen in macro expansions.
The problem is solved by introducing a SmallDenseMap<DIScope *,
DILexicalBlockFile *, 1> that keeps track of all the different lexical
scopes that share a line/file location.
Fixes PR30681.
llvm-svn: 284998
|
| |
|
|
|
|
|
|
|
|
|
| |
constant area
https://reviews.llvm.org/D23614
Currently we load +0.0 from constant area. That can change to be generated using
XOR instruction.
llvm-svn: 284995
|
| |
|
|
|
|
|
| |
The optimization has correctness issues, so reverting for now to fix tests
on thumb1 targets.
llvm-svn: 284993
|
| |
|
|
|
|
|
|
|
| |
Add support for estimating the square root or its reciprocal and division or
reciprocal using the combiner generic Newton series.
Differential revision: https://reviews.llvm.org/D25291
llvm-svn: 284986
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When using MemorySSA, re-optimize MemoryPhis when removing a store since
this may create MemoryPhis with all identical arguments.
Also, when using MemorySSA to check if two MemoryUses are reading from
the same version of the heap, use the defining access instead of calling
getClobberingAccess, since the latter can currently result in many more
AA calls. Once the MemorySSA use optimization tracking changes are
done, we can remove this limitation, which should result in more loads
being CSE'd.
Reviewers: dberlin
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D25881
llvm-svn: 284984
|
| |
|
|
|
|
|
|
| |
https://reviews.llvm.org/D24924
This improves the code generated for a sequence of AND, ANY_EXT, SRL instructions. This is a targetted fix for this special pattern. The pattern is generated by target independet dag combiner and so a more general fix may not be necessary. If we come across other similar cases, some ideas for handling it are discussed on the code review.
llvm-svn: 284983
|
| |
|
|
| |
llvm-svn: 284982
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The v_movreld machine instruction is used with three operands that are
in a sense tied to each other (the explicit VGPR_32 def and the implicit
VGPR_NN def and use). There is no way to express that using the currently
available operand bits, and indeed there are cases where the Two Address
instructions pass does the wrong thing.
This patch introduces a new set of pseudo instructions that are identical
in intended semantics as v_movreld, but they only have two tied operands.
Having to add a new set of pseudo instructions is admittedly annoying, but
it's a fairly straightforward and solid approach. The only alternative I
see is to try to teach the Two Address instructions pass about Three Address
instructions, and I'm afraid that's trickier and is going to end up more
fragile.
Note that v_movrels does not suffer from this problem, and so this patch
does not touch it.
This fixes several GL45-CTS.shaders.indexing.* tests.
Reviewers: tstellarAMD, arsenm
Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D25633
llvm-svn: 284980
|
| |
|
|
|
|
|
|
|
| |
It seems to break selfhost on some bots, see e.g.
http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/21
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/20
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/22
llvm-svn: 284979
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Fix AsmParser lines to correctly handle end-of-line pre-processor
comments parsing when '#' is not the assembly line comment prefix.
Reviewers: rnk
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25567
llvm-svn: 284978
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add relocations for AArch64 ILP32. Includes:
- Addition of definitions for R_AARCH32_*
- Definition of new -target-abi: ilp32
- Definition of data layout string
- Tests for added relocations. Not comprehensive, but matches
existing tests for 64-bit. Renames "CHECK-OBJ" to "CHECK-OBJ-LP64".
- Tests for llvm-readobj
Reviewers: zatrazz, peter.smith, echristo, t.p.northover
Subscribers: aemerson, rengolin, mehdi_amini
Differential Revision: https://reviews.llvm.org/D25159
llvm-svn: 284973
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
These are good candidates for jump threading. This enables later opts
(such as InstCombine) to combine instructions from the selects with
instructions out of the selects. SimplifyCFG will fold the select
again if unfolding wasn't worth it.
Patch by James Molloy and Pablo Barrio.
Reviewers: reames, bkramer, mcrosier, gberry, haicheng, jmolloy, sebpop
Subscribers: jojo, rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D25477
llvm-svn: 284971
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Add synci to the microMIPS instruction definitions, mark the MIPS sync & synci
as not being part of microMIPS. This does not cover the sync instruction alias,
as that will be handled with a different patch. Add sync to the valid tests for
microMIPS.
Reviewers: vkalintiris
Differential Revision: https://reviews.llvm.org/D25795
llvm-svn: 284962
|
| |
|
|
|
|
|
| |
Fix the implementation of OptReportLocationInfo's operator < so that contexts
with different unroll counts are reported separately.
llvm-svn: 284957
|
| |
|
|
|
|
| |
Clang patch to replace 512-bit vector and 64-bit element versions with native IR will follow.
llvm-svn: 284955
|
| |
|
|
| |
llvm-svn: 284953
|
| |
|
|
| |
llvm-svn: 284940
|
| |
|
|
|
|
| |
We were defaulting to SSE2 costs which weren't taking into account the availability of PBLENDW/PBLENDVB to improve merging of per-element shift results.
llvm-svn: 284939
|
| |
|
|
| |
llvm-svn: 284938
|
| |
|
|
|
|
| |
AVX512VL VPERMV3
llvm-svn: 284922
|
| |
|
|
| |
llvm-svn: 284921
|
| |
|
|
|
|
| |
tPCRelJT may not be the first instruction in a block. Check that instead of dereferencing a broken iterator.
llvm-svn: 284917
|
| |
|
|
| |
llvm-svn: 284916
|
| |
|
|
| |
llvm-svn: 284915
|
| |
|
|
|
|
|
|
|
|
|
|
| |
In BasicAA GEP operand values get adjusted ("wrap-around") based on the
pointersize. Otherwise, in non-64b modes, AA could report false negatives.
However, a wrap-around is valid only for a fully evaluated expression.
It had been introduced to fix an alias problem in
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160118/326163.html.
This commit restricts the wrap-around to constant gep operands only where the
value is known at compile-time.
llvm-svn: 284908
|
| |
|
|
| |
llvm-svn: 284896
|