| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch adds basic .debug_names verification capabilities to the
DWARF verifier. Right now, it checks that the headers and abbreviation
tables of the individual name indexes can be parsed correctly, it
verifies the buckets table and the cross-checks the CU lists for
consistency. I intend to add further checks in follow-up patches.
Reviewers: JDevlieghere, aprantl, probinson, dblaikie
Subscribers: vleschuk, echristo, clayborg, llvm-commits
Differential Revision: https://reviews.llvm.org/D44211
llvm-svn: 327011
|
| |
|
|
|
|
|
| |
In future, both the summary information and the 'instruction info' table should
be moved into a separate "Summary" view.
llvm-svn: 327010
|
| |
|
|
|
|
|
|
| |
I'm trying to preserve the intent of these tests by using
non-undef operands; if we fix FP undef folding these tests
will not pass.
llvm-svn: 327004
|
| |
|
|
|
|
| |
This makes 'ninja clean ; ninja check-all' work again.
llvm-svn: 327002
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
llvm-mca is an LLVM based performance analysis tool that can be used to
statically measure the performance of code, and to help triage potential
problems with target scheduling models.
llvm-mca uses information which is already available in LLVM (e.g. scheduling
models) to statically measure the performance of machine code in a specific cpu.
Performance is measured in terms of throughput as well as processor resource
consumption. The tool currently works for processors with an out-of-order
backend, for which there is a scheduling model available in LLVM.
The main goal of this tool is not just to predict the performance of the code
when run on the target, but also help with diagnosing potential performance
issues.
Given an assembly code sequence, llvm-mca estimates the IPC (instructions per
cycle), as well as hardware resources pressure. The analysis and reporting style
were mostly inspired by the IACA tool from Intel.
This patch is related to the RFC on llvm-dev visible at this link:
http://lists.llvm.org/pipermail/llvm-dev/2018-March/121490.html
Differential Revision: https://reviews.llvm.org/D43951
llvm-svn: 326998
|
| |
|
|
|
|
|
|
|
| |
Allow us to embed the (Xcode) toolchain in the dSYM bundle's property
list.
Differential revision: https://reviews.llvm.org/D44151
llvm-svn: 326994
|
| |
|
|
|
|
|
|
|
|
|
|
| |
of vXi32.
This instruction can be thought of as reading either the even elements of a vXi32 input or the lower half of each element of a vXi64 input. We currently use the vXi32 interpretation, but vXi64 matches better with its broadcast behavior in EVEX.
I'm looking at moving MULDQ/MULUDQ creation to a DAG combine so we can do it when AVX512DQ is enabled without having to go through Custom lowering. But in some of the test cases we failed to use a broadcast load due to the size difference. This should help with that.
I'm also wondering if we can model these instructions in native IR and remove the intrinsics and I think using a vXi64 type will work better with that.
llvm-svn: 326991
|
| |
|
|
|
|
|
|
| |
This reverts commit 1f3bd185c53beb6aa68446974b7e80837abd6ef0 (r326107)
because it fails
ThinLTO/X86/diagnostic-handler-remarks-with-hotness.ll.
llvm-svn: 326975
|
| |
|
|
|
|
|
|
|
|
| |
Summary:
Original change was D43313 (r326932) and reverted by r326953 because it
broke an LLD test and a windows build. The LLD test was already fixed in
lld commit r326944 (thanks maskray). This is the original change with
the windows build fixed.
llvm-svn: 326970
|
| |
|
|
|
|
|
|
|
|
| |
unaligned predicates.
These patterns weren't checking the alignment of the load, but were using the aligned instructions. This will cause a GP fault if the data isn't aligned.
I believe these were introduced in r312450.
llvm-svn: 326967
|
| |
|
|
|
|
|
|
|
|
| |
Since there is no instruction for integer vector division, factor in the
cost of singling out each element to be used with the scalar division
instruction.
Differential revision: https://reviews.llvm.org/D43974
llvm-svn: 326955
|
| |
|
|
|
|
| |
This reverts commit rr326932 because it broke lld/test/ELF/eh-frame-hdr-augmentation.s.
llvm-svn: 326953
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The attached testcase started failing after the patch to define
isExtractSubvectorCheap with the following pattern mismatch:
ISEL: Starting pattern match
Initial Opcode index to 85068
Match failed at index 85076
LLVM ERROR: Cannot select: t47: v8i16 = insert_subvector undef:v8i16, t43, Constant:i64<0>
The code generated from llvm/lib/Target/AArch64/AArch64InstrInfo.td
def : Pat<(insert_subvector undef, (v4i16 FPR64:$src), (i32 0)),
(INSERT_SUBREG (v8i16 (IMPLICIT_DEF)), FPR64:$src, dsub)>;
is in ninja/lib/Target/AArch64/AArch64GenDAGISel.inc
At the location of the error it is:
/* 85076*/ OPC_CheckChild2Type, MVT::i32,
And it failed to match the type of operand 2.
Adding another def-pat for i64 fixes the failed def-pat error:
def : Pat<(insert_subvector undef, (v4i16 FPR64:$src), (i64 0)),
(INSERT_SUBREG (v8i16 (IMPLICIT_DEF)), FPR64:$src, dsub)>;
llvm-svn: 326949
|
| |
|
|
|
|
|
|
| |
Not all build bots have unzip which I used in a test.
This reverts commit 0b1f26d39ea42dd3716b525fbc8c78d8c7bb4479.
llvm-svn: 326941
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Because of -ffunction-sections (and maybe other use cases I'm not aware of?) it
can occur that we need more than 0xfeff sections but ELF dosn't support that
many sections. To solve this problem SHN_XINDEX exists and with it come a whole
host of changes for section indexes everywhere. This change adds support for
those cases which should allow llvm-objcopy to copy binaries that have an
arbitrary number of sections.
Differential Revision: https://reviews.llvm.org/D42516
llvm-svn: 326940
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch enhances DWARFDebugFrame with the capability of parsing and
printing DWARF expressions in CFI instructions. It also makes FDEs and
CIEs accessible to lib users, so they can process them in client tools
that rely on LLVM. To make it self-contained with a test case, it
teaches llvm-readobj to be able to dump EH frames and checks they are
correct in a unit test. The llvm-readobj code is Maksim Panchenko's work
(maksfb).
Reviewers: JDevlieghere, espindola
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D43313
llvm-svn: 326932
|
| |
|
|
| |
llvm-svn: 326930
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
A desired property of the node order in Swing Modulo Scheduling is
that for nodes outside circuits the following holds: none of them is
scheduled after both a successor and a predecessor. We call
node orders that meet this property valid.
Although invalid node orders do not lead to the generation of incorrect
code, they can cause the pipeliner not being able to find a pipelined schedule
for arbitrary II. The reason is that after scheduling the successor and the
predecessor of a node, no room may be left to schedule the node itself.
For data flow graphs with 0-latency edges, the node ordering algorithm
of Swing Modulo Scheduling can generate such undesired invalid node orders.
This patch fixes that.
In the remainder of this commit message, I will give an example
demonstrating the issue, explain the fix, and explain how the the fix is tested.
Consider, as an example, the following data flow graph with all
edge latencies 0 and all edges pointing downward.
```
n0
/ \
n1 n3
\ /
n2
|
n4
```
Consider the implemented node order algorithm in top-down mode. In that mode,
the algorithm orders the nodes based on greatest Height and in case of equal
Height on lowest Movability. Finally, in case of equal Height and
Movability, given two nodes with an edge between them, the algorithm prefers
the source-node.
In the graph, for every node, the Height and Movability are equal to 0.
As will be explained below, the algorithm can generate the order n0, n1, n2, n3, n4.
So, node n3 is scheduled after its predecessor n0 and after its successor n2.
The reason that the algorithm can put node n2 in the order before node n3,
even though they have an edge between them in which node n3 is the source,
is the following: Suppose the algorithm has constructed the partial node
order n0, n1. Then, the nodes left to be ordered are nodes n2, n3, and n4. Suppose
that the while-loop in the implemented algorithm considers the nodes in
the order n4, n3, n2. The algorithm will start with node n4, and look for
more preferable nodes. First, node n4 will be compared with node n3. As the nodes
have equal Height and Movability and have no edge between them, the algorithm
will stick with node n4. Then node n4 is compared with node n2. Again the
Height and Movability are equal. But, this time, there is an edge between
the two nodes, and the algorithm will prefer the source node n2.
As there are no nodes left to compare, the algorithm will add node n2 to
the node order, yielding the partial node order n0, n1, n2. In this way node n2
arrives in the node-order before node n3.
To solve this, this patch introduces the ZeroLatencyHeight (ZLH) property
for nodes. It is defined as the maximum unweighted length of a path from the
given node to an arbitrary node in which each edge has latency 0.
So, ZLH(n0)=3, ZLH(n1)=ZLH(n3)=2, ZLH(n2)=1, and ZLH(n4)=0
In this patch, the preference for a greater ZeroLatencyHeight
is added in the top-down mode of the node ordering algorithm, after the
preference for a greater Height, and before the preference for a
lower Movability.
Therefore, the two allowed node-orders are n0, n1, n3, n2, n4 and n0, n3, n1, n2, n4.
Both of them are valid node orders.
In the same way, the bottom-up mode of the node ordering algorithm is adapted
by introducing the ZeroLatencyDepth property for nodes.
The patch is tested by adding extra checks to the following existing
lit-tests:
test/CodeGen/Hexagon/SUnit-boundary-prob.ll
test/CodeGen/Hexagon/frame-offset-overflow.ll
test/CodeGen/Hexagon/vect/vect-shuffle.ll
Before this patch, the pipeliner failed to pipeline the loops in these tests
due to invalid node-orders. After the patch, the pipeliner successfully
pipelines all these loops.
Reviewers: bcahoon
Reviewed By: bcahoon
Subscribers: Ayal, mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D43620
llvm-svn: 326925
|
| |
|
|
|
|
|
| |
Test was added in r326906 to an incorrect location.
Moving the test to PPC CodeGen directory as the test is PPC specific.
llvm-svn: 326923
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Simplify feature checks by using isTypeLegal.
The v8i32 conversion on AVX1 targets was only working after LowerMUL splits 256-bit vectors.
While I was there I've also made it so we don't have to check for AVX2 and BWI directly and instead just ask if the type is legal.
Differential Revision: https://reviews.llvm.org/D44190
llvm-svn: 326917
|
| |
|
|
|
|
|
|
|
| |
This is a follow-up to r325169, this time for all types, not just HVX
vector types.
Disable this by default, since it's not always safe.
llvm-svn: 326915
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: GCN ISA supports instructions that can read 16 consecutive dwords from memory through the scalar data cache;
loadstoreVectorizer should take advantage of the wider vector length and pack 16/8 elements of dwords/quadwords.
Author: FarhanaAleen
Reviewed By: rampitec
Subscribers: llvm-commits, AMDGPU
Differential Revision: https://reviews.llvm.org/D44179
llvm-svn: 326910
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instructions.
Summary:
If the operands of a udiv/urem can be proved to fit within a smaller
power-of-two-sized type, reduce the width of the udiv/urem.
Backed out for failing an assert in clang bootstrap builds. Re-landing
with a fix for handling non-power-of-two inputs (e.g. udiv i24).
Original Differential Revision: https://reviews.llvm.org/D44102
llvm-svn: 326908
|
| |
|
|
|
|
| |
This reverts commit ce988cc100dc65e7c6c727aff31ceb99231cab03.
llvm-svn: 326907
|
| |
|
|
|
|
|
|
|
| |
The purpose of this patch is to have LSR generate better code on Power.
This is done by overriding isLSRCostLess.
Differential Revision: https://reviews.llvm.org/D40855
llvm-svn: 326906
|
| |
|
|
| |
llvm-svn: 326904
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Instead of only printing the CU-relative offset in non-verbose mode, it
makes more sense to only printed the resolved address. In verbose mode
we still print both.
Differential revision: https://reviews.llvm.org/D44148
rdar://33525475
llvm-svn: 326903
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r326839.
r326839 breaks assembly file parsing:
$ cat q.c
void g() {}
$ clang -S q.c -g
$ clang -g -c q.s
q.s:9:2: error: file number already allocated
.file 1 "/tmp/test" "q.c"
^
llvm-svn: 326902
|
| |
|
|
|
|
|
|
|
|
| |
udiv/urem instructions."
Breaks bootstrap builds: clang built with this patch asserts while
building MCDwarf.cpp: Assertion `castIsValid(op, S, Ty) && "Invalid
cast!"' failed.
llvm-svn: 326900
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If the operands of a udiv/urem can be proved to fit within a smaller
power-of-two-sized type, reduce the width of the udiv/urem.
Reviewers: spatel, sanjoy
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D44102
llvm-svn: 326898
|
| |
|
|
| |
llvm-svn: 326897
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These instructions are defined as taking a GPR register and a
coprocessor register for ISAs up to MIPS32. MIPS32 extended the
definition to allow a selector--a value from 0 to 32--to access
another register.
These instructions are now internally defined as being MIPS-I
instructions, but are rejected for pre-MIPS32 ISA's if they have
an explicit selector which is non-zero. This deviates slightly from
GAS's behaviour which rejects assembly instructions with an
explicit selector for pre-MIPS32 ISAs.
E.g:
mfc0 $4, $5, 0
is rejected by GAS for MIPS-I to MIPS-V but will be accepted
with this patch for MIPS-I to MIPS-V.
Reviewers: atanasyan
Differential Revision: https://reviews.llvm.org/D41662
llvm-svn: 326890
|
| |
|
|
|
|
|
|
|
|
|
| |
The LoadStoreVectorizer thought that <1 x T> and T were the same types
when merging stores, leading to a crash later.
Patch by Erik Hogeman.
Differential Revision: https://reviews.llvm.org/D44014
llvm-svn: 326884
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Don't PerformSHLSimplify if the given node is used by a node that also uses a
constant because we may get stuck in an infinite combine loop.
bugzilla: https://bugs.llvm.org/show_bug.cgi?id=36577
Patch by Sam Parker.
Differential Revision: https://reviews.llvm.org/D44097
llvm-svn: 326882
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Before the patch a try to reassociate ((v * 16) * 0) * 1 fall into infinite loop
Reviewers: pankajchawla
Differential Revision: http://reviews.llvm.org/D41467
From: Evgeny Stupachenko <evstupac@gmail.com>
<evgeny.v.stupachenko@intel.com>
llvm-svn: 326861
|
| |
|
|
|
|
|
|
| |
Fixes the bug found by asan. Also XFAIL the new test for Darwin,
which is stuck on DWARF v2, and fix up other tests so they stop
failing on Windows.
llvm-svn: 326839
|
| |
|
|
|
|
|
|
| |
Notably helps cleanup after legalization of vector types
Differential Revision: https://reviews.llvm.org/D43674
llvm-svn: 326838
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's been quite some time the Dependence Analysis (DA) is broken,
as it uses the GEP representation to "identify" multi-dimensional arrays.
It even wrongly detects multi-dimensional arrays in single nested loops:
from test/Analysis/DependenceAnalysis/Coupled.ll, example @couple6
;; for (long int i = 0; i < 50; i++) {
;; A[i][3*i - 6] = i;
;; *B++ = A[i][i];
DA used to detect two subscripts, which makes no sense in the LLVM IR
or in C/C++ semantics, as there are no guarantees as in Fortran of
subscripts not overlapping into a next array dimension:
maximum nesting levels = 1
SrcPtrSCEV = %A
DstPtrSCEV = %A
using GEPs
subscript 0
src = {0,+,1}<nuw><nsw><%for.body>
dst = {0,+,1}<nuw><nsw><%for.body>
class = 1
loops = {1}
subscript 1
src = {-6,+,3}<nsw><%for.body>
dst = {0,+,1}<nuw><nsw><%for.body>
class = 1
loops = {1}
Separable = {}
Coupled = {1}
With the current patch, DA will correctly work on only one dimension:
maximum nesting levels = 1
SrcSCEV = {(-2424 + %A)<nsw>,+,1212}<%for.body>
DstSCEV = {%A,+,404}<%for.body>
subscript 0
src = {(-2424 + %A)<nsw>,+,1212}<%for.body>
dst = {%A,+,404}<%for.body>
class = 1
loops = {1}
Separable = {0}
Coupled = {}
This change removes all uses of GEP from DA, and we now only rely
on the SCEV representation.
The patch does not turn on -da-delinearize by default, and so the DA analysis
will be more conservative in the case of multi-dimensional memory accesses in
nested loops.
I disabled some interchange tests, as the DA is not able to disambiguate
the dependence anymore. To make DA stronger, we may need to
compute a bound on the number of iterations based on the access functions
and array dimensions.
The patch cleans up all the CHECKs in test/Transforms/LoopInterchange/*.ll to
avoid checking for snippets of LLVM IR: this form of checking is very hard to
maintain. Instead, we now check for output of the pass that are more meaningful
than dozens of lines of LLVM IR. Some tests now require -debug messages and thus
only enabled with asserts.
Patch written by Sebastian Pop and Aditya Kumar.
Differential Revision: https://reviews.llvm.org/D35430
llvm-svn: 326837
|
| |
|
|
| |
llvm-svn: 326831
|
| |
|
|
| |
llvm-svn: 326830
|
| |
|
|
|
|
| |
The spaces in the instructions are now consistent.
llvm-svn: 326829
|
| |
|
|
|
|
|
|
|
|
| |
in 32-bit mode
We don't currently reject r8-r15 or xmm8-32 or bpl/spl/sil/dil in 32-bit mode.
Differential Revision: https://reviews.llvm.org/D44031
llvm-svn: 326826
|
| |
|
|
|
|
|
|
|
|
|
| |
In case if -mattr used to modify feature set bits in llvm-mc call
getIsaVersion can fail to identify specific ISA due to test mismatch.
Adding default fallback tests which will always correctly report at
least major version.
Differential Revision: https://reviews.llvm.org/D44163
llvm-svn: 326825
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Emit UdtSourceLine information for enums to match MSVC
- Add a method to add UDTSrcLine and call it for all Class/Struct/Union/Enum
- Update test cases to verify the changes
Reviewers: zturner, llvm-commits, rnk
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D44116
llvm-svn: 326824
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Using cst_pred_ty in the definition allows us to match vectors with undef elements.
This is a continuation of an effort to make all pattern matchers allow undef elements in vectors:
rL325437
rL325466
D43792
Differential Revision: https://reviews.llvm.org/D44076
llvm-svn: 326823
|
| |
|
|
|
|
| |
This avoids duplicated code and now also rejects dllimport aliases.
llvm-svn: 326814
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Following the ARM-neon backend, define isExtractSubvectorCheap to return true
when extracting low and high part of a neon register.
The patch disables a test in llvm/test/CodeGen/AArch64/arm64-ext.ll This
testcase is fragile in the sense that it requires a BUILD_VECTOR to "survive"
all DAG transforms until ISelLowering. The testcase is supposed to check that
AArch64TargetLowering::ReconstructShuffle() works, and for that we need a
BUILD_VECTOR in ISelLowering. As we now transform the BUILD_VECTOR earlier into
an VEXT + vector_shuffle, we don't have the BUILD_VECTOR pattern when we get to
ISelLowering. As there is no way to disable the combiner to only exercise the
code in ISelLowering, the patch disables the testcase.
Differential revision: https://reviews.llvm.org/D43973
llvm-svn: 326811
|
| |
|
|
|
|
|
|
|
|
| |
One addrspacecast disappeared in clang emitted IR for
block invoke function due to adoption of the new
addr space mapping.
Differential Revision: https://reviews.llvm.org/D43785
llvm-svn: 326806
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch handling:
Enable parsing of raw encodings of system registers .
Allows UNPREDICTABLE sysregs to be decoded to a raw number in the same way that disasslib does, rather than llvm crashing.
Disassemble msr/mrs with unpredictable sysregs as SoftFail.
Fix regression due to SoftFailing some encodings.
Patch by Chris Ryder
Differential revision:https://reviews.llvm.org/D43374
llvm-svn: 326803
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change doCallSiteSplitting to iterate until we reach the terminator instruction.
tryToSplitCallSite can replace BB's terminator in case BB is a successor of
itself. Then IE will be invalidated and we also have to check the current
terminator.
Reviewers: junbuml, davidxl, davide, fhahn
Reviewed By: fhahn, junbuml
Differential Revision: https://reviews.llvm.org/D43824
llvm-svn: 326793
|