| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Adding a new restructuredText file to document the trace format produced with
an FDR mode handler and read by llvm-xray toolset.
Fixed two problems in the documentation from differential review. One bad table
and a missing link in the toc.
Original commit was e97c5836a77db803fe53319c53f3bf8e8b26d2b7.
Reviewers: dberris, pelikan
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36041
llvm-svn: 309891
|
|
|
|
| |
llvm-svn: 309890
|
|
|
|
|
|
|
|
|
| |
LIT launches executables with absolute, and not relative, path.
strncmp would try to do exact comparison and fail.
Differential Revision: https://reviews.llvm.org/D36242
llvm-svn: 309889
|
|
|
|
| |
llvm-svn: 309887
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is largely NFC*, in preparation for utilizing ProfileSummaryInfo
and BranchFrequencyInfo analyses. In this patch I am only doing the
splitting for the New PM, but I can do the same for the legacy PM as
a follow-on if this looks good.
*Not NFC since for partial unrolling we lose the updates done to the
loop traversal (adding new sibling and child loops) - according to
Chandler this is not very useful for partial unrolling, but it also
means that the debugging flag -unroll-revisit-child-loops no longer
works for partial unrolling.
Reviewers: chandlerc
Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D36157
llvm-svn: 309886
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I was surprised to see the code model being passed to MC. After all,
it assembles code, it doesn't create it.
The one place it is used is in the expansion of .cfi directives to
handle .eh_frame being more that 2gb away from the code.
As far as I can tell, gnu assembler doesn't even have an option to
enable this. Compiling a c file with gcc -mcmodel=large produces a
regular looking .eh_frame. This is probably because in practice linker
parse and recreate .eh_frames.
In llvm this is used because the JIT can place the code and .eh_frame
very far apart. Ideally we would fix the jit and delete this
option. This is hard.
Apart from confusion another problem with the current interface is
that most callers pass CodeModel::Default, which is bad since MC has
no way to map it to the target default if it actually needed to.
This patch then replaces the argument with a boolean with a default
value. The vast majority of users don't ever need to look at it. In
fact, only CodeGen and llvm-mc use it and llvm-mc just to enable more
testing.
llvm-svn: 309884
|
|
|
|
|
|
|
|
|
|
| |
(xor(sext(cmp)), -1) to ext(!cmp).
As far as I can tell this should be handled by foldCastedBitwiseLogic which is called later in visitXor.
Differential Revision: https://reviews.llvm.org/D36214
llvm-svn: 309882
|
|
|
|
|
|
|
|
| |
This adds support for sext in foldLogicCastConstant. This is a prerequisite for D36214.
Differential Revision: https://reviews.llvm.org/D36234
llvm-svn: 309880
|
|
|
|
|
|
|
|
|
| |
Followup to r309570, fixing it slightly differently (ranges_base and
addr_base should never be read from a DWO file - so there shouldn't be
any issue with 'overriding' the values - conditionalize the code and
assert that the values aren't being overriden).
llvm-svn: 309879
|
|
|
|
|
|
|
|
|
| |
Power 9 has instructions to do absolute difference (VABSDUB, VABSDUH, VABSDUW)
for byte, halfword and word. We should take advantage of these.
Differential Revision: https://reviews.llvm.org/D34684
llvm-svn: 309876
|
|
|
|
| |
llvm-svn: 309872
|
|
|
|
|
|
| |
Test fusion of AES operations.
llvm-svn: 309855
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch makes LoopDeletion use the incremental DominatorTree API.
We modify LoopDeletion to perform the deletion in 5 steps:
1. Create a new dummy edge from the preheader to the exit, by adding a conditional branch.
2. Inform the DomTree about the new edge.
3. Remove the conditional branch and replace it with an unconditional edge to the exit. This removes the edge to the loop header, making it unreachable.
4. Inform the DomTree about the deleted edge.
5. Remove the unreachable block from the function.
Creating the dummy conditional branch is necessary to perform incremental DomTree update.
We should consider using the batch updater when it's ready.
Reviewers: dberlin, davide, grosser, sanjoy
Reviewed By: dberlin, grosser
Subscribers: mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D35391
llvm-svn: 309850
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch is update after the first patch (https://reviews.llvm.org/rL309651) based on the post-commit comments.
Stack coloring pass need to maintain AliasAnalysis information when merging stack slots of different types.
Actually, there is a FIXME comment in StackColoring.cpp
// FIXME: In order to enable the use of TBAA when using AA in CodeGen,
// we'll also need to update the TBAA nodes in MMOs with values
// derived from the merged allocas.
But, TBAA has been already enabled in CodeGen without fixing this pass.
The incorrect TBAA metadata results in recent failures in bootstrap test on ppc64le (PR33928) by allowing unsafe instruction scheduling.
Although we observed the problem on ppc64le, this is a platform neutral issue.
This patch makes the stack coloring pass maintains AliasAnalysis information when merging multiple stack slots.
This patch fixes PR33928.
llvm-svn: 309849
|
|
|
|
|
|
|
|
| |
This reverts commit 3462b8ad41a840fd54dbbd0d3f2a514c5ad6f656.
The docs-llvm-html target failed.
llvm-svn: 309842
|
|
|
|
|
|
|
|
|
| |
GAS ignores the aforementioned issue
this patch aligns LLVM + throws in an appropriate warning
Differential Revision: https://reviews.llvm.org/D36060
llvm-svn: 309841
|
|
|
|
| |
llvm-svn: 309839
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Adding a new restructuredText file to document the trace format produced with
an FDR mode handler and read by llvm-xray toolset.
Reviewers: dberris, pelikan
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36041
llvm-svn: 309836
|
|
|
|
| |
llvm-svn: 309834
|
|
|
|
|
|
|
| |
This reverts r309602, check-lit still leaves Output directories in the
source directory.
llvm-svn: 309833
|
|
|
|
|
|
|
|
|
|
| |
If there are no calls, this is a faster path than
searching the entire program for calls.
This was supposed to be left in r309781.
Fixes unused variable warning.
llvm-svn: 309832
|
|
|
|
|
|
| |
rdar://problem/33580047
llvm-svn: 309831
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During store merge we construct a sorted list of consecutive store
candidates and consider subsequences for merging into a single
store. For each subsequence we check if the stored value type is legal
the merged store would have valid and fast and if the constructed
value to be stored is valid. The only properties that affect this
check between subsequences is the size of the subsequence, the
alignment of the first store, the alignment of the stored load value
(when merging stores-of-loads), and whether the merged value is a
constant zero.
If we do not find a viable mergeable subsequence starting from the
first store of length N, we know that a subsequence starting at a
later store of length N will also fail unless the new store's
alignment, the new load's alignment (if we're merging store-of-loads),
or we've dropped stores of nonzero value and could construct a merged
stores of zero (for merging constants).
As a result if we fail to find a valid subsequence starting from the
first store we can safely skip considering subsequences that start
with subsequent stores unless one of the above properties is
true. This significantly (2x) improves compile time in some
pathological cases.
Reviewers: RKSimon, efriedma, zvi, spatel, waltl
Subscribers: grandinj, llvm-commits
Differential Revision: https://reviews.llvm.org/D35901
llvm-svn: 309830
|
|
|
|
|
|
| |
Separate the checking of the fused pairings with B.cc and CBcc.
llvm-svn: 309825
|
|
|
|
| |
llvm-svn: 309824
|
|
|
|
|
|
|
|
| |
MachineLocation::getOffset() always returns 0.
rdar://problem/33580047
llvm-svn: 309823
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Suggested by @t.p.northover in https://bugs.llvm.org/show_bug.cgi?id=34015.
Reviewers: javed.absar, t.p.northover, rengolin
Reviewed By: t.p.northover
Subscribers: aemerson, kristof.beyls, llvm-commits, t.p.northover
Differential Revision: https://reviews.llvm.org/D36223
llvm-svn: 309821
|
|
|
|
| |
llvm-svn: 309819
|
|
|
|
| |
llvm-svn: 309818
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D35856
llvm-svn: 309817
|
|
|
|
| |
llvm-svn: 309816
|
|
|
|
| |
llvm-svn: 309814
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
extracted elements
Summary:
Currently most of the time vectors of extractelement instructions are
treated as scalars that must be gathered into vectors. But in some
cases, like when we have extractelement instructions from single vector
with different constant indeces or from 2 vectors of the same size, we
can treat this operations as shuffle of a single vector or blending of 2
vectors.
```
define <2 x i8> @g(<2 x i8> %x, <2 x i8> %y) {
%x0 = extractelement <2 x i8> %x, i32 0
%y1 = extractelement <2 x i8> %y, i32 1
%x0x0 = mul i8 %x0, %x0
%y1y1 = mul i8 %y1, %y1
%ins1 = insertelement <2 x i8> undef, i8 %x0x0, i32 0
%ins2 = insertelement <2 x i8> %ins1, i8 %y1y1, i32 1
ret <2 x i8> %ins2
}
```
can be converted to something like
```
define <2 x i8> @g(<2 x i8> %x, <2 x i8> %y) {
%1 = shufflevector <2 x i8> %x, <2 x i8> %y, <2 x i32> <i32 0, i32 3>
%2 = mul <2 x i8> %1, %1
ret <2 x i8> %2
}
```
Currently this type of conversion is considered as high cost
transformation.
Reviewers: mzolotukhin, delena, mkuper, hfinkel, RKSimon
Subscribers: ashahid, RKSimon, spatel, llvm-commits
Differential Revision: https://reviews.llvm.org/D30200
llvm-svn: 309812
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This should enable us to test the generation of target-specific constant
pools, e.g. for ARM:
constants:
- id: 0
value: 'g(GOT_PREL)-(LPC0+8-.)'
alignment: 4
isTargetSpecific: true
I intend to use this to test PIC support in GlobalISel for ARM.
This is difficult to test outside of that context, since the existing
MIR tests usually rely on parser support as well, and that seems a bit
trickier to add. We could try to add a unit test, but the setup for that
seems rather convoluted and overkill.
We do test however that the parser reports a nice error when
encountering a target-specific constant pool.
Differential Revision: https://reviews.llvm.org/D36092
llvm-svn: 309806
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
weren't in the match.
Summary:
Fix a bug discovered in an out-of-tree target where memoperands from
pseudo-instructions that weren't part of the match were being merged into the
result instructions as part of GIR_MergeMemOperands.
This bug was caused by a change to the handling of State.MIs between rules when
the state machine tables were fused into a single table. Previously, each rule
would reset State.MIs using State.MIs.resize(1) but this is no longer done, as a
result stale data is occasionally left in some elements of State.MIs. Most
opcodes aren't affected by this but GIR_MergeMemOperands merges all memoperands
from the intructions recorded in State.MIs into the result instruction.
Suppose for example, we processed but rejected the following pattern:
(signextend (load x))
at this point, State.MIs contains the signextend and the load. Now suppose we
process and accept this pattern:
(add x, y)
at this point, State.MIs contains the add as well as the (now irrelevant) load.
When GIR_MergeMemOperands is processed, the memoperands from that irrelevant
load will be merged into the result instruction even though it was not part of
the match.
Bringing back the State.MIs.resize(1) would fix the problem but it would limit
our ability to optimize the table in the future. Instead, this patch fixes the
problem by explicitly stating which instructions should be merged into the result.
There's no direct test case in this commit because a test case would be very brittle.
However, at the time of writing this should fix the failures in
http://green.lab.llvm.org/green/job/Compiler_Verifiers_GlobalISEL/ as well as a
failure in test/CodeGen/ARM/GlobalISel/arm-isel.ll when expensive checks are enabled.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Subscribers: fhahn, kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D36094
llvm-svn: 309804
|
|
|
|
|
|
|
|
|
|
| |
'sext' of 'cmp' test.
When the 'and' test was originally added it was intended to make sure we didn't change it to a sext of and of cmp. But since then the test was changed to expect it to be turned into 'select cmp1, sext cmp2, 0'. Then another optimization was added to turn the select into 'sext (and cmp1, cmp2)' which is exactly the transformation that was being blocked when the test case started.
Looks like 'or' gets optimized in a similar way, but not 'xor'.
llvm-svn: 309793
|
|
|
|
| |
llvm-svn: 309790
|
|
|
|
| |
llvm-svn: 309789
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
compiler.
Summary: The logic is guarded by "assert".
Reviewers: davidxl, davide, chandlerc
Reviewed By: davide, chandlerc
Subscribers: sanjoy, llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D36195
llvm-svn: 309787
|
|
|
|
| |
llvm-svn: 309785
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
infinite-inlining across multiple runs of the inliner by keeping a tiny
history of internal-to-SCC inlining decisions.
This is still a bit gross, but I don't yet have any fundamentally better
ideas and numerous people are blocked on this to use new PM and ThinLTO
together.
The core of the idea is to detect when we are about to do an inline that
has a chance of re-splitting an SCC which we have split before with
a similar inlining step. That is a critical component in the inlining
forming a cycle and so far detects all of the various cyclic patterns
I can come up with as well as the original real-world test case (which
comes from a ThinLTO build of libunwind).
I've added some tests that I think really demonstrate what is going on
here. They are essentially state machines that march the inliner through
various steps of a cycle and check that we stop when the cycle is closed
and that we actually did do inlining to form that cycle.
A lot of thanks go to Eric Christopher and Sanjoy Das for the help
understanding this issue and improving the test cases.
The biggest "yuck" here is the layering issue -- the CGSCC pass manager
is providing somewhat magical state to the inliner for it to use to make
itself converge. This isn't great, but I don't honestly have a lot of
better ideas yet and at least seems nicely isolated.
I have tested this patch, and it doesn't block *any* inlining on the
entire LLVM test suite and SPEC, so it seems sufficiently narrowly
targeted to the issue at hand.
We have come up with hypothetical issues that this patch doesn't cover,
but so far none of them are practical and we don't have a viable
solution yet that covers the hypothetical stuff, so proceeding here in
the interim. Definitely an area that we will be back and revisiting in
the future.
Differential Revision: https://reviews.llvm.org/D36188
llvm-svn: 309784
|
|
|
|
| |
llvm-svn: 309783
|
|
|
|
|
|
|
|
|
|
| |
This was failing on out of bounds access to the extra operands
on the s_swappc_b64 beyond those in the instruction definition.
This was working, but somehow regressed within the past few weeks,
although I don't see any obvious commit.
llvm-svn: 309782
|
|
|
|
| |
llvm-svn: 309781
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: In ThinLTO backend compile, OPTOptions are not set so that the ICP in ThinLTO backend does not know if it is a SamplePGO build, in which profile count needs to be annotated directly on call instructions. This patch cleaned up the PGOOptions handling logic and passes down PGOOptions to ThinLTO backend.
Reviewers: chandlerc, tejohnson, davidxl
Reviewed By: chandlerc
Subscribers: sanjoy, llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D36052
llvm-svn: 309780
|
|
|
|
|
|
|
|
|
|
| |
Previous change "Turn s_and_saveexec_b64 into s_and_b64 if
result is unused" introduced asan use-after-poison error.
Instruction was analyzed after eraseFromParent() calls.
Move analysys higher than erase.
llvm-svn: 309779
|
|
|
|
|
|
| |
Distribute various expressions across ifs.
llvm-svn: 309777
|
|
|
|
|
|
|
|
| |
When finding the fixed offsets for function arguments,
this needs to skip over the 4 bytes reserved for the
emergency stack slot.
llvm-svn: 309776
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pattern shows up when lowering byval copies on AMDGPU.
The byval object access is split into 4-byte chunks, adding a
constant offset to the FixedStack base. When some of the offsets
turn into ors, this prevents combining the constant offsets.
This makes it not apparent that the object is there when matching
addressing modes, so it ends up using a scratch wave offset
relative access and the lengthy frame index expansion for that.
llvm-svn: 309775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`llc -march` is problematic because it only switches the target
architecture, but leaves the operating system unchanged. This
occasionally leads to indeterministic tests because the OS from
LLVM_DEFAULT_TARGET_TRIPLE is used.
However we can simply always use `llc -mtriple` instead. This changes
all the tests to do this to avoid people using -march when they copy and
paste parts of tests.
See also the discussion in https://reviews.llvm.org/D35287
llvm-svn: 309774
|