| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change reworks the handling of atomic memcpy within the instcombine pass.
Previously, a constant length atomic memcpy would be lowered into loads & stores
as long as no more than 16 load/store pairs are created. This is quite different
from the lowering done for a non-atomic memcpy; which only ever lowers into a single
load/store pair of no more than 8 bytes. Larger constant-sized memcpy calls are
expanded to load/stores in later passes, such as SelectionDAG lowering.
In this change the behaviour for atomic memcpy is unified with non-atomic memcpy;
atomic memcpy is now treated in the same was as non-atomic memcpy has always been.
We leave it to later passes to lower longer-length atomic memcpy calls.
Due to the structure of the pass's handling of memtransfer intrinsics, this change
also gives us handling of atomic memmove that we did not previously have.
Reviewers: apilipenko, skatkov, mkazantsev, anna, reames
Reviewed By: reames
Subscribers: reames, llvm-commits
Differential Revision: https://reviews.llvm.org/D46658
llvm-svn: 332093
|
|
|
|
|
|
| |
Nothing uses this yet but this will allow us to specialize MMX/XMM/YMM/ZMM vector moves.
llvm-svn: 332090
|
|
|
|
|
|
|
|
|
|
|
|
| |
losesInfo would be left unset when no conversion needs to be done. A
caller such as InstCombine's fitsInFPType would then branch on an
uninitialized value.
Caught using valgrind on an out-of-tree target.
Differential Revision: https://reviews.llvm.org/D46645
llvm-svn: 332087
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D45883
llvm-svn: 332082
|
|
|
|
| |
llvm-svn: 332079
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
https://bugs.llvm.org/show_bug.cgi?id=34897 demonstrates an incorrect
coroutine frame allocation elision in the coro-elide pass. The elision
is performed on the basis that the SSA variables from all llvm.coro.begin
are directly referenced in subsequent llvm.coro.destroy instructions.
However, this ignores the fact that the function may exit through paths
that do not run these destroy instructions. In the sample program from
PR34897, for example, the llvm.coro.destroy instruction is only
executed in exception handling code. When the coroutine function exits
normally, llvm.coro.destroy is not called. Eliding the allocation in
this case causes a subsequent reference to the coroutine handle from
outside of the function to access freed memory.
To fix the issue, when finding an llvm.coro.destroy for each llvm.coro.begin,
only consider llvm.coro.destroy that are executed along non-exceptional paths.
Test Plan:
1. Download the sample program from
https://bugs.llvm.org/show_bug.cgi?id=34897, compile it with
`clang++ -fcoroutines-ts -stdlib=libc++ -std=c++1z -O2`, and run it.
It should print `"run1\ncheck1\nrun2\ncheck2"` and then exit
successfully.
2. Compile https://godbolt.org/g/mCKfnr and confirm it is still
optimized to a single instruction, 'return 1190'.
3. `check-llvm`
Reviewers: rsmith, GorNishanov, eric_niebler
Reviewed By: GorNishanov
Subscribers: andrewrk, lewissbaker, EricWF, llvm-commits
Differential Revision: https://reviews.llvm.org/D43242
llvm-svn: 332077
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add documentation for the LLVM Support functions `openFileForWrite` and
`openFileForRead`. The `openFileForRead` parameter `RealPath`, in
particular, I think warranted some explanation.
In addition, make the behavior of the functions more consistent across
platforms. Prior to this patch, Windows would set or not set the result
file descriptor based on the nature of the error, whereas Unix would
consistently set it to `-1` if the open failed. Make Windows
consistently set it to `-1` as well.
Test Plan:
1. `ninja check-llvm`
2. `ninja docs-llvm-html`
Reviewers: zturner, rnk, danielmartin, scanon
Reviewed By: danielmartin, scanon
Subscribers: scanon, danielmartin, llvm-commits
Differential Revision: https://reviews.llvm.org/D46499
llvm-svn: 332075
|
|
|
|
|
|
| |
with 'unreachable'
llvm-svn: 332072
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Ship kNetBSD_ShadowOffset32 set to 1ULL << 30.
This is prepared for the amd64 kernel runtime.
Sponsored by <The NetBSD Foundation>
Reviewers: vitalybuka, joerg, kcc
Reviewed By: vitalybuka
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46724
llvm-svn: 332069
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cold
We found current sampleFDO had a performance issue when triaging a regression.
For a callsite with inline instance in the profile, even if hot callsite inliner
cannot inline it, it may still execute enough times and should not be treated as
cold in regular inliner later. However, currently if such callsite is not inlined
by hot callsite inliner, and the BB where the callsite locates doesn't get
samples from other instructions inside of it, the callsite will have no profile
metadata annotated. In regular inliner cost analysis, if the callsite has no
profile annotated and its caller has profile information, it will be treated as
cold.
The fix changes the isCallsiteHot check and chooses to compare
CallsiteTotalSamples with hot cutoff value computed by ProfileSummaryInfo.
Differential Revision: https://reviews.llvm.org/D45377
llvm-svn: 332058
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds a wrapper for std::distance() which works with ranges.
As it would be a common case to write `distance(predecessors(BB))`, this
also introduces `pred_size()` and `succ_size()` helpers to make that
easier to write.
Differential Revision: https://reviews.llvm.org/D46668
llvm-svn: 332057
|
|
|
|
|
|
| |
The bitwidth of the operation should always be wider than the result width of the truncate since we don't recurse through any width changing operations.
llvm-svn: 332055
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This implements a new table-gen emitter to create tables for
a wasm disassembler, and a dissassembler to use them.
Comes with 2 tests, that tests a few instructions manually. Is also able to
disassemble large .wasm files with objdump reasonably.
Not working so well, to be addressed in followups:
- objdump appears to be passing an incorrect starting point.
- since the disassembler works an instruction at a time, and it is
disassembling stack instruction, it has no idea of pseudo register assignments.
These registers are required for the instruction printing code that follows.
For now, all such registers appear in the output as $0.
Patch by Wouter van Oortmerssen
Differential Revision: https://reviews.llvm.org/D45848
llvm-svn: 332052
|
|
|
|
|
|
|
|
|
|
|
|
| |
from r331958.
Clang's codegen now uses 128-bit masked load/store intrinsics in IR. The backend will widen to 512-bits on AVX512F targets.
So this patch adds patterns to detect codegen's widening and patterns for AVX512VL that don't get widened.
We may be able to drop some of the old patterns, but I leave that for a future patch.
llvm-svn: 332049
|
|
|
|
|
|
|
|
|
| |
This reverts commit SVN r331889, which could trigger failed
assertions for cases where the snprintf function is declared
with a vaguely differing signature (e.g. being defined as
static inline), see PR37408.
llvm-svn: 332043
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D45881
llvm-svn: 332042
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Move and correct LLVMDIBuilderCreateTypedef. This is the last API in DIBuilderBindings.h, so it is being removed and the C API will now be re-exported from IRBindings.h.
Reviewers: whitequark, harlanhaskins, deadalnix
Reviewed By: whitequark
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46725
llvm-svn: 332041
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm, nhaehnle
Reviewed By: arsenm
Subscribers: kzhuravl, wdng, mgorny, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D45994
llvm-svn: 332039
|
|
|
|
|
|
|
|
|
|
|
| |
This is similar to what we do for integer min/max with 'not'
ops (rL321882).
This should fix:
https://bugs.llvm.org/show_bug.cgi?id=37404
https://bugs.llvm.org/show_bug.cgi?id=37405
llvm-svn: 332031
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
encoded the contribution
length excluding the table header. Instead it must encode the contribution length minus the length
field itself.
Reviewer: JDevliegehere
Differential Revision: https://reviews.llvm.org/D45922
llvm-svn: 332030
|
|
|
|
|
|
|
|
|
| |
ValueTracking; NFC
Differential Revision: https://reviews.llvm.org/D46704
Change-Id: Ifabcbe431a2169743b3cc310f2a34fd706f13f02
llvm-svn: 332026
|
|
|
|
|
|
|
| |
This was missing from r331961.
Caught by sanitizer bots.
llvm-svn: 332024
|
|
|
|
|
|
| |
Use instrs lists or merge multiple instregex patterns.
llvm-svn: 332022
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Accessing the members of a large data structures needs a lot of GEPs which
usually have large offsets due to the size of the underlying data structure. If
the offsets are too large to fit into the r+i addressing mode, these GEPs cannot
be sunk to their users' blocks and many extra registers are needed then to carry
the values of these GEPs.
This patch tries to split a large data struct starting from %base like the
following.
Before:
BB0:
%base =
BB1:
%gep0 = gep %base, off0
%gep1 = gep %base, off1
%gep2 = gep %base, off2
BB2:
%load1 = load %gep0
%load2 = load %gep1
%load3 = load %gep2
After:
BB0:
%base =
%new_base = gep %base, off0
BB1:
%new_gep0 = %new_base
%new_gep1 = gep %new_base, off1 - off0
%new_gep2 = gep %new_base, off2 - off0
BB2:
%load1 = load i32, i32* %new_gep0
%load2 = load i32, i32* %new_gep1
%load3 = load i32, i32* %new_gep2
In the above example, the struct is split into two parts. The first part still
starts from %base and the second part starts from %new_base. After the
splitting, %new_gep1 and %new_gep2 have smaller offsets and then can be sunk to
BB2 and folded into their users.
The algorithm to split data structure is simple and very similar to the work of
merging SExts. First, it collects GEPs that have large offsets when iterating
the blocks. Second, it splits the underlying data structures and updates the
collected GEPs to use smaller offsets.
Differential Revision: https://reviews.llvm.org/D42759
llvm-svn: 332015
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Adds getters for the line, column, and scope of a DILocation
- Adds getters for the name, size in bits, offset in bits, alignment in bits, line, and flags of a DIType
Reviewers: whitequark, harlanhaskins, deadalnix
Reviewed By: whitequark
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46627
llvm-svn: 332014
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Move LLVMTemporaryMDNode and LLVMMetadataReplaceAllUsesWith to the C bindings and add LLVMDeleteTemporaryMDNode for deleting non-RAUW'ed temporary nodes.
Reviewers: whitequark, harlanhaskins, deadalnix
Reviewed By: whitequark
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46632
llvm-svn: 332010
|
|
|
|
| |
llvm-svn: 332006
|
|
|
|
|
|
|
|
|
|
|
| |
These symbols only get included in the output symbols table if
they are used in a relocation.
This behaviour matches more closely the ELF object writer.
Differential Revision: https://reviews.llvm.org/D46561
llvm-svn: 332005
|
|
|
|
|
|
|
| |
This code can just test whether blocks are *in* the loop, which we
already have a dedicated set tracking in the loop itself.
llvm-svn: 332004
|
|
|
|
| |
llvm-svn: 332002
|
|
|
|
|
|
|
|
| |
WriteVecALU/WriteVecLogic/WriteShuffle/WriteVarShuffle/WritePSADBW/WritePHAdd scheduler classes
Split off XMM classes from the default (MMX) classes.
llvm-svn: 331999
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a follow up to the rL330983. The patch teaches ld, sd, and lld
commands accept 32-bit memory offsets by replacing `mem_simm16` operand
to `mem_simmptr`. In fact, these commands should accept 64-bit offsets,
but so large offsets require another command expanding and will be
supported by a separate patch.
Differential Revision: https://reviews.llvm.org/D46629
llvm-svn: 331997
|
|
|
|
|
|
|
|
|
|
| |
This is a follow up to the rL330983. The patch teaches lh and lhu
commands accepts 32-bit memory offsets by replacing `mem_simm16` operand
to `mem_simmptr`.
Differential Revision: https://reviews.llvm.org/D46513
llvm-svn: 331996
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With nnan, there's no need for the masked merge / blend
sequence (that probably costs much more than the min/max
instruction).
Somewhere between clang 5.0 and 6.0, we started producing
these intrinsics for fmax()/fmin() in C source instead of
libcalls or fcmp/select. The backend wasn't prepared for
that, so we regressed perf in those cases.
Note: it's possible that other targets have similar problems
as seen here.
Noticed while investigating PR37403 and related bugs:
https://bugs.llvm.org/show_bug.cgi?id=37403
The IR FMF propagation cases still don't work. There's
a proposal that might fix those cases in D46563.
llvm-svn: 331992
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change teaches DSE that the atomic memory intrinsics can be overwriten
partially in the same way as the non-atomic forms. Specifically, that the
atomic memcpy & memset can be shortened at the end and that the atomic memset
can be shortened at the beginning, if they partially overwritten
by later stores.
Reviewers: mkazantsev, skatkov, apilipenko, efriedma, rsmith, spatel, filcab, sanjoy
Reviewed By: efriedma
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45584
llvm-svn: 331991
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes bug https://bugs.llvm.org/show_bug.cgi?id=37339.
InlineAsm is only uniqued if the FunctionTypes are exactly the
same, while cmpTypes() for example considers all pointer types
in the default address space to be the same. For this reason
the end of cmpInlineAsm() can be reached.
This patch replaces the unreachable assertion with a check that
the function types are not identical.
Differential Revision: https://reviews.llvm.org/D46495
Reviewers: jfb
llvm-svn: 331990
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The combine in rebuildSetCC may be combined to another
node leaving our references stale. Keep a handle on
it to avoid stale references.
Fixes PR36602.
Reviewers: dbabokin, RKSimon, eli.friedman, davide
Subscribers: hiraditya, uabelho, JesperAntonsson, qcolombet, llvm-commits
Differential Revision: https://reviews.llvm.org/D46404
llvm-svn: 331985
|
|
|
|
|
|
|
|
|
|
|
|
| |
The print format was causing at least 2 unit-test failures from r331971.
The signed/unsigned comparison warnings only appeared to affect two lines but
it was unclear whether it might just pop up on other lines, so I have been
explicit in all the literals in the tests.
There were other bot unit-test failures that I am still investigating.
llvm-svn: 331978
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Operand 0 is the condition, not the true value.
Use op 1 and op 2 as the correct values.
Reviewers: george.burgess.iv, nlopes, efriedma
Reviewed By: george.burgess.iv
Subscribers: craig.topper, rjmccall, lebedev.ri, llvm-commits
Differential Revision: https://reviews.llvm.org/D46343
llvm-svn: 331976
|
|
|
|
|
|
|
|
|
| |
Put in a conservatively correct estimate for now. Avoids miscompiling
clang in FDO mode. This is really tricky to trigger in reality as
basically all interesting cases will be folded away by computeKnownBits
earlier, I was unable to find a reasonably small test case.
llvm-svn: 331975
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewed by: dblaikie, JDevlieghere, espindola
Differential Revision: https://reviews.llvm.org/D44560
Summary:
The .debug_line parser previously reported errors by printing to stderr and
return false. This is not particularly helpful for clients of the library code,
as it prevents them from handling the errors in a manner based on the calling
context. This change switches to using llvm::Error and callbacks to indicate
what problems were detected during parsing, and has updated clients to handle
the errors in a location-specific manner. In general, this means that they
continue to do the same thing to external users. Below, I have outlined what
the known behaviour changes are, relating to this change.
There are two levels of "errors" in the new error mechanism, to broadly
distinguish between different fail states of the parser, since not every
failure will prevent parsing of the unit, or of subsequent unit. Malformed
table errors that prevent reading the remainder of the table (reported by
returning them) and other minor issues representing problems with parsing that
do not prevent attempting to continue reading the table (reported by calling a
specified callback funciton). The only example of this currently is when the
last sequence of a unit is unterminated. However, I think it would be good to
change the handling of unrecognised opcodes to report as minor issues as well,
rather than just printing to the stream if --verbose is used (this would be a
subsequent change however).
I have substantially extended the DwarfGenerator to be able to handle
custom-crafted .debug_line sections, allowing for comprehensive unit-testing
of the parser code. For now, I am just adding unit tests to cover the basic
error reporting, and positive cases, and do not currently intend to test every
part of the parser, although the framework should be sufficient to do so at a
later point.
Known behaviour changes:
- The dump function in DWARFContext now does not attempt to read subsequent
tables when searching for a specific offset, if the unit length field of a
table before the specified offset is a reserved value.
- getOrParseLineTable now returns a useful Error if an invalid offset is
encountered, rather than simply a nullptr.
- The parse functions no longer use `WithColor::warning` directly to report
errors, allowing LLD to call its own warning function.
- The existing parse error messages have been updated to not specifically
include "warning" in their message, allowing consumers to determine what
severity the problem is.
- If the line table version field appears to have a value less than 2, an
informative error is returned, instead of just false.
- If the line table unit length field uses a reserved value, an informative
error is returned, instead of just false.
- Dumping of .debug_line.dwo sections is now implemented the same as regular
.debug_line sections.
- Verbose dumping of .debug_line[.dwo] sections now prints the prologue, if
there is a prologue error, just like non-verbose dumping.
As a helper for the generator code, I have re-added emitInt64 to the
AsmPrinter code. This previously existed, but was removed way back in r100296,
presumably because it was dead at the time.
This change also requires a change to LLD, which will be committed separately.
llvm-svn: 331971
|
|
|
|
|
|
|
|
| |
Reviewers: atanasyan, smaksimovic, abeserminji
Differential Revision: https://reviews.llvm.org/D46390
llvm-svn: 331969
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: craig.topper, RKSimon
Reviewed By: craig.topper, RKSimon
Differential Revision: https://reviews.llvm.org/D46539
llvm-svn: 331961
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During simplification umax we trigger isKnownPredicate twice. As a first attempt it
tries the induction. To do that it tries to get post increment of SCEV.
Re-writing the SCEV may result in simplification of umax. If the SCEV contains a lot
of umax operations this recursion becomes very slow.
The added test demonstrates the slow behavior.
To resolve this we use only simple ways to check whether the predicate is known.
Reviewers: sanjoy, mkazantsev
Reviewed By: sanjoy
Subscribers: lebedev.ri, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D46046
llvm-svn: 331949
|
|
|
|
|
|
| |
computeKnownBits call. NFC
llvm-svn: 331948
|
|
|
|
|
|
|
|
| |
columns violation. NFC
Fix a similar line in the same function.
llvm-svn: 331947
|
|
|
|
| |
llvm-svn: 331944
|
|
|
|
|
|
|
|
|
|
|
|
| |
Const/local/shared address spaces are all < 4GB and we can always use
32-bit pointers to access them. This has substantial performance impact
on kernels that uses shared memory for intermediary results.
The feature is disabled by default.
Differential Revision: https://reviews.llvm.org/D46147
llvm-svn: 331941
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a follow-up to D45986. As suggested there, we should match the "all-bits-set"
pattern in addition to "any-bits-set".
This was a little more complicated than I thought it would be initially because the
"and 1" instruction can be anywhere in the chain. Hopefully, the code comments make
that logic understandable, but if you see a way to simplify or improve that, it's
most appreciated.
This transforms patterns that emerge from bitfield tests as seen in PR37098:
https://bugs.llvm.org/show_bug.cgi?id=37098
I think it would also help reduce the large test from:
D46336
D46595
but we need something to reassociate that case to the forms we're expecting here first.
Differential Revision: https://reviews.llvm.org/D46649
llvm-svn: 331937
|
|
|
|
|
|
|
|
| |
The previous handling for guard widening in InstCombine was extremely restrictive. In particular, it didn't handle the common case where we had two guards separated by a single icmp. Handle this by scanning through a small fixed window of instructions to find the next guard if needed.
Differential Revision: https://reviews.llvm.org/D46203
llvm-svn: 331935
|