| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
| |
because it breaks compiler-rt tests.
This reverts commit 6d03890384517919a3ba7fe4c35535425f278f89.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This adds a clang option to disable inline line tables. When it is used,
the inliner uses the call site as the location of the inlined function instead of
marking it as an inline location with the function location.
See https://bugs.llvm.org/show_bug.cgi?id=42344
Reviewers: rnk
Subscribers: hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D67723
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
in the following C code the branch is not removed by clang in O3.
```
int f1(char* p) {
int i1 = __builtin_strlen(p);
if (!p)
return -1;
return i1;
}
```
The issue is that the call to strlen is sunk to the following block by instcombine. In its new place the call to strlen doesn't dominate the use in the icmp anymore so value tracking can't see that p cannot be null.
This patch resolves the issue by inserting an assumption at the place of the call before sinking a call when that call can be used to prove an argument to be nonnull.
This resolves this issue at O3.
Reviewers: majnemer, xbolva00, fhahn, jdoerfert, spatel, efriedma
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69477
|
| |
|
|
|
|
| |
off_t apparently is just "long" on Win64, which is 32-bits, and
therefore not long enough to compare with UINT32_MAX. Use auto to follow
the surrounding code. uint64_t would also be fine.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds MXCSR as a reserved physical register and models its use
by X86 SSE instructions. It also adds flag "mayRaiseFPException" for the
instructions that possibly can raise FP exception according to the
architecture definition.
Following what SystemZ and other targets does, only the current rounding
modes and the IEEE exception masks are modeled. *Changes* of the MXCSR
due to exceptions are not modeled.
Patch by Pengfei Wang
Differential Revision: https://reviews.llvm.org/D68121
|
| |
|
|
|
|
|
|
|
|
| |
readlane and writelane instructions are not allowed to use m0 as the
data operand, so spilling them is tricky and would require an
intermediate SGPR to spill it. Constrain the virtual register class in
this caes to disallow the inline spiller from folding the m0 operand
directly into the spill instruction.
I copied this hack from AArch64 which has the same problem for $sp.
|
| | |
|
| | |
|
| |
|
|
|
|
| |
hardcode number of operands or position of the EFLAGS operand.
This makes the code immune to the MXCSR addition in D68121.
|
| | |
|
| |
|
|
|
|
|
|
| |
LinkGraph::splitBlock will split a block at a given index, returning a new
block covering the range [ 0, index ) and modifying the original block to
cover the range [ index, original-block-size ). Block addresses, content,
edges and symbols will be updated as necessary. This utility will be used
in upcoming improvements to JITLink's eh-frame support.
|
| |
|
|
|
|
|
|
|
|
| |
The scheduling definitions for ASIMD transpose and zipping overlapped with
others a few lines below. Somehow, they didn't raise errors before.
There seem to be other overlapping definitions. Somehow, they still don't
raise errors.
Differential revision: https://reviews.llvm.org/D68353
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds support for generating the XCOFF data section in object files for global variables with initialization.
Merged aix-xcoff-common.ll into aix-xcoff-data.ll.
Changed variable name charr to chrarray in the test case to test if readobj works with 8-character names.
Authored by: xingxue
Reviewers: hubert.reinterptrtcast, sfertile, jasonliu, daltenty, Xiangling_L.
Reviewed by: hubert.reinterpretcast, sfertile, daltenty.
Subscribers: DiggerLin, Wuzish, nemanjai, hiraditya, MaskRay, jsji, shchenz, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67125
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From SelectionDAGs point of view, debug variable locations specified with
dbg.declare and dbg.addr are indirect -- they specify the address of
something. But calling conventions might mean that a Value is placed on
the stack somewhere, and this too is indirection. Previously this was
mixed up in the "IsIndirect" field of DBG_VALUE insts; this patch
separates them by encoding the indirection in a DIExpression.
If we have a dbg.declare or dbg.addr, then the expression produces an
address that then becomes a DWARF memory location. We can represent
this by putting a DW_OP_deref on the _end_ of the expression. If a Value
has been placed on the stack, then we need to put a DW_OP_deref on the
_start_ of the expression, to load the Value from the stack and have
the rest of the expression operate on it.
Differential Revision: https://reviews.llvm.org/D69028
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Delete the BasicBlockPass and BasicBlockManager, all its dependencies and update documentation.
The BasicBlockManager was improperly tested and found to be potentially broken, and was deprecated as of rL373254.
In light of the switch to the new pass manager coming before the next release, this patch is a first cleanup of the LegacyPassManager.
Reviewers: chandlerc, echristo
Subscribers: mehdi_amini, sanjoy.google, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69121
|
| |
|
|
| |
getTargetShuffleAndZeroables
|
| |
|
|
| |
helper. NFCI.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Patch adds support for vectorization of the jumbled stores. The value
operands are vectorized and then shuffled in the right order before
store.
Reviewers: RKSimon, spatel, hfinkel, mkuper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43339
|
| |
|
|
| |
Requested by Cameron McInally in D69275.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
VCCZBugHandledSet was used to make sure we don't apply the same
workaround more than once to a single cbranch instruction, but it's not
necessary because the workaround involves inserting an s_waitcnt
instruction, which is enough for subsequent iterations to detect that no
further workaround is necessary.
Also beef up the test case to check that the workaround was only applied
once. I have also manually verified that the test still passes even if I
hack the big do-while loop in runOnMachineFunction to run a minimum of
five iterations.
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69621
|
| | |
|
| |
|
|
|
| |
It used to generate S2_tstbit_i with constant -33 which resulted in an assert.
The reason is log2_32 was called with 64bit value 0.
|
| |
|
|
| |
Technically integers can assert on getZExtValue() if beyond i64 range, and a fuzzer usually find this.....
|
| |
|
|
|
|
| |
Enable lowering of constant pool index, jump table index, and bloack address to MIR on AIX.
Differential Revision: https://reviews.llvm.org/D69264
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
During AArch64 frame lowering instructions to enable return address
signing are inserted into function if needed. Functions generated during
machine outlining don't run through target frame lowering and hence are
missing such instructions.
This patch introduces the following changes:
1. If not all functions that potentially participate in function outlining
agree on their return address signing scope and their return address
signing key, outlining is disabled for these functions.
2. If not all functions that potentially participate in function outlining
agree on their support for v8.3A features, outlining is disabled for
these functions.
2. If all candidate functions agree on the signing scope, signing key and
and their support for v8.3 features, the outlined function behaves as
if it had the same scope and key attributes and as if it would provide
the same v8.3A support as the original functions.
Reviewers: olista01, paquette, t.p.northover, ostannard
Reviewed By: ostannard
Subscribers: ostannard, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69097
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is used on AMDGPU for rounding from v3f64 (which is illegal) to
v3f32 (which is legal).
Subscribers: jvesely, nhaehnle, tpr, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69339
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This extends the rules for when a call instruction is deemed to be an
FPMathOperator, which is based on the type of the call (i.e. the return
type of the function being called). Previously we only allowed
floating-point and vector-of-floating-point types. Now we also allow
arrays (nested to any depth) of floating-point and
vector-of-floating-point types.
This was motivated by llpc, the pipeline compiler for AMD GPUs
(https://github.com/GPUOpen-Drivers/llpc). llpc has many math library
functions that operate on vectors, typically represented as <4 x float>,
and some that operate on matrices, typically represented as
[4 x <4 x float>], and it's useful to be able to decorate calls to all
of them with fast math flags.
Reviewers: spatel, wristow, arsenm, hfinkel, aemerson, efriedma, cameron.mcinally, mcberg2017, jmolloy
Subscribers: wdng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69161
|
| |
|
|
|
|
|
|
|
|
| |
This is a follow-up to D67448.
Split live intervals with multiple dead defs during the initial
execution of the live interval analysis, but do it outside of the
function createAndComputeVirtRegInterval.
Differential Revision: https://reviews.llvm.org/D68666
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The architecture enum contains two kinds of contstants: the "official" ones
defined by Microsoft, and unofficial constants added by breakpad to cover the
architectures not described by the first ones.
Up until now, there was no big need to differentiate between the two. However,
now that Microsoft has defined
https://docs.microsoft.com/en-us/windows/win32/api/sysinfoapi/ns-sysinfoapi-system_info
a constant for ARM64, we have a name clash.
This patch renames all breakpad-defined constants with to include the prefix
"BP_". This frees up the name "ARM64", which I'll re-introduce with the new
"official" value in a follow-up patch.
Reviewers: amccarth, clayborg
Subscribers: lldb-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D69285
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Extend the describeLoadedValue() with support for target specific ARM and
AArch64 instructions interpretation. The patch provides specialization for
ADD and SUB operations that include a register and an immediate/offset
operand. Some of the instructions can operate with global string addresses
or constant pool indexes but such cases are omitted since we currently lack
flexible support for processing such operands at DWARF production stage.
Patch by Nikola Prica
Differential Revision: https://reviews.llvm.org/D67556
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Adds support for codegen of masked stores, with non-truncating
and truncating variants.
Reviewers: huntergr, greened, dmgreen, rovka, sdesmalen
Reviewed By: dmgreen, sdesmalen
Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69378
|
| |
|
|
|
|
| |
FSHL/FSHR support
Remove hard wired legality check.
|
| |
|
|
| |
hardwiring to MVT::i8
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add intrinsics for the following:
- sxt[b|h|w] & uxt[b|h|w]
- cls & clz
- not & cnot
Reviewers: huntergr, sdesmalen, dancgr
Reviewed By: sdesmalen
Subscribers: cameron.mcinally, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69567
|
| |
|
|
|
| |
This one should have been done in r363902 when hasReadVCCZBug was
introduced.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet, craig.topper, rnk
Reviewed By: rnk
Subscribers: rnk, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69034
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The address sanitizer ignore memory accesses from different address
spaces, however when instrumenting globals the check for different
address spaces is missing. This result in assertion failure. The fault
was found in an out of tree target.
The patch skip all globals of non default address space.
Reviewed By: leonardchan, vitalybuka
Differential Revision: https://reviews.llvm.org/D68790
|
| |
|
|
|
|
|
|
|
|
|
|
| |
the match pattern
If the instruction have match pattern, llvm-tblgen will infer the sideeffect bit from the match pattern and it works well.
If not, the tblgen will set it as true that hurt the scheduling.
PowerPC has some instructions that didn't specify the match pattern(i.e. LXSD etc), which is manually selected post-ra according
to the register pressure. We need to clear the sideeffect flag for these instructions.
Differential Revision: https://reviews.llvm.org/D69232
|
| |
|
|
|
|
|
|
|
|
| |
Teach combineVectorSizedSetCCEquality() to handle arbitrary memcmp
expansions but do not change any default policy for now.
This also fixes a bug in the memcmp expansion itself when large
displacements are needed.
https://reviews.llvm.org/D69507
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When createing an ORC remote JIT target the current library split forces the target process to link large portions of LLVM (Core, Execution Engine, JITLink, Object, MC, Passes, RuntimeDyld, Support, Target, and TransformUtils). This occurs because the ORC RPC interfaces rely on the static globals the ORC Error types require, which starts a cycle of pulling in more and more.
This patch breaks the ORC RPC Error implementations out into an "OrcError" library which only depends on LLVM Support. It also pulls the ORC RPC headers into their own subdirectory.
With this patch code can include the Orc/RPC/*.h headers and will only incur link dependencies on LLVMOrcError and LLVMSupport.
Reviewers: lhames
Reviewed By: lhames
Subscribers: mgorny, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68732
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69581
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add a flag `F_no_mmap` to `FileOutputBuffer` to support
`--[no-]mmap-output-file` in ELF LLD. LLD currently explicitly ignores
this flag for compatibility with GNU ld and gold.
We need this flag to speed up link time for large binaries in certain
scenarios. When we link some of our larger binaries we find that LLD
takes 50+ GB of memory, which causes memory pressure. The memory
pressure causes the VM to flush dirty pages of the output file to disk.
This is normally okay, since we should be flushing cold pages. However,
when using BtrFS with compression we need to write 128KB at a time when
we flush a page. If any page in that 128KB block is written again, then
it must be flushed a second time, and so on. Since LLD doesn't write
sequentially this causes write amplification. The same 128KB block will
end up being flushed multiple times, causing the linker to many times
more IO than necessary. We've observed 3-5x faster builds with
-no-mmap-output-file when we hit this scenario.
The bad scenario only applies to compressed filesystems, which group
together multiple pages into a single compressed block. I've tested
BtrFS, but the problem will be present for any compressed filesystem
on Linux, since it is caused by the VM.
Silently ignoring --no-mmap-output-file caused a silent regression when
we switched from gold to lld. We pass --no-mmap-output-file to fix this
edge case, but since lld silently ignored the flag we didn't realize it
wasn't being respected.
Benchmark building a 9 GB binary that exposes this edge case. I linked 3
times with --mmap-output-file and 3 times with --no-mmap-output-file and
took the average. The machine has 24 cores @ 2.4 GHz, 112 GB of RAM,
BtrFS mounted with -compress-force=zstd, and an 80% full disk.
| Mode | Time |
|---------|-------|
| mmap | 894 s |
| no mmap | 126 s |
When compression is disabled, BtrFS performs just as well with and
without mmap on this benchmark.
I was unable to reproduce the regression with any binaries in
lld-speed-test.
Reviewed By: ruiu, MaskRay
Differential Revision: https://reviews.llvm.org/D69294
|
| |
|
|
|
|
|
|
|
|
| |
This patch adds support for deleted C++ special member functions in
clang and llvm. Also added Defaulted member encodings for future
support for defaulted member functions.
Patch by Sourabh Singh Tomar!
Differential Revision: https://reviews.llvm.org/D69215
|
| |
|
|
|
|
|
|
|
|
| |
stores w/StoreSDNode) by default
Enable the new SelectionDAG representation for unordered loads and stores introduced in r371441 by default. As a reminder, the new lowering changes the representation of an unordered atomic load from an AtomicSDNode - which is essentially a black box which gets passed through without combines messing with it - to a LoadSDNode w/a atomic marker on the MMO. The later parallels the way we handle volatiles, and I've audited the code to ensure that every location which checks one checks the other.
This has been fairly heavily fuzzed, and I examined diffs in a reasonable large corpus of assembly by hand, so I'm reasonable sure this is correct for the common case. Late in the review for this, it was discovered that I hadn't correctly handled cases which could be legalized into CAS operations. This points out that there's a strong bias in the IR of the frontend I'm working with towards only legal atomics. If there are problems with this patch, the most likely area will be legalization.
Differential Revision: https://reviews.llvm.org/D69219
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
known zero.
This catches some cases. There are probably ways to improve this.
I tried doing it as a combine on the setcc, but that broke
some cases involving flag reuse in place of test.
I renamed the isX86CCUnsigned to isX86CCSigned and flipped its
polarity to make it consistent with the similar functions for
ISD::SETCC. This avoids calling EQ/NE as being signed or unsigned.
Fixes PR43823.
Differential Revision: https://reviews.llvm.org/D69499
|
| |
|
|
|
|
|
|
|
| |
Adding patten matching for two SVE intrinsics: frecps and frsqrts.
Also added patterns for fsub and fmul - these SDNodes directly correspond
to machine instructions.
Review: https://reviews.llvm.org/D68476
Patch authored by mgudim (Mikhail Gudim).
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
llvm/test/DebugInfo/MIR/X86/live-debug-values-reg-copy.mir failed with
EXPENSIVE_CHECKS enabled, causing the patch to be reverted in
rG2c496bb5309c972d59b11f05aee4782ddc087e71.
This patch relands the patch with a proper fix to the
live-debug-values-reg-copy.mir tests, by ensuring the MIR encodes the
callee-saves correctly so that the CalleeSaved info is taken from MIR
directly, rather than letting it be recalculated by the PEI pass. I've
done this by running `llc -stop-before=prologepilog` on the LLVM
IR as captured in the test files, adding the extra MOV instructions
that were manually added in the original test file, then running `llc
-run-pass=prologepilog` and finally re-added the comments for the MOV
instructions.
|
| |
|
|
|
|
|
|
|
| |
Stores are vectorized with maximum vectorization factor of 16. Patch
tries to improve the situation and use maximal vectorization factor.
Reviewers: spatel, RKSimon, mkuper, hfinkel
Differential Revision: https://reviews.llvm.org/D43582
|
| | |
|