| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch reverts part of r362750 / D62650, which stopped
LiveDebugVariables from trimming leading variable location ranges down
to only covering those instructions that are in scope. I've observed some
circumstances where the number of DBG_VALUEs in a function can be
amplified in an un-necessary way, to cover more instructions that are
out of scope, leading to very slow compile times. Trimming the range
of instructions that the variables cover solves the slow compile times.
The specific problem that r362750 tries to fix is addressed by the
assignment to RStart that I've added. Any variable location that begins
at the first instruction of a block will now be considered to begin at the
start of the block. While these sound the same, the have different
SlotIndexes, and the register allocator may shoehorn additional
instructions in between the two. The test added in the past
(wrong_debug_loc_after_regalloc.ll) still works with this modification.
live-debug-variables.ll has a range trimmed to not cover the prologue of
the function, while dbg-addr-dse.ll has a DBG_VALUE sink past one
instruction with no DebugLoc, which is expected behaviour.
Differential Revision: https://reviews.llvm.org/D73691
(cherry picked from commit 41206b61e30c3d84188cb17b91c2c0c800982dd1)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Mark the CrashRecoveryContextImpl constructor noexcept, so that MSVC
won't emit an unwind helper to clean up the allocation from `new` if the
constructor throws an exception.
Otherwise, MSVC complains:
llvm\lib\Support\CrashRecoveryContext.cpp(220): error C2712: \
Cannot use __try in functions that require object unwinding
The other simple fix would be to wrap `new` in a static helper or
lambda.
Users have reported that Tensorflow builds LLVM with /EHsc.
(cherry picked from commit a349c09162a8260bdf691c4f7ab72a16c33975f6)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When more than one SelectPseudo instruction is handled a new MBB is
returned. This must not be done if that would result in leaving an undhandled
isel pseudo behind in the original MBB.
Fixes https://bugs.llvm.org/show_bug.cgi?id=44849.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D74352
(cherry picked from commit 0311e28e9cc01a244faa774b8cab337b45404fa9)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
llvm::report_fatal_error() generate preprocessed source + reproducer.sh again.
Added a test for #pragma clang __debug llvm_fatal_error to test for the original issue.
Added llvm::sys::Process::Exit() and replaced ::exit() in places where it was appropriate. This new function would call the current CrashRecoveryContext if one is running on the same thread; or call ::exit() otherwise.
Fixes PR44705.
Differential Revision: https://reviews.llvm.org/D73742
(cherry picked from commit faace365088a2a3a4cb1050a9facfc34a7a56577)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Copy it instead. Otherwise, key registers (such as RBP) may get zeroed
out by the stack unwinder.
Fixes CrashRecoveryTest.DumpStackCleanup with MSVC in release builds.
Reviewed By: stella.stamenova
Differential Revision: https://reviews.llvm.org/D73809
(cherry picked from commit b074acb82f7e75a189fa7933b09627241b166121)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Debug Info Version was changed to use "Max" instead of "Warning" per the
original design intent - but this maxes old/new IR unlinkable, since
mismatched merge styles are a linking failure.
It seems possible/maybe reasonable to actually support the combination
of these two flags: Warn, but then use the maximum value rather than the
first value/earlier module's value.
Reviewers: tejohnson
Differential Revision: https://reviews.llvm.org/D74257
(cherry picked from commit ba9cae58bbdd41451ee67773c9d0f90a01756f12)
|
|
|
|
|
|
|
|
|
|
| |
Previously, the SEH codepath in CrashRecoveryContext didn't create a CrashRecoveryContextImpl. The other codepaths (VEH and Unix) were creating it.
When running with -fintegrated-cc1, this is needed to handle exit() as a jump to CrashRecoveryContext's exception filter, through a call to RaiseException. In that situation, we need a user-defined exception code, which is later interpreted as an exit() by the exception filter. This in turn needs to set RetCode accordingly, *inside* the exception filter, and *before* calling HandleCrash().
Differential Revision: https://reviews.llvm.org/D74078
(cherry picked from commit 2a3fa0fc5cd7d3398c0293915b0e569eaa0be24b)
|
|
|
|
|
|
|
|
|
| |
As discussed in D70568, remove this because it isn't used anywhere, and I think it's better to go through real crashes for testing (#pragma clang __debug crash).
Also remove the support function llvm::CrashRecoveryContext::HandleCrash() which was added at the same time by @ddunbar.
Differential Revision: https://reviews.llvm.org/D74063
(cherry picked from commit 8ecde3ac34bbb5a8d53d8ec5cd32867658646df1)
|
|
|
|
|
|
|
|
| |
This patch adds a new option to enable/disable register renaming in the
load-store optimizer. Defaults to disabled, as there is a potential
mis-compile caused by this.
(cherry picked from commit 8e3f59b45ae185cc9b4e3a817d7ac958f1d55976)
|
|
|
|
|
|
|
|
|
|
|
|
| |
The old version might be faster on EG (RECIP_IEEE is Trans only),
but it'd need extra corner case checks.
This gives correct corner case behaviour and saves a register.
Fixes OCL CTS sqrt test (1-thread, scalar) on Turks.
Reviewer: arsenm
Differential Revision: https://reviews.llvm.org/D74017
(cherry picked from commit e6686adf8a743564f0c455c34f04752ab08cf642)
|
|
|
|
|
|
|
|
|
|
|
|
| |
X86 uses i8 for shift amounts. This code can fail on a 32-bit target
if it runs after type legalization.
This code was copied from AArch64 and modified for X86, but the
shift amount wasn't changed to the correct type for X86.
Fixes PR44812
(cherry picked from commit ec9a94af4d5fb3270f2451fcbec5a3a99f4ac03a)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The compiler may transform the following code
ctx = ctx + reloc_offset
... (*(u32 *)ctx) & 0x8000 ...
to
ctx = ctx + reloc_offset
... (*(u8 *)(ctx + 1)) & 0x80 ...
where reloc_offset will be replaced with a constant during
AsmPrinter phase.
The above transformed code will be rejected the kernel verifier
as it does not allow
*(type *)((ctx + non_zero_offset1) + non_zero_offset2)
style access pattern.
It is hard at SelectionDag phase to identify whether a load
is related to context or not. Sometime, interprocedure analysis
may be needed. So let us simply prevent such optimization
from happening.
Differential Revision: https://reviews.llvm.org/D73997
(cherry picked from commit d96c1bbaa03574daf759e5e9a6c75047c5e3af64)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
ARM Type Promotion pass does not clear
the container that defines if one variable
was visited or not, missing optimization
opportunities by luck when two llvm:Values
from different functions are allocated at
the same memory address.
Also fixes a comment and uses existing
method to pop and obtain last element
of the worklist.
Reviewers: samparker
Reviewed By: samparker
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73970
(cherry picked from commit 8ba2b6281075c65c1a47abed57810e1201942533)
|
|
|
|
|
|
|
|
|
|
|
|
| |
While D72944 also fixes https://bugs.llvm.org/show_bug.cgi?id=44541,
it does so in a more roundabout manner and there might be other
loopholes to trigger the same issue. This is a more direct fix,
that prevents the transform if the min/max is based on a
non-canonical sub X, 0 instruction.
Differential Revision: https://reviews.llvm.org/D73849
(cherry picked from commit a148b9e9909db6a592609eb35b4de38c9e67cb8b)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, there is no way to disable ExpensiveCombines when doing
a standalone opt -instcombine run, as that's the default, and the
opt option can currently only be used to force enable, not to force
disable. The only way to disable expensive combines is via -O1 or -O2,
but that of course also runs the rest of the kitchen sink...
This patch allows using opt -instcombine -expensive-combines=0 to
run InstCombine without ExpensiveCombines.
Differential Revision: https://reviews.llvm.org/D72861
(cherry picked from commit 2ca092f3209579fde7a38ade511c1bbcef213c36)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes https://bugs.llvm.org/show_bug.cgi?id=44835. Skip the transform
if it wouldn't actually do anything (apart from removing and reinserting
the same instructions).
Note that the test case doesn't loop on current master anymore, only
on the LLVM 10 release branch. The issue is already mitigated on master
due to worklist order fixes, but we should fix the root cause there as well.
As a side note, we should probably assert in combineLoadToNewType()
that it does not combine to the same type. Not doing this here, because
this assertion would also be triggered in another place right now.
Differential Revision: https://reviews.llvm.org/D74278
(cherry picked from commit 23db9724d0e5490fa5a2a726acf015f84e2c87cf)
|
|
|
|
| |
This reverts commit 60e0120c913dd1d4bfe33769e1f000a076249a42.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently due to the edge caching, we create wrong predicates for
branches with matching true and false successors. We will cache the
condition for the edge from the true successor, and then lookup the same
edge (src and dst are the same) for the edge to the false successor.
If both successors match, the condition should always be true. At the
moment, we cannot really create constant VPValues, but we can just
create a true condition as X | !X. Later passes will clean that up.
Fixes PR44488.
Reviewers: rengolin, hsaito, fhahn, Ayal, dorit, gilr
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D73079
(cherry picked from commit f14f2a856802e086662d919e2ead641718b27555)
|
|
|
|
|
|
|
|
|
|
|
|
| |
The problem was noticed by the Chrome OS toolchain folks
(crbug.com/1048445) because llvm-objcopy --add-gnu-debuglink would
insert the wrong checksum when processing a binary larger than 4 GB.
That use case regressed in 1e1e3ba2526 when we started using
llvm::crc32() in more places.
Differential revision: https://reviews.llvm.org/D74039
(cherry picked from commit 6c4a8bc0a9f6a466d90d542bef66d69550c1b041)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Similar to D73680 (AArch64 BTI).
A local linkage function whose address is not taken does not need ENDBR32/ENDBR64. Placing the patch label after ENDBR32/ENDBR64 has the advantage that code does not need to differentiate whether the function has an initial ENDBR.
Also, add 32-bit tests and test that .cfi_startproc is at the function
entry. The line information has a general implementation and is tested
by AArch64/patchable-function-entry-empty.mir
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D73760
(cherry picked from commit 8ff86fcf4c038c7156ed4f01e7ed35cae49489e2)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Under MVE, we do not have any lowering for fminimum, which a
vector_reduce_fmin without NoNan will be expanded into. As with the
other recent patches, force this to expand in the pre-isel pass. Note
that Neon lowering would be OK because the scalar fminimum uses the
vector VMIN instruction, but is probably better to just rely on the
scalar operations, which is what is done here.
Also fixes what appears to be the reversal of INF vs -INF in the
vector_reduce_fmin widening code.
(cherry picked from commit 362d00e0510ee75750499e2993a782428e377215)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Followup to D73135. If the target doesn't have hard float (default
for ARM), then we assert when trying to soften the result of vector
reduction intrinsics. This patch marks these for expansion as well.
(A bit odd to use vectors on a target without hard float ... but
that's where you end up if you expose target-independent vector types.)
Differential Revision: https://reviews.llvm.org/D73854
(cherry picked from commit 1cc4f8d17247cd9be88addd75d060f9321b6f8b0)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fadd/fmul reductions without reassoc are lowered to
VECREDUCE_STRICT_FADD/FMUL nodes, which don't have legalization
support. Until that is in place, expand these intrinsics on
ARM and AArch64. Other targets always expand the vector reduction
intrinsics.
Additionally expand fmax/fmin reductions without nonan flag on
AArch64, as the backend asserts that the flag is present when
lowering VECREDUCE_FMIN/FMAX.
This fixes https://bugs.llvm.org/show_bug.cgi?id=44600.
Differential Revision: https://reviews.llvm.org/D73135
(cherry picked from commit 70d345e687caba4ac1f95655c6924dfa91e0083f)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Due to the fact that kill is just a normal intrinsic, even though it's
supposed to terminate the thread, we can end up with provably infinite
loops that are actually supposed to end successfully. The
AMDGPUUnifyDivergentExitNodes pass breaks up these loops, but because
there's no obvious place to make the loop branch to, it just makes it
return immediately, which skips the exports that are supposed to happen
at the end and hangs the GPU if all the threads end up being killed.
While it would be nice if the fact that kill terminates the thread were
modeled in the IR, I think that the structurizer as-is would make a mess if we
did that when the kill is inside control flow. For now, we just add a null
export at the end to make sure that it always exports something, which fixes
the immediate problem without penalizing the more common case. This means that
we sometimes do two "done" exports when only some of the threads enter the
discard loop, but from tests the hardware seems ok with that.
This fixes dEQP-VK.graphicsfuzz.while-inside-switch with radv.
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70781
(cherry picked from commit 87d98c149504f9b0751189744472d7cc94883960)
|
|
|
|
|
|
|
|
|
|
| |
R600 relies on this behaviour.
Fixes: 6e18266aa4dd78953557b8614cb9ff260bad7c65 ('Partially revert D61491 "AMDGPU: Be explicit about whether the high-word in SI_PC_ADD_REL_OFFSET is 0"')
Fixes ~100 piglit regressions since 6e18266
Differential Revision: https://reviews.llvm.org/D72991
(cherry picked from commit 1b8eab179db46f25a267bb73c657009c0bb542cc)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The recommended optimization level for BPF programs
is O2 since (1). BPF is running inside the kernel and
linux kernel won't work at -O0 level, and (2). Verifier
is not able to handle O0 code properly, e.g., potential
large stack size and a lot of spills.
But we should keep -O0 at least compiling.
This patch fixed a bug in BPFMISimplifyPatchable phase
where with -O0, a segmentation fault will happen for a
simple program like:
int test(int a, int b) { return a + b; }
A test case is added to capture such a case.
Differential Revision: https://reviews.llvm.org/D73681
(cherry picked from commit 795bbb366266e83d2bea8dc04c19919b52ab3a2a)
|
|
|
|
|
|
|
|
|
| |
This reverts commit 0dc6c249bffac9f23a605ce4e42a84341da3ddbd.
The commit is reported to cause a regression in piglit/bin/glsl-vs-loop for
Mesa.
(cherry picked from commit a80291ce10ba9667352adcc895f9668144f5f616)
|
|
|
|
|
|
|
|
|
|
| |
Pipeline scheduler model for the RISC-V Rocket micro-architecture using the
MIScheduler interface. Support for both 32 and 64-bit Rocket cores is
implemented.
Differential revision: https://reviews.llvm.org/D68685
(cherry picked from commit 838a28e234e098bfc073a45f37a4dd3bb5b45eab)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
For -fpatchable-function-entry=N,0 -mbranch-protection=bti, after
9a24488cb67a90f889529987275c5e411ce01dda, we place the NOP sled after
the initial BTI.
```
.Lfunc_begin0:
bti c
nop
nop
.section __patchable_function_entries,"awo",@progbits,f,unique,0
.p2align 3
.xword .Lfunc_begin0
```
This patch adds a label after the initial BTI and changes the __patchable_function_entries entry to reference the label:
```
.Lfunc_begin0:
bti c
.Lpatch0:
nop
nop
.section __patchable_function_entries,"awo",@progbits,f,unique,0
.p2align 3
.xword .Lpatch0
```
This placement is compatible with the resolution in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424 .
A local linkage function whose address is not taken does not need a BTI.
Placing the patch label after BTI has the advantage that code does not
need to differentiate whether the function has an initial BTI.
Reviewers: mrutland, nickdesaulniers, nsz, ostannard
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73680
(cherry picked from commit 06b8e32d4fd3f634f793e3bc0bc4fdb885e7a3ac)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... as well as:
Revert "[DWARF] Defer creating declaration DIEs until we prepare call site info"
This reverts commit fa4701e1979553c2df61698ac1ac212627630442.
This reverts commit 79daafc90308787b52a5d3a7586e82acd5e374b3.
There have been reports of this assert getting hit:
CalleeDIE && "Could not find DIE for call site entry origin
(cherry picked from commit 802bec896171997a7b73dde3857712e0eedeabc1)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Dead instructions do not need to be sunk. Currently we try and record
the recipies for them, but there are no recipes emitted for them and
there's nothing to sink. They can be removed from SinkAfter while
marking them for recording.
Fixes PR44634.
Reviewers: rengolin, hsaito, fhahn, Ayal, gilr
Reviewed By: gilr
Differential Revision: https://reviews.llvm.org/D73423
(cherry picked from commit a911fef3dd79e0a04b241be7b476dde7e99744c4)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
D72308 incorrectly assumed `resume` cannot exist without a `landingpad`,
which is not true. This sets `Changed` to true whenever we make changes
to a function, including creating a call to `__resumeException` within a
function without a landing pad.
Reviewers: tlively
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73308
(cherry picked from commit 580d7838dd08e13dac6caf4ab3142c9381bc7ad0)
|
|
|
|
|
|
|
|
| |
This should fix the build failure at
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/32524
and others.
(cherry picked from commit e0a6093a744d16c90eafa62d7143ce41806b2466)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds a ManglingOptions struct to IRMaterializationUnit, and replaces
IRCompileLayer::CompileFunction with a new IRCompileLayer::IRCompiler class. The
ManglingOptions struct defines the emulated-TLS state (via a bool member,
EmulatedTLS, which is true if emulated-TLS is enabled and false otherwise). The
IRCompileLayer::IRCompiler class wraps an IRCompiler (the same way that the
CompileFunction typedef used to), but adds a method to return the
IRCompileLayer::ManglingOptions that the compiler will use.
These changes allow us to correctly determine the symbols that will be produced
when a thread local global variable defined at the IR level is compiled with or
without emulated TLS. This is required for ORCv2, where MaterializationUnits
must declare their interface up-front.
Most ORCv2 clients should not require any changes. Clients writing custom IR
compilers will need to wrap their compiler in an IRCompileLayer::IRCompiler,
rather than an IRCompileLayer::CompileFunction, however this should be a
straightforward change (see modifications to CompileUtils.* in this patch for an
example).
(cherry picked from commit ce2207abaf9a925b35f15ef92aaff6b301ba6d22)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The MaterializationResponsibility::defineMaterializing method allows clients to
add new definitions that are in the process of being materialized to the JIT.
This patch adds support to defineMaterializing for symbols with weak linkage
where the new definitions may be rejected if another materializer concurrently
defines the same symbol. If a weak symbol is rejected it will not be added to
the MaterializationResponsibility's responsibility set. Clients can check for
membership in the responsibility set via the
MaterializationResponsibility::getSymbols() method before resolving any
such weak symbols.
This patch also adds code to RTDyldObjectLinkingLayer to tag COFF comdat symbols
introduced during codegen as weak, on the assumption that these are COFF comdat
constants. This fixes http://llvm.org/PR40074.
(cherry picked from commit 84217ad66115cc31b184374a03c8333e4578996f)
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit fixes PR39321.
GlobalExtensions is not guaranteed to be destroyed when optimizer plugins are unloaded. If it is indeed destroyed after a plugin is dlclose-d, the destructor of the corresponding ExtensionFn is not mapped anymore, causing a call to unmapped memory during destruction.
This commit guarantees that extensions coming from external plugins are removed from GlobalExtensions when the plugin is unloaded if GlobalExtensions has not been destroyed yet.
Differential Revision: https://reviews.llvm.org/D71959
(cherry picked from commit ab2300bc154f7bed43f85f74fd3fe31be71d90e0)
|
|
|
|
|
|
|
|
|
|
|
| |
Symbols created for merged external global variables have default
visibility. This can break programs when compiling with -Oz
-fvisibility=hidden as symbols that should be hidden will be exported at
link time.
Differential Revision: https://reviews.llvm.org/D73235
(cherry picked from commit a2fb2c0ddca14c133f24d08af4a78b6a3d612ec6)
|
|
|
|
| |
(cherry picked from commit 31e07692d7f2b383bd64c63cd2b5c35b6503cf3a)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MachineFunction::getPSVManager()""
Reland 7a8b0b1595e7dc878b48cf9bbaa652087a6895db, with a fix that checks
`!E.value().empty()` to avoid inserting a zero to SlotRemap.
Debugged by rnk@ in https://bugs.chromium.org/p/chromium/issues/detail?id=1045650#c33
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D73510
(cherry picked from commit 68051c122440b556e88a946bce12bae58fcfccb4)
(cherry picked from commit c7c5da6df30141c563e1f5b8ddeabeecdd29e55e)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This behavior appears to have changed unintentionally in
b0e979724f2679e4e6f5b824144ea89289bd6d56.
Instead of printing the leading newline in printFunction, print it when
printing a module. This ensures that `OS << *Func` starts printing
immediately on the current line, but whole modules are printed nicely.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D73505
(cherry picked from commit 9521c18438a9f09663f3dc68aa7581371c0653c9)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. if users don't specific -mattr, the default target-feature come
from IR attribute.
2. fixed bug and re-land this patch
Reviewers: lenary, asb
Reviewed By: lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70837
(cherry picked from commit 0cb274de397a193fb37c60653b336d48a3a4f1bd)
|
|
|
|
|
|
|
| |
This reverts commit 7bc58a779aaa1de56fad8b1bc8e46932d2f2f1e4.
It breaks EXPENSIVE_CHECKS on Windows
(cherry picked from commit cef838e65f9a2aeecf5e19431077bc16b01a79fb)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: lenary, asb
Reviewed By: lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72768
(cherry picked from commit 1256d68093ac1696034e385bbb4cb6e516b66bea)
|
|
|
|
|
|
|
|
|
| |
Long `cl::value_desc()` is added right after the flag name,
before `cl::desc()` column. And thus the `cl::desc()` column,
for all flags, is padded to the right,
which makes the output unreadable.
(cherry picked from commit 70cbf8c71c510077baadcad305fea6f62e830b06)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
These instructions ignore parts of the input vectors which makes the
default MSan handling too strict and causes false positive reports.
Reviewers: vitalybuka, RKSimon, thakis
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73374
(cherry picked from commit 1df8549b26892198ddf77dfd627eb9f979d45b7e)
|
|
|
|
|
|
|
|
| |
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D73301
(cherry picked from commit 50a3ff30e1587235d1830fec9694c1239302ab9f)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-fpatchable-function-entry=N,M where M>0
Similar to the function attribute `prefix` (prefix data),
"patchable-function-prefix" inserts data (M NOPs) before the function
entry label.
-fpatchable-function-entry=2,1 (1 NOP before entry, 1 NOP after entry)
will look like:
```
.type foo,@function
.Ltmp0: # @foo
nop
foo:
.Lfunc_begin0:
# optional `bti c` (AArch64 Branch Target Identification) or
# `endbr64` (Intel Indirect Branch Tracking)
nop
.section __patchable_function_entries,"awo",@progbits,get,unique,0
.p2align 3
.quad .Ltmp0
```
-fpatchable-function-entry=N,0 + -mbranch-protection=bti/-fcf-protection=branch has two reasonable
placements (https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01185.html):
```
(a) (b)
func: func:
.Ltmp0: bti c
bti c .Ltmp0:
nop nop
```
(a) needs no additional code. If the consensus is to go for (b), we will
need more code in AArch64BranchTargets.cpp / X86IndirectBranchTracking.cpp .
Differential Revision: https://reviews.llvm.org/D73070
(cherry picked from commit 22467e259507f5ead2a87d989251b4c951a587e4)
|
|
|
|
|
|
|
|
| |
"patchable-function-entry"="0"
Add improve tests
(cherry picked from commit d232c215669cb57f5eb4ead40a4a336220dbc429)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
before addPreEmitPass()
This intention is to move patchable-function before aarch64-branch-targets
(configured in AArch64PassConfig::addPreEmitPass) so that we emit BTI before NOPs
(see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424).
This also allows addPreEmitPass() passes to know the precise instruction sizes if they want.
Tried x86-64 Debug/Release builds of ccls with -fxray-instrument -fxray-instruction-threshold=1.
No output difference with this commit and the previous commit.
(cherry picked from commit 9a24488cb67a90f889529987275c5e411ce01dda)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Previously, we would erroneously turn %pcrel_lo(label), where label has
a %pcrel_hi against a weak symbol, into %pcrel_lo(label + offset), as
evaluatePCRelLo would believe the target independent logic was going to
fold it. Moreover, even if that were fixed, shouldForceRelocation lacks
an MCAsmLayout and thus cannot evaluate the %pcrel_hi fixup to a value
and check the symbol, so we would then erroneously constant-fold the
%pcrel_lo whilst leaving the %pcrel_hi intact. After D72197, this same
sequence also occurs for symbols with global binding, which is triggered
in real-world code.
Instead, as discussed in D71978, we introduce a new FKF_IsTarget flag to
avoid these kinds of issues. All the resolution logic happens in one
place, with no coordination required between RISCAsmBackend and
RISCVMCExpr to ensure they implement the same logic twice. Although the
implementation of %pcrel_hi can be left as target independent, we make
it target dependent to ensure that they are handled identically to
%pcrel_lo, otherwise we risk one of them being constant folded but the
other being preserved. This also allows us to properly support fixup
pairs where the instructions are in different fragments.
Reviewers: asb, lenary, efriedma
Reviewed By: efriedma
Subscribers: arichardson, hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73211
(cherry picked from commit 3f5976c97dbfefb4669abcf968bd79a9a64c18e0)
|