bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[DebugInfo] Re-instate LiveDebugVariables scope trimming	Jeremy Morse	2020-02-12	1	-1/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch reverts part of r362750 / D62650, which stopped LiveDebugVariables from trimming leading variable location ranges down to only covering those instructions that are in scope. I've observed some circumstances where the number of DBG_VALUEs in a function can be amplified in an un-necessary way, to cover more instructions that are out of scope, leading to very slow compile times. Trimming the range of instructions that the variables cover solves the slow compile times. The specific problem that r362750 tries to fix is addressed by the assignment to RStart that I've added. Any variable location that begins at the first instruction of a block will now be considered to begin at the start of the block. While these sound the same, the have different SlotIndexes, and the register allocator may shoehorn additional instructions in between the two. The test added in the past (wrong_debug_loc_after_regalloc.ll) still works with this modification. live-debug-variables.ll has a range trimmed to not cover the prologue of the function, while dbg-addr-dse.ll has a DBG_VALUE sink past one instruction with no DebugLoc, which is expected behaviour. Differential Revision: https://reviews.llvm.org/D73691 (cherry picked from commit 41206b61e30c3d84188cb17b91c2c0c800982dd1)
*	Fix MSVC build with C++ EH enabled	Reid Kleckner	2020-02-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Mark the CrashRecoveryContextImpl constructor noexcept, so that MSVC won't emit an unwind helper to clean up the allocation from `new` if the constructor throws an exception. Otherwise, MSVC complains: llvm\lib\Support\CrashRecoveryContext.cpp(220): error C2712: \ Cannot use __try in functions that require object unwinding The other simple fix would be to wrap `new` in a static helper or lambda. Users have reported that Tensorflow builds LLVM with /EHsc. (cherry picked from commit a349c09162a8260bdf691c4f7ab72a16c33975f6)
*	[SystemZ] Bugfix in emitSelect()	Jonas Paulsson	2020-02-12	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When more than one SelectPseudo instruction is handled a new MBB is returned. This must not be done if that would result in leaving an undhandled isel pseudo behind in the original MBB. Fixes https://bugs.llvm.org/show_bug.cgi?id=44849. Review: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D74352 (cherry picked from commit 0311e28e9cc01a244faa774b8cab337b45404fa9)
*	[Clang][Driver] After default -fintegrated-cc1, make ↵	Alexandre Ganea	2020-02-12	3	-7/+36
\| \| \| \| \| \| \| \| \| \| \| \| \|	llvm::report_fatal_error() generate preprocessed source + reproducer.sh again. Added a test for #pragma clang __debug llvm_fatal_error to test for the original issue. Added llvm::sys::Process::Exit() and replaced ::exit() in places where it was appropriate. This new function would call the current CrashRecoveryContext if one is running on the same thread; or call ::exit() otherwise. Fixes PR44705. Differential Revision: https://reviews.llvm.org/D73742 (cherry picked from commit faace365088a2a3a4cb1050a9facfc34a7a56577)
*	[Support] Don't modify the current EH context during stack unwinding	Reid Kleckner	2020-02-11	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	Copy it instead. Otherwise, key registers (such as RBP) may get zeroed out by the stack unwinder. Fixes CrashRecoveryTest.DumpStackCleanup with MSVC in release builds. Reviewed By: stella.stamenova Differential Revision: https://reviews.llvm.org/D73809 (cherry picked from commit b074acb82f7e75a189fa7933b09627241b166121)
*	IR Linking: Support merging Warning+Max module metadata flags	David Blaikie	2020-02-11	1	-23/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Debug Info Version was changed to use "Max" instead of "Warning" per the original design intent - but this maxes old/new IR unlinkable, since mismatched merge styles are a linking failure. It seems possible/maybe reasonable to actually support the combination of these two flags: Warn, but then use the maximum value rather than the first value/earlier module's value. Reviewers: tejohnson Differential Revision: https://reviews.llvm.org/D74257 (cherry picked from commit ba9cae58bbdd41451ee67773c9d0f90a01756f12)
*	[Support] When using SEH, create a impl instance for CrashRecoveryContext. NFCI.	Alexandre Ganea	2020-02-10	1	-25/+35
\| \| \| \| \| \| \| \| \| \|	Previously, the SEH codepath in CrashRecoveryContext didn't create a CrashRecoveryContextImpl. The other codepaths (VEH and Unix) were creating it. When running with -fintegrated-cc1, this is needed to handle exit() as a jump to CrashRecoveryContext's exception filter, through a call to RaiseException. In that situation, we need a user-defined exception code, which is later interpreted as an exit() by the exception filter. This in turn needs to set RetCode accordingly, inside the exception filter, and before calling HandleCrash(). Differential Revision: https://reviews.llvm.org/D74078 (cherry picked from commit 2a3fa0fc5cd7d3398c0293915b0e569eaa0be24b)
*	[Clang] Remove unused #pragma clang __debug handle_crash	Alexandre Ganea	2020-02-10	1	-8/+0
\| \| \| \| \| \| \| \| \|	As discussed in D70568, remove this because it isn't used anywhere, and I think it's better to go through real crashes for testing (#pragma clang __debug crash). Also remove the support function llvm::CrashRecoveryContext::HandleCrash() which was added at the same time by @ddunbar. Differential Revision: https://reviews.llvm.org/D74063 (cherry picked from commit 8ecde3ac34bbb5a8d53d8ec5cd32867658646df1)
*	[AArch64] Add option to enable/disable load-store renaming.	Florian Hahn	2020-02-10	1	-0/+7
\| \| \| \| \| \| \| \|	This patch adds a new option to enable/disable register renaming in the load-store optimizer. Defaults to disabled, as there is a potential mis-compile caused by this. (cherry picked from commit 8e3f59b45ae185cc9b4e3a817d7ac958f1d55976)
*	AMDGPU/EG,CM: Implement fsqrt using recip(rsqrt(x)) instead of x * rsqrt(x)	Jan Vesely	2020-02-10	3	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \|	The old version might be faster on EG (RECIP_IEEE is Trans only), but it'd need extra corner case checks. This gives correct corner case behaviour and saves a register. Fixes OCL CTS sqrt test (1-thread, scalar) on Turks. Reviewer: arsenm Differential Revision: https://reviews.llvm.org/D74017 (cherry picked from commit e6686adf8a743564f0c455c34f04752ab08cf642)
*	[X86] Use MVT::i8 instead of MVT::i64 for shift amount in BuildSDIVPow2	Craig Topper	2020-02-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	X86 uses i8 for shift amounts. This code can fail on a 32-bit target if it runs after type legalization. This code was copied from AArch64 and modified for X86, but the shift amount wasn't changed to the correct type for X86. Fixes PR44812 (cherry picked from commit ec9a94af4d5fb3270f2451fcbec5a3a99f4ac03a)
*	[BPF] disable ReduceLoadWidth during SelectionDag phase	Yonghong Song	2020-02-10	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The compiler may transform the following code ctx = ctx + reloc_offset ... ((u32 )ctx) & 0x8000 ... to ctx = ctx + reloc_offset ... ((u8 )(ctx + 1)) & 0x80 ... where reloc_offset will be replaced with a constant during AsmPrinter phase. The above transformed code will be rejected the kernel verifier as it does not allow (type )((ctx + non_zero_offset1) + non_zero_offset2) style access pattern. It is hard at SelectionDag phase to identify whether a load is related to context or not. Sometime, interprocedure analysis may be needed. So let us simply prevent such optimization from happening. Differential Revision: https://reviews.llvm.org/D73997 (cherry picked from commit d96c1bbaa03574daf759e5e9a6c75047c5e3af64)
*	[ARM] Fix non-determenistic behaviour	Diogo Sampaio	2020-02-10	1	-3/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: ARM Type Promotion pass does not clear the container that defines if one variable was visited or not, missing optimization opportunities by luck when two llvm:Values from different functions are allocated at the same memory address. Also fixes a comment and uses existing method to pop and obtain last element of the worklist. Reviewers: samparker Reviewed By: samparker Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73970 (cherry picked from commit 8ba2b6281075c65c1a47abed57810e1201942533)
*	[InstCombine] Fix infinite min/max canonicalization loop (PR44541)	Nikita Popov	2020-02-10	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \|	While D72944 also fixes https://bugs.llvm.org/show_bug.cgi?id=44541, it does so in a more roundabout manner and there might be other loopholes to trigger the same issue. This is a more direct fix, that prevents the transform if the min/max is based on a non-canonical sub X, 0 instruction. Differential Revision: https://reviews.llvm.org/D73849 (cherry picked from commit a148b9e9909db6a592609eb35b4de38c9e67cb8b)
*	[InstCombine] Support disabling expensive combines in opt	Nikita Popov	2020-02-10	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, there is no way to disable ExpensiveCombines when doing a standalone opt -instcombine run, as that's the default, and the opt option can currently only be used to force enable, not to force disable. The only way to disable expensive combines is via -O1 or -O2, but that of course also runs the rest of the kitchen sink... This patch allows using opt -instcombine -expensive-combines=0 to run InstCombine without ExpensiveCombines. Differential Revision: https://reviews.llvm.org/D72861 (cherry picked from commit 2ca092f3209579fde7a38ade511c1bbcef213c36)
*	[InstCombine] Fix infinite loop in min/max load/store bitcast combine (PR44835)	Nikita Popov	2020-02-10	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes https://bugs.llvm.org/show_bug.cgi?id=44835. Skip the transform if it wouldn't actually do anything (apart from removing and reinserting the same instructions). Note that the test case doesn't loop on current master anymore, only on the LLVM 10 release branch. The issue is already mitigated on master due to worklist order fixes, but we should fix the root cause there as well. As a side note, we should probably assert in combineLoadToNewType() that it does not combine to the same type. Not doing this here, because this assertion would also be triggered in another place right now. Differential Revision: https://reviews.llvm.org/D74278 (cherry picked from commit 23db9724d0e5490fa5a2a726acf015f84e2c87cf)
*	Revert "[ARM] Improve codegen of volatile load/store of i64"	Victor Campos	2020-02-08	6	-162/+6
\| \| \| \|	This reverts commit 60e0120c913dd1d4bfe33769e1f000a076249a42.
*	[LV] Fix predication for branches with matching true and false succs.	Florian Hahn	2020-02-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently due to the edge caching, we create wrong predicates for branches with matching true and false successors. We will cache the condition for the edge from the true successor, and then lookup the same edge (src and dst are the same) for the edge to the false successor. If both successors match, the condition should always be true. At the moment, we cannot really create constant VPValues, but we can just create a true condition as X \| !X. Later passes will clean that up. Fixes PR44488. Reviewers: rengolin, hsaito, fhahn, Ayal, dorit, gilr Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D73079 (cherry picked from commit f14f2a856802e086662d919e2ead641718b27555)
*	Make llvm::crc32() work also for input sizes larger than 32 bits.	Hans Wennborg	2020-02-06	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \|	The problem was noticed by the Chrome OS toolchain folks (crbug.com/1048445) because llvm-objcopy --add-gnu-debuglink would insert the wrong checksum when processing a binary larger than 4 GB. That use case regressed in 1e1e3ba2526 when we started using llvm::crc32() in more places. Differential revision: https://reviews.llvm.org/D74039 (cherry picked from commit 6c4a8bc0a9f6a466d90d542bef66d69550c1b041)
*	[X86] -fpatchable-function-entry=N,0: place patch label after ENDBR{32,64}	Fangrui Song	2020-02-05	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Similar to D73680 (AArch64 BTI). A local linkage function whose address is not taken does not need ENDBR32/ENDBR64. Placing the patch label after ENDBR32/ENDBR64 has the advantage that code does not need to differentiate whether the function has an initial ENDBR. Also, add 32-bit tests and test that .cfi_startproc is at the function entry. The line information has a general implementation and is tested by AArch64/patchable-function-entry-empty.mir Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D73760 (cherry picked from commit 8ff86fcf4c038c7156ed4f01e7ed35cae49489e2)
*	[ARM][VecReduce] Force expand vector_reduce_fmin	David Green	2020-02-05	2	-5/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Under MVE, we do not have any lowering for fminimum, which a vector_reduce_fmin without NoNan will be expanded into. As with the other recent patches, force this to expand in the pre-isel pass. Note that Neon lowering would be OK because the scalar fminimum uses the vector VMIN instruction, but is probably better to just rely on the scalar operations, which is what is done here. Also fixes what appears to be the reversal of INF vs -INF in the vector_reduce_fmin widening code. (cherry picked from commit 362d00e0510ee75750499e2993a782428e377215)
*	[ARM] Expand vector reduction intrinsics on soft float	Nikita Popov	2020-02-05	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \|	Followup to D73135. If the target doesn't have hard float (default for ARM), then we assert when trying to soften the result of vector reduction intrinsics. This patch marks these for expansion as well. (A bit odd to use vectors on a target without hard float ... but that's where you end up if you expose target-independent vector types.) Differential Revision: https://reviews.llvm.org/D73854 (cherry picked from commit 1cc4f8d17247cd9be88addd75d060f9321b6f8b0)
*	[AArch64][ARM] Always expand ordered vector reductions (PR44600)	Nikita Popov	2020-02-05	2	-2/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fadd/fmul reductions without reassoc are lowered to VECREDUCE_STRICT_FADD/FMUL nodes, which don't have legalization support. Until that is in place, expand these intrinsics on ARM and AArch64. Other targets always expand the vector reduction intrinsics. Additionally expand fmax/fmin reductions without nonan flag on AArch64, as the backend asserts that the flag is present when lowering VECREDUCE_FMIN/FMAX. This fixes https://bugs.llvm.org/show_bug.cgi?id=44600. Differential Revision: https://reviews.llvm.org/D73135 (cherry picked from commit 70d345e687caba4ac1f95655c6924dfa91e0083f)
*	AMDGPU: Fix handling of infinite loops in fragment shaders	Connor Abbott	2020-02-04	1	-6/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Due to the fact that kill is just a normal intrinsic, even though it's supposed to terminate the thread, we can end up with provably infinite loops that are actually supposed to end successfully. The AMDGPUUnifyDivergentExitNodes pass breaks up these loops, but because there's no obvious place to make the loop branch to, it just makes it return immediately, which skips the exports that are supposed to happen at the end and hangs the GPU if all the threads end up being killed. While it would be nice if the fact that kill terminates the thread were modeled in the IR, I think that the structurizer as-is would make a mess if we did that when the kill is inside control flow. For now, we just add a null export at the end to make sure that it always exports something, which fixes the immediate problem without penalizing the more common case. This means that we sometimes do two "done" exports when only some of the threads enter the discard loop, but from tests the hardware seems ok with that. This fixes dEQP-VK.graphicsfuzz.while-inside-switch with radv. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70781 (cherry picked from commit 87d98c149504f9b0751189744472d7cc94883960)
*	AMDGPU/R600: Emit rodata in text segment	Jan Vesely	2020-02-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	R600 relies on this behaviour. Fixes: 6e18266aa4dd78953557b8614cb9ff260bad7c65 ('Partially revert D61491 "AMDGPU: Be explicit about whether the high-word in SI_PC_ADD_REL_OFFSET is 0"') Fixes ~100 piglit regressions since 6e18266 Differential Revision: https://reviews.llvm.org/D72991 (cherry picked from commit 1b8eab179db46f25a267bb73c657009c0bb542cc)
*	[BPF] fix a bug in BPFMISimplifyPatchable pass with -O0	Yonghong Song	2020-02-03	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The recommended optimization level for BPF programs is O2 since (1). BPF is running inside the kernel and linux kernel won't work at -O0 level, and (2). Verifier is not able to handle O0 code properly, e.g., potential large stack size and a lot of spills. But we should keep -O0 at least compiling. This patch fixed a bug in BPFMISimplifyPatchable phase where with -O0, a segmentation fault will happen for a simple program like: int test(int a, int b) { return a + b; } A test case is added to capture such a case. Differential Revision: https://reviews.llvm.org/D73681 (cherry picked from commit 795bbb366266e83d2bea8dc04c19919b52ab3a2a)
*	Revert "[AMDGPU] Invert the handling of skip insertion."	Nicolai Hähnle	2020-02-03	6	-173/+6
\| \| \| \| \| \| \| \| \|	This reverts commit 0dc6c249bffac9f23a605ce4e42a84341da3ddbd. The commit is reported to cause a regression in piglit/bin/glsl-vs-loop for Mesa. (cherry picked from commit a80291ce10ba9667352adcc895f9668144f5f616)
*	[RISCV] Scheduler description for the Rocket core	Kai Wang	2020-02-03	11	-186/+900
\| \| \| \| \| \| \| \| \| \|	Pipeline scheduler model for the RISC-V Rocket micro-architecture using the MIScheduler interface. Support for both 32 and 64-bit Rocket cores is implemented. Differential revision: https://reviews.llvm.org/D68685 (cherry picked from commit 838a28e234e098bfc073a45f37a4dd3bb5b45eab)
*	[AArch64] -fpatchable-function-entry=N,0: place patch label after BTI	Fangrui Song	2020-02-03	2	-1/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: For -fpatchable-function-entry=N,0 -mbranch-protection=bti, after 9a24488cb67a90f889529987275c5e411ce01dda, we place the NOP sled after the initial BTI. ``` .Lfunc_begin0: bti c nop nop .section __patchable_function_entries,"awo",@progbits,f,unique,0 .p2align 3 .xword .Lfunc_begin0 ``` This patch adds a label after the initial BTI and changes the __patchable_function_entries entry to reference the label: ``` .Lfunc_begin0: bti c .Lpatch0: nop nop .section __patchable_function_entries,"awo",@progbits,f,unique,0 .p2align 3 .xword .Lpatch0 ``` This placement is compatible with the resolution in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424 . A local linkage function whose address is not taken does not need a BTI. Placing the patch label after BTI has the advantage that code does not need to differentiate whether the function has an initial BTI. Reviewers: mrutland, nickdesaulniers, nsz, ostannard Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73680 (cherry picked from commit 06b8e32d4fd3f634f793e3bc0bc4fdb885e7a3ac)
*	Revert "Reland: [DWARF] Allow cross-CU references of subprogram definitions"	Vedant Kumar	2020-01-29	4	-28/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	... as well as: Revert "[DWARF] Defer creating declaration DIEs until we prepare call site info" This reverts commit fa4701e1979553c2df61698ac1ac212627630442. This reverts commit 79daafc90308787b52a5d3a7586e82acd5e374b3. There have been reports of this assert getting hit: CalleeDIE && "Could not find DIE for call site entry origin (cherry picked from commit 802bec896171997a7b73dde3857712e0eedeabc1)
*	[LV] Do not try to sink dead instructions.	Florian Hahn	2020-01-29	2	-8/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Dead instructions do not need to be sunk. Currently we try and record the recipies for them, but there are no recipes emitted for them and there's nothing to sink. They can be removed from SinkAfter while marking them for recording. Fixes PR44634. Reviewers: rengolin, hsaito, fhahn, Ayal, gilr Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D73423 (cherry picked from commit a911fef3dd79e0a04b241be7b476dde7e99744c4)
*	[WebAssembly] Fix resume-only case in Emscripten EH	Heejin Ahn	2020-01-29	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: D72308 incorrectly assumed `resume` cannot exist without a `landingpad`, which is not true. This sets `Changed` to true whenever we make changes to a function, including creating a call to `__resumeException` within a function without a landing pad. Reviewers: tlively Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73308 (cherry picked from commit 580d7838dd08e13dac6caf4ab3142c9381bc7ad0)
*	[ORC] Fix a missing move in ce2207abaf9.	Lang Hames	2020-01-29	1	-1/+1
\| \| \| \| \| \| \| \|	This should fix the build failure at http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/32524 and others. (cherry picked from commit e0a6093a744d16c90eafa62d7143ce41806b2466)
*	[ORC] Add support for emulated TLS to ORCv2.	Lang Hames	2020-01-29	6	-50/+96
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit adds a ManglingOptions struct to IRMaterializationUnit, and replaces IRCompileLayer::CompileFunction with a new IRCompileLayer::IRCompiler class. The ManglingOptions struct defines the emulated-TLS state (via a bool member, EmulatedTLS, which is true if emulated-TLS is enabled and false otherwise). The IRCompileLayer::IRCompiler class wraps an IRCompiler (the same way that the CompileFunction typedef used to), but adds a method to return the IRCompileLayer::ManglingOptions that the compiler will use. These changes allow us to correctly determine the symbols that will be produced when a thread local global variable defined at the IR level is compiled with or without emulated TLS. This is required for ORCv2, where MaterializationUnits must declare their interface up-front. Most ORCv2 clients should not require any changes. Clients writing custom IR compilers will need to wrap their compiler in an IRCompileLayer::IRCompiler, rather than an IRCompileLayer::CompileFunction, however this should be a straightforward change (see modifications to CompileUtils.* in this patch for an example). (cherry picked from commit ce2207abaf9a925b35f15ef92aaff6b301ba6d22)
*	[ORC] Add weak symbol support to defineMaterializing, fix for PR40074.	Lang Hames	2020-01-29	3	-28/+96
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The MaterializationResponsibility::defineMaterializing method allows clients to add new definitions that are in the process of being materialized to the JIT. This patch adds support to defineMaterializing for symbols with weak linkage where the new definitions may be rejected if another materializer concurrently defines the same symbol. If a weak symbol is rejected it will not be added to the MaterializationResponsibility's responsibility set. Clients can check for membership in the responsibility set via the MaterializationResponsibility::getSymbols() method before resolving any such weak symbols. This patch also adds code to RTDyldObjectLinkingLayer to tag COFF comdat symbols introduced during codegen as weak, on the assumption that these are COFF comdat constants. This fixes http://llvm.org/PR40074. (cherry picked from commit 84217ad66115cc31b184374a03c8333e4578996f)
*	[PassManagerBuilder] Remove global extension when a plugin is unloaded	Elia Geretto	2020-01-29	1	-8/+33
\| \| \| \| \| \| \| \| \| \| \| \|	This commit fixes PR39321. GlobalExtensions is not guaranteed to be destroyed when optimizer plugins are unloaded. If it is indeed destroyed after a plugin is dlclose-d, the destructor of the corresponding ExtensionFn is not mapped anymore, causing a call to unmapped memory during destruction. This commit guarantees that extensions coming from external plugins are removed from GlobalExtensions when the plugin is unloaded if GlobalExtensions has not been destroyed yet. Differential Revision: https://reviews.llvm.org/D71959 (cherry picked from commit ab2300bc154f7bed43f85f74fd3fe31be71d90e0)
*	[GlobalMerge] Preserve symbol visibility when merging globals	Michael Spang	2020-01-29	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	Symbols created for merged external global variables have default visibility. This can break programs when compiling with -Oz -fvisibility=hidden as symbols that should be hidden will be exported at link time. Differential Revision: https://reviews.llvm.org/D73235 (cherry picked from commit a2fb2c0ddca14c133f24d08af4a78b6a3d612ec6)
*	Work around PR44697 in CrashRecoveryContext	Hans Wennborg	2020-01-29	1	-0/+7
\| \| \| \|	(cherry picked from commit 31e07692d7f2b383bd64c63cd2b5c35b6503cf3a)
*	Reland "[StackColoring] Remap PseudoSourceValue frame indices via ↵	Fangrui Song	2020-01-27	1	-7/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MachineFunction::getPSVManager()"" Reland 7a8b0b1595e7dc878b48cf9bbaa652087a6895db, with a fix that checks `!E.value().empty()` to avoid inserting a zero to SlotRemap. Debugged by rnk@ in https://bugs.chromium.org/p/chromium/issues/detail?id=1045650#c33 Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D73510 (cherry picked from commit 68051c122440b556e88a946bce12bae58fcfccb4) (cherry picked from commit c7c5da6df30141c563e1f5b8ddeabeecdd29e55e)
*	[IR] Keep a double break between functions when printing a module	Reid Kleckner	2020-01-27	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This behavior appears to have changed unintentionally in b0e979724f2679e4e6f5b824144ea89289bd6d56. Instead of printing the leading newline in printFunction, print it when printing a module. This ensures that `OS << *Func` starts printing immediately on the current line, but whole modules are printed nicely. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D73505 (cherry picked from commit 9521c18438a9f09663f3dc68aa7581371c0653c9)
*	[RISCV] Support ABI checking with per function target-features	Zakk Chen	2020-01-27	3	-10/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	1. if users don't specific -mattr, the default target-feature come from IR attribute. 2. fixed bug and re-land this patch Reviewers: lenary, asb Reviewed By: lenary Tags: #llvm Differential Revision: https://reviews.llvm.org/D70837 (cherry picked from commit 0cb274de397a193fb37c60653b336d48a3a4f1bd)
*	Revert "[RISCV] Support ABI checking with per function target-features"	Zakk Chen	2020-01-27	3	-27/+10
\| \| \| \| \| \| \|	This reverts commit 7bc58a779aaa1de56fad8b1bc8e46932d2f2f1e4. It breaks EXPENSIVE_CHECKS on Windows (cherry picked from commit cef838e65f9a2aeecf5e19431077bc16b01a79fb)
*	[RISCV] Check the target-abi module flag matches the option	Zakk Chen	2020-01-27	3	-12/+28
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: lenary, asb Reviewed By: lenary Tags: #llvm Differential Revision: https://reviews.llvm.org/D72768 (cherry picked from commit 1256d68093ac1696034e385bbb4cb6e516b66bea)
*	[X86] Make `llc --help` output readable again	Roman Lebedev	2020-01-27	1	-7/+7
\| \| \| \| \| \| \| \| \|	Long `cl::value_desc()` is added right after the flag name, before `cl::desc()` column. And thus the `cl::desc()` column, for all flags, is padded to the right, which makes the output unreadable. (cherry picked from commit 70cbf8c71c510077baadcad305fea6f62e830b06)
*	[msan] Instrument x86.pclmulqdq* intrinsics.	Evgenii Stepanov	2020-01-27	1	-0/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: These instructions ignore parts of the input vectors which makes the default MSan handling too strict and causes false positive reports. Reviewers: vitalybuka, RKSimon, thakis Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73374 (cherry picked from commit 1df8549b26892198ddf77dfd627eb9f979d45b7e)
*	[PatchableFunction] Allow empty entry MachineBasicBlock	Fangrui Song	2020-01-24	1	-3/+8
\| \| \| \| \| \| \| \|	Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D73301 (cherry picked from commit 50a3ff30e1587235d1830fec9694c1239302ab9f)
*	Add function attribute "patchable-function-prefix" to support ↵	Fangrui Song	2020-01-24	4	-16/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	-fpatchable-function-entry=N,M where M>0 Similar to the function attribute `prefix` (prefix data), "patchable-function-prefix" inserts data (M NOPs) before the function entry label. -fpatchable-function-entry=2,1 (1 NOP before entry, 1 NOP after entry) will look like: ``` .type foo,@function .Ltmp0: # @foo nop foo: .Lfunc_begin0: # optional `bti c` (AArch64 Branch Target Identification) or # `endbr64` (Intel Indirect Branch Tracking) nop .section __patchable_function_entries,"awo",@progbits,get,unique,0 .p2align 3 .quad .Ltmp0 ``` -fpatchable-function-entry=N,0 + -mbranch-protection=bti/-fcf-protection=branch has two reasonable placements (https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01185.html): ``` (a) (b) func: func: .Ltmp0: bti c bti c .Ltmp0: nop nop ``` (a) needs no additional code. If the consensus is to go for (b), we will need more code in AArch64BranchTargets.cpp / X86IndirectBranchTracking.cpp . Differential Revision: https://reviews.llvm.org/D73070 (cherry picked from commit 22467e259507f5ead2a87d989251b4c951a587e4)
*	[AsmPrinter] Don't emit __patchable_function_entries entry if ↵	Fangrui Song	2020-01-24	1	-1/+5
\| \| \| \| \| \| \| \|	"patchable-function-entry"="0" Add improve tests (cherry picked from commit d232c215669cb57f5eb4ead40a4a336220dbc429)
*	[CodeGen] Move fentry-insert, xray-instrumentation and patchable-function ↵	Fangrui Song	2020-01-24	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	before addPreEmitPass() This intention is to move patchable-function before aarch64-branch-targets (configured in AArch64PassConfig::addPreEmitPass) so that we emit BTI before NOPs (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424). This also allows addPreEmitPass() passes to know the precise instruction sizes if they want. Tried x86-64 Debug/Release builds of ccls with -fxray-instrument -fxray-instruction-threshold=1. No output difference with this commit and the previous commit. (cherry picked from commit 9a24488cb67a90f889529987275c5e411ce01dda)
*	[RISCV] Fix evaluating %pcrel_lo against global and weak symbols	James Clarke	2020-01-23	5	-108/+86
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Previously, we would erroneously turn %pcrel_lo(label), where label has a %pcrel_hi against a weak symbol, into %pcrel_lo(label + offset), as evaluatePCRelLo would believe the target independent logic was going to fold it. Moreover, even if that were fixed, shouldForceRelocation lacks an MCAsmLayout and thus cannot evaluate the %pcrel_hi fixup to a value and check the symbol, so we would then erroneously constant-fold the %pcrel_lo whilst leaving the %pcrel_hi intact. After D72197, this same sequence also occurs for symbols with global binding, which is triggered in real-world code. Instead, as discussed in D71978, we introduce a new FKF_IsTarget flag to avoid these kinds of issues. All the resolution logic happens in one place, with no coordination required between RISCAsmBackend and RISCVMCExpr to ensure they implement the same logic twice. Although the implementation of %pcrel_hi can be left as target independent, we make it target dependent to ensure that they are handled identically to %pcrel_lo, otherwise we risk one of them being constant folded but the other being preserved. This also allows us to properly support fixup pairs where the instructions are in different fragments. Reviewers: asb, lenary, efriedma Reviewed By: efriedma Subscribers: arichardson, hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73211 (cherry picked from commit 3f5976c97dbfefb4669abcf968bd79a9a64c18e0)