bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[FPEnv][X86] More strict int <-> FP conversion fixes	Ulrich Weigand	2019-12-23	4	-92/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix several several additional problems with the int <-> FP conversion logic both in common code and in the X86 target. In particular: - The STRICT_FP_TO_UINT expansion emits a floating-point compare. This compare can raise exceptions and therefore needs to be a strict compare. I've made it signaling (even though quiet would also be correct) as signaling is the more usual default for an LT. This code exists both in common code and in the X86 target. - The STRICT_UINT_TO_FP expansion algorithm was incorrect for strict mode: it emitted two STRICT_SINT_TO_FP nodes and then used a select to choose one of the results. This can cause spurious exceptions by the STRICT_SINT_TO_FP that ends up not chosen. I've fixed the algorithm to use only a single STRICT_SINT_TO_FP instead. - The !isStrictFPEnabled logic in DoInstructionSelection would sometimes do the wrong thing because it calls getOperationAction using the result VT. But for some opcodes, incuding [SU]INT_TO_FP, getOperationAction needs to be called using the operand VT. - Remove some (obsolete) code in X86DAGToDAGISel::Select that would mutate STRICT_FP_TO_[SU]INT to non-strict versions unnecessarily. Reviewed by: craig.topper Differential Revision: https://reviews.llvm.org/D71840
*	[AMDGPU] Don't create MachinePointerInfos with an UndefValue pointer	Jay Foad	2019-12-23	5	-34/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The only useful information the UndefValue conveys is the address space, which MachinePointerInfo can represent directly without referring to an IR value. Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71838
*	[DAGCombine] visitEXTRACT_SUBVECTOR - 'little to big' ↵	Sanjay Patel	2019-12-23	2	-97/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	extract_subvector(bitcast()) support This moves the X86 specific transform from rL364407 into DAGCombiner to generically handle 'little to big' cases (for example: extract_subvector(v2i64 bitcast(v16i8))). This allows us to remove both the x86 implementation and the aarch64 bitcast(extract_subvector(bitcast())) combine. Earlier patches that dealt with regressions initially exposed by this patch: rG5e5e99c041e4 rG0b38af89e2c0 Patch by: @RKSimon (Simon Pilgrim) Differential Revision: https://reviews.llvm.org/D63815
*	[AArch64] [Windows] Use COFF stubs for calls to extern_weak functions	Martin Storsjö	2019-12-23	3	-7/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As the extern_weak target might be missing, resolving to the absolute address zero, we can't use the normal direct PC-relative branch instructions (as that would result in relocations out of range). Improve the classifyGlobalFunctionReference method to set MO_DLLIMPORT/MO_COFFSTUB, and simplify the existing code in AArch64TargetLowering::LowerCall to use the return value from classifyGlobalFunctionReference for these cases. Add code in both AArch64FastISel and GlobalISel/IRTranslator to bail out for function calls to extern weak functions on windows, to let SelectionDAG handle them. This matches what was done for X86 in 6bf108d77a3c. Differential Revision: https://reviews.llvm.org/D71721
*	[ARM] [Windows] Use COFF stubs for calls to extern_weak functions	Martin Storsjö	2019-12-23	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	As the extern_weak target might be missing, resolving to the absolute address zero, we can't use the normal direct PC-relative branch instructions (as that would result in relocations out of range). Instead check the shouldAssumeDSOLocal method and load the address from a COFF stub. This matches what was done for X86 in 6bf108d77a3c. Differential Revision: https://reviews.llvm.org/D71720
*	[NFC] Style cleanups	Shengchen Kan	2019-12-23	1	-22/+23
\| \| \| \| \| \|	1. Remove duplicate function for class name at the beginning of the comment. 2. Use auto where the type is already obvious from the context.
*	[Power9] Remove the PPCISD::XXREVERSE as it has completely the same ↵	QingShan Zhang	2019-12-23	4	-23/+5
\| \| \| \| \| \| \| \| \|	semantics of ISD::BSWAP The custom node PPCISD::XXREVERSE has completely the same semantics of generic node ISD::BSWAP. We need to clean up it as we have the combine rules for bswap in the base class, while nothing for xxreverse. Differential Revision: https://reviews.llvm.org/D70657
*	[AVR] Fix codegen for rotate instructions	Jim Lin	2019-12-23	3	-4/+104
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch introduces the ROLBRd and RORBRd pseudo-instructions, which implemenent the "traditional" rotate operations; instead of the AVR rotate instructions that use the carry bit. The code is not optimized at all. Especially when dealing with loops of rotate instructions, this codegen should be improved some day. Related bug: 41358 <https://bugs.llvm.org/show_bug.cgi?id=41358> //Note//: This is my first submitted patch. Reviewers: dylanmckay, Jim Reviewed By: dylanmckay Subscribers: hiraditya, llvm-commits, dylanmckay, dsprenkels Tags: #llvm Patched by dsprenkels (Daan Sprenkels) Differential Revision: https://reviews.llvm.org/D60365
*	[PowerPC] Exploit `vrl(b\|h\|w\|d)` to perform vector rotation	Kai Luo	2019-12-23	2	-1/+21
\| \| \| \| \| \| \| \| \|	Summary: Currently, we set legalization action of `ISD::ROTL` vectors as `Expand` in `PPCISelLowering`. However, we can exploit `vrl(b\|h\|w\|d)` to lower `ISD::ROTL` directly. Differential Revision: https://reviews.llvm.org/D71324
*	[AMDGPU] Fixes -Wrange-loop-analysis warnings	Mark de Wever	2019-12-22	2	-4/+4
\| \| \| \| \| \|	This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall. Differential Revision: https://reviews.llvm.org/D71815
*	[Hexagon] Fixes -Wrange-loop-analysis warnings	Mark de Wever	2019-12-22	5	-10/+10
\| \| \| \| \| \|	This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall. Differential Revision: https://reviews.llvm.org/D71814
*	[NVPTX] Fixes -Wrange-loop-analysis warnings	Mark de Wever	2019-12-22	1	-1/+1
\| \| \| \| \| \| \| \| \|	This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall. Also removed the top-level const as requested by Aaron Ballman in similar patches. Differential Revision: https://reviews.llvm.org/D71812
*	[PowerPC] Fixes -Wrange-loop-analysis warnings	Mark de Wever	2019-12-22	1	-3/+3
\| \| \| \| \| \|	This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall. Differential Revision: https://reviews.llvm.org/D71811
*	[ms] [X86] Use "P" modifier on operands to call instructions in inline X86 ↵	Eric Astor	2019-12-22	4	-13/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	assembly. Summary: This is documented as the appropriate template modifier for call operands. Fixes PR44272, and adds a regression test. Also adds support for operand modifiers in Intel-style inline assembly. Reviewers: rnk Reviewed By: rnk Subscribers: merge_guards_bot, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71677
*	[AArch64] match splat of bitcasted extract subvector to DUPLANE	Sanjay Patel	2019-12-22	1	-7/+43
\| \| \| \| \| \| \| \| \| \|	This is another potential regression exposed by D63815. Here we peek through a bitcast to find an extract subvector and scale the splat offset based on that: splat (bitcast (extract X, C)), LaneC --> duplane (bitcast X), LaneC' Differential Revision: https://reviews.llvm.org/D71672
*	Fix "result of 32-bit shift implicitly converted to 64 bits" warning. NFC.	Simon Pilgrim	2019-12-21	1	-1/+1
\|
*	[AArch64] Respect reserved registers while renaming in LdSt opt.	Florian Hahn	2019-12-21	1	-1/+4
\| \| \| \| \| \|	We cannot pick reserved registers as rename registers. Fixes https://bugs.llvm.org/show_bug.cgi?id=44358
*	AMDGPU/GlobalISel: Fix misuse of div_scale intrinsics	Matt Arsenault	2019-12-21	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \|	Confusingly, the intrinsic operands do not match the instruction/custom node. The order is shuffled, and the 3rd operand is an immediate to select operands. I'm not 100% sure I did this right, but fdiv still doesn't select end to end and it will be easier to tell when it does. This at least avoids an assertion in RegBankSelect and allows hitting the fallback on selection.
*	AMDGPU/GlobalISel: Fix missing scc imp-def on scalar and/or/xor	Matt Arsenault	2019-12-21	1	-0/+5
\|
*	AMDGPU/GlobalISel: Simplify code	Matt Arsenault	2019-12-21	1	-5/+5
\| \| \| \| \|	This can directly access the register bank, and doesn't need to get it through the ID.
*	[WebAssembly] Use TargetIndex operands in DbgValue to track WebAssembly ↵	Yury Delendik	2019-12-20	6	-0/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	operands locations Extends DWARF expression language to express locals/globals locations. (via target-index operands atm) (possible variants are: non-virtual registers or address spaces) The WebAssemblyExplicitLocals can replace virtual registers to targertindex operand type at the time when WebAssembly backend introduces {get,set,tee}_local instead of corresponding virtual registers. Reviewed By: aprantl, dschuff Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D52634
*	Add parentheses to silence warning	Bill Wendling	2019-12-20	1	-2/+2
\|
*	More style cleanups following rG14fc20ca6282 [NFC]	Philip Reames	2019-12-20	1	-34/+28
\| \| \| \| \| \| \|	Demote member functions to static functions where possible Use early continue/early return to reduce nesting Clarify comments slightly. Reuse previously define expression in one case.
*	Fix a memory leak introduced w/the instruction padding support in rG14fc20ca6282	Philip Reames	2019-12-20	1	-6/+6
\| \| \| \|	Should have caught this in review, but only noticed when addressing post commit style items. We were creating a new instance of the X86MCInstrInfo class, and then never reclaiming the memory. This wasn't even conditional on the new off by default flags, so it was an unconditional leak.
*	Align branches within 32-Byte boundary (NOP padding)	Philip Reames	2019-12-20	1	-1/+286
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	WARNING: If you're looking at this patch because you're looking for a full performace mitigation of the Intel JCC Erratum, this is not it! This is a preliminary patch on the patch towards mitigating the performance regressions caused by Intel's microcode update for Jump Conditional Code Erratum. For context, see: https://www.intel.com/content/www/us/en/support/articles/000055650.html The patch adds the required assembler infrastructure and command line options needed to exercise the logic for INTERNAL TESTING. These are NOT public flags, and should not be used for anything other than LLVM's own testing/debugging purposes. They are likely to change both in spelling and meaning. WARNING: This patch is knowingly incorrect in some cornercases. We need, and do not yet provide, a mechanism to selective enable/disable the padding. Conversation on this will continue in parellel with work on extending this infrastructure to support prefix padding. The goal here is to have the assembler align specific instructions such that they neither cross or end at a 32 byte boundary. The impacted instructions are: a. Conditional jump. b. Fused conditional jump. c. Unconditional jump. d. Indirect jump. e. Ret. f. Call. The new options for llvm-mc are: -x86-align-branch-boundary=NUM aligns branches within NUM byte boundary. -x86-align-branch=TYPE[+TYPE...] specifies types of branches to align. A new MCFragment type, MCBoundaryAlignFragment, is added, which may emit NOP to align the fused/unfused branch. alignBranchesBegin inserts MCBoundaryAlignFragment before instructions, alignBranchesEnd marks the end of the branch to be aligned, relaxBoundaryAlign grows or shrinks sizes of NOP to align the target branch. Nop padding is disabled when the instruction may be rewritten by the linker, such as TLS Call. Process Note: I am landing a patch by skan as it has been LGTMed, and continuing to iterate on the review is simply slowing us down at this point. We can and will continue to iterate in tree. Patch By: skan Differential Revision: https://reviews.llvm.org/D70157
*	[PPC32] Emit R_PPC_PLTREL24 for calls to dso_local ifunc	Fangrui Song	2019-12-20	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	static void *ifunc(void) __attribute__((ifunc("resolver"))); void foo() { ifunc(); } The relocation produced by the ifunc() call: 1. gcc -msecure-plt -fPIC => R_PPC_PLTREL24 r_addend=0x8000 2. gcc -msecure-plt -PIE => R_PPC_PLTREL24 r_addend=0x8000 3. clang -msecure-plt -fPIC => R_PPC_PLTREL24 r_addend=0x8000 4. clang -msecure-plt -fPIE => R_PPC_REL24 4 is incorrect. The R_PPC_REL24 needs a call stub due to ifunc. If this relocation is mixed with other R_PPC_PLTREL24(r_addend=0x8000) in a function, both GNU ld and lld (after D71621 fix) may produce a wrong result. This patch fixes 4 to use R_PPC_PLTREL24, which matches GCC. Both GNU ld and lld (after D71621) will be happy. Reviewed By: sfertile Differential Revision: https://reviews.llvm.org/D71649
*	[X86] Fix a KNL miscompile caused by combineSetCC swapping LHS/RHS variables ↵	Craig Topper	2019-12-20	1	-19/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	before a later use. The setcc operands are copied into LHS and RHS variables at the top of the function. We also capture the condition code. A later piece of code swaps the operands and changing the CC variable as part of a canonicalization to make some other checks simpler. But we might not make the transform we canonicalized for. So we continue on through the function where we can use the swapped LHS/RHS variables and access the original condition code operand instead of the modified CC variable. This leads to a setcc being created with the original condition code, but with swapped operands. To mitigate this, this patch does a couple things. The LHS/RHS/CC variables are made const to keep them from being modified like this again. The transform that needs the swap now uses temporary copies of the variables. And the transform that used the original condition code operand has been altered to use the CC variable we cached originally. Either of these changes are enough to fix the issue, but doing both to make this code very safe. I also considered rewriting the swap code in some way to check both permutations without explicitly swapping or needing temporary variables, but held off on that. Differential Revision: https://reviews.llvm.org/D71736
*	[AArch64][SVE] Replace integer immediate intrinsics with splat vector variant	Danilo Carvalho Grael	2019-12-20	2	-22/+39
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Replace the integer immediate intrisics with splat vector variants so they can be applied as optimizations for the C/C++ intrinsics. Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D71614
*	[SystemZ] Add a mapping from "select register" to "load on condition" (2-addr).	Jonas Paulsson	2019-12-20	4	-81/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The SELR(Mux) instructions can be converted to two-address form as LOCR(Mux) instructions whenever one of the sources are the same reg as dest. By adding this mapping in getTwoOperandOpcode(), we get: - Two-address hints in getRegAllocationHints() for select register instructions. - No need anymore for special handling in SystemZShortenInst.cpp - shortenSelect() removed. The two-address hints are now added before the GRX32 hints, which should be preferred. Review: Ulrich Weigand https://reviews.llvm.org/D68870
*	[SystemZ] Bugfix and improve the handling of CC values.	Jonas Paulsson	2019-12-20	6	-33/+141
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It was recently discovered that the handling of CC values was actually broken since overflow was not properly handled ('nsw' flag not checked for). Add and sub instructions now have a new target specific instruction flag named SystemZII::CCIfNoSignedWrap. It means that the CC result can be used instead of a compare with 0, but only if the instruction has the 'nsw' flag set. This patch also adds the improvements of conversion to logical instructions and the analyzing of add with immediates, to be able to eliminate more compares. Review: Ulrich Weigand https://reviews.llvm.org/D66868
*	Revert "[ARM] Improve codegen of volatile load/store of i64"	Victor Campos	2019-12-20	6	-158/+6
\| \| \| \|	This reverts commit bbcf1c3496ce2bd1ed87e8fb15ad896e279633ce.
*	[SystemZ][FPEnv] Enable strict vector FP extends/truncations	Ulrich Weigand	2019-12-20	4	-13/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The back-end currently has special DAGCombine code to detect cases where two floating-point extend or truncate operations can be combined into a single vector operation. This patch extends that support to also handle strict FP operations. Note that currently only the case where both operations have the same input chain are supported. This already suffices to cover the common case where the operations result from scalarizing a non-legal vector type. More general cases can be supported in the future.
*	[AArch64][SVE] Correct intrinsics and patterns for logical predicate ↵	Paul Walker	2019-12-20	1	-17/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instructions In general SVE intrinsics are considered predicated and merging with everything else having suitable decoration. For predicated zeroing operations (like the predicate logical instructions) we use the "_z" suffix. After this change all intrinsics use their expected names (i.e. orr instead of or and eor instead of xor). I've removed intrinsics and patterns for condition code setting instructions as that data is not returned as part of the intrinsic. The expectation is to ask for a cc flag explicitly. For example: a = and_z(pg, p1, p2) cc = ptest_<flag>(pg, a) With the code generator expected to use "s" variants of instructions when available. Differential Revision: https://reviews.llvm.org/D71715
*	[AArch64][SVE] Fold constant multiply of element count	Cullen Rhodes	2019-12-20	3	-1/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: E.g. %0 = tail call i64 @llvm.aarch64.sve.cntw(i32 31) %mul = mul i64 %0, <const> Should emit: cntw x0, all, mul #<const> For <const> in the range 1-16. Patch by Kerry McLaughlin Reviewers: sdesmalen, huntergr, dancgr, rengolin, efriedma Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71014
*	[AArch64][SVE] Add intrnisics for saturating scalar arithmetic	Andrzej Warzynski	2019-12-20	2	-69/+137
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The following intrnisics are added: * @llvm.aarch64.sve.sqdec{b\|h\|w\|d\|p} * @llvm.aarch64.sve.sqinc{b\|h\|w\|d\|p} * @llvm.aarch64.sve.uqdec{b\|h\|w\|d\|p} * @llvm.aarch64.sve.uqinc{b\|h\|w\|d\|p} For every intrnisic there a scalar variants (with n32 or n64 suffix) and vector variants (no suffix). Reviewers: sdesmalen, rengolin, efriedma Reviewed By: sdesmalen, efriedma Subscribers: eli.friedman, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71252
*	Recommit "[AArch64][SVE] Add permutation and selection intrinsics"	Cullen Rhodes	2019-12-20	5	-42/+264
\| \| \| \| \| \| \| \|	Recommit 23c28c40436143006be740533375c036d11c92cd (reverted in dcb48f50bdfa0fa47b62d089b6ed999d857fc9f8) with a fix for an assert "Request for a fixed size on a scalable object" being triggered in `LowerSVEIntrinsicEXT`. The fix is to call `getKnownMinSize` on the TypeSize object.
*	[AArch64][SVE] Add intrinsics for binary narrowing operations	Andrzej Warzynski	2019-12-20	3	-22/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The following intrinsics for binary narrowing shift righ operations are added: * @llvm.aarch64.sve.shrnb * @llvm.aarch64.sve.uqshrnb * @llvm.aarch64.sve.sqshrnb * @llvm.aarch64.sve.sqshrunb * @llvm.aarch64.sve.uqrshrnb * @llvm.aarch64.sve.sqrshrnb * @llvm.aarch64.sve.sqrshrunb * @llvm.aarch64.sve.shrnt * @llvm.aarch64.sve.uqshrnt * @llvm.aarch64.sve.sqshrnt * @llvm.aarch64.sve.sqshrunt * @llvm.aarch64.sve.uqrshrnt * @llvm.aarch64.sve.sqrshrnt * @llvm.aarch64.sve.sqrshrunt Reviewers: sdesmalen, rengolin, efriedma Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71552
*	[ARM][MVE] Fixes for tail predication.	Sam Parker	2019-12-20	3	-12/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	1) Fix an issue with the incorrect value being used for the number of elements being passed to [d\|w]lstp. We were trying to check that the value was available at LoopStart, but this doesn't consider that the last instruction in the block could also define the register. Two helpers have been added to RDA for this. 2) Insert some code to now try to move the element count def or the insertion point so that we can perform more tail predication. 3) Related to (1), the same off-by-one could prevent us from generating a low-overhead loop when a mov lr could have been the last instruction in the block. 4) Fix up some instruction attributes so that not all the low-overhead loop instructions are labelled as branches and terminators - as this is not true for dls/dlstp. Differential Revision: https://reviews.llvm.org/D71609
*	[ARM][MVE] Tail predicate in the presence of vcmp	Sam Parker	2019-12-20	3	-76/+270
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Record the discovered VPT blocks while checking for validity and, for now, only handle blocks that begin with VPST and not VPT. We're now allowing more than one instruction to define vpr, but each block must somehow be predicated using the vctp. This leaves us with several scenarios which need fixing up: 1) A VPT block with is only predicated by the vctp and has no internal vpr defs. 2) A VPT block which is only predicated by the vctp but has an internal vpr def. 3) A VPT block which is predicated upon the vctp as well as another vpr def. 4) A VPT block which is not predicated upon a vctp, but contains it and all instructions within the block are predicated upon in. The changes needed are, for: 1) The easy one, just remove the vpst and unpredicate the instructions in the block. 2) Remove the vpst and unpredicate the instructions up to the internal vpr def. Need insert a new vpst to predicate the remaining instructions. 3) No nothing. 4) The vctp will be inside a vpt and the instruction will be removed, so adjust the size of the mask on the vpst. Differential Revision: https://reviews.llvm.org/D71107
*	[ARM][MVE] Tail predicate bottom/top muls.	Sam Parker	2019-12-20	1	-0/+3
\| \| \| \| \| \|	Add VMULL and VQDMULL variants to our tail predication white list. Differential Revision: https://reviews.llvm.org/D71465
*	[X86] Make EmitCmp into a static function and explicitly return chain result ↵	Craig Topper	2019-12-19	2	-18/+19
\| \| \| \| \| \| \| \| \| \| \| \|	for STRICT_FCMP. NFCI The only thing its getting from the X86TargetLowering class is the subtarget which we can easily pass. This function only has one call site now since this might help the compiler inline it. Explicitly return both the flag result and the chain result for STRICT_FCMP nodes. This removes an assumption in the caller that getValue(1) is the right way to get the chain.
*	[X86] Directly call EmitTest in two places instead of creating a null ↵	Craig Topper	2019-12-19	1	-4/+2
\| \| \| \| \| \| \| \| \|	constant and calling EmitCmp. NFCI EmitCmp will just immediately call EmitTest and discard the null constant only to have EmitTest create it again if it doesn't fold. So just skip all that and go directly to EmitTest.
*	[StackMaps] Be explicit about label formation [NFC] (try 2)	Philip Reames	2019-12-19	4	-9/+43
\| \| \| \| \| \|	Recommit after making the same API change in non-x86 targets. This has been build for all targets, and tested for effected ones. Why the difference? Because my disk filled up when I tried make check for all. For auto-padding assembler support, we'll need to bundle the label with the instructions (nops or call sequences) so that they don't get separated. This just rearranges the code to make the upcoming change more obvious.
*	Temporarily Revert "[StackMaps] Be explicit about label formation [NFC]"	Eric Christopher	2019-12-19	1	-14/+3
\| \| \| \| \| \|	as it broke the aarch64 build. This reverts commit bc7595d934b958ab481288d7b8e768fe5310be8f.
*	[StackMaps] Be explicit about label formation [NFC]	Philip Reames	2019-12-19	1	-3/+14
\| \| \| \|	For auto-padding assembler support, we'll need to bundle the label with the instructions (nops or call sequences) so that they don't get separated. This just rearranges the code to make the upcoming change more obvious.
*	[FaultMaps] Make label formation a bit more explicit [NFC]	Philip Reames	2019-12-19	1	-1/+5
\| \| \| \|	This is in advance of assembler padding directives support where we'll need to bundle the label w/the corresponding faulting instruction to avoid padding being inserted between.
*	[RISCV] Don't crash on unsupported relocations	Luís Marques	2019-12-19	1	-2/+11
\| \| \| \| \| \| \| \| \| \|	Summary: Instead of crashing due to the `llvm_unreachable`, provide a proper error when invalid fixups/relocations are encountered. Reviewers: asb, lenary Reviewed By: asb Tags: #llvm Differential Revision: https://reviews.llvm.org/D71536
*	[SystemZ] Recognize mrecord-mcount in backend	Jonas Paulsson	2019-12-19	2	-3/+18
\| \| \| \| \| \| \|	Emit the __mcount_loc section for all fentry calls. Review: Ulrich Weigand https://reviews.llvm.org/D71629
*	[RISCV] Enable the machine outliner for RISC-V	lewis-revill	2019-12-19	3	-0/+190
\| \| \| \| \| \| \| \| \| \|	This patch enables the machine outliner for RISC-V and adds the necessary logic for checking whether sequences can be safely outlined, and describing how they should be outlined. Outlined functions are called using the register t0 (x5) as the return address register, which must be available for an occurrence of a sequence to be safely outlined. Differential Revision: https://reviews.llvm.org/D66210
*	[PowerPC] Only use PLT annotations if using PIC relocation model	Justin Hibbits	2019-12-19	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The default static (non-PIC, non-PIE) model for 32-bit powerpc does not use @PLT annotations and relocations in GCC. LLVM shouldn't use @PLT annotations either, because it breaks secure-PLT linking with (some versions of?) GNU LD. Update the available-externally.ll test to reflect that default mode should be the same as the static relocation, by using the same check prefix. Reviewed by: sfertile Differential Revision: https://reviews.llvm.org/D70570