bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Include what you use in AArch64AsmBackend.cpp	Dmitri Gribenko	2019-05-27	1	-1/+4
\| \| \| \| \| \| \| \| \| \|	AArch64AsmBackend.cpp was not using any APIs from AArch64.h, and was only including it for transitive dependencies. Doing so is problematic from include-what-you-use perspective, but it is also a layering issue (it creates a dependency cycle between the primary AArch64 target library and the MCTargetDesc library). llvm-svn: 361774
*	[AMDGPU] Fix for the address sanitizer failure caused by the ifollowing ↵	Alexander Timofeev	2019-05-27	1	-1/+3
\| \| \| \| \| \| \| \|	commit: 1a8b2ea611cf4ca7cb09562e0238cfefa27c05b5 Divergence driven ISel. Assign register class for cross block values according to the divergence. llvm-svn: 361770
*	[AMDGPU][MC] Enabled constant expressions as operands of s_waitcnt	Dmitry Preobrazhensky	2019-05-27	1	-36/+28
\| \| \| \| \| \| \| \| \| \|	See bug 40820: https://bugs.llvm.org/show_bug.cgi?id=40820 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D61017 llvm-svn: 361763
*	[ARM GlobalISel] Cleanup CallLowering a bit	Diana Picus	2019-05-27	2	-22/+13
\| \| \| \| \| \| \|	We never actually use the Offsets produced by ComputeValueVTs, so remove them until we need them. llvm-svn: 361755
*	[BPF] generate R_BPF_NONE relocation for BTF DataSec variables	Yonghong Song	2019-05-26	1	-10/+22
\| \| \| \| \| \| \| \| \| \| \|	The variables in BTF DataSec type encode in-section offset. R_BPF_NONE should be generated instead of R_BPF_64_32. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D62460 llvm-svn: 361742
*	[AMDGPU] Divergence driven ISel. Assign register class for cross block ↵	Alexander Timofeev	2019-05-26	7	-111/+176
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	values according to the divergence. Details: To make instruction selection really divergence driven it is necessary to assign the correct register classes to the cross block values beforehand. For the divergent targets same value type requires different register classes dependent on the value divergence. Reviewers: rampitec, nhaehnle Differential Revision: https://reviews.llvm.org/D59990 This commit was reverted because of the build failure. The reason was mlformed patch. Build failure fixed. llvm-svn: 361741
*	[SimplifyCFG] back out all SwitchInst commits	Shawn Landden	2019-05-26	1	-1/+1
\| \| \| \| \| \| \| \|	They caused the sanitizer builds to fail. My suspicion is the change the countLeadingZeros(). llvm-svn: 361736
*	[X86][SSE] Add shuffle combining support for ISD::ANY_EXTEND_VECTOR_INREG	Simon Pilgrim	2019-05-26	4	-13/+23
\| \| \| \| \| \|	Reuses what we already have in place for ISD::ZERO_EXTEND_VECTOR_INREG just with a different sentinel llvm-svn: 361734
*	[Support] make countLeadingZeros() and countTrailingZeros() return unsigned	Shawn Landden	2019-05-26	1	-1/+1
\| \| \| \| \| \| \| \| \|	This matches countLeadingOnes() and countTrailingOnes(), and APInt's countLeadingZeros() and countTrailingZeros(). (as well as __builtin_clzll()) llvm-svn: 361724
*	[ARM] Select fp16 fma	David Green	2019-05-26	1	-0/+3
\| \| \| \| \| \| \| \|	This adds a pattern for fma, similar to the float and double patterns. Differential Revision: https://reviews.llvm.org/D62330 llvm-svn: 361719
*	[ARM] Select a number of fp16 rounding functions	David Green	2019-05-26	2	-4/+6
\| \| \| \| \| \| \| \| \|	This add patterns for fp16 round and ceil etc. Same as the float and double patterns. Differential Revision: https://reviews.llvm.org/D62326 llvm-svn: 361718
*	[ARM] Promote various fp16 math intrinsics	David Green	2019-05-26	1	-0/+11
\| \| \| \| \| \| \| \| \|	Promote a number of fp16 math intrinsics to float, so that the relevant float math routines can be used. Copysign is expanded so as to be handled in-place. Differential Revision: https://reviews.llvm.org/D62325 llvm-svn: 361717
*	[X86][AVX] combineBitcastvxi1 - peek through bitops to determine size of ↵	Simon Pilgrim	2019-05-26	1	-3/+17
\| \| \| \| \| \| \| \| \| \|	original vector We were only testing for direct SETCC results - this allows us to peek through AND/OR/XOR combinations of the comparison results as well. There's a missing SEXT(PACKSS) fold that I need to investigate for v8i1 cases before I can enable it there as well. llvm-svn: 361716
*	[ARM] Select fp16 fabs	David Green	2019-05-26	1	-2/+2
\| \| \| \| \| \| \| \|	This adds a pattern for the fabs intrinsic, the same as float and double. Differential Revision: https://reviews.llvm.org/D62324 llvm-svn: 361715
*	[ARM] Select fp16 fsqrt	David Green	2019-05-26	1	-2/+2
\| \| \| \| \| \| \| \|	This adds a pattern for the sqrt intrinsic, the same as float and double. Differential Revision: https://reviews.llvm.org/D62322 llvm-svn: 361714
*	[ARM] Promote fp16 frem	David Green	2019-05-26	1	-0/+5
\| \| \| \| \| \| \| \|	Promote fp16 frem operations on ARM to floats so they call fmodf. Differential Revision: https://reviews.llvm.org/D62321 llvm-svn: 361713
*	[X86] lowerBuildVectorToBitOp - support build_vector(shift()) -> ↵	Simon Pilgrim	2019-05-25	1	-0/+20
\| \| \| \| \| \| \| \|	shift(build_vector(),C) Commonly occurs in sign-extension cases llvm-svn: 361706
*	[X86] Combine fminnum/fmaxnum with non-nan operand to fmin/fmax	Nikita Popov	2019-05-25	1	-3/+7
\| \| \| \| \| \| \| \| \|	If we have a known non-nan operand, place it in the second operand of fmin/fmax that is returned if either operand is nan. Differential Revision: https://reviews.llvm.org/D62448 llvm-svn: 361704
*	[X86FixupLEAs] Turn optIncDec into a generic two address LEA optimizer. ↵	Craig Topper	2019-05-25	1	-48/+106
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Support LEA64_32r properly. INC/DEC is really a special case of a more generic issue. We should also turn leas into add reg/reg or add reg/imm regardless of the slow lea flags. This also supports LEA64_32 which has 64 bit input registers and 32 bit output registers. So we need to convert the 64 bit inputs to their 32 bit equivalents to check if they are equal to base reg. One thing to note, the original code preserved the kill flags by adding operands to the new instruction instead of using addReg. But I think tied operands aren't supposed to have the kill flag set. I dropped the kill flags, but I could probably try to preserve it in the add reg/reg case if we think its important. Not sure which operand its supposed to go on for the LEA64_32r instruction due to the super reg implicit uses. Though I'm also not sure those are needed since they were probably just created by an INSERT_SUBREG from a 32-bit input. Differential Revision: https://reviews.llvm.org/D61472 llvm-svn: 361691
*	[X86] Add zero idioms to the haswell, broadwell, and skylake schedule ↵	Craig Topper	2019-05-25	5	-18/+395
\| \| \| \| \| \| \| \| \| \|	models. Add 256-bit fp xor to sandybridge zero idioms This copies the Sandy Bridge zero idiom support to later CPUs. Adding the AVX2 and AVX512F/VL instructions as appropriate. Differential Revision: https://reviews.llvm.org/D62360 llvm-svn: 361690
*	Revert r361644, "[AMDGPU] Divergence driven ISel. Assign register class for ↵	Peter Collingbourne	2019-05-25	7	-176/+87
\| \| \| \| \| \| \| \| \| \|	cross block values according to the divergence." Broke sanitizer bots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/21694/steps/bootstrap%20clang/logs/stdio http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/32478/steps/check-llvm%20asan/logs/stdio llvm-svn: 361688
*	[GlobalISel][AArch64] Make FP constraint checks consider possible use/def banks	Jessica Paquette	2019-05-24	2	-7/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In a few places in getInstrMapping, we check if use/def instructions for the instruction we're mapping have floating point constraints. We can improve this check and reduce the number of copies in GISel-compiled code if we make a couple observations: - For a def instruction, it only matters if the def instruction must always output a value stored on a FPR - For a use instruction, it only matters if the use instruction must always only take in values stored in FPRs This adds two new functions: - onlyUsesFP - onlyDefinesFP Then we can use those when we're checking the uses/defs instead. Without this patch, the load, unmerge, store, and select in the added test would have unnecessary copies. Differential Revision: https://reviews.llvm.org/D62426 llvm-svn: 361679
*	[GlobalISel][AArch64] NFC: Factor out HasFPConstraints into a proper function	Jessica Paquette	2019-05-24	2	-41/+32
\| \| \| \| \| \| \| \| \|	Factor it out into a function, and replace places where we had the same check with the new function. Differential Revision: https://reviews.llvm.org/D62421 llvm-svn: 361677
*	Implement call lowering without parameters on AIX	Jason Liu	2019-05-24	9	-19/+128
\| \| \| \| \| \| \| \| \| \| \| \|	Summary:dd This patch implements call lowering for calls without parameters on AIX as initial support. Reviewers: sfertile, hubert.reinterpretcast, aheejin, efriedma Differential Revision: https://reviews.llvm.org/D61948 llvm-svn: 361669
*	[GlobalISel][AArch64] Improve register bank mappings for G_SELECT	Jessica Paquette	2019-05-24	1	-6/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The fcsel and csel instructions differ in only the register banks they work on. So, they're entirely interchangeable otherwise. With this in mind, this does two things: - Teach AArch64RegisterBankInfo to consider the inputs to G_SELECT as well as the outputs. - Teach it to choose the best register bank mapping based off the constraints of the inputs and outputs. The "best" in this case means the one that requires the smallest number of copies to properly emit a fcsel/csel. For example, if the inputs are all already going to be on FPRs, we should emit a fcsel, even if the output is a GPR. This costs one copy to produce the result, but saves us from copying the inputs into GPRs. Also update the regbank-select.mir to check that we end up with the right select instruction. Differential Revision: https://reviews.llvm.org/D62267 llvm-svn: 361665
*	[AArch64] check for INLINEASM_BR along w/ INLINEASM	Nick Desaulniers	2019-05-24	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: It looks like since INLINEASM_BR was created off of INLINEASM, a few checks for INLINEASM needed to be updated to check for either case. pr/41999 Reviewers: t.p.northover, peter.smith Reviewed By: peter.smith Subscribers: craig.topper, javed.absar, kristof.beyls, hiraditya, llvm-commits, peter.smith, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D62402 llvm-svn: 361661
*	[ARM] additionally check for ARM::INLINEASM_BR w/ ARM::INLINEASM	Nick Desaulniers	2019-05-24	2	-11/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We were observing failures for arm32 allyesconfigs of the Linux kernel with the asm goto Clang patch, where ldr's were being generated to offsets too far away to encode in imm12. It looks like since INLINEASM_BR was created off of INLINEASM, a few checks for INLINEASM needed to be updated to check for either case. pr/41999 Link: https://github.com/ClangBuiltLinux/linux/issues/490 Reviewers: peter.smith, kristof.beyls, ostannard, rengolin, t.p.northover Reviewed By: peter.smith Subscribers: jyu2, javed.absar, hiraditya, llvm-commits, nathanchance, craig.topper, kees, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D62400 llvm-svn: 361659
*	AMDGPU: Activate all lanes when spilling CSR VGPR for SGPR spills	Matt Arsenault	2019-05-24	1	-26/+66
\| \| \| \| \| \| \|	If some lanes weren't active on entry to the function, this could clobber their VGPR values. llvm-svn: 361655
*	AMDGPU: Boost inline threshold with addrspacecasted alloca arguments	Matt Arsenault	2019-05-24	1	-3/+4
\| \| \| \| \| \| \|	This was skipping GetUnderlyingObject for nonprivate addresses, but an alloca could also be found through an addrspacecast if it's flat. llvm-svn: 361649
*	[AMDGPU] Divergence driven ISel. Assign register class for cross block ↵	Alexander Timofeev	2019-05-24	7	-87/+176
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	values according to the divergence. Details: To make instruction selection really divergence driven it is necessary to assign the correct register classes to the cross block values beforehand. For the divergent targets same value type requires different register classes dependent on the value divergence. Reviewers: rampitec, nhaehnle Differential Revision: https://reviews.llvm.org/D59990 llvm-svn: 361644
*	[PowerPC] Remove CRBits Copy Of Unset/set CBit	Stefan Pintilie	2019-05-24	2	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For the situation, where we generate the following code: crxor 8, 8, 8 < Some instructions> .LBB0_1: < Some instructions> cror 1, 8, 8 cror (COPY of CRbit) depends on the result of the crxor instruction. CR8 is known to be zero as crxor is equivalent to CRUNSET. We can simply use crxor 1, 1, 1 instead to zero out CR1, which does not have any dependency on any previous instruction. This patch will optimize it to: < Some instructions> .LBB0_1: < Some instructions> cror 1, 1, 1 Patch By: Victor Huang (NeHuang) Differential Revision: https://reviews.llvm.org/D62044 llvm-svn: 361632
*	[AArch64][SVE2] Asm: support SVE2 String Processing Group	Cullen Rhodes	2019-05-24	2	-0/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Patch adds support for the SVE2 character match instructions MATCH and NMATCH. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62206 llvm-svn: 361627
*	[AArch64][SVE2] Asm: support SVE2 Narrowing Group	Cullen Rhodes	2019-05-24	2	-0/+118
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Patch adds support for the following instructions: SVE2 bitwise shift right narrow: * SQSHRUNB, SQSHRUNT, SQRSHRUNB, SQRSHRUNT, SHRNB, SHRNT, RSHRNB, RSHRNT, SQSHRNB, SQSHRNT, SQRSHRNB, SQRSHRNT, UQSHRNB, UQSHRNT, UQRSHRNB, UQRSHRNT SVE2 integer add/subtract narrow high part: * ADDHNB, ADDHNT, RADDHNB, RADDHNT, SUBHNB, SUBHNT, RSUBHNB, RSUBHNT SVE2 saturating extract narrow: * SQXTNB, SQXTNT, UQXTNB, UQXTNT, SQXTUNB, SQXTUNT The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62205 llvm-svn: 361624
*	[AArch64][SVE2] Asm: support SVE2 Accumulate Group	Cullen Rhodes	2019-05-24	2	-0/+186
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Patch adds support for the following instructions: SVE2 bitwise shift and insert: * SRI, SLI SVE2 bitwise shift right and accumulate: * SSRA, USRA, SRSRA, URSRA SVE2 complex integer add: * CADD, SQCADD SVE2 integer absolute difference and accumulate: * SABA, UABA SVE2 integer absolute difference and accumulate long: * SABALB, SABALT, UABALB, UABALT SVE2 integer add/subtract long with carry: * ADCLB, ADCLT, SBCLB, SBCLT The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62204 llvm-svn: 361622
*	[SelectionDAG] computeKnownBits - support constant pool values from target	Simon Pilgrim	2019-05-24	2	-4/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds the overridable TargetLowering::getTargetConstantFromLoad function which allows targets to return any constant value loaded by a LoadSDNode node - only X86 makes use of this so far but everything should be in place for other targets. computeKnownBits then uses this function to improve codegen, notably vector code after legalization. A future commit will do the same for ComputeNumSignBits but computeKnownBits sees the bigger benefit. This required a couple of fixes: * SimplifyDemandedBits must early-out for getTargetConstantFromLoad cases to prevent infinite loops of constant regeneration (similar to what we already do for BUILD_VECTOR). * Fix a DAGCombiner::visitTRUNCATE issue as we had trunc(shl(v8i32),v8i16) <-> shl(trunc(v8i16),v8i32) infinite loops after legalization on AVX512 targets. Differential Revision: https://reviews.llvm.org/D61887 llvm-svn: 361620
*	[AArch64][SVE2] Asm: add PMULLB/PMULLT instructions	Cullen Rhodes	2019-05-24	2	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds support for the polynomial multiplication instructions PMULLB/PMULLT. The 64-bit source and 128-bit destination element variants are enabled with crypto extensions (+sve2-aes), similar to the NEON PMULL2 instruction. All other variants are enabled with +sve2. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62145 llvm-svn: 361619
*	[AArch64][SVE2] Asm: add integer add/sub long/wide instructions	Cullen Rhodes	2019-05-24	2	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Patch adds support for the following instructions: SVE2 integer add/subtract long: * SADDLB, SADDLT, UADDLB, UADDLT, SSUBLB, SSUBLT, USUBLB, USUBLT, SABDLB, SABDLT, UABDLB, UABDLT SVE2 integer add/subtract wide: * SADDWB, SADDWT, UADDWB, UADDWT, SSUBWB, SSUBWT, USUBWB, USUBWT The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62142 llvm-svn: 361615
*	[AArch64][SVE2] Asm: add various bitwise shift instructions	Cullen Rhodes	2019-05-24	2	-11/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds support for the SVE2 saturating/rounding bitwise shift left (predicated) group of instructions: * SRSHL, URSHL, SRSHLR, URSHLR, SQSHL, UQSHL, SQRSHL, UQRSHL, SQSHLR, UQSHLR, SQRSHLR, UQRSHLR Immediate forms of the SQSHL and UQSHL instructions are also added to the existing SVE bitwise shift by immediate (predicated) group, as well as three new instructions SRSHR/URSHR/SQSHLU. The new instructions in this group are encoded similarly and are implemented using the same TableGen class with a minimal change (1 bit in encoding). The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62140 llvm-svn: 361612
*	[AArch64][SVE2] Asm: add saturating add/sub instructions	Cullen Rhodes	2019-05-24	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Patch adds support for the following instructions: * SQADD, UQADD, SUQADD, USQADD * SQSUB, UQSUB, SQSUBR, UQSUBR The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: chill Differential Revision: https://reviews.llvm.org/D62130 llvm-svn: 361611
*	[AArch64][SVE2] Asm: fix overlapping bit	Cullen Rhodes	2019-05-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Bit 20 in sve2_int_arith_pred TableGen class was overlapping. The encodings are not affected as bit 20 is defined by the opc bits and this was overwriting the earlier error of setting bit 20 to 0. Raised by Momchil: https://reviews.llvm.org/D62130 Reviewed By: chill Differential Revision: https://reviews.llvm.org/D62292 llvm-svn: 361609
*	GlobalISel: support swifterror attribute on AArch64.	Tim Northover	2019-05-24	2	-4/+26
\| \| \| \| \| \| \| \|	swifterror marks an argument as a register pretending to be a pointer, so we need a guaranteed mem2reg-like analysis of its uses. Fortunately most of the infrastructure can be reused from the DAG world. llvm-svn: 361608
*	[mips] Always check that `shift and add` optimization is efficient.	Simon Atanasyan	2019-05-24	1	-26/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The D45316 introduced the `shouldTransformMulToShiftsAddsSubs` function to check that breaking down constant multiplications into a series of shifts, adds, and subs is efficient. Unfortunately, this function does not check maximum number of steps on all paths of the algorithm. This patch fixes this bug. Fix for PR41929. Differential Revision: https://reviews.llvm.org/D62166 llvm-svn: 361606
*	[ARM] ARMExpandPseudoInsts: add debug messages	Sjoerd Meijer	2019-05-24	1	-2/+16
\| \| \| \| \| \| \| \| \| \| \|	This pass wasn't printing any messages at all, which I find really inconvenient while debugging/tracing things. It now dumps the before and after of expanded instructions. It doesn't do this yet for all instructions, but this is a good start I guess. Differential Revision: https://reviews.llvm.org/D62297 llvm-svn: 361604
*	[Power9] Add a specific heuristic to schedule the addi before the load	QingShan Zhang	2019-05-24	2	-0/+58
\| \| \| \| \| \| \| \| \| \|	When we are scheduling the load and addi, if all other heuristic didn't take effect, we will try to schedule the addi before the load, to hide the latency, and avoid the true dependency added by RA. And this only take effects for Power9. Differential Revision: https://reviews.llvm.org/D61930 llvm-svn: 361600
*	[AArch64] Preserve X8 for thunks ending in variadic musttail calls	Reid Kleckner	2019-05-24	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: On Windows, X8 may be used to pass in the address of an aggregate that is returned indirectly. Therefore, it should be forwarded to variadic musttail calls and preserved in thunks. Fixes PR41997 Reviewers: mgrang, efriedma Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62344 llvm-svn: 361585
*	[AArch64] Add nvcast patterns for v2f32 -> v1f64	Serge Pavlov	2019-05-24	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Constant stores of f32 values can create such NvCast nodes. Reviewers: t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62285 llvm-svn: 361584
*	[WebAssembly] Expand more SIMD float ops	Thomas Lively	2019-05-24	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: These were previously causing ISel failures. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62354 llvm-svn: 361577
*	AMDGPU: Correct maximum possible private allocation size	Matt Arsenault	2019-05-23	4	-28/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were assuming a much larger possible per-wave visible stack allocation than is possible: https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/faa3ae51388517353afcdaf9c16621f879ef0a59/src/core/runtime/amd_gpu_agent.cpp#L70 Based on this, we can assume the high 15 bits of a frame index or sret are 0. The frame index value is the per-lane offset, so the maximum frame index value is MAX_WAVE_SCRATCH / wavesize. Remove the corresponding subtarget feature and option that made this configurable. llvm-svn: 361541
*	Resubmit r360436 "[X86] Avoid SFB - Fix inconsistent codegen with/without ↵	Robert Lougher	2019-05-23	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	debug info" Fixes https://bugs.llvm.org/show_bug.cgi?id=40969 The functions findPotentiallyBlockedCopies and buildCopy are currently not accounting for the presence of debug instructions. In the former this results in the optimization not being trigerred, and in the latter results in inconsistent codegen. This patch enables the optimization to be performed in a debug build and ensures the codegen is consistent with non-debug builds. Patch by Chris Dawson. Differential Revision: https://reviews.llvm.org/D61680 llvm-svn: 361527
*	[WebAssembly] Implement ReplaceNodeResults to fix a SIMD crash	Thomas Lively	2019-05-23	2	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61037 llvm-svn: 361526