bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Rename TTI::getIntImmCost for instructions and intrinsics	Reid Kleckner	2019-12-11	2	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Soon Intrinsic::ID will be a plain integer, so this overload will not be possible. Rename both overloads to ensure that downstream targets observe this as a build failure instead of a runtime failure. Split off from D71320 Reviewers: efriedma Differential Revision: https://reviews.llvm.org/D71381
*	Revert "[SDAG] remove use restriction in isNegatibleForFree() when called ↵	Sanjay Patel	2019-12-11	2	-7/+4
\| \| \| \| \| \| \|	from getNegatedExpression()" This reverts commit d1f0bdf2d2df9bdf11ee2ddfff3df50e53f2f042. The patch can cause infinite loops in DAGCombiner.
*	[SDAG] remove use restriction in isNegatibleForFree() when called from ↵	Sanjay Patel	2019-12-11	2	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	getNegatedExpression() This is an alternate fix for the bug discussed in D70595. This also includes minimal tests for other in-tree targets to show the problem more generally. We check the number of uses as a predicate for whether some value is free to negate, but that use count can change as we rewrite the expression in getNegatedExpression(). So something that was marked free to negate during the cost evaluation phase becomes not free to negate during the rewrite phase (or the inverse - something that was not free becomes free). This can lead to a crash/assert because we expect that everything in an expression that is negatible to be handled in the corresponding code within getNegatedExpression(). This patch skips the use check during the rewrite phase. So we determine that some expression isNegatibleForFree (identically to without this patch), but during the rewrite, don't rely on use counts to decide how to create the optimal expression. Differential Revision: https://reviews.llvm.org/D70975
*	[X86] Erase dead LEA instruction after converting it to MOV in ↵	Craig Topper	2019-12-11	1	-0/+3
\| \| \| \|	FixupLEAPass::processInstrForSlow3OpLEA.
*	[X86] Split v64i1 arguments into 2 v32i1s that will be promoted to v32i8 ↵	Craig Topper	2019-12-10	1	-3/+17
\| \| \| \| \| \|	under min-legal-vector-width=256 This is an improvement to 88dacbd43625cf7aad8a01c0c3b92142c4dc0970
*	[FPEnv][X86] Constrained FCmp intrinsics enabling on X86	Wang, Pengfei	2019-12-11	9	-94/+252
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a follow up of D69281, it enables the X86 backend support for the FP comparision. Reviewers: uweigand, kpn, craig.topper, RKSimon, cameron.mcinally, andrew.w.kaylor Subscribers: hiraditya, llvm-commits, annita.zhang, LuoYuanke, LiuChen3 Tags: #llvm Differential Revision: https://reviews.llvm.org/D70582
*	[X86] Go back to considering v64i1 as a legal type under ↵	Craig Topper	2019-12-10	1	-47/+32
\| \| \| \| \| \| \| \| \| \|	min-legal-vector-width=256. Scalarize v64i1 arguments and shuffles under min-legal-vector-width=256. This reverts 3e1aee2ba717529b651a79ed4fc7e7147358043f in favor of a different approach. Scalarizing isn't great codegen, but making the type illegal was interfering with k constraint in inline assembly.
*	add support for strict operation fpextend/fpround/fsqrt on X86 backend	Liu, Chen3	2019-12-10	4	-94/+86
\| \| \| \|	Differential Revision: https://reviews.llvm.org/D71184
*	[PGO][PGSO] Instrument the code gen / target passes.	Hiroshi Yamauchi	2019-12-09	3	-2/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). A second try after reverted D71072. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71149
*	[ARM] Teach the Arm cost model that a Shift can be folded into other ↵	David Green	2019-12-09	2	-7/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instructions This attempts to teach the cost model in Arm that code such as: %s = shl i32 %a, 3 %a = and i32 %s, %b Can under Arm or Thumb2 become: and r0, r1, r2, lsl #3 So the cost of the shift can essentially be free. To do this without trying to artificially adjust the cost of the "and" instruction, it needs to get the users of the shl and check if they are a type of instruction that the shift can be folded into. And so it needs to have access to the actual instruction in getArithmeticInstrCost, which if available is added as an extra parameter much like getCastInstrCost. We otherwise limit it to shifts with a single user, which should hopefully handle most of the cases. The list of instruction that the shift can be folded into include ADC, ADD, AND, BIC, CMP, EOR, MVN, ORR, ORN, RSB, SBC and SUB. This translates to Add, Sub, And, Or, Xor and ICmp. Differential Revision: https://reviews.llvm.org/D70966
*	[DebugInfo] Make describeLoadedValue() reg aware	David Stenberg	2019-12-09	2	-5/+84
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently the describeLoadedValue() hook is assumed to describe the value of the instruction's first explicit define. The hook will not be called for instructions with more than one explicit define. This commit adds a register parameter to the describeLoadedValue() hook, and invokes the hook for all registers in the worklist. This will allow us to for example describe instructions which produce more than two parameters' values; e.g. Hexagon's various combine instructions. This also fixes situations in our downstream target where we may pass smaller parameters in the high part of a register. If such a parameter's value is produced by a larger copy instruction, we can't describe the call site value using the super-register, and we instead need to know which sub-register that should be used. This also allows us to handle cases like this: $ebx = [...] $rdi = MOVSX64rr32 $ebx $esi = MOV32rr $edi CALL64pcrel32 @call The hook will first be invoked for the MOV32rr instruction, which will say that @call's second parameter (passed in $esi) is described by $edi. As $edi is not preserved it will be added to the worklist. When we get to the MOVSX64rr32 instruction, we need to describe two values; the sign-extended value of $ebx -> $rdi for the first parameter, and $ebx -> $edi for the second parameter, which is now possible. This commit modifies the dbgcall-site-lea-interpretation.mir test case. In the test case, the values of some 32-bit parameters were produced with LEA64r. Perhaps we can in general cases handle such by emitting expressions that AND out the lower 32-bits, but I have not been able to land in a case where a LEA64r is used for a 32-bit parameter instead of LEA64_32 from C code. I have not found a case where it would be useful to describe parameters using implicit defines, so in this patch the hook is still only invoked for explicit defines of forwarding registers. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: djtodoro, vsk Subscribers: ormris, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D70431
*	Revert "[DebugInfo] Make describeLoadedValue() reg aware"	David Stenberg	2019-12-09	2	-84/+5
\| \| \| \| \|	This reverts commit 3cd93a4efcdeabeb20cb7bec9fbddcb540d337a1. I'll recommit with a well-formatted arcanist commit message.
*	[DebugInfo] Make describeLoadedValue() reg aware	David Stenberg	2019-12-09	2	-5/+84
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the describeLoadedValue() hook is assumed to describe the value of the instruction's first explicit define. The hook will not be called for instructions with more than one explicit define. This commit adds a register parameter to the describeLoadedValue() hook, and invokes the hook for all registers in the worklist. This will allow us to for example describe instructions which produce more than two parameters' values; e.g. Hexagon's various combine instructions. This also fixes a case in our downstream target where we may pass smaller parameters in the high part of a register. If such a parameter's value is produced by a larger copy instruction, we can't describe the call site value using the super-register, and we instead need to know which sub-register that should be used. This also allows us to handle cases like this: $ebx = [...] $rdi = MOVSX64rr32 $ebx $esi = MOV32rr $edi CALL64pcrel32 @call The hook will first be invoked for the MOV32rr instruction, which will say that @call's second parameter (passed in $esi) is described by $edi. As $edi is not preserved it will be added to the worklist. When we get to the MOVSX64rr32 instruction, we need to describe two values; the sign-extended value of $ebx -> $rdi for the first parameter, and $ebx -> $edi for the second parameter, which is now possible. This commit modifies the dbgcall-site-lea-interpretation.mir test case. In the test case, the values of some 32-bit parameters were produced with LEA64r. Perhaps we can in general cases handle such by emitting expressions that AND out the lower 32-bits, but I have not been able to land in a case where a LEA64r is used for a 32-bit parameter instead of LEA64_32 from C code. I have not found a case where it would be useful to describe parameters using implicit defines, so in this patch the hook is still only invoked for explicit defines of forwarding registers.
*	[X86] Fix prolog/epilog mismatch for stack protectors on win32-macho.	Amara Emerson	2019-12-06	1	-1/+1
\| \| \| \| \| \| \| \|	The xor'ing behaviour is only used for msvc/crt environments, when we're targeting macho the guard load code doesn't know about the xor in the epilog. Disable xor'ing when targeting win32-macho to be consistent. Differential Revision: https://reviews.llvm.org/D71095
*	[TargetLowering] Fix another potential FPE in expandFP_TO_UINT	Craig Topper	2019-12-06	1	-13/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	D53794 introduced code to perform the FP_TO_UINT expansion via FP_TO_SINT in a way that would never expose floating-point exceptions in the intermediate steps. Unfortunately, I just noticed there is still a way this can happen. As discussed in D53794, the compiler now generates this sequence: // Sel = Src < 0x8000000000000000 // Val = select Sel, Src, Src - 0x8000000000000000 // Ofs = select Sel, 0, 0x8000000000000000 // Result = fp_to_sint(Val) ^ Ofs The problem is with the Src - 0x8000000000000000 expression. As I mentioned in the original review, that expression can never overflow or underflow if the original value is in range for FP_TO_UINT. But I missed that we can get an Inexact exception in the case where Src is a very small positive value. (In this case the result of the sub is ignored, but that doesn't help.) Instead, I'd suggest to use the following sequence: // Sel = Src < 0x8000000000000000 // FltOfs = select Sel, 0, 0x8000000000000000 // IntOfs = select Sel, 0, 0x8000000000000000 // Result = fp_to_sint(Val - FltOfs) ^ IntOfs In the case where the value is already in range of FP_TO_SINT, we now simply compute Val - 0, which now definitely cannot trap (unless Val is a NaN in which case we'd want to trap anyway). In the case where the value is not in range of FP_TO_SINT, but still in range of FP_TO_UINT, the sub can never be inexact, as Val is between 2^(n-1) and (2^n)-1, i.e. always has the 2^(n-1) bit set, and the sub is always simply clearing that bit. There is a slight complication in the case where Val is a constant, so we know at compile time whether Sel is true or false. In that scenario, the old code would automatically optimize the sub away, while this no longer happens with the new code. Instead, I've added extra code to check for this case and then just fall back to FP_TO_SINT directly. (This seems to catch even slightly more cases.) Original version of the patch by Ulrich Weigand. X86 changes added by Craig Topper Differential Revision: https://reviews.llvm.org/D67105
*	[X86] Don't setup and teardown memory for a musttail call	Reid Kleckner	2019-12-06	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: musttail calls should not require allocating extra stack for arguments. Updates to arguments passed in memory should happen in place before the epilogue. This bug was mostly a missed optimization, unless inalloca was used and store to push conversion fired. If a reserved call frame was used for an inalloca musttail call, the call setup and teardown instructions would be deleted, and SP adjustments would be inserted in the prologue and epilogue. You can see these are removed from several test cases in this change. In the case where the stack frame was not reserved, i.e. call frame optimization fires and turns argument stores into pushes, then the imbalanced call frame setup instructions created for inalloca calls become a problem. They remain in the instruction stream, resulting in a call setup that allocates zero bytes (expected for inalloca), and a call teardown that deallocates the inalloca pack. This deallocation was unbalanced, leading to subsequent crashes. Reviewers: hans Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71097
*	Revert "[PGO][PGSO] Instrument the code gen / target passes."	Hiroshi Yamauchi	2019-12-06	3	-52/+2
\| \| \| \| \| \|	This reverts commit 9a0b5e14075a1f42a72eedb66fd4fde7985d37ac. This seems to break buildbots.
*	[x86] add cost model special-case for insert/extract from element 0	Sanjay Patel	2019-12-06	1	-3/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a follow-up to D70607 where we made any extract element on SLM more costly than default. But that is pessimistic for extract from element 0 because that corresponds to x86 movd/movq instructions. These generally have >1 cycle latency, but they are probably implemented as single uop instructions. Note that no vectorization tests are affected by this change. Also, no targets besides SLM are affected because those are falling through to the default cost of 1 anyway. But this will become visible/important if we add more specializations via cost tables. Differential Revision: https://reviews.llvm.org/D71023
*	[PGO][PGSO] Instrument the code gen / target passes.	Hiroshi Yamauchi	2019-12-06	3	-2/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71072
*	[X86] Make X86TargetLowering::BuildFILD return a std::pair of SDValues so we ↵	Craig Topper	2019-12-05	2	-11/+12
\| \| \| \| \| \| \| \| \| \|	explicitly return the chain instead of calling getValue on the single SDValue. We shouldn't assume that the returned result can be used to get the other result. This is prep-work for strict FP where we will also need to pass the chain result along in more cases.
*	Add strict fp support for instructions fadd/fsub/fmul/fdiv	Liu, Chen3	2019-12-06	4	-32/+42
\| \| \| \|	Differential Revision: https://reviews.llvm.org/D68757
*	[X86] Remove ProcIntelGLM/ProcIntelGLP/ProcIntelTRM and replace them with a ↵	Craig Topper	2019-12-05	4	-23/+15
\| \| \| \| \| \|	single feature flag covers the two places they were used. Differential Revision: https://reviews.llvm.org/D71048
*	[MCRegInfo] Add forward sub and super register iterators. (NFC)	Florian Hahn	2019-12-05	1	-12/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds forward iterators mc_difflist_iterator, mc_subreg_iterator and mc_superreg_iterator, based on the existing DiffListIterator. Those are used to provide iterator ranges over sub- and super-register from TRI, which are slightly more convenient than the existing MCSubRegIterator/MCSuperRegIterator. Unfortunately, it duplicates a bit of functionality, but the new iterators are a bit more convenient (and can be used with various existing iterator utilities) and should probably replace the old iterators in the future. This patch updates some existing users. Reviewers: evandro, qcolombet, paquette, MatzeB, arsenm Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D70565
*	Fix the macro fusion table for X86 according to Intel optimization	Shengchen Kan	2019-12-05	2	-171/+254
\| \| \| \| \| \|	manual and add function isMacroFused Differential Revision: https://reviews.llvm.org/D70999
*	[X86] Remove override of shouldUseStrictFP_TO_INT for fp80. NFC	Craig Topper	2019-12-04	2	-9/+0
\| \| \| \| \| \| \| \| \| \|	I suspect this became unnecessary after r354161. Prior to that we may have been going through the default expansion of FP_TO_UINT on 64-bit targets and then ending up back in Custom X86 handling to handle the FP_TO_SINT for it. Now we just Custom handle the FP_TO_UINT directly. We already need to handle it for 32-bit mode during type legalization so we wouldn't save any code by using the default expansion on 64-bit.
*	[X86] Add missing break to the end of the last case in a switch. NFC	Craig Topper	2019-12-04	1	-0/+1
\|
*	Add support for lowering 32-bit/64-bit pointers	Amy Huang	2019-12-04	3	-4/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This follows a previous patch that changes the X86 datalayout to represent mixed size pointers (32-bit sext, 32-bit zext, and 64-bit) with address spaces (https://reviews.llvm.org/D64931) This patch implements the address space cast lowering to the corresponding sign extension, zero extension, or truncate instructions. Related to https://bugs.llvm.org/show_bug.cgi?id=42359 Reviewers: rnk, craig.topper, RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69639
*	[X86] Model DAZ and FTZ	Wang, Pengfei	2019-12-04	2	-21/+43
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a follow-up of D70881. It models DAZ and FTZ for releated instructions. Reviewers: craig.topper, RKSimon, andrew.w.kaylor Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70938
*	[X86] Model MXCSR for all AVX512 instructions	Wang, Pengfei	2019-12-04	2	-57/+79
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Model MXCSR for all AVX512 instructions Reviewers: craig.topper, RKSimon, andrew.w.kaylor Subscribers: hiraditya, llvm-commits, LuoYuanke, LiuChen3 Tags: #llvm Differential Revision: https://reviews.llvm.org/D70881
*	[X86] Model MXCSR for AVX instructions other than AVX512	Wang, Pengfei	2019-12-03	2	-12/+17
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Model MXCSR for AVX instructions other than AVX512 Reviewers: craig.topper, RKSimon Subscribers: hiraditya, llvm-commits, LuoYuanke, LiuChen3 Tags: #llvm Differential Revision: https://reviews.llvm.org/D70875
*	[X86] Add floating point execution domain to ↵	Craig Topper	2019-11-30	2	-49/+74
\| \| \| \|	comi/ucomi/cvtss2si/cvtsd2si/cvttss2si/cvttsd2si/cvtsi2ss/cvtsi2sd instructions.
*	[X86] Add support for STRICT_FP_TO_UINT/SINT from fp128.	Craig Topper	2019-11-27	1	-4/+9
\|
*	[X86] Add SSEPackedSingle/Double execution domain to COMI/UCOMI SSE/AVX ↵	Craig Topper	2019-11-27	2	-32/+34
\| \| \| \|	instructions.
*	[x86] make SLM extract vector element more expensive than default	Sanjay Patel	2019-11-27	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I'm not sure what the effect of this change will be on all of the affected tests or a larger benchmark, but it fixes the horizontal add/sub problems noted here: https://reviews.llvm.org/D59710?vs=227972&id=228095&whitespace=ignore-most#toc The costs are based on reciprocal throughput numbers in Agner's tables for PEXTR*; these appear to be very slow ops on Silvermont. This is a small step towards the larger motivation discussed in PR43605: https://bugs.llvm.org/show_bug.cgi?id=43605 Also, it seems likely that insert/extract is the source of perf regressions on other CPUs (up to 30%) that were cited as part of the reason to revert D59710, so maybe we'll extend the table-based approach to other subtargets. Differential Revision: https://reviews.llvm.org/D70607
*	[X86] [Win64] Avoid truncating large (> 32 bit) stack allocations	Martin Storsjö	2019-11-27	1	-1/+1
\| \| \| \| \| \| \|	This fixes PR44129, which was broken in a7adc3185b (in 7.0.0 and newer). Differential Revision: https://reviews.llvm.org/D70741
*	[X86] Add strict fp support for operations of X87 instructions	Craig Topper	2019-11-26	3	-19/+47
\| \| \| \| \| \| \| \| \| \|	This is the following patch of D68854. This patch adds basic operations of X87 instructions, including +, -, *, / , fp extensions and fp truncations. Patch by Chen Liu(LiuChen3) Differential Revision: https://reviews.llvm.org/D68857
*	[Codegen][ARM] Add addressing modes from masked loads and stores	David Green	2019-11-26	2	-33/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MVE has a basic symmetry between it's normal loads/store operations and the masked variants. This means that masked loads and stores can use pre-inc and post-inc addressing modes, just like the standard loads and stores already do. To enable that, this patch adds all the relevant infrastructure for treating masked loads/stores addressing modes in the same way as normal loads/stores. This involves: - Adding an AddressingMode to MaskedLoadStoreSDNode, along with an extra Offset operand that is added after the PtrBase. - Extending the IndexedModeActions from 8bits to 16bits to store the legality of masked operations as well as normal ones. This array is fairly small, so doubling the size still won't make it very large. Offset masked loads can then be controlled with setIndexedMaskedLoadAction, similar to standard loads. - The same methods that combine to indexed loads, such as CombineToPostIndexedLoadStore, are adjusted to handle masked loads in the same way. - The ARM backend is then adjusted to make use of these indexed masked loads/stores. - The X86 backend is adjusted to hopefully be no functional changes. Differential Revision: https://reviews.llvm.org/D70176
*	[X86][MC] no error diagnostic for out-of-range jrcxz/jecxz/jcxz	Alexey Lapshin	2019-11-26	1	-6/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix for PR24072: X86 instructions jrcxz/jecxz/jcxz performs short jumps if rcx/ecx/cx register is 0 The maximum relative offset for a forward short jump is 127 Bytes (0x7F). The maximum relative offset for a backward short jump is 128 Bytes (0x80). Gnu assembler warns when the distance of the jump exceeds the maximum but llvm-as does not. Patch by Konstantin Belochapka and Alexey Lapshin Differential Revision: https://reviews.llvm.org/D70652
*	[X86] Return Op instead of SDValue() for lowering flags_read/write intrinsics	Craig Topper	2019-11-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Returning SDValue() means we didn't handle it and the common code should try to expand it. But its a target intrinsic so expanding won't do anything and just leave the node alone. But it will print confusing debug messages. By returning Op we tell the common code that the node is legal and shouldn't receive any further processing.
*	[X86] Add support for STRICT_FP_ROUND/STRICT_FP_EXTEND from/to fp128 to/from ↵	Craig Topper	2019-11-25	1	-17/+37
\| \| \| \| \| \| \| \| \| \|	f32/f64/f80 in 64-bit mode. These need to emit a libcall like we do for the non-strict version. 32-bit mode needs to SoftenFloat support to be implemented for strict FP nodes. Differential Revision: https://reviews.llvm.org/D70504
*	[X86] Add proper execution domain information to the avx512vnni instructions.	Craig Topper	2019-11-25	1	-0/+2
\|
*	[X86][SSE] Split off generic isLaneCrossingShuffleMask helper. NFC.	Simon Pilgrim	2019-11-23	1	-3/+14
\| \| \| \|	Avoid MVT dependency which will be needed in a future patch.
*	[X86] Add test cases for most of the constrained fp libcalls with fp128.	Craig Topper	2019-11-21	1	-4/+8
\| \| \| \| \| \| \| \| \| \|	Add explicit setOperation actions for some to match their none strict counterparts. This isn't required, but makes the code self documenting that we didn't forget about strict fp. I've used LibCall instead of Expand since that's more explicitly what we want. Only lrint/llrint/lround/llround are missing now.
*	[X86] Mark fp128 FMA as LibCall instead of Expand. Add STRICT_FMA as well.	Craig Topper	2019-11-21	1	-1/+2
\| \| \| \| \|	The Expand code would fall back to LibCall, but this makes it more explicit.
*	[LegalizeDAG][X86] Add support for turning STRICT_FADD/SUB/MUL/DIV into ↵	Craig Topper	2019-11-21	1	-5/+16
\| \| \| \| \| \| \|	libcalls. Use it for fp128 on x86-64. This requires a minor hack for f32/f64 strict fadd/fsub to avoid turning those into libcalls.
*	[X86] Mark vector STRICT_FADD/STRICT_FSUB as Legal and add mutation to ↵	Craig Topper	2019-11-21	2	-3/+17
\| \| \| \| \| \| \|	X86ISelDAGToDAG The prevents LegalizeVectorOps from scalarizing them. We'll need to remove the X86 mutation code when we add isel patterns.
*	[PGO][PGSO] DAG.shouldOptForSize part.	Hiroshi Yamauchi	2019-11-21	3	-12/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: (Split of off D67120) SelectionDAG::shouldOptForSize changes for profile guided size optimization. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70095
*	[X86] Change legalization action for f128 fadd/fsub/fmul/fdiv from Custom to ↵	Craig Topper	2019-11-21	1	-12/+4
\| \| \| \| \| \| \| \| \| \| \|	LibCall. The custom code just emits a libcall, but we can do the same with generic code. The only difference is that the generic code can form tail calls where the custom code couldn't. This is responsible for the test changes. This avoids needing to modify the Custom handling for strict fp.
*	[cmake] Explicitly mark libraries defined in lib/ as "Component Libraries"	Tom Stellard	2019-11-21	5	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Most libraries are defined in the lib/ directory but there are also a few libraries defined in tools/ e.g. libLLVM, libLTO. I'm defining "Component Libraries" as libraries defined in lib/ that may be included in libLLVM.so. Explicitly marking the libraries in lib/ as component libraries allows us to remove some fragile checks that attempt to differentiate between lib/ libraries and tools/ libraires: 1. In tools/llvm-shlib, because llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of all libraries defined in the whole project, there was custom code needed to filter out libraries defined in tools/, none of which should be included in libLLVM.so. This code assumed that any library defined as static was from lib/ and everything else should be excluded. With this change, llvm_map_components_to_libnames(LIB_NAMES, "all") only returns libraries that have been added to the LLVM_COMPONENT_LIBS global cmake property, so this custom filtering logic can be removed. Doing this also fixes the build with BUILD_SHARED_LIBS=ON and LLVM_BUILD_LLVM_DYLIB=ON. 2. There was some code in llvm_add_library that assumed that libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or ARG_LINK_COMPONENTS set. This is only true because libraries defined lib lib/ use LLVMBuild.txt and don't set these values. This code has been fixed now to check if the library has been explicitly marked as a component library, which should now make it easier to remove LLVMBuild at some point in the future. I have tested this patch on Windows, MacOS and Linux with release builds and the following combinations of CMake options: - "" (No options) - -DLLVM_BUILD_LLVM_DYLIB=ON - -DLLVM_LINK_LLVM_DYLIB=ON - -DBUILD_SHARED_LIBS=ON - -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON - -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON Reviewers: beanz, smeenai, compnerd, phosek Reviewed By: beanz Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70179
*	[X86] Fix i16->f128 sitofp to promote the i16 to i32 before trying to form a ↵	Craig Topper	2019-11-20	1	-8/+9
\| \| \| \| \| \|	libcall. Previously one of the test cases added here gave an error.