bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Revert "[X86] Use 'llvm_unreachable' instead of nullptr in unreachable code to"	Pengfei Wang	2019-05-29	2	-6/+4
\| \| \| \| \| \|	This reverts commit c1b3716614bc0a107e6f41a7d3d503baefad8a5b. llvm-svn: 361918
*	[X86] Use 'llvm_unreachable' instead of nullptr in unreachable code to	Pengfei Wang	2019-05-29	2	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	avoid static check fail RegClassOrBank is an object of RegClassOrRegBank, which is defined as using llvm::RegClassOrRegBank = typedef PointerUnion<const TargetRegisterClass , const RegisterBank > so control flow can not get here. Use ""llvm_unreachable" here to avoid "null pointer" confusion. Patch by Shengchen Kan (skan) Differential Revision: https://reviews.llvm.org/D62006 Signed-off-by: pengfei <pengfei.wang@intel.com> llvm-svn: 361912
*	[X86] Fix x86-64 call *foo@tlsdesc(%rax) and support R_386_TLSGOTDESC ↵	Fangrui Song	2019-05-29	3	-3/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	R_386_TLS_DESC_CALL D18885 emitted 5 bytes for call foo@tlsdesc(%rax). It should use the 2-byte form instead and let R_X86_64_TLSDESC_CALL apply to the beginning of the call instruction. The 2-byte form was deliberately chosen to make ->LE and ->IE relaxation work: 0: 48 8d 05 00 00 00 00 lea 0x0(%rip),%rax # 7 <.text+0x7> 3: R_X86_64_GOTPC32_TLSDESC a-0x4 7: ff 10 callq (%rax) 7: R_X86_64_TLSDESC_CALL a => 0: 48 c7 c0 fc ff ff ff mov $0xfffffffffffffffc,%rax 7: 66 90 xchg %ax,%ax Also change the symbol type to STT_TLS when VK_TLSCALL or VK_TLSDESC is seen. Reviewed By: compnerd Differential Revision: https://reviews.llvm.org/D62512 llvm-svn: 361910
*	[WebAssembly] Add signatures for RINT builtins	Thomas Lively	2019-05-29	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: azakai, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, aheejin, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62564 llvm-svn: 361904
*	[RegUsageInfoCollector] Don't mark as saved registers that don't have ↵	Quentin Colombet	2019-05-28	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	subregister lanes To determine the list of clobbered registers, the RegUsageInfoCollector pass uses the list of callee saved registers provided by the target and then augments it with the list of registers which have all their subregisters saved. It then basically does the difference between all the registers and the saved registers to come up with what is clobbered (plus it checks that the register is defined within that functions). The patch fixes a bug where when register does not have any subregister lane, hence when checking if any of its subregister are not saved, we would find none and think the register is saved as well. That's obviously wrong. The code was actually kind of checking for something like that with the CoveredBySubRegs bit. What this bit says is that a register is completely covered by its subregisters. We required that this bit was set, to check that a register was saved by its subregister lanes, since without this bit, we potentially would miss to check some part of the register. However, this bit is used de facto on registers that don't have any subregisters (e.g., on ARM) and the code was not prepared for that. This patch fixes this by checking that a register has subregisters before declaring it saved when none of its lanes are modified. llvm-svn: 361901
*	[ORC] Track JIT symbol states more explicitly.	Lang Hames	2019-05-28	2	-132/+105
\| \| \| \| \| \| \| \| \| \| \| \| \|	Prior to this patch, JITDylibs inferred symbol states (whether a symbol was newly added, materializing, resolved, or ready to run) via a combination of (1) bits in the JITSymbolFlags member, and (2) the state of some internal JITDylib data structures. This patch explicitly tracks symbol states by adding a new SymbolState member to the symbol table entries, and removing the 'Lazy' and 'Materializing' bits from JITSymbolFlags. This is a first step towards adding additional states representing initialization phases (e.g. eh-frame registration, registration with the language runtime, and static initialization). llvm-svn: 361899
*	[AArch64][GlobalISel] Select FCMPSri/FCMPDri when comparing against 0.0	Jessica Paquette	2019-05-28	1	-13/+27
\| \| \| \| \| \| \| \| \| \| \|	Add support for selecting FCMPSri and FCMPDri when comparing against 0.0, and factor out opcode selection for G_FCMP into its own function. Add a test to show that we don't do this with other immediates. Differential Revision: https://reviews.llvm.org/D62539 llvm-svn: 361888
*	[WebAssembly] Support for atomic fences	Heejin Ahn	2019-05-28	3	-4/+107
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds support for translation of LLVM IR fence instruction. We convert a singlethread fence to a pseudo compiler barrier which becomes 0 instructions in final binary, and a thread fence to an idempotent atomicrmw instruction to a memory address. Reviewers: dschuff, jfb, sunfish, tlively Subscribers: sbc100, jgravelle-google, llvm-commits Differential Revision: https://reviews.llvm.org/D50277 llvm-svn: 361884
*	[PGO] Handle cases of failing to split critical edges	Rong Xu	2019-05-28	1	-44/+56
\| \| \| \| \| \| \| \| \| \| \|	Fix PR41279 where critical edges to EHPad are not split. The fix is to not instrument those critical edges. We used to be able to know the size of counters right after MST is computed. With this, we have to pre-collect the instrument BBs to know the size, and then instrument them. Differential Revision: https://reviews.llvm.org/D62439 llvm-svn: 361882
*	Revert "[CorrelatedValuePropagation] Fix prof branch_weights metadata ↵	Nikita Popov	2019-05-28	1	-61/+56
\| \| \| \| \| \| \| \| \| \| \| \|	handling for SwitchInst" This reverts commit 53f2f3286572cb879b3861d7c15480e4d830dd3b. As reported on D62126, this causes assertion failures if the switch has incorrect branch_weights metadata, which may happen as a result of other transforms not handling it correctly yet. llvm-svn: 361881
*	AMDGPU: Temporary drop s_mul_hi_i/u32 patterns	Konstantin Zhuravlyov	2019-05-28	1	-6/+2
\| \| \| \| \| \| \| \|	It introduces performance regressions in several applications. This has already been submitted downstream. llvm-svn: 361879
*	[AArch64] Handle ISD::LRINT and ISD::LLRINT	Adhemerval Zanella	2019-05-28	2	-0/+15
\| \| \| \| \| \| \| \| \| \| \|	This patch optimizes ISD::LRINT and ISD::LLRINT to frintx plus fcvtzs. It currently only handles the scalar version. Reviewed By: SjoerdMeijer, mstorsjo Differential Revision: https://reviews.llvm.org/D62018 llvm-svn: 361877
*	[CodeGen] Add lrint/llrint builtins	Adhemerval Zanella	2019-05-28	9	-2/+116
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch add the ISD::LRINT and ISD::LLRINT along with new intrinsics. The changes are straightforward as for other floating-point rounding functions, with just some adjustments required to handle the return value being an interger. The idea is to optimize lrint/llrint generation for AArch64 in a subsequent patch. Current semantic is just route it to libm symbol. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D62017 llvm-svn: 361875
*	[DAGCombine] (x - C) - y -> (x - y) - C fold. Try 2	Roman Lebedev	2019-05-28	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Again only vectors affected. Frustrating. Let me take a look into that.. https://rise4fun.com/Alive/AAq This is a recommit, originally committed in rL361856, but reverted to investigate test-suite compile-time hangs. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: javed.absar, JDevlieghere, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62294 llvm-svn: 361874
*	[DAGCombine][X86][AArch64][AMDGPU] (x - y) + -1 -> add (xor y, -1), x ↵	Roman Lebedev	2019-05-28	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fold. Try 2 Summary: This prevents regressions in next patch, and somewhat recovers from the regression to AMDGPU test in D62223. It is indeed not great that we leave vector decrement, don't transform it into vector add all-ones.. https://rise4fun.com/Alive/ZRl This is a recommit, originally committed in rL361855, but reverted to investigate test-suite compile-time hangs. Reviewers: RKSimon, craig.topper, spatel, arsenm Reviewed By: RKSimon, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62263 llvm-svn: 361873
*	[DAGCombiner][X86][AArch64][SPARC][SystemZ] y - (x + C) -> (y - x) - C ↵	Roman Lebedev	2019-05-28	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fold. Try 2 Summary: Direct sibling of D62223 patch. While i don't have a direct motivational pattern for this, it would seem to make sense to handle both patterns (or none), for symmetry? The aarch64 changes look neutral; sparc and systemz look like improvement (one less instruction each); x86 changes - 32bit case improves, 64bit case shows that LEA no longer gets constructed, which may be because that whole test is `-mattr=+slow-lea,+slow-3ops-lea` https://rise4fun.com/Alive/ffh This is a recommit, originally committed in rL361853, but reverted to investigate test-suite compile-time hangs. Reviewers: RKSimon, craig.topper, spatel, t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, jyknight, javed.absar, kristof.beyls, fedor.sergeev, jrtc27, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62252 llvm-svn: 361872
*	[DAGCombiner][X86][AArch64][AMDGPU] (x + C) - y -> (x - y) + C fold. Try 2	Roman Lebedev	2019-05-28	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The main motivation is shown by all these `neg` instructions that are now created. In particular, the `@reg32_lshr_by_negated_unfolded_sub_b` test. AArch64 test changes all look good (`neg` created), or neutral. X86 changes look neutral (vectors), or good (`neg` / `xor eax, eax` created). I'm not sure about `X86/ragreedy-hoist-spill.ll`, it looks like the spill is now hoisted into preheader (which should still be good?), 2 4-byte reloads become 1 8-byte reload, and are elsewhere, but i'm not sure how that affects that loop. I'm unable to interpret AMDGPU change, looks neutral-ish? This is hopefully a step towards solving [[ https://bugs.llvm.org/show_bug.cgi?id=41952 \| PR41952 ]]. https://rise4fun.com/Alive/pkdq (we are missing more patterns, i'll submit them later) This is a recommit, originally committed in rL361852, but reverted to investigate test-suite compile-time hangs. Reviewers: craig.topper, RKSimon, spatel, arsenm Reviewed By: RKSimon Subscribers: bjope, qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62223 llvm-svn: 361871
*	[AMDGPU] Correct the handling of inlineasm output registers.	Michael Liao	2019-05-28	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - There's a regression due to the cross-block RC assignment. Use the proper way to derive the output register RC in inline asm. Reviewers: rampitec, alex-t Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, dstuttard, tpr, t-tye, eraman, hiraditya, llvm-commits, yaxunl Tags: #llvm Differential Revision: https://reviews.llvm.org/D62537 llvm-svn: 361868
*	Revert DAGCombine "hoist binop with const" folds	Roman Lebedev	2019-05-28	1	-42/+0
\| \| \| \| \| \| \| \| \| \|	Appear to introduce test-suite compile-time hang. http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/22825 This reverts r361852,r361853,r361854,r361855,r361856 llvm-svn: 361865
*	[InstCombine] Clean up saturing math overflow optimizations; NFC	Nikita Popov	2019-05-28	1	-29/+20
\| \| \| \| \| \| \|	Reduce duplication and make it easier to handle signed always-overflows conditions in the future. llvm-svn: 361863
*	[ValueTracking][ConstantRange] Distinguish low/high always overflow	Nikita Popov	2019-05-28	4	-14/+17
\| \| \| \| \| \| \| \| \| \| \| \|	In order to fold an always overflowing signed saturating add/sub, we need to know in which direction the always overflow occurs. This patch splits up AlwaysOverflows into AlwaysOverflowsLow and AlwaysOverflowsHigh to pass through this information (but it is not used yet). Differential Revision: https://reviews.llvm.org/D62463 llvm-svn: 361858
*	[IR] Add SaturatingInst and BinaryOpIntrinsic classes	Nikita Popov	2019-05-28	1	-3/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	Based on the suggestion in D62447, this adds a SaturatingInst class that represents the saturating add/sub family of intrinsics. It exposes the same interface as WithOverflowInst, for this reason I have also added a common base class BinaryOpIntrinsic that holds the actual implementation code and will be useful in some places handling both overflowing and saturating math. Differential Revision: https://reviews.llvm.org/D62466 llvm-svn: 361857
*	[DAGCombine] (x - C) - y -> (x - y) - C fold	Roman Lebedev	2019-05-28	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Again only vectors affected. Frustrating. Let me take a look into that.. https://rise4fun.com/Alive/AAq Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: javed.absar, JDevlieghere, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62294 llvm-svn: 361856
*	[DAGCombine][X86][AArch64][AMDGPU] (x - y) + -1 -> add (xor y, -1), x fold	Roman Lebedev	2019-05-28	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This prevents regressions in next patch, and somewhat recovers from the regression to AMDGPU test in D62223. It is indeed not great that we leave vector decrement, don't transform it into vector add all-ones.. https://rise4fun.com/Alive/ZRl Reviewers: RKSimon, craig.topper, spatel, arsenm Reviewed By: RKSimon, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62263 llvm-svn: 361855
*	[DAGCombiner][X86][AArch64] (x - C) + y -> (x + y) - C fold	Roman Lebedev	2019-05-28	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Only vector tests are being affected here, since subtraction by scalar constant is rewritten as addition by negated constant. No surprising test changes. https://rise4fun.com/Alive/pbT Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62257 llvm-svn: 361854
*	[DAGCombiner][X86][AArch64][SPARC][SystemZ] y - (x + C) -> (y - x) - C fold	Roman Lebedev	2019-05-28	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Direct sibling of D62223 patch. While i don't have a direct motivational pattern for this, it would seem to make sense to handle both patterns (or none), for symmetry? The aarch64 changes look neutral; sparc and systemz look like improvement (one less instruction each); x86 changes - 32bit case improves, 64bit case shows that LEA no longer gets constructed, which may be because that whole test is `-mattr=+slow-lea,+slow-3ops-lea` https://rise4fun.com/Alive/ffh Reviewers: RKSimon, craig.topper, spatel, t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, jyknight, javed.absar, kristof.beyls, fedor.sergeev, jrtc27, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62252 llvm-svn: 361853
*	[DAGCombiner][X86][AArch64][AMDGPU] (x + C) - y -> (x - y) + C fold	Roman Lebedev	2019-05-28	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The main motivation is shown by all these `neg` instructions that are now created. In particular, the `@reg32_lshr_by_negated_unfolded_sub_b` test. AArch64 test changes all look good (`neg` created), or neutral. X86 changes look neutral (vectors), or good (`neg` / `xor eax, eax` created). I'm not sure about `X86/ragreedy-hoist-spill.ll`, it looks like the spill is now hoisted into preheader (which should still be good?), 2 4-byte reloads become 1 8-byte reload, and are elsewhere, but i'm not sure how that affects that loop. I'm unable to interpret AMDGPU change, looks neutral-ish? This is hopefully a step towards solving [[ https://bugs.llvm.org/show_bug.cgi?id=41952 \| PR41952 ]]. https://rise4fun.com/Alive/pkdq (we are missing more patterns, i'll submit them later) Reviewers: craig.topper, RKSimon, spatel, arsenm Reviewed By: RKSimon Subscribers: bjope, qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62223 llvm-svn: 361852
*	Revert "[x86] split 256-bit store of concatenated vectors"	Sanjay Patel	2019-05-28	1	-11/+0
\| \| \| \| \| \| \| \| \|	This reverts commit d5a8637072f4c556b88156bd2f6237a2ead47d31. Most likely suspect for this bot failure: http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux/builds/9684 llvm-svn: 361850
*	AMDGPU: Don't enable all lanes with non-CSR VGPR spills	Matt Arsenault	2019-05-28	1	-39/+49
\| \| \| \| \| \| \| \|	If the only VGPRs used for SGPR spilling were not CSRs, this was enabling all laness and immediately restoring exec. This is the usual situation in leaf functions. llvm-svn: 361848
*	[AMDGPU] Fix the mis-handling of `vreg_1` copied from scalar register.	Michael Liao	2019-05-28	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Don't treat the use of a scalar register as `vreg_1` an VGPR usage. Otherwise, that promotes that scalar register into vector one, which breaks the assumption that scalar register holds the lane mask. - The issue is triggered in a complicated case, where if the uses of that (lane mask) scalar register is legalized firstly before its definition, e.g., due to the mismatch block placement and its topological order or loop. In that cases, the legalization of PHI introduces the use of that scalar register as `vreg_1`. Reviewers: rampitec, nhaehnle, arsenm, alex-t Subscribers: kzhuravl, jvesely, wdng, dstuttard, tpr, t-tye, hiraditya, llvm-commits, yaxunl Tags: #llvm Differential Revision: https://reviews.llvm.org/D62492 llvm-svn: 361847
*	[ARM] Replace fp-only-sp and d16 with fp64 and d32.	Simon Tatham	2019-05-28	19	-195/+245
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Those two subtarget features were awkward because their semantics are reversed: each one indicates the _lack_ of support for something in the architecture, rather than the presence. As a consequence, you don't get the behavior you want if you combine two sets of feature bits. Each SubtargetFeature for an FP architecture version now comes in four versions, one for each combination of those options. So you can still say (for example) '+vfp2' in a feature string and it will mean what it's always meant, but there's a new string '+vfp2d16sp' meaning the version without those extra options. A lot of this change is just mechanically replacing positive checks for the old features with negative checks for the new ones. But one more interesting change is that I've rearranged getFPUFeatures() so that the main FPU feature is appended to the output list before rather than after the features derived from the Restriction field, so that -fp64 and -d32 can override defaults added by the main feature. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: srhines, javed.absar, eraman, kristof.beyls, hiraditya, zzheng, Petar.Avramovic, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D60691 llvm-svn: 361845
*	[AArch64] Delete unused VariantKind in AArch64MCExpr	Fangrui Song	2019-05-28	2	-4/+1
\| \| \| \|	llvm-svn: 361844
*	[X86-64] Fix 256-bit SET0 lowering for non-VLX targets	David Greene	2019-05-28	1	-0/+6
\| \| \| \| \| \| \| \| \| \|	If we don't have VLX then 256-bit SET0 should be lowered to VPXOR with ZMM registers. This restores functionality accidentally removed by r309926. Differential Revision: https://reviews.llvm.org/D62415 llvm-svn: 361843
*	llvm-undname: Support demangling char8_t	Nico Weber	2019-05-28	2	-0/+3
\| \| \| \| \| \|	Ports clang's mangling support added in r354633 to llvm-undname. llvm-svn: 361839
*	llvm-undname: Add support for local static thread guards	Nico Weber	2019-05-28	2	-3/+9
\| \| \| \|	llvm-svn: 361835
*	[XCOFF] Implement parsing symbol table for xcoffobjfile and output as yaml ↵	Jason Liu	2019-05-28	2	-39/+230
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	format Summary: This patch implement parsing symbol table for xcoffobjfile and output as yaml format. Parsing auxiliary entries of a symbol will be in a separate patch. The XCOFF object file (aix_xcoff.o) used in the test comes from -bash-4.2$ cat test.c extern int i; extern int TestforXcoff; int main() { i++; TestforXcoff--; } Patch by DiggerLin Reviewers: sfertile, hubert.reinterpretcast, MaskRay, daltenty Differential Revision: https://reviews.llvm.org/D61532 llvm-svn: 361832
*	[x86] split 256-bit store of concatenated vectors	Sanjay Patel	2019-05-28	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This shows up as a side issue to the main problem for the AVX target example from PR37428: https://bugs.llvm.org/show_bug.cgi?id=37428 - https://godbolt.org/z/7tpRa3 But as we can see in the pile of existing test diffs, it's actually a widespread problem that affects any AVX or later target. Apart from a couple of oddballs, I think these are all improvements for the reasons stated in the code comment: we do not want to enable YMM unnecessarily (avoid vzeroupper and frequency throttling) and some cores split 256-bit stores anyway. We could say that MergeConsecutiveStores() is going overboard on some of these examples, but that won't solve the problem completely. But that is the reason I'm proposing this as a lowering rather than a combine: we will infinite loop fighting the merge code if we try this earlier. Differential Revision: https://reviews.llvm.org/D62498 llvm-svn: 361822
*	[DAG] LegalizeVectorTypes - reduce scope of local variables. NFCI.	Simon Pilgrim	2019-05-28	1	-4/+2
\| \| \| \| \| \|	Move the element index/count variables into the block where they are actually used - appeases cppcheck and helps avoid shadow variable warnings. llvm-svn: 361821
*	Stop undef fragments from closing non-overlapping fragments	David Stenberg	2019-05-28	2	-14/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When DwarfDebug::buildLocationList() encountered an undef debug value, it would truncate all open values, regardless if they were overlapping or not. This patch fixes so that it only does that for overlapping fragments. This change unearthed a bug that I had introduced in D57511, which I have fixed in this patch. The code in DebugHandlerBase that changes labels for parameter debug values could break DwarfDebug's assumption that the labels for the entries in the debug value history are monotonically increasing. Before this patch, that bug could result in location list entries whose ending address was lower than the beginning address, and with the changes for undef debug values that this patch introduces it could trigger an assertion, due to attempting to emit location list entries with empty ranges. A reproducer for the bug is added in param-reg-const-mix.mir. Reviewers: aprantl, jmorse, probinson Reviewed By: aprantl Subscribers: javed.absar, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D62379 llvm-svn: 361820
*	MIR: Fix printer crashing on dead CSR frame indexes	Matt Arsenault	2019-05-28	1	-0/+3
\| \| \| \|	llvm-svn: 361819
*	[x86] fix 256-bit vector store splitting to honor 'volatile'	Sanjay Patel	2019-05-28	1	-14/+30
\| \| \| \| \| \| \| \| \| \| \|	Forking this out of the discussion in D62498 (and assuming that will be committed later, so adding the helper function here). The LangRef says: "the backend should never split or merge target-legal volatile load/store instructions." Differential Revision: https://reviews.llvm.org/D62506 llvm-svn: 361815
*	[X86] Custom lower CONCAT_VECTORS of v2i1	Benjamin Kramer	2019-05-28	2	-7/+3
\| \| \| \| \| \| \|	The generic legalizer cannot handle this. Add an assert instead of silently miscompiling vectors with elements smaller than 8 bits. llvm-svn: 361814
*	[NFC] Test commit, delete trailing whitespace	Graham Hunter	2019-05-28	1	-1/+1
\| \| \| \|	llvm-svn: 361813
*	Re-commit r357452 (take 2): "SimplifyCFG SinkCommonCodeFromPredecessors: ↵	Hans Wennborg	2019-05-28	1	-14/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Also sink function calls without used results (PR41259)" This was reverted in r360086 as it was supected of causing mysterious test failures internally. However, it was never concluded that this patch was the root cause. > The code was previously checking that candidates for sinking had exactly > one use or were a store instruction (which can't have uses). This meant > we could sink call instructions only if they had a use. > > That limitation seemed a bit arbitrary, so this patch changes it to > "instruction has zero or one use" which seems more natural and removes > the need to special-case stores. > > Differential revision: https://reviews.llvm.org/D59936 llvm-svn: 361811
*	[CorrelatedValuePropagation] Fix prof branch_weights metadata handling for ↵	Yevgeny Rouban	2019-05-28	1	-56/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	SwitchInst This patch fixes the CorrelatedValuePropagation pass to keep prof branch_weights metadata of SwitchInst consistent. It makes use of SwitchInstProfUpdateWrapper. New tests are added. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D62126 llvm-svn: 361808
*	[X86] X86CmovConverterPass::collectCmovCandidates - fix uninitialized ↵	Simon Pilgrim	2019-05-28	1	-1/+2
\| \| \| \| \| \|	variable warnings. NFCI. llvm-svn: 361804
*	[AArch64][SVE2] Asm: support SVE2 Floating Point Convert Group	Cullen Rhodes	2019-05-28	2	-0/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Patch adds support for the following intructions: SVE2 floating-point convert precision: * FCVTXNT, FCVTNT, FCVTLT The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: chill Differential Revision: https://reviews.llvm.org/D62382 llvm-svn: 361801
*	[AArch64][SVE2] Asm: support SVE2 Crypto Extensions Group	Cullen Rhodes	2019-05-28	2	-0/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Patch adds support for the following instructions: SVE2 crypto constructive binary operations: * SM4EKEY, RAX1 SVE2 crypto destructive binary operations: * AESE, AESD, SM4E SVE2 crypto unary operations: * AESMC, AESIMC AESE, AESD, AESMC and AESIMC are enabled with +sve2-aes. SM4E and SM4EKEY are enabled with +sve2-sm4. RAX1 is enabled with +sve2-sha3. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62307 llvm-svn: 361797
*	[AArch64][SVE2] Asm: support SVE2 Histogram Computation Groups	Cullen Rhodes	2019-05-28	2	-0/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Patch adds support for the following instructions: SVE2 histogram generation (segment): * HISTSEG SVE2 histogram generation (vector): * HISTCNT The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: chill Differential Revision: https://reviews.llvm.org/D62306 llvm-svn: 361796
*	[AArch64][SVE2] Asm: support SVE2 Misc Group	Cullen Rhodes	2019-05-28	2	-0/+98
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Patch adds support for the following instructions: SVE2 bitwise exclusive-or interleaved: * EORBT, EORTB SVE2 bitwise permute: * BEXT, BDEP, BGRP SVE2 bitwise shift left long: * SSHLLB, SSHLLT, USHLLB, USHLLT SVE2 integer add/subtract interleaved long: * SADDLBT, SSUBLBT, SSUBLTB BDEP, BEXT and BGRP are enabled with SVE2 feature +bitperm, all other instructions in this group are enabled with +sve2. Reviewed By: chill Differential Revision: https://reviews.llvm.org/D62304 llvm-svn: 361795