bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[AMDGPU] Fix for branch offset hardware workaround	Ryan Taylor	2019-06-26	7	-24/+111
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes a hardware bug that makes a branch offset of 0x3f unsafe. This replaces the 32 bit branch with offset 0x3f to a 64 bit instruction that includes the same 32 bit branch and the encoding for a s_nop 0 to follow. The relaxer than modifies the offsets accordingly. Change-Id: I10b7aed99d651f8159401b01bb421f105fa6288e Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63494 llvm-svn: 364451
*	Allow matching extend-from-memory with strict FP nodes	Ulrich Weigand	2019-06-26	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This implements a small enhancement to https://reviews.llvm.org/D55506 Specifically, while we were able to match strict FP nodes for floating-point extend operations with a register as source, this did not work for operations with memory as source. That is because from regular operations, this is represented as a combined "extload" node (which is a variant of a load SD node); but there is no equivalent using a strict FP operation. However, it turns out that even in the absence of an extload node, we can still just match the operations explicitly, e.g. (strict_fpextend (f32 (load node:$ptr)) This patch implements that method to match the LDEB/LXEB/LXDB SystemZ instructions even when the extend uses a strict-FP node. llvm-svn: 364450
*	[WebAssembly] Omit wrap on i64x2.{shl,shr*} ISel when possible	Thomas Lively	2019-06-26	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Since the WebAssembly SIMD shift instructions take i32 operands, we truncate the i64 operand to <2 x i64> shifts during ISel. When the i64 operand is sign extended from i32, this CL makes it so the sign extension is dropped instead of a wrap instruction added. Reviewers: dschuff, aheejin Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63615 llvm-svn: 364446
*	[WebAssembly] Implement tail calls and unify tablegen call classes	Thomas Lively	2019-06-26	9	-151/+186
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Implements direct and indirect tail calls enabled by the 'tail-call' feature in both DAG ISel and FastISel. Updates existing call tests and adds new tests including a binary encoding test. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62877 llvm-svn: 364445
*	[X86][SSE] X86TargetLowering::isCommutativeBinOp - add PMULDQ	Simon Pilgrim	2019-06-26	1	-3/+1
\| \| \| \| \| \|	Allows narrowInsertExtractVectorBinOp to reduce vector size instead of the more restricted SimplifyDemandedVectorEltsForTargetNode llvm-svn: 364434
*	[X86][SSE] X86TargetLowering::isCommutativeBinOp - add PCMPEQ	Simon Pilgrim	2019-06-26	1	-0/+1
\| \| \| \| \| \|	Allows narrowInsertExtractVectorBinOp to reduce vector size llvm-svn: 364432
*	[X86][SSE] X86TargetLowering::isBinOp - add PCMPGT	Simon Pilgrim	2019-06-26	1	-0/+1
\| \| \| \| \| \|	Allows narrowInsertExtractVectorBinOp to reduce vector size llvm-svn: 364431
*	[X86] shouldScalarizeBinop - never scalarize target opcodes.	Simon Pilgrim	2019-06-26	1	-2/+9
\| \| \| \| \| \|	We have (almost) no target opcodes that have scalar/vector equivalents - for now assume we can't scalarize them (we can add exceptions if we need to). llvm-svn: 364429
*	AMDGPU: Fix unused variable	Matt Arsenault	2019-06-26	1	-1/+0
\| \| \| \|	llvm-svn: 364426
*	AMDGPU: Check MRI for callee saved regs instead of TRI	Matt Arsenault	2019-06-26	4	-7/+5
\| \| \| \| \| \| \|	This should the same, but MRI does allow dynamically changing the CSR set, although currently not used. llvm-svn: 364425
*	[X86][Codegen] X86DAGToDAGISel::matchBitExtract(): consistently capture ↵	Roman Lebedev	2019-06-26	1	-7/+6
\| \| \| \| \| \|	lambdas by value llvm-svn: 364420
*	[X86] X86DAGToDAGISel::matchBitExtract(): pattern c: truncation awareness	Roman Lebedev	2019-06-26	1	-8/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The one thing of note here is that the 'bitwidth' constant (32/64) was previously pessimistic. Given `x & (-1 >> (C - z))`, we were taking `C` to be `bitwidth(x)`, but in reality we want `(-1 >> (C - z))` pattern to mean "low z bits must be all-ones". And for that, `C` should be `bitwidth(-1 >> (C - z))`, i.e. of the shift operation itself. Last pattern D does not seem to exhibit any of these truncation issues. Although it has the opposite problem - if we extract low bits (no shift) from i64, and then truncate to i32, then we fail to shrink this 64-bit extraction into 32-bit extraction. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62806 llvm-svn: 364419
*	[X86] X86DAGToDAGISel::matchBitExtract(): pattern b: truncation awareness	Roman Lebedev	2019-06-26	1	-5/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: (Not so) boringly identical to pattern a (D62786) Not yet sure how do deal with the last pattern c. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62793 llvm-svn: 364418
*	[X86] X86DAGToDAGISel::matchBitExtract(): pattern a: truncation awareness	Roman Lebedev	2019-06-26	1	-18/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Finally tying up loose ends here. The problem is quite simple: If we have pattern `(x >> start) & (1 << nbits) - 1`, and then truncate the result, that truncation will be propagated upwards, into the `and`. And that isn't currently handled. I'm only fixing pattern `a` here, the same fix will be needed for patterns `b`/`c` too. I think this isn't missing any extra legality checks, since we only look past truncations. Similary, i don't think we can get any other truncation there other than i64->i32. Reviewers: craig.topper, RKSimon, spatel Reviewed By: craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62786 llvm-svn: 364417
*	Fix the build after r364401	Hans Wennborg	2019-06-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It was failing with: /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/lib/Target/X86/X86ISelLowering.cpp:18772:66: error: call of overloaded 'makeArrayRef(<brace-enclosed initializer list>)' is ambiguous scaleShuffleMask<int>(Scale, makeArrayRef<int>({ 0, 2, 1, 3 }), Mask); ^ /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/lib/Target/X86/X86ISelLowering.cpp:18772:66: note: candidates are: In file included from /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/include/llvm/CodeGen/MachineFunction.h:20:0, from /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/include/llvm/CodeGen/CallingConvLower.h:19, from /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/lib/Target/X86/X86ISelLowering.h:17, from /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/lib/Target/X86/X86ISelLowering.cpp:14: /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/include/llvm/ADT/ArrayRef.h:480:15: note: llvm::ArrayRef<T> llvm::makeArrayRef(const std::vector<_RealType>&) [with T = int] ArrayRef<T> makeArrayRef(const std::vector<T> &Vec) { ^ /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/include/llvm/ADT/ArrayRef.h:485:37: note: llvm::ArrayRef<T> llvm::makeArrayRef(const llvm::ArrayRef<T>&) [with T = int] template <typename T> ArrayRef<T> makeArrayRef(const ArrayRef<T> &Vec) { ^ llvm-svn: 364414
*	[X86][AVX] combineExtractSubvector - 'little to big' ↵	Simon Pilgrim	2019-06-26	1	-0/+28
\| \| \| \| \| \| \| \|	extract_subvector(bitcast()) support Ideally this needs to be a generic combine in DAGCombiner::visitEXTRACT_SUBVECTOR but there's some nasty regressions in aarch64 due to neon shuffles not handling bitcasts at all..... llvm-svn: 364407
*	[ARM] Handle fixup_arm_pcrel_9 correctly on big-endian targets	Mikhail Maltsev	2019-06-26	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The getFixupKindContainerSizeBytes function returns the size of the instruction containing a given fixup. Currently fixup_arm_pcrel_9 is not handled in this function, this causes an assertion failure in the debug build and incorrect codegen in the release build. This patch fixes the problem. Reviewers: ostannard, simon_tatham Reviewed By: ostannard Subscribers: javed.absar, kristof.beyls, hiraditya, pbarrio, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63778 llvm-svn: 364404
*	[RISCV] Add pseudo instruction for calls with explicit register	Lewis Revill	2019-06-26	4	-11/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds the PseudoCALLReg instruction which allows using an explicit register operand as the destination for the return address. GCC can successfully parse this form of the call instruction, which would be used for calls to functions which do not use ra as the return address register, such as the __riscv_save libcalls. This patch forms the first part of an implementation of -msave-restore for RISC-V. Differential Revision: https://reviews.llvm.org/D62685 llvm-svn: 364403
*	[X86][AVX] truncateVectorWithPACK - avoid bitcasted shuffles	Simon Pilgrim	2019-06-26	1	-2/+5
\| \| \| \| \| \| \| \|	truncateVectorWithPACK is often used in conjunction with ComputeNumSignBits which struggles when peeking through bitcasts. This fix tries to avoid bitcast(shuffle(bitcast())) patterns in the 256-bit 64-bit sublane shuffles so we can still see through at least until lowering when the shuffles will need to be bitcasted to widen the shuffle type. llvm-svn: 364401
*	[ExpandMemCmp] Honor prefer-vector-width.	Clement Courbet	2019-06-26	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: gchatelet, echristo, spatel, atdt Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63769 llvm-svn: 364384
*	[PowerPC] Fixed missing change flag of emitRLDICWhenLoweringJumpTables	Kai Luo	2019-06-26	1	-9/+10
\| \| \| \| \| \| \| \| \|	PPCMIPeephole::emitRLDICWhenLoweringJumpTables should return a bool value to indicate optimization is conducted or not. Differential Revision: https://reviews.llvm.org/D63801 llvm-svn: 364383
*	[ARM] Fix -Wimplicit-fallthrough after D60709/r364331	Fangrui Song	2019-06-26	1	-4/+3
\| \| \| \|	llvm-svn: 364376
*	[PowerPC] Mark FCOPYSIGN legal for FP vectors	Nemanja Ivanovic	2019-06-26	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	This was just an omission in the back end. We have had the instructions for both single and double precision for a few HW generations, but never got around to legalizing these. Differential revision: https://reviews.llvm.org/D63634 llvm-svn: 364373
*	[PowerPC][NFC] Move peephole optimization of RLDICR into a method.	Kai Luo	2019-06-26	1	-47/+57
\| \| \| \|	llvm-svn: 364372
*	[WebAssembly] Remove catch_all from AsmParser	Heejin Ahn	2019-06-25	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: `catch_all` is from the first version of EH proposal and now has been removed. There were no tests covering this, and thus no tests to remove or fix. Reviewers: aardappel Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63737 llvm-svn: 364360
*	Don't look for the TargetFrameLowering in the implementation	Matt Arsenault	2019-06-25	4	-8/+4
\| \| \| \| \| \|	The same oddity was apparently copy-pasted between multiple targets. llvm-svn: 364349
*	Update phis in AMDGPUUnifyDivergentExitNodes	Diego Novillo	2019-06-25	1	-7/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Original patch https://reviews.llvm.org/D63659 from Steven Perron <stevenperron@google.com> The pass AMDGPUUnifyDivergentExitNodes does not update the phi nodes in the successors of blocks that is splits. This is fixed by calling BasicBlock::splitBasicBlock to split the block instead of doing it manually. This does extra work because a new conditional branch is created in BB which is immediately replaced, but I think the simplicity is worth it. It also helps make the code more future proof in case other things need to be updated. llvm-svn: 364342
*	[AMDGPU] Removed dead SIMachineFunctionInfo::getWorkItemIDVGPR()	Stanislav Mekhanoshin	2019-06-25	2	-20/+0
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D63780 llvm-svn: 364339
*	[X86] Remove isel patterns that look for (vzext_movl (scalar_to_vector (load)))	Craig Topper	2019-06-25	3	-99/+0
\| \| \| \| \| \| \| \|	I believe these all get canonicalized to vzext_movl. The only case where that wasn't true was when the load was loadi32 and the load was an extload aligned to 32 bits. But that was fixed in r364207. Differential Revision: https://reviews.llvm.org/D63701 llvm-svn: 364337
*	[X86] Add a DAG combine to turn vzmovl+load into vzload if the load isn't ↵	Craig Topper	2019-06-25	3	-24/+20
\| \| \| \| \| \| \| \| \| \| \| \|	volatile. Remove isel patterns for vzmovl+load We currently have some isel patterns for treating vzmovl+load the same as vzload, but that shrinks the load which we shouldn't do if the load is volatile. Rather than adding isel checks for volatile. This patch removes the patterns and teachs DAG combine to merge them into vzload when its legal to do so. Differential Revision: https://reviews.llvm.org/D63665 llvm-svn: 364333
*	[ARM] Support inline assembler constraints for MVE.	Simon Tatham	2019-06-25	1	-1/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	"To" selects an odd-numbered GPR, and "Te" an even one. There are some 8.1-M instructions that have one too few bits in their register fields and require registers of particular parity, without necessarily using a consecutive even/odd pair. Also, the constraint letter "t" should select an MVE q-register, when MVE is present. This didn't need any source changes, but some extra tests have been added. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: javed.absar, eraman, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D60709 llvm-svn: 364331
*	[AVR] Adjust to Register class change	Ayke van Laethem	2019-06-25	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	A refactor in r364191 changed register types from an unsigned int to the llvm:Register class. Adjust the AVR backend to this change. This fixes build errors when building with the experimental AVR backend enabled. Differential Revision: https://reviews.llvm.org/D63776 llvm-svn: 364330
*	[ARM] Code-generation infrastructure for MVE.	Simon Tatham	2019-06-25	7	-17/+306
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This provides the low-level support to start using MVE vector types in LLVM IR, loading and storing them, passing them to __asm__ statements containing hand-written MVE vector instructions, and if you have the hard-float ABI turned on, using them as function parameters. (In the soft-float ABI, vector types are passed in integer registers, and combining all those 32-bit integers into a q-reg requires support for selection DAG nodes like insert_vector_elt and build_vector which aren't implemented yet for MVE. In fact I've also had to add `arm_aapcs_vfpcc` to a couple of existing tests to avoid that problem.) Specifically, this commit adds support for: * spills, reloads and register moves for MVE vector registers * ditto for the VPT predication mask that lives in VPR.P0 * make all the MVE vector types legal in ISel, and provide selection DAG patterns for BITCAST, LOAD and STORE * make loads and stores of scalar FP types conditional on `hasFPRegs()` rather than `hasVFP2Base()`. As a result a few existing tests needed their llc command lines updating to use `-mattr=-fpregs` as their method of turning off all hardware FP support. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60708 llvm-svn: 364329
*	[PPC32] Support PLT calls for -msecure-plt -fpic	Fangrui Song	2019-06-25	3	-34/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In Secure PLT ABI, -fpic is similar to -fPIC. The differences are that: * -fpic stores the address of _GLOBAL_OFFSET_TABLE_ in r30, while -fPIC stores .got2+0x8000. * -fpic uses an addend of 0 for R_PPC_PLTREL24, while -fPIC uses 0x8000. Reviewers: hfinkel, jhibbits, joerg, nemanjai, spetrovic Reviewed By: jhibbits Subscribers: adalava, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63563 llvm-svn: 364324
*	[ARM] Fix for DLS/LE CodeGen	Sam Parker	2019-06-25	1	-8/+9
\| \| \| \| \| \| \| \| \|	The expensive buildbots highlighted the mir tests were broken, which I've now updated and added --verify-machineinstrs to them. This also uncovered a couple of bugs in the backend pass, so these have also been fixed. llvm-svn: 364323
*	[AMDGPU] Null checking on TS to avoid crashing in clang tests.	Michael Liao	2019-06-25	1	-1/+2
\| \| \| \| \| \| \|	- `test/Misc/backend-resource-limit-diagnostics.cl` crashes as null streamer is used. llvm-svn: 364318
*	[X86] lowerShuffleAsSpecificZeroOrAnyExtend - add ANY_EXTEND TODO.	Simon Pilgrim	2019-06-25	1	-0/+1
\| \| \| \| \| \|	lowerShuffleAsSpecificZeroOrAnyExtend should be able to lower to ANY_EXTEND_VECTOR_INREG as well as ZER_EXTEND_VECTOR_INREG. llvm-svn: 364313
*	[ARM] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D60692	Fangrui Song	2019-06-25	1	-0/+1
\| \| \| \|	llvm-svn: 364312
*	AMDGPU: Select G_SEXT/G_ZEXT/G_ANYEXT	Matt Arsenault	2019-06-25	3	-5/+135
\| \| \| \|	llvm-svn: 364308
*	[ARM] Fix buildbot failure due to -Werror.	Simon Tatham	2019-06-25	1	-1/+0
\| \| \| \| \| \| \| \|	Including both 'case ARM_AM::uxtw' and 'default' in the getShiftOp switch caused a buildbot to fail with error: default label in switch which covers all enumeration values [-Werror,-Wcovered-switch-default] llvm-svn: 364300
*	[ARM] MVE VPT Blocks	Sjoerd Meijer	2019-06-25	1	-4/+12
\| \| \| \| \| \| \| \| \| \| \| \|	A minor iteration on the MVE VPT Block pass to enable more efficient VPT Block code generation: consecutive VPT predicated statements, predicated on the same condition, will be placed within the same VPT Block. This essentially is also an exercise to write some more tests for the next step, which should be more generic also merging instructions when they are not consecutive. Differential Revision: https://reviews.llvm.org/D63711 llvm-svn: 364298
*	AMDGPU: Write LDS objects out as global symbols in code generation	Nicolai Haehnle	2019-06-25	9	-14/+102
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The symbols use the processor-specific SHN_AMDGPU_LDS section index introduced with a previous change. The linker is then expected to resolve relocations, which are also emitted. Initially disabled for HSA and PAL environments until they have caught up in terms of linker and runtime loader. Some notes: - The llvm.amdgcn.groupstaticsize intrinsics can no longer be lowered to a constant at compile times, which means some tests can no longer be applied. The current "solution" is a terrible hack, but the intrinsic isn't used by Mesa, so we can keep it for now. - We no longer know the full LDS size per kernel at compile time, which means that we can no longer generate a relevant error message at compile time. It would be possible to add a check for the size of individual variables, but ultimately the linker will have to perform the final check. Change-Id: If66dbf33fccfbf3609aefefa2558ac0850d42275 Reviewers: arsenm, rampitec, t-tye, b-sumner, jsjodin Subscribers: qcolombet, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61494 llvm-svn: 364297
*	AMDGPU/MC: Add .amdgpu_lds directive	Nicolai Haehnle	2019-06-25	3	-0/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The directive defines a symbol as an group/local memory (LDS) symbol. LDS symbols behave similar to common symbols for the purposes of ELF, using the processor-specific SHN_AMDGPU_LDS as section index. It is the linker and/or runtime loader's job to "instantiate" LDS symbols and resolve relocations that reference them. It is not possible to initialize LDS memory (not even zero-initialize as for .bss). We want to be able to link together objects -- starting with relocatable objects, but possible expanding to shared objects in the future -- that access LDS memory in a flexible way. LDS memory is in an address space that is entirely separate from the address space that contains the program image (code and normal data), so having program segments for it doesn't really make sense. Furthermore, we want to be able to compile multiple kernels in a compilation unit which have disjoint use of LDS memory. In that case, we may want to place LDS symbols differently for different kernels to save memory (LDS memory is very limited and physically private to each kernel invocation), so we can't simply place LDS symbols in a .lds section. Hence this solution where LDS symbols always stay undefined. Change-Id: I08cbc37a7c0c32f53f7b6123aa0afc91dbc1748f Reviewers: arsenm, rampitec, t-tye, b-sumner, jsjodin Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61493 llvm-svn: 364296
*	[ARM] Explicit lowering of half <-> double conversions.	Simon Tatham	2019-06-25	2	-15/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If an FP_EXTEND or FP_ROUND isel dag node converts directly between f16 and f32 when the target CPU has no instruction to do it in one go, it has to be done in two steps instead, going via f32. Previously, this was done implicitly, because all such CPUs had the storage-only implementation of f16 (i.e. the only thing you can do with one at all is to convert it to/from f32). So isel would legalize the f16 into an f32 as soon as it saw it, by inserting an fp16_to_fp node (or vice versa), and then the fp_extend would already be f32->f64 rather than f16->f64. But that technique can't support a target CPU which has full f16 support but _not_ f64, such as some variants of Arm v8.1-M. So now we provide custom lowering for FP_EXTEND and FP_ROUND, which checks support for f16 and f64 and decides on the best thing to do given the combination of flags it gets back. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60692 llvm-svn: 364294
*	[ARM] Add remaining miscellaneous MVE instructions.	Simon Tatham	2019-06-25	4	-22/+164
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This final batch includes the tail-predicated versions of the low-overhead loop instructions (LETP); the VPSEL instruction to select between two vector registers based on the predicate mask without having to open a VPT block; and VPNOT which complements the predicate mask in place. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62681 llvm-svn: 364292
*	[ARM] Add MVE vector load/store instructions.	Simon Tatham	2019-06-25	13	-60/+1049
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds the rest of the vector memory access instructions. It includes contiguous loads/stores, with an ordinary addressing mode such as [r0,#offset] (plus writeback variants); gather loads and scatter stores with a scalar base address register and a vector of offsets from it (written [r0,q1] or similar); and gather/scatters with a vector of base addresses (written [q0,#offset], again with writeback). Additionally, some of the loads can widen each loaded value into a larger vector lane, and the corresponding stores narrow them again. To implement these, we also have to add the addressing modes they need. Also, in AsmParser, the `isMem` query function now has subqueries `isGPRMem` and `isMVEMem`, according to which kind of base register is used by a given memory access operand. I've also had to add an extra check in `checkTargetMatchPredicate` in the AsmParser, without which our last-minute check of `rGPR` register operands against SP and PC was failing an assertion because Tablegen had inserted an immediate 0 in place of one of a pair of tied register operands. (This matches the way the corresponding check for `MCK_rGPR` in `validateTargetOperandClass` is guarded.) Apparently the MVE load instructions were the first to have ever triggered this assertion, but I think only because they were the first to have a combination of the usual Arm pre/post writeback system and the `rGPR` class in particular. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62680 llvm-svn: 364291
*	[PowerPC] Emit XXSEL for vec_sel and code that has the same pattern	Nemanja Ivanovic	2019-06-25	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	As pointed out in https://bugs.llvm.org/show_bug.cgi?id=41777 we do not emit a vector select even when the pretty much asks for one. This patch changes that. Differential revision: https://reviews.llvm.org/D61658 llvm-svn: 364289
*	[ARM] DLS/LE low-overhead loop code generation	Sam Parker	2019-06-25	6	-1/+349
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Introduce three pseudo instructions to be used during DAG ISel to represent v8.1-m low-overhead loops. One maps to set_loop_iterations while loop_decrement_reg is lowered to two, so that we can separate the decrement and branching operations. The pseudo instructions are expanded pre-emission, where we can still decide whether we actually want to generate a low-overhead loop, in a new pass: ARMLowOverheadLoops. The pass currently bails, reverting to an sub, icmp and br, in the cases where a call or stack spill/restore happens between the decrement and branching instructions, or if the loop is too large. Differential Revision: https://reviews.llvm.org/D63476 llvm-svn: 364288
*	[ExpandMemCmp] Move all options to TargetTransformInfo.	Clement Courbet	2019-06-25	5	-45/+26
\| \| \| \| \| \|	Split off from D60318. llvm-svn: 364281
*	AMDGPU/GlobalISel: Fix regbankselect for amdgcn.class	Matt Arsenault	2019-06-25	1	-4/+8
\| \| \| \|	llvm-svn: 364262