bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[SystemZ][FPEnv] Back-end support for STRICT_[SU]INT_TO_FP	Ulrich Weigand	2019-12-17	3	-16/+31
\| \| \| \| \| \| \| \| \|	As of b1d8576 there is middle-end support for STRICT_[SU]INT_TO_FP, so this patch adds SystemZ back-end support as well. The patch is SystemZ target specific except for adding SD patterns strict_[su]int_to_fp and any_[su]int_to_fp to TargetSelectionDAG.td as usual.
*	Revert "Honor -fuse-init-array when os is not specified on x86"	Mitch Phillips	2019-12-17	10	-3/+63
\| \| \| \| \| \| \|	This reverts commit aa5ee8f244441a8ea103a7e0ed8b6f3e74454516. This change broke the sanitizer buildbots. See comments at the patchset (https://reviews.llvm.org/D71360) for more information.
*	This adds constrained intrinsics for the signed and unsigned conversions	Kevin P. Neal	2019-12-17	1	-37/+137
\| \| \| \| \| \| \| \| \|	of integers to floating point. This includes some of Craig Topper's changes for promotion support from D71130. Differential Revision: https://reviews.llvm.org/D69275
*	[RISCV][NFC] Trivial cleanup	Luís Marques	2019-12-17	1	-3/+0
\| \| \| \|	Fix a typo. Remove two seemingly out-of-date TODO comments.
*	Fix assertion failure in getMemOperandWithOffsetWidth	Kristof Beyls	2019-12-17	7	-30/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes an assertion failure that triggers inside getMemOperandWithOffset when Machine Sinking calls it on a MachineInstr that is not a memory operation. Different backends implement getMemOperandWithOffset differently: some return false on non-memory MachineInstrs, others assert. The Machine Sinking pass in at least SinkingPreventsImplicitNullCheck relies on getMemOperandWithOffset to return false on non-memory MachineInstrs, instead of asserting. This patch updates the documentation on getMemOperandWithOffset that it should return false on any MachineInstr it cannot handle, instead of asserting. It also adapts the in-tree backends accordingly where necessary. Differential Revision: https://reviews.llvm.org/D71359
*	Resubmit "[Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove"	Guillaume Chatelet	2019-12-17	2	-10/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a resubmit of D71473. This patch introduces a set of functions to enable deprecation of IRBuilder functions without breaking out of tree clients. Functions will be deprecated one by one and as in tree code is cleaned up. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: aaron.ballman, courbet Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71547
*	Honor -fuse-init-array when os is not specified on x86	Kamlesh Kumar	2019-12-16	10	-63/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently -fuse-init-array option is not effective when target triple does not specify os, on x86,x86_64. i.e. // -fuse-init-array is not honored. $ clang -target i386 -fuse-init-array test.c -S // -fuse-init-array is honored. $ clang -target i386-linux -fuse-init-array test.c -S This patch fixes first case. And does cleanup. Reviewers: rnk, craig.topper, fhahn, echristo Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D71360
*	[RISCV] Added isCompressibleInst() to estimate size in getInstSizeInBytes()	Ana Pazos	2019-12-16	1	-1/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Modified compression emitter tablegen backend to emit isCompressibleInst() check which in turn is used by getInstSizeInBytes() to better estimate instruction size. Note the generation of compressed instructions in RISC-V happens late in the assembler therefore instruction size estimate might be off if computed before. Reviewers: lenary, asb, luismarques, lewis-revill Reviewed By: asb Subscribers: sameer.abuasal, lewis-revill, hiraditya, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, lenary, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68290
*	[AArch64][SVE] Change pattern generation code to fix -Wimplicit-fallthrough ↵	Fangrui Song	2019-12-16	1	-4/+11
\| \| \| \|	after D71483
*	[AArch64][SVE] Add patterns for logical immediate operations.	Danilo Carvalho Grael	2019-12-16	3	-4/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add pattern matching for the following SVE logical vector and immediate instructions: - and/bic, orr/orn, eor/eon. Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D71483
*	[WebAssembly] Replace SIMD int min/max builtins with patterns	Thomas Lively	2019-12-16	2	-4/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The instructions were originally implemented via builtins and intrinsics so users would have to explicitly opt-in to using them. This was useful while were validating whether these instructions should have been merged into the spec proposal. Now that they have been, we can use normal codegen patterns, so the intrinsics and builtins are no longer useful. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71500
*	[SystemZ] Improve verification of MachineOperands.	Jonas Paulsson	2019-12-16	3	-8/+40
\| \| \| \| \| \| \| \| \| \| \|	Now that the machine verifier will check for cases of register/immediate MachineOperands and their correspondence to the MC instruction descriptor, this patch adds the operand types to the descriptors where they were previously missing. All MCOI::OPERAND_UNKNOWN operand types have been handled to get a known type, except for G_... (global isel) instructions. Review: Ulrich Weigand https://reviews.llvm.org/D71494
*	[mips] Add an assert in getTargetStreamer()	Miloš Stojanović	2019-12-16	1	-0/+2
\| \| \| \| \| \|	Check if the TargetStreamer can be accessed. Differential Revision: https://reviews.llvm.org/D71477
*	Revert "[Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove"	Guillaume Chatelet	2019-12-16	2	-8/+10
\| \| \| \|	This reverts commit 181ab91efc9fb08dedda10a2fbc5fccb83ce8799.
*	Reland [AArch64][MachineOutliner] Return address signing for outlined functions	David Tellenbach	2019-12-16	1	-8/+304
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Reland after fixing a bug that allowed outlining of SP modifying instructions that invalidated return address signing. During AArch64 frame lowering instructions to enable return address signing are inserted into functions if needed. Functions generated during machine outlining don't run through target frame lowering and hence are missing such instructions. This patch introduces the following changes: 1. If not all functions that potentially participate in function outlining agree on their return address signing scope and their return address signing key, outlining is disabled for these functions. 2. If not all functions that potentially participate in function outlining agree on their support for v8.3A features, outlining is disabled for these functions. 3. If an outlining candidate would outline instructions that modify sp in a way that invalidates return address signing, outlining is disabled for that particular candidate. 4. If all candidate functions agree on the signing scope, signing key and their support for v8.3 features, the outlined function behaves as if it had the same scope and key attributes and as if it would provide the same v8.3A support as the original functions. Reviewers: ostannard, paquette Reviewed By: ostannard Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70635
*	[Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove	Guillaume Chatelet	2019-12-16	2	-10/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch introduces a set of functions to enable deprecation of IRBuilder functions without breaking out of tree clients. Functions will be deprecated one by one and as in tree code is cleaned up. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71473
*	[AArch64][SVE2] Add intrinsics for binary narrowing operations	Andrzej Warzynski	2019-12-16	2	-10/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The following intrinsics for binary narrowing add and sub operations are added: * @llvm.aarch64.sve.addhnb * @llvm.aarch64.sve.addhnt * @llvm.aarch64.sve.raddhnb * @llvm.aarch64.sve.raddhnt * @llvm.aarch64.sve.subhnb * @llvm.aarch64.sve.subhnt * @llvm.aarch64.sve.rsubhnb * @llvm.aarch64.sve.rsubhnt Reviewers: sdesmalen, rengolin, efriedma Reviewed By: sdesmalen, efriedma Subscribers: tschuett, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71424
*	[AArch64] Enable emission of stack maps for non-Mach-O binaries on AArch64.	Kristof Beyls	2019-12-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	The emission of stack maps in AArch64 binaries has been disabled for all binary formats except Mach-O since rL206610, probably mistakenly, as far as I can tell. This patch reverts this to its intended state. Differential Revision: https://reviews.llvm.org/D70069 Patch by Loic Ottet.
*	[Aarch64][SVE] Add intrinsics for scatter stores	Andrzej Warzynski	2019-12-16	5	-61/+297
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds the following SVE intrinsics for scatter stores: * 64-bit offsets: * @llvm.aarch64.sve.st1.scatter (unscaled) * @llvm.aarch64.sve.st1.scatter.index (scaled) * 32-bit unscaled offsets: * @llvm.aarch64.sve.st1.scatter.uxtw (zero-extended offset) * @llvm.aarch64.sve.st1.scatter.sxtw (sign-extended-offset) * 32-bit scaled offsets: * @llvm.aarch64.sve.st1.scatter.uxtw.index (zero-extended offset) * @llvm.aarch64.sve.st1.scatter.sxtw.index (sign-extended offset) * vector base + immediate: * @llvm.aarch64.sve.st1.scatter.imm Reviewers: rengolin, efriedma, sdesmalen Reviewed By: efriedma, sdesmalen Subscribers: kmclaughlin, eli.friedman, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71074
*	Fix whitespace.	Jay Foad	2019-12-16	1	-2/+2
\|
*	Fix for AMDGPU MUL_I24 known bits calculation	Jay Foad	2019-12-16	1	-9/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: At present, the code calculating known bits of AMDGPU MUL_I24 confuses the concepts of "non-negative number" and "positive number". In some situations, it results in incorrect code. I have a case where the optimizer replaces the result of calculating MUL_I24(-5, 0) with -8. Reviewers: foad, arsenm Reviewed By: arsenm Subscribers: foad, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Patch by Eugene Kuznetsov. Differential Revision: https://reviews.llvm.org/D70367
*	[ARM] Move MVE opcode helper functions to ARMBaseInstrInfo. NFC.	Sjoerd Meijer	2019-12-16	3	-108/+118
\| \| \| \| \| \| \| \|	In ARMLowOverheadLoops.cpp, MVETailPredication.cpp, and MVEVPTBlock.cpp we have quite a few helper functions all looking at the opcodes of MVE instructions. This moves all these utility functions to ARMBaseInstrInfo. Diferential Revision: https://reviews.llvm.org/D71426
*	[PowerPC] Fix %llvm.ppc.altivec.vc* lowering	Jim Lin	2019-12-16	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: r372285 changed LLVM to use a `TargetConstant` for parameters of intrinsics that are required to be immediates. Since that commit, use of `%llvm.ppc.altivec.vc{fsx,fux,tsxs,tuxs}` intrinsics has not worked, and resulted in a `LLVM ERROR: Cannot select: intrinsic %llvm.ppc.altivec.vc*` error. The intrinsics' TableGen definitions matched on `imm` instead of `timm`. This commit updates those definitions to use `timm`. Fixes: https://llvm.org/PR44239 Reviewers: hfinkel, nemanjai, #powerpc, Jim Reviewed By: Jim Subscribers: qiucf, wuzish, Jim, hiraditya, kbarton, jsji, shchenz, llvm-commits Tags: #llvm Patched by vddvss (Colin Samples). Differential Revision: https://reviews.llvm.org/D71138
*	Revert "AArch64: Fix frame record chain"	Logan Chien	2019-12-14	2	-32/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Breaks aosp-O3-polly-before-vectorizer-unprofitable with the following error message: void llvm::emitFrameOffset(llvm::MachineBasicBlock &, MachineBasicBlock::iterator, const llvm::DebugLoc &, unsigned int, unsigned int, llvm::StackOffset, const llvm::TargetInstrInfo , MachineInstr::MIFlag, bool, bool, bool ): Assertion `(DestReg != AArch64::SP \|\| Bytes % 16 == 0) && "SP increment/decrement not 16-byte aligned"' failed. This reverts commit d4e10e6adb1b629b3fc1b78f7e281fbcec392edb.
*	AArch64: Fix frame record chain	Logan Chien	2019-12-14	2	-16/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The commit r369122 may keep LR and FP register (aka. frame record) in the middle of a frame, thus we must add the offsets to ensure the FP register always points to innermost frame record on the stack. According to AAPCS64[1], a conforming code shall construct a linked list of stack frames that can be traversed with frame records. This commit is also essential to frame-pointer-based stack unwinder (e.g. the stack unwinder in linx-perf-tools.) [1] https://github.com/ARM-software/software-standards/blob/master/abi/aapcs64/aapcs64.rst#the-frame-pointer Test: llvm-lit ${LLVM_SRC}/test/CodeGen/AArch64/framelayout-frame-record.ll Test: llvm-lit ${LLVM_SRC}/test/CodeGen/AArch64 Differential Revision: https://reviews.llvm.org/D70800
*	[AArch64] Save FP for leaf functions when disabling frame pointer elimination	Fangrui Song	2019-12-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	The change allows clang -mno-omit-leaf-frame-pointer to disable frame pointer elimination. This behavior matches X86 and Mips, and also GCC AArch64. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D71168
*	[PowerPC] Add Support for indirect calls on AIX.	Sean Fertile	2019-12-13	7	-35/+132
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Extends the desciptor-based indirect call support for 32-bit codegen, and enables indirect calls for AIX. In-depth Description: In a function descriptor based ABI, a function pointer points at a descriptor structure as opposed to the function's entry point. The descriptor takes the form of 3 pointers: 1 for the function's entry point, 1 for the TOC anchor of the module containing the function definition, and 1 for the environment pointer: struct FunctionDescriptor { void EntryPoint; void TOCAnchor; void *EnvironmentPointer; }; An indirect call has several steps of loading the the information from the descriptor into the proper registers for setting up the call. Namely it has to: 1) Save the caller's TOC pointer into the TOC save slot in the linkage area, and then load the callee's TOC pointer into the TOC register (GPR 2 on AIX). 2) Load the function descriptor's entry point into the count register. 3) Load the environment pointer into the environment pointer register (GPR 11 on AIX). 4) Perform the call by branching on count register. 5) Restore the caller's TOC pointer after returning from the indirect call. A couple important caveats to the above: - There is no way to directly load a value from memory into the count register. Instead we populate the count register by loading the entry point address into a gpr and then moving the gpr to the count register. - The TOC restore has to come immediately after the branch on count register instruction (i.e., the 1st instruction executed after we return from the call). This is an implementation limitation. We could, in theory, schedule the restore elsewhere as long as no uses of the TOC pointer fall in between the call and the restore; however, to keep it simple, we insert a pseudo instruction that represents both the indirect branch instruction and the load instruction that restores the caller's TOC from the linkage area. As they flow through the compiler as a single pseudo instruction, nothing can be inserted between them and the caller's TOC is then valid at any use. Differtential Revision: https://reviews.llvm.org/D70724
*	[Mips] Fix gcc -Wunused-but-set-variable in -DLLVM_ENABLE_ASSERTIONS=Off ↵	Fangrui Song	2019-12-13	1	-0/+1
\| \| \| \|	builds after D71028
*	[Legalizer] Making artifact combining order-independent	Roman Tereshin	2019-12-13	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Legalization algorithm is complicated by two facts: 1) While regular instructions should be possible to legalize in an isolated, per-instruction, context-free manner, legalization artifacts can only be eliminated in pairs, which could be deeply, and ultimately arbitrary nested: { [ () ] }, where which paranthesis kind depicts an artifact kind, like extend, unmerge, etc. Such structure can only be fully eliminated by simple local combines if they are attempted in a particular order (inside out), or alternatively by repeated scans each eliminating only one innermost pair, resulting in O(n^2) complexity. 2) Some artifacts might in fact be regular instructions that could (and sometimes should) be legalized by the target-specific rules. Which means failure to eliminate all artifacts on the first iteration is not a failure, they need to be tried as instructions, which may produce more artifacts, including the ones that are in fact regular instructions, resulting in a non-constant number of iterations required to finish the process. I trust the recently introduced termination condition (no new artifacts were created during as-a-regular-instruction-retrial of artifacts not eliminated on the previous iteration) to be efficient in providing termination, but only performing the legalization in full if and only if at each step such chains of artifacts are successfully eliminated in full as well. Which is currently not guaranteed, as the artifact combines are applied only once and in an arbitrary order that has to do with the order of creation or insertion of artifacts into their worklist, which is a no particular order. In this patch I make a small change to the artifact combiner, making it to re-insert into the worklist immediate (modulo a look-through copies) artifact users of each vreg that changes its definition due to an artifact combine. Here the first scan through the artifacts worklist, while not being done in any guaranteed order, only needs to find the innermost pair(s) of artifacts that could be immediately combined out. After that the process follows def-use chains, making them shorter at each step, thus combining everything that can be combined in O(n) time. Reviewers: volkan, aditya_nandakumar, qcolombet, paquette, aemerson, dsanders Reviewed By: aditya_nandakumar, paquette Tags: #llvm Differential Revision: https://reviews.llvm.org/D71448
*	[RISCV] Move DebugLoc Copy into CompressInstEmitter	Sam Elliott	2019-12-13	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This copy ensures that debug location information is kept for compressed instructions. There are places where both compressInstruction and uncompressInstruction are called that were not doing this copy, discarding some debug info. This change merely moves the copy into the generated file, so you cannot forget to copy the location over when compressing or uncompressing. Reviewers: asb, luismarques Reviewed By: luismarques Subscribers: sameer.abuasal, aprantl, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67493
*	[ARM] Fix in ICE when retrieving the number of micro-ops for vlldm/vlstm	Momchil Velikov	2019-12-13	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	The big switch in `ARMBaseInstrInfo::getNumMicroOps` is missing cases for `VLLDM` and `VLSTM`, which are currently defined with itineraries having a dynamic count of micro-ops. Assuming an optimistic case in which these instruction do not actually perform loads or stores, and with the idea that Armv8-m cores are supposed to use the new style scheduling models, this patch just sets the itinerary for those two instructions to `NoItinerary`. Differential Revision: https://reviews.llvm.org/D71266
*	[AArch64] Emit PAC/BTI .note.gnu.property flags	Momchil Velikov	2019-12-13	1	-0/+60
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch make LLVM emit the processor specific program property types defined in AArch64 ELF spec https://developer.arm.com/docs/ihi0056/f/elf-for-the-arm-64-bit-architecture-aarch64-abi-2019q2-documentation A file containing no functions gets both property flags. Otherwise, a property is set iff all the functions in the file have the corresponding attribute. Patch by Daniel Kiss and Momchil Velikov. Differential Revision: https://reviews.llvm.org/D71019
*	[MC][PowerPC] Fix a crash when redefining a symbol after .set	Fangrui Song	2019-12-13	1	-1/+2
\| \| \| \| \| \| \| \|	Fix PR44284. This is probably not valid assembly but we should not crash. Reviewed By: luporl, #powerpc, steven.zhang Differential Revision: https://reviews.llvm.org/D71443
*	[ARM][MVE] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=Off builds ↵	Fangrui Song	2019-12-13	1	-3/+4
\| \| \| \|	after D71062
*	[ARM][MVE] Make VPT invalid for tail predication	Sam Parker	2019-12-13	1	-3/+0
\| \| \| \| \| \| \| \| \|	We've been marking VPT incompatible instructions as invalid for tail predication too, though this may not strictly be true. VPT are incompatible and, unless its the first predicate def in a loop, they shouldn't be compatible for tail predication either. Differential Revision: https://reviews.llvm.org/D71410
*	[ARM][MVE] Add vector reduction intrinsics with two vector operands	Mikhail Maltsev	2019-12-13	2	-40/+251
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds intrinsics for the following MVE instructions: * VABAV * VMLADAV, VMLSDAV * VMLALDAV, VMLSLDAV * VRMLALDAVH, VRMLSLDAVH Each of the above 4 groups has a corresponding new LLVM IR intrinsic, since the instructions cannot be easily represented using general-purpose IR operations. Reviewers: simon_tatham, ostannard, dmgreen, MarkMurrayARM Reviewed By: MarkMurrayARM Subscribers: merge_guards_bot, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71062
*	[ARM][MVE] Add intrinsics for more immediate shifts.	Simon Tatham	2019-12-13	3	-73/+182
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fills in the remaining shift operations that take a single vector input and an immediate shift count: the `vqshl`, `vqshlu`, `vrshr` and `vshll[bt]` families. `vshll[bt]` (which shifts each input lane left into a double-width output lane) is the most interesting one. There are separate MC instruction ids for shifting by exactly the input lane width and shifting by less than that, because the instruction encoding is so completely different for the lane-width special case. So I had to write two sets of patterns to match based on the immediate shift count, which involved adding a ComplexPattern matcher to avoid the general-case pattern accidentally matching the special case too. For that family I've made sure to add an llc codegen test for both versions of each instruction. I'm experimenting with a new strategy for parametrising the isel patterns for all these instructions: adding extra fields to the relevant `Instruction` subclass itself, which are ignored by the Tablegen backends that generate the MC data, but can be retrieved from each instance of that instruction subclass when it's passed as a template parameter to the multiclass that generates its isel patterns. A nice effect of that is that I can fill in those informational fields using `let` blocks, rather than having to type them out once per instruction at `defm` time. (As a result, quite a lot of existing instruction `def`s are reindented by this patch, so it's clearer to read with whitespace changes ignored.) Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: MarkMurrayARM Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71458
*	[ARM] Add custom strict fp conversion lowering when non-strict is custom	John Brawn	2019-12-13	1	-33/+64
\| \| \| \| \| \| \| \| \| \| \| \|	We have custom lowering for operations converting to/from floating-point types when we don't have hardware support for those types, and this doesn't interact well with the target-independent legalization of the strict versions of these operations. Fix this by adding similar custom lowering of the strict versions. This fixes the last of the assertion failures in the CodeGen/ARM/fp-intrinsics test, with the remaining failures due to poor instruction selection. Differential Revision: https://reviews.llvm.org/D71127
*	Revert "AMDGPU: Try to commute sub of boolean ext"	Tim Renouf	2019-12-13	1	-26/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 69fcfb7d3597e0cdb5554b4e672e9032b411b167. As shown in the test I attached to this commit, the change I reverted causes a problem with "zext(cc1) - zext(cc2)". It commuted the operands to the sub and used different logic to select the addc/subc instruction: sub zext (setcc), x => addcarry 0, x, setcc sub sext (setcc), x => subcarry 0, x, setcc ... but that is bogus. I believe it is not possible to fold those commuted patterns into any form of addcarry or subcarry. It may have worked as intended before "AMDGPU: Change boolean content type to 0 or 1" because the setcc was considered to be -1 rather than 1. Differential Revision: https://reviews.llvm.org/D70978 Change-Id: If2139421aa6c935cbd1d925af58fe4a4aa9e8f43
*	[NFC] Use EVT instead of bool for getSetCCInverse()	Alex Richardson	2019-12-13	8	-37/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The use of a boolean isInteger flag (generally initialized using VT.isInteger()) caused errors in our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). In our backend, pointers use a separate ValueType (iFATPTR) and therefore .isInteger() returns false. This meant that getSetCCInverse() was using the floating-point variant and generated incorrect code for us: `(void )0x12033091e < (void )0xffffffffffffffff` would return false. Committing this change will significantly reduce our merge conflicts for each upstream merge. Reviewers: spatel, bogner Reviewed By: bogner Subscribers: wuzish, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70917
*	Revert "[ARM][MVE] findVCMPToFoldIntoVPS. NFC."	Sjoerd Meijer	2019-12-13	1	-28/+30
\| \| \| \| \| \| \| \|	This reverts commit 9468e3334ba54fbb1b209aaec662d7375451fa1f. There's a test that doesn't like this change. The RDA analysis gets invalided by changes in the block, which is not taken into account. Revert while I work on a fix for this.
*	[ARM][MVE][Intrinsics] Add _x() variants of my _m() intrinsics.	Mark Murray	2019-12-13	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Better use of multiclass is used, and this helped find some existing bugs in the predicated VMULL* intrinsics, which are now fixed. The refactored VMULL[TB]Q_(INT\|POLY)_M() intrinsics were discovered to have an argument ("inactive") with incorrect type, and this required a fix that is included in this whole patch. The argument "inactive" should have been the same width (per vector element) as the return type of the intrinsic, but was not in the case where the return type was double the element width of the input types. To assist in testing the multiclassing , and to thwart further gremlins, the unit tests are improved in scope. The .ll tests are all generated by a small bit of throw-away scripting from the corresponding .c tests, and as such the diffs are large and nasty. Look at the file rather than the diff. Reviewers: dmgreen, miyuki, ostannard, simon_tatham Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71421
*	Recommit "[AArch64][SVE] Implement intrinsics for non-temporal loads & stores"	Kerry McLaughlin	2019-12-13	3	-2/+95
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Updated pred_load patterns added to AArch64SVEInstrInfo.td by this patch to use reg + imm non-temporal loads to fix previous test failures. Original commit message: Adds the following intrinsics: - llvm.aarch64.sve.ldnt1 - llvm.aarch64.sve.stnt1 This patch creates masked loads and stores with the MONonTemporal flag set when used with the intrinsics above.
*	[NFC][AArch64] Fix typo.	Nate Voorhies	2019-12-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Coaleascer should be coalescer. Reviewers: qcolombet, Jim Reviewed By: Jim Subscribers: Jim, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70731
*	[AArch64][SVE] Add integer arithmetic with immediate instructions.	Danilo Carvalho Grael	2019-12-12	3	-9/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add pattern matching for the following instructions: - add, sub, subr, sqadd, sqsub, uqadd, uqsub This patch required complex patterns to match the immediate with optinal left shift. I re-used the Select function from the other SVE repo to implement the complext pattern. I plan on doing another patch to also match constant vector of the same immediate. Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D71370
*	[SystemZ] Implement the packed stack layout	Jonas Paulsson	2019-12-12	4	-112/+239
\| \| \| \| \| \| \| \| \|	Any llvm function with the "packed-stack" attribute will be compiled to use the packed stack layout which reuses unused parts of the incoming register save area. This is needed for building the Linux kernel. Review: Ulrich Weigand https://reviews.llvm.org/D70821
*	[amdgpu] Fix `-Wenum-compare` warning. NFC.	Michael Liao	2019-12-12	1	-6/+6
\|
*	[ARM][MVE] findVCMPToFoldIntoVPS. NFC.	Sjoerd Meijer	2019-12-12	1	-30/+28
\| \| \| \| \| \| \|	This adds ReachingDefAnalysis (RDA) to the VPTBlock pass, so that we can reimplement findVCMPToFoldIntoVPS with just a few calls to RDA. Differential Revision: https://reviews.llvm.org/D71330
*	AMDGPU/SILoadStoreOptimizer: Simplify function	Tom Stellard	2019-12-12	1	-62/+50
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: merge_guards_bot, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71044
*	[ARM][MVE] Sink vector shift operand	Sam Parker	2019-12-12	1	-3/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recommit e0b966643fc2. sub instructions were being generated for the negated value, and for some reason they were the register only ones. I think the problem was because I was grabbing the 'zero' from vmovimm, which is a target constant. Now I'm just generating a new Constant zero and so rsb instructions are now generated. Original commit message: The shift amount operand can be provided in a general purpose register so sink it. Flip the vdup and negate so the existing patterns can be used for matching. Differential Revision: https://reviews.llvm.org/D70841