bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[ARM] Add support for the s,j,x,N,O inline asm constraints	David Candler	2019-09-05	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A number of inline assembly constraints are currently supported by LLVM, but rejected as invalid by Clang: Target independent constraints: s: An integer constant, but allowing only relocatable values ARM specific constraints: j: An immediate integer between 0 and 65535 (valid for MOVW) x: A 32, 64, or 128-bit floating-point/SIMD register: s0-s15, d0-d7, or q0-q3 N: An immediate integer between 0 and 31 (Thumb1 only) O: An immediate integer which is a multiple of 4 between -508 and 508. (Thumb1 only) This patch adds support to Clang for the missing constraints along with some checks to ensure that the constraints are used with the correct target and Thumb mode, and that immediates are within valid ranges (at least where possible). The constraints are already implemented in LLVM, but just a couple of minor corrections to checks (V8M Baseline includes MOVW so should work with 'j', 'N' and 'O' shouldn't be valid in Thumb2) so that Clang and LLVM are in line with each other and the documentation. Differential Revision: https://reviews.llvm.org/D65863 Change-Id: I18076619e319bac35fbb60f590c069145c9d9a0a llvm-svn: 371079
*	[ARM] Fixup the creation of VPT blocks	David Green	2019-09-05	1	-15/+20
\| \| \| \| \| \| \| \| \| \|	This attempts to just fix the creation of VPT blocks, fixing up the iterating, which instructions are considered in the bundle, and making sure that we do not overrun the end of the block. Differential Revision: https://reviews.llvm.org/D67219 llvm-svn: 371064
*	[LLVM][Alignment] Make functions using log of alignment explicit	Guillaume Chatelet	2019-09-05	6	-29/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch renames functions that takes or returns alignment as log2, this patch will help with the transition to llvm::Align. The renaming makes it explicit that we deal with log(alignment) instead of a power of two alignment. A few renames uncovered dubious assignments: - `MirParser`/`MirPrinter` was expecting powers of two but `MachineFunction` and `MachineBasicBlock` were using deal with log2(align). This patch fixes it and updates the documentation. - `MachineBlockPlacement` exposes two flags (`align-all-blocks` and `align-all-nofallthru-blocks`) supposedly interpreted as power of two alignments, internally these values are interpreted as log2(align). This patch updates the documentation, - `MachineFunctionexposes` exposes `align-all-functions` also interpreted as power of two alignment, internally this value is interpreted as log2(align). This patch updates the documentation, Reviewers: lattner, thegameg, courbet Subscribers: dschuff, arsenm, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, Jim, s.egerton, llvm-commits, courbet Tags: #llvm Differential Revision: https://reviews.llvm.org/D65945 llvm-svn: 371045
*	[ARM][ParallelDSP] SExt mul for accumulation	Sam Parker	2019-09-04	1	-5/+14
\| \| \| \| \| \| \| \| \| \|	For any unpaired muls, we accumulate them as an input to the reduction. Check the type of the mul and perform a sext if the existing accumlator input type is not the same. Differential Revision: https://reviews.llvm.org/D66993 llvm-svn: 370851
*	[GlobalISel][CallLowering] Add support for splitting types according to ↵	Amara Emerson	2019-09-03	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	calling conventions. On AArch64, s128 types have to be split into s64 GPRs when passed as arguments. This change adds the generic support in call lowering for dealing with multiple registers, for incoming and outgoing args. Support for splitting for return types not yet implemented. Differential Revision: https://reviews.llvm.org/D66180 llvm-svn: 370822
*	[MC] Pass through .code16/32/64 and .syntax unified for COFF	Reid Kleckner	2019-09-03	1	-10/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	These flags should simply be passed through to the target, which will do the right thing. Add an MC/X86 test that uses these directives with the three primary object file formats and shows that they disassemble the same everywhere. There is a missing test for .code32 on Windows ARM, since I'm not sure exactly how to construct one. Fixes PR43203 llvm-svn: 370805
*	[ARM] Ignore Implicit CPSR regs when lowering from Machine to MC operands	David Green	2019-09-03	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The code here seems to date back to r134705, when tablegen lowering was first being added. I don't believe that we need to include CPSR implicit operands on the MCInst. This now works more like other backends (like AArch64), where all implicit registers are skipped. This allows the AliasInst for CSEL's to match correctly, as can be seen in the test changes. Differential revision: https://reviews.llvm.org/D66703 llvm-svn: 370745
*	[ARM] Invert CSEL predicates if the opposite is a simpler constant to ↵	David Green	2019-09-03	4	-30/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	materialise This moves ConstantMaterializationCost into ARMBaseInstrInfo so that it can also be used in ISel Lowering, adding codesize values to the computed costs, to be able to compare either approximate instruction counts or codesize costs. It also adds a HasLowerConstantMaterializationCost, which compares the ConstantMaterializationCost of two values, returning true if the first is smaller either in instruction count/codesize, or falling back to the other in the case that they are equal. This is used in constant CSEL lowering to invert the predicate if the opposite is easier to materialise. Differential revision: https://reviews.llvm.org/D66701 llvm-svn: 370741
*	[ARM] Generate 8.1-m CSINC, CSNEG and CSINV instructions.	David Green	2019-09-03	6	-1/+92
\| \| \| \| \| \| \| \| \| \| \| \|	Arm 8.1-M adds a number of related CSEL instructions, including CSINC, CSNEG and CSINV. These choose between two values given the content in CPSR and a condition, performing an increment, negation or inverse of the false value. This adds some selection for them, either from constant values or patterns. It does not include CSEL directly, which is currently not always making code better. It is still useful, but we will have to check more carefully where it should and shouldn't be used. Code by Ranjeet Singh and Simon Tatham, with some modifications from me. Differential revision: https://reviews.llvm.org/D66483 llvm-svn: 370739
*	[ARM] Fix MVE ldst offset ranges	David Green	2019-09-03	1	-19/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were using isShiftedInt<7, Shift>(RHSC) to detect the ranges of offsets to fold into MVE loads/stores. The instructions actually take a 7 bit unsigned integer which is either added or subtracted. So something more like isShiftedUInt<7, Shift>(abs(RHSC)). Instead I've changes this to use the isScaledConstantInRange method, same as in SelectT2AddrModeImm7Offset used by pre/post inc, which seemed to already be getting this correct. Differential revision: https://reviews.llvm.org/D66997 llvm-svn: 370731
*	[ARM][MVE] Decoding of VMSR doesn't diagnose some unpredictable encodings	Oliver Stannard	2019-09-03	1	-25/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Decoding of VMSR doesn't diagnose some unpredictable encodings, as the unpredictable bits are not correctly set. Diff-reduce this instruction's internals WRT VMRS so I can see the differences better. Mostly this is s/src/Rt/g. Fill in the "should-be-(0)" bits. Designate the Unpredictable{} bits for both VMRS and VMSR. Patch by Mark Murray! Differential revision: https://reviews.llvm.org/D66938 llvm-svn: 370729
*	Bug fix on function epilog optimization (ARM backend)	Oliver Stannard	2019-09-03	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To save a 'add sp,#val' instruction by adding registers to the final pop instruction, the first register transferred by this pop instruction need to be found. If the function to be optimized has a non-void return value, the operand list contains r0 (implicit) which prevents the optimization to take place. Therefore implicit register references should be skipped in the search loop, because this registers are never popped from the stack. Patch by Rainer Herbertz (rOptimizer)! Differential revision: https://reviews.llvm.org/D66730 llvm-svn: 370728
*	[ARM] Select vmla	Sam Tebbs	2019-09-03	1	-0/+15
\| \| \| \| \| \| \| \|	This patch adds vmla selection. Differential revision: https://reviews.llvm.org/D66297 llvm-svn: 370704
*	[ARM] Use MQPR not QPR for MVE registers	David Green	2019-09-02	3	-96/+98
\| \| \| \| \| \| \| \| \|	We should be using MQPR, and if we don't we can get COPYs and PHIs created for QPR. These get folded into instructions, failing verification checks. Differential revision: https://reviews.llvm.org/D66214 llvm-svn: 370676
*	[ARM] Remove MVE masked loads/stores	David Green	2019-09-01	3	-127/+0
\| \| \| \| \| \| \| \| \|	These were never enabled correctly and are causing other problems. Taking them out for the moment, whilst we work on the issues. This reverts r370329. llvm-svn: 370607
*	[ARM] MVE Masked loads and stores	David Green	2019-08-29	3	-0/+127
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Masked loads and store fit naturally with MVE, the instructions being easily predicated. This adds lowering for the simple cases of masked loads and stores. It does not yet deal with widening/narrowing or pre/post inc. The llvm masked load intrinsic will accept a "passthru" value, dictating the values used for the zero masked lanes. In MVE the instructions write 0 to the zero predicated lanes, so we need to match a passthru that isn't 0 (or undef) with a select instruction to pull in the correct data after the load. We also need to do something with unaligned loads/stores. Currently this uses a similar method used in big endian, using an VLDRB.8 (and potentially a VREV in BE). This does mean that the predicate mask is converted from, for example, a v4i1 to a v16i1. The VLDR instructions are defined as using the first bit of the relevant mask lane, so this could potentially load different results if the predicate is little odd. As the input is a v4i1 however, I believe this is OK and all the bits required should be set in the predicate, making the VLDRB.8 load the same data. Differential Revision: https://reviews.llvm.org/D66534 llvm-svn: 370329
*	[RISCV] Avoid generating AssertZext for LP64 ABI when lowering floating LibCall	Shiva Chen	2019-08-28	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The patch fixed the issue that RV64 didn't clear the upper bits when return complex floating value with lp64 ABI. float _Complex complex_add(float _Complex a, float _Complex b) { return a + b; } RealResult = zero_extend(RealA + RealB) ImageResult = ImageA + ImageB Return (RealResult \| (ImageResult << 32)) The patch introduces shouldExtendTypeInLibCall target hook to suppress the AssertZext generation when lowering floating LibCall. Thanks to Eli's comments from the Bugzilla https://bugs.llvm.org/show_bug.cgi?id=42820 Differential Revision: https://reviews.llvm.org/D65497 llvm-svn: 370275
*	[TargetLowering] Add buildLegalVectorShuffle facility to help build legal ↵	Amaury Sechet	2019-08-28	1	-5/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	shuffles Summary: There are at least 2 ways to express the same shuffle. Various pieces of code explicit check for both option, but other places do not when they would benefit from doing it. This patches refactor the codebase to use buildLegalVectorShuffle in order to make that behavior more consistent. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66804 llvm-svn: 370190
*	[ARM] Move MVEVPTBlockPass to a separate file. NFC	David Green	2019-08-28	3	-143/+173
\| \| \| \| \| \| \| \| \|	This just pulls the MVEVPTBlockPass into a separate file, as opposed to being wrapped up in Thumb2ITBlockPass. Differential revision: https://reviews.llvm.org/D66579 llvm-svn: 370187
*	[MVE] VMOVX patterns	David Green	2019-08-28	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds fp16 VMOVX patterns, using the same patterns as rL362482 with some adjustments for MVE. It allows us to move fp16 registers without going into and out of gprs. VMOVX is able to move the top bits from a fp16 in a fp reg into the bottom bits of another register, zeroing the rest. This can be used for odd MVE register lanes. The top bits are not read by fp16 instructions, so no move is required there if we are dealing with even lanes. Differential revision: https://reviews.llvm.org/D66793 llvm-svn: 370184
*	[ARM][ParallelDSP] Change search for muls	Sam Parker	2019-08-28	1	-166/+185
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rL369567 reverted a couple of recent changes made to ARMParallelDSP because of a miscompilation error: PR43073. The issue stemmed from an underlying bug that was caused by adding muls into a reduction before it was proved that they could be executed in parallel with another mul. Most of the changes here are from the previously reverted commits. The additional changes have been made area: 1) The Search function now doesn't insert any muls into the Reduction object. That now happens once the search has successfully finished. 2) For any muls added into the reduction but that weren't paired, we accumulate their values as an input into the smlad. Differential Revision: https://reviews.llvm.org/D66660 llvm-svn: 370171
*	[MC] Minor cleanup to MCFixup::Kind handling. NFC.	Sam Clegg	2019-08-23	3	-6/+6
\| \| \| \| \| \| \| \| \| \|	Prefer `MCFixupKind` where possible and add getTargetKind() to convert to `unsigned` when needed rather than scattering cast operators around the place. Differential Revision: https://reviews.llvm.org/D59890 llvm-svn: 369720
*	Reapply: [ARM] Fix lsrl with a 128/256 bit shift amount or a shift of 32	Sam Tebbs	2019-08-22	1	-6/+7
\| \| \| \| \| \| \| \| \|	The CodeGen/Thumb2/mve-vaddv.ll test needed to be amended to reflect the changes from the above patch. This reverts commit cd53ff6, reapplying 7c6b229. llvm-svn: 369638
*	Revert r369626 "[ARM] Fix lsrl with a 128/256 bit shift amount or a shift of 32"	Hans Wennborg	2019-08-22	1	-7/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It broke the bots, see e.g. http://lab.llvm.org:8011/builders/clang-cuda-build/builds/36275/ > This patch fixes shifts by a 128/256 bit shift amount. It also fixes > codegen for shifts of 32 by delegating to LLVM's default optimisation > instead of emitting a long shift. > > Tests that used to generate long shifts of 32 are updated to check for the > more optimised codegen. > > Differential revision: https://reviews.llvm.org/D66519 > > llvm-svn: 369626 llvm-svn: 369636
*	[ARM] Fix lsrl with a 128/256 bit shift amount or a shift of 32	Sam Tebbs	2019-08-22	1	-6/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes shifts by a 128/256 bit shift amount. It also fixes codegen for shifts of 32 by delegating to LLVM's default optimisation instead of emitting a long shift. Tests that used to generate long shifts of 32 are updated to check for the more optimised codegen. Differential revision: https://reviews.llvm.org/D66519 llvm-svn: 369626
*	[TargetLowering] Remove optional arguments passing to makeLibCall	Shiva Chen	2019-08-22	1	-5/+9
\| \| \| \| \| \| \| \| \| \|	The patch introduces MakeLibCallOptions struct as suggested by @efriedma on D65497. The struct contain argument flags which will pass to makeLibCall function. The patch should not has any functionality changes. Differential Revision: https://reviews.llvm.org/D65795 llvm-svn: 369622
*	Revert r367389 (and follow-up r368404); it caused PR43073.	Nico Weber	2019-08-21	1	-50/+76
\| \| \| \|	llvm-svn: 369567
*	[ARM] Formatting for ARMInstrMVE.td. NFC	David Green	2019-08-21	1	-89/+98
\| \| \| \| \| \| \|	This is just some formatting cleanup, prior to the masked load and store patch in D66534. llvm-svn: 369545
*	[ARM] Select vaddva	Sam Tebbs	2019-08-20	1	-0/+7
\| \| \| \| \| \| \| \|	This patch adds vaddva selection. Differential revision: https://reviews.llvm.org/D66410 llvm-svn: 369404
*	[ARM] Add support for MVE vaddv	Sam Tebbs	2019-08-19	4	-0/+41
\| \| \| \| \| \| \| \| \|	This patch adds vecreduce_add and the relevant instruction selection for vaddv. Differential revision: https://reviews.llvm.org/D66085 llvm-svn: 369245
*	[ARM] MVE sext costs	David Green	2019-08-19	1	-0/+25
\| \| \| \| \| \| \| \| \|	This adds some sext costs for MVE, taken from the length of assembly sequences that we currently generate. Differential Revision: https://reviews.llvm.org/D66010 llvm-svn: 369244
*	Reland "[ARM] push LR before __gnu_mcount_nc"	Jian Cai	2019-08-16	5	-0/+90
\| \| \| \| \| \| \| \|	This relands r369147 with fixes to unit tests. https://reviews.llvm.org/D65019 llvm-svn: 369173
*	[ARM] Preserve liveness in ARMConstantIslands.	Eli Friedman	2019-08-16	1	-3/+18
\| \| \| \| \| \| \| \| \| \|	We currently don't use liveness information after this point, but it can be useful to catch bugs using -verify-machineinstrs, and optimizations could potentially use this information in the future. Differential Revision: https://reviews.llvm.org/D66319 llvm-svn: 369162
*	Revert "[ARM] push LR before __gnu_mcount_nc"	Jian Cai	2019-08-16	5	-90/+0
\| \| \| \| \| \|	This reverts commit f4cf3b959333f62b7a7b2d7771f7010c9d8da388. llvm-svn: 369149
*	[ARM] push LR before __gnu_mcount_nc	Jian Cai	2019-08-16	5	-0/+90
\| \| \| \| \| \| \| \| \|	Push LR register before calling __gnu_mcount_nc as it expects the value of LR register to be the top value of the stack on ARM32. Differential Revision: https://reviews.llvm.org/D65019 llvm-svn: 369147
*	[ARM] MVE sext of a load is free	David Green	2019-08-16	1	-0/+15
\| \| \| \| \| \| \| \| \|	MVE also has some sext of loads, which will be free just as scalar instructions are. Differential Revision: https://reviews.llvm.org/D66008 llvm-svn: 369118
*	[ARM] Correct register for narrowing and widening MVE loads and stores.	David Green	2019-08-16	3	-13/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The widening and narrowing MVE instructions like VLDRH.32 are only permitted to use low tGPR registers. This means that if they are used for a stack slot, where the register used is only decided during frame setup, we need to be able to correctly pick a thumb1 register over a normal GPR. This attempts to add the required logic into eliminateFrameIndex and rewriteT2FrameIndex, only picking the FrameReg if it is a valid register for the operands register class, and picking a valid scratch register for the register class. Differential Revision: https://reviews.llvm.org/D66285 llvm-svn: 369108
*	[ARM] Don't pretend we know how to generate MVE VLDn	David Green	2019-08-16	2	-1/+7
\| \| \| \| \| \| \| \| \| \|	We don't yet know how to generate these instructions for MVE. And in the case of VLD3, we don't even have the instruction. For the moment don't tell the vectoriser that we have VLD4, just to end up serialising the results. Differential Revision: https://reviews.llvm.org/D66009 llvm-svn: 369101
*	[ARM][LowOverheadLoops] Fix generated code for "revert".	Eli Friedman	2019-08-15	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Two issues: 1. t2CMPri shouldn't use CPSR if it isn't predicated. This doesn't really have any visible effect at the moment, but it might matter in the future. 2. The t2CMPri generated for t2WhileLoopStart might need to use a register that isn't LR. My team found this because we have a patch to track register liveness late in the pass pipeline. I'll look into upstreaming it to help catch issues like this earlier. Differential Revision: https://reviews.llvm.org/D66243 llvm-svn: 369069
*	[SDAG] Minor code cleanup/standardization of atomic accessors [NFC]	Philip Reames	2019-08-15	1	-1/+1
\| \| \| \|	llvm-svn: 369057
*	Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM	Daniel Sanders	2019-08-15	19	-257/+257
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned& Depends on D65919 Reviewers: arsenm, bogner, craig.topper, RKSimon Reviewed By: arsenm Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65962 llvm-svn: 369041
*	[llvm] Migrate llvm::make_unique to std::make_unique	Jonas Devlieghere	2019-08-15	6	-39/+39
\| \| \| \| \| \| \| \|	Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo. llvm-svn: 369013
*	[ARM] Fix alignment checks for BE VLDRH	David Green	2019-08-15	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	We need to allow any alignment at least 2, not just exactly 2, so that the big endian loads and stores can be selected successfully. I've also added extra BE testing for the load and store tests. Thanks to Oliver for the report. Differential Revision: https://reviews.llvm.org/D66222 llvm-svn: 368996
*	[ARM] MVE predicate store patterns	David Green	2019-08-15	1	-0/+7
\| \| \| \| \| \| \| \| \|	Stack loads and stores were already working, but direct stores were not. This adds the patterns for them, same as predicate loads. Differential Revision: https://reviews.llvm.org/D66213 llvm-svn: 368988
*	[ARM] MVE trunc to i1 vectors	David Green	2019-08-15	1	-0/+7
\| \| \| \| \| \| \| \| \|	This adds patterns for selecting trunc instructions from full vectors to i1's vectors. Differential Revision: https://reviews.llvm.org/D66201 llvm-svn: 368981
*	[ARM] Add MVE beats vector cost model	David Green	2019-08-13	4	-21/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The MVE architecture has the idea of "beats", where a vector instruction can be executed over several ticks of the architecture. This adds a similar system into the Arm backend cost model, multiplying the cost of all vector instructions by a factor. This factor essentially becomes the expected difference between scalar code and vector code, on average. MVE Vector instructions can also overlap so the a true cost of them is often lower. But equally scalar instructions can in some situations be dual issued, or have other optimisations such as unrolling or make use of dsp instructions. The default is chosen as 2. This should not prevent vectorisation is a most cases (as the vector instructions will still be doing at least 4 times the work), but it will help prevent over vectorising in cases where the benefits are less likely. This adds things so far to the obvious places in ARMTargetTransformInfo, and updates a few related costs like not treating float instructions as cost 2 just because they are floats. Differential Revision: https://reviews.llvm.org/D66005 llvm-svn: 368733
*	[ARM] Fix detection of duplicates when parsing reg list operands	Momchil Velikov	2019-08-13	1	-19/+43
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D65957 llvm-svn: 368712
*	[ARM] Fix encoding of APSR in CLRM instruction	Momchil Velikov	2019-08-13	2	-16/+7
\| \| \| \| \| \| \| \| \|	The APSR is encoded by setting bit 15 in the register list of the CLRM instruction (cf. https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf). Differential Revision: https://reviews.llvm.org/D65873 llvm-svn: 368711
*	GlobalISel: Change representation of shuffle masks	Matt Arsenault	2019-08-13	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently shufflemasks get emitted as any other constant, and you end up with a bunch of virtual registers of G_CONSTANT with a G_BUILD_VECTOR. The AArch64 selector then asserts on anything that doesn't fit this pattern. This isn't an ideal representation, and should avoid legalization and have fewer opportunities for a representational error. Rather than invent a new shuffle mask operand type, similar to what ShuffleVectorSDNode does, just track the original IR Constant mask operand. I don't completely like the idea of adding another link to the IR, but MIR is already quite dependent on IR constants already, and this will allow sharing the shuffle mask utility functions with the IR. llvm-svn: 368704
*	[GlobalISel] Make the InstructionSelector instance non-const, allowing state ↵	Amara Emerson	2019-08-13	3	-6/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to be maintained. Currently we can't keep any state in the selector object that we get from subtarget. As a result we have to plumb through all our variables through multiple functions. This change makes it non-const and adds a virtual init() method to allow further state to be captured for each target. AArch64 makes use of this in this patch to cache a call to hasFnAttribute() which is expensive to call, and is used on each selection of G_BRCOND. Differential Revision: https://reviews.llvm.org/D65984 llvm-svn: 368652