bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[ARM][Thumb2] Fix ADD/SUB invalid writes to SP	Diogo Sampaio	2020-01-14	2	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch fixes pr23772 [ARM] r226200 can emit illegal thumb2 instruction: "sub sp, r12, #80". The violation was that SUB and ADD (reg, immediate) instructions can only write to SP if the source register is also SP. So the above instructions was unpredictable. To enforce that the instruction t2(ADD\|SUB)ri does not write to SP we now enforce the destination register to be rGPR (That exclude PC and SP). Different than the ARM specification, that defines one instruction that can read from SP, and one that can't, here we inserted one that can't write to SP, and other that can only write to SP as to reuse most of the hard-coded size optimizations. When performing this change, it uncovered that emitting Thumb2 Reg plus Immediate could not emit all variants of ADD SP, SP #imm instructions before so it was refactored to be able to. (see test/CodeGen/Thumb2/mve-stacksplot.mir where we use a subw sp, sp, Imm12 variant ) It also uncovered a disassembly issue of adr.w instructions, that were only written as SUBW instructions (see llvm/test/MC/Disassembler/ARM/thumb2.txt). Reviewers: eli.friedman, dmgreen, carwil, olista01, efriedma, andreadb Reviewed By: efriedma Subscribers: gbedwell, john.brawn, efriedma, ostannard, kristof.beyls, hiraditya, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70680
*	Reverting, broke some bots. Need further investigation.	Diogo Sampaio	2020-01-10	2	-5/+5
\| \| \| \| \| \| \| \|	Summary: This reverts commit 8c12769f3046029e2a9b4e48e1645b1a77d28650. Reviewers: Subscribers:
*	[ARM][Thumb2] Fix ADD/SUB invalid writes to SP	Diogo Sampaio	2020-01-10	2	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch fixes pr23772 [ARM] r226200 can emit illegal thumb2 instruction: "sub sp, r12, #80". The violation was that SUB and ADD (reg, immediate) instructions can only write to SP if the source register is also SP. So the above instructions was unpredictable. To enforce that the instruction t2(ADD\|SUB)ri does not write to SP we now enforce the destination register to be rGPR (That exclude PC and SP). Different than the ARM specification, that defines one instruction that can read from SP, and one that can't, here we inserted one that can't write to SP, and other that can only write to SP as to reuse most of the hard-coded size optimizations. When performing this change, it uncovered that emitting Thumb2 Reg plus Immediate could not emit all variants of ADD SP, SP #imm instructions before so it was refactored to be able to. (see test/CodeGen/Thumb2/mve-stacksplot.mir where we use a subw sp, sp, Imm12 variant ) It also uncovered a disassembly issue of adr.w instructions, that were only written as SUBW instructions (see llvm/test/MC/Disassembler/ARM/thumb2.txt). Reviewers: eli.friedman, dmgreen, carwil, olista01, efriedma Reviewed By: efriedma Subscribers: john.brawn, efriedma, ostannard, kristof.beyls, hiraditya, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70680
*	[GlobalISel]: Allow targets to override how to widen constants during ↵	Aditya Nandakumar	2019-12-03	2	-6/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	legalization https://reviews.llvm.org/D70922 This adds a hook to allow targets to define exactly what extension operation should be performed for widening constants. This handles cases like widening i1 true which would end up becoming -1 which affects code quality during combines. Additionally, in order to stay consistent with how DAG is promoting constants, we now signextend for byte sized types and zero extend otherwise (by default). Targets can of course override this if necessary.
*	[globalisel] Rename G_GEP to G_PTR_ADD	Daniel Sanders	2019-11-05	8	-37/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: G_GEP is rather poorly named. It's a simple pointer+scalar addition and doesn't support any of the complexities of getelementptr. I therefore propose that we rename it. There's a G_PTR_MASK so let's follow that convention and go with G_PTR_ADD Reviewers: volkan, aditya_nandakumar, bogner, rovka, arsenm Subscribers: sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, arphaman, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69734
*	[ARM] VFPv2 only supports 16 D registers.	Eli Friedman	2019-09-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	r361845 changed the way we handle "D16" vs. "D32" targets; there used to be a negative "d16" which removed instructions from the instruction set, and now there's a "d32" feature which adds instructions to the instruction set. This is good, but there was an oversight in the implementation: the behavior of VFPv2 was changed. In particular, the "vfp2" feature was changed to imply "d32". This is wrong: VFPv2 only supports 16 D registers. In practice, this means if you specify -mfpu=vfpv2, the compiler will generate illegal instructions. This patch gets rid of "vfp2d16" and "vfp2d16sp", and fixes "vfp2" and "vfp2sp" so they don't imply "d32". Differential Revision: https://reviews.llvm.org/D67375 llvm-svn: 372186
*	[GlobalISel] CSEMIRBuilder: Add support for G_GEP	Volkan Keles	2019-08-15	2	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds G_GEP to `shouldCSEOpc` so that it can be CSEd. It also refactors `translateGetElementPtr` by replacing `createGenericVirtualRegister` calls with types. Reviewers: aditya_nandakumar, arsenm, dsanders, paquette, aemerson Reviewed By: aditya_nandakumar Subscribers: wdng, rovka, javed.absar, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66316 llvm-svn: 369070
*	GlobalISel: Add more verifier checks for G_SHUFFLE_VECTOR	Matt Arsenault	2019-08-13	1	-0/+11
\| \| \| \|	llvm-svn: 368705
*	GlobalISel: Change representation of shuffle masks	Matt Arsenault	2019-08-13	1	-9/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently shufflemasks get emitted as any other constant, and you end up with a bunch of virtual registers of G_CONSTANT with a G_BUILD_VECTOR. The AArch64 selector then asserts on anything that doesn't fit this pattern. This isn't an ideal representation, and should avoid legalization and have fewer opportunities for a representational error. Rather than invent a new shuffle mask operand type, similar to what ShuffleVectorSDNode does, just track the original IR Constant mask operand. I don't completely like the idea of adding another link to the IR, but MIR is already quite dependent on IR constants already, and this will allow sharing the shuffle mask utility functions with the IR. llvm-svn: 368704
*	[globalisel] Add G_SEXT_INREG	Daniel Sanders	2019-08-09	2	-29/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Targets often have instructions that can sign-extend certain cases faster than the equivalent shift-left/arithmetic-shift-right. Such cases can be identified by matching a shift-left/shift-right pair but there are some issues with this in the context of combines. For example, suppose you can sign-extend 8-bit up to 32-bit with a target extend instruction. %1:_(s32) = G_SHL %0:_(s32), i32 24 # (I've inlined the G_CONSTANT for brevity) %2:_(s32) = G_ASHR %1:_(s32), i32 24 %3:_(s32) = G_ASHR %2:_(s32), i32 1 would reasonably combine to: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 25 which no longer matches the special case. If your shifts and extend are equal cost, this would break even as a pair of shifts but if your shift is more expensive than the extend then it's cheaper as: %2:_(s32) = G_SEXT_INREG %0:_(s32), i32 8 %3:_(s32) = G_ASHR %2:_(s32), i32 1 It's possible to match the shift-pair in ISel and emit an extend and ashr. However, this is far from the only way to break this shift pair and make it hard to match the extends. Another example is that with the right known-zeros, this: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 24 %3:_(s32) = G_MUL %2:_(s32), i32 2 can become: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 23 All upstream targets have been configured to lower it to the current G_SHL,G_ASHR pair but will likely want to make it legal in some cases to handle their faster cases. To follow-up: Provide a way to legalize based on the constant. At the moment, I'm thinking that the best way to achieve this is to provide the MI in LegalityQuery but that opens the door to breaking core principles of the legalizer (legality is not context sensitive). That said, it's worth noting that looking at other instructions and acting on that information doesn't violate this principle in itself. It's only a violation if, at the end of legalization, a pass that checks legality without being able to see the context would say an instruction might not be legal. That's a fairly subtle distinction so to give a concrete example, saying %2 in: %1 = G_CONSTANT 16 %2 = G_SEXT_INREG %0, %1 is legal is in violation of that principle if the legality of %2 depends on %1 being constant and/or being 16. However, legalizing to either: %2 = G_SEXT_INREG %0, 16 or: %1 = G_CONSTANT 16 %2:_(s32) = G_SHL %0, %1 %3:_(s32) = G_ASHR %2, %1 depending on whether %1 is constant and 16 does not violate that principle since both outputs are genuinely legal. Reviewers: bogner, aditya_nandakumar, volkan, aemerson, paquette, arsenm Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, kristof.beyls, javed.absar, hiraditya, jrtc27, atanasyan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61289 llvm-svn: 368487
*	[GlobalISel] Accept multiple vregs for lowerCall's args	Diana Picus	2019-06-27	1	-43/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change the interface of CallLowering::lowerCall to accept several virtual registers for each argument, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660 and lowerFormalArguments in D63549. With this change, we no longer pack the virtual registers generated for aggregates into one big lump before delegating to the target. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. NFCI for AMDGPU, Mips and X86. Differential Revision: https://reviews.llvm.org/D63551 llvm-svn: 364512
*	[GlobalISel] Accept multiple vregs for lowerCall's result	Diana Picus	2019-06-27	1	-33/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change the interface of CallLowering::lowerCall to accept several virtual registers for the call result, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660 and lowerFormalArguments in D63549. With this change, we no longer pack the virtual registers generated for aggregates into one big lump before delegating to the target. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. NFCI for AMDGPU, Mips and X86. Differential Revision: https://reviews.llvm.org/D63550 llvm-svn: 364511
*	[GlobalISel] Accept multiple vregs in lowerFormalArgs	Diana Picus	2019-06-27	1	-54/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change the interface of CallLowering::lowerFormalArguments to accept several virtual registers for each formal argument, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660. lowerCall will be refactored in the same way in follow-up patches. With this change, we forward the virtual registers generated for aggregates to CallLowering. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. We also copy the pack/unpackRegs helpers to CallLowering to facilitate this. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. AArch64 seems to have had a bug when lowering e.g. [1 x i8*], which was put into a s64 instead of a p0. Added a test-case which illustrates the problem more clearly (it crashes without this patch) and fixed the existing test-case to expect p0. AMDGPU has been updated to unpack into the virtual registers for kernels. I think the other code paths fall back for aggregates, so this should be NFC. Mips doesn't support aggregates yet, so it's also NFC. x86 seems to have code for dealing with aggregates, but I couldn't find the tests for it, so I just added a fallback to DAGISel if we get more than one virtual register for an argument. Differential Revision: https://reviews.llvm.org/D63549 llvm-svn: 364510
*	[ARM GlobalISel] Tests for s64 G_ADD and G_SUB.	Eli Friedman	2019-06-20	2	-0/+121
\| \| \| \| \| \|	Forgot to commit these in r363989 (https://reviews.llvm.org/D63585) llvm-svn: 363991
*	Rename ExpandISelPseudo->FinalizeISel, delay register reservation	Matt Arsenault	2019-06-19	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	This allows targets to make more decisions about reserved registers after isel. For example, now it should be certain there are calls or stack objects in the frame or not, which could have been introduced by legalization. Patch by Matthias Braun llvm-svn: 363757
*	[ARM] Replace fp-only-sp and d16 with fp64 and d32.	Simon Tatham	2019-05-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Those two subtarget features were awkward because their semantics are reversed: each one indicates the _lack_ of support for something in the architecture, rather than the presence. As a consequence, you don't get the behavior you want if you combine two sets of feature bits. Each SubtargetFeature for an FP architecture version now comes in four versions, one for each combination of those options. So you can still say (for example) '+vfp2' in a feature string and it will mean what it's always meant, but there's a new string '+vfp2d16sp' meaning the version without those extra options. A lot of this change is just mechanically replacing positive checks for the old features with negative checks for the new ones. But one more interesting change is that I've rearranged getFPUFeatures() so that the main FPU feature is appended to the output list before rather than after the features derived from the Restriction field, so that -fp64 and -d32 can override defaults added by the main feature. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: srhines, javed.absar, eraman, kristof.beyls, hiraditya, zzheng, Petar.Avramovic, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D60691 llvm-svn: 361845
*	[ARM GlobalISel] Un-XFAIL some tests. NFC	Diana Picus	2019-05-27	2	-2/+0
\| \| \| \| \| \| \|	It turns out we support big endian now (probably since r332449, but I haven't bisected to confirm). llvm-svn: 361756
*	[IRTranslator] Don't hardcode GEP index type	Diana Picus	2019-05-14	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When breaking up loads and stores of aggregates, the IRTranslator uses LLT::scalar(64) for the index type of the G_GEP instructions that compute the addresses. This is unnecessarily large for 32-bit targets. Use the int ptr type provided by the DataLayout instead. Note that we're already doing the right thing when translating getelementptr instructions from the IR. This is just an oversight when generating new ones while translating loads/stores. Both x86 and AArch64 already have tests confirming that the old behaviour is preserved for 64-bit targets. Differential Revision: https://reviews.llvm.org/D61852 llvm-svn: 360656
*	[ARM GlobalISel] Map DBG_VALUE for types != s32	Diana Picus	2019-05-09	1	-15/+31
\| \| \| \| \| \| \| \|	...and make sure we fail elegantly for unsupported values. s64 goes into DPR, anything <= 32 into GPR. llvm-svn: 360321
*	ARM: disallow SP as Rn for Thumb2 TST & TEQ instructions	Tim Northover	2019-05-08	2	-4/+3
\| \| \| \| \| \| \| \|	Using SP in this position is unpredictable in ARMv7. CMP and CMN are not affected, and of course v8 relaxes this requirement, but that's handled elsewhere. llvm-svn: 360242
*	[ARM GlobalISel] Widen G_SELECT operands	Diana Picus	2019-05-07	1	-0/+71
\| \| \| \| \| \|	...except for the condition operand. llvm-svn: 360135
*	[ARM GlobalISel] Widen G_INTTOPTR/G_PTRTOINT	Diana Picus	2019-05-07	1	-0/+60
\| \| \| \| \| \| \|	We actually have a couple of G_PTRTOINT to s8 when building clang, so we should do something about them. llvm-svn: 360130
*	[ARM GlobalISel] Widen G_GEP index operand	Diana Picus	2019-05-07	1	-3/+31
\| \| \| \|	llvm-svn: 360127
*	[ARM GlobalISel] Select extensions to < 32 bits	Diana Picus	2019-05-02	2	-0/+646
\| \| \| \| \| \| \| \| \|	Select G_SEXT and G_ZEXT with destination types smaller than 32 bits in the exact same way as 32 bits. This overwrites the higher bits, but that should be ok since all legal users of types smaller than 32 bits ignore those bits anyway. llvm-svn: 359768
*	[ARM GlobalISel] Rename some inst selector tests. NFC	Diana Picus	2019-05-02	2	-45/+45
\| \| \| \| \| \|	Prepare to add support for extensions to types smaller than 32 bits. llvm-svn: 359767
*	[ARM GlobalISel] Legalize extensions to < 32 bits	Diana Picus	2019-05-02	1	-9/+131
\| \| \| \| \| \|	Make it legal to extend from e.g. s1 to s8 or s16. llvm-svn: 359766
*	[ARM GlobalISel] Widen small shift operands	Diana Picus	2019-04-30	1	-0/+121
\| \| \| \| \| \| \|	The legalizer was already widening the shift amount. Add tests for that behaviour, and also support widening the shifted value. llvm-svn: 359542
*	[ARM GlobalISel] Be more careful about bailing out	Diana Picus	2019-04-30	1	-0/+14
\| \| \| \| \| \| \| \| \|	Bail out on function arguments/returns with types aggregating an unsupported type. This fixes cases where we would happily and incorrectly lower functions taking e.g. [1 x i64] parameters, when we don't even support plain i64 yet. llvm-svn: 359540
*	[GlobalISel] Enable CSE in the IRTranslator & legalizer for -O0 with ↵	Amara Emerson	2019-04-15	4	-26/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	constants only. Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't be degraded. This change also improves the IRTranslator so that in most places, but not all, it creates constants using the MIRBuilder directly instead of first creating a new destination vreg and then creating a constant. By doing this, the buildConstant() method can just return the vreg of an existing G_CONSTANT instead of having to create a COPY from it. I measured a 0.2% improvement in compile time and a 0.9% improvement in code size at -O0 ARM64. Compile time: Program base cse diff test-suite...ark/tramp3d-v4/tramp3d-v4.test 9.04 9.12 0.8% test-suite...Mark/mafft/pairlocalalign.test 2.68 2.66 -0.7% test-suite...-typeset/consumer-typeset.test 5.53 5.51 -0.4% test-suite :: CTMark/lencod/lencod.test 5.30 5.28 -0.3% test-suite :: CTMark/Bullet/bullet.test 25.82 25.76 -0.2% test-suite...:: CTMark/ClamAV/clamscan.test 6.92 6.90 -0.2% test-suite...TMark/7zip/7zip-benchmark.test 34.24 34.17 -0.2% test-suite :: CTMark/SPASS/SPASS.test 6.25 6.24 -0.1% test-suite...:: CTMark/sqlite3/sqlite3.test 1.66 1.66 -0.1% test-suite :: CTMark/kimwitu++/kc.test 13.61 13.60 -0.0% Geomean difference -0.2% Code size: Program base cse diff test-suite...-typeset/consumer-typeset.test 1315632 1266480 -3.7% test-suite...:: CTMark/ClamAV/clamscan.test 1313892 1297508 -1.2% test-suite :: CTMark/lencod/lencod.test 1439504 1423112 -1.1% test-suite...TMark/7zip/7zip-benchmark.test 2936980 2904172 -1.1% test-suite :: CTMark/Bullet/bullet.test 3478276 3445460 -0.9% test-suite...ark/tramp3d-v4/tramp3d-v4.test 8082868 8033492 -0.6% test-suite :: CTMark/kimwitu++/kc.test 3870380 3853972 -0.4% test-suite :: CTMark/SPASS/SPASS.test 1434904 1434896 -0.0% test-suite...Mark/mafft/pairlocalalign.test 764528 764528 0.0% test-suite...:: CTMark/sqlite3/sqlite3.test 782092 782092 0.0% Geomean difference -0.9% Differential Revision: https://reviews.llvm.org/D60580 llvm-svn: 358369
*	[ARM GlobalISel] Select G_FCONSTANT for VFP3	Diana Picus	2019-04-10	1	-12/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make it possible to TableGen code for FCONSTS and FCONSTD. We need to make two changes to the TableGen descriptions of vfp_f32imm and vfp_f64imm respectively: * add GISelPredicateCode to check that the immediate fits in 8 bits; * extract the SDNodeXForms into separate definitions and create a GISDNodeXFormEquiv and a custom renderer function for each of them. There's a lot of boilerplate to get the actual value of the immediate, but it basically just boils down to calling ARM_AM::getFP32Imm or ARM_AM::getFP64Imm. llvm-svn: 358063
*	[ARM GlobalISel] Select G_FCONSTANT into pools	Diana Picus	2019-04-10	1	-0/+135
\| \| \| \| \| \| \|	Put all floating point constants into constant pools and load their values from there. llvm-svn: 358062
*	[ARM GlobalISel] Map G_FCONSTANT	Diana Picus	2019-04-10	1	-0/+39
\| \| \| \|	llvm-svn: 358061
*	[GlobalISel][AArch64] Allow CallLowering to handle types which are normally	Amara Emerson	2019-04-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	required to be passed as different register types. E.g. <2 x i16> may need to be passed as a larger <2 x i32> type, so formal arg lowering needs to be able truncate it back. Likewise, when dealing with returns of these types, they need to be widened in the appropriate way back. Differential Revision: https://reviews.llvm.org/D60425 llvm-svn: 358032
*	[ARM GlobalISel] Support DBG_VALUE	Diana Picus	2019-04-04	2	-0/+113
\| \| \| \| \| \|	Make sure we can map and select DBG_VALUE. llvm-svn: 357681
*	[ARM GlobalISel] Run regbankselect test for Thumb. NFCI	Diana Picus	2019-03-28	1	-1/+2
\| \| \| \| \| \| \|	This should just work, since ARM mode and Thumb2 mode are at the same level of support now and should map the same to GPR and FPR. llvm-svn: 357159
*	[ARM GlobalISel] Fix G_STORE with s1	Diana Picus	2019-03-28	2	-7/+44
\| \| \| \| \| \| \|	G_STORE for 1-bit values uses a STRBi12, which stores the whole byte. Zero out the undefined bits before writing. llvm-svn: 357154
*	[ARM GlobalISel] Fix selection of G_SELECT	Diana Picus	2019-03-28	3	-6/+6
\| \| \| \| \| \| \| \| \| \| \|	G_SELECT uses a 1-bit scalar for the condition, and is currently implemented with a plain CMPri against 0. This means that values such as 0x1110 are interpreted as true, when instead the higher bits should be treated as undefined and therefore ignored. Replace the CMPri with a TSTri against 0x1, which performs an implicit AND, yielding the expected result. llvm-svn: 357153
*	[ARM GlobalISel] 64-bit memops should be aligned	Diana Picus	2019-03-25	2	-31/+89
\| \| \| \| \| \| \| \| \| \|	We currently use only VLDR/VSTR for all 64-bit loads/stores, so the memory operands must be word-aligned. Mark aligned operations as legal and narrow non-aligned ones to 32 bits. While we're here, also mark non-power-of-2 loads/stores as unsupported. llvm-svn: 356872
*	[ARM GlobalISel] Support G_CTLZ for Thumb2	Diana Picus	2019-03-01	2	-0/+34
\| \| \| \| \| \|	Same as ARM mode but with different opcode. llvm-svn: 355191
*	[ARM GlobalISel] Check target flags in test. NFCI	Diana Picus	2019-03-01	2	-12/+12
\| \| \| \| \| \| \| \|	There was a time when we couldn't dump target-specific flags such as arm-sbrel etc, so the tests didn't check for them. We can now be more specific in our tests. llvm-svn: 355189
*	[ARM GlobalISel] Support global variables for Thumb2	Diana Picus	2019-02-28	8	-26/+660
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the same level of support as for ARM mode (i.e. still no TLS support). In most cases, it is sufficient to replace the opcodes with the t2-equivalent, but there are some idiosyncrasies that I decided to preserve because I don't understand the full implications: * For ARM we use LDRi12 to load from constant pools, but for Thumb we use t2LDRpci (I'm not sure if the ideal would be to use t2LDRi12 for Thumb as well, or to use LDRcp for ARM). * For Thumb we don't have an equivalent for MOV\|LDRLIT_ga_pcrel_ldr, so we have to generate MOV\|LDRLIT_ga_pcrel plus a load from GOT. The tests are in separate files because they're hard enough to read even without doubling the number of checks. llvm-svn: 355077
*	[ARM GlobalISel] Support floating point for Thumb2	Diana Picus	2019-02-22	3	-968/+996
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is exactly the same as arm mode, so for the instruction selector tests we just extract them to a new file and run with the same checks for both arm and thumb mode. For the legalizer we need to update the tests for soft float a bit, but only because BL and tBL are slightly different. We could be pedantic and check that we get a well-formed BL for arm mode and a tBL for thumb, but for the purposes of the legalizer test it's sufficient to just skip over the predicate operands in the checks. Also note that we have the pedantic checks in the divmod test, so we're covered. llvm-svn: 354665
*	[ARM GlobalISel] Support G_FRAME_INDEX for Thumb2	Diana Picus	2019-02-21	3	-31/+80
\| \| \| \| \| \|	Same as arm mode. llvm-svn: 354579
*	[ARM GlobalISel] Support G_PHI for Thumb2	Diana Picus	2019-02-19	3	-131/+181
\| \| \| \| \| \|	Same as arm mode. llvm-svn: 354310
*	[ARM GlobalISel] Support branches for Thumb2	Diana Picus	2019-02-15	3	-35/+83
\| \| \| \| \| \|	Just like arm mode, but with different opcodes. llvm-svn: 354113
*	[ARM GlobalISel] Support G_SELECT for Thumb2	Diana Picus	2019-02-13	3	-55/+137
\| \| \| \| \| \|	Same as arm mode, but slightly different opcodes. llvm-svn: 353938
*	[ARM GlobalISel] Support G_ICMP for Thumb2	Diana Picus	2019-02-07	3	-92/+436
\| \| \| \| \| \| \|	Mark as legal and use the t2* equivalents of the arm mode instructions, e.g. t2CMPrr instead of plain CMPrr. llvm-svn: 353392
*	[ARM GlobalISel] Support G_GEP for Thumb2	Diana Picus	2019-02-05	3	-27/+59
\| \| \| \| \| \|	Same as ARM, but use a different opcode in the instruction selection. llvm-svn: 353151
*	GlobalISel: Enforce operand types for constants	Matt Arsenault	2019-02-04	4	-10/+10
\| \| \| \| \| \| \| \|	A number of of tests were using imm operands, not cimm. Since CSE relies on the exact ConstantInt* pointer used, and implicit conversions are generally evil, also enforce the bitsize of the types. llvm-svn: 353113
*	GlobalISel: Fix creating MMOs with align 0	Matt Arsenault	2019-01-31	1	-3/+3
\| \| \| \|	llvm-svn: 352712