summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/ARM
Commit message (Collapse)AuthorAgeFilesLines
* [ARM GlobalISel] Allow calls to varargs functionsDiana Picus2019-01-172-7/+86
| | | | | | | | | Allow varargs functions to be called, both in arm and thumb mode. This boils down to choosing the correct calling convention, which we can easily test by making sure arm_aapcscc is used instead of arm_aapcs_vfpcc when the callee is variadic. llvm-svn: 351424
* [DAGCombine] Fix ReduceLoadWidth for shifted offsetsSam Parker2019-01-161-0/+36
| | | | | | | | | | | | ReduceLoadWidth can trigger using a shifted mask is used and this requires that the function return a shl node to correct for the offset. However, the way that this was implemented meant that the returned result could be an existing node, which would be incorrect. This fixes the method of inserting the new node and replacing uses. Differential Revision: https://reviews.llvm.org/D50432 llvm-svn: 351310
* Remove irrelevant references to legacy git repositories fromJames Y Knight2019-01-152-3/+3
| | | | | | | | | compiler identification lines in test-cases. (Doing so only because it's then easier to search for references which are actually important and need fixing.) llvm-svn: 351200
* [ARM GlobalISel] Import MOVi32imm into GlobalISelDiana Picus2019-01-141-0/+21
| | | | | | | | | | | | | | | | Make it possible for TableGen to produce code for selecting MOVi32imm. This allows reasonably recent ARM targets to select a lot more constants than before. We achieve this by adding GISelPredicateCode to arm_i32imm. It's impossible to use the exact same code for both DAGISel and GlobalISel, since one uses "Subtarget->" and the other "STI." to refer to the subtarget. Moreover, in GlobalISel we don't have ready access to the MachineFunction, so we need to add a bit of code for obtaining it from the instruction that we're selecting. This is also the reason why it needs to remain a PatLeaf instead of the more specific IntImmLeaf. llvm-svn: 351056
* Replace "no-frame-pointer-*" function attributes with "frame-pointer"Francis Visoiu Mistrih2019-01-1428-108/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Part of the effort to refactoring frame pointer code generation. We used to use two function attributes "no-frame-pointer-elim" and "no-frame-pointer-elim-non-leaf" to represent three kinds of frame pointer usage: (all) frames use frame pointer, (non-leaf) frames use frame pointer, (none) frame use frame pointer. This CL makes the idea explicit by using only one enum function attribute "frame-pointer" Option "-frame-pointer=" replaces "-disable-fp-elim" for tools such as llc. "no-frame-pointer-elim" and "no-frame-pointer-elim-non-leaf" are still supported for easy migration to "frame-pointer". tests are mostly updated with // replace command line args ‘-disable-fp-elim=false’ with ‘-frame-pointer=none’ grep -iIrnl '\-disable-fp-elim=false' * | xargs sed -i '' -e "s/-disable-fp-elim=false/-frame-pointer=none/g" // replace command line args ‘-disable-fp-elim’ with ‘-frame-pointer=all’ grep -iIrnl '\-disable-fp-elim' * | xargs sed -i '' -e "s/-disable-fp-elim/-frame-pointer=all/g" Patch by Yuanfang Chen (tabloid.adroit)! Differential Revision: https://reviews.llvm.org/D56351 llvm-svn: 351049
* [AArch64] Create feature set for Exynos M4Evandro Menezes2019-01-111-1/+1
| | | | | | Complete the feature set for Exynos M4 and update test cases. llvm-svn: 350953
* [ARM] Add missing patterns for DSP mulsSam Parker2019-01-081-32/+149
| | | | | | | | | | | Using a PatLeaf for sext_16_node allowed matching smulbb and smlabb instructions once the operands had been sign extended. But we also need to use sext_inreg operands along with sext_16_node to catch a few more cases that enable use to remove the unnecessary sxth. Differential Revision: https://reviews.llvm.org/D55992 llvm-svn: 350613
* [ARM] ComputeKnownBits to handle extract vectorsDiogo N. Sampaio2019-01-071-23/+101
| | | | | | | | | This patch adds the sign/zero extension done by vgetlane to ARM computeKnownBitsForTargetNode. Differential revision: https://reviews.llvm.org/D56098 llvm-svn: 350553
* Regenerate test.Simon Pilgrim2019-01-071-13/+44
| | | | | | Prep work towards enabling SimplifyDemandedBits vector support for TRUNCATE as discussed on D56118. llvm-svn: 350514
* [ARM] Set Defs = [CPSR] for COPY_STRUCT_BYVAL, as it clobbers CPSR.Florian Hahn2018-12-211-0/+61
| | | | | | | | | | | | Fixes PR35023. Reviewers: MatzeB, t.p.northover, sunfish, qcolombet, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D55909 llvm-svn: 349935
* [ARM GlobalISel] Support G_CONSTANT for Thumb2Diana Picus2018-12-196-179/+603
| | | | | | | | | | | All we have to do is mark it as legal. This allows us to select a lot of new patterns handled by TableGen. This patch adds tests for them and splits up the existing test file for binary operators into 2 files, one for arithmetic ops and one for logical ones. llvm-svn: 349610
* ARM: use acquire/release instruction variants when available.Tim Northover2018-12-172-8/+153
| | | | | | | | These features (fairly) recently got split out into their own feature, so we should make CodeGen use them when available. The main change here is that the check used to be based on the triple, but now it's based on CPU features. llvm-svn: 349355
* [DAGCombiner] allow hoisting vector bitwise logic ahead of truncatesSanjay Patel2018-12-161-2/+1
| | | | | | | | | | | | | | | | | | The transform performs a bitwise logic op in a wider type followed by truncate when both inputs are truncated from the same source type: logic_op (truncate x), (truncate y) --> truncate (logic_op x, y) There are a bunch of other checks that should prevent doing this when it might be harmful. We already do this transform for scalars in this spot. The vector limitation was shared with a check for the case when the operands are extended. I'm not sure if that limit is needed either, but that would be a separate patch. Differential Revision: https://reviews.llvm.org/D55448 llvm-svn: 349303
* [ARM] make test immune to scalarization improvements; NFCSanjay Patel2018-12-141-2/+2
| | | | llvm-svn: 349177
* [ARM GlobalISel] Thumb2: casts between int and ptrDiana Picus2018-12-143-47/+101
| | | | | | Mark as legal and add tests. Nothing special to do. llvm-svn: 349147
* [ARM GlobalISel] Remove duplicate test. NFCIDiana Picus2018-12-141-51/+0
| | | | | | | Fixup for r349026. I forgot to delete these test functions from the original file when I moved them to arm-legalize-exts.mir. llvm-svn: 349146
* [ARM GlobalISel] Allow simple binary ops in Thumb2Diana Picus2018-12-143-558/+696
| | | | | | | | | | | Mark G_ADD, G_SUB, G_MUL, G_AND, G_OR and G_XOR as legal for both ARM and Thumb2. Extract the legalizer tests for these opcodes into another file. Add tests for the instruction selector. llvm-svn: 349142
* [ARM GlobalISel] Support exts and truncs for Thumb2Diana Picus2018-12-132-0/+367
| | | | | | | | | | | Mark G_SEXT, G_ZEXT and G_ANYEXT to 32 bits as legal and add support for them in the instruction selector. This uses handwritten code again because the patterns that are generated with TableGen are tuned for what the DAG combiner would produce and not for simple sext/zext nodes. Luckily, we only need to update the opcodes to use the Thumb2 variants, everything else can be reused from ARM. llvm-svn: 349026
* [CodeGen] Allow mempcy/memset to generate small overlapping stores.Clement Courbet2018-12-132-10/+11
| | | | | | | | | | | | | Summary: All targets either just return false here or properly model `Fast`, so I don't think there is any reason to prevent CodeGen from doing the right thing here. Subscribers: nemanjai, javed.absar, eraman, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D55365 llvm-svn: 349016
* [ARM GlobalISel] Select load/store for Thumb2Diana Picus2018-12-123-24/+137
| | | | | | | | | | | | Unfortunately we can't use TableGen for this because it doesn't yet support predicates on the source pattern root. Therefore, add a bit of handwritten code to the instruction selector to handle the most basic cases. Also mark them as legal and extract their legalizer test cases to a new test file. llvm-svn: 348920
* [GlobalISel] Restrict G_MERGE_VALUES capability and replace with new opcodes.Amara Emerson2018-12-101-6/+6
| | | | | | | | | | | | This patch restricts the capability of G_MERGE_VALUES, and uses the new G_BUILD_VECTOR and G_CONCAT_VECTORS opcodes instead in the appropriate places. This patch also includes AArch64 support for selecting G_BUILD_VECTOR of <4 x s32> and <2 x s64> vectors. Differential Revisions: https://reviews.llvm.org/D53629 llvm-svn: 348788
* ARM: use correct offset from base pointer (r6) in call frame regions.Tim Northover2018-12-072-2/+55
| | | | | | | | | When we had dynamic call frames (i.e. sp adjustment around each call) we were including that adjustment into offsets calculated based on r6, even though it's only sp that changes. This led to incorrect stack slot accesses. llvm-svn: 348591
* [Targets] Add errors for tiny and kernel codemodel on targets that don't ↵David Green2018-12-071-0/+9
| | | | | | | | | | | support them Adds fatal errors for any target that does not support the Tiny or Kernel codemodels by rejigging the getEffectiveCodeModel calls. Differential Revision: https://reviews.llvm.org/D50141 llvm-svn: 348585
* [ARM][NFC] Adding another test for armcgpSam Parker2018-12-061-0/+21
| | | | llvm-svn: 348489
* [ARM][NFC] Added extra arm-cgp testSam Parker2018-12-061-0/+36
| | | | llvm-svn: 348482
* [ARM GlobalISel] Implement call lowering for Thumb2Diana Picus2018-12-052-32/+63
| | | | | | | | The only things that are different from arm are: * different opcodes for calls and returns * Thumb calls take predicate operands llvm-svn: 348347
* ARM: use target-specific SUBS node when combining cmp with cmov.Tim Northover2018-12-033-19/+20
| | | | | | | | | | | This has two positive effects. First, using a custom node prevents recombination leading to an infinite loop since the output DAG is notionally a little more complex than the input one. Using a flag-setting instruction also allows the subtraction to be folded with the related comparison more easily. https://reviews.llvm.org/D53190 llvm-svn: 348122
* [ARM] FP16: support vld1.16 for vector loads with post-incrementSjoerd Meijer2018-12-031-0/+48
| | | | | | Differential Revision: https://reviews.llvm.org/D55112 llvm-svn: 348110
* [ARM] Don't expand sdiv when optimising for minsizeSjoerd Meijer2018-11-302-0/+184
| | | | | | | | | | | | | | | Don't expand SDIV with an immediate that is a power of 2 if we optimise for minimum code size. For example: sdiv %1, i32 4 gets expanded to a sequence of 3 instructions, but this is suboptimal for minimum code size so instead we just generate a MOV and a SDIV if integer division is supported. Differential Revision: https://reviews.llvm.org/D54546 llvm-svn: 347965
* [LegalizeVectorTypes][X86][ARM][AArch64][PowerPC] Don't use ↵Craig Topper2018-11-261-5/+5
| | | | | | | | | | | | SplitVecOp_TruncateHelper for FP_TO_SINT/UINT. SplitVecOp_TruncateHelper tries to promote the result type while splitting FP_TO_SINT/UINT. It then concatenates the result and introduces a truncate to the original result type. But it does this without inserting the AssertZExt/AssertSExt that the regular result type promotion would insert. Nor does it turn FP_TO_UINT into FP_TO_SINT the way normal result type promotion for these operations does. This is bad on X86 which doesn't support FP_TO_SINT until AVX512. This patch disables the use of SplitVecOp_TruncateHelper for these operations and just lets normal promotion handle it. I've tweaked a couple things in X86ISelLowering to avoid a few obvious regressions there. I believe all the changes on X86 are improvements. The other targets look neutral. Differential Revision: https://reviews.llvm.org/D54906 llvm-svn: 347593
* [ARM GlobalISel] Support G_CTLZ and G_CTLZ_ZERO_UNDEFDiana Picus2018-11-262-0/+199
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can now select CLZ via the TableGen'erated code, so support G_CTLZ and G_CTLZ_ZERO_UNDEF throughout the pipeline for types <= s32. Legalizer: If the CLZ instruction is available, use it for both G_CTLZ and G_CTLZ_ZERO_UNDEF. Otherwise, use a libcall for G_CTLZ_ZERO_UNDEF and lower G_CTLZ in terms of it. In order to achieve this we need to add support to the LegalizerHelper for the legalization of G_CTLZ_ZERO_UNDEF for s32 as a libcall (__clzsi2). We also need to allow lowering of G_CTLZ in terms of G_CTLZ_ZERO_UNDEF if that is supported as a libcall, as opposed to just if it is Legal or Custom. Due to a minor refactoring of the helper function in charge of this, we will also allow the same behaviour for G_CTTZ and G_CTPOP. This is not going to be a problem in practice since we don't yet have support for treating G_CTTZ and G_CTPOP as libcalls (not even in DAGISel). Reg bank select: Map G_CTLZ to GPR. G_CTLZ_ZERO_UNDEF should not make it to this point. Instruction select: Nothing to do. llvm-svn: 347545
* [ARM] Prevent parallel macs for unsigned valuesSam Parker2018-11-263-0/+207
| | | | | | | | | | | | Both zext and sext are currently allowed during the search for narrow sequences and sexts operands are later added to the mac candidates. But operands of muls are also added, without checking whether they're sext or zext, which means we can generate a signed smlad when we shouldn't. Differential Revision: https://reviews.llvm.org/D54790 llvm-svn: 347542
* [ARM][NFC] codegen tests cleanup: remove dangling check prefixesSjoerd Meijer2018-11-237-24/+24
| | | | | | | | | | | | | | | | | | I am working on making FileCheck stricter (in D54769 and D53710) so that it issues diagnostics when there's something wrong with tests. This is a cleanup for dangling prefixes in the ARM codegen tests, e.g.: --check-prefixes=A,B where A occurs in the check file, but B doesn't. This can be innocent if A does all the required checking, but can also be a bug in that test if it results in the test actually not checking anything (if A for example only checks a common label). Test CodeGen/ARM/smml.ll is such an example. Differential Revision: https://reviews.llvm.org/D54842 llvm-svn: 347487
* [LegalizeVectorTypes] Don't use SplitVecOp_TruncateHelper if we're heading ↵Craig Topper2018-11-231-10/+3
| | | | | | | | | | | | towards scalarizing the type. This code takes a truncate, fp_to_int, or int_to_fp with a legal result type and an input type that needs to be split and enlarges the elements in the result type before doing the split. Then inserts a follow up truncate or fp_round after concatenating the two halves back together. But if the input type of the original op is being split on its way to ultimately being scalarized we're just going to end up building a vector from scalars and then truncating or rounding it in the vector register. Seems kind of silly to enlarge the result element type of the operation only to end up with scalar code and then building a vector with large elements only to make the elements smaller again in the vector register. Seems better to just try to get away producing smaller result types in the scalarized code. The X86 test case that changes is a pretty contrived test case that exists because of a bug we used to have in our AVG matching code. I think the code is better now, but its not realistic anyway. llvm-svn: 347482
* [DAGCombiner] form 'not' ops ahead of shifts (PR39657)Sanjay Patel2018-11-221-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We fail to canonicalize IR this way (prefer 'not' ops to arbitrary 'xor'), but that would not matter without this patch because DAGCombiner was reversing that transform. I think we need this transform in the backend regardless of what happens in IR to catch cases where the shift-xor is formed late from GEP or other ops. https://rise4fun.com/Alive/NC1 Name: shl Pre: (-1 << C2) == C1 %shl = shl i8 %x, C2 %r = xor i8 %shl, C1 => %not = xor i8 %x, -1 %r = shl i8 %not, C2 Name: shr Pre: (-1 u>> C2) == C1 %sh = lshr i8 %x, C2 %r = xor i8 %sh, C1 => %not = xor i8 %x, -1 %r = lshr i8 %not, C2 https://bugs.llvm.org/show_bug.cgi?id=39657 llvm-svn: 347478
* [ARM GlobalISel] Add test for BFC. NFCIDiana Picus2018-11-221-0/+57
| | | | | | | | | r334871 has made it possible for TableGen'erated code to select BFC, but it has not added a test for it on the ARM side. Add it now to make sure we don't introduce regressions if we ever change anything about that rule. llvm-svn: 347447
* [ARM] Attempt to fix arm selfhost bots after rL347191Sam Parker2018-11-191-2/+2
| | | | llvm-svn: 347238
* [ARM] Remove trunc sinks in ARM CGPSam Parker2018-11-194-107/+231
| | | | | | | | | | | | | | | | | | | | | | | | | | Truncs are treated as sources if their produce a value of the same type as the one we currently trying to promote. Truncs used to be considered as a sink if their operand was the same value type. We now allow smaller types in the search, so we should search through truncs that produce a smaller value. These truncs can then be converted to an AND mask. This leaves sinks as being: - points where the value in the register is being observed, such as an icmp, switch or store. - points where value types have to match, such as calls and returns. - zext are included to ease the transformation and are generally removed later on. During this change, it also became apart from truncating sinks was broken: if a sink used a source, its type information had already been lost by the time the truncation happens. So I've changed the method of caching the type information. Differential Revision: https://reviews.llvm.org/D54515 llvm-svn: 347191
* [ARM] make test immune to improvements in undef simplificationSanjay Patel2018-11-181-2/+2
| | | | llvm-svn: 347164
* DAG combiner: fold (select, C, X, undef) -> XStanislav Mekhanoshin2018-11-161-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D54646 llvm-svn: 347110
* [CodeGen] Fix forward scan in MachineBasicBlock::computeRegisterLiveness.Eli Friedman2018-11-141-0/+33
| | | | | | | | | | The scan was incorrectly skipping the first instruction, so a register could appear to be dead when it was actually live. This eventually leads to a machine verifier failure and miscompile in arm-ldst-opt. Differential Revision: https://reviews.llvm.org/D54491 llvm-svn: 346821
* [DAGCombiner] Fix load-store forwarding of indexed loads.Nirav Dave2018-11-121-0/+33
| | | | | | | | | | | | | | | | Summary: Handle extra output from index loads in cases where we wish to forward a load value directly from a preceeding store. Fixes PR39571. Reviewers: peter.smith, rengolin Subscribers: javed.absar, hiraditya, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D54265 llvm-svn: 346654
* [ARM] Add MemOperand to LDRcp to enable DCE.Eli Friedman2018-11-091-0/+56
| | | | | | | | | | | | LDRcp should be deleted when the dest register is dead in register coalescing. Without MemOp, dead LDRcp will cause dead constant pool value which references to non-existing label. Patch by Yin Ma. Differential Revision: https://reviews.llvm.org/D54173 llvm-svn: 346563
* [ARM] Don't promote i1 types in ARM CGPSam Parker2018-11-092-0/+44
| | | | | | | | | Now that we have mixed type sizes, i1 values need to be explicitly handled as we want to avoid promoting these values. Differential Revision: https://reviews.llvm.org/D54308 llvm-svn: 346499
* [ARM] Enable mixed types in ARM CGPSam Parker2018-11-093-9/+225
| | | | | | | | | | | | | | | | | | | | | | Previously, during the search, all values had to have the same 'TypeSize', which is equal to number of bits of the integer type of the icmp operand. All values in the tree had to match this size; meaning that, if we searched from i16, we wouldn't accept i8s. A change in type size requires zext and truncs to perform the casts so, to allow mixed narrow types, the handling of these instructions is now slightly different: - we allow casts if their result or operand is <= TypeSize. - zexts are sinks if their result > TypeSize. - truncs are still sinks if their operand == TypeSize. - truncs are still sources if their result == TypeSize. The transformation bails on finding an icmp that operates on data smaller than the current TypeSize. Differential Revision: https://reviews.llvm.org/D54108 llvm-svn: 346480
* [DAGCombine] Improve alias analysis for chain of independent stores.Nirav Dave2018-11-082-96/+97
| | | | | | | | | | | | | | | | | | | FindBetterNeighborChains simulateanously improves the chain dependencies of a chain of related stores avoiding the generation of extra token factors. For chains longer than the GatherAllAliasDepths, stores further down in the chain will necessarily fail, a potentially significant waste and preventing otherwise trivial parallelization. This patch directly parallelize the chains of stores before improving each store. This generally improves DAG-level parallelism. Reviewers: courbet, spatel, RKSimon, bogner, efriedma, craig.topper, rnk Subscribers: sdardis, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D53552 llvm-svn: 346432
* [ARM] Fix CPSR liveness in tMOVCCr_pseudo lowering.Eli Friedman2018-11-071-3/+223
| | | | | | | | | | | | | | | | | | | | | The lowering was missing live-ins in certain cases, like a sequence of multiple tMOVCCr_pseudo instructions. This would lead to a verifier failure, and on pre-v6 Thumb CPSR would be incorrectly clobbered. For reasons I don't completely understand, it's hard to get a sequence of multiple tMOVCCr_pseudo instructions; the issue only seems to show up with 64-bit comparisons where the result is zero-extended. I added some extra testcases in case that changes in the future. Probably some optimization opportunities here if anyone is interested. (@test_slt_not is the case that was getting miscompiled.) The code to check the liveness of CPSR was stolen from X86ISelLowering.cpp; maybe it could be refactored into common helper, but I have no idea where to put it. Differential Revision: https://reviews.llvm.org/D54192 llvm-svn: 346355
* [NFC][ARM] Adding extra test for ARM CGPSam Parker2018-11-051-0/+14
| | | | | | Added a reproducer that I received a while ago. llvm-svn: 346132
* [ARM] Turn assert into condition in ARMCGPSam Parker2018-11-051-0/+20
| | | | | | | | | Turn the assert in PrepareConstants into a conditon so that we can handle mul instructions with negative immediates. Differential Revision: https://reviews.llvm.org/D54094 llvm-svn: 346126
* [ARM][ARMCGP] Remove unecessary zexts and truncsSam Parker2018-11-052-5/+29
| | | | | | | | | | | | | | | | | | | r345840 slightly changed the way promotion happens which could result in zext and truncs having the same source and destination types. This fixes that issue. We can now also remove the zext and trunc in the following case: (zext (trunc (promoted op)), i32) This means that we can no longer treat a value, that is only used by a sink, to be safe to promote. I've also added in some extra asserts and replaced a cast for a dyn_cast. Differential Revision: https://reviews.llvm.org/D54032 llvm-svn: 346125
OpenPOWER on IntegriCloud