summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/ARM
Commit message (Collapse)AuthorAgeFilesLines
* Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd"Reid Kleckner2016-08-151-1/+1
| | | | | | | | | This reverts commit r278660. It causes downstream assertion failure in InstCombine on shuffle instructions. Comes up in __mm_swizzle_epi32. llvm-svn: 278672
* [SimplifyCFG] Rewrite SinkThenElseCodeToEndJames Molloy2016-08-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return *b += 3; else return *b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. llvm-svn: 278660
* [LSR] Don't try and create post-inc expressions on non-rotated loopsJames Molloy2016-08-151-1/+2
| | | | | | | | | | | | | | | If a loop is not rotated (for example when optimizing for size), the latch is not the backedge. If we promote an expression to post-inc form, we not only increase register pressure and add a COPY for that IV expression but for all IVs! Motivating testcase: void f(float *a, float *b, float *c, int n) { while (n-- > 0) *c++ = *a++ + *b++; } It's imperative that the pointer increments be located in the latch block and not the header block; if not, we cannot use post-increment loads and stores and we have to keep both the post-inc and pre-inc values around until the end of the latch which bloats register usage. llvm-svn: 278658
* Reapply [BranchFolding] Restrict tail merging loop blocks after MBPHaicheng Wu2016-08-121-1/+1
| | | | | | | | | Fixed a bug in the test case. To fix PR28104, this patch restricts tail merging to blocks that belong to the same loop after MBP. llvm-svn: 278575
* Revert "[BranchFolding] Restrict tail merging loop blocks after MBP"Haicheng Wu2016-08-121-1/+1
| | | | | | This reverts commit r278463 because it hits the bot. llvm-svn: 278484
* Recommit 'Remove the restriction that MachineSinking is now stopped byWei Mi2016-08-121-1/+3
| | | | | | | | | | | | | | | | "insert_subreg, subreg_to_reg, and reg_sequence" instructions' after adjusting some unittest checks. This is to solve PR28852. The restriction was added at 2010 to make better register coalescing. We assumed that it was not necessary any more. Testing results on x86 supported the assumption. We will look closely to any performance impact it will bring and will be prepared to help analyzing performance problem found on other architectures. Differential Revision: https://reviews.llvm.org/D23210 llvm-svn: 278466
* [BranchFolding] Restrict tail merging loop blocks after MBPHaicheng Wu2016-08-121-1/+1
| | | | | | | | | To fix PR28014, this patch restricts tail merging to blocks that belong to the same loop after MBP. Differential Revision: https://reviews.llvm.org/D23191 llvm-svn: 278463
* Codegen: Tail Merge: Be less aggressive with special cases.Kyle Butt2016-08-101-2/+2
| | | | | | | | | | | | This change makes it possible for tail-duplication and tail-merging to be disjoint. By being less aggressive when merging during layout, there are no overlapping cases between tail-duplication and tail-merging, provided the thresholds are disjoint. There is a remaining TODO to benchmark the succ_size() test for non-layout tail merging. llvm-svn: 278265
* [ARM] Improve sxta{b|h} and uxta{b|h} testsSam Parker2016-08-102-15/+239
| | | | | | | | | | | | | Created a Thumb2 predicated pattern matcher that uses Thumb2 and HasT2ExtractPack and used it to redefine the patterns for sxta{b|h} and uxta{b|h}. Also used the similar patterns to fill in isel pattern gaps for the corresponding instructions in the ARM backend. The patch is mainly changes to tests since most of this functionality appears not to have been tested. Differential Revision: https://reviews.llvm.org/D23273 llvm-svn: 278207
* [ARM] Add support for embedded position-independent codeOliver Stannard2016-08-083-0/+435
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for some new relocation models to the ARM backend: * Read-only position independence (ROPI): Code and read-only data is accessed PC-relative. The offsets between all code and RO data sections are known at static link time. This does not affect read-write data. * Read-write position independence (RWPI): Read-write data is accessed relative to the static base register (r9). The offsets between all writeable data sections are known at static link time. This does not affect read-only data. These two modes are independent (they specify how different objects should be addressed), so they can be used individually or together. They are otherwise the same as the "static" relocation model, and are not compatible with SysV-style PIC using a global offset table. These modes are normally used by bare-metal systems or systems with small real-time operating systems. They are designed to avoid the need for a dynamic linker, the only initialisation required is setting r9 to an appropriate value for RWPI code. I have only added support to SelectionDAG, not FastISel, because FastISel is currently disabled for bare-metal targets where these modes would be used. Differential Revision: https://reviews.llvm.org/D23195 llvm-svn: 278015
* [ARM] Constant Materialize: imms with specific value can be encoded into mov.wWeiming Zhao2016-08-051-3/+57
| | | | | | | | | | | | | | | | | | Summary: Thumb2 supports encoding immediates with specific patterns into mov.w by splatting the low 8 bits into other bytes. I'm resubmitting this patch. The test case in the original commit r277610 does not specify triple, so builds with differnt default triple will have different output. This patch fixed trile as thumb-darwin-apple. Reviewers: john.brawn, jmolloy, bruno Subscribers: jmolloy, aemerson, rengolin, samparker, llvm-commits Differential Revision: https://reviews.llvm.org/D23090 llvm-svn: 277865
* Revert "[ARM] Constant Materialize: imms with specific value can be encoded ↵Bruno Cardoso Lopes2016-08-031-17/+0
| | | | | | | | | | | into mov.w" This reverts commit r277610 / d619aa8878c3dafcc0d29a46517f63ff3209fdd4. This make subtarget-no-movt.ll fail in http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/26892, llvm-svn: 277654
* [ARM] Constant Materialize: imms with specific value can be encoded into mov.wWeiming Zhao2016-08-031-0/+17
| | | | | | | | | | | | Summary: Thumb2 supports encoding immediates with specific patterns into mov.w by splatting the low 8 bits into other bytes. Reviewers: john.brawn, jmolloy Subscribers: jmolloy, aemerson, rengolin, samparker, llvm-commits Differential Revision: https://reviews.llvm.org/D23090 llvm-svn: 277610
* ARM: only form SMMLS when SUBE flags unused.Tim Northover2016-08-021-0/+21
| | | | | | | | In this particular example we wouldn't want the smmls anyway (the value is actually unused), but in general smmls does not provide the required flags register so if that SUBE result is used we can't replace it. llvm-svn: 277541
* [ARM] Improve smul* and smla* isel for Thumb2Sam Parker2016-08-021-34/+157
| | | | | | | | | | | | | | Added (sra (shl x, 16), 16) to the sext_16_node PatLeaf for ARM to simplify some pattern matching. This has allowed several patterns for smul* and smla* to be removed as well as making it easier to add the matching for the corresponding instructions for Thumb2 targets. Also added two Pat classes that are predicated on Thumb2 with the hasDSP flag and UseMulOps flags. Updated the smul codegen test with the wider range of patterns plus the ThumbV6 and ThumbV6T2 targets. Differential Revision: https://reviews.llvm.org/D22908 llvm-svn: 277450
* [AArch64] Add support for Samsung Exynos M2 (NFC).Evandro Menezes2016-08-011-1/+29
| | | | llvm-svn: 277364
* DAG: avoid duplicated truncating for sign extended operandWeiming Zhao2016-07-292-3/+112
| | | | | | | | | | | | | | | Summary: When performing cmp for EQ/NE and the operand is sign extended, we can avoid the truncaton if the bits to be tested are no less than origianl bits. Reviewers: eli.friedman Subscribers: eli.friedman, aemerson, nemanjai, t.p.northover, llvm-commits Differential Revision: https://reviews.llvm.org/D22933 llvm-svn: 277252
* [Thumb] Emit Thumb move in both Thumb modes for struct_byval predicatesPrakhar Bahuguna2016-07-291-0/+29
| | | | | | | | | | | | | | | | | Summary: The MOV/MOVT instructions being chosen for struct_byval predicates was conditional only on Thumb2, resulting in an ARM MOV/MOVT instruction being incorrectly emitted in Thumb1 mode. This is especially apparent with v8-m.base targets. This patch ensures that Thumb instructions are emitted in both Thumb modes. Reviewers: rengolin, t.p.northover Subscribers: llvm-commits, aemerson, rengolin Differential Revision: https://reviews.llvm.org/D22865 llvm-svn: 277128
* MIRParser: Use shorter cfi identifiersMatthias Braun2016-07-261-3/+3
| | | | | | | | | | | | | | | | In an instruction like: CFI_INSTRUCTION .cfi_def_cfa ... we can drop the '.cfi_' prefix since that should be obvious by the context: CFI_INSTRUCTION def_cfa ... While being a terser and cleaner syntax this also prepares to dropping support for identifiers starting with a dot character so we can use it for expressions. Differential Revision: http://reviews.llvm.org/D22388 llvm-svn: 276785
* [ARM] Saturation instructions are DSP-onlyRenato Golin2016-07-253-10/+38
| | | | | | | | | | | The saturation instructions appeared in v6T2, with DSP extensions, but they were being accepted / generated on any, with the new introduction of the saturation detection in the back-end. This commit restricts the usage to DSP-enable only cores. Fixes PR28607. llvm-svn: 276701
* [ARM] Improve longMAC codegen testSam Parker2016-07-251-34/+85
| | | | | | | | Added thumb targets and dataflow checks to the longMAC test. Differential Revision: https://reviews.llvm.org/D22684 llvm-svn: 276629
* [ARM] Enable ISel of SMMLS for ARM and Thumb2Sam Parker2016-07-251-2/+34
| | | | | | | | Use ISelDAGToDAG to recognise the SMMLS instruction pattern. Differential Revision: https://reviews.llvm.org/D22562 llvm-svn: 276624
* [ARM] Skip inline asm memory operands in DAGToDAGISelDiana Picus2016-07-201-0/+11
| | | | | | | | | | | | | | | | | | | | | | Retry r275776 (no changes, we suspect the issue was with another commit). The current logic for handling inline asm operands in DAGToDAGISel interprets the operands by looking for constants, which should represent the flags describing the kind of operand we're dealing with (immediate, memory, register def etc). The operands representing actual data are skipped only if they are non-const, with the exception of immediate operands which are skipped explicitly when a flag describing an immediate is found. The oversight is that memory operands may be const too (e.g. for device drivers reading a fixed address), so we should explicitly skip the operand following a flag describing a memory operand. If we don't, we risk interpreting that constant as a flag, which is definitely not intended. Fixes PR26038 Differential Revision: https://reviews.llvm.org/D22103 llvm-svn: 276101
* Revert "Disable this-return argument forwarding on ARM/AArch64"David Majnemer2016-07-202-4/+4
| | | | | | | | | Inference of the 'returned' attribute was fixed in r276008, lets try turning the backend support back on. This reverts commit r275677. llvm-svn: 276081
* [ARM] Refactor Thumb2 Mul and Mla instr descsSam Parker2016-07-191-4/+26
| | | | | | | | | | | Recommitting after r274347 was reverted. This patch introduces some classes to refactor the 3 and 4 register Thumb2 multiplication instruction descriptions, plus improved tests for some of those instructions. Differential Revision: https://reviews.llvm.org/D21929 llvm-svn: 275979
* Revert "[ARM] Skip inline asm memory operands in DAGToDAGISel"Vitaly Buka2016-07-181-11/+0
| | | | | | | | Breaks asan, see https://reviews.llvm.org/D22103 This reverts commit r275776. llvm-svn: 275890
* Revert "[ARM] Update test to use CHECK-LABEL. NFCI."Vitaly Buka2016-07-181-8/+6
| | | | | | | | Breaks asan, see https://reviews.llvm.org/D22103 This reverts commit r275777. llvm-svn: 275889
* CodeGenPrep: use correct function to determine Global's alignment.Tim Northover2016-07-181-0/+9
| | | | | | | | | Elsewhere (particularly computeKnownBits) we assume that a global will be aligned to the value returned by Value::getPointerAlignment. This is used to boost the alignment on memcpy/memset, so any target-specific request can only increase that value. llvm-svn: 275866
* [ARM] Update test to use CHECK-LABEL. NFCI.Diana Picus2016-07-181-6/+8
| | | | llvm-svn: 275777
* [ARM] Skip inline asm memory operands in DAGToDAGISelDiana Picus2016-07-181-0/+11
| | | | | | | | | | | | | | | | | | | | The current logic for handling inline asm operands in DAGToDAGISel interprets the operands by looking for constants, which should represent the flags describing the kind of operand we're dealing with (immediate, memory, register def etc). The operands representing actual data are skipped only if they are non-const, with the exception of immediate operands which are skipped explicitly when a flag describing an immediate is found. The oversight is that memory operands may be const too (e.g. for device drivers reading a fixed address), so we should explicitly skip the operand following a flag describing a memory operand. If we don't, we risk interpreting that constant as a flag, which is definitely not intended. Fixes PR26038 Differential Revision: https://reviews.llvm.org/D22103 llvm-svn: 275776
* [ARM] Honour ABI for rem under -O0 for EABI, GNUEABI, Android and MuslDiana Picus2016-07-181-12/+105
| | | | | | | | | | | | | | | | | | | | At higher optimization levels, we generate the libcall for DIVREM_Ix, which is fine: aeabi_{u|i}divmod. At -O0 we generate the one for REM_Ix, which is the default {u}mod{q|h|s|d}i3. This commit makes sure that we don't generate REM_Ix calls for ABIs that don't support them (i.e. where we need to use DIVREM_Ix instead). This is achieved by bailing out of FastISel, which can't handle non-double multi-reg returns, and letting the legalization infrastructure expand the REM_Ix calls. It also updates the divmod-eabi.ll test to run under -O0 as well, and adds some Windows checks to it to make sure we don't break things for it. Fixes PR27068 Differential Revision: https://reviews.llvm.org/D21926 llvm-svn: 275773
* Disable this-return argument forwarding on ARM/AArch64Hal Finkel2016-07-162-4/+4
| | | | | | | | | | | r275042 reverted function-attribute inference for the 'returned' attribute because the feature triggered self-hosting failures on ARM and AArch64. James Molloy determined that the this-return argument forwarding feature, which directly ties the returned input argument to the returned value, was the cause. It seems likely that this forwarding code contains, or triggers, a subtle bug. Disabling for now until we can track that down. llvm-svn: 275677
* ARM/MIR: Move test from MIR to CodeGen/ARM directoryMatthias Braun2016-07-161-0/+164
| | | | | | | | | | test/CodeGen/MIR/ARM/ARMLoadStoreDBG.mir is an actual test for the ARM load store optimization pass and not a test of the mir parser/printer. It belongs to test/CodeGen/ARM; This also updates the test to use the new -run-pass llc syntax. llvm-svn: 275662
* ExpandPostRAPseudos should transfer implicit uses, not only implicit defsMichael Kuperstein2016-07-151-0/+1
| | | | | | | | | | | | | | | | | | | Previously, we would expand: %BL<def> = COPY %DL<kill>, %EBX<imp-use,kill>, %EBX<imp-def> Into: %BL<def> = MOV8rr %DL<kill>, %EBX<imp-def> Dropping the imp-use on the floor. That confused CriticalAntiDepBreaker, which (correctly) assumes that if an instruction defs but doesn't use a register, that register is dead immediately before the instruction - while in this case, the high lanes of EBX can be very much alive. This fixes PR28560. Differential Revision: https://reviews.llvm.org/D22425 llvm-svn: 275634
* CodeGen: avoid emitting unnecessary CFISaleem Abdulrasool2016-07-151-3/+5
| | | | | | | | | | | | Remove unnecessary clutter in assembly output. When using SjLj EH, the CFI is not actually used for anything. Do not emit the CFI needlessly. The minor test adjustments are interesting. The prologue test was just overzealous matcching. The interesting case is the LSDA change. It was originally added to ensure that various compilations did not mangle the name (it explicitly checked the name!). However, subsequent cleanups made it more reliant on the CFI to find the name. Parse the generated code flow to generically find the label still. llvm-svn: 275614
* [ARM] Prefer indirect calls in minsize modeJames Molloy2016-07-151-0/+28
| | | | | | | | ... When we emit several calls to the same function in the same basic block. An indirect call uses a "BLX r0" instruction which has a 16-bit encoding. If many calls are made to the same target, this can enable significant code size reductions. llvm-svn: 275537
* [MIR] Print on the given output instead of stderr.Quentin Colombet2016-07-131-1/+1
| | | | | | | | | | | | Currently the MIR framework prints all its outputs (errors and actual representation) on stderr. This patch fixes that by printing the regular output in the output specified with -o. Differential Revision: http://reviews.llvm.org/D22251 llvm-svn: 275314
* Do not expand SDIV when compiling for minimum code sizeSjoerd Meijer2016-07-081-4/+24
| | | | | | Differential Revision: http://reviews.llvm.org/D22139 llvm-svn: 274855
* Addressing post-commit comments regarding not expanding UDIV;Sjoerd Meijer2016-07-081-1/+1
| | | | | | we don't expand only when compiling for minimum code size. llvm-svn: 274847
* Code size optimisation: don't expand a div to a mul and and a shift sequence.Sjoerd Meijer2016-07-081-0/+25
| | | | | | | | | As a result, the urem instruction will not be expanded to a sequence of umull, lsrs, muls and sub instructions, but just a call to __aeabi_uidivmod. Differential Revision: http://reviews.llvm.org/D22131 llvm-svn: 274843
* ARM: support high registers in __builtin_longjmp on WoASaleem Abdulrasool2016-07-081-5/+3
| | | | | | | | | | Windows on ARM uses a pure thumb-2 environment. This means that it can select a high register when doing a __builtin_longjmp. We would use a tLDRi which would truncate the register to a low register. Use a t2LDRi12 to get the full register file access. Tweak the code to just load into PC, as that is an interworking branch on all supported cores anyways. llvm-svn: 274815
* [ARM] Do not test for CPUs, use SubtargetFeatures. Also remove 2 flags.Diana Picus2016-07-061-1/+1
| | | | | | | | | | | | | | | This is a follow-up for r273544. The end goal is to get rid of the isSwift / isCortexXY / isWhatever methods. This commit also removes two command-line flags that weren't used in any of the tests: widen-vmovs and swift-partial-update-clearance. The former may be easily replaced with the mattr mechanism, but the latter may not (as it is a subtarget property, and not a proper feature). Differential Revision: http://reviews.llvm.org/D21797 llvm-svn: 274620
* ARM: fix `-mlong-calls` for WoASaleem Abdulrasool2016-07-051-1/+1
| | | | | | | | | Not all code-paths set the relocation model to static for Windows. This currently breaks on Windows ARM with `-mlong-calls` when built with clang. Loosen the assertion to what it was previously. We would ideally ensure that all the configuration sets Windows to static relocation model. llvm-svn: 274570
* NFC. Fix popular typo in comment 'deferencing' --> 'dereferencing'.Nick Lewycky2016-06-281-1/+1
| | | | | | Bonus changes, * placement in X86ISelLowering and 'exerce' -> 'exercise' in test. llvm-svn: 273984
* Add support for musl-libc on ARM Linux.Rafael Espindola2016-06-246-0/+32
| | | | | | Patch by Lei Zhang! llvm-svn: 273726
* [ARM] Use aapcs_vfp for ___truncdfhf2 on v7k.Ahmed Bougacha2016-06-241-0/+9
| | | | | | | r215348 overrode the f16 libcalls to be soft-float, but v7k uses the default (hard-float) calling convention. llvm-svn: 273631
* [ARM] Lower (select_cc k k (select_cc ~k ~k x)) into (SSAT l_k x)Pablo Barrio2016-06-231-0/+215
| | | | | | | | | | | | | | | | | Summary: SSAT saturates an integer, making sure that its value lies within an interval [-k, k]. Since the constant is given to SSAT as the number of bytes set to one, k + 1 must be a power of 2, otherwise the optimization is not possible. Also, the select_cc must use < and > respectively so that they define an interval. Reviewers: mcrosier, jmolloy, rengolin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D21372 llvm-svn: 273581
* [arm+x86] Make GNU variants behave like GNU w.r.t combining sin+cos into sincos.Daniel Sanders2016-06-211-0/+18
| | | | | | | | | | | | | | | | Summary: canCombineSinCosLibcall() would previously combine sin+cos into sincos for GNUX32/GNUEABI/GNUEABIHF regardless of whether UnsafeFPMath were set or not. However, GNU would only combine them for UnsafeFPMath because sincos does not set errno like sin and cos do. It seems likely that this is an oversight. Reviewers: t.p.northover Subscribers: t.p.northover, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D21431 llvm-svn: 273259
* Use shouldAssumeDSOLocal.Rafael Espindola2016-06-201-0/+19
| | | | | | With this ARM fast isel knows that PIE variable are not preemptable. llvm-svn: 273169
* [ARM] Enable isel of UMAALSam Parker2016-06-201-0/+29
| | | | | | | | | | TargetLowering and DAGToDAG are used to combine ADDC, ADDE and UMLAL dags into UMAAL. Selection is split into the two phases because it is easier to match the two patterns at those different times. Differential Revision: http://http://reviews.llvm.org/D21461 llvm-svn: 273165
OpenPOWER on IntegriCloud