summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM
Commit message (Collapse)AuthorAgeFilesLines
...
* Convert assert(false) into llvm_unreachable where it makes sense.Benjamin Kramer2015-10-252-2/+2
| | | | llvm-svn: 251266
* [ARM] Renaming +t2dsp feature into +dsp, as discussed on llvm-devArtyom Skrobov2015-10-232-7/+7
| | | | llvm-svn: 251125
* [ARM CodeGen] @llvm.debugtrap call may be removed when restoring callee ↵Oleg Ranevskyy2015-10-231-1/+5
| | | | | | | | | | | | | | | | | saved registers Summary: When ARMFrameLowering::emitPopInst generates a "pop" instruction to restore the callee saved registers, it checks if the LR register is among them. If so, the function may decide to remove the basic block's terminator and replace it with a "pop" to the PC register instead of LR. This leads to a problem when the block's terminator is preceded by a "llvm.debugtrap" call. The MI iterator points to the trap in such a case, which is also a terminator. If the function decides to restore LR to PC, it erroneously removes the trap. Reviewers: asl, rengolin Subscribers: aemerson, jfb, rengolin, dschuff, llvm-commits Differential Revision: http://reviews.llvm.org/D13672 llvm-svn: 251123
* Change makeLibCall to take an ArrayRef<SDValue> instead of pointer and size. ↵Craig Topper2015-10-221-6/+6
| | | | | | This removes the need to pass a hardcoded size in many places. NFC llvm-svn: 251032
* Add missing load/store flags to thumb2 instructions.Pete Cooper2015-10-221-1/+4
| | | | | | | | | | | | | | | | | | These were the cause of a verifier error when building 7zip with -verify-machineinstrs. Running 'make check' with the verifier triggered the same error on the test here so i've updated the test to run the verifier on one of its runs instead of adding a new one. While looking at this code, there was a stale comment that these instructions were only used for disassembly. This probably used to be the case, but they are now used in the 'ARM load / store optimization pass' too. This reapplies r242300 which was reverted in r242428 due to bot failures. Ultimately those failures were spurious and completely unrelated to this commit. I reverted this at the time because it was thought to be at fault. llvm-svn: 250969
* Adding support for TargetLoweringBase::LibCallArtyom Skrobov2015-10-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: TargetLoweringBase::Expand is defined as "Try to expand this to other ops, otherwise use a libcall." For ISD::UDIV and ISD::SDIV, the choice between the two possibilities was defined in a rather convoluted way: - if DIVREM is legal, expand to DIVREM - if DIVREM has a custom lowering, expand to DIVREM - if DIVREM libcall is defined and a remainder from the same division is computed elsewhere, expand to a DIVREM libcall - else, expand to a DIV libcall This had the undesirable effect that if both DIV and DIVREM are implemented as libcalls, then ISD::UDIV and ISD::SDIV are expanded to the heavier DIVREM libcall, even when the remainder isn't used. The new code adds a new LegalizeAction, TargetLoweringBase::LibCall, so that backends can directly control whether they prefer an expansion or a conversion to a libcall. This makes the generic lowering code even more generic, allowing its reuse in a wider range of target-specific configurations. The useful effect is that ARM backend will now generate a call to __aeabi_{i,u}div rather than __aeabi_{i,u}divmod in cases where it doesn't need the remainder. There's no functional change outside the ARM backend. Reviewers: t.p.northover, rengolin Subscribers: t.p.northover, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D13862 llvm-svn: 250826
* ARM: Remove implicit ilist iterator conversions, NFCDuncan P. N. Exon Smith2015-10-199-45/+40
| | | | llvm-svn: 250759
* Fix mapping of @llvm.arm.ssat/usat intrinsics to ssat/usat instructionsAsiri Rathnayake2015-10-191-4/+4
| | | | | | | | | | | | | The mapping of these two intrinsics in ARMInstrInfo.td had a small omission which lead to their operands not being validated/transformed before being lowered into usat and ssat instructions. This can cause incorrect instructions to be emitted. I've also added tests for the remaining two saturating arithmatic intrinsics @llvm.arm.qadd and @llvm.arm.qsub as they are missing codegen tests. llvm-svn: 250697
* Make a bunch of static arrays const.Craig Topper2015-10-181-3/+3
| | | | llvm-svn: 250642
* Remove unnecessary 'const' pointed out by David Blaikie.Craig Topper2015-10-171-2/+2
| | | | llvm-svn: 250619
* Use std::begin/end and std::is_sorted to simplify some code. NFCCraig Topper2015-10-171-8/+5
| | | | llvm-svn: 250614
* [ARM] Make sure we do not dereference the end iterator when accessing debugQuentin Colombet2015-10-151-2/+2
| | | | | | | | | | information. Although the problem was always here, it would only be exposed when shrink-wrapping is enable. rdar://problem/23110493 llvm-svn: 250352
* Test commitChristof Douma2015-10-131-1/+0
| | | | llvm-svn: 250154
* [ARM] Mark Swift MISched model as incompleteJames Molloy2015-10-121-0/+1
| | | | | | | | | | | The Swift Machine Scheduler Model is incomplete. There are instructions missing which can trigger the "incomplete machine model" abort. This was observed when a downstream SchedMachineModel was added to the ARM target. Patch by Christof Douma! llvm-svn: 250033
* ARM: tweak WoA frame loweringSaleem Abdulrasool2015-10-091-8/+8
| | | | | | | | | | Accept r11 when targeting Windows on ARM rather than just low registers. Because we are in a thumb-2 only mode, this may be slightly more expensive in code size, but results in better code for the environment since it spills the frame register, which is generally desired for fast stack walking as per the ABI. llvm-svn: 249804
* Add Triple::isAndroid().Evgeniy Stepanov2015-10-081-4/+2
| | | | | | | This is a simple refactoring that replaces Triple.getEnvironment() checks for Android with Triple.isAndroid(). llvm-svn: 249750
* [ARM] Promote helper function to SelectionDAG.Chad Rosier2015-10-071-34/+12
| | | | | | | | | I'll be using the function in a similar combine for AArch64. The helper was also improved to handle undef values. Part of http://reviews.llvm.org/D13442 llvm-svn: 249572
* [ARM] Use correct half-precision functions in EABI modeOliver Stannard2015-10-071-0/+8
| | | | | | | | | The ARM RTABI defines the half- to single-precision float conversion functions with an __aeabi prefix, but libgcc only has them with a __gnu prefix. Therefore we need to emit the __aeabi version when compiling with an eabi or eabihf triple, and the __gnu version with a gnueabi or gnueabihf triple. llvm-svn: 249565
* [ARM] Prevent PerformVDIVCombine from combining a vcvt/vdiv with 8 lanes.Chad Rosier2015-10-071-3/+4
| | | | | | This would result in a crash since the vcvt used does not support v8i32 types. llvm-svn: 249560
* [ARM][AArch64] Only lower to interleaved load/store if the target has NEONJeroen Ketema2015-10-071-6/+7
| | | | | | | | | Without an additional check for NEON, the compiler crashes during legalization of NEON ldN/stN. Differential Revision: http://reviews.llvm.org/D13508 llvm-svn: 249550
* [ARM] Push more complex check down to reduce compile time. NFC.Chad Rosier2015-10-071-10/+10
| | | | llvm-svn: 249547
* [ARM] Minor refactoring. NFC.Chad Rosier2015-10-061-2/+4
| | | | llvm-svn: 249465
* [ARM] Minor refactoring. NFC.Chad Rosier2015-10-061-8/+10
| | | | llvm-svn: 249464
* [ARM] Minor refactoring. NFC.Chad Rosier2015-10-061-9/+8
| | | | llvm-svn: 249463
* [ARM] Minor refactoring to improve readability. NFC.Chad Rosier2015-10-061-13/+14
| | | | llvm-svn: 249454
* [ARM] Modify codegen for memcpy intrinsic to prefer LDM/STM.Scott Douglass2015-10-058-30/+166
| | | | | | | | | | | | | | | | | | | | | | | | | | | We were previously codegen'ing memcpy as regular load/store operations and hoping that the register allocator would allocate registers in ascending order so that we could apply an LDM/STM combine after register allocation. According to the commit that first introduced this code (r37179), we planned to teach the register allocator to allocate the registers in ascending order. This never got implemented, and up to now we've been stuck with very poor codegen. A much simpler approach for achieving better codegen is to create MEMCPY pseudo instructions, attach scratch virtual registers to them and then, post register allocation, expand the MEMCPYs into LDM/STM pairs using the scratch registers. The register allocator will have picked arbitrary registers which we sort when expanding the MEMCPY. This approach also avoids the need to repeatedly calculate offsets which ultimately ought to be eliminated pre-RA in order to decrease register pressure. Fixes PR9199 and PR23768. [This is based on Peter Collingbourne's r238473 which was reverted.] Differential Revision: http://reviews.llvm.org/D13239 Change-Id: I727543c2e94136e0f80b8e22d5642d7b9ee5b458 Author: Peter Collingbourne <peter@pcc.me.uk> llvm-svn: 249322
* Fix pr24486.Rafael Espindola2015-10-051-2/+2
| | | | | | | | | | | | | | | | | | This extends the work done in r233995 so that now getFragment (in addition to getSection) also works for variable symbols. With that the existing logic to decide if a-b can be computed works even if a or b are variables. Given that, the expression evaluation can avoid expanding variables as aggressively and that in turn lets the relocation code see the original variable. In order for this to work with the asm streamer, there is now a dummy fragment per section. It is used to assign a section to a symbol when no other fragment exists. This patch is a joint work by Maxim Ostapenko andy myself. llvm-svn: 249303
* Actually switch the arch when we see .arch. PR21695Roman Divacky2015-10-021-0/+5
| | | | llvm-svn: 249165
* ARM: diagnose invalid local fixups on Thumb1Tim Northover2015-10-022-20/+62
| | | | | | | | | We previously stopped producing Thumb2 relaxations when they weren't supported, but only diagnosed the case where an actual relocation was produced. We should also tell people if local symbols aren't going to work rather than silently overflowing. llvm-svn: 249164
* ARM: correctly align constant pool value on Thumb1 targets.Tim Northover2015-10-021-1/+1
| | | | | | | Since we're using tLDRpci to access it, the constant pool's address must be 0 (mod 4). llvm-svn: 249163
* [ARM] More care with Thumb1 writeback in ARMLoadStoreOptimizerScott Douglass2015-10-011-3/+7
| | | | | | Differential Revision: http://reviews.llvm.org/D13240 llvm-svn: 249002
* [ARM] Support for ARMv6-Z / ARMv6-ZK missingArtyom Skrobov2015-09-301-2/+1
| | | | | | | | | | | | | | | As Richard Barton observed at http://reviews.llvm.org/D12937#inline-107121 TargetParser in LLVM has insufficient support for ARMv6Z and ARMv6ZK. In particular, there were no tests for TrustZone being supported in these architectures. The patch clears a FIXME: left by Saleem Abdulrasool in r201471, and fixes his test case which hadn't really been testing what it was claiming to test. Differential Revision: http://reviews.llvm.org/D13236 llvm-svn: 248921
* [ARM][NEON] Use address space in vld([1234]|[234]lane) and ↵Jeroen Ketema2015-09-301-6/+7
| | | | | | | | | | | | | | | | | | | | | vst([1234]|[234]lane) instructions This commit changes the interface of the vld[1234], vld[234]lane, and vst[1234], vst[234]lane ARM neon intrinsics and associates an address space with the pointer that these intrinsics take. This changes, e.g., <2 x i32> @llvm.arm.neon.vld1.v2i32(i8*, i32) to <2 x i32> @llvm.arm.neon.vld1.v2i32.p0i8(i8*, i32) This change ensures that address spaces are fully taken into account in the ARM target during lowering of interleaved loads and stores. Differential Revision: http://reviews.llvm.org/D12985 llvm-svn: 248887
* Improved the interface of methods commuting operands, improved X86-FMA3 ↵Andrew Kaylor2015-09-283-11/+23
| | | | | | | | | | mem-folding&coalescing. Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com) Differential Revision: http://reviews.llvm.org/D11370 llvm-svn: 248735
* [ARM] Avoid redundant checks for isThumb1Only() after supportsTailCall()Artyom Skrobov2015-09-282-25/+25
| | | | | | | | | | | | | | | supportsTailCall() has two callers. Both of them double-check isThumb1Only(), and refuse to proceed with tail-calling in that case. Therefore, it makes sense to move this check to ARMSubtarget::initSubtargetFeatures, where SupportsTailCall is initialized; and to eliminate the extra checks at the call sites. Following a review comment, added an "assert(supportsTailCall())" in IsEligibleForTailCall. NFC. llvm-svn: 248703
* [ARM] Don't generate clrex for pre-v7 targets.Ahmed Bougacha2015-09-261-0/+2
| | | | | | Since r248294, we emit clrex, but it doesn't exist on v6. llvm-svn: 248640
* ARM: make -Asserts,-Werror=unused-variable build happySaleem Abdulrasool2015-09-251-4/+4
| | | | | | | The value was only used in an assertion. Sink the variable usage into the assertion. llvm-svn: 248562
* ARM: address WoA division limitationSaleem Abdulrasool2015-09-253-8/+147
| | | | | | | | | | | | | We now emit the compiler generated divide by zero check that was needed for the MSVC routines. We construct a psuedo-instruction for the DBZ check as the operation requires splitting up the BB. For the 64-bit operations, we need to custom expand the node as we need to insert the DBZ check and then emit the libcall to the appropriate name. Because this is target specific, it seemed better to reproduce the expansion operation from the target-agnostic type legalization rather than sink this there to avoid the duplication. The division library calls now match MSVC semantically. llvm-svn: 248561
* [ARM] Handle +t2dsp feature as an ArchExtKind in ARMTargetParser.defArtyom Skrobov2015-09-2412-98/+89
| | | | | | | | | | | | | | | | | | Currently, the availability of DSP instructions (ACLE 6.4.7) is handled in a hand-rolled tricky condition block in tools/clang/lib/Basic/Targets.cpp, with a FIXME: attached. This patch changes the handling of +t2dsp to be in line with other architecture extensions. Following a revert of r248152 and new review comments, this patch also includes renaming FeatureDSPThumb2 -> FeatureDSP, hasThumb2DSP() -> hasDSP(), etc. The spelling of "t2dsp" is preserved, pending a further investigation of its possible external usage. Differential Revision: http://reviews.llvm.org/D12937 llvm-svn: 248519
* ARM: fix folding stack adjustment (again again again...)Tim Northover2015-09-231-1/+2
| | | | | | | | | | | | This time, the issue is that we weren't accounting for the possibility that aligned DPRs could have been stored after the final "push" in a prologue. When that happened we effectively moved a "sub sp, #N" from below the aligned stores to above them, and everything went to pot. To make it worse, I'd actually committed something testing that we produced wrong code, so the test update is tiny. llvm-svn: 248437
* [ARM] Add option to force fast-iselOliver Stannard2015-09-231-0/+10
| | | | | | | | | | | | The ARM backend has some logic that only allows the fast-isel to be enabled for subtargets where it is known to be stable. This adds a backend option to override this and force the fast-isel to be used for any target, to allow it to be tested. This is an ARM-specific option, because no other backend disables the fast-isel on a per-subtarget basis. llvm-svn: 248369
* [ARM] Emit clrex in the expanded cmpxchg fail block.Ahmed Bougacha2015-09-222-0/+8
| | | | | | | | | | | | | | | | | | | ARM counterpart to r248291: In the comparison failure block of a cmpxchg expansion, the initial ldrex/ldxr will not be followed by a matching strex/stxr. On ARM/AArch64, this unnecessarily ties up the execution monitor, which might have a negative performance impact on some uarchs. Instead, release the monitor in the failure block. The clrex instruction was designed for this: use it. Also see ARMARM v8-A B2.10.2: "Exclusive access instructions and Shareable memory locations". Differential Revision: http://reviews.llvm.org/D13033 llvm-svn: 248294
* Untabify.NAKAMURA Takumi2015-09-222-3/+3
| | | | llvm-svn: 248264
* Reformat blank lines.NAKAMURA Takumi2015-09-221-1/+0
| | | | llvm-svn: 248263
* Reformat.NAKAMURA Takumi2015-09-221-7/+6
| | | | llvm-svn: 248261
* ARMInstrInfo.cpp: Reformat.NAKAMURA Takumi2015-09-221-66/+65
| | | | llvm-svn: 248260
* [ARM] Do not scale vext with a factorJeroen Ketema2015-09-211-9/+1
| | | | | | | | | | | | | The vext pseudo-instruction takes the number of elements that need to be extracted, not the number of bytes. Hence, use the number of elements directly instead of scaling them with a factor. Reviewers: Silviu Baranga, James Molloy (not reflected in the differential revision) Differential Revision: http://reviews.llvm.org/D12974 llvm-svn: 248208
* Revert "[ARM] Handle +t2dsp feature as an ArchExtKind in ARMTargetParser.def"James Molloy2015-09-211-2/+2
| | | | | | | | This was committed without the code review (http://reviews.llvm.org/D12937) being approved. This reverts commit r248152. llvm-svn: 248174
* [ARM] Handle +t2dsp feature as an ArchExtKind in ARMTargetParser.defArtyom Skrobov2015-09-211-2/+2
| | | | | | | | | | | | | | | | Currently, the availability of DSP instructions (ACLE 6.4.7) is handled in a hand-rolled tricky condition block in tools/clang/lib/Basic/Targets.cpp, with a FIXME: attached. This patch changes the handling of +t2dsp to be in line with other architecture extensions. Following review comments, also updating the description of FeatureDSPThumb2 in ARM.td. Differential Revision: http://reviews.llvm.org/D12937 llvm-svn: 248152
* Cleanup places that passed SMLoc by const reference to pass it by value ↵Craig Topper2015-09-201-2/+1
| | | | | | instead. NFC llvm-svn: 248135
OpenPOWER on IntegriCloud