summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM
Commit message (Collapse)AuthorAgeFilesLines
* Silencing warnings from MSVC 2015 Update 2. All of these changes silence ↵Aaron Ballman2016-03-301-1/+1
| | | | | | "C4334 '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)". NFC. llvm-svn: 264929
* Remove HasFnAttribute guards to getFnAttribute callsNirav Dave2016-03-301-1/+0
| | | | | | | | | | | | These checks are redundant and can be removed Reviewers: hans Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D18564 llvm-svn: 264872
* Swift Calling Convention: add swiftself attribute.Manman Ren2016-03-292-0/+11
| | | | | | Differential Revision: http://reviews.llvm.org/D17866 llvm-svn: 264754
* ARM: maintain BB ordering when expanding WIN__DBZCHKSaleem Abdulrasool2016-03-251-1/+1
| | | | | | | | | | | | | It is possible to have a fallthrough MBB prior to MBB placement. The original addition of the BB would result in reordering the BB as not preceding the successor. Because of the fallthrough nature of the BB, we could end up executing incorrect code or even a constant pool island! Insert the spliced BB into the same location to avoid that. Thanks to Tim Northover for invaluable hints and Fiora for the discussion on what may have been occurring! llvm-svn: 264454
* ARM: fix optimised division on WoASaleem Abdulrasool2016-03-251-0/+1
| | | | | | | | | We did not have an explicit branch to the continuation BB. When the check was hoisted, this could permit control follow to fall through into the division trap. Add the explicit branch to the continuation basic block to ensure that code execution is correct. llvm-svn: 264370
* Replace a string comparison in ARMSubtarget.h with a tablegen entry in ↵Artyom Skrobov2016-03-232-5/+8
| | | | | | | | | | | | ARM.td (NFC) Reviewers: rengolin, t.p.northover Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D18393 llvm-svn: 264165
* ARM: Better codegen for 64-bit compares.Peter Collingbourne2016-03-212-0/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This introduces a custom lowering for ISD::SETCCE (introduced in r253572) that allows us to emit a short code sequence for 64-bit compares. Before: push {r7, lr} cmp r0, r2 mov.w r0, #0 mov.w r12, #0 it hs movhs r0, #1 cmp r1, r3 it ge movge.w r12, #1 it eq moveq r12, r0 cmp.w r12, #0 bne .LBB1_2 @ BB#1: @ %bb1 bl f pop {r7, pc} .LBB1_2: @ %bb2 bl g pop {r7, pc} After: push {r7, lr} subs r0, r0, r2 sbcs.w r0, r1, r3 bge .LBB1_2 @ BB#1: @ %bb1 bl f pop {r7, pc} .LBB1_2: @ %bb2 bl g pop {r7, pc} Saves around 80KB in Chromium's libchrome.so. Some notes on this patch: - I don't much like the ARMISD::BRCOND and ARMISD::CMOV combines I introduced (nothing else needs them). However, they are necessary in order to avoid poor codegen, and they seem similar to existing combines in other backends (e.g. X86 combines (brcond (cmp (setcc Compare))) to (brcond Compare)). - No support for Thumb-1. This is in principle possible, but we'd need to implement ARMISD::SUBE for Thumb-1. Differential Revision: http://reviews.llvm.org/D15256 llvm-svn: 263962
* [ARM] Add Cortex-A32 supportRenato Golin2016-03-212-2/+10
| | | | | | | | Adding Cortex-A32 as an available target in the ARM backend. Patch by Sam Parker. llvm-svn: 263956
* [CXX_FAST_TLS] Fix issues in ARM.Manman Ren2016-03-181-2/+3
| | | | | | | | | We need to be careful on which registers can be explicitly handled via copies. Prologue, Epilogue use physical registers and if one belongs to the set of CSRsViaCopy, it will no longer be CSRed, since PEI overwrites it after the explicit copies. llvm-svn: 263857
* [CXX_FAST_TLS] Disable tail call when calling conventions are mismatched.Manman Ren2016-03-181-0/+7
| | | | | | | Since CXX_FAST_TLS has a bigger set of CSRs, we don't tail call when caller and callee have mismatched calling conventions. llvm-svn: 263856
* [CXX_FAST_TLS] fix issues with O0 on ARM, AArch64 and X86.Manman Ren2016-03-181-0/+1
| | | | | | | Since at O0, explicit copies via SplitCSR may not be removed even if they are unnecessary, we choose not to use SplitCSR at O0. llvm-svn: 263855
* ARM: stop asserting on weird <3 x Ty> vectors in ISelLowering.Tim Northover2016-03-171-2/+3
| | | | llvm-svn: 263741
* ARM: Revert SVN r253865, 254158, fix windows divisionSaleem Abdulrasool2016-03-171-7/+18
| | | | | | | | | | | | | | | | | | | | | | The two changes together weakened the test and caused a regression with division handling in MSVC mode. They were applied to avoid an assertion being triggered in the block frequency analysis. However, the underlying problem was simply being masked rather than solved properly. Address the actual underlying problem and revert the changes. Rather than analyze the cause of the assertion, the division failure was assumed to be an overflow. The underlying issue was a subtle bug in the BB construction in the emission of the div-by-zero check (WIN__DBZCHK). We did not construct the proper successor information in the basic blocks, nor did we update the PHIs associated with the basic block when we split them. This would result in assertions being triggered in the block frequency analysis pass. Although the original tests are being removed, the tests themselves performed very little in terms of validation but merely tested that we did not assert when generating code. Update this with new tests that actually ensure that we do not regress on the code generation. llvm-svn: 263714
* Tweak some atomics functions in preparation for larger changes; NFC.James Y Knight2016-03-162-7/+12
| | | | | | | | | | | | | | | | - Rename getATOMIC to getSYNC, as llvm will soon be able to emit both '__sync' libcalls and '__atomic' libcalls, and this function is for the '__sync' ones. - getInsertFencesForAtomic() has been replaced with shouldInsertFencesForAtomic(Instruction), so that the decision can be made per-instruction. This functionality will be used soon. - emitLeadingFence/emitTrailingFence are no longer called if shouldInsertFencesForAtomic returns false, and thus don't need to check the condition themselves. llvm-svn: 263665
* [MC] Rename TLSDESC as it's not ARM specific.Davide Italiano2016-03-151-1/+1
| | | | | | Similarly to what was done for TLSCALL in r263515. llvm-svn: 263564
* [MC] Rename TLSCALL as it's not ARM specific.Davide Italiano2016-03-152-5/+5
| | | | | | | | | | | | | | `MCSymbolRefExpr` variant kind for TLSCALL is prefixed with _ARM_ since this is how it was originally implemented. The X86_64 version is exactly the same so there's no reason to create a new variant, we can just rename the existing one to be machine-independent. This generalization is the first step to implement support for GNU2 TLS dialect in MC. Differential Revision: http://reviews.llvm.org/D18160 llvm-svn: 263515
* [DAG] use !isUndef() ; NFCISanjay Patel2016-03-141-5/+4
| | | | llvm-svn: 263453
* [DAG] use isUndef() ; NFCISanjay Patel2016-03-141-11/+9
| | | | llvm-svn: 263448
* ARM: Support relative references using the PREL31 symbol variant.Peter Collingbourne2016-03-101-0/+3
| | | | | | Differential Revision: http://reviews.llvm.org/D17937 llvm-svn: 263156
* [ARM] Cortex-R8 supportAlexandros Lamprineas2016-03-101-0/+12
| | | | | | | | | | | This patch adds Cortex-R8 to Target Parser and TableGen. It also adds CodeGen tests for the build attributes. Patch by Pablo Barrio. Differential Revision: http://reviews.llvm.org/D17925 llvm-svn: 263132
* ARM: follow up improvements for SVN r263118Saleem Abdulrasool2016-03-104-4/+14
| | | | | | | | | | | | | | The initial change was insufficiently complete for always getting the semantics of __builtin_longjmp correct. The builtin is translated into a `tInt_eh_sjlj_longjmp` DAG node. This node set R7 as clobbered. However, the code would then follow up with a clobber of R11. I had failed to notice the imp-def,kill on R7 in the isel. Unfortunately, it seems that it is not possible to conditionalise the Defs list via an !if. Instead, construct a new parallel WIN node and prefer that when targeting windows. This ensures that we now both correctly model the __builtin_longjmp as well as construct the frame in a more ABI conformant manner. llvm-svn: 263123
* ARM: correct __builtin_longjmp on WoASaleem Abdulrasool2016-03-101-1/+3
| | | | | | | | WoA uses r11 as the FP even though it is a pure thumb-2 environment in contrast to AAPCS which states r7. This adjusts __builtin_longjmp to not clobber r7 and to properly restore the frame pointer on execution. llvm-svn: 263118
* Add support for a preserve_most calling convention to the AArch64 backend.Roman Levenstein2016-03-101-0/+4
| | | | | | | | | | This change adds a support for a preserve_most calling convention to the AArch64 backend, similar to how it was done for X86-64. There is also a subsequent patch on top of this one to add a tail-calls support for this calling convention. Differential Revision: http://reviews.llvm.org/D18016 llvm-svn: 263092
* [ARM] Simplify ARMInstr*.td by getting rid of identity PatFrags (NFC)Artyom Skrobov2016-03-083-107/+74
| | | | | | | | | | Reviewers: t.p.northover, grosbach, resistor Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D17636 llvm-svn: 262936
* [ARM] Merging 64-bit divmod lib calls into oneRenato Golin2016-03-041-0/+9
| | | | | | | | | | | | | | | | | | | | | When div+rem calls on the same arguments are found, the ARM back-end merges the two calls into one __aeabi_divmod call for up to 32-bits values. However, for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't merging the calls, and thus calling ldivmod twice and spilling the temporary results, which generated pretty bad code. This patch legalises 64-bit lib calls for divmod, so that now all the spilling and the second call are gone. It also relaxes the DivRem combiner a bit on the legal type check, since it was already checking for isLegalOrCustom on every value, so the extra check for isTypeLegal was redundant. Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make sure we only emit valid types or the ones that were explicitly marked as custom. Now, passing check-all and test-suite on x86, ARM and AArch64. This patch fixes PR17193 (and a long time FIXME in the tests). llvm-svn: 262738
* Revert "[ARM] Merging 64-bit divmod lib calls into one"Renato Golin2016-03-031-9/+0
| | | | | | This reverts commit r262507, which broke some ARM buildbots. llvm-svn: 262594
* [ARM] Merging 64-bit divmod lib calls into oneRenato Golin2016-03-021-0/+9
| | | | | | | | | | | | | | | | | When div+rem calls on the same arguments are found, the ARM back-end merges the two calls into one __aeabi_divmod call for up to 32-bits values. However, for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't merging the calls, and thus calling ldivmod twice and spilling the temporary results, which generated pretty bad code. This patch legalises 64-bit lib calls for divmod, so that now all the spilling and the second call are gone. It also relaxes the DivRem combiner a bit on the legal type check, since it was already checking for isLegalOrCustom on every value, so the extra check for isTypeLegal was redundant. This patch fixes PR17193 (and a long time FIXME in the tests). llvm-svn: 262507
* ARM: Introduce conservative load/store optimization modeMatthias Braun2016-03-021-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | Most of the time ARM has the CCR.UNALIGN_TRP bit set to false which means that unaligned loads/stores do not trap and even extensive testing will not catch these bugs. However the multi/double variants are not affected by this bit and will still trap. In effect a more aggressive load/store optimization will break existing (bad) code. These bugs do not necessarily manifest in the broken code where the misaligned pointer is formed but often later in perfectly legal code where it is accessed. This means recompiling system libraries (which have no alignment bugs) with a newer compiler will break existing applications (with alignment bugs) that worked before. So (under protest) I implemented this safe mode which limits the formation of multi/double operations to cases that are not affected by user code (stack operations like spills/reloads) or cases where the normal operations trap anyway (floating point load/stores). It is disabled by default. Differential Revision: http://reviews.llvm.org/D17015 llvm-svn: 262504
* TableGen: Check scheduling models for completenessMatthias Braun2016-03-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | TableGen checks at compiletime that for scheduling models with "CompleteModel = 1" one of the following holds: - Is marked with the hasNoSchedulingInfo flag - The instruction is a subclass of Sched - There are InstRW definitions in the scheduling model Typical steps necessary to complete a model: - Ensure all pseudo instructions that are expanded before machine scheduling (usually everything handled with EmitYYY() functions in XXXTargetLowering). - If a CPU does not support some instructions mark the corresponding resource unsupported: "WriteRes<WriteXXX, []> { let Unsupported = 1; }". - Add missing scheduling information. Differential Revision: http://reviews.llvm.org/D17747 llvm-svn: 262384
* CodeGen: Change MachineInstr to use MachineInstr&, NFCDuncan P. N. Exon Smith2016-02-274-6/+6
| | | | | | | | Change MachineInstr API to prefer MachineInstr& over MachineInstr* whenever the parameter is expected to be non-null. Slowly inching toward being able to fix PR26753. llvm-svn: 262149
* ARM: disallow pc as a base register in Thumb2 memory ops.Tim Northover2016-02-252-2/+2
| | | | | | | These should all be deferring to the "OP (literal)" variant according to the ARM ARM. llvm-svn: 261895
* ARM: fix handling of movw/movt relocations with addend.Tim Northover2016-02-231-3/+6
| | | | | | | | We were emitting only one half of a the paired relocations needed for these instructions because we decided that an offset needed a scattered relocation. In fact, movw/movt relocations can be paired without being scattered. llvm-svn: 261679
* Fix PR25339: ARM Constant IslandWeiming Zhao2016-02-231-9/+39
| | | | | | | | | | | | | | | | Summary: Currently, the ARM Constant Island may not converge (or not converge quickly). This patch let it move to the closest water after the user if it doesn't converge after 15 iterations. This address https://llvm.org/bugs/show_bug.cgi?id=25339 Reviewers: t.p.northover, srhines, kristof.beyls, aadg, rengolin Subscribers: weimingz, aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D16890 llvm-svn: 261665
* [ARM] fix initialization of PredictableSelectIsExpensiveJunmo Park2016-02-231-1/+1
| | | | | | | | | | | | Summary: If we want classify OoO or not, using getSchedModel().isOutOfOrder() could be more proper way than using Subtarget->isLikeA9(). Reviewers: jmolloy, rengolin Differential Revision: http://reviews.llvm.org/D17433 llvm-svn: 261623
* CodeGen: TII: Take MachineInstr& in predicate API, NFCDuncan P. N. Exon Smith2016-02-2310-88/+86
| | | | | | | | | | | | | Change TargetInstrInfo API to take `MachineInstr&` instead of `MachineInstr*` in the functions related to predicated instructions (I'll try to come back later and get some of the rest). All of these functions require non-null parameters already, so references are more clear. As a bonus, this happens to factor away a host of implicit iterator => pointer conversions. No functionality change intended. llvm-svn: 261605
* CodeGen: Bring back MachineBasicBlock::iterator::getInstrIterator()...Duncan P. N. Exon Smith2016-02-222-2/+3
| | | | | | | | | | | | | | | | | | This is a little embarrassing. When I reverted r261504 (getIterator() => getInstrIterator()) in r261567, I did a `git grep` to see if there were new calls to `getInstrIterator()` that I needed to migrate. There were 10-20 hits, and I blindly did a `sed ...` before calling `ninja check`. However, these were `MachineInstrBundleIterator::getInstrIterator()`, which predated r261567. Perhaps coincidentally, these had an identical name and return type. This commit undoes my careless sed and restores `MachineBasicBlock::iterator::getInstrIterator()`. llvm-svn: 261577
* Revert "CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC"Duncan P. N. Exon Smith2016-02-223-11/+10
| | | | | | | | | | This reverts commit r261504, since it's not obvious the new name is better: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160222/334298.html I'll recommit if we get consensus that it's the right direction. llvm-svn: 261567
* CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFCDuncan P. N. Exon Smith2016-02-213-9/+9
| | | | | | | | | | | | | | | | | | | | | | | Delete MachineInstr::getIterator(), since the term "iterator" is overloaded when talking about MachineInstr. - Downcast to ilist_node in iplist::getNextNode() and getPrevNode() so that ilist_node::getIterator() is still available. - Add it back as MachineInstr::getInstrIterator(). This matches the naming in MachineBasicBlock. - Add MachineInstr::getBundleIterator(). This is explicitly called "bundle" (not matching MachineBasicBlock) to disintinguish it clearly from ilist_node::getIterator(). - Update all calls. Some of these I switched to `auto` to remove boiler-plate, since the new name is clear about the type. There was one call I updated that looked fishy, but it wasn't clear what the right answer was. This was in X86FrameLowering::inlineStackProbe(), added in r252578 in lib/Target/X86/X86FrameLowering.cpp. I opted to leave the behaviour unchanged, but I'll reply to the original commit on the list in a moment. llvm-svn: 261504
* ADT: Remove == and != comparisons between ilist iterators and pointersDuncan P. N. Exon Smith2016-02-211-1/+1
| | | | | | | | | | | | | | I missed == and != when I removed implicit conversions between iterators and pointers in r252380 since they were defined outside ilist_iterator. Since they depend on getNodePtrUnchecked(), they indirectly rely on UB. This commit removes all uses of these operators. (I'll delete the operators themselves in a separate commit so that it can be easily reverted if necessary.) There should be NFC here. llvm-svn: 261498
* Minor code cleanups. NFC.Junmo Park2016-02-191-3/+3
| | | | llvm-svn: 261294
* [CodeGen] Document and use getConstant's splat-building feature. NFC.Ahmed Bougacha2016-02-151-6/+3
| | | | | | Differential Revision: http://reviews.llvm.org/D17229 llvm-svn: 260901
* [CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI.Ahmed Bougacha2016-02-091-36/+20
| | | | llvm-svn: 260316
* ARM: support TLS for WoASaleem Abdulrasool2016-02-035-0/+62
| | | | | | | | | | | Add support for TLS access for Windows on ARM. This generates a similar access to MSVC for ARM. The changes to the tablegen data is needed to support loading an external symbol global that is not for a call. The adjustments to the DAG to DAG transforms are needed to preserve the 32-bit move. llvm-svn: 259676
* [ARM] Move GNUEABI divmod to __aeabi_divmod*Renato Golin2016-02-031-2/+4
| | | | | | | | | | The GNU toolchain emits __aeabi_divmod for soft-divide on ARM cores which happens to be a lot faster than __divsi3/__modsi3 when the core has hardware divide instructions. Do the same here. Fixes PR26450. llvm-svn: 259657
* Removed FeatureVFPOnlySP from the Cortex-R7 processor modelSjoerd Meijer2016-02-021-1/+0
| | | | | | | | | description and changed the regression test accordingly. The default configuration of a Cortex-R7 is to implement the VFPv3-D16 architecture and the feature line as it was is too restrictive. llvm-svn: 259480
* Avoid overly large SmallPtrSet/SmallSetMatthias Braun2016-01-301-1/+1
| | | | | | | These sets perform linear searching in small mode so it is never a good idea to use SmallSize/N bigger than 32. llvm-svn: 259283
* Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith ↵Yaron Keren2016-01-291-1/+1
| | | | | | | | r259192 post commit comment. clang part in r259232, this is the LLVM part of the patch. llvm-svn: 259240
* ARM: don't mangle DAG constant if it has more than one useTim Northover2016-01-291-2/+2
| | | | | | | | | | | | | | | | The basic optimisation was to convert (mul $LHS, $complex_constant) into roughly "(shl (mul $LHS, $simple_constant), $simple_amt)" when it was expected to be cheaper. The original logic checks that the mul only has one use (since we're mangling $complex_constant), but when used in even more complex addressing modes there may be an outer addition that can pick up the wrong value too. I *think* the ARM addressing-mode problem is actually unreachable at the moment, but that depends on complex assessments of the profitability of pre-increment addressing modes so I've put a real check in there instead of an assertion. llvm-svn: 259228
* [ARM] Emit trap instruction using .inst directiveAlexandros Lamprineas2016-01-291-6/+5
| | | | | | | | | | The trap instruction is emitted as a data-in-text rather than an instruction. This patch uses the .inst directive for emitting trap. Differential Revision: http://reviews.llvm.org/D16684 llvm-svn: 259182
* ARMv7k: base ABI decision on v7k Arch rather than watchos OS.Tim Northover2016-01-275-6/+7
| | | | | | | | Various bits we want to use the new ABI actually compile with "-arch armv7k -miphoneos-version-min=9.0". Not ideal, but also not ridiculous given how slices work. llvm-svn: 258975
OpenPOWER on IntegriCloud