summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM
Commit message (Collapse)AuthorAgeFilesLines
* ARM, AArch64, X86: Check preserved registers for tail calls.Matthias Braun2016-04-041-8/+7
| | | | | | | | | | | | | | | | We can only perform a tail call to a callee that preserves all the registers that the caller needs to preserve. This situation happens with calling conventions like preserver_mostcc or cxx_fast_tls. It was explicitely handled for fast_tls and failing for preserve_most. This patch generalizes the check to any calling convention. Related to rdar://24207743 Differential Revision: http://reviews.llvm.org/D18680 llvm-svn: 265329
* Add MachineFunctionProperty checks for AllVRegsAllocated for target passesDerek Schuff2016-04-046-1/+30
| | | | | | | | | | | | | | Summary: This adds the same checks that were added in r264593 to all target-specific passes that run after register allocation. Reviewers: qcolombet Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18525 llvm-svn: 265313
* Remove useless check for ThreadModel==Single in ARMISelLowering. NFC.James Y Knight2016-04-011-7/+3
| | | | | | | | | | | ThreadModel::Single is already handled already by ARMPassConfig adding LowerAtomicPass to the pass list, which lowers all atomics to non-atomic ops and deletes fences. So by the time we get to ISel, there's no atomic fences left, so they don't need special handling. llvm-svn: 265178
* Fix for pr24346: arm asm label calculation error in subJames Molloy2016-04-013-6/+16
| | | | | | | | | | | | | | | | | | | | | | Some ARM instructions encode 32-bit immediates as a 8-bit integer (0-255) and a 4-bit rotation (0-30, even) in its least significant 12 bits. The original fixup, FK_Data_4, patches the instruction by the value bit-to-bit, regardless of the encoding. For example, assuming the label L1 and L2 are 0x0 and 0x104 respectively, the following instruction: add r0, r0, #(L2 - L1) ; expects 0x104, i.e., 260 would be assembled to the following, which adds 1 to r0, instead of 260: e2800104 add r0, r0, #4, 2 ; equivalently 1 The new fixup kind fixup_arm_mod_imm takes care of the encoding: e2800f41 add r0, r0, #260 Patch by Ting-Yuan Huang! llvm-svn: 265122
* [ARM] Expand v1i64 and v2i64 ctpop.Benjamin Kramer2016-03-311-0/+2
| | | | | | | The default is legal, which results in 'Cannot select' errors. This is triggered during selfhost due to a recent cost model change. llvm-svn: 265040
* Change eliminateCallFramePseudoInstr() to return an iteratorHans Wennborg2016-03-314-9/+8
| | | | | | | | | | | | | | | | | | | | | This will become necessary in a subsequent change to make this method merge adjacent stack adjustments, i.e. it might erase the previous and/or next instruction. It also greatly simplifies the calls to this function from Prolog- EpilogInserter. Previously, that had a bunch of logic to resume iteration after the call; now it just continues with the returned iterator. Note that this changes the behaviour of PEI a little. Previously, it attempted to re-visit the new instruction created by eliminateCallFramePseudoInstr(). That code was added in r36625, but I can't see any reason for it: the new instructions will obviously not be pseudo instructions, they will not have FrameIndex operands, and we have already accounted for the stack adjustment. Differential Revision: http://reviews.llvm.org/D18627 llvm-svn: 265036
* CodeGen: Factor out code for tail call result compatibility check; NFCMatthias Braun2016-03-301-36/+10
| | | | llvm-svn: 264959
* Silencing warnings from MSVC 2015 Update 2. All of these changes silence ↵Aaron Ballman2016-03-301-1/+1
| | | | | | "C4334 '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)". NFC. llvm-svn: 264929
* Remove HasFnAttribute guards to getFnAttribute callsNirav Dave2016-03-301-1/+0
| | | | | | | | | | | | These checks are redundant and can be removed Reviewers: hans Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D18564 llvm-svn: 264872
* Swift Calling Convention: add swiftself attribute.Manman Ren2016-03-292-0/+11
| | | | | | Differential Revision: http://reviews.llvm.org/D17866 llvm-svn: 264754
* ARM: maintain BB ordering when expanding WIN__DBZCHKSaleem Abdulrasool2016-03-251-1/+1
| | | | | | | | | | | | | It is possible to have a fallthrough MBB prior to MBB placement. The original addition of the BB would result in reordering the BB as not preceding the successor. Because of the fallthrough nature of the BB, we could end up executing incorrect code or even a constant pool island! Insert the spliced BB into the same location to avoid that. Thanks to Tim Northover for invaluable hints and Fiora for the discussion on what may have been occurring! llvm-svn: 264454
* ARM: fix optimised division on WoASaleem Abdulrasool2016-03-251-0/+1
| | | | | | | | | We did not have an explicit branch to the continuation BB. When the check was hoisted, this could permit control follow to fall through into the division trap. Add the explicit branch to the continuation basic block to ensure that code execution is correct. llvm-svn: 264370
* Replace a string comparison in ARMSubtarget.h with a tablegen entry in ↵Artyom Skrobov2016-03-232-5/+8
| | | | | | | | | | | | ARM.td (NFC) Reviewers: rengolin, t.p.northover Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D18393 llvm-svn: 264165
* ARM: Better codegen for 64-bit compares.Peter Collingbourne2016-03-212-0/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This introduces a custom lowering for ISD::SETCCE (introduced in r253572) that allows us to emit a short code sequence for 64-bit compares. Before: push {r7, lr} cmp r0, r2 mov.w r0, #0 mov.w r12, #0 it hs movhs r0, #1 cmp r1, r3 it ge movge.w r12, #1 it eq moveq r12, r0 cmp.w r12, #0 bne .LBB1_2 @ BB#1: @ %bb1 bl f pop {r7, pc} .LBB1_2: @ %bb2 bl g pop {r7, pc} After: push {r7, lr} subs r0, r0, r2 sbcs.w r0, r1, r3 bge .LBB1_2 @ BB#1: @ %bb1 bl f pop {r7, pc} .LBB1_2: @ %bb2 bl g pop {r7, pc} Saves around 80KB in Chromium's libchrome.so. Some notes on this patch: - I don't much like the ARMISD::BRCOND and ARMISD::CMOV combines I introduced (nothing else needs them). However, they are necessary in order to avoid poor codegen, and they seem similar to existing combines in other backends (e.g. X86 combines (brcond (cmp (setcc Compare))) to (brcond Compare)). - No support for Thumb-1. This is in principle possible, but we'd need to implement ARMISD::SUBE for Thumb-1. Differential Revision: http://reviews.llvm.org/D15256 llvm-svn: 263962
* [ARM] Add Cortex-A32 supportRenato Golin2016-03-212-2/+10
| | | | | | | | Adding Cortex-A32 as an available target in the ARM backend. Patch by Sam Parker. llvm-svn: 263956
* [CXX_FAST_TLS] Fix issues in ARM.Manman Ren2016-03-181-2/+3
| | | | | | | | | We need to be careful on which registers can be explicitly handled via copies. Prologue, Epilogue use physical registers and if one belongs to the set of CSRsViaCopy, it will no longer be CSRed, since PEI overwrites it after the explicit copies. llvm-svn: 263857
* [CXX_FAST_TLS] Disable tail call when calling conventions are mismatched.Manman Ren2016-03-181-0/+7
| | | | | | | Since CXX_FAST_TLS has a bigger set of CSRs, we don't tail call when caller and callee have mismatched calling conventions. llvm-svn: 263856
* [CXX_FAST_TLS] fix issues with O0 on ARM, AArch64 and X86.Manman Ren2016-03-181-0/+1
| | | | | | | Since at O0, explicit copies via SplitCSR may not be removed even if they are unnecessary, we choose not to use SplitCSR at O0. llvm-svn: 263855
* ARM: stop asserting on weird <3 x Ty> vectors in ISelLowering.Tim Northover2016-03-171-2/+3
| | | | llvm-svn: 263741
* ARM: Revert SVN r253865, 254158, fix windows divisionSaleem Abdulrasool2016-03-171-7/+18
| | | | | | | | | | | | | | | | | | | | | | The two changes together weakened the test and caused a regression with division handling in MSVC mode. They were applied to avoid an assertion being triggered in the block frequency analysis. However, the underlying problem was simply being masked rather than solved properly. Address the actual underlying problem and revert the changes. Rather than analyze the cause of the assertion, the division failure was assumed to be an overflow. The underlying issue was a subtle bug in the BB construction in the emission of the div-by-zero check (WIN__DBZCHK). We did not construct the proper successor information in the basic blocks, nor did we update the PHIs associated with the basic block when we split them. This would result in assertions being triggered in the block frequency analysis pass. Although the original tests are being removed, the tests themselves performed very little in terms of validation but merely tested that we did not assert when generating code. Update this with new tests that actually ensure that we do not regress on the code generation. llvm-svn: 263714
* Tweak some atomics functions in preparation for larger changes; NFC.James Y Knight2016-03-162-7/+12
| | | | | | | | | | | | | | | | - Rename getATOMIC to getSYNC, as llvm will soon be able to emit both '__sync' libcalls and '__atomic' libcalls, and this function is for the '__sync' ones. - getInsertFencesForAtomic() has been replaced with shouldInsertFencesForAtomic(Instruction), so that the decision can be made per-instruction. This functionality will be used soon. - emitLeadingFence/emitTrailingFence are no longer called if shouldInsertFencesForAtomic returns false, and thus don't need to check the condition themselves. llvm-svn: 263665
* [MC] Rename TLSDESC as it's not ARM specific.Davide Italiano2016-03-151-1/+1
| | | | | | Similarly to what was done for TLSCALL in r263515. llvm-svn: 263564
* [MC] Rename TLSCALL as it's not ARM specific.Davide Italiano2016-03-152-5/+5
| | | | | | | | | | | | | | `MCSymbolRefExpr` variant kind for TLSCALL is prefixed with _ARM_ since this is how it was originally implemented. The X86_64 version is exactly the same so there's no reason to create a new variant, we can just rename the existing one to be machine-independent. This generalization is the first step to implement support for GNU2 TLS dialect in MC. Differential Revision: http://reviews.llvm.org/D18160 llvm-svn: 263515
* [DAG] use !isUndef() ; NFCISanjay Patel2016-03-141-5/+4
| | | | llvm-svn: 263453
* [DAG] use isUndef() ; NFCISanjay Patel2016-03-141-11/+9
| | | | llvm-svn: 263448
* ARM: Support relative references using the PREL31 symbol variant.Peter Collingbourne2016-03-101-0/+3
| | | | | | Differential Revision: http://reviews.llvm.org/D17937 llvm-svn: 263156
* [ARM] Cortex-R8 supportAlexandros Lamprineas2016-03-101-0/+12
| | | | | | | | | | | This patch adds Cortex-R8 to Target Parser and TableGen. It also adds CodeGen tests for the build attributes. Patch by Pablo Barrio. Differential Revision: http://reviews.llvm.org/D17925 llvm-svn: 263132
* ARM: follow up improvements for SVN r263118Saleem Abdulrasool2016-03-104-4/+14
| | | | | | | | | | | | | | The initial change was insufficiently complete for always getting the semantics of __builtin_longjmp correct. The builtin is translated into a `tInt_eh_sjlj_longjmp` DAG node. This node set R7 as clobbered. However, the code would then follow up with a clobber of R11. I had failed to notice the imp-def,kill on R7 in the isel. Unfortunately, it seems that it is not possible to conditionalise the Defs list via an !if. Instead, construct a new parallel WIN node and prefer that when targeting windows. This ensures that we now both correctly model the __builtin_longjmp as well as construct the frame in a more ABI conformant manner. llvm-svn: 263123
* ARM: correct __builtin_longjmp on WoASaleem Abdulrasool2016-03-101-1/+3
| | | | | | | | WoA uses r11 as the FP even though it is a pure thumb-2 environment in contrast to AAPCS which states r7. This adjusts __builtin_longjmp to not clobber r7 and to properly restore the frame pointer on execution. llvm-svn: 263118
* Add support for a preserve_most calling convention to the AArch64 backend.Roman Levenstein2016-03-101-0/+4
| | | | | | | | | | This change adds a support for a preserve_most calling convention to the AArch64 backend, similar to how it was done for X86-64. There is also a subsequent patch on top of this one to add a tail-calls support for this calling convention. Differential Revision: http://reviews.llvm.org/D18016 llvm-svn: 263092
* [ARM] Simplify ARMInstr*.td by getting rid of identity PatFrags (NFC)Artyom Skrobov2016-03-083-107/+74
| | | | | | | | | | Reviewers: t.p.northover, grosbach, resistor Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D17636 llvm-svn: 262936
* [ARM] Merging 64-bit divmod lib calls into oneRenato Golin2016-03-041-0/+9
| | | | | | | | | | | | | | | | | | | | | When div+rem calls on the same arguments are found, the ARM back-end merges the two calls into one __aeabi_divmod call for up to 32-bits values. However, for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't merging the calls, and thus calling ldivmod twice and spilling the temporary results, which generated pretty bad code. This patch legalises 64-bit lib calls for divmod, so that now all the spilling and the second call are gone. It also relaxes the DivRem combiner a bit on the legal type check, since it was already checking for isLegalOrCustom on every value, so the extra check for isTypeLegal was redundant. Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make sure we only emit valid types or the ones that were explicitly marked as custom. Now, passing check-all and test-suite on x86, ARM and AArch64. This patch fixes PR17193 (and a long time FIXME in the tests). llvm-svn: 262738
* Revert "[ARM] Merging 64-bit divmod lib calls into one"Renato Golin2016-03-031-9/+0
| | | | | | This reverts commit r262507, which broke some ARM buildbots. llvm-svn: 262594
* [ARM] Merging 64-bit divmod lib calls into oneRenato Golin2016-03-021-0/+9
| | | | | | | | | | | | | | | | | When div+rem calls on the same arguments are found, the ARM back-end merges the two calls into one __aeabi_divmod call for up to 32-bits values. However, for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't merging the calls, and thus calling ldivmod twice and spilling the temporary results, which generated pretty bad code. This patch legalises 64-bit lib calls for divmod, so that now all the spilling and the second call are gone. It also relaxes the DivRem combiner a bit on the legal type check, since it was already checking for isLegalOrCustom on every value, so the extra check for isTypeLegal was redundant. This patch fixes PR17193 (and a long time FIXME in the tests). llvm-svn: 262507
* ARM: Introduce conservative load/store optimization modeMatthias Braun2016-03-021-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | Most of the time ARM has the CCR.UNALIGN_TRP bit set to false which means that unaligned loads/stores do not trap and even extensive testing will not catch these bugs. However the multi/double variants are not affected by this bit and will still trap. In effect a more aggressive load/store optimization will break existing (bad) code. These bugs do not necessarily manifest in the broken code where the misaligned pointer is formed but often later in perfectly legal code where it is accessed. This means recompiling system libraries (which have no alignment bugs) with a newer compiler will break existing applications (with alignment bugs) that worked before. So (under protest) I implemented this safe mode which limits the formation of multi/double operations to cases that are not affected by user code (stack operations like spills/reloads) or cases where the normal operations trap anyway (floating point load/stores). It is disabled by default. Differential Revision: http://reviews.llvm.org/D17015 llvm-svn: 262504
* TableGen: Check scheduling models for completenessMatthias Braun2016-03-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | TableGen checks at compiletime that for scheduling models with "CompleteModel = 1" one of the following holds: - Is marked with the hasNoSchedulingInfo flag - The instruction is a subclass of Sched - There are InstRW definitions in the scheduling model Typical steps necessary to complete a model: - Ensure all pseudo instructions that are expanded before machine scheduling (usually everything handled with EmitYYY() functions in XXXTargetLowering). - If a CPU does not support some instructions mark the corresponding resource unsupported: "WriteRes<WriteXXX, []> { let Unsupported = 1; }". - Add missing scheduling information. Differential Revision: http://reviews.llvm.org/D17747 llvm-svn: 262384
* CodeGen: Change MachineInstr to use MachineInstr&, NFCDuncan P. N. Exon Smith2016-02-274-6/+6
| | | | | | | | Change MachineInstr API to prefer MachineInstr& over MachineInstr* whenever the parameter is expected to be non-null. Slowly inching toward being able to fix PR26753. llvm-svn: 262149
* ARM: disallow pc as a base register in Thumb2 memory ops.Tim Northover2016-02-252-2/+2
| | | | | | | These should all be deferring to the "OP (literal)" variant according to the ARM ARM. llvm-svn: 261895
* ARM: fix handling of movw/movt relocations with addend.Tim Northover2016-02-231-3/+6
| | | | | | | | We were emitting only one half of a the paired relocations needed for these instructions because we decided that an offset needed a scattered relocation. In fact, movw/movt relocations can be paired without being scattered. llvm-svn: 261679
* Fix PR25339: ARM Constant IslandWeiming Zhao2016-02-231-9/+39
| | | | | | | | | | | | | | | | Summary: Currently, the ARM Constant Island may not converge (or not converge quickly). This patch let it move to the closest water after the user if it doesn't converge after 15 iterations. This address https://llvm.org/bugs/show_bug.cgi?id=25339 Reviewers: t.p.northover, srhines, kristof.beyls, aadg, rengolin Subscribers: weimingz, aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D16890 llvm-svn: 261665
* [ARM] fix initialization of PredictableSelectIsExpensiveJunmo Park2016-02-231-1/+1
| | | | | | | | | | | | Summary: If we want classify OoO or not, using getSchedModel().isOutOfOrder() could be more proper way than using Subtarget->isLikeA9(). Reviewers: jmolloy, rengolin Differential Revision: http://reviews.llvm.org/D17433 llvm-svn: 261623
* CodeGen: TII: Take MachineInstr& in predicate API, NFCDuncan P. N. Exon Smith2016-02-2310-88/+86
| | | | | | | | | | | | | Change TargetInstrInfo API to take `MachineInstr&` instead of `MachineInstr*` in the functions related to predicated instructions (I'll try to come back later and get some of the rest). All of these functions require non-null parameters already, so references are more clear. As a bonus, this happens to factor away a host of implicit iterator => pointer conversions. No functionality change intended. llvm-svn: 261605
* CodeGen: Bring back MachineBasicBlock::iterator::getInstrIterator()...Duncan P. N. Exon Smith2016-02-222-2/+3
| | | | | | | | | | | | | | | | | | This is a little embarrassing. When I reverted r261504 (getIterator() => getInstrIterator()) in r261567, I did a `git grep` to see if there were new calls to `getInstrIterator()` that I needed to migrate. There were 10-20 hits, and I blindly did a `sed ...` before calling `ninja check`. However, these were `MachineInstrBundleIterator::getInstrIterator()`, which predated r261567. Perhaps coincidentally, these had an identical name and return type. This commit undoes my careless sed and restores `MachineBasicBlock::iterator::getInstrIterator()`. llvm-svn: 261577
* Revert "CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC"Duncan P. N. Exon Smith2016-02-223-11/+10
| | | | | | | | | | This reverts commit r261504, since it's not obvious the new name is better: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160222/334298.html I'll recommit if we get consensus that it's the right direction. llvm-svn: 261567
* CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFCDuncan P. N. Exon Smith2016-02-213-9/+9
| | | | | | | | | | | | | | | | | | | | | | | Delete MachineInstr::getIterator(), since the term "iterator" is overloaded when talking about MachineInstr. - Downcast to ilist_node in iplist::getNextNode() and getPrevNode() so that ilist_node::getIterator() is still available. - Add it back as MachineInstr::getInstrIterator(). This matches the naming in MachineBasicBlock. - Add MachineInstr::getBundleIterator(). This is explicitly called "bundle" (not matching MachineBasicBlock) to disintinguish it clearly from ilist_node::getIterator(). - Update all calls. Some of these I switched to `auto` to remove boiler-plate, since the new name is clear about the type. There was one call I updated that looked fishy, but it wasn't clear what the right answer was. This was in X86FrameLowering::inlineStackProbe(), added in r252578 in lib/Target/X86/X86FrameLowering.cpp. I opted to leave the behaviour unchanged, but I'll reply to the original commit on the list in a moment. llvm-svn: 261504
* ADT: Remove == and != comparisons between ilist iterators and pointersDuncan P. N. Exon Smith2016-02-211-1/+1
| | | | | | | | | | | | | | I missed == and != when I removed implicit conversions between iterators and pointers in r252380 since they were defined outside ilist_iterator. Since they depend on getNodePtrUnchecked(), they indirectly rely on UB. This commit removes all uses of these operators. (I'll delete the operators themselves in a separate commit so that it can be easily reverted if necessary.) There should be NFC here. llvm-svn: 261498
* Minor code cleanups. NFC.Junmo Park2016-02-191-3/+3
| | | | llvm-svn: 261294
* [CodeGen] Document and use getConstant's splat-building feature. NFC.Ahmed Bougacha2016-02-151-6/+3
| | | | | | Differential Revision: http://reviews.llvm.org/D17229 llvm-svn: 260901
* [CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI.Ahmed Bougacha2016-02-091-36/+20
| | | | llvm-svn: 260316
* ARM: support TLS for WoASaleem Abdulrasool2016-02-035-0/+62
| | | | | | | | | | | Add support for TLS access for Windows on ARM. This generates a similar access to MSVC for ARM. The changes to the tablegen data is needed to support loading an external symbol global that is not for a call. The adjustments to the DAG to DAG transforms are needed to preserve the 32-bit move. llvm-svn: 259676
OpenPOWER on IntegriCloud