summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM/ARMISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [ARM] Remove redundant computeKnownBits helper.Eli Friedman2017-04-191-29/+14
| | | | | | | | | | | | Move the BFI logic to computeKnownBitsForTargetNode, and delete the redundant CMOV logic. This is intended as a cleanup, but it's probably possible to construct a case where moving the BFI logic allows more combines. Differential Revision: https://reviews.llvm.org/D31795 llvm-svn: 300752
* [ARM] Use TableGen patterns to select vtbl. NFC.Eli Friedman2017-04-191-0/+6
| | | | | | Differential Revision: https://reviews.llvm.org/D32103 llvm-svn: 300749
* DAG: Make mayBeEmittedAsTailCall parameter constMatt Arsenault2017-04-181-1/+1
| | | | llvm-svn: 300603
* [ARM] Check for correct HW div when lowering divmodDiana Picus2017-04-181-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For subtargets that use the custom lowering for divmod, e.g. gnueabi, we used to check if the subtarget has hardware divide and then lower to a div-mul-sub sequence if true, or to a libcall if false. However, judging by the usage of hasDivide vs hasDivideInARMMode, it seems that hasDivide only refers to Thumb. For instance, in the ARMTargetLowering constructor, the code that specifies whether to use libcalls for (S|U)DIV looks like this: bool hasDivide = Subtarget->isThumb() ? Subtarget->hasDivide() : Subtarget->hasDivideInARMMode(); In the case of divmod for arm-gnueabi, using only hasDivide() to determine what to do means that instead of lowering to __aeabi_idivmod to get the remainder, we lower to div-mul-sub and then further lower the div to __aeabi_idiv. Even worse, if we have hardware divide in ARM but not in Thumb, we generate a libcall instead of using it (this is not an issue in practice since AFAICT none of the cores that we support have hardware divide in ARM but not Thumb). This patch fixes the code dealing with custom lowering to take into account the mode (Thumb or ARM) when deciding whether or not hardware division is available. Differential Revision: https://reviews.llvm.org/D32005 llvm-svn: 300536
* [ARM/AArch64] Ensure valid vector element types for interleaved accessesMatthew Simpson2017-04-101-24/+32
| | | | | | | | | | | This patch refactors and strengthens the type checks performed for interleaved accesses. The primary functional change is to ensure that the interleaved accesses have valid element types. The added test cases previously failed because the element type is f128. Differential Revision: https://reviews.llvm.org/D31817 llvm-svn: 299864
* [SelectionDAG] [ARM CodeGen] Fix chain information of LowerMULHuihui Zhang2017-04-061-2/+13
| | | | | | | | | | | | | | In LowerMUL, the chain information is not preserved for the new created Load SDNode. For example, if a Store alias with one of the operand of Mul. The Load for that operand need to be scheduled before the Store. The dependence is recorded in the chain of Store, in TokenFactor. However, when lowering MUL, the SDNodes for the new Loads for VMULL are not updated in the TokenFactor for the Store. Thus the chain is not preserved for the lowered VMULL. llvm-svn: 299701
* [DAGCombiner] Add vector demanded elements support to ↵Simon Pilgrim2017-03-311-0/+1
| | | | | | | | | | computeKnownBitsForTargetNode Follow up to D25691, this sets up the plumbing necessary to support vector demanded elements support in known bits calculations in target nodes. Differential Revision: https://reviews.llvm.org/D31249 llvm-svn: 299201
* [ARM] Fix mixup between Lo and Hi in SMLALBB formation.Eli Friedman2017-03-251-4/+4
| | | | llvm-svn: 298752
* [ARM] Fix computeKnownBits for ARMISD::CMOVPirama Arumuga Nainar2017-03-231-2/+2
| | | | | | | | | | | | | | | | | | | Summary: The true and false operands for the CMOV are operands 0 and 1. ARMISelLowering.cpp::computeKnownBits was looking at operands 1 and 2 instead. This can cause CMOV instructions to be incorrectly folded into BFI if value set by the CMOV is another CMOV, whose known bits are computed incorrectly. This patch fixes the issue and adds a test case. Reviewers: kristof.beyls, jmolloy Subscribers: llvm-commits, aemerson, srhines, rengolin Differential Revision: https://reviews.llvm.org/D31265 llvm-svn: 298624
* Reapply r298417 "[ARM] Recommit the glueless lowering of addc/adde in Thumb1"Artyom Skrobov2017-03-221-13/+83
| | | | | | | | The UB in t2_so_imm_neg conversion has been addressed under D31242 / r298512 This reverts commit r298482. llvm-svn: 298562
* Revert "[ARM] Recommit the glueless lowering of addc/adde in Thumb1, ↵Vitaly Buka2017-03-221-83/+13
| | | | | | | | | | including the amended (no UB anymore) fix for adding/subtracting -2147483648." Fails check-llvm with ubsan This reverts commit r298417. llvm-svn: 298482
* [ARM] Recommit the glueless lowering of addc/adde in Thumb1,Artyom Skrobov2017-03-211-13/+83
| | | | | | | | | including the amended (no UB anymore) fix for adding/subtracting -2147483648. This reverts r298328 "[ARM] Revert r297443 and r297820." and partially reverts r297842 "Revert "[Thumb1] Fix the bug when adding/subtracting -2147483648"" llvm-svn: 298417
* [ARM] Revert r297443 and r297820.Eli Friedman2017-03-211-83/+13
| | | | | | | | | | | | The glueless lowering of addc/adde in Thumb1 has known serious miscompiles (see https://reviews.llvm.org/D31081), and r297820 causes an infinite loop for certain constructs. It's not clear when they will be fixed, so let's just take them out of the tree for now. (I resolved a small conflict with r297453.) llvm-svn: 298328
* [ARM] Fix PR32130: Handle promotion of zero sized constants.Vadzim Dambrouski2017-03-201-1/+2
| | | | | | | | | | | The special case of zero sized values was previously not handled correctly. This patch handles this by not promoting if the size is zero. Patch by Tim Neumann. Differential Revision: https://reviews.llvm.org/D31116 llvm-svn: 298320
* Make library calls sensitive to regparm module flag (Fixes PR3997).Nirav Dave2017-03-181-1/+1
| | | | | | | | | | Reviewers: mkuper, rnk Subscribers: mehdi_amini, jyknight, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D27050 llvm-svn: 298179
* Capitalize ArgListEntry fields. NFC.Nirav Dave2017-03-181-12/+12
| | | | llvm-svn: 298178
* Revert "[Thumb1] Fix the bug when adding/subtracting -2147483648"Artyom Skrobov2017-03-151-4/+4
| | | | | | This reverts r297820 which apparently fails on A15 hosts. llvm-svn: 297842
* [Thumb1] Fix the bug when adding/subtracting -2147483648Artyom Skrobov2017-03-151-4/+4
| | | | | | Differential Revision: https://reviews.llvm.org/D30829 llvm-svn: 297820
* [ARM] Enable SMLAL[B|T] iselSam Parker2017-03-151-5/+128
| | | | | | | | | | | Enable the selection of the 64-bit signed multiply accumulate instructions which operate on 16-bit operands. These are enabled for ARMv5TE onwards for ARM and for V6T2 and other DSP enabled Thumb architectures. Differential Revision: https://reviews.llvm.org/D30044 llvm-svn: 297809
* [ARM] Move SMULW[B|T] isel to DAG CombineSam Parker2017-03-141-0/+113
| | | | | | | | | | | | Create nodes for smulwb and smulwt and move their selection from DAGToDAG to DAG combine. smlawb and smlawt can then be selected using tablegen. Added some helper functions to detect shift patterns as well as a wrapper around SimplifyDemandBits. Added a couple of extra tests. Differential Revision: https://reviews.llvm.org/D30708 llvm-svn: 297716
* [Thumb1] combine ADDC/SUBC with a negative immediateArtyom Skrobov2017-03-131-0/+20
| | | | | | | | | | | | Summary: This simple optimization has been split out of https://reviews.llvm.org/D30400 Reviewers: efriedma, jmolloy Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D30829 llvm-svn: 297682
* Refactor the multiply-accumulate combines to act onArtyom Skrobov2017-03-101-76/+64
| | | | | | | | | | | | | | | | | ARMISD::ADD[CE] nodes, instead of the generic ISD::ADD[CE]. Summary: This allows for some simplification because the combines are no longer limited to just one go at the node before it gets legalized into an ARM target-specific one. Reviewers: jmolloy, rogfer01 Subscribers: aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D30401 llvm-svn: 297453
* For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,Artyom Skrobov2017-03-101-12/+63
| | | | | | | | | | | | same as already done for ARM and Thumb2. Reviewers: jmolloy, rogfer01, efriedma Subscribers: aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D30400 llvm-svn: 297443
* [ARM] Reapply r296865 "[ARM] fpscr read/write intrinsics not aware of each ↵Ranjeet Singh2017-03-071-3/+4
| | | | | | | | | | | | | | | | | | | | other"" The original patch r296865 was reverted as it broke the chromium builds for Android https://bugs.llvm.org/show_bug.cgi?id=32134, this patch reapplies r296865 with a fix to make sure it doesn't cause the build regression. The problem was that intrinsic selection on int_arm_get_fpscr was failing in ISel this was because the code to manually select this intrinsic still thought it was the version with no side-effects (INTRINSIC_WO_CHAIN) which is wrong as it doesn't semantically match the definition in the tablegen code which says it does have side-effects, I've fixed this by updating the intrinsic type to INTRINSIC_W_CHAIN (has side-effects). I've also added a test for this based on Hans original reproducer. Differential Revision: https://reviews.llvm.org/D30645 llvm-svn: 297137
* [ARM/AArch64] Support wide interleaved accessesMatthew Simpson2017-03-021-53/+140
| | | | | | | | | | | | | This patch teaches (ARM|AArch64)ISelLowering.cpp to match illegal vector types to interleaved access intrinsics as long as the types are multiples of the vector register width. A "wide" access will now be mapped to multiple interleave intrinsics similar to the way in which non-interleaved accesses with illegal types are legalized into multiple accesses. I'll update the associated TTI costs (in getInterleavedMemoryOpCost) as a follow-on. Differential Revision: https://reviews.llvm.org/D29466 llvm-svn: 296750
* [ARM] don't transform an add(ext Cond), C to select unless there's a setcc ↵Sanjay Patel2017-02-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | of the condition The transform in question claims to be doing: // fold (add (select cc, 0, c), x) -> (select cc, x, (add, x, c)) ...starting in PerformADDCombineWithOperands(), but it wasn't actually checking for a setcc node for the sext/zext patterns. This is exactly the opposite of a transform I'd like to add to DAGCombiner's foldSelectOfConstants(), so I was seeing infinite loops with my draft of a patch applied. The changes in select_const.ll look positive (less instructions). The change in arm-and-tst-peephole.ll is unrelated. We're changing the input IR in that test to preserve the intent of the test, but that's not affected by this code change. Differential Revision: https://reviews.llvm.org/D30355 llvm-svn: 296389
* Fix PR31896.Evgeniy Stepanov2017-02-211-5/+8
| | | | | | Address of an alias of a global with offset is incorrectly lowered as an address of the global (i.e. ignoring offset). llvm-svn: 295762
* [ARM] Replace HasT2ExtractPack with HasDSPSam Parker2017-02-171-2/+2
| | | | | | | | | | | Removed the HasT2ExtractPack feature and replaced its references with HasDSP. This then allows the Thumb2 extend instructions to be selected for ARMv8M +dsp. These instruction descriptions have also been refactored and more target tests have been added for their isel. Differential Revision: https://reviews.llvm.org/D29623 llvm-svn: 295452
* [ARM] Fix crash caused by r294945James Molloy2017-02-131-2/+4
| | | | | | | | I'd missed a creator of FCMP nodes - duplicateCmp(). Kindly and promptly reported by Gabor Ballabas, due to his CSiBE test suite. llvm-svn: 294968
* [ARM] Use VCMP, not VCMPE, for floating point equality comparisonsJames Molloy2017-02-131-13/+32
| | | | | | | | | | | | | | | | | | | | | | | | | When generating a floating point comparison we currently unconditionally generate VCMPE. This has the sideeffect of setting the cumulative Invalid bit in FPSCR if any of the operands are QNaN. It is expected that use of a relational predicate on a QNaN value should raise Invalid. Quoting from the C standard: The relational and equality operators support the usual mathematical relationships between numeric values. For any ordered pair of numeric values exactly one of relationships the less, greater, equal and is true. Relational operators may raise the floating-point exception when argument values are NaNs. The standard doesn't explicitly state the expectation for equality operators, but the implication and obvious expectation is that equality operators should not raise Invalid on a QNaN input, as those predicates are wholly defined on unordered inputs (to return not equal). Therefore, add a new operand to ARMISD::FPCMP and FPCMPZ indicating if QNaN should raise Invalid, and pipe that through to TableGen. llvm-svn: 294945
* [ARM] Don't lower f16 interleaved accesses.Ahmed Bougacha2017-02-111-0/+14
| | | | | | | | | | | | There are no vldN/vstN f16 variants, even with +fullfp16. We could use the i16 variants, but, in practice, even with +fullfp16, the f16 sequence leading to the i16 shuffle usually gets scalarized. We'd need to improve our support for f16 codegen before getting there. Reject f16 interleaved accesses. If we try to emit the f16 intrinsics, we'll just end up with a selection failure. llvm-svn: 294818
* [ARM/AArch ISel] SwiftCC: First parameters that are marked swiftself are not ↵Arnold Schwaighofer2017-02-081-1/+2
| | | | | | | | | | | | | | | | | | | 'this returns' We mark X0 as preserved by a call that passes the returned parameter. x0 = ... fun(x0) // no implicit def of x0 This no longer is valid if we pass the parameter in a different register then the returned value as is the case with a swiftself parameter (passed in x20). x20 = ... fun(x20) // there should be an implict def of x8 rdar://30425845 llvm-svn: 294527
* [ARM] Make RWPI use movw/movt when availableChristof Douma2017-02-071-8/+15
| | | | | | | | | | | | | | | | | When constructing global address literals while targeting the RWPI relocation model. LLVM currently only uses literal pools. If MOVW/MOVT instructions are available we can use these instead. Beside being more efficient it allows -arm-execute-only to work with -relocation-model=RWPI as well. When we generate MOVW/MOVT for global addresses when targeting the RWPI relocation model, we need to use base relative relocations. This patch does the needed plumbing in MC to generate these for MOVW/MOVT. Differential Revision: https://reviews.llvm.org/D29487 Change-Id: I446786e43a6f5aa9b6a5bb2cd216d60d41c7755d llvm-svn: 294298
* [CodeGen] Remove dead call-or-prologue enum from CCStateReid Kleckner2017-02-021-30/+9
| | | | | | | This enum has been dead since Olivier Stannard re-implemented ARM byval handling in r202985 (2014). llvm-svn: 293943
* [LV] Move interleaved access helper functions to VectorUtils (NFC)Matthew Simpson2017-02-011-14/+3
| | | | | | | | | | | | This patch moves some helper functions related to interleaved access vectorization out of LoopVectorize.cpp and into VectorUtils.cpp. We would like to use these functions in a follow-on patch that improves interleaved load and store lowering in (ARM/AArch64)ISelLowering.cpp. One of the functions was already duplicated there and has been removed. Differential Revision: https://reviews.llvm.org/D29398 llvm-svn: 293788
* Cleanup dump() functions.Matthias Braun2017-01-281-1/+1
| | | | | | | | | | | | | | | | | | We had various variants of defining dump() functions in LLVM. Normalize them (this should just consistently implement the things discussed in http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html For reference: - Public headers should just declare the dump() method but not use LLVM_DUMP_METHOD or #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP) - The definition of a dump method should look like this: #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP) LLVM_DUMP_METHOD void MyClass::dump() { // print stuff to dbgs()... } #endif llvm-svn: 293359
* [ARM] Fix some Clang-tidy modernize and Include What You Use warnings; other ↵Eugene Zelenko2017-01-271-46/+93
| | | | | | minor fixes (NFC). llvm-svn: 293348
* ARM: fix vectorized division on WoASaleem Abdulrasool2017-01-271-2/+2
| | | | | | | | | | | | | | The Windows on ARM target uses custom division for normal division as the backend needs to insert division-by-zero checks. However, it is designed to only handle non-vectorized division. ARM has custom lowering for vectorized division as that can avoid loading registers with the values and invoke a division routine for each one, preferring to lower using NEON instructions. Fall back to the custom lowering for the NEON instructions if we encounter a vectorized division. Resolves PR31778! llvm-svn: 293259
* [ARM] Use helpers for adding pred / CC operands. NFCDiana Picus2017-01-201-13/+17
| | | | | | | | | | | Hunt down some of the places where we use bare addReg(0) or addImm(AL).addReg(0) and replace with add(condCodeOp()) and add(predOps()). This should make it easier to understand what those operands represent (without having to look at the definition of the instruction that we're adding to). Differential Revision: https://reviews.llvm.org/D27984 llvm-svn: 292587
* ARM: match GCC's behaviour for builtinsSaleem Abdulrasool2017-01-131-166/+16
| | | | | | | | | | | | GCC changes the CC between the user-code and the builtins based on the value of `-target` rather than `-mfloat-abi`. When a HF target is used, the VFP variant of the AAPCS CC is used. Otherwise, the AAPCS variant is used. In all cases, the AEABI functions use the AAPCS CC. Adjust the calling convention based on the target. Resolves PR30543! llvm-svn: 291909
* Apply clang-tidy's performance-unnecessary-value-param to LLVM.Benjamin Kramer2017-01-131-1/+1
| | | | | | | With some minor manual fixes for using function_ref instead of std::function. No functional change intended. llvm-svn: 291904
* [ARM] CodeGen: Replace AddDefaultT1CC and AddNoT1CC. NFCDiana Picus2017-01-131-15/+15
| | | | | | | | | | | For AddDefaultT1CC, we add a new helper t1CondCodeOp, which creates the appropriate register operand. For AddNoT1CC, we use the existing condCodeOp helper - we only had two uses of AddNoT1CC, so at this point it's probably not worth having yet another helper just for them. Differential Revision: https://reviews.llvm.org/D28603 llvm-svn: 291894
* [ARM] CodeGen: Remove AddDefaultCC. NFC.Diana Picus2017-01-131-19/+26
| | | | | | | | | | Replace all uses of AddDefaultCC with add(condCodeOp()). The transformation has been done automatically with a custom tool based on Clang AST Matchers + RefactoringTool. Differential Revision: https://reviews.llvm.org/D28557 llvm-svn: 291893
* [CodeGen] Rename MachineInstrBuilder::addOperand. NFCDiana Picus2017-01-131-12/+12
| | | | | | | | | | | Rename from addOperand to just add, to match the other method that has been added to MachineInstrBuilder for adding more than just 1 operand. See https://reviews.llvm.org/D28057 for the whole discussion. Differential Revision: https://reviews.llvm.org/D28556 llvm-svn: 291891
* [ARM] CodeGen: Remove AddDefaultPred. NFC.Diana Picus2017-01-131-179/+241
| | | | | | | | | | | | | | | | | | | | | | | | | Replace all uses of AddDefaultPred with MachineInstrBuilder::add(predOps()). This makes the code building MachineInstrs more readable, because it allows us to write code like: MIB.addSomeOperand(blah) .add(predOps()) .addAnotherOperand(blahblah) instead of AddDefaultPred(MIB.addSomeOperand(blah)) .addAnotherOperand(blahblah) This commit also adds the predOps helper in the ARM backend, as well as the add method taking a variable number of operands to the MachineInstrBuilder. The transformation has been done mostly automatically with a custom tool based on Clang AST Matchers + RefactoringTool. Differential Revision: https://reviews.llvm.org/D28555 llvm-svn: 291890
* ARM: slightly more table driven libcall setupSaleem Abdulrasool2017-01-121-26/+59
| | | | | | | | | Switch some additional library call setup to be table driven. This makes it more immediately obvious what the library call looks like. This is important for ARM since the calling conventions for the builtins change based on the target/libcall name. NFC llvm-svn: 291789
* [ARM] More aggressive matching for vpadd and vpaddl.Eli Friedman2017-01-111-4/+104
| | | | | | | | | The new matchers work after legalization to make them simpler, and to avoid blocking other optimizations. Differential Revision: https://reviews.llvm.org/D27779 llvm-svn: 291693
* [ARM] Remove rbit intrinsics and autoupgrade to generic bitreverse.Chad Rosier2017-01-101-5/+0
| | | | | | Testing already covered by CodeGen/ARM/rbit.ll llvm-svn: 291587
* Caught a simple typo. I do not know of a way to test this, but it seems like ↵Aaron Ballman2016-12-301-1/+1
| | | | | | an unlikely thing to regress in the future. llvm-svn: 290757
* [ARM] Implement isExtractSubvectorCheap.Eli Friedman2016-12-201-0/+8
| | | | | | | | | | | | | | See https://reviews.llvm.org/D6678 for the history of isExtractSubvectorCheap. Essentially the same considerations apply to ARM. This temporarily breaks the formation of vpadd/vpaddl in certain cases; AddCombineToVPADDL essentially assumes that we won't form VUZP shuffles. See https://reviews.llvm.org/D27779 for followup fix. Differential Revision: https://reviews.llvm.org/D27774 llvm-svn: 290198
OpenPOWER on IntegriCloud