summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM/ARMISelDAGToDAG.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [Thumb] Teach ISel how to lower compares of AND bitmasks efficientlyJames Molloy2016-09-131-4/+133
| | | | | | | | | | | | | For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)). 1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS. 2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS. 3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS). 4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask. 1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win. llvm-svn: 281323
* Revert r281215, it caused PR30358.Nico Weber2016-09-121-134/+4
| | | | llvm-svn: 281263
* [Thumb] Teach ISel how to lower compares of AND bitmasks efficientlyJames Molloy2016-09-121-4/+134
| | | | | | | | | | | | | For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)). 1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS. 2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS. 3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS). 4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask. 1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win. llvm-svn: 281215
* [Thumb] Select (CMPZ X, -C) -> (CMPZ (ADDS X, C), 0)James Molloy2016-09-091-0/+42
| | | | | | The CMPZ #0 disappears during peepholing, leaving just a tADDi3, tADDi8 or t2ADDri. This avoids having to materialize the expensive negative constant in Thumb-1, and allows a shrinking from a 32-bit CMN to a 16-bit ADDS in Thumb-2. llvm-svn: 281040
* Replace "fallthrough" comments with LLVM_FALLTHROUGHJustin Bogner2016-08-171-1/+1
| | | | | | | This is a mechanical change of comments in switches like fallthrough, fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead. llvm-svn: 278902
* [ARM] Constant Materialize: imms with specific value can be encoded into mov.wWeiming Zhao2016-08-051-1/+3
| | | | | | | | | | | | | | | | | | Summary: Thumb2 supports encoding immediates with specific patterns into mov.w by splatting the low 8 bits into other bytes. I'm resubmitting this patch. The test case in the original commit r277610 does not specify triple, so builds with differnt default triple will have different output. This patch fixed trile as thumb-darwin-apple. Reviewers: john.brawn, jmolloy, bruno Subscribers: jmolloy, aemerson, rengolin, samparker, llvm-commits Differential Revision: https://reviews.llvm.org/D23090 llvm-svn: 277865
* Revert "[ARM] Constant Materialize: imms with specific value can be encoded ↵Bruno Cardoso Lopes2016-08-031-3/+1
| | | | | | | | | | | into mov.w" This reverts commit r277610 / d619aa8878c3dafcc0d29a46517f63ff3209fdd4. This make subtarget-no-movt.ll fail in http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/26892, llvm-svn: 277654
* [ARM] Constant Materialize: imms with specific value can be encoded into mov.wWeiming Zhao2016-08-031-1/+3
| | | | | | | | | | | | Summary: Thumb2 supports encoding immediates with specific patterns into mov.w by splatting the low 8 bits into other bytes. Reviewers: john.brawn, jmolloy Subscribers: jmolloy, aemerson, rengolin, samparker, llvm-commits Differential Revision: https://reviews.llvm.org/D23090 llvm-svn: 277610
* ARM: only form SMMLS when SUBE flags unused.Tim Northover2016-08-021-1/+2
| | | | | | | | In this particular example we wouldn't want the smmls anyway (the value is actually unused), but in general smmls does not provide the required flags register so if that SUBE result is used we can't replace it. llvm-svn: 277541
* MachineFunction: Return reference for getFrameInfo(); NFCMatthias Braun2016-07-281-9/+9
| | | | | | | getFrameInfo() never returns nullptr so we should use a reference instead of a pointer. llvm-svn: 277017
* [ARM] Improve longMAC codegen testSam Parker2016-07-251-0/+4
| | | | | | | | Added thumb targets and dataflow checks to the longMAC test. Differential Revision: https://reviews.llvm.org/D22684 llvm-svn: 276629
* [ARM] Enable ISel of SMMLS for ARM and Thumb2Sam Parker2016-07-251-0/+30
| | | | | | | | Use ISelDAGToDAG to recognise the SMMLS instruction pattern. Differential Revision: https://reviews.llvm.org/D22562 llvm-svn: 276624
* [ARM] Skip inline asm memory operands in DAGToDAGISelDiana Picus2016-07-201-0/+11
| | | | | | | | | | | | | | | | | | | | | | Retry r275776 (no changes, we suspect the issue was with another commit). The current logic for handling inline asm operands in DAGToDAGISel interprets the operands by looking for constants, which should represent the flags describing the kind of operand we're dealing with (immediate, memory, register def etc). The operands representing actual data are skipped only if they are non-const, with the exception of immediate operands which are skipped explicitly when a flag describing an immediate is found. The oversight is that memory operands may be const too (e.g. for device drivers reading a fixed address), so we should explicitly skip the operand following a flag describing a memory operand. If we don't, we risk interpreting that constant as a flag, which is definitely not intended. Fixes PR26038 Differential Revision: https://reviews.llvm.org/D22103 llvm-svn: 276101
* Revert "[ARM] Skip inline asm memory operands in DAGToDAGISel"Vitaly Buka2016-07-181-11/+0
| | | | | | | | Breaks asan, see https://reviews.llvm.org/D22103 This reverts commit r275776. llvm-svn: 275890
* [ARM] Skip inline asm memory operands in DAGToDAGISelDiana Picus2016-07-181-0/+11
| | | | | | | | | | | | | | | | | | | | The current logic for handling inline asm operands in DAGToDAGISel interprets the operands by looking for constants, which should represent the flags describing the kind of operand we're dealing with (immediate, memory, register def etc). The operands representing actual data are skipped only if they are non-const, with the exception of immediate operands which are skipped explicitly when a flag describing an immediate is found. The oversight is that memory operands may be const too (e.g. for device drivers reading a fixed address), so we should explicitly skip the operand following a flag describing a memory operand. If we don't, we risk interpreting that constant as a flag, which is definitely not intended. Fixes PR26038 Differential Revision: https://reviews.llvm.org/D22103 llvm-svn: 275776
* [Thumb-1] Select post-increment load and store where possibleJames Molloy2016-07-151-0/+29
| | | | | | | | | | Thumb-1 doesn't have post-inc or pre-inc load or store instructions. However the LDM/STM instructions with writeback can function as post-inc load/store: ldm r0!, {r1} @ load from r0 into r1 and increment r0 by 4 Obviously, this only works if the post increment is 4. llvm-svn: 275540
* [ARM] Do not test for CPUs, use SubtargetFeatures. Also remove 1 flagDiana Picus2016-07-071-10/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a follow-up for r273544. The end goal is to get rid of the isSwift / isCortexXY / isWhatever methods. This commit also removes a command line flag that isn't used in any of the tests: check-vmlx-hazards. It can be replaced easily with the mattr mechanism, since this is now a subtarget feature. There is still some work left regarding FeatureExpandMLx. In the past MLx expansion was enabled for subtargets with hasVFP2(), until r129775 [1] switched from that to isCortexA9, without too much justification. In spite of that, the code performing MLx expansion still contains calls to isSwift/isLikeA9, although the results of those are pretty clear given that we're only enabling it for the A9. We should try to enable it for all targets that have FeatureHasVMLxHazards, as it seems to be closely related to that behaviour, and if that is possible try to clean up the MLx expansion pass from all calls to isWhatever. This will require some performance testing, so it will be done in another patch. [1] http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20110418/119725.html Differential Revision: http://reviews.llvm.org/D21798 llvm-svn: 274742
* [Thumb] Reapply r272251 with a fix for PR28348 (mk 2)James Molloy2016-07-051-1/+43
| | | | | | | | | | | | | | | | | | | | | | | | | The important thing I was missing was ensuring newly added constants were kept in topological order. Repositioning the node is correct if the constant is newly added (so it has no topological ordering) but wrong if it already existed - positioning it next in the worklist would break the topological ordering. Original commit message: [Thumb] Select a BIC instead of AND if the immediate can be encoded more optimally negated If an immediate is only used in an AND node, it is possible that the immediate can be more optimally materialized when negated. If this is the case, we can negate the immediate and use a BIC instead; int i(int a) { return a & 0xfffffeec; } Used to produce: ldr r1, [CONSTPOOL] ands r0, r1 CONSTPOOL: 0xfffffeec And now produces: movs r1, #255 adds r1, #20 ; Less costly immediate generation bics r0, r1 llvm-svn: 274543
* Revert "[Thumb] Reapply r272251 with a fix for PR28348"James Molloy2016-07-041-40/+1
| | | | | | This reverts commit r274510 - it made green dragon unhappy. llvm-svn: 274512
* [Thumb] Reapply r272251 with a fix for PR28348James Molloy2016-07-041-1/+40
| | | | | | | | | | | | | | | | | | | | | | | | | We were using DAG->getConstant instead of DAG->getTargetConstant. This meant that we could inadvertently increase the use count of a constant if stars aligned, which it did in this testcase. Increasing the use count of the constant could cause ISel to fall over (because DAGToDAG lowering assumed the constant had only one use!) Original commit message: [Thumb] Select a BIC instead of AND if the immediate can be encoded more optimally negated If an immediate is only used in an AND node, it is possible that the immediate can be more optimally materialized when negated. If this is the case, we can negate the immediate and use a BIC instead; int i(int a) { return a & 0xfffffeec; } Used to produce: ldr r1, [CONSTPOOL] ands r0, r1 CONSTPOOL: 0xfffffeec And now produces: movs r1, #255 adds r1, #20 ; Less costly immediate generation bics r0, r1 llvm-svn: 274510
* Revert r272251, it caused PR28348.Nico Weber2016-06-291-40/+1
| | | | llvm-svn: 274141
* [ARM] Enable isel of UMAALSam Parker2016-06-201-0/+40
| | | | | | | | | | TargetLowering and DAGToDAG are used to combine ADDC, ADDE and UMLAL dags into UMAAL. Selection is split into the two phases because it is easier to match the two patterns at those different times. Differential Revision: http://http://reviews.llvm.org/D21461 llvm-svn: 273165
* [ARM] Strength reduce vectors to arrays.Benjamin Kramer2016-06-171-22/+10
| | | | | | No functionality change intended. llvm-svn: 273001
* [ARM] Add support for mrrc/mrrc2 intrinsics.Ranjeet Singh2016-06-171-0/+35
| | | | | | | | | | | | | | | | | | | Reapplying patch as it was reverted when it was first committed because of an assertion failure when the mrrc2 intrinsic was called in ARM mode. The failure was happening because the instruction was being built in ARMISelDAGToDAG.cpp and the tablegen description for mrrc2 instruction doesn't allow you to use a predicate. The ARM architecture manuals do say that mrrc2 in ARM mode can be predicated with AL in assembly but this has no effect on the encoding of the instruction as the top 4 bits will always be 1111 not 1110 which is the encoding for the condition AL. Differential Revision: http://reviews.llvm.org/D21408 llvm-svn: 272982
* Reverting r272778 because there's an assertionRanjeet Singh2016-06-151-28/+0
| | | | | | failure when running the test CodeGen/ARM/intrinsics-coprocessor.ll llvm-svn: 272791
* [ARM] Add support for mrrc/mrrc2 intrinsics.Ranjeet Singh2016-06-151-0/+28
| | | | | | Differential Revision: http://reviews.llvm.org/D21178 llvm-svn: 272778
* [Thumb] Fix off-by-one error in r272007James Molloy2016-06-141-1/+1
| | | | | | | | We can only generate immediates up to #510 with a MOV+ADD, not #511, because there's no such instruction as add #256. Found by Oliver Stannard and csmith! llvm-svn: 272665
* [ARM] Reverting r272544 because clang patch needsRanjeet Singh2016-06-131-28/+0
| | | | | | | to go in as soon as llvm patch has gone in because tests will start breaking in Clang. llvm-svn: 272546
* [ARM] Add mrrc/mrrc2 co-processor intrinsicsRanjeet Singh2016-06-131-0/+28
| | | | | | | | | | | | | MRRC/MRRC2 instruction writes to two registers. The intrinsic definition returns a single uint64_t to represent the write, this is a compact way of representing a write to two 32 bit registers, the alternative might have been two return a struct of 2 uint32_t's but this isn't as nice. Differential Revision: llvm-svn: 272544
* Pass DebugLoc and SDLoc by const ref.Benjamin Kramer2016-06-121-6/+7
| | | | | | | | This used to be free, copying and moving DebugLocs became expensive after the metadata rewrite. Passing by reference eliminates a ton of track/untrack operations. No functionality change intended. llvm-svn: 272512
* [Thumb] Select a BIC instead of AND if the immediate can be encoded more ↵James Molloy2016-06-091-1/+40
| | | | | | | | | | | | | | | | | | | | | | optimally negated If an immediate is only used in an AND node, it is possible that the immediate can be more optimally materialized when negated. If this is the case, we can negate the immediate and use a BIC instead; int i(int a) { return a & 0xfffffeec; } Used to produce: ldr r1, [CONSTPOOL] ands r0, r1 CONSTPOOL: 0xfffffeec And now produces: movs r1, #255 adds r1, #20 ; Less costly immediate generation bics r0, r1 llvm-svn: 272251
* [Thumb-1] Add optimized constant materialization for integers [256..512)James Molloy2016-06-071-0/+1
| | | | | | We can materialize these integers using a MOV; ADDi8 pair. llvm-svn: 272007
* [ARM] Add additional matching for UBFX instructionsOliver Stannard2016-06-011-0/+21
| | | | | | | | | | | This adds an additional matcher to select UBFX(..) from SRL(AND(..)) in ARMISelDAGToDAG to help with code size. Patch by David Green. Differential Revision: http://reviews.llvm.org/D20667 llvm-svn: 271384
* Apply clang-tidy's misc-static-assert where it makes sense.Benjamin Kramer2016-05-271-5/+7
| | | | | | | Also fold conditions into assert(0) where it makes sense. No functional change intended. llvm-svn: 270982
* SDAG: Implement Select instead of SelectImpl in ARMDAGToDAGISelJustin Bogner2016-05-121-258/+347
| | | | | | | | | | | | This is a large change, but it's pretty mechanical: - Where we were returning a node before, call ReplaceNode instead. - Where we would return null to fall back to another selector, rename the method to try* and return a bool for success. - Where we were calling SelectNodeTo, just return afterwards. Part of llvm.org/pr26808. llvm-svn: 269258
* SDAG: Clean up dangling nodes in ARMISelDAGToDAG::SelectImplJustin Bogner2016-05-121-1/+7
| | | | | | | | | When we convert to the void Select interface, leaving unreferenced nodes around won't be allowed anymore. Part of llvm.org/pr26808. llvm-svn: 269256
* SDAG: Rename Select->SelectImpl and repurpose Select as returning voidJustin Bogner2016-05-051-3/+2
| | | | | | | | | | | | | | This is a step towards removing the rampant undefined behaviour in SelectionDAG, which is a part of llvm.org/PR26808. We rename SelectionDAGISel::Select to SelectImpl and update targets to match, and then change Select to return void and consolidate the sketchy behaviour we're trying to get away from there. Next, we'll update backends to implement `void Select(...)` instead of SelectImpl and eventually drop the base Select implementation. llvm-svn: 268693
* ARM: Use a Handle to track SDNodes in case they're CSE'd. NFCJustin Bogner2016-05-051-4/+2
| | | | | | | | | | | | The code here is recursively Select-ing a new Node to avoid issues where N is CSE'd during replaceDAGValue and stops being valid. We can accomplish the same goal in a more principled way by using a HandleSDNode. This is essentially a less dodgy fix for PR25733 than the original attempt back in r255120. llvm-svn: 268590
* ARM: use a pseudo-instruction for cmpxchg at -O0.Tim Northover2016-04-181-0/+33
| | | | | | | | | | | | | | | | | The fast register-allocator cannot cope with inter-block dependencies without spilling. This is fine for ldrex/strex loops coming from atomicrmw instructions where any value produced within a block is dead by the end, but not for cmpxchg. So we lower a cmpxchg at -O0 via a pseudo-inst that gets expanded after regalloc. Fortunately this is at -O0 so we don't have to care about performance. This simplifies the various axes of expansion considerably: we assume a strong seq_cst operation and ensure ordering via the always-present DMB instructions rather than v8 acquire/release instructions. Should fix the 32-bit part of PR25526. llvm-svn: 266679
* [NFC] Header cleanupMehdi Amini2016-04-181-1/+0
| | | | | | | | | | | | | | Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595
* [ARM] Enable SMLAW[B|T] and SMLUW[B|T] instruction selectionSam Parker2016-04-081-0/+138
| | | | | | | | | | Added ISelDAGToDAG functions to enable selection of the smlawb, smlawt, smulwb and smulwt instructions for the ARM backend. Also updated the smul CodeGen test and removed the smulw one. Differential Revision: http://reviews.llvm.org/D18892 llvm-svn: 265793
* ARM: support TLS for WoASaleem Abdulrasool2016-02-031-0/+5
| | | | | | | | | | | Add support for TLS access for Windows on ARM. This generates a similar access to MSVC for ARM. The changes to the tablegen data is needed to support loading an external symbol global that is not for a call. The adjustments to the DAG to DAG transforms are needed to preserve the 32-bit move. llvm-svn: 259676
* ARM: don't mangle DAG constant if it has more than one useTim Northover2016-01-291-2/+2
| | | | | | | | | | | | | | | | The basic optimisation was to convert (mul $LHS, $complex_constant) into roughly "(shl (mul $LHS, $simple_constant), $simple_amt)" when it was expected to be cheaper. The original logic checks that the mul only has one use (since we're mangling $complex_constant), but when used in even more complex addressing modes there may be an outer addition that can pick up the wrong value too. I *think* the ARM addressing-mode problem is actually unreachable at the moment, but that depends on complex assessments of the profitability of pre-increment addressing modes so I've put a real check in there instead of an assertion. llvm-svn: 259228
* [ARM] Add new system registers to ARMv8-M Baseline/MainlineBradley Smith2016-01-251-7/+32
| | | | | | | | This patch was originally committed as r257884, but was reverted due to windows failures. The cause of these failures has been fixed under r258677, hence re-committing the original patch. llvm-svn: 258682
* # This is a combination of 2 commits.Reid Kleckner2016-01-151-32/+7
| | | | | | | | | | | | | | | | # The first commit's message is: Revert "[ARM] Add DSP build attribute and extension targeting" This reverts commit b11cc50c0b4a7c8cdb628abc50b7dc226ff583dc. # This is the 2nd commit message: Revert "[ARM] Add new system registers to ARMv8-M Baseline/Mainline" This reverts commit 837d08454e3e5beb8581951ac26b22fa07df3cd5. llvm-svn: 257916
* [ARM] Add new system registers to ARMv8-M Baseline/MainlineBradley Smith2016-01-151-7/+32
| | | | llvm-svn: 257884
* [ARM] Add ARMv8-A semaphore/atomic instructions to ARMv8-M Baseline/MainlineBradley Smith2016-01-151-1/+1
| | | | llvm-svn: 257882
* ARM: support TLS accesses on Darwin platformsTim Northover2016-01-071-5/+10
| | | | | | | | Darwin TLS accesses most closely resemble ELF's general-dynamic situation, since they have to be able to handle all possible situations. The descriptors and so on are obviously slightly different though. llvm-svn: 257039
* ARM: don't use a deleted node as the BaseReg in complex pattern.Tim Northover2015-12-091-1/+4
| | | | | | | | | | We mutated the DAG, which invalidated the node we were trying to use as a base register. Sometimes we got away with it, but other times the node really did get deleted before it was finished with. Should fix PR25733 llvm-svn: 255120
* [ARM] Handle the inline asm constraint type 'o'James Molloy2015-10-261-0/+1
| | | | | | This means "memory with offset" and requires very little plumbing to get working. This fixes PR25317. llvm-svn: 251280
OpenPOWER on IntegriCloud