summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM/ARMISelLowering.h
Commit message (Collapse)AuthorAgeFilesLines
* Patch to implement UMLAL/SMLAL instructions for the ARM architectureArnold Schwaighofer2012-09-041-0/+3
| | | | | | | | | | | This patch corrects the definition of umlal/smlal instructions and adds support for matching them to the ARM dag combiner. Bug 12213 Patch by Yin Ma! llvm-svn: 163136
* Not all targets have efficient ISel code generation for select instructions.Nadav Rotem2012-09-021-0/+5
| | | | | | | | | For example, the ARM target does not have efficient ISel handling for vector selects with scalar conditions. This patch adds a TLI hook which allows the different targets to report which selects are supported well and which selects should be converted to CF duting codegen prepare. llvm-svn: 163093
* Remove the CAND/COR/CXOR custom ISD nodes and their select code.Jakob Stoklund Olesen2012-08-181-3/+0
| | | | | | | These nodes are no longer needed because the peephole pass can fold CMOV+AND into ANDCC etc. llvm-svn: 162179
* Revert 161581: Patch to implement UMLAL/SMLAL instructions for the ARMArnold Schwaighofer2012-08-121-3/+0
| | | | | | | | | architecture It broke MultiSource/Applications/JM/ldecod/ldecod on armv7 thumb O0 g and armv7 thumb O3. llvm-svn: 161736
* Change addTypeForNeon to use MVT instead of EVT so all the calls to ↵Craig Topper2012-08-121-3/+3
| | | | | | getSimpleVT can be removed. llvm-svn: 161735
* Patch to implement UMLAL/SMLAL instructions for the ARM architectureArnold Schwaighofer2012-08-091-0/+3
| | | | | | | | | | | This patch corrects the definition of umlal/smlal instructions and adds support for matching them to the ARM dag combiner. Bug 12213 Patch by Yin Ma! llvm-svn: 161581
* Fall back to selection DAG isel for calls to builtin functions.Bob Wilson2012-08-031-2/+4
| | | | | | | | | | Fast isel doesn't currently have support for translating builtin function calls to target instructions. For embedded environments where the library functions are not available, this is a matter of correctness and not just optimization. Most of this patch is just arranging to make the TargetLibraryInfo available in fast isel. <rdar://problem/12008746> llvm-svn: 161232
* Re-enable the CMN instruction.Bill Wendling2012-06-111-0/+1
| | | | | | | | | We turned off the CMN instruction because it had semantics which we weren't getting correct. If we are comparing with an immediate, then it's okay to use the CMN instruction. <rdar://problem/7569620> llvm-svn: 158302
* ARM: properly handle alignment for struct byval.Manman Ren2012-06-011-0/+3
| | | | | | | | | Factor out the expansion code into a function. This change is to be enabled in clang. rdar://9877866 llvm-svn: 157830
* ARM: support struct byval in llvmManman Ren2012-06-011-0/+3
| | | | | | | | | | We handle struct byval by inserting a pseudo op, which will be expanded to a loop at ExpandISelPseudos. A separate patch for clang will be submitted to enable struct byval. rdar://9877866 llvm-svn: 157793
* Change interface for TargetLowering::LowerCallTo and TargetLowering::LowerCallJustin Holewinski2012-05-251-7/+1
| | | | | | | | | | to pass around a struct instead of a large set of individual values. This cleans up the interface and allows more information to be added to the struct for future targets without requiring changes to each and every target. NV_CONTRIB llvm-svn: 157479
* Make ARM and Mips use TargetMachine::getTLSModel()Hans Wennborg2012-05-041-1/+2
| | | | | | | | This moves the logic for selecting a TLS model to a single place, instead of the previous three (ARM, Mips, and X86 which already uses this function). llvm-svn: 156162
* Fix a long standing tail call optimization bug. When a libcall is emittedEvan Cheng2012-04-101-1/+1
| | | | | | | | | | | | | legalizer always use the DAG entry node. This is wrong when the libcall is emitted as a tail call since it effectively folds the return node. If the return node's input chain is not the entry (i.e. call, load, or store) use that as the tail call input chain. PR12419 rdar://9770785 rdar://11195178 llvm-svn: 154370
* Always compute all the bits in ComputeMaskedBits.Rafael Espindola2012-04-041-1/+0
| | | | | | | | This allows us to keep passing reduced masks to SimplifyDemandedBits, but know about all the bits if SimplifyDemandedBits fails. This allows instcombine to simplify cases like the one in the included testcase. llvm-svn: 154011
* Reorder includes to match coding standards. Fix an issue or two exposed by that.Craig Topper2012-03-171-0/+1
| | | | llvm-svn: 152978
* Use vmov.f32 to materialize f32 consts on ARM. This relaxes constraints onLang Hames2012-03-151-0/+2
| | | | | | | register allocation by allowing all 32 D-registers to be used. Patch by Cameron Zwarich. llvm-svn: 152824
* Re-commit r151623 with fix. Only issue special no-return calls if it's a ↵Evan Cheng2012-02-281-1/+1
| | | | | | direct call. llvm-svn: 151645
* Revert r151623 "Some ARM implementaions, e.g. A-series, does return stack ↵Daniel Dunbar2012-02-281-1/+1
| | | | | | prediction. ...", it is breaking the Clang build during the Compiler-RT part. llvm-svn: 151630
* Some ARM implementaions, e.g. A-series, does return stack prediction. That is,Evan Cheng2012-02-281-1/+1
| | | | | | | | | | | | | | | | | the processor keeps a return addresses stack (RAS) which stores the address and the instruction execution state of the instruction after a function-call type branch instruction. Calling a "noreturn" function with normal call instructions (e.g. bl) can corrupt RAS and causes 100% return misprediction so LLVM should use a unconditional branch instead. i.e. mov lr, pc b _foo The "mov lr, pc" is issued in order to get proper backtrace. rdar://8979299 llvm-svn: 151623
* Optimize a couple of common patterns involving conditional moves where the falseEvan Cheng2012-02-231-0/+4
| | | | | | | | | | | | | | | | | | | | | value is zero. Instead of a cmov + op, issue an conditional op instead. e.g. cmp r9, r4 mov r4, #0 moveq r4, #1 orr lr, lr, r4 should be: cmp r9, r4 orreq lr, lr, #1 That is, optimize (or x, (cmov 0, y, cond)) to (or.cond x, y). Similarly extend this to xor as well as (and x, (cmov -1, y, cond)) => (and.cond x, y). It's possible to extend this to ADD and SUB but I don't think they are common. rdar://8659097 llvm-svn: 151224
* Make all pointers to TargetRegisterClass const since they are all pointers ↵Craig Topper2012-02-221-1/+1
| | | | | | to static data that should not be modified. llvm-svn: 151134
* Fix ARM SjLj-EH dispatch setup code. <rdar://problem/10444602>Bob Wilson2011-11-161-5/+0
| | | | | | | | | | | | | | | | | The EmitBasePointerRecalculation function has 2 problems, one minor and one fatal. The minor problem is that it inserts the code at the setjmp instead of in the dispatch block. The fatal problem is that at the point where this code runs, we don't know whether there will be a base pointer, so the entire function is a no-op. The base pointer recalculation needs to be handled as it was before, by inserting a pseudo instruction that gets expanded late. Most of the support for the old approach is still here, but it no longer has any connection to the eh_sjlj_dispatchsetup intrinsic. Clean up the parts related to the intrinsic and just generate the pseudo instruction directly. llvm-svn: 144781
* Add vmov.f32 to materialize f32 immediate splats which cannot be handled byEvan Cheng2011-11-151-0/+3
| | | | | | integer variants. rdar://10437054 llvm-svn: 144608
* Fixed parameter name.Lang Hames2011-11-021-1/+1
| | | | llvm-svn: 143594
* Try to lower memset/memcpy/memmove to vector instructions on ARM where the ↵Lang Hames2011-11-021-1/+6
| | | | | | alignment permits. llvm-svn: 143582
* Take the code that was emitted for the llvm.eh.dispatch.setup intrinsic and emitBill Wendling2011-10-071-0/+3
| | | | | | | it with the new SjLj emitter stuff. This way there's no need to emit that kind-of-hacky intrinsic. llvm-svn: 141419
* Refactor some of the code that sets up the entry block for SjLj EH. No ↵Bill Wendling2011-10-061-0/+4
| | | | | | functionality change. llvm-svn: 141323
* Check-pointing the new SjLj EH lowering.Bill Wendling2011-10-031-0/+3
| | | | | | | | | | | This code will replace the version in ARMAsmPrinter.cpp. It creates a new machine basic block, which is the dispatch for the return from a longjmp call. It then shoves the address of that machine basic block into the correct place in the function context so that the EH runtime will jump to it directly instead of having to go through a compare-and-jump-to-the-dispatch bit. This should be more efficient in the common case. llvm-svn: 141031
* ARM fix encoding of VMOV.f32 and VMOV.f64 immediates.Jim Grosbach2011-09-301-6/+0
| | | | | | | | | | | Encode the immediate into its 8-bit form as part of isel rather than later, which simplifies things for mapping the encoding bits, allows the removal of the custom disassembler decoding hook, makes the operand printer trivial, and prepares things more cleanly for handling these in the asm parser. rdar://10211428 llvm-svn: 140834
* Add codegen support for vector select (in the IR this means a selectDuncan Sands2011-09-061-0/+3
| | | | | | | | | | | | with a vector condition); such selects become VSELECT codegen nodes. This patch also removes VSETCC codegen nodes, unifying them with SETCC nodes (codegen was actually often using SETCC for vector SETCC already). This ensures that various DAG combiner optimizations kick in for vector comparisons. Passes dragonegg bootstrap with no testsuite regressions (nightly testsuite as well as "make check-all"). Patch mostly by Nadav Rotem. llvm-svn: 139159
* 64-bit atomic cmpxchg for ARM.Eli Friedman2011-08-311-1/+2
| | | | llvm-svn: 138868
* Some 64-bit atomic operations on ARM. 64-bit cmpxchg coming next.Eli Friedman2011-08-311-1/+16
| | | | llvm-svn: 138845
* Follow up to r138791.Evan Cheng2011-08-301-0/+3
| | | | | | | | | | | | Add a instruction flag: hasPostISelHook which tells the pre-RA scheduler to call a target hook to adjust the instruction. For ARM, this is used to adjust instructions which may be setting the 's' flag. ADC, SBC, RSB, and RSC instructions have implicit def of CPSR (required since it now uses CPSR physical register dependency rather than "glue"). If the carry flag is used, then the target hook will *fill in* the optional operand with CPSR. Otherwise, the hook will remove the CPSR implicit def from the MachineInstr. llvm-svn: 138810
* Change ARM / Thumb2 addc / adde and subc / sube modeling to use physicalEvan Cheng2011-08-301-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | register dependency (rather than glue them together). This is general goodness as it gives scheduler more freedom. However it is motivated by a nasty bug in isel. When a i64 sub is expanded to subc + sube. libcall #1 \ \ subc \ / \ \ / \ \ / libcall #2 sube If the libcalls are not serialized (i.e. both have chains which are dag entry), legalizer can serialize them in arbitrary orders. If it's unlucky, it can force libcall #2 before libcall #1 in the above case. subc | libcall #2 | libcall #1 | sube However since subc and sube are "glued" together, this ends up being a cycle when the scheduler combine subc and sube as a single scheduling unit. The right solution is to fix LegalizeType too chains the libcalls together. However, LegalizeType is not processing nodes in order so that's harder than it should be. For now, the move to physical register dependency will do. rdar://10019576 llvm-svn: 138791
* land David Blaikie's patch to de-constify Type, with a few tweaks.Chris Lattner2011-07-181-1/+1
| | | | llvm-svn: 135375
* Improve codegen for select's:Evan Cheng2011-07-131-0/+1
| | | | | | | | | | | | | | | | | | | | if (x != 0) x = 1 if (x == 1) x = 1 Previous codegen looks like this: mov r1, r0 cmp r1, #1 mov r0, #0 moveq r0, #1 The naive lowering select between two different values. It should recognize the test is equality test so it's more a conditional move rather than a select: cmp r0, #1 movne r0, #0 rdar://9758317 llvm-svn: 135017
* Remove getRegClassForInlineAsmConstraint from the ARM port.Eric Christopher2011-06-291-3/+0
| | | | | | Part of rdar://9643582 llvm-svn: 134095
* Have LowerOperandForConstraint handle multiple character constraints.Eric Christopher2011-06-021-1/+1
| | | | | | Part of rdar://9119939 llvm-svn: 132510
* Make the logic for determining function alignment more explicit. No ↵Eli Friedman2011-05-061-3/+0
| | | | | | functionality change. llvm-svn: 131012
* Add an unfolded offset field to LSR's Formula record. This is used toDan Gohman2011-05-031-0/+6
| | | | | | | | model constants which can be added to base registers via add-immediate instructions which don't require an additional register to materialize the immediate. llvm-svn: 130743
* ARM and Thumb2 support for atomic MIN/MAX/UMIN/UMAX loads.Jim Grosbach2011-04-261-0/+5
| | | | | | rdar://9326019 llvm-svn: 130234
* Thumb2 and ARM add/subtract with carry fixes.Andrew Trick2011-04-231-0/+1
| | | | | | | | | | | | | Fixes Thumb2 ADCS and SBCS lowering: <rdar://problem/9275821>. t2ADCS/t2SBCS are now pseudo instructions, consistent with ARM, so the assembly printer correctly prints the 's' suffix. Fixes Thumb2 adde -> SBC matching to check for live/dead carry flags. Fixes the internal ARM machine opcode mnemonic for ADCS/SBCS. Fixes ARM SBC lowering to check for live carry (potential bug). llvm-svn: 130048
* whitespaceAndrew Trick2011-04-231-6/+6
| | | | llvm-svn: 130046
* ARM byval support. Will be enabled by another patch to the FE. ↵Stuart Hastings2011-04-201-1/+8
| | | | | | <rdar://problem/7662569> llvm-svn: 129858
* Add a ARM-specific SD node for VBSL so that forms with a constant first operandCameron Zwarich2011-03-301-0/+3
| | | | | | can be recognized. This fixes <rdar://problem/9183078>. llvm-svn: 128584
* Re-apply r127953 with fixes: eliminate empty return block if it has no ↵Evan Cheng2011-03-211-0/+2
| | | | | | predecessors; update dominator tree if cfg is modified. llvm-svn: 127981
* Revert r127953, "SimplifyCFG has stopped duplicating returns into predecessorsDaniel Dunbar2011-03-191-2/+0
| | | | | | to canonicalize IR", it broke a lot of things. llvm-svn: 127954
* SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IREvan Cheng2011-03-191-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to have single return block (at least getting there) for optimizations. This is general goodness but it would prevent some tailcall optimizations. One specific case is code like this: int f1(void); int f2(void); int f3(void); int f4(void); int f5(void); int f6(void); int foo(int x) { switch(x) { case 1: return f1(); case 2: return f2(); case 3: return f3(); case 4: return f4(); case 5: return f5(); case 6: return f6(); } } => LBB0_2: ## %sw.bb callq _f1 popq %rbp ret LBB0_3: ## %sw.bb1 callq _f2 popq %rbp ret LBB0_4: ## %sw.bb3 callq _f3 popq %rbp ret This patch teaches codegenprep to duplicate returns when the return value is a phi and where the phi operands are produced by tail calls followed by an unconditional branch: sw.bb7: ; preds = %entry %call8 = tail call i32 @f5() nounwind br label %return sw.bb9: ; preds = %entry %call10 = tail call i32 @f6() nounwind br label %return return: %retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ] ret i32 %retval.0 This allows codegen to generate better code like this: LBB0_2: ## %sw.bb jmp _f1 ## TAILCALL LBB0_3: ## %sw.bb1 jmp _f2 ## TAILCALL LBB0_4: ## %sw.bb3 jmp _f3 ## TAILCALL rdar://9147433 llvm-svn: 127953
* Some minor cleanups based on feedback.Bill Wendling2011-03-151-2/+0
| | | | llvm-svn: 127694
* Generate a VTBL instruction instead of a series of loads and stores when weBill Wendling2011-03-141-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | can. As Nate pointed out, VTBL isn't super performant, but it *has* to be better than this: _shuf: @ BB#0: @ %entry push {r4, r7, lr} add r7, sp, #4 sub sp, #12 mov r4, sp bic r4, r4, #7 mov sp, r4 mov r2, sp vmov d16, r0, r1 orr r0, r2, #6 orr r3, r2, #7 vst1.8 {d16[0]}, [r3] vst1.8 {d16[5]}, [r0] subs r4, r7, #4 orr r0, r2, #5 vst1.8 {d16[4]}, [r0] orr r0, r2, #4 vst1.8 {d16[4]}, [r0] orr r0, r2, #3 vst1.8 {d16[0]}, [r0] orr r0, r2, #2 vst1.8 {d16[2]}, [r0] orr r0, r2, #1 vst1.8 {d16[1]}, [r0] vst1.8 {d16[3]}, [r2] vldr.64 d16, [sp] vmov r0, r1, d16 mov sp, r4 pop {r4, r7, pc} The "illegal" testcase in vext.ll is no longer illegal. <rdar://problem/9078775> llvm-svn: 127630
OpenPOWER on IntegriCloud