summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM/ARMISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
* On Darwin ARM, set the UNWIND_RESUME libcall to _Unwind_SjLj_Resume.John McCall2011-05-291-0/+1
| | | | | | | | | This is important for the correct lowering of unwind instructions (which doesn't matter at all) and llvm.eh.resume calls (which does). Take 2, now with more basic competence. llvm-svn: 132295
* I didn't mean to commit these residues of a personal project.John McCall2011-05-291-1/+0
| | | | llvm-svn: 132293
* On Darwin ARM, set the UNWIND_RESUME libcall to _Unwind_SjLj_Resume.John McCall2011-05-291-0/+1
| | | | | | | This is important for the correct lowering of unwind instructions (which doesn't matter at all) and llvm.eh.resume calls (which does). llvm-svn: 132291
* Add support for ARM ldrexd/strexd intrinsics. They both use i32 register pairsBruno Cardoso Lopes2011-05-281-0/+22
| | | | | | | | to load/store i64 values. Since there's no current support to explicitly declare such restrictions, implement it by using specific hardcoded register pairs during isel. llvm-svn: 132248
* Fix the remaining atomic intrinsics to use the right register classes on Thumb2,Cameron Zwarich2011-05-271-10/+23
| | | | | | and add some basic tests for them. llvm-svn: 132235
* Don't use movw / movt for iOS static codegen for now to workaround some ↵Evan Cheng2011-05-271-1/+2
| | | | | | tools issues. rdar://9514789 llvm-svn: 132211
* RTABI chapter 4.3.4 specifies __eabi_mem* calls. Specifically, __eabi_memset ↵Renato Golin2011-05-221-0/+6
| | | | | | accepts parameters (ptr, size, value) in a different order than GNU's memset (ptr, value, size), therefore the special lowering in AAPCS mode. Implementation by Evzen Muller. llvm-svn: 131868
* Revert accidental commit.Evan Cheng2011-05-201-1/+1
| | | | llvm-svn: 131739
* Revert r131664 and fix it in instcombine instead. rdar://9467055Evan Cheng2011-05-201-1/+1
| | | | llvm-svn: 131708
* Fixed sdiv and udiv for <4 x i16>. The test from r125402 still applies for ↵Mon P Wang2011-05-191-7/+7
| | | | | | this change. llvm-svn: 131630
* Handle perfect shuffle case that generates a vrev for vectors of floats.Tanya Lattner2011-05-181-1/+2
| | | | | | Add test case. llvm-svn: 131582
* Revise r131553. Just use the type of the input node and forgo the bitcast. ↵Evan Cheng2011-05-181-4/+3
| | | | | | rdar://9449159. llvm-svn: 131555
* Fix an ARMTargetLowering::LowerSELECT bug: legalized result must have same ↵Evan Cheng2011-05-181-1/+3
| | | | | | type as input. Sorry test cases only trigger when dag combine is disabled. rdar://9449178 llvm-svn: 131553
* In r131488 I misunderstood how VREV works. It splits the vector in half and ↵Tanya Lattner2011-05-181-1/+9
| | | | | | | | splits each half. Therefore, the real problem was that we were using a VREV64 for a 4xi16, when we should have been using a VREV32. Updated test case and reverted change to the PerfectShuffle Table. llvm-svn: 131529
* Fix typo.Cameron Zwarich2011-05-181-4/+4
| | | | llvm-svn: 131519
* Fix more of PR8825 by correctly using rGPR registers when lowering atomicCameron Zwarich2011-05-181-2/+11
| | | | | | compare-and-swap intrinsics. llvm-svn: 131518
* Give the 'eh.sjlj.dispatchsetup' intrinsic call the value coming from the setjmpBill Wendling2011-05-111-1/+1
| | | | | | | | intrinsic call. This prevents it from being reordered so that it appears *before* the setjmp intrinsic (thus making it completely useless). <rdar://problem/9409683> llvm-svn: 131174
* Make the logic for determining function alignment more explicit. No ↵Eli Friedman2011-05-061-5/+2
| | | | | | functionality change. llvm-svn: 131012
* Temporarily disable use of divmod compiler-rt functions for iOS.Bob Wilson2011-05-031-6/+0
| | | | llvm-svn: 130766
* Add an unfolded offset field to LSR's Formula record. This is used toDan Gohman2011-05-031-0/+8
| | | | | | | | model constants which can be added to base registers via add-immediate instructions which don't require an additional register to materialize the immediate. llvm-svn: 130743
* 80-col.Eric Christopher2011-04-291-8/+9
| | | | llvm-svn: 130558
* ARM and Thumb2 support for atomic MIN/MAX/UMIN/UMAX loads.Jim Grosbach2011-04-261-0/+143
| | | | | | rdar://9326019 llvm-svn: 130234
* Thumb2 and ARM add/subtract with carry fixes.Andrew Trick2011-04-231-66/+73
| | | | | | | | | | | | | Fixes Thumb2 ADCS and SBCS lowering: <rdar://problem/9275821>. t2ADCS/t2SBCS are now pseudo instructions, consistent with ARM, so the assembly printer correctly prints the 's' suffix. Fixes Thumb2 adde -> SBC matching to check for live/dead carry flags. Fixes the internal ARM machine opcode mnemonic for ADCS/SBCS. Fixes ARM SBC lowering to check for live carry (potential bug). llvm-svn: 130048
* Remove -use-divmod-libcall. Let targets opt in when they are available.Evan Cheng2011-04-201-1/+2
| | | | llvm-svn: 129884
* Excise unintended hunk in 129858. <rdar://problem/7662569>Stuart Hastings2011-04-201-5/+0
| | | | llvm-svn: 129862
* ARM byval support. Will be enabled by another patch to the FE. ↵Stuart Hastings2011-04-201-78/+164
| | | | | | <rdar://problem/7662569> llvm-svn: 129858
* Remove some duplicate op action entries and reorganize.Eric Christopher2011-04-191-8/+5
| | | | llvm-svn: 129781
* Fix a ton of comment typos found by codespell. Patch byChris Lattner2011-04-151-2/+2
| | | | | | Luis Felipe Strano Moraes! llvm-svn: 129558
* Fix another fcopysign lowering bug. If src is f64 and destination is f32, don'tEvan Cheng2011-04-151-1/+4
| | | | | | forget to right shift the source by 32 first. rdar://9287902 llvm-svn: 129556
* Fix a typo in an ARM-specific DAG combine. This fixes <rdar://problem/9278274>.Cameron Zwarich2011-04-131-1/+1
| | | | llvm-svn: 129468
* Split a store of a VMOVDRR into two integer stores to avoid mixing NEON and ARMCameron Zwarich2011-04-121-2/+22
| | | | | | | stores of arguments in the same cache line. This fixes the second half of <rdar://problem/8674845>. llvm-svn: 129345
* Change -arm-trap-func= into a non-arm specific option. Now Intrinsic::trap ↵Evan Cheng2011-04-081-23/+1
| | | | | | is lowered into a call to the specified trap function at sdisel time. llvm-svn: 129152
* Add option to emit @llvm.trap as a function call instead of a trap ↵Evan Cheng2011-04-071-1/+23
| | | | | | instruction. rdar://9249183. llvm-svn: 129107
* Prevent ARM DAG Combiner from doing an AND or OR combine on an illegal ↵Tanya Lattner2011-04-071-0/+6
| | | | | | vector type (vectors of size 3). Also included test cases. llvm-svn: 129074
* Change -arm-divmod-libcall to a target neutral option.Evan Cheng2011-04-071-6/+1
| | | | llvm-svn: 129045
* Reapply r128946 (pseudoization of various instructions), and fix the extra ↵Owen Anderson2011-04-051-1/+21
| | | | | | imp-def of CPSR it was adding. llvm-svn: 128965
* Revert r128946 while I figure out why it broke the buildbots.Owen Anderson2011-04-051-21/+1
| | | | llvm-svn: 128951
* Give RSBS and RSCS the pseudo treatment.Owen Anderson2011-04-051-1/+21
| | | | llvm-svn: 128946
* Fix bugs in the pseuo-ization of ADCS/SBCS pointed out by Jim, as well as ↵Owen Anderson2011-04-051-27/+69
| | | | | | doing the expansion earlier (using a custom inserter) to allow for the chance of predicating these instructions. llvm-svn: 128940
* Revamp the SjLj "dispatch setup" intrinsic.Bill Wendling2011-04-051-1/+1
| | | | | | | | | | | | It needed to be moved closer to the setjmp statement, because the code directly after the setjmp needs to know about values that are on the stack. Also, the 'bitcast' of the function context was causing a dead load. This wouldn't be too horrible, except that at -O0 it wasn't optimized out, and because it wasn't using the correct base pointer (if there is a VLA), it would try to access a value from a garbage address. <rdar://problem/9130540> llvm-svn: 128873
* Do some peephole optimizations to remove pointless VMOVs from Neon to integerCameron Zwarich2011-04-021-0/+31
| | | | | | | | registers that arise from argument shuffling with the soft float ABI. These instructions are particularly slow on Cortex A8. This fixes one half of <rdar://problem/8674845>. llvm-svn: 128759
* Issue libcalls __udivmod*i4 / __divmod*i4 for div / rem pairs.Evan Cheng2011-04-011-0/+10
| | | | | | rdar://8911343 llvm-svn: 128696
* Distribute (A + B) * C to (A * C) + (B * C) to make use of NEON multiplierEvan Cheng2011-03-311-0/+38
| | | | | | | | | | | accumulator forwarding: vadd d3, d0, d1 vmul d3, d3, d2 => vmul d3, d0, d2 vmla d3, d1, d2 llvm-svn: 128665
* Don't try to create zero-sized stack objects.Evan Cheng2011-03-301-2/+3
| | | | llvm-svn: 128586
* Add a ARM-specific SD node for VBSL so that forms with a constant first operandCameron Zwarich2011-03-301-3/+32
| | | | | | can be recognized. This fixes <rdar://problem/9183078>. llvm-svn: 128584
* Add intrinsics @llvm.arm.neon.vmulls and @llvm.arm.neon.vmullu.* back. FrontendsEvan Cheng2011-03-291-0/+7
| | | | | | | | | | | | | | | was lowering them to sext / uxt + mul instructions. Unfortunately the optimization passes may hoist the extensions out of the loop and separate them. When that happens, the long multiplication instructions can be broken into several scalar instructions, causing significant performance issue. Note the vmla and vmls intrinsics are not added back. Frontend will codegen them as intrinsics vmull* + add / sub. Also note the isel optimizations for catching mul + sext / zext are not changed either. First part of rdar://8832507, rdar://9203134 llvm-svn: 128502
* Add Neon SINT_TO_FP and UINT_TO_FP lowering from v4i16 to v4f32. FixesCameron Zwarich2011-03-291-0/+35
| | | | | | <rdar://problem/8875309> and <rdar://problem/9057191>. llvm-svn: 128492
* Optimizing (zext A + zext B) * C, to (VMULL A, C) + (VMULL B, C) during ↵Evan Cheng2011-03-291-15/+81
| | | | | | | | | | | | | | | | | | | isel lowering to fold the zero-extend's and take advantage of no-stall back to back vmul + vmla: vmull q0, d4, d6 vmlal q0, d5, d6 is faster than vaddl q0, d4, d5 vmovl q1, d6 vmul q0, q0, q1 This allows us to vmull + vmlal for: f = vmull_u8( vget_high_u8(s), c); f = vmlal_u8(f, vget_low_u8(s), c); rdar://9197392 llvm-svn: 128444
* Fix the bfi handling for or (and a mask) (and b mask). We need the twoEric Christopher2011-03-261-9/+10
| | | | | | | | | | | | | | masks to match inversely for the code as is to work. For the example given we actually want: bfi r0, r2, #1, #1 not #0, however, given the way the pattern is written it's not possible at the moment. Fixes rdar://9177502 llvm-svn: 128320
* Re-apply r127953 with fixes: eliminate empty return block if it has no ↵Evan Cheng2011-03-211-0/+10
| | | | | | predecessors; update dominator tree if cfg is modified. llvm-svn: 127981
OpenPOWER on IntegriCloud