summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM/ARMISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
* User proper libcall names & condcodes while compiling for ARM EABI.Anton Korobeynikov2010-09-281-6/+150
| | | | | | Patch by Evzen Muller! llvm-svn: 114991
* Add a command line option "-arm-strict-align" to disallow unaligned memoryBob Wilson2010-09-281-9/+1
| | | | | | accesses for ARM targets that would otherwise allow it. Radar 8465431. llvm-svn: 114941
* Enable code placement optimization pass for ARM.Evan Cheng2010-09-241-7/+1
| | | | llvm-svn: 114746
* Add support for ELF PLT references for ARM MC asm printing. Adding aJim Grosbach2010-09-221-4/+16
| | | | | | | | new VariantKind to the MCSymbolExpr seems like overkill, but I'm not sure there's a more straightforward way to get the printing difference captured. (i.e., x86 uses @PLT, ARM uses (PLT)). llvm-svn: 114613
* Change VDUPLANE DAG combiner to just return the result instead of callingBob Wilson2010-09-221-5/+3
| | | | | | | | CombineTo to avoid putting the result on the worklist. I don't think it makes much difference for now, but it might help someday as we add more DAG combine optimizations. llvm-svn: 114595
* Combine both VMOVDRR(VMOVRRD) and VMOVRRD(VMOVDRR), instead of just doing oneBob Wilson2010-09-221-28/+35
| | | | | | | | of those. Refactor to share code for handling BUILD_VECTOR(VMOVRRD). I don't have a testcase that exercises this, but it seems like an obvious good thing to do. llvm-svn: 114589
* Enable target-specific mul-lowering on ARM, even at -Os. Remove a test that ↵Owen Anderson2010-09-211-4/+0
| | | | | | | | this makes irrelevant, but add a new test for the new, improved functionality. llvm-svn: 114494
* convert a couple more places to use the new getStore()Chris Lattner2010-09-211-7/+7
| | | | llvm-svn: 114463
* Define the TargetLowering::getTgtMemIntrinsic hook for ARM so that NEON loadBob Wilson2010-09-211-0/+61
| | | | | | and store intrinsics are represented with MemIntrinsicSDNodes. llvm-svn: 114454
* convert the targets off the non-MachinePointerInfo of getLoad.Chris Lattner2010-09-211-31/+28
| | | | llvm-svn: 114410
* reimplement memcpy/memmove/memset lowering to use MachinePointerInfoChris Lattner2010-09-211-1/+1
| | | | | | | instead of srcvalue/offset pairs. This corrects SV info for mem operations whose size is > 32-bits. llvm-svn: 114401
* Add target-specific DAG combiner for BUILD_VECTOR and VMOVRRD. An i64Bob Wilson2010-09-171-0/+27
| | | | | | | | value should be in GPRs when it's going to be used as a scalar, and we use VMOVRRD to make that happen, but if the value is converted back to a vector we need to fold to a simple bit_convert. Radar 8407927. llvm-svn: 114233
* Split out some of the calling convention bits so that they can beEric Christopher2010-09-101-147/+1
| | | | | | used for fast-isel. llvm-svn: 113652
* Teach if-converter to be more careful with predicating instructions that wouldEvan Cheng2010-09-101-2/+2
| | | | | | | | | | | take multiple cycles to decode. For the current if-converter clients (actually only ARM), the instructions that are predicated on false are not nops. They would still take machine cycles to decode. Micro-coded instructions such as LDM / STM can potentially take multiple cycles to decode. If-converter should take treat them as non-micro-coded simple instructions. llvm-svn: 113570
* remove trailing whitespaceJim Grosbach2010-09-081-12/+12
| | | | llvm-svn: 113338
* Replace NEON vabdl, vaba, and vabal intrinsics with combinations of theBob Wilson2010-09-031-17/+0
| | | | | | | | vabd intrinsic and add and/or zext operations. In the case of vaba, this also avoids the need for a DAG combine pattern to combine vabd with add. Update tests. Auto-upgrade the old intrinsics. llvm-svn: 112941
* Remove NEON vmull, vmlal, and vmlsl intrinsics, replacing them with multiply,Bob Wilson2010-09-011-1/+52
| | | | | | | add, and subtract operations with zero-extended or sign-extended vectors. Update tests. Add auto-upgrade support for the old intrinsics. llvm-svn: 112773
* Create an ARMISD::AND node. This node is exactly like the "ARM::AND" node, butBill Wendling2010-08-291-0/+1
| | | | | | it sets the CPSR register. llvm-svn: 112393
* ARM/Thumb2: Fix a misselect in getARMCmp, when attempting to adjust a signedDaniel Dunbar2010-08-251-4/+4
| | | | | | | | | comparison that would overflow. - The other under/overflow cases can't actually happen because the immediates which would trigger them are legal (so we don't enter this code), but adjusted the style to make it clear the transform is always valid. llvm-svn: 112053
* Replace the arm.neon.vmovls and vmovlu intrinsics with vector sign-extend andBob Wilson2010-08-201-2/+6
| | | | | | zero-extend operations. llvm-svn: 111614
* Expand ZERO_EXTEND operations for NEON vector types.Bob Wilson2010-08-181-0/+1
| | | | | | Testcase from Nick Lewycky. llvm-svn: 111341
* Allow more cases of undef shuffle indices and add tests for them.Bob Wilson2010-08-171-12/+22
| | | | llvm-svn: 111226
* Ignore undef shuffle indices when checking for a VTRN shuffle. Radar 8290937.Bob Wilson2010-08-161-0/+1
| | | | llvm-svn: 111208
* Temporarily disable tail calls on ARM to work around some linker problems.Bob Wilson2010-08-131-0/+9
| | | | llvm-svn: 111050
* cortex m4 has floating point support, but only single precision.Jim Grosbach2010-08-111-1/+2
| | | | llvm-svn: 110810
* Consider this code snippet:Bill Wendling2010-08-111-3/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | float t1(int argc) { return (argc == 1123) ? 1.234f : 2.38213f; } We would generate truly awful code on ARM (those with a weak stomach should look away): _t1: movw r1, #1123 movs r2, #1 movs r3, #0 cmp r0, r1 mov.w r0, #0 it eq moveq r0, r2 movs r1, #4 cmp r0, #0 it ne movne r3, r1 adr r0, #LCPI1_0 ldr r0, [r0, r3] bx lr The problem was that legalization was creating a cascade of SELECT_CC nodes, for for the comparison of "argc == 1123" which was fed into a SELECT node for the ?: statement which was itself converted to a SELECT_CC node. This is because the ARM back-end doesn't have custom lowering for SELECT nodes, so it used the default "Expand". I added a fairly simple "LowerSELECT" to the ARM back-end. It takes care of this testcase, but can obviously be expanded to include more cases. Now we generate this, which looks optimal to me: _t1: movw r1, #1123 movs r2, #0 cmp r0, r1 adr r0, #LCPI0_0 it eq moveq r2, #4 ldr r0, [r0, r2] bx lr .align 2 LCPI0_0: .long 1075344593 @ float 2.382130e+00 .long 1067316150 @ float 1.234000e+00 llvm-svn: 110799
* - Add subtarget feature -mattr=+db which determine whether an ARM cpu has theEvan Cheng2010-08-111-12/+12
| | | | | | | | | memory and synchronization barrier dmb and dsb instructions. - Change instruction names to something more sensible (matching name of actual instructions). - Added tests for memory barrier codegen. llvm-svn: 110785
* Delete some unused instructions.Evan Cheng2010-08-101-72/+0
| | | | llvm-svn: 110710
* Re-apply r110655 with fixes. Epilogue must restore sp from fp if the ↵Evan Cheng2010-08-101-4/+5
| | | | | | | | function stack frame has a var-sized object. Also added a test case to check for the added benefit of this patch: it's optimizing away the unnecessary restore of sp from fp for some non-leaf functions. llvm-svn: 110707
* Revert r110655, "Fix ARM hasFP() semantics. It should return true whenever FPDaniel Dunbar2010-08-101-5/+4
| | | | | | register is", it breaks a couple test-suite tests. llvm-svn: 110701
* Fix ARM hasFP() semantics. It should return true whenever FP register isEvan Cheng2010-08-101-4/+5
| | | | | | | | | | reserved, not available for general allocation. This eliminates all the extra checks for Darwin. This change also fixes the use of FP to access frame indices in leaf functions and cleaned up some confusing code in epilogue emission. llvm-svn: 110655
* Remove switch for disabling ARM tail calls. TheyDale Johannesen2010-08-041-9/+0
| | | | | | seem to be working correctly. No functional change. llvm-svn: 110226
* Combine NEON VABD (absolute difference) intrinsics with ADDs to make VABABob Wilson2010-08-041-0/+16
| | | | | | (absolute difference with accumulate) intrinsics. Radar 8228576. llvm-svn: 110170
* Add support for getting & setting the FPSCR application register on ARM when ↵Nate Begeman2010-08-031-1/+22
| | | | | | | | | VFP is enabled. Add support for using the FPSCR in conjunction with the vcvtr instruction, for controlling fp to int rounding. Add support for the FLT_ROUNDS_ node now that the FPSCR is exposed. llvm-svn: 110152
* Refactor ARM-specific DAG combining in preparation for adding some moreBob Wilson2010-07-291-12/+25
| | | | | | transformations. llvm-svn: 109800
* Implement vector constants which are splat ofDale Johannesen2010-07-291-8/+62
| | | | | | | | | integers with mov + vdup. 8003375. This is currently disabled by default because LICM will not hoist a VDUP, so it pessimizes the code if the construct occurs inside a loop (8248029). llvm-svn: 109799
* Hook in GlobalMerge passAnton Korobeynikov2010-07-241-0/+6
| | | | llvm-svn: 109359
* Use the appropriate register class for an i32 when adding ARM::LR to theJim Grosbach2010-07-231-1/+1
| | | | | | | function live in set. This will give us tGPR for Thumb1 and GPR otherwise, so the copy will be spillable. rdar://8224931 llvm-svn: 109293
* - Allow target to specify when is register pressure "too high". In most cases,Evan Cheng2010-07-231-0/+18
| | | | | | | | | | | | | it's too late to start backing off aggressive latency scheduling when most of the registers are in use so the threshold should be a bit tighter. - Correctly handle live out's and extract_subreg etc. - Enable register pressure aware scheduling by default for hybrid scheduler. For ARM, this is almost always a win on # of instructions. It's runtime neutral for most of the tests. But for some kernels with high register pressure it can be a huge win. e.g. 464.h264ref reduced number of spills by 54 and sped up by 20%. llvm-svn: 109279
* Mark an assert-only variable as used.Chandler Carruth2010-07-221-0/+1
| | | | llvm-svn: 109091
* More register pressure aware scheduling work.Evan Cheng2010-07-211-14/+11
| | | | llvm-svn: 109064
* Baby steps towards ARM fast-isel.Eric Christopher2010-07-211-0/+6
| | | | llvm-svn: 109047
* Fix calling convention on ARM if vfp2+ is enabled.Rafael Espindola2010-07-211-1/+5
| | | | llvm-svn: 109009
* Teach bottom up pre-ra scheduler to track register pressure. Work in progress.Evan Cheng2010-07-211-12/+30
| | | | llvm-svn: 108991
* Removed un-used code.Jim Grosbach2010-07-201-49/+0
| | | | llvm-svn: 108841
* ARM has to provide its own TargetLowering::findRepresentativeClass because ↵Evan Cheng2010-07-191-0/+16
| | | | | | its scalar floating point registers alias its vector registers. llvm-svn: 108761
* Since ARM emits inline jump tables as part of the ConstantIsland pass,Jim Grosbach2010-07-191-0/+4
| | | | | | | | it should set the jump table encloding the EK_Inline. This prevents a second, unused, copy of the table from being emitted after the function body. PR6581. llvm-svn: 108730
* revert so I can get the right PR# in the log message.Jim Grosbach2010-07-191-4/+0
| | | | llvm-svn: 108727
* Since ARM emits inline jump tables as part of the ConstantIsland pass,Jim Grosbach2010-07-191-0/+4
| | | | | | | | it should set the jump table encloding the EK_Inline. This prevents a second, unused, copy of the table from being emitted after the function body. PR7499. llvm-svn: 108722
* Add combiner patterns to more effectively utilize the BFI (bitfield insert)Jim Grosbach2010-07-171-16/+68
| | | | | | | instruction for non-constant operands. This includes the case referenced in the README.txt regarding a bitfield copy. llvm-svn: 108608
OpenPOWER on IntegriCloud