summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/ARM
Commit message (Collapse)AuthorAgeFilesLines
* Radar 8803471: Fix expansion of ARM BCCi64 pseudo instructions.Bob Wilson2010-12-231-0/+1
| | | | | | | | If the basic block containing the BCCi64 (or BCCZi64) instruction ends with an unconditional branch, that branch needs to be deleted before appending the expansion of the BCCi64 to the end of the block. llvm-svn: 122521
* Add ARM-specific DAG combining to cast i64 vector element load/stores to f64.Bob Wilson2010-12-211-0/+30
| | | | | | | | | | | Type legalization splits up i64 values into pairs of i32 values, which leads to poor quality code when inserting or extracting i64 vector elements. If the vector element is loaded or stored, it can be treated as an f64 value and loaded or stored directly from a VPR register. Use the pre-legalization DAG combiner to cast those vector elements to f64 types so that the type legalizer won't mess them up. Radar 8755338. llvm-svn: 122319
* move this test into the ARM test so that it is only run when the arm backendChris Lattner2010-12-191-0/+23
| | | | | | is enabled. llvm-svn: 122163
* Fix result type of Neon floating-point comparisons against zero.Bob Wilson2010-12-181-0/+19
| | | | | | The result vector elements are always integers. Radar 8782191. llvm-svn: 122112
* During local stack slot allocation, the materializeFrameBaseRegister functionBill Wendling2010-12-171-0/+15
| | | | | | | | | | | | | | | may be called. If the entry block is empty, the insertion point iterator will be the "end()" value. Calling ->getParent() on it (among others) causes problems. Modify materializeFrameBaseRegister to take the machine basic block and insert the frame base register at the beginning of that block. (It's very similar to what the code does all ready. The only difference is that it will always insert at the beginning of the entry block instead of after a previous materialization of the frame base register. I doubt that that matters here.) <rdar://problem/8782198> llvm-svn: 122104
* Fix a DAGCombiner crash when folding binary vector operations with constantBob Wilson2010-12-171-0/+14
| | | | | | | BUILD_VECTOR operands where the element type is not legal. I had previously changed this code to insert TRUNCATE operations, but that was just wrong. llvm-svn: 122102
* Combine several vector-related DAGCombiner tests.Bob Wilson2010-12-175-61/+63
| | | | llvm-svn: 122101
* Fix crash compiling a QQQQ REG_SEQUENCE for a Neon vld3_lane operation.Bob Wilson2010-12-171-0/+19
| | | | | | Radar 8776599 llvm-svn: 122018
* 1. ARM/MC/ELF: A few more ELF relocs for .oJason W Kim2010-12-161-0/+35
| | | | | | | 2. Fixed EmitLocalCommonSymbol for ELF (Yes, they exist. :) Test added. llvm-svn: 121951
* Don't handle -arm-long-calls in fast isel for now.Eric Christopher2010-12-151-0/+30
| | | | llvm-svn: 121919
* Add Neon VCVT instructions for f32 <-> f16 conversions.Bob Wilson2010-12-151-1/+19
| | | | | | | Clang is now providing intrinsics for these and so we need to support them in the backend. Radar 8068427. llvm-svn: 121902
* bfi A, (and B, C1), C2) -> bfi A, B, C2 iff C1 & C2 == C1. rdar://8458663Evan Cheng2010-12-141-0/+13
| | | | llvm-svn: 121746
* fix fixme case typo :-) Jason W Kim2010-12-141-1/+1
| | | | llvm-svn: 121743
* First cut of ARM/MC/ELF PIC relocations.Jason W Kim2010-12-131-0/+100
| | | | | | Test has fixme, to move to .s -> .o test when AsmParser works better. llvm-svn: 121732
* (or (and (shl A, #shamt), mask), B) => ARMbfi B, A, ~mask where lsb(mask) == ↵Evan Cheng2010-12-111-2/+13
| | | | | | #shamt. rdar://8752056 llvm-svn: 121606
* Add float patterns for Neon vld1-lane/dup and vst1-lane operations.Bob Wilson2010-12-104-18/+54
| | | | llvm-svn: 121583
* Fix some invalid alignments for Neon vld-dup and vld/st-lane instructions.Bob Wilson2010-12-102-8/+28
| | | | | | | Alignments smaller than the total size of the memory being loaded or stored, unless the alignment is 8 bytes, are not allowed. Add tests for this, too. llvm-svn: 121506
* ARM stm/ldm instructions require more than one register in the register list.Jim Grosbach2010-12-091-2/+2
| | | | | | | | Otherwise, a plain str/ldr should be used instead. Make sure we account for that in prologue/epilogue code generation. rdar://8745460 llvm-svn: 121391
* ARM/MC/ELF TPsoft is now a proper pseudo inst.Jason W Kim2010-12-081-0/+52
| | | | | | | | | Added test to check bl __aeabi_read_tp gets emitted properly for ELF/ASM as well as ELF/OBJ (including fixup) Also added support for ELF::R_ARM_TLS_IE32 llvm-svn: 121312
* Fix a bad prologue / epilogue codegen bug where the compiler would emit illegalEvan Cheng2010-12-071-0/+40
| | | | | | | | | | | vpush instructions to save / restore VFP / NEON registers like this: vpush {d8,d10,d11} vpop {d8,d10,d11} vpush and vpop do not allow gaps in the register list. rdar://8728956 llvm-svn: 121197
* If dbg_declare() or dbg_value() is not lowered by isel then emit DEBUG ↵Devang Patel2010-12-061-24/+19
| | | | | | message instead of creating DBG_VALUE for undefined value in reg0. llvm-svn: 121059
* Making use of VFP / NEON floating point multiply-accumulate / subtraction isEvan Cheng2010-12-051-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. llvm-svn: 120960
* ARM/MC/ELF relocation "hello world" for movw/movt.Jason W Kim2010-12-011-0/+42
| | | | | | | | | | | Lifted adjustFixupValue() from Darwin for sharing w ELF. Test added TODO: refactor ELFObjectWriter::RecordRelocation more. Possibly share more code with Darwin? Lots more relocations... llvm-svn: 120534
* Enable sibling call optimization of libcalls which are expanded duringEvan Cheng2010-11-301-29/+45
| | | | | | | | | | | legalization time. Since at legalization time there is no mapping from SDNode back to the corresponding LLVM instruction and the return SDNode is target specific, this requires a target hook to check for eligibility. Only x86 and ARM support this form of sibcall optimization right now. rdar://8707777 llvm-svn: 120501
* Add support for NEON VLD3-dup instructions.Bob Wilson2010-11-301-0/+23
| | | | | | The encoding for alignment in VLD4-dup instructions is still a work in progress. llvm-svn: 120356
* Mark Darwin call instructions as using "r7" to prevent the frame-registerEvan Cheng2010-11-291-0/+28
| | | | | | | assignment instructions from being moved below / above calls. rdar://8690640 llvm-svn: 120339
* Add missing colon.Benjamin Kramer2010-11-291-1/+1
| | | | llvm-svn: 120336
* Add support for NEON VLD3-dup instructions.Bob Wilson2010-11-291-0/+20
| | | | llvm-svn: 120312
* Add support for NEON VLD2-dup instructions.Bob Wilson2010-11-281-0/+32
| | | | llvm-svn: 120236
* Add NEON VLD1-dup instructions (load 1 element to all lanes).Bob Wilson2010-11-272-1/+43
| | | | llvm-svn: 120194
* Recognize sign/zero-extended constant BUILD_VECTORs for VMULL operations.Bob Wilson2010-11-231-0/+72
| | | | | | | We need to check if the individual vector elements are sign/zero-extended values. For now this only handles constants values. Radar 8687140. llvm-svn: 120034
* Fix epilogue codegen to avoid leaving the stack pointer in an invalidEvan Cheng2010-11-221-1/+1
| | | | | | | | | | | | | | | | | state. Previously Thumb2 would restore sp from fp like this: mov sp, r7 sub, sp, #4 If an interrupt is taken after the 'mov' but before the 'sub', callee-saved registers might be clobbered by the interrupt handler. Instead, try restoring directly from sp: add sp, #4 Or, if necessary (with VLA, etc.) use a scratch register to compute sp and then restore it: sub.w r4, r7, #8 mov sp, r7 rdar://8465407 llvm-svn: 119977
* Fix bug in DAGCombiner for ARM that was trying to do a ShiftCombine on ↵Tanya Lattner2010-11-181-0/+8
| | | | | | | | illegal types (vector should be split first). Added test case. llvm-svn: 119749
* Rewrite stack callee saved spills and restores to use push/pop instructions.Eric Christopher2010-11-182-3/+3
| | | | | | | | | Remove movePastCSLoadStoreOps and associated code for simple pointer increments. Update routines that depended upon other opcodes for save/restore. Adjust all testcases accordingly. llvm-svn: 119725
* These tests are looking for library function names thatDale Johannesen2010-11-174-4/+4
| | | | | | | appear to differ on Linux. Try to make them pass on Linux. Would be good for a Linux person to review this. llvm-svn: 119572
* Change ARMGlobalMerge to keep BSS globals in separate pools.Bob Wilson2010-11-171-1/+7
| | | | | | This completes the fixes for Radar 8673120. llvm-svn: 119566
* Fix ARMGlobalMerge pass to check if globals are entirely within range.Bob Wilson2010-11-171-0/+6
| | | | | | | It is generally not sufficient to check if the starting offset is in range of the maximum offset that can be efficiently used for the target. llvm-svn: 119565
* Change the symbol for merged globals from "merged" to "_MergedGlobals".Bob Wilson2010-11-171-1/+1
| | | | | | | This makes it more clear that the symbol is an internal, compiler-generated name and gives a little more description about its contents. llvm-svn: 119564
* Fix the ARMGlobalMerge pass to look at variable sizes instead of pointer sizes.Bob Wilson2010-11-171-0/+11
| | | | | | | It was mistakenly looking at the pointer type when checking for the size of global variables. This is a partial fix for Radar 8673120. llvm-svn: 119563
* Remove ARM isel hacks that fold large immediates into a pair of add, sub, and,Evan Cheng2010-11-172-4/+4
| | | | | | | | | | | | | | | | | | | | | and xor. The 32-bit move immediates can be hoisted out of loops by machine LICM but the isel hacks were preventing them. Instead, let peephole optimization pass recognize registers that are defined by immediates and the ARM target hook will fold the immediates in. Other changes include 1) do not fold and / xor into cmp to isel TST / TEQ instructions if there are multiple uses. This happens when the 'and' is live out, machine sink would have sinked the computation and that ends up pessimizing code. The peephole pass would recognize situations where the 'and' can be toggled to define CPSR and eliminate the comparison anyway. 2) Move peephole pass to after machine LICM, sink, and CSE to avoid blocking important optimizations. rdar://8663787, rdar://8241368 llvm-svn: 119548
* Fix PR8612 in the standard spiller, take two.Jakob Stoklund Olesen2010-11-161-0/+1
| | | | | | | | | | The live range of a register defined by an early clobber starts at the use slot, not the def slot. Except when it is an early clobber tied to a use operand. Then it starts at the def slot like a standard def. llvm-svn: 119305
* Revert "Fix PR8612 in the standard spiller as well."Jakob Stoklund Olesen2010-11-151-1/+0
| | | | | | This reverts r119183 which borke the buildbots. llvm-svn: 119270
* Recommit this change and remove the failing part of the test - it didn'tEric Christopher2010-11-151-26/+3
| | | | | | | pass in the first place and was masked by earlier failures not warning and aborting the block. llvm-svn: 119184
* Fix PR8612 in the standard spiller as well.Jakob Stoklund Olesen2010-11-151-0/+1
| | | | | | | The live range of a register defined by an early clobber starts at the use slot, not the def slot. llvm-svn: 119183
* When spilling a register defined by an early clobber, make sure that the newJakob Stoklund Olesen2010-11-151-0/+84
| | | | | | | | | | | | | live ranges for the spill register are also defined at the use slot instead of the normal def slot. This fixes PR8612 for the inline spiller. A use was being allocated to the same register as a spilled early clobber def. This problem exists in all the spillers. A fix for the standard spiller is forthcoming. llvm-svn: 119182
* Add conditional move of large immediate.Evan Cheng2010-11-131-15/+32
| | | | llvm-svn: 118968
* Fix an obvious typo which inverted an immediate.Evan Cheng2010-11-131-4/+17
| | | | llvm-svn: 118951
* This should be still failing, but is. Disable it with theEric Christopher2010-11-131-2/+2
| | | | | | forget-me-stick for now. llvm-svn: 118950
* Add conditional mvn instructions.Evan Cheng2010-11-121-9/+53
| | | | llvm-svn: 118935
* Add some missing isel predicates on def : pat patterns to avoid generating ↵Evan Cheng2010-11-125-53/+154
| | | | | | VFP vmla / vmls (they cause stalls). Disabling them in isel is properly not a right solution, I'll look into a proper solution next. llvm-svn: 118922
OpenPOWER on IntegriCloud