summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM/ARMTargetMachine.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* ADT/Triple: Move a variety of clients to using isOSDarwin() and isOSWindows()Daniel Dunbar2011-04-191-12/+11
| | | | | | predicates. llvm-svn: 129816
* This patch combines several changes from Evan Cheng for rdar://8659675.Bob Wilson2011-04-191-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Enable these fp vmlx codegen changes for Cortex-A9. llvm-svn: 129775
* Tidy up.Jim Grosbach2011-04-061-2/+1
| | | | llvm-svn: 129034
* Triple::MinGW64 is deprecated and removed. We can use Triple::MinGW32 generally.NAKAMURA Takumi2011-02-171-1/+0
| | | | | | No one uses *-mingw64. mingw-w64 is represented as {i686|x86_64}-w64-mingw32. In llvm side, i686 and x64 can be treated as similar way. llvm-svn: 125747
* Add support for the --noexecstack option.Rafael Espindola2011-01-231-2/+3
| | | | llvm-svn: 124077
* Rename TargetFrameInfo into TargetFrameLowering. Also, put couple of FIXMEs ↵Anton Korobeynikov2011-01-101-5/+5
| | | | | | and fixes here and there. llvm-svn: 123170
* Making use of VFP / NEON floating point multiply-accumulate / subtraction isEvan Cheng2010-12-051-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. llvm-svn: 120960
* tidy upChris Lattner2010-11-171-18/+11
| | | | llvm-svn: 119462
* First step of huge frame-related refactoring: move emit{Prologue,Epilogue} ↵Anton Korobeynikov2010-11-151-3/+6
| | | | | | out of TargetRegisterInfo to TargetFrameInfo, which is definitely much better suitable place llvm-svn: 119097
* Revert the accidental commit I made reverting the previous commit.Eric Christopher2010-11-111-6/+7
| | | | llvm-svn: 118835
* Revert this temporarily.Eric Christopher2010-11-111-7/+6
| | | | llvm-svn: 118827
* Jim Asked us to move DataLayout on ARM back to the most specialized classes. DoRafael Espindola2010-10-031-5/+16
| | | | | | | | so and also change X86 for consistency. Investigating if this can be improved a bit. llvm-svn: 115469
* I added a new file ARMAsmBackend which stubs out in similar ways toJason W Kim2010-09-301-0/+6
| | | | | | | | | | the eqv X86 class. For now, I split the ELFARMAsmBackend from the DarwinARMAsmBackend (also mimicking X86) Tested against -r115126 llvm-svn: 115129
* Resolve this GCC warning:Nick Lewycky2010-09-281-1/+2
| | | | | | ARMTargetMachine.cpp:53: error: control reaches end of non-void function llvm-svn: 114992
* Odd additional stub framework for the ARM MC ELF emission.Rafael Espindola2010-09-271-14/+39
| | | | | | | | | llc now recognizes the "intent" to support MC/obj emission for ARM, but given that they are all stubs, it asserts on --filetype=obj --march=arm Patch by Jason Kim. llvm-svn: 114856
* Convert some VTBL and VTBX instructions to use pseudo instructions prior toBob Wilson2010-09-131-3/+0
| | | | | | | register allocation. Remove the NEONPreAllocPass, which is no longer needed. Yeah!! llvm-svn: 113818
* Report error if codegen tries to instantiate a ARM target when the cpu does ↵Evan Cheng2010-08-111-0/+3
| | | | | | support it. e.g. cortex-m* processors. llvm-svn: 110798
* Change -prefer-32bit-thumb to attribute -mattr=+32bit instead to disable ↵Evan Cheng2010-08-091-7/+1
| | | | | | more 32-bit to 16-bit optimizations. llvm-svn: 110584
* Add an option to disable 32 -> 16-bit Thumb2 size reduction pass for ↵Evan Cheng2010-08-091-2/+7
| | | | | | experimentation. llvm-svn: 110579
* Hook in GlobalMerge passAnton Korobeynikov2010-07-241-1/+7
| | | | llvm-svn: 109359
* Remove early IT block formation. It's not used.Evan Cheng2010-07-021-8/+0
| | | | llvm-svn: 107513
* Add missing ARM and Thumb data layout info for vector types.Bob Wilson2010-06-251-4/+8
| | | | | | Radar 8128745. llvm-svn: 106820
* Oops. IT block formation pass needs to be run at any optimization level.Evan Cheng2010-06-241-4/+3
| | | | llvm-svn: 106775
* Move ARM if-conversion before post-ra scheduling.Evan Cheng2010-06-181-15/+2
| | | | llvm-svn: 106355
* Allow ARM if-converter to be run after post allocation scheduling.Evan Cheng2010-06-181-2/+5
| | | | | | | | | | | | | | | | - This fixed a number of bugs in if-converter, tail merging, and post-allocation scheduler. If-converter now runs branch folding / tail merging first to maximize if-conversion opportunities. - Also changed the t2IT instruction slightly. It now defines the ITSTATE register which is read by instructions in the IT block. - Added Thumb2 specific hazard recognizer to ensure the scheduler doesn't change the instruction ordering in the IT block (since IT mask has been finalized). It also ensures no other instructions can be scheduled between instructions in the IT block. This is not yet enabled. llvm-svn: 106344
* Make post-ra scheduling, anti-dep breaking, and register scavenger ↵Evan Cheng2010-06-161-2/+11
| | | | | | (conservatively) aware of predicated instructions. This enables ARM to move if-conversion before post-ra scheduler. llvm-svn: 106091
* Typo.Evan Cheng2010-06-091-1/+1
| | | | llvm-svn: 105677
* Thumb2 IT blocks are fairly expensive. When there are multiple selects usingEvan Cheng2010-06-091-0/+10
| | | | | | | | | | | | | | | the same condition, it's important to make sure they are scheduled together to avoid forming multiple IT blocks. I'm adding a pre-regalloc pass that forms IT blocks early (by re-scheduling instructions and split basic blocks) to attempt to fix this. This is not turned on by default since I am not sure this is the right fix. Another issue is llvm selects are modeled as two-address conditional moves. This can be very bad when the copies before the conditional moves are not coalesced away. Teach IT formation pass to move the copies above the IT block (when legal) to avoid breaking the IT block. llvm-svn: 105669
* Implement a bunch more TargetSelectionDAGInfo infrastructure.Dan Gohman2010-05-111-2/+4
| | | | | | | | Move EmitTargetCodeForMemcpy, EmitTargetCodeForMemset, and EmitTargetCodeForMemmove out of TargetLowering and into SelectionDAGInfo to exercise this. llvm-svn: 103481
* Remove late ARM codegen optimization pass committed by accident.Anton Korobeynikov2010-04-071-7/+1
| | | | | | It is not ready for public yet. llvm-svn: 100673
* Move NEON-VFP domain fixer upper, so post-RA scheduler would benefit from it.Anton Korobeynikov2010-04-071-4/+6
| | | | llvm-svn: 100668
* Some initial version of global mergerAnton Korobeynikov2010-04-071-1/+7
| | | | llvm-svn: 100641
* TargetRegistry: Fix create{AsmInfo,MCDisassembler} to return non-const objects.Daniel Dunbar2010-03-201-1/+1
| | | | llvm-svn: 99097
* remove dead code.Chris Lattner2010-02-021-24/+0
| | | | llvm-svn: 95134
* eliminate all the dead addSimpleCodeEmitter implementations.Chris Lattner2010-02-021-25/+0
| | | | | | | eliminate random "code emitter" stuff in Alpha, except for the JIT path. Next up, remove the template cruft. llvm-svn: 95131
* For aligned load/store instructions, it's only required to know whether aJim Grosbach2010-01-191-4/+0
| | | | | | | | | | | function can support dynamic stack realignment. That's a much easier question to answer at instruction selection stage than whether the function actually will have dynamic alignment prologue. This allows the removal of the stack alignment heuristic pass, and improves code quality for cases where the heuristic would result in dynamic alignment code being generated when it was not strictly necessary. llvm-svn: 93885
* Factor the stack alignment calculations out into a target independent pass.Jim Grosbach2009-12-021-1/+1
| | | | | | No functionality change. llvm-svn: 90336
* Detect need for autoalignment of the stack earlier to catch spills moreJim Grosbach2009-11-151-0/+4
| | | | | | | conservatively. eliminateFrameIndex() machinery adjust to handle addr mode 6 (vld1/vst1) used for spills. Fix tests to expect aligned Q-reg spilling llvm-svn: 88874
* indicate what the native integer types for the target are.Chris Lattner2009-11-071-4/+4
| | | | | | Please verify. llvm-svn: 86397
* - Add pseudo instructions tLDRpci_pic and t2LDRpci_pic which does a pc-relativeEvan Cheng2009-11-061-0/+4
| | | | | | | | | | | | load of a GV from constantpool and then add pc. It allows the code sequence to be rematerializable so it would be hoisted by machine licm. - Add a late pass to break these pseudo instructions into a number of real instructions. Also move the code in Thumb2 IT pass that breaks up t2MOVi32imm to this pass. This is done before post regalloc scheduling to allow the scheduler to proper schedule these instructions. It also allow them to be if-converted and shrunk by later passes. llvm-svn: 86304
* Pass StringRef by value.Daniel Dunbar2009-11-061-2/+1
| | | | llvm-svn: 86251
* Move subtarget check upper for NEON reg-reg fixup pass.Anton Korobeynikov2009-11-031-1/+2
| | | | llvm-svn: 85914
* Turn neon reg-reg moves fixup code into separate pass. This should reduce ↵Anton Korobeynikov2009-11-031-2/+5
| | | | | | the compile time. llvm-svn: 85850
* Revert r85346 change to control tail merging by CodeGenOpt::Level.Bob Wilson2009-10-281-1/+1
| | | | | | I'm going to redo this using the OptimizeForSize function attribute. llvm-svn: 85426
* Record CodeGen optimization level in the BranchFolding pass so that we canBob Wilson2009-10-271-1/+1
| | | | | | | | | | | | | | | | use it to control tail merging when there is a tradeoff between performance and code size. When there is only 1 instruction in the common tail, we have been merging. That can be good for code size but is a definite loss for performance. Now we will avoid tail merging in that case when the optimization level is "Aggressive", i.e., "-O3". Radar 7338114. Since the IfConversion pass invokes BranchFolding, it too needs to know the optimization level. Note that I removed the RegisterPass instantiation for IfConversion because it required a default constructor. If someone wants to keep that for some reason, we can add a default constructor with a hard-wired optimization level. llvm-svn: 85346
* Revert 84843. Evan, this was breaking some of the if-conversion tests.Bob Wilson2009-10-221-3/+5
| | | | llvm-svn: 84868
* Move if-conversion before post-regalloc scheduling so the predicated ↵Evan Cheng2009-10-221-5/+3
| | | | | | instruction get scheduled properly. llvm-svn: 84843
* Trim include.Evan Cheng2009-10-221-1/+0
| | | | llvm-svn: 84831
* Move load / store multiple before post-alloc scheduling.Evan Cheng2009-10-021-10/+2
| | | | llvm-svn: 83236
* Add a option which would move ld/st multiple pass before post-alloc scheduling.Evan Cheng2009-09-301-1/+16
| | | | llvm-svn: 83145
OpenPOWER on IntegriCloud