summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* Remove unused argument.Rafael Espindola2011-04-211-6/+4
| | | | llvm-svn: 129955
* Fix DWARF description of Q registers.Devang Patel2011-04-211-0/+27
| | | | llvm-svn: 129952
* Fix DWARF description of S registers.Devang Patel2011-04-212-0/+44
| | | | llvm-svn: 129947
* As per ARM docs, register Dx is described as DW_OP_regx(256+x) in DWARF.Devang Patel2011-04-211-24/+32
| | | | llvm-svn: 129922
* PTX: Expand useable register spaceJustin Holewinski2011-04-211-6/+226
| | | | llvm-svn: 129913
* ptx: fix parameter orderingChe-Liang Chiou2011-04-211-4/+1
| | | | | | | | | This patch depends on the prior fix r129908 that changes to use std::find, rather than std::binary_search, on unordered array. Patch by Dan Bailey llvm-svn: 129909
* ptx: PTXMachineFunctionInfo no longer sort registers and so should not use ↵Che-Liang Chiou2011-04-211-2/+3
| | | | | | std::binary_search llvm-svn: 129908
* Remove -use-divmod-libcall. Let targets opt in when they are available.Evan Cheng2011-04-203-6/+4
| | | | llvm-svn: 129884
* Revert r129846; it's breaking a buildbot. SeeEli Friedman2011-04-201-0/+1
| | | | | | http://google1.osuosl.org:8011/builders/llvm-x86_64-linux-checks/builds/825/steps/test.llvm.stage2/logs/st.ll llvm-svn: 129869
* Prefer cheap registers for busy live ranges.Jakob Stoklund Olesen2011-04-202-7/+16
| | | | | | | | | | | | | | On the x86-64 and thumb2 targets, some registers are more expensive to encode than others in the same register class. Add a CostPerUse field to the TableGen register description, and make it available from TRI->getCostPerUse. This represents the cost of a REX prefix or a 32-bit instruction encoding required by choosing a high register. Teach the greedy register allocator to prefer cheap registers for busy live ranges (as indicated by spill weight). llvm-svn: 129864
* Excise unintended hunk in 129858. <rdar://problem/7662569>Stuart Hastings2011-04-201-5/+0
| | | | llvm-svn: 129862
* ARM byval support. Will be enabled by another patch to the FE. ↵Stuart Hastings2011-04-203-80/+173
| | | | | | <rdar://problem/7662569> llvm-svn: 129858
* PTX: Add intrinsics to list of built-in intrinsics, which allows them to beJustin Holewinski2011-04-209-24/+60
| | | | | | | | | | used by Clang. To help Clang integration, the PTX target has been split into two targets: ptx32 and ptx64, depending on the desired pointer size. - Add GCCBuiltin class to all intrinsics - Split PTX target into ptx32 and ptx64 llvm-svn: 129851
* ptx: add integer div and rem instructionChe-Liang Chiou2011-04-201-0/+2
| | | | | | Patched by Dan Bailey llvm-svn: 129848
* ptx: add floating-point comparison to setpChe-Liang Chiou2011-04-201-14/+234
| | | | | | Patched by Dan Bailey llvm-svn: 129847
* ptx: fix parameter orderingChe-Liang Chiou2011-04-201-1/+0
| | | | | | Patched by Dan Bailey llvm-svn: 129846
* This should always be signed chars, so use int8_t. This fixes a miscompile whenNick Lewycky2011-04-201-3/+3
| | | | | | | | llvm is built with unsigned chars where an immediate such as 0xff would be zero extended to 64-bits, turning "cmp $0xff,%eax" into "cmp $0xffffffffffffffff,%eax". llvm-svn: 129845
* Remove unused arguments.Rafael Espindola2011-04-201-3/+2
| | | | llvm-svn: 129844
* ADT/Triple: Renambe isOSX... methods to isMacOSX for consistency with the OSDaniel Dunbar2011-04-206-14/+15
| | | | | | triple component. llvm-svn: 129838
* Fix typo in the comment.Johnny Chen2011-04-191-1/+1
| | | | llvm-svn: 129837
* ADT/Triple: Move a variety of clients to using isOSDarwin() and isOSWindows()Daniel Dunbar2011-04-1910-99/+76
| | | | | | predicates. llvm-svn: 129816
* Target/X86: Eliminate uses of getDarwinVers().Daniel Dunbar2011-04-194-11/+7
| | | | llvm-svn: 129813
* Target/X86: Add getTargetTriple() accessor.Daniel Dunbar2011-04-191-0/+2
| | | | llvm-svn: 129812
* Target/PPC: Kill off DarwinVers, which is now dead.Daniel Dunbar2011-04-192-24/+1
| | | | llvm-svn: 129811
* Target/PPC: Eliminate a use of getDarwinVers().Daniel Dunbar2011-04-191-2/+4
| | | | llvm-svn: 129810
* Target/PPC: Add a TargetTriple field.Daniel Dunbar2011-04-192-1/+9
| | | | llvm-svn: 129809
* Target: Eliminate a use of getDarwinMajorNumber().Daniel Dunbar2011-04-191-1/+8
| | | | llvm-svn: 129803
* Remove some duplicate op action entries and reorganize.Eric Christopher2011-04-191-8/+5
| | | | llvm-svn: 129781
* This patch combines several changes from Evan Cheng for rdar://8659675.Bob Wilson2011-04-195-12/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Enable these fp vmlx codegen changes for Cortex-A9. llvm-svn: 129775
* Add -mcpu=cortex-a9-mp. It's cortex-a9 with MP extension. rdar://8648637.Bob Wilson2011-04-191-0/+2
| | | | llvm-svn: 129774
* Avoid some 's' 16-bit instruction which partially update CPSRBob Wilson2011-04-194-86/+182
| | | | | | | (and add false dependency) when it isn't dependent on last CPSR defining instruction. rdar://8928208 llvm-svn: 129773
* Avoid write-after-write issue hazards for Cortex-A9.Bob Wilson2011-04-192-0/+25
| | | | | | | | | | | Add a avoidWriteAfterWrite() target hook to identify register classes that suffer from write-after-write hazards. For those register classes, try to avoid writing the same register in two consecutive instructions. This is currently disabled by default. We should not spill to avoid hazards! The command line flag -avoid-waw-hazard can be used to enable waw avoidance. llvm-svn: 129772
* Some single-precision VFP instructions can execute in either the VPF or NeonBob Wilson2011-04-191-0/+24
| | | | | | pipelines, at least on Cortex-A9. llvm-svn: 129771
* Improvements for the Cortex-A9 scheduling itineraries.Bob Wilson2011-04-191-12/+16
| | | | llvm-svn: 129770
* Add support for FastISel'ing varargs calls.Eli Friedman2011-04-191-4/+21
| | | | llvm-svn: 129765
* Implement support for x86 fastisel of small fixed-sized memcpys, which are ↵Chris Lattner2011-04-191-5/+50
| | | | | | | | | generated en-mass for C++ PODs. On my c++ test file, this cuts the fast isel rejects by 10x and shrinks the generated .s file by 5% llvm-svn: 129755
* tidy upChris Lattner2011-04-191-3/+5
| | | | llvm-svn: 129753
* Implement support for fast isel of calls of i1 arguments, even though they ↵Chris Lattner2011-04-191-10/+23
| | | | | | | | | | | | are illegal, when they are a truncate from something else. This eliminates fully half of all the fastisel rejections on a test c++ file I'm working with, which should make a substantial improvement for -O0 compile of c++ code. This fixed rdar://9297003 - fast isel bails out on all functions taking bools llvm-svn: 129752
* Handle i1/i8/i16 constant integer arguments to calls by prepromoting them.Chris Lattner2011-04-191-9/+22
| | | | | | | | | | | | | | | | | | | | | | | | | Before we would bail out on i1 arguments all together, now we just bail on non-constant ones. Also, we used to emit extraneous code. e.g. test12 was: movb $0, %al movzbl %al, %edi callq _test12 and test13 was: movb $0, %al xorl %edi, %edi movb %al, 7(%rsp) callq _test13f Now we get: movl $0, %edi callq _test12 and: movl $0, %edi callq _test13f llvm-svn: 129751
* be layout aware, to produce:Chris Lattner2011-04-191-1/+8
| | | | | | | | | | | | | | | | | | | testb $1, %al je LBB0_2 ## BB#1: ## %if.then movb $0, %al instead of: testb $1, %al jne LBB0_1 jmp LBB0_2 LBB0_1: ## %if.then movb $0, %al how 'bout that. llvm-svn: 129749
* fix rdar://9297006 - fast isel bails out on trunc to i1 -> bools cry,Chris Lattner2011-04-191-6/+29
| | | | | | a common cause of fast isel rejects on c++ code. llvm-svn: 129748
* Change A9 scheduling itineraries VLD* / VST* entries default to "aligned". ThatEvan Cheng2011-04-192-172/+373
| | | | | | | | | is, it assumes addresses are 64-bit aligned (which should be the more common case). If the alignment is found not to be aligned, then getOperandLatency() would adjust the operand latency computation by one to compensate for it. rdar://9294833 llvm-svn: 129742
* Do not lose mem_operands while lowering VLD / VST intrinsics.Evan Cheng2011-04-192-4/+37
| | | | llvm-svn: 129738
* Trim a few unneeded includes.Jim Grosbach2011-04-183-31/+0
| | | | llvm-svn: 129723
* Invert the meaning of printAliasInstr's return value. It now returnsEric Christopher2011-04-182-1/+4
| | | | | | true on success and false on failure. Update callers. llvm-svn: 129722
* Small fix to the ARM AsmParser to ensure that aSean Callanan2011-04-181-0/+1
| | | | | | superclass variable is instantiated properly. llvm-svn: 129713
* Add a new bit that ImmLeaf's can opt into, which allows them to duck out ofChris Lattner2011-04-181-3/+6
| | | | | | | | the generated FastISel. X86 doesn't need to generate code to match ADD16ri8 since ADD16ri will do just fine. This is a small codesize win in the generated instruction selector. llvm-svn: 129692
* switch the rest of the x86 immediate patterns over to ImmLeaf, Chris Lattner2011-04-171-17/+9
| | | | | | | simplifying them and exposing more information to tblgen. It would be nice if other target authors adopted this as well, particularly arm since it has fastisel. llvm-svn: 129676
* now that predicates have a decent abstraction layer on them, introduce a new Chris Lattner2011-04-171-1/+6
| | | | | | | | | kind of predicate: one that is specific to imm nodes. The predicate function specified here just checks an int64_t directly instead of messing around with SDNode's. The virtue of this is that it means that fastisel and other things can reason about these predicates. llvm-svn: 129675
* Rework our internal representation of node predicates to expose moreChris Lattner2011-04-171-1/+1
| | | | | | | | structure and fix some fixmes. We now have a TreePredicateFn class that handles all of the decoding of these things. This is an internal cleanup that has no impact on the code generated by tblgen. llvm-svn: 129670
OpenPOWER on IntegriCloud