summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Relax the requirement that the exception object must be an instruction. DuringBill Wendling2012-05-171-6/+6
| | | | | | bugpoint-ing, it may turn into something else. llvm-svn: 156998
* SelectionDAGBuilder: CaseBlock, CaseRanges and CaseCmp changed ↵Stepan Dyatkovskiy2012-05-172-10/+11
| | | | | | representation of Low and High from signed to unsigned. Since unsigned ints usually simpler, faster and allows to reduce some extra signed bit checks needed before <,>,<=,>= comparisons. llvm-svn: 156985
* Set sub-register <undef> flags more accurately.Jakob Stoklund Olesen2012-05-161-7/+11
| | | | | | | | | | | | | | | When widening an existing <def,reads-undef> operand to a super-register, it may be necessary to clear the <undef> flag because the wider register is now read-modify-write through the instruction. Conversely, it may be necessary to add an <undef> flag when the coalescer turns a full-register def into a sub-register def, but the larger register wasn't live before the instruction. This happens in test/CodeGen/ARM/coalesce-subregs.ll, but the test is too small for the <undef> flags to affect the generated code. llvm-svn: 156951
* Fix a thinko in DisintegrateMERGE_VALUES. Patch by Xiaoyi Guo.Duncan Sands2012-05-162-2/+2
| | | | llvm-svn: 156909
* Enable sub-sub-register copy coalescing.Jakob Stoklund Olesen2012-05-151-9/+0
| | | | | | | | | | | | | | | | It is now possible to coalesce weird skewed sub-register copies by picking a super-register class larger than both original registers. The included test case produces code like this: vld2.32 {d16, d17, d18, d19}, [r0]! vst2.32 {d18, d19, d20, d21}, [r0] We still perform interference checking as if it were a normal full copy join, so this is still quite conservative. In particular, the f1 and f2 functions in the included test case still have remaining copies because of false interference. llvm-svn: 156878
* Teach RegisterCoalescer to handle symmetric sub-register copies.Jakob Stoklund Olesen2012-05-151-14/+28
| | | | | | | | | | | | | It is possible to coalesce two overlapping registers to a common super-register that it larger than both of the original registers. The important difference is that it may be necessary to rewrite DstReg operands as well as SrcReg operands because the sub-register index has changed. This behavior is still disabled by CoalescerPair. llvm-svn: 156869
* Handle NewReg==OldReg in renameRegister().Jakob Stoklund Olesen2012-05-151-1/+2
| | | | | | | This can happen when widening a virtual register to a super-register class. llvm-svn: 156867
* We never call adjustCopiesBackFrom() for partial copies.Jakob Stoklund Olesen2012-05-151-1/+3
| | | | | | There is no need to look at an always null SrcIdx. llvm-svn: 156866
* Extend the CoalescerPair interface to handle symmetric sub-register copies.Jakob Stoklund Olesen2012-05-152-44/+51
| | | | | | | | | | Now both SrcReg and DstReg can be sub-registers of the final coalesced register. CoalescerPair::setRegisters still rejects such copies because RegisterCoalescer doesn't yet handle them. llvm-svn: 156848
* Add -enable-aa-sched-mi, off by default, for AliasAnalysis inside ↵Andrew Trick2012-05-151-22/+243
| | | | | | | | | | | | | | | MachineScheduler. This feature avoids creating edges in the scheduler's dependence graph for non-aliasing memory operations according to whichever alias analysis is available. It has been fully tested in Hexagon. Before making this default, it needs to be extended to handle multiple MachineMemOperands, compile time needs more evaluation, and benchmarking on X86 and ARM is needed. Patch by Sergei Larin! llvm-svn: 156842
* Allow MCCodeEmitter access to the target MCRegisterInfo.Jim Grosbach2012-05-151-5/+8
| | | | | | | | Add the MCRegisterInfo to the factories and constructors. Patch by Tom Stellard <Tom.Stellard@amd.com>. llvm-svn: 156828
* Rejected r156804 due to buildbots failures.Stepan Dyatkovskiy2012-05-151-35/+46
| | | | llvm-svn: 156808
* SelectionDAGBuilder::Clusterify : main functinality was replaced with ↵Stepan Dyatkovskiy2012-05-151-46/+35
| | | | | | CRSBuilder::optimize, so big part of Clusterify's code was reduced. llvm-svn: 156804
* Don't access MO reference after invalidating operand list.Jakob Stoklund Olesen2012-05-141-2/+3
| | | | | | This should unbreak llvm-x86_64-linux. llvm-svn: 156778
* Fix PR12821.Jakob Stoklund Olesen2012-05-141-0/+6
| | | | | | | RAFast must add an <imp-def> operand when it is rewriting a sub-register def that isn't a read-modify-write. llvm-svn: 156777
* Rename @llvm.debugger to @llvm.debugtrap.Dan Gohman2012-05-142-3/+3
| | | | llvm-svn: 156774
* Don't look for empty live ranges in the unions.Jakob Stoklund Olesen2012-05-121-1/+4
| | | | | | | | | Empty live ranges represent undef and still get allocated, but they won't appear in LiveIntervalUnions. Patch by Patrik Hägglund! llvm-svn: 156685
* Revert 156658.Chad Rosier2012-05-111-2/+1
| | | | llvm-svn: 156662
* [fast-isel] Fast-isel doesn't use the expect intrinsic.Chad Rosier2012-05-111-1/+2
| | | | llvm-svn: 156658
* ARM: peephole optimization to remove cmp instructionManman Ren2012-05-111-0/+9
| | | | | | | | | | | | | | | | | This patch will optimize the following cases: sub r1, r3 | sub r1, imm cmp r3, r1 or cmp r1, r3 | cmp r1, imm bge L1 TO subs r1, r3 bge L1 or ble L1 If the branch instruction can use flag from "sub", then we can replace "sub" with "subs" and eliminate the "cmp" instruction. rdar: 10734411 llvm-svn: 156599
* Define a new intrinsic, @llvm.debugger. It will be similar to __builtin_trap(),Dan Gohman2012-05-112-0/+5
| | | | | | but it generates int3 on x86 instead of ud2. llvm-svn: 156593
* misched: Print machineinstrs with -debug-only=mischedAndrew Trick2012-05-101-0/+2
| | | | llvm-svn: 156576
* misched: tracing register pressure heuristics.Andrew Trick2012-05-101-6/+22
| | | | llvm-svn: 156575
* misched: Add register pressure backoff to ConvergingScheduler.Andrew Trick2012-05-101-38/+144
| | | | | | | | | | | Prioritize the instruction that comes closest to keeping pressure under the target's limit. Then prioritize instructions that avoid increasing the max pressure in the scheduled region. The max pressure heuristic is a tad aggressive. Later I'll fix it to consider the unscheduled pressure as well. WIP: This is mostly functional but untested and not likely to do much good yet. llvm-svn: 156574
* misched: Release only unscheduled nodes into ReadyQ.Andrew Trick2012-05-101-2/+8
| | | | llvm-svn: 156573
* misched: Added ReadyQ container wrapper for Top and Bottom Queues.Andrew Trick2012-05-101-11/+44
| | | | llvm-svn: 156572
* misched: Introducing Top and Bottom register pressure trackers during ↵Andrew Trick2012-05-103-39/+112
| | | | | | scheduling. llvm-svn: 156571
* RegPressure: API for speculatively checking instruction pressure.Andrew Trick2012-05-102-1/+229
| | | | | | | | | Added getMaxExcessUpward/DownwardPressure. They somewhat abuse the tracker by speculatively handling an instruction out of order. But it is convenient for now. In the future, we will cache each instruction's pressure contribution to make this efficient. llvm-svn: 156561
* RegPressure: fix array index iteration style.Andrew Trick2012-05-101-8/+8
| | | | llvm-svn: 156560
* Revert: 156550 "ARM: peephole optimization to remove cmp instruction"Manman Ren2012-05-101-9/+0
| | | | | | This commit broke an external linux bot and gave a compile-time warning. llvm-svn: 156556
* ARM: peephole optimization to remove cmp instructionManman Ren2012-05-101-0/+9
| | | | | | | | | | | | | | | | | This patch will optimize the following cases: sub r1, r3 | sub r1, imm cmp r3, r1 or cmp r1, r3 | cmp r1, imm bge L1 TO subs r1, r3 bge L1 or ble L1 If the branch instruction can use flag from "sub", then we can replace "sub" with "subs" and eliminate the "cmp" instruction. rdar: 10734411 llvm-svn: 156550
* Fix thinko in conditional.Eric Christopher2012-05-081-1/+1
| | | | | | Part of rdar://11352000 and should bring the buildbots back. llvm-svn: 156421
* DAGCombiner should not change the type of an extract_vector index.Jim Grosbach2012-05-081-3/+4
| | | | | | | | | | When a combine twiddles an extract_vector, care should be take to preserve the type of the index operand. No luck extracting a reasonable testcase, unfortunately. rdar://11391009 llvm-svn: 156419
* Formatting fixes.Akira Hatanaka2012-05-081-4/+4
| | | | | | Patch by Jack Carter. llvm-svn: 156409
* Handle OpDeref in case it comes in as a register operand.Eric Christopher2012-05-082-4/+7
| | | | | | Part of rdar://11352000 llvm-svn: 156405
* Extract methods for joining physregs.Jakob Stoklund Olesen2012-05-081-77/+103
| | | | | | No functional change. llvm-svn: 156345
* Naming convention and whitespace. No functional change.Jakob Stoklund Olesen2012-05-071-68/+67
| | | | llvm-svn: 156342
* Coalesce subreg-subreg copies.Jakob Stoklund Olesen2012-05-071-14/+25
| | | | | | | | | | | | | | | | | At least some of them: %vreg1:sub_16bit = COPY %vreg2:sub_16bit; GR64:%vreg1, GR32: %vreg2 Previously, we couldn't figure out that the above copy could be eliminated by coalescing %vreg2 with %vreg1:sub_32bit. The new getCommonSuperRegClass() hook makes it possible. This is not very useful yet since the unmodified part of the destination register usually interferes with the source register. The coalescer needs to understand sub-register interference checking first. llvm-svn: 156334
* Add an MF argument to TRI::getPointerRegClass() and TII::getRegClass().Jakob Stoklund Olesen2012-05-0710-22/+29
| | | | | | | | | | | | | The getPointerRegClass() hook can return register classes that depend on the calling convention of the current function (ptr_rc_tailcall). So far, we have been able to infer the calling convention from the subtarget alone, but as we add support for multiple calling conventions per target, that no longer works. Patch by Yiannis Tsiouris! llvm-svn: 156328
* Teach DAG combine to fold x-x to 0.0 when unsafe FP math is enabled.Owen Anderson2012-05-071-0/+4
| | | | llvm-svn: 156324
* Add a new target hook "predictableSelectIsExpensive".Benjamin Kramer2012-05-051-0/+1
| | | | | | | | | | | This will be used to determine whether it's profitable to turn a select into a branch when the branch is likely to be predicted. Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM. I'm not entirely happy with the name of this flag, suggestions welcome ;) llvm-svn: 156233
* Make sure findRepresentativeClass picks the widest super-register.Jakob Stoklund Olesen2012-05-041-6/+10
| | | | | | | | We want the representative register class to contain the largest super-registers available. This makes the function less sensitive to the register class numbering. llvm-svn: 156220
* Remove extra comma in debug output.Jakob Stoklund Olesen2012-05-041-1/+1
| | | | llvm-svn: 156219
* Use SuperRegClassIterator for findRepresentativeClass().Jakob Stoklund Olesen2012-05-041-26/+15
| | | | | | | | The masks returned by SuperRegClassIterator are computed automatically by TableGen. This is better than depending on the manually specified SuperRegClasses. llvm-svn: 156147
* Fix two-address pass's aggressive instruction commuting heuristics. It's meantEvan Cheng2012-05-031-15/+16
| | | | | | | | | | | | | | | | | | | | | | to catch cases like: %reg1024<def> = MOV r1 %reg1025<def> = MOV r0 %reg1026<def> = ADD %reg1024, %reg1025 r0 = MOV %reg1026 By commuting ADD, it let coalescer eliminate all of the copies. However, there was a bug in the heuristics where it ended up commuting the ADD in: %reg1024<def> = MOV r0 %reg1025<def> = MOV 0 %reg1026<def> = ADD %reg1024, %reg1025 r0 = MOV %reg1026 That did no benefit but rather ensure the last MOV would not be coalesced. rdar://11355268 llvm-svn: 156048
* Added TargetRegisterInfo::getAllocatableClass.Andrew Trick2012-05-032-7/+12
| | | | | | | | | | The ensures that virtual registers always belong to an allocatable class. If your target attempts to create a vreg for an operand that has no allocatable register subclass, you will crash quickly. This ensures that targets define register classes as intended. llvm-svn: 156046
* Teach DAGCombine the same multiply-by-1.0 folding trick when doing FMAs, ↵Owen Anderson2012-05-021-0/+18
| | | | | | just like it now knows for FMULs. llvm-svn: 156029
* Teach DAG combine that multiplication by 1.0 can always be constant folded.Owen Anderson2012-05-021-0/+3
| | | | llvm-svn: 156023
* Tidy up. Naming conventions.Jim Grosbach2012-05-011-16/+16
| | | | llvm-svn: 155960
* Use dyn_cast instead of checking opcode and cast.Jakub Staszak2012-05-011-2/+1
| | | | llvm-svn: 155957
OpenPOWER on IntegriCloud