summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
* Generalize ExtendUsesToFormExtLoad to be usable for ANY_EXTEND,Dan Gohman2009-04-092-45/+78
| | | | | | | | | | | | | | | in addition to ZERO_EXTEND and SIGN_EXTEND. Fix a bug in the way it checked for live-out values, and simplify the way it find users by using SDNode::use_iterator's (relatively) new features. Also, make it slightly more permissive on targets with free truncates. In SelectionDAGBuild, avoid creating ANY_EXTEND nodes that are larger than necessary. If the target's SwitchAmountTy has enough bits, use it. This exposes the truncate to optimization early, enabling more optimizations. llvm-svn: 68670
* Don't copy the operand of a SwitchInst into virtual registers asDan Gohman2009-04-091-3/+13
| | | | | | | | | | eagerly. This helps avoid CopyToReg nodes in some cases where they aren't needed, and also helps subsequent optimizer heuristics in cases where the extra nodes would cause the node to appear to have multiple results. This doesn't have a significant impact currently; it'll help an upcoming change. llvm-svn: 68667
* Soft float support for FREM.Duncan Sands2009-04-082-0/+14
| | | | llvm-svn: 68614
* Soft float support for undef. Reported by Xerxes Rånby.Duncan Sands2009-04-082-0/+6
| | | | llvm-svn: 68607
* Implement support for using modeling implicit-zero-extension on x86-64Dan Gohman2009-04-082-9/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with SUBREG_TO_REG, teach SimpleRegisterCoalescing to coalesce SUBREG_TO_REG instructions (which are similar to INSERT_SUBREG instructions), and teach the DAGCombiner to take advantage of this on targets which support it. This eliminates many redundant zero-extension operations on x86-64. This adds a new TargetLowering hook, isZExtFree. It's similar to isTruncateFree, except it only applies to actual definitions, and not no-op truncates which may not zero the high bits. Also, this adds a new optimization to SimplifyDemandedBits: transform operations like x+y into (zext (add (trunc x), (trunc y))) on targets where all the casts are no-ops. In contexts where the high part of the add is explicitly masked off, this allows the mask operation to be eliminated. Fix the DAGCombiner to avoid undoing these transformations to eliminate casts on targets where the casts are no-ops. Also, this adds a new two-address lowering heuristic. Since two-address lowering runs before coalescing, it helps to be able to look through copies when deciding whether commuting and/or three-address conversion are profitable. Also, fix a bug in LiveInterval::MergeInClobberRanges. It didn't handle the case that a clobber range extended both before and beyond an existing live range. In that case, multiple live ranges need to be added. This was exposed by the new subreg coalescing code. Remove 2008-05-06-SpillerBug.ll. It was bugpoint-reduced, and the spiller behavior it was looking for no longer occurrs with the new instruction selection. llvm-svn: 68576
* Revert prev. patch for now.Devang Patel2009-04-071-4/+6
| | | | llvm-svn: 68569
* Right now DBG_LABEL are required for llvm.dbg.region_start and ↵Devang Patel2009-04-071-6/+4
| | | | | | llvm.dbg.region_end in non-fast mode also. llvm-svn: 68559
* Don't attempt to handle aggregate argument values in FastISel; letDan Gohman2009-04-071-1/+5
| | | | | | SelectionDAG do those. This fixes PR3955. llvm-svn: 68546
* Fix a TargetLowering optimization so that it doesn't duplicateDan Gohman2009-04-031-0/+1
| | | | | | loads when an input node has multiple uses. llvm-svn: 68398
* Delete ISD::INSERT_SUBREG and ISD::EXTRACT_SUBREG, which are unused.Dan Gohman2009-04-032-20/+0
| | | | | | | Note that these are distinct from TargetInstrInfo::INSERT_SUBREG and TargetInstrInfo::EXTRACT_SUBREG, which are used. llvm-svn: 68355
* To convert the StopPoint insn into an assembler directive by ISel, we need ↵Sanjiv Gupta2009-04-021-0/+4
| | | | | | to have access to the line number field. So we convert that info as an operand by custom handling DBG_STOPPOINT in legalize. llvm-svn: 68329
* Fully general expansion of integer shift of any size.Evan Cheng2009-03-312-3/+81
| | | | llvm-svn: 68134
* Minor top-level comment fix.Dan Gohman2009-03-311-1/+1
| | | | llvm-svn: 68113
* Fix live-out reg logic to not insert over-aggressive AssertZExtDan Gohman2009-03-311-3/+3
| | | | | | instructions. This fixes lua. llvm-svn: 68083
* Fix PR3899: add support for extracting floats from vectorsDuncan Sands2009-03-293-0/+22
| | | | | | | when using -soft-float. Based on a patch by Jakob Stoklund Olesen. llvm-svn: 67996
* Make check in CheckTailCallReturnConstraints for ignorable instructions betweenArnold Schwaighofer2009-03-281-18/+32
| | | | | | | a CALL and a RET node more generic. Add a test for tail calls with a void return. llvm-svn: 67943
* Enable tail call optimization for functions that return a struct (bug 3664) ↵Arnold Schwaighofer2009-03-281-0/+24
| | | | | | and for functions that return types that need extending (e.g i1). llvm-svn: 67934
* Optimize some 64-bit multiplication by constants into two lea's or one lea + ↵Evan Cheng2009-03-281-8/+8
| | | | | | | | | | | | | | | | | shl since imulq is slow (latency 5). e.g. x * 40 => shlq $3, %rdi leaq (%rdi,%rdi,4), %rax This has the added benefit of allowing more multiply to be folded into addressing mode. e.g. a * 24 + b => leaq (%rdi,%rdi,2), %rax leaq (%rsi,%rax,8), %rax llvm-svn: 67917
* Fix what surely must be a copy+pasto.Dan Gohman2009-03-271-2/+2
| | | | llvm-svn: 67881
* Initialize LiveOutInfo's APInt members to zero, as APInt'sDan Gohman2009-03-271-1/+1
| | | | | | | default constructor produces an uninitialized APInt. This fixes PR3896. llvm-svn: 67879
* Pull transform from target-dependent code into target-independent code.Bill Wendling2009-03-261-0/+49
| | | | llvm-svn: 67742
* Revert 67132. This is breaking some objective-c apps.Evan Cheng2009-03-251-3/+13
| | | | | | Also fixes SDISel so it *does not* force promote return value if the function is not marked signext / zeroext. llvm-svn: 67701
* When optimizing with debug info, don't keep theDale Johannesen2009-03-251-4/+5
| | | | | | | | stoppoint nodes around until Legalize; doing this imposed an ordering on a sequence of loads that came from different lines, interfering with scheduling. llvm-svn: 67692
* more tidying: name the components of PhysReg in the case whenChris Lattner2009-03-241-8/+9
| | | | | | the target constraint specifies a specific physreg. llvm-svn: 67618
* Tidy a bit more.Chris Lattner2009-03-241-3/+3
| | | | llvm-svn: 67617
* simplify this code a bit now that "allocation to a vreg class" can neverChris Lattner2009-03-241-16/+13
| | | | | | fail. llvm-svn: 67616
* Minor compile-time optimization; don't bother checkingDan Gohman2009-03-241-1/+1
| | | | | | | canClobberPhysRegDefs if the successor node doesn't clobber any physical registers. llvm-svn: 67587
* Add a pre-pass to the burr-list scheduler which makes adjustments toDan Gohman2009-03-241-0/+120
| | | | | | | | | help out the register pressure reduction heuristics in the case of nodes with multiple uses. Currently this uses very conservative heuristics, so it doesn't have a broad impact, but in cases where it does help it can make a big difference. llvm-svn: 67586
* When unfolding a load during scheduling, the new operator node hasDan Gohman2009-03-231-3/+6
| | | | | | | | | | a data dependency on the load node, so it really needs a data-dependence edge to the load node, even if the load previously existed. And add a few comments. llvm-svn: 67554
* Don't set SUnit::hasPhysRegDefs to true unless the defs areDan Gohman2009-03-231-1/+4
| | | | | | actually have uses, which reflects the way it's used. llvm-svn: 67540
* Fix canClobberPhysRegDefs to check all SDNodes grouped togetherDan Gohman2009-03-231-14/+19
| | | | | | | in an SUnit, instead of just the first one. This fix is needed by some upcoming scheduler changes. llvm-svn: 67531
* Add a new bit to SUnit to record whether a node has implicit physregDan Gohman2009-03-231-3/+6
| | | | | | defs, regardless of whether they are actually used. llvm-svn: 67528
* Now that errs() is properly non-buffered, there's no need toDan Gohman2009-03-231-2/+0
| | | | | | explicitly flush it. llvm-svn: 67526
* Model inline asm constraint which ties an input to an output register as ↵Evan Cheng2009-03-231-29/+21
| | | | | | machine operand TIED_TO constraint. This eliminated the need to pre-allocate registers for these. This also allows register allocator can eliminate the unneeded copies. llvm-svn: 67512
* Simplify this code; use a while instead of an if and a do-while.Dan Gohman2009-03-201-9/+6
| | | | llvm-svn: 67400
* For inline asm output operand that matches an input. Encode the input ↵Evan Cheng2009-03-205-28/+43
| | | | | | operand index in the high bits. llvm-svn: 67387
* Fixed the comment. No functionality change.Sanjiv Gupta2009-03-201-4/+4
| | | | llvm-svn: 67370
* Added missing support for widening when splitting an unary op (PR3683)Mon P Wang2009-03-184-4/+91
| | | | | | | and expanding a bit convert (PR3711). In both cases, we extract the valid part of the widen vector and then do the conversion. llvm-svn: 67175
* Don't force promotion of return arguments on the callee.Rafael Espindola2009-03-171-9/+0
| | | | | | | Some architectures (like x86) don't require it. This fixes bug 3779. llvm-svn: 67132
* Fix codegen to compute the size of an allocation by multiplying theChris Lattner2009-03-171-8/+17
| | | | | | | | | | | size by the array amount as an i32 value instead of promoting from i32 to i64 then doing the multiply. Not doing this broke wrap-around assumptions that the optimizers (validly) made. The ultimate real fix for this is to introduce i64 version of alloca and remove mallocinst. This fixes PR3829 llvm-svn: 67093
* Fix a problem with DAGCombine where we were building an illegal buildMon P Wang2009-03-171-6/+11
| | | | | | | | vector shuffle mask. Forced the mask to be built using i32. Note: this will be irrelevant once vector_shuffle no longer takes a build vector for the shuffle mask. llvm-svn: 67076
* Avoid doing the transformation c ? 1.0 : 2.0 as load { 2.0, 1.0 } + c*4Mon P Wang2009-03-141-1/+4
| | | | | | | if FPConstant is legal because if the FPConstant doesn't need to be stored in a constant pool, the transformation is unlikely to be profitable. llvm-svn: 66994
* Improve FastISel's handling of truncates to i1, and implementDan Gohman2009-03-131-10/+18
| | | | | | | | ptrtoint and inttoptr in X86FastISel. These casts aren't always handled in the generic FastISel code because X86 sometimes needs custom code to do truncation and zero-extension. llvm-svn: 66988
* Fix FastISel's assumption that i1 values are always zero-extendedDan Gohman2009-03-131-1/+14
| | | | | | | | | by inserting explicit zero extensions where necessary. Included is a testcase where SelectionDAG produces a virtual register holding an i1 value which FastISel previously mistakenly assumed to be zero-extended. llvm-svn: 66941
* Fix some significant problems with constant pools that resulted in ↵Evan Cheng2009-03-136-15/+10
| | | | | | | | | | | | | | | | | | | | | | | unnecessary paddings between constant pool entries, larger than necessary alignments (e.g. 8 byte alignment for .literal4 sections), and potentially other issues. 1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants. 2. MachineConstantPool alignment field is also a log2 value. 3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values. 4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries. 5. Asm printer uses expensive data structure multimap to track constant pool entries by sections. 6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic. Solutions: 1. ConstantPoolSDNode alignment field is changed to keep non-log2 value. 2. MachineConstantPool alignment field is also changed to keep non-log2 value. 3. Functions that create ConstantPool nodes are passing in non-log2 alignments. 4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT. 5. Asm printer uses cheaper data structure to group constant pool entries. 6. Asm printer compute entry offsets after grouping is done. 7. Change JIT code to compute entry offsets on the fly. llvm-svn: 66875
* Oops...I committed too much.Bill Wendling2009-03-135-21/+25
| | | | llvm-svn: 66867
* Temporarily XFAIL this test.Bill Wendling2009-03-135-25/+21
| | | | llvm-svn: 66866
* Fix a typo in a comment.Dan Gohman2009-03-121-1/+1
| | | | llvm-svn: 66843
* Move 3 "(add (select cc, 0, c), x) -> (select cc, x, (add, x, c))"Chris Lattner2009-03-121-76/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | related transformations out of target-specific dag combine into the ARM backend. These were added by Evan in r37685 with no testcases and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll). Add some simple X86-specific (for now) DAG combines that turn things like cond ? 8 : 0 -> (zext(cond) << 3). This happens frequently with the recently added cp constant select optimization, but is a very general xform. For example, we now compile the second example in const-select.ll to: _test: movsd LCPI2_0, %xmm0 ucomisd 8(%esp), %xmm0 seta %al movzbl %al, %eax movl 4(%esp), %ecx movsbl (%ecx,%eax,4), %eax ret instead of: _test: movl 4(%esp), %eax leal 4(%eax), %ecx movsd LCPI2_0, %xmm0 ucomisd 8(%esp), %xmm0 cmovbe %eax, %ecx movsbl (%ecx), %eax ret This passes multisource and dejagnu. llvm-svn: 66779
* Enable Chris' value propagation change. It make available known sign, zero, ↵Evan Cheng2009-03-121-3/+1
| | | | | | one bits information for values that are live out of basic blocks. The goal is to eliminate unnecessary sext, zext, truncate of values that are live-in to blocks. This does not handle PHI nodes yet. llvm-svn: 66777
OpenPOWER on IntegriCloud