summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Teach DAG combine to fold (extract_subvec (concat v1, ..) i) to v_iMichael Liao2012-10-171-2/+17
| | | | | | | | - If the extracted vector has the same type of all vectored being concatenated together, it should be simplified directly into v_i, where i is the index of the element being extracted. llvm-svn: 166125
* Switch MRI::UsedPhysRegs to a register unit bit vector.Jakob Stoklund Olesen2012-10-171-2/+2
| | | | | | | This is a more compact, less redundant representation, and it avoids scanning long lists of aliases for ARM D-registers, for example. llvm-svn: 166124
* Add a really faster pre-RA scheduler (-pre-RA-sched=linearize). It doesn't useEvan Cheng2012-10-173-3/+160
| | | | | | | | | | | | | | any scheduling heuristics nor does it build up any scheduling data structure that other heuristics use. It essentially linearize by doing a DFA walk but it does handle glues correctly. IMPORTANT: it probably can't handle all the physical register dependencies so it's not suitable for x86. It also doesn't deal with dbg_value nodes right now so it's definitely is still WIP. rdar://12474515 llvm-svn: 166122
* Merge MRI::isPhysRegOrOverlapUsed() into isPhysRegUsed().Jakob Stoklund Olesen2012-10-172-2/+2
| | | | | | | | | | | All callers of these functions really want the isPhysRegOrOverlapUsed() functionality which also checks aliases. For historical reasons, targets without register aliases were calling isPhysRegUsed() instead. Change isPhysRegUsed() to also check aliases, and switch all isPhysRegOrOverlapUsed() callers to isPhysRegUsed(). llvm-svn: 166117
* misched: Better handling of invalid latencies in the machine modelAndrew Trick2012-10-171-2/+10
| | | | llvm-svn: 166107
* Use a SparseSet instead of a BitVector for UsedInInstr in RAFast.Jakob Stoklund Olesen2012-10-171-23/+30
| | | | | | | | This is just as fast, and it makes it possible to avoid leaking the UsedPhysRegs BitVector implementation through MachineRegisterInfo::addPhysRegsUsed(). llvm-svn: 166083
* Avoid rematerializing a redef immediately after the old def.Jakob Stoklund Olesen2012-10-161-0/+7
| | | | | | | | | | | | | | | | | PR14098 contains an example where we would rematerialize a MOV8ri immediately after the original instruction: %vreg7:sub_8bit<def> = MOV8ri 9; GR32_ABCD:%vreg7 %vreg22:sub_8bit<def> = MOV8ri 9; GR32_ABCD:%vreg7 Besides being pointless, it is also wrong since the original instruction only redefines part of the register, and the value read by the new instruction is wrong. The problem was the LiveRangeEdit::allUsesAvailableAt() didn't special-case OrigIdx == UseIdx and found the wrong SSA value. llvm-svn: 166068
* Revert r166046 "Switch back to the old coalescer for now to fix the 32 bit bit"Jakob Stoklund Olesen2012-10-161-344/+1
| | | | | | A fix for PR14098, including the test case is in the next commit. llvm-svn: 166067
* Teach DAG combine to fold (trunc (fptoXi x)) to (fptoXi x)Michael Liao2012-10-161-0/+46
| | | | llvm-svn: 166049
* Switch back to the old coalescer for now to fix the 32 bit bitRafael Espindola2012-10-161-1/+344
| | | | | | llvm+clang+compiler-rt bootstrap. llvm-svn: 166046
* Issue:Stepan Dyatkovskiy2012-10-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stack is formed improperly for long structures passed as byval arguments for EABI mode. If we took AAPCS reference, we can found the next statements: A: "If the argument requires double-word alignment (8-byte), the NCRN (Next Core Register Number) is rounded up to the next even register number." (5.5 Parameter Passing, Stage C, C.3). B: "The alignment of an aggregate shall be the alignment of its most-aligned component." (4.3 Composite Types, 4.3.1 Aggregates). So if we have structure with doubles (9 double fields) and 3 Core unused registers (r1, r2, r3): caller should use r2 and r3 registers only. Currently r1,r2,r3 set is used, but it is invalid. Callee VA routine should also use r2 and r3 regs only. All is ok here. This behaviour is guessed by rounding up SP address with ADD+BFC operations. Fix: Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and 8 byte alignment, we waste odd registers then. P.S.: I also improved LDRB_POST_IMM regression test. Since ldrb instruction will not generated by current regression test after this patch. llvm-svn: 166018
* misched: Added handleMove support for updating all kill flags, not just for ↵Andrew Trick2012-10-162-8/+21
| | | | | | | | | allocatable regs. This is a medium term workaround until we have a more robust solution in the form of a register liveness utility for postRA passes. llvm-svn: 166001
* Remove unused BitVectors from getAllocatableSet().Jakob Stoklund Olesen2012-10-163-9/+1
| | | | llvm-svn: 165999
* Remove RegisterClassInfo::isReserved() and isAllocatable().Jakob Stoklund Olesen2012-10-156-17/+18
| | | | | | Clients can use the equivalent functions in MRI. llvm-svn: 165990
* Remove LIS::isAllocatable() and isReserved() helpers.Jakob Stoklund Olesen2012-10-154-7/+5
| | | | | | All callers can simply use the corresponding MRI functions. llvm-svn: 165985
* Switch most getReservedRegs() clients to the MRI equivalent.Jakob Stoklund Olesen2012-10-1510-40/+25
| | | | | | | Using the cached bit vector in MRI avoids comstantly allocating and recomputing the reserved register bit vector. llvm-svn: 165983
* Freeze the reserved registers as soon as isel is complete.Jakob Stoklund Olesen2012-10-152-9/+10
| | | | | | | | | | | | | Also provide an MRI::getReservedRegs() function to access the frozen register set, and isReserved() and isAllocatable() methods to test individual registers. The various implementations of TRI::getReservedRegs() are quite complicated, and many passes need to look at the reserved register set. This patch makes it possible for these passes to use the cached copy in MRI, avoiding a lot of malloc traffic and repeated calculations. llvm-svn: 165982
* Move the Attributes::Builder outside of the Attributes class and into its ↵Bill Wendling2012-10-151-3/+3
| | | | | | own class named AttrBuilder. No functionality change. llvm-svn: 165960
* Make sure we iterate over newly created instructions. Fixes pr13625. Testcase toRafael Espindola2012-10-151-0/+5
| | | | | | follow in one sec. llvm-svn: 165951
* misched: ILP scheduler for experimental heuristics.Andrew Trick2012-10-152-20/+197
| | | | llvm-svn: 165950
* Resubmit the changes to llvm core to update the functions to support ↵Micah Villmow2012-10-159-24/+29
| | | | | | different pointer sizes on a per address space basis. llvm-svn: 165941
* Remove the bitwise XOR operator from the Attributes class. Replace it with ↵Bill Wendling2012-10-141-2/+2
| | | | | | the equivalent from the builder class. llvm-svn: 165893
* Drop <def,dead> flags when merging into an unused lane.Jakob Stoklund Olesen2012-10-131-4/+9
| | | | | | | | | | | | The new coalescer can merge a dead def into an unused lane of an otherwise live vector register. Clear the <dead> flag when that happens since the flag refers to the full virtual register which is still live after the partial dead def. This fixes PR14079. llvm-svn: 165877
* Allow for loops in LiveIntervals::pruneValue().Jakob Stoklund Olesen2012-10-131-29/+32
| | | | | | | | | | | | | | It is possible that the live range of the value being pruned loops back into the kill MBB where the search started. When that happens, make sure that the beginning of KillMBB is also pruned. Instead of starting a DFS at KillMBB and skipping the root of the search, start a DFS at each KillMBB successor, and allow the search to loop back to KillMBB. This fixes PR14078. llvm-svn: 165872
* Use a transposed algorithm for handleMove().Jakob Stoklund Olesen2012-10-121-427/+213
| | | | | | | | | | | | | Completely update one interval at a time instead of collecting live range fragments to be updated. This avoids building data structures, except for a single SmallPtrSet of updated intervals. Also share code between handleMove() and handleMoveIntoBundle(). Add support for moving dead defs across other live values in the interval. The MI scheduler can do that. llvm-svn: 165824
* Fix coalescing with IMPLICIT_DEF values.Jakob Stoklund Olesen2012-10-121-21/+54
| | | | | | | | | | | | | | | | PHIElimination inserts IMPLICIT_DEF instructions to guarantee that all PHI predecessors have a live-out value. These IMPLICIT_DEF values are not considered to be real interference when coalescing virtual registers: %vreg1 = IMPLICIT_DEF %vreg2 = MOV32r0 When joining %vreg1 and %vreg2, the IMPLICIT_DEF instruction and its value number should simply be erased since the %vreg2 value number now provides a live-out value for the PHI predecesor block. llvm-svn: 165813
* Fix big-endian codegen bug in DAGTypeLegalizer::ExpandRes_BITCASTUlrich Weigand2012-10-121-0/+4
| | | | | | | | | | | | | | | | | | | | | | On PowerPC, a bitcast of <16 x i8> to i128 may run through a code path in ExpandRes_BITCAST that attempts to do an intermediate bitcast to a <4 x i32> vector, and then construct the Hi and Lo parts of the resulting i128 by pairing up two of those i32 vector elements each. The code already recognizes that on a big-endian system, the first two vector elements form the Hi part, and the final two vector elements form the Lo part (vice-versa from the little-endian situation). However, we also need to take endianness into account when forming each of those separate pairs: on a big-endian system, vector element 0 is the *high* part of the pair making up the Hi part of the result, and vector element 1 is the low part of the pair. The code currently always uses vector element 0 as the low part and vector element 1 as the high part, as is appropriate for little-endian platforms only. This patch fixes this by swapping the vector elements as they are paired up as appropriate. llvm-svn: 165802
* Legalizer optimize a pair of div / mod to a call to divrem libcall if they areEvan Cheng2012-10-121-0/+2
| | | | | | | | | | not legal. However, it should use a div instruction + mul + sub if divide is legal. The rem legalization code was missing a check and incorrectly uses a divrem libcall even when div is legal. rdar://12481395 llvm-svn: 165778
* Remove unnecessary classof()'sSean Silva2012-10-111-8/+0
| | | | | | | isa<> et al. automatically infer when the cast is an upcast (including a self-cast), so these are no longer necessary. llvm-svn: 165767
* Revert 165732 for further review.Micah Villmow2012-10-119-29/+24
| | | | llvm-svn: 165747
* Add in the first iteration of support for llvm/clang/lldb to allow variable ↵Micah Villmow2012-10-119-24/+29
| | | | | | per address space pointer sizes to be optimized correctly. llvm-svn: 165726
* Pass an explicit operand number to addLiveIns.Jakob Stoklund Olesen2012-10-112-8/+8
| | | | | | | | | Not all instructions define a virtual register in their first operand. Specifically, INLINEASM has a different format. <rdar://problem/12472811> llvm-svn: 165721
* Follow the same routine to add target float expansion hookMichael Liao2012-10-111-26/+24
| | | | llvm-svn: 165707
* misched: Handle "transient" non-instructions.Andrew Trick2012-10-112-17/+25
| | | | llvm-svn: 165701
* Add a new interface to allow IR-level passes to access codegen-specific ↵Nadav Rotem2012-10-101-2/+2
| | | | | | information. llvm-svn: 165665
* Add in support for expansion of all of the comparison operations to the ↵Micah Villmow2012-10-101-17/+62
| | | | | | | | | | absolute minimum required set. This allows a backend to expand any arbitrary set of comparisons as long as a minimum set is supported. The minimum set of required instructions is ISD::AND, ISD::OR, ISD::SETO(or ISD::SETOEQ) and ISD::SETUO(or ISD::SETUNE). Everything is expanded into one of two patterns: Pattern 1: (LHS CC1 RHS) Opc (LHS CC2 RHS) Pattern 2: (LHS CC1 LHS) Opc (RHS CC2 RHS) llvm-svn: 165655
* Add alternative support for FP_ROUND from v2f32 to v2f64Michael Liao2012-10-102-4/+8
| | | | | | | | | | | - Due to the current matching vector elements constraints in ISD::FP_EXTEND, rounding from v2f32 to v2f64 is scalarized. Add a customized v2f32 widening to convert it into a target-specific X86ISD::VFPEXT to work around this constraints. This patch also reverts a previous attempt to fix this issue by recovering the scalarized ISD::FP_EXTEND pattern and thus significantly reduces the overhead of supporting non-power-2 vector FP extend. llvm-svn: 165625
* Issue description:Stepan Dyatkovskiy2012-10-101-2/+3
| | | | | | | | | | | | | | | | | | | | SchedulerDAGInstrs::buildSchedGraph ignores dependencies between FixedStack objects and byval parameters. So loading byval parameters from stack may be inserted *before* it will be stored, since these operations are treated as independent. Fix: Currently ARMTargetLowering::LowerFormalArguments saves byval registers with FixedStack MachinePointerInfo. To fix the problem we need to store byval registers with MachinePointerInfo referenced to first the "byval" parameter. Also commit adds two new fields to the InputArg structure: Function's argument index and InputArg's part offset in bytes relative to the start position of Function's argument. E.g.: If function's argument is 128 bit width and it was splitted onto 32 bit regs, then we got 4 InputArg structs with same arg index, but different offset values. llvm-svn: 165616
* Remove the final bits of Attributes being declared in the AttributeBill Wendling2012-10-101-2/+4
| | | | | | | namespace. Use the attribute's enum value instead. No functionality change intended. llvm-svn: 165610
* My earlier "fix" for PBQP (see r165201) was incorrect. The real issue was thatLang Hames2012-10-101-2/+2
| | | | | | | | checkRegMaskInterference only initializes the bitmask on the first interference. This fixes PR14027 and (re)fixes PR13945. llvm-svn: 165608
* misched: fall-back to a target hook for instr bundles.Andrew Trick2012-10-101-3/+4
| | | | llvm-svn: 165606
* misched: Use the TargetSchedModel interface wherever possible.Andrew Trick2012-10-103-32/+75
| | | | | | | | Allows the new machine model to be used for NumMicroOps and OutputLatency. Allows the HazardRecognizer to be disabled along with itineraries. llvm-svn: 165603
* misched: Add computeInstrLatency to TargetSchedModel.Andrew Trick2012-10-091-0/+24
| | | | llvm-svn: 165566
* misched: Allow flags to disable hasInstrSchedModel/hasInstrItineraries for ↵Andrew Trick2012-10-091-6/+12
| | | | | | external users of TargetSchedule. llvm-svn: 165564
* misched: Remove LoopDependencies heuristic.Andrew Trick2012-10-091-40/+1
| | | | | | This wasn't contributing anything significant to postRA heuristics except compile time (by my measurements) and will be replaced by a more general heuristic for cross-region dependencies within the scheduler itself. llvm-svn: 165563
* Use the attribute enums to query if a parameter has an attribute.Bill Wendling2012-10-091-6/+6
| | | | llvm-svn: 165550
* Add in the first step of the multiple pointer support. This adds in support ↵Micah Villmow2012-10-091-6/+7
| | | | | | | | to the data layout for specifying a per address space pointer size. The next step is to update the optimizers to allow them to optimize the different address spaces with this information. llvm-svn: 165505
* Create enums for the different attributes.Bill Wendling2012-10-0911-33/+42
| | | | | | | We use the enums to query whether an Attributes object has that attribute. The opaque layer is responsible for knowing where that specific attribute is stored. llvm-svn: 165488
* Fix up comment to be more clear.Eric Christopher2012-10-081-2/+2
| | | | llvm-svn: 165463
* Refactor the AddrMode class out of TLI to its own header file.Nadav Rotem2012-10-081-1/+1
| | | | | | | | This class is used by LSR and a number of places in the codegen. This is the first step in de-coupling LSR from TLI, and creating a new interface in between them. llvm-svn: 165455
OpenPOWER on IntegriCloud