summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Remove unused ShouldFoldAtomicFences flag.Tim Northover2013-04-201-1/+0
| | | | | | | | I think it's almost impossible to fold atomic fences profitably under LLVM/C++11 semantics. As a result, this is now unused and just cluttering up the target interface. llvm-svn: 179940
* Remove unused MEMBARRIER DAG node; it's been replaced by ATOMIC_FENCE.Tim Northover2013-04-205-71/+1
| | | | llvm-svn: 179939
* Add CodeGen support for functions that always return arguments via a new ↵Stephen Lin2013-04-202-7/+37
| | | | | | parameter attribute 'returned', which is taken advantage of in target-independent tail call opportunity detection and in ARM call lowering (when placed on an integral first parameter). llvm-svn: 179925
* Allow tail call opportunity detection through nested and/or multiple ↵Stephen Lin2013-04-201-73/+126
| | | | | | iterations of extractelement/insertelement indirection llvm-svn: 179924
* Simplify the code in FastISel::tryToFoldLoad, add an assertion and fix a ↵Eli Bendersky2013-04-191-17/+10
| | | | | | comment. llvm-svn: 179908
* Move TryToFoldFastISelLoad to FastISel, where it belongs. In general, I'mEli Bendersky2013-04-192-79/+66
| | | | | | | trying to move as much FastISel logic as possible out of the main path in SelectionDAGISel - intermixing them just adds confusion. llvm-svn: 179902
* ArrayRefize getMachineNode(). No functionality change.Michael Liao2013-04-192-22/+24
| | | | llvm-svn: 179901
* Add an MRI::verifyUseLists() function.Jakob Stoklund Olesen2013-04-192-3/+54
| | | | | | | This checks the sanity of the register use lists in the MI intermediate representation. llvm-svn: 179895
* Use dbgs() consistently for -debug printoutsEli Bendersky2013-04-191-13/+13
| | | | llvm-svn: 179894
* Revert "PR14606: debug info imported_module support"Eric Christopher2013-04-192-25/+0
| | | | | | This reverts commit r179836 as it seems to have caused test failures. llvm-svn: 179840
* PR14606: debug info imported_module supportDavid Blaikie2013-04-192-0/+25
| | | | | | | | | | Adding another CU-wide list, in this case of imported_modules (since they should be relatively rare, it seemed better to add a list where each element had a "context" value, rather than add a (usually empty) list to every scope). This takes care of DW_TAG_imported_module, but to fully address PR14606 we'll need to expand this to cover DW_TAG_imported_declaration too. llvm-svn: 179836
* Add some more stats for fast isel vs. SelectionDAG, w.r.t lowering functionEli Bendersky2013-04-191-1/+10
| | | | | | arguments in entry BBs. llvm-svn: 179824
* Add support for subsections to the ELF assembler. Fixes PR8717.Peter Collingbourne2013-04-171-1/+1
| | | | | | Differential Revision: http://llvm-reviews.chandlerc.com/D598 llvm-svn: 179725
* Replace uses of the deprecated std::auto_ptr with OwningPtr.Andy Gibbs2013-04-151-23/+22
| | | | | | This is a rework of the broken parts in r179373 which were subsequently reverted in r179374 due to incompatibility with C++98 compilers. This version should be ok under C++98. llvm-svn: 179520
* Document the decision to assume that the cost of floats is twice as much as ↵Nadav Rotem2013-04-141-1/+3
| | | | | | integers. llvm-svn: 179478
* MI-Sched: DEBUG formatting.Andrew Trick2013-04-131-14/+22
| | | | llvm-svn: 179452
* MI-Sched cleanup. If an instruction has no valid sched class, do not attempt ↵Andrew Trick2013-04-131-0/+2
| | | | | | to check for a variant. llvm-svn: 179451
* MI-Sched: schedule physreg copies.Andrew Trick2013-04-132-1/+76
| | | | | | | | | | | The register allocator expects minimal physreg live ranges. Schedule physreg copies accordingly. This is slightly tricky when they occur in the middle of the scheduling region. For now, this is handled by rescheduling the copy when its associated instruction is scheduled. Eventually we may instead bundle them, but only if we can preserve the bundles as parallel copies during regalloc. llvm-svn: 179449
* CostModel: increase the default cost of supported floating point operations ↵Nadav Rotem2013-04-121-4/+7
| | | | | | from 1 to two. Fixed a few tests that changes because now the cost of one insert + a vector operation on two doubles is lower than two scalar operations on doubles. llvm-svn: 179413
* Revert broken pieces of r179373.Benjamin Kramer2013-04-121-16/+16
| | | | | | You can't copy an OwningPtr, and move semantics aren't available in C++98. llvm-svn: 179374
* Replace uses of the deprecated std::auto_ptr with OwningPtr.Andy Gibbs2013-04-123-20/+20
| | | | llvm-svn: 179373
* Don't disable block layout when forcing block alignment.Nadav Rotem2013-04-121-8/+6
| | | | llvm-svn: 179355
* Add a flag to align all basic blocks in the function.Nadav Rotem2013-04-121-0/+14
| | | | | | | | | | When debugging performance regressions we often ask ourselves if the regression that we see is due to poor isel/sched/ra or due to some micro-architetural problem. When comparing two code sequences one good way to rule out front-end bottlenecks (and other the issues) is to force code alignment. This pass adds a flag that forces the alignment of all of the basic blocks in the program. llvm-svn: 179353
* Add braces around || in && to pacify GCC.Benjamin Kramer2013-04-111-4/+4
| | | | llvm-svn: 179275
* Manually remove successors in if conversion when CopyAndPredicateBlock is usedHal Finkel2013-04-101-0/+8
| | | | | | | | | | | | | | | In the simple and triangle if-conversion cases, when CopyAndPredicateBlock is used because the to-be-predicated block has other predecessors, we need to explicitly remove the old copied block from the successors list. Normally if conversion relies on TII->AnalyzeBranch combined with BB->CorrectExtraCFGEdges to cleanup the successors list, but if the predicated block contained an un-analyzable branch (such as a now-predicated return), then this will fail. These extra successors were causing a problem on PPC because it was causing later passes (such as PPCEarlyReturm) to leave dead return-only basic blocks in the code. llvm-svn: 179227
* Generalize the PassConfig API and remove addFinalizeRegAlloc().Andrew Trick2013-04-101-36/+50
| | | | | | | | | | The target hooks are getting out of hand. What does it mean to run before or after regalloc anyway? Allowing either Pass* or AnalysisID pass identification should make it much easier for targets to use the substitutePass and insertPass APIs, and create less need for badly named target hooks. llvm-svn: 179140
* The .dwo section shouldn't contain the unrelocated values (andEric Christopher2013-04-091-13/+21
| | | | | | | | | | | therefore not at all) of the pc or statement list. We also don't need to emit the compilation dir so save so space and time and don't bother. Fix up the testcase accordingly and verify that we don't emit the attributes or the items that they use. llvm-svn: 179114
* DAGCombiner: Fold a shuffle on CONCAT_VECTORS into a new CONCAT_VECTORS if ↵Benjamin Kramer2013-04-091-0/+49
| | | | | | | | | | | | | | | | | | | | possible. This pattern occurs in SROA output due to the way vector arguments are lowered on ARM. The testcase from PR15525 now compiles into this, which is better than the code we got with the old scalarrepl: _Store: ldr.w r9, [sp] vmov d17, r3, r9 vmov d16, r1, r2 vst1.8 {d16, d17}, [r0] bx lr Differential Revision: http://llvm-reviews.chandlerc.com/D647 llvm-svn: 179106
* DW_FORM_sec_offset should be a relocation on platforms that useEric Christopher2013-04-072-5/+14
| | | | | | | | | a relocation across sections. Do this for DW_AT_stmt list in the skeleton CU and check the relocations in the debug_info section. Add a FIXME for multiple CUs. llvm-svn: 178969
* typoNadav Rotem2013-04-061-1/+1
| | | | llvm-svn: 178949
* Dwarf: use utostr on CUID to append to SmallString.Manman Ren2013-04-061-1/+1
| | | | | | | | | We used to do "SmallString += CUID", which is incorrect, since CUID will be truncated to a char. rdar://problem/13573833 llvm-svn: 178941
* Reapply r178845 with fix - Fix bug in PEI's virtual-register scavengingHal Finkel2013-04-052-24/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes PEI as previously described, but correctly handles the case where the instruction defining the virtual register to be scavenged is the first in the block. Arnold provided me with a bugpoint-reduced test case, but even that seems too large to use as a regression test. If I'm successful in cleaning it up then I'll commit that as well. Original commit message: This change fixes a bug that I introduced in r178058. After a register is scavenged using one of the available spills slots the instruction defining the virtual register needs to be moved to after the spill code. The scavenger has already processed the defining instruction so that registers killed by that instruction are available for definition in that same instruction. Unfortunately, after this, the scavenger needs to iterate through the spill code and then visit, again, the instruction that defines the now-scavenged register. In order to avoid confusion, the register scavenger needs the ability to 'back up' through the spill code so that it can again process the instructions in the appropriate order. Prior to this fix, once the scavenger reached the just-moved instruction, it would assert if it killed any registers because, having already processed the instruction, it believed they were undefined. Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar for diagnosing the problem and testing this fix. llvm-svn: 178919
* Use the target options specified on a function to reset the back-end.Bill Wendling2013-04-052-39/+70
| | | | | | | | During LTO, the target options on functions within the same Module may change. This would necessitate resetting some of the back-end. Do this for X86, because it's a Friday afternoon. llvm-svn: 178917
* Revert r178845 - Fix bug in PEI's virtual-register scavengingHal Finkel2013-04-052-67/+24
| | | | | | | | | | | | | | | | | | | | | | Reverting because this breaks one of the LTO builders. Original commit message: This change fixes a bug that I introduced in r178058. After a register is scavenged using one of the available spills slots the instruction defining the virtual register needs to be moved to after the spill code. The scavenger has already processed the defining instruction so that registers killed by that instruction are available for definition in that same instruction. Unfortunately, after this, the scavenger needs to iterate through the spill code and then visit, again, the instruction that defines the now-scavenged register. In order to avoid confusion, the register scavenger needs the ability to 'back up' through the spill code so that it can again process the instructions in the appropriate order. Prior to this fix, once the scavenger reached the just-moved instruction, it would assert if it killed any registers because, having already processed the instruction, it believed they were undefined. Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar for diagnosing the problem and testing this fix. llvm-svn: 178916
* Fix bug in PEI's virtual-register scavengingHal Finkel2013-04-052-24/+67
| | | | | | | | | | | | | | | | | | | | This change fixes a bug that I introduced in r178058. After a register is scavenged using one of the available spills slots the instruction defining the virtual register needs to be moved to after the spill code. The scavenger has already processed the defining instruction so that registers killed by that instruction are available for definition in that same instruction. Unfortunately, after this, the scavenger needs to iterate through the spill code and then visit, again, the instruction that defines the now-scavenged register. In order to avoid confusion, the register scavenger needs the ability to 'back up' through the spill code so that it can again process the instructions in the appropriate order. Prior to this fix, once the scavenger reached the just-moved instruction, it would assert if it killed any registers because, having already processed the instruction, it believed they were undefined. Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar for diagnosing the problem and testing this fix. llvm-svn: 178845
* RegisterPressure heuristics currently require signed comparisons.Andrew Trick2013-04-051-2/+2
| | | | llvm-svn: 178823
* Disable DFSResult for ConvergingScheduler.Andrew Trick2013-04-051-2/+0
| | | | | | | | For now, just save the compile time since the ConvergingScheduler heuristics don't use this analysis. We'll probably enable it later after compile-time investigation. llvm-svn: 178822
* MachineScheduler: format DEBUG output.Andrew Trick2013-04-051-22/+17
| | | | | | | I'm getting more serious about tuning and enabling on x86/ARM. Start by making the trace readable. llvm-svn: 178821
* CostModel: Add parameter to instruction cost to further classify operand valuesArnold Schwaighofer2013-04-041-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | On certain architectures we can support efficient vectorized version of instructions if the operand value is uniform (splat) or a constant scalar. An example of this is a vector shift on x86. We can efficiently support for (i = 0 ; i < ; i += 4) w[0:3] = v[0:3] << <2, 2, 2, 2> but not for (i = 0; i < ; i += 4) w[0:3] = v[0:3] << x[0:3] This patch adds a parameter to getArithmeticInstrCost to further qualify operand values as uniform or uniform constant. Targets can then choose to return a different cost for instructions with such operand values. A follow-up commit will test this feature on x86. radar://13576547 llvm-svn: 178807
* Debug Info: revert 178722 for now.Manman Ren2013-04-043-15/+4
| | | | | | | | | | | | | There is a difference for FORM_ref_addr between DWARF 2 and DWARF 3+. Since Eric is against guarding DWARF 2 ref_addr with DarwinGDBCompat, we are still in discussion on how to handle this. The correct solution is to update our header to say version 4 instead of version 2 and update tool chains as well. rdar://problem/13559431 llvm-svn: 178806
* typoAdrian Prantl2013-04-041-1/+1
| | | | llvm-svn: 178804
* FormattingEli Bendersky2013-04-041-2/+2
| | | | llvm-svn: 178771
* Debug Info: according to DWARF 2, FORM_ref_addr the same size as an address onManman Ren2013-04-043-4/+15
| | | | | | | | | | | the target system. It was hard-coded to 4 bytes before. I can't get llvm to generate a ref_addr on a reasonably sized testing case. rdar://problem/13559431 llvm-svn: 178722
* Fix PR15632: No support for ppcf128 floating-point remainder on PowerPC.Bill Schmidt2013-04-032-0/+12
| | | | | | | | For this we need to use a libcall. Previously LLVM didn't implement libcall support for frem, so I've added it in the usual straightforward manner. A test case from the bug report is included. llvm-svn: 178639
* Fix grammar.Eric Christopher2013-04-031-1/+1
| | | | llvm-svn: 178624
* Remove ZeroOrMore from the option description. We don't need it here.Eric Christopher2013-04-031-1/+1
| | | | llvm-svn: 178623
* Allow MachineTraceMetrics to be used when the model has no resources.Jakob Stoklund Olesen2013-04-022-7/+11
| | | | | | | It it still possible to extract information from itineraries, for example. llvm-svn: 178582
* Don't attempt MTM heuristics without a scheduling model present.Jakob Stoklund Olesen2013-04-021-0/+4
| | | | | | This should fix the PPC buildbots. llvm-svn: 178558
* Count processor resources individually in MachineTraceMetrics.Jakob Stoklund Olesen2013-04-021-9/+144
| | | | | | | | | | | | | | | The new instruction scheduling models provide information about the number of cycles consumed on each processor resource. This makes it possible to estimate ILP more accurately than simply counting instructions / issue width. The functions getResourceDepth() and getResourceLength() now identify the limiting processor resource, and return a cycle count based on that. This gives more precise resource information, particularly in traces that use one resource a lot more than others. llvm-svn: 178553
* DAGCombiner: Merge store/loads when we have extload/truncstoresArnold Schwaighofer2013-04-021-0/+19
| | | | | | | | | | | | | | | | This is helps on architectures where i8,i16 are not legal but we have byte, and short loads/stores. Allowing us to merge copies like the one below on ARM. copy(char *a, char *b, int n) { do { int t0 = a[0]; int t1 = a[1]; b[0] = t0; b[1] = t1; radar://13536387 llvm-svn: 178546
OpenPOWER on IntegriCloud