summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC/PPCInstrInfo.h
Commit message (Collapse)AuthorAgeFilesLines
...
* CodeGen: TII: Take MachineInstr& in predicate API, NFCDuncan P. N. Exon Smith2016-02-231-5/+5
| | | | | | | | | | | | | Change TargetInstrInfo API to take `MachineInstr&` instead of `MachineInstr*` in the functions related to predicated instructions (I'll try to come back later and get some of the rest). All of these functions require non-null parameters already, so references are more clear. As a bonus, this happens to factor away a host of implicit iterator => pointer conversions. No functionality change intended. llvm-svn: 261605
* replace MachineCombinerPattern namespace and enum with enum class; NFCISanjay Patel2015-11-051-1/+1
| | | | | | | | Also, remove an enum hack where enum values were used as indexes into an array. We may want to make this a real class to allow pattern-based queries/customization (D13417). llvm-svn: 252196
* Improved the interface of methods commuting operands, improved X86-FMA3 ↵Andrew Kaylor2015-09-281-4/+17
| | | | | | | | | | mem-folding&coalescing. Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com) Differential Revision: http://reviews.llvm.org/D11370 llvm-svn: 248735
* [Machine Combiner] Refactor machine reassociation code to be target-independent.Chad Rosier2015-09-211-22/+3
| | | | | | | | | | No functional change intended. Patch by Haicheng Wu <haicheng@codeaurora.org>! http://reviews.llvm.org/D12887 PR24522 llvm-svn: 248164
* Pass BranchProbability/BlockMass by value instead of const& as they are ↵Cong Hou2015-09-101-6/+4
| | | | | | small. NFC. llvm-svn: 247357
* [PowerPC/MIR Serialization] Target flags serialization supportHal Finkel2015-08-301-0/+9
| | | | | | | | | | | | | Add support for MIR serialization of PowerPC-specific operand target flags (based on the generic infrastructure added in r244185 and r245383). I won't even pretend that this is good test coverage, but this includes the regression test associated with r246372. Adding an MIR test for that fix is far superior to adding an IR-level test because particular instruction-scheduling decisions are necessary in order to expose the bug, and using an MIR test we can start the pipeline post-scheduling. llvm-svn: 246373
* [PowerPC] Use the MachineCombiner to reassociate fadd/fmulHal Finkel2015-07-151-0/+32
| | | | | | | | | | | | | This is a direct port of the code from the X86 backend (r239486/r240361), which uses the MachineCombiner to reassociate (floating-point) adds/muls to increase ILP, to the PowerPC backend. The rationale is the same. There is a lot of copy-and-paste here between the X86 code and the PowerPC code, and we should extract at least some of this into CodeGen somewhere. However, I don't want to do that until this code is enhanced to handle FMAs as well. After that, we'll be in a better position to extract the common parts. llvm-svn: 242279
* [PowerPC] Fix the PPCInstrInfo::getInstrLatency implementationHal Finkel2015-07-141-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PowerPC uses itineraries to describe processor pipelines (and dispatch-group restrictions for P7/P8 cores). Unfortunately, the target-independent implementation of TII.getInstrLatency calls ItinData->getStageLatency, and that looks for the largest cycle count in the pipeline for any given instruction. This, however, yields the wrong answer for the PPC itineraries, because we don't encode the full pipeline. Because the functional units are fully pipelined, we only model the initial stages (there are no relevant hazards in the later stages to model), and so the technique employed by getStageLatency does not really work. Instead, we should take the maximum output operand latency, and that's what PPCInstrInfo::getInstrLatency now does. This caused some test-case churn, including two unfortunate side effects. First, the new arrangement of copies we get from function parameters now sometimes blocks VSX FMA mutation (a FIXME has been added to the code and the test cases), and we have one significant test-suite regression: SingleSource/Benchmarks/BenchmarkGame/spectral-norm 56.4185% +/- 18.9398% In this benchmark we have a loop with a vectorized FP divide, and it with the new scheduling both divides end up in the same dispatch group (which in this case seems to cause a problem, although why is not exactly clear). The grouping structure is hard to predict from the bottom of the loop, and there may not be much we can do to fix this. Very few other test-suite performance effects were really significant, but almost all weakly favor this change. However, in light of the issues highlighted above, I've left the old behavior available via a command-line flag. llvm-svn: 242188
* Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)Alexander Kornienko2015-06-231-1/+1
| | | | | | Apparently, the style needs to be agreed upon first. llvm-svn: 240390
* Fixed/added namespace ending comments using clang-tidy. NFCAlexander Kornienko2015-06-191-1/+1
| | | | | | | | | | | | | The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
* MachineLICM: Use TargetSchedModel instead of just itinerariesMatthias Braun2015-06-131-1/+1
| | | | | | | | | This will use Itinieraries if available, but will also work if just a MCSchedModel is available. Differential Revision: http://reviews.llvm.org/D10428 llvm-svn: 239658
* [CodeGen] ArrayRef'ize cond/pred in various TII APIs. NFC.Ahmed Bougacha2015-06-111-12/+8
| | | | llvm-svn: 239553
* Remove some unnecessary forward declarations and put a couple moreEric Christopher2015-03-121-1/+1
| | | | | | where they're supposed to reside. llvm-svn: 232014
* [PowerPC] Mark all instructions as non-cheap for MachineLICMHal Finkel2015-01-081-0/+9
| | | | | | | | | | | | MachineLICM uses a callback named hasLowDefLatency to determine if an instruction def operand has a 'low' latency. If all relevant operands have a 'low' latency, the instruction is considered too cheap to hoist out of loops even in low-register-pressure situations. On PowerPC cores, both the embedded cores and the others, there is no reason to believe that this is a good choice: all instructions have a cost inside a loop, and hoisting them when not limited by register pressure is a reasonable default. llvm-svn: 225471
* Canonicalize header guards into a common format.Benjamin Kramer2014-08-131-2/+2
| | | | | | | | | | Add header guards to files that were missing guards. Remove #endif comments as they don't seem common in LLVM (we can easily add them back if we decide they're useful) Changes made by clang-tidy with minor tweaks. llvm-svn: 215558
* Provide an implementation of getNoopForMachoTarget for PPC, otherwiseJoerg Sonnenberger2014-08-081-0/+2
| | | | | | empty functions will assert in the MC object writer. llvm-svn: 215238
* The hazard recognizer only needs a subtarget, not a target machineEric Christopher2014-06-131-1/+1
| | | | | | so make it take one. Fix up all users accordingly. llvm-svn: 210948
* Remove TargetMachine from PPCInstrInfo and all dependencies andEric Christopher2014-06-121-2/+2
| | | | | | replace with the current subtarget. llvm-svn: 210836
* De-virtualize or remove some methods that have no overrides nor override ↵Craig Topper2014-04-301-1/+1
| | | | | | anything. In some cases remove all together if there are no callers either. llvm-svn: 207610
* [C++11] Add 'override' keywords and remove 'virtual'. Additionally add ↵Craig Topper2014-04-291-86/+83
| | | | | | 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. PowerPC edition llvm-svn: 207504
* [PowerPC] Correct commutable indices for VSX FMA instructionsHal Finkel2014-03-251-0/+3
| | | | | | | | | | Although the first two operands are the ones that can be swapped, the tied input operand is listed before them, so we need to adjust for that. I have a test case for this, but it goes along with an upcoming commit (so it will come soon). llvm-svn: 204748
* Improve instruction scheduling for the PPC POWER7Hal Finkel2013-12-121-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Aside from a few minor latency corrections, the major change here is a new hazard recognizer which focuses on better dispatch-group formation on the POWER7. As with the PPC970's hazard recognizer, the most important thing it does is avoid load-after-store hazards within the same dispatch group. It uses the POWER7's special dispatch-group-terminating nop instruction (instead of inserting multiple regular nop instructions). This new hazard recognizer makes use of the scheduling dependency graph itself, built using AA information, to robustly detect the possibility of load-after-store hazards. significant test-suite performance changes (the error bars are 99.5% confidence intervals based on 5 test-suite runs both with and without the change -- speedups are negative): speedups: MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2 -0.55171% +/- 0.333168% MultiSource/Benchmarks/TSVC/CrossingThresholds-dbl/CrossingThresholds-dbl -17.5576% +/- 14.598% MultiSource/Benchmarks/TSVC/Reductions-dbl/Reductions-dbl -29.5708% +/- 7.09058% MultiSource/Benchmarks/TSVC/Reductions-flt/Reductions-flt -34.9471% +/- 11.4391% SingleSource/Benchmarks/BenchmarkGame/puzzle -25.1347% +/- 11.0104% SingleSource/Benchmarks/Misc/flops-8 -17.7297% +/- 9.79061% SingleSource/Benchmarks/Shootout-C++/ary3 -35.5018% +/- 23.9458% SingleSource/Regression/C/uint64_to_float -56.3165% +/- 25.4234% SingleSource/UnitTests/Vectorizer/gcc-loops -18.5309% +/- 6.8496% regressions: MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000 18.351% +/- 12.156% SingleSource/Benchmarks/Shootout-C++/methcall 27.3086% +/- 14.4733% llvm-svn: 197099
* [weak vtables] Remove a bunch of weak vtablesJuergen Ributzka2013-11-191-0/+1
| | | | | | | | | | | | This patch removes most of the trivial cases of weak vtables by pinning them to a single object file. The memory leaks in this version have been fixed. Thanks Alexey for pointing them out. Differential Revision: http://llvm-reviews.chandlerc.com/D2068 Reviewed by Andy llvm-svn: 195064
* Revert r194865 and r194874.Alexey Samsonov2013-11-181-1/+0
| | | | | | | | | | | | This change is incorrect. If you delete virtual destructor of both a base class and a subclass, then the following code: Base *foo = new Child(); delete foo; will not cause the destructor for members of Child class. As a result, I observe plently of memory leaks. Notable examples I investigated are: ObjectBuffer and ObjectBufferStream, AttributeImpl and StringSAttributeImpl. llvm-svn: 194997
* [weak vtables] Remove a bunch of weak vtablesJuergen Ributzka2013-11-151-0/+1
| | | | | | | | | | | This patch removes most of the trivial cases of weak vtables by pinning them to a single object file. Differential Revision: http://llvm-reviews.chandlerc.com/D2068 Reviewed by Andy llvm-svn: 194865
* DebugInfo: remove target-specific Frame Index handling for DBG_VALUE ↵David Blaikie2013-06-161-6/+0
| | | | | | | | | | MachineInstrs Frame index handling is now target-agnostic, so delete the target hooks for creation & asm printing of target-specific addressing in DBG_VALUEs and any related functions. llvm-svn: 184067
* Implement optimizeCompareInstr for PPCHal Finkel2013-04-181-0/+12
| | | | | | | | | | | | | | | Many PPC instructions have a so-called 'record form' which stores to a specific condition register the result of comparing the result of the instruction with zero (always as a signed comparison). For integer operations on PPC64, this is always a 64-bit comparison. This implementation is derived from the implementation in the ARM backend; there are some differences because PPC condition registers are allocatable virtual registers (although the record forms always use a specific one), and we look for a matching subtraction instruction after the compare (but before the first use) in addition to before it. llvm-svn: 179802
* PPC: Don't predicate a diamond with two counter decrementsHal Finkel2013-04-101-3/+1
| | | | | | | | | | I've not seen this happen in practice, and probably can't until we start allowing decrement-counter-based conditional branches to be double predicated, but just in case, don't allow predication of a diamond in which both sides have ctr-defining branches. Even though the branching behavior of these can be predicated, the counter-decrementing behavior cannot be. llvm-svn: 179199
* Allow PPC B and BLR to be if-converted into some predicated formsHal Finkel2013-04-091-0/+47
| | | | | | | | | | | | | | This enables us to form predicated branches (which are the same conditional branches we had before) and also a larger set of predicated returns (including instructions like bdnzlr which is a conditional return and loop-counter decrement all in one). At the moment, if conversion does not capture all possible opportunities. A simple example is provided in early-ret2.ll, where if conversion forms one predicated return, and then the PPCEarlyReturn pass picks up the other one. So, at least for now, we'll keep both mechanisms. llvm-svn: 179134
* Implement PPCInstrInfo::FoldImmediateHal Finkel2013-04-061-0/+3
| | | | | | | | There are certain PPC instructions into which we can fold a zero immediate operand. We can detect such cases by looking at the register class required by the using operand (so long as it is not otherwise constrained). llvm-svn: 178961
* Enable early if conversion on PPCHal Finkel2013-04-051-0/+11
| | | | | | | | | | | | | On cores for which we know the misprediction penalty, and we have the isel instruction, we can profitably perform early if conversion. This enables us to replace some small branch sequences with selects and avoid the potential stalls from mispredicting the branches. Enabling this feature required implementing canInsertSelect and insertSelect in PPCInstrInfo; isel code in PPCISelLowering was refactored to use these functions as well. llvm-svn: 178926
* Note in PPCFunctionInfo VRSAVE spillsHal Finkel2013-03-231-2/+2
| | | | | | | | | | | | In preparation for using the new register scavenger capability for providing more than one register simultaneously, specifically note functions that have spilled VRSAVE (currently, this can happen only in functions that use the setjmp intrinsic). As with CR spilling, such functions will need to provide two emergency spill slots to the scavenger. No functionality change intended. llvm-svn: 177832
* Improve PPC VR (Altivec) register spillingHal Finkel2013-03-171-2/+4
| | | | | | | | | | | | | | | | This change cleans up two issues with Altivec register spilling: 1. The spilling code was inefficient (using two instructions, and add and a load, when just one would do) 2. The code assumed that r0 would always be available (true for now, but this will change) The new code handles VR spilling just like GPR spills but forced into r+r mode. As a result, when any VR spills are present, we must now always allocate the register-scavenger spill slot. llvm-svn: 177231
* Implement PPCInstrInfo::isCoalescableExtInstr().Jakob Stoklund Olesen2012-06-191-0/+3
| | | | | | | | | | | | | The PPC::EXTSW instruction preserves the low 32 bits of its input, just like some of the x86 instructions. Use it to reduce register pressure when the low 32 bits have multiple uses. This requires a small change to PeepholeOptimizer since EXTSW takes a 64-bit input register. This is related to PR5997. llvm-svn: 158743
* Reorder includes in Target backends to following coding standards. Remove ↵Craig Topper2012-03-171-1/+1
| | | | | | some superfluous forward declarations. llvm-svn: 152997
* Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, ↵Jia Liu2012-02-181-3/+3
| | | | | | MSP430, PPC, PTX, Sparc, X86, XCore. llvm-svn: 150878
* add RESTORE_CR and support CR unspillsHal Finkel2011-12-061-1/+1
| | | | llvm-svn: 145961
* update PPC 940 hazard rec. to function in postRA modeHal Finkel2011-12-021-0/+3
| | | | llvm-svn: 145676
* Hide the call to InitMCInstrInfo into tblgen generated ctor.Evan Cheng2011-07-011-1/+4
| | | | llvm-svn: 134244
* Various bits of framework needed for precise machine-level selectionAndrew Trick2010-12-241-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | DAG scheduling during isel. Most new functionality is currently guarded by -enable-sched-cycles and -enable-sched-hazard. Added InstrItineraryData::IssueWidth field, currently derived from ARM itineraries, but could be initialized differently on other targets. Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is active, and if so how many cycles of state it holds. Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry into the scheduler's available queue. ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to get information about it's SUnits, provides RecedeCycle for bottom-up scheduling, correctly computes scoreboard depth, tracks IssueCount, and considers potential stall cycles when checking for hazards. ScheduleDAGRRList now models machine cycles and hazards (under flags). It tracks MinAvailableCycle, drives the hazard recognizer and priority queue's ready filter, manages a new PendingQueue, properly accounts for stall cycles, etc. llvm-svn: 122541
* whitespaceAndrew Trick2010-12-241-10/+10
| | | | llvm-svn: 122539
* implement support for the MO_DARWIN_STUB TargetOperand flag,Chris Lattner2010-11-141-1/+1
| | | | | | | | and have isel apply to to call operands as required. This allows us to get $stub suffixes on label references on ppc/tiger with the new instprinter, fixing two tests. Only 2 to go. llvm-svn: 119093
* Remove the isMoveInstr() hook.Jakob Stoklund Olesen2010-07-161-6/+0
| | | | llvm-svn: 108567
* RISC architectures get their memory operand folding for free.Jakob Stoklund Olesen2010-07-111-17/+0
| | | | | | | | The only folding these load/store architectures can do is converting COPY into a load or store, and the target independent part of foldMemoryOperand already knows how to do that. llvm-svn: 108099
* Replace copyRegToReg with copyPhysReg for PowerPC.Jakob Stoklund Olesen2010-07-111-6/+4
| | | | llvm-svn: 108083
* Add a DebugLoc parameter to TargetInstrInfo::InsertBranch(). ThisStuart Hastings2010-06-171-1/+2
| | | | | | | | | | | | addresses a longstanding deficiency noted in many FIXMEs scattered across all the targets. This effectively moves the problem up one level, replacing eleven FIXMEs in the targets with eight FIXMEs in CodeGen, plus one path through FastISel where we actually supply a DebugLoc, fixing Radar 7421831. llvm-svn: 106243
* Add a DebugLoc argument to TargetInstrInfo::copyRegToReg, so that itDan Gohman2010-05-061-1/+2
| | | | | | doesn't have to guess. llvm-svn: 103194
* Add argument TargetRegisterInfo to loadRegFromStackSlot and storeRegToStackSlot.Evan Cheng2010-05-061-2/+4
| | | | llvm-svn: 103193
* Frame index can be negative.Evan Cheng2010-04-291-1/+1
| | | | llvm-svn: 102577
* Add PPC specific emitFrameIndexDebugValue.Evan Cheng2010-04-261-0/+6
| | | | llvm-svn: 102325
OpenPOWER on IntegriCloud