summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/MachineCombiner.cpp
Commit message (Collapse)AuthorAgeFilesLines
* CodeGen: Rename DEBUG_TYPE to match passnamesMatthias Braun2017-05-251-2/+2
| | | | | | | | Rename the DEBUG_TYPE to match the names of corresponding passes where it makes sense. Also establish the pattern of simply referencing DEBUG_TYPE instead of repeating the passname where possible. llvm-svn: 303921
* Fix up grammar in a comment.Eric Christopher2017-03-151-1/+1
| | | | llvm-svn: 297898
* Compile time decreasing in the case we're dealing with Machine Combiner. Andrew V. Tischenko2017-02-131-15/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this patch compile time was about 21s (see below). After this patch we have less than 2s (see bellow). Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz DAGCombiner - trunk time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m1.685s DAGCombiner + Speed patch time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m1.655s MachineCombiner w/o Speed patch time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m21.614s MachineCombiner + Speed patch time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m1.593s The test spill_fdiv.ll is attached to D29627 D29627 should be closed. llvm-svn: 294936
* MachineInstr: Remove parameter from dump()Matthias Braun2017-01-291-1/+3
| | | | | | | | | | | | | The primary use of the dump() functions in LLVM is for use in a debugger. Unfortunately lldb does not seem to handle default arguments so using `p SomeMI.dump()` fails and you have to type the longer `p SomeMI.dump(nullptr)`. Remove the paramter to make the most common use easy. (You can always construct something like `p SomeMI.print(dbgs(),MyTII)` if you need more features). Differential Revision: https://reviews.llvm.org/D29241 llvm-svn: 293440
* machine combiner: fix pretty printerSebastian Pop2016-12-211-1/+1
| | | | | | | | | | | we used to print UNKNOWN instructions when the instruction to be printer was not yet inserted in any BB: in that case the pretty printer would not be able to compute a TII as the instruction does not belong to any BB or function yet. This patch explicitly passes the TII to the pretty-printer. Differential Revision: https://reviews.llvm.org/D27645 llvm-svn: 290228
* instr-combiner: sum up all latencies of the transformed instructionsSebastian Pop2016-12-111-2/+9
| | | | | | | | | | | | | | | | | | | | We have found that -- when the selected subarchitecture has a scheduling model and we are not optimizing for size -- the machine-instruction combiner uses a too-simple algorithm to compute the cost of one of the two alternatives [before and after running a combining pass on a section of code], and therefor it throws away the combination results too often. This fix has the potential to help any ISA with the potential to combine instructions and for which at least one subarchitecture has a scheduling model. As of now, this is only known to definitely affect AArch64 subarchitectures with a scheduling model. Regression tested on AMD64/GNU-Linux, new test case tested to fail on an unpatched compiler and pass on a patched compiler. Patch by Abe Skolnik and Sebastian Pop. llvm-svn: 289399
* Use StringRef in Pass/PassManager APIs (NFC)Mehdi Amini2016-10-011-1/+1
| | | | llvm-svn: 283004
* [MachineCombiner] Support for floating-point FMA on ARM64 (re-commit r267098)Gerolf Hoflehner2016-04-241-1/+11
| | | | | | | | | | | | | | | | | | | The original patch caused crashes because it could derefence a null pointer for SelectionDAGTargetInfo for targets that do not define it. Evaluates fmul+fadd -> fmadd combines and similar code sequences in the machine combiner. It adds support for float and double similar to the existing integer implementation. The key features are: - DAGCombiner checks whether it should combine greedily or let the machine combiner do the evaluation. This is only supported on ARM64. - It gives preference to throughput over latency: the heuristic used is to combine always in loops. The targets decides whether the machine combiner should optimize for throughput or latency. - Supports for fmadd, f(n)msub, fmla, fmls patterns - On by default at O3 ffast-math llvm-svn: 267328
* Revert r267098 - [MachineCombiner] Support for floating-point FMA on ARM64Daniel Sanders2016-04-221-11/+1
| | | | | | It introduced buildbot failures on clang-cmake-mips, clang-ppc64le-linux, among others. llvm-svn: 267127
* [MachineCombiner] Support for floating-point FMA on ARM64Gerolf Hoflehner2016-04-221-1/+11
| | | | | | | | | | | | | | | | Evaluates fmul+fadd -> fmadd combines and similar code sequences in the machine combiner. It adds support for float and double similar to the existing integer implementation. The key features are: - DAGCombiner checks whether it should combine greedily or let the machine combiner do the evaluation. This is only supported on ARM64. - It gives preference to throughput over latency: the heuristic used is to combine always in loops. The targets decides whether the machine combiner should optimize for throughput or latency. - Supports for fmadd, f(n)msub, fmla, fmls patterns - On by default at O3 ffast-math llvm-svn: 267098
* [NFC] Header cleanupMehdi Amini2016-04-181-2/+1
| | | | | | | | | | | | | | Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595
* Minor code cleanup. NFC.Junmo Park2016-02-271-1/+1
| | | | llvm-svn: 262096
* Reapply "CodeGen: Use references in MachineTraceMetrics::Trace, NFC"Duncan P. N. Exon Smith2016-02-221-4/+4
| | | | | | | | | | | | | This reverts commit r261510, effectively reapplying r261509. The original commit missed a caller in AArch64ConditionalCompares. Original commit message: Pass non-null arguments by reference in MachineTraceMetrics::Trace, simplifying future work to remove implicit iterator => pointer conversions. llvm-svn: 261511
* Revert "CodeGen: Use references in MachineTraceMetrics::Trace, NFC"Duncan P. N. Exon Smith2016-02-221-4/+4
| | | | | | | This reverts commit r261509. I'm not sure how this compiled locally, but something was out of whack. llvm-svn: 261510
* CodeGen: Use references in MachineTraceMetrics::Trace, NFCDuncan P. N. Exon Smith2016-02-221-4/+4
| | | | | | | | Pass non-null arguments by reference in MachineTraceMetrics::Trace, simplifying future work to remove implicit iterator => pointer conversions. llvm-svn: 261509
* less indent; NFCISanjay Patel2015-11-101-46/+47
| | | | llvm-svn: 252643
* add 'MustReduceDepth' as an objective/cost-metric for the MachineCombinerSanjay Patel2015-11-101-29/+53
| | | | | | | | | | | | | | | | | | | | | | This is one of the problems noted in PR25016: https://llvm.org/bugs/show_bug.cgi?id=25016 and: http://lists.llvm.org/pipermail/llvm-dev/2015-October/090998.html The spilling problem is independent and not addressed by this patch. The MachineCombiner was doing reassociations that don't improve or even worsen the critical path. This is caused by inclusion of the "slack" factor when calculating the critical path of the original code sequence. If we don't add that, then we have a more conservative cost comparison of the old code sequence vs. a new sequence. The more liberal calculation must be preserved, however, for the AArch64 MULADD patterns because benchmark regressions were observed without that. The two failing test cases now have identical asm that does what we want: a + b + c + d ---> (a + b) + (c + d) Differential Revision: http://reviews.llvm.org/D13417 llvm-svn: 252616
* replace MachineCombinerPattern namespace and enum with enum class; NFCISanjay Patel2015-11-051-1/+1
| | | | | | | | Also, remove an enum hack where enum values were used as indexes into an array. We may want to make this a real class to allow pattern-based queries/customization (D13417). llvm-svn: 252196
* Fix Clang-tidy modernize-use-nullptr warnings in source directories and ↵Hans Wennborg2015-10-061-4/+3
| | | | | | | | | | generated files; other minor cleanups. Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D13321 llvm-svn: 249482
* include equal sign in debug equations; NFCSanjay Patel2015-10-031-2/+2
| | | | llvm-svn: 249248
* fix minsize detection: minsize attribute implies optimizing for sizeSanjay Patel2015-08-111-3/+1
| | | | llvm-svn: 244604
* [MachineCombiner] Don't use the opcode-only form of computeInstrLatencyHal Finkel2015-08-051-1/+1
| | | | | | | | | In r242277, I updated the MachineCombiner to work with itineraries, but I missed a call that is scheduling-model-only (the opcode-only form of computeInstrLatency). Using the form that takes an MI* allows this to work with itineraries (and should be NFC for subtargets with scheduling models). llvm-svn: 244020
* wrap OptSize and MinSize attributes for easier and consistent access (NFCI)Sanjay Patel2015-08-041-0/+1
| | | | | | | | | | | | | | | | | Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of "or'ing" them together when optimizing just for size (-Os). Currently, we are not consistent about this and rely on a front-end to always set OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here that should be added as follow-on patches with regression tests. This patch is NFC-intended: it just replaces existing direct accesses of the attributes by the equivalent wrapper call. Differential Revision: http://reviews.llvm.org/D11734 llvm-svn: 243994
* [MachineCombiner] Work with itinerariesHal Finkel2015-07-151-4/+9
| | | | | | | | | | | | MachineCombiner predicated its use of scheduling-based metrics on hasInstrSchedModel(), but useful conclusions can be drawn from pipeline itineraries as well. Almost all of the logic (except for resource tracking in preservesResourceLen) can be used if we have an itinerary, so enable it in that case as well. This will be used by the PowerPC backend in an upcoming commit. llvm-svn: 242277
* Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)Alexander Kornienko2015-06-231-1/+1
| | | | | | Apparently, the style needs to be agreed upon first. llvm-svn: 240390
* [x86] generalize reassociation optimization in machine combiner to 2 ↵Sanjay Patel2015-06-231-18/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | instructions Currently ( D10321, http://reviews.llvm.org/rL239486 ), we can use the machine combiner pass to reassociate the following sequence to reduce the critical path: A = ? op ? B = A op X C = B op Y --> A = ? op ? B = X op Y C = A op B 'op' is currently limited to x86 AVX scalar FP adds (with fast-math on), but in theory, it could be any associative math/logic op (see TODO in code comment). This patch generalizes the pattern match to ignore the instruction that defines 'A'. So instead of a sequence of 3 adds, we now only need to find 2 dependent adds and decide if it's worth reassociating them. This generalization has a compile-time cost because we can now match more instruction sequences and we rely more heavily on the machine combiner to discard sequences where reassociation doesn't improve the critical path. For example, in the new test case: A = M div N B = A add X C = B add Y We'll match 2 reassociation patterns, but this transform doesn't reduce the critical path: A = M div N B = A add Y C = B add X We need the combiner to reject that pattern but select this: A = M div N B = X add Y C = B add A Differential Revision: http://reviews.llvm.org/D10460 llvm-svn: 240361
* name change: hasPattern() -> getMachineCombinerPatterns() ; NFCSanjay Patel2015-06-191-5/+5
| | | | | | | This was suggested as part of D10460, but it's independent of any functional change. llvm-svn: 240192
* Fixed/added namespace ending comments using clang-tidy. NFCAlexander Kornienko2015-06-191-1/+1
| | | | | | | | | | | | | The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
* hoist loop-invariant; NFCISanjay Patel2015-06-131-3/+2
| | | | llvm-svn: 239681
* remove unnecessary casts; NFCISanjay Patel2015-06-131-3/+2
| | | | llvm-svn: 239678
* punctuation policing; NFCSanjay Patel2015-06-101-5/+5
| | | | llvm-svn: 239484
* fix typo in comment; NFCSanjay Patel2015-06-101-1/+1
| | | | llvm-svn: 239478
* fix typo in comment; NFCSanjay Patel2015-05-211-1/+1
| | | | llvm-svn: 237962
* use range-based for-loops; NFCISanjay Patel2015-05-211-4/+2
| | | | llvm-svn: 237918
* CodeGen: Canonicalize access to function attributes, NFCDuncan P. N. Exon Smith2015-02-141-2/+1
| | | | | | | | | | | | | | | | | Canonicalize access to function attributes to use the simpler API. getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind) getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind) Also, add `Function::getFnStackAlignment()`, and canonicalize: getAttributes().getStackAlignment(AttributeSet::FunctionIndex) => getFnStackAlignment() llvm-svn: 229208
* remove function names from comments; NFCSanjay Patel2015-01-271-8/+6
| | | | llvm-svn: 227256
* fix typos; NFCSanjay Patel2015-01-271-4/+4
| | | | llvm-svn: 227253
* The subtarget is cached on the MachineFunction. Access it directly.Eric Christopher2015-01-271-2/+1
| | | | llvm-svn: 227173
* Change MCSchedModel to be a struct of statically initialized data.Pete Cooper2014-09-021-3/+3
| | | | | | | | This removes static initializers from the backends which generate this data, and also makes this struct match the other Tablegen generated structs in behaviour Reviewed by Andy Trick and Chandler C llvm-svn: 216919
* [MachineCombiner] Removal of dangling DBG_VALUES after combining [20598]Gerolf Hoflehner2014-08-131-1/+1
| | | | | | | | This is a cleaner solution to the problem described in r215431. When instructions are combined a dangling DBG_VALUE is removed. This resolves bug 20598. llvm-svn: 215587
* MachineCombiner Pass for selecting faster instruction sequence on AArch64Gerolf Hoflehner2014-08-071-1/+3
| | | | | | | | | | Re-commit of r214832,r21469 with a work-around that avoids the previous problem with gcc build compilers The work-around is to use SmallVector instead of ArrayRef of basic blocks in preservesResourceLen()/MachineCombiner.cpp llvm-svn: 215151
* Remove the TargetMachine forwards for TargetSubtargetInfo basedEric Christopher2014-08-041-2/+2
| | | | | | information and update all callers. No functional change. llvm-svn: 214781
* CodeGen: silence a warningSaleem Abdulrasool2014-08-031-2/+1
| | | | | | | | GCC 4.8.2 objects to the tautological condition in the assert as the unsigned value is guaranteed to be >= 0. Simplify the assertion by dropping the tautological condition. llvm-svn: 214671
* MachineCombiner Pass for selecting faster instructionGerolf Hoflehner2014-08-031-0/+434
sequence - target independent framework When the DAGcombiner selects instruction sequences it could increase the critical path or resource len. For example, on arm64 there are multiply-accumulate instructions (madd, msub). If e.g. the equivalent multiply-add sequence is not on the crictial path it makes sense to select it instead of the combined, single accumulate instruction (madd/msub). The reason is that the conversion from add+mul to the madd could lengthen the critical path by the latency of the multiply. But the DAGCombiner would always combine and select the madd/msub instruction. This patch uses machine trace metrics to estimate critical path length and resource length of an original instruction sequence vs a combined instruction sequence and picks the faster code based on its estimates. This patch only commits the target independent framework that evaluates and selects code sequences. The machine instruction combiner is turned off for all targets and expected to evolve over time by gradually handling DAGCombiner pattern in the target specific code. This framework lays the groundwork for fixing rdar://16319955 llvm-svn: 214666
OpenPOWER on IntegriCloud