summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/MachineScheduler.cpp
Commit message (Collapse)AuthorAgeFilesLines
* CodeGen: Refactor renameDisconnectedComponents() as a passMatthias Braun2016-05-311-10/+2
| | | | | | | | | | | | | | | | | | Refactor LiveIntervals::renameDisconnectedComponents() to be a pass. Also change the name to "RenameIndependentSubregs": - renameDisconnectedComponents() worked on a MachineFunction at a time so it is a natural candidate for a machine function pass. - The algorithm is testable with a .mir test now. - This also fixes a problem where the lazy renaming as part of the MachineScheduler introduced IMPLICIT_DEF instructions after the number of a nodes in a region were counted leading to a mismatch. Differential Revision: http://reviews.llvm.org/D20507 llvm-svn: 271345
* MachineScheduler: Introduce ONLY1 reason to improve debug outputMatthias Braun2016-05-271-6/+13
| | | | llvm-svn: 271058
* LiveIntervalAnalysis: Fix missing defs in renameDisconnectedComponents().Matthias Braun2016-05-201-7/+2
| | | | | | | | | | | | | | Fix renameDisconnectedComponents() creating vreg uses that can be reached from function begin withouthaving a definition (or explicit live-in). Fix this by inserting IMPLICIT_DEF instruction before control-flow joins as necessary. Removes an assert from MachineScheduler because we may now get additional IMPLICIT_DEF when preparing the scheduling policy. This fixes the underlying problem of http://llvm.org/PR27705 llvm-svn: 270259
* CodeGen: Move TargetPassConfig from Passes.h to an own header; NFCMatthias Braun2016-05-101-0/+1
| | | | | | | | Many files include Passes.h but only a fraction needs to know about the TargetPassConfig class. Move it into an own header. Also rename Passes.cpp to TargetPassConfig.cpp while we are at it. llvm-svn: 269011
* Reset the TopRPTracker's position in ScheduleDAGMILive::initQueuesKrzysztof Parzyszek2016-04-281-5/+11
| | | | | | | | | | | | | | | | | | | ScheduleDAGMI::initQueues changes the RegionBegin to the first non-debug instruction. Since it does not track register pressure, it does not affect any RP trackers. ScheduleDAGMILive inherits initQueues from ScheduleDAGMI, and it does reset the TopTPTracker in its schedule method. Any derived, target-specific scheduler will need to do it as well, but the TopRPTracker is only exposed as a "const" object to derived classes. Without the ability to modify the tracker directly, this leaves a derived scheduler with a potential of having the TopRPTracker out-of-sync with the CurrentTop. The symptom of the problem: void llvm::ScheduleDAGMILive::scheduleMI(llvm::SUnit *, bool): Assertion `TopRPTracker.getPos() == CurrentTop && "out of sync"' failed. Differential Revision: http://reviews.llvm.org/D19438 llvm-svn: 267918
* Re-commit optimization bisect support (r267022) without new pass manager ↵Andrew Kaylor2016-04-221-2/+2
| | | | | | | | | | support. The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling). Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267231
* MachineScheduler: Move code to initialize a Candidate out of tryCandidate(); NFCMatthias Braun2016-04-221-33/+33
| | | | llvm-svn: 267191
* MachineScheduler: Limit the size of the ready list.Matthias Braun2016-04-221-1/+10
| | | | | | | | | Avoid quadratic complexity in unusually large basic blocks by limiting the size of the ready lists. Differential Revision: http://reviews.llvm.org/D19349 llvm-svn: 267189
* Revert "Initial implementation of optimization bisect support."Vedant Kumar2016-04-221-2/+2
| | | | | | | | This reverts commit r267022, due to an ASan failure: http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549 llvm-svn: 267115
* Initial implementation of optimization bisect support.Andrew Kaylor2016-04-211-2/+2
| | | | | | | | | | | | This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations. The bisection is enabled using a new command line option (-opt-bisect-limit). Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit. A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used. The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check. Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute. A new function call has been added for module and SCC passes that behaves in a similar way. Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267022
* MachineSched: Cleanup; NFCMatthias Braun2016-04-211-32/+16
| | | | llvm-svn: 266946
* [NFC] Header cleanupMehdi Amini2016-04-181-1/+0
| | | | | | | | | | | | | | Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595
* [MachineScheduler]Add support for store clusteringJun Bum Lim2016-04-151-35/+60
| | | | | | | | | | | | Perform store clustering just like load clustering. This change add StoreClusterMutation in machine-scheduler. To control StoreClusterMutation, added enableClusterStores() in TargetInstrInfo.h. This is enabled only on AArch64 for now. This change also add support for unscaled stores which were not handled in getMemOpBaseRegImmOfs(). llvm-svn: 266437
* MachineScheduler: Ignore COPYs with undef/dead op in CopyConstrain mutation.Matthias Braun2016-04-041-4/+6
| | | | | | | | There is no problem with the code today, but the fix will avoid a crash in test/CodeGen/AMDGPU/subreg-coalescer-undef-use.ll once the DetectDeadLanes pass is added. llvm-svn: 265351
* [misched] Fix a truncation issue from r263021.Chad Rosier2016-03-111-1/+1
| | | | | | | | | The truncation was causing the sorting algorithm to behave oddly when comparing positive and negative offsets. Fortunately, this doesn't currently happen in practice and was exposed by a WIP. Thus, I can't test this change now, but the follow on patch will. llvm-svn: 263255
* [TII] Allow getMemOpBaseRegImmOfs() to accept negative offsets. NFC.Chad Rosier2016-03-091-2/+2
| | | | | | http://reviews.llvm.org/D17967 llvm-svn: 263021
* Add DAG mutation interface to the post-RA schedulerKrzysztof Parzyszek2016-03-051-6/+11
| | | | | | Differential Revision: http://reviews.llvm.org/D17868 llvm-svn: 262774
* CodeGen: Update LiveIntervalAnalysis API to use MachineInstr&, NFCDuncan P. N. Exon Smith2016-02-271-1/+1
| | | | | | These parameters aren't expected to be null, so take them by reference. llvm-svn: 262151
* CodeGen: Take MachineInstr& in SlotIndexes and LiveIntervals, NFCDuncan P. N. Exon Smith2016-02-271-9/+8
| | | | | | | | | | | | | | Take MachineInstr by reference instead of by pointer in SlotIndexes and the SlotIndex wrappers in LiveIntervals. The MachineInstrs here are never null, so this cleans up the API a bit. It also incidentally removes a few implicit conversions from MachineInstrBundleIterator to MachineInstr* (see PR26753). At a couple of call sites it was convenient to convert to a range-based for loop over MachineBasicBlock::instr_begin/instr_end, so I added MachineBasicBlock::instrs. llvm-svn: 262115
* MachineScheduler: Add a command line option to disable post scheduler.Chad Rosier2016-01-201-1/+9
| | | | llvm-svn: 258364
* MachineScheduler: Honor optnone functions in the pre-ra scheduler.Chad Rosier2016-01-201-0/+3
| | | | llvm-svn: 258363
* MachineScheduler: Allow independent scheduling of sub register defsMatthias Braun2016-01-201-43/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Note that this is disabled by default and still requires a patch to handleMove() which is not upstreamed yet. If the TrackLaneMasks policy/strategy is enabled the MachineScheduler will build a schedule graph where definitions of independent subregisters are no longer serialised. Implementation comments: - Without lane mask tracking a sub register def also counts as a use (except for the first one with the read-undef flag set), with lane mask tracking enabled this is no longer the case. - Pressure Diffs where previously maintained per definition of a vreg with the help of the SSA information contained in the LiveIntervals. With lanemask tracking enabled we cannot do this anymore and instead change the pressure diffs for all uses of the vreg as it becomes live/dead. For this changed style to work correctly we ignore uses of instructions that define the same register again: They won't affect register pressure. - With lanemask tracking we remove all read-undef flags from sub register defs when building the graph and re-add them later when all vreg lanes have become dead. Differential Revision: http://reviews.llvm.org/D14969 llvm-svn: 258259
* RegisterPressure: Make liveness tracking subregister awareMatthias Braun2016-01-201-12/+12
| | | | | | Differential Revision: http://reviews.llvm.org/D14968 llvm-svn: 258258
* MachineScheduler: Add a target hook for deciding which RegPressure sets toTom Stellard2015-12-161-7/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | increase Summary: This patch adds a function called getRegPressureSetScore() to TargetRegisterInfo. The MachineScheduler uses this when comparing instruction that increase the register pressure of different sets to determine which set is safer to increase. This hook is useful for GPU targets where the number of registers in the class is not the best metric for determing which presser set is safer to increase. Future work may include adding more parameters to this function, like for example, the current pressure level of the set or the amount that the pressure will be increased/decreased. Reviewers: qcolombet, escha, arsenm, atrick, MatzeB Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14806 llvm-svn: 255795
* MachineScheduler: Print initial pressure in debug dumpMatthias Braun2015-11-131-0/+7
| | | | llvm-svn: 253097
* MachineScheduler: Improve debug output for "only one node in readyset"Matthias Braun2015-11-131-2/+2
| | | | | | | When there is only 1 node left in the ready queue and it is picked call the reason "ONLY1" instead of "NOCAND". llvm-svn: 253096
* MachineScheduler: Add regpressure information to debug dumpMatthias Braun2015-11-061-6/+30
| | | | llvm-svn: 252340
* ScheduleDAGInstrs: Remove IsPostRA flag; NFCMatthias Braun2015-11-031-16/+14
| | | | | | | | | | | | | | | | | ScheduleDAGInstrs doesn't behave differently before or after register allocation. It was only used in a method of MachineSchedulerBase which behaved differently in MachineScheduler/PostMachineScheduler. Change this to let MachineScheduler/PostMachineScheduler just pass in a parameter to that function. The order of the LiveIntervals* and bool RemoveKillFlags paramters have been switched to make out-of-tree code fail instead of unintentionally passing a value intended for the IsPostRA flag to the (previously following and default initialized) RemoveKillFlags. Differential Revision: http://reviews.llvm.org/D14245 llvm-svn: 251883
* Revert "ScheduleDAGInstrs: Remove IsPostRA flag"Matthias Braun2015-10-291-14/+16
| | | | | | | | It broke 3 arm testcases. This reverts commit r251608. llvm-svn: 251615
* MachineScheduler: Fix typo in debug messageMatthias Braun2015-10-291-1/+1
| | | | | | Maybe I just missed the humor there ;-) llvm-svn: 251609
* ScheduleDAGInstrs: Remove IsPostRA flagMatthias Braun2015-10-291-16/+14
| | | | | | | | This was a layering violation in ScheduleDAGInstrs (and MachineSchedulerBase) they both shouldn't know directly whether they are used by the PostMachineScheduler or the MachineScheduler. llvm-svn: 251608
* MachineScheduler: Use ranged for and slightly simplify the codeMatthias Braun2015-10-291-11/+12
| | | | llvm-svn: 251607
* Make the SelectionDAG graph printer use SDNode::PersistentId labels.James Y Knight2015-10-271-5/+0
| | | | | | | | r248010 changed the -debug output to use short ids, but did not similarly modify the graph printer. Change to be consistent, for ease of cross-reference. llvm-svn: 251465
* MachineScheduler: Add a way to disable the 'ReduceLatency' heuristicMatthias Braun2015-10-221-2/+2
| | | | llvm-svn: 251037
* CodeGen: Continue removing ilist iterator implicit conversionsDuncan P. N. Exon Smith2015-10-091-5/+5
| | | | llvm-svn: 249884
* Make MachineScheduler debug output less confusing.James Y Knight2015-09-181-5/+26
| | | | | | At least...a little bit. llvm-svn: 248020
* Revert "(HEAD -> master, origin/master, origin/HEAD) RegisterPressure: Move ↵Matthias Braun2015-09-171-4/+4
| | | | | | | | | | LiveInRegs/LiveOutRegs from RegisterPressure to PressureTracker" This reverts commit r247943. Accidental commit, code review was not finished yet. llvm-svn: 247945
* RegisterPressure: Move LiveInRegs/LiveOutRegs from RegisterPressure to ↵Matthias Braun2015-09-171-4/+4
| | | | | | | | PressureTracker Differential Revision: http://reviews.llvm.org/D12814 llvm-svn: 247943
* MachineScheduler: Provide an option for node hiding cutoff and disable it by ↵Matthias Braun2015-09-171-1/+9
| | | | | | default llvm-svn: 247942
* [PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatibleChandler Carruth2015-09-091-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with the new pass manager, and no longer relying on analysis groups. This builds essentially a ground-up new AA infrastructure stack for LLVM. The core ideas are the same that are used throughout the new pass manager: type erased polymorphism and direct composition. The design is as follows: - FunctionAAResults is a type-erasing alias analysis results aggregation interface to walk a single query across a range of results from different alias analyses. Currently this is function-specific as we always assume that aliasing queries are *within* a function. - AAResultBase is a CRTP utility providing stub implementations of various parts of the alias analysis result concept, notably in several cases in terms of other more general parts of the interface. This can be used to implement only a narrow part of the interface rather than the entire interface. This isn't really ideal, this logic should be hoisted into FunctionAAResults as currently it will cause a significant amount of redundant work, but it faithfully models the behavior of the prior infrastructure. - All the alias analysis passes are ported to be wrapper passes for the legacy PM and new-style analysis passes for the new PM with a shared result object. In some cases (most notably CFL), this is an extremely naive approach that we should revisit when we can specialize for the new pass manager. - BasicAA has been restructured to reflect that it is much more fundamentally a function analysis because it uses dominator trees and loop info that need to be constructed for each function. All of the references to getting alias analysis results have been updated to use the new aggregation interface. All the preservation and other pass management code has been updated accordingly. The way the FunctionAAResultsWrapperPass works is to detect the available alias analyses when run, and add them to the results object. This means that we should be able to continue to respect when various passes are added to the pipeline, for example adding CFL or adding TBAA passes should just cause their results to be available and to get folded into this. The exception to this rule is BasicAA which really needs to be a function pass due to using dominator trees and loop info. As a consequence, the FunctionAAResultsWrapperPass directly depends on BasicAA and always includes it in the aggregation. This has significant implications for preserving analyses. Generally, most passes shouldn't bother preserving FunctionAAResultsWrapperPass because rebuilding the results just updates the set of known AA passes. The exception to this rule are LoopPass instances which need to preserve all the function analyses that the loop pass manager will end up needing. This means preserving both BasicAAWrapperPass and the aggregating FunctionAAResultsWrapperPass. Now, when preserving an alias analysis, you do so by directly preserving that analysis. This is only necessary for non-immutable-pass-provided alias analyses though, and there are only three of interest: BasicAA, GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is preserved when needed because it (like DominatorTree and LoopInfo) is marked as a CFG-only pass. I've expanded GlobalsAA into the preserved set everywhere we previously were preserving all of AliasAnalysis, and I've added SCEVAA in the intersection of that with where we preserve SCEV itself. One significant challenge to all of this is that the CGSCC passes were actually using the alias analysis implementations by taking advantage of a pretty amazing set of loop holes in the old pass manager's analysis management code which allowed analysis groups to slide through in many cases. Moving away from analysis groups makes this problem much more obvious. To fix it, I've leveraged the flexibility the design of the new PM components provides to just directly construct the relevant alias analyses for the relevant functions in the IPO passes that need them. This is a bit hacky, but should go away with the new pass manager, and is already in many ways cleaner than the prior state. Another significant challenge is that various facilities of the old alias analysis infrastructure just don't fit any more. The most significant of these is the alias analysis 'counter' pass. That pass relied on the ability to snoop on AA queries at different points in the analysis group chain. Instead, I'm planning to build printing functionality directly into the aggregation layer. I've not included that in this patch merely to keep it smaller. Note that all of this needs a nearly complete rewrite of the AA documentation. I'm planning to do that, but I'd like to make sure the new design settles, and to flesh out a bit more of what it looks like in the new pass manager first. Differential Revision: http://reviews.llvm.org/D12080 llvm-svn: 247167
* Fix three typos in comments; "easilly" -> "easily".Nick Lewycky2015-08-181-1/+1
| | | | llvm-svn: 245379
* MachineScheduler: Restrict macroop fusion to data-dependent instructions.Matthias Braun2015-07-201-9/+33
| | | | | | | | | | | | | | Before creating a schedule edge to encourage MacroOpFusion check that: - The predecessor actually writes a register that the branch reads. - The predecessor has no successors in the ScheduleDAG so we can schedule it in front of the branch. This avoids skewing the scheduling heuristic in cases where macroop fusion cannot happen. Differential Revision: http://reviews.llvm.org/D10745 llvm-svn: 242723
* Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)Alexander Kornienko2015-06-231-3/+3
| | | | | | Apparently, the style needs to be agreed upon first. llvm-svn: 240390
* Fixed/added namespace ending comments using clang-tidy. NFCAlexander Kornienko2015-06-191-3/+3
| | | | | | | | | | | | | The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
* Fix "the the" in comments.Eric Christopher2015-06-191-1/+1
| | | | llvm-svn: 240112
* [TargetInstrInfo] Rename getLdStBaseRegImmOfs and implement for x86.Sanjoy Das2015-06-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | Summary: TargetInstrInfo::getLdStBaseRegImmOfs to TargetInstrInfo::getMemOpBaseRegImmOfs and implement for x86. The implementation only handles a few easy cases now and will be made more sophisticated in the future. This is NFCI: the only user of `getLdStBaseRegImmOfs` (now `getmemOpBaseRegImmOfs`) is `LoadClusterMotion` and `LoadClusterMotion` is disabled for x86. Reviewers: reames, ab, MatzeB, atrick Reviewed By: MatzeB, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10199 llvm-svn: 239741
* Rename TargetSubtargetInfo::enablePostMachineScheduler() to ↵Matthias Braun2015-06-131-1/+1
| | | | | | | | | | | | | | enablePostRAScheduler() r213101 changed the behaviour of this method to not only affect the PostMachineScheduler scheduler but also the PostRAScheduler scheduler, renaming should make this fact clear. Also document that the preferred way is to specify this in the scheduling model instead of overriding this method. Differential Revision: http://reviews.llvm.org/D10427 llvm-svn: 239659
* MachineScheduler debug output clarity.Andrew Trick2015-05-171-2/+3
| | | | llvm-svn: 237545
* RegisterPressureTracker: reword stale comments.Andrew Trick2015-05-171-2/+1
| | | | llvm-svn: 237544
* Complete the MachineScheduler fix made way back in r210390.Andrew Trick2015-03-271-2/+2
| | | | | | | | | | | | | | | | | | "Fix the MachineScheduler's logic for updating ready times for in-order. Now the scheduler updates a node's ready time as soon as it is scheduled, before releasing dependent nodes." This fix was only made in one variant of the ScheduleDAGMI driver. Francois de Ferriere reported the issue in the other bit of code where it was also needed. I never got around to coming up with a test case, but it's an obvious fix that shouldn't be delayed any longer. I'll try to refactor this code a little better. I did verify performance on a wide variety of targets and saw no negative impact with this fix. llvm-svn: 233366
OpenPOWER on IntegriCloud