summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/ModuloSchedule.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [ModuloSchedule] Do not remap terminatorsJames Molloy2019-10-041-1/+1
| | | | | | | | | | This is a trivial point fix. Terminator instructions aren't scheduled, so we shouldn't expect to be able to remap them. This doesn't affect Hexagon and PPC because their terminators are always hardware loop backbranches that have no register operands. llvm-svn: 373762
* [ModuloSchedule] removeBranch() *before* creating the trip count conditionJames Molloy2019-10-031-2/+1
| | | | | | | | | | | | | | The Hexagon code assumes there's no existing terminator when inserting its trip count condition check. This causes swp-stages5.ll to break. The generated code looks good to me, it is likely a permutation. I have disabled the new codegen path to keep everything green and will investigate along with the other 3-4 tests that have different codegen. Fixes expensive-checks build. llvm-svn: 373629
* [ModuloSchedule] Peel out prologs and epilogs, generate actual codeJames Molloy2019-10-021-0/+262
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This extends the PeelingModuloScheduleExpander to generate prolog and epilog code, and correctly stitch uses through the prolog, kernel, epilog DAG. The key concept in this patch is to ensure that all transforms are *local*; only a function of a block and its immediate predecessor and successor. By defining the problem in this way we can inductively rewrite the entire DAG using only local knowledge that is easy to reason about. For example, we assume that all prologs and epilogs are near-perfect clones of the steady-state kernel. This means that if a block has an instruction that is predicated out, we can redirect all users of that instruction to that equivalent instruction in our immediate predecessor. As all blocks are clones, every instruction must have an equivalent in every other block. Similarly we can make the assumption by construction that if a value defined in a block is used outside that block, the only possible user is its immediate successors. We maintain this even for values that are used outside the loop by creating a limited form of LCSSA. This code isn't small, but it isn't complex. Enabled a bunch of testing from Hexagon. There are a couple of tests not enabled yet; I'm about 80% sure there isn't buggy codegen but the tests are checking for patterns that we don't produce. Those still need a bit more investigation. In the meantime we (Google) are happy with the code produced by this on our downstream SMS implementation, and believe it generates correct code. Subscribers: mgorny, hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68205 llvm-svn: 373462
* [ModuloSchedule] KernelRewriter::rewrite - silence static analyzer ↵Simon Pilgrim2019-09-241-0/+1
| | | | | | | | dyn_cast<> null dereference warning. NFCI. Assert that we've found the start of the MI schedule list. llvm-svn: 372723
* [MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCountJames Molloy2019-09-211-23/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recommit: fix asan errors. The way MachinePipeliner uses these target hooks is stateful - we reduce trip count by one per call to reduceLoopCount. It's a little overfit for hardware loops, where we don't have to worry about stitching a loop induction variable across prologs and epilogs (the induction variable is implicit). This patch introduces a new API: /// Analyze loop L, which must be a single-basic-block loop, and if the /// conditions can be understood enough produce a PipelinerLoopInfo object. virtual std::unique_ptr<PipelinerLoopInfo> analyzeLoopForPipelining(MachineBasicBlock *LoopBB) const; The return value is expected to be an implementation of the abstract class: /// Object returned by analyzeLoopForPipelining. Allows software pipelining /// implementations to query attributes of the loop being pipelined. class PipelinerLoopInfo { public: virtual ~PipelinerLoopInfo(); /// Return true if the given instruction should not be pipelined and should /// be ignored. An example could be a loop comparison, or induction variable /// update with no users being pipelined. virtual bool shouldIgnoreForPipelining(const MachineInstr *MI) const = 0; /// Create a condition to determine if the trip count of the loop is greater /// than TC. /// /// If the trip count is statically known to be greater than TC, return /// true. If the trip count is statically known to be not greater than TC, /// return false. Otherwise return nullopt and fill out Cond with the test /// condition. virtual Optional<bool> createTripCountGreaterCondition(int TC, MachineBasicBlock &MBB, SmallVectorImpl<MachineOperand> &Cond) = 0; /// Modify the loop such that the trip count is /// OriginalTC + TripCountAdjust. virtual void adjustTripCount(int TripCountAdjust) = 0; /// Called when the loop's preheader has been modified to NewPreheader. virtual void setPreheader(MachineBasicBlock *NewPreheader) = 0; /// Called when the loop is being removed. virtual void disposed() = 0; }; The Pipeliner (ModuloSchedule.cpp) can use this object to modify the loop while allowing the target to hold its own state across all calls. This API, in particular the disjunction of creating a trip count check condition and adjusting the loop, improves the code quality in ModuloSchedule.cpp. llvm-svn: 372463
* Revert "[MachinePipeliner] Improve the TargetInstrInfo API ↵Mitch Phillips2019-09-201-17/+23
| | | | | | | | | | | analyzeLoop/reduceLoopCount" This commit broke the ASan buildbot. See comments in rL372376 for more information. This reverts commit 15e27b0b6d9d51362fad85dbe95ac5b3fadf0a06. llvm-svn: 372425
* [MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCountJames Molloy2019-09-201-23/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The way MachinePipeliner uses these target hooks is stateful - we reduce trip count by one per call to reduceLoopCount. It's a little overfit for hardware loops, where we don't have to worry about stitching a loop induction variable across prologs and epilogs (the induction variable is implicit). This patch introduces a new API: /// Analyze loop L, which must be a single-basic-block loop, and if the /// conditions can be understood enough produce a PipelinerLoopInfo object. virtual std::unique_ptr<PipelinerLoopInfo> analyzeLoopForPipelining(MachineBasicBlock *LoopBB) const; The return value is expected to be an implementation of the abstract class: /// Object returned by analyzeLoopForPipelining. Allows software pipelining /// implementations to query attributes of the loop being pipelined. class PipelinerLoopInfo { public: virtual ~PipelinerLoopInfo(); /// Return true if the given instruction should not be pipelined and should /// be ignored. An example could be a loop comparison, or induction variable /// update with no users being pipelined. virtual bool shouldIgnoreForPipelining(const MachineInstr *MI) const = 0; /// Create a condition to determine if the trip count of the loop is greater /// than TC. /// /// If the trip count is statically known to be greater than TC, return /// true. If the trip count is statically known to be not greater than TC, /// return false. Otherwise return nullopt and fill out Cond with the test /// condition. virtual Optional<bool> createTripCountGreaterCondition(int TC, MachineBasicBlock &MBB, SmallVectorImpl<MachineOperand> &Cond) = 0; /// Modify the loop such that the trip count is /// OriginalTC + TripCountAdjust. virtual void adjustTripCount(int TripCountAdjust) = 0; /// Called when the loop's preheader has been modified to NewPreheader. virtual void setPreheader(MachineBasicBlock *NewPreheader) = 0; /// Called when the loop is being removed. virtual void disposed() = 0; }; The Pipeliner (ModuloSchedule.cpp) can use this object to modify the loop while allowing the target to hold its own state across all calls. This API, in particular the disjunction of creating a trip count check condition and adjusting the loop, improves the code quality in ModuloSchedule.cpp. llvm-svn: 372376
* Hide implementation details in namespaces.Benjamin Kramer2019-09-171-0/+2
| | | | llvm-svn: 372113
* [ModuloSchedule] Fix no-asserts buildJames Molloy2019-09-041-5/+8
| | | | | | Apologies, due to a git SNAFU this fix (dump doesn't exist and silence unused variables) stayed in my index rather than applying to rL370893. llvm-svn: 370894
* [ModuloSchedule] Introduce PeelingModuloScheduleExpanderJames Molloy2019-09-041-4/+461
| | | | | | | | | | | | | | | This is the beginnings of a reimplementation of ModuloScheduleExpander. It works by generating a single-block correct pipelined kernel and then peeling out the prolog and epilogs. This patch implements kernel generation as well as a validator that will confirm the number of phis added is the same as the ModuloScheduleExpander. Prolog and epilog peeling will come in a different patch. Differential Revision: https://reviews.llvm.org/D67081 llvm-svn: 370893
* [MachinePipeliner] Add a way to unit-test the schedule emitterJames Molloy2019-09-031-0/+114
| | | | | | | | | | | | | | | | | | Emitting a schedule is really hard. There are lots of corner cases to take care of; in fact, of the 60+ SWP-specific testcases in the Hexagon backend most of those are testing codegen rather than the schedule creation itself. One issue is that to test an emission corner case we must craft an input such that the generated schedule uses that corner case; sometimes this is very hard and convolutes testcases. Other times it is impossible but we want to test it anyway. This patch adds a simple test pass that will consume a module containing a loop and generate pipelined code from it. We use post-instr-symbols as a way to annotate instructions with the stage and cycle that we want to schedule them at. We also provide a flag that causes the MachinePipeliner to generate these annotations instead of actually emitting code; this allows us to generate an input testcase with: llc < %s -stop-after=pipeliner -pipeliner-annotate-for-testing -o test.mir And run the emission in isolation with: llc < test.mir -run-pass=modulo-schedule-test llvm-svn: 370705
* [MachinePipeliner] Separate schedule emission, NFCJames Molloy2019-08-301-0/+1190
This is the first stage in refactoring the pipeliner and making it more accessible for backends to override and control. This separates the logic and state required to *emit* a scheudule from the logic that *computes* and validates a schedule. This will enable (a) new schedule emitters and (b) new modulo scheduling implementations to coexist. NFC. Differential Revision: https://reviews.llvm.org/D67006 llvm-svn: 370500
OpenPOWER on IntegriCloud