summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/MachineSink.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [MachineSink] Improve runtime performance. NFC.Arnaud A. de Grandmaison2015-06-151-35/+59
| | | | | | | | | | | | This patch fixes a compilation time issue, when MachineSink faces PHIs with a huge number of operands. This can happen for example in goto table based interpreters, where some basic blocks can have several of those PHIs, each one with several hundreds operands. MachineSink was spending a significant time re-building and re-sorting the list of successors of the current MachineBasicBlock. The computing and sorting of the current MachineBasicBlock successors is now cached. llvm-svn: 239720
* Disable MachineSink on convergent operations, similar to how IR Sink isOwen Anderson2015-06-011-0/+4
| | | | | | | restricted. No test because no in-tree target currently has convergent MachineInstr's. llvm-svn: 238763
* MachineInstr: Remove unused parameter.Matthias Braun2015-05-191-2/+2
| | | | llvm-svn: 237726
* MachineSink: Collect registers before clearing their killflags.Matthias Braun2015-05-161-1/+10
| | | | | | | | | | | | | | | | Currently whenever we sink any instruction, we do clearKillFlags for every use of every use operand for that instruction, apparently there are a lot of duplication, therefore compile time penalties. This patch collect all the interested registers first, do clearKillFlags for it all together at once at the end, so we only need to do clearKillFlags once for one register, duplication is avoided. Patch by Lawrence Hu! Differential Revision: http://reviews.llvm.org/D9719 llvm-svn: 237510
* Clear kill flags on all used registers when sinking instructions.Pete Cooper2015-05-081-1/+7
| | | | | | | | | | | | | The test here was sinking the AND here to a lower BB: %vreg7<def> = ANDWri %vreg8, 0; GPR32common:%vreg7,%vreg8 TBNZW %vreg8<kill>, 0, <BB#1>; GPR32common:%vreg8 which meant that vreg8 was read after it was killed. This commit changes the code from clearing kill flags on the AND to clearing flags on all registers used by the AND. llvm-svn: 236886
* 80 cols fix since i'm looking at this function anyway. NFCPete Cooper2015-05-081-1/+2
| | | | llvm-svn: 236885
* Use DomTree in MachineSink to sink over diamonds.Patrik Hagglund2014-12-041-15/+19
| | | | | | | | | | | | | | | | | | | | | According to a previous FIXME comment we now not only look at MBB successors, but also handle code sinking past them: x = computation if () {} else {} use x The instruction could be sunk over the whole diamond for the if/then/else (or loop, etc), allowing it to be sunk into other blocks after that. Modified test added in r204522, due to one spill less present. Minor fixes in comments. Patch provided by Jonas Paulsson. Reviewed by Hal Finkel. llvm-svn: 223350
* Update SetVector to rely on the underlying set's insert to return a ↵David Blaikie2014-11-191-1/+1
| | | | | | | | | | | | | pair<iterator, bool> This is to be consistent with StringSet and ultimately with the standard library's associative container insert function. This lead to updating SmallSet::insert to return pair<iterator, bool>, and then to update SmallPtrSet::insert to return pair<iterator, bool>, and then to update all the existing users of those functions... llvm-svn: 222334
* [MachineSink] Use the real post dominator treeJingyue Wu2014-10-151-21/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Fixes a FIXME in MachineSinking. Instead of using the simple heuristics in isPostDominatedBy, use the real MachinePostDominatorTree and MachineLoopInfo. The old heuristics caused instructions to sink unnecessarily, and might create register pressure. This is the second try of the fix. The first one (D4814) caused a performance regression due to failing to sink instructions out of loops (PR21115). This patch fixes PR21115 by sinking an instruction from a deeper loop to a shallower one regardless of whether the target block post-dominates the source. Thanks Alexey Volkov for reporting PR21115! Test Plan: Added a NVPTX codegen test to verify that our change prevents the backend from over-sinking. It also shows the unnecessary register pressure caused by over-sinking. Added an X86 test to verify we can sink instructions out of loops regardless of the dominance relationship. This test is reduced from Alexey's test in PR21115. Updated an affected test in X86. Also ran SPEC CINT2006 and llvm-test-suite for compilation time and runtime performance. Results are attached separately in the review thread. Reviewers: Jiangning, resistor, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, bruno, volkalexey, llvm-commits, meheff, eliben, jholewinski Differential Revision: http://reviews.llvm.org/D5633 llvm-svn: 219773
* Access subtarget specific variables off of the MachineFunction'sEric Christopher2014-10-141-4/+2
| | | | | | cached subtarget and not the TargetMachine. llvm-svn: 219668
* Revert r216862 due to a performance regressionJingyue Wu2014-10-011-9/+21
| | | | | | Reported by Alexey Volkov in PR21115 llvm-svn: 218771
* [MachineSink+PGO] Teach MachineSink to use BlockFrequencyInfoBruno Cardoso Lopes2014-09-251-6/+23
| | | | | | | | | | | | | | | Machine Sink uses loop depth information to select between successors BBs to sink machine instructions into, where BBs within smaller loop depths are preferable. This patch adds support for choosing between successors by using profile information from BlockFrequencyInfo instead, whenever the information is available. Tested it under SPEC2006 train (average of 30 runs for each program); ~1.5% execution speedup in average on x86-64 darwin. <rdar://problem/18021659> llvm-svn: 218472
* [MachineSinking] Conservatively clear kill flags after coalescing.Patrik Hagglund2014-09-091-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This solves the problem of having a kill flag inside a loop with a definition of the register prior to the loop: %vreg368<def> ... Inside loop: %vreg520<def> = COPY %vreg368 %vreg568<def,tied1> = add %vreg341<tied0>, %vreg520<kill> => was coalesced into => %vreg568<def,tied1> = add %vreg341<tied0>, %vreg368<kill> MachineVerifier then complained: *** Bad machine code: Virtual register killed in block, but needed live out. *** The kill flag for %vreg368 is incorrect, and is cleared by this patch. This is similar to the clearing done at the end of MachineSinking::SinkInstruction(). Patch provided by Jonas Paulsson. Reviewed by Quentin Colombet and Juergen Ributzka. llvm-svn: 217427
* Revert r216803 "[MachineSinking] Clear kill flag of all operands at all ↵Juergen Ributzka2014-09-041-13/+3
| | | | | | | | | their uses." This reverts commit r216803, because it might have broken the buildbot. The issue is tracked in PR20842. llvm-svn: 217120
* [MachineSink] Use the real post dominator treeJingyue Wu2014-09-011-21/+9
| | | | | | | | | | | | | | | | | | | | | | | Summary: Fixes a FIXME in MachineSinking. Instead of using the simple heuristics in isPostDominatedBy, use the real MachinePostDominatorTree. The old heuristics caused instructions to sink unnecessarily, and might create register pressure. Test Plan: Added a NVPTX codegen test to verify that our change is in effect. It also shows the unnecessary register pressure caused by over-sinking. Updated affected tests in AArch64 and X86. Reviewers: eliben, meheff, Jiangning Reviewed By: Jiangning Subscribers: jholewinski, aemerson, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D4814 llvm-svn: 216862
* [MachineSinking] Clear kill flag of all operands at all their uses.Juergen Ributzka2014-08-291-3/+13
| | | | | | | | | | | | | | | | | | | | | | | When sinking an instruction it might be moved past the original last use of one of its operands. This last use has the kill flag set and the verifier will obviously complain about this. Before Machine Sinking (AArch64): %vreg3<def> = ASRVXr %vreg1, %vreg2<kill> %XZR<def> = SUBSXrs %vreg4, %vreg1<kill>, 160, %NZCV<imp-def> ... After Machine Sinking: %XZR<def> = SUBSXrs %vreg4, %vreg1<kill>, 160, %NZCV<imp-def> ... %vreg3<def> = ASRVXr %vreg1, %vreg2<kill> This fix clears all the kill flags in all instruction that use the same operands as the instruction that is being sunk. This fixes rdar://problem/18180996. llvm-svn: 216803
* [MachineSink] Improve the compile time by preserving the dominance informationQuentin Colombet2014-08-111-39/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | as long as possible. ** Context ** Each time the dominance information is modified, the dominator tree analysis switches in a slow query mode. After a few queries without any modification on the dominator tree, it performs an expensive update of its internal structure to provide fast queries again. ** Problem ** Prior to this patch, the MachineSink pass was splitting the critical edges on demand while relying heavy on the dominator tree information. In some cases, this leads to pathological behavior where: - We end up in the slow query mode right after splitting an edge. - We update the dominance information. - We break the dominance information again, thus ending up in the slow query mode and so on. ** Proposed Solution ** To mitigate this effect, this patch postpones all the splitting of the edges at the end of each iteration of the main loop. The benefits are: - The dominance information is valid for the life time of an iteration. - This simplifies the code as we do not have to special treat instructions that are sunk on critical edges. Indeed, the related block will be available through the next iteration. The downside is that when edges splitting is required, this incurs an additional iteration of the main loop compared to the previous scheme. ** Performance ** Thanks to this patch, the motivating example compiles in 6+ minutes instead of 10+ minutes. No test case added as the motivating example as nothing special but being huge! I have measured only noise for both the compile time and the runtime on the llvm test-suite + SPECs with Os and O3. Note: The current implementation of MachineBasicBlock::SplitCriticalEdge also uses the dominance information and therefore, hits this problem. A subsequent patch will address that. <rdar://problem/17894619> llvm-svn: 215410
* Remove the TargetMachine forwards for TargetSubtargetInfo basedEric Christopher2014-08-041-2/+3
| | | | | | information and update all callers. No functional change. llvm-svn: 214781
* Add TargetInstrInfo interface isAsCheapAsAMove.Jiangning Liu2014-07-291-1/+1
| | | | llvm-svn: 214158
* [Modules] Remove potential ODR violations by sinking the DEBUG_TYPEChandler Carruth2014-04-221-1/+2
| | | | | | | | | | | | define below all header includes in the lib/CodeGen/... tree. While the current modules implementation doesn't check for this kind of ODR violation yet, it is likely to grow support for it in the future. It also removes one layer of macro pollution across all the included headers. Other sub-trees will follow. llvm-svn: 206837
* [C++11] More 'nullptr' conversion. In some cases just using a boolean check ↵Craig Topper2014-04-141-17/+17
| | | | | | instead of comparing to nullptr. llvm-svn: 206142
* Disable each MachineFunctionPass for 'optnone' functions, unless thatPaul Robinson2014-03-311-0/+3
| | | | | | | pass normally runs at optimization level None, or is part of the register allocation pipeline. llvm-svn: 205228
* Switch a number of loops in lib/CodeGen over to range-based for-loops, now thatOwen Anderson2014-03-171-16/+11
| | | | | | the MachineRegisterInfo iterators are compatible with it. llvm-svn: 204075
* Phase 2 of the great MachineRegisterInfo cleanup. This time, we're changingOwen Anderson2014-03-131-4/+4
| | | | | | | | | | operator* on the by-operand iterators to return a MachineOperand& rather than a MachineInstr&. At this point they almost behave like normal iterators! Again, this requires making some existing loops more verbose, but should pave the way for the big range-based for-loop cleanups in the future. llvm-svn: 203865
* [C++11] Add 'override' keyword to virtual methods that override their base ↵Craig Topper2014-03-071-3/+3
| | | | | | class. llvm-svn: 203220
* Now that we have C++11, turn simple functors into lambdas and remove a ton ↵Benjamin Kramer2014-03-011-11/+6
| | | | | | | | of boilerplate. No intended functionality change. llvm-svn: 202588
* MachineSink: Fix and tweak critical-edge breaking heuristic.Will Dietz2013-10-141-7/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Per original comment, the intention of this loop is to go ahead and break the critical edge (in order to sink this instruction) if there's reason to believe doing so might "unblock" the sinking of additional instructions that define registers used by this one. The idea is that if we have a few instructions to sink "together" breaking the edge might be worthwhile. This commit makes a few small changes to help better realize this goal: First, modify the loop to ignore registers defined by this instruction. We don't sink definitions of physical registers, and sinking an SSA definition isn't going to unblock an upstream instruction. Second, ignore uses of physical registers. Instructions that define physical registers are rejected for sinking, and so moving this one won't enable moving any defining instructions. As an added bonus, while virtual register use-def chains are generally small due to SSA goodness, iteration over the uses and definitions (used by hasOneNonDBGUse) for physical registers like EFLAGS can be rather expensive in practice. (This is the original reason for looking at this) Finally, to keep things simple continue to only consider this trick for registers that have a single use (via hasOneNonDBGUse), but to avoid spuriously breaking critical edges only do so if the definition resides in the same MBB and therefore this one directly blocks it from being sunk as well. If sinking them together is meant to be, let the iterative nature of this pass sink the definition into this block first. Update tests to accomodate this change, add new testcase where sinking avoids pipeline stalls. llvm-svn: 192608
* Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector ↵Craig Topper2013-07-141-1/+1
| | | | | | size. llvm-svn: 186274
* Use SmallVectorImpl instead of SmallVector for iterators and references to ↵Craig Topper2013-07-031-3/+3
| | | | | | avoid specifying the vector size unnecessarily. llvm-svn: 185512
* Use the new script to sort the includes of every file under lib.Chandler Carruth2012-12-031-7/+7
| | | | | | | | | | | | | | | | | Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131
* Remove unused BitVectors from getAllocatableSet().Jakob Stoklund Olesen2012-10-161-2/+0
| | | | llvm-svn: 165999
* MachineSink: Sort the successors before trying to find SuccToSinkTo.Manman Ren2012-07-311-1/+1
| | | | | | | | Use stable_sort instead of sort. Follow-up to r161062. rdar://11980766 llvm-svn: 161075
* MachineSink: Sort the successors before trying to find SuccToSinkTo.Manman Ren2012-07-311-2/+15
| | | | | | | | | | One motivating example is to sink an instruction from a basic block which has two successors: one outside the loop, the other inside the loop. We should try to sink the instruction outside the loop. rdar://11980766 llvm-svn: 161062
* Codegen pass definition cleanup. No functionality.Andrew Trick2012-02-081-2/+1
| | | | | | | | | | | | | Moving toward a uniform style of pass definition to allow easier target configuration. Globally declare Pass ID. Globally declare pass initializer. Use INITIALIZE_PASS consistently. Add a call to the initializer from CodeGen.cpp. Remove redundant "createPass" functions and "getPassName" methods. While cleaning up declarations, cleaned up comments (sorry for large diff). llvm-svn: 150100
* whitespaceAndrew Trick2012-02-081-5/+5
| | | | llvm-svn: 150094
* Extract method for detecting constant unallocatable physregs.Jakob Stoklund Olesen2012-01-161-14/+1
| | | | | | It is safe to move uses of such registers. llvm-svn: 148259
* Do not sink instruction, if it is not profitable.Devang Patel2011-12-141-13/+76
| | | | | | | | On ARM, peephole optimization for ABS creates a trivial cfg triangle which tempts machine sink to sink instructions in code which is really straight line code. Sometimes this sinking may alter register allocator input such that use and def of a reg is divided by a branch in between, which may result in extra spills. Now mahine sink avoids sinking if final sink destination is post dominator. Radar 10266272. llvm-svn: 146604
* Fix comment.Devang Patel2011-12-091-2/+1
| | | | llvm-svn: 146226
* Update stale comment.Devang Patel2011-12-091-4/+1
| | | | llvm-svn: 146220
* Revert r146184. I am seeing performance regression cause by this patch in ↵Devang Patel2011-12-081-10/+11
| | | | | | one test case. llvm-svn: 146205
* Refactor. No intentional functionality change.Devang Patel2011-12-081-29/+41
| | | | llvm-svn: 146187
* Filter "sink to" candidate blocks sooner. This avoids unnecessary ↵Devang Patel2011-12-081-11/+13
| | | | | | computation to determine whether the block dominates all uses or not. llvm-svn: 146184
* Add bundle aware API for querying instruction properties and switch the codeEvan Cheng2011-12-071-1/+1
| | | | | | | | | | | | | | generator to it. For non-bundle instructions, these behave exactly the same as the MC layer API. For properties like mayLoad / mayStore, look into the bundle and if any of the bundled instructions has the property it would return true. For properties like isPredicable, only return true if *all* of the bundled instructions have the property. For properties like canFoldAsLoad, isCompare, conservatively return false for bundles. llvm-svn: 146026
* While sinking machine instructions, sink matching DBG_VALUEs also otherwise ↵Devang Patel2011-09-071-0/+31
| | | | | | live debug variable pass will drop DBG_VALUEs on the floor. llvm-svn: 139208
* Fix a couple of places where changes are made but not tracked.Evan Cheng2011-04-111-1/+4
| | | | llvm-svn: 129287
* Get rid of static constructors for pass registration. Instead, every pass ↵Owen Anderson2010-10-191-1/+3
| | | | | | | | | | | | | | | | | exposes an initializeMyPassFunction(), which must be called in the pass's constructor. This function uses static dependency declarations to recursively initialize the pass's dependencies. Clients that only create passes through the createFooPass() APIs will require no changes. Clients that want to use the CommandLine options for passes will need to manually call the appropriate initialization functions in PassInitialization.h before parsing commandline arguments. I have tested this with all standard configurations of clang and llvm-gcc on Darwin. It is possible that there are problems with the static dependencies that will only be visible with non-standard options. If you encounter any crash in pass registration/creation, please send the testcase to me directly. llvm-svn: 116820
* Begin adding static dependence information to passes, which will allow us toOwen Anderson2010-10-121-1/+6
| | | | | | | | | perform initialization without static constructors AND without explicit initialization by the client. For the moment, passes are required to initialize both their (potential) dependencies and any passes they preserve. I hope to be able to relax the latter requirement in the future. llvm-svn: 116334
* Now with fewer extraneous semicolons!Owen Anderson2010-10-071-1/+1
| | | | llvm-svn: 115996
* Don't sink insert_subreg, subreg_to_reg, reg_sequence. They are meant to beEvan Cheng2010-09-231-1/+10
| | | | | | close to their sources to facilitate coalescing. llvm-svn: 114631
* Enable machine sinking critical edge splitting. e.g.Evan Cheng2010-09-201-7/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | define double @foo(double %x, double %y, i1 %c) nounwind { %a = fdiv double %x, 3.2 %z = select i1 %c, double %a, double %y ret double %z } Was: _foo: divsd LCPI0_0(%rip), %xmm0 testb $1, %dil jne LBB0_2 movaps %xmm1, %xmm0 LBB0_2: ret Now: _foo: testb $1, %dil je LBB0_2 divsd LCPI0_0(%rip), %xmm0 ret LBB0_2: movaps %xmm1, %xmm0 ret This avoids the divsd when early exit is taken. rdar://8454886 llvm-svn: 114372
OpenPOWER on IntegriCloud