summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/PowerPC/PPCCTRLoops.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [C++11] Add 'override' keywords and remove 'virtual'. Additionally add ↵Craig Topper2014-04-291-4/+4
| | | | | | 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. PowerPC edition llvm-svn: 207504
* [C++] Use 'nullptr'. Target edition.Craig Topper2014-04-251-5/+5
| | | | llvm-svn: 207197
* [Modules] Fix potential ODR violations by sinking the DEBUG_TYPEChandler Carruth2014-04-221-2/+2
| | | | | | | definition below all of the header #include lines, lib/Target/... edition. llvm-svn: 206842
* [Modules] Move ValueHandle into the IR library where Value itself lives.Chandler Carruth2014-03-041-1/+1
| | | | | | | | | | | Move the test for this class into the IR unittests as well. This uncovers that ValueMap too is in the IR library. Ironically, the unittest for ValueMap is useless in the Support library (honestly, so was the ValueHandle test) and so it already lives in the IR unittests. Mmmm, tasty layering. llvm-svn: 202821
* Silencing an MSVC signed comparison warning.Aaron Ballman2014-02-261-1/+1
| | | | llvm-svn: 202295
* Account for 128-bit integer operations in PPCCTRLoopsHal Finkel2014-02-251-6/+11
| | | | | | | We need to abort the formation of counter-register-based loops where there are 128-bit integer operations that might become function calls. llvm-svn: 202192
* Make DataLayout a plain object, not a pass.Rafael Espindola2014-02-251-1/+2
| | | | | | | Instead, have a DataLayoutPass that holds one. This will allow parts of LLVM don't don't handle passes to also use DataLayout. llvm-svn: 202168
* Make some DataLayout pointers const.Rafael Espindola2014-02-241-1/+1
| | | | | | No functionality change. Just reduces the noise of an upcoming patch. llvm-svn: 202087
* Rename a few more DataLayout variables.Rafael Espindola2014-02-211-2/+2
| | | | llvm-svn: 201833
* [PM] Split DominatorTree into a concrete analysis result object whichChandler Carruth2014-01-131-4/+4
| | | | | | | | | | | | | | | | | | | | | | | can be used by both the new pass manager and the old. This removes it from any of the virtual mess of the pass interfaces and lets it derive cleanly from the DominatorTreeBase<> template. In turn, tons of boilerplate interface can be nuked and it turns into a very straightforward extension of the base DominatorTree interface. The old analysis pass is now a simple wrapper. The names and style of this split should match the split between CallGraph and CallGraphWrapperPass. All of the users of DominatorTree have been updated to match using many of the same tricks as with CallGraph. The goal is that the common type remains the resulting DominatorTree rather than the pass. This will make subsequent work toward the new pass manager significantly easier. Also in numerous places things became cleaner because I switched from re-running the pass (!!! mid way through some other passes run!!!) to directly recomputing the domtree. llvm-svn: 199104
* [cleanup] Move the Dominators.h and Verifier.h headers into the IRChandler Carruth2014-01-131-1/+1
| | | | | | | | | | | | | | | | | | directory. These passes are already defined in the IR library, and it doesn't make any sense to have the headers in Analysis. Long term, I think there is going to be a much better way to divide these matters. The dominators code should be fully separated into the abstract graph algorithm and have that put in Support where it becomes obvious that evn Clang's CFGBlock's can use it. Then the verifier can manually construct dominance information from the Support-driven interface while the Analysis library can provide a pass which both caches, reconstructs, and supports a nice update API. But those are very long term, and so I don't want to leave the really confusing structure until that day arrives. llvm-svn: 199082
* Re-sort all of the includes with ./utils/sort_includes.py so thatChandler Carruth2014-01-071-4/+4
| | | | | | | | | | subsequent changes are easier to review. About to fix some layering issues, and wanted to separate out the necessary churn. Also comment and sink the include of "Windows.h" in three .inc files to match the usage in Memory.inc. llvm-svn: 198685
* Add a llvm.copysign intrinsicHal Finkel2013-08-191-0/+6
| | | | | | | | | | | | | | | | | | | | | This adds a llvm.copysign intrinsic; We already have Libfunc recognition for copysign (which is turned into the FCOPYSIGN SDAG node). In order to autovectorize calls to copysign in the loop vectorizer, we need a corresponding intrinsic as well. In addition to the expected changes to the language reference, the loop vectorizer, BasicTTI, and the SDAG builder (the intrinsic is transformed into an FCOPYSIGN node, just like the function call), this also adds FCOPYSIGN to a few lists in LegalizeVector{Ops,Types} so that vector copysigns can be expanded. In TargetLoweringBase::initActions, I've made the default action for FCOPYSIGN be Expand for vector types. This seems correct for all in-tree targets, and I think is the right thing to do because, previously, there was no way to generate vector-values FCOPYSIGN nodes (and most targets don't specify an action for vector-typed FCOPYSIGN). llvm-svn: 188728
* Don't form PPC CTR-based loops around a copysignl callHal Finkel2013-08-191-1/+2
| | | | | | | | | copysign/copysignf never become function calls (because the SDAG expansion code does not lower to the corresponding function call, but rather directly implements the associated logic), but copysignl almost always is lowered into a call to the requested libm functon (and, thus, might clobber CTR). llvm-svn: 188727
* Add ISD::FROUND for libm round()Hal Finkel2013-08-071-0/+5
| | | | | | | | | | | | | | | All libm floating-point rounding functions, except for round(), had their own ISD nodes. Recent PowerPC cores have an instruction for round(), and so here I'm adding ISD::FROUND so that round() can be custom lowered as well. For the most part, this is straightforward. I've added an intrinsic and a matching ISD node just like those for nearbyint() and friends. The SelectionDAG pattern I've named frnd (because ISD::FP_ROUND has already claimed fround). This will be used by the PowerPC backend in a follow-up commit. llvm-svn: 187926
* PPC: Add CTR-register clobber to builtin setjmpHal Finkel2013-07-171-0/+7
| | | | | | | | | | | | | Because the builtin longjmp implementation uses a CTR-based indirect jump, when the control flow arrives at the builtin setjmp call, the CTR register has necessarily been clobbered. Correspondingly, this adds CTR to the list of implicit definitions of the builtin setjmp pseudo instruction. We don't need to add CTR to the implicit definitions of builtin longjmp because, even though it does clobber the CTR register, the control flow cannot return to inside the loop unless there is also a builtin setjmp call. llvm-svn: 186488
* Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid ↵Craig Topper2013-07-041-1/+1
| | | | | | specifying the vector size. llvm-svn: 185606
* Don't form PPC CTR loops for over-sized exit countsHal Finkel2013-07-011-0/+3
| | | | | | | | | | Although you can't generate this from C on PPC64, if you have a loop using a 64-bit counter on PPC32 then you can't form a CTR-based loop for it. This had been cauing the PPCCTRLoops pass to assert. Thanks to Joerg Sonnenberger for providing a test case! llvm-svn: 185361
* Disallow i64 div/rem in PPC32 counter loopsHal Finkel2013-06-071-0/+7
| | | | | | | | On PPC32, [su]div,rem on i64 types are transformed into runtime library function calls. As a result, they are not allowed in counter-based loops (the counter-loops verification pass caught this error; this change fixes PR16169). llvm-svn: 183581
* Rename LoopSimplify.h to LoopUtils.hHal Finkel2013-05-201-1/+1
| | | | | | As discussed, LoopUtils.h is a better name. llvm-svn: 182314
* Remove copied preheader insertion logic from PPCCTRLoopsHal Finkel2013-05-201-85/+3
| | | | | | | | | Now that the preheader insertion logic in LoopSimplify is externally exposed, use it, and remove the copy-and-pasted version. No functionality change intended. llvm-svn: 182300
* Rename PPC MTCTRse to MTCTRloopHal Finkel2013-05-201-1/+1
| | | | | | | | | | As the pairing of this instruction form with the bdnz/bdz branches is now enforced by the verification pass, make it clear from the name that these are used only for counter-based loops. No functionality change intended. llvm-svn: 182296
* Add a PPCCTRLoops verification passHal Finkel2013-05-201-0/+155
| | | | | | | | | | | | | | | | | | When asserts are enabled, this adds a verification pass for PPC counter-loop formation. Unfortunately, without sacrificing code quality, there is no better way of forming counter-based loops except at the (late) IR level. This means that we need to recognize, at the IR level, anything which might turn into a function call (or indirect branch). Because this is currently a finite set of things, and because SelectionDAG lowering is basic-block local, this can be done. Nevertheless, it is fragile, and failure results in a miscompile. This verification pass checks that all (reachable) counter-based branches are dominated by a loop mtctr instruction, and that no instructions in between clobber the counter register. If these conditions are not satisfied, then an ICE will be triggered. In short, this is to help us sleep better at night. llvm-svn: 182295
* Check InlineAsm clobbers in PPCCTRLoopsHal Finkel2013-05-181-0/+15
| | | | | | | | We don't need to reject all inline asm as using the counter register (most does not). Only those that explicitly clobber the counter register need to prevent the transformation. llvm-svn: 182191
* Create an new preheader in PPCCTRLoops to avoid counter register clobbersHal Finkel2013-05-161-153/+165
| | | | | | | | | | | | | Some IR-level instructions (such as FP <-> i64 conversions) are not chained w.r.t. the mtctr intrinsic and yet may become function calls that clobber the counter register. At the selection-DAG level, these might be reordered with the mtctr intrinsic causing miscompiles. To avoid this situation, if an existing preheader has instructions that might use the counter register, create a new preheader for the mtctr intrinsic. This extra block will be remerged with the old preheader at the MI level, but will prevent unwanted reordering at the selection-DAG level. llvm-svn: 182045
* PPC32 cannot form counter loops around i64 FP conversionsHal Finkel2013-05-161-1/+5
| | | | | | | On PPC32, i64 FP conversions are implemented using runtime calls (which clobber the counter register). These must be excluded. llvm-svn: 182023
* undef setjmp in PPCCTRLoopsHal Finkel2013-05-151-0/+16
| | | | | | | Trying to unbreak the VS build by copying some undef code from Utils/LowerInvoke.cpp. llvm-svn: 181938
* Implement PPC counter loops as a late IR-level passHal Finkel2013-05-151-665/+390
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The old PPCCTRLoops pass, like the Hexagon pass version from which it was derived, could only handle some simple loops in canonical form. We cannot directly adapt the new Hexagon hardware loops pass, however, because the Hexagon pass contains a fundamental assumption that non-constant-trip-count loops will contain a guard, and this is not always true (the result being that incorrect negative counts can be generated). With this commit, we replace the pass with a late IR-level pass which makes use of SE to calculate the backedge-taken counts and safely generate the loop-count expressions (including any necessary max() parts). This IR level pass inserts custom intrinsics that are lowered into the desired decrement-and-branch instructions. The most fragile part of this new implementation is that interfering uses of the counter register must be detected on the IR level (and, on PPC, this also includes any indirect branches in addition to function calls). Also, to make all of this work, we need a variant of the mtctr instruction that is marked as having side effects. Without this, machine-code level CSE, DCE, etc. illegally transform the resulting code. Hopefully, this can be improved in the future. This new pass is smaller than the original (and much smaller than the new Hexagon hardware loops pass), and can handle many additional cases correctly. In addition, the preheader-creation code has been copied from LoopSimplify, and after we decide on where it belongs, this code will be refactored so that it can be explicitly shared (making this implementation even smaller). The new test-case files ctrloop-{le,lt,ne}.ll have been adapted from tests for the new Hexagon pass. There are a few classes of loops that this pass does not transform (noted by FIXMEs in the files), but these deficiencies can be addressed within the SE infrastructure (thus helping many other passes as well). llvm-svn: 181927
* Fix a register-class comparison bug in PPCCTRLoopsHal Finkel2013-03-211-1/+1
| | | | | | | | | Thanks to Jakob for isolating the underlying problem from the test case in r177423. The original commit had introduced asymmetric copy operations, but these turned out to be a work-around to the real problem (the use of == instead of hasSubClassEq in PPCCTRLoops). llvm-svn: 177679
* Fix a sign-extension bug in PPCCTRLoopsHal Finkel2013-03-181-1/+1
| | | | | | | Don't sign extend the immediate value from the OR instruction in an LIS/OR pair. llvm-svn: 177361
* Fix 80-col. violations in PPCCTRLoopsHal Finkel2013-03-181-6/+8
| | | | llvm-svn: 177296
* Fix large count and negative constant count handling in PPCCTRLoopsHal Finkel2013-03-181-11/+41
| | | | | | | | | | | | | | | | | | This commit fixes an assert that would occur on loops with large constant counts (like looping for ((uint32_t) -1) iterations on PPC64). The existing code did not handle counts that it computed to be negative (asserting instead), but these can be created with valid inputs. This bug was discovered by bugpoint while I was attempting to isolate a completely different problem. Also, in writing test cases for the negative-count problem, I discovered that the ori/lsi handling was broken (there was a typo which caused the logic that was supposed to detect these pairs and extract the iteration count to always fail). This has now also been corrected (and is covered by one of the new test cases). llvm-svn: 177295
* Cleanup initial-value constants in PPCCTRLoopsHal Finkel2013-03-181-2/+9
| | | | | | | | Because the initial-value constants had not been added to the list of instructions considered for DCE the resulting code had redundant constant-materialization instructions. llvm-svn: 177294
* Add registration for PPC-specific passes to allow the IR to be dumpedKrzysztof Parzyszek2013-02-131-1/+13
| | | | | | via -print-after-all. llvm-svn: 175058
* Move all of the header files which are involved in modelling the LLVM IRChandler Carruth2013-01-021-1/+1
| | | | | | | | | | | | | | | | | | | | | into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. llvm-svn: 171366
* Use the new script to sort the includes of every file under lib.Chandler Carruth2012-12-031-4/+4
| | | | | | | | | | | | | | | | | Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131
* Don't use getNextOperandForReg().Jakob Stoklund Olesen2012-08-081-1/+4
| | | | | | | | | This way of using getNextOperandForReg() was unlikely to work as intended. We don't give any guarantees about the order of operands in the use-def chains, so looking only at operands following a given operand in the chain doesn't make sense. llvm-svn: 161542
* Cleanup trip-count finding for PPC CTR loops (and some bug fixes).Hal Finkel2012-06-161-86/+127
| | | | | | | | | | | | | | This cleans up the method used to find trip counts in order to form CTR loops on PPC. This refactoring allows the pass to find loops which have a constant trip count but also happen to end with a comparison to zero. This also adds explicit FIXMEs to mark two different classes of loops that are currently ignored. In addition, we now search through all potential induction operations instead of just the first. Also, we check the predicate code on the conditional branch and abort the transformation if the code is not EQ or NE, and we then make sure that the branch to be transformed matches the condition register defined by the comparison (multiple possible comparisons will be considered). llvm-svn: 158607
* Fix a bug in the new PPC CTR-Loops pass.Hal Finkel2012-06-081-0/+1
| | | | | | | | | The code which tests for an induction operation cannot assume that any ADDI instruction will have a register operand because the operand could also be a frame index; for example: %vreg16<def> = ADDI8 <fi#0>, 0; G8RC:%vreg16 llvm-svn: 158205
* Add the PPCCTRLoops pass: a PPC machine-code-level optimization pass to form ↵Hal Finkel2012-06-081-0/+679
CTR-based loop branching code. This pass is derived from the Hexagon HardwareLoops pass. The only significant enhancement over the Hexagon pass is that PPCCTRLoops will also attempt to delete the replaced add and compare operations if they are no longer otherwise used. Also, invalid preheader DebugLoc is not used. llvm-svn: 158204
OpenPOWER on IntegriCloud