summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis/TargetTransformInfo.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [Constant Hoisting] Make the constant materialization cost operand dependentJuergen Ributzka2014-03-211-8/+8
| | | | | | | | | Extend the target hook to take also the operand index into account when calculating the cost of the constant materialization. Related to <rdar://problem/16381500> llvm-svn: 204435
* Revert "[Constant Hoisting] Extend coverage of the constant hoisting pass."Juergen Ributzka2014-03-201-8/+8
| | | | | | I will break this up into smaller pieces for review and recommit. llvm-svn: 204393
* [Constant Hoisting] Extend coverage of the constant hoisting pass.Juergen Ributzka2014-03-201-8/+8
| | | | | | | | | This commit extends the coverage of the constant hoisting pass, adds additonal debug output and updates the function names according to the style guide. Related to <rdar://problem/16381500> llvm-svn: 204389
* [TTI] There is actually no realistic way to pop TTI implementations offChandler Carruth2014-03-101-10/+0
| | | | | | | | | | | | | | the stack of the analysis group because they are all immutable passes. This is made clear by Craig's recent work to use override systematically -- we weren't overriding anything for 'finalizePass' because there is no such thing. This is kind of a lame restriction on the API -- we can no longer push and pop things, we just set up the stack and run. However, I'm not invested in building some better solution on top of the existing (terrifying) immutable pass and legacy pass manager. llvm-svn: 203437
* [Modules] Move CallSite into the IR library where it belogs. It isChandler Carruth2014-03-041-1/+1
| | | | | | | abstracting between a CallInst and an InvokeInst, both of which are IR concepts. llvm-svn: 202816
* Switch all uses of LLVM_OVERRIDE to just use 'override' directly.Craig Topper2014-03-021-49/+45
| | | | llvm-svn: 202621
* Switch all uses of LLVM_FINAL to just use 'final', and remove the macro.Craig Topper2014-03-021-1/+1
| | | | llvm-svn: 202618
* Make DataLayout a plain object, not a pass.Rafael Espindola2014-02-251-1/+2
| | | | | | | Instead, have a DataLayoutPass that holds one. This will allow parts of LLVM don't don't handle passes to also use DataLayout. llvm-svn: 202168
* Make succ_iterator a real random access iterator and clean up a couple of users.Benjamin Kramer2014-02-101-6/+1
| | | | llvm-svn: 201088
* Revert "Revert "Add Constant Hoisting Pass" (r200034)"Juergen Ributzka2014-01-251-1/+21
| | | | | | | This reverts commit r200058 and adds the using directive for ARMTargetTransformInfo to silence two g++ overload warnings. llvm-svn: 200062
* Revert "Add Constant Hoisting Pass" (r200034)Hans Wennborg2014-01-251-21/+1
| | | | | | | | | | | | | | | This commit caused -Woverloaded-virtual warnings. The two new TargetTransformInfo::getIntImmCost functions were only added to the superclass, and to the X86 subclass. The other targets were not updated, and the warning highlighted this by pointing out that e.g. ARMTTI::getIntImmCost was hiding the two new getIntImmCost variants. We could pacify the warning by adding "using TargetTransformInfo::getIntImmCost" to the various subclasses, or turning it off, but I suspect that it's wrong to leave the functions unimplemnted in those targets. The default implementations return TCC_Free, which I don't think is right e.g. for ARM. llvm-svn: 200058
* Add Constant Hoisting PassJuergen Ributzka2014-01-241-1/+21
| | | | | | | | Retry commit r200022 with a fix for the build bot errors. Constant expressions have (unlike instructions) module scope use lists and therefore may have users in different functions. The fix is to simply ignore these out-of-function uses. llvm-svn: 200034
* Revert "Add Constant Hoisting Pass"Juergen Ributzka2014-01-241-21/+1
| | | | | | This reverts commit r200022 to unbreak the build bots. llvm-svn: 200024
* Add Constant Hoisting PassJuergen Ributzka2014-01-241-1/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This pass identifies expensive constants to hoist and coalesces them to better prepare it for SelectionDAG-based code generation. This works around the limitations of the basic-block-at-a-time approach. First it scans all instructions for integer constants and calculates its cost. If the constant can be folded into the instruction (the cost is TCC_Free) or the cost is just a simple operation (TCC_BASIC), then we don't consider it expensive and leave it alone. This is the default behavior and the default implementation of getIntImmCost will always return TCC_Free. If the cost is more than TCC_BASIC, then the integer constant can't be folded into the instruction and it might be beneficial to hoist the constant. Similar constants are coalesced to reduce register pressure and materialization code. When a constant is hoisted, it is also hidden behind a bitcast to force it to be live-out of the basic block. Otherwise the constant would be just duplicated and each basic block would have its own copy in the SelectionDAG. The SelectionDAG recognizes such constants as opaque and doesn't perform certain transformations on them, which would create a new expensive constant. This optimization is only applied to integer constants in instructions and simple (this means not nested) constant cast experessions. For example: %0 = load i64* inttoptr (i64 big_constant to i64*) Reviewed by Eric llvm-svn: 200022
* Add final and owerride keywords to TargetTransformInfo's subclasses.Juergen Ributzka2014-01-241-45/+53
| | | | llvm-svn: 200021
* Re-sort all of the includes with ./utils/sort_includes.py so thatChandler Carruth2014-01-071-2/+2
| | | | | | | | | | subsequent changes are easier to review. About to fix some layering issues, and wanted to separate out the necessary churn. Also comment and sink the include of "Windows.h" in three .inc files to match the usage in Memory.inc. llvm-svn: 198685
* Costmodel: Add support for horizontal vector reductionsArnold Schwaighofer2013-09-171-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upcoming SLP vectorization improvements will want to be able to estimate costs of horizontal reductions. Add infrastructure to support this. We model reductions as a series of (shufflevector,add) tuples ultimately followed by an extractelement. For example, for an add-reduction of <4 x float> we could generate the following sequence: (v0, v1, v2, v3) \ \ / / \ \ / + + (v0+v2, v1+v3, undef, undef) \ / ((v0+v2) + (v1+v3), undef, undef) %rdx.shuf = shufflevector <4 x float> %rdx, <4 x float> undef, <4 x i32> <i32 2, i32 3, i32 undef, i32 undef> %bin.rdx = fadd <4 x float> %rdx, %rdx.shuf %rdx.shuf7 = shufflevector <4 x float> %bin.rdx, <4 x float> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef> %bin.rdx8 = fadd <4 x float> %bin.rdx, %rdx.shuf7 %r = extractelement <4 x float> %bin.rdx8, i32 0 This commit adds a cost model interface "getReductionCost(Opcode, Ty, Pairwise)" that will allow clients to ask for the cost of such a reduction (as backends might generate more efficient code than the cost of the individual instructions summed up). This interface is excercised by the CostModel analysis pass which looks for reduction patterns like the one above - starting at extractelements - and if it sees a matching sequence will call the cost model interface. We will also support a second form of pairwise reduction that is well supported on common architectures (haddps, vpadd, faddp). (v0, v1, v2, v3) \ / \ / (v0+v1, v2+v3, undef, undef) \ / ((v0+v1)+(v2+v3), undef, undef, undef) %rdx.shuf.0.0 = shufflevector <4 x float> %rdx, <4 x float> undef, <4 x i32> <i32 0, i32 2 , i32 undef, i32 undef> %rdx.shuf.0.1 = shufflevector <4 x float> %rdx, <4 x float> undef, <4 x i32> <i32 1, i32 3, i32 undef, i32 undef> %bin.rdx.0 = fadd <4 x float> %rdx.shuf.0.0, %rdx.shuf.0.1 %rdx.shuf.1.0 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> <i32 0, i32 undef, i32 undef, i32 undef> %rdx.shuf.1.1 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef> %bin.rdx.1 = fadd <4 x float> %rdx.shuf.1.0, %rdx.shuf.1.1 %r = extractelement <4 x float> %bin.rdx.1, i32 0 llvm-svn: 190876
* Add getUnrollingPreferences to TTIHal Finkel2013-09-111-0/+7
| | | | | | | | | Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important. llvm-svn: 190542
* Revert: r189565 - Add getUnrollingPreferences to TTIHal Finkel2013-08-291-9/+0
| | | | | | | | | | | | | | | Revert unintentional commit (of an unreviewed change). Original commit message: Add getUnrollingPreferences to TTI Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important. llvm-svn: 189566
* Add getUnrollingPreferences to TTIHal Finkel2013-08-291-0/+9
| | | | | | | | | Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important. llvm-svn: 189565
* Handle address spaces in TargetTransformInfoMatt Arsenault2013-08-281-7/+15
| | | | llvm-svn: 189527
* Turn MipsOptimizeMathLibCalls into a target-independent scalar transformRichard Sandiford2013-08-231-0/+8
| | | | | | | | | | ...so that it can be used for z too. Most of the code is the same. The only real change is to use TargetTransformInfo to test when a sqrt instruction is available. The pass is opt-in because at the moment it only handles sqrt. llvm-svn: 189097
* SimplifyCFG: Use parallel-and and parallel-or mode to consolidate branch ↵Tom Stellard2013-07-271-0/+6
| | | | | | | | | | | | | | conditions Merge consecutive if-regions if they contain identical statements. Both transformations reduce number of branches. The transformation is guarded by a target-hook, and is currently enabled only for +R600, but the correctness has been tested on X86 target using a variety of CPU benchmarks. Patch by: Mei Ye llvm-svn: 187278
* TargetTransformInfo: address calculation parameter for gather/scatherArnold Schwaighofer2013-07-121-3/+4
| | | | | | | | | | | Address calculation for gather/scather in vectorized code can incur a significant cost making vectorization unbeneficial. Add infrastructure to add cost. Tests and cost model for targets will be in follow-up commits. radar://14351991 llvm-svn: 186187
* Loop Strength Reduce: Scaling factor cost.Quentin Colombet2013-05-311-0/+17
| | | | | | | | | | | | Account for the cost of scaling factor in Loop Strength Reduce when rating the formulae. This uses a target hook. The default implementation of the hook is: if the addressing mode is legal, the scaling factor is free. <rdar://problem/13806271> llvm-svn: 183045
* CostModel: Add parameter to instruction cost to further classify operand valuesArnold Schwaighofer2013-04-041-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | On certain architectures we can support efficient vectorized version of instructions if the operand value is uniform (splat) or a constant scalar. An example of this is a vector shift on x86. We can efficiently support for (i = 0 ; i < ; i += 4) w[0:3] = v[0:3] << <2, 2, 2, 2> but not for (i = 0; i < ; i += 4) w[0:3] = v[0:3] << x[0:3] This patch adds a parameter to getArithmeticInstrCost to further qualify operand values as uniform or uniform constant. Targets can then choose to return a different cost for instructions with such operand values. A follow-up commit will test this feature on x86. radar://13576547 llvm-svn: 178807
* Small fix for cost analysis of ptrtoint.Patrik Hagglund2013-03-121-2/+2
| | | | | | This seems to be a "copy-paste error" introducecd in r156140. llvm-svn: 176863
* ARM cost model: Address computation in vector mem ops not freeArnold Schwaighofer2013-02-081-0/+7
| | | | | | | | | | | | | | | Adds a function to target transform info to query for the cost of address computation. The cost model analysis pass now also queries this interface. The code in LoopVectorize adds the cost of address computation as part of the memory instruction cost calculation. Only there, we know whether the instruction will be scalarized or not. Increase the penality for inserting in to D registers on swift. This becomes necessary because we now always assume that address computation has a cost and three is a closer value to the architecture. radar://13097204 llvm-svn: 174713
* Begin fleshing out an interface in TTI for modelling the costs ofChandler Carruth2013-01-221-18/+154
| | | | | | | | | | | | | | | | | | | | | | | | | | | generic function calls and intrinsics. This is somewhat overlapping with an existing intrinsic cost method, but that one seems targetted at vector intrinsics. I'll merge them or separate their names and use cases in a separate commit. This sinks the test of 'callIsSmall' down into TTI where targets can control it. The whole thing feels very hack-ish to me though. I've left a FIXME comment about the fundamental design problem this presents. It isn't yet clear to me what the users of this function *really* care about. I'll have to do more analysis to figure that out. Putting this here at least provides it access to proper analysis pass tools and other such. It also allows us to more cleanly implement the baseline cost interfaces in TTI. With this commit, it is now theoretically possible to simplify much of the inline cost analysis's handling of calls by calling through to this interface. That conversion will have to happen in subsequent commits as it requires more extensive restructuring of the inline cost analysis. The CodeMetrics class is now really only in the business of running over a block of code and aggregating the metrics on that block of code, with the actual cost evaluation done entirely in terms of TTI. llvm-svn: 173148
* Switch CodeMetrics itself over to use TTI to determine if an instructionChandler Carruth2013-01-211-0/+3
| | | | | | | | | | | | is free. The whole CodeMetrics API should probably be reworked more, but this is enough to allow deleting the duplicate code there for computing whether an instruction is free. All of the passes using this have been updated to pull in TTI and hand it to the CodeMetrics stuff. Further, a dead CodeMetrics API (analyzeFunction) is nuked for lack of users. llvm-svn: 173036
* Introduce a generic interface for querying an operation's expectedChandler Carruth2013-01-211-1/+122
| | | | | | | | | | | | | | | lowered cost. Currently, this is a direct port of the logic implementing isInstructionFree in CodeMetrics. The hope is that the interface can be improved (f.ex. supporting un-formed instruction queries) and the implementation abstracted so that as we have test cases and target knowledge we can expose increasingly accurate heuristics to clients. I'll start switching existing consumers over and kill off the routine in CodeMetrics in subsequent commits. llvm-svn: 172998
* Revert CostTable algorithm, will re-writeRenato Golin2013-01-201-45/+0
| | | | llvm-svn: 172992
* Fix 80-col and early exit in cost modelRenato Golin2013-01-191-12/+16
| | | | llvm-svn: 172877
* Change CostTable model to be global to all targetsRenato Golin2013-01-161-0/+41
| | | | | | | | Moving the X86CostTable to a common place, so that other back-ends can share the code. Also simplifying it a bit and commoning up tables with one and two types on operations. llvm-svn: 172658
* ARM Cost model: Use the size of vector registers and widest vectorizable ↵Nadav Rotem2013-01-091-0/+8
| | | | | | instruction to determine the max vectorization factor. llvm-svn: 172010
* Cost Model: Move the 'max unroll factor' variable to the TTI and add initial ↵Nadav Rotem2013-01-091-0/+8
| | | | | | Cost Model support on ARM. llvm-svn: 171928
* Switch the SCEV expander and LoopStrengthReduce to useChandler Carruth2013-01-071-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TargetTransformInfo rather than TargetLowering, removing one of the primary instances of the layering violation of Transforms depending directly on Target. This is a really big deal because LSR used to be a "special" pass that could only be tested fully using llc and by looking at the full output of it. It also couldn't run with any other loop passes because it had to be created by the backend. No longer is this true. LSR is now just a normal pass and we should probably lift the creation of LSR out of lib/CodeGen/Passes.cpp and into the PassManagerBuilder. =] I've not done this, or updated all of the tests to use opt and a triple, because I suspect someone more familiar with LSR would do a better job. This change should be essentially without functional impact for normal compilations, and only change behvaior of targetless compilations. The conversion required changing all of the LSR code to refer to the TTI interfaces, which fortunately are very similar to TargetLowering's interfaces. However, it also allowed us to *always* expect to have some implementation around. I've pushed that simplification through the pass, and leveraged it to simplify code somewhat. It required some test updates for one of two things: either we used to skip some checks altogether but now we get the default "no" answer for them, or we used to have no information about the target and now we do have some. I've also started the process of removing AddrMode, as the TTI interface doesn't use it any longer. In some cases this simplifies code, and in others it adds some complexity, but I think it's not a bad tradeoff even there. Subsequent patches will try to clean this up even further and use other (more appropriate) abstractions. Yet again, almost all of the formatting changes brought to you by clang-format. =] llvm-svn: 171735
* Make the popcnt support enums and methods have more clear names andChandler Carruth2013-01-071-5/+5
| | | | | | | follow the conding conventions regarding enumerating a set of "kinds" of things. llvm-svn: 171687
* Move TargetTransformInfo to live under the Analysis library. This noChandler Carruth2013-01-071-0/+270
longer would violate any dependency layering and it is in fact an analysis. =] llvm-svn: 171686
OpenPOWER on IntegriCloud