summaryrefslogtreecommitdiffstats
path: root/polly/lib/Support
Commit message (Collapse)AuthorAgeFilesLines
...
* Extract some constant factors from "SCEVAddExprs"Johannes Doerfert2016-04-251-0/+26
| | | | | | | | | Additive expressions can have constant factors too that we can extract and thereby simplify the internal representation. For now we do compute the gcd of all constant factors but only extract the same (possibly negated) factor if there is one. llvm-svn: 267445
* Model zext-extend instructionsJohannes Doerfert2016-04-252-21/+117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A zero-extended value can be interpreted as a piecewise defined signed value. If the value was non-negative it stays the same, otherwise it is the sum of the original value and 2^n where n is the bit-width of the original (or operand) type. Examples: zext i8 127 to i32 -> { [127] } zext i8 -1 to i32 -> { [256 + (-1)] } = { [255] } zext i8 %v to i32 -> [v] -> { [v] | v >= 0; [256 + v] | v < 0 } However, LLVM/Scalar Evolution uses zero-extend (potentially lead by a truncate) to represent some forms of modulo computation. The left-hand side of the condition in the code below would result in the SCEV "zext i1 <false, +, true>for.body" which is just another description of the C expression "i & 1 != 0" or, equivalently, "i % 2 != 0". for (i = 0; i < N; i++) if (i & 1 != 0 /* == i % 2 */) /* do something */ If we do not make the modulo explicit but only use the mechanism described above we will get the very restrictive assumption "N < 3", because for all values of N >= 3 the SCEVAddRecExpr operand of the zero-extend would wrap. Alternatively, we can make the modulo in the operand explicit in the resulting piecewise function and thereby avoid the assumption on N. For the example this would result in the following piecewise affine function: { [i0] -> [(1)] : 2*floor((-1 + i0)/2) = -1 + i0; [i0] -> [(0)] : 2*floor((i0)/2) = i0 } To this end we can first determine if the (immediate) operand of the zero-extend can wrap and, in case it might, we will use explicit modulo semantic to compute the result instead of emitting non-wrapping assumptions. Note that operands with large bit-widths are less likely to be negative because it would result in a very large access offset or loop bound after the zero-extend. To this end one can optimistically assume the operand to be positive and avoid the piecewise definition if the bit-width is bigger than some threshold (here MaxZextSmallBitWidth). We choose to go with a hybrid solution of all modeling techniques described above. For small bit-widths (up to MaxZextSmallBitWidth) we will model the wrapping explicitly and use a piecewise defined function. However, if the bit-width is bigger than MaxZextSmallBitWidth we will employ overflow assumptions and assume the "former negative" piece will not exist. llvm-svn: 267408
* Introduce a parameter set type [NFC]Johannes Doerfert2016-04-251-15/+11
| | | | llvm-svn: 267401
* Remove unnecessary argument of the SCEVValidator [NFC]Johannes Doerfert2016-04-251-15/+7
| | | | llvm-svn: 267400
* Translate SCEVs to isl_pw_aff and their invalid domainJohannes Doerfert2016-04-231-64/+89
| | | | | | | | | | | | | The SCEVAffinator will now produce not only the isl representaiton of a SCEV but also the domain under which it is invalid. This is used to record possible overflows that can happen in the statement domains in the statements invalid domain. The result is that invalid loads have an accurate execution contexts with regards to the validity of their statements domain. While the SCEVAffinator currently is only taking "no-wrapping" assumptions, we can add more withouth worrying about the execution context of loads that are optimistically hoisted. llvm-svn: 267288
* SCoPValidator: Use SCEVTraversal to simplify SCEVInRegionDependencesTobias Grosser2016-04-181-101/+33
| | | | llvm-svn: 266622
* Record wrapping assumptions earlyJohannes Doerfert2016-04-121-39/+22
| | | | | | | | Utilizing the record option for assumptions we can simplify the wrapping assumption generation a lot. Additionally, we can now report locations together with wrapping assumptions, though they might not be accurate yet. llvm-svn: 266069
* Simplify SCEVAffinator code [NFC]Johannes Doerfert2016-04-121-15/+8
| | | | llvm-svn: 266051
* Allow pointer expressions in SCEVs again.Johannes Doerfert2016-04-101-10/+2
| | | | | | | | | In r247147 we disabled pointer expressions because the IslExprBuilder did not fully support them. This patch reintroduces them by simply treating them as integers. The only special handling for pointers that is left detects the comparison of two address_of operands and uses an unsigned compare. llvm-svn: 265894
* [FIX] Do not recompute SCEVs but pass them to subfunctionsJohannes Doerfert2016-04-092-20/+8
| | | | | | | | | | | | This reverts commit 2879c53e80e05497f408f21ce470d122e9f90f94. Additionally, it adds SDiv and SRem instructions to the set of values discovered by the findValues function even if we add the operands to be able to recompute the SCEVs. In subfunctions we do not want to recompute SDiv and SRem instructions but pass them instead as they might have been created through the IslExprBuilder and are more complicated than simple SDiv/SRem instructions in the code. llvm-svn: 265873
* [FIX] Teach the ScopExpander about parallel subfunctionsJohannes Doerfert2016-04-081-5/+14
| | | | llvm-svn: 265824
* [FIX] Handle multiplications in the SCEVAffinator againJohannes Doerfert2016-04-081-1/+10
| | | | | | | | | | | If ScalarEvolution cannot look through some expression but we do, it might happen that a multiplication will arrive at the SCEVAffinator::visitMulExpr. While we could always try to improve the extractConstantFactor function we might still miss something, thus we reintroduce the code to generate multiplicative piecewise-affine functions as a fall-back. llvm-svn: 265777
* [FIX] Look through div & srem instructions in SCEVsJohannes Doerfert2016-04-081-5/+29
| | | | | | | | | The findValues() function did not look through div & srem instructions that were part of the argument SCEV. However, in different other places we already look through it. This mismatch caused us to preload values in the wrong order. llvm-svn: 265775
* Generalize the domain complexity restrictionsJohannes Doerfert2016-03-261-0/+31
| | | | | | | | | | | | | | | | | | This patch applies the restrictions on the number of domain conjuncts also to the domain parts of piecewise affine expressions we generate. To this end the wording is change slightly. It was needed to support complex additions featuring zext-instructions but it also fixes PR27045. lnt profitable runs reports only little changes that might be noise: Compile Time: Polybench/[...]/2mm +4.34% SingleSource/[...]/stepanov_container -2.43% Execution Time: External/[...]/186_crafty -2.32% External/[...]/188_ammp -1.89% External/[...]/473_astar -1.87% llvm-svn: 264514
* docs: Add doxygen mainpageTobias Grosser2016-03-071-0/+4
| | | | | | (and test if doxygen is updated on-commit) llvm-svn: 262855
* [SCEVValidator] Fix loop exit values considered affine.Michael Kruse2016-03-031-1/+8
| | | | | | | | | | | | | | | | | | | | Index calculations can use the last value that come out of a loop. Ideally, ScalarEvolution can compute that exit value directly without depending on the loop induction variable, but not in all cases. This changes isAffine to not consider such loop exit values as affine to avoid that SCEVExpander adds uses of the original loop induction variable. This fix is analogous to r262404 that applies to general uses of loop exit values instead of index expressions and loop bouds as in this patch. This reduces the number of LNT test-suite fails with -polly-position=before-vectorizer -polly-unprofitable from 10 to 8. llvm-svn: 262665
* Pass scope and LoopInfo to SCEVValidator. NFC.Michael Kruse2016-03-032-19/+26
| | | | | | | | The scope will be required in the following fix. This commit separates the large changes that do not change behaviour from the small, but functional change. llvm-svn: 262664
* Fix: Add pass manager barrier.Michael Kruse2016-03-021-0/+6
| | | | | | | | | | | The LNT test suite with -polly-process-unprofitable -polly-position=before-vectorizer currenty fails 59 tests. With this barrier added, only 16 keep failing. This is probably because Polly's code generation currently does not correctly preserve all analyses it promised to preserve. Temporarily add this barrier until further investigation. llvm-svn: 262488
* Fix non-synthesizable loop exit values.Michael Kruse2016-03-012-9/+25
| | | | | | | | | | | Polly recognizes affine loops that ScalarEvolution does not, in particular those with loop conditions that depend on hoisted invariant loads. Check for SCEVAddRec dependencies on such loops and do not consider their exit values as synthesizable because SCEVExpander would generate them as expressions that depend on the original induction variables. These are not available in generated code. llvm-svn: 262404
* [SCEVValidator] Remove redundant visit.Michael Kruse2016-03-011-3/+0
| | | | | | | SCEVAddRecExpr::getStart() is synonymous to SCEVAddRecExpr::getOperand(0) which will be visited in the following loop anyway. llvm-svn: 262375
* [FIX] Compare SCEVs not values during SCEV expansionJohannes Doerfert2016-02-211-4/+9
| | | | | | | This fixes a compile time bug in SPEC2006 403.gcc, namely an endless recursion in the ScopExpander::visitUnknown function. llvm-svn: 261474
* Add more isl object printing functionsHongbin Zheng2016-02-201-0/+5
| | | | llvm-svn: 261402
* Add more isl object printing functionHongbin Zheng2016-02-181-0/+6
| | | | llvm-svn: 261216
* Separate more constant factors of parametersJohannes Doerfert2016-02-142-22/+33
| | | | | | | | | | | | | | | | | So far we separated constant factors from multiplications, however, only when they are at the outermost level of a parameter SCEV. Now, we also separate constant factors from the parameter SCEV if the outermost expression is a SCEVAddRecExpr. With the changes to the SCEVAffinator we can now improve the extractConstantFactor(...) function at will without worrying about any other code part. Thus, if needed we can implement a more comprehensive extractConstantFactor(...) function that will traverse the SCEV instead of looking only at the outermost level. Four test cases were affected. One did not change much and the other three were simplified. llvm-svn: 260859
* Follow uses to create value MemoryAccessesMichael Kruse2016-02-061-0/+11
| | | | | | | | | | | | | | | | | | | | The previously implemented approach is to follow value definitions and create write accesses ("push defs") while searching for uses. This requires the same relatively validity- and requirement conditions to be replicated at multiple locations (PHI instructions, other instructions, uses by PHIs). We replace this by iterating over the uses in a SCoP ("pull in requirements"), and add writes only when at least one read has been added. It turns out to be simpler code because each use is only iterated over once and writes are added for the first access that reads it. We need another iteration to identify escaping values (uses not in the SCoP), which also makes the difference between such accesses more obvious. As a side-effect, the order of scalar MemoryAccess can change. Differential Revision: http://reviews.llvm.org/D15706 llvm-svn: 259987
* Introduce MemAccInst helper class; NFCMichael Kruse2016-01-271-11/+0
| | | | | | | | | | | | | | | | MemAccInst wraps the common members of LoadInst and StoreInst. Also use of this class in: - ScopInfo::buildMemoryAccess - BlockGenerator::generateLocationAccessed - ScopInfo::addArrayAccess - Scop::buildAliasGroups - Replace every use of polly::getPointerOperand Reviewers: jdoerfert, grosser Differential Revision: http://reviews.llvm.org/D16530 llvm-svn: 258947
* Print "null" for ISL objects that are nullptrMichael Kruse2015-12-131-0/+2
| | | | | | | | Use it to print "null" if a MemoryAccess's access relation is not available instead of printing nothing. Suggested-by: Johannes Doerfert llvm-svn: 255466
* IR cleanup after CodeGenerationMichael Kruse2015-11-261-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | Re-run canonicalization passes after Polly's code generation. The set of passes currently added here are nearly all the passes between --polly-position=early and --polly-position=before-vectorizer, i.e. all passes that would usually run after Polly. In order to run these only if Polly actually modified the code, we add a function attribute "polly-optimzed" to a function that contains generated code. The cleanup pass is skipped if the function does not have this attribute. There is no support by the (legacy) PassManager to run passes only under some conditions. One could have wrapped all transformation passes to run only when CodeGeneration changed the code, but the analyses would run anyway. This patch creates an independent pass manager. The disadvantages are that all analyses have to re-run even if preserved and it does not honor compiler switches like the PassManagerBuilder does. Differential Revision: http://reviews.llvm.org/D14333 llvm-svn: 254150
* Use parameter constraints provided via llvm.assumeJohannes Doerfert2015-11-121-0/+38
| | | | | | | | | If an llvm.assume dominates the SCoP entry block and the assumed condition can be expressed as an affine inequality we will now add it to the context. Differential Revision: http://reviews.llvm.org/D14413 llvm-svn: 252851
* ScopDetection: Tighten the check for always executed 'error blocks'Tobias Grosser2015-11-111-1/+8
| | | | | | | | | Basic blocks that are always executed can not be error blocks as their execution can not possibly be an unlikely event. In this commit we tighten the check if an error block to basic blcoks that do not dominate the exit condition, but that dominate all exiting blocks of the scop. llvm-svn: 252726
* ScopDetection: Do not allow blocks to reference operands in error blocksTobias Grosser2015-11-111-0/+3
| | | | | | | | | | | | | r252713 introduced a couple of regressions due to later basic blocks refering to instructions defined in error blocks which have not yet been modeled. This commit is currently just encoding limitations of our modeling and code generation backends to ensure correctness. In theory, we should be able to generate and optimize such regions, as everything that is dominated by an error region is assumed to not be executed anyhow. We currently just lack the code to make this happen in practice. llvm-svn: 252725
* stringFromIslObj: Do not crash when printing 'null' objectsTobias Grosser2015-11-101-1/+5
| | | | | | No test case, as this code path is currently only used for debugging. llvm-svn: 252609
* polly/ADT: Remove implicit ilist iterator conversions, NFCDuncan P. N. Exon Smith2015-11-061-1/+1
| | | | | | | | | | | | | Remove all the implicit ilist iterator conversions from polly, in preparation for making them illegal in ADT. There was one oddity I came across: at line 95 of lib/CodeGen/LoopGenerators.cpp, there was a post-increment `Builder.GetInsertPoint()++`. Since it was a no-op, I removed it, but I admit I wonder if it might be a bug (both before and after this change)? Perhaps it should be a pre-increment? llvm-svn: 252357
* Remove independent blocks passJohannes Doerfert2015-10-181-1/+0
| | | | | | | | | | | Polly can now be used as a analysis only tool as long as the code generation is disabled. However, we do not have an alternative to the independent blocks pass in place yet, though in the relevant cases this does not seem to impact the performance much. Nevertheless, a virtual alternative that allows the same transformations without changing the input region will follow shortly. llvm-svn: 250652
* Use EP_ModuleOptimizerEarly to run early polly passes,Tobias Grosser2015-10-121-3/+3
| | | | | | | | instead of llvm::PassManagerBuilder::EP_EarlyAsPossible. This will allow us to run actual module passes in Polly's canonicalization sequence, but should otherwise have only little impact. llvm-svn: 250091
* [NFC] Move helper functions to ScopHelperJohannes Doerfert2015-10-091-0/+42
| | | | | | | | Helper functions in the BlockGenerators.h/cpp introduce dependences from the frontend to the backend of Polly. As they are used in ScopDetection, ScopInfo, etc. we move them to the ScopHelper file. llvm-svn: 249919
* Treat conditionally executed non-pure calls as errorsJohannes Doerfert2015-10-071-14/+23
| | | | | | | | | | | | | | This replaces the support for user defined error functions by a heuristic that tries to determine if a call to a non-pure function should be considered "an error". If so the block is assumed not to be executed at runtime. While treating all non-pure function calls as errors will allow a lot more regions to be analyzed, it will also cause us to dismiss a lot again due to an infeasible runtime context. This patch tries to limit that effect. A non-pure function call is considered an error if it is executed only in conditionally with regards to a cheap but simple heuristic. llvm-svn: 249611
* Allow invariant loads in the SCoP descriptionJohannes Doerfert2015-10-072-12/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows invariant loads to be used in the SCoP description, e.g., as loop bounds, conditions or in memory access functions. First we collect "required invariant loads" during SCoP detection that would otherwise make an expression we care about non-affine. To this end a new level of abstraction was introduced before SCEVValidator::isAffineExpr() namely ScopDetection::isAffine() and ScopDetection::onlyValidRequiredInvariantLoads(). Here we can decide if we want a load inside the region to be optimistically assumed invariant or not. If we do, it will be marked as required and in the SCoP generation we bail if it is actually not invariant. If we don't it will be a non-affine expression as before. At the moment we optimistically assume all "hoistable" (namely non-loop-carried) loads to be invariant. This causes us to expand some SCoPs and dismiss them later but it also allows us to detect a lot we would dismiss directly if we would ask e.g., AliasAnalysis::canBasicBlockModify(). We also allow potential aliases between optimistically assumed invariant loads and other pointers as our runtime alias checks are sound in case the loads are actually invariant. Together with the invariant checks this combination allows to handle a lot more than LICM can. The code generation of the invariant loads had to be extended as we can now have dependences between parameters and invariant (hoisted) loads as well as the other way around, e.g., test/Isl/CodeGen/invariant_load_parameters_cyclic_dependence.ll First, it is important to note that we cannot have real cycles but only dependences from a hoisted load to a parameter and from another parameter to that hoisted load (and so on). To handle such cases we materialize llvm::Values for parameters that are referred by a hoisted load on demand and then materialize the remaining parameters. Second, there are new kinds of dependences between hoisted loads caused by the constraints on their execution. If a hoisted load is conditionally executed it might depend on the value of another hoisted load. To deal with such situations we sort them already in the ScopInfo such that they can be generated in the order they are listed in the Scop::InvariantAccesses list (see compareInvariantAccesses). The dependences between hoisted loads caused by indirect accesses are handled the same way as before. llvm-svn: 249607
* Consolidate the different ValueMapTypes we are usingTobias Grosser2015-10-041-3/+6
| | | | | | | | | | There have been various places where llvm::DenseMap<const llvm::Value *, llvm::Value *> types have been defined, but all types have been expected to be identical. We make this more clear by consolidating the different types and use BlockGenerator::ValueMapT wherever there is a need for types to match BlockGenerator::ValueMapT. llvm-svn: 249264
* Allow user defined error functionsJohannes Doerfert2015-10-011-6/+18
| | | | | | | | | | | | The user can provide function names with -polly-error-functions=name1,name2,name3 that will be treated as error functions. Any call to them is assumed not to be executed. This feature is mainly for developers to play around with the new "error block" feature. llvm-svn: 249098
* [FIX] Handle identity mappings in the ScopExpanderJohannes Doerfert2015-09-301-1/+2
| | | | | | | | If the VMap in the ScopExpander contains identity mappings we now ignore the mapping. Reported-by: Tobias Grosser <tobias@grosser.es> llvm-svn: 248946
* Move remapping functionality in the ScopExpanderJohannes Doerfert2015-09-301-4/+12
| | | | | | | | | Because we handle more than SCEV does it is not possible to rewrite an expression on the top-level using the SCEVParameterRewriter only. With this patch we will do the rewriting on demand only and also recursively, thus not only on the top-level. llvm-svn: 248916
* Allow switch instructions in SCoPsJohannes Doerfert2015-09-281-0/+14
| | | | | | | | | | | | | | | This patch allows switch instructions with affine conditions in the SCoP. Also switch instructions in non-affine subregions are allowed. Both did not require much changes to the code, though there was some refactoring needed to integrate them without code duplication. In the llvm-test suite the number of profitable SCoPs increased from 135 to 139 but more importantly we can handle more benchmarks and user inputs without preprocessing. Differential Revision: http://reviews.llvm.org/D13200 llvm-svn: 248701
* Sort includes using Chandler's sort_includes.py scriptTobias Grosser2015-09-251-4/+2
| | | | llvm-svn: 248568
* Use <nsw> AddRecs in the affinator to avoid bounded assumptionsJohannes Doerfert2015-09-201-0/+14
| | | | | | | | If we encounter a <nsw> tagged AddRec for a loop we know the trip count of that loop has to be bounded or the semantics is undefined anyway. Hence, we only need to add unbounded assumptions if no such AddRec is known. llvm-svn: 248128
* Use modulo semantic to generate non-integer-overflow assumptionsJohannes Doerfert2015-09-151-4/+95
| | | | | | | | | | | | | | | | | | | | | | | | | This will allow to generate non-wrap assumptions for integer expressions that are part of the SCoP. We compare the common isl representation of the expression with one computed with modulo semantic. For all parameter combinations they are not equal we can have integer overflows. The nsw flags are respected when the modulo representation is computed, nuw and nw flags are ignored for now. In order to not increase compile time to much, the non-wrap assumptions are collected in a separate boundary context instead of the assumed context. This helps compile time as the boundary context can become complex and it is therefor not advised to use it in other operations except runtime check generation. However, the assumed context is e.g., used to tighten dependences. While the boundary context might help to tighten the assumed context it is doubtful that it will help in practice (it does not effect lnt much) as the boundary (or no-wrap assumptions) only restrict the very end of the possible value range of parameters. PET uses a different approach to compute the no-wrap context, though lnt runs have shown that this version performs slightly better for us. llvm-svn: 247732
* Use blocks instead of domains in SCEVAffinatorJohannes Doerfert2015-09-151-13/+11
| | | | | | | | Due to the new domain generation, the SCoP keeps track of the domain for all blocks, thus the SCEVAffinator can now work with blocks to avoid duplication of the domains. llvm-svn: 247731
* Revert r247278 "Disable support for modulo expressions"Johannes Doerfert2015-09-141-17/+11
| | | | | | | | This reverts commit 00c5b6ca8832439193036aadaaaee92a43236219. We can handle modulo expressions in the domain again. llvm-svn: 247542
* Runtime error check eliminationJohannes Doerfert2015-09-101-0/+14
| | | | | | | | | | | | | Hoist runtime checks in the loop nest if they guard an "error" like event. Such events are recognized as blocks with an unreachable terminator or a call to the ubsan function that deals with out of bound accesses. Other "error" events can be added easily. We will ignore these blocks when we detect/model/optmize and code generate SCoPs but we will make sure that they would not have been executed using the assumption framework. llvm-svn: 247310
* Allow general loops with one latchJohannes Doerfert2015-09-101-30/+0
| | | | | | | | | | | | | | | | | | | As we do not rely on ScalarEvolution any more we do not need to get the backedge taken count. Additionally, our domain generation handles everything that is affine and has one latch and our ScopDetection will over-approximate everything else. This change will therefor allow loops with: - one latch - exiting conditions that are affine Additionally, it will not check for structured control flow anymore. Hence, loops and conditionals are not necessarily single entry single exit regions any more. Differential Version: http://reviews.llvm.org/D12758 llvm-svn: 247289
OpenPOWER on IntegriCloud