summaryrefslogtreecommitdiffstats
path: root/polly/test/ScopInfo/NonAffine
Commit message (Collapse)AuthorAgeFilesLines
* [ScopDetect] Reject loop with multiple exit blocks.Michael Kruse2018-04-251-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current statement domain derivation algorithm does not (always) consider that different exit blocks of a loop can have different conditions to be reached. From the code for (int i = n; ; i-=2) { if (i <= 0) goto even; if (i <= 1) goto odd; A[i] = i; } even: A[0] = 42; return; odd: A[1] = 21; return; Polly currently derives the following domains: Stmt_even_critedge Domain := [n] -> { Stmt_even_critedge[] }; Stmt_odd Domain := [n] -> { Stmt_odd[] : (1 + n) mod 2 = 0 and n > 0 }; while the domain for the odd case is correct, Stmt_even is assumed to be executed unconditionally, which is obviously wrong. While projecting out the loop dimension in `adjustDomainDimensions`, it does not consider that there are other exit condition that have matched before. I don't know a how to fix this without changing a lot of code. Therefore This patch rejects loops with multiple exist blocks to fix the miscompile of test-suite's uuencode. The odd condition is transformed by LLVM to %cmp1 = icmp eq i64 %indvars.iv, 1 such that the project_out in adjustDomainDimensions() indeed only matches for odd n (using this condition only, we'd have an infinite loop otherwise). The even condition manifests as %cmp = icmp slt i64 %indvars.iv, 3 Because buildDomainsWithBranchConstraints() does not consider other exit conditions, it has to assume that the induction variable will eventually be lower than 3 and taking this exit. IMHO we need to reuse the algorithm that determines the number of iterations (addLoopBoundsToHeaderDomain) to determine which exit condition applies first. It has to happen in buildDomainsWithBranchConstraints() because the result will need to propagate to successor BBs. Currently addLoopBoundsToHeaderDomain() just look for union of all backedge conditions (which means leaving not the loop here). The patch in llvm.org/PR35465 changes it to look for exit conditions instead. This is required because there might be other exit conditions that do not alternatively go back to the loop header. Differential Revision: https://reviews.llvm.org/D45649 llvm-svn: 330858
* Update isl to isl-0.18-1047-g4a20ef8Tobias Grosser2018-02-209-14/+14
| | | | | | | | | | | | | | | | | | This update: - Removes several deprecated functions (e.g., isl_band). - Improves the pretty-printing of sets by detecting modulos and "false" equalities. - Minor improvements to coalescing and increased robustness of the isl scheduler. This update does not yet include isl commit isl-0.18-90-gd00cb45 (isl_pw_*_alloc: add missing check for compatible spaces, Wed Sep 6 12:18:04 2017 +0200), as this additional check is too tight and unfortunately causes two test case failures in Polly. A patch has been submitted to isl and will be included in the next isl update for Polly. llvm-svn: 325557
* Do not use wrapping ranges to bound non-affine accessesTobias Grosser2017-02-123-4/+4
| | | | | | | | | | | | | When deriving the range of valid values of a scalar evolution expression might be a range [12, 8), where the upper bound is smaller than the lower bound and where the range is expected to possibly wrap around. We theoretically could model such a range as a union of two non-wrapping ranges, but do not do this as of yet. Instead, we just do not derive any bounds. Before this change, we could have obtained bounds where the maximal possible value is strictly smaller than the minimal possible value, which is incorrect and also caused assertions during scop modeling. llvm-svn: 294891
* Update tests to more precise analysis results in LLVM coreTobias Grosser2017-01-111-2/+2
| | | | | | | LLVM's range analysis became a little tighter, which means Polly can derive tighter bounds as well. llvm-svn: 291718
* [test] Add missing colon.Michael Kruse2016-10-161-1/+1
| | | | llvm-svn: 284349
* [test] Add -polly-unprofitable-scalar-accs to test that needs it.Michael Kruse2016-10-161-0/+1
| | | | | | | | The test non_affine_loop_used_later.ll also tests the profability heuristic. Add the option -polly-unprofitable-scalar-accs explicitely to ensure that the test succeeds if the default value is changed. llvm-svn: 284338
* [tests] Force invariant load hoisting for test cases that need it -- IIITobias Grosser2016-08-152-1/+5
| | | | llvm-svn: 278673
* [tests] Force invariant load hoisting for test cases that need it IITobias Grosser2016-08-151-1/+2
| | | | llvm-svn: 278669
* [FIX] Model the rounding behaviour of SRem correctlyJohannes Doerfert2016-06-071-4/+4
| | | | llvm-svn: 272001
* Optimistic assume required invariant loads to be invariantJohannes Doerfert2016-05-232-6/+92
| | | | | | | | | Before this patch we bailed if a required invariant load was potentially overwritten. However, now we will optimistically assume it is actually invariant and, to this end, restrict the valid parameter space as well as the execution context with regards to potential overwrites of the location. llvm-svn: 270416
* Revert "Optimistic assume required invariant loads to be invariant"Johannes Doerfert2016-05-192-92/+6
| | | | | | | This reverts commit 787e642207ca978f2e800140529fc7049ea1f3de until the lnt failures are fixed. llvm-svn: 270061
* Optimistic assume required invariant loads to be invariantJohannes Doerfert2016-05-192-6/+92
| | | | | | | | So far we bailed if a required invariant load was potentially overwritten in the SCoP. From now on we will optimistically assume it is actually invariant and, to this end, restrict the valid parameter space. llvm-svn: 270060
* Invalidate unprofitable SCoPs after creationJohannes Doerfert2016-05-101-2/+2
| | | | | | | | | | If a profitable run is performed we will check if the SCoP seems to be profitable after creation but before e.g., dependence are computed. This is needed as SCoP detection only approximates the actual SCoP representation. In the end this should allow us to be less conservative during the SCoP detection while keeping the compile time in check. llvm-svn: 269074
* Weaken profitability constraints during ScopDetectionJohannes Doerfert2016-05-101-1/+3
| | | | | | | | Regions with one affine loop can be profitable if the loop is distributable. To this end we will allow them to be treated as profitable if they contain at least two non-trivial basic blocks. llvm-svn: 269064
* Simplify the internal representation according to the context [NFC]Johannes Doerfert2016-05-101-1/+1
| | | | | | | | We now use context information to simplify the domains and access functions of the SCoP instead of just aligning them with the parameter space. llvm-svn: 269048
* Record assumptions first and add them laterJohannes Doerfert2016-04-121-1/+1
| | | | | | | | | | | | | | | | | | There are three reasons why we want to record assumptions first before we add them to the assumed/invalid context: 1) If the SCoP is not profitable or otherwise invalid without the assumed/invalid context we do not have to compute it. 2) Information about the context are gathered rather late in the SCoP construction (basically after we know all parameters), thus the user might see overly complicated assumptions to be taken while they would have been simplified later on. 3) Currently we cannot take assumptions at any point but have to wait, e.g., for the domain generation to finish. This makes wrapping assumptions much more complicated as they need to be and it will have a similar effect on "signed-unsigned" assumptions later. llvm-svn: 266068
* Track assumptions and restrictions separatlyJohannes Doerfert2016-03-018-20/+20
| | | | | | | | | | | | | | | In order to speed up compile time and to avoid random timeouts we now separately track assumptions and restrictions. In this context assumptions describe parameter valuations we need and restrictions describe parameter valuations we do not allow. During AST generation we create a runtime check for both, whereas the one for the restrictions is negated before a conjunction is build. Except the In-Bounds assumptions we currently only track restrictions. Differential Revision: http://reviews.llvm.org/D17247 llvm-svn: 262328
* Separate more constant factors of parametersJohannes Doerfert2016-02-142-21/+20
| | | | | | | | | | | | | | | | | So far we separated constant factors from multiplications, however, only when they are at the outermost level of a parameter SCEV. Now, we also separate constant factors from the parameter SCEV if the outermost expression is a SCEVAddRecExpr. With the changes to the SCEVAffinator we can now improve the extractConstantFactor(...) function at will without worrying about any other code part. Thus, if needed we can implement a more comprehensive extractConstantFactor(...) function that will traverse the SCEV instead of looking only at the outermost level. Four test cases were affected. One did not change much and the other three were simplified. llvm-svn: 260859
* [FIX] Two "off-by-one" error in constant range usageJohannes Doerfert2016-02-078-14/+14
| | | | llvm-svn: 260031
* ScopInfo: Never add read accesses for synthesizable valuesMichael Kruse2016-01-271-4/+0
| | | | | | | | | | | | | Before adding a MK_Value READ MemoryAccess, check whether the read is necessary or synthesizable. Synthesizable values are later generated by the SCEVExpander and therefore do not need to be transferred explicitly. This can happen because the check for synthesizability has presumbly been forgotten in the case where a phi's incoming value has been defined in a different statement. Differential Revision: http://reviews.llvm.org/D15687 llvm-svn: 258998
* Unique phi write accessesMichael Kruse2016-01-261-3/+1
| | | | | | | | | | | | | | | | | | | Ensure that there is at most one phi write access per PHINode and ScopStmt. In particular, this would be possible for non-affine subregions with multiple exiting blocks. We replace multiple MAY_WRITE accesses by one MUST_WRITE access. The written value is constructed using a PHINode of all exiting blocks. The interpretation of the PHI WRITE's "accessed value" changed from the incoming value to the PHI like for PHI READs since there is no unique incoming value. Because region simplification shuffles around PHI nodes -- particularly with exit node PHIs -- the PHINodes at analysis time does not always exist anymore in the code generation pass. We instead remember the incoming block/value pair in the MemoryAccess. Differential Revision: http://reviews.llvm.org/D15681 llvm-svn: 258809
* Remove irreducible control flow from test caseTobias Grosser2016-01-221-7/+13
| | | | | | | | | | | | | | | The test case we look at does not necessarily require irreducible control flow, but a normal loop is sufficient to create a non-affine region containing more than one basic block that dominates the exit node. We replace this irreducible control flow with a normal loop for the following reasons: 1) This is easier to understand 2) We will subsequently commit a patch that ensures Polly does not process irreducible control flow. Within non-affine regions, we could possibly handle irreducible control flow. llvm-svn: 258496
* Update to ISL 0.16.1Michael Kruse2016-01-1516-47/+47
| | | | llvm-svn: 257898
* Prepare unit tests for update to ISL 0.16Michael Kruse2016-01-1513-365/+500
| | | | | | | | | | | | | | | | | | | | | ISL 0.16 will change how sets are printed which breaks 117 unit tests that text-compare printed sets. This patch re-formats most of these unit tests using a script and small manual editing on top of that. When actually updating ISL, most work is done by just re-running the script to adapt to the changed output. Some tests that compare IR and tests with single CHECK-lines that can be easily updated manually are not included here. The re-format script will also be committed afterwards. The per-test formatter invocation command lines options will not be added in the near future because it is ad hoc and would overwrite the manual edits. Ideally it also shouldn't be required anymore because ISL's set printing has become more stable in 0.16. Differential Revision: http://reviews.llvm.org/D16095 llvm-svn: 257851
* Revert "Always treat scalar writes as MUST_WRITEs"Tobias Grosser2015-12-141-1/+1
| | | | | | | | | | | | | | | | | | | This reverts commit r255471. Johannes raised in the post-commit review of r255471 the concern that PHI writes in non-affine regions with two exiting blocks are not really MUST_WRITE, but we just know that at least one out of the set of all possible PHI writes will be executed. Modeling all PHI nodes as MUST_WRITEs is probably save, but adding the needed documentation for such a special case is probably not worth the effort. Michael will be proposing a new patch that ensures only a single PHI_WRITE is created for non-affine regions, which - besides other benefits - should also allow us to use a single well-defined MUST_WRITE for such PHI writes. (This is not a full revert, but the condition and documentation have been slightly extended) llvm-svn: 255503
* Add unit test for r255473Michael Kruse2015-12-141-0/+51
| | | | | | | Check that memory accesses in non-affine regions that are always executed are MUST_WRITE. llvm-svn: 255500
* Always treat scalar writes as MUST_WRITEsMichael Kruse2015-12-131-1/+1
| | | | | | | | | LLVM's IR guarantees that a value definition occurs before any use, and also the value of a PHI must be one of the incoming values, "written" in one of the incoming blocks. Hence, such writes are never conditional in the context of a non-affine subregion. llvm-svn: 255471
* Revert "Avoid unnecessay .s2a write access when used only in PHIs"Tobias Grosser2015-10-172-2/+20
| | | | | | | This reverts commit r250606 due to some bugs it introduced. After these bugs have been resolved, we will add it back to tree. llvm-svn: 250607
* Avoid unnecessay .s2a write access when used only in PHIsMichael Kruse2015-10-162-20/+2
| | | | | | | PHI accesses will be handled separately by buildPHIAccesses, buildScalarDependences does not need to create additional accesses. llvm-svn: 250517
* Allow invariant loads in the SCoP descriptionJohannes Doerfert2015-10-074-60/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows invariant loads to be used in the SCoP description, e.g., as loop bounds, conditions or in memory access functions. First we collect "required invariant loads" during SCoP detection that would otherwise make an expression we care about non-affine. To this end a new level of abstraction was introduced before SCEVValidator::isAffineExpr() namely ScopDetection::isAffine() and ScopDetection::onlyValidRequiredInvariantLoads(). Here we can decide if we want a load inside the region to be optimistically assumed invariant or not. If we do, it will be marked as required and in the SCoP generation we bail if it is actually not invariant. If we don't it will be a non-affine expression as before. At the moment we optimistically assume all "hoistable" (namely non-loop-carried) loads to be invariant. This causes us to expand some SCoPs and dismiss them later but it also allows us to detect a lot we would dismiss directly if we would ask e.g., AliasAnalysis::canBasicBlockModify(). We also allow potential aliases between optimistically assumed invariant loads and other pointers as our runtime alias checks are sound in case the loads are actually invariant. Together with the invariant checks this combination allows to handle a lot more than LICM can. The code generation of the invariant loads had to be extended as we can now have dependences between parameters and invariant (hoisted) loads as well as the other way around, e.g., test/Isl/CodeGen/invariant_load_parameters_cyclic_dependence.ll First, it is important to note that we cannot have real cycles but only dependences from a hoisted load to a parameter and from another parameter to that hoisted load (and so on). To handle such cases we materialize llvm::Values for parameters that are referred by a hoisted load on demand and then materialize the remaining parameters. Second, there are new kinds of dependences between hoisted loads caused by the constraints on their execution. If a hoisted load is conditionally executed it might depend on the value of another hoisted load. To deal with such situations we sort them already in the ScopInfo such that they can be generated in the order they are listed in the Scop::InvariantAccesses list (see compareInvariantAccesses). The dependences between hoisted loads caused by indirect accesses are handled the same way as before. llvm-svn: 249607
* Introduce -polly-process-unprofitableTobias Grosser2015-10-064-4/+4
| | | | | | | | | | This single option replaces -polly-detect-unprofitable and -polly-no-early-exit and is supposed to be the only option that disables compile-time heuristics that aim to bail out early on scops that are believed to not benefit from Polly optimizations. Suggested-by: Johannes Doerfert llvm-svn: 249426
* tests: Drop -polly-detect-unprofitable and -polly-no-early-exitTobias Grosser2015-10-0618-24/+24
| | | | | | | | These flags are now always passed to all tests and need to be disabled if not needed. Disabling these flags, rather than passing them to almost all tests, significantly simplfies our RUN: lines. llvm-svn: 249422
* tests: Explicitly state if profitability tests should be usedTobias Grosser2015-10-064-7/+25
| | | | | | | | | | | | Polly's profitability heuristic saves compile time by skipping trivial scops or scops were we know no good optimization can be applied. For almost all our tests this heuristic makes little sense as we aim for minimal test cases when testing functionality. Hence, in almost all cases this heuristic is better be disabled. In preparation of disabling Polly's compile time heuristic by default in the test suite we first explicitly enable it in the couple of test cases that really use it (or run with/without heuristic side-by-side). llvm-svn: 249418
* [FIX] Count affine loops correctlyJohannes Doerfert2015-10-044-4/+18
| | | | | | | The "unprofitable" heuristic was broken and counted boxed loops even though we do not represent and optimize them. llvm-svn: 249274
* [FIX] Repair broken commitJohannes Doerfert2015-10-021-57/+0
| | | | | | | The last invariant load fix was based on a later patch not polly/master, thus needs to be adjusted. llvm-svn: 249145
* [FIX] Do not hoist from inside a non-affine subregionJohannes Doerfert2015-10-021-0/+57
| | | | | | | | We have to skip accesses in non-affine subregions during hoisting as they might not be executed under the same condition as the entry of the non-affine subregion. llvm-svn: 249139
* Earlier creation of ScopStmt objectsMichael Kruse2015-10-021-4/+4
| | | | | | | | | | | | | | | | This moves the construction of ScopStmt to the beginning of the ScopInfo pass. The late creation was a result of the earlier separation of ScopInfo and TempScopInfo. This will avoid introducing more ScopStmt-like maps in future commits. The AccFuncMap will also be removed in some future commit. DomainMap might also be included into ScopStmt. The order in which ScopStmt are created changes and initially creates empty statements that are removed in a simplification. Differential Revision: http://reviews.llvm.org/D13341 llvm-svn: 249132
* [FIX] Use the surrounding loop for non-affine SCoP regionsJohannes Doerfert2015-09-261-0/+30
| | | | | | | | | | When the whole SCoP is a non-affine region we need to use the surrounding loop in the construction of the schedule as that is the one that will be looked up after the schedule generation. This fixes bug 24947 llvm-svn: 248667
* Use modulo semantic to generate non-integer-overflow assumptionsJohannes Doerfert2015-09-154-4/+28
| | | | | | | | | | | | | | | | | | | | | | | | | This will allow to generate non-wrap assumptions for integer expressions that are part of the SCoP. We compare the common isl representation of the expression with one computed with modulo semantic. For all parameter combinations they are not equal we can have integer overflows. The nsw flags are respected when the modulo representation is computed, nuw and nw flags are ignored for now. In order to not increase compile time to much, the non-wrap assumptions are collected in a separate boundary context instead of the assumed context. This helps compile time as the boundary context can become complex and it is therefor not advised to use it in other operations except runtime check generation. However, the assumed context is e.g., used to tighten dependences. While the boundary context might help to tighten the assumed context it is doubtful that it will help in practice (it does not effect lnt much) as the boundary (or no-wrap assumptions) only restrict the very end of the possible value range of parameters. PET uses a different approach to compute the no-wrap context, though lnt runs have shown that this version performs slightly better for us. llvm-svn: 247732
* Revert r247278 "Disable support for modulo expressions"Johannes Doerfert2015-09-143-11/+3
| | | | | | | | This reverts commit 00c5b6ca8832439193036aadaaaee92a43236219. We can handle modulo expressions in the domain again. llvm-svn: 247542
* Propagate exit conditions as described in the PET paperJohannes Doerfert2015-09-141-2/+2
| | | | | | | | | At some point we build loop trip counts using this method. It was replaced by a simpler trick that works only for affine (e.g., not modulo) constraints and relies on the removal of unbounded parts. In order to allow modulo constrains again we go back to the former, more accurate method. llvm-svn: 247540
* Replace ScalarEvolution based domain generationJohannes Doerfert2015-09-106-15/+210
| | | | | | | | | | | | | | | | | | | | | | This patch replaces the last legacy part of the domain generation, namely the ScalarEvolution part that was used to obtain loop bounds. We now iterate over the loops in the region and propagate the back edge condition to the header blocks. Afterwards we propagate the new information once through the whole region. In this process we simply ignore unbounded parts of the domain and thereby assume the absence of infinite loops. + This patch already identified a couple of broken unit tests we had for years. + We allow more loops already and the step to multiple exit and multiple back edges is minimal. + It allows to model the overflow checks properly as we actually visit every block in the SCoP and know where which condition is evaluated. - It is currently not compatible with modulo constraints in the domain. Differential Revision: http://reviews.llvm.org/D12499 llvm-svn: 247279
* Disable support for modulo expressionsJohannes Doerfert2015-09-101-0/+5
| | | | | | | | | The support for modulo expressions is not comlete and makes the new domain generation harder. As the currently broken domain generation needs to be replaced, we will first swap in the new, fixed domain generation and make it compatible with the modulo expressions later. llvm-svn: 247278
* Allow PHI nodes in the region exit blockJohannes Doerfert2015-09-082-3/+21
| | | | | | | | | | | | While we do not need to model PHI nodes in the region exit (as it is not part of the SCoP), we need to prepare for the case that the exit block is split in code generation to create a single exiting block. If this will happen, hence if the region did not have a single exiting block before, we will model the operands of the PHI nodes as escaping scalars in the SCoP. Differential Revision: http://reviews.llvm.org/D12051 llvm-svn: 247078
* Traverse the SCoP to compute non-loop-carried domain conditionsJohannes Doerfert2015-08-3010-29/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | In order to compute domain conditions for conditionals we will now traverse the region in the ScopInfo once and build the domains for each block in the region. The SCoP statements can then use these constraints when they build their domain. The reason behind this change is twofold: 1) This removes a big chunk of preprocessing logic from the TempScopInfo, namely the Conditionals we used to build there. Additionally to moving this logic it is also simplified. Instead of walking the dominance tree up for each basic block in the region (as we did before), we now traverse the region only once in order to collect the domain conditions. 2) This is the first step towards the isl based domain creation. The second step will traverse the region similar to this step, however it will propagate back edge conditions. Once both are in place this conditional handling will allow multiple exit loops additional logic. Reviewers: grosser Differential Revision: http://reviews.llvm.org/D12428 llvm-svn: 246398
* Do not detect Scops with only one loop.Tobias Grosser2015-08-277-13/+38
| | | | | | | | | | | | | | | | | | | | If a region does not have more than one loop, we do not identify it as a Scop in ScopDetection. The main optimizations Polly is currently performing (tiling, preparation for outer-loop vectorization and loop fusion) are unlikely to have a positive impact on individual loops. In some cases, Polly's run-time alias checks or conditional hoisting may still have a positive impact, but those are mostly enabling transformations which LLVM already performs for individual loops. As we do not focus on individual loops, we leave them untouched to not introduce compile time regressions and execution time noise. This results in good compile time reduction (oourafft: -73.99%, smg2000: -56.25%). Contributed-by: Pratik Bhatu <cs12b1010@iith.ac.in> Reviewers: grosser Differential Revision: http://reviews.llvm.org/D12268 llvm-svn: 246161
* Add option to control reduction detectionTobias Grosser2015-08-201-0/+9
| | | | llvm-svn: 245598
* Update isl to isl-0.15-117-ge42acfeTobias Grosser2015-08-111-1/+1
| | | | | | | | | | | Besides other changes this version of isl contains a fundamental fix to memory corruption issues we have seen with imath-32 backed isl_ints. This update also contains a fix that ensures that the schedule-tree based version of isl's dependence analysis takes the domain of the schedule into account. llvm-svn: 244585
* Keep track of ScopArrayInfo objects that model PHI node storageTobias Grosser2015-07-281-5/+5
| | | | | | | | | | | | | | | | | | | | | Summary: When translating PHI nodes into memory dependences during code generation we require two kinds of memory. 'Normal memory' as for all scalar dependences and 'PHI node memory' to store the incoming values of the PHI node. With this patch we now mark and track these two kinds of memories, which we previously incorrectly marked as a single memory object. Being aware of PHI node storage makes code generation easier, as we do not need to guess what kind of storage a scalar reference requires. This simplifies the code nicely. Reviewers: jdoerfert Subscribers: pollydev, llvm-commits Differential Revision: http://reviews.llvm.org/D11554 llvm-svn: 243420
* Use schedule trees to represent execution order of statementsTobias Grosser2015-07-142-3/+3
| | | | | | | | | | | | | | | | | | Instead of flat schedules, we now use so-called schedule trees to represent the execution order of the statements in a SCoP. Schedule trees make it a lot easier to analyze, understand and modify properties of a schedule, as specific nodes in the tree can be choosen and possibly replaced. This patch does not yet fully move our DependenceInfo pass to schedule trees, as some additional performance analysis is needed here. (In general schedule trees should be faster in compile-time, as the more structured representation is generally easier to analyze and work with). We also can not yet perform the reduction analysis on schedule trees. For more information regarding schedule trees, please see Section 6 of https://lirias.kuleuven.be/handle/123456789/497238 llvm-svn: 242130
OpenPOWER on IntegriCloud