summaryrefslogtreecommitdiffstats
path: root/polly/test/Isl/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Fix non-affine region dominance of implicitely stored valuesMichael Kruse2015-11-071-0/+48
| | | | | | | | | | | | | | | After loop versioning, a dominance check of a non-affine subregion's exit node causes the dominance check to always fail on any block in the subregion if it shares the same exit block with the scop. The subregion's exit block has become polly_merge_new_and_old, which also receives the control flow of the generated code. This would cause that any value for implicit stores is assumed to be not from the scop. We check dominance with the generated exit node instead. This fixes llvm.org/PR25438 llvm-svn: 252375
* Add missing '%loadPolly' to test caseTobias Grosser2015-11-061-1/+1
| | | | llvm-svn: 252302
* Fix reuse of non-dominating synthesized value in subregion exitMichael Kruse2015-11-061-0/+30
| | | | | | | | | | | | | | | | We were adding all generated values in non-affine subregions to be used for the subregions generated exit block. The thought was that only values that are dominating the original exit block can be used there. But it is possible for synthesizable values to be expanded in any block. If the same values is also used for implicit writes, it would try to reuse already synthesized values even if not dominating the exit block. The fix is to only add values to the list of values usable in the exit block only if it is dominating the exit block. This fixes llvm.org/PR25412. llvm-svn: 252301
* Adjust debug metadata to LLVM changes in 252219Tobias Grosser2015-11-062-4/+4
| | | | llvm-svn: 252273
* ScopInfo: Allocate globally unique memory access identifiersTobias Grosser2015-11-052-0/+102
| | | | | | | | Before this commit memory reference identifiers have only been unique per basic block, but not per (non-affine) ScopStmt. This commit now uses the MemoryAccess base pointer to uniquely identify each Memory access. llvm-svn: 252200
* Use per-BB value maps for non-exit BBsMichael Kruse2015-11-051-0/+35
| | | | | | | | | | | For generating scalar writes of non-affine subregions, all except phi writes are generated in the exit block. The phi writes are generated in the incoming block for which we errornously used the same BBMap. This can conflict if a value for one block is synthesized, and then reused for another block which is not dominated by the first block. This is fixed by using block-specific BBMaps for phi writes. llvm-svn: 252172
* [FIX] Simplify and correct preloading of base pointer originJohannes Doerfert2015-11-031-0/+73
| | | | | | | | | To simplify and correct the preloading of a base pointer origin, e.g., the base pointer for the current indirect invariant load, we now just check if there is an invariant access class that involves the base pointer of the current class. llvm-svn: 251962
* [FIX] Ensure base pointer origin was preloaded alreadyJohannes Doerfert2015-11-031-0/+91
| | | | | | | | If a base pointer of a preloaded value has a base pointer origin, thus it is an indirect invariant load, we have to make sure the base pointer origin is preloaded first. llvm-svn: 251946
* [FIX] Correctly update SAI base pointerJohannes Doerfert2015-11-031-0/+50
| | | | | | | | | If a base pointer load is preloaded, we have change the base pointer of the derived SAI. However, as the derived SAI relationship is is coarse grained, we need to check if we actually preloaded the base pointer or a different element of the base pointer SAI array. llvm-svn: 251881
* [FIX] Use appropriately sized types for big constantsJohannes Doerfert2015-11-031-0/+56
| | | | llvm-svn: 251869
* tests: Add test case forgotten in 251191Tobias Grosser2015-10-251-0/+61
| | | | llvm-svn: 251228
* ScopDetection: Update DetectionContextMap accordinglyTobias Grosser2015-10-251-0/+51
| | | | | | | | When verifying if a scop is still valid we rerun all analysis, but did not update DetectionContextMap. This change ensures that information, e.g. about non-affine regions, is correctly updated llvm-svn: 251227
* Add a missing '-S'Tobias Grosser2015-10-241-1/+1
| | | | llvm-svn: 251199
* BlockGenerator: Do not assert when finding model PHI nodes defined outside ↵Tobias Grosser2015-10-241-0/+53
| | | | | | | | | | the scop Such PHI nodes can not only appear in the ExitBlock of the Scop, but indeed any scalar PHI node above the scop and used in the scop is modeled as scalar read access. llvm-svn: 251198
* Correct typo in CHECK lineMichael Kruse2015-10-191-1/+1
| | | | | | Thanks Tobias for the hint. llvm-svn: 250695
* Synthesize phi arguments in incoming blockMichael Kruse2015-10-191-0/+61
| | | | | | | | | | | | New values were always synthesized in the block of the instruction that needed them. This is incorrect for PHI node whose' value must be defined in the respective incoming block. This patch temporarily moves the builder's insert point to the incoming block while synthesizing phi node arguments. This fixes PR25241 (http://llvm.org/bugs/show_bug.cgi?id=25241) llvm-svn: 250693
* [FIX] Do not try to hoist "empty" accessesJohannes Doerfert2015-10-181-0/+60
| | | | | | | | | | | | Accesses that have a relative offset (in bytes) that is not divisible by the type size (in bytes) will be represented as empty in the SCoP description. This is on its own not good but it also crashed the invariant load hoisting. This patch will fix the latter problem while the former should be addressed too. This fixes bug 25236. llvm-svn: 250664
* [FIX] Do not hoist invariant pointers with non-loaded base ptr in SCoPJohannes Doerfert2015-10-181-0/+57
| | | | | | | | | | If the base pointer of a load is invariant and defined in the SCoP but not loaded we cannot hoist the load as we would not hoist the base pointer definition. This fixes bug 25237. llvm-svn: 250663
* [FIX] Restructure invariant load equivalence classesJohannes Doerfert2015-10-182-0/+181
| | | | | | | | | | | | | | | | | Sorting is replaced by a demand driven code generation that will pre-load a value when it is needed or, if it was not needed before, at some point determined by the order of invariant accesses in the program. Only in very little cases this demand driven pre-loading will kick in, though it will prevent us from generating faulty code. An example where it is needed is shown in: test/ScopInfo/invariant_loads_complicated_dependences.ll Invariant loads that appear in parameters but are not on the top-level (e.g., the parameter is not a SCEVUnknown) will now be treated correctly. Differential Revision: http://reviews.llvm.org/D13831 llvm-svn: 250655
* Remove independent blocks passJohannes Doerfert2015-10-182-29/+7
| | | | | | | | | | | Polly can now be used as a analysis only tool as long as the code generation is disabled. However, we do not have an alternative to the independent blocks pass in place yet, though in the relevant cases this does not seem to impact the performance much. Nevertheless, a virtual alternative that allows the same transformations without changing the input region will follow shortly. llvm-svn: 250652
* Revert to original BlockGenerator::getOrCreateAlloca(MemoryAccess &Access)Tobias Grosser2015-10-181-0/+129
| | | | | | | | | | Expressing this in terms of BlockGenerator::getOrCreateAlloca(const ScopArrayInfo *Array) does not work as the MemoryAccess BasePtr is in case of invariant load hoisting different to the ScopArrayInfo BasePtr. Until this is investigated and fixed, we move back to code that just uses the baseptr of MemoryAccess. llvm-svn: 250637
* Load/Store scalar accesses before/after the statement itselfMichael Kruse2015-10-176-6/+6
| | | | | | | | | | | | | | | | | | | | | Instead of generating implicit loads within basic blocks, put them before the instructions of the statment itself, including non-affine subregions. The region's entry node is dominating all blocks in the region and therefore the loaded value will be available there. Implicit writes in block-stmts were already stored back at the end of the block. Now, also generate the stores of non-affine subregions when leaving the statement, i.e. in the exiting block. This change is required for array-mapped implicits ("De-LICM") to ensure that there are no dependencies of demoted scalars within statments. Statement load all required values, operator on copied in registers, and then write back the changed value to the demoted memory. Lifetimes analysis within statements becomes unecessary. Differential Revision: http://reviews.llvm.org/D13487 llvm-svn: 250625
* test: Correctly check for branch statementsTobias Grosser2015-10-152-0/+8
| | | | | | | | | | | | | In r250408 'CHECK-NEXT: br' lines were removed as they also matched a '%polly.subregion.iv.inc' instruction and did consequently not check what they were supposed to check. However, without these lines we can not test that the .s2a instructions that are not any more generated since r250411 really are not emitted. Hence, we add back the CHECK-NEXT lines to ensure there are really no instructions generated between the store that we check for and the branch at the end of the basic block. To ensure we do not match too early, we now check for 'br i1' or 'br label'. llvm-svn: 250435
* Do not add accesses for intra-ScopStmt scalar def-use chainsMichael Kruse2015-10-152-14/+0
| | | | | | | | | | | | When pulling a llvm::Value to be written as a PHI write, the former code did only check whether it is within the same basic block, but it could also be the same non-affine subregion. In that case some unecessary pair of MemoryAccesses would have been created. Two unit test were explicitely checking for the unecessary writes, including the comments that the writes are unecessary. llvm-svn: 250411
* Remove "CHECK: br" from some unit testsMichael Kruse2015-10-152-5/+0
| | | | | | | | | They happen to match %polly.subregion.iv.inc = add i32 %polly.subregion.iv, 1 ^^ ^^ that is, are misleading in what they actually check. llvm-svn: 250408
* Add testcase for SCEV explansion in non-affine subregionsMichael Kruse2015-10-151-0/+89
| | | | | | | | | | When sharing the same map from old to new value, CodeGeneration would reuse the same new value for each basic block. However, the SCEV expander might emit code in a basic block that does not dominate a use of the SCEV in another basic block. This test checks whether both such blocks have their own expanded new values. llvm-svn: 250389
* [tests] More testing for PHI-nodes in non-affine regionsTobias Grosser2015-10-132-0/+60
| | | | | | | | | | | | | We harden one test case by ensuring no additional stores may possibly be introduced between the stores we check for and the basic block terminator statements. We also add a test case for the situation where a value that is passed from a non-affine region to a PHI node does not dominate the exit of the non-affine region. This case has come up in patch reviews, so we make sure it is properly handled today and in the future. llvm-svn: 250217
* Consolidate invariant loadsJohannes Doerfert2015-10-093-6/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | If a (assumed) invariant location is loaded multiple times we generated a parameter for each location. However, this caused compile time problems for several benchmarks (e.g., 445_gobmk in SPEC2006 and BT in the NAS benchmarks). Additionally, the code we generate is suboptimal as we preload the same location multiple times and perform the same checks on all the parameters that refere to the same value. With this patch we consolidate the invariant loads in three steps: 1) During SCoP initialization required invariant loads are put in equivalence classes based on their pointer operand. One representing load is used to generate a parameter for the whole class, thus we never generate multiple parameters for the same location. 2) During the SCoP simplification we remove invariant memory accesses that are in the same equivalence class. While doing so we build the union of all execution domains as it is only important that the location is at least accessed once. 3) During code generation we only preload one element of each equivalence class with the unified execution domain. All others are mapped to that preloaded value. Differential Revision: http://reviews.llvm.org/D13338 llvm-svn: 249853
* Allow invariant loads in the SCoP descriptionJohannes Doerfert2015-10-079-2/+431
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows invariant loads to be used in the SCoP description, e.g., as loop bounds, conditions or in memory access functions. First we collect "required invariant loads" during SCoP detection that would otherwise make an expression we care about non-affine. To this end a new level of abstraction was introduced before SCEVValidator::isAffineExpr() namely ScopDetection::isAffine() and ScopDetection::onlyValidRequiredInvariantLoads(). Here we can decide if we want a load inside the region to be optimistically assumed invariant or not. If we do, it will be marked as required and in the SCoP generation we bail if it is actually not invariant. If we don't it will be a non-affine expression as before. At the moment we optimistically assume all "hoistable" (namely non-loop-carried) loads to be invariant. This causes us to expand some SCoPs and dismiss them later but it also allows us to detect a lot we would dismiss directly if we would ask e.g., AliasAnalysis::canBasicBlockModify(). We also allow potential aliases between optimistically assumed invariant loads and other pointers as our runtime alias checks are sound in case the loads are actually invariant. Together with the invariant checks this combination allows to handle a lot more than LICM can. The code generation of the invariant loads had to be extended as we can now have dependences between parameters and invariant (hoisted) loads as well as the other way around, e.g., test/Isl/CodeGen/invariant_load_parameters_cyclic_dependence.ll First, it is important to note that we cannot have real cycles but only dependences from a hoisted load to a parameter and from another parameter to that hoisted load (and so on). To handle such cases we materialize llvm::Values for parameters that are referred by a hoisted load on demand and then materialize the remaining parameters. Second, there are new kinds of dependences between hoisted loads caused by the constraints on their execution. If a hoisted load is conditionally executed it might depend on the value of another hoisted load. To deal with such situations we sort them already in the ScopInfo such that they can be generated in the order they are listed in the Scop::InvariantAccesses list (see compareInvariantAccesses). The dependences between hoisted loads caused by indirect accesses are handled the same way as before. llvm-svn: 249607
* tests: Drop -polly-detect-unprofitable and -polly-no-early-exitTobias Grosser2015-10-06170-214/+214
| | | | | | | | These flags are now always passed to all tests and need to be disabled if not needed. Disabling these flags, rather than passing them to almost all tests, significantly simplfies our RUN: lines. llvm-svn: 249422
* test: sdiv in loop bounds is supported since a whileTobias Grosser2015-10-061-9/+6
| | | | | | | By disabling our scop-profitability heuristics this becomes also visible in some older test cases. llvm-svn: 249411
* Remove non-executed statements during SCoP simplifcationJohannes Doerfert2015-10-041-0/+47
| | | | | | | | | | A statement with an empty domain complicates the invariant load hoisting and does not help any subsequent analysis or transformation. In fact it might introduce parameter dimensions or increase the schedule dimensionality. To this end, we remove statements with an empty domain early in the SCoP simplification. llvm-svn: 249276
* [FIX] Repair broken commitJohannes Doerfert2015-10-021-3/+4
| | | | | | | The last invariant load fix was based on a later patch not polly/master, thus needs to be adjusted. llvm-svn: 249145
* [FIX] Do not hoist from inside a non-affine subregionJohannes Doerfert2015-10-021-4/+3
| | | | | | | | We have to skip accesses in non-affine subregions during hoisting as they might not be executed under the same condition as the entry of the non-affine subregion. llvm-svn: 249139
* Hand down referenced & globally mapped values to the subfunctionJohannes Doerfert2015-10-023-1/+111
| | | | | | | | | | | | | | | | | | | | | | If a value is globally mapped (IslNodeBuilder::ValueMap) and referenced in the code that will be put into a subfunction, we hand down the new value to the subfunction. This patch also removes code that handed down all invariant loads to the subfunction. Instead, only needed invariant loads are given to the subfunction. There are two possible reasons for an invariant load to be handed down: 1) The invariant load is used in a block that is placed in the subfunction but which is not the parent of the load. In this case, the scalar access that will read the loaded value, will cause its base pointer (the preloaded value) to be handed down to the subfunction. 2) The invariant load is defined and used in a block that is placed in the subfunction. With this patch we will hand down the preloaded value to the subfunction as the invariant load is globally mapped to that value. llvm-svn: 249126
* [FIX] Parallel codegen for invariant loadsJohannes Doerfert2015-10-011-0/+33
| | | | | | Hand down all preloaded values to the parallel subfunction. llvm-svn: 249010
* Reapply "BlockGenerator: Generate synthesisable instructions only on-demand"Tobias Grosser2015-09-305-14/+47
| | | | | | | | | | | | | | | Instructions which we can synthesis from a SCEV expression are not generated directly, but only when they are used as an operand of another instruction. This avoids generating unnecessary instructions and works more reliably than first inserting them and then deleting them later on. This commit was reverted in r248860 due to a remaining miscompile, where we forgot to synthesis the operand values that were referenced from scalar writes. test/Isl/CodeGen/scalar-store-from-same-bb.ll tests that we do this now correctly. llvm-svn: 248900
* [FIX] Use escape logic for invariant loadsJohannes Doerfert2015-09-302-0/+114
| | | | | | | | | Before we unconditinoally forced all users outside the SCoP to use the preloaded value. However, if the SCoP is not executed due to the runtime checks, we need to use the original value because it might not be invariant in the first place. llvm-svn: 248881
* Identify and hoist definitively invariant loadsJohannes Doerfert2015-09-2911-120/+100
| | | | | | | | | | | | | | | | | | | As a first step in the direction of assumed invariant loads (loads that are not written in some context) we now detect and hoist definitively invariant loads. These invariant loads will be preloaded in the code generation and used in the optimized version of the SCoP. If the load is only conditionally executed the preloaded version will also only be executed under the same condition, hence we will never access memory that wouldn't have been accessed otherwise. This is also the most distinguishing feature to licm. As hoisting can make statements empty we will simplify the SCoP and remove empty statements that would otherwise cause artifacts in the code generation. Differential Revision: http://reviews.llvm.org/D13194 llvm-svn: 248861
* Revert "BlockGenerator: Generate synthesisable instructions only on-demand"Johannes Doerfert2015-09-294-15/+14
| | | | | | | | | | | | This reverts commit 07830c18d789ee72812d5b5b9b4f8ce72ebd4207. The commit broke at least one test in lnt, MultiSource/Benchmarks/Ptrdist/bc/number.c was miss compiled and the test produced a wrong result. One Polly test case that was added later was adjusted too. llvm-svn: 248860
* Codegen: Support memory accesses with different typesTobias Grosser2015-09-292-0/+99
| | | | | | | | | | | | | | | | | | | | | | Every once in a while we see code that accesses memory with different types, e.g. to perform operations on a piece of memory using type 'float', but to copy data to this memory using type 'int'. Modeled in C, such codes look like: void foo(float A[], float B[]) { for (long i = 0; i < 100; i++) *(int *)(&A[i]) = *(int *)(&B[i]); for (long i = 0; i < 100; i++) A[i] += 10; } We already used the correct types during normal operations, but fall back to our detected type as soon as we import changed memory access functions. For these memory accesses we may generate invalid IR due to a mismatch between the element type of the array we detect and the actual type used in the memory access. To address this issue, we always cast the newly created address of a memory access back to the type of the memory access where the address will be used. llvm-svn: 248781
* OpenMP: Name addresses in subfunction structureTobias Grosser2015-09-281-2/+2
| | | | | | | While debugging, this makes it easier to understand due to which memory reference these stores have been introduced. llvm-svn: 248717
* BlockGenerator: Generate synthesisable instructions only on-demandTobias Grosser2015-09-283-10/+11
| | | | | | | | | | | | | Instructions which we can synthesis from a SCEV expression are not generated directly, but only when they are used as an operand of another instruction. This avoids generating unnecessary instruction and works more reliably than first inserting them and then deleting them later on. Suggested-by: Johannes Doerfert <doerfert@cs.uni-saarland.de> Differential Revision: http://reviews.llvm.org/D13208 llvm-svn: 248712
* Allow switch instructions in SCoPsJohannes Doerfert2015-09-282-0/+144
| | | | | | | | | | | | | | | This patch allows switch instructions with affine conditions in the SCoP. Also switch instructions in non-affine subregions are allowed. Both did not require much changes to the code, though there was some refactoring needed to integrate them without code duplication. In the llvm-test suite the number of profitable SCoPs increased from 135 to 139 but more importantly we can handle more benchmarks and user inputs without preprocessing. Differential Revision: http://reviews.llvm.org/D13200 llvm-svn: 248701
* BlockGenerator: Be less agressive with deleting dead instructionsTobias Grosser2015-09-272-6/+10
| | | | | | | | | | | | | | We now only delete trivially dead instructions in the BB we copy (copyBB), but not in any other BB. Only for copyBB we know that there will _never_ be any future uses of instructions that have no use after copyBB has been generated. Other instructions in the AST that have been generated by IslNodeBuilder may look dead at the moment, but may possibly still be referenced by GlobalMaps. If we delete them now, later uses would break surprisingly. We do not have a test case that breaks due to us deleting too many instructions. This issue was found by inspection. llvm-svn: 248688
* BlockGenerator: Simplify code generated for region statementsTobias Grosser2015-09-273-14/+5
| | | | | | | | | | | | After having generated a new user statement a couple of inefficient or trivially dead instructions may remain. This commit runs instruction simplification over the newly generated blocks to ensure unneeded instructions are removed right away. This commit does adds simplification for non-affine subregions which was not yet part of 248681. llvm-svn: 248683
* [CodeGen test] Replace undef values with some defined constantsTobias Grosser2015-09-271-4/+4
| | | | | | | Otherwise, part of the computation will be just simplified away when we add instruction simplification support to the RegionGenerator. llvm-svn: 248682
* BlockGenerator: Simplify code generated for scop statementsTobias Grosser2015-09-272-6/+5
| | | | | | | | | | | After having generated a new user statement a couple of inefficient or trivially dead instructions may remain. This commit runs instruction simplification over the newly generated blocks to ensure unneeded instructions are removed right away. This commit does not yet add simplification for non-affine subregions. llvm-svn: 248681
* Create parallel code in a separate blockJohannes Doerfert2015-09-266-3/+6
| | | | | | | | | | | This commit basically reverts r246427 but still solves the issue tackled by that commit. Instead of emitting initialization code in the beginning of the start block we now generate parallel code in its own block and thereby guarantee separation. This is necessary as we cannot generate code for hoisted loads prior to the start block but it still needs to be placed prior to everything else. llvm-svn: 248674
* Simplify domain generationJohannes Doerfert2015-09-201-2/+2
| | | | | | | | We now add loop carried information during the second traversal of the region instead of in a intermediate step in-between. This makes the generation simpler, removes code and should even be faster. llvm-svn: 248125
OpenPOWER on IntegriCloud