summaryrefslogtreecommitdiffstats
path: root/polly/lib/CodeGen/BlockGenerators.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* BlockGenerators: Remove unnecessary const_castTobias Grosser2015-12-221-1/+1
| | | | llvm-svn: 256227
* Adjust formatting to clang-format changes in 256149Tobias Grosser2015-12-211-1/+1
| | | | llvm-svn: 256151
* BlockGenerator: Use getArrayAccessFor for vector code generationTobias Grosser2015-12-151-2/+2
| | | | | | | | | | | | | | | getAccessFor does not guarantee a certain access to be returned in case an instruction is related to multiple accesses. However, in the vector code generation we want to know the stride of the array access of a store instruction. By using getArrayAccessFor we ensure we always get the correct memory access. This patch fixes a potential bug, but I was unable to produce a failing test case. Several existing test cases cover this code, but all of them already passed out of luck (or the specific but not-guaranteed order in which we build memory accesses). llvm-svn: 255715
* VectorBlockGenerator: Generate scalar loads for vector statementsTobias Grosser2015-12-151-0/+34
| | | | | | | | When generating scalar loads/stores separately the vector code has not been updated. This commit adds code to generate scalar loads for vector code as well as code to assert in case scalar stores are encountered within a vector loop. llvm-svn: 255714
* ScopInfo: Look up first (and only) array accessTobias Grosser2015-12-151-1/+1
| | | | | | | | | | When rewriting the access functions of load/store statements, we are only interested in the actual array memory location. The current code just took the very first memory access, which could be a scalar or an array access. As a result, we failed to update access functions even though this was requested via .jscop. llvm-svn: 255713
* Fix typos; NFCMichael Kruse2015-12-141-1/+1
| | | | llvm-svn: 255580
* BlockGenerator: Do not use fast-path for external constantsTobias Grosser2015-12-141-3/+6
| | | | | | | This change should not change the behavior of Polly today, but it allows external constants to be remapped e.g. when targetting multiple LLVM modules. llvm-svn: 255506
* BlockGenerator: Drop unneeded const_castsTobias Grosser2015-12-141-3/+3
| | | | llvm-svn: 255505
* ScopInfo: Harmonize the different array kindsTobias Grosser2015-12-131-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Over time different vocabulary has been introduced to describe the different memory objects in Polly, resulting in different - often inconsistent - naming schemes in different parts of Polly. We now standartize this to the following scheme: KindArray, KindValue, KindPHI, KindExitPHI | ------- isScalar -----------| In most cases this naming scheme has already been used previously (this minimizes changes and ensures we remain consistent with previous publications). The main change is that we remove KindScalar to clearify the difference between a scalar as a memory object of kind Value, PHI or ExitPHI and a value (former KindScalar) which is a memory object modeling a llvm::Value. We also move all documentation to the Kind* enum in the ScopArrayInfo class, remove the second enum in the MemoryAccess class and update documentation to be formulated from the perspective of the memory object, rather than the memory access. The terms "Implicit"/"Explicit", formerly used to describe memory accesses, have been dropped. From the perspective of memory accesses they described the different memory kinds well - especially from the perspective of code generation - but just from the perspective of a memory object it seems more straightforward to talk about scalars and arrays, rather than explicit and implicit arrays. The last comment is clearly subjective, though. A less subjective reason to go for these terms is the historic use both in mailing list discussions and publications. llvm-svn: 255467
* Introduce origin/kind for exit PHI node accessesMichael Kruse2015-11-261-2/+2
| | | | | | | | | | | | | | | Previously, accesses that originate from PHI nodes in the exit block were registered as SCALAR. In some context they are treated as scalars, but it makes a difference in others. We used to check whether the AccessInstruction is a terminator to differentiate the cases. This patch introduces an MemoryAccess origin EXIT_PHI and a ScopArrayInfo kind KIND_EXIT_PHI to make this case more explicit. No behavioural change intended. Differential Revision: http://reviews.llvm.org/D14688 llvm-svn: 254149
* RegionGenerator: Only introduce subregion.ivs for loops fully within a subregionTobias Grosser2015-11-121-1/+1
| | | | | | | | | | | | | | | IVs of loops for which the loop header is in the subregion, but not the entire loop may be incremented outside of the subregion and can consequently not be kept private to the subregion. Instead, they need to and are modeled as virtual loops in the iteration domains. As this is the case, generating new subregion induction variables for such loops is not needed and indeed wrong as they would hide the virtual induction variables modeled in the scop. This fixes a miscompile in MultiSource/Benchmarks/Ptrdist/bc and MultiSource/Benchmarks/nbench/. Thanks Michael and Johannes for their investiagations and helpful observations regarding this bug. llvm-svn: 252860
* Fix non-affine generated entering node not being recognized as dominatingMichael Kruse2015-11-091-6/+14
| | | | | | | | | | | | | | Scalar reloads in the generated entering block were not recognized as dominating the subregions locks when there were multiple entering nodes. This resulted in values defined in there not being copied. As a fix, we unconditionally add the BBMap of the generated entering node to the generated entry. This fixes part of llvm.org/PR25439. This reverts 252449 and reapplies r252445. Its test was failing indeterministically due to r252375 which was reverted in r252522. llvm-svn: 252540
* Fix dominance when subregion exit is outside scopMichael Kruse2015-11-091-2/+18
| | | | | | | | | | | | | | | The dominance of the generated non-affine subregion block was based on the scop's merge block, therefore resulted in an invalid DominanceTree. It resulted in some values as assumed to be unusable in the actual generated exit block. We detect the case that the exit block has been moved and decide dominance using the BB at the original exit. If we create another exit node, that exit nodes is dominated by the one generated from where the original exit resides. This fixes llvm.org/PR25438 and part of llvm.org/PR25439. llvm-svn: 252526
* Revert r252375 "Fix non-affine region dominance of implicitely stored values"Michael Kruse2015-11-091-6/+4
| | | | | | | | It introduced indeterminism as it was iterating over an address-indexed hashtable. The corresponding bug PR25438 will be fixed in a successive commit. llvm-svn: 252522
* Revert "Fix non-affine generated entering node not being recognized as ↵Johannes Doerfert2015-11-091-14/+6
| | | | | | | | | | | dominating" This reverts commit 9775824b265e574fc541e975d64d3e270243b59d due to a failing unit test. Please check and correct the unit test and commit again. llvm-svn: 252449
* Fix non-affine generated entering node not being recognized as dominatingMichael Kruse2015-11-091-6/+14
| | | | | | | | | | | Scalar reloads in the generated entering block were not recognized as dominating the subregions locks when there were multiple entering nodes. This resulted in values defined in there not being copied. As a fix, we unconditionally add the BBMap of the generated entering node to the generated entry. This fixes part of llvm.org/PR25439. llvm-svn: 252445
* [FIX] Initialize incoming scalar memory locations for PHIsJohannes Doerfert2015-11-091-4/+9
| | | | llvm-svn: 252437
* [NFC] Remove unused variable.Johannes Doerfert2015-11-091-1/+0
| | | | llvm-svn: 252436
* Fix non-affine region dominance of implicitely stored valuesMichael Kruse2015-11-071-4/+6
| | | | | | | | | | | | | | | After loop versioning, a dominance check of a non-affine subregion's exit node causes the dominance check to always fail on any block in the subregion if it shares the same exit block with the scop. The subregion's exit block has become polly_merge_new_and_old, which also receives the control flow of the generated code. This would cause that any value for implicit stores is assumed to be not from the scop. We check dominance with the generated exit node instead. This fixes llvm.org/PR25438 llvm-svn: 252375
* polly/ADT: Remove implicit ilist iterator conversions, NFCDuncan P. N. Exon Smith2015-11-061-20/+22
| | | | | | | | | | | | | Remove all the implicit ilist iterator conversions from polly, in preparation for making them illegal in ADT. There was one oddity I came across: at line 95 of lib/CodeGen/LoopGenerators.cpp, there was a post-increment `Builder.GetInsertPoint()++`. Since it was a no-op, I removed it, but I admit I wonder if it might be a bug (both before and after this change)? Perhaps it should be a pre-increment? llvm-svn: 252357
* Fix reuse of non-dominating synthesized value in subregion exitMichael Kruse2015-11-061-1/+2
| | | | | | | | | | | | | | | | We were adding all generated values in non-affine subregions to be used for the subregions generated exit block. The thought was that only values that are dominating the original exit block can be used there. But it is possible for synthesizable values to be expanded in any block. If the same values is also used for implicit writes, it would try to reuse already synthesized values even if not dominating the exit block. The fix is to only add values to the list of values usable in the exit block only if it is dominating the exit block. This fixes llvm.org/PR25412. llvm-svn: 252301
* Use per-BB value maps for non-exit BBsMichael Kruse2015-11-051-1/+6
| | | | | | | | | | | For generating scalar writes of non-affine subregions, all except phi writes are generated in the exit block. The phi writes are generated in the incoming block for which we errornously used the same BBMap. This can conflict if a value for one block is synthesized, and then reused for another block which is not dominated by the first block. This is fixed by using block-specific BBMaps for phi writes. llvm-svn: 252172
* RegionGenerator: Clear local maps after statement constructionTobias Grosser2015-10-261-0/+3
| | | | | | | | These maps are only needed during the construction of a single region statement. Clearing them is important, as we otherwise get an assert in case some of the referenced values are erased before the RegionGenerator is deleted. llvm-svn: 251341
* BlockGenerator: Do not assert when finding model PHI nodes defined outside ↵Tobias Grosser2015-10-241-4/+1
| | | | | | | | | | the scop Such PHI nodes can not only appear in the ExitBlock of the Scop, but indeed any scalar PHI node above the scop and used in the scop is modeled as scalar read access. llvm-svn: 251198
* BlockGenerator: Directly handle multi-exit PHI nodesTobias Grosser2015-10-241-16/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | This change adds code to directly code-generate multi-exit PHI nodes, instead of trying to reuse the EscapeMap infrastructure for this. Using escape maps adds a level of indirection that is hard to understand and - more importantly - breaks in certain cases. Specifically, the original code relied on simplifyRegion() to split the original PHI node in two PHI nodes, one merging the values coming from within the scop and a second that merges the first PHI node with the values that come from outside the scop. To generate code the first PHI node is then just handled like any other in-scop value that is used somewhere outside the scop. This fails for the case where all values from inside the scop are identical, as the first PHI node is in such cases automatically simplified and eliminated by LLVM right at construction. As a result, there is no instruction that can be pass to the EscapeMap handling, which means the references in the second PHI node are not updated and may still reference values from within the original scop that do not dominate it. Our new code iterates directly over all modeled ScopArrayInfo objects that represent multi-exit PHI nodes and generates code for them without relying on the EscapeMap infrastructure. Hence, it works also for the case where the first PHI node is eliminated. llvm-svn: 251191
* Synthesize phi arguments in incoming blockMichael Kruse2015-10-191-0/+4
| | | | | | | | | | | | New values were always synthesized in the block of the instruction that needed them. This is incorrect for PHI node whose' value must be defined in the respective incoming block. This patch temporarily moves the builder's insert point to the incoming block while synthesizing phi node arguments. This fixes PR25241 (http://llvm.org/bugs/show_bug.cgi?id=25241) llvm-svn: 250693
* [FIX] Cast preloaded valuesJohannes Doerfert2015-10-181-1/+3
| | | | | | | Preloaded values have to match the type of their counterpart in the original code and not the type of the base array. llvm-svn: 250654
* Revert to original BlockGenerator::getOrCreateAlloca(MemoryAccess &Access)Tobias Grosser2015-10-181-1/+4
| | | | | | | | | | Expressing this in terms of BlockGenerator::getOrCreateAlloca(const ScopArrayInfo *Array) does not work as the MemoryAccess BasePtr is in case of invariant load hoisting different to the ScopArrayInfo BasePtr. Until this is investigated and fixed, we move back to code that just uses the baseptr of MemoryAccess. llvm-svn: 250637
* BlockGenerator: Add getOrCreateAlloca(const ScopArrayInfo *Array)Tobias Grosser2015-10-171-3/+7
| | | | | | | | This allows the caller to get the alloca locations of an array without the need to thank if Array is a PHI or a non-PHI Array. We directly make use of this in BlockGenerator::getOrCreateAlloca(MemoryAccess &Access). llvm-svn: 250628
* Load/Store scalar accesses before/after the statement itselfMichael Kruse2015-10-171-48/+44
| | | | | | | | | | | | | | | | | | | | | Instead of generating implicit loads within basic blocks, put them before the instructions of the statment itself, including non-affine subregions. The region's entry node is dominating all blocks in the region and therefore the loaded value will be available there. Implicit writes in block-stmts were already stored back at the end of the block. Now, also generate the stores of non-affine subregions when leaving the statement, i.e. in the exiting block. This change is required for array-mapped implicits ("De-LICM") to ensure that there are no dependencies of demoted scalars within statments. Statement load all required values, operator on copied in registers, and then write back the changed value to the demoted memory. Lifetimes analysis within statements becomes unecessary. Differential Revision: http://reviews.llvm.org/D13487 llvm-svn: 250625
* BlockGenerator: Register outside users of scalars directlyTobias Grosser2015-10-171-5/+28
| | | | | | | | | | | | Instead of checking at code generation time for each ScopStmt if a scalar has external uses, we just iterate over the ScopArrayInfo descriptions we have and check each of these for possible external uses. Besides being somehow clearer, this approach has the benefit that we will always create valid LLVM-IR even in case we disable the code generation of ScopStmt bodies e.g. for testing purposes. llvm-svn: 250608
* Drop unused parameter from handleOutsideUsersTobias Grosser2015-10-171-4/+3
| | | | llvm-svn: 250606
* [NFC] Move helper functions to ScopHelperJohannes Doerfert2015-10-091-40/+0
| | | | | | | | Helper functions in the BlockGenerators.h/cpp introduce dependences from the frontend to the backend of Polly. As they are used in ScopDetection, ScopInfo, etc. we move them to the ScopHelper file. llvm-svn: 249919
* Allow invariant loads in the SCoP descriptionJohannes Doerfert2015-10-071-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows invariant loads to be used in the SCoP description, e.g., as loop bounds, conditions or in memory access functions. First we collect "required invariant loads" during SCoP detection that would otherwise make an expression we care about non-affine. To this end a new level of abstraction was introduced before SCEVValidator::isAffineExpr() namely ScopDetection::isAffine() and ScopDetection::onlyValidRequiredInvariantLoads(). Here we can decide if we want a load inside the region to be optimistically assumed invariant or not. If we do, it will be marked as required and in the SCoP generation we bail if it is actually not invariant. If we don't it will be a non-affine expression as before. At the moment we optimistically assume all "hoistable" (namely non-loop-carried) loads to be invariant. This causes us to expand some SCoPs and dismiss them later but it also allows us to detect a lot we would dismiss directly if we would ask e.g., AliasAnalysis::canBasicBlockModify(). We also allow potential aliases between optimistically assumed invariant loads and other pointers as our runtime alias checks are sound in case the loads are actually invariant. Together with the invariant checks this combination allows to handle a lot more than LICM can. The code generation of the invariant loads had to be extended as we can now have dependences between parameters and invariant (hoisted) loads as well as the other way around, e.g., test/Isl/CodeGen/invariant_load_parameters_cyclic_dependence.ll First, it is important to note that we cannot have real cycles but only dependences from a hoisted load to a parameter and from another parameter to that hoisted load (and so on). To handle such cases we materialize llvm::Values for parameters that are referred by a hoisted load on demand and then materialize the remaining parameters. Second, there are new kinds of dependences between hoisted loads caused by the constraints on their execution. If a hoisted load is conditionally executed it might depend on the value of another hoisted load. To deal with such situations we sort them already in the ScopInfo such that they can be generated in the order they are listed in the Scop::InvariantAccesses list (see compareInvariantAccesses). The dependences between hoisted loads caused by indirect accesses are handled the same way as before. llvm-svn: 249607
* BlockGenerator: Use plain Value * instead of const Value *Tobias Grosser2015-10-041-23/+20
| | | | | | | | | The use of const qualified Value pointers prevents the use of AssertingVH. We could probably think of adding const support to AssertingVH, but as const correctness seems to currently provide limited benefit in Polly, we do not do this yet. llvm-svn: 249266
* BlockGenerators: Use auto to be less sensitive to type changesTobias Grosser2015-10-041-13/+13
| | | | llvm-svn: 249265
* Consolidate the different ValueMapTypes we are usingTobias Grosser2015-10-041-2/+2
| | | | | | | | | | There have been various places where llvm::DenseMap<const llvm::Value *, llvm::Value *> types have been defined, but all types have been expected to be identical. We make this more clear by consolidating the different types and use BlockGenerator::ValueMapT wherever there is a need for types to match BlockGenerator::ValueMapT. llvm-svn: 249264
* BlockGenerator: Use AssertingVH in mapsTobias Grosser2015-10-031-5/+4
| | | | | | | By using asserting value handles, we will get assertions when we forget to clear any of the Value maps instead of difficult to debug undefined behavior. llvm-svn: 249237
* Move remapping functionality in the ScopExpanderJohannes Doerfert2015-09-301-3/+2
| | | | | | | | | Because we handle more than SCEV does it is not possible to rewrite an expression on the top-level using the SCEVParameterRewriter only. With this patch we will do the rewriting on demand only and also recursively, thus not only on the top-level. llvm-svn: 248916
* Reapply "BlockGenerator: Generate synthesisable instructions only on-demand"Tobias Grosser2015-09-301-46/+17
| | | | | | | | | | | | | | | Instructions which we can synthesis from a SCEV expression are not generated directly, but only when they are used as an operand of another instruction. This avoids generating unnecessary instructions and works more reliably than first inserting them and then deleting them later on. This commit was reverted in r248860 due to a remaining miscompile, where we forgot to synthesis the operand values that were referenced from scalar writes. test/Isl/CodeGen/scalar-store-from-same-bb.ll tests that we do this now correctly. llvm-svn: 248900
* BlockGenerator: Extract value synthesis into its own function [NFC]Tobias Grosser2015-09-301-21/+31
| | | | | | This will allow us to reuse this code in a subsequent commit. llvm-svn: 248893
* Identify and hoist definitively invariant loadsJohannes Doerfert2015-09-291-0/+11
| | | | | | | | | | | | | | | | | | | As a first step in the direction of assumed invariant loads (loads that are not written in some context) we now detect and hoist definitively invariant loads. These invariant loads will be preloaded in the code generation and used in the optimized version of the SCoP. If the load is only conditionally executed the preloaded version will also only be executed under the same condition, hence we will never access memory that wouldn't have been accessed otherwise. This is also the most distinguishing feature to licm. As hoisting can make statements empty we will simplify the SCoP and remove empty statements that would otherwise cause artifacts in the code generation. Differential Revision: http://reviews.llvm.org/D13194 llvm-svn: 248861
* Revert "BlockGenerator: Generate synthesisable instructions only on-demand"Johannes Doerfert2015-09-291-1/+43
| | | | | | | | | | | | This reverts commit 07830c18d789ee72812d5b5b9b4f8ce72ebd4207. The commit broke at least one test in lnt, MultiSource/Benchmarks/Ptrdist/bc/number.c was miss compiled and the test produced a wrong result. One Polly test case that was added later was adjusted too. llvm-svn: 248860
* Codegen: Support memory accesses with different typesTobias Grosser2015-09-291-1/+17
| | | | | | | | | | | | | | | | | | | | | | Every once in a while we see code that accesses memory with different types, e.g. to perform operations on a piece of memory using type 'float', but to copy data to this memory using type 'int'. Modeled in C, such codes look like: void foo(float A[], float B[]) { for (long i = 0; i < 100; i++) *(int *)(&A[i]) = *(int *)(&B[i]); for (long i = 0; i < 100; i++) A[i] += 10; } We already used the correct types during normal operations, but fall back to our detected type as soon as we import changed memory access functions. For these memory accesses we may generate invalid IR due to a mismatch between the element type of the array we detect and the actual type used in the memory access. To address this issue, we always cast the newly created address of a memory access back to the type of the memory access where the address will be used. llvm-svn: 248781
* BlockGenerator: Generate synthesisable instructions only on-demandTobias Grosser2015-09-281-43/+1
| | | | | | | | | | | | | Instructions which we can synthesis from a SCEV expression are not generated directly, but only when they are used as an operand of another instruction. This avoids generating unnecessary instruction and works more reliably than first inserting them and then deleting them later on. Suggested-by: Johannes Doerfert <doerfert@cs.uni-saarland.de> Differential Revision: http://reviews.llvm.org/D13208 llvm-svn: 248712
* Allow switch instructions in SCoPsJohannes Doerfert2015-09-281-3/+1
| | | | | | | | | | | | | | | This patch allows switch instructions with affine conditions in the SCoP. Also switch instructions in non-affine subregions are allowed. Both did not require much changes to the code, though there was some refactoring needed to integrate them without code duplication. In the llvm-test suite the number of profitable SCoPs increased from 135 to 139 but more importantly we can handle more benchmarks and user inputs without preprocessing. Differential Revision: http://reviews.llvm.org/D13200 llvm-svn: 248701
* BlockGenerator: Be less agressive with deleting dead instructionsTobias Grosser2015-09-271-2/+36
| | | | | | | | | | | | | | We now only delete trivially dead instructions in the BB we copy (copyBB), but not in any other BB. Only for copyBB we know that there will _never_ be any future uses of instructions that have no use after copyBB has been generated. Other instructions in the AST that have been generated by IslNodeBuilder may look dead at the moment, but may possibly still be referenced by GlobalMaps. If we delete them now, later uses would break surprisingly. We do not have a test case that breaks due to us deleting too many instructions. This issue was found by inspection. llvm-svn: 248688
* BlockGenerator: Simplify code generated for region statementsTobias Grosser2015-09-271-0/+3
| | | | | | | | | | | | After having generated a new user statement a couple of inefficient or trivially dead instructions may remain. This commit runs instruction simplification over the newly generated blocks to ensure unneeded instructions are removed right away. This commit does adds simplification for non-affine subregions which was not yet part of 248681. llvm-svn: 248683
* BlockGenerator: Simplify code generated for scop statementsTobias Grosser2015-09-271-0/+5
| | | | | | | | | | | After having generated a new user statement a couple of inefficient or trivially dead instructions may remain. This commit runs instruction simplification over the newly generated blocks to ensure unneeded instructions are removed right away. This commit does not yet add simplification for non-affine subregions. llvm-svn: 248681
* Create parallel code in a separate blockJohannes Doerfert2015-09-261-1/+1
| | | | | | | | | | | This commit basically reverts r246427 but still solves the issue tackled by that commit. Instead of emitting initialization code in the beginning of the start block we now generate parallel code in its own block and thereby guarantee separation. This is necessary as we cannot generate code for hoisted loads prior to the start block but it still needs to be placed prior to everything else. llvm-svn: 248674
OpenPOWER on IntegriCloud