summaryrefslogtreecommitdiffstats
path: root/polly/lib/CodeGen/BlockGenerators.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* BlockGenerator: Do not redundantly reload from PHI-allocas in non-affine stmtsTobias Grosser2017-01-191-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this change we created an additional reload in the copy of the incoming block of a PHI node to reload the incoming value, even though the necessary value has already been made available by the normally generated scalar loads. In this change, we drop the code that generates this redundant reload and instead just reuse the scalar value already available. Besides making the generated code slightly cleaner, this change also makes sure that scalar loads go through the normal logic, which means they can be remapped (e.g. to array slots) and corresponding code is generated to load from the remapped location. Without this change, the original scalar load at the beginning of the non-affine region would have been remapped, but the redundant scalar load would continue to load from the old PHI slot location. It might be possible to further simplify the code in addOperandToPHI, but this would not only mean to pull out getNewValue, but to also change the insertion point update logic. As this did not work when trying it the first time, this change is likely not trivial. To not introduce bugs last minute, we postpone further simplications to a subsequent commit. We also document the current behavior a little bit better. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D28892 llvm-svn: 292486
* BlockGenerator: remove obfuscating const and const castsTobias Grosser2017-01-191-2/+2
| | | | | | | | Making certain values 'const' to just cast it away a little later mainly obfuscates the code. Hence, we just drop the 'const' parts. Suggested-by: Michael Kruse <llvm@meinersbur.de> llvm-svn: 292480
* Use range-based for loop [NFC]Tobias Grosser2017-01-191-2/+2
| | | | llvm-svn: 292471
* Adjust formatting to commit r292110 [NFC]Tobias Grosser2017-01-161-7/+9
| | | | llvm-svn: 292123
* Use typed enums to model MemoryKind and move MemoryKind out of ScopArrayInfoTobias Grosser2017-01-141-2/+2
| | | | | | | | | | | | | | | | | | To benefit of the type safety guarantees of C++11 typed enums, which would have caught the type mismatch fixed in r291960, we make MemoryKind a typed enum. This change also allows us to drop the 'MK_' prefix and to instead use the more descriptive full name of the enum as prefix. To reduce the amount of typing needed, we use this opportunity to move MemoryKind from ScopArrayInfo to a global scope, which means the ScopArrayInfo:: prefix is not needed. This move also makes historically sense. In the beginning of Polly we had different MemoryKind enums in both MemoryAccess and ScopArrayInfo, which were later canonicalized to one. During this canonicalization we just choose the enum in ScopArrayInfo, but did not consider to move this shared enum to global scope. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D28090 llvm-svn: 292030
* canSynthesize: Remove unused argument LI. NFC.Michael Kruse2016-11-291-1/+1
| | | | | | | The helper function polly::canSynthesize() does not directly use the LoopInfo analysis, hence remove it from its argument list. llvm-svn: 288144
* [Polly CodeGen] Break critical edge from RTC to original loop.Eli Friedman2016-11-021-24/+14
| | | | | | | | | | | | | | | This makes polly generate a CFG which is closer to what we want in LLVM IR, with a loop preheader for the original loop. This is just a cleanup, but it exposes some fragile assumptions. I'm not completely happy with the changes related to expandCodeFor; RTCBB->getTerminator() is basically a random insertion point which happens to work due to the way we generate runtime checks. I'm not sure what the right answer looks like, though. Differential Revision: https://reviews.llvm.org/D26053 llvm-svn: 285864
* [polly] Fix non-determinism in polly BlockGeneratorsMandeep Singh Grang2016-10-191-2/+2
| | | | | | | | | | | | Summary: Iterating over SeenBlocks which is a SmallPtrSet results in non-determinism in codegen Reviewers: jdoerfert, zinob, grosser Tags: #polly Differential Revision: https://reviews.llvm.org/D25778 llvm-svn: 284622
* [ScopInfo/CodeGen] ExitPHI reads are implicit.Michael Kruse2016-10-121-1/+1
| | | | | | | | | | | | | | | | | Under some conditions MK_Value read accessed where converted to MK_ExitPHI read accessed. This is unexpected because MK_ExitPHI read accesses are implicit after the scop execution. This behaviour was introduced in r265261, which fixed a failed assertion/crash in CodeGen. Instead, we fix this failure in CodeGen itself. createExitPHINodeMerges(), despite its name, also handles accesses of kind MK_Value, only to skip them because they access values that are usually not PHI nodes in the SCoP region's exit block. Except in the situation observed in r265261. Do not convert value accessed to ExitPHI accesses and do not handle value accesses like ExitPHI accessed in CodeGen anymore. llvm-svn: 284023
* [CodeGen] Change 'Scalar' to 'Array' in method names. NFC.Michael Kruse2016-09-301-9/+9
| | | | | | | | | | | | | generateScalarLoad() and generateScalarStore() are used for explicit (MK_Array) memory accesses, therefore the method names were misleading. The names also were similar to generateScalarLoads() and generateScalarStores() (plural forms) which indeed handle scalar accesses. Presumbly, they were originally named to contrast VectorBlockGenerator::generateLoad(). Rename the two methods to generateArrayLoad(), respectively generateArrayStore(). llvm-svn: 282861
* [CodeGen] Add assertion for partial scalar accesses. NFC.Michael Kruse2016-09-301-0/+18
| | | | | | | The code generator always adds unconditional LoadInst and StoreInst, hence the MemoryAccess must be defined over all statement instances. llvm-svn: 282853
* Perform copying to created arrays according to the packing transformationRoman Gareev2016-09-141-1/+3
| | | | | | | | | | | | | | | | This is the fourth patch to apply the BLIS matmul optimization pattern on matmul kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf). BLIS implements gemm as three nested loops around a macro-kernel, plus two packing routines. The macro-kernel is implemented in terms of two additional loops around a micro-kernel. The micro-kernel is a loop around a rank-1 (i.e., outer product) update. In this change we perform copying to created arrays, which is the last step to implement the packing transformation. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D23260 llvm-svn: 281441
* Allow mapping scalar MemoryAccesses to array elements.Michael Kruse2016-09-011-19/+54
| | | | | | | | | | | | | | | | | | | | | | Change the code around setNewAccessRelation to allow to use a an existing array element for memory instead of an ad-hoc alloca. This facility will be used for DeLICM/DeGVN to convert scalar dependencies into regular ones. The changes necessary include: - Make the code generator use the implicit locations instead of the alloca ones. - A test case - Make the JScop importer accept changes of scalar accesses for that test case. - Adapt the MemoryAccess interface to the fact that the MemoryKind can change. They are named (get|is)OriginalXXX() to get the status of the memory access before any change by setNewAccessRelation() (some properties such as getIncoming() do not change even if the kind is changed and are still required). To get the modified properties, there is (get|is)LatestXXX(). The old accessors without Original|Latest become synonyms of the (get|is)OriginalXXX() to not make functional changes in unrelated code. Differential Revision: https://reviews.llvm.org/D23962 llvm-svn: 280408
* [BlockGenerator] Invalidate SCEV values for instructions in scopTobias Grosser2016-08-181-0/+14
| | | | | | | | | | | | | | | We already invalidated a couple of critical values earlier on, but we now invalidate all instructions contained in a scop after the scop has been code generated. This is necessary as later scops may otherwise obtain SCEV expressions that reference values in the earlier scop that before dominated the later scop, but which had been moved into the conditional branch and consequently do not dominate the later scop any more. If these very values are then used during code generation of the later scop, we generate used that are dominated by the values they use. This fixes: http://llvm.org/PR28984 llvm-svn: 279047
* [BlockGenerator] Insert initializations at beginning of start blockTobias Grosser2016-08-091-1/+1
| | | | | | | | | | | | | | In case some code -- not guarded by control flow -- would be emitted directly in the start block, it may happen that this code would use uninitalized scalar values if the scalar initialization is only emitted at the end of the start block. This is not a problem today in normal Polly, as all statements are emitted in their own basic blocks, but Polly-ACC emits host-to-device copy statements into the start block. Additional Polly-ACC test coverage will be added in subsequent changes that improve the handling of PHI nodes in Polly-ACC. llvm-svn: 278124
* [BlockGenerator] Also eliminate dead code not originating from BBTobias Grosser2016-08-091-12/+9
| | | | | | | | | | | | | | | | | After having generated the code for a ScopStmt, we run a simple dead-code elimination that drops all instructions that are known to be and remain unused. Until this change, we only considered instructions for dead-code elimination, if they have a corresponding instruction in the original BB that belongs to ScopStmt. However, when generating code we do not only copy code from the BB belonging to a ScopStmt, but also generate code for operands referenced from BB. After this change, we now also considers code for dead code elimination, which does not have a corresponding instruction in BB. This fixes a bug in Polly-ACC where such dead-code referenced CPU code from within a GPU kernel, which is possible as we do not guarantee that all variables that are used in known-dead-code are moved to the GPU. llvm-svn: 278103
* [CodeGen] Use MapVector instead of DenseMap.Michael Kruse2016-08-051-2/+2
| | | | | | | | | | | The map is iterated over when generating the values escaping the SCoP. The indeterministic iteration order of DenseMap causes the output IR to change at every compilation, adding noise to comparisons. Replace DenseMap by a MapVector to ensure the same iteration order at every compilation. llvm-svn: 277832
* BlockGenerator: Assert that we do not get alloca of array accessTobias Grosser2016-08-041-0/+4
| | | | llvm-svn: 277698
* Fix a couple of spelling mistakesTobias Grosser2016-08-031-1/+1
| | | | llvm-svn: 277569
* Extend the jscop interface to allow the user to declare new arrays and to ↵Roman Gareev2016-07-301-6/+3
| | | | | | | | | | | | | | | | reference these arrays from access expressions Extend the jscop interface to allow the user to export arrays. It is required that already existing arrays of the list of arrays correspond to arrays of the SCoP. Each array that is appended to the list will be newly created. Furthermore, we allow the user to modify access expressions to reference any array in case it has the same element type. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D22828 llvm-svn: 277263
* BlockGenerator: remove dead instructions in normal statementsTobias Grosser2016-07-211-0/+22
| | | | | | | | | | | | | This ensures that no trivially dead code is generated. This is not only cleaner, but also avoids troubles in case code is generated in a separate function and some of this dead code contains references to values that are not available. This issue may happen, in case the memory access functions have been updated and old getelementptr instructions remain in the code. With normal Polly, a test case is difficult to draft, but the upcoming GPU code generation can possibly trigger such problems. We will later extend this dead-code elimination to region and vector statements. llvm-svn: 276263
* [Polly] Remove usage of the `apply` functionSanjoy Das2016-05-291-1/+1
| | | | | | | | | | | | | | | Summary: API-wise `apply` is a somewhat unidiomatic one-off function, and removing the only(?) use in polly will let me remove it from SCEV's exposed interface. Reviewers: jdoerfert, Meinersbur, grosser Subscribers: grosser, mcrosier, pollydev Differential Revision: http://reviews.llvm.org/D20779 llvm-svn: 271177
* Remove some unused local variables. NFC.Michael Kruse2016-05-231-13/+0
| | | | | | | Found by clang static analyzer (http://llvm.org/reports/scan-build/) and Visual Studio. llvm-svn: 270432
* Use the SCoP directly for canSynthesize [NFC]Johannes Doerfert2016-05-231-1/+1
| | | | llvm-svn: 270429
* Duplicate part of the Region interface in the Scop class [NFC]Johannes Doerfert2016-05-231-13/+10
| | | | | | | This allows to use the SCoP directly for various queries, thus to hide the underlying region more often. llvm-svn: 270426
* Add and use Scop::contains(Loop/BasicBlock/Instruction) [NFC]Johannes Doerfert2016-05-231-7/+5
| | | | llvm-svn: 270424
* Directly access information through the Scop class [NFC]Johannes Doerfert2016-05-231-2/+1
| | | | llvm-svn: 270421
* Simplify BlockGenerator::handleOutsideUsers interface [NFC]Johannes Doerfert2016-05-231-4/+4
| | | | llvm-svn: 270411
* BlockGenerator: Drop leftover debug statementTobias Grosser2016-04-281-1/+0
| | | | llvm-svn: 267874
* Check only loop control of loops that are part of the regionJohannes Doerfert2016-04-251-3/+0
| | | | | | | This also removes a duplicated line of code in the region generator that caused a SPEC benchmark to fail with the new SCoPs. llvm-svn: 267404
* [FIX] Adjust the insert point for non-affine region PHIsJohannes Doerfert2016-04-011-4/+7
| | | | | | | | | | If a non-affine region PHI is generated we should not move the insert point prior to the synthezised value in the same block as we might split that block at the insert point later on. Only if the incoming value should be placed in a different block we should change the insertion point. llvm-svn: 265132
* [BlockGenerator] Fix PHI merges for MK_Arrays.Michael Kruse2016-03-031-0/+10
| | | | | | | | | | | | | | | | | | Value merging is only necessary for scalars when they are used outside of the scop. While an array's base pointer can be used after the scop, it gets an extra ScopArrayInfo of type MK_Value. We used to generate phi's for both of them, where one was assuming the reault of the other phi would be the original value, because it has already been replaced by the previous phi. This resulted in IR that the current IR verifier allows, but is probably illegal. This reduces the number of LNT test-suite fails with -polly-position=before-vectorizer -polly-process-unprofitable from 16 to 10. Also see llvm.org/PR26718. llvm-svn: 262629
* Fix non-synthesizable loop exit values.Michael Kruse2016-03-011-1/+1
| | | | | | | | | | | Polly recognizes affine loops that ScalarEvolution does not, in particular those with loop conditions that depend on hoisted invariant loads. Check for SCEVAddRec dependencies on such loops and do not consider their exit values as synthesizable because SCEVExpander would generate them as expressions that depend on the original induction variables. These are not available in generated code. llvm-svn: 262404
* Use inline local variable declaration. NFC.Michael Kruse2016-02-251-3/+2
| | | | llvm-svn: 261876
* Fix DomTree preservation for generated subregions.Michael Kruse2016-02-251-5/+49
| | | | | | | | | The generated dedicated subregion exit block was assumed to have the same dominance relation as the original exit block. This is incorrect if the exit block receives other edges than only from the subregion, which results in that e.g. the subregion's entry block does not dominate the exit block. llvm-svn: 261865
* Introduce ScopStmt::getEntryBlock(). NFC.Michael Kruse2016-02-241-2/+1
| | | | | | This replaces an ungly inline ternary operator pattern. llvm-svn: 261792
* Add assertions checking def dominates use. NFC.Michael Kruse2016-02-241-0/+20
| | | | | | | | This is also be caught by the function verifier, but disconnected from the place that produced it. Catch it already at creation to be able to reason more directly about the cause. llvm-svn: 261790
* BlockGenerator: Drop unnecessary return valueTobias Grosser2016-02-211-4/+3
| | | | llvm-svn: 261473
* Replace getLoopForInst by getLoopForStmtJohannes Doerfert2016-02-161-10/+12
| | | | | | This patch was extracted from http://reviews.llvm.org/D13611. llvm-svn: 260958
* Support accesses with differently sized types to the same arrayTobias Grosser2016-02-041-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows code such as: void multiple_types(char *Short, char *Float, char *Double) { for (long i = 0; i < 100; i++) { Short[i] = *(short *)&Short[2 * i]; Float[i] = *(float *)&Float[4 * i]; Double[i] = *(double *)&Double[8 * i]; } } To model such code we use as canonical element type of the modeled array the smallest element type of all original array accesses, if type allocation sizes are multiples of each other. Otherwise, we use a newly created iN type, where N is the gcd of the allocation size of the types used in the accesses to this array. Accesses with types larger as the canonical element type are modeled as multiple accesses with the smaller type. For example the second load access is modeled as: { Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 } To support code-generating these memory accesses, we introduce a new method getAccessAddressFunction that assigns each statement instance a single memory location, the address we load from/store to. Currently we obtain this address by taking the lexmin of the access function. We may consider keeping track of the memory location more explicitly in the future. We currently do _not_ handle multi-dimensional arrays and also keep the restriction of not supporting accesses where the offset expression is not a multiple of the access element type size. This patch adds tests that ensure we correctly invalidate a scop in case these accesses are found. Both types of accesses can be handled using the very same model, but are left to be added in the future. We also move the initialization of the scop-context into the constructor to ensure it is already available when invalidating the scop. Finally, we add this as a new item to the 2.9 release notes Reviewers: jdoerfert, Meinersbur Differential Revision: http://reviews.llvm.org/D16878 llvm-svn: 259784
* Revert "Support loads with differently sized types from a single array"Tobias Grosser2016-02-031-1/+5
| | | | | | This reverts commit (@259587). It needs some further discussions. llvm-svn: 259629
* Support loads with differently sized types from a single arrayTobias Grosser2016-02-021-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We support now code such as: void multiple_types(char *Short, char *Float, char *Double) { for (long i = 0; i < 100; i++) { Short[i] = *(short *)&Short[2 * i]; Float[i] = *(float *)&Float[4 * i]; Double[i] = *(double *)&Double[8 * i]; } } To support such code we use as element type of the modeled array the smallest element type of all original array accesses. Accesses with larger types are modeled as multiple accesses with the smaller type. For example the second load access is modeled as: { Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 } To support jscop-rewritable memory accesses we need each statement instance to only be assigned a single memory location, which will be the address at which we load the value. Currently we obtain this address by taking the lexmin of the access function. We may consider keeping track of the memory location more explicitly in the future. llvm-svn: 259587
* Add const keyword to MemoryAccess argument [NFC]Johannes Doerfert2016-02-021-1/+1
| | | | llvm-svn: 259504
* Introduce MemAccInst helper class; NFCMichael Kruse2016-01-271-17/+17
| | | | | | | | | | | | | | | | MemAccInst wraps the common members of LoadInst and StoreInst. Also use of this class in: - ScopInfo::buildMemoryAccess - BlockGenerator::generateLocationAccessed - ScopInfo::addArrayAccess - Scop::buildAliasGroups - Replace every use of polly::getPointerOperand Reviewers: jdoerfert, grosser Differential Revision: http://reviews.llvm.org/D16530 llvm-svn: 258947
* Unique phi write accessesMichael Kruse2016-01-261-31/+80
| | | | | | | | | | | | | | | | | | | Ensure that there is at most one phi write access per PHINode and ScopStmt. In particular, this would be possible for non-affine subregions with multiple exiting blocks. We replace multiple MAY_WRITE accesses by one MUST_WRITE access. The written value is constructed using a PHINode of all exiting blocks. The interpretation of the PHI WRITE's "accessed value" changed from the incoming value to the PHI like for PHI READs since there is no unique incoming value. Because region simplification shuffles around PHI nodes -- particularly with exit node PHIs -- the PHINodes at analysis time does not always exist anymore in the code generation pass. We instead remember the incoming block/value pair in the MemoryAccess. Differential Revision: http://reviews.llvm.org/D15681 llvm-svn: 258809
* BlockGenerators: Replace getNewScalarValue with getNewValueTobias Grosser2016-01-261-52/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Both functions implement the same functionality, with the difference that getNewScalarValue assumes that globals and out-of-scop scalars can be directly reused without loading them from their corresponding stack slot. This is correct for sequential code generation, but causes issues with outlining code e.g. for OpenMP code generation. getNewValue handles such cases correctly. Hence, we can replace getNewScalarValue with getNewValue. This is not only more future proof, but also eliminates a bunch of code. The only functionality that was available in getNewScalarValue that is lost is the on-demand creation of scalar values. However, this is not necessary any more as scalars are always loaded at the beginning of each basic block and will consequently always be available when scalar stores are generated. As this was not the case in older versions of Polly, it seems the on-demand loading is just some older code that has not yet been removed. Finally, generateScalarLoads also generated loads for values that are loop invariant, available in GlobalMap and which are preferred over the ones loaded in generateScalarLoads. Hence, we can just skip the code generation of such scalar values, avoiding the generation of dead code. Differential Revision: http://reviews.llvm.org/D16522 llvm-svn: 258799
* BlockGenerators: Avoid redundant map lookup [NFC]Tobias Grosser2016-01-241-2/+2
| | | | llvm-svn: 258660
* Refactor canSynthesize in the BlockGenerators [NFC]Johannes Doerfert2015-12-221-6/+9
| | | | llvm-svn: 256269
* Treat inline assembly as a constant in the code generation.Johannes Doerfert2015-12-221-0/+4
| | | | llvm-svn: 256267
* Reduce indention in BlockGenerator::trySynthesizeNewValue [NFC]Johannes Doerfert2015-12-221-23/+26
| | | | llvm-svn: 256266
OpenPOWER on IntegriCloud