summaryrefslogtreecommitdiffstats
path: root/polly/test
Commit message (Collapse)AuthorAgeFilesLines
...
* Add rewrite by-reference parameter passTobias Grosser2017-08-171-0/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This pass detangles induction variables from functions, which take variables by reference. Most fortran functions compiled with gfortran pass variables by reference. Unfortunately a common pattern, printf calls of induction variables, prevent in this situation the promotion of the induction variable to a register, which again inhibits any kind of loop analysis. To work around this issue we developed a specialized pass which introduces separate alloca slots for known-read-only references, which indicate the mem2reg pass that the induction variables can be promoted to registers and consquently enable SCEV to work. We currently hardcode the information that a function _gfortran_transfer_integer_write does not read its second parameter, as dragonegg does not add the right annotations and we cannot change old dragonegg releases. Hopefully flang will produce the right annotations. Reviewers: Meinersbur, bollu, singam-sanjay Reviewed By: bollu Subscribers: mgorny, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D36800 llvm-svn: 311066
* Add missing 'REQUIRES' lineTobias Grosser2017-08-161-0/+2
| | | | llvm-svn: 311046
* [GPGPU] Also record invariant loads as kernel subtree valuesTobias Grosser2017-08-161-0/+35
| | | | | | | Before this change kernels that used invariant loads would have resulted in invalid PTX code. llvm-svn: 311042
* [Polly] XFAIL ReportLoopHasNoExit tests after r310940Jakub Kuderski2017-08-161-0/+6
| | | | | | | | | | | ReportLoopHasNoExit started failing after r310940 that added infinite loops to postdominators. The change made regions not contain infinite loops anymore. This patch unbreaks the polly tree by XFAILING the ReportLoopHasNoExit test. Full fix is under review in D36776. llvm-svn: 310980
* [JSON] Make the failure to parse a jscop file a hard errorPhilip Pfaffe2017-08-1024-27/+23
| | | | | | | | | | | | | | | | | | Summary: Before, if we fail to parse a jscop file, this will be reported as an error and importing is aborted. However, this isn't actually strong enough, since although the import is aborted, the scop has already been modified and is very likely broken. Instead, make this a hard failure and throw an LLVM error. This new behaviour requires small changes to the tests for the legacy pass, namely using `not` to verify the error. Further, fixed the jscop file for the base_pointer_load_is_inst_inside_invariant_1 testcase. Reviewed By: Meinersbur Split out of D36578. llvm-svn: 310599
* Fix 310555: Require pollyacc instead of assertsPhilip Pfaffe2017-08-101-1/+1
| | | | llvm-svn: 310595
* Fix r310304: Fix the lit testcases.Philip Pfaffe2017-08-103-5/+5
| | | | | | In opt, Polly passes are only available after -load. llvm-svn: 310581
* Add missing 'REQUIRES' lineTobias Grosser2017-08-101-0/+2
| | | | llvm-svn: 310555
* [GPGPU] Make the ast_build available to block generatorTobias Grosser2017-08-102-0/+95
| | | | | | This is necessary for partial writes (as used by delicm) to work. llvm-svn: 310553
* [ManagedMemoryRewrite] [Polly] Erase original malloc and free. [NFC]Siddharth Bhat2017-08-091-0/+4
| | | | | | | We do not need to keep `malloc` and `free` around since they are replaced by `polly_{malloc,free}Managed.` llvm-svn: 310504
* [ManagedMemoryRewrite] Remove test case that was submitted by mistake. [NFC]Siddharth Bhat2017-08-091-77/+0
| | | | llvm-svn: 310473
* [ManagedMemoryRewrite] Introduce a new pass to rewrite modules to use ↵Siddharth Bhat2017-08-092-0/+167
| | | | | | | | | | | | | | | | managed memory. This pass is useful to automatically convert a codebase that uses malloc/free to use their managed memory counterparts. Currently, rewrite malloc and free to the `polly_{malloc,free}Managed` variants. A future patch will teach ManagedMemoryRewrite to rewrite global arrays as pointers to globally allocated managed memory. Differential Revision: https://reviews.llvm.org/D36513 llvm-svn: 310471
* [CodeGen] Use isLatestArrayKind().Michael Kruse2017-08-091-0/+58
| | | | | | | | | | Codegen with -polly-parallel queried the unmapped MemoryAccess, but only the MemoryKind after mapping is relevant for codegen. This should fix various fails of the perf-x86_64-penryn-O3-polly-parallel-fast buildbot. llvm-svn: 310466
* [PPCGCodeGeneration] Compute element size in bytes for arrays correctly.Siddharth Bhat2017-08-091-0/+50
| | | | | | | | | | | | Previously, we used to compute this with `elementSizeInBits / 8`. This would yield an element size of 0 when the array had element size < 8 in bits. To fix this, ask data layout what the size in bytes should be. Differential Revision: https://reviews.llvm.org/D36459 llvm-svn: 310448
* [test] Add descriptions and pseudocode to tests. NFC.Michael Kruse2017-08-088-1/+94
| | | | llvm-svn: 310385
* Use SCEV information for the second level aliasingRoman Gareev2017-08-083-0/+174
| | | | | | | | | | | | | | | | | | | | | We introduce another level of alias metadata to distinguish the individual non-aliasing accesses that have inter iteration alias-free base pointers marked with "Inter iteration alias-free" mark nodes. To distinguish two accesses, the comparison of raw pointers representing base pointers is used. In case of, for example, ublas's prod function that implements GEMM, and DeLiCM we can get accesses to same location represented by different raw pointers. Consequently, we create different alias sets that can prevent accesses from, for example, being sinked or hoisted. To avoid the issue, we compare the corresponding SCEV information instead of the corresponding raw pointers. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D35761 llvm-svn: 310380
* Do not use isl_set_project_out to get all loop prefixesRoman Gareev2017-08-082-0/+475
| | | | | | | | | | | | Currently, only convex isolation sets can be efficiently processed by isl. Consequently, as a temporary solution, we use a different algorithm for partial tile isolation that helps to build convex isolation sets in some cases. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D36278 llvm-svn: 310374
* [NFC] [PPCGCodeGen] Add missing REQUIRES: pollyacc line.Siddharth Bhat2017-08-081-0/+2
| | | | llvm-svn: 310354
* [Polly] [PPCGCodeGeneration] Handle failing of invariant load hoisting ↵Siddharth Bhat2017-08-081-0/+56
| | | | | | | | | | | gracefully. To do this, we replicate what `CodeGeneration` does. We expose `markNodeUnreachable` from `CodeGeneration` to `PPCGCodeGeneration`. Differential Revision: https://reviews.llvm.org/D36457 llvm-svn: 310350
* [DeLICM] Properly handle PHI writes becoming empty partial writes.Michael Kruse2017-08-081-0/+80
| | | | | | | | | | | | | It is possible that partial writes are empty (write is never executed). In this case, when in PHINode's incoming edge is never taken such that the incoming write becomes an empty partial write, if enabled. The issue is that when converting the union_map to an map, it's space cannot be derived from the union_map itself. Rather, we need to determine its space independently. This fixes test-suite's MultiSource/Benchmarks/ASC_Sequoia/CrystalMk. llvm-svn: 310348
* [ScheduleOptimizer] Make matmul pattern detection work with delicm outputTobias Grosser2017-08-081-0/+76
| | | | | | | | | | | | In certain cases delicm might decide to not leave the original array write in the loop body, but to remove it and instead leave a transformed phi node as write access. This commit teached the matmul pattern detection to order the memory accesses according to when the access actually happens and use this information to detect the new pattern. This makes pattern based matmul optimization work for 2mm and 3mm in polybench 4 after polly-position=before-vectorizer has been enabled. llvm-svn: 310338
* [test] Add some missing options that become necessary after the recent ↵Tobias Grosser2017-08-071-2/+2
| | | | | | default changes llvm-svn: 310315
* [test] Add one more test case for the previous commitTobias Grosser2017-08-071-0/+108
| | | | llvm-svn: 310312
* [ZoneAlgo] Allow two writes that write identical values into same array slotTobias Grosser2017-08-071-0/+56
| | | | | | | | | Two write statements which write into the very same array slot generally are conflicting. However, in case the value that is written is identical, this does not cause any problem. Hence, allow such write pairs in this specific situation. llvm-svn: 310311
* [Polly] Fully-Indexed static expansionAndreas Simbuerger2017-08-073-0/+317
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit implements the initial version of fully-indexed static expansion. ``` for(int i = 0; i<Ni; i++) for(int j = 0; j<Ni; j++) S: B[j] = j; T: A[i] = B[i] ``` After the pass, we want this : ``` for(int i = 0; i<Ni; i++) for(int j = 0; j<Ni; j++) S: B[i][j] = j; T: A[i] = B[i][i] ``` For now we bail (fail) in the following cases: - Scalar access - Multiple writes per SAI - MayWrite Access - Expansion that leads to an access to the original array Furthermore: We still miss checks for escaping references to the array base pointers. A future commit will add the missing escape-checks to stay correct in those cases. The expansion is still locked behind a CLI-Option and should not yet be used. Patch contributed by: Nicholas Bonfante <bonfante.nicolas@gmail.com> Reviewers: simbuerg, Meinersbur, bollu Reviewed By: Meinersbur Subscribers: mgorny, llvm-commits, pollydev Differential Revision: https://reviews.llvm.org/D34982 llvm-svn: 310304
* [ForwardOpTree] Use known array content analysis to forward load instructions.Michael Kruse2017-08-077-0/+480
| | | | | | | | | | | | | | | | | This is an addition to the -polly-optree pass that reuses the array content analysis from DeLICM to find array elements that contain the same value as the value loaded when the target statement instance is executed. The analysis is now enabled by default. The known content analysis could also be used to rematerialize any llvm::Value that was written to some array element, but currently only loads are forwarded. Differential Revision: https://reviews.llvm.org/D36380 llvm-svn: 310279
* Add missing 'REQUIRES: pollyacc' lineTobias Grosser2017-08-061-0/+2
| | | | llvm-svn: 310197
* [GPGPU] Make sure managed arrays are prepared at the beginning of the scopTobias Grosser2017-08-062-5/+110
| | | | | | | | | | | | | | | Summary: This resolves some "instruction does not dominate use" errors, as we used to prepare the arrays at the location of the first kernel, which not necessarily dominated all other kernel calls. Reviewers: Meinersbur, bollu, singam-sanjay Subscribers: nemanjai, pollydev, llvm-commits, kbarton Differential Revision: https://reviews.llvm.org/D36372 llvm-svn: 310196
* [GPGPU] Rename all, not only the first libdevice functionTobias Grosser2017-08-062-0/+7
| | | | llvm-svn: 310194
* [Polly] [PPCGCodeGeneration] Deal with loops outside the Scop correctly in ↵Siddharth Bhat2017-08-061-0/+67
| | | | | | | | | | | | | | | PPCGCodeGeneration. A Scop with a loop outside it is not handled currently by PPCGCodeGeneration. The test case is such that the Scop has only one inner loop that is detected. This currently breaks codegen. The fix is to reuse the existing mechanism in `IslNodeBuilder` within `GPUNodeBuilder. Differential Revision: https://reviews.llvm.org/D36290 llvm-svn: 310193
* Add missing REQUIRES lineTobias Grosser2017-08-031-0/+2
| | | | llvm-svn: 309943
* Make sure that all parameter dimensions are set in scheduleTobias Grosser2017-08-031-0/+51
| | | | | | | | | | | | | | | | | | | | | | | Summary: In case the option -polly-ignore-parameter-bounds is set, not all parameters will be added to context and domains. This is useful to keep the size of the sets and maps we work with small. Unfortunately, for AST generation it is necessary to ensure all parameters are part of the schedule tree. Hence, we modify the GPGPU code generation to make sure this is the case. To obtain the necessary information we expose a new function Scop::getFullParamSpace(). We also make a couple of functions const to be able to make SCoP::getFullParamSpace() const. Reviewers: Meinersbur, bollu, gareevroman, efriedma, huihuiz, sebpop, simbuerg Subscribers: nemanjai, kbarton, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D36243 llvm-svn: 309939
* [test] Fix test case without Polly-ACC.Michael Kruse2017-08-031-0/+2
| | | | llvm-svn: 309938
* [PPCGCodeGeneration] Construct `isl_multi_pw_aff` of PPCGArray.bounds even ↵Siddharth Bhat2017-08-031-0/+53
| | | | | | | | | | | | | | | | | | when polly-ignore-parameter-bounds is turned on. When we have `-polly-ignore-parameter-bounds`, `Scop::Context` does not contain all the paramters present in the program. The construction of the `isl_multi_pw_aff` requires all the indivisual `pw_aff` to have the same parameter dimensions. To achieve this, we used to realign every `pw_aff` with `Scop::Context`. However, in conjunction with `-polly-ignore-parameter-bounds`, this is now incorrect, since `Scop::Context` does not contain all parameters. We set this up correctly by creating a space that has all the parameters used by all the `isl_pw_aff`. Then, we realign all `isl_pw_aff` to this space. llvm-svn: 309934
* Remove debug metadata from copied instruction to prevent GPUModule ↵Singapuram Sanjay Srivallabh2017-08-021-0/+104
| | | | | | | | | | | | | | | | | | | | | | | | | verification failure Summary: **Remove debug metadata from instruction to be copied to prevent the source file's debug metadata being copied into GPUModule and eventually failing Module verification and ASM string codegeneration.** When copying the instruction onto the Module meant for the GPU, debug metadata attached to an instruction causes all related metadata to be pulled into the Module, including the DICompileUnit, which is not listed in llvm.dbg.cu of the Module. This fails the verification of the Module and generation of the ASM string. The only debug metadata of the instruction, the DebugLoc, is unset by this patch. This patch reattempts https://reviews.llvm.org/D35630 by targeting only those instructions that are to end up in a Module meant for the GPU. Reviewers: grosser, bollu Reviewed By: grosser Subscribers: pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D36161 llvm-svn: 309822
* [Simplify] Rewrite redundant write detection algorithm.Michael Kruse2017-08-0115-7/+594
| | | | | | | | | | | | | | | | | | | | | | The previous algorithm was to search a writes and the sours of its value operand, and see whether the write just stores the same read value back, which includes a search whether there is another write access between them. This is O(n^2) in the max number of accesses in a statement (+ the complexity of isl comparing the access functions). The new algorithm is more similar to the one used for searching for overwrites and coalescable writes. It scans over all accesses in order of execution while tracking which array elements still have the same value since it was read. This is O(n), not counting the complexity within isl. It should be more reliable than trying to catch all non-conforming cases in the previous approach. It is also less code. We now also support if the write is a partial write of the read's domain, and to some extent non-affine subregions. Differential Revision: https://reviews.llvm.org/D36137 llvm-svn: 309734
* [Simplify] Improve scalability.Michael Kruse2017-08-012-0/+292
| | | | | | | | | | | | | | | | | | | | | | With a lot of reads and writes to the same array in a statement, some isl sets that capture the state between access can become complex such that isl takes more considerable time and memory for operations on them. The problems identified were: - is_subset() takes considerable time with many disjoints in the arguments. We limit the number of disjoints to 4, any additional information is thrown away. - subtract() can lead to many disjoints. We instead assume that any array element is possibly accessed, which removes all disjoints. - subtract_domain() may lead to considerable processing, even if all elements are are to be removed. Instead, we remove determine and remove the affected spaces manually. No behaviour is changed. llvm-svn: 309728
* [NFC] Add 'REQUIRES: pollyacc' on ↵Siddharth Bhat2017-08-011-0/+2
| | | | | | | | 'test/GPGPU/invariant-load-hoisting-of-array.ll' - Should fix broken build due to `r309681`. llvm-svn: 309686
* [PPCGCodeGeneration] Correct usage of llvm::Value with getLatestValue.Siddharth Bhat2017-08-011-0/+100
| | | | | | | | | | It is possible that the `HostPtr` that coresponds to an array could be invariant load hoisted. Make sure we use the invariant load hoisted value by using `IslNodeBuilder::getLatestValue`. Differential Revision: https://reviews.llvm.org/D36001 llvm-svn: 309681
* [ForwardOpTree] Support synthesizable values.Michael Kruse2017-07-315-43/+272
| | | | | | | | | | | | | | | | | | | | | | | | | This allows -polly-optree to move instructions that depend on synthesizable values. The difficulty for synthesizable values is that their value depends on the location. When it is moved over a loop header, and the SCEV expression depends on the loop induction variable (SCEVAddRecExpr), it would use the current induction variable instead of the last one. At the moment we cannot forward PHI nodes such that crossing the header of loops referenced by SCEVAddRecExpr is not possible (assuming the loop header has at least two incoming blocks: for entering the loop and the backedge, such any instruction to be forwarded must have a phi between use and definition). A remaining issue is when the forwarded value is used after the loop, but is only synthesizable inside the loop. This happens e.g. if ScalarEvolution is unable to determine the number of loop iterations or the initial loop value. We do not forward in this situation. Differential Revision: https://reviews.llvm.org/D36102 llvm-svn: 309609
* [Simplify] Remove all kinds of redundant scalar writes.Michael Kruse2017-07-314-7/+136
| | | | | | | | In addition to array and PHI writes, also allow scalar value writes. The only kind of write not allowed are writes by functions (including memcpy/memmove/memset). llvm-svn: 309582
* [GPGPU] Add support for NVIDIA libdeviceTobias Grosser2017-07-313-0/+82
| | | | | | | | | | | | | | | | | | | | | Summary: This allows us to map functions such as exp, expf, expl, for which no LLVM intrinsics exist. Instead, we link to NVIDIA's libdevice which provides high-performance implementations of a wide range of (math) functions. We currently link only a small subset, the exp, cos and copysign functions. Other functions will be enabled as needed. Reviewers: bollu, singam-sanjay Reviewed By: bollu Subscribers: tstellar, tra, nemanjai, pollydev, mgorny, llvm-commits, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D35703 llvm-svn: 309560
* Revert "Remove Debug metadata from copied instruction to prevent Module ↵Tobias Grosser2017-07-311-104/+0
| | | | | | | | | | | verification failure" This reverts commit r309490 as it triggers on our AOSP buildbut error messages of the form: inlinable function call in a function with debug info must have a !dbg location llvm-svn: 309556
* Remove Debug metadata from copied instruction to prevent Module verification ↵Singapuram Sanjay Srivallabh2017-07-291-0/+104
| | | | | | | | | | | | | | | | | | | | | | | failure Summary: **Remove debug metadata from instruction to be copied to prevent the source file's debug metadata being copied into GPUModule and eventually failing Module verification and ASM string codegeneration.** When copying the instruction onto the Module meant for the GPU, debug metadata attached to an instruction causes all related metadata to be pulled into the Module, including the DICompileUnit, which is not listed in llvm.dbg.cu of the Module. This fails the verification of the Module and generation of the ASM string. The only debug metadata of the instruction, the DebugLoc, is unset by this patch. Reviewers: grosser, bollu, Meinersbur Reviewed By: grosser, bollu Subscribers: pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D35630 llvm-svn: 309490
* [Simplify] Implement write accesses coalescing.Michael Kruse2017-07-2925-10/+855
| | | | | | | | | | | | | | | | | Write coalescing combines write accesses that - Write the same llvm::Value. - Write to the same array. - Unless they do not write anything in a statement instance (partial writes), write to the same element. - There is no other access between them that accesses the same element. This is particularly useful after DeLICM, which leaves partial writes to disjoint domains. Differential Revision: https://reviews.llvm.org/D36010 llvm-svn: 309489
* [test] Add test case for -polly-simplify. NFC.Michael Kruse2017-07-291-0/+44
| | | | llvm-svn: 309458
* [Simplify] Do not remove dependencies of phis within region stmts.Michael Kruse2017-07-281-0/+50
| | | | | | | These were wrongly assumed to be phi nodes that require MemoryKind::PHI accesses. llvm-svn: 309454
* Remove offset parameter from llvm.dbg.value intrinsics in testcaseAdrian Prantl2017-07-281-5/+5
| | | | llvm-svn: 309433
* [test] Fix typo in filename. NFC.Michael Kruse2017-07-281-0/+0
| | | | llvm-svn: 309403
* [Simplify] Fix typo in statistics output. NFC.Michael Kruse2017-07-281-1/+1
| | | | llvm-svn: 309402
OpenPOWER on IntegriCloud