summaryrefslogtreecommitdiffstats
path: root/polly/test
Commit message (Collapse)AuthorAgeFilesLines
...
* Store the size of the outermost dimension in case of newly created arrays ↵Roman Gareev2016-09-126-17/+18
| | | | | | | | | | | | | that require memory allocation. We do not need the size of the outermost dimension in most cases, but if we allocate memory for newly created arrays, that size is needed. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D23991 llvm-svn: 281234
* GPGPU: Bail out gracefully in case of invalid IRTobias Grosser2016-09-121-0/+78
| | | | | | | | | | | Instead of aborting, we now bail out gracefully in case the kernel IR we generate is invalid. This can currently happen in case the SCoP stores pointer values, which we model as arrays, as data values into other arrays. In this case, the original pointer value is not available on the device and can consequently not be stored. As detecting this ahead of time is not so easy, we detect these situations after the invalid IR has been generated and bail out. llvm-svn: 281193
* Add missing 'REQUIRES' lineTobias Grosser2016-09-111-0/+2
| | | | llvm-svn: 281166
* GPGPU: Do not fail in case of arrays never accessedTobias Grosser2016-09-111-0/+272
| | | | | | | | | If these arrays have never been accessed we failed to derive an upper bound of the accesses and consequently a size for the outermost dimension. We now explicitly check for empty access sets and then just use zero as size for the outermost dimension. llvm-svn: 281165
* Add -polly-flatten-schedule pass.Michael Kruse2016-09-082-0/+110
| | | | | | | | | | | | | | | | | | | | | | | | | The -polly-flatten-schedule pass reduces the number of scattering dimensions in its isl_union_map form to make them easier to understand. It is not meant to be used in production, only for debugging and regression tests. To illustrate, how it can make sets simpler, here is a lifetime set used computed by the porposed DeLICM pass without flattening: { Stmt_reduction_for[0, 4] -> [0, 2, o2, o3] : o2 < 0; Stmt_reduction_for[0, 4] -> [0, 1, o2, o3] : o2 >= 5; Stmt_reduction_for[0, 4] -> [0, 1, 4, o3] : o3 > 0; Stmt_reduction_for[0, i1] -> [0, 1, i1, 1] : 0 <= i1 <= 3; Stmt_reduction_for[0, 4] -> [0, 2, 0, o3] : o3 <= 0 } And here the same lifetime for a semantically identical one-dimensional schedule: { Stmt_reduction_for[0, i1] -> [2 + 3i1] : 0 <= i1 <= 4 } Differential Revision: https://reviews.llvm.org/D24310 llvm-svn: 280948
* Add check-polly-tests build target.Michael Kruse2016-09-051-5/+9
| | | | | | | | | | | | | | | | | | | | The check-polly-tests target runs regression/unit tests but without checking formatting. This is useful to not having to reload a file in an open editor (which eg. clears the undo buffer, moves cursor/window position) when running polly-update-format. After this change, the following test targets exist: - check-polly-unittests to run unittests only - check-polly-tests to run unit and regression tests - polly-check-format to check formatting using clang-format - check-polly to run them all As a side-effect, when running check-polly, polly-check-format and run in parallel (instead of polly-check-format first). Differential Revision: https://reviews.llvm.org/D24191 llvm-svn: 280654
* Allow mapping scalar MemoryAccesses to array elements.Michael Kruse2016-09-013-0/+315
| | | | | | | | | | | | | | | | | | | | | | Change the code around setNewAccessRelation to allow to use a an existing array element for memory instead of an ad-hoc alloca. This facility will be used for DeLICM/DeGVN to convert scalar dependencies into regular ones. The changes necessary include: - Make the code generator use the implicit locations instead of the alloca ones. - A test case - Make the JScop importer accept changes of scalar accesses for that test case. - Adapt the MemoryAccess interface to the fact that the MemoryKind can change. They are named (get|is)OriginalXXX() to get the status of the memory access before any change by setNewAccessRelation() (some properties such as getIncoming() do not change even if the kind is changed and are still required). To get the modified properties, there is (get|is)LatestXXX(). The old accessors without Original|Latest become synonyms of the (get|is)OriginalXXX() to not make functional changes in unrelated code. Differential Revision: https://reviews.llvm.org/D23962 llvm-svn: 280408
* Add space between access string and follow-up.Michael Kruse2016-08-263-5/+5
| | | | llvm-svn: 279826
* Add "New access function" to update_check.py classifier.Michael Kruse2016-08-261-0/+2
| | | | | | Lines with this prefix are printed by JSONImporter. llvm-svn: 279825
* [FIX] Access dimensions should correspond to number of dimensions of the ↵Roman Gareev2016-08-262-4/+6
| | | | | | accesses array. llvm-svn: 279821
* Introduce unittests.Michael Kruse2016-08-253-1/+180
| | | | | | | | | | | | | | Add the infrastructure for unittests to Polly and two simple tests for conversion between isl_val and APInt. In addition, a build target check-polly-unittests is added to run only the unittests but not the regression tests. Clang's unittest mechanism served as as a blueprint which then was adapted to Polly. Differential Revision: https://reviews.llvm.org/D23833 llvm-svn: 279734
* Use configure_lit_site_cfg instead of configure_file.Michael Kruse2016-08-251-4/+19
| | | | | | | | | | | | | configure_lit_site_cfg defines some more parameters that are used in lit.site.cfg.in. configure_file would leave those empty. These additional definitions seem to be unimportant for regression tests, but unittests do not work without them. In case of out-of-tree builds, define the additional parameters with default values. These may not take all configuration parameters into account, as configure_lit_site_cfg would. llvm-svn: 279733
* Add %loadPolly to test command line.Michael Kruse2016-08-241-1/+1
| | | | | | Required for out-of-tree builds of Polly. llvm-svn: 279657
* Add a flag to dump SCoP optimized with the IslScheduleOptimizer passRoman Gareev2016-08-211-25/+8
| | | | | | | | | | | | | Dump polyhedral descriptions of Scops optimized with the isl scheduling optimizer and the set of post-scheduling transformations applied on the schedule tree to be able to check the work of the IslScheduleOptimizer pass at the polyhedral level. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D23740 llvm-svn: 279395
* [SCEVValidator] Don't reorder multiplies in extractConstantFactor.Eli Friedman2016-08-181-0/+39
| | | | | | | | | | | The existing code would add the operands in the wrong order, and eventually crash because the SCEV expression doesn't exactly match the parameter SCEV expression in SCEVAffinator::visit. (SCEV doesn't sort the operands to getMulExpr in general.) Differential Revision: https://reviews.llvm.org/D23592 llvm-svn: 279087
* [BlockGenerator] Invalidate SCEV values for instructions in scopTobias Grosser2016-08-181-1/+3
| | | | | | | | | | | | | | | We already invalidated a couple of critical values earlier on, but we now invalidate all instructions contained in a scop after the scop has been code generated. This is necessary as later scops may otherwise obtain SCEV expressions that reference values in the earlier scop that before dominated the later scop, but which had been moved into the conditional branch and consequently do not dominate the later scop any more. If these very values are then used during code generation of the later scop, we generate used that are dominated by the values they use. This fixes: http://llvm.org/PR28984 llvm-svn: 279047
* [ScopInfo] Make scalars used by PHIs in non-affine regions availableTobias Grosser2016-08-161-0/+61
| | | | | | | | | | | Normally this is ensured when adding PHI nodes, but as PHI node dependences do not need to be added in case all incoming blocks are within the same non-affine region, this was missed. This corrects an issue visible in LNT's sqlite3, in case invariant load hoisting was disabled. llvm-svn: 278792
* [ScopDetect] Do not assert in case of AddRecs with non-constant start expressionTobias Grosser2016-08-151-0/+50
| | | | llvm-svn: 278738
* [test] Force invariant load hoisting one last timeTobias Grosser2016-08-151-1/+2
| | | | | | | Without invariant load hoisting an (unrelated) bug is exposed in this test case: http://llvm.org/PR28984 llvm-svn: 278680
* [tests] Force invariant load hoisting for test cases that need it -- IIITobias Grosser2016-08-1529-22/+58
| | | | llvm-svn: 278673
* [tests] Force invariant load hoisting for test cases that need it IITobias Grosser2016-08-1519-22/+45
| | | | llvm-svn: 278669
* [test] Correct spelling in test caseTobias Grosser2016-08-151-1/+3
| | | | | | and explicitly enable invariant load hoisting for this test case. llvm-svn: 278668
* [tests] Force invariant load hoisting for test cases that need itTobias Grosser2016-08-1558-72/+72
| | | | | | | | This will make it easier to switch the default of Polly's invariant load hoisting strategy and also makes it very clear that these test cases indeed require invariant code hoisting to work. llvm-svn: 278667
* Perform replacement of access relations and creation of new arrays according ↵Roman Gareev2016-08-151-0/+86
| | | | | | | | | | | | | | | | | | | | to the packing transformation This is the third patch to apply the BLIS matmul optimization pattern on matmul kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf). BLIS implements gemm as three nested loops around a macro-kernel, plus two packing routines. The macro-kernel is implemented in terms of two additional loops around a micro-kernel. The micro-kernel is a loop around a rank-1 (i.e., outer product) update. In this change we perform replacement of the access relations and create empty arrays, which are steps to implement the packing transformation. In subsequent changes we will implement copying to created arrays. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: http://reviews.llvm.org/D22187 llvm-svn: 278666
* [GPGPU] Ensure arrays where only parts are modified are copied to GPUTobias Grosser2016-08-102-0/+43
| | | | | | | | | | | | | To do so we change the way array exents are computed. Instead of the precise set of memory locations accessed, we now compute the extent as the range between minimal and maximal address in the first dimension and the full extent defined by the sizes of the inner array dimensions. We also move the computation of the may_persist region after the construction of the arrays, as it relies on array information. Without arrays being constructed no useful information is computed at all. llvm-svn: 278212
* [GPGPU] Support PHI nodes used in GPU kernelTobias Grosser2016-08-091-0/+87
| | | | | | | | | | Ensure the right scalar allocations are used as the host location of data transfers. For the device code, we clear the allocation cache before device code generation to be able to generate new device-specific allocation and we need to make sure to add back the old host allocations as soon as the device code generation is finished. llvm-svn: 278126
* [GPGPU] Use separate basic block for GPU initialization codeTobias Grosser2016-08-091-0/+3
| | | | | | | | | | | This increases the readability of the IR and also clarifies that the GPU inititialization is executed _after_ the scalar initialization which needs to before the code of the transformed scop is executed. Besides increased readability, the IR should not change. Specifically, I do not expect any changes in program semantics due to this patch. llvm-svn: 278125
* [BlockGenerator] Insert initializations at beginning of start blockTobias Grosser2016-08-093-3/+3
| | | | | | | | | | | | | | In case some code -- not guarded by control flow -- would be emitted directly in the start block, it may happen that this code would use uninitalized scalar values if the scalar initialization is only emitted at the end of the start block. This is not a problem today in normal Polly, as all statements are emitted in their own basic blocks, but Polly-ACC emits host-to-device copy statements into the start block. Additional Polly-ACC test coverage will be added in subsequent changes that improve the handling of PHI nodes in Polly-ACC. llvm-svn: 278124
* [tests] Add two missing 'REQUIRES' linesTobias Grosser2016-08-092-0/+4
| | | | llvm-svn: 278104
* [BlockGenerator] Also eliminate dead code not originating from BBTobias Grosser2016-08-094-21/+118
| | | | | | | | | | | | | | | | | After having generated the code for a ScopStmt, we run a simple dead-code elimination that drops all instructions that are known to be and remain unused. Until this change, we only considered instructions for dead-code elimination, if they have a corresponding instruction in the original BB that belongs to ScopStmt. However, when generating code we do not only copy code from the BB belonging to a ScopStmt, but also generate code for operands referenced from BB. After this change, we now also considers code for dead code elimination, which does not have a corresponding instruction in BB. This fixes a bug in Polly-ACC where such dead-code referenced CPU code from within a GPU kernel, which is possible as we do not guarantee that all variables that are used in known-dead-code are moved to the GPU. llvm-svn: 278103
* [GPGPU] Pass parameters always by using their own typeTobias Grosser2016-08-091-0/+41
| | | | llvm-svn: 278100
* [GPGPU] Support Values referenced from both isl expr and llvm instructionsTobias Grosser2016-08-081-0/+67
| | | | | | | | | | | When adding code that avoids to pass values used in isl expressions and LLVM instructions twice, we forgot to make single variable passed to the kernel available in the ValueMap that makes it usable for instructions that are not replaced with isl ast expressions. This change adds the variable that is passed to the kernel to the ValueMap to ensure it is available for such use cases as well. llvm-svn: 278039
* [GPGPU] Create code to verify run-time conditionsTobias Grosser2016-08-081-0/+58
| | | | llvm-svn: 278026
* GPGPU: Sort dimension sizes of multi-dimensional shared memory arrays correctlyTobias Grosser2016-08-051-0/+103
| | | | | | | | | | Before this commit we generated the array type in reverse order and we also added the outermost dimension size to the new array declaration, which is incorrect as Polly additionally assumed an additional unsized outermost dimension, such that we had an off-by-one error in the linearization of access expressions. llvm-svn: 277802
* Add missing 'REQUIRES' lineTobias Grosser2016-08-051-0/+2
| | | | llvm-svn: 277800
* GPGPU: Add cuda annotations to specify maximal number of threads per blockTobias Grosser2016-08-051-0/+35
| | | | | | | | These annotations ensure that the NVIDIA PTX assembler limits the number of registers used such that we can be certain the resulting kernel can be executed for the number of threads in a thread block that we are planning to use. llvm-svn: 277799
* GPGPU: Support scalars that are mapped to shared memoryTobias Grosser2016-08-041-0/+75
| | | | llvm-svn: 277726
* GPGPU: Add private memory supportTobias Grosser2016-08-041-0/+82
| | | | llvm-svn: 277722
* GPGPU: Add support for shared memoryTobias Grosser2016-08-041-0/+83
| | | | llvm-svn: 277721
* GPGPU: Handle scalar array referencesTobias Grosser2016-08-041-0/+13
| | | | | | | Pass the content of scalar array references to the alloca on the kernel side and do not pass them additional as normal LLVM scalar value. llvm-svn: 277699
* GPGPU: Pass subtree values correctly to the kernelTobias Grosser2016-08-041-0/+15
| | | | llvm-svn: 277697
* GPGPU: Mark kernel functions as polly.skipTobias Grosser2016-08-034-5/+6
| | | | | | | | | Otherwise, we would try to re-optimize them with Polly-ACC and possibly even generate kernels that try to offload themselves, which does not work as the GPURuntime is not available on the accelerator and also does not make any sense. llvm-svn: 277589
* Add missing prefixes.Roman Gareev2016-07-301-7/+7
| | | | llvm-svn: 277264
* Extend the jscop interface to allow the user to declare new arrays and to ↵Roman Gareev2016-07-303-0/+165
| | | | | | | | | | | | | | | | reference these arrays from access expressions Extend the jscop interface to allow the user to export arrays. It is required that already existing arrays of the list of arrays correspond to arrays of the SCoP. Each array that is appended to the list will be newly created. Furthermore, we allow the user to modify access expressions to reference any array in case it has the same element type. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D22828 llvm-svn: 277263
* Add missing REQUIRES lineTobias Grosser2016-07-281-1/+3
| | | | llvm-svn: 276964
* GPGPU: Pass context parameters to GPU kernelTobias Grosser2016-07-281-0/+60
| | | | llvm-svn: 276963
* GPGPU: Pass host iterators to kernelTobias Grosser2016-07-281-0/+4
| | | | llvm-svn: 276962
* GPGPU: use current 'Index' to find slot in parameter arrayTobias Grosser2016-07-281-1/+14
| | | | | | | | Before this change we used the array index, which would result in us accessing the parameter array out-of-bounds. This bug was visible for test cases where not all arrays in a scop are passed to a given kernel. llvm-svn: 276961
* GPGPU: Generate kernel parameter allocation with right sizeTobias Grosser2016-07-281-1/+1
| | | | | | Before this change we miscounted the number of function parameters. llvm-svn: 276960
* GPGPU: Add basic support for kernel launchesTobias Grosser2016-07-272-2/+10
| | | | llvm-svn: 276863
OpenPOWER on IntegriCloud