summaryrefslogtreecommitdiffstats
path: root/polly/test
Commit message (Collapse)AuthorAgeFilesLines
...
* GPGPU: Load GPU kernelsTobias Grosser2016-07-252-0/+4
| | | | | | | We embed the PTX code into the host IR as a global variable and compile it at run-time into a GPU kernel. llvm-svn: 276645
* [GSoC] Add PolyhedralInfo pass - new interface to polly analysisJohannes Doerfert2016-07-2519-16/+74
| | | | | | | | | | | | | Adding a new pass PolyhedralInfo. This pass will be the interface to Polly. Initially, we will provide the following interface: - #IsParallel(Loop *L) - return a bool depending on whether the loop is parallel or not for the given program order. Patch by Utpal Bora <cs14mtech11017@iith.ac.in> Differential Revision: https://reviews.llvm.org/D21486 llvm-svn: 276637
* GPGPU: Emit data-transfer codeTobias Grosser2016-07-251-0/+4
| | | | | | | Also factor out getArraySize() to avoid code dupliciation and reorder some function arguments to indicate the direction into which data is transferred. llvm-svn: 276636
* GPGPU: Complete code to allocate and free device arraysTobias Grosser2016-07-251-1/+2
| | | | | | | At the beginning of each SCoP, we allocate device arrays for all arrays used on the GPU and we free such arrays after the SCoP has been executed. llvm-svn: 276635
* [GSoC] Do not process SCoPs with infeasible runtime contextJohannes Doerfert2016-07-251-0/+69
| | | | | | | | | | | | Do not process SCoPs with infeasible runtime context in the new ScopInfoWrapperPass. Do not compute dependences for such SCoPs in the new DependenceInfoWrapperPass. Patch by Utpal Bora <cs14mtech11017@iith.ac.in> Differential Revision: https://reviews.llvm.org/D22402 llvm-svn: 276631
* Apply all necessary tilings and interchangings to get a macro-kernelRoman Gareev2016-07-251-0/+51
| | | | | | | | | | | | | | | | | | | This is the second patch to apply the BLIS matmul optimization pattern on matmul kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf). BLIS implements gemm as three nested loops around a macro-kernel, plus two packing routines. The macro-kernel is implemented in terms of two additional loops around a micro-kernel. The micro-kernel is a loop around a rank-1 (i.e., outer product) update. In this change we create the BLIS macro-kernel by applying a combination of tiling and interchanging. In subsequent changes we will implement the packing transformation. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: http://reviews.llvm.org/D21491 llvm-svn: 276627
* GPGPU: initialize GPU context and simplify the corresponding GPURuntime ↵Tobias Grosser2016-07-251-0/+3
| | | | | | | | | interface. There is no need to expose the selected device at the moment. We also pass back pointers as return values, as this simplifies the interface. llvm-svn: 276623
* GPGPU: Generate PTX assembly code for the kernel modulesTobias Grosser2016-07-229-346/+287
| | | | | | | | | | | | | | | | | Run the NVPTX backend over the GPUModule IR and write the resulting assembly code in a string. To work correctly, it is important to invalidate analysis results that still reference the IR in the kernel module. Hence, this change clears all references to dominators, loop info, and scalar evolution. Finally, the NVPTX backend has troubles to generate code for various special floating point types (not surprising), but also for uncommon integer types. This commit does not resolve these issues, but pulls out problematic test cases into separate files to XFAIL them individually and resolve them in future (not immediate) changes one by one. llvm-svn: 276396
* GPGPU: generate code for ScopStatementsTobias Grosser2016-07-213-13/+104
| | | | | | | | | | | | | | | This change introduces the actual compute code in the GPU kernels. To ensure all values referenced from the statements in the GPU kernel are indeed available we scan all ScopStmts in the GPU kernel for references to llvm::Values that are not yet covered by already modeled outer loop iterators, parameters, or array base pointers and also pass these additional llvm::Values to the GPU kernel. For arrays used in the GPU kernel we introduce a new ScopArrayInfo object, which is referenced by the newly generated access functions within the GPU kernel and which is used to help with code generation. llvm-svn: 276270
* BlockGenerator: remove dead instructions in normal statementsTobias Grosser2016-07-211-5/+0
| | | | | | | | | | | | | This ensures that no trivially dead code is generated. This is not only cleaner, but also avoids troubles in case code is generated in a separate function and some of this dead code contains references to values that are not available. This issue may happen, in case the memory access functions have been updated and old getelementptr instructions remain in the code. With normal Polly, a test case is difficult to draft, but the upcoming GPU code generation can possibly trigger such problems. We will later extend this dead-code elimination to region and vector statements. llvm-svn: 276263
* tests: make test cases more robust using regexpTobias Grosser2016-07-212-5/+5
| | | | llvm-svn: 276262
* tests: fix order of memory accesses to ensure import succeedsTobias Grosser2016-07-213-20/+28
| | | | | | | | | | | It seems the order in which we generated memory accesses changed such that the import of these updated memory accesses failed for the 'loop3' statement in this test case. Unfortunately, the existing CHECK lines were not strict enough to catch this. Hence, besides fixing the order of the memory access lines we also ensure that the memory access changes are both clearly visibly and well checked. llvm-svn: 276247
* GPGPU: Bail out of scops with hoisted invariant loadsTobias Grosser2016-07-191-5/+7
| | | | | | | This is currently not supported and will only be added later. Also update the test cases to ensure no invariant code hoisting is applied. llvm-svn: 275987
* test: Add missing 'REQUIRES' lineTobias Grosser2016-07-191-0/+2
| | | | llvm-svn: 275962
* test: Add missing 'REQUIRES' lineTobias Grosser2016-07-191-0/+2
| | | | llvm-svn: 275960
* GPGPU: Emit in-kernel synchronization statementsTobias Grosser2016-07-191-0/+5
| | | | | | | We use this opportunity to further classify the different user statements that can arise and add TODOs for the ones not yet implemented. llvm-svn: 275957
* GPGPU: generate control flow within the kernelTobias Grosser2016-07-193-9/+29
| | | | llvm-svn: 275956
* GPGPU: add scop parameters to kernel argumentsTobias Grosser2016-07-191-0/+43
| | | | llvm-svn: 275955
* GPGPU: add host iterators to kernel argumentsTobias Grosser2016-07-191-0/+12
| | | | llvm-svn: 275954
* GPGPU: add intrinsic functions to obtain a kernels thread and block idsTobias Grosser2016-07-192-0/+25
| | | | llvm-svn: 275953
* GPGPU: create kernel function skeletonTobias Grosser2016-07-191-0/+76
| | | | | | | | | Create for each kernel a separate LLVM-IR module containing a single function marked as kernel function and taking one pointer for each array referenced by this kernel. Add debugging output to verify the kernels are generated correctly. llvm-svn: 275952
* GPGPU: collect array referencesTobias Grosser2016-07-185-26/+28
| | | | | | | | | | | Initialize the list of references to a GPU array to ensure that the arrays that need to be passed to kernel calls are computed correctly. Furthermore, the very same information is also necessary to compute synchronization correctly. As the functionality to compute these references is already available, what is left for us to do is only to connect the necessary functionality to compute array reference information. llvm-svn: 275798
* test: Add missing 'REQUIRES' lineTobias Grosser2016-07-181-0/+2
| | | | llvm-svn: 275784
* GPGPU: Create host control flowTobias Grosser2016-07-182-0/+96
| | | | | | | | | | | | | | Create LLVM-IR for all host-side control flow of a given GPU AST. We implement this by introducing a new GPUNodeBuilder class derived from IslNodeBuilder. The IslNodeBuilder will take care of generating all general-purpose ast nodes, but we provide our own createUser implementation to handle the different GPU specific user statements. For now, we just skip any user statement and only generate a host-code sceleton, but in subsequent commits we will add handling of normal ScopStmt's performing computations, kernel calls, as well as host-device data transfers. We will also introduce run-time check generation and LICM in subsequent commits. llvm-svn: 275783
* GPGPU: Format statements scheduled on the host ourselvesTobias Grosser2016-07-151-0/+198
| | | | | | | | Otherwise ppcg would try to call into pet functionality that this not available, which obviously will cause trouble. As we can easily print these statements ourselves, we just do so. llvm-svn: 275579
* GPGPU: Use schedule whole components for schedulerTobias Grosser2016-07-151-20/+27
| | | | | | | | | | | This option increases the scalability of the scheduler and allows us to remove the 'gisting' workaround we introduced in r275565 to handle a more complicated test case. Another benefit of using this option is also that the generated code looks a lot more streamlined. Thanks to Sven Verdoolaege for reminding me of this option. llvm-svn: 275573
* GPGPU: Drop domain constraints from flow dependencesTobias Grosser2016-07-151-0/+174
| | | | | | | | This works around a shortcoming of the isl scheduler, which even for some smaller test cases does not terminate in case domain constraints are part of the flow dependences. llvm-svn: 275565
* GPGPU: Test scalar/array types i1/i3/i8/i32/i60/i64/i80/i120/i128/i3000Tobias Grosser2016-07-151-0/+490
| | | | | | | | | | | | | | | | | | Arrays with integer base type are similar to arrays with floating point types, with the exception that LLVM's integer types can take some odd values. We add a selection of different values to make sure we correctly round these types when necessary. References to scalar integer types are special, as we currently do not model these types as array accesses as they are considered 'synthesizable' by Polly. As a result, we do not generate explicit data-transfers for them, but instead will need to keep track of all references to 'synthesizable' values separately. At the current stage, this is only visible by missing host-to-device data-transfer calls. In the future, we will also require special code generation strategies. llvm-svn: 275551
* GPGPU: Test scalar parameters of type half/float/double/fp128/x86_fp80/ppc_fp128Tobias Grosser2016-07-151-1/+251
| | | | | | | | | | | We currently only test that the code structure we generate for these scalar parameters is correct and we add these types to make sure later code generation additions have sufficient test coverage. In case some of these types cannot be mapped due to missing hardware support on the GPU some of these test cases may need to be updated later on. llvm-svn: 275548
* GPGPU: Make sure scops with more than one array workTobias Grosser2016-07-151-0/+55
| | | | | | We use this opportunity to add a test case containing a scalar parameter. llvm-svn: 275547
* GPGPU: Model array access informationTobias Grosser2016-07-151-7/+23
| | | | | | This allows us to derive host-device and device-host data-transfers. llvm-svn: 275535
* GPGPU: Use CHECK-NEXT to harden test casesTobias Grosser2016-07-151-44/+45
| | | | | | | | A sequence of CHECK lines allows additional statements to appear in the output of the tested program without any test failures appearing. As we do not want this to happen, switch this test case to use CHECK-NEXT. llvm-svn: 275534
* GPGPU: Generate an AST for the GPU-mapped scheduleTobias Grosser2016-07-141-0/+18
| | | | | | | | | | For this we need to provide an explicit list of statements as they occur in the polly::Scop to ppcg. We also setup basic AST printing facilities to facilitate debugging. To allow code reuse some (minor) changes in ppcg are have been necessary. llvm-svn: 275436
* GPGPU: Use a tile size of 32 by defaultTobias Grosser2016-07-141-6/+5
| | | | | | | | | The tile size was previously uninitialized. As a result, it was often zero (aka. no tiling), which is not what we want in general. More importantly, there was the risk for arbitrary tile sizes to be choosen, which we did not observe, but which still is highly problematic. llvm-svn: 275418
* GPGPU: Map initial schedule to GPU scheduleTobias Grosser2016-07-141-3/+27
| | | | | | | | This change now applies ppcg's GPU mapping on our initial schedule. For this to work, we need to also initialize the set of all names (isl_ids) used in the scop as well as the program context. llvm-svn: 275396
* GPGPU: compute new schedule from polly scopTobias Grosser2016-07-141-2/+11
| | | | | | | | | | | | | | | | | | | | To do so we copy the necessary information to compute an initial schedule from polly::Scop to ppcg's scop. Most of the necessary information is directly available and only needs to be passed on to ppcg, with the exception of 'tagged' access relations, access relations that additionally carry information about which memory access an access relation originates from. We could possibly perform the construction of tagged accesses as part of ScopInfo, but as this format is currently specific to ppcg we do not do this yet, but keep this functionality local to our GPU code generation. After the scop has been initialized, we compute data dependences and ask ppcg to compute an initial schedule. Some of this functionality is already available in polly::DependenceInfo and polly::ScheduleOptimizer, but to keep differences to ppcg small we use ppcg's functionality here. We may later investiage if a closer integration of these tools makes sense. llvm-svn: 275390
* GPGPU: create default initialized PPCG scop and gpu programTobias Grosser2016-07-142-1/+66
| | | | | | | | | | | | | | At this stage, we do not yet modify the IR but just generate a default initialized ppcg_scop and gpu_prog and free both immediately. Both will later be filled with data from the polly::Scop and are needed to use PPCG for GPU schedule generation. This commit does not yet perform any GPU code generation, but ensures that the basic infrastructure has been put in place. We also add a simple test case to ensure the new code is run and use this opportunity to verify that GPU_CODEGEN tests are only run if GPU code generation has been enabled in cmake. llvm-svn: 275389
* Add CHECK line to test case. NFC.Michael Kruse2016-07-121-2/+7
| | | | | | | | | | Check not only that the compiler is not crashing, but also whether the probablematic part (The sequence of instructions simplified to '4') is reflected in the output. Thanks to Tobias for the hint. llvm-svn: 275189
* [SCEVAffinator] Fix assertion checking for constant divisor.Michael Kruse2016-07-121-0/+55
| | | | | | | | | | | An assertion in visitSDivInstruction() checked whether the divisor is constant by checking whether the argument is a ConstantInt. However, SCEVValidator allows the divisor to be simplified to a constant by ScalarEvolution. We synchronize the implementation of SCEVValidator and SCEVAffinator to both accept simplified SCEV expressions. llvm-svn: 275174
* Add test case forgotten in r275053Tobias Grosser2016-07-111-0/+59
| | | | llvm-svn: 275055
* Fix assertion due to buildMemoryAccess.Michael Kruse2016-07-082-0/+119
| | | | | | | | | | | | | For llvm the memory accesses from nonaffine loops should be visible, however for polly those nonaffine loops should be invisible/boxed. This fixes llvm.org/PR28245 Cointributed-by: Huihui Zhang <huihuiz@codeaurora.org> Differential Revision: http://reviews.llvm.org/D21591 llvm-svn: 274842
* test: Drop unnecessary -polly-code-generator=isl flagTobias Grosser2016-07-061-2/+2
| | | | | | | isl is already the default code generator since we switched from CLooG several years ago. llvm-svn: 274609
* Ensure parameter names are isl-compatibleTobias Grosser2016-07-012-37/+37
| | | | | | | Without this change it is not possible for isl to parse the resulting objects from their string representation. llvm-svn: 274350
* Fix assertion due to loop overlap with nonaffine region.Michael Kruse2016-06-271-0/+112
| | | | | | | | | | | | Reject and report regions that contains loops overlapping nonaffine region. This situation typically happens in the presence of inifinite loops. This addresses bug llvm.org/PR28071. Differential Revision: http://reviews.llvm.org/D21312 Contributed-by: Huihui Zhang <huihuiz@codeaurora.org> llvm-svn: 273905
* [GSoC 2016] New function pass DependenceInfoWrapperPassJohannes Doerfert2016-06-274-0/+37
| | | | | | | | | | | | | | This patch addresses: - A new function pass to compute polyhedral dependences. This is required to avoid the region pass manager. - Stores a map of Scop to Dependence object for all the scops present in a function. By default, access wise dependences are stored. Patch by Utpal Bora <cs14mtech11017@iith.ac.in> Differential Revision: http://reviews.llvm.org/D21105 llvm-svn: 273881
* [GSoC 2016]New function pass ScopInfoWrapperPassJohannes Doerfert2016-06-2714-0/+17
| | | | | | | | | | | | This patch adds a new function pass ScopInfoWrapperPass so that the polyhedral description of a region, the SCoP, can be constructed and used in a function pass. Patch by Utpal Bora <cs14mtech11017@iith.ac.in> Differential Revision: http://reviews.llvm.org/D20962 llvm-svn: 273856
* Apply all necessary tilings and unrollings to get a micro-kernelRoman Gareev2016-06-221-0/+128
| | | | | | | | | | | | | | | | | | | | | This is the first patch to apply the BLIS matmul optimization pattern on matmul kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf). BLIS implements gemm as three nested loops around a macro-kernel, plus two packing routines. The macro-kernel is implemented in terms of two additional loops around a micro-kernel. The micro-kernel is a loop around a rank-1 (i.e., outer product) update. In this change we create the BLIS micro-kernel by applying a combination of tiling and unrolling. In subsequent changes we will add the extraction of the BLIS macro-kernel and implement the packing transformation. Contributed-by: Roman Gareev <gareevroman@gmail.com> Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: http://reviews.llvm.org/D21140 llvm-svn: 273397
* Update isl to isl-0.17.1-57-g1879898Tobias Grosser2016-06-125-29/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | With this update the isl AST generation extracts disjunctive constraints early on. As a result, code that previously resulted in two branches with (close-to) identical code within them: if (P <= -1) { for (int c0 = 0; c0 < N; c0 += 1) Stmt_store(c0); } else if (P >= 1) for (int c0 = 0; c0 < N; c0 += 1) Stmt_store(c0); results now in only a single branch body: if (P <= -1 || P >= 1) for (int c0 = 0; c0 < N; c0 += 1) Stmt_store(c0); This resolves http://llvm.org/PR27559 Besides the above change, this isl update brings better simplification of sets/maps containing existentially quantified dimensions and fixes a bug in isl's coalescing. llvm-svn: 272500
* Expand test cases affected by next commitTobias Grosser2016-06-125-38/+54
| | | | | | | | | As these test cases will be changed in a subsequent commit, we expand and tighten them to make the subsequent changes to them more obvious. As part of this we add more context to some test cases and add CHECK-NEXT lines to ensure no intermediate lines are missed by accident. llvm-svn: 272499
* Recommit: "[FIX] Determine insertion point during SCEV expansion"Tobias Grosser2016-06-111-0/+43
| | | | | | | This patch was originally contributed by Johannes Doerfert in r271892, but was in conflict with the revert in r272483. llvm-svn: 272486
OpenPOWER on IntegriCloud