summaryrefslogtreecommitdiffstats
path: root/polly/test/GPGPU/kernel-params-only-some-arrays.ll
Commit message (Collapse)AuthorAgeFilesLines
* Revert "[GPGPU] Simplify PPCGSCop to reduce compile time [NFC]"Tobias Grosser2017-08-191-4/+4
| | | | | | | | | We still see some issues with parameter space mismatches. Revert this to get a clean baseline. We will recommit after these issues have been resolved. This reverts commit 0e360a14194f722ded7aa2bc9d4be2ed2efeeb49. llvm-svn: 311268
* [PPCG] Only add Kernel argument sizes for OpenCL, not CUDA runtimePhilipp Schaad2017-08-191-2/+2
| | | | | | | | Kernel argument sizes now only get appended to the kernel launch parameter list if the OpenCL runtime is selected, not if CUDA runtime is chosen. Differential revision: D36925 llvm-svn: 311248
* [GPGPU] Simplify PPCGSCop to reduce compile time [NFC]Tobias Grosser2017-08-181-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Drop unused parameter dimensions to reduce the size of the sets we are working with. Especially the computed dependences tend to accumulate a lot of parameters that are present in the input memory accesses, but often not necessary to express the actual dependences. As isl represents maps and sets with dense matrices, reducing the dimensionality of isl sets commonly reduces code generation performance. This reduces compile time from 17 to 11 seconds for our test case. While this is not impressive, this patch helped me to identify the previous two performance improvements and additionally also increases readability of the isl data structures we use. Reviewers: Meinersbur, bollu, singam-sanjay Reviewed By: bollu Subscribers: nemanjai, pollydev, llvm-commits, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D36869 llvm-svn: 311161
* [GPGPU] Fix compilation issue with latest CUDA upgrade to i128Tobias Grosser2017-07-281-2/+2
| | | | llvm-svn: 309366
* [PPCGCodeGen] [3/3] Update PPCGCodeGen + tests to latest ppcg.Siddharth Bhat2017-07-201-4/+5
| | | | | | | | | | | | | | | | | | | | | This commit *WILL COMPILE*. 1. `PPCG` now uses `isl_multi_pw_aff` instead of an array of `pw_aff`. This needs us to adjust how we index array bounds and how we construct array bounds. 2. `PPCG` introduces two new kinds of nodes: `init_device` and `clear_device`. We should investigate what the correct way to handle these are. 3. `PPCG` has gotten smarter with its use of live range reordering, so some of the tests have a qualitative improvement. 4. `PPCG` changed its output style, so many test cases need to be updated to fit the new style for `polly-acc-dump-code` checks. Differential Revision: https://reviews.llvm.org/D35677 llvm-svn: 308625
* [PPCGCodeGen] Differentiate kernels based on their parent ScopSingapuram Sanjay Srivallabh2017-07-121-6/+6
| | | | | | | | | | | | | | | | | | | | | Summary: Add a sequence number that identifies a ptx_kernel's parent Scop within a function to it's name to differentiate it from other kernels produced from the same function, yet different Scops. Kernels produced from different Scops can end up having the same name. Consider a function with 2 Scops and each Scop being able to produce just one kernel. Both of these kernels have the name "kernel_0". This can lead to the wrong kernel being launched when the runtime picks a kernel from its cache based on the name alone. This patch supplements D33985, by differentiating kernels across Scops as well. Previously (even before D33985) while profiling kernels generated through JIT e.g. Julia, [[ https://groups.google.com/d/msg/polly-dev/J1j587H3-Qw/mR-jfL16BgAJ | kernels associated with different functions, and even different SCoPs within a function, would be grouped together due to the common name ]]. This patch prevents this grouping and the kernels are reported separately. Reviewers: grosser, bollu Reviewed By: grosser Subscribers: mehdi_amini, nemanjai, pollydev, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D35176 llvm-svn: 307814
* Prefix the name of the calling host function in the name of callee GPU kernelSingapuram Sanjay Srivallabh2017-07-051-6/+6
| | | | | | | | | | | | | | | | | | | Summary: Provide more context to the name of a GPU kernel by prefixing its name with the host function that calls it. E.g. The first kernel called by `gemm` would be `FUNC_gemm_KERNEL_0`. Kernels currently follow the "kernel_#" (# = 0,1,2,3,...) nomenclature. This patch makes it easier to map host caller and device callee, especially when there are many kernels produced by Polly-ACC. Reviewers: grosser, Meinersbur, bollu, philip.pfaffe, kbarton! Reviewed By: grosser Subscribers: nemanjai, pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D33985 llvm-svn: 307173
* [Polly][PPCGCodeGen] OpenCL now gets kernel argument size from PPCG CodeGenSiddharth Bhat2017-05-091-2/+2
| | | | | | | | | | | | | | | | Summary: PPCGCodeGeneration now attaches the size of the kernel launch parameters at the end of the parameter list. For the existing CUDA Runtime, this gets ignored, but the OpenCL Runtime knows to check for kernel-argument size at the end of the parameter list. (The resulting parameters list is twice as long. This has been accounted for in the corresponding test cases). Reviewers: grosser, Meinersbur, bollu Reviewed By: bollu Subscribers: nemanjai, yaxunl, Anastasia, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D32961 llvm-svn: 302515
* [PPCGCodeGeneration] Update PPCG Code Generation for OpenCL compatibilitySiddharth Bhat2017-04-251-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | Added a small change to the way pointer arguments are set in the kernel code generation. The way the pointer is retrieved now, specifically requests global address space to be annotated. This is necessary, if the IR should be run through NVPTX to generate OpenCL compatible PTX. The changes do not affect the PTX Strings generated for the CUDA target (nvptx64-nvidia-cuda), but are necessary for OpenCL (nvptx64-nvidia-nvcl). Additionally, the data layout has been updated to what the NVPTX Backend requests/recommends. Contributed-by: Philipp Schaad Reviewers: Meinersbur, grosser, bollu Reviewed By: grosser, bollu Subscribers: jlebar, pollydev, llvm-commits, nemanjai, yaxunl, Anastasia Tags: #polly Differential Revision: https://reviews.llvm.org/D32215 llvm-svn: 301299
* GPGPU: Detect read-only scalar arrays ...Tobias Grosser2016-09-171-2/+6
| | | | | | and pass these by value rather than by reference. llvm-svn: 281837
* GPGPU: Mark kernel functions as polly.skipTobias Grosser2016-08-031-2/+2
| | | | | | | | | Otherwise, we would try to re-optimize them with Polly-ACC and possibly even generate kernels that try to offload themselves, which does not work as the GPURuntime is not available on the accelerator and also does not make any sense. llvm-svn: 277589
* GPGPU: use current 'Index' to find slot in parameter arrayTobias Grosser2016-07-281-1/+14
| | | | | | | | Before this change we used the array index, which would result in us accessing the parameter array out-of-bounds. This bug was visible for test cases where not all arrays in a scop are passed to a given kernel. llvm-svn: 276961
* GPGPU: generate code for ScopStatementsTobias Grosser2016-07-211-2/+4
| | | | | | | | | | | | | | | This change introduces the actual compute code in the GPU kernels. To ensure all values referenced from the statements in the GPU kernel are indeed available we scan all ScopStmts in the GPU kernel for references to llvm::Values that are not yet covered by already modeled outer loop iterators, parameters, or array base pointers and also pass these additional llvm::Values to the GPU kernel. For arrays used in the GPU kernel we introduce a new ScopArrayInfo object, which is referenced by the newly generated access functions within the GPU kernel and which is used to help with code generation. llvm-svn: 276270
* test: Add missing 'REQUIRES' lineTobias Grosser2016-07-191-0/+2
| | | | llvm-svn: 275960
* GPGPU: add intrinsic functions to obtain a kernels thread and block idsTobias Grosser2016-07-191-0/+8
| | | | llvm-svn: 275953
* GPGPU: create kernel function skeletonTobias Grosser2016-07-191-0/+76
Create for each kernel a separate LLVM-IR module containing a single function marked as kernel function and taking one pointer for each array referenced by this kernel. Add debugging output to verify the kernels are generated correctly. llvm-svn: 275952
OpenPOWER on IntegriCloud