summaryrefslogtreecommitdiffstats
path: root/polly/lib/CodeGen/PPCGCodeGeneration.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [IR] Include more target specific intrinsic headersHeejin Ahn2019-12-141-0/+1
| | | | After D71320, target-specific intrinsic headers should be included.
* [GPGPU] Fix depricated warning.Michael Kruse2019-11-141-1/+1
| | | | | | | setAlignment(unsigned) was deprecated in commit: 0e62011df891d0e7ad904524edf705d07d12d5d4 [Alignment][NFC] Remove dependency on GlobalObject::setAlignment(unsigned)
* [GPGPU] Fix #includes.Michael Kruse2019-11-141-0/+1
| | | | | | Adapt for 05da2fe52162 "Sink all InitializePasses.h includes" which forgot the GPGPU files (presumably because POLLY_ENABLE_GPGPU_CODEGEN is OFF by default).
* Move CodeGenFileType enum to Support/CodeGen.hReid Kleckner2019-11-131-2/+1
| | | | | | | Avoids the need to include TargetMachine.h from various places just for an enum. Various other enums live here, such as the optimization level, TLS model, etc. Data suggests that this change probably doesn't matter, but it seems nice to have anyway.
* Apply include-what-you-use #include removal suggestions. NFC.Michael Kruse2019-03-281-9/+1
| | | | | | | | | | | | This removes unused includes (and forward declarations) as suggested by include-what-you-use. If a transitive include of a removed include is required to compile a file, I added the required header (or forward declaration if suggested by include-what-you-use). This should reduce compilation time and reduce the number of iterative recompilations when a header was changed. llvm-svn: 357209
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* PPCG codegenTobias Grosser2018-08-011-1/+2
| | | | | | | | The latest version of the isl C++ bindings does not export the 'set' method yet. Fall back to the C interface until this method can be exported. llvm-svn: 338512
* [Polly-ACC] Fix compilation after r338450. NFC.Michael Kruse2018-08-011-1/+1
| | | | llvm-svn: 338462
* [PPCGCodeGen] Change printf to outs() to prevent garbled output. [NFC]Siddharth Bhat2018-07-041-8/+8
| | | | | | | | | | | | | | | | | Summary: It appears that llvm uses unbuffered C++ streams. So, we should not mix C and C++ stream operations, because that will give us mixed up output. Reviewers: efriedma, jdoerfert, Meinersbur, gareevroman, sebpop, zinob, huihuiz, pollydev, grosser, singam-sanjay, philip.pfaffe Reviewed By: philip.pfaffe Subscribers: nemanjai, kbarton Differential Revision: https://reviews.llvm.org/D40126 llvm-svn: 336288
* [polly-acc] change cl_get_* return types to 32/64bitPhilip Pfaffe2018-07-021-9/+17
| | | | | | | | | | | | | | | | | | | Summary: This patch changes the return types for ocl_get_* functions during SPIR code generation. Because these functions return size_t types, the return type needs to be changed to the actual size of size_t on the device. Based on work by Michal Babej and Pekka Jääskeläinen Patch by: Alain Denzler Reviewers: grosser, philip.pfaffe, bollu Reviewed By: grosser, philip.pfaffe Subscribers: nemanjai, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D48774 llvm-svn: 336080
* Simplify the implementation of getCUDALibDeviceFunction. NFC.Philip Pfaffe2018-06-141-13/+9
| | | | | | | | | | | | | | | | Summary: The function is currently awfully complicated. Drop the IILE and use StringRef over std::string. Reviewers: Meinersbur, grosser, bollu Reviewed By: Meinersbur Subscribers: nemanjai, kbarton, bollu, llvm-commits, pollydev Differential Revision: https://reviews.llvm.org/D48070 llvm-svn: 334695
* Run clang-formatPhilip Pfaffe2018-06-071-1/+1
| | | | llvm-svn: 334172
* Fix a missing lambda return type that tripped the buildersPhilip Pfaffe2018-06-071-1/+1
| | | | llvm-svn: 334166
* [Acc] Re-land r326643 to finally fix PR33208.Philip Pfaffe2018-05-231-2/+4
| | | | | | | Other than before, don't clear out LI entirely but only those relevant loops. llvm-svn: 333089
* CodeGen: Add a dwo output file argument to addPassesToEmitFile and hook it ↵Peter Collingbourne2018-05-211-2/+3
| | | | | | | | | | up to dwo output. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47089 llvm-svn: 332881
* [polly] Update uses of DEBUG macro to LLVM_DEBUG.Nicola Zaghen2018-05-151-17/+17
| | | | | | | | | | | The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM Differential Revision: https://reviews.llvm.org/D44978 llvm-svn: 332352
* Remove another set or release() callsTobias Grosser2018-04-291-3/+3
| | | | llvm-svn: 331129
* Revert "[Acc] Fix for PR33208"Philip Pfaffe2018-03-031-5/+9
| | | | | | This reverts commit r326643. Fix didn't really fix anything. llvm-svn: 326656
* [Acc] Fix for PR33208Philip Pfaffe2018-03-031-9/+5
| | | | | | | | | | | | | During codegen, Polly attempts to clear all loops from ScalarEvolution and LoopInfo, and it does so one block at a time. This causes undefined behaviour, since this way a loop header might be removed from a loop before the entire loop is erased, causing ScalarEvolution to run into an error. Instead, just delete the entire loop atomically. This fixes currently failing testcases. llvm-svn: 326643
* Use isl::manage_copy to simplify calls to isl::manage(isl_.._copy())Tobias Grosser2018-02-201-12/+7
| | | | | | | | | | | As part of this cleanup a couple of unnecessary isl::manage(obj.copy()) pattern are eliminated as well. We checked for all potential cleanups by scanning for: "grep -R isl::manage\( lib/ | grep copy" llvm-svn: 325558
* Run polly-update-format. NFC.Michael Kruse2017-11-211-2/+2
| | | | | | | polly-check-format has been failing since at least r318517, due to more than one cause. llvm-svn: 318795
* Port ScopInfo to the isl cpp bindingsPhilip Pfaffe2017-11-191-20/+21
| | | | | | | | | | | | | | | | | | | | | Summary: Most changes are mechanical, but in one place I changed the program semantics by fixing a likely bug: In `Scop::hasFeasibleRuntimeContext()`, I'm now explicitely handling the error-case. Before, when the call to `addNonEmptyDomainConstraints()` returned a null set, this (probably) accidentally worked because isl_bool_error converts to true. I'm checking for nullptr now. Reviewers: grosser, Meinersbur, bollu Reviewed By: Meinersbur Subscribers: nemanjai, kbarton, pollydev, llvm-commits Differential Revision: https://reviews.llvm.org/D39971 llvm-svn: 318632
* [polly] Remove redundant return [NFC]Mandeep Singh Grang2017-11-101-1/+0
| | | | | | | | | | | | | | Reviewers: grosser, bollu Reviewed By: grosser Subscribers: nemanjai, kbarton, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D39916 llvm-svn: 317922
* [Acc] Do not statically dispatch into IslNodeBuilder's createForPhilip Pfaffe2017-10-291-0/+7
| | | | | | | | | | | | | | | | | | | | Summary: When GPUNodeBuilder creates loops inside the kernel, it dispatches to IslNodeBuilder. This however is surprisingly dangerous, since it accesses the AST Node's user through the wrong type. This patch fixes this problem by overriding createFor correctly. This fixes PR35010. Reviewers: grosser, bollu, Meinersbur Reviewed By: Meinersbur Subscribers: Meinersbur, nemanjai, pollydev, llvm-commits, kbarton Differential Revision: https://reviews.llvm.org/D39364 llvm-svn: 316872
* [GPGPU] Make sure escaping invariant load hoisted scalars are preservedTobias Grosser2017-10-041-1/+3
| | | | | | | | | | | | | | | We make sure that the final reload of an invariant scalar memory access uses the same stack slot into which the invariant memory access was stored originally. Earlier, this was broken as we introduce a new stack slot aside of the preload stack slot, which remained uninitialized and caused our escaping loads to contain garbage. This happened due to us clearing the pre-populated values in EscapeMap after kernel code generation. We address this issue by preserving the original host values and restoring them after kernel code generation. EscapeMap is not expected to be used during kernel code generation, hence we clear it during kernel generation to make sure that any unintended uses are noticed. llvm-svn: 314894
* [GPGPU] Set Polly's RTC to false in case invariant load hoisting failsTobias Grosser2017-10-011-0/+6
| | | | | | | | | | | This matches the behavior we already have in lib/Codegen/CodeGeneration.cpp and makes sure that we fall back to the original code. It seems when invariant load hoisting was introduced to the GPGPU backend we missed to reset the RTC flag, such that kernels where invariant load hoisting failed executed the 'optimized' SCoP, which however is set to a simple 'unreachable'. Unsurprisingly, this results in hard to debug issues that are a lot of fun to debug. llvm-svn: 314624
* [PPCGCodeGen] Document pre-composition with Zero in getExtent. [NFC]Siddharth Bhat2017-09-071-0/+26
| | | | | | | It's weird at first glance that we do this, so I wrote up some documentation on why we need to perform this process. llvm-svn: 312715
* [PPCGCodeGen] Convert intrinsics to libdevice functions whenever possible.Siddharth Bhat2017-08-311-7/+41
| | | | | | | | | | | | | | | | This is useful when we face certain intrinsics such as `llvm.exp.*` which cannot be lowered by the NVPTX backend while other intrinsics can. So, we would need to keep blacklists of intrinsics that cannot be handled by the NVPTX backend. It is much simpler to try and promote all intrinsics to libdevice versions. This patch makes function/intrinsic very uniform, and will always try to use a libdevice version if it exists. Differential Revision: https://reviews.llvm.org/D37056 llvm-svn: 312239
* [PM] Properly require and preserve OptimizationRemarkEmitter. NFCI.Michael Kruse2017-08-281-11/+2
| | | | | | | | | | | | | | | | | | | | | | Properly require and preserve the OptimizationRemarkEmitter for use in ScopPass. Previously one had to get the ORE from ScopDetection because CodeGeneration did not mark it as preserved. It would need to be recomputed which results in the legacy PM to throw away all previous SCoP analysis. This also changes the implementation of ScopPass::getAnalysisUsage to not unconditionally preserve all passes, but only those needed to be preserved by any SCoP pass (at least when using the legacy PM). This allows invalidating DependenceInfo (and IslAstInfo) in case the pass would cause them to change (e.g. OpTree, DeLICM, MaximalArrayExpansion) JSONImporter should also invalidate the DependenceInfo. In this patch it marks DependenceInfo as preserved anyway because some regression tests depend on it. Differential Revision: https://reviews.llvm.org/D37010 llvm-svn: 311888
* [Polly] [PPCGCodeGeneration] Mild refactoring of checking validity of ↵Siddharth Bhat2017-08-241-9/+10
| | | | | | | | | | | | functions in a kernel. This is a stylistic change to make the function a little more readable. Also add a debug print to show what instruction contains a use of a function we don't understand in the kernel. Differential Revision: https://reviews.llvm.org/D37058 llvm-svn: 311648
* [PPCGCodeGen] Fix compiler warning: '<': signed/unsigned mismatch. NFC.Michael Kruse2017-08-231-6/+6
| | | | | | | | | | | MSVC warns about comparison between a signed and unsigned integer. The rules of C(++) define that an unsigned comparison has to be carried-out in this case. This is unlikely to be intended. Fix by assigning the loop's upper bound to a signed integer first. This also avoids repeated evaluation of the invariant upper bound. llvm-svn: 311548
* [PPCGCodeGeneration] Enable `polly-codegen-perf-monitoring` for PPCGCodegen.Siddharth Bhat2017-08-211-0/+19
| | | | | | | | | | This feature was not enabled for `PPCGCodeGeneration`. Now that this is enabled, we can benchmark Scops that have been optimised with `-polly-codegen-ppcg` with the `-polly-codegen-perf-monitoring` option. Differential Revision: https://reviews.llvm.org/D36934 llvm-svn: 311328
* [GPGPU] Add llvm.powi to the libdevice supported functionsTobias Grosser2017-08-211-1/+1
| | | | | | These intrinsics are used in COSMO. llvm-svn: 311324
* [GPGPU] Add log / logf to the libdevice supported functionsTobias Grosser2017-08-211-2/+2
| | | | | | These two functions are used in COSMO llvm-svn: 311322
* Revert "[GPGPU] Simplify PPCGSCop to reduce compile time [NFC]"Tobias Grosser2017-08-191-79/+3
| | | | | | | | | We still see some issues with parameter space mismatches. Revert this to get a clean baseline. We will recommit after these issues have been resolved. This reverts commit 0e360a14194f722ded7aa2bc9d4be2ed2efeeb49. llvm-svn: 311268
* [GPGPU] Correctly initialize array order and fixed_element informationTobias Grosser2017-08-191-7/+7
| | | | | | | | | | | | | | | | | | | | | | Summary: This information is necessary for PPCG to perform correct life range reordering. With these changes applied we can live-range reorder some of the important kernels in COSMO. We also update and rename one test case, which previously could not be optimized and now is optimized thanks to live-range reordering. To preserve test coverage we add a new test case scalar-writes-in-scop-requires-abort.ll, which exercises our automatic abort in case of scalar writes in the kernel. Reviewers: Meinersbur, bollu, singam-sanjay Subscribers: nemanjai, pollydev, llvm-commits, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D36929 llvm-svn: 311259
* [PPCG] Only add Kernel argument sizes for OpenCL, not CUDA runtimePhilipp Schaad2017-08-191-14/+28
| | | | | | | | Kernel argument sizes now only get appended to the kernel launch parameter list if the OpenCL runtime is selected, not if CUDA runtime is chosen. Differential revision: D36925 llvm-svn: 311248
* [GPGPU] Collect parameter dimension used in MemoryAccessesTobias Grosser2017-08-191-5/+17
| | | | | | | | | | | When using -polly-ignore-integer-wrapping and -polly-acc-codegen-managed-memory we add parameter dimensions lazily to the domains, which results in PPCG not including parameter dimensions that are only used in memory accesses in the kernel space. To make sure these parameters are still passed to the kernel, we collect these parameter dimensions and align the kernel's parameter space before code-generating it. llvm-svn: 311239
* [GPGPU] Simplify PPCGSCop to reduce compile time [NFC]Tobias Grosser2017-08-181-3/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Drop unused parameter dimensions to reduce the size of the sets we are working with. Especially the computed dependences tend to accumulate a lot of parameters that are present in the input memory accesses, but often not necessary to express the actual dependences. As isl represents maps and sets with dense matrices, reducing the dimensionality of isl sets commonly reduces code generation performance. This reduces compile time from 17 to 11 seconds for our test case. While this is not impressive, this patch helped me to identify the previous two performance improvements and additionally also increases readability of the isl data structures we use. Reviewers: Meinersbur, bollu, singam-sanjay Reviewed By: bollu Subscribers: nemanjai, pollydev, llvm-commits, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D36869 llvm-svn: 311161
* [Polly] [PPCGCodeGeneration] Print current Scop and loop depth in ↵Siddharth Bhat2017-08-181-0/+3
| | | | | | | | PPCGCodeGen. [NFC] Differential Revision: https://reviews.llvm.org/D36871 llvm-svn: 311158
* [GPGPU] Do not create copy statements when targetting managed memoryTobias Grosser2017-08-181-1/+2
| | | | | | | | | | | | | | | | | | Summary: They are not used and consequently do not even need to be computed. This reduces the overall compile time for our kernel from 1m33s to 17s. Reviewers: Meinersbur, bollu, singam-sanjay Reviewed By: bollu Subscribers: nemanjai, pollydev, llvm-commits, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D36868 llvm-svn: 311157
* [GPGPU] Synchronize after each kernel, not each copy outTobias Grosser2017-08-181-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change reduces the overall number of synchronize calls for kernels with a lot of output data at the cost of additional synchronize calls for kernels launched in sequence without any device to host transfers in between. As the latter pattern is a lot less frequent, this seems a better tradeoff. Even though the above motivation would be motivation enough, this is just a step towards enabling ppcg to not compute to and from device copy calls at all, which would be incorrect in case we still relied on these calls to place our synchronization statements. Reviewers: Meinersbur, bollu, singam-sanjay Reviewed By: bollu Subscribers: nemanjai, kbarton, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D36867 llvm-svn: 311155
* [GPGPU] Only collect the access that belong to an array [NFC]Tobias Grosser2017-08-171-6/+5
| | | | | | | | This avoid the construction of very large sets and in many cases also keeps the number of parameters low. As a result, we see a compile time reduction from 5 minutes to only slightly above 1 minute for one of our larger test cases. llvm-svn: 311127
* [GPGPU] Move getExtend to C++ [NFC]Tobias Grosser2017-08-171-54/+35
| | | | llvm-svn: 311123
* [GPGPU] Make the ast_build available to block generatorTobias Grosser2017-08-101-0/+2
| | | | | | This is necessary for partial writes (as used by delicm) to work. llvm-svn: 310553
* [ManagedMemoryRewrite] Introduce a new pass to rewrite modules to use ↵Siddharth Bhat2017-08-091-30/+37
| | | | | | | | | | | | | | | | managed memory. This pass is useful to automatically convert a codebase that uses malloc/free to use their managed memory counterparts. Currently, rewrite malloc and free to the `polly_{malloc,free}Managed` variants. A future patch will teach ManagedMemoryRewrite to rewrite global arrays as pointers to globally allocated managed memory. Differential Revision: https://reviews.llvm.org/D36513 llvm-svn: 310471
* [PPCGCodeGeneration] Compute element size in bytes for arrays correctly.Siddharth Bhat2017-08-091-1/+14
| | | | | | | | | | | | Previously, we used to compute this with `elementSizeInBits / 8`. This would yield an element size of 0 when the array had element size < 8 in bits. To fix this, ask data layout what the size in bytes should be. Differential Revision: https://reviews.llvm.org/D36459 llvm-svn: 310448
* [Polly] [PPCGCodeGeneration] Handle failing of invariant load hoisting ↵Siddharth Bhat2017-08-081-9/+26
| | | | | | | | | | | gracefully. To do this, we replicate what `CodeGeneration` does. We expose `markNodeUnreachable` from `CodeGeneration` to `PPCGCodeGeneration`. Differential Revision: https://reviews.llvm.org/D36457 llvm-svn: 310350
* [GPGPU] Remove redundant constructorsTobias Grosser2017-08-071-4/+4
| | | | llvm-svn: 310284
* [ScopInfo] Move Scop::getPwAffOnly to isl++ [NFC]Tobias Grosser2017-08-061-1/+1
| | | | llvm-svn: 310231
OpenPOWER on IntegriCloud