summaryrefslogtreecommitdiffstats
path: root/polly
Commit message (Collapse)AuthorAgeFilesLines
* [ScopInfo] Fix: use raw source pointer.Michael Kruse2016-10-252-1/+62
| | | | | | | | | | | | When adding an llvm.memcpy instruction to AliasSetTracker, it uses the raw source and target pointers which preserve bitcasts. MemAccInst::getPointerOperand() also returns the raw target pointers, but Scop::buildAliasGroups() did not for the source pointer. This lead to mismatches between AliasSetTracker and ScopInfo on which pointer to use. Fixed by also using raw pointers in Scop::buildAliasGroups(). llvm-svn: 285071
* [polly] Change SmallPtrSet which is being iterated to SmallSetVector in ↵Mandeep Singh Grang2016-10-211-2/+2
| | | | | | | | | | | | | | | | ScopInfo.h Summary: This will avoid non-deterministic iteration order. Reviewers: grosser, jdoerfert, zinob, mgrang Subscribers: #polly Tags: #polly Differential Revision: https://reviews.llvm.org/D25880 llvm-svn: 284883
* [SCEVAffinator] Make precise modular math more correct.Eli Friedman2016-10-217-70/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | Integer math in LLVM IR is modular. Integer math in isl is arbitrary-precision. Modeling LLVM IR math correctly in isl requires either adding assumptions that math doesn't actually overflow, or explicitly wrapping the math. However, expressions with the "nsw" flag are special; we can pretend they're arbitrary-precision because it's undefined behavior if the result wraps. SCEV expressions based on IR instructions with an nsw flag also carry an nsw flag (roughly; actually, the real rule is a bit more complicated, but the details don't matter here). Before this patch, SCEV flags were also overloaded with an additional function: the ZExt code was mutating SCEV expressions as a hack to indicate to checkForWrapping that we don't need to add assumptions to the operand of a ZExt; it'll add explicit wrapping itself. This kind of works... the problem is that if anything else ever touches that SCEV expression, it'll get confused by the incorrect flags. Instead, with this patch, we make the decision about whether to explicitly wrap the math a bit earlier, basing the decision purely on the SCEV expression itself, and not its users. Differential Revision: https://reviews.llvm.org/D25287 llvm-svn: 284848
* [polly] Change SmallPtrSet which are being iterated into SmallSetVectorMandeep Singh Grang2016-10-211-1/+1
| | | | | | | | | | | | Summary: Otherwise the lack of an iteration order results in non-determinism in codegen. Reviewers: _jdoerfert, zinob, grosser Tags: #polly Differential Revision: https://reviews.llvm.org/D25863 llvm-svn: 284845
* [cmake] Avoid warnings in feature tests. NFC.Michael Kruse2016-10-201-5/+5
| | | | | | | | | | Apply the __attribute__((unused)) before the function to unambiguously apply to the function declaration. Add more casts-to-void to mark return values unused as intended. Contributed-by: Andy Gibbs <andyg1001@hotmail.co.uk> llvm-svn: 284718
* Update isl to isl-0.17.1-236-ga9c6cc7Tobias Grosser2016-10-207-3/+43
| | | | | | This includes isl_id_to_str, which is used in Michael's upcoming DeLICM patch. llvm-svn: 284689
* [polly] Fix non-determinism in polly BlockGeneratorsMandeep Singh Grang2016-10-191-2/+2
| | | | | | | | | | | | Summary: Iterating over SeenBlocks which is a SmallPtrSet results in non-determinism in codegen Reviewers: jdoerfert, zinob, grosser Tags: #polly Differential Revision: https://reviews.llvm.org/D25778 llvm-svn: 284622
* [test] Fix buildbot after SCEV change.Michael Kruse2016-10-181-1/+1
| | | | | | | | Update test after commit r284501: [SCEV] Make CompareValueComplexity a little bit smarter Contributed-by: Sanjoy Das <sanjoy@playingwithpointers.com> llvm-svn: 284543
* Handle multi-dimensional invariant load.Eli Friedman2016-10-172-0/+73
| | | | | | | If the address of a load depends on another load, make sure to emit the loads in the right order. llvm-svn: 284426
* [ScopDetect] Depend transitively on ScalarEvolution.Michael Kruse2016-10-172-1/+48
| | | | | | | ScopDetection might be queried by -dot-scops or -view-scops passes for which it accesses ScalarEvolution. llvm-svn: 284385
* [test] Add missing colon.Michael Kruse2016-10-161-1/+1
| | | | llvm-svn: 284349
* [cmake] Add polly-isl-test dependency to lit tests.Michael Kruse2016-10-161-1/+1
| | | | | | Also handle the in-llvm-tree case forgotten in r284339. llvm-svn: 284347
* [cmake] Add polly-isl-test dependency to lit tests.Michael Kruse2016-10-161-1/+1
| | | | | | | | | | | lit recursively iterates through the test subdirectories and finds the ISL unittest. For this test to work, the polly-isl-test executable needs to be compiled. Add the polly-isl-test dependency to POLLY_TEST_DEPS. This makes check-polly and check-polly-tests work from a fresh build directory. llvm-svn: 284339
* [test] Add -polly-unprofitable-scalar-accs to test that needs it.Michael Kruse2016-10-161-0/+1
| | | | | | | | The test non_affine_loop_used_later.ll also tests the profability heuristic. Add the option -polly-unprofitable-scalar-accs explicitely to ensure that the test succeeds if the default value is changed. llvm-svn: 284338
* cmake: avoid "zero-length gnu_printf format string" warning in gcc 6.1.1Tobias Grosser2016-10-151-2/+2
| | | | | Contributed-by: Andy Gibbs <andyg1001@hotmail.co.uk> llvm-svn: 284302
* [ScopInfo/CodeGen] ExitPHI reads are implicit.Michael Kruse2016-10-124-9/+63
| | | | | | | | | | | | | | | | | Under some conditions MK_Value read accessed where converted to MK_ExitPHI read accessed. This is unexpected because MK_ExitPHI read accesses are implicit after the scop execution. This behaviour was introduced in r265261, which fixed a failed assertion/crash in CodeGen. Instead, we fix this failure in CodeGen itself. createExitPHINodeMerges(), despite its name, also handles accesses of kind MK_Value, only to skip them because they access values that are usually not PHI nodes in the SCoP region's exit block. Except in the situation observed in r265261. Do not convert value accessed to ExitPHI accesses and do not handle value accesses like ExitPHI accessed in CodeGen anymore. llvm-svn: 284023
* [DepInfo] Print -debug output outside of max-operations scope.Michael Kruse2016-10-101-5/+5
| | | | | | | | | ISL tries to simplify the polyhedral operations before printing its objects. This increases the operations counter and therefore can contribute to hitting the operations limit. Therefore the result could be different when -debug output is enabled, making debugging harder. llvm-svn: 283745
* [Support/DepInfo] Introduce IslMaxOperationsGuard and make DepInfo use it. NFC.Michael Kruse2016-10-102-51/+110
| | | | | | | | | | | IslMaxOperationsGuard defines a scope where ISL may abort operations because if it takes too many operations. Replace the call to the raw ISL interface by a use of the guard. IslMaxOperationsGuard provides a uniform way to define a maximal computation time for a code region in C++ using RAII. llvm-svn: 283744
* Fix formatting after recent cl:: changesTobias Grosser2016-10-091-10/+13
| | | | | | This fixes 'make check-polly' llvm-svn: 283693
* Turn cl::values() (for enum) from a vararg function to using C++ variadic ↵Mehdi Amini2016-10-083-17/+10
| | | | | | | | | | | | | | | template The core of the change is supposed to be NFC, however it also fixes what I believe was an undefined behavior when calling: va_start(ValueArgs, Desc); with Desc being a StringRef. Differential Revision: https://reviews.llvm.org/D25342 llvm-svn: 283671
* Define PATH_MAX on windowsHongbin Zheng2016-10-071-0/+6
| | | | | | Differential Revision: https://reviews.llvm.org/D25372 llvm-svn: 283600
* [cmake] Unify disabling upstream project warnings.Michael Kruse2016-10-071-25/+14
| | | | | | | | Handle MSVC, ISL and PPCG in one place. The only functional change is that warnings are also disabled for MSVC compiling PPCG (Which currently fails anyway). llvm-svn: 283547
* [cmake] Move isl_test artifacts to Polly folder.Michael Kruse2016-10-072-0/+2
| | | | | | | Folders in Visual Studio solutions help organize the build artifacts from all LLVM projects. There is a folder to keep Polly-built files in. llvm-svn: 283546
* Build and run isl_test as part of check-pollyTobias Grosser2016-10-045-1/+94
| | | | | | | | | | | | | | | | | | | | Running isl tests is important to gain confidence that the isl build we created works as expected. Besides the actual isl tests, there are also isl AST generation tests shipped with isl. This change only adds support for the isl unit tests. AST generation test support is left for a later commit. There is a choice to run tests directly through the build system or in the context of lit. We choose to run tests as part of lit to as this allows us to easily set environment variables, print output only on error and generally run the tests directly from the lit command. Reviewers: brad.king, Meinersbur Subscribers: modocache, brad.king, pollydev, beanz, llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D25155 llvm-svn: 283245
* [ScopInfo] Add -polly-unprofitable-scalar-accs option.Michael Kruse2016-10-042-1/+103
| | | | | | | | | With this option one can disable the heuristic that assumes that statements with a scalar write access cannot be profitably optimized. Such a statement instances necessarily have WAW-dependences to itself. With DeLICM scalar accesses can be changed to array accesses, which can avoid these WAW-dependence. llvm-svn: 283233
* [ScopInfo] Scalar access do not have indirect base pointers.Michael Kruse2016-10-043-5/+5
| | | | | | | | | | | | | | | | ScopArrayInfo used to determine base pointer origins by looking up whether the base pointer is a load. The "base pointer" for scalar accesses is the llvm::Value being accessed. This is only a symbolic base pointer, it represents the alloca variable (.s2a or .phiops) generated for it at code generation. This patch disables determining base pointer origin for scalars. A test case where this caused a crash will be added in the next commit. In that test SAI tried to get the origin base pointer that was only declared later, therefore not existing. This is probably only possible for scalars used in PHINode incoming blocks. llvm-svn: 283232
* [ScopInfo] Make simplifySCoP() public. NFC.Michael Kruse2016-10-041-3/+7
| | | | | | | | This function may need to be called after the scop construction. The upcoming DeLICM will use this to cleanup statement that all write accesses have been removed from. llvm-svn: 283221
* isl: update to isl-0.17.1-233-gc911e6aTobias Grosser2016-10-0121-517/+1168
| | | | llvm-svn: 283049
* [CodeGen] Add assertion for indirect array index expression generation. NFC.Michael Kruse2016-09-301-0/+3
| | | | | | | | | | | Currently Polly cannot generate code for index expressions if the base pointer is computed within the scop. The base pointer must be generated as well, but there is no code that triggers that. Add an assertion to detect when this would occur and miscompile. The IR verifier should catch it as well. llvm-svn: 282893
* [Support] Complete ISL annotations to IslPtr<>. NFC.Michael Kruse2016-09-301-9/+9
| | | | | | | | | | | Add missing __isl_(give/take/keep) annotations to IslPtr<> and NonowningIslPtr<> methods. Because IslPtr's constructor's annotation would depend on the TakeOwnership parameter, the parameter has been removed. Caller must copy the object themselves if the do not want to take ownership. llvm-svn: 282883
* [Support] Compile fix for gcc. NFC.Michael Kruse2016-09-301-2/+4
| | | | | | | | | | | | gcc 5.4 insists on template specialization to be in a namespace polly { ... } block, instead of being prefixed with 'polly::'. Error message: root/src/llvm/tools/polly/lib/Support/GICHelper.cpp:203:54: error: specialization of ‘template<class T> void polly::IslPtr<T>::dump() const’ in different namespace [-fpermissive] template <> void polly::IslPtr<isl_##TYPE>::dump() const { \ ^ msvc14 and clang 3.8 did not complain. llvm-svn: 282874
* [Support] Add (Nonowning-)IslPtr::dump(). NFC.Michael Kruse2016-09-302-0/+37
| | | | | | | | | | | | The dump() methods can be called from a debugger instead of e.g. isl_*_dump(Var.Obj) where Var is a variable of type IslPtr/NonowningIslPtr. To ensure that the existence of the function pointers do not depdend on whether the methods are used somwhere, they are declared with external linkage. llvm-svn: 282870
* [Support] Call isl_*_free() only on non-null pointers. NFC.Michael Kruse2016-09-301-2/+6
| | | | | | | | | | | | Add a non-NULL check before calling the free function into functions that are supposed to be inlined. First, this is a form of partial inlining of the free function, namely the nullptr test that free has to do. Secondly, and more importantly, it allows the compiler to remove the call to isl_*_free() when it knows that the object is nullptr, for instance because the last call is a take(). "Consuming" the last use of an ISL object using take() (instead of copy()) is a common pattern. llvm-svn: 282864
* [CodeGen] Change 'Scalar' to 'Array' in method names. NFC.Michael Kruse2016-09-302-15/+14
| | | | | | | | | | | | | generateScalarLoad() and generateScalarStore() are used for explicit (MK_Array) memory accesses, therefore the method names were misleading. The names also were similar to generateScalarLoads() and generateScalarStores() (plural forms) which indeed handle scalar accesses. Presumbly, they were originally named to contrast VectorBlockGenerator::generateLoad(). Rename the two methods to generateArrayLoad(), respectively generateArrayStore(). llvm-svn: 282861
* [CodeGen] Add assertion for partial scalar accesses. NFC.Michael Kruse2016-09-301-0/+18
| | | | | | | The code generator always adds unconditional LoadInst and StoreInst, hence the MemoryAccess must be defined over all statement instances. llvm-svn: 282853
* www: Add Loopy publicationTobias Grosser2016-09-291-0/+5
| | | | llvm-svn: 282747
* www: add new code coverage link to Polly websiteTobias Grosser2016-09-251-1/+1
| | | | llvm-svn: 282351
* [ScopDetection] Remove redundant checks for endless loopsTobias Grosser2016-09-204-98/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Both `canUseISLTripCount()` and `addOverApproximatedRegion()` contained checks to reject endless loops which are now removed and replaced by a single check in `isValidLoop()`. For reporting such loops the `ReportLoopOverlapWithNonAffineSubRegion` is renamed to `ReportLoopHasNoExit`. The test case `ReportLoopOverlapWithNonAffineSubRegion.ll` is adapted and renamed as well. The schedule generation in `buildSchedule()` is based on the following assumption: Given some block B that is contained in a loop L and a SESE region R, we assume that L is contained in R or the other way around. However, this assumption is broken in the presence of endless loops that are nested inside other loops. Therefore, in order to prevent erroneous behavior in `buildSchedule()`, r265280 introduced a corresponding check in `canUseISLTripCount()` to reject endless loops. Unfortunately, it was possible to bypass this check with -polly-allow-nonaffine-loops which was fixed by adding another check to reject endless loops in `allowOverApproximatedRegion()` in r273905. Hence there existed two separate locations that handled this case. Thank you Johannes Doerfert for helping to provide the above background information. Reviewers: Meinersbur, grosser Subscribers: _jdoerfert, pollydev Differential Revision: https://reviews.llvm.org/D24560 Contributed-by: Matthias Reisinger <d412vv1n@gmail.com> llvm-svn: 281987
* Fix spelling in CMakeListsTobias Grosser2016-09-191-1/+1
| | | | llvm-svn: 281897
* GPGPU: add missing REQUIRES line to test caseTobias Grosser2016-09-181-2/+3
| | | | llvm-svn: 281850
* GPGPU: Do not run mostly sequential kernels in GPUTobias Grosser2016-09-182-0/+131
| | | | | | | | In case sequential kernels are found deeper in the loop tree than any parallel kernel, the overall scop is probably mostly sequential. Hence, run it on the CPU. llvm-svn: 281849
* GPGPU: Dynamically ensure 'sufficient compute'Tobias Grosser2016-09-182-2/+132
| | | | | | | | | | | | | Offloading to a GPU is only beneficial if there is a sufficient amount of compute that can be accelerated. Many kernels just have a very small number of dynamic compute, which means GPU acceleration is not beneficial. We compute at run-time an approximation of how many dynamic instructions will be executed and fall back to CPU code in case this number is not sufficiently large. To keep the run-time checking code simple, we over-approximate the number of instructions executed in each statement by computing the volume of the rectangular hull of its iteration space. llvm-svn: 281848
* GPGPU: Make test cases independent of register numbering [NFC]Tobias Grosser2016-09-183-13/+13
| | | | llvm-svn: 281847
* GPGPU: Store back non-read-only scalarsTobias Grosser2016-09-172-2/+231
| | | | | | | | | | | | | | | | We may generate GPU kernels that store into scalars in case we run some sequential code on the GPU because the remaining data is expected to already be on the GPU. For these kernels it is important to not keep the scalar values in thread-local registers, but to store them back to the corresponding device memory objects that backs them up. We currently only store scalars back at the end of a kernel. This is only correct if precisely one thread is executed. In case more than one thread may be run, we currently invalidate the scop. To support such cases correctly, we would need to always load and store back from a corresponding global memory slot instead of a thread-local alloca slot. llvm-svn: 281838
* GPGPU: Detect read-only scalar arrays ...Tobias Grosser2016-09-179-95/+107
| | | | | | and pass these by value rather than by reference. llvm-svn: 281837
* Update CFGPrinter -> CFGPrinterLegacyPassTobias Grosser2016-09-161-1/+1
| | | | | | .. to match recent changes in LLVM that broke the Polly compilation. llvm-svn: 281705
* GPGPU: Do not assume arrays start at 0Tobias Grosser2016-09-153-6/+216
| | | | | | | | | | | | | | | | | | Our alias checks precisely check that the minimal and maximal accessed elements do not overlap in a kernel. Hence, we must ensure that our host <-> device transfers do not touch additional memory locations that are not covered in the alias check. To ensure this, we make sure that the data we copy for a given array is only the data from the smallest element accessed to the largest element accessed. We also adjust the size of the array according to the offset at which the array is actually accessed. An interesting result of this is: In case array are accessed with negative subscripts ,e.g., A[-100], we automatically allocate and transfer _more_ data to cover the full array. This is important as such code indeed exists in the wild. llvm-svn: 281611
* Perform copying to created arrays according to the packing transformationRoman Gareev2016-09-1415-44/+368
| | | | | | | | | | | | | | | | This is the fourth patch to apply the BLIS matmul optimization pattern on matmul kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf). BLIS implements gemm as three nested loops around a macro-kernel, plus two packing routines. The macro-kernel is implemented in terms of two additional loops around a micro-kernel. The micro-kernel is a loop around a rank-1 (i.e., outer product) update. In this change we perform copying to created arrays, which is the last step to implement the packing transformation. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D23260 llvm-svn: 281441
* cmake: PollyPPCG depends on PollyISLTobias Grosser2016-09-131-0/+2
| | | | | | | This line makes BUILD_SHARED_LIBS=ON work for Polly-ACC. Without it, ld complains about missing isl symbols when constructing the shared library. llvm-svn: 281396
* GPGPU: Use const_cast to avoid compiler warning [NFC]Tobias Grosser2016-09-131-1/+1
| | | | llvm-svn: 281333
OpenPOWER on IntegriCloud