summaryrefslogtreecommitdiffstats
path: root/polly/lib/CodeGen/IslNodeBuilder.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [Alignment][NFC] Remove AllocaInst::setAlignment(unsigned)Guillaume Chatelet2019-09-301-1/+2
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, arsenm, jvesely, nhaehnle, eraman, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68141 llvm-svn: 373207
* [Alignment] Fix polly buildGuillaume Chatelet2019-09-301-1/+2
| | | | llvm-svn: 373199
* [CodeGen] Handle outlining of CopyStmts.Michael Kruse2019-09-171-3/+4
| | | | | | | | | | | | | | | | Since the removal of extensions nodes from schedule trees in r362257 it is possible to emit parallel code for SCoPs containing matrix-multiplications. However, the code looking for references used in outlined statement was not prepared to handle CopyStmts introduced by the matrix-matrix multiplication detection. In this case, CopyStmts do not introduce references in addition to the ones captured by MemoryAccesses, i.e. we change the assertion to accept CopyStmts and add a regression test for this case. This fixes llvm.org/PR43164 llvm-svn: 372188
* Apply include-what-you-use #include removal suggestions. NFC.Michael Kruse2019-03-281-2/+0
| | | | | | | | | | | | This removes unused includes (and forward declarations) as suggested by include-what-you-use. If a transitive include of a removed include is required to compile a file, I added the required header (or forward declaration if suggested by include-what-you-use). This should reduce compilation time and reduce the number of iterative recompilations when a header was changed. llvm-svn: 357209
* [CodeGen] LLVM OpenMP Backend.Michael Kruse2019-03-191-4/+25
| | | | | | | | | | | | | | | | | | | | | The ParallelLoopGenerator class is changed such that GNU OpenMP specific code was removed, allowing to use it as super class in a template-pattern. Therefore, the code has been reorganized and one may not use the ParallelLoopGenerator directly anymore, instead specific implementations have to be provided. These implementations contain the library-specific code. As such, the "GOMP" (code completely taken from the existing backend) and "KMP" variant were created. For "check-polly" all tests that involved "GOMP": equivalents were added that test the new functionalities, like static scheduling and different chunk sizes. "docs/UsingPollyWithClang.rst" shows how the alternative backend may be used. Patch by Michael Halkenhäuser <michaelhalk@web.de> Differential Revision: https://reviews.llvm.org/D59100 llvm-svn: 356434
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [CodeGen] Convert IslNodeBuilder::getNumberOfIterations to isl++. NFC.Michael Kruse2018-07-311-30/+18
| | | | llvm-svn: 338451
* [CodeGen] Convert IslNodeBuilder::createForSequential to isl++. NFC.Michael Kruse2018-07-311-25/+17
| | | | llvm-svn: 338450
* [CodeGen] Convert IslNodeBuilder::getUpperBound to isl++. NFC.Michael Kruse2018-07-311-30/+17
| | | | llvm-svn: 338449
* [IslNodeBuilder] Use isl++ to replace foreach_set with for loopTobias Grosser2018-07-171-14/+13
| | | | llvm-svn: 337247
* Remove the last uses of isl::give and isl::takeTobias Grosser2018-04-291-9/+9
| | | | llvm-svn: 331126
* Use isl::manage_copy to simplify calls to isl::manage(isl_.._copy())Tobias Grosser2018-02-201-4/+3
| | | | | | | | | | | As part of this cleanup a couple of unnecessary isl::manage(obj.copy()) pattern are eliminated as well. We checked for all potential cleanups by scanning for: "grep -R isl::manage\( lib/ | grep copy" llvm-svn: 325558
* [CodeGen] Detect empty domain because of parameters context.Michael Kruse2017-11-211-0/+2
| | | | | | | | | | | | | | | | | | | Isl does not allow generating isl_ast_expr from an isl_pw_aff that has an empty domain (i.e. has no pieces). We already detected the case if the isl_pw_aff comes with an empty domain. isl_ast_build also considers the domain empty if it is disjoint with the parameter context (e.g. parameters values that we exclude by runtime versioning). Intersect the access relation domain with the parameter context to also detect such practically empty access domains. The effective pointer used in the generated code is unimportand because it will never be executed. This fixes llvm.org/PR35362 llvm-svn: 318806
* Run polly-update-format. NFC.Michael Kruse2017-11-211-1/+1
| | | | | | | polly-check-format has been failing since at least r318517, due to more than one cause. llvm-svn: 318795
* Port ScopInfo to the isl cpp bindingsPhilip Pfaffe2017-11-191-8/+8
| | | | | | | | | | | | | | | | | | | | | Summary: Most changes are mechanical, but in one place I changed the program semantics by fixing a likely bug: In `Scop::hasFeasibleRuntimeContext()`, I'm now explicitely handling the error-case. Before, when the call to `addNonEmptyDomainConstraints()` returned a null set, this (probably) accidentally worked because isl_bool_error converts to true. I'm checking for nullptr now. Reviewers: grosser, Meinersbur, bollu Reviewed By: Meinersbur Subscribers: nemanjai, kbarton, pollydev, llvm-commits Differential Revision: https://reviews.llvm.org/D39971 llvm-svn: 318632
* [OpenMP] Fix reference collection of latest base ptrs.Michael Kruse2017-10-311-1/+1
| | | | | | | | When collecting base pointers that need to be made available in parallel subfunctions, use the base pointer associated with the latest ScopArrayInfo, instead of the original one. llvm-svn: 316983
* [Acc] Do not statically dispatch into IslNodeBuilder's createForPhilip Pfaffe2017-10-291-8/+6
| | | | | | | | | | | | | | | | | | | | Summary: When GPUNodeBuilder creates loops inside the kernel, it dispatches to IslNodeBuilder. This however is surprisingly dangerous, since it accesses the AST Node's user through the wrong type. This patch fixes this problem by overriding createFor correctly. This fixes PR35010. Reviewers: grosser, bollu, Meinersbur Reviewed By: Meinersbur Subscribers: Meinersbur, nemanjai, pollydev, llvm-commits, kbarton Differential Revision: https://reviews.llvm.org/D39364 llvm-svn: 316872
* [IslExprBuilder] Do not generate RTC with more than 64 bitTobias Grosser2017-09-231-0/+11
| | | | | | | | | | Such RTCs may introduce integer wrapping intrinsics with more than 64 bit, which are translated to library calls on AOSP that are not part of the runtime and will consequently cause linker errors. Thanks to Eli Friedman for reporting this issue and reducing the test case. llvm-svn: 314065
* [ISLNodeBuilder] Materialize Fortran array sizes of arrays without memory ↵Siddharth Bhat2017-09-011-31/+22
| | | | | | | | | | | | | | | | | | | | | | accesses. In Polly, we specifically add a paramter to represent the outermost dimension size of fortran arrays. We do this because this information is statically available from the fortran metadata generated by dragonegg. However, we were only materializing these parameters (meaning, creating an llvm::Value to back the isl_id) from *memory accesses*. This is wrong, we should materialize parameters from *scop array info*. It is wrong because if there is a case where we detect 2 fortran arrays, but only one of them is accessed, we may not materialize the other array's dimensions at all. This is incorrect. We fix this by looping over all `polly::ScopArrayInfo` in a scop, rather that just all `polly::MemoryAccess`. Differential Revision: https://reviews.llvm.org/D37379 llvm-svn: 312350
* [Polly] Fix some Clang-tidy modernize and Include What You Use warnings; ↵Eugene Zelenko2017-08-241-23/+36
| | | | | | other minor fixes (NFC). llvm-svn: 311704
* Add more statistics.Michael Kruse2017-08-231-0/+13
| | | | | | | | | | | | | | | | Add statistics about - Which optimizations are applied - Number of loops in Scops at various stages - Number of scalar/singleton writes at various stages representative for scalar false dependencies - Number of parallel loops These will be useful to find regressions due to moving Polly further down of LLVM's pass pipeline. Differential Revision: https://reviews.llvm.org/D37049 llvm-svn: 311553
* [NFC] Fix the broken comment.Roman Gareev2017-08-221-1/+1
| | | | llvm-svn: 311477
* Disable the Loop Vectorizer in case of GEMMRoman Gareev2017-08-221-1/+26
| | | | | | | | | | | | | | Currently, in case of GEMM and the pattern matching based optimizations, we use only the SLP Vectorizer out of two LLVM vectorizers. Since the Loop Vectorizer can get in the way of optimal code generation, we disable the Loop Vectorizer for the innermost loop using mark nodes and emitting the corresponding metadata. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D36928 llvm-svn: 311473
* Clarify the intend of the run-time checkTobias Grosser2017-08-191-2/+6
| | | | llvm-svn: 311243
* [GPGPU] Collect parameter dimension used in MemoryAccessesTobias Grosser2017-08-191-1/+7
| | | | | | | | | | | When using -polly-ignore-integer-wrapping and -polly-acc-codegen-managed-memory we add parameter dimensions lazily to the domains, which results in PPCG not including parameter dimensions that are only used in memory accesses in the kernel space. To make sure these parameters are still passed to the kernel, we collect these parameter dimensions and align the kernel's parameter space before code-generating it. llvm-svn: 311239
* [GPGPU] Also record invariant loads as kernel subtree valuesTobias Grosser2017-08-161-3/+9
| | | | | | | Before this change kernels that used invariant loads would have resulted in invalid PTX code. llvm-svn: 311042
* [CodeGen] Use isLatestArrayKind().Michael Kruse2017-08-091-1/+1
| | | | | | | | | | Codegen with -polly-parallel queried the unmapped MemoryAccess, but only the MemoryKind after mapping is relevant for codegen. This should fix various fails of the perf-x86_64-penryn-O3-polly-parallel-fast buildbot. llvm-svn: 310466
* [ScopInfo] Move Scop::getPwAffOnly to isl++ [NFC]Tobias Grosser2017-08-061-1/+1
| | | | llvm-svn: 310231
* [ScopInfo] Translate Scop::getParamSpace to isl++ [NFC]Tobias Grosser2017-08-061-4/+6
| | | | llvm-svn: 310224
* [ScopInfo] Translate Scop::getContext to isl++ [NFC]Tobias Grosser2017-08-061-3/+3
| | | | llvm-svn: 310221
* [ScopInfo] Translate Scop::getIdForParam to isl++ [NFC]Tobias Grosser2017-08-061-2/+2
| | | | llvm-svn: 310220
* [ScopInfo] Move ScopStmt::setAstBuild/getAstBuild to isl++Tobias Grosser2017-08-061-1/+1
| | | | llvm-svn: 310216
* Move ScopInfo::getDomain(), getDomainSpace(), getDomainId() to isl++Tobias Grosser2017-08-061-3/+3
| | | | llvm-svn: 310209
* [Polly] [PPCGCodeGeneration] Deal with loops outside the Scop correctly in ↵Siddharth Bhat2017-08-061-0/+1
| | | | | | | | | | | | | | | PPCGCodeGeneration. A Scop with a loop outside it is not handled currently by PPCGCodeGeneration. The test case is such that the Scop has only one inner loop that is detected. This currently breaks codegen. The fix is to reuse the existing mechanism in `IslNodeBuilder` within `GPUNodeBuilder. Differential Revision: https://reviews.llvm.org/D36290 llvm-svn: 310193
* [IslNodeBuilder] [NFC] Refactor creation of loop induction variables of ↵Siddharth Bhat2017-08-061-11/+23
| | | | | | | | | | | | loops outside scops. This logic is duplicated, so we refactor it into a separate function. This will be used in a later patch to teach PPCGCodeGen code generation for loops that are outside the scop. Differential Revision: https://reviews.llvm.org/D36310 llvm-svn: 310192
* [NFC] [IslNodeBuilder, GPUNodeBuilder] Unify mechanism for looking up ↵Siddharth Bhat2017-08-011-5/+8
| | | | | | | | | | | | | | | | | | replacement Values. We populate `IslNodeBuilder::ValueMap` which contains replacements for `llvm::Value`s. There was no simple method to pick up a replacement if it exists, otherwise fall back to the original. Create a method `IslNodeBuilder::getLatestValue` which provides this functionality. This will be used in a later patch to fix bugs in `PPCGCodeGeneration` where the latest value is not being used. Differential Revision: https://reviews.llvm.org/D36000 llvm-svn: 309674
* [IslNodeBuilder] Remove unused instructionTobias Grosser2017-07-311-1/+0
| | | | | Suggested-by: Maximilian Falkenstein <falkensm@student.ethz.ch> llvm-svn: 309533
* Move applyScheduleToAccessRelation to isl++Tobias Grosser2017-07-231-1/+2
| | | | llvm-svn: 308842
* Move MemoryAccess::getAddressFunction to isl++Tobias Grosser2017-07-231-1/+1
| | | | llvm-svn: 308841
* Move MemoryAccess::NewAccessRelation to isl++Tobias Grosser2017-07-231-1/+1
| | | | | | We also move related accessor functions llvm-svn: 308840
* Move MemoryAccess::id to isl++Tobias Grosser2017-07-231-3/+5
| | | | llvm-svn: 308836
* Move ScopArrayInfo to isl++Tobias Grosser2017-07-211-1/+1
| | | | | | This moves the full ScopArrayInfo class to isl++ llvm-svn: 308801
* [IslNodeBuilder] Relax complexity check in invariant loads and run it earlyTobias Grosser2017-07-201-23/+0
| | | | | | | | | | | | | | | | | | | | | | | | When performing invariant load hoisting we check that invariant load expressions are not too complex. Up to this commit, we performed this check by counting the sum of dimensions in the access range as a very simple heuristic. This heuristic is a little too conservative, as it prevents hoisting for any scops with a very large number of parameters. Hence, we update the heuristic to only count existentially quantified dimensions and set dimensions. We expect this to still detect the problematic expressions in h264 because of which this check was originally introduced. For some unknown reason, this complexity check was originally committed in IslNodeBuilder. It really belongs in ScopInfo, as there is no point in optimizing a program which we could have known earlier cannot be code generated. The benefit of running the check early is that we can avoid to even hoist checks that are expensive to code generate as invariant loads. This can be seen in the changed tests, where we now indeed detect the scop, but just not invariant load hoist the complicated access. We also improve the formatting of the code, document it, and use isl++ to simplify expressions. llvm-svn: 308659
* [Invariant Loads] Do not consider invariant loads to have dependences.Siddharth Bhat2017-07-131-0/+16
| | | | | | | | | | | | | | | | | We need to relax constraints on invariant loads so that they do not create fake RAW dependences. So, we do not consider invariant loads as scalar dependences in a region. During these changes, it turned out that we do not consider `llvm::Value` replacements correctly within `PPCGCodeGeneration` and `ISLNodeBuilder`. The replacements dictated by `ValueMap` were not being followed in all places. This was fixed in this commit. There is no clean way to decouple this change because this bug only seems to arise when the relaxed version of invariant load hoisting was enabled. Differential Revision: https://reviews.llvm.org/D35120 llvm-svn: 307907
* Heap allocation for new arrays.Michael Kruse2017-06-281-7/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch aims to implement the option of allocating new arrays created by polly on heap instead of stack. To enable this option, a key named 'allocation' must be written in the imported json file with the value 'heap'. We need such a feature because in a next iteration, we will implement a mechanism of maximal static expansion which will need a way to allocate arrays on heap. Indeed, the expansion is very costly in terms of memory and doing the allocation on stack is not worth considering. The malloc and the free are added respectively at polly.start and polly.exiting such that there is no use-after-free (for instance in case of Scop in a loop) and such that all memory cells allocated with a malloc are free'd when we don't need them anymore. We also add : - In the class ScopArrayInfo, we add a boolean as member called IsOnHeap which represents the fact that the array in allocated on heap or not. - A new branch in the method allocateNewArrays in the ISLNodeBuilder for the case of heap allocation. allocateNewArrays now takes a BBPair containing polly.start and polly.exiting. allocateNewArrays takes this two blocks and add the malloc and free calls respectively to polly.start and polly.exiting. - As IntPtrTy for the malloc call, we use the DataLayout one. To do that, we have modified : - createScopArrayInfo and getOrCreateScopArrayInfo such that it returns a non-const SAI, in order to be able to call setIsOnHeap in the JSONImporter. - executeScopConditionnaly such that it return both start block and end block of the scop, because we need this two blocs to be able to add the malloc and the free calls at the right position. Differential Revision: https://reviews.llvm.org/D33688 llvm-svn: 306540
* Fix a lot of typos. NFC.Michael Kruse2017-06-081-7/+7
| | | | llvm-svn: 304974
* [CodeGen] Support partial write accesses.Michael Kruse2017-05-211-15/+52
| | | | | | | | | | | | | | | | | | | Allow the BlockGenerator to generate memory writes that are not defined over the complete statement domain, but only over a subset of it. It generates a condition that evaluates to 1 if executing the subdomain, and only then execute the access. Only write accesses are supported. Read accesses would require a PHINode which has a value if the access is not executed. Partial write makes DeLICM able to apply mappings that are not defined over the entire domain (for instance, a branch that leaves a loop with a PHINode in its header; a MemoryKind::PHI write when leaving is never read by its PHI read). Differential Revision: https://reviews.llvm.org/D33255 llvm-svn: 303517
* [Fortran Support] Materialize outermost dimension for Fortran array.Siddharth Bhat2017-05-191-0/+92
| | | | | | | | | | | | | | | | | | | - We use the outermost dimension of arrays since we need this information to generate GPU transfers. - In general, if we do not know the outermost dimension of the array (because the indexing expression is non-affine, for example) then we simply cannot generate transfer code. - However, for Fortran arrays, we can use the Fortran array representation which stores the dimensions of all arrays. - This patch uses the Fortran array representation to generate code that computes the outermost dimension size. Differential Revision: https://reviews.llvm.org/D32967 llvm-svn: 303429
* [Polly] Canonicalize arrays according to base-ptr equivalence classTobias Grosser2017-05-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In case two arrays share base pointers in the same invariant load equivalence class, we canonicalize all memory accesses to the first of these arrays (according to their order in the equivalence class). This enables us to optimize kernels such as boost::ublas by ensuring that different references to the C array are interpreted as accesses to the same array. Before this change the runtime alias check for ublas would fail, as it would assume models of the C array with differing (but identically valued) base pointers would reference distinct regions of memory whereas the referenced memory regions were indeed identical. As part of this change we remove most of the MemoryAccess::get*BaseAddr interface. We removed already all references to get*BaseAddr in previous commits to ensure that no code relies on matching base pointers between memory accesses and scop arrays -- except for three remaining uses where we need the original base pointer. We document for these situations that MemoryAccess::getOriginalBaseAddr may return a base pointer that is distinct to the base pointer of the scop array referenced by this memory access. Reviewers: sebpop, Meinersbur, zinob, gareevroman, pollydev, huihuiz, efriedma, jdoerfert Reviewed By: Meinersbur Subscribers: etherzhhb Tags: #polly Differential Revision: https://reviews.llvm.org/D28518 llvm-svn: 302636
* [Polly] Do not introduce address space castHongbin Zheng2017-04-271-1/+2
| | | | | | | | Do not introduce address space cast in IslNodeBuilder::preloadUnconditionally. Differential Revision: https://reviews.llvm.org/D32581 llvm-svn: 301519
OpenPOWER on IntegriCloud