summaryrefslogtreecommitdiffstats
path: root/polly/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Adjust to clang-format changesTobias Grosser2018-03-203-4/+0
| | | | llvm-svn: 328005
* Revert "[Acc] Fix for PR33208"Philip Pfaffe2018-03-031-5/+9
| | | | | | This reverts commit r326643. Fix didn't really fix anything. llvm-svn: 326656
* [Acc] Fix for PR33208Philip Pfaffe2018-03-031-9/+5
| | | | | | | | | | | | | During codegen, Polly attempts to clear all loops from ScalarEvolution and LoopInfo, and it does so one block at a time. This causes undefined behaviour, since this way a loop header might be removed from a loop before the entire loop is erased, causing ScalarEvolution to run into an error. Instead, just delete the entire loop atomically. This fixes currently failing testcases. llvm-svn: 326643
* Use isl::manage_copy to simplify calls to isl::manage(isl_.._copy())Tobias Grosser2018-02-204-27/+18
| | | | | | | | | | | As part of this cleanup a couple of unnecessary isl::manage(obj.copy()) pattern are eliminated as well. We checked for all potential cleanups by scanning for: "grep -R isl::manage\( lib/ | grep copy" llvm-svn: 325558
* [CodeGen] Fix noalias annotations for memcpy/memmove.Michael Kruse2017-12-221-0/+6
| | | | | | | | | | | | | | | | Memory transfer instructions take two pointers. It is not defined to which of those a noalias annotation applies. To ensure correctness, do not add noalias annotations to memcpy/memmove instructions anymore. The caused a miscompile with test-suite's MultiSource/Applications/obsequi. Since r321138, the MemCpyOpt pass would remove memcpy/memmove calls if known to copy uninitialized memory. In that case, it was initialized by another memcpy, but the annotation for the target pointer said it would not alias. The annotation was actually meant for the source pointer, which was was an alloca and could not alias with the target pointer. llvm-svn: 321371
* [CodeGen] Detect empty domain because of parameters context.Michael Kruse2017-11-211-0/+2
| | | | | | | | | | | | | | | | | | | Isl does not allow generating isl_ast_expr from an isl_pw_aff that has an empty domain (i.e. has no pieces). We already detected the case if the isl_pw_aff comes with an empty domain. isl_ast_build also considers the domain empty if it is disjoint with the parameter context (e.g. parameters values that we exclude by runtime versioning). Intersect the access relation domain with the parameter context to also detect such practically empty access domains. The effective pointer used in the generated code is unimportand because it will never be executed. This fixes llvm.org/PR35362 llvm-svn: 318806
* Run polly-update-format. NFC.Michael Kruse2017-11-214-6/+4
| | | | | | | polly-check-format has been failing since at least r318517, due to more than one cause. llvm-svn: 318795
* Port ScopInfo to the isl cpp bindingsPhilip Pfaffe2017-11-193-40/+40
| | | | | | | | | | | | | | | | | | | | | Summary: Most changes are mechanical, but in one place I changed the program semantics by fixing a likely bug: In `Scop::hasFeasibleRuntimeContext()`, I'm now explicitely handling the error-case. Before, when the call to `addNonEmptyDomainConstraints()` returned a null set, this (probably) accidentally worked because isl_bool_error converts to true. I'm checking for nullptr now. Reviewers: grosser, Meinersbur, bollu Reviewed By: Meinersbur Subscribers: nemanjai, kbarton, pollydev, llvm-commits Differential Revision: https://reviews.llvm.org/D39971 llvm-svn: 318632
* [polly] Remove redundant return [NFC]Mandeep Singh Grang2017-11-101-1/+0
| | | | | | | | | | | | | | Reviewers: grosser, bollu Reviewed By: grosser Subscribers: nemanjai, kbarton, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D39916 llvm-svn: 317922
* [OpenMP] Fix reference collection of latest base ptrs.Michael Kruse2017-10-311-1/+1
| | | | | | | | When collecting base pointers that need to be made available in parallel subfunctions, use the base pointer associated with the latest ScopArrayInfo, instead of the original one. llvm-svn: 316983
* [Acc] Do not statically dispatch into IslNodeBuilder's createForPhilip Pfaffe2017-10-292-8/+13
| | | | | | | | | | | | | | | | | | | | Summary: When GPUNodeBuilder creates loops inside the kernel, it dispatches to IslNodeBuilder. This however is surprisingly dangerous, since it accesses the AST Node's user through the wrong type. This patch fixes this problem by overriding createFor correctly. This fixes PR35010. Reviewers: grosser, bollu, Meinersbur Reviewed By: Meinersbur Subscribers: Meinersbur, nemanjai, pollydev, llvm-commits, kbarton Differential Revision: https://reviews.llvm.org/D39364 llvm-svn: 316872
* [GPGPU] Make sure escaping invariant load hoisted scalars are preservedTobias Grosser2017-10-041-1/+3
| | | | | | | | | | | | | | | We make sure that the final reload of an invariant scalar memory access uses the same stack slot into which the invariant memory access was stored originally. Earlier, this was broken as we introduce a new stack slot aside of the preload stack slot, which remained uninitialized and caused our escaping loads to contain garbage. This happened due to us clearing the pre-populated values in EscapeMap after kernel code generation. We address this issue by preserving the original host values and restoring them after kernel code generation. EscapeMap is not expected to be used during kernel code generation, hence we clear it during kernel generation to make sure that any unintended uses are noticed. llvm-svn: 314894
* [GPGPU] Set Polly's RTC to false in case invariant load hoisting failsTobias Grosser2017-10-011-0/+6
| | | | | | | | | | | This matches the behavior we already have in lib/Codegen/CodeGeneration.cpp and makes sure that we fall back to the original code. It seems when invariant load hoisting was introduced to the GPGPU backend we missed to reset the RTC flag, such that kernels where invariant load hoisting failed executed the 'optimized' SCoP, which however is set to a simple 'unreachable'. Unsurprisingly, this results in hard to debug issues that are a lot of fun to debug. llvm-svn: 314624
* Fix the build after r314375Philip Pfaffe2017-09-281-1/+1
| | | | | | r314375 privatized Loop's constructor and replaced it with an Allocator. llvm-svn: 314412
* [IslExprBuilder] Do not generate RTC with more than 64 bitTobias Grosser2017-09-232-0/+37
| | | | | | | | | | Such RTCs may introduce integer wrapping intrinsics with more than 64 bit, which are translated to library calls on AOSP that are not part of the runtime and will consequently cause linker errors. Thanks to Eli Friedman for reporting this issue and reducing the test case. llvm-svn: 314065
* Check whether IslAstInfo and DependenceInfo were computed for the same Scop.Michael Kruse2017-09-212-2/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | Since -polly-codegen reports itself to preserve DependenceInfo and IslAstInfo, we might get those analysis that were computed by a different ScopInfo for a different Scop structure. This would be unfortunate because DependenceInfo and IslAstInfo hold references to resources allocated by ScopInfo/ScopBuilder/Scop (e.g. isl_id). If -polly-codegen and DependenceInfo/IslAstInfo do not agree on which Scop to use, unpredictable things can happen. When the ScopInfo/Scop object is freed, there is a high probability that the new ScopInfo/Scop object will be created at the same heap position with the same address. Comparing whether the Scop or ScopInfo address is the expected therefore is unreliable. Instead, we compare the address of the isl_ctx object. Both, DependenceInfo and IslAstInfo must hold a reference to the isl_ctx object to ensure it is not freed before the destruction of those analyses which might happen after the destruction of the Scop/ScopInfo they refer to. Hence, the isl_ctx will not be freed and its address not reused as long there is a DependenceInfo or IslAstInfo around. This fixes llvm.org/PR34441 llvm-svn: 313842
* [CodegenCleanup] Update cleanup passes according (old) PassManagerBuilder.Michael Kruse2017-09-091-8/+18
| | | | | | | | | | | | | | | | | | | | | | | | | Update CodegenCleanup using the function-level passes added by populatePassManager that run between EP_EarlyAsPossible and EP_VectorizerStart in -O3. The changes in particular are: - Added pass create arguments, e.g. ExpensiveCombines for InstCombine. - Remove reroll pass. The option -reroll-loops is disabled by default. - Add passes run with UnitAtATime, which is the default. - Add instances of LibCallsShrinkWrap, TailCallElimination, SCCP (sparse conditional constant propagation), Float2Int that did not run before. - Add instances of GVN as in the default pipeline. Notes: - GVNHoist, GVNSink, NewGVN are still disabled in the -O3 pipeline. - The optimization level and other optimization parameters are not accessible outside of PassManagerBuilder, hence we cannot add passes depending on these. Differential Revision: https://reviews.llvm.org/D37571 llvm-svn: 312875
* [CodeGen] Bitcast scalar writes to actual value.Michael Kruse2017-09-071-0/+7
| | | | | | | | The type of NewValue might change due to ScalarEvolution looking though bitcasts. The synthesized NewValue therefore becomes the type before the bitcast. llvm-svn: 312718
* [PPCGCodeGen] Document pre-composition with Zero in getExtent. [NFC]Siddharth Bhat2017-09-071-0/+26
| | | | | | | It's weird at first glance that we do this, so I wrote up some documentation on why we need to perform this process. llvm-svn: 312715
* [CodegenCleanup] Use old GVN pass instead of NewGVNTobias Grosser2017-09-041-1/+3
| | | | | | | It seems NewGVN still has some problems: llvm.org/PR34452, we will switch back after they have been resolved. llvm-svn: 312480
* [IslAst] Do not assert in case of empty min/max alias locationsTobias Grosser2017-09-031-15/+45
| | | | | | | | | | | | | In certain situations, the context in the isl_ast_build could result for the min/max locations of our alias sets to become empty, which would cause an internal error in isl, which is then unable to derive a value for these expressions. Check these conditions before code generating expressions and instead assume that alias check succeeded. This is valid, as the corresponding memory accesses will not be executed under any valid context. This fixed llvm.org/PR34432. Thanks to Qirun Zhang for reporting. llvm-svn: 312455
* [IslAst] Move buildCondition to isl++Tobias Grosser2017-09-031-26/+28
| | | | llvm-svn: 312452
* [ISLNodeBuilder] Materialize Fortran array sizes of arrays without memory ↵Siddharth Bhat2017-09-011-31/+22
| | | | | | | | | | | | | | | | | | | | | | accesses. In Polly, we specifically add a paramter to represent the outermost dimension size of fortran arrays. We do this because this information is statically available from the fortran metadata generated by dragonegg. However, we were only materializing these parameters (meaning, creating an llvm::Value to back the isl_id) from *memory accesses*. This is wrong, we should materialize parameters from *scop array info*. It is wrong because if there is a case where we detect 2 fortran arrays, but only one of them is accessed, we may not materialize the other array's dimensions at all. This is incorrect. We fix this by looping over all `polly::ScopArrayInfo` in a scop, rather that just all `polly::MemoryAccess`. Differential Revision: https://reviews.llvm.org/D37379 llvm-svn: 312350
* Run GVN during the cleanupRoman Gareev2017-09-011-0/+1
| | | | | | | | | | | | | Currently, GVN can be necessary to eliminate redundant instructions in case of, for instance, GEMM and float type. This patch makes GVN be run during the cleanup. Reviewed-by: Tobias Grosser <tobias@grosser.es>, Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D37340 llvm-svn: 312307
* [PPCGCodeGen] Convert intrinsics to libdevice functions whenever possible.Siddharth Bhat2017-08-311-7/+41
| | | | | | | | | | | | | | | | This is useful when we face certain intrinsics such as `llvm.exp.*` which cannot be lowered by the NVPTX backend while other intrinsics can. So, we would need to keep blacklists of intrinsics that cannot be handled by the NVPTX backend. It is much simpler to try and promote all intrinsics to libdevice versions. This patch makes function/intrinsic very uniform, and will always try to use a libdevice version if it exists. Differential Revision: https://reviews.llvm.org/D37056 llvm-svn: 312239
* [BlockGenerator] Generate entry block of regions from instruction listsTobias Grosser2017-08-311-1/+6
| | | | | | | | | The adds code generation support for the previous commit. This patch has been re-applied, after the memory issue in the previous patch has been fixed. llvm-svn: 312211
* Revert "[BlockGenerator] Generate entry block of regions from instruction lists"Tobias Grosser2017-08-311-6/+1
| | | | | | This reverts commit r312129. It caused some memory issues. llvm-svn: 312208
* [BlockGenerator] Generate entry block of regions from instruction listsTobias Grosser2017-08-301-1/+6
| | | | | | The adds code generation support for the previous commit. llvm-svn: 312129
* [IslAst] Do not compare arrays in alias check which are known to be identicalTobias Grosser2017-08-281-0/+14
| | | | | | This possibly helps to avoid run-time check failures in the COSMO kernels. llvm-svn: 311920
* [PM] Properly require and preserve OptimizationRemarkEmitter. NFCI.Michael Kruse2017-08-283-22/+6
| | | | | | | | | | | | | | | | | | | | | | Properly require and preserve the OptimizationRemarkEmitter for use in ScopPass. Previously one had to get the ORE from ScopDetection because CodeGeneration did not mark it as preserved. It would need to be recomputed which results in the legacy PM to throw away all previous SCoP analysis. This also changes the implementation of ScopPass::getAnalysisUsage to not unconditionally preserve all passes, but only those needed to be preserved by any SCoP pass (at least when using the legacy PM). This allows invalidating DependenceInfo (and IslAstInfo) in case the pass would cause them to change (e.g. OpTree, DeLICM, MaximalArrayExpansion) JSONImporter should also invalidate the DependenceInfo. In this patch it marks DependenceInfo as preserved anyway because some regression tests depend on it. Differential Revision: https://reviews.llvm.org/D37010 llvm-svn: 311888
* [Polly] Fix some Clang-tidy modernize and Include What You Use warnings; ↵Eugene Zelenko2017-08-243-66/+109
| | | | | | other minor fixes (NFC). llvm-svn: 311704
* [CodeGen] Detect impossible partial write conditions more reliably.Michael Kruse2017-08-241-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Whether a partial write is tautological/unsatisfiable not only depends on the access domain, but also on the domain covered by its node in the AST. In the example below, there are two instances of Stmt_cond_false. It may have a partial write access that is not executed in instance Stmt_cond_false(0). for (int c0 = 0; c0 < tmp5; c0 += 1) { Stmt_for_body344(c0); if (tmp5 >= c0 + 2) Stmt_cond_false(c0); Stmt_cond_end(c0); } if (tmp5 <= 0) { Stmt_for_body344(0); Stmt_cond_false(0); Stmt_cond_end(0); } Isl cannot derive a subscript for an array element that is never accessed. This caused an error in that no subscript expression has been generated in IslNodeBuilder::createNewAccesses, but BlockGenerator expected one to exist because there is an execution of that write, just not in that ast node. Fixed by instead of determining whether the access domain is empty, inspect whether isl generated a constant "false" ast expression in the current ast node. This should fix a compiler crash of the aosp buildbot. llvm-svn: 311663
* [Polly] [PPCGCodeGeneration] Mild refactoring of checking validity of ↵Siddharth Bhat2017-08-241-9/+10
| | | | | | | | | | | | functions in a kernel. This is a stylistic change to make the function a little more readable. Also add a debug print to show what instruction contains a use of a function we don't understand in the kernel. Differential Revision: https://reviews.llvm.org/D37058 llvm-svn: 311648
* Add more statistics.Michael Kruse2017-08-233-0/+85
| | | | | | | | | | | | | | | | Add statistics about - Which optimizations are applied - Number of loops in Scops at various stages - Number of scalar/singleton writes at various stages representative for scalar false dependencies - Number of parallel loops These will be useful to find regressions due to moving Polly further down of LLVM's pass pipeline. Differential Revision: https://reviews.llvm.org/D37049 llvm-svn: 311553
* [PPCGCodeGen] Fix compiler warning: '<': signed/unsigned mismatch. NFC.Michael Kruse2017-08-231-6/+6
| | | | | | | | | | | MSVC warns about comparison between a signed and unsigned integer. The rules of C(++) define that an unsigned comparison has to be carried-out in this case. This is unlikely to be intended. Fix by assigning the loop's upper bound to a signed integer first. This also avoids repeated evaluation of the invariant upper bound. llvm-svn: 311548
* [IRBuilder] Only emit alias scop metadata for arrays, but not scalarsTobias Grosser2017-08-221-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: There is no need to emit alias metadata for scalars, as basicaa will easily distinguish them from arrays. This reduces the size of the metadata we generate. This is especially useful after we moved to -polly-position=before-vectorizer, where a lot more scalar dependences are introduced, which increased the size of the alias analysis metadata and made us commonly reach the limits after which we do not emit alias metadata that have been introduced to prevent quadratic growth of this alias metadata. This improves 2mm performance from 1.5 seconds to 0.17 seconds. Reviewers: Meinersbur, bollu, singam-sanjay Reviewed By: Meinersbur Subscribers: pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D37028 llvm-svn: 311498
* [NFC] Fix the broken comment.Roman Gareev2017-08-221-1/+1
| | | | llvm-svn: 311477
* Disable the Loop Vectorizer in case of GEMMRoman Gareev2017-08-223-12/+49
| | | | | | | | | | | | | | Currently, in case of GEMM and the pattern matching based optimizations, we use only the SLP Vectorizer out of two LLVM vectorizers. Since the Loop Vectorizer can get in the way of optimal code generation, we disable the Loop Vectorizer for the innermost loop using mark nodes and emitting the corresponding metadata. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D36928 llvm-svn: 311473
* [ManagedMemoryRewrite] Use `unit64_t` to store size, not `int`.Siddharth Bhat2017-08-221-1/+2
| | | | llvm-svn: 311440
* [ManagedMemoryRewrite] Get size in bytes rather than in bits and dividing by 8.Siddharth Bhat2017-08-221-1/+1
| | | | llvm-svn: 311439
* [ManagedMemoryRewrite] slightly tweak debug output style. [NFC]Siddharth Bhat2017-08-211-10/+10
| | | | llvm-svn: 311361
* [ManagedMemoryRewrite] Print reasons for skipping global array to dbgs(). [NFC]Siddharth Bhat2017-08-211-2/+12
| | | | llvm-svn: 311360
* [ManagedMemoryRewrite] hide debug output behing DEBUG(...). [NFC]Siddharth Bhat2017-08-211-1/+1
| | | | llvm-svn: 311331
* [PPCGCodeGeneration] Enable `polly-codegen-perf-monitoring` for PPCGCodegen.Siddharth Bhat2017-08-212-4/+25
| | | | | | | | | | This feature was not enabled for `PPCGCodeGeneration`. Now that this is enabled, we can benchmark Scops that have been optimised with `-polly-codegen-ppcg` with the `-polly-codegen-perf-monitoring` option. Differential Revision: https://reviews.llvm.org/D36934 llvm-svn: 311328
* [GPGPU] Add llvm.powi to the libdevice supported functionsTobias Grosser2017-08-211-1/+1
| | | | | | These intrinsics are used in COSMO. llvm-svn: 311324
* [GPGPU] Add log / logf to the libdevice supported functionsTobias Grosser2017-08-211-2/+2
| | | | | | These two functions are used in COSMO llvm-svn: 311322
* Revert "[GPGPU] Simplify PPCGSCop to reduce compile time [NFC]"Tobias Grosser2017-08-191-79/+3
| | | | | | | | | We still see some issues with parameter space mismatches. Revert this to get a clean baseline. We will recommit after these issues have been resolved. This reverts commit 0e360a14194f722ded7aa2bc9d4be2ed2efeeb49. llvm-svn: 311268
* [ManagedMemoryRewrite] Make pass more robust and fix memory issueTobias Grosser2017-08-191-2/+4
| | | | | | | | Instead of using Twines and temporary expressions, we do string manipulation through a std::string. This resolves a memory corruption issue, which likely was caused by twines loosing their underlying string too soon. llvm-svn: 311264
* [ManagedMemoryRewrite] Iterate over operands of the expanded instruction, ↵Siddharth Bhat2017-08-191-6/+11
| | | | | | | | | | | | | not the constantexpr itself. - We should iterate over `I`, which is `Cur` expanded out to an instruction, and not `Cur` itself. - This is a bugfix. Differential Revision: https://reviews.llvm.org/D36923 llvm-svn: 311261
* [GPGPU] Correctly initialize array order and fixed_element informationTobias Grosser2017-08-191-7/+7
| | | | | | | | | | | | | | | | | | | | | | Summary: This information is necessary for PPCG to perform correct life range reordering. With these changes applied we can live-range reorder some of the important kernels in COSMO. We also update and rename one test case, which previously could not be optimized and now is optimized thanks to live-range reordering. To preserve test coverage we add a new test case scalar-writes-in-scop-requires-abort.ll, which exercises our automatic abort in case of scalar writes in the kernel. Reviewers: Meinersbur, bollu, singam-sanjay Subscribers: nemanjai, pollydev, llvm-commits, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D36929 llvm-svn: 311259
OpenPOWER on IntegriCloud