summaryrefslogtreecommitdiffstats
path: root/polly/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [PPCGCodeGeneration] Update PPCG Code Generation for OpenCL compatibilitySiddharth Bhat2017-04-251-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | Added a small change to the way pointer arguments are set in the kernel code generation. The way the pointer is retrieved now, specifically requests global address space to be annotated. This is necessary, if the IR should be run through NVPTX to generate OpenCL compatible PTX. The changes do not affect the PTX Strings generated for the CUDA target (nvptx64-nvidia-cuda), but are necessary for OpenCL (nvptx64-nvidia-nvcl). Additionally, the data layout has been updated to what the NVPTX Backend requests/recommends. Contributed-by: Philipp Schaad Reviewers: Meinersbur, grosser, bollu Reviewed By: grosser, bollu Subscribers: jlebar, pollydev, llvm-commits, nemanjai, yaxunl, Anastasia Tags: #polly Differential Revision: https://reviews.llvm.org/D32215 llvm-svn: 301299
* [Polly] [DependenceInfo] change WAR generation, Read will not block ReadSiddharth Bhat2017-04-241-27/+101
| | | | | | | | | | | | | | | | | | | | | Earlier, the call to buildFlow was: WAR = buildFlow(Write, Read, MustWrite, Schedule). This meant that Read could block another Read, since must-sources can block each other. Fixed the call to buildFlow to correctly compute Read. The resulting code needs to do some ISL juggling to get the output we want. Bug report: https://bugs.llvm.org/show_bug.cgi?id=32623 Reviewers: Meinersbur Tags: #polly Differential Revision: https://reviews.llvm.org/D32011 llvm-svn: 301266
* [isl C++ bindings] Add explicit const casts for *foreach* bindingsTobias Grosser2017-04-231-7/+7
| | | | | | | This avoids a compiler warning about lost 'const' attributes. Suggested-by: Michael Kruse <llvm@meinersbur.de> llvm-svn: 301108
* [DeLICM] Use Known information when comparing Existing.Written and ↵Michael Kruse2017-04-201-8/+47
| | | | | | | | | | | | Proposed.Written. This change only affects unit tests, but no functional changes are expected on LLVM-IR, as no Known information is yet extracted and consequently this functionality is only triggered through unit tests. Differential Revision: https://reviews.llvm.org/D32027 llvm-svn: 300874
* Update isl bindings to latest version (+ Polly extensions)Tobias Grosser2017-04-152-83/+87
| | | | | | | | | After the isl C++ binding generator is now close to being upstreamed to isl, we synchronize the latest changes to Polly. These are mostly formatting changes plus a small interface change for the foreach callback function and some naming changes in isl::boolean. llvm-svn: 300398
* Use isl C++ foreach implementationTobias Grosser2017-04-144-134/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit switches Polly over to the isl::obj::foreach_* implementation, which is part of the new isl bindings and follows the foreach pattern established in Polly by Michael Kruse. The original isl C function: isl_stat isl_union_set_foreach_set(__isl_keep isl_union_set *uset, isl_stat (*fn)(__isl_take isl_set *set, void *user), void *user); which required the user to define a static callback function to which all interesting parameters are passed via a 'void *' user-pointer, is on the C++ side available as a function that takes a std::function<>, which can carry any additional arguments without the need for a user pointer: stat UnionSet::foreach_set(const std::function<stat(set)> &fn) const; The following code illustrates the use of the new C++ interface: auto Lambda = [=, &Result](isl::set Set) -> isl::stat { auto Shifted = shiftDimension(Set, Pos, Amount); Result = Result.add(Shifted); return isl::stat::ok; } UnionSet.foreach_set(Lambda); Polly had some specialized foreach functions which did not require the lambdas to return a status flag. We remove these functions in this commit to move Polly completely over to the new isl interface. We may in the future discuss if functors without return values can be supported easily. Another extension proposed by Michael Kruse is the use of C++ iterators to allow the use of normal for loops to iterate over these sets. Such an extension would allow us to further simplify the code. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D30620 llvm-svn: 300323
* [DeLICM] Export Known and Written to DeLICMTests. NFC.Michael Kruse2017-04-131-9/+16
| | | | | | | This will allow unittesting of new functionality based on Known and Written. llvm-svn: 300211
* [DeLICM] Add Knowledge::Known. NFC.Michael Kruse2017-04-131-2/+29
| | | | | | | This field will later contain a ValInst that is known to be stored in an occupied array element. llvm-svn: 300210
* [DeLICM] Make Knowledge::Written an isl::union_map. NFC.Michael Kruse2017-04-131-14/+21
| | | | | | The map will later point to a ValInst that is written. llvm-svn: 300208
* Exploit BasicBlock::getModule to shorten codeTobias Grosser2017-04-115-9/+5
| | | | | Suggested-by: Roman Gareev <gareevroman@gmail.com> llvm-svn: 299914
* SAdjust to recent change in constructor definition of AllocaInstTobias Grosser2017-04-113-18/+23
| | | | llvm-svn: 299913
* Update for alloca construction changesMatt Arsenault2017-04-114-5/+17
| | | | llvm-svn: 299905
* [FIX] Fix ScheduleTreeOptimizer::optimizeMatMulPatternRoman Gareev2017-04-061-1/+1
| | | | | | Use new values of the dimensions during their permutation. llvm-svn: 299663
* Restore the initial ordering of dimensions before applying the pattern matchingRoman Gareev2017-04-061-1/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dimensions of band nodes can be implicitly permuted by the algorithm applied during the schedule generation. For example, in case of the following matrix-matrix multiplication, for (i = 0; i < 1024; i++) for (k = 0; k < 1024; k++) for (j = 0; j < 1024; j++) C[i][j] += A[i][k] * B[k][j]; it can produce the following schedule tree domain: "{ Stmt_for_body6[i0, i1, i2] : 0 <= i0 <= 1023 and 0 <= i1 <= 1023 and 0 <= i2 <= 1023 }" child: schedule: "[{ Stmt_for_body6[i0, i1, i2] -> [(i0)] }, { Stmt_for_body6[i0, i1, i2] -> [(i1)] }, { Stmt_for_body6[i0, i1, i2] -> [(i2)] }]" permutable: 1 coincident: [ 1, 1, 0 ] The current implementation of the pattern matching optimizations relies on the initial ordering of dimensions. Otherwise, it can produce the miscompilation (e.g., [1]). This patch helps to restore the initial ordering of dimensions by recreating the band node when the corresponding conditions are satisfied. Refs.: [1] - https://bugs.llvm.org/show_bug.cgi?id=32500 Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D31741 llvm-svn: 299662
* [Polly] [ScheduleOptimizer] Prevent incorrect tile size computationSiddharth Bhat2017-04-061-0/+11
| | | | | | | | | | | | | | | | | Because Polly exposes parameters that directly influence tile size calculations, one can setup situations like divide-by-zero. Check against a possible divide-by-zero in getMacroKernelParams and return early. Also assert at the end of getMacroKernelParams that the block sizes computed for matrices are positive (>= 1). Tags: #polly Differential Revision: https://reviews.llvm.org/D31708 llvm-svn: 299633
* Update to isl-0.18-417-gb9e7334Tobias Grosser2017-04-0613-413/+425
| | | | | | This is a regular maintenance update. llvm-svn: 299617
* Remove llvm.lifetime.start/end in original region.Michael Kruse2017-04-051-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | The current StackColoring algorithm does not correctly handle the situation when some, but not all paths from a BB to the entry node cross a llvm.lifetime.start. According to an interpretation of the language reference at http://llvm.org/docs/LangRef.html#llvm-lifetime-start-intrinsic this might be correct, but it would cost too much effort to handle in StackColoring. To be on the safe side, remove all lifetime markers even in the original code version (they have never been copied to the optimized version) to ensure that no path to the entry block will cross a llvm.lifetime.start. The same principle applies to paths the a function return and the llvm.lifetime.end marker, so we remove them as well. This fixes llvm.org/PR32251. Also see the discussion at http://lists.llvm.org/pipermail/llvm-dev/2017-March/111551.html llvm-svn: 299585
* [Polly] [DependenceInfo] change WAR, WAW generation to correct semanticsSiddharth Bhat2017-04-042-33/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | = Change of WAR, WAW generation: = - `buildFlow(Sink, MustSource, MaySource, Sink)` treates any flow of the form `sink <- may source <- must source` as a *may* dependence. - we used to call: ```lang=cpp, name=old-flow-call.cpp Flow = buildFlow(MustWrite, MustWrite, Read, Schedule); WAW = isl_union_flow_get_must_dependence(Flow); WAR = isl_union_flow_get_may_dependence(Flow); ``` - This caused some WAW dependences to be treated as WAR dependences. - Incorrect semantics. - Now, we call WAR and WAW correctly. == Correct WAW: == ```lang=cpp, name=new-waw-call.cpp Flow = buildFlow(Write, MustWrite, MayWrite, Schedule); WAW = isl_union_flow_get_may_dependence(Flow); isl_union_flow_free(Flow); ``` == Correct WAR: == ```lang=cpp, name=new-war-call.cpp Flow = buildFlow(Write, Read, MustaWrite, Schedule); WAR = isl_union_flow_get_must_dependence(Flow); isl_union_flow_free(Flow); ``` - We want the "shortest" WAR possible (exact dependences). - We mark all the *must-writes* as may-source, reads as must-souce. - Then, we ask for *must* dependence. - This removes all the reads that flow through a *must-write* before reaching a sink. - Note that we only block ealier writes with *must-writes*. This is intuitively correct, as we do not want may-writes to block must-writes. - Leaves us with direct (R -> W). - This affects reduction generation since RED is built using WAW and WAR. = New StrictWAW for Reductions: = - We used to call: ```lang=cpp,name=old-waw-war-call.cpp Flow = buildFlow(MustWrite, MustWrite, Read, Schedule); WAW = isl_union_flow_get_must_dependence(Flow); WAR = isl_union_flow_get_may_dependence(Flow); ``` - This *is* the right model of WAW we need for reductions, just not in general. - Reductions need to track only *strict* WAW, without any interfering reductions. = Explanation: Why the new WAR dependences in tests are correct: = - We no longer set WAR = WAR - WAW - Hence, we will have WAR dependences that were originally removed. - These may look incorrect, but in fact make sense. == Code: == ```lang=llvm, name=new-war-dependence.ll ; void manyreductions(long *A) { ; for (long i = 0; i < 1024; i++) ; for (long j = 0; j < 1024; j++) ; S0: *A += 42; ; ; for (long i = 0; i < 1024; i++) ; for (long j = 0; j < 1024; j++) ; S1: *A += 42; ; ``` === WAR dependence: === { S0[1023, 1023] -> S1[0, 0] } - Between `S0[1023, 1023]` and `S1[0, 0]`, we will have the dependences: ```lang=cpp, name=dependence-incorrect, counterexample S0[1023, 1023]: *-- tmp = *A (load0)--* WAR 2 add = tmp + 42 | *-> *A = add (store0) | WAR 1 S1[0, 0]: | tmp = *A (load1) | add = tmp + 42 | A = add (store1)<-* ``` - One may assume that WAR2 *hides* WAR1 (since store0 happens before store1). However, within a statement, Polly has no idea about the ordering of loads and stores. - Hence, according to Polly, the code may have looked like this: ```lang=cpp, name=dependence-correct S0[1023, 1023]: A = add (store0) tmp = A (load0) ---* add = A + 42 | WAR 1 S1[0, 0]: | tmp = A (load1) | add = A + 42 | A = add (store1) <-* ``` - So, Polly generates (correct) WAR dependences. It does not make sense to remove these dependences, since they are correct with respect to Polly's model. Reviewers: grosser, Meinersbur tags: #polly Differential revision: https://reviews.llvm.org/D31386 llvm-svn: 299429
* Fix formatting in LoopGeneratorsPhilip Pfaffe2017-04-041-2/+2
| | | | llvm-svn: 299424
* [Polly][NewPM] Pull references to the legacy PM interface from utilities and ↵Philip Pfaffe2017-04-045-15/+13
| | | | | | | | | | | | | | | | | | | | | helpers Summary: A couple of the utilities used to analyze or build IR make explicit use of the legacy PM on their interface, to access analysis results. This patch removes the legacy PM from the interface, and just passes the required results directly. This shouldn't introduce any function changes, although the API technically allowed to obtain two different analysis results before, one passed by reference and one through the PM. I don't believe that was ever intended, however. Reviewers: grosser, Meinersbur Reviewed By: grosser Subscribers: nemanjai, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D31653 llvm-svn: 299423
* [PerfMonitor] Use Intrinsics::getDeclarationTobias Grosser2017-04-031-11/+2
| | | | | | | | Instead of creating the declaration ourselves, we obtain it directly from the LLVM intrinsic definitions. This addresses a post-review comment for r299359. Suggested-by: Hongzing Zheng <etherzhhb@gmail.com> llvm-svn: 299360
* [CodeGen] Add Performance MonitorTobias Grosser2017-04-033-0/+254
| | | | | | | | | | | | | | | | | | | | | | | | | | Add support for -polly-codegen-perf-monitoring. When performance monitoring is enabled, we emit performance monitoring code during code generation that prints after program exit statistics about the total number of cycles executed as well as the number of cycles spent in scops. This gives an estimate on how useful polyhedral optimizations might be for a given program. Example output: Polly runtime information ------------------------- Total: 783110081637 Scops: 663718949365 In the future, we might also add functionality to measure how much time is spent in optimized scops and how many cycles are spent in the fallback code. Reviewers: bollu,sebpop Tags: #polly Differential Revision: https://reviews.llvm.org/D31599 llvm-svn: 299359
* [ScopInfo] Fix typos in option description.Michael Kruse2017-04-031-2/+2
| | | | llvm-svn: 299356
* [PollyIRBuilder] Bound size of alias metadataTobias Grosser2017-04-031-0/+8
| | | | | | | | | | | | No-alias metadata grows quadratic in the size of arrays involved, which can become very costly for large programs. This commit bounds the number of arrays for which we construct no-alias information to ten. This is conservatively correct, as we just provide less information to LLVM and speeds up the compile time of one of my internal test cases from 'does-not-terminate' to 'finishes-in-less-than-a-minute'. In the future we might try to be more clever here, but this change should provide a good baseline. llvm-svn: 299352
* Update to isl-0.18-410-gc253447Tobias Grosser2017-04-038-69/+103
| | | | | | | This is a regular maintenance update to ensure latest isl changes are tested in our buildbots. llvm-svn: 299350
* revert test commit r299024Huihui Zhang2017-03-291-1/+0
| | | | llvm-svn: 299026
* test commit, add blank lineHuihui Zhang2017-03-291-0/+1
| | | | llvm-svn: 299024
* [ScopInfo] Introduce ScopStmt::contains(BB*). NFC.Michael Kruse2017-03-231-4/+2
| | | | | | | | | Provide an common way for testing if a statement contains something for region and block statements. First user is RegionGenerator::addOperandToPHI. Suggested-by: Tobias Grosser <tobias@grosser.es> llvm-svn: 298617
* Update to isl-0.18-402-ga30c537Tobias Grosser2017-03-239-17/+41
| | | | | | This is a regular maintenance update. llvm-svn: 298595
* [DeLICM] Add const qualifiers. NFC.Michael Kruse2017-03-221-2/+2
| | | | llvm-svn: 298546
* [Support] Add functions to ISLTools.Michael Kruse2017-03-221-0/+139
| | | | | | | | | | | Add shiftDim and convertZoneToTimepoints overloads for isl maps. Add distributeDomain, liftDomains and applyDomainRange functions. These are going to be used in https://reviews.llvm.org/D31247 (Add known array contents to Knowledge) llvm-svn: 298543
* [DeLICM] Remove overloaded Knowledge constructor. NFC.Michael Kruse2017-03-221-6/+0
| | | | | | | | The isl C++ bindings now has implicit conversions from isl::set to isl::union_set. Therefore the additional overload accepting isl::set is not required anymore. llvm-svn: 298529
* [DeLICM] Remove AllElements. NFC.Michael Kruse2017-03-221-14/+0
| | | | | | It is not used and will not be used (anymore) in future commits. llvm-svn: 298522
* Introduce another level of metadata to distinguish non-aliasing accessesRoman Gareev2017-03-223-5/+76
| | | | | | | | | | | | | | Introduce another level of alias metadata to distinguish the individual non-aliasing accesses that have inter iteration alias-free base pointers marked with "Inter iteration alias-free" mark nodes. It can be used to, for example, distinguish different stores (loads) produced by unrolling of the innermost loops and, subsequently, sink (hoist) them by LICM. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D30606 llvm-svn: 298510
* Map the new load to the base pointer of the invariant load hoisted loadRoman Gareev2017-03-221-0/+3
| | | | | | | | | | | Map the new load to the base pointer of the invariant load hoisted load to be able to find the alias information for it. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D30605 llvm-svn: 298507
* [DependenceInfo] change name Write to MustWrite to remove ambiguity [NFC]Siddharth Bhat2017-03-211-13/+15
| | | | | | | | | | | | | | | "Write" is an overloaded term. In collectInfo() till buildFlow(), it is used to mean "must writes". However, within the memory based analysis, it is used to mean "both may and must writes". Renaming the Write variable helps clarify this difference. Reviewers: grosser Tags: #polly Differential Revision: https://reviews.llvm.org/D31181 llvm-svn: 298361
* Update isl to isl-0.18-395-g77701b3Tobias Grosser2017-03-2114-17/+81
| | | | | | This is a normal maintenance update. llvm-svn: 298352
* [CodeGen] Remove need for all parameters to be in scop context for load ↵Tobias Grosser2017-03-181-6/+14
| | | | | | | | | | | hoisting. When not adding constraints on parameters using -polly-ignore-parameter-bounds, the context may not necessarily list all parameter dimensions. To support code generation in this situation, we now always iterate over the actual parameter list, rather than relying on the context to list all parameter dimensions. llvm-svn: 298197
* [IslExprBuilder] Print accessed memory locations with RuntimeDebugBuilderTobias Grosser2017-03-182-4/+17
| | | | | | | | | | | | | | | | | | | After this change, enabling -polly-codegen-add-debug-printing in combination with -polly-codegen-generate-expressions allows us to instrument the compiled binaries to not only print the values stored and loaded to a given memory access, but also to print the accessed location with array name and per-dimension offset: MemRef_A[3][2] Store to 6299784: 5.000000 MemRef_A[3][3] Load from 6299788: 0.000000 MemRef_A[3][3] Store to 6299788: 6.000000 This can be very helpful for debugging. llvm-svn: 298194
* [OpenMP] Do not emit lifetime markers for contextTobias Grosser2017-03-181-9/+0
| | | | | | | | | | | | | | | | | | In commit r219005 lifetime markers have been introduced to mark the lifetime of the OpenMP context data structure. However, their use seems incorrect and recently caused a miscompile in ASC_Sequoia/CrystalMk after r298053 which was not at all related to r298053. r298053 only caused a change in the loop order, as this change resulted in a different isl internal representation which caused the scheduler to derive a different schedule. This change then caused the IR to change, which apparently created a pattern in which LLVM exploites the lifetime markers. It seems we are using the OpenMP context outside of the lifetime markers. Even though CrystalMk could probably be fixed by expanding the scope of the lifetime markers, it is not clear what happens in case the OpenMP function call is in a loop which will cause a sequence of starting and ending lifetimes. As it is unlikely that the lifetime markers give any performance benefit, we just drop them to remove complexity. llvm-svn: 298192
* [ScheduleOptimiser] fix typos in top comment [NFC]Siddharth Bhat2017-03-171-2/+2
| | | | | | | coice -> choice Transations -> Transactions llvm-svn: 298095
* Revert "Remove references to AssumptionCache. NFC."Michael Kruse2017-03-172-73/+74
| | | | | | | | | | The AssumptionCache removal of r289756 has been reverted in r290086/r290087. A different solution has been implemented in r291671 which keeps the AssumptionCache. We can therefore use it again in Polly. This reverts r289791. llvm-svn: 298089
* [DependenceInfo] Remove idempotent union: must-writes with may-writes [NFC]Siddharth Bhat2017-03-171-2/+1
| | | | | | | Since may-writes are always a superset of the must-writes, there is no point in taking a union of one with the other. llvm-svn: 298085
* [ScopInfo/PruneUnprofitable] Move default profitability check.Michael Kruse2017-03-172-2/+2
| | | | | | | | | | | | | | | | | | | | In the previous default ScopInfo applied the profitability heuristic for scalar accesses (-polly-unprofitable-scalar-accs=true) and the -polly-prune-unprofitable was disabled by default (-polly-enable-prune-unprofitable=false) as that pruning was already done. This changes switches the defaults to -polly-unprofitable-scalar-accs=true -polly-enable-prune-unprofitable=false such that the scalar access heuristic check is done by the pass. This allows passes between ScopInfo and PruneUnprofitable to optimize away scalar accesses. Without enabling such intermediate passes, there is no change in behaviour of profitability checks in a PassManagerBuilder built pass chain, but it allows us to cover this configuration with the buildbots. Suggested-by: Tobias Grosser <tobias@grosser.es> llvm-svn: 298081
* [PruneUnprofitable] Add -polly-prune-unprofitable pass.Michael Kruse2017-03-174-5/+86
| | | | | | | | | | | | | | | | | | | | | | ScopInfo's normal profitability heuristic considers SCoPs where all statements have scalar writes as not profitably optimizable and invalidate the SCoP in that case. However, -polly-delicm and -polly-simplify may be able to remove some of the scalar writes such that the flag -polly-unprofitable-scalar-accs=false allows disabling that part of the heuristic. In cases where DeLICM (or other passes after ScopInfo) are not successful in removing scalar writes, the SCoP is still not profitably optimizable. The schedule optimizer would again try computing another schedule, resulting in slower compilation. The -polly-prune-unprofitable pass applies the profitability heuristic again before the schedule optimizer Polly can still bail out even with -polly-unprofitable-scalar-accs=false. Differential Revision: https://reviews.llvm.org/D31033 llvm-svn: 298080
* [ScopInfo] Add option to not add parameter bounds to context [NFC]Tobias Grosser2017-03-171-0/+9
| | | | | | | | For experiments it is sometimes helpful to provide parameter bound information to polly and to not use these parameter bounds for simplification. Add a new option "-polly-ignore-parameter-bounds" which does precisely this. llvm-svn: 298077
* [DependenceInfo] Replace use of deprecated isl_dim_n_out [NFC]Siddharth Bhat2017-03-171-1/+2
| | | | | | Change isl_dim_n_out to isl_map_dim(*, isl_dim_out) llvm-svn: 298075
* [DependenceInfo] Track may-writes and build flow information inSiddharth Bhat2017-03-171-0/+3
| | | | | | | | | | | | Dependences::calculateDependences. This ensures that we handle may-writes correctly when building dependence information. Also add a test case checking correctness of may-write information. Not handling it before was an oversight. Differential Revision: https://reviews.llvm.org/D31075 llvm-svn: 298074
* [ScopInfo] Do not take inbounds assumptions [NFC]Tobias Grosser2017-03-171-0/+7
| | | | | | | For experiments it is sometimes helpful to not take any inbounds assumptions. Add a new option "-polly-ignore-inbounds" which does precisely this. llvm-svn: 298073
* [ScopInfo] Do not try to eliminate parameter dimensions that do not existTobias Grosser2017-03-171-1/+2
| | | | | | | | | | In subsequent changes we will make Polly a little bit more lazy in adding parameter dimensions to different sets. As a result, not all parameters will always be part of the parameter space. This change ensures that we do not use the '-1' returned when a parameter dimension cannot be found, but instead just do not try to eliminate the anyhow non-existing dimension. llvm-svn: 298054
OpenPOWER on IntegriCloud