summaryrefslogtreecommitdiffstats
path: root/polly/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [DeLICM] Fix wrong comment. NFC.Michael Kruse2017-02-221-1/+1
| | | | | | | Correct a comment that claimed that a store after load was detected when the code checks a load after a store. llvm-svn: 295835
* [DeLICM] Print message when zone analysis is not available on -analysis.Michael Kruse2017-02-221-0/+5
| | | | | | | This is to distinguish the cases that analysis has failed from the case where not transformation was performed. llvm-svn: 295833
* [DeLICM] Use opt<int>.Michael Kruse2017-02-221-1/+1
| | | | | | | | There is no template specialization for cl::parser<unsigned long> such that parsing an cl::opt<unsigned long> command line argument will fail. Use opt<int> instead which has an associated parser. llvm-svn: 295832
* [DependenceInfo] Simplify creation and subsequent use of AccessSchedule [NFC]Tobias Grosser2017-02-211-31/+22
| | | | | | | | | | | | | | We only ever use the wrapped domain of AccessSchedule, so stop creating an entire union_map and then pulling the domain out. Reviewers: grosser Tags: #polly Contributed-by: Siddharth Bhat <siddu.druid@gmail.com> Differential Revision: https://reviews.llvm.org/D30179 llvm-svn: 295726
* [DeLICM] Map values hoisted by LICM back to the array.Michael Kruse2017-02-211-5/+1249
| | | | | | | | | | | | | | | | | | | | Implement the -polly-delicm pass. The pass intends to undo the effects of LoopInvariantCodeMotion (LICM) which adds additional scalar dependencies into SCoPs. DeLICM will try to map those scalars back to the array elements they were promoted from, as long as the array element is unused. The is the main patch from the DeLICM/DePRE patch series. It does not yet undo GVN PRE for which additional information about known values is needed and does not handle PHI write accesses that have have no target. As such its usefulness is limited. Patches for these issues including regression tests for error situatons will follow. Reviewers: grosser Differential Revision: https://reviews.llvm.org/D24716 llvm-svn: 295713
* [Cmake] Install the isl headers into the install tree.Michael Kruse2017-02-201-0/+14
| | | | | | | | | | | | | | | | | | | | isl headers are currently missing in a Polly installation. Because the Polly headers depend on those, code can't be compiled against an installed Polly. This patch installs the isl headers. I left a TODO, as optionally it should be possible to use a system version of isl instead of the one shipped with Polly. When compiling, clients of the installation need to add -I${PREFIX}/include/polly/ to there include path right now, because there currently is no way to export this path automatically. Contributed-by: Philip Pfaffe <philip.pfaffe@gmail.com> Differential Revision: https://reviews.llvm.org/D29931 llvm-svn: 295671
* [ScopInfo] Count read-only arrays when computing complexity of alias checkTobias Grosser2017-02-181-1/+3
| | | | | | | | | | | Instead of counting the number of read-only accesses, we now count the number of distinct read-only array references when checking if a run-time alias check may be too complex. The run-time alias check is quadratic in the number of base pointers, not the number of accesses. Before this change we accidentally skipped SPEC's lbm test case. llvm-svn: 295567
* [DependenceInfo] Pull out statement [NFC]Tobias Grosser2017-02-181-3/+2
| | | | | | This simplifies the code slightly. llvm-svn: 295551
* [Dependences] Compute reduction dependences on schedule tree [NFC]Tobias Grosser2017-02-181-39/+27
| | | | | | | | | | | | | | | | | | | This change gets rid of the need for zero padding, makes the reduction computation code more similar to the normal dependence computation, and also better documents what we do at the moment. Making the dependence computation for reductions a little bit easier to understand will hopefully help us to further reduce code duplication. This reduces the time spent only in the reduction dependence pass from 260ms to 150ms for test/DependenceInfo/reduction_sequence.ll. This is a reduction of over 40% in dependence computation time. This change was inspired by discussions with Michael Kruse, Utpal Bora, Siddharth Bhat, and Johannes Doerfert. It can hopefully lay the base for further cleanups of the reduction code. llvm-svn: 295550
* Drop leftover debug statementTobias Grosser2017-02-171-1/+0
| | | | llvm-svn: 295444
* [ScopInfo] Add statistics to count loops after scop modelingTobias Grosser2017-02-172-9/+47
| | | | llvm-svn: 295431
* [ScopDetection] Compute the maximal loop depth correctlyTobias Grosser2017-02-171-1/+1
| | | | | | | Before this change, we obtained loop depth numbers that were deeper then the actual loop depth. llvm-svn: 295430
* Updated isl to isl-0.18-254-g6bc184dTobias Grosser2017-02-1790-1726/+2481
| | | | | | | This update includes a couple more coalescing changes as well as a large number of isl-internal code cleanups (dead assigments, ...). llvm-svn: 295419
* [ScopInfo] Do not try to fold array dimensions of size zeroTobias Grosser2017-02-171-0/+4
| | | | | | | | | | Trying to fold such kind of dimensions will result in a division by zero, which crashes the compiler. As such arrays are likely to invalidate the scop anyhow (but are not illegal in LLVM-IR), there is no point in trying to optimize the array layout. Hence, we just avoid the folding of constant dimensions of size zero. llvm-svn: 295415
* [ScopInfo] Rename MaxDisjunctions -> MaxDisjuncts [NFC]Tobias Grosser2017-02-161-8/+8
| | | | | | | There is only a single disjunction. However, we bound the number of 'disjuncts' in this disjunction. Name the variable accordingly. llvm-svn: 295362
* [ScopInfo] Bound the number of disjuncts in contextTobias Grosser2017-02-161-0/+8
| | | | | | | | | Before this change wrapping range metadata resulted in exponential growth of the context, which made context construction of large scops very slow. Instead, we now just do not model the range information precisely, in case the number of disjuncts in the context has already reached a certain limit. llvm-svn: 295360
* [ScopInfo] Use uppercase variable name [NFC]Tobias Grosser2017-02-161-5/+5
| | | | llvm-svn: 295350
* [ScopInfo] Always derive upper and lower bounds for parametersTobias Grosser2017-02-161-13/+22
| | | | | | | | | | | | | | | | | | | | Commit r230230 introduced the use of range metadata to derive bounds for parameters, instead of just looking at the type of the parameter. As part of this commit support for wrapping ranges was added, where the lower bound of a parameter is larger than the upper bound: { 255 < p || p < 0 } However, at the same time, for wrapping ranges support for adding bounds given by the size of the containing type has acidentally been dropped. As a result, the range of the parameters was not guaranteed to be bounded any more. This change makes sure we always add the bounds given by the size of the type and then additionally add bounds based on signed wrapping, if available. For a parameter p with a type size of 32 bit, the valid range is then: { -2147483648 <= p <= 2147483647 and (255 < p or p < 0) } llvm-svn: 295349
* [FIX] Fix the typo in ScheduleOptimizer.cpp.Roman Gareev2017-02-161-1/+2
| | | | llvm-svn: 295292
* [DeLICM] Add Knowledge class. NFC.Michael Kruse2017-02-151-0/+363
| | | | | | | | | | | | | | | The Knowledge class remembers the state of data at any timepoint of a SCoP's execution. Currently, it tracks whether an array element is unused or is occupied by some value, and the writes to it. A future addition will be to also remember which value it contains. Objects are used to determine whether two Knowledge contain conflicting information, i.e. two states cannot be true a the same time. This commit was extracted from the DeLICM algorithm at https://reviews.llvm.org/D24716. llvm-svn: 295197
* [ScopDetectDiagnostics] Do not format unnamed array namesTobias Grosser2017-02-121-1/+1
| | | | | | | | | | | | | Formatting unnamed array names is expensive in LLVM as the this requires deriving the numbered virtual instruction name (e.g., %12) for an llvm::Value, which is currently not implemented efficiently. As instruction numberes anyhow do not really carry a lot of information for the user, we just print 'unknown' instead. This change reduces the scop detection time from 24 to 19 seconds, for one of our large-scale inputs. This is a reduction by 21%. llvm-svn: 294894
* [ScopDetection] Add statistics to count the maximal number of scops in loopTobias Grosser2017-02-121-0/+7
| | | | llvm-svn: 294893
* Do not use wrapping ranges to bound non-affine accessesTobias Grosser2017-02-121-0/+6
| | | | | | | | | | | | | When deriving the range of valid values of a scalar evolution expression might be a range [12, 8), where the upper bound is smaller than the lower bound and where the range is expected to possibly wrap around. We theoretically could model such a range as a union of two non-wrapping ranges, but do not do this as of yet. Instead, we just do not derive any bounds. Before this change, we could have obtained bounds where the maximal possible value is strictly smaller than the minimal possible value, which is incorrect and also caused assertions during scop modeling. llvm-svn: 294891
* Check reduction dependencies in case of the matrix multiplication optimizationRoman Gareev2017-02-111-3/+6
| | | | | | | | | | | | | | To determine parameters of the matrix multiplication, we check RAW dependencies that can be expressed using only reduction dependencies. Consequently, we should check the reduction dependencies, if this is the case. Reviewed-by: Tobias Grosser <tobias@grosser.es>, Sven Verdoolaege <skimo-polly@kotnet.org> Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D29814 llvm-svn: 294836
* [FIX] Fix the potential issue of containsOnlyMatMulDep.Roman Gareev2017-02-111-0/+2
| | | | llvm-svn: 294835
* [NFC] Fix the style issue of lib/Transform/ScheduleOptimizer.cpp.Roman Gareev2017-02-111-2/+1
| | | | llvm-svn: 294834
* [NFC] Fix style issues of lib/Transform/ScheduleOptimizer.cpp.Roman Gareev2017-02-111-9/+6
| | | | llvm-svn: 294831
* Use the size of the widest type of the matrix multiplication operandsRoman Gareev2017-02-111-8/+48
| | | | | | | | | | | | | The size of the operands type is the one of the parameters required to determine the BLIS micro-kernel. We get the size of the widest type of the matrix multiplication operands in case there are several different types. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D29269 llvm-svn: 294828
* [ScopInfo] Use original base address when building ScopArrayInfo [NFC]Tobias Grosser2017-02-101-2/+2
| | | | | | | | | This change clarfies that we want to indeed use the original base address when creating the ScopArrayInfo that corresponds to a given memory access. This change prepares for https://reviews.llvm.org/D28518. llvm-svn: 294734
* [ScopInfo] Use getAccessValue to obtain the accessed valueTobias Grosser2017-02-101-1/+1
| | | | | | | | | | | | | | | This replaces the use of getOriginalAddrPtr, a value that is stored in ScopArrayInfo and might at some point not be unique any more. However, the access value is defined to be unique. This change is an update on r294576, which only clarified that we need the original memory access, but where we still remained dependent to have one base pointer per scop. This change removes unnecessary uses of MemoryAddress::getOriginalBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294733
* [BlockGenerator] Use MemoryAccess::getAccessValue to get load instructionTobias Grosser2017-02-091-2/+2
| | | | | | | | | | | | | | | | | | | | When generating code in the BlockGenerator we copy all (interesting) instructions and keep track of the new values in a basic block map. To obtain the original llvm::Value that belongs to a load memory access, we use getAccessValue() instead of getOriginalBaseAddr(). The former always references the instruction we use to load values from. The latter, on the other hand, is obtaine from the corresponding ScopArrayInfo and would not be unique in case ScopArrayInfo objects at some point allow memory accesses with different base addresses. This change is an update on r294566, which only clarified that we need the original memory access, but where we still remained dependent to have one base pointer per scop. This change removes unnecessary uses of MemoryAddress::getOriginalBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294669
* [ScopInfo] Use MemoryAccess::getScopArrayInfo() interface to access Array [NFC]Tobias Grosser2017-02-091-3/+2
| | | | | | | | | | | | | | By using the public interface MemoryAccess::getScopArrayInfo() we avoid the direct access to the ScopArrayInfoMap and as a result also do not need to use the BasePtr as key. This change makes the code cleaner. The const-cast we introduce is a little ugly. We may consider to drop const correctness for getScopArrayInfo() at some point. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294655
* [ScopInfo] Use types instead of 'auto' and use more descriptive variable ↵Tobias Grosser2017-02-091-8/+10
| | | | | | | | | | names [NFC] LLVM's coding conventions suggest to use auto only in obvious cases. Hence, we move this code to actually declare the types used. We also replace the variable name 'SAI', with the name 'Array', as this improves readability. llvm-svn: 294654
* [ScopInfo] Use ScopArrayInfo instead of base addressTobias Grosser2017-02-091-11/+11
| | | | | | | | | | | | When building alias groups, we sort different ScopArrays into unrelated groups. Historically we identified arrays through their base pointer, as no ScopArrayInfo class was yet available. This change changes the alias group construction to reference arrays through their ScopArrayInfo object. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294649
* [ScopInfo] Expect the OriginalBaseAddr when looking at underlying ↵Tobias Grosser2017-02-091-3/+3
| | | | | | | | | | | | | | | | instructions [NFC] During SCoP construction we sometimes inspect the underlying IR by looking at the base address of a MemoryAccess. In such cases, we always want the original base address. Make this clear by calling getOriginalBaseAddr(). This is a non-functional change as getBaseAddr maps to getOriginalBaseAddr at the moment. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294576
* [ScopInfo] Remove unnecessary indirection through SCEV [NFC]Tobias Grosser2017-02-091-8/+4
| | | | | | | | | The base address of a memory access is already an llvm::Value. Hence, there is no need to go through SCEV, but we can directly work with the llvm::Value. Also use 'Value *' instead of 'auto' for cases where the type is not obvious. llvm-svn: 294575
* [IRBuilder] Extract base pointers directly from ScopArrayTobias Grosser2017-02-091-13/+7
| | | | | | | | | | | | | Instead of iterating over statements and their memory accesses to extract the set of available base pointers, just directly iterate over all ScopArray objects. This reflects more the actual intend of the code: collect all arrays (and their base pointers) to emit alias information that specifies that accesses to different arrays cannot alias. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294574
* [IslAst] Print the ScopArray name to mark reductionsTobias Grosser2017-02-091-1/+1
| | | | | | | | | | | | | Before this change we used the name of the base pointer to mark reductions. This is imprecise as the canonical reference is the ScopArray itself and not the basepointer of a reduction. Using the base pointer of reductions is problematic in cases where a single ScopArray is referenced through two different base pointers. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294568
* [DependenceInfo] Use ScopArrayInfo to keep track of arrays [NFC]Tobias Grosser2017-02-091-4/+4
| | | | | | | | | | | | | | When computing reduction dependences we first identify all ScopArrays which are part of reductions and then only compute for these ScopArrays the more detailed data dependences that allow us to identify reductions and optimize across them. Instead of using the base pointer as identifier of a ScopArray, it is clearer and more understandable to directly use the ScopArray as identifier. This change implements such a switch. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294567
* [BlockGenerator] BBMap uses original BaseAddress for scalar loads [NFC]Tobias Grosser2017-02-091-2/+2
| | | | | | | | | | | | | | | | | When regenerating code in the BlockGenerator we copy instructions that may references scalar values, for which the new value of a given scalar is looked up in BBMap using the original scalar llvm::Value as index. It is consequently necessary that (re)loaded scalar values are made available in BBMap using the original llvm::Value as key independently if the llvm::Value was (re)loaded from the original scalar or a new access function has been specified that caused the value to be reloaded from an array with a differnet base address. We make this clear by using MemoryAccess::getOriginalBaseAddr() instead of MemoryAccess::getBaseAddr() as index to BBMap. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294566
* Isolate a set of partial tile prefixes in case of the matrix multiplicationRoman Gareev2017-02-091-6/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | optimization Isolate a set of partial tile prefixes to allow hoisting and sinking out of the unrolled innermost loops produced by the optimization of the matrix multiplication. In case it cannot be proved that the number of loop iterations can be evenly divided by tile sizes and we tile and unroll the point loop, the isl generates conditional expressions. Subsequently, the conditional expressions can prevent stores and loads of the unrolled loops from being sunk and hoisted. The patch isolates a set of partial tile prefixes, which have exactly Mr x Nr iterations of the two innermost loops, the result of the loop tiling performed by the matrix multiplication optimization, where Mr and Mr are parameters of the micro-kernel. This helps to get rid of the conditional expressions of the unrolled innermost loops. Probably this approach can be replaced with padding in future. In case of, for example, the gemm from Polybench/C 3.2 and parametric loop bounds, it helps to increase the performance from 7.98 GFlops (27.71% of theoretical peak) to 21.47 GFlops (74.57% of theoretical peak). Hence, we get the same performance as in case of scalar loops bounds. It also cause compile time regression. The compile-time is increased from 0.795 seconds to 0.837 seconds in case of scalar loops bounds and from 1.222 seconds to 1.490 seconds in case of parametric loops bounds. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D29244 llvm-svn: 294564
* [NFC] Make ScheduleTreeOptimizer::optimizeBand return a schedule node optimizedRoman Gareev2017-02-081-1/+1
| | | | | | | | | | | | | | with optimizeMatMulPattern This patch makes ScheduleTreeOptimizer::optimizeBand return a schedule node optimized with optimizeMatMulPattern. Otherwise, it could not use the isolate option, because standardBandOpts could try to tile a band node with anchored subtree and get the error, since the use of the isolate option causes any tree containing the node to be considered anchored. Furthermore, it is not intended to apply standard optimizations, when the matrix multiplication has been detected. llvm-svn: 294444
* [External] Move lib/JSON to lib/External/JSON. NFC.Michael Kruse2017-02-0518-3/+3
| | | | | | For consistency with isl and ppcg which are already in lib/External. llvm-svn: 294126
* [Support] Add convertZoneToTimepoints. NFC.Michael Kruse2017-02-041-0/+16
| | | | | | | | | | | | This function has been extracted from the upcoming DeLICM patch (https://reviews.llvm.org/D24716). In contrast to computeReachingWrite and computeArrayUnused, convertZoneToTimepoints implies a format for zones (ranges between timepoints). Zones at the moment are unique to DeLICM, but convertZoneToTimepoints makes most sense in conjunction with the previous two functions. llvm-svn: 294094
* [Support] Add computeArrayUnused. NFC.Michael Kruse2017-02-041-0/+47
| | | | | | | This function has been extracted from the upcoming DeLICM patch (https://reviews.llvm.org/D24716). llvm-svn: 294093
* [Support] Add computeReachingWrite. NFC.Michael Kruse2017-02-041-0/+60
| | | | | | | This function has been extracted from the upcoming DeLICM patch (https://reviews.llvm.org/D24716). llvm-svn: 294092
* [Support] Remove unused function hasInvokeEdge. NFC.Michael Kruse2017-02-031-9/+0
| | | | llvm-svn: 294062
* A new algorithm for identification of a SCoP statement that implement a matrixRoman Gareev2017-02-021-260/+396
| | | | | | | | | | | | | | | | | | | | multiplication The current identification of a SCoP statement that implement a matrix multiplication does not help to identify different permutations of loops that contain it and check for dependencies, which can prevent it from being optimized. It also requires external determination of the operands of the matrix multiplication. This patch contains the implementation of a new algorithm that helps to avoid these issues. It also modifies the test cases that generate matrix multiplications with linearized accesses, because the new algorithm does not support them. Reviewed-by: Michael Kruse <llvm@meinersbur.de>, Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D28357 llvm-svn: 293890
* Update to recent formatting changesTobias Grosser2017-02-017-32/+22
| | | | llvm-svn: 293756
* Fix format after recent clang-format change.Daniel Jasper2017-02-011-3/+2
| | | | llvm-svn: 293753
OpenPOWER on IntegriCloud