summaryrefslogtreecommitdiffstats
path: root/polly
Commit message (Collapse)AuthorAgeFilesLines
* Drop leftover debug statementTobias Grosser2017-02-171-1/+0
| | | | llvm-svn: 295444
* [ScopInfo] Add statistics to count loops after scop modelingTobias Grosser2017-02-174-22/+302
| | | | llvm-svn: 295431
* [ScopDetection] Compute the maximal loop depth correctlyTobias Grosser2017-02-172-1/+250
| | | | | | | Before this change, we obtained loop depth numbers that were deeper then the actual loop depth. llvm-svn: 295430
* Updated isl to isl-0.18-254-g6bc184dTobias Grosser2017-02-17101-1749/+2506
| | | | | | | This update includes a couple more coalescing changes as well as a large number of isl-internal code cleanups (dead assigments, ...). llvm-svn: 295419
* [ScopInfo] Do not try to fold array dimensions of size zeroTobias Grosser2017-02-172-0/+63
| | | | | | | | | | Trying to fold such kind of dimensions will result in a division by zero, which crashes the compiler. As such arrays are likely to invalidate the scop anyhow (but are not illegal in LLVM-IR), there is no point in trying to optimize the array layout. Hence, we just avoid the folding of constant dimensions of size zero. llvm-svn: 295415
* [ScopInfo] Rename MaxDisjunctions -> MaxDisjuncts [NFC]Tobias Grosser2017-02-161-8/+8
| | | | | | | There is only a single disjunction. However, we bound the number of 'disjuncts' in this disjunction. Name the variable accordingly. llvm-svn: 295362
* [tests] Fix some misspellings [NFC]Tobias Grosser2017-02-161-1/+1
| | | | llvm-svn: 295361
* [ScopInfo] Bound the number of disjuncts in contextTobias Grosser2017-02-162-0/+215
| | | | | | | | | Before this change wrapping range metadata resulted in exponential growth of the context, which made context construction of large scops very slow. Instead, we now just do not model the range information precisely, in case the number of disjuncts in the context has already reached a certain limit. llvm-svn: 295360
* [ScopInfo] Use uppercase variable name [NFC]Tobias Grosser2017-02-161-5/+5
| | | | llvm-svn: 295350
* [ScopInfo] Always derive upper and lower bounds for parametersTobias Grosser2017-02-162-14/+23
| | | | | | | | | | | | | | | | | | | | Commit r230230 introduced the use of range metadata to derive bounds for parameters, instead of just looking at the type of the parameter. As part of this commit support for wrapping ranges was added, where the lower bound of a parameter is larger than the upper bound: { 255 < p || p < 0 } However, at the same time, for wrapping ranges support for adding bounds given by the size of the containing type has acidentally been dropped. As a result, the range of the parameters was not guaranteed to be bounded any more. This change makes sure we always add the bounds given by the size of the type and then additionally add bounds based on signed wrapping, if available. For a parameter p with a type size of 32 bit, the valid range is then: { -2147483648 <= p <= 2147483647 and (255 < p or p < 0) } llvm-svn: 295349
* [FIX] Fix the typo in ScheduleOptimizer.cpp.Roman Gareev2017-02-161-1/+2
| | | | llvm-svn: 295292
* [DeLICM] Add forgotten unittests in previous commit. NFC.Michael Kruse2017-02-152-0/+189
| | | | llvm-svn: 295204
* [DeLICM] Add Knowledge class. NFC.Michael Kruse2017-02-153-1/+378
| | | | | | | | | | | | | | | The Knowledge class remembers the state of data at any timepoint of a SCoP's execution. Currently, it tracks whether an array element is unused or is occupied by some value, and the writes to it. A future addition will be to also remember which value it contains. Objects are used to determine whether two Knowledge contain conflicting information, i.e. two states cannot be true a the same time. This commit was extracted from the DeLICM algorithm at https://reviews.llvm.org/D24716. llvm-svn: 295197
* [ScopDetectDiagnostics] Do not format unnamed array namesTobias Grosser2017-02-121-1/+1
| | | | | | | | | | | | | Formatting unnamed array names is expensive in LLVM as the this requires deriving the numbered virtual instruction name (e.g., %12) for an llvm::Value, which is currently not implemented efficiently. As instruction numberes anyhow do not really carry a lot of information for the user, we just print 'unknown' instead. This change reduces the scop detection time from 24 to 19 seconds, for one of our large-scale inputs. This is a reduction by 21%. llvm-svn: 294894
* [ScopDetection] Add statistics to count the maximal number of scops in loopTobias Grosser2017-02-121-0/+7
| | | | llvm-svn: 294893
* Do not use wrapping ranges to bound non-affine accessesTobias Grosser2017-02-1211-11/+58
| | | | | | | | | | | | | When deriving the range of valid values of a scalar evolution expression might be a range [12, 8), where the upper bound is smaller than the lower bound and where the range is expected to possibly wrap around. We theoretically could model such a range as a union of two non-wrapping ranges, but do not do this as of yet. Instead, we just do not derive any bounds. Before this change, we could have obtained bounds where the maximal possible value is strictly smaller than the minimal possible value, which is incorrect and also caused assertions during scop modeling. llvm-svn: 294891
* Check reduction dependencies in case of the matrix multiplication optimizationRoman Gareev2017-02-112-3/+371
| | | | | | | | | | | | | | To determine parameters of the matrix multiplication, we check RAW dependencies that can be expressed using only reduction dependencies. Consequently, we should check the reduction dependencies, if this is the case. Reviewed-by: Tobias Grosser <tobias@grosser.es>, Sven Verdoolaege <skimo-polly@kotnet.org> Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D29814 llvm-svn: 294836
* [FIX] Fix the potential issue of containsOnlyMatMulDep.Roman Gareev2017-02-111-0/+2
| | | | llvm-svn: 294835
* [NFC] Fix the style issue of lib/Transform/ScheduleOptimizer.cpp.Roman Gareev2017-02-111-2/+1
| | | | llvm-svn: 294834
* [NFC] Fix style issues of lib/Transform/ScheduleOptimizer.cpp.Roman Gareev2017-02-111-9/+6
| | | | llvm-svn: 294831
* Use the size of the widest type of the matrix multiplication operandsRoman Gareev2017-02-113-8/+330
| | | | | | | | | | | | | The size of the operands type is the one of the parameters required to determine the BLIS micro-kernel. We get the size of the widest type of the matrix multiplication operands in case there are several different types. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D29269 llvm-svn: 294828
* Porting the example illustrating Polly from HTML to reStructuredTextTobias Grosser2017-02-1020-618/+1100
| | | | | | | | | | | | | http://polly.llvm.org/example_manual_matmul.html which illustrates individual passes of Polly, has been ported to reStructuredText and necessary changes have been made to the configuration files used by SPHINX to include the new source as a part of the documentation. Contributed-by: Singapuram Sanjay Srivallabh <singapuram.sanjay@gmail.com> Differential Revision: https://reviews.llvm.org/D25163 llvm-svn: 294735
* [ScopInfo] Use original base address when building ScopArrayInfo [NFC]Tobias Grosser2017-02-101-2/+2
| | | | | | | | | This change clarfies that we want to indeed use the original base address when creating the ScopArrayInfo that corresponds to a given memory access. This change prepares for https://reviews.llvm.org/D28518. llvm-svn: 294734
* [ScopInfo] Use getAccessValue to obtain the accessed valueTobias Grosser2017-02-101-1/+1
| | | | | | | | | | | | | | | This replaces the use of getOriginalAddrPtr, a value that is stored in ScopArrayInfo and might at some point not be unique any more. However, the access value is defined to be unique. This change is an update on r294576, which only clarified that we need the original memory access, but where we still remained dependent to have one base pointer per scop. This change removes unnecessary uses of MemoryAddress::getOriginalBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294733
* [BlockGenerator] Use MemoryAccess::getAccessValue to get load instructionTobias Grosser2017-02-091-2/+2
| | | | | | | | | | | | | | | | | | | | When generating code in the BlockGenerator we copy all (interesting) instructions and keep track of the new values in a basic block map. To obtain the original llvm::Value that belongs to a load memory access, we use getAccessValue() instead of getOriginalBaseAddr(). The former always references the instruction we use to load values from. The latter, on the other hand, is obtaine from the corresponding ScopArrayInfo and would not be unique in case ScopArrayInfo objects at some point allow memory accesses with different base addresses. This change is an update on r294566, which only clarified that we need the original memory access, but where we still remained dependent to have one base pointer per scop. This change removes unnecessary uses of MemoryAddress::getOriginalBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294669
* [ScopInfo] Use MemoryAccess::getScopArrayInfo() interface to access Array [NFC]Tobias Grosser2017-02-091-3/+2
| | | | | | | | | | | | | | By using the public interface MemoryAccess::getScopArrayInfo() we avoid the direct access to the ScopArrayInfoMap and as a result also do not need to use the BasePtr as key. This change makes the code cleaner. The const-cast we introduce is a little ugly. We may consider to drop const correctness for getScopArrayInfo() at some point. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294655
* [ScopInfo] Use types instead of 'auto' and use more descriptive variable ↵Tobias Grosser2017-02-091-8/+10
| | | | | | | | | | names [NFC] LLVM's coding conventions suggest to use auto only in obvious cases. Hence, we move this code to actually declare the types used. We also replace the variable name 'SAI', with the name 'Array', as this improves readability. llvm-svn: 294654
* [ScopInfo] Use ScopArrayInfo instead of base addressTobias Grosser2017-02-092-18/+18
| | | | | | | | | | | | When building alias groups, we sort different ScopArrays into unrelated groups. Historically we identified arrays through their base pointer, as no ScopArrayInfo class was yet available. This change changes the alias group construction to reference arrays through their ScopArrayInfo object. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294649
* [ScopInfo] Expect the OriginalBaseAddr when looking at underlying ↵Tobias Grosser2017-02-091-3/+3
| | | | | | | | | | | | | | | | instructions [NFC] During SCoP construction we sometimes inspect the underlying IR by looking at the base address of a MemoryAccess. In such cases, we always want the original base address. Make this clear by calling getOriginalBaseAddr(). This is a non-functional change as getBaseAddr maps to getOriginalBaseAddr at the moment. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294576
* [ScopInfo] Remove unnecessary indirection through SCEV [NFC]Tobias Grosser2017-02-091-8/+4
| | | | | | | | | The base address of a memory access is already an llvm::Value. Hence, there is no need to go through SCEV, but we can directly work with the llvm::Value. Also use 'Value *' instead of 'auto' for cases where the type is not obvious. llvm-svn: 294575
* [IRBuilder] Extract base pointers directly from ScopArrayTobias Grosser2017-02-092-16/+10
| | | | | | | | | | | | | Instead of iterating over statements and their memory accesses to extract the set of available base pointers, just directly iterate over all ScopArray objects. This reflects more the actual intend of the code: collect all arrays (and their base pointers) to emit alias information that specifies that accesses to different arrays cannot alias. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294574
* [FIX] Disable the problematic run linesRoman Gareev2017-02-092-19/+19
| | | | | | | | There are problems with using the machine information to derive the precise vector size on polly-amd64-linux and polly-arm-linux. We temporarily disable the problematic run lines. llvm-svn: 294571
* [FIX] Specify the CPU to overwrite the machine info and set a fixed vectorRoman Gareev2017-02-092-2/+3
| | | | | | size. llvm-svn: 294569
* [IslAst] Print the ScopArray name to mark reductionsTobias Grosser2017-02-096-9/+9
| | | | | | | | | | | | | Before this change we used the name of the base pointer to mark reductions. This is imprecise as the canonical reference is the ScopArray itself and not the basepointer of a reduction. Using the base pointer of reductions is problematic in cases where a single ScopArray is referenced through two different base pointers. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294568
* [DependenceInfo] Use ScopArrayInfo to keep track of arrays [NFC]Tobias Grosser2017-02-091-4/+4
| | | | | | | | | | | | | | When computing reduction dependences we first identify all ScopArrays which are part of reductions and then only compute for these ScopArrays the more detailed data dependences that allow us to identify reductions and optimize across them. Instead of using the base pointer as identifier of a ScopArray, it is clearer and more understandable to directly use the ScopArray as identifier. This change implements such a switch. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294567
* [BlockGenerator] BBMap uses original BaseAddress for scalar loads [NFC]Tobias Grosser2017-02-091-2/+2
| | | | | | | | | | | | | | | | | When regenerating code in the BlockGenerator we copy instructions that may references scalar values, for which the new value of a given scalar is looked up in BBMap using the original scalar llvm::Value as index. It is consequently necessary that (re)loaded scalar values are made available in BBMap using the original llvm::Value as key independently if the llvm::Value was (re)loaded from the original scalar or a new access function has been specified that caused the value to be reloaded from an array with a differnet base address. We make this clear by using MemoryAccess::getOriginalBaseAddr() instead of MemoryAccess::getBaseAddr() as index to BBMap. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294566
* Isolate a set of partial tile prefixes in case of the matrix multiplicationRoman Gareev2017-02-093-6/+427
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | optimization Isolate a set of partial tile prefixes to allow hoisting and sinking out of the unrolled innermost loops produced by the optimization of the matrix multiplication. In case it cannot be proved that the number of loop iterations can be evenly divided by tile sizes and we tile and unroll the point loop, the isl generates conditional expressions. Subsequently, the conditional expressions can prevent stores and loads of the unrolled loops from being sunk and hoisted. The patch isolates a set of partial tile prefixes, which have exactly Mr x Nr iterations of the two innermost loops, the result of the loop tiling performed by the matrix multiplication optimization, where Mr and Mr are parameters of the micro-kernel. This helps to get rid of the conditional expressions of the unrolled innermost loops. Probably this approach can be replaced with padding in future. In case of, for example, the gemm from Polybench/C 3.2 and parametric loop bounds, it helps to increase the performance from 7.98 GFlops (27.71% of theoretical peak) to 21.47 GFlops (74.57% of theoretical peak). Hence, we get the same performance as in case of scalar loops bounds. It also cause compile time regression. The compile-time is increased from 0.795 seconds to 0.837 seconds in case of scalar loops bounds and from 1.222 seconds to 1.490 seconds in case of parametric loops bounds. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D29244 llvm-svn: 294564
* [NFC] Make ScheduleTreeOptimizer::optimizeBand return a schedule node optimizedRoman Gareev2017-02-083-7/+1
| | | | | | | | | | | | | | with optimizeMatMulPattern This patch makes ScheduleTreeOptimizer::optimizeBand return a schedule node optimized with optimizeMatMulPattern. Otherwise, it could not use the isolate option, because standardBandOpts could try to tile a band node with anchored subtree and get the error, since the use of the isolate option causes any tree containing the node to be considered anchored. Furthermore, it is not intended to apply standard optimizations, when the matrix multiplication has been detected. llvm-svn: 294444
* [External] Move lib/JSON to lib/External/JSON. NFC.Michael Kruse2017-02-0521-9/+8
| | | | | | For consistency with isl and ppcg which are already in lib/External. llvm-svn: 294126
* [Support] Add convertZoneToTimepoints. NFC.Michael Kruse2017-02-043-0/+116
| | | | | | | | | | | | This function has been extracted from the upcoming DeLICM patch (https://reviews.llvm.org/D24716). In contrast to computeReachingWrite and computeArrayUnused, convertZoneToTimepoints implies a format for zones (ranges between timepoints). Zones at the moment are unique to DeLICM, but convertZoneToTimepoints makes most sense in conjunction with the previous two functions. llvm-svn: 294094
* [Support] Add computeArrayUnused. NFC.Michael Kruse2017-02-043-0/+179
| | | | | | | This function has been extracted from the upcoming DeLICM patch (https://reviews.llvm.org/D24716). llvm-svn: 294093
* [Support] Add computeReachingWrite. NFC.Michael Kruse2017-02-043-0/+222
| | | | | | | This function has been extracted from the upcoming DeLICM patch (https://reviews.llvm.org/D24716). llvm-svn: 294092
* [Support] Remove unused function hasInvokeEdge. NFC.Michael Kruse2017-02-032-18/+0
| | | | llvm-svn: 294062
* A new algorithm for identification of a SCoP statement that implement a matrixRoman Gareev2017-02-025-491/+624
| | | | | | | | | | | | | | | | | | | | multiplication The current identification of a SCoP statement that implement a matrix multiplication does not help to identify different permutations of loops that contain it and check for dependencies, which can prevent it from being optimized. It also requires external determination of the operands of the matrix multiplication. This patch contains the implementation of a new algorithm that helps to avoid these issues. It also modifies the test cases that generate matrix multiplications with linearized accesses, because the new algorithm does not support them. Reviewed-by: Michael Kruse <llvm@meinersbur.de>, Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D28357 llvm-svn: 293890
* Update to recent formatting changesTobias Grosser2017-02-018-35/+24
| | | | llvm-svn: 293756
* Fix format after recent clang-format change.Daniel Jasper2017-02-011-3/+2
| | | | llvm-svn: 293753
* Update the documentation on how the packing transformation is implementedRoman Gareev2017-01-291-2/+19
| | | | | | | | | | | | Add a simple example to update the documentation on how the packing transformation is implemented. Reviewed-by: Tobias Grosser <tobias@grosser.es>, Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D28021 llvm-svn: 293429
* Add forgotten test case for r293169Tobias Grosser2017-01-281-0/+26
| | | | llvm-svn: 293383
* [BlockGenerator] Comment corretions for r293374 [NFC]Tobias Grosser2017-01-282-7/+10
| | | | | | | This addresses some additional comments from Michael Kruse for commit r293374 as expressed in https://reviews.llvm.org/D28901. llvm-svn: 293378
* [Polly] [BlockGenerator] Unify ScalarMap and PhiOpsMapTobias Grosser2017-01-285-158/+92
| | | | | | | | | | | | | | | | | | | | | | | | | Instead of keeping two separate maps from Value to Allocas, one for MemoryType::Value and the other for MemoryType::PHI, we introduce a single map from ScopArrayInfo to the corresponding Alloca. This change is intended, both as a general simplification and cleanup, but also to reduce our use of MemoryAccess::getBaseAddr(). Moving away from using getBaseAddr() makes sure we have only a single place where the array (and its base pointer) for which we generate code for is specified, which means we can more easily introduce new access functions that use a different ScopArrayInfo as base. We already today experiment with modifiable access functions, so this change does not address a specific bug, but it just reduces the scope one needs to reason about. Another motivation for this patch is https://reviews.llvm.org/D28518, where memory accesses with different base pointers could possibly be mapped to a single ScopArrayInfo object. Such a mapping is currently not possible, as we currently generate alloca instructions according to the base addresses of the memory accesses, not according to the ScopArrayInfo object they belong to. By making allocas ScopArrayInfo specific, a mapping to a single ScopArrayInfo object will automatically mean that the same stack slot is used for these arrays. For D28518 this is not a problem, as only MemoryType::Array objects are mapping, but resolving this inconsistency will hopefully avoid confusion. llvm-svn: 293374
OpenPOWER on IntegriCloud