summaryrefslogtreecommitdiffstats
path: root/polly/test
Commit message (Collapse)AuthorAgeFilesLines
...
* Heap allocation for new arrays.Michael Kruse2017-06-283-0/+255
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch aims to implement the option of allocating new arrays created by polly on heap instead of stack. To enable this option, a key named 'allocation' must be written in the imported json file with the value 'heap'. We need such a feature because in a next iteration, we will implement a mechanism of maximal static expansion which will need a way to allocate arrays on heap. Indeed, the expansion is very costly in terms of memory and doing the allocation on stack is not worth considering. The malloc and the free are added respectively at polly.start and polly.exiting such that there is no use-after-free (for instance in case of Scop in a loop) and such that all memory cells allocated with a malloc are free'd when we don't need them anymore. We also add : - In the class ScopArrayInfo, we add a boolean as member called IsOnHeap which represents the fact that the array in allocated on heap or not. - A new branch in the method allocateNewArrays in the ISLNodeBuilder for the case of heap allocation. allocateNewArrays now takes a BBPair containing polly.start and polly.exiting. allocateNewArrays takes this two blocks and add the malloc and free calls respectively to polly.start and polly.exiting. - As IntPtrTy for the malloc call, we use the DataLayout one. To do that, we have modified : - createScopArrayInfo and getOrCreateScopArrayInfo such that it returns a non-const SAI, in order to be able to call setIsOnHeap in the JSONImporter. - executeScopConditionnaly such that it return both start block and end block of the scop, because we need this two blocs to be able to add the malloc and the free calls at the right position. Differential Revision: https://reviews.llvm.org/D33688 llvm-svn: 306540
* [JSONImport] Check, if the size of an imported array is positiveAndreas Simbuerger2017-06-273-0/+205
| | | | llvm-svn: 306479
* [FIX] Add %loadPolly to testAndreas Simbuerger2017-06-271-1/+1
| | | | | | | | This test fails, if polly is not linked into LLVM's tools. Our lit site-config already deals with this by not adding the -load option, if polly is linked into LLVM's tools. llvm-svn: 306395
* [PPCGCodeGeneration] Add flag to allow polly to fail in GPU kernel fails.Siddharth Bhat2017-06-261-0/+47
| | | | | | - This is useful for debugging GPU code. llvm-svn: 306290
* [PPCGCodeGeneration] Allow intrinsics within kernels.Siddharth Bhat2017-06-261-0/+66
| | | | | | | | | | | | | | | | | | | - In D33414, if any function call was found within a kernel, we would bail out. - This is an over-approximation. This patch changes this by allowing the `llvm.sqrt.*` family of intrinsics. - This introduces an additional step when creating a separate llvm::Module for a kernel (GPUModule). We now copy function declarations from the original module to new module. - We also populate IslNodeBuilder::ValueMap so it replaces the function references to the old module to the ones in the new module (GPUModule). Differential Revision: https://reviews.llvm.org/D34145 llvm-svn: 306284
* [tests] Add forgotten pollyacc REQUIRES lineTobias Grosser2017-06-261-0/+2
| | | | llvm-svn: 306273
* [PPCGCodeGeneration] Enable GPU code generation with invariant loads.Siddharth Bhat2017-06-251-0/+116
| | | | | | | | | | The condition that disallowed code generation in PPCGCodeGeneration with invariant loads is not required. I haven't been able to construct a counterexample where this generates invalid code. Differential Revision: https://reviews.llvm.org/D34604 llvm-svn: 306245
* [ScopInfo] Bound the number of array disjuncts in run-time bounds checksTobias Grosser2017-06-251-0/+241
| | | | | | | | | | | This reduces the compilation time of one reduced test case from Android from 16 seconds to 100 mseconds (we bail out), without negatively impacting any other test case we currently have. We still saw occasionally compilation timeouts on the AOSP buildbot. Hopefully, those will go away with this change. llvm-svn: 306235
* [FIX] A small addition to r305675.Roman Gareev2017-06-251-8/+7
| | | | llvm-svn: 306234
* [ScopInfo] Fix crash with sum of invariant load and AddRec.Eli Friedman2017-06-201-0/+50
| | | | | | | | | | | | | | | | | | | | | | r303971 added an assertion that SCEV addition involving an AddRec and a SCEVUnknown must involve a dominance relation: either the SCEVUnknown value dominates the AddRec's loop, or the AddRec's loop header dominates the SCEVUnknown. This is generally fine for most usage of SCEV because it isn't possible to write an expression in IR which would violate it, but it's a bit inconvenient here for polly. To solve the issue, just avoid creating a SCEV expression which triggers the asssertion. I'm not really happy with this solution, but I don't have any better ideas. Fixes https://bugs.llvm.org/show_bug.cgi?id=33464. Differential Revision: https://reviews.llvm.org/D34259 llvm-svn: 305864
* [CodeGen] Emit aliasing metadata for new arrays.Michael Kruse2017-06-192-8/+20
| | | | | | | | | | | | Ensure that all array base pointers are assigned before generating aliasing metadata by allocating new arrays beforehand. Before this patch, getBasePtr() returned nullptr for new arrays because the arrays were created at a later point. Nullptr did not match to any array after the created array base pointers have been assigned and when the loads/stores are generated. llvm-svn: 305675
* Don't check side effects for functions outside of SCoPEli Friedman2017-06-141-0/+36
| | | | | | | | | | | | In r304074 we introduce a patch to accept results from side effect free functions into SCEV modeling. This causes rejection of cases where the call is happening outside the SCoP. This patch checks if the call is outside the Region and treats the results as a parameter (SCEVType::PARAM) to the SCoP instead of returning SCEVType::INVALID. Patch by Sameer Abu Asal. llvm-svn: 305423
* [Polly] [PPCGCodeGeneration] Skip Scops which contain function pointers.Siddharth Bhat2017-06-121-0/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In `PPCGCodeGeneration`, we try to take the references of every `Value` that is used within a Scop to offload to the kernel. This occurs in `GPUNodeBuilder::createLaunchParameters`. This breaks if one of the values is a function pointer, since one of these cases will trigger: 1. We try to to take the references of an intrinsic function, and this breaks at `verifyModule`, since it is illegal to take the reference of an intrinsic. 2. We manage to take the reference to a function, but this fails at `verifyModule` since the function will not be present in the module that is created in the kernel. 3. Even if `verifyModule` succeeds (which should not occur), we would then try to call a *host function* from the *device*, which is illegal runtime behaviour. So, we disable this entire range of possibilities by simply not allowing function references within a `Scop` which corresponds to a kernel. However, note that this is too conservative. We *can* allow intrinsics within kernels if the backend can lower the intrinsic correctly. For example, an intrinsic like `llvm.powi.*` can actually be lowered by the `NVPTX` backend. We will now gradually whitelist intrinsics which are known to be safe. Differential Revision: https://reviews.llvm.org/D33414 llvm-svn: 305185
* [Polly] [ScopDetection] Allow passing multiple functions to `-polly-only-func`.Siddharth Bhat2017-06-091-0/+100
| | | | | | | | - This is useful to run optimisations on only certain functions. Differential Revision: https://reviews.llvm.org/D33990 llvm-svn: 305060
* [Simplify] Use execution order of memory accesses.Michael Kruse2017-06-063-0/+126
| | | | | | | | | | | | Iterate through memory accesses in execution order (first all implicit reads, then explicit accesses, then implicit writes). In the test case this caused an implicit load to be handled as if it was loaded after the write. That is, the value being written before it is available. This fixes llvm.org/PR33323 llvm-svn: 304810
* [Polly] [BlockGen] Support partial writes in regionsTobias Grosser2017-06-066-0/+344
| | | | | | | | | | | | | | | | | | | | | | | | Summary: The RegionGenerator traditionally kept a BlockMap that mapped from original basic blocks to newly generated basic blocks. With the introduction of partial writes such a 1:1 mapping is not possible any more, as a single basic block can be code generated into multiple basic blocks. Hence, depending on the use case we need to either use the first basic block or the last basic block. This is intended to address the last four cases of incorrect code generation in our AOSP buildbot and hopefully should turn it green. Reviewers: Meinersbur, bollu, gareevroman, efriedma, huihuiz, sebpop, simbuerg Reviewed By: Meinersbur Subscribers: pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D33767 llvm-svn: 304808
* Add test coverage for regions with non-affine loopsTobias Grosser2017-06-031-0/+62
| | | | | | | | This adds test coverage for regions with non-affine loops, which we unfortunately missed when committing this features years ago. We will add more test coverage over time. llvm-svn: 304672
* [CodeGen] Track trip counts per-scop for performance measurement.Siddharth Bhat2017-06-023-98/+150
| | | | | | | | | | | | | | | | | | | - Add a counter that is incremented once on exit from a scop. - Test cases got split into two: one to test the cycles, and another one to test trip counts. - Sample output: ```name=sample-output.txt scop function, entry block name, exit block name, total time, trip count warmup, %entry.split, %polly.merge_new_and_old, 5180, 1 f, %entry.split, %polly.merge_new_and_old, 409944, 500 g, %entry.split, %polly.merge_new_and_old, 1226, 1 ``` Differential Revision: https://reviews.llvm.org/D33822 llvm-svn: 304543
* [CodeGen] Print performance counter information in CSV.Siddharth Bhat2017-06-021-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | This ensures that tools can parse performance information which Polly generates easily. - Sample output: ```name=out.csv scop function, entry block name, exit block name, total time warmup, %entry.split, %polly.merge_new_and_old, 1960 f, %entry.split, %polly.merge_new_and_old, 1238 g, %entry.split, %polly.merge_new_and_old, 1218 ``` - Example code to parse output: ```lang=python, name=example-parse.py import asciitable import sys table = asciitable.read('out.csv', delimiter=',') asciitable.write(table, sys.stdout, delimiter=',') ``` llvm-svn: 304533
* [CodeGen] Extend Performance Counter to track per-scop information.Siddharth Bhat2017-06-022-3/+100
| | | | | | | | | | | | | | | Previously, we would generate one performance counter for all scops. Now, we generate both the old information, as well as a per-scop performance counter to generate finer grained information. This patch needed a way to generate a unique name for a `Scop`. The start region, end region, and function name combined provides a unique `Scop` name. So, `Scop` has a new public API to provide its start and end region names. Differential Revision: https://reviews.llvm.org/D33723 llvm-svn: 304528
* [ScopBuilder] Exclude ignored intrinsics from explicit instruction list.Michael Kruse2017-06-011-0/+46
| | | | | | | | | | | | | | Ignored intrinsics are ignored at code generation, therefore do not need to be part of the instruction list. Specifically, llvm.lifetime.* intrinisics are removed before code generation, referencing them would cause a use-after-free error. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D33768 llvm-svn: 304483
* [BlockGenerator] Take context into account when identifying partial writesTobias Grosser2017-06-011-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | A partial write is a write where the domain of the values written is a subset of the execution domain of the parent statement containing the write. Originally, we directly checked this subset relation whereas it is indeed only important that the subset relation holds for the parameter values that are known to be valid in the execution context of the scop. We update our check to avoid the unnecessary introduction of partial writes in situations where the write appears to be partial without context information, but where context information allows us to understand that a full write can be generated. This change fixes (hides) a recent regression introduced in r303517, which broke our AOSP builds. The part that is correctly fixed in this change is that we do not any more unnecessarily generate a partial write. This is good performance wise and, as we currently do not yet explicitly introduce partial writes in the default configuration, this also hides possible bugs in the partial writes implementation. The crashes that we have originally seen were caused by such a bug, where partial writes were incorrectly generated in region statements. An additional patch in a subsequent commit is needed to address this problem. Reported-by: Reported-by: Eli Friedman <efriedma@codeaurora.org> Differential Revision: https://reviews.llvm.org/D33759 llvm-svn: 304398
* [test] Add a short explanation to testTobias Grosser2017-05-311-0/+6
| | | | llvm-svn: 304279
* [ScopInfo] Do not add terminator & synthesizable instructions to the output ↵Michael Kruse2017-05-291-5/+0
| | | | | | | | | | | instructions. Such instructions are generates on-demand by the CodeGenerator and thus do not need representation in a statement. Differential Revision: https://reviews.llvm.org/D33642 llvm-svn: 304151
* Delinearize memory accesses that reference parameters coming from function callsTobias Grosser2017-05-271-0/+118
| | | | | | | | | | | | | | Certain affine memory accesses which we model today might contain products of parameters which we might combined into a new parameter to be able to create an affine expression that represents these memory accesses. Especially in the context of OpenCL, this approach looses information as memory accesses such as A[get_global_id(0) * N + get_global_id(1)] are assumed to be linear. We correctly recover their multi-dimensional structure by assuming that parameters that are the result of a function call at IR level likely are not parameters, but indeed induction variables. The resulting access is now A[get_global_id(0)][get_global_id(1)] for an array A[][N]. llvm-svn: 304075
* Allow side-effect free function calls in valid affine SCEVsTobias Grosser2017-05-271-0/+67
| | | | | | | | | | | | | | | Side-effect free function calls with only constant parameters can be easily re-generated and consequently do not prevent us from modeling a SCEV. This change allows array subscripts to reference function calls such as 'get_global_id()' as used in OpenCL. We use the function name plus the constant operands to name the parameter. This is possible as the function name is required and is not dropped in release builds the same way names of llvm::Values are dropped. We also provide more readable names for common OpenCL functions, to make it easy to understand the polyhedral model we generate. llvm-svn: 304074
* Update some tests to changes in isl's internal representationTobias Grosser2017-05-275-9/+11
| | | | | | This was forgotten as part of r304069. llvm-svn: 304070
* [Polly] Added the list of Instructions to output in ScopInfo passTobias Grosser2017-05-271-0/+49
| | | | | | | | | | | | Summary: This patch outputs all the list of instructions in BlockStmts. Reviewers: Meinersbur, grosser, bollu Subscribers: bollu, llvm-commits, pollydev Differential Revision: https://reviews.llvm.org/D33163 llvm-svn: 304062
* [Polly] Add handling of Top Level RegionsPhilip Pfaffe2017-05-241-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: My goal is to make the newly added `AllowWholeFunctions` options more usable/powerful. The changes to ScopBuilder.cpp are exclusively checks to prevent `Region.getExit()` from being dereferenced, since Top Level Regions (TLRs) don't have an exit block. In ScopDetection's `isValidCFG`, I removed a check that disallowed ReturnInstructions to have return values. This might of course have been intentional, so I would welcome your feedback on this and maybe a small explanation why return values are forbidden. Maybe it can be done but needs more changes elsewhere? The remaining changes in ScopDetection are simply to consider the AllowWholeFunctions option in more places, i.e. allow TLRs when it is set and once again avoid derefererncing `getExit()` if it doesn't exist. Finally, in ScopHelper.cpp I extended `polly::isErrorBlock` to handle regions without exit blocks as well: The original check was if a given BasicBlock dominates all predecessors of the exit block. Therefore I do the same for TLRs by regarding all BasicBlocks terminating with a ReturnInst as predecessors of a "virtual" function exit block. Patch by: Lukas Boehm Reviewers: philip.pfaffe, grosser, Meinersbur Reviewed By: grosser Subscribers: pollydev, llvm-commits, bollu Tags: #polly Differential Revision: https://reviews.llvm.org/D33411 llvm-svn: 303790
* [DeLICM] Partial writes for PHIs.Michael Kruse2017-05-243-0/+166
| | | | | | | | | | | | | | Enable the use for partial writes for PHI write accesses with a switch. This simply skips the test for whether a PHI write would be partial. The analog test for partial value writes also protects for partial reads which we do not support (yet). It is possible to test for partial reads separately such that we could skip the partial write check as well. In case this shows up to be useful, I can implement it as well. Differential Revision: https://reviews.llvm.org/D33487 llvm-svn: 303762
* [JSONImporter] misses checks whether the data it imports makes sense.Michael Kruse2017-05-2440-0/+1939
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Without this patch, the JSONImporter did not verify if the data it loads were correct or not (Bug llvm.org/PR32543). I add some checks in the JSONImporter class and some test cases. Here are the checks (and test cases) I added : JSONImporter::importContext - The "context" key does not exist. - The context was not parsed successfully by ISL. - The isl_set has the wrong number of parameters. - The isl_set is not a parameter set. JSONImporter::importSchedule - The "statements" key does not exist. - There is not the right number of statement in the file. - The "schedule" key does not exist. - The schedule was not parsed successfully by ISL. JSONImporter::importAccesses - The "statements" key does not exist. - There is not the right number of statement in the file. - The "accesses" key does not exist. - There is not the right number of memory accesses in the file. - The "relation" key does not exist. - The memory access was not parsed successfully by ISL. JSONImporter::areArraysEqual - The "type" key does not exist. - The "sizes" key does not exist. - The "name" key does not exist. JSONImporter::importArrays /!\ Do not check if there is an key name "arrays" because it is not considered as an error. All checks are already in place or implemented in JSONImporter::areArraysEqual. Contributed-by: Nicolas Bonfante <nicolas.bonfante@insa-lyon.fr> Differential Revision: https://reviews.llvm.org/D32739 llvm-svn: 303759
* [ScopInfo] Simplify domains earlyTobias Grosser2017-05-231-0/+378
| | | | | | | | | | | | | This speeds up scop modeling for scops with many redundent existentially quantified constraints. For the attached test case, this change reduces scop modeling time from minutes (hours?) to 0.15 seconds. This change resolves a compilation timeout on the AOSP build. Thanks Eli for reporting _and_ reducing the test case! Reported-by: Eli Friedman <efriedma@codeaurora.org> llvm-svn: 303600
* [CodeGen] Add invalidation of the loop SCEVs after merge block generation.Michael Kruse2017-05-221-0/+70
| | | | | | | | | | | | | The SCEVs of loops surrounding the escape users of a merge blocks are forgotten, so that loop trip counts based on old values can be revoked. This fixes llvm.org//PR32536 Contributed-by: Baranidharan Mohan <mbdharan@gmail.com> Differential Revision: https://reviews.llvm.org/D33195 llvm-svn: 303561
* [CodeGen] Support partial write accesses.Michael Kruse2017-05-2115-0/+589
| | | | | | | | | | | | | | | | | | | Allow the BlockGenerator to generate memory writes that are not defined over the complete statement domain, but only over a subset of it. It generates a condition that evaluates to 1 if executing the subdomain, and only then execute the access. Only write accesses are supported. Read accesses would require a PHINode which has a value if the access is not executed. Partial write makes DeLICM able to apply mappings that are not defined over the entire domain (for instance, a branch that leaves a loop with a PHINode in its header; a MemoryKind::PHI write when leaving is never read by its PHI read). Differential Revision: https://reviews.llvm.org/D33255 llvm-svn: 303517
* Fix buildbots after r303429Tobias Grosser2017-05-201-8/+3
| | | | | | | | A test case with a GPU runline was added without setting 'REQUIRES=pollyacc'. We drop the GPU run line, as the basic functionality can already be tested with the normal code generation. llvm-svn: 303485
* [Fortran Support] Materialize outermost dimension for Fortran array.Siddharth Bhat2017-05-192-2/+84
| | | | | | | | | | | | | | | | | | | - We use the outermost dimension of arrays since we need this information to generate GPU transfers. - In general, if we do not know the outermost dimension of the array (because the indexing expression is non-affine, for example) then we simply cannot generate transfer code. - However, for Fortran arrays, we can use the Fortran array representation which stores the dimensions of all arrays. - This patch uses the Fortran array representation to generate code that computes the outermost dimension size. Differential Revision: https://reviews.llvm.org/D32967 llvm-svn: 303429
* [ScopDetection] Allow detection of full functionsTobias Grosser2017-05-191-0/+64
| | | | | | This is useful when only analyzing functions. llvm-svn: 303420
* [ScopInfo] Gracefully handle long compile timesTobias Grosser2017-05-191-0/+235
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following test case tried to compute the lexicographic minimum of the following set during alias analysis, which caused very long compile time: [p_0, p_1, p_2, p_3, p_4, p_5] -> { MemRef0[i0] : (517p_3 >= 70944 - 298p_2 and 256i0 >= -71199 + 298p_2 + 517p_3 and 256i0 <= -70944 + 298p_2 + 517p_3) or (409p_4 >= 57120 - 298p_2 and 256i0 >= -57375 + 298p_2 + 409p_4 and 256i0 <= -57120 + 298p_2 + 409p_4) or (104p_4 >= 17329 + 149p_2 - 50p_3 and 128i0 >= 17328 + 149p_2 - 50p_3 - 104p_4 and 128i0 <= 17455 + 149p_2 - 50p_3 - 104p_4) or (104p_4 <= 17328 + 149p_2 - 50p_3 and 128i0 >= 17201 + 149p_2 - 50p_3 - 104p_4 and 128i0 <= 17328 + 149p_2 - 50p_3 - 104p_4) or (409p_4 <= 57119 - 298p_2 and 256i0 >= -57120 + 298p_2 + 409p_4 and 256i0 <= -56865 + 298p_2 + 409p_4) or (517p_3 <= 70943 - 298p_2 and 256i0 >= -70944 + 298p_2 + 517p_3 and 256i0 <= -70689 + 298p_2 + 517p_3) or (p_1 >= 2 + 2p_0 and 298p_5 >= 70944 - 517p_3 and 256i0 >= -71199 + 517p_3 + 298p_5 and 256i0 <= -70944 + 517p_3 + 298p_5) or (p_1 >= 2 + 2p_0 and 298p_5 >= 57120 - 409p_4 and 256i0 >= -57375 + 409p_4 + 298p_5 >and 256i0 <= -57120 + 409p_4 + 298p_5) or (p_1 >= 2 + 2p_0 and 149p_5 <= -17329 >+ 50p_3 + 104p_4 and 128i0 >= 17328 - 50p_3 - 104p_4 + 149p_5 and 128i0 <= >17455 - 50p_3 - 104p_4 + 149p_5) or (p_1 >= 2 + 2p_0 and 149p_5 >= -17328 + >50p_3 + 104p_4 and 128i0 >= 17201 - 50p_3 - 104p_4 + 149p_5 and 128i0 <= 17328 >- 50p_3 - 104p_4 + 149p_5) or (p_1 >= 2 + 2p_0 and 298p_5 <= 57119 - 409p_4 and >256i0 >= -57120 + 409p_4 + 298p_5 and 256i0 <= -56865 + 409p_4 + 298p_5) or >(p_1 >= 2 + 2p_0 and 298p_5 <= 70943 - 517p_3 and 256i0 >= -70944 + 517p_3 + >298p_5 and 256i0 <= -70689 + 517p_3 + 298p_5) } We now guard the potentially expensive functions in Polly's scop analysis to gracefully bail out in case of overly long compilation times. llvm-svn: 303404
* [Fortran Support] Change "global" pattern match to work for paramsSiddharth Bhat2017-05-182-1/+95
| | | | | | | | | | | | | | | | | | Summary: - Rename global / local naming convention that did not make much sense to Visible / Invisible, where the visible refers to whether the ALLOCATE call to the Fortran array is present in the current module or not. - This match now works on both cross fortran module globals and on parameters to functions since neither of them are necessarily allocated at the point of their usage. - Add testcase that matches against both a load and a store against function parameters. Differential Revision: https://reviews.llvm.org/D33190 llvm-svn: 303356
* [Polly][Fortran Support] Fix two testcases for the loadable-library use-casePhilip Pfaffe2017-05-152-2/+2
| | | | llvm-svn: 303057
* [Fortran Support] Add pattern match for Fortran Arrays that are parameters.Siddharth Bhat2017-05-151-0/+67
| | | | | | | | | | | - This breaks the previous assumption that Fortran Arrays are `GlobalValue`. - The names of functions were getting unwieldy. So, I renamed the Fortran related functions. Differential Revision: https://reviews.llvm.org/D33075 llvm-svn: 303040
* [Polly] Fix code generation of llvm.expect intrinsicTobias Grosser2017-05-141-0/+34
| | | | | | | | | | | | | | | | | | | At the time of code generation, an instruction with an llvm intrinsic is ignored in copyBB. However, if the value of the instruction is used later in the program, the value needs to be synthesized. However, this is causing some issues with the instructions being generated in a hoisted basic block. Removing llvm.expect from the list of ignored intrinsics fixes this bug. This resolves http://llvm.org/PR32324. Contributed-by: Annanay Agarwal <cs14btech11001@iith.ac.in> Tags: #polly Differential Revision: https://reviews.llvm.org/D32992 llvm-svn: 303006
* [Simplify] Remove identical write removal. NFC.Michael Kruse2017-05-136-6/+6
| | | | | | | | | | | | | Removal of overwritten writes currently encompasses all the cases of the identical write removal. There is an observable behavioral change in that the last, instead of the first, MemoryAccess is kept. This should not affect the generated code, however. Differential Revision: https://reviews.llvm.org/D33143 llvm-svn: 302987
* [Simplify] Remove writes that are overwritten.Michael Kruse2017-05-136-0/+286
| | | | | | | | | | | | | | | | | | | | Remove memory writes that are overwritten by later writes. This works for StoreInsts: store double 21.0, double* %A store double 42.0, double* %A scalar writes at the end of a statement and mixes of these. Multiple writes can be the result of DeLICM, which might map multiple writes to the same location when it knows that these do no conflict (for instance because they write the same value). Such writes interfere with pattern-matched optimization such as gemm and may not get removed by other LLVM passes after code generation. Differential Revision: https://reviews.llvm.org/D33142 llvm-svn: 302986
* [NFC] [Fortran Support] Run -instnamer on testcasesSiddharth Bhat2017-05-122-40/+40
| | | | llvm-svn: 302892
* [FIX] Fix regression caused by c29f4ed, testcase matches outputSiddharth Bhat2017-05-121-1/+1
| | | | | | | - Commit changed codegen for induction variables - Updated testcase llvm-svn: 302891
* [NFC] [Fortran Support] Cleanup Fortran Array pattern mactch testcasesSiddharth Bhat2017-05-122-88/+63
| | | | | | | | | | - Move the testcases to ScopInfo/ since the processing takes place in ScopBuilder. - Cleanup testcases, run -polly-canonicalize on them, find minimal set of opt parameters. llvm-svn: 302886
* [Polly] Generate more 'canonical' induction variableHongbin Zheng2017-05-129-26/+23
| | | | | | | | | | | | | | | | | | | | | | | Today Polly generates induction variable in this way: polly.indvar = phi 0, polly.indvar.next ... polly.indvar.next = polly.indvar + stide polly.loop_cond = predicate polly.indvar, (UB - stride) Instead of: polly.indvar = phi 0, polly.indvar.next ... polly.indvar.next = polly.indvar + stide polly.loop_cond = predicate polly.indvar.next, UB The way Polly generate induction variable cause some problem in the indvar simplify pass. This patch make polly generate the later form, by assuming the induction variable never overflow Differential Revision: https://reviews.llvm.org/D33089 llvm-svn: 302866
* [Simplify] Remove identical scalar writes.Michael Kruse2017-05-116-0/+321
| | | | | | | | | | | | | | | | | | | | | | After DeLICM, it is possible to have two writes of the same value to the same location in the same statement when it determined that those writes do not conflict (write the same value). Teach -polly-simplify to remove one of the writes. It interferes with the pattern matching of matrix-multiplication kernels and also seem to not be optimized away by LLVM. The algorthm is simple, has O(n^2) behaviour (n = max number of MemoryAccesses in a statement) and only matches the most obvious cases, but seem to be enough to pattern-match Boost ublas gemm. Not handled cases include: - StoreInst instructions (a.k.a. explicit writes), since the value might be loaded or overwritten between the two stores. - PHINode, especially LCSSA, when the PHI value matches with on other's. - Partial writes (in preparation) llvm-svn: 302805
* [NFC] [Fortran Support] move Fortran array detection testcasesSiddharth Bhat2017-05-102-0/+0
| | | | | | move these testcases to where they belong: ScopDetect llvm-svn: 302735
OpenPOWER on IntegriCloud