summaryrefslogtreecommitdiffstats
path: root/polly/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [CodeGen] Support partial write accesses.Michael Kruse2017-05-212-71/+190
| | | | | | | | | | | | | | | | | | | Allow the BlockGenerator to generate memory writes that are not defined over the complete statement domain, but only over a subset of it. It generates a condition that evaluates to 1 if executing the subdomain, and only then execute the access. Only write accesses are supported. Read accesses would require a PHINode which has a value if the access is not executed. Partial write makes DeLICM able to apply mappings that are not defined over the entire domain (for instance, a branch that leaves a loop with a PHINode in its header; a MemoryKind::PHI write when leaving is never read by its PHI read). Differential Revision: https://reviews.llvm.org/D33255 llvm-svn: 303517
* [Fortran Support] Materialize outermost dimension for Fortran array.Siddharth Bhat2017-05-192-1/+101
| | | | | | | | | | | | | | | | | | | - We use the outermost dimension of arrays since we need this information to generate GPU transfers. - In general, if we do not know the outermost dimension of the array (because the indexing expression is non-affine, for example) then we simply cannot generate transfer code. - However, for Fortran arrays, we can use the Fortran array representation which stores the dimensions of all arrays. - This patch uses the Fortran array representation to generate code that computes the outermost dimension size. Differential Revision: https://reviews.llvm.org/D32967 llvm-svn: 303429
* [IR] De-virtualize ~Value to save a vptrReid Kleckner2017-05-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Implements PR889 Removing the virtual table pointer from Value saves 1% of RSS when doing LTO of llc on Linux. The impact on time was positive, but too noisy to conclusively say that performance improved. Here is a link to the spreadsheet with the original data: https://docs.google.com/spreadsheets/d/1F4FHir0qYnV0MEp2sYYp_BuvnJgWlWPhWOwZ6LbW7W4/edit?usp=sharing This change makes it invalid to directly delete a Value, User, or Instruction pointer. Instead, such code can be rewritten to a null check and a call Value::deleteValue(). Value objects tend to have their lifetimes managed through iplist, so for the most part, this isn't a big deal. However, there are some places where LLVM deletes values, and those places had to be migrated to deleteValue. I have also created llvm::unique_value, which has a custom deleter, so it can be used in place of std::unique_ptr<Value>. I had to add the "DerivedUser" Deleter escape hatch for MemorySSA, which derives from User outside of lib/IR. Code in IR cannot include MemorySSA headers or call the MemoryAccess object destructors without introducing a circular dependency, so we need some level of indirection. Unfortunately, no class derived from User may have any virtual methods, because adding a virtual method would break User::getHungOffOperands(), which assumes that it can find the use list immediately prior to the User object. I've added a static_assert to the appropriate OperandTraits templates to help people avoid this trap. Reviewers: chandlerc, mehdi_amini, pete, dberlin, george.burgess.iv Reviewed By: chandlerc Subscribers: krytarowski, eraman, george.burgess.iv, mzolotukhin, Prazek, nlewycky, hans, inglorion, pcc, tejohnson, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D31261 llvm-svn: 303362
* [Polly][NewPM] Port ScopDetection to the new PassManagerPhilip Pfaffe2017-05-122-6/+6
| | | | | | | | | | | | | | | | Summary: This is a proof of concept of how to port polly-passes to the new PassManager architecture. This approach works ootb for Function-Passes, but might not be directly applicable to Scop/Region-Passes. While we could just run the Analyses/Transforms over functions instead, we'd surrender the nice pipelining behaviour we have now. Reviewers: Meinersbur, grosser Reviewed By: grosser Subscribers: pollydev, sanjoy, nemanjai, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D31459 llvm-svn: 302902
* [Polly] Remove unused headerHongbin Zheng2017-05-121-2/+0
| | | | llvm-svn: 302868
* [Polly] Generate more 'canonical' induction variableHongbin Zheng2017-05-121-4/+7
| | | | | | | | | | | | | | | | | | | | | | | Today Polly generates induction variable in this way: polly.indvar = phi 0, polly.indvar.next ... polly.indvar.next = polly.indvar + stide polly.loop_cond = predicate polly.indvar, (UB - stride) Instead of: polly.indvar = phi 0, polly.indvar.next ... polly.indvar.next = polly.indvar + stide polly.loop_cond = predicate polly.indvar.next, UB The way Polly generate induction variable cause some problem in the indvar simplify pass. This patch make polly generate the later form, by assuming the induction variable never overflow Differential Revision: https://reviews.llvm.org/D33089 llvm-svn: 302866
* [Polly] Canonicalize arrays according to base-ptr equivalence classTobias Grosser2017-05-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In case two arrays share base pointers in the same invariant load equivalence class, we canonicalize all memory accesses to the first of these arrays (according to their order in the equivalence class). This enables us to optimize kernels such as boost::ublas by ensuring that different references to the C array are interpreted as accesses to the same array. Before this change the runtime alias check for ublas would fail, as it would assume models of the C array with differing (but identically valued) base pointers would reference distinct regions of memory whereas the referenced memory regions were indeed identical. As part of this change we remove most of the MemoryAccess::get*BaseAddr interface. We removed already all references to get*BaseAddr in previous commits to ensure that no code relies on matching base pointers between memory accesses and scop arrays -- except for three remaining uses where we need the original base pointer. We document for these situations that MemoryAccess::getOriginalBaseAddr may return a base pointer that is distinct to the base pointer of the scop array referenced by this memory access. Reviewers: sebpop, Meinersbur, zinob, gareevroman, pollydev, huihuiz, efriedma, jdoerfert Reviewed By: Meinersbur Subscribers: etherzhhb Tags: #polly Differential Revision: https://reviews.llvm.org/D28518 llvm-svn: 302636
* Fix formatting in PollyTobias Grosser2017-05-101-3/+3
| | | | llvm-svn: 302620
* Update Polly for LLVM API change r302571 that removed varargs functionsChandler Carruth2017-05-101-2/+2
| | | | | | with a nullptr sentinel in favor of nicely typed variadic templates. llvm-svn: 302618
* [Polly][PPCGCodeGen] OpenCL now gets kernel argument size from PPCG CodeGenSiddharth Bhat2017-05-091-17/+54
| | | | | | | | | | | | | | | | Summary: PPCGCodeGeneration now attaches the size of the kernel launch parameters at the end of the parameter list. For the existing CUDA Runtime, this gets ignored, but the OpenCL Runtime knows to check for kernel-argument size at the end of the parameter list. (The resulting parameters list is twice as long. This has been accounted for in the corresponding test cases). Reviewers: grosser, Meinersbur, bollu Reviewed By: bollu Subscribers: nemanjai, yaxunl, Anastasia, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D32961 llvm-svn: 302515
* [Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGenSiddharth Bhat2017-05-071-19/+94
| | | | | | | | | | | | | | | | | | | | | | | Summary: When compiling for GPU, one can now choose to compile for OpenCL or CUDA, with the corresponding polly-gpu-runtime flag (libopencl / libcudart). The GPURuntime library (GPUJIT) has been extended with the OpenCL Runtime library for that purpose, correctly choosing the corresponding library calls to the option chosen when compiling (via different initialization calls). Additionally, a specific GPU Target architecture can now be chosen with -polly-gpu-arch (only nvptx64 implemented thus far). Reviewers: grosser, bollu, Meinersbur, etherzhhb, singam-sanjay Reviewed By: grosser, Meinersbur Subscribers: singam-sanjay, llvm-commits, pollydev, nemanjai, mgorny, yaxunl, Anastasia Tags: #polly Differential Revision: https://reviews.llvm.org/D32431 llvm-svn: 302379
* Revert "[Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGen"Siddharth Bhat2017-05-051-94/+19
| | | | | | | | | | | | | | | This reverts commit 17a84e414adb51ee375d14836d4c2a817b191933. Patches should have been submitted in the order of: 1. D32852 2. D32854 3. D32431 I mistakenly pushed D32431(3) first. Reverting to push in the correct order. llvm-svn: 302217
* [Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGenSiddharth Bhat2017-05-051-19/+94
| | | | | | | | | | | | | | | | | | | | | | | Summary: When compiling for GPU, one can now choose to compile for OpenCL or CUDA, with the corresponding polly-gpu-runtime flag (libopencl / libcudart). The GPURuntime library (GPUJIT) has been extended with the OpenCL Runtime library for that purpose, correctly choosing the corresponding library calls to the option chosen when compiling (via different initialization calls). Additionally, a specific GPU Target architecture can now be chosen with -polly-gpu-arch (only nvptx64 implemented thus far). Reviewers: grosser, bollu, Meinersbur, etherzhhb, singam-sanjay Reviewed By: grosser, Meinersbur Subscribers: singam-sanjay, llvm-commits, pollydev, nemanjai, mgorny, yaxunl, Anastasia Tags: #polly Differential Revision: https://reviews.llvm.org/D32431 llvm-svn: 302215
* Introduce VirtualUse. NFC.Michael Kruse2017-05-041-32/+106
| | | | | | | | | | | | | | | | | | | If a ScopStmt references a (scalar) value, there are multiple possibilities where this value can come. The decision about what kind of use it is must be handled consistently at different places, which can be error-prone. VirtualUse is meant to centralize the handling of the different types of value uses. This patch makes ScopBuilder and CodeGeneration use VirtualUse. This already helps to show inconsistencies with the value handling. In order to keep this patch NFC, exceptions to the general rules are added. These might be fixed later if they turn to problems. Overall, this should result in fewer post-codegen IR-verification errors, but instead assertion failures in `getNewValue` that are closer to the actual error. Differential Revision: https://reviews.llvm.org/D32667 llvm-svn: 302157
* [NFC] [IslAST] fix typo: "int the" -> "in the"Siddharth Bhat2017-05-021-1/+1
| | | | llvm-svn: 301925
* [Codegen] Disable Polly's codegen verification by defaultTobias Grosser2017-04-281-1/+1
| | | | | | | | | | | | | | | | | As has been reported in the previous commit, codegen verification can result in quadratic compile time increases for large functions with many scops. This is certainly not something we would like to have in the Polly default configuration. Hence, we disable codegen verification by default -- also to see if this resolves some of the compilation timeouts we currently see on the AOSP buildbots. We still leave this feature in Polly as it has shown _very_ useful for debugging. In fact, we may want to have a discussion if we can bring this feature back in a way that does not impact compilation time so much. Thanks to Eli Friedman <efriedma@codeaurora.org> for reporting this issue and for providing the test case in the previous commit (where I forgot to acknowledge him). llvm-svn: 301670
* [CodeGen] Skip verify if -polly-codegen-verify is set to falseTobias Grosser2017-04-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Before this change, we always tried to verify the function and printed verification errors, but just did not abort in case -polly-codegen-verify=false was set and verification failed. As verification can become very cosly -- for large functions with many scops we may verify the very same function very often -- this can affect compile time very negatively. Hence, we respect the -polly-codegen-verify flag with this check, ensuring that no verification is run if -polly-codegen-verify=false. This reduces code generation time from 26 seconds to 4 seconds on the test case below with -polly-codegen-verify=false: struct X { int x; }; void a(); #define SIG (int x, X **y, X **z) typedef void (*fn)SIG; #define FN { for (int i = 0; i < x; ++i) { (*y)[i].x += (*z)[i].x; } a(); } #define FN5 FN FN FN FN FN #define FN25 FN5 FN5 FN5 FN5 #define FN125 FN25 FN25 FN25 FN25 FN25 #define FN250 FN125 FN125 #define FN1250 FN250 FN250 FN250 FN250 FN250 void x SIG { FN1250 } llvm-svn: 301669
* [Polly] [PPCGCodeGeneration] Add managed memory support to GPU codeSiddharth Bhat2017-04-281-8/+114
| | | | | | | | | | | | | | | | | | | | | | generation. This needs changes to GPURuntime to expose synchronization between host and device. 1. Needs better function naming, I want a better name than "getOrCreateManagedDeviceArray" 2. DeviceAllocations is used by both the managed memory and the non-managed memory path. This exploits the fact that the two code paths are never run together. I'm not sure if this is the best design decision Reviewed by: PhilippSchaad Tags: #polly Differential Revision: https://reviews.llvm.org/D32215 llvm-svn: 301640
* [Polly] Do not introduce address space castHongbin Zheng2017-04-271-1/+2
| | | | | | | | Do not introduce address space cast in IslNodeBuilder::preloadUnconditionally. Differential Revision: https://reviews.llvm.org/D32581 llvm-svn: 301519
* [PPCGCodeGeneration] Update PPCG Code Generation for OpenCL compatibilitySiddharth Bhat2017-04-251-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | Added a small change to the way pointer arguments are set in the kernel code generation. The way the pointer is retrieved now, specifically requests global address space to be annotated. This is necessary, if the IR should be run through NVPTX to generate OpenCL compatible PTX. The changes do not affect the PTX Strings generated for the CUDA target (nvptx64-nvidia-cuda), but are necessary for OpenCL (nvptx64-nvidia-nvcl). Additionally, the data layout has been updated to what the NVPTX Backend requests/recommends. Contributed-by: Philipp Schaad Reviewers: Meinersbur, grosser, bollu Reviewed By: grosser, bollu Subscribers: jlebar, pollydev, llvm-commits, nemanjai, yaxunl, Anastasia Tags: #polly Differential Revision: https://reviews.llvm.org/D32215 llvm-svn: 301299
* Exploit BasicBlock::getModule to shorten codeTobias Grosser2017-04-113-5/+3
| | | | | Suggested-by: Roman Gareev <gareevroman@gmail.com> llvm-svn: 299914
* SAdjust to recent change in constructor definition of AllocaInstTobias Grosser2017-04-113-18/+23
| | | | llvm-svn: 299913
* Update for alloca construction changesMatt Arsenault2017-04-114-5/+17
| | | | llvm-svn: 299905
* Remove llvm.lifetime.start/end in original region.Michael Kruse2017-04-051-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | The current StackColoring algorithm does not correctly handle the situation when some, but not all paths from a BB to the entry node cross a llvm.lifetime.start. According to an interpretation of the language reference at http://llvm.org/docs/LangRef.html#llvm-lifetime-start-intrinsic this might be correct, but it would cost too much effort to handle in StackColoring. To be on the safe side, remove all lifetime markers even in the original code version (they have never been copied to the optimized version) to ensure that no path to the entry block will cross a llvm.lifetime.start. The same principle applies to paths the a function return and the llvm.lifetime.end marker, so we remove them as well. This fixes llvm.org/PR32251. Also see the discussion at http://lists.llvm.org/pipermail/llvm-dev/2017-March/111551.html llvm-svn: 299585
* Fix formatting in LoopGeneratorsPhilip Pfaffe2017-04-041-2/+2
| | | | llvm-svn: 299424
* [Polly][NewPM] Pull references to the legacy PM interface from utilities and ↵Philip Pfaffe2017-04-045-15/+13
| | | | | | | | | | | | | | | | | | | | | helpers Summary: A couple of the utilities used to analyze or build IR make explicit use of the legacy PM on their interface, to access analysis results. This patch removes the legacy PM from the interface, and just passes the required results directly. This shouldn't introduce any function changes, although the API technically allowed to obtain two different analysis results before, one passed by reference and one through the PM. I don't believe that was ever intended, however. Reviewers: grosser, Meinersbur Reviewed By: grosser Subscribers: nemanjai, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D31653 llvm-svn: 299423
* [PerfMonitor] Use Intrinsics::getDeclarationTobias Grosser2017-04-031-11/+2
| | | | | | | | Instead of creating the declaration ourselves, we obtain it directly from the LLVM intrinsic definitions. This addresses a post-review comment for r299359. Suggested-by: Hongzing Zheng <etherzhhb@gmail.com> llvm-svn: 299360
* [CodeGen] Add Performance MonitorTobias Grosser2017-04-032-0/+253
| | | | | | | | | | | | | | | | | | | | | | | | | | Add support for -polly-codegen-perf-monitoring. When performance monitoring is enabled, we emit performance monitoring code during code generation that prints after program exit statistics about the total number of cycles executed as well as the number of cycles spent in scops. This gives an estimate on how useful polyhedral optimizations might be for a given program. Example output: Polly runtime information ------------------------- Total: 783110081637 Scops: 663718949365 In the future, we might also add functionality to measure how much time is spent in optimized scops and how many cycles are spent in the fallback code. Reviewers: bollu,sebpop Tags: #polly Differential Revision: https://reviews.llvm.org/D31599 llvm-svn: 299359
* [PollyIRBuilder] Bound size of alias metadataTobias Grosser2017-04-031-0/+8
| | | | | | | | | | | | No-alias metadata grows quadratic in the size of arrays involved, which can become very costly for large programs. This commit bounds the number of arrays for which we construct no-alias information to ten. This is conservatively correct, as we just provide less information to LLVM and speeds up the compile time of one of my internal test cases from 'does-not-terminate' to 'finishes-in-less-than-a-minute'. In the future we might try to be more clever here, but this change should provide a good baseline. llvm-svn: 299352
* [ScopInfo] Introduce ScopStmt::contains(BB*). NFC.Michael Kruse2017-03-231-4/+2
| | | | | | | | | Provide an common way for testing if a statement contains something for region and block statements. First user is RegionGenerator::addOperandToPHI. Suggested-by: Tobias Grosser <tobias@grosser.es> llvm-svn: 298617
* Introduce another level of metadata to distinguish non-aliasing accessesRoman Gareev2017-03-222-5/+60
| | | | | | | | | | | | | | Introduce another level of alias metadata to distinguish the individual non-aliasing accesses that have inter iteration alias-free base pointers marked with "Inter iteration alias-free" mark nodes. It can be used to, for example, distinguish different stores (loads) produced by unrolling of the innermost loops and, subsequently, sink (hoist) them by LICM. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D30606 llvm-svn: 298510
* Map the new load to the base pointer of the invariant load hoisted loadRoman Gareev2017-03-221-0/+3
| | | | | | | | | | | Map the new load to the base pointer of the invariant load hoisted load to be able to find the alias information for it. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D30605 llvm-svn: 298507
* [CodeGen] Remove need for all parameters to be in scop context for load ↵Tobias Grosser2017-03-181-6/+14
| | | | | | | | | | | hoisting. When not adding constraints on parameters using -polly-ignore-parameter-bounds, the context may not necessarily list all parameter dimensions. To support code generation in this situation, we now always iterate over the actual parameter list, rather than relying on the context to list all parameter dimensions. llvm-svn: 298197
* [IslExprBuilder] Print accessed memory locations with RuntimeDebugBuilderTobias Grosser2017-03-182-4/+17
| | | | | | | | | | | | | | | | | | | After this change, enabling -polly-codegen-add-debug-printing in combination with -polly-codegen-generate-expressions allows us to instrument the compiled binaries to not only print the values stored and loaded to a given memory access, but also to print the accessed location with array name and per-dimension offset: MemRef_A[3][2] Store to 6299784: 5.000000 MemRef_A[3][3] Load from 6299788: 0.000000 MemRef_A[3][3] Store to 6299788: 6.000000 This can be very helpful for debugging. llvm-svn: 298194
* [OpenMP] Do not emit lifetime markers for contextTobias Grosser2017-03-181-9/+0
| | | | | | | | | | | | | | | | | | In commit r219005 lifetime markers have been introduced to mark the lifetime of the OpenMP context data structure. However, their use seems incorrect and recently caused a miscompile in ASC_Sequoia/CrystalMk after r298053 which was not at all related to r298053. r298053 only caused a change in the loop order, as this change resulted in a different isl internal representation which caused the scheduler to derive a different schedule. This change then caused the IR to change, which apparently created a pattern in which LLVM exploites the lifetime markers. It seems we are using the OpenMP context outside of the lifetime markers. Even though CrystalMk could probably be fixed by expanding the scope of the lifetime markers, it is not clear what happens in case the OpenMP function call is in a loop which will cause a sequence of starting and ending lifetimes. As it is unlikely that the lifetime markers give any performance benefit, we just drop them to remove complexity. llvm-svn: 298192
* Possible error in doc commentTobias Grosser2017-03-121-1/+1
| | | | | | | | | | | | | | | | | If a SCoP is most probably sequential, then it's better to run it on a CPU. Hence, there's no point in running it on a GPU. Reviewers: grosser Subscribers: nemanjai Tags: #polly Contributed-by: Singapuram Sanjay <singapuram.sanjay@gmail.com> Differential Revision: https://reviews.llvm.org/D30864 llvm-svn: 297578
* [Simplify] Add -polly-simplify pass.Michael Kruse2017-03-101-0/+8
| | | | | | | | | | | | | | | | | This new pass removes unnecessary accesses and writes. It currently supports 2 simplifications, but more are planned. It removes write accesses that write a loaded value back to the location it was loaded from. It is a typical artifact from DeLICM. Removing it will get rid of bogus dependencies later in dependency analysis. It also removes statements without side-effects. ScopInfo already removes these, but the removal of unnecessary writes can result in more side-effect free statements. Differential Revision: https://reviews.llvm.org/D30820 llvm-svn: 297473
* Fix namespaces after clang-format updateTobias Grosser2017-03-011-1/+1
| | | | llvm-svn: 296635
* Disable the parallel code generation in case of extension nodesRoman Gareev2017-02-271-0/+8
| | | | | | | | | | | We can not perform the dependence analysis and, consequently, the parallel code generation in case the schedule tree contains extension nodes. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D30394 llvm-svn: 296325
* Remove all references to PostDominators. NFC.Michael Kruse2017-02-233-5/+0
| | | | | | | | | Marking a pass as preserved is necessary if any Polly pass uses it, even if it is not preserved within the generated code. Not marking it would cause the the Polly pass chain to be interrupted. It is not used by any Polly pass anymore, hence we can remove all references to it. llvm-svn: 295983
* [BlockGenerator] Use MemoryAccess::getAccessValue to get load instructionTobias Grosser2017-02-091-2/+2
| | | | | | | | | | | | | | | | | | | | When generating code in the BlockGenerator we copy all (interesting) instructions and keep track of the new values in a basic block map. To obtain the original llvm::Value that belongs to a load memory access, we use getAccessValue() instead of getOriginalBaseAddr(). The former always references the instruction we use to load values from. The latter, on the other hand, is obtaine from the corresponding ScopArrayInfo and would not be unique in case ScopArrayInfo objects at some point allow memory accesses with different base addresses. This change is an update on r294566, which only clarified that we need the original memory access, but where we still remained dependent to have one base pointer per scop. This change removes unnecessary uses of MemoryAddress::getOriginalBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294669
* [IRBuilder] Extract base pointers directly from ScopArrayTobias Grosser2017-02-091-13/+7
| | | | | | | | | | | | | Instead of iterating over statements and their memory accesses to extract the set of available base pointers, just directly iterate over all ScopArray objects. This reflects more the actual intend of the code: collect all arrays (and their base pointers) to emit alias information that specifies that accesses to different arrays cannot alias. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294574
* [IslAst] Print the ScopArray name to mark reductionsTobias Grosser2017-02-091-1/+1
| | | | | | | | | | | | | Before this change we used the name of the base pointer to mark reductions. This is imprecise as the canonical reference is the ScopArray itself and not the basepointer of a reduction. Using the base pointer of reductions is problematic in cases where a single ScopArray is referenced through two different base pointers. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294568
* [BlockGenerator] BBMap uses original BaseAddress for scalar loads [NFC]Tobias Grosser2017-02-091-2/+2
| | | | | | | | | | | | | | | | | When regenerating code in the BlockGenerator we copy instructions that may references scalar values, for which the new value of a given scalar is looked up in BBMap using the original scalar llvm::Value as index. It is consequently necessary that (re)loaded scalar values are made available in BBMap using the original llvm::Value as key independently if the llvm::Value was (re)loaded from the original scalar or a new access function has been specified that caused the value to be reloaded from an array with a differnet base address. We make this clear by using MemoryAccess::getOriginalBaseAddr() instead of MemoryAccess::getBaseAddr() as index to BBMap. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294566
* Update to recent formatting changesTobias Grosser2017-02-012-12/+8
| | | | llvm-svn: 293756
* [BlockGenerator] Comment corretions for r293374 [NFC]Tobias Grosser2017-01-281-6/+9
| | | | | | | This addresses some additional comments from Michael Kruse for commit r293374 as expressed in https://reviews.llvm.org/D28901. llvm-svn: 293378
* [Polly] [BlockGenerator] Unify ScalarMap and PhiOpsMapTobias Grosser2017-01-283-62/+50
| | | | | | | | | | | | | | | | | | | | | | | | | Instead of keeping two separate maps from Value to Allocas, one for MemoryType::Value and the other for MemoryType::PHI, we introduce a single map from ScopArrayInfo to the corresponding Alloca. This change is intended, both as a general simplification and cleanup, but also to reduce our use of MemoryAccess::getBaseAddr(). Moving away from using getBaseAddr() makes sure we have only a single place where the array (and its base pointer) for which we generate code for is specified, which means we can more easily introduce new access functions that use a different ScopArrayInfo as base. We already today experiment with modifiable access functions, so this change does not address a specific bug, but it just reduces the scope one needs to reason about. Another motivation for this patch is https://reviews.llvm.org/D28518, where memory accesses with different base pointers could possibly be mapped to a single ScopArrayInfo object. Such a mapping is currently not possible, as we currently generate alloca instructions according to the base addresses of the memory accesses, not according to the ScopArrayInfo object they belong to. By making allocas ScopArrayInfo specific, a mapping to a single ScopArrayInfo object will automatically mean that the same stack slot is used for these arrays. For D28518 this is not a problem, as only MemoryType::Array objects are mapping, but resolving this inconsistency will hopefully avoid confusion. llvm-svn: 293374
* BlockGenerator: Do not redundantly reload from PHI-allocas in non-affine stmtsTobias Grosser2017-01-191-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this change we created an additional reload in the copy of the incoming block of a PHI node to reload the incoming value, even though the necessary value has already been made available by the normally generated scalar loads. In this change, we drop the code that generates this redundant reload and instead just reuse the scalar value already available. Besides making the generated code slightly cleaner, this change also makes sure that scalar loads go through the normal logic, which means they can be remapped (e.g. to array slots) and corresponding code is generated to load from the remapped location. Without this change, the original scalar load at the beginning of the non-affine region would have been remapped, but the redundant scalar load would continue to load from the old PHI slot location. It might be possible to further simplify the code in addOperandToPHI, but this would not only mean to pull out getNewValue, but to also change the insertion point update logic. As this did not work when trying it the first time, this change is likely not trivial. To not introduce bugs last minute, we postpone further simplications to a subsequent commit. We also document the current behavior a little bit better. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D28892 llvm-svn: 292486
* BlockGenerator: remove obfuscating const and const castsTobias Grosser2017-01-191-2/+2
| | | | | | | | Making certain values 'const' to just cast it away a little later mainly obfuscates the code. Hence, we just drop the 'const' parts. Suggested-by: Michael Kruse <llvm@meinersbur.de> llvm-svn: 292480
* Use range-based for loop [NFC]Tobias Grosser2017-01-191-2/+2
| | | | llvm-svn: 292471
OpenPOWER on IntegriCloud