bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Post-commit fix of a comment	Philip Pfaffe	2017-05-23	1	-1/+1
\| \| \| \|	llvm-svn: 303628
*	[Polly][NewPM] Port DependenceInfo to the new ScopPassManager.	Philip Pfaffe	2017-05-23	1	-0/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch ports DependenceInfo to the new ScopPassManager. Printing is implemented as a seperate printer pass. Reviewers: grosser, Meinersbur Reviewed By: grosser Subscribers: llvm-commits, pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D33421 llvm-svn: 303621
*	[ScopInfo] Translate foldAccessRelation to isl C++ [NFC]	Tobias Grosser	2017-05-23	1	-50/+47
\| \| \| \|	llvm-svn: 303615
*	[ScopInfo] Translate buildMemIntrinsicAccessRelation to isl C++ [NFC]	Tobias Grosser	2017-05-23	1	-16/+16
\| \| \| \|	llvm-svn: 303612
*	[ScopInfo] Translate assumeNoOutOfBound to isl C++ [NFC]	Tobias Grosser	2017-05-23	1	-26/+19
\| \| \| \|	llvm-svn: 303611
*	[ScopInfo] Translate applyAndSetFAD to isl C++	Tobias Grosser	2017-05-23	1	-10/+6
\| \| \| \|	llvm-svn: 303610
*	[ScopInfo] Translate isReadOnly to isl C++	Tobias Grosser	2017-05-23	1	-8/+4
\| \| \| \|	llvm-svn: 303608
*	[ScopInfo] Simplify domains early	Tobias Grosser	2017-05-23	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	This speeds up scop modeling for scops with many redundent existentially quantified constraints. For the attached test case, this change reduces scop modeling time from minutes (hours?) to 0.15 seconds. This change resolves a compilation timeout on the AOSP build. Thanks Eli for reporting _and_ reducing the test case! Reported-by: Eli Friedman <efriedma@codeaurora.org> llvm-svn: 303600
*	[CodeGen] Support partial write accesses.	Michael Kruse	2017-05-21	1	-9/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allow the BlockGenerator to generate memory writes that are not defined over the complete statement domain, but only over a subset of it. It generates a condition that evaluates to 1 if executing the subdomain, and only then execute the access. Only write accesses are supported. Read accesses would require a PHINode which has a value if the access is not executed. Partial write makes DeLICM able to apply mappings that are not defined over the entire domain (for instance, a branch that leaves a loop with a PHINode in its header; a MemoryKind::PHI write when leaving is never read by its PHI read). Differential Revision: https://reviews.llvm.org/D33255 llvm-svn: 303517
*	[ScopInfo] Translate updateDimensionality to isl C++ [NFC]	Tobias Grosser	2017-05-21	1	-33/+29
\| \| \| \|	llvm-svn: 303514
*	[ScopInfo] Translate wrapConstantDimensions to isl C++ [NFC]	Tobias Grosser	2017-05-21	1	-26/+22
\| \| \| \|	llvm-svn: 303511
*	[ScopInfo] Translate addRangeBoundsToSet to isl C++ [NFC]	Tobias Grosser	2017-05-21	1	-22/+23
\| \| \| \|	llvm-svn: 303510
*	[Fortran Support] Materialize outermost dimension for Fortran array.	Siddharth Bhat	2017-05-19	1	-3/+101
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- We use the outermost dimension of arrays since we need this information to generate GPU transfers. - In general, if we do not know the outermost dimension of the array (because the indexing expression is non-affine, for example) then we simply cannot generate transfer code. - However, for Fortran arrays, we can use the Fortran array representation which stores the dimensions of all arrays. - This patch uses the Fortran array representation to generate code that computes the outermost dimension size. Differential Revision: https://reviews.llvm.org/D32967 llvm-svn: 303429
*	[ScopDetection] Allow detection of full functions	Tobias Grosser	2017-05-19	1	-2/+8
\| \| \| \| \| \|	This is useful when only analyzing functions. llvm-svn: 303420
*	[ScopInfo] Fix typo in documentation	Tobias Grosser	2017-05-19	1	-5/+6
\| \| \| \|	llvm-svn: 303405
*	[ScopInfo] Gracefully handle long compile times	Tobias Grosser	2017-05-19	1	-6/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following test case tried to compute the lexicographic minimum of the following set during alias analysis, which caused very long compile time: [p_0, p_1, p_2, p_3, p_4, p_5] -> { MemRef0[i0] : (517p_3 >= 70944 - 298p_2 and 256i0 >= -71199 + 298p_2 + 517p_3 and 256i0 <= -70944 + 298p_2 + 517p_3) or (409p_4 >= 57120 - 298p_2 and 256i0 >= -57375 + 298p_2 + 409p_4 and 256i0 <= -57120 + 298p_2 + 409p_4) or (104p_4 >= 17329 + 149p_2 - 50p_3 and 128i0 >= 17328 + 149p_2 - 50p_3 - 104p_4 and 128i0 <= 17455 + 149p_2 - 50p_3 - 104p_4) or (104p_4 <= 17328 + 149p_2 - 50p_3 and 128i0 >= 17201 + 149p_2 - 50p_3 - 104p_4 and 128i0 <= 17328 + 149p_2 - 50p_3 - 104p_4) or (409p_4 <= 57119 - 298p_2 and 256i0 >= -57120 + 298p_2 + 409p_4 and 256i0 <= -56865 + 298p_2 + 409p_4) or (517p_3 <= 70943 - 298p_2 and 256i0 >= -70944 + 298p_2 + 517p_3 and 256i0 <= -70689 + 298p_2 + 517p_3) or (p_1 >= 2 + 2p_0 and 298p_5 >= 70944 - 517p_3 and 256i0 >= -71199 + 517p_3 + 298p_5 and 256i0 <= -70944 + 517p_3 + 298p_5) or (p_1 >= 2 + 2p_0 and 298p_5 >= 57120 - 409p_4 and 256i0 >= -57375 + 409p_4 + 298p_5 >and 256i0 <= -57120 + 409p_4 + 298p_5) or (p_1 >= 2 + 2p_0 and 149p_5 <= -17329 >+ 50p_3 + 104p_4 and 128i0 >= 17328 - 50p_3 - 104p_4 + 149p_5 and 128i0 <= >17455 - 50p_3 - 104p_4 + 149p_5) or (p_1 >= 2 + 2p_0 and 149p_5 >= -17328 + >50p_3 + 104p_4 and 128i0 >= 17201 - 50p_3 - 104p_4 + 149p_5 and 128i0 <= 17328 >- 50p_3 - 104p_4 + 149p_5) or (p_1 >= 2 + 2p_0 and 298p_5 <= 57119 - 409p_4 and >256i0 >= -57120 + 409p_4 + 298p_5 and 256i0 <= -56865 + 409p_4 + 298p_5) or >(p_1 >= 2 + 2p_0 and 298p_5 <= 70943 - 517p_3 and 256i0 >= -70944 + 517p_3 + >298p_5 and 256i0 <= -70689 + 517p_3 + 298p_5) } We now guard the potentially expensive functions in Polly's scop analysis to gracefully bail out in case of overly long compilation times. llvm-svn: 303404
*	[ScopInfo] Fix r302231 to use logical or (\|\|). NFC.	Michael Kruse	2017-05-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	In r302231 we mistakenly use bitwise or (\|) instead of logical or (\|\|). This patch fixes that. Contributed-by: Sameer AbuAsal <sabuasal@codeaurora.org> Differential Revision: https://reviews.llvm.org/D33337 llvm-svn: 303386
*	[Fortran Support] Change "global" pattern match to work for params	Siddharth Bhat	2017-05-18	1	-44/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Rename global / local naming convention that did not make much sense to Visible / Invisible, where the visible refers to whether the ALLOCATE call to the Fortran array is present in the current module or not. - This match now works on both cross fortran module globals and on parameters to functions since neither of them are necessarily allocated at the point of their usage. - Add testcase that matches against both a load and a store against function parameters. Differential Revision: https://reviews.llvm.org/D33190 llvm-svn: 303356
*	[Polly][NewPM][WIP] Add a ScopPassManager	Philip Pfaffe	2017-05-15	2	-1/+101
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds both a ScopAnalysisManager and a ScopPassManager. The ScopAnalysisManager is itself a Function-Analysis, and manages analyses on Scops. The ScopPassManager takes care of building Scop pass pipelines. This patch is marked WIP because I've left two FIXMEs which I need to think about some more. Both of these deal with invalidation: Deferred invalidation is currently not implemented. Deferred invalidation deals with analyses which cache references to other analysis results. If these results are invalidated, invalidation needs to be propagated into the caching analyses. The ScopPassManager as implemented assumes that ScopPasses do not affect other Scops in any way. There has been some discussion about this on other patch threads, however it makes sense to reiterate this for this specific patch. I'm uploading this patch even though it's incomplete to encourage discussion and give you an impression of how this is going to work. Differential Revision: https://reviews.llvm.org/D33192 llvm-svn: 303062
*	[Polly][NewPM] Port ScopInfo to the new PassManager	Philip Pfaffe	2017-05-15	3	-20/+50
\| \| \| \|	llvm-svn: 303056
*	[Fortran Support] Add pattern match for Fortran Arrays that are parameters.	Siddharth Bhat	2017-05-15	2	-84/+67
\| \| \| \| \| \| \| \| \| \| \|	- This breaks the previous assumption that Fortran Arrays are `GlobalValue`. - The names of functions were getting unwieldy. So, I renamed the Fortran related functions. Differential Revision: https://reviews.llvm.org/D33075 llvm-svn: 303040
*	[Polly][NewPM] Port ScopDetection to the new PassManager	Philip Pfaffe	2017-05-12	3	-159/+214
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a proof of concept of how to port polly-passes to the new PassManager architecture. This approach works ootb for Function-Passes, but might not be directly applicable to Scop/Region-Passes. While we could just run the Analyses/Transforms over functions instead, we'd surrender the nice pipelining behaviour we have now. Reviewers: Meinersbur, grosser Reviewed By: grosser Subscribers: pollydev, sanjoy, nemanjai, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D31459 llvm-svn: 302902
*	[DeLICM] Lookup input accesses.	Michael Kruse	2017-05-11	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previous to this patch, we used VirtualUse to determine the input access of an llvm::Value in a statement. The input access is the READ MemoryAccess that makes a value available in that statement, which can either be a READ of a MemoryKind::Value or the MemoryKind::PHI for a PHINode in the statement. DeLICM uses the input access to heuristically find a candidate to map without searching all possible values. This might modify the behaviour in that previously PHI accesses were not considered input accesses before. This was unintentially lost when "VirtualUse" was extracted from the "Known Knowledge" patch. llvm-svn: 302838
*	[ScopInfo] Keep scalar acceess dictionaries up-to-data. NFC.	Michael Kruse	2017-05-11	1	-0/+24
\| \| \| \| \| \| \| \| \| \|	When removing a MemoryAccess, also remove it from maps pointing to it. This was already done for InstructionToAccess, but not yet for ValueReads, ValueWrites and PHIWrites as those were only used during the ScopBuilder phase. Keeping them updated allows us to use them later as well. llvm-svn: 302836
*	[Fix] [Fortran Support] Fix variable name & make testcase activate on release	Siddharth Bhat	2017-05-10	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	There was: #ifdef NDEBUG This should be: #ifndef NDEBUG Also, the variable name was incorrect. Fixed the variable name. llvm-svn: 302696
*	[Fortran Support] Detect Fortran arrays & metadata from dragonegg output	Siddharth Bhat	2017-05-10	2	-5/+269
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the ability to tag certain memory accesses as those belonging to Fortran arrays. We do this by pattern matching against known patterns of Dragonegg's LLVM IR output from Fortran code. Fortran arrays have metadata stored with them in a struct. This struct is called the "Fortran array descriptor", and a reference to this is stored in each MemoryAccess. Differential Revision: https://reviews.llvm.org/D32639 llvm-svn: 302653
*	[Polly] Canonicalize arrays according to base-ptr equivalence class	Tobias Grosser	2017-05-10	2	-1/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In case two arrays share base pointers in the same invariant load equivalence class, we canonicalize all memory accesses to the first of these arrays (according to their order in the equivalence class). This enables us to optimize kernels such as boost::ublas by ensuring that different references to the C array are interpreted as accesses to the same array. Before this change the runtime alias check for ublas would fail, as it would assume models of the C array with differing (but identically valued) base pointers would reference distinct regions of memory whereas the referenced memory regions were indeed identical. As part of this change we remove most of the MemoryAccess::getBaseAddr interface. We removed already all references to getBaseAddr in previous commits to ensure that no code relies on matching base pointers between memory accesses and scop arrays -- except for three remaining uses where we need the original base pointer. We document for these situations that MemoryAccess::getOriginalBaseAddr may return a base pointer that is distinct to the base pointer of the scop array referenced by this memory access. Reviewers: sebpop, Meinersbur, zinob, gareevroman, pollydev, huihuiz, efriedma, jdoerfert Reviewed By: Meinersbur Subscribers: etherzhhb Tags: #polly Differential Revision: https://reviews.llvm.org/D28518 llvm-svn: 302636
*	[DeLICM] Known knowledge.	Michael Kruse	2017-05-06	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	Extend the Knowledge class to store information about the contents of array elements and which values are written. Two knowledges do not conflict the known content is the same. The content information if computed from writes to and loads from the array elements, and represented by "ValInst": isl spaces that compare equal if the value represented is the same. Differential Revision: https://reviews.llvm.org/D31247 llvm-svn: 302339
*	[ScopBuilder] Move Scop::init to ScopBuilder. NFC.	Michael Kruse	2017-05-05	2	-68/+57
\| \| \| \| \| \| \| \| \| \| \|	Scop::init is used only during SCoP construction. Therefore ScopBuilder seems the more appropriate place for it. We integrate it onto its only caller ScopBuilder::buildScop where some other construction steps already took place. Differential Revision: https://reviews.llvm.org/D32908 llvm-svn: 302276
*	[ScopBuilder] Do not verify unfeasible SCoPs.	Michael Kruse	2017-05-05	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SCoPs with unfeasible runtime context are thrown away and therefore do not need their uses verified. The added test case requires a complexity limit to exceed. Normally, error statements are removed from the SCoP and for that reason are skipped during the verification. If there is a unfeasible runtime context (here: because of the complexity limit being reached), the removal of error statements and other SCoP construction steps are skipped to not waste time. Error statements are not modeled in SCoPs and therefore have no requirements on whether the scalars used in them are available. llvm-svn: 302234
*	Fix handling of signWrappedSets in access relations	Tobias Grosser	2017-05-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since r294891, in MemoryAccess::computeBoundsOnAccessRelation(), we skip manually bounding the access relation in case the parameter of the load instruction is already a wrapped set. Later on we assume that the lower bound on the set is always smaller or equal to the upper bound on the set. Bug 32715 manages to construct a sign wrapped set, in which case the assertion does not necessarily hold. Fix this by handling a sign wrapped set similar to a normal wrapped set, that is skipping the computation. Contributed-by: Maximilian Falkenstein <falkensm@student.ethz.ch> Reviewers: grosser Subscribers: pollydev, llvm-commits Tags: #Polly Differential Revision: https://reviews.llvm.org/D32893 llvm-svn: 302231
*	[ScopBuilder] Add missing semicolon after LLVM_FALLTHROUGH.	Michael Kruse	2017-05-04	1	-1/+1
\| \| \| \| \| \|	It was forgotten in r302157. llvm-svn: 302163
*	Introduce VirtualUse. NFC.	Michael Kruse	2017-05-04	1	-43/+104
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a ScopStmt references a (scalar) value, there are multiple possibilities where this value can come. The decision about what kind of use it is must be handled consistently at different places, which can be error-prone. VirtualUse is meant to centralize the handling of the different types of value uses. This patch makes ScopBuilder and CodeGeneration use VirtualUse. This already helps to show inconsistencies with the value handling. In order to keep this patch NFC, exceptions to the general rules are added. These might be fixed later if they turn to problems. Overall, this should result in fewer post-codegen IR-verification errors, but instead assertion failures in `getNewValue` that are closer to the actual error. Differential Revision: https://reviews.llvm.org/D32667 llvm-svn: 302157
*	[ScopDetection] Check for already known required-invariant loads [NFC]	Tobias Grosser	2017-05-04	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \|	For certain test cases we spent over 50% of the scop detection time in checking if a load is likely invariant. We can avoid most of these checks by testing early on if a load is expected to be invariant. Doing this reduces scop-detection time on a large benchmark from 52 seconds to just 25 seconds. No functional change is expected. llvm-svn: 302134
*	[ScopInfo] Do not use LLVM names to identify statements, arrays, and parameters	Tobias Grosser	2017-05-03	1	-20/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	LLVM-IR names are commonly available in debug builds, but often not in release builds. Hence, using LLVM-IR names to identify statements or memory reference results makes the behavior of Polly depend on the compile mode. This is undesirable. Hence, we now just number the statements instead of using LLVM-IR names to identify them (this issue has previously been brought up by Zino Benaissa). However, as LLVM-IR names help in making test cases more readable, we add an option '-polly-use-llvm-names' to still use LLVM-IR names. This flag is by default set in the polly tests to make test cases more readable. This change reduces the time in ScopInfo from 32 seconds to 2 seconds for the following test case provided by Eli Friedman <efriedma@codeaurora.org> (already used in one of the previous commits): struct X { int x; }; void a(); #define SIG (int x, X y, X z) typedef void (fn)SIG; #define FN { for (int i = 0; i < x; ++i) { (y)[i].x += (*z)[i].x; } a(); } #define FN5 FN FN FN FN FN #define FN25 FN5 FN5 FN5 FN5 #define FN125 FN25 FN25 FN25 FN25 FN25 #define FN250 FN125 FN125 #define FN1250 FN250 FN250 FN250 FN250 FN250 void x SIG { FN1250 } For a larger benchmark I have on-hand (10000 loops), this reduces the time for running -polly-scops from 5 minutes to 4 minutes, a reduction by 20%. The reason for this large speedup is that our previous use of printAsOperand had a quadratic cost, as for each printed and unnamed operand the full function was scanned to find the instruction number that identifies the operand. We do not need to adjust the way memory reference ids are constructured, as they do not use LLVM values. Reviewed by: efriedma Tags: #polly Differential Revision: https://reviews.llvm.org/D32789 llvm-svn: 302072
*	[ScopInfo] Remove code not needed anymore after r302004	Tobias Grosser	2017-05-03	2	-7/+3
\| \| \| \|	llvm-svn: 302005
*	[ScopInfo] Do not add array name into memory reference ids	Tobias Grosser	2017-05-03	1	-10/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before this change a memory reference identifier had the form: <STMT>_<ACCESSTYPE><ID>_<MEMREF>, e.g., Stmt_bb9_Write0_MemRef_tmp11 After this change, we use the format: <STMT>_<ACCESSTYPE><ID>, e.g., Stmt_bb9_Write0 The name of the array that is accessed through a memory reference is not necessary to uniquely identify a memory reference, but was only added to provide additional information for debugging. We drop this information now for the following two reasons: 1) This shortens the names and consequently improves readability 2) This removes a second location where we decide on the name of a scop array, leaving us only with the location where the actual scop array is created. Having after 2) only a single location to name scop arrays will allow us to change the naming convention of scop arrays more easily, which we will do in a future commit to reduce compilation time. llvm-svn: 302004
*	[ScopInfo] Consider only write-free dereferencable loads as invariant	Tobias Grosser	2017-04-27	1	-12/+17
\| \| \| \| \| \| \| \| \| \| \| \| \|	When we introduced in r297375 support for hoisting loads that are known to be dereferencable without any conditional guard, we forgot to keep the check to verify that no other write into the very same location exists. This change ensures now that dereferencable loads are allowed to access everything, but can only be hoisted in case no conflicting write exists. This resolves llvm.org/PR32778 Reported-by: Huihui Zhang <huihuiz@codeaurora.org> llvm-svn: 301582
*	[Polly] [DependenceInfo] change WAR generation, Read will not block Read	Siddharth Bhat	2017-04-24	1	-27/+101
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Earlier, the call to buildFlow was: WAR = buildFlow(Write, Read, MustWrite, Schedule). This meant that Read could block another Read, since must-sources can block each other. Fixed the call to buildFlow to correctly compute Read. The resulting code needs to do some ISL juggling to get the output we want. Bug report: https://bugs.llvm.org/show_bug.cgi?id=32623 Reviewers: Meinersbur Tags: #polly Differential Revision: https://reviews.llvm.org/D32011 llvm-svn: 301266
*	Exploit BasicBlock::getModule to shorten code	Tobias Grosser	2017-04-11	1	-2/+1
\| \| \| \| \|	Suggested-by: Roman Gareev <gareevroman@gmail.com> llvm-svn: 299914
*	[Polly] [DependenceInfo] change WAR, WAW generation to correct semantics	Siddharth Bhat	2017-04-04	1	-27/+108
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	= Change of WAR, WAW generation: = - `buildFlow(Sink, MustSource, MaySource, Sink)` treates any flow of the form `sink <- may source <- must source` as a may dependence. - we used to call: ```lang=cpp, name=old-flow-call.cpp Flow = buildFlow(MustWrite, MustWrite, Read, Schedule); WAW = isl_union_flow_get_must_dependence(Flow); WAR = isl_union_flow_get_may_dependence(Flow); ``` - This caused some WAW dependences to be treated as WAR dependences. - Incorrect semantics. - Now, we call WAR and WAW correctly. == Correct WAW: == ```lang=cpp, name=new-waw-call.cpp Flow = buildFlow(Write, MustWrite, MayWrite, Schedule); WAW = isl_union_flow_get_may_dependence(Flow); isl_union_flow_free(Flow); ``` == Correct WAR: == ```lang=cpp, name=new-war-call.cpp Flow = buildFlow(Write, Read, MustaWrite, Schedule); WAR = isl_union_flow_get_must_dependence(Flow); isl_union_flow_free(Flow); ``` - We want the "shortest" WAR possible (exact dependences). - We mark all the must-writes as may-source, reads as must-souce. - Then, we ask for must dependence. - This removes all the reads that flow through a must-write before reaching a sink. - Note that we only block ealier writes with must-writes. This is intuitively correct, as we do not want may-writes to block must-writes. - Leaves us with direct (R -> W). - This affects reduction generation since RED is built using WAW and WAR. = New StrictWAW for Reductions: = - We used to call: ```lang=cpp,name=old-waw-war-call.cpp Flow = buildFlow(MustWrite, MustWrite, Read, Schedule); WAW = isl_union_flow_get_must_dependence(Flow); WAR = isl_union_flow_get_may_dependence(Flow); ``` - This is the right model of WAW we need for reductions, just not in general. - Reductions need to track only strict WAW, without any interfering reductions. = Explanation: Why the new WAR dependences in tests are correct: = - We no longer set WAR = WAR - WAW - Hence, we will have WAR dependences that were originally removed. - These may look incorrect, but in fact make sense. == Code: == ```lang=llvm, name=new-war-dependence.ll ; void manyreductions(long A) { ; for (long i = 0; i < 1024; i++) ; for (long j = 0; j < 1024; j++) ; S0: A += 42; ; ; for (long i = 0; i < 1024; i++) ; for (long j = 0; j < 1024; j++) ; S1: A += 42; ; ``` === WAR dependence: === { S0[1023, 1023] -> S1[0, 0] } - Between `S0[1023, 1023]` and `S1[0, 0]`, we will have the dependences: ```lang=cpp, name=dependence-incorrect, counterexample S0[1023, 1023]: -- tmp = A (load0)-- WAR 2 add = tmp + 42 \| -> A = add (store0) \| WAR 1 S1[0, 0]: \| tmp = A (load1) \| add = tmp + 42 \| A = add (store1)<- ``` - One may assume that WAR2 hides WAR1 (since store0 happens before store1). However, within a statement, Polly has no idea about the ordering of loads and stores. - Hence, according to Polly, the code may have looked like this: ```lang=cpp, name=dependence-correct S0[1023, 1023]: A = add (store0) tmp = A (load0) ---* add = A + 42 \| WAR 1 S1[0, 0]: \| tmp = A (load1) \| add = A + 42 \| A = add (store1) <-* ``` - So, Polly generates (correct) WAR dependences. It does not make sense to remove these dependences, since they are correct with respect to Polly's model. Reviewers: grosser, Meinersbur tags: #polly Differential revision: https://reviews.llvm.org/D31386 llvm-svn: 299429
*	[ScopInfo] Fix typos in option description.	Michael Kruse	2017-04-03	1	-2/+2
\| \| \| \|	llvm-svn: 299356
*	revert test commit r299024	Huihui Zhang	2017-03-29	1	-1/+0
\| \| \| \|	llvm-svn: 299026
*	test commit, add blank line	Huihui Zhang	2017-03-29	1	-0/+1
\| \| \| \|	llvm-svn: 299024
*	[DependenceInfo] change name Write to MustWrite to remove ambiguity [NFC]	Siddharth Bhat	2017-03-21	1	-13/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	"Write" is an overloaded term. In collectInfo() till buildFlow(), it is used to mean "must writes". However, within the memory based analysis, it is used to mean "both may and must writes". Renaming the Write variable helps clarify this difference. Reviewers: grosser Tags: #polly Differential Revision: https://reviews.llvm.org/D31181 llvm-svn: 298361
*	Revert "Remove references to AssumptionCache. NFC."	Michael Kruse	2017-03-17	2	-73/+74
\| \| \| \| \| \| \| \| \| \|	The AssumptionCache removal of r289756 has been reverted in r290086/r290087. A different solution has been implemented in r291671 which keeps the AssumptionCache. We can therefore use it again in Polly. This reverts r289791. llvm-svn: 298089
*	[DependenceInfo] Remove idempotent union: must-writes with may-writes [NFC]	Siddharth Bhat	2017-03-17	1	-2/+1
\| \| \| \| \| \| \|	Since may-writes are always a superset of the must-writes, there is no point in taking a union of one with the other. llvm-svn: 298085
*	[ScopInfo/PruneUnprofitable] Move default profitability check.	Michael Kruse	2017-03-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the previous default ScopInfo applied the profitability heuristic for scalar accesses (-polly-unprofitable-scalar-accs=true) and the -polly-prune-unprofitable was disabled by default (-polly-enable-prune-unprofitable=false) as that pruning was already done. This changes switches the defaults to -polly-unprofitable-scalar-accs=true -polly-enable-prune-unprofitable=false such that the scalar access heuristic check is done by the pass. This allows passes between ScopInfo and PruneUnprofitable to optimize away scalar accesses. Without enabling such intermediate passes, there is no change in behaviour of profitability checks in a PassManagerBuilder built pass chain, but it allows us to cover this configuration with the buildbots. Suggested-by: Tobias Grosser <tobias@grosser.es> llvm-svn: 298081
*	[PruneUnprofitable] Add -polly-prune-unprofitable pass.	Michael Kruse	2017-03-17	2	-5/+76
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ScopInfo's normal profitability heuristic considers SCoPs where all statements have scalar writes as not profitably optimizable and invalidate the SCoP in that case. However, -polly-delicm and -polly-simplify may be able to remove some of the scalar writes such that the flag -polly-unprofitable-scalar-accs=false allows disabling that part of the heuristic. In cases where DeLICM (or other passes after ScopInfo) are not successful in removing scalar writes, the SCoP is still not profitably optimizable. The schedule optimizer would again try computing another schedule, resulting in slower compilation. The -polly-prune-unprofitable pass applies the profitability heuristic again before the schedule optimizer Polly can still bail out even with -polly-unprofitable-scalar-accs=false. Differential Revision: https://reviews.llvm.org/D31033 llvm-svn: 298080
*	[ScopInfo] Add option to not add parameter bounds to context [NFC]	Tobias Grosser	2017-03-17	1	-0/+9
\| \| \| \| \| \| \| \|	For experiments it is sometimes helpful to provide parameter bound information to polly and to not use these parameter bounds for simplification. Add a new option "-polly-ignore-parameter-bounds" which does precisely this. llvm-svn: 298077