bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Migrate function attribute "no-frame-pointer-elim"="false" to ↵	Fangrui Song	2019-12-24	1	-2/+2
\| \| \| \|	"frame-pointer"="none" as cleanups after D56351
*	Remove irrelevant references to legacy git repositories from	James Y Knight	2019-01-15	1	-1/+1
\| \| \| \| \| \| \| \| \|	compiler identification lines in test-cases. (Doing so only because it's then easier to search for references which are actually important and need fixing.) llvm-svn: 351200
*	[Polly][IslAst] Fix minimal dependence distance.	Huihui Zhang	2018-04-04	1	-0/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When checking the parallelism of a scheduling dimension, we first check if excluding reduction dependences the loop is parallel or not. If the loop is not parallel, then we need to return the minimal dependence distance of all data dependences, including the previously subtracted reduction dependences. Reviewers: grosser, Meinersbur, efriedma, eli.friedman, jdoerfert, bollu Reviewed By: Meinersbur Subscribers: llvm-commits, pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D45236 llvm-svn: 329214
*	[ScopBuilder] Make -polly-stmt-granularity=scalar-indep the default.	Michael Kruse	2018-02-03	2	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Splitting basic blocks into multiple statements if there are now additional scalar dependencies gives more freedom to the scheduler, but more statements also means higher compile-time complexity. Switch to finer statement granularity, the additional compile time should be limited by the number of operations quota. The regression tests are written for the -polly-stmt-granularity=bb setting, therefore we add that flag to those tests that break with the new default. Some of the tests only fail because the statements are named differently due to a basic block resulting in multiple statements, but which are removed during simplification of statements without side-effects. Previous commits tried to reduce this effect, but it is not completely avoidable. Differential Revision: https://reviews.llvm.org/D42151 llvm-svn: 324169
*	[IslAst] Do not assert in case of empty min/max alias locations	Tobias Grosser	2017-09-03	1	-0/+72
\| \| \| \| \| \| \| \| \| \| \| \| \|	In certain situations, the context in the isl_ast_build could result for the min/max locations of our alias sets to become empty, which would cause an internal error in isl, which is then unable to derive a value for these expressions. Check these conditions before code generating expressions and instead assume that alias check succeeded. This is valid, as the corresponding memory accesses will not be executed under any valid context. This fixed llvm.org/PR34432. Thanks to Qirun Zhang for reporting. llvm-svn: 312455
*	[IslAst] Do not compare arrays in alias check which are known to be identical	Tobias Grosser	2017-08-28	1	-0/+33
\| \| \| \| \| \|	This possibly helps to avoid run-time check failures in the COSMO kernels. llvm-svn: 311920
*	[tests] Set -polly-import-jscop-dir=%S always	Tobias Grosser	2017-07-11	10	-10/+10
\| \| \| \| \| \|	This simplifies the test cases. llvm-svn: 307645
*	[IslAst] Print memory accesses in AST dump	Tobias Grosser	2017-07-10	2	-3/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When providing the option "-polly-ast-print-accesses" Polly also prints the memory accesses that are generated: #pragma known-parallel for (int c0 = 0; c0 <= 1023; c0 += 4) #pragma simd for (int c1 = c0; c1 <= c0 + 3; c1 += 1) Stmt_for_body( /* read / &MemRef_B[0] / write */ MemRef_A[c1] ); This makes writing and debugging memory layout transformations easier. Based on a patch contributed by Thomas Lang (ETH Zurich) llvm-svn: 307579
*	[ScopInfo] Disable memory folding in case it results in multi-disjunct relations	Tobias Grosser	2017-03-01	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Multi-disjunct access maps can easily result in inbound assumptions which explode in case of many memory accesses and many parameters. This change reduces compilation time of some larger kernel from over 15 minutes to less than 16 seconds. Interesting is the test case test/ScopInfo/multidim_param_in_subscript.ll which has a memory access [n] -> { Stmt_for_body3[i0, i1] -> MemRef_A[i0, -1 + n - i1] } which requires folding, but where only a single disjunct remains. We can still model this test case even when only using limited memory folding. For people only reading commit messages, here the comment that explains what memory folding is: To recover memory accesses with array size parameters in the subscript expression we post-process the delinearization results. We would normally recover from an access A[exp0(i) * N + exp1(i)] into an array A[][N] the 2D access A[exp0(i)][exp1(i)]. However, another valid delinearization is A[exp0(i) - 1][exp1(i) + N] which - depending on the range of exp1(i) - may be preferrable. Specifically, for cases where we know exp1(i) is negative, we want to choose the latter expression. As we commonly do not have any information about the range of exp1(i), we do not choose one of the two options, but instead create a piecewise access function that adds the (-1, N) offsets as soon as exp1(i) becomes negative. For a 2D array such an access function is created by applying the piecewise map: [i,j] -> [i, j] : j >= 0 [i,j] -> [i-1, j+N] : j < 0 After this patch we generate only the first case, except for situations where we can proove the first case to be invalid and can consequently select the second without introducing disjuncts. llvm-svn: 296679
*	[ScopInfo] Simplify inbounds assumptions under domain constraints	Tobias Grosser	2017-02-28	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Without this simplification for a loop nest: void foo(long n1_a, long n1_b, long n1_c, long n1_d, long p1_b, long p1_c, long p1_d, float A_1[][p1_b][p1_c][p1_d]) { for (long i = 0; i < n1_a; i++) for (long j = 0; j < n1_b; j++) for (long k = 0; k < n1_c; k++) for (long l = 0; l < n1_d; l++) A_1[i][j][k][l] += i + j + k + l; } the assumption: n1_a <= 0 or (n1_a > 0 and n1_b <= 0) or (n1_a > 0 and n1_b > 0 and n1_c <= 0) or (n1_a > 0 and n1_b > 0 and n1_c > 0 and n1_d <= 0) or (n1_a > 0 and n1_b > 0 and n1_c > 0 and n1_d > 0 and p1_b >= n1_b and p1_c >= n1_c and p1_d >= n1_d) is taken rather than the simpler assumption: p9_b >= n9_b and p9_c >= n9_c and p9_d >= n9_d. The former is less strict, as it allows arbitrary values of p1_* in case, the loop is not executed at all. However, in practice these precise constraints explode when combined across different accesses and loops. For now it seems to make more sense to take less precise, but more scalable constraints by default. In case we find a practical example where more precise constraints are needed, we can think about allowing such precise constraints in specific situations where they help. This change speeds up the new test case from taking very long (waited at least a minute, but it probably takes a lot more) to below a second. llvm-svn: 296456
*	Updated isl to isl-0.18-254-g6bc184d	Tobias Grosser	2017-02-17	2	-9/+11
\| \| \| \| \| \| \|	This update includes a couple more coalescing changes as well as a large number of isl-internal code cleanups (dead assigments, ...). llvm-svn: 295419
*	[IslAst] Print the ScopArray name to mark reductions	Tobias Grosser	2017-02-09	5	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	Before this change we used the name of the base pointer to mark reductions. This is imprecise as the canonical reference is the ScopArray itself and not the basepointer of a reduction. Using the base pointer of reductions is problematic in cases where a single ScopArray is referenced through two different base pointers. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294568
*	IslAst: always use the context during ast generation	Tobias Grosser	2016-11-10	2	-1/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Providing the context to the ast generator allows for additional simplifcations and -- more importantly -- allows to generate loops with only partially bounded domains, assuming the domains are bounded for all parameter configurations that are valid as defined by the context. This change fixes the crash reported in http://llvm.org/PR30956 The original reason why we did not include the context when generating an AST was that CLooG and later isl used to sometimes transfer some of the constraints that bound the size of parameters from the context into the generated AST. This resulted in operations with very large constants, which sometimes introduced problematic integer overflows. The latest versions of the isl AST generator are careful to not introduce such constants. Reported-by: Eli Friedman <efriedma@codeaurora.org> llvm-svn: 286442
*	[tests] Force invariant load hoisting for test cases that need it -- III	Tobias Grosser	2016-08-15	1	-1/+2
\| \| \| \|	llvm-svn: 278673
*	[GSoC] Add PolyhedralInfo pass - new interface to polly analysis	Johannes Doerfert	2016-07-25	19	-16/+74
\| \| \| \| \| \| \| \| \| \| \| \| \|	Adding a new pass PolyhedralInfo. This pass will be the interface to Polly. Initially, we will provide the following interface: - #IsParallel(Loop *L) - return a bool depending on whether the loop is parallel or not for the given program order. Patch by Utpal Bora <cs14mtech11017@iith.ac.in> Differential Revision: https://reviews.llvm.org/D21486 llvm-svn: 276637
*	Update isl to isl-0.17.1-57-g1879898	Tobias Grosser	2016-06-12	2	-12/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With this update the isl AST generation extracts disjunctive constraints early on. As a result, code that previously resulted in two branches with (close-to) identical code within them: if (P <= -1) { for (int c0 = 0; c0 < N; c0 += 1) Stmt_store(c0); } else if (P >= 1) for (int c0 = 0; c0 < N; c0 += 1) Stmt_store(c0); results now in only a single branch body: if (P <= -1 \|\| P >= 1) for (int c0 = 0; c0 < N; c0 += 1) Stmt_store(c0); This resolves http://llvm.org/PR27559 Besides the above change, this isl update brings better simplification of sets/maps containing existentially quantified dimensions and fixes a bug in isl's coalescing. llvm-svn: 272500
*	Expand test cases affected by next commit	Tobias Grosser	2016-06-12	2	-16/+30
\| \| \| \| \| \| \| \| \|	As these test cases will be changed in a subsequent commit, we expand and tighten them to make the subsequent changes to them more obvious. As part of this we add more context to some test cases and add CHECK-NEXT lines to ensure no intermediate lines are missed by accident. llvm-svn: 272499
*	Codegen: Enable the detection of min/max expressions	Tobias Grosser	2016-05-07	3	-3/+3
\| \| \| \| \| \| \| \| \|	Min/max expressions are easier to read and can in some cases also result in more concise IR that is generated as the min/max --- when lowered to a cmp+select pattern -- commonly has a simpler condition then the ternary condition isl would normally generate. llvm-svn: 268855
*	Record wrapping assumptions early	Johannes Doerfert	2016-04-12	1	-1/+1
\| \| \| \| \| \| \| \|	Utilizing the record option for assumptions we can simplify the wrapping assumption generation a lot. Additionally, we can now report locations together with wrapping assumptions, though they might not be accurate yet. llvm-svn: 266069
*	Update to isl-0.16.1-145-g243bf7c	Tobias Grosser	2016-03-25	2	-2/+2
\| \| \| \| \| \| \|	Just an import to keep track with the latest version of isl. We are not looking for specific features. llvm-svn: 264452
*	Drop explicit -polly-delinearize parameter	Tobias Grosser	2016-03-23	3	-3/+3
\| \| \| \| \| \| \|	Delinearization is now enabled by default and does not need to explicitly need to be enabled in our tests. llvm-svn: 264154
*	Track assumptions and restrictions separatly	Johannes Doerfert	2016-03-01	5	-22/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to speed up compile time and to avoid random timeouts we now separately track assumptions and restrictions. In this context assumptions describe parameter valuations we need and restrictions describe parameter valuations we do not allow. During AST generation we create a runtime check for both, whereas the one for the restrictions is negated before a conjunction is build. Except the In-Bounds assumptions we currently only track restrictions. Differential Revision: http://reviews.llvm.org/D17247 llvm-svn: 262328
*	Try to build alias checks even when non-affine accesses are allowed	Johannes Doerfert	2016-02-25	1	-0/+2
\| \| \| \| \| \| \|	From now on we bail only if a non-trivial alias group contains a non-affine access, not when we discover aliasing and non-affine accesses are allowed. llvm-svn: 261863
*	Remove -polly-code-generator=isl from many test cases	Tobias Grosser	2015-11-21	7	-20/+20
\| \| \| \| \| \| \|	This is the default since a long time. Setting it again does not add value in any of these test cases. llvm-svn: 253800
*	ScopInfo: Ensure unique names for parameter names coming from load instructions	Tobias Grosser	2015-11-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In case the original parameter instruction does not have a name, but it comes from a load instruction where the base pointer has a name we used the name of the load instruction to give some more intuition of where the parameter came from. To ensure this works also through GEPs which may have complex offsets, we originally just dropped the offsets and _only_ used the base pointer name. As this can result in multiple parameters to get the same name, we now prefix the parameter ID to ensure parameter names are unique. This will make it easier to understand debug output. This change does not affect correctness, as parameter IDs (even of the same name) can always be distinguished through the SCEV pointer stored inside them. llvm-svn: 253330
*	[FIX] Carefully simplify assumptions in the presence of error blocks	Johannes Doerfert	2015-11-08	1	-0/+65
\| \| \| \| \| \| \| \| \|	If a SCoP contains error blocks we cannot use the domain constraints to simplify the assumptions as the domain is already influenced by the assumptions we took. Before this patch we did that and some assumptions became self-fulfilling as they were implied by the domain constraints. llvm-svn: 252424
*	tests: Drop -polly-detect-unprofitable and -polly-no-early-exit	Tobias Grosser	2015-10-06	42	-56/+56
\| \| \| \| \| \| \| \|	These flags are now always passed to all tests and need to be disabled if not needed. Disabling these flags, rather than passing them to almost all tests, significantly simplfies our RUN: lines. llvm-svn: 249422
*	Model fixed-size multi-dimensional arrays if possible multi-dimensional	Tobias Grosser	2015-09-17	4	-72/+15
\| \| \| \| \| \| \| \| \|	If the GEP instructions give us enough insights, model scalar accesses as multi-dimensional (and generate the relevant run-time checks to ensure correctness). This will allow us to simplify the dependence computation in a subsequent commit. llvm-svn: 247906
*	Use modulo semantic to generate non-integer-overflow assumptions	Johannes Doerfert	2015-09-15	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This will allow to generate non-wrap assumptions for integer expressions that are part of the SCoP. We compare the common isl representation of the expression with one computed with modulo semantic. For all parameter combinations they are not equal we can have integer overflows. The nsw flags are respected when the modulo representation is computed, nuw and nw flags are ignored for now. In order to not increase compile time to much, the non-wrap assumptions are collected in a separate boundary context instead of the assumed context. This helps compile time as the boundary context can become complex and it is therefor not advised to use it in other operations except runtime check generation. However, the assumed context is e.g., used to tighten dependences. While the boundary context might help to tighten the assumed context it is doubtful that it will help in practice (it does not effect lnt much) as the boundary (or no-wrap assumptions) only restrict the very end of the possible value range of parameters. PET uses a different approach to compute the no-wrap context, though lnt runs have shown that this version performs slightly better for us. llvm-svn: 247732
*	Allow general loops with one latch	Johannes Doerfert	2015-09-10	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As we do not rely on ScalarEvolution any more we do not need to get the backedge taken count. Additionally, our domain generation handles everything that is affine and has one latch and our ScopDetection will over-approximate everything else. This change will therefor allow loops with: - one latch - exiting conditions that are affine Additionally, it will not check for structured control flow anymore. Hence, loops and conditionals are not necessarily single entry single exit regions any more. Differential Version: http://reviews.llvm.org/D12758 llvm-svn: 247289
*	[PM] Update Polly for the new AA infrastructure landed in r247167.	Chandler Carruth	2015-09-09	3	-15/+15
\| \| \| \|	llvm-svn: 247198
*	Do not detect Scops with only one loop.	Tobias Grosser	2015-08-27	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a region does not have more than one loop, we do not identify it as a Scop in ScopDetection. The main optimizations Polly is currently performing (tiling, preparation for outer-loop vectorization and loop fusion) are unlikely to have a positive impact on individual loops. In some cases, Polly's run-time alias checks or conditional hoisting may still have a positive impact, but those are mostly enabling transformations which LLVM already performs for individual loops. As we do not focus on individual loops, we leave them untouched to not introduce compile time regressions and execution time noise. This results in good compile time reduction (oourafft: -73.99%, smg2000: -56.25%). Contributed-by: Pratik Bhatu <cs12b1010@iith.ac.in> Reviewers: grosser Differential Revision: http://reviews.llvm.org/D12268 llvm-svn: 246161
*	Check for feasible runtime check context early	Johannes Doerfert	2015-08-20	1	-0/+54
\| \| \| \| \| \| \| \| \| \|	Instead of generating code for an empty assumed context we bail out early. As the number of assumptions we generate increases this becomes more and more important. Additionally, this change will allow us to hide internal contexts that are only used in runtime checks e.g., a boundary context with constraints not suited for simplifications. llvm-svn: 245540
*	Removed redundant alias checks generated during run time.	Johannes Doerfert	2015-07-23	6	-18/+18
\| \| \| \| \| \| \| \| \|	As specified in PR23888, run-time alias check generation is expensive in terms of compile-time. This reduces the compile time by computing minimal/maximal access only once for each base pointer Contributed-by: Pratik Bhatu <cs12b1010@iith.ac.in> llvm-svn: 243024
*	Use schedule trees to represent execution order of statements	Tobias Grosser	2015-07-14	2	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of flat schedules, we now use so-called schedule trees to represent the execution order of the statements in a SCoP. Schedule trees make it a lot easier to analyze, understand and modify properties of a schedule, as specific nodes in the tree can be choosen and possibly replaced. This patch does not yet fully move our DependenceInfo pass to schedule trees, as some additional performance analysis is needed here. (In general schedule trees should be faster in compile-time, as the more structured representation is generally easier to analyze and work with). We also can not yet perform the reduction analysis on schedule trees. For more information regarding schedule trees, please see Section 6 of https://lirias.kuleuven.be/handle/123456789/497238 llvm-svn: 242130
*	Remove target triples from test cases	Tobias Grosser	2015-04-21	10	-10/+0
\| \| \| \| \| \| \| \|	I just learned that target triples prevent test cases to be run on other architectures. Polly test cases are until now sufficiently target independent to not require any target triples. Hence, we drop them. llvm-svn: 235384
*	Delinearization of expressions that contain array size parameters	Tobias Grosser	2015-03-30	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows us to delinerize code such as: A[][n] for (i for (j A[i][n-j-1] = ... which would previously have been delinearize to an access A[i+1][-j-1]. To recover the correct access we apply the piecewise expression: { A[i][j] -> A[i-1][i+N]: i < 0; A[i][j] -> A[i][i]: i >= 0} This approach generalizes to higher dimensions. llvm-svn: 233566
*	Update Polly tests to handle explicitly typed load changes in LLVM.	David Blaikie	2015-02-27	34	-48/+48
\| \| \| \|	llvm-svn: 230796
*	Update Polly tests to handle explicitly typed gep changes in LLVM	David Blaikie	2015-02-27	41	-76/+76
\| \| \| \|	llvm-svn: 230784
*	Use isl_ast_expr_call to create run-time checks	Tobias Grosser	2015-02-26	5	-17/+17
\| \| \| \| \| \| \|	isl recently introduced a new interface to create run-time checks from constraint sets. Use this interface to simplify our run-time check generation. llvm-svn: 230640
*	ScopDetection: Only detect scops that have at least one read and one write	Tobias Grosser	2015-02-19	42	-56/+56
\| \| \| \| \| \| \| \| \| \|	Scops that only read seem generally uninteresting and scops that only write are most likely initializations where there is also little to optimize. To not waste compile time we bail early. Differential Revision: http://reviews.llvm.org/D7735 llvm-svn: 229820
*	Update to isl 99d53692ba	Tobias Grosser	2015-02-16	4	-29/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit imports the latest isl version into lib/External/isl. The changes relavant for Polly are: 1) Schedule trees [1] have been introduced as a more structured way to describe schedules. Polly does not yet use them, but we may switch to them in the near future. 2) Another set of coalescing changes [2] simplifies some data dependences and removes a couple of code generation artifacts. We now understand that the following sets can be merged: { Stmt_S1[i0, i1] -> Stmt_S2[i0 + i1] : i0 >= 0 and i1 <= 1023 - i0 and i1 >= 1 Stmt_S1[i0, 0] -> Stmt_S2[i0] : i0 <= 1023 and i0 >= 1} into: { Stmt_S1[i0, i1] -> Stmt_S2[i0 + i1] : i1 <= 1023 - i0 and i1 >= 0 and i1 >= 1 - i0 and i0 >= 0 } Changes of this kind reduce unnecessary specialization during code generation. - for (int c3 = 0; c3 <= 1023; c3 += 1) { - if (c3 % 2 == 0) { - Stmt_for_body3(c1, c3); - } else - Stmt_for_body3(c1, c3); - } + for (int c3 = 0; c3 <= 1023; c3 += 1) + Stmt_for_body3(c1, c3); [1] http://impact.gforge.inria.fr/impact2014/papers/impact2014-verdoolaege.pdf [2] http://impact.gforge.inria.fr/impact2015/papers/impact2015-verdoolaege.pdf llvm-svn: 229423
*	Add early exits for SCoPs we did not optimize	Johannes Doerfert	2015-02-11	6	-19/+19
\| \| \| \| \| \| \| \| \| \| \| \| \|	This allows us to skip ast and code generation if we did not optimize a SCoP and will not generate parallel or alias annotations. The initial heuristic to exit is simple but allows improvements later on. All failing test cases have been modified to disable early exit, thus to keep their coverage. Differential Revision: http://reviews.llvm.org/D7254 llvm-svn: 228851
*	[NFC] Drop the "scattering" tuple name	Johannes Doerfert	2015-02-02	13	-17/+17
\| \| \| \|	llvm-svn: 227801
*	Drop all constant scheduling dimensions	Tobias Grosser	2015-01-01	18	-51/+51
\| \| \| \| \| \| \| \| \|	Schedule dimensions that have the same constant value accross all statements do not carry any information, but due to the increased dimensionality of the schedule cost compile time. To not pay this cost, we remove constant dimensions if possible. llvm-svn: 225067
*	Run upgrade script from PR21532 to match LLVM changes	Duncan P. N. Exon Smith	2014-12-15	4	-27/+27
\| \| \| \| \| \| \| \| \|	Update tests for LLVM assembly format change in r224257 using the script attached to PR21532. I'm hoping this unsticks the bot [1]. [1]: http://lab.llvm.org:8011/builders/polly-amd64-linux/builds/25432 llvm-svn: 224269
*	Update to the latest version of isl	Tobias Grosser	2014-12-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Isl now specifically marks modulo operations that are compared against zero. They can be implemented with the C/LLVM remainder operation. We also update a couple of test cases where the output of isl has slightly changed. llvm-svn: 223607
*	Assume GetElementPtr offsets to be inbounds	Tobias Grosser	2014-11-25	4	-31/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In case a GEP instruction references into a fixed size array e.g., an access A[i][j] into an array A[100x100], LLVM-IR does not guarantee that the subscripts always compute values that are within array bounds. We now derive the set of parameter values for which all accesses are within bounds and add the assumption that the scop is only every executed with this set of parameter values. Example: void foo(float A[][20], long n, long m { for (long i = 0; i < n; i++) for (long j = 0; j < m; j++) A[i][j] = ... This loop yields out-of-bound accesses if m is at least 20 and at the same time at least one iteration of the outer loop is executed. Hence, we assume: n <= 0 or m <= 20. Doing so simplifies the dependence analysis problem, allows us to perform more optimizations and generate better code. TODO: The location where the GEP instruction is executed is not necessarily the location where the memory is actually accessed. As a result scanning for GEP[s] is imprecise. Even though this is not a correctness problem, this imprecision may result in missed optimizations or non-optimal run-time checks. In polybench where this mismatch between parametric loop bounds and fixed size arrays is common, we see with this patch significant reductions in compile time (up to 50%) and execution time (up to 70%). We see two significant compile time regressions (fdtd-2d, jacobi-2d-imper), and one execution time regression (trmm). Both regressions arise due to additional optimizations that have been enabled by this patch. They can be addressed in subsequent commits. http://reviews.llvm.org/D6369 llvm-svn: 222754
*	Introduce minimalistic cost model for auto parallelization	Tobias Grosser	2014-11-16	4	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of parallelizing every parallel outermost loop, we now use a very minimalistic cost model. Specifically, we assume innermost loops are not worth parallelising and all non-innermost loops are. When parallelizing all loops in LNT we got several slowdowns/timeouts due to us parallelizing innermost loops that are executed only a couple of times (number of iterations not known statically). With this basic heuristic enabled LNT does not show any more timeouts, while several interesting loops are still parallelized. There are many ways to obtain an improved heuristic. Constructing such an improvide heuristic from a position of minimal slow-down and zero code size increase seems to be the best, as it allows us to track progress on LNT. llvm-svn: 222096
*	Explicitly annotate loops we want to run thread-parallel	Tobias Grosser	2014-11-06	23	-30/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We introduces a new flag -polly-parallel and use it to annotate the for-nodes in the isl ast that we want to execute thread parallel (e.g., using OpenMP). We previously already emmitted openmp annotations, but we did this for various kinds of parallel loops, including some which we can not run in parallel. With this patch we now have three annotations: 1) #pragma known-parallel [reduction] 2) #pragma omp for 3) #pragma simd meaning: 1) loop has no loop carried dependences 2) loop will be executed thread-parallel 3) loop can possibly be vectorized This patch introduces 1) and reduces the use of 2) to only the cases where we will actually generate thread parallel code. It is in preparation of openmp code generation in our isl backend. Legacy: - We also have a command line option -enable-polly-openmp. This option controls the OpenMP code generation in CLooG. It will become an alias of -polly-parallel after the CLooG code generation has been dropped. http://reviews.llvm.org/D6142 llvm-svn: 221479