bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Migrate function attribute "no-frame-pointer-elim"="false" to ↵	Fangrui Song	2019-12-24	2	-2/+2
\| \| \| \|	"frame-pointer"="none" as cleanups after D56351
*	[CodeGen] Handle outlining of CopyStmts.	Michael Kruse	2019-09-17	1	-0/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since the removal of extensions nodes from schedule trees in r362257 it is possible to emit parallel code for SCoPs containing matrix-multiplications. However, the code looking for references used in outlined statement was not prepared to handle CopyStmts introduced by the matrix-matrix multiplication detection. In this case, CopyStmts do not introduce references in addition to the ones captured by MemoryAccesses, i.e. we change the assertion to accept CopyStmts and add a regression test for this case. This fixes llvm.org/PR43164 llvm-svn: 372188
*	[CodeGen] LLVM OpenMP Backend.	Michael Kruse	2019-03-19	3	-8/+134
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ParallelLoopGenerator class is changed such that GNU OpenMP specific code was removed, allowing to use it as super class in a template-pattern. Therefore, the code has been reorganized and one may not use the ParallelLoopGenerator directly anymore, instead specific implementations have to be provided. These implementations contain the library-specific code. As such, the "GOMP" (code completely taken from the existing backend) and "KMP" variant were created. For "check-polly" all tests that involved "GOMP": equivalents were added that test the new functionalities, like static scheduling and different chunk sizes. "docs/UsingPollyWithClang.rst" shows how the alternative backend may be used. Patch by Michael Halkenhäuser <michaelhalk@web.de> Differential Revision: https://reviews.llvm.org/D59100 llvm-svn: 356434
*	[ScopBuilder] Make -polly-stmt-granularity=scalar-indep the default.	Michael Kruse	2018-02-03	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Splitting basic blocks into multiple statements if there are now additional scalar dependencies gives more freedom to the scheduler, but more statements also means higher compile-time complexity. Switch to finer statement granularity, the additional compile time should be limited by the number of operations quota. The regression tests are written for the -polly-stmt-granularity=bb setting, therefore we add that flag to those tests that break with the new default. Some of the tests only fail because the statements are named differently due to a basic block resulting in multiple statements, but which are removed during simplification of statements without side-effects. Previous commits tried to reduce this effect, but it is not completely avoidable. Differential Revision: https://reviews.llvm.org/D42151 llvm-svn: 324169
*	[OpenMP] Fix reference collection of latest base ptrs.	Michael Kruse	2017-10-31	2	-3/+47
\| \| \| \| \| \| \| \|	When collecting base pointers that need to be made available in parallel subfunctions, use the base pointer associated with the latest ScopArrayInfo, instead of the original one. llvm-svn: 316983
*	UnXFAIL tests that previously failed VerifyDFSNumbers	Jakub Kuderski	2017-10-03	2	-6/+0
\| \| \| \| \| \|	They started passing again by the DT::eraseNode fix in r314847. llvm-svn: 314850
*	XFAIL two test that fail VerifyDFSNumbers DominatorTree check	Jakub Kuderski	2017-10-03	2	-0/+6
\| \| \| \| \| \| \| \| \|	This test XFAILs two test that start to fail when verifying DT's DFS numbers, as per Tobias' suggestion. Related VerifyDFSNumbers patch: D38331. llvm-svn: 314800
*	[CodeGen] Use isLatestArrayKind().	Michael Kruse	2017-08-09	1	-0/+58
\| \| \| \| \| \| \| \| \| \|	Codegen with -polly-parallel queried the unmapped MemoryAccess, but only the MemoryKind after mapping is relevant for codegen. This should fix various fails of the perf-x86_64-penryn-O3-polly-parallel-fast buildbot. llvm-svn: 310466
*	[tests] Set -polly-import-jscop-dir=%S always	Tobias Grosser	2017-07-11	2	-4/+4
\| \| \| \| \| \|	This simplifies the test cases. llvm-svn: 307645
*	[Polly] Generate more 'canonical' induction variable	Hongbin Zheng	2017-05-12	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Today Polly generates induction variable in this way: polly.indvar = phi 0, polly.indvar.next ... polly.indvar.next = polly.indvar + stide polly.loop_cond = predicate polly.indvar, (UB - stride) Instead of: polly.indvar = phi 0, polly.indvar.next ... polly.indvar.next = polly.indvar + stide polly.loop_cond = predicate polly.indvar.next, UB The way Polly generate induction variable cause some problem in the indvar simplify pass. This patch make polly generate the later form, by assuming the induction variable never overflow Differential Revision: https://reviews.llvm.org/D33089 llvm-svn: 302866
*	[OpenMP] Do not emit lifetime markers for context	Tobias Grosser	2017-03-18	3	-11/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In commit r219005 lifetime markers have been introduced to mark the lifetime of the OpenMP context data structure. However, their use seems incorrect and recently caused a miscompile in ASC_Sequoia/CrystalMk after r298053 which was not at all related to r298053. r298053 only caused a change in the loop order, as this change resulted in a different isl internal representation which caused the scheduler to derive a different schedule. This change then caused the IR to change, which apparently created a pattern in which LLVM exploites the lifetime markers. It seems we are using the OpenMP context outside of the lifetime markers. Even though CrystalMk could probably be fixed by expanding the scope of the lifetime markers, it is not clear what happens in case the OpenMP function call is in a loop which will cause a sequence of starting and ending lifetimes. As it is unlikely that the lifetime markers give any performance benefit, we just drop them to remove complexity. llvm-svn: 298192
*	[ScopDetection] Require LoadInst base pointers to be hoisted.	Michael Kruse	2017-03-07	2	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Only when load-hoisted we can be sure the base pointer is invariant during the SCoP's execution. Most of the time it would be added to the required hoists for the alias checks anyway, except with -polly-ignore-aliasing, -polly-use-runtime-alias-checks=0 or if AliasAnalysis is already sure it doesn't alias with anything (for instance if there is no other pointer to alias with). Two more parts in Polly assume that this load-hoisting took place: - setNewAccessRelation() which contains an assert which tests this. - BlockGenerator which would use to the base ptr from the original code if not load-hoisted (if the access expression is regenerated) Differential Revision: https://reviews.llvm.org/D30694 llvm-svn: 297195
*	[tests] Make sure tests do not end in 'unreachable'	Tobias Grosser	2017-03-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	There is no point in optimizing unreachable code, hence our test cases should always return. This commit is part of a series that makes Polly more robust on the presence of unreachables. llvm-svn: 297147
*	[tests] Force invariant load hoisting for test cases that need it	Tobias Grosser	2016-08-15	4	-4/+4
\| \| \| \| \| \| \| \|	This will make it easier to switch the default of Polly's invariant load hoisting strategy and also makes it very clear that these test cases indeed require invariant code hoisting to work. llvm-svn: 278667
*	This reverts recent expression type changes	Tobias Grosser	2016-06-11	3	-10/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The recent expression type changes still need more discussion, which will happen on phabricator or on the mailing list. The precise list of commits reverted are: - "Refactor division generation code" - "[NFC] Generate runtime checks after the SCoP" - "[FIX] Determine insertion point during SCEV expansion" - "Look through IntToPtr & PtrToInt instructions" - "Use minimal types for generated expressions" - "Temporarily promote values to i64 again" - "[NFC] Avoid unnecessary comparison for min/max expressions" - "[Polly] Fix -Wunused-variable warnings (NFC)" - "[NFC] Simplify min/max expression generation" - "Simplify the type adjustment in the IslExprBuilder" Some of them are just reverted as we would otherwise get conflicts. I will try to re-commit them if possible. llvm-svn: 272483
*	Use minimal types for generated expressions	Johannes Doerfert	2016-06-06	3	-6/+10
\| \| \| \| \| \| \| \| \| \| \| \|	We now use the minimal necessary bit width for the generated code. If operations might overflow (add/sub/mul) we will try to adjust the types in order to ensure a non-wrapping computation. If the type adjustment is not possible, thus the necessary type is bigger than the type value of --polly-max-expr-bit-width, we will use assumptions to verify the computation will not wrap. However, for run-time checks we cannot build assumptions but instead utilize overflow tracking intrinsics. llvm-svn: 271878
*	Check overflows in RTCs and bail accordingly	Johannes Doerfert	2016-05-12	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We utilize assumptions on the input to model IR in polyhedral world. To verify these assumptions we version the code and guard it with a runtime-check (RTC). However, since the RTCs are themselves generated from the polyhedral representation we generate them under the same assumptions that they should verify. In other words, the guarantees that we try to provide with the RTCs do not hold for the RTCs themselves. To this end it is necessary to employ a different check for the RTCs that will verify the assumptions did hold for them too. Differential Revision: http://reviews.llvm.org/D20165 llvm-svn: 269299
*	[FIX] Do not recompute SCEVs but pass them to subfunctions	Johannes Doerfert	2016-04-09	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 2879c53e80e05497f408f21ce470d122e9f90f94. Additionally, it adds SDiv and SRem instructions to the set of values discovered by the findValues function even if we add the operands to be able to recompute the SCEVs. In subfunctions we do not want to recompute SDiv and SRem instructions but pass them instead as they might have been created through the IslExprBuilder and are more complicated than simple SDiv/SRem instructions in the code. llvm-svn: 265873
*	[FIX] Teach the ScopExpander about parallel subfunctions	Johannes Doerfert	2016-04-08	1	-0/+44
\| \| \| \|	llvm-svn: 265824
*	Separate more constant factors of parameters	Johannes Doerfert	2016-02-14	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	So far we separated constant factors from multiplications, however, only when they are at the outermost level of a parameter SCEV. Now, we also separate constant factors from the parameter SCEV if the outermost expression is a SCEVAddRecExpr. With the changes to the SCEVAffinator we can now improve the extractConstantFactor(...) function at will without worrying about any other code part. Thus, if needed we can implement a more comprehensive extractConstantFactor(...) function that will traverse the SCEV instead of looking only at the outermost level. Four test cases were affected. One did not change much and the other three were simplified. llvm-svn: 260859
*	executeScopConditionally: Introduce special exiting block	Tobias Grosser	2015-12-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When introducing separate control flow for the original and optimized code we introduce now a special 'ExitingBlock': \ / EnteringBB \| SplitBlock---------\ _____\|_____ \| / EntryBB \ StartBlock \| (region) \| \| \_ExitingBB_/ ExitingBlock \| \| MergeBlock---------/ \| ExitBB / \ This 'ExitingBlock' contains code such as the final_reloads for scalars, which previously were just added to whichever statement/loop_exit/branch-merge block had been generated last. Having an explicit basic block makes it easier to find these constructs when looking at the CFG. llvm-svn: 255107
*	Consolidate invariant loads	Johannes Doerfert	2015-10-09	1	-0/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a (assumed) invariant location is loaded multiple times we generated a parameter for each location. However, this caused compile time problems for several benchmarks (e.g., 445_gobmk in SPEC2006 and BT in the NAS benchmarks). Additionally, the code we generate is suboptimal as we preload the same location multiple times and perform the same checks on all the parameters that refere to the same value. With this patch we consolidate the invariant loads in three steps: 1) During SCoP initialization required invariant loads are put in equivalence classes based on their pointer operand. One representing load is used to generate a parameter for the whole class, thus we never generate multiple parameters for the same location. 2) During the SCoP simplification we remove invariant memory accesses that are in the same equivalence class. While doing so we build the union of all execution domains as it is only important that the location is at least accessed once. 3) During code generation we only preload one element of each equivalence class with the unified execution domain. All others are mapped to that preloaded value. Differential Revision: http://reviews.llvm.org/D13338 llvm-svn: 249853
*	tests: Drop -polly-detect-unprofitable and -polly-no-early-exit	Tobias Grosser	2015-10-06	16	-27/+27
\| \| \| \| \| \| \| \|	These flags are now always passed to all tests and need to be disabled if not needed. Disabling these flags, rather than passing them to almost all tests, significantly simplfies our RUN: lines. llvm-svn: 249422
*	Hand down referenced & globally mapped values to the subfunction	Johannes Doerfert	2015-10-02	3	-1/+111
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a value is globally mapped (IslNodeBuilder::ValueMap) and referenced in the code that will be put into a subfunction, we hand down the new value to the subfunction. This patch also removes code that handed down all invariant loads to the subfunction. Instead, only needed invariant loads are given to the subfunction. There are two possible reasons for an invariant load to be handed down: 1) The invariant load is used in a block that is placed in the subfunction but which is not the parent of the load. In this case, the scalar access that will read the loaded value, will cause its base pointer (the preloaded value) to be handed down to the subfunction. 2) The invariant load is defined and used in a block that is placed in the subfunction. With this patch we will hand down the preloaded value to the subfunction as the invariant load is globally mapped to that value. llvm-svn: 249126
*	[FIX] Parallel codegen for invariant loads	Johannes Doerfert	2015-10-01	1	-0/+33
\| \| \| \| \| \|	Hand down all preloaded values to the parallel subfunction. llvm-svn: 249010
*	Reapply "BlockGenerator: Generate synthesisable instructions only on-demand"	Tobias Grosser	2015-09-30	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instructions which we can synthesis from a SCEV expression are not generated directly, but only when they are used as an operand of another instruction. This avoids generating unnecessary instructions and works more reliably than first inserting them and then deleting them later on. This commit was reverted in r248860 due to a remaining miscompile, where we forgot to synthesis the operand values that were referenced from scalar writes. test/Isl/CodeGen/scalar-store-from-same-bb.ll tests that we do this now correctly. llvm-svn: 248900
*	Revert "BlockGenerator: Generate synthesisable instructions only on-demand"	Johannes Doerfert	2015-09-29	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 07830c18d789ee72812d5b5b9b4f8ce72ebd4207. The commit broke at least one test in lnt, MultiSource/Benchmarks/Ptrdist/bc/number.c was miss compiled and the test produced a wrong result. One Polly test case that was added later was adjusted too. llvm-svn: 248860
*	OpenMP: Name addresses in subfunction structure	Tobias Grosser	2015-09-28	1	-2/+2
\| \| \| \| \| \| \|	While debugging, this makes it easier to understand due to which memory reference these stores have been introduced. llvm-svn: 248717
*	BlockGenerator: Generate synthesisable instructions only on-demand	Tobias Grosser	2015-09-28	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	Instructions which we can synthesis from a SCEV expression are not generated directly, but only when they are used as an operand of another instruction. This avoids generating unnecessary instruction and works more reliably than first inserting them and then deleting them later on. Suggested-by: Johannes Doerfert <doerfert@cs.uni-saarland.de> Differential Revision: http://reviews.llvm.org/D13208 llvm-svn: 248712
*	BlockGenerator: Simplify code generated for scop statements	Tobias Grosser	2015-09-27	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \|	After having generated a new user statement a couple of inefficient or trivially dead instructions may remain. This commit runs instruction simplification over the newly generated blocks to ensure unneeded instructions are removed right away. This commit does not yet add simplification for non-affine subregions. llvm-svn: 248681
*	Create parallel code in a separate block	Johannes Doerfert	2015-09-26	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	This commit basically reverts r246427 but still solves the issue tackled by that commit. Instead of emitting initialization code in the beginning of the start block we now generate parallel code in its own block and thereby guarantee separation. This is necessary as we cannot generate code for hoisted loads prior to the start block but it still needs to be placed prior to everything else. llvm-svn: 248674
*	Propagate exit conditions as described in the PET paper	Johannes Doerfert	2015-09-14	2	-6/+9
\| \| \| \| \| \| \| \| \|	At some point we build loop trip counts using this method. It was replaced by a simpler trick that works only for affine (e.g., not modulo) constraints and relies on the removal of unbounded parts. In order to allow modulo constrains again we go back to the former, more accurate method. llvm-svn: 247540
*	Clean-up unit tests	Michael Kruse	2015-09-10	5	-6/+0
\| \| \| \| \| \|	Remove redundant flags and duplicate invocations of the same test. llvm-svn: 247285
*	Replace ScalarEvolution based domain generation	Johannes Doerfert	2015-09-10	3	-23/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch replaces the last legacy part of the domain generation, namely the ScalarEvolution part that was used to obtain loop bounds. We now iterate over the loops in the region and propagate the back edge condition to the header blocks. Afterwards we propagate the new information once through the whole region. In this process we simply ignore unbounded parts of the domain and thereby assume the absence of infinite loops. + This patch already identified a couple of broken unit tests we had for years. + We allow more loops already and the step to multiple exit and multiple back edges is minimal. + It allows to model the overflow checks properly as we actually visit every block in the SCoP and know where which condition is evaluated. - It is currently not compatible with modulo constraints in the domain. Differential Revision: http://reviews.llvm.org/D12499 llvm-svn: 247279
*	Do not use '.' in subfunction names	Tobias Grosser	2015-09-08	9	-15/+15
\| \| \| \| \| \| \| \| \| \|	Certain backends, e.g. NVPTX, do not support '.' in function names. Hence, we ensure all '.' are replaced by '_' when generating function names for subfunctions. For the current OpenMP code generation, this is not strictly necessary, but future uses cases (e.g. GPU offloading) need this issue to be fixed. llvm-svn: 246980
*	Add forgotten .jscop file	Tobias Grosser	2015-09-05	1	-0/+21
\| \| \| \|	llvm-svn: 246925
*	OpenMP: Name the values passed to the subfunciton according to the original ↵	Tobias Grosser	2015-09-05	2	-18/+17
\| \| \| \| \| \|	llvm::Values llvm-svn: 246924
*	OpenMP codegen: support generation of multi-dimensional access functions	Tobias Grosser	2015-09-05	1	-0/+78
\| \| \| \| \| \| \| \| \| \|	When computing the index expressions for new, multi-dimensional memory accesses these new index expressions may reference original llvm::Values that are not transfered into the OpenMP subfunction. Using GlobalMap we now replace references to such values with the rewritten values that have e.g. been passed to the OpenMP subfunction. llvm-svn: 246923
*	Generate scalar initialization loads at the beginning of the start BB	Tobias Grosser	2015-08-31	1	-0/+45
\| \| \| \| \| \| \| \| \|	Our OpenMP code generation generated part of its launching code directly into the start basic block and without this change the scalar initialization was run _after_ the OpenMP threads have been launched. This resulted in uninitialized scalar values to be used. llvm-svn: 246427
*	OpenMP-codegen: Correctly pass function arguments to subfunctions	Tobias Grosser	2015-08-31	1	-0/+59
\| \| \| \| \| \| \|	Before we only checked if certain instructions can be expanded by us. Now we check any value, including function arguments. llvm-svn: 246425
*	Traverse the SCoP to compute non-loop-carried domain conditions	Johannes Doerfert	2015-08-30	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to compute domain conditions for conditionals we will now traverse the region in the ScopInfo once and build the domains for each block in the region. The SCoP statements can then use these constraints when they build their domain. The reason behind this change is twofold: 1) This removes a big chunk of preprocessing logic from the TempScopInfo, namely the Conditionals we used to build there. Additionally to moving this logic it is also simplified. Instead of walking the dominance tree up for each basic block in the region (as we did before), we now traverse the region only once in order to collect the domain conditions. 2) This is the first step towards the isl based domain creation. The second step will traverse the region similar to this step, however it will propagate back edge conditions. Once both are in place this conditional handling will allow multiple exit loops additional logic. Reviewers: grosser Differential Revision: http://reviews.llvm.org/D12428 llvm-svn: 246398
*	Generate alias metadata even in OpenMP mode	Tobias Grosser	2015-08-19	1	-0/+52
\| \| \| \| \| \| \| \|	To make alias scope metadata generation work in OpenMP mode we now provide the ScopAnnotator with information about the base pointer rewrite that happens when passing arrays into the OpenMP subfunction. llvm-svn: 245451
*	Fix test case after recent LLVM changes	Tobias Grosser	2015-08-13	1	-2/+2
\| \| \| \|	llvm-svn: 244954
*	Changed renaming of local symbols by inserting a dot before the numeric suffix.	Sunil Srivastava	2015-05-12	2	-2/+2
\| \| \| \| \| \| \| \| \|	Modified two test cases to adjust to the above change in renaming. These two files were causing the buildbot failure in Polly, #30204 for example. Details in http://reviews.llvm.org/D9483 This checkin goes with r237150 and r237151 llvm-svn: 237203
*	Rename IslCodeGeneration to CodeGeneration	Tobias Grosser	2015-05-12	10	-17/+17
\| \| \| \| \| \| \| \| \|	Besides class, function and file names, we also change the command line option from -polly-codegen-isl to just -polly-codegen. The isl postfix is a leftover from the times when we still had the CLooG based -polly-codegen. Today it is just redundant and we drop it. llvm-svn: 237099
*	Remove target triples from test cases	Tobias Grosser	2015-04-21	8	-8/+0
\| \| \| \| \| \| \| \|	I just learned that target triples prevent test cases to be run on other architectures. Polly test cases are until now sufficiently target independent to not require any target triples. Hence, we drop them. llvm-svn: 235384
*	Update Polly tests to handle explicitly typed load changes in LLVM.	David Blaikie	2015-02-27	4	-9/+9
\| \| \| \|	llvm-svn: 230796
*	Update Polly tests to handle explicitly typed gep changes in LLVM	David Blaikie	2015-02-27	10	-20/+20
\| \| \| \|	llvm-svn: 230784
*	ScopDetection: Only detect scops that have at least one read and one write	Tobias Grosser	2015-02-19	10	-27/+27
\| \| \| \| \| \| \| \| \| \|	Scops that only read seem generally uninteresting and scops that only write are most likely initializations where there is also little to optimize. To not waste compile time we bail early. Differential Revision: http://reviews.llvm.org/D7735 llvm-svn: 229820
*	[FIX] Updated test case (fixed names -> regular expressions)	Johannes Doerfert	2015-02-02	1	-12/+12
\| \| \| \|	llvm-svn: 227807