bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[PR29121] Don't fold if it would produce atomic vector loads or stores	Philip Reames	2016-12-01	1	-0/+20
\| \| \| \| \| \| \| \|	The instcombine code which folds loads and stores into their use types can trip up if the use is a bitcast to a type which we can't directly load or store in the IR. In principle, such types shouldn't exist, but in practice they do today. This is a workaround to avoid a bug while we work towards the long term goal. Differential Revision: https://reviews.llvm.org/D24365 llvm-svn: 288415
*	[SLP] Fix for PR6246: vectorization for scalar ops on vector elements.	Alexey Bataev	2016-12-01	1	-114/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When trying to vectorize trees that start at insertelement instructions function tryToVectorizeList() uses vectorization factor calculated as MinVecRegSize/ScalarTypeSize. But sometimes it does not work as tree cost for this fixed vectorization factor is too high. Patch tries to improve the situation. It tries different vectorization factors from max(PowerOf2Floor(NumberOfVectorizedValues), MinVecRegSize/ScalarTypeSize) to MinVecRegSize/ScalarTypeSize and tries to choose the best one. Differential Revision: https://reviews.llvm.org/D27215 llvm-svn: 288412
*	[SLP] Fixed cost model for horizontal reduction.	Alexey Bataev	2016-12-01	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently when cost of scalar operations is evaluated the vector type is used for scalar operations. Patch fixes this issue and fixes evaluation of the vector operations cost. Several test showed that vector cost model is too optimistic. It allowed vectorization of 8 or less add/fadd operations, though scalar code is faster. Actually, only for 16 or more operations vector code provides better performance. Differential Revision: https://reviews.llvm.org/D26277 llvm-svn: 288398
*	[GVN, OptDiag] Print the interesting instructions involved in missed ↵	Adam Nemet	2016-12-01	1	-1/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	load-elimination [recommitting after the fix in r288307] This includes the intervening store and the load/store that we're trying to forward from in the optimization remark for the missed load elimination. This is hooked up under a new mode in ORE that allows for compile-time budget for a bit more analysis to print more insightful messages. This mode is currently enabled for -fsave-optimization-record (-Rpass is trickier since it is controlled in the front-end). With this we can now print the red remark in http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L446 Differential Revision: https://reviews.llvm.org/D26490 llvm-svn: 288381
*	[GVN, OptDiag] Include the value that is forwarded in load elimination	Adam Nemet	2016-12-01	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[recommitting after the fix in r288307] This requires some changes to the opt-diag API. Hal and I have discussed this at the Dev Meeting and came up with a streaming delimiter (setExtraArgs) to solve this. Arguments after this delimiter are only included in the optimization records and not in the remarks printed in the compiler output. (Note, how in the test the content of the YAML file changes but the remarks on the compiler output don't.) This implements the green GVN message with a bug fix at line http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L446 The fix is that now we properly include the constant value in the message: "load of type i32 eliminated in favor of 7" Differential Revision: https://reviews.llvm.org/D26489 llvm-svn: 288380
*	[SLP] Additional tests with the cost of vector operations.	Alexey Bataev	2016-12-01	1	-1/+19
\| \| \| \|	llvm-svn: 288377
*	Revert "[SLP] Additional tests with the cost of vector operations."	Alexey Bataev	2016-12-01	1	-18/+1
\| \| \| \| \| \|	This reverts commit a61718435fc4118c82f8aa6133fd81f803789c1e. llvm-svn: 288371
*	[GVN] Basic optimization remark support	Adam Nemet	2016-12-01	1	-0/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[recommitting after the fix in r288307] Follow-on patches will add more interesting cases. The goal of this patch-set is to get the GVN messages printed in opt-viewer from Dhrystone as was presented in my Dev Meeting talk. This is the optimization view for the function (the last remark in the function has a bug which is fixed in this series): http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L430 Differential Revision: https://reviews.llvm.org/D26488 llvm-svn: 288370
*	[SLP] Additional tests with the cost of vector operations.	Alexey Bataev	2016-12-01	1	-1/+18
\| \| \| \|	llvm-svn: 288369
*	[GVN] When merging blocks update LoopInfo if it's available	Adam Nemet	2016-12-01	1	-0/+50
\| \| \| \| \| \| \| \| \| \| \| \| \|	If LoopInfo is available during GVN, BasicAA will use it. However MergeBlockIntoPredecessor does not update LI as it merges blocks. This didn't use to cause problems because LI was freed before GVN/BasicAA. Now with OptimizationRemarkEmitter, the lifetime of LI is extended so LI needs to be kept up-to-date during GVN. Differential Revision: https://reviews.llvm.org/D27288 llvm-svn: 288307
*	[LoopUnroll] Implement profile-based loop peeling	Michael Kuperstein	2016-11-30	2	-0/+143
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This implements PGO-driven loop peeling. The basic idea is that when the average dynamic trip-count of a loop is known, based on PGO, to be low, we can expect a performance win by peeling off the first several iterations of that loop. Unlike unrolling based on a known trip count, or a trip count multiple, this doesn't save us the conditional check and branch on each iteration. However, it does allow us to simplify the straight-line code we get (constant-folding, etc.). This is important given that we know that we will usually only hit this code, and not the actual loop. This is currently disabled by default. Differential Revision: https://reviews.llvm.org/D25963 llvm-svn: 288274
*	[InstCombine] allow more narrowing transforms for logic ops	Sanjay Patel	2016-11-30	1	-12/+12
\| \| \| \| \| \| \|	We had a limited version of this for scalar 'and'; this expands the transform to 'or' and 'xor' and allows vectors types too. llvm-svn: 288273
*	[InstCombine] add tests to show potentially missed logic+trunc transforms; NFC	Sanjay Patel	2016-11-30	1	-1/+81
\| \| \| \|	llvm-svn: 288270
*	[InstCombine] update test to use FileCheck and auto-generate checks; NFC	Sanjay Patel	2016-11-30	1	-14/+15
\| \| \| \|	llvm-svn: 288261
*	[InstCombine] auto-generate checks for select+bitwise logic tests; NFC	Sanjay Patel	2016-11-30	2	-259/+297
\| \| \| \|	llvm-svn: 288254
*	Revert "[GVN] Basic optimization remark support"	Adam Nemet	2016-11-30	1	-59/+0
\| \| \| \| \| \| \| \|	This reverts commit r288210. The failure on the stage2 LTO build is back. llvm-svn: 288226
*	[GVN] Basic optimization remark support	Adam Nemet	2016-11-29	1	-0/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[recommiting patches one-by-one to see which breaks the stage2 LTO bot] Follow-on patches will add more interesting cases. The goal of this patch-set is to get the GVN messages printed in opt-viewer from Dhrystone as was presented in my Dev Meeting talk. This is the optimization view for the function (the last remark in the function has a bug which is fixed in this series): http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L430 Differential Revision: https://reviews.llvm.org/D26488 llvm-svn: 288210
*	[StructurizeCFG] Fix infinite loop in rebuildSSA.	Justin Lebar	2016-11-29	1	-0/+51
\| \| \| \| \| \| \| \| \|	Michel Dänzer reported that r288051, "[StructurizeCFG] Use range-based for loops", introduced a bug into rebuildSSA, wherein we were iterating over an instruction's use list while modifying it, without taking care to do this correctly. llvm-svn: 288200
*	Revert "[GVN] Basic optimization remark support"	Adam Nemet	2016-11-29	1	-59/+0
\| \| \| \| \| \| \| \| \|	This reverts commit r288046. Trying to see if the revert fixes a compiler crash during a stage2 LTO build with a GVN backtrace. llvm-svn: 288179
*	Revert "[GVN, OptDiag] Include the value that is forwarded in load elimination"	Adam Nemet	2016-11-29	1	-6/+0
\| \| \| \| \| \| \| \| \|	This reverts commit r288047. Trying to see if the revert fixes a compiler crash during a stage2 LTO build with a GVN backtrace. llvm-svn: 288178
*	Revert "[GVN, OptDiag] Print the interesting instructions involved in missed ↵	Adam Nemet	2016-11-29	1	-44/+1
\| \| \| \| \| \| \| \| \| \| \|	load-elimination" This reverts commit r288090. Trying to see if the revert fixes a compiler crash during a stage2 LTO build with a GVN backtrace. llvm-svn: 288177
*	[CVP] Remove use of removed flag (-cvp-dont-process-adds) from the test	Artur Pilipenko	2016-11-29	1	-1/+1
\| \| \| \| \| \|	The flag was removed by 288154 llvm-svn: 288161
*	[SLP] Add a new test for tree vectorization starting from insertelement	Alexey Bataev	2016-11-29	1	-33/+508
\| \| \| \| \| \|	instruction. llvm-svn: 288148
*	[SLPVectorizer] Improved support of partial tree vectorization.	Alexey Bataev	2016-11-29	1	-87/+74
\| \| \| \| \| \| \| \| \| \| \|	Currently SLP vectorizer tries to vectorize a binary operation and dies immediately after unsuccessful the first unsuccessfull attempt. Patch tries to improve the situation, trying to vectorize all binary operations of all children nodes in the binop tree. Differential Revision: https://reviews.llvm.org/D25517 llvm-svn: 288115
*	[GVN, OptDiag] Print the interesting instructions involved in missed ↵	Adam Nemet	2016-11-29	1	-1/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	load-elimination This includes the intervening store and the load/store that we're trying to forward from in the optimization remark for the missed load elimination. This is hooked up under a new mode in ORE that allows for compile-time budget for a bit more analysis to print more insightful messages. This mode is currently enabled for -fsave-optimization-record (-Rpass is trickier since it is controlled in the front-end). With this we can now print the red remark in http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L446 Differential Revision: https://reviews.llvm.org/D26490 llvm-svn: 288090
*	[SROA] Drop lifetime.start/end intrinsics when they block promotion.	Eli Friedman	2016-11-28	1	-3/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Preserving lifetime markers isn't as important as allowing promotion, so just drop the lifetime markers if necessary. This also fixes an assertion failure where other parts of SROA assumed that lifetime markers never block promotion. Fixes https://llvm.org/bugs/show_bug.cgi?id=29139. Differential Revision: https://reviews.llvm.org/D24854 llvm-svn: 288074
*	Revert r287553: [CodeGenPrep] Skip merging empty case blocks	Joerg Sonnenberger	2016-11-28	3	-150/+6
\| \| \| \| \| \| \|	It results in assertions in lib/Analysis/BlockFrequencyInfoImpl.cpp line 670 ("Expected irreducible CFG"). llvm-svn: 288052
*	[GVN, OptDiag] Include the value that is forwarded in load elimination	Adam Nemet	2016-11-28	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This requires some changes to the opt-diag API. Hal and I have discussed this at the Dev Meeting and came up with a streaming delimiter (setExtraArgs) to solve this. Arguments after this delimiter are only included in the optimization records and not in the remarks printed in the compiler output. (Note, how in the test the content of the YAML file changes but the remarks on the compiler output don't.) This implements the green GVN message with a bug fix at line http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L446 The fix is that now we properly include the constant value in the message: "load of type i32 eliminated in favor of 7" Differential Revision: https://reviews.llvm.org/D26489 llvm-svn: 288047
*	[GVN] Basic optimization remark support	Adam Nemet	2016-11-28	1	-0/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Follow-on patches will add more interesting cases. The goal of this patch-set is to get the GVN messages printed in opt-viewer from Dhrystone as was presented in my Dev Meeting talk. This is the optimization view for the function (the last remark in the function has a bug which is fixed in this series): http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L430 Differential Revision: https://reviews.llvm.org/D26488 llvm-svn: 288046
*	[InlineCost] Reduce inline thresholds to compensate for cost changes	James Molloy	2016-11-28	2	-10/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In r286814, the algorithm for calculating inline costs changed. This caused more inlining to take place which is especially apparent in optsize and minsize modes. As the cost calculation removed a skewed behaviour (we were inconsistent about the cost of calls) it isn't possible to update the thresholds to get exactly the same behaviour as before. However, this threshold change accounts for the very common case where an inline candidate has no calls within it. In this case, r286814 would inline around 5-6 more (IR) instructions. The changes to -Oz have been heavily benchmarked. The "obvious" value for the inline threshold at -Oz is zero, but due to inaccuracies in the inline heuristics this can actually cause code size increases due to not inlining key thunk functions (that then disappear). Experimentally, 5 was the sweet spot for code size over the test-suite. For -Os, this change removes the outlier results shown up by green dragon (http://104.154.54.203/db_default/v4/nts/13248). Fixes D26848. llvm-svn: 288024
*	[SLP] Add new and update existing lit testfor providing more context to ↵	Mohammad Shahid	2016-11-27	2	-4/+103
\| \| \| \| \| \| \|	incoming patch for vectorization of jumbled load Change-Id: Ifb9091bb0f84c1937c2c8bd2fc345734f250d2f9 llvm-svn: 287992
*	[InstCombine] add test to show missing vector optimization; NFC	Sanjay Patel	2016-11-26	1	-3/+16
\| \| \| \|	llvm-svn: 287982
*	[InstCombine] don't drop metadata in FoldOpIntoSelect()	Sanjay Patel	2016-11-26	1	-0/+11
\| \| \| \|	llvm-svn: 287980
*	[SimplifyCFG] auto-generate better checks; NFC	Sanjay Patel	2016-11-25	2	-12/+30
\| \| \| \|	llvm-svn: 287954
*	[SimplifyCFG] auto-generate better checks; NFC	Sanjay Patel	2016-11-25	1	-21/+36
\| \| \| \|	llvm-svn: 287953
*	[Loop Unswitch] Patch to selective unswitch only the reachable branch ↵	Abhilash Bhandari	2016-11-25	1	-0/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instructions. Summary: The iterative algorithm for Loop Unswitching may render some of the branches unreachable in the unswitched loops. Given the exponential nature of the algorithm, this is quite an overhead. This patch fixes this problem by selectively unswitching only those branches within a loop that are reachable from the loop header. Reviewers: Michael Zolothukin, Anna Thomas, Weiming Zhao. Subscribers: llvm-commits. Differential Revision: http://reviews.llvm.org/D26299 llvm-svn: 287925
*	[X86][AVX512] Add support for v2i64 fptosi/fptoui/sitofp/uitofp on ↵	Simon Pilgrim	2016-11-24	1	-8/+23
\| \| \| \| \| \| \| \|	AVX512DQ-only targets Use 512-bit instructions with subvector insertion/extraction like we do in a number of similar circumstances llvm-svn: 287882
*	[SLP] Add more tests for SLP Vectorizer.	Alexey Bataev	2016-11-23	1	-0/+302
\| \| \| \|	llvm-svn: 287801
*	[LoadStoreVectorizer] Enable vectorization of stores in the presence of an ↵	Alina Sbirlea	2016-11-23	2	-5/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	aliasing load Summary: The "getVectorizablePrefix" method would give up if it found an aliasing load for a store chain. In practice, the aliasing load can be treated as a memory barrier and all stores that precede it are a valid vectorizable prefix. Issue found by volkan in D26962. Testcase is a pruned version of the one in the original patch. Reviewers: jlebar, arsenm, tstellarAMD Subscribers: mzolotukhin, wdng, nhaehnle, anna, volkan, llvm-commits Differential Revision: https://reviews.llvm.org/D27008 llvm-svn: 287781
*	[X86][AVX512] Add support for v4i64 fptosi/fptoui/sitofp/uitofp on ↵	Simon Pilgrim	2016-11-23	1	-28/+70
\| \| \| \| \| \| \| \|	AVX512DQ-only targets Use 512-bit instructions with subvector insertion/extraction like we do in a number of similar circumstances llvm-svn: 287762
*	[CostModel][X86] Add missing AVX512DQ v8i64 fptosi/sitofp costs	Simon Pilgrim	2016-11-23	1	-52/+118
\| \| \| \|	llvm-svn: 287760
*	[SCCP] Add a test for switches on undef.	Davide Italiano	2016-11-23	1	-0/+27
\| \| \| \| \| \| \| \| \| \|	Without this test, you can just remove the code fixing the switch to the first constant in ResolvedUndefs in and everything pass. This test, instead, fails with an assertion if the code is removed. Found while refactoring SCCP to integrate undef in the solver. llvm-svn: 287731
*	Before sample pgo annotation, do not inline a function that has no debug ↵	Dehao Chen	2016-11-22	3	-1/+24
\| \| \| \| \| \| \| \|	info. (NFC) If there is no debug info in the callee, inlining it will not help annotator. This avoids infinite loop as reported in PR/31119. llvm-svn: 287710
*	[SCCP] Remove code in visitBinaryOperator (and add tests).	Davide Italiano	2016-11-22	1	-4/+26
\| \| \| \| \| \| \| \| \| \|	We visit and/or, we try to derive a lattice value for the instruction even if one of the operands is overdefined. If the non-overdefined value is still 'unknown' just return and wait for ResolvedUndefsIn to "plug in" the correct value. This simplifies the logic a bit. While I'm here add tests for missing cases. llvm-svn: 287709
*	[InstCombine] change bitwise logic type to eliminate bitcasts	Sanjay Patel	2016-11-22	1	-4/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In PR27925: https://llvm.org/bugs/show_bug.cgi?id=27925 ...we proposed adding this fold to eliminate a bitcast. In D20774, there was some concern about changing the type of a bitwise op as well as creating bitcasts that might not be free for a target. However, if we're strictly eliminating an instruction (by limiting this to one-use ops), then we should be able to do this in InstCombine. But we're cautiously restricting the transform for now to vector types to avoid possible backend problems. A transform to make sure the logic op is legal for the target should be added to reverse this transform and improve codegen. Differential Revision: https://reviews.llvm.org/D26641 llvm-svn: 287707
*	Fixed the lost FastMathFlags in GVN(Global Value Numbering).	Vyacheslav Klochkov	2016-11-22	1	-0/+29
\| \| \| \| \| \| \|	Reviewer: Hal Finkel. Differential Revision: https://reviews.llvm.org/D26952 llvm-svn: 287700
*	Fixed the lost FastMathFlags in Reassociate optimization.	Vyacheslav Klochkov	2016-11-22	1	-0/+14
\| \| \| \| \| \| \|	Reviewer: Hal Finkel. Differential Revision: https://reviews.llvm.org/D26957 llvm-svn: 287695
*	[CodeGenPrepare] Don't sink non-cheap addrspacecasts.	Justin Lebar	2016-11-21	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Previously, CGP would unconditionally sink addrspacecast instructions, even going so far as to sink them into a loop. Now we check that the cast is "cheap", as defined by TLI. We introduce a new "is-cheap" function to TLI rather than using isNopAddrSpaceCast because some GPU platforms want the ability to ask for non-nop casts to be sunk. Reviewers: arsenm, tra Subscribers: jholewinski, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26923 llvm-svn: 287591
*	[LoopReroll] Make root-finding more aggressive.	Eli Friedman	2016-11-21	1	-0/+31
\| \| \| \| \| \| \| \| \| \|	Allow using an instruction other than a mul or phi as the base for root-finding. For example, the included testcase includes a loop which requires using a getelementptr as the base for root-finding. Differential Revision: https://reviews.llvm.org/D26529 llvm-svn: 287588
*	[InstCombine] canonicalize min/max constant to select's false value	Sanjay Patel	2016-11-21	6	-117/+233
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a first step towards canonicalization and improved folding/codegen for integer min/max as discussed here: http://lists.llvm.org/pipermail/llvm-dev/2016-November/106868.html Here, we're just matching the simplest min/max patterns and adjusting the icmp predicate while swapping the select operands. I've included FIXME tests in test/Transforms/InstCombine/select_meta.ll so it's easier to see how this might be extended (corresponds to the TODO comment in the code). That's also why I'm using matchSelectPattern() rather than a simpler check; once the backend is patched, we can just remove some of the restrictions to allow the obfuscated min/max patterns in the FIXME tests to be matched. Differential Revision: https://reviews.llvm.org/D26525 llvm-svn: 287585