bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[LoopPass] Some minor cleanups	David Majnemer	2016-07-19	1	-7/+5
\| \| \| \| \| \|	No functional change is intended. llvm-svn: 275999
*	[X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using ↵	Simon Pilgrim	2016-07-19	1	-9/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	generic IR D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead. It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match). This patch changes both scalar and packed versions back to using x86-specific builtins. It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding. A companion clang patch is at D22105 Differential Revision: https://reviews.llvm.org/D22106 llvm-svn: 275981
*	refactor SimplifySelectInst; NFCI	Sanjay Patel	2016-07-18	1	-97/+115
\| \| \| \|	llvm-svn: 275911
*	[OptRemarkEmitter] Port to new PM	Adam Nemet	2016-07-18	2	-12/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The main goal is to able to start using the new OptRemarkEmitter analysis from the LoopVectorizer. Since the vectorizer was recently converted to the new PM, it makes sense to convert this analysis as well. This pass is currently tested through the LoopDistribution pass, so I am also porting LoopDistribution to get coverage for this analysis with the new PM. Reviewers: davidxl, silvas Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D22436 llvm-svn: 275810
*	[ThinLTO] Perform profile-guided indirect call promotion	Teresa Johnson	2016-07-17	1	-7/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: To enable profile-guided indirect call promotion in ThinLTO mode, we simply add call graph edges for each profitable target from the profile to the summaries, then the summary-guided importing will consider the callee for importing as usual. Also we need to enable the indirect call promotion pass creation in the PassManagerBuilder when PerformThinLTO=true (we are in the ThinLTO backend), so that the newly imported functions are considered for promotion in the backends. The IC promotion profiles refer to callees by GUID, which required adding GUIDs to the per-module VST in bitcode (and assigning them valueIds similar to how they are assigned valueIds in the combined index). Reviewers: mehdi_amini, xur Subscribers: mehdi_amini, davidxl, llvm-commits Differential Revision: http://reviews.llvm.org/D21932 llvm-svn: 275707
*	[PM] Convert IVUsers analysis to new pass manager.	Dehao Chen	2016-07-16	2	-38/+60
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Convert IVUsers analysis to new pass manager. Reviewers: davidxl, silvas Subscribers: junbuml, sanjoy, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D22434 llvm-svn: 275698
*	[CFLAA] Add attributes handling for CFLAnders.	George Burgess IV	2016-07-15	1	-9/+127
\| \| \| \| \| \| \| \| \| \| \| \|	This patch adds proper handling of stratified attributes into our anders-style CFLAA implementation. It also comes bundled with more CFLAnders tests. :) Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22325 llvm-svn: 275604
*	[CFLAA] Add an initial CFLAnders implementation.	George Burgess IV	2016-07-15	3	-27/+464
\| \| \| \| \| \| \| \| \| \| \| \| \|	This adds an incomplete anders-style implementation for CFLAA. It's incomplete in that it's missing interprocedural analysis, attrs handling, etc. and that it needs more tests. More tests and features will be added in future commits. Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22291 llvm-svn: 275602
*	[OptRemark,LDist] RFC: Add hotness attribute	Adam Nemet	2016-07-15	3	-0/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is the first set of changes implementing the RFC from http://thread.gmane.org/gmane.comp.compilers.llvm.devel/98334 This is a cross-sectional patch; rather than implementing the hotness attribute for all optimization remarks and all passes in a patch set, it implements it for the 'missed-optimization' remark for Loop Distribution. My goal is to shake out the design issues before scaling it up to other types and passes. Hotness is computed as an integer as the multiplication of the block frequency with the function entry count. It's only printed in opt currently since clang prints the diagnostic fields directly. E.g.: remark: /tmp/t.c:3:3: loop not distributed: use -Rpass-analysis=loop-distribute for more info (hotness: 300) A new API added is similar to emitOptimizationRemarkMissed. The difference is that it additionally takes a code region that the diagnostic corresponds to. From this, hotness is computed using BFI. The new API is exposed via an analysis pass so that it can be made dependent on LazyBFI. (Thanks to Hal for the analysis pass idea.) This feature can all be enabled by setDiagnosticHotnessRequested in the LLVM context. If this is off, LazyBFI is not calculated (D22141) so there should be no overhead. A new command-line option is added to turn this on in opt. My plan is to switch all user of emitOptimizationRemark* to use this module instead. Reviewers: hfinkel Subscribers: rcox2, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D21771 llvm-svn: 275583
*	[AliasAnalysis] Give back AA results for fence instructions	David Majnemer	2016-07-15	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	Calling getModRefInfo with a fence resulted in crashes because fences don't have a memory location. Add a new predicate to Instruction called isFenceLike which indicates that the instruction mutates memory but not any single memory location in particular. In practice, it is a proxy for the set of instructions which "mayWriteToMemory" but cannot be used with MemoryLocation::get. This fixes PR28570. llvm-svn: 275581
*	Re-submit r272891 "Prevent dangling pointer problems in BranchProbabilityInfo"	Igor Laevsky	2016-07-15	1	-0/+9
\| \| \| \| \| \| \| \| \|	Most possibly problem was caused by the same reason as PR28400. This change bypasses it by using CallbackVH instead of AssertingVH. Differential Revision: https://reviews.llvm.org/D20957 llvm-svn: 275563
*	[ValueTracking] Use Instruction::getFunction; NFC	Sanjoy Das	2016-07-14	1	-4/+2
\| \| \| \|	llvm-svn: 275465
*	GlobalsAA: Functions with the argmemonly attribute won't read arbitrary globals	Tom Stellard	2016-07-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In preparation for changing GlobalsAA to stop assuming that intrinsics can't read arbitrary globals, we need to make sure GlobalsAA is querying function attributes rather than relying on this assumption. This patch was inspired by: http://reviews.llvm.org/D20206 Reviewers: jmolloy, hfinkel Subscribers: eli.friedman, llvm-commits Differential Revision: https://reviews.llvm.org/D21318 llvm-svn: 275433
*	This implements a more optimal algorithm for selecting a base constant in	Sjoerd Meijer	2016-07-14	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	constant hoisting. It not only takes into account the number of uses and the cost of expressions in which constants appear, but now also the resulting integer range of the offsets. Thus, the algorithm maximizes the number of uses within an integer range that will enable more efficient code generation. On ARM, for example, this will enable code size optimisations because less negative offsets will be created. Negative offsets/immediates are not supported by Thumb1 thus preventing more compact instruction encoding. Differential Revision: http://reviews.llvm.org/D21183 llvm-svn: 275382
*	Simplify llvm.masked.load w/ undef masks	David Majnemer	2016-07-14	2	-19/+42
\| \| \| \| \| \| \|	We can always pick the passthru value if the mask is undef: we are permitted to treat the mask as-if it were filled with zeros. llvm-svn: 275379
*	[ConstantFolding] Fold masked loads	David Majnemer	2016-07-14	1	-1/+36
\| \| \| \| \| \| \| \| \|	We can constant fold a masked load if the operands are appropriately constant. Differential Revision: http://reviews.llvm.org/D22324 llvm-svn: 275352
*	[ConstantFolding] Extend FoldReinterpretLoadFromConstPtr to handle negative ↵	David Majnemer	2016-07-13	1	-10/+20
\| \| \| \| \| \| \| \| \|	offsets Treat loads which clip before the start of a global initializer the same way we treat clipping beyond the end of the initializer: use zeros. llvm-svn: 275345
*	Move a transform from InstCombine to InstSimplify.	David Majnemer	2016-07-13	1	-0/+9
\| \| \| \| \| \| \|	This transform doesn't require any new instructions, it can safely live in InstSimplify. llvm-svn: 275344
*	[LAA] Don't hold on to DominatorTree in the analysis result	Adam Nemet	2016-07-13	1	-4/+5
\| \| \| \|	llvm-svn: 275335
*	[LAA] Don't hold on to TargetLibraryInfo in the analysis result	Adam Nemet	2016-07-13	1	-3/+4
\| \| \| \|	llvm-svn: 275334
*	[LAA] Don't hold on to DataLayout in the analysis result	Adam Nemet	2016-07-13	1	-11/+8
\| \| \| \| \| \| \|	In fact, don't even pass this to the ctor since we can get it from the module. llvm-svn: 275326
*	[LAA] Don't hold on to LoopInfo in the analysis result	Adam Nemet	2016-07-13	1	-3/+3
\| \| \| \|	llvm-svn: 275325
*	[LAA] Don't hold on to AliasAnalysis in the analysis result	Adam Nemet	2016-07-13	1	-3/+3
\| \| \| \|	llvm-svn: 275322
*	Reverting r275284 due to platform-specific test failures	Andrew Kaylor	2016-07-13	1	-1/+0
\| \| \| \|	llvm-svn: 275304
*	Fix for Bug 26903, adds support to inline __builtin_mempcpy	Andrew Kaylor	2016-07-13	1	-0/+1
\| \| \| \| \| \| \| \|	Patch by Sunita Marathe Differential Revision: http://reviews.llvm.org/D21920 llvm-svn: 275284
*	[ConstantFolding] Use sdiv_ov	David Majnemer	2016-07-13	1	-4/+4
\| \| \| \| \| \|	This is a simplification, there should be no functional change. llvm-svn: 275273
*	[ConstantFolding] Don't treat negative GEP offsets as positive	David Majnemer	2016-07-13	1	-4/+7
\| \| \| \| \| \|	GEP offsets are signed, don't treat them as huge positive numbers. llvm-svn: 275251
*	[BFI] Add new LazyBFI analysis pass	Adam Nemet	2016-07-13	4	-0/+76
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is necessary for D21771. In order to add the hotness attribute to optimization remarks we need BFI to be available in all passes that emit optimization remarks. However we don't want to pay for computing BFI unless the hotness attribute is requested. This is achieved by making BFI lazy at the very high-level through a new analysis pass -- BFI is not calculated unless requested. I am adding a test to check the laziness under D21771 where the first user of the analysis is added. Reviewers: hfinkel, dexonsmith, davidxl Subscribers: davidxl, dexonsmith, llvm-commits Differential Revision: http://reviews.llvm.org/D22141 llvm-svn: 275250
*	[ConstantFolding] Cleanups	David Majnemer	2016-07-13	1	-67/+66
\| \| \| \| \| \|	No functional change is intended, just a minor cleanup. llvm-svn: 275249
*	[IR] Make getIndexedOffsetInType return a signed result	David Majnemer	2016-07-13	2	-3/+3
\| \| \| \| \| \| \|	A GEPed offset can go negative, the result of getIndexedOffsetInType should according be a signed type. llvm-svn: 275246
*	Fix ScalarEvolutionExpander step scaling bug	Keno Fischer	2016-07-13	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The expandAddRecExprLiterally function incorrectly transforms `[Start + Step * X]` into `Step * [Start + X]` instead of the correct transform of `[Step * X] + Start`. This caused https://github.com/JuliaLang/julia/issues/14704#issuecomment-174126219 due to what appeared to be sufficiently complicated loop interactions. Patch by Jameson Nash (jameson@juliacomputing.com). Reviewers: sanjoy Differential Revision: http://reviews.llvm.org/D16505 llvm-svn: 275239
*	Remove another unused variable from r275216	Teresa Johnson	2016-07-12	1	-2/+1
\| \| \| \| \| \| \|	Remove another variable added in r275216 that was only used in debug mode. llvm-svn: 275238
*	Refactor indirect call promotion profitability analysis (NFC)	Teresa Johnson	2016-07-12	3	-1/+112
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Refactored the profitability analysis out of the IC promotion pass and into lib/Analysis so that it can be accessed by the summary index builder in a follow-on patch to enable IC promotion in ThinLTO (D21932). Reviewers: davidxl, xur Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22182 llvm-svn: 275216
*	[LoopAccessAnalysis] Some minor cleanups	David Majnemer	2016-07-12	1	-20/+16
\| \| \| \| \| \| \| \| \|	Use range-base for loops. Use auto when appropriate. No functional change is intended. llvm-svn: 275213
*	Attempt to make buildbots happy.	George Burgess IV	2016-07-11	1	-3/+2
\| \| \| \| \| \| \|	Woohoo, unused variable warnings in builds without asserts (as a result of r275122). llvm-svn: 275126
*	[CFLAA] Simplify CFLGraphBuilder. NFC.	George Burgess IV	2016-07-11	4	-265/+185
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch simplifies the graph builder by encoding nodes as {Value, Dereference Level} pairs. This lets us kill edge types, and allows us to get rid of hacks in StratifiedSets (like addAttrsBelow/...). This simplification also allows us to remove InstantiatedRelations and InstantiatedAttrs. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D22080 llvm-svn: 275122
*	Add TLI.allowsMisalignedMemoryAccesses to LoadStoreVectorizer	Alina Sbirlea	2016-07-11	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Extend TTI to access TLI.allowsMisalignedMemoryAccesses(). Check condition when vectorizing load and store chains. Add additional parameters: AddressSpace, Alignment, Fast. Reviewers: llvm-commits, jlebar Subscribers: arsenm, mzolotukhin Differential Revision: http://reviews.llvm.org/D21935 llvm-svn: 275100
*	Implement callsite-hotness based inline cost for Sample-based PGO	Dehao Chen	2016-07-11	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: For sample-based PGO, using BFI to calculate callsite count is sometime not accurate. This is because with sampling based approach, if a callsite resides in a hot loop deeply nested in a bunch of cold branches, the callsite's BFI frequency would be inaccurately calculated due to lack of samples in the cold branch. E.g. if (A1 && A2 && A3 && ..... && A10) { for (i=0; i < 100000000; i++) { callsite(); } } Assume that A1 to A100 are all 100% taken, and callsite has 1000 samples and thus is considerred hot. Because the loop's trip count is huge, it's normal that all branches outside the loop has no sample at all. As a result, we can only use static branch probability to derive the the frequency of the loop header. Assuming that static heuristic thinks each branch is 50% taken, then the count calculated from BFI will be 1/(2^10) of the actual value. In order to get more accurate callsite count, we directly annotate the weight on the call instruction, and directly use it when checking callsite hotness. Note that this mechanism can also be shared by instrumentation based callsite hotness analysis. The side benefit is that it breaks the dependency from Inliner to BFI as call count is embedded in the IR. Reviewers: davidxl, eraman, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22118 llvm-svn: 275073
*	AliasAnalysis: unify getModRefInfo(I, CS) semantics with other overloads	Nicolai Haehnle	2016-07-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This subtle change to getModRefInfo(Instruction, ImmutableCallSite) is to ensure that the semantics are equal to that of getModRefInfo(CS1, CS2) when the Instruction is a call-site. This is now more in line with getModRefInfo generally: it returns Mod when I modifies a memory location that is accessed (read or written) by CS and Ref when I reads a memory location that is written by CS. From a grep of the code, the only uses of this particular getModRefInfo overload are in MemorySSA and MemCpyOptimizer, and they only care about where the result is MR_NoModRef or not. Therefore, this change should have no visible effect. Separated out from D17279 upon request. llvm-svn: 275065
*	Pointer-comparison folding should look through returned-argument functions	Hal Finkel	2016-07-11	1	-0/+5
\| \| \| \| \| \| \| \| \|	For functions which are known to return a specific argument, pointer-comparison folding can look through the function calls as part of its analysis. Differential Revision: http://reviews.llvm.org/D9387 llvm-svn: 275039
*	Teach isDereferenceablePointer to look through returned-argument functions	Hal Finkel	2016-07-11	1	-0/+5
\| \| \| \| \| \| \| \| \|	For functions which are known to return their argument, isDereferenceableAndAlignedPointer can examine the argument value. Differential Revision: http://reviews.llvm.org/D9384 llvm-svn: 275038
*	Teach SCEV to look through returned-argument functions	Hal Finkel	2016-07-11	1	-0/+7
\| \| \| \| \| \| \| \| \|	When building SCEVs, if a function is known to return its argument, then we can build the SCEV using the corresponding argument value. Differential Revision: http://reviews.llvm.org/D9381 llvm-svn: 275037
*	Teach computeKnownBits to look through returned-argument functions	Hal Finkel	2016-07-11	1	-3/+8
\| \| \| \| \| \| \| \| \|	If a function is known to return one of its arguments, we can use that in order to compute known bits of the return value. Differential Revision: http://reviews.llvm.org/D9397 llvm-svn: 275036
*	BasicAA should look through functions with returned arguments	Hal Finkel	2016-07-11	2	-2/+20
\| \| \| \| \| \| \| \| \| \| \|	Motivated by the work on the llvm.noalias intrinsic, teach BasicAA to look through returned-argument functions when answering queries. This is essential so that we don't loose all other AA information when supplementing with llvm.noalias. Differential Revision: http://reviews.llvm.org/D9383 llvm-svn: 275035
*	[CFLAA] Make a constant variable `const`. NFC.	George Burgess IV	2016-07-09	1	-1/+1
\| \| \| \| \| \| \|	`const` was dropped by r274958, and the lack of `const` makes GCC6 (correctly) complain. llvm-svn: 274961
*	[CFLAA] Move the graph builder out from CFLSteens. NFC.	George Burgess IV	2016-07-09	3	-422/+440
\| \| \| \| \| \| \| \|	Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D22022 llvm-svn: 274958
*	[CFLAA] Simplify CFLGraphBuilder. NFC.	George Burgess IV	2016-07-09	1	-41/+25
\| \| \| \| \| \| \| \| \| \| \|	This removes a few fields from the graph builder by making us compute things (that we'd always compute anyway) more eagerly. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D22009 llvm-svn: 274957
*	Revert "InstCombine rule to fold truncs whose value is available"	Anna Thomas	2016-07-08	1	-27/+16
\| \| \| \| \| \| \|	This reverts commit r274853. Caused failure in ppcBE build llvm-svn: 274943
*	[TTI] Expose TTI::getGEPCost and use it in SLSR and NaryReassociate.	Jingyue Wu	2016-07-08	1	-0/+5
\| \| \| \| \| \|	NFC. llvm-svn: 274940
*	[PM] name the new PM LAA class LoopAccessAnalysis (LAA) /NFC	Xinliang David Li	2016-07-08	1	-3/+3
\| \| \| \|	llvm-svn: 274934