bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	use range loop; NFCI	Sanjay Patel	2016-04-04	1	-3/+3
\| \| \| \|	llvm-svn: 265360
*	Enable unroll for constant bound loops when TripCount is not modulo of ↵	Zia Ansari	2016-04-04	1	-0/+10
\| \| \| \| \| \| \| \| \| \|	unroll factor, reducing it to maximum power-of-2 that satisfies threshold limit. Commit for Evgeny Stupachenko (evstupac@gmail.com) Differential Revision: http://reviews.llvm.org/D18290 llvm-svn: 265337
*	[PGO] Avoid instrumenting direct callee's at value sites.	Betul Buyukkurt	2016-04-04	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	Direct callees' that are cast to other function prototypes, show up in the Call/Invoke instructions as ConstantExpr's. Currently llvm::CallSite's getCalledFunction() fails to return the callees in such expressions as direct calls. Value profiling should avoid instrumenting such cases. Mostly NFC. llvm-svn: 265330
*	[ThinLTO] Augment FunctionImport dump with value name to GUID map	Teresa Johnson	2016-04-04	1	-3/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: To aid in debugging, dump out the correlation between value names and GUID for each source module when it is materialized. This will make it easier to comprehend the earlier summary-based function importing debug trace which only has access to and prints the GUIDs. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18556 llvm-svn: 265326
*	ValueMapper: Remove old FIXMEs; almost NFC	Duncan P. N. Exon Smith	2016-04-04	1	-21/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove a few old FIXMEs from the original commit of the Metadata/Value split in r223802. These are commented out assertions to the effect that calls between mapValue and mapMetadata never return nullptr. (The only behaviour change is that Mapper::mapSimpleMetadata memoizes the nullptr return.) When I originally rewrote the mapping code, I thought we could be stricter in the new metadata hierarchy and never return nullptr when RF_NullMapMissingGlobalValues was off. It's still not entirely clear to me why these assertions failed (a few months ago, I had a theory that I forgot to write down, but that's helping no one). Understood or not, I no longer see how these commented-out assertions would be useful. I'm relegating them to the annals of source control before making significant changes to ValueMapper.cpp. llvm-svn: 265282
*	ValueMapper: Disallow metadata mapping recursion through mapValue	Duncan P. N. Exon Smith	2016-04-03	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds an assertion to maintain the property from r265273. When Mapper::mapSimpleMetadata calls Mapper::mapValue, it should not find its way back to mapMetadataImpl. This guarantees that mapSimpleMetadata is not involved in any recursion. Since Mapper::mapValue calls out to arbitrary materializers, we need to save a bit on the ValueMap to make this assertion effective. There should be no functionality change here. This co-recursion should already have been impossible. llvm-svn: 265276
*	Work around MSVC failure from r265273	Duncan P. N. Exon Smith	2016-04-03	1	-0/+10
\| \| \| \| \| \|	http://lab.llvm.org:8011/builders/sanitizer-windows/builds/19726 llvm-svn: 265275
*	ValueMapper: Avoid recursion in mapSimplifiedMetadata, NFC	Duncan P. N. Exon Smith	2016-04-03	1	-9/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The main change is to delay materializing GlobalValue initializers from Mapper::mapValue until Mapper::~Mapper. This effectively removes all recursion from mapSimplifiedMetadata, as promised in r265270. mapSimplifiedMetadata calls mapValue for ConstantAsMetadata nodes to find the mapped constant, and now it shouldn't be possible for mapValue to indirectly re-invoke mapMetadata. I'll add an assertion to that effect in a follow-up (separated so that the assertion can easily be reverted independently, if it comes to that). This a step toward a broader goal: converting Mapper::mapMetadataImpl from a recursive to an iterative algorithm. When a BlockAddress points at a BasicBlock inside an unmaterialized function body, we need to delay it until the function body is materialized in Mapper::~Mapper. This commit creates a temporary BasicBlock and returns a new BlockAddress, then RAUWs the BasicBlock once it is known. This situation should be extremely rare since a BlockAddress is usually used from within the function it's referencing (and BlockAddress itself is rare). There should be no observable functionality change. llvm-svn: 265273
*	ValueMapper: Split out mapSimpleMetadata, NFC	Duncan P. N. Exon Smith	2016-04-03	1	-4/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Split out a helper for mapping metadata without operands. This is any metadata that is not an MDNode, and any MDNode where the answer is known without looking at operands. Through some weird twists, this function is co-recursive: mapSimpleMetadata => MapValue => materializeInitFor => linkFunctionBody => RemapInstructions => MapMetadata => mapSimpleMetadata I plan to break the recursion in a follow-up. llvm-svn: 265270
*	ValueMapper: Introduce Mapper helper class, NFC	Duncan P. N. Exon Smith	2016-04-03	1	-85/+101
\| \| \| \| \| \| \|	Remove a bunch of boilerplate from ValueMapper.cpp by using a new file-local class called Mapper. llvm-svn: 265268
*	[SimplifyLibCalls] Garbage collect dead code.	Davide Italiano	2016-04-03	1	-28/+7
\| \| \| \| \| \| \| \| \| \|	We already skip optimizations if the return value of printf() is used, so CI->use_empty() is always true. Differential Revision: http://reviews.llvm.org/D18656 llvm-svn: 265253
*	Linker: Remove IRMover::isMetadataUnneeded indirection; almost NFC	Duncan P. N. Exon Smith	2016-04-02	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of checking live during MapMetadata whether a subprogram is needed, seed the ValueMap with `nullptr` up-front. There is a small hypothetical functionality change. Previously, calling MapMetadataOp on a node whose "scope:" chain led to an unneeded subprogram would return nullptr. However, if that were ever called, then the subprogram would be needed; a situation that the IRMover is supposed to avoid a priori! Besides cleaning up the code a little, this restores a nice property: MapMetadataOp returns the same as MapMetadata. llvm-svn: 265229
*	ValueMapper: Add support for seeding metadata with nullptr	Duncan P. N. Exon Smith	2016-04-02	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	Support seeding a ValueMap with nullptr for Metadata entries, a situation I didn't consider in the Metadata/Value split. I added a ValueMapper::getMappedMD accessor that returns an Optional<Metadata*> with the mapped (possibly null) metadata. IRMover needs to use this to avoid modifying the map when it's checking for unneeded subprograms. I updated a call from bugpoint since I find the new code clearer. llvm-svn: 265228
*	Fix "warning: variabl 'XX’ set but not used" in release build (variable ↵	Mehdi Amini	2016-04-02	1	-1/+1
\| \| \| \| \| \| \|	used in assertion, NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265220
*	Create a typedef GlobalValue::GUID for uint64_t and RAUW (NFC)	Mehdi Amini	2016-04-02	1	-6/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This should make the code more readable, especially all the map declarations. Reviewers: tejohnson Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18721 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265215
*	[PGO] Use a helper function to find all indirect call-sites	Rong Xu	2016-04-01	2	-26/+46
\| \| \| \| \| \| \| \| \| \|	Use a helper function to find all the direct-calls-sites in a function. Also split the code into a separated file as this will be use by indirect-call-promotion transformation. Differential Revision: http://reviews.llvm.org/D18704 llvm-svn: 265199
*	LowerBitSets: Move declarations to separate namespace.	Peter Collingbourne	2016-04-01	1	-0/+1
\| \| \| \| \| \|	Should fix modules build. llvm-svn: 265176
*	[sancov] save entry block from pruning (it is always full dominator)	Mike Aizatsky	2016-04-01	1	-3/+3
\| \| \| \|	llvm-svn: 265168
*	[InstCombine] Don't sink an instr after a catchswitch	David Majnemer	2016-04-01	1	-1/+5
\| \| \| \| \| \|	A catchswitch is a terminator, instructions cannot be inserted after it. llvm-svn: 265158
*	[SLPVectorizer] Don't insert an extractelement before a catchswitch	David Majnemer	2016-04-01	1	-2/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	A catchswitch cannot be preceded by another instruction in the same basic block (other than a PHI node). Instead, insert the extract element right after the materialization of the vectorized value. This isn't optimal but is a reasonable compromise given the constraints of WinEH. This fixes PR27163. llvm-svn: 265157
*	[PGO] Refactor PGOFuncName meta data code to be used in clang	Rong Xu	2016-04-01	1	-8/+2
\| \| \| \| \| \| \| \| \|	Refactor the code that gets and creates PGOFuncName meta data so that it can be used in clang's value profile annotation. Differential Revision: http://reviews.llvm.org/D18623 llvm-svn: 265149
*	Add a module Hash in the bitcode and the combined index, implementing a kind ↵	Mehdi Amini	2016-04-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	of "build-id" This is intended to be used for ThinLTO incremental build. Differential Revision: http://reviews.llvm.org/D18213 This is a recommit of r265095 after fixing the Windows issues. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265111
*	Revert "Add support for computing SHA1 in LLVM"	Mehdi Amini	2016-04-01	1	-1/+1
\| \| \| \| \| \| \| \|	This reverts commit r265096, r265095, and r265094. Windows build is broken, and the validation does not pass. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265102
*	Don't insert stackrestore on deoptimizing returns	Sanjoy Das	2016-04-01	1	-2/+4
\| \| \| \| \| \| \| \|	They're not necessary (since the stack pointer is trivially restored on return), and the way LLVM inserts the stackrestore calls breaks the IR (we get a stackrestore between the deoptimize call and the return). llvm-svn: 265101
*	Don't insert lifetime end markers on deoptimizing returns	Sanjoy Das	2016-04-01	1	-2/+5
\| \| \| \| \| \| \| \| \|	They're not necessary (since the lifetime of the alloca is trivially over due to the return), and the way LLVM inserts the lifetime.end markers breaks the IR (we get a lifetime end marker between the deoptimize call and the return). llvm-svn: 265100
*	Add a module Hash in the bitcode and the combined index, implementing a kind ↵	Mehdi Amini	2016-04-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	of "build-id" This is intended to be used for ThinLTO incremental build. Differential Revision: http://reviews.llvm.org/D18213 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265095
*	Preserve blockaddress use edges in the module splitter.	Evgeniy Stepanov	2016-03-31	1	-45/+46
\| \| \| \| \| \| \| \|	"blockaddress" can not apply to an external function. All blockaddress constant uses must belong to the same module as the definition of the target function. llvm-svn: 265061
*	[NVPTX] Infer __nvvm_reflect as nounwind, readnone	David Majnemer	2016-03-31	1	-0/+9
\| \| \| \| \| \| \| \| \| \|	This patch simply mirrors the attributes we give to @llvm.nvvm.reflect to the __nvvm_reflect libdevice call. This shaves about 30% of the code in libdevice away because of CSE opportunities. It's also helps us figure out that libdevice implementations of transcendental functions don't have side-effects. llvm-svn: 265060
*	Preserve extern_weak linkage in CloneModule.	Evgeniy Stepanov	2016-03-31	1	-10/+15
\| \| \| \| \| \| \|	Only force "extern" linkage if the function used to be a definition in the source module. Declarations keep their original linkage. llvm-svn: 265043
*	Minor code cleanup /NFC	Xinliang David Li	2016-03-31	1	-4/+6
\| \| \| \|	llvm-svn: 265025
*	Introduce a @llvm.experimental.guard intrinsic	Sanjoy Das	2016-03-31	4	-5/+117
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: As discussed on llvm-dev[1]. This change adds the basic boilerplate code around having this intrinsic in LLVM: - Changes in Intrinsics.td, and the IR Verifier - A lowering pass to lower @llvm.experimental.guard to normal control flow - Inliner support [1]: http://lists.llvm.org/pipermail/llvm-dev/2016-February/095523.html Reviewers: reames, atrick, chandlerc, rnk, JosephTremoulet, echristo Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18527 llvm-svn: 264976
*	AMDGPU: Add frexp_exp intrinsic	Matt Arsenault	2016-03-30	1	-5/+16
\| \| \| \|	llvm-svn: 264944
*	AMDGPU: Constant folding for frexp_mant	Matt Arsenault	2016-03-30	1	-0/+14
\| \| \| \|	llvm-svn: 264943
*	Cloning: Reduce complexity of debug info cloning and fix correctness issue.	Peter Collingbourne	2016-03-30	2	-3/+11
\| \| \| \| \| \| \| \| \| \| \| \| \|	Commit r260791 contained an error in that it would introduce a cross-module reference in the old module. It also introduced O(N^2) complexity in the module cloner by requiring the entire module to be visited for each function. Fix both of these problems by avoiding use of the CloneDebugInfoMetadata function (which is only designed to do intra-module cloning) and cloning function-attached metadata in the same way that we clone all other metadata. Differential Revision: http://reviews.llvm.org/D18583 llvm-svn: 264935
*	Silencing warnings from MSVC 2015 Update 2. All of these changes silence ↵	Aaron Ballman	2016-03-30	2	-5/+5
\| \| \| \| \| \|	"C4334 '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)". NFC. llvm-svn: 264929
*	[IndVarSimplify] Don't insert after a catchswitch	David Majnemer	2016-03-30	1	-0/+6
\| \| \| \| \| \| \| \| \| \|	Widening a PHI requires us to insert a trunc. The logical place for this trunc is in the same BB as the PHI. This is not possible if the BB is terminated by a catchswitch. This fixes PR27133. llvm-svn: 264926
*	[PassManager] Make PassManagerBuilder::addExtension take an std::function, ↵	Justin Lebar	2016-03-30	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rather than a function pointer. Summary: This gives callers flexibility to pass lambdas with captures, which lets callers avoid the C-style void*-ptr closure style. (Currently, callers in clang store state in the PassManagerBuilderBase arg.) No functional change, and the new API is backwards-compatible. Reviewers: chandlerc Subscribers: joker.eph, cfe-commits Differential Revision: http://reviews.llvm.org/D18613 llvm-svn: 264918
*	[LoopVectorize] Don't vectorize loops when everything will be scalarized	Hal Finkel	2016-03-30	1	-18/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change prevents the loop vectorizer from vectorizing when all of the vector types it generates will be scalarized. I've run into this problem on the PPC's QPX vector ISA, which only holds floating-point vector types. The loop vectorizer will, however, happily vectorize loops with purely integer computation. Here's an example: LV: The Smallest and Widest types: 32 / 32 bits. LV: The Widest register is: 256 bits. LV: Found an estimated cost of 0 for VF 1 For instruction: %indvars.iv25 = phi i64 [ 0, %entry ], [ %indvars.iv.next26, %for.body ] LV: Found an estimated cost of 0 for VF 1 For instruction: %arrayidx = getelementptr inbounds [1600 x i32], [1600 x i32]* %a, i64 0, i64 %indvars.iv25 LV: Found an estimated cost of 0 for VF 1 For instruction: %2 = trunc i64 %indvars.iv25 to i32 LV: Found an estimated cost of 1 for VF 1 For instruction: store i32 %2, i32* %arrayidx, align 4 LV: Found an estimated cost of 1 for VF 1 For instruction: %indvars.iv.next26 = add nuw nsw i64 %indvars.iv25, 1 LV: Found an estimated cost of 1 for VF 1 For instruction: %exitcond27 = icmp eq i64 %indvars.iv.next26, 1600 LV: Found an estimated cost of 0 for VF 1 For instruction: br i1 %exitcond27, label %for.cond.cleanup, label %for.body LV: Scalar loop costs: 3. LV: Found an estimated cost of 0 for VF 2 For instruction: %indvars.iv25 = phi i64 [ 0, %entry ], [ %indvars.iv.next26, %for.body ] LV: Found an estimated cost of 0 for VF 2 For instruction: %arrayidx = getelementptr inbounds [1600 x i32], [1600 x i32]* %a, i64 0, i64 %indvars.iv25 LV: Found an estimated cost of 0 for VF 2 For instruction: %2 = trunc i64 %indvars.iv25 to i32 LV: Found an estimated cost of 2 for VF 2 For instruction: store i32 %2, i32* %arrayidx, align 4 LV: Found an estimated cost of 1 for VF 2 For instruction: %indvars.iv.next26 = add nuw nsw i64 %indvars.iv25, 1 LV: Found an estimated cost of 1 for VF 2 For instruction: %exitcond27 = icmp eq i64 %indvars.iv.next26, 1600 LV: Found an estimated cost of 0 for VF 2 For instruction: br i1 %exitcond27, label %for.cond.cleanup, label %for.body LV: Vector loop of width 2 costs: 2. LV: Found an estimated cost of 0 for VF 4 For instruction: %indvars.iv25 = phi i64 [ 0, %entry ], [ %indvars.iv.next26, %for.body ] LV: Found an estimated cost of 0 for VF 4 For instruction: %arrayidx = getelementptr inbounds [1600 x i32], [1600 x i32]* %a, i64 0, i64 %indvars.iv25 LV: Found an estimated cost of 0 for VF 4 For instruction: %2 = trunc i64 %indvars.iv25 to i32 LV: Found an estimated cost of 4 for VF 4 For instruction: store i32 %2, i32* %arrayidx, align 4 LV: Found an estimated cost of 1 for VF 4 For instruction: %indvars.iv.next26 = add nuw nsw i64 %indvars.iv25, 1 LV: Found an estimated cost of 1 for VF 4 For instruction: %exitcond27 = icmp eq i64 %indvars.iv.next26, 1600 LV: Found an estimated cost of 0 for VF 4 For instruction: br i1 %exitcond27, label %for.cond.cleanup, label %for.body LV: Vector loop of width 4 costs: 1. ... LV: Selecting VF: 8. LV: The target has 32 registers LV(REG): Calculating max register usage: LV(REG): At #0 Interval # 0 LV(REG): At #1 Interval # 1 LV(REG): At #2 Interval # 2 LV(REG): At #4 Interval # 1 LV(REG): At #5 Interval # 1 LV(REG): VF = 8 The problem is that the cost model here is not wrong, exactly. Since all of these operations are scalarized, their cost (aside from the uniform ones) are indeed VF*(scalar cost), just as the model suggests. In fact, the larger the VF picked, the lower the relative overhead from the loop itself (and the induction-variable update and check), and so in a sense, picking the largest VF here is the right thing to do. The problem is that vectorizing like this, where all of the vectors will be scalarized in the backend, isn't really vectorizing, but rather interleaving. By itself, this would be okay, but then the vectorizer itself also interleaves, and that's where the problem manifests itself. There's aren't actually enough scalar registers to support the normal interleave factor multiplied by a factor of VF (8 in this example). In other words, the problem with this is that our register-pressure heuristic does not account for scalarization. While we might want to improve our register-pressure heuristic, I don't think this is the right motivating case for that work. Here we have a more-basic problem: The job of the vectorizer is to vectorize things (interleaving aside), and if the IR it generates won't generate any actual vector code, then something is wrong. Thus, if every type looks like it will be scalarized (i.e. will be split into VF or more parts), then don't consider that VF. This is not a problem specific to PPC/QPX, however. The problem comes up under SSE on x86 too, and as such, this change fixes PR26837 too. I've added Sanjay's reduced test case from PR26837 to this commit. Differential Revision: http://reviews.llvm.org/D18537 llvm-svn: 264904
*	[PGO] PGOFuncName in LTO optimizations	Rong Xu	2016-03-30	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PGOFuncNames are used as the key to retrieve the Function definition from the MD5 stored in the profile. For internal linkage function, we prefix the source file name to the PGOFuncNames. LTO's internalization privatizes many global linkage symbols. This happens after value profile annotation, but those internal linkage functions should not have a source prefix. To differentiate compiler generated internal symbols from original ones, PGOFuncName meta data are created and attached to the original internal symbols in the value profile annotation step. If a symbol does not have the meta data, its original linkage must be non-internal. Also add a new map that maps PGOFuncName's MD5 value to the function definition. Differential Revision: http://reviews.llvm.org/D17895 llvm-svn: 264902
*	Remove HasFnAttribute guards to getFnAttribute calls	Nirav Dave	2016-03-30	2	-7/+4
\| \| \| \| \| \| \| \| \| \| \| \|	These checks are redundant and can be removed Reviewers: hans Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D18564 llvm-svn: 264872
*	[MemorySSA] Make the visitor more careful with calls.	George Burgess IV	2016-03-30	1	-5/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Prior to this patch, the MemorySSA caching visitor would cache all calls that it visited. When paired with phi optimization, this can be problematic. Consider: define void @foo() { ; 1 = MemoryDef(liveOnEntry) call void @clobberFunction() br i1 undef, label %if.end, label %if.then if.then: ; MemoryUse(??) call void @readOnlyFunction() ; 2 = MemoryDef(1) call void @clobberFunction() br label %if.end if.end: ; 3 = MemoryPhi(...) ; MemoryUse(?) call void @readOnlyFunction() ret void } When optimizing MemoryUse(?), we visit defs 1 and 2, so we note to cache them later. We ultimately end up not being able to optimize passed the Phi, so we set MemoryUse(?) to point to the Phi. We then cache the clobbering call for def 1 to be the Phi. This commit changes this behavior so that we wipe out any calls added to VisistedCalls while visiting the defs of a phi we couldn't optimize. Aside: With this patch, we now can bootstrap clang/LLVM without a single MemorySSA verifier failure. Woohoo. :) llvm-svn: 264820
*	[PGO] Handle invoke inst in IR based icall instrumentation	Xinliang David Li	2016-03-30	1	-5/+7
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D18580 llvm-svn: 264818
*	[MemorySSA] Change how the walker views/walks visited phis.	George Burgess IV	2016-03-30	1	-22/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch teaches the caching MemorySSA walker a few things: 1. Not to walk Phis we've walked before. It seems that we tried to do this before, but it didn't work so well in cases like: define void @foo() { %1 = alloca i8 %2 = alloca i8 br label %begin begin: ; 3 = MemoryPhi({%0,liveOnEntry},{%end,2}) ; 1 = MemoryDef(3) store i8 0, i8* %2 br label %end end: ; MemoryUse(?) load i8, i8* %1 ; 2 = MemoryDef(1) store i8 0, i8* %2 br label %begin } Because we wouldn't put Phis in Q.Visited until we tried to visit them. So, when trying to optimize MemoryUse(?): - We would visit 3 above - ...Which would make us put {%0,liveOnEntry} in Q.Visited - ...Which would make us visit {%0,liveOnEntry} - ...Which would make us put {%end,2} in Q.Visited - ...Which would make us visit {%end,2} - ...Which would make us visit 3 - ...Which would realize we've already visited everything in 3 - ...Which would make us conservatively return 3. In the added test-case, (@looped_visitedonlyonce) this behavior would cause us to give incorrect results. Specifically, we'd visit 4 twice in the same query, but on the second visit, we'd skip while.cond because it had been visited, visit if.then/if.then2, and cache "1" as the clobbering def on the way back. 2. If we try to walk the defs of a {Phi,MemLoc} and see it has been visited before, just hand back the Phi we're trying to optimize. I promise this isn't as terrible as it seems. :) We now insert {Phi,MemLoc} pairs just before walking the Phi's upward defs. So, we check the cache for the {Phi,MemLoc} pair before checking if we've already walked the Phi. The {Phi,MemLoc} pair is (almost?) always guaranteed to have a cache entry if we've already fully walked it, because we cache as we go. So, if the {Phi,MemLoc} pair isn't in cache, either: (a) we must be in the process of visiting it (in which case, we can't give a better answer in a cache-as-we-go DFS walker) (b) we visited it, but didn't cache it on the way back (...which seems to require `ModifyingAccess` to not dominate `StartingAccess`, so I'm 99% sure that would be an error. If it's not an error, I haven't been able to get it to happen locally, so I suspect it's rare.) - - - - - As a consequence of this change, we no longer skip upward defs of phis, so we can kill the `VisitedOnlyOne` check. This gives us better accuracy than we had before, at the cost of potentially doing a bit more work when we have a loop. llvm-svn: 264814
*	[LoopDataPrefetch] Centralize the tuning cl::opts under the pass	Adam Nemet	2016-03-29	1	-4/+35
\| \| \| \| \| \| \| \| \|	This is effectively NFC, minus the renaming of the options (-cyclone-prefetch-distance -> -prefetch-distance). The change was requested by Tim in D17943. llvm-svn: 264806
*	[tsan] Do not instrument reads/writes to instruction profile counters.	Anna Zaks	2016-03-29	1	-1/+25
\| \| \| \| \| \| \| \| \|	We have known races on profile counters, which can be reproduced by enabling -fsanitize=thread and -fprofile-instr-generate simultaneously on a multi-threaded program. This patch avoids reporting those races by not instrumenting the reads and writes coming from the instruction profiler. llvm-svn: 264805
*	ADCE: Remove debug info intrinsics in dead scopes	Duncan P. N. Exon Smith	2016-03-29	1	-6/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During ADCE, track which debug info scopes still have live references from the code, and delete debug info intrinsics for the dead ones. These intrinsics describe the locations of variables (in registers or stack slots). If there's no code left corresponding to a variable's scope, then there's no way to reference the variable in the debugger and it doesn't matter what its value is. I add a DEBUG printout when the described location in an SSA register, in case it helps some trying to track down why locations get lost. However, we still delete these; the scope itself isn't attached to any real code, so the ship has already sailed. llvm-svn: 264800
*	[LoopDataPrefetch] Make more member functions private, NFC.	Adam Nemet	2016-03-29	1	-1/+2
\| \| \| \|	llvm-svn: 264798
*	[ThinLTO] Remove post-pass metadata linking support	Teresa Johnson	2016-03-29	1	-64/+21
\| \| \| \| \| \| \| \| \| \| \|	Since we have moved to a model where functions are imported in bulk from each source module after making summary-based importing decisions, there is no longer a need to link metadata as a postpass, and all users have been removed. This essentially reverts r255909 and follow-on fixes. llvm-svn: 264763
*	Make InlineSimple's one-arg constructor explicit. NFC	Justin Lebar	2016-03-29	1	-1/+2
\| \| \| \|	llvm-svn: 264744
*	Reformat a comment in InlineSimple.cpp. NFC	Justin Lebar	2016-03-29	1	-3/+3
\| \| \| \|	llvm-svn: 264743