bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[InstCombine] Preserve llvm.mem.parallel_loop_access metadata when replacing	Dorit Nuzman	2016-09-04	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \|	memcpy with ld/st. When InstCombine replaces a memcpy with loads+stores it does not copy over the llvm.mem.parallel_loop_access from the memcpy instruction. This patch fixes that. Differential Revision: https://reviews.llvm.org/D23499 llvm-svn: 280617
*	Test commit.	Dorit Nuzman	2016-09-04	1	-0/+1
\| \| \| \|	llvm-svn: 280615
*	Fix inliner funclet unwind memoization	Joseph Tremoulet	2016-09-04	1	-7/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The inliner may need to determine where a given funclet unwinds to, and this determination may depend on other funclets throughout the funclet tree. The code that performs this walk in getUnwindDestToken memoizes results to avoid redundant computations. In the case that a funclet's unwind destination is derived from its ancestor, there's code to walk back down the tree from the ancestor updating the memo map of its descendants to record the unwind destination. This change fixes that code to account for the case that some descendant has a different unwind destination, which can happen if that unwind dest is a descendant of the EHPad being queried and thus didn't determine its unwind destination. Also update test inline-funclets.ll, which is supposed to cover such scenarios, to include a case that fails an assertion without this fix but passes with it. Fixes PR29151. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24117 llvm-svn: 280610
*	Cleanup : Use metadata preserving API for branch creation	Xinliang David Li	2016-09-03	1	-9/+4
\| \| \| \| \| \| \|	Use the wrapper API in IRBuilder that does meta data copy to create new branch in LoopUnswitch. llvm-svn: 280602
*	AMDGPU: Do basic folding of class intrinsic	Matt Arsenault	2016-09-03	1	-0/+79
\| \| \| \| \| \| \|	This allows more of the OCML builtin library to be constant folded. llvm-svn: 280586
*	ADT: Do not inherit from std::iterator in ilist_iterator	Duncan P. N. Exon Smith	2016-09-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Inheriting from std::iterator uses more boiler-plate than manual typedefs. Avoid that in both ilist_iterator and MachineInstrBundleIterator. This has the side effect of removing ilist_iterator from certain ADL lookups in namespace std; calls to std::next need to be qualified by "std::" that didn't have to before. The one case of this in-tree was operating on a temporary, so I used the more compact operator++. llvm-svn: 280570
*	[Profile] handle select instruction in 'expect' lowering	Xinliang David Li	2016-09-02	1	-11/+25
\| \| \| \| \| \| \| \| \|	Builtin expect lowering currently ignores select. This patch fixes the issue Differential Revision: http://reviews.llvm.org/D24166 llvm-svn: 280547
*	[SLP] Don't pass a global CL option as an argument. NFC.	Chad Rosier	2016-09-02	1	-8/+7
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D24199 llvm-svn: 280527
*	[InsttCombine] fold insertelement of constant into shuffle with constant ↵	Sanjay Patel	2016-09-02	1	-0/+76
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	operand (PR29126) The motivating case occurs with SSE/AVX scalar intrinsics, so this is a first step towards shrinking that to a single shufflevector. Note that the transform is intentionally limited to shuffles that are equivalent to vector selects to avoid creating arbitrary shuffle masks that may not lower well. This should solve PR29126: https://llvm.org/bugs/show_bug.cgi?id=29126 Differential Revision: https://reviews.llvm.org/D23886 llvm-svn: 280504
*	[LV] Ensure reverse interleaved group GEPs remain uniform	Matthew Simpson	2016-09-02	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	For uniform instructions, we're only required to generate a scalar value for the first vector lane of each unroll iteration. Thus, if we have a reverse interleaved group, computing the member index off the scalar GEP corresponding to the last vector lane of its pointer operand technically makes the GEP non-uniform. We should compute the member index off the first scalar GEP instead. I've added the updated member index computation to the existing reverse interleaved group test. llvm-svn: 280497
*	[SimplifyCFG] Add a workaround to fix PR30188	James Molloy	2016-09-02	1	-0/+10
\| \| \| \| \| \| \| \|	We're sinking stores, which is a good thing, but in the process creating selects for the store address operand, which SROA/Mem2Reg can't look through, which caused serious regressions. The real fix is in SROA, which I'll be looking into. llvm-svn: 280470
*	revert r280429 and r280425:	Dehao Chen	2016-09-02	1	-22/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	r280425 \| dehao \| 2016-09-01 16:15:50 -0700 (Thu, 01 Sep 2016) \| 9 lines Refactor LICM pass in preparation for LoopSink pass. Summary: LoopSink pass uses some common function in LICM. This patch refactor the LICM code to make it usable by LoopSink pass (https://reviews.llvm.org/D22778). r280429 \| dehao \| 2016-09-01 16:31:25 -0700 (Thu, 01 Sep 2016) \| 9 lines Refactor LICM to expose canSinkOrHoistInst to LoopSink pass. Summary: LoopSink pass shares the same canSinkOrHoistInst functionality with LICM pass. This patch exposes this function in preparation of https://reviews.llvm.org/D22778 llvm-svn: 280453
*	revert r280432:	Dehao Chen	2016-09-02	1	-6/+5
\| \| \| \| \| \| \| \| \|	r280432 \| dehao \| 2016-09-01 16:51:37 -0700 (Thu, 01 Sep 2016) \| 9 lines Explicitly require DominatorTreeAnalysis pass for instsimplify pass. Summary: DominatorTreeAnalysis is always required by instsimplify. llvm-svn: 280452
*	Explicitly require DominatorTreeAnalysis pass for instsimplify pass.	Dehao Chen	2016-09-01	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: DominatorTreeAnalysis is always required by instsimplify. Reviewers: davidxl, danielcdh Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24173 llvm-svn: 280432
*	Refactor LICM to expose canSinkOrHoistInst to LoopSink pass.	Dehao Chen	2016-09-01	1	-7/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: LoopSink pass shares the same canSinkOrHoistInst functionality with LICM pass. This patch exposes this function in preparation of https://reviews.llvm.org/D22778 Reviewers: chandlerc, davidxl, danielcdh Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24171 llvm-svn: 280429
*	Refactor replaceDominatedUsesWith to have a flag to control whether to ↵	Dehao Chen	2016-09-01	2	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	replace uses in BB itself. Summary: This is in preparation for LoopSink pass which calls replaceDominatedUsesWith to update after sinking. Reviewers: chandlerc, davidxl, danielcdh Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24170 llvm-svn: 280427
*	Refactor LICM pass in preparation for LoopSink pass.	Dehao Chen	2016-09-01	1	-21/+23
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: LoopSink pass uses some common function in LICM. This patch refactor the LICM code to make it usable by LoopSink pass (https://reviews.llvm.org/D22778). Reviewers: chandlerc, davidxl, danielcdh Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24168 llvm-svn: 280425
*	[LV] Use ScalarParts for ad-hoc pointer IV scalarization (NFCI)	Matthew Simpson	2016-09-01	1	-22/+9
\| \| \| \| \| \| \| \| \|	We can now maintain scalar values in VectorLoopValueMap. Thus, we no longer have to create temporary vectors with insertelement instructions when handling pointer induction variables. This case was mistakenly missed from r279649 when refactoring the other scalarization code. llvm-svn: 280405
*	[LV] Move VectorParts allocation and mapping into PHI widening (NFC)	Matthew Simpson	2016-09-01	1	-29/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch moves the allocation of VectorParts for PHI nodes into the actual PHI widening code. Previously, we allocated these VectorParts in vectorizeBlockInLoop, and passed them by reference to widenPHIInstruction. Upon returning, we would then map the VectorParts in VectorLoopValueMap. This behavior is problematic for the cases where we only want to generate a scalar version of a PHI node. For example, if in the future we only generate a scalar version of an induction variable, we would end up inserting an empty vector entry into the map once we return to vectorizeBlockInLoop. We now no longer need to pass VectorParts to the various PHI widening functions, and we can keep VectorParts allocation as close as possible to the point at which they are actually mapped in VectorLoopValueMap. llvm-svn: 280390
*	[EarlyCSE] Change C API pass interface for EarlyCSE w/ MemorySSA	Geoff Berry	2016-09-01	1	-2/+6
\| \| \| \| \| \| \| \| \| \|	Previous change broke the C API for creating an EarlyCSE pass w/ MemorySSA by adding a bool parameter to control whether MemorySSA was used or not. This broke the OCaml bindings. Instead, change the old C API entry point back and add a new one to request an EarlyCSE pass with MemorySSA. llvm-svn: 280379
*	[InstCombine] remove fold of an icmp pattern that should never happen	Sanjay Patel	2016-09-01	1	-15/+0
\| \| \| \| \| \| \| \| \| \| \| \|	While removing a scalar shackle from an icmp fold, I noticed that I couldn't find any tests to trigger this code path. The 'and' shrinking transform should be handled by InstCombiner::foldCastedBitwiseLogic() or eliminated with InstSimplify. The icmp narrowing is part of InstCombiner::foldICmpWithCastAndCast(). Differential Revision: https://reviews.llvm.org/D24031 llvm-svn: 280370
*	[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches	James Molloy	2016-09-01	1	-28/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was a real restriction in the original version of SinkIfThenCodeToEnd. Now it's been rewritten, the restriction can be lifted. As part of this, we handle a very common and useful case where one of the incoming branches is actually conditional. Consider: if (a) x(1); else if (b) x(2); This produces the following CFG: [if] / \ [x(1)] [if] \| \| \ \| \| \ \| [x(2)] \| \ \| / [ end ] [end] has two unconditional predecessor arcs and one conditional. The conditional refers to the implicit empty 'else' arc. This same pattern can also be caused by an empty default block in a switch. We can't sink the call to x() down to end because no call to x() happens on the third incoming arc (assume that x() has sideeffects for the sake of argument; if something is safe to speculate we could indeed sink nevertheless but this cannot happen in the general case and causes many extra selects). We are now able to detect this case and split off the unconditional arcs to a common successor: [if] / \ [x(1)] [if] \| \| \ \| \| \ \| [x(2)] \| \ / \| [sink.split] \| \ / [ end ] Now we can sink the call to x() into %sink.split. This can cause significant code simplification in many testcases. llvm-svn: 280364
*	[SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd	James Molloy	2016-09-01	1	-90/+149
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	r279460 rewrote this function to be able to handle more than two incoming edges and took pains to ensure this didn't regress anything. This time we change the logic for determining if an instruction should be sunk. Previously we used a single pass greedy algorithm - sink instructions until one requires more than one PHI node or we run out of instructions to sink. This had the problem that sinking instructions that had non-identical but trivially the same operands needed extra logic so we sunk them aggressively. For example: %a = load i32* %b %d = load i32* %b %c = gep i32* %a, i32 0 %e = gep i32* %d, i32 1 Sinking %c and %e would naively require two PHI merges as %a != %d. But the loads are obviously equivalent (and maybe can't be hoisted because there is no common predecessor). This is why we implemented the fairly complex function areValuesTriviallySame(), to look through trivial differences like this. However it's just not clever enough. Instead, throw areValuesTriviallySame away, use pointer equality to check equivalence of operands and switch to a two-stage algorithm. In the "scan" stage, we look at every sinkable instruction in isolation from end of block to front. If it's sinkable, we keep track of all operands that required PHI merging. In the "sink" stage, we iteratively sink the last non-terminator in the source blocks. But when calculating how many PHIs are actually required to be inserted (to work out if we should stop or not) we remove any values that have already been sunk from the set of PHI-merges required, which allows us to be more aggressive. This turns an algorithm with potentially recursive lookahead (looking through GEPs, casts, loads and any other instruction potentially not CSE'd) to two linear scans. llvm-svn: 280351
*	[SimplifyCFG] Fix nondeterministic iteration order	James Molloy	2016-09-01	1	-2/+2
\| \| \| \| \| \| \| \|	We iterate over the result from SafeToMergeTerminators, so make it a SmallSetVector instead of a SmallPtrSet. Should fix stage3 convergence builds. llvm-svn: 280342
*	[SimplifyCFG] Improve FoldValueComparisonIntoPredecessors to handle more cases	James Molloy	2016-09-01	1	-6/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A very important case is not handled here: multiple arcs to a single block with a PHI. Consider: a: %1 = icmp %b, 1 br %1, label %c, label %e c: %2 = icmp %b, 2 br %2, label %d, label %e d: br %e e: phi [0, %a], [1, %c], [2, %d] FoldValueComparisonIntoPredecessors will refuse to fold this, as it doesn't know how to deal with two arcs to a common destination with different PHI values. The answer is obvious - just split all conflicting arcs. llvm-svn: 280338
*	Add cast to appease windows builder. Fixes build break introduced in r280306.	Nick Lewycky	2016-08-31	1	-1/+1
\| \| \| \|	llvm-svn: 280311
*	Add -fprofile-dir= to clang.	Nick Lewycky	2016-08-31	1	-13/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	-fprofile-dir=path allows the user to specify where .gcda files should be emitted when the program is run. In particular, this is the first flag that causes the .gcno and .o files to have different paths, LLVM is extended to support this. -fprofile-dir= does not change the file name in the .gcno (and thus where lcov looks for the source) but it does change the name in the .gcda (and thus where the runtime library writes the .gcda file). It's different from a GCOV_PREFIX because a user can observe that the GCOV_PREFIX_STRIP will strip paths off of -fprofile-dir= but not off of a supplied GCOV_PREFIX. To implement this we split -coverage-file into -coverage-data-file and -coverage-notes-file to specify the two different names. The !llvm.gcov metadata node grows from a 2-element form {string coverage-file, node dbg.cu} to 3-elements, {string coverage-notes-file, string coverage-data-file, node dbg.cu}. In the 3-element form, the file name is already "mangled" with .gcno/.gcda suffixes, while the 2-element form left that to the middle end pass. llvm-svn: 280306
*	[InstCombine] allow icmp (shr exact X, C2), C fold for splat constant vectors	Sanjay Patel	2016-08-31	1	-5/+0
\| \| \| \| \| \| \|	The enhancement to foldICmpDivConstant ( http://llvm.org/viewvc/llvm-project?view=revision&revision=280299 ) allows us to remove the ConstantInt check; no other changes needed. llvm-svn: 280300
*	[InstCombine] allow icmp (div X, Y), C folds for splat constant vectors	Sanjay Patel	2016-08-31	1	-37/+26
\| \| \| \| \| \|	Converting all of the overflow ops to APInt looked risky, so I've left that as a TODO. llvm-svn: 280299
*	[InstCombine] change insertRangeTest() to use APInt instead of Constant; NFCI	Sanjay Patel	2016-08-31	3	-20/+29
\| \| \| \| \| \| \| \|	This is prep work before changing the callers to also use APInt which will allow folds for splat vectors. Currently, the callers have ConstantInt guards in place, so no functional change intended with this commit. llvm-svn: 280282
*	[LoopInfo] Add verification by recomputation.	Michael Zolotukhin	2016-08-31	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Current implementation of LI verifier isn't ideal and fails to detect some cases when LI is incorrect. For instance, it checks that all recorded loops are in a correct form, but it has no way to check if there are no more other (unrecorded in LI) loops in the function. This patch adds a way to detect such bugs. Reviewers: chandlerc, sanjoy, hfinkel Subscribers: llvm-commits, silvas, mzolotukhin Differential Revision: https://reviews.llvm.org/D23437 llvm-svn: 280280
*	[EarlyCSE] Optionally use MemorySSA. NFC.	Geoff Berry	2016-08-31	2	-19/+128
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Use MemorySSA, if requested, to do less conservative memory dependency checking. This change doesn't enable the MemorySSA enhanced EarlyCSE in the default pipelines, so should be NFC. Reviewers: dberlin, sanjoy, reames, majnemer Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19821 llvm-svn: 280279
*	[EarlyCSE] Allow forwarding a non-invariant load into an invariant load.	Geoff Berry	2016-08-31	1	-5/+5
\| \| \| \| \| \| \| \| \| \|	Reviewers: sanjoy Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D23935 llvm-svn: 280265
*	[SLP] Update the debug based on Michael's suggestion.	Chad Rosier	2016-08-31	1	-2/+3
\| \| \| \| \| \| \|	Passing the types/opcode check still doesn't guarantee we'll actually vectorize. Therefore, just make it clear we're attempting to vectorize. llvm-svn: 280263
*	[SLP] Sink debug after checking for matching types/opcode.	Chad Rosier	2016-08-31	1	-2/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D24090 llvm-svn: 280260
*	s/static inline/static/ for headers I have changed in r279475. NFC.	Tim Shen	2016-08-31	1	-5/+3
\| \| \| \|	llvm-svn: 280257
*	[statepoints][experimental] Add support for live-in semantics of values in ↵	Philip Reames	2016-08-31	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	deopt bundles This is a first step towards supporting deopt value lowering and reporting entirely with the register allocator. I hope to build on this in the near future to support live-on-return semantics, but I have a use case which allows me to test and investigate code quality with just the live-in semantics so I've chosen to start there. For those curious, my use cases is our implementation of the "__llvm_deoptimize" function we bind to @llvm.deoptimize. I'm choosing not to hard code that fact in the patch and instead make it configurable via function attributes. The basic approach here is modelled on what is done for the "Live In" values on stackmaps and patchpoints. (A secondary goal here is to remove one of the last barriers to merging the pseudo instructions.) We start by adding the operands directly to the STATEPOINT SDNode. Once we've lowered to MI, we extend the remat logic used by the register allocator to fold virtual register uses into StackMap::Indirect entries as needed. This does rely on the fact that the register allocator rematerializes. If it didn't along some code path, we could end up with more vregs than physical registers and fail to allocate. Today, we only fold in the register allocator. This can create some weird effects when combined with arguments passed on the stack because we don't fold them appropriately. I have an idea how to fix that, but it needs this patch in place to work on that effectively. (There's some weird interaction with the scheduler as well, more investigation needed.) My near term plan is to land this patch off-by-default, experiment in my local tree to identify any correctness issues and then start fixing codegen problems one by one as I find them. Once I have the live-in lowering fully working (both correctness and code quality), I'm hoping to move on to the live-on-return semantics. Note: I don't have any known miscompiles with this patch enabled, but I'm pretty sure I'll find at least a couple. Thus, the "experimental" tag and the fact it's off by default. Differential Revision: https://reviews.llvm.org/D24000 llvm-svn: 280250
*	[SLP] Arguments should be camel case, and start with an upper case letter. NFC.	Chad Rosier	2016-08-31	1	-2/+2
\| \| \| \|	llvm-svn: 280248
*	Revert "[SimplifyCFG] Improve FoldValueComparisonIntoPredecessors to handle ↵	James Molloy	2016-08-31	1	-20/+6
\| \| \| \| \| \| \| \|	more cases" This reverts commit r280218. This also causes buildbot errors. Sigh. Not a successful day all around! llvm-svn: 280239
*	Revert "[SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd"	James Molloy	2016-08-31	1	-127/+86
\| \| \| \| \| \|	This reverts commit r280216 - it caused buildbot failures. llvm-svn: 280234
*	Revert "[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches"	James Molloy	2016-08-31	1	-89/+27
\| \| \| \| \| \|	This reverts commit r280217. r280216 caused buildbot failures - backing out the entire chain. llvm-svn: 280233
*	Revert "[SimplifyCFG] Add a workaround to fix PR30188"	James Molloy	2016-08-31	1	-10/+0
\| \| \| \| \| \|	This reverts commit r280219. r280216 caused buildbot failures - backing out the entire chain. llvm-svn: 280232
*	Revert "[SimplifyCFG] Fix bootstrap failure after r280220"	James Molloy	2016-08-31	1	-21/+5
\| \| \| \| \| \|	This reverts commit r280228. r280216 caused buildbot failures - backing out the entire sequence. llvm-svn: 280231
*	[SimplifyCFG] Fix bootstrap failure after r280220	James Molloy	2016-08-31	1	-5/+21
\| \| \| \| \| \|	We check that a sinking candidate is used by only one PHI node during our legality checks. However for instructions that are used by other sinking candidates our heuristic is less conservative. This can result in a candidate actually being illegal when we come to sink it because of how we sunk a predecessor. Do the used-by-only-one-PHI checks again during sinking to ensure we don't crash. llvm-svn: 280228
*	[SimplifyCFG] Add a workaround to fix PR30188	James Molloy	2016-08-31	1	-0/+10
\| \| \| \| \| \| \| \|	We're sinking stores, which is a good thing, but in the process creating selects for the store address operand, which SROA/Mem2Reg can't look through, which caused serious regressions. The real fix is in SROA, which I'll be looking into. llvm-svn: 280219
*	[SimplifyCFG] Improve FoldValueComparisonIntoPredecessors to handle more cases	James Molloy	2016-08-31	1	-6/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A very important case is not handled here: multiple arcs to a single block with a PHI. Consider: a: %1 = icmp %b, 1 br %1, label %c, label %e c: %2 = icmp %b, 2 br %2, label %d, label %e d: br %e e: phi [0, %a], [1, %c], [2, %d] FoldValueComparisonIntoPredecessors will refuse to fold this, as it doesn't know how to deal with two arcs to a common destination with different PHI values. The answer is obvious - just split all conflicting arcs. llvm-svn: 280218
*	[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches	James Molloy	2016-08-31	1	-27/+89
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was a real restriction in the original version of SinkIfThenCodeToEnd. Now it's been rewritten, the restriction can be lifted. As part of this, we handle a very common and useful case where one of the incoming branches is actually conditional. Consider: if (a) x(1); else if (b) x(2); This produces the following CFG: [if] / \ [x(1)] [if] \| \| \ \| \| \ \| [x(2)] \| \ \| / [ end ] [end] has two unconditional predecessor arcs and one conditional. The conditional refers to the implicit empty 'else' arc. This same pattern can also be caused by an empty default block in a switch. We can't sink the call to x() down to end because no call to x() happens on the third incoming arc (assume that x() has sideeffects for the sake of argument; if something is safe to speculate we could indeed sink nevertheless but this cannot happen in the general case and causes many extra selects). We are now able to detect this case and split off the unconditional arcs to a common successor: [if] / \ [x(1)] [if] \| \| \ \| \| \ \| [x(2)] \| \ / \| [sink.split] \| \ / [ end ] Now we can sink the call to x() into %sink.split. This can cause significant code simplification in many testcases. llvm-svn: 280217
*	[SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd	James Molloy	2016-08-31	1	-86/+127
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	r279460 rewrote this function to be able to handle more than two incoming edges and took pains to ensure this didn't regress anything. This time we change the logic for determining if an instruction should be sunk. Previously we used a single pass greedy algorithm - sink instructions until one requires more than one PHI node or we run out of instructions to sink. This had the problem that sinking instructions that had non-identical but trivially the same operands needed extra logic so we sunk them aggressively. For example: %a = load i32* %b %d = load i32* %b %c = gep i32* %a, i32 0 %e = gep i32* %d, i32 1 Sinking %c and %e would naively require two PHI merges as %a != %d. But the loads are obviously equivalent (and maybe can't be hoisted because there is no common predecessor). This is why we implemented the fairly complex function areValuesTriviallySame(), to look through trivial differences like this. However it's just not clever enough. Instead, throw areValuesTriviallySame away, use pointer equality to check equivalence of operands and switch to a two-stage algorithm. In the "scan" stage, we look at every sinkable instruction in isolation from end of block to front. If it's sinkable, we keep track of all operands that required PHI merging. In the "sink" stage, we iteratively sink the last non-terminator in the source blocks. But when calculating how many PHIs are actually required to be inserted (to work out if we should stop or not) we remove any values that have already been sunk from the set of PHI-merges required, which allows us to be more aggressive. This turns an algorithm with potentially recursive lookahead (looking through GEPs, casts, loads and any other instruction potentially not CSE'd) to two linear scans. llvm-svn: 280216
*	[SimplifyCFG] Tail-merge calls with sideeffects	James Molloy	2016-08-31	1	-7/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	This was deliberately disabled during my rewrite of SinkIfThenToEnd to keep behaviour at least vaguely consistent with the previous version and keep it as close to NFC as I could. There's no real reason not to merge sideeffect calls though, so let's do it! Small fixup along the way to ensure we don't create indirect calls. Should fix PR28964. llvm-svn: 280215
*	[Coroutines] Part 10: Add coroutine promise support.	Gor Nishanov	2016-08-31	5	-9/+127
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: 1) CoroEarly now lowers llvm.coro.promise intrinsic that allows to obtain a coroutine promise pointer from a coroutine frame and vice versa. 2) CoroFrame now interprets Promise argument of llvm.coro.begin to place CoroutinPromise alloca at a deterministic offset from the coroutine frame. Now, the coroutine promise example from docs\Coroutines.rst compiles and produces expected result (see test/Transform/Coroutines/ex4.ll). Reviewers: majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23993 llvm-svn: 280184