bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[JumpThreading] Re-enable JumpThreading for guards	Sanjoy Das	2017-02-17	1	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: JumpThreading for guards feature has been reverted at https://reviews.llvm.org/rL295200 due to the following problem: the feature used the following algorithm for detection of diamond patters: 1. Find a block with 2 predecessors; 2. Check that these blocks have a common single parent; 3. Check that the parent's terminator is a branch instruction. The problem is that these checks are insufficient. They may pass for a non-diamond construction in case if those two predecessors are actually the same block. This may happen if parent's terminator is a br (either conditional or unconditional) to a block that ends with "switch" instruction with exactly two branches going to one block. This patch re-enables the JumpThreading for guards and fixes this issue by adding the check that those found predecessors are actually different blocks. This guarantees that parent's terminator is a conditional branch with exactly 2 different successors, which is now ensured by assertions. It also adds two more tests for this situation (with parent's terminator being a conditional and an unconditional branch). Patch by Max Kazantsev! Reviewers: anna, sanjoy, reames Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30036 llvm-svn: 295410
*	Revert "[JumpThreading] Thread through guards"	Anna Thomas	2017-02-15	1	-37/+0
\| \| \| \| \| \| \| \| \|	This reverts commit r294617. We fail on an assert while trying to get a condition from an unconditional branch. llvm-svn: 295200
*	[InlineFunction] use getFunction(); NFC	Sanjay Patel	2017-02-15	1	-3/+3
\| \| \| \|	llvm-svn: 295185
*	[InlineFunction] use getCaller(); NFCI	Sanjay Patel	2017-02-15	1	-3/+2
\| \| \| \|	llvm-svn: 295181
*	[InlineFunction] use range-for loop; NFCI	Sanjay Patel	2017-02-15	1	-10/+8
\| \| \| \|	llvm-svn: 295179
*	SimplifyCFG: Register cloned assume intrinsics with assumption cache when ↵	Peter Collingbourne	2017-02-15	1	-3/+10
\| \| \| \| \| \| \| \|	creating critical edge. Differential Revision: https://reviews.llvm.org/D29976 llvm-svn: 295145
*	Fix a bug in caller's BFI update code after inlining.	Easwaran Raman	2017-02-14	1	-3/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Multiple blocks in the callee can be mapped to a single cloned block since we prune the callee as we clone it. The existing code iterates over the value map and clones the block frequency (and eventually scales the frequencies of the cloned blocks). Value map's iteration is not deterministic and so the cloned block might get the frequency of any of the original blocks. The fix is to set the max of the original frequencies to the cloned block. The first block in the sequence must have this max frequency and, in the call context, subsequent blocks must have its frequency. Differential Revision: https://reviews.llvm.org/D29696 llvm-svn: 295115
*	[BasicBlockUtils] Use getFirstNonPHIOrDbg to set debugloc for instructions ↵	Taewook Oh	2017-02-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	created in SplitBlockPredecessors Summary: When setting debugloc for instructions created in SplitBlockPredecessors, current implementation copies debugloc from the first-non-phi instruction of the original basic block. However, if the first-non-phi instruction is a call for @llvm.dbg.value, the debugloc of the instruction may point the location outside of the block itself. For the example code of ``` 1 typedef struct _node_t { 2 struct _node_t next; 3 } node_t; 4 5 extern node_t root; 6 7 int foo() { 8 node_t node, tmp; 9 int ret = 0; 10 11 node = tmp = root->next; 12 while (node != root) { 13 while (node) { 14 tmp = node; 15 node = node->next; 16 ret++; 17 } 18 } 19 20 return ret; 21 } ``` , below is the basicblock corresponding to line 12 after Reassociate expressions pass: ``` while.cond: ; preds = %while.cond2, %entry %node.0 = phi %struct._node_t* [ %1, %entry ], [ null, %while.cond2 ] %ret.0 = phi i32 [ 0, %entry ], [ %ret.1, %while.cond2 ] tail call void @llvm.dbg.value(metadata i32 %ret.0, i64 0, metadata !19, metadata !20), !dbg !21 tail call void @llvm.dbg.value(metadata %struct._node_t* %node.0, i64 0, metadata !11, metadata !20), !dbg !31 %cmp = icmp eq %struct._node_t* %node.0, %0, !dbg !33 br i1 %cmp, label %while.end5, label %while.cond2, !dbg !35 ``` As you can see, the first-non-phi instruction is a call for @llvm.dbg.value, and the debugloc is ``` !21 = !DILocation(line: 9, column: 7, scope: !6) ``` , which is a definition of 'ret' variable and outside of the scope of the basicblock itself. However, current implementation picks up this debugloc for the instructions created in SplitBlockPredecessors. This patch addresses this problem by picking up debugloc from the first-non-phi-non-dbg instruction. Reviewers: dblaikie, samsonov, eugenis Reviewed By: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29867 llvm-svn: 295106
*	PredicateInfo: Handle critical edges	Daniel Berlin	2017-02-12	1	-63/+107
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds support for placing predicateinfo such that it affects critical edges. This fixes the issues mentioned by Nuno on the mailing list. Depends on D29519 Reviewers: davide, nlopes Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29606 llvm-svn: 294921
*	Encode duplication factor from loop vectorization and loop unrolling to ↵	Dehao Chen	2017-02-10	2	-4/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	discriminator. Summary: This patch starts the implementation as discuss in the following RFC: http://lists.llvm.org/pipermail/llvm-dev/2016-October/106532.html When optimization duplicates code that will scale down the execution count of a basic block, we will record the duplication factor as part of discriminator so that the offline process tool can find the duplication factor and collect the accurate execution frequency of the corresponding source code. Two important optimization that fall into this category is loop vectorization and loop unroll. This patch records the duplication factor for these 2 optimizations. The recording will be guarded by a flag encode-duplication-in-discriminators, which is off by default. Reviewers: probinson, aprantl, davidxl, hfinkel, echristo Reviewed By: hfinkel Subscribers: mehdi_amini, anemet, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D26420 llvm-svn: 294782
*	[JumpThreading] Thread through guards	Sanjoy Das	2017-02-09	1	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch allows JumpThreading also thread through guards. Virtually, guard(cond) is equivalent to the following construction: if (cond) { do something } else {deoptimize} Yet it is not explicitly converted into IFs before lowering. This patch enables early threading through guards in simple cases. Currently it covers the following situation: if (cond1) { // code A } else { // code B } // code C guard(cond2) // code D If there is implication cond1 => cond2 or !cond1 => cond2, we can transform this construction into the following: if (cond1) { // code A // code C } else { // code B // code C guard(cond2) } // code D Thus, removing the guard from one of execution branches. Patch by Max Kazantsev! Reviewers: reames, apilipenko, igor-laevsky, anna, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29620 llvm-svn: 294617
*	NVPTX: Extract mem intrinsic expansions into utilities	Matt Arsenault	2017-02-08	2	-0/+232
\| \| \| \|	llvm-svn: 294490
*	PredicateInfo: Some compilers are unhappy with naming Use *'s Use. Change ↵	Daniel Berlin	2017-02-07	1	-13/+13
\| \| \| \| \| \|	the name. llvm-svn: 294364
*	Add PredicateInfo utility and printing pass	Daniel Berlin	2017-02-07	3	-0/+642
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds a utility to build extended SSA (see "ABCD: eliminating array bounds checks on demand"), and an intrinsic to support it. This is then used to get functionality equivalent to propagateEquality in GVN, in NewGVN (without having to replace instructions as we go). It would work similarly in SCCP or other passes. This has been talked about a few times, so i built a real implementation and tried to productionize it. Copies are inserted for operands used in assumes and conditional branches that are based on comparisons (see below for more) Every use affected by the predicate is renamed to the appropriate intrinsic result. E.g. %cmp = icmp eq i32 %x, 50 br i1 %cmp, label %true, label %false true: ret i32 %x false: ret i32 1 will become %cmp = icmp eq i32, %x, 50 br i1 %cmp, label %true, label %false true: ; Has predicate info ; branch predicate info { TrueEdge: 1 Comparison: %cmp = icmp eq i32 %x, 50 } %x.0 = call @llvm.ssa_copy.i32(i32 %x) ret i32 %x.0 false: ret i23 1 (you can use -print-predicateinfo to get an annotated-with-predicateinfo dump) This enables us to easily determine what operations are affected by a given predicate, and how operations affected by a chain of predicates. Reviewers: davide, sanjoy Subscribers: mgorny, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D29519 Update for review comments Fix a bug Nuno noticed where we are giving information about and/or on edges where the info is not useful and easy to use wrong Update for review comments llvm-svn: 294351
*	[ValueTracking] emit a remark when we detect a conflicting assumption (PR31809)	Sanjay Patel	2017-02-06	1	-4/+12
\| \| \| \| \| \| \| \| \| \| \| \|	This is a follow-up to D29395 where we try to be good citizens and let the user know that we've probably gone off the rails. This should allow us to resolve: https://llvm.org/bugs/show_bug.cgi?id=31809 Differential Revision: https://reviews.llvm.org/D29404 llvm-svn: 294208
*	NFC: [LoopUnroll] More meaningful message in tracing	Anna Thomas	2017-02-03	1	-1/+1
\| \| \| \|	llvm-svn: 294017
*	FunctionImport: Use IRMover directly.	Peter Collingbourne	2017-02-03	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \|	The importer was previously using ModuleLinker in a sort of "IRMover mode". Use IRMover directly instead in order to remove a level of indirection. I will remove all importing support from ModuleLinker in a separate change. Differential Revision: https://reviews.llvm.org/D29468 llvm-svn: 294014
*	Shut up another GCC warning about operator precedence. NFC.	Michael Kuperstein	2017-02-01	1	-1/+1
\| \| \| \|	llvm-svn: 293812
*	[LoopUnroll] Use addClonedBlockToLoopInfo to add loop header to LI (NFC).	Florian Hahn	2017-02-01	1	-11/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: I have a similar patch up for review already (D29173). If you prefer I can squash them both together. Also I think there more potential for code sharing between LoopUnroll.cpp and LoopUnrollRuntime.cpp. Do you think patches for that would be worthwhile? Reviewers: mkuper, mzolotukhin Reviewed By: mkuper, mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29311 llvm-svn: 293758
*	[LoopUnroll] Use addClonedBlockToLoopInfo to clone the top level loop (NFC)	Florian Hahn	2017-01-31	1	-14/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: rL293124 added the necessary infrastructure to properly add the cloned top level loop to LoopInfo, which means we do not have to do it manually in CloneLoopBlocks. @mkuper sorry for not pointing this out during my review of D29156, I just realized that today. Reviewers: mzolotukhin, chandlerc, mkuper Reviewed By: mkuper Subscribers: llvm-commits, mkuper Differential Revision: https://reviews.llvm.org/D29173 llvm-svn: 293615
*	Revert "[MemorySSA] Revert r293361 and r293363, as the tests fail under asan."	Daniel Berlin	2017-01-30	2	-15/+34
\| \| \| \| \| \| \|	This reverts commit r293471, reapplying r293361 and r293363 with a fix for an out-of-bounds read. llvm-svn: 293474
*	[MemorySSA] Revert r293361 and r293363, as the tests fail under asan.	Sam McCall	2017-01-30	2	-27/+12
\| \| \| \|	llvm-svn: 293471
*	[MemorySSA] Correct an assertion surrounding with parentheses.	Davide Italiano	2017-01-30	1	-3/+2
\| \| \| \|	llvm-svn: 293453
*	[InstCombine] Merge DebugLoc when speculatively hoisting store instruction	Taewook Oh	2017-01-28	1	-8/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Along with https://reviews.llvm.org/D27804, debug locations need to be merged when hoisting store instructions as well. Not sure if just dropping debug locations would make more sense for this case, but as the branch instruction will have at least different discriminator with the hoisted store instruction, I think there will be no difference in practice. Reviewers: aprantl, andreadb, danielcdh Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29062 llvm-svn: 293372
*	MemorySSA: Allow movement to arbitrary places	Daniel Berlin	2017-01-28	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Extend the MemorySSAUpdater API to allow movement to arbitrary places Reviewers: davide, george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29239 llvm-svn: 293363
*	MemorySSA: Fix block numbering invalidation and replacement bugs discovered ↵	Daniel Berlin	2017-01-28	2	-11/+20
\| \| \| \| \| \|	by updater llvm-svn: 293361
*	Cleanup dump() functions.	Matthias Braun	2017-01-28	2	-4/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We had various variants of defining dump() functions in LLVM. Normalize them (this should just consistently implement the things discussed in http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html For reference: - Public headers should just declare the dump() method but not use LLVM_DUMP_METHOD or #if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP) - The definition of a dump method should look like this: #if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP) LLVM_DUMP_METHOD void MyClass::dump() { // print stuff to dbgs()... } #endif llvm-svn: 293359
*	MemorySSA: Move updater to its own file	Daniel Berlin	2017-01-28	3	-339/+373
\| \| \| \|	llvm-svn: 293357
*	Introduce a basic MemorySSA updater, that supports insertDef,	Daniel Berlin	2017-01-28	1	-28/+368
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	insertUse, moveBefore and moveAfter operations. Summary: This creates a basic MemorySSA updater that handles arbitrary insertion of uses and defs into MemorySSA, as well as arbitrary movement around the CFG. It replaces the current splice API. It can be made to handle arbitrary control flow changes. Currently, it uses the same updater algorithm from D28934. The main difference is because MemorySSA is single variable, we have the complete def and use list, and don't need anyone to give it to us as part of the API. We also have to rename stores below us in some cases. If we go that direction in that patch, i will merge all the updater implementations (using an updater_traits or something to provide the get* functions we use, called read/write in that patch). Sadly, the current SSAUpdater algorithm is way too slow to use for what we are doing here. I have updated the tests we have to basically build memoryssa incrementally using the updater api, and make sure it still comes out the same. Reviewers: george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29047 llvm-svn: 293356
*	NFC: Add debug tracing for more cases where loop unrolling fails.	Anna Thomas	2017-01-27	1	-2/+8
\| \| \| \|	llvm-svn: 293313
*	[LangRef] Make @llvm.sqrt(x) return undef, rather than have UB, for negative x.	Justin Lebar	2017-01-27	1	-5/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Some frontends emit a speculate-and-select idiom for sqrt, wherein they compute sqrt(x), check if x is negative, and select NaN if it is: %cmp = fcmp olt double %a, -0.000000e+00 %sqrt = call double @llvm.sqrt.f64(double %a) %ret = select i1 %cmp, double 0x7FF8000000000000, double %sqrt This is technically UB as the LangRef is written today if %a is ever less than -0. But emitting code that's compliant with the current definition of sqrt would require a branch, which would then prevent us from matching this idiom in SelectionDAG (which we do today -- ISD::FSQRT has defined behavior on negative inputs), because SelectionDAG looks at one BB at a time. Nothing in LLVM takes advantage of this undefined behavior, as far as we can tell, and the fact that llvm.sqrt has UB dates from its initial addition to the LangRef. Reviewers: arsenm, mehdi_amini, hfinkel Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D28797 llvm-svn: 293242
*	[LoopUnroll] Properly update loopinfo for runtime unrolling by 2	Michael Kuperstein	2017-01-26	2	-9/+17
\| \| \| \| \| \| \| \| \| \| \|	Even when we don't create a remainder loop (that is, when we unroll by 2), we may duplicate nested loops into the remainder. This is complicated by the fact the remainder may itself be either inserted into an outer loop, or at the top level. In the latter case, we may need to create new top-level loops. Differential Revision: https://reviews.llvm.org/D29156 llvm-svn: 293124
*	MemorySSA: Link all defs together into an intrusive defslist, to make ↵	Daniel Berlin	2017-01-25	1	-36/+130
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	updater easier Summary: This is the first in a series of patches to add a simple, generalized updater to MemorySSA. For MemorySSA, every def is may-def, instead of the normal must-def. (the best way to think of memoryssa is "everything is really one variable, with different versions of that variable at different points in the program). This means when updating, we end up having to do a bunch of work to touch defs below and above us. In order to support this quickly, i have ilist'd all the defs for each block. ilist supports tags, so this is quite easy. the only slightly messy part is that you can't have two iplists for the same type that differ only whether they have the ownership part enabled or not, because the traits are for the value type. The verifiers have been updated to test that the def order is correct. Reviewers: george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29046 llvm-svn: 293085
*	[SimplifyCFG] Do not sink and merge inline-asm instructions.	Akira Hatanaka	2017-01-25	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conservatively disable sinking and merging inline-asm instructions as doing so can potentially create arguments that cannot satisfy the inline-asm constraints. For example, SimplifyCFG used to do the following transformation: (before) if.then: %0 = call i32 asm "rorl $2, $0", "=&r,0,n"(i32 %r6, i32 8) br label %if.end if.else: %1 = call i32 asm "rorl $2, $0", "=&r,0,n"(i32 %r6, i32 6) br label %if.end (after) %.sink = select i1 %tobool, i32 6, i32 8 %0 = call i32 asm "rorl $2, $0", "=&r,0,n"(i32 %r6, i32 %.sink) This would result in a crash in the backend since only immediate integer operands are permitted for constraint "n". rdar://problem/30110806 Differential Revision: https://reviews.llvm.org/D29111 llvm-svn: 293025
*	[PH] Replace uses of AssertingVH from members of analysis results with	Chandler Carruth	2017-01-24	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	a lazy-asserting PoisoningVH. AssertVH is fundamentally incompatible with cache-invalidation of analysis results. The invaliadtion happens after the AssertingVH has already fired. Instead, use a PoisoningVH that will assert if the dangling handle is ever used rather than merely be assigned or destroyed. This patch also removes all of the (numerous) doomed attempts to work around this fundamental incompatibility. It is a pretty significant simplification IMO. The most interesting change is in the Inliner where we still do some clearing because we don't want to rely on the coarse grained invalidation strategy of the containing pass manager. However, I prefer the approach that contains this logic to the cleanup phase of the Inliner, and I think we could enhance the CGSCC analysis management layer to make this even better in the future if desired. The rest is straight cleanup. I've also added a test for one of the harder cases to work around: when a module analysis contains many AssertingVHes pointing at functions. Differential Revision: https://reviews.llvm.org/D29006 llvm-svn: 292928
*	Update domtree incrementally in loop peeling.	Serge Pavlov	2017-01-24	1	-7/+30
\| \| \| \| \| \| \| \| \|	With this change dominator tree remains in sync after each step of loop peeling. Differential Revision: https://reviews.llvm.org/D29029 llvm-svn: 292895
*	SimplifyLibCalls: Replace more unary libcalls with intrinsics	Matt Arsenault	2017-01-23	1	-28/+21
\| \| \| \|	llvm-svn: 292855
*	[LoopUnroll] First form LCSSA, then loop-simplify	Michael Kuperstein	2017-01-23	1	-18/+17
\| \| \| \| \| \| \| \| \| \| \| \| \|	Running non-LCSSA-preserving LoopSimplify followed by LCSSA on (roughly) the same loop is incorrect, since LoopSimplify may break LCSSA arbitrarily higher in the loop nest. Instead, run LCSSA first, and then run LCSSA-preserving LoopSimplify on the result. This fixes PR31718. Differential Revision: https://reviews.llvm.org/D29055 llvm-svn: 292854
*	[Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC)	David L. Jones	2017-01-23	4	-446/+446
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The LibFunc::Func enum holds enumerators named for libc functions. Unfortunately, there are real situations, including libc implementations, where function names are actually macros (musl uses "#define fopen64 fopen", for example; any other transitively visible macro would have similar effects). Strictly speaking, a conforming C++ Standard Library should provide any such macros as functions instead (via <cstdio>). However, there are some "library" functions which are not part of the standard, and thus not subject to this rule (fopen64, for example). So, in order to be both portable and consistent, the enum should not use the bare function names. The old enum naming used a namespace LibFunc and an enum Func, with bare enumerators. This patch changes LibFunc to be an enum with enumerators prefixed with "LibFFunc_". (Unfortunately, a scoped enum is not sufficient to override macros.) There are additional changes required in clang. Reviewers: rsmith Subscribers: mehdi_amini, mzolotukhin, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D28476 llvm-svn: 292848
*	Tweak ASCII art in Simplify CFG. NFC	Amaury Sechet	2017-01-23	1	-1/+1
\| \| \| \|	llvm-svn: 292792
*	[PM] Sink an LCSSA preservation assert from the LoopSimplify pass into	Chandler Carruth	2017-01-21	2	-14/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the library routine shared with the new PM and other code. This assert checks that when LCSSA preservation is requested we start in LCSSA form. Without this early assert, given very complex test cases we can hit an assert or crash much later on when trying to preserve LCSSA. The new PM's loop simplify doesn't need to (and indeed can't) preserve LCSSA as the new PM doesn't deal in transforms in the dependency graph. But we asked the library to and shockingly, this didn't work very well! Stop doing that. Now the assert will tell us immediately with existing test cases. Before this, it took a pretty convoluted input to trigger this. However, sinking the assert also found a bug in LoopUnroll where we asked simplifyLoop to preserve LCSSA right before we reform it. That's kinda silly and unsurprising that it wasn't available. =D Stop doing that too. We also would assert that the unrolled loop was in LCSSA even if preserving LCSSA was never requested! I don't have a test case or anything here. I spotted it by inspection and it seems quite obvious. No logic change anyways, that's just avoiding a spurrious assert. llvm-svn: 292710
*	Improve PGO support for the new inliner	Easwaran Raman	2017-01-20	1	-4/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds the following to the new PM based inliner in PGO mode: * Use block frequency analysis to derive callsite's profile count and use that to adjust thresholds of hot and cold callsites. * Incrementally update the BFI of the caller after a callee gets inlined into it. This incremental update is only within an invocation of the run method - BFI is not preserved across calls to run. Update the function entry count of the callee after inlining it into a caller. * I've tuned the thresholds for the hot and cold callsites using a hacked up version of the old inliner that explicitly computes BFI on a set of internal benchmarks and spec. Once the new PM based pipeline stabilizes (IIRC Chandler mentioned there are known issues) I'll benchmark this again and adjust the thresholds if required. Inliner PGO support. Differential revision: https://reviews.llvm.org/D28331 llvm-svn: 292666
*	Preserve domtree and loop-simplify for runtime unrolling.	Eli Friedman	2017-01-18	3	-21/+82
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Mostly straightforward changes; we just didn't do the computation before. One sort of interesting change in LoopUnroll.cpp: we weren't handling dominance for children of the loop latch correctly, but foldBlockIntoPredecessor hid the problem for complete unrolling. Currently punting on loop peeling; made some minor changes to isolate that problem to LoopUnrollPeel.cpp. Adds a flag -unroll-verify-domtree; it verifies the domtree immediately after we finish updating it. This is on by default for +Asserts builds. Differential Revision: https://reviews.llvm.org/D28073 llvm-svn: 292447
*	Cloning: Copy comdats when cloning globals.	Peter Collingbourne	2017-01-18	1	-0/+13
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D28838 llvm-svn: 292430
*	Fix up a comment. NFC.	Michael Kuperstein	2017-01-18	1	-1/+0
\| \| \| \|	llvm-svn: 292425
*	[LV] Allow reductions that have several uses outside the loop	Michael Kuperstein	2017-01-18	1	-5/+12
\| \| \| \| \| \| \| \| \| \| \|	We currently check whether a reduction has a single outside user. We don't really need to require that - we just need to make sure a single value is used externally. The number of external users of that value shouldn't actually matter. Differential Revision: https://reviews.llvm.org/D28830 llvm-svn: 292424
*	[Target, Transforms] Fix some Clang-tidy modernize and Include What You Use ↵	Eugene Zelenko	2017-01-18	1	-9/+18
\| \| \| \| \| \|	warnings; other minor fixes (NFC). llvm-svn: 292320
*	SimplifyLibCalls: Remove checks for fabs	Matt Arsenault	2017-01-17	1	-5/+7
\| \| \| \| \| \| \|	Use the intrinsic instead of emitting the libcall which will be replaced by the intrinsic. llvm-svn: 292176
*	SimplifyLibCalls: Replace fabs libcalls with intrinsics	Matt Arsenault	2017-01-17	1	-6/+8
\| \| \| \| \| \| \| \|	Add missing fabs(fpext) optimzation that worked with the call, and also fixes it creating a second fpext when there were multiple uses. llvm-svn: 292172
*	[PM] Introduce an analysis set used to preserve all analyses over	Chandler Carruth	2017-01-15	3	-5/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	a function's CFG when that CFG is unchanged. This allows transformation passes to simply claim they preserve the CFG and analysis passes to check for the CFG being preserved to remove the fanout of all analyses being listed in all passes. I've gone through and removed or cleaned up as many of the comments reminding us to do this as I could. Differential Revision: https://reviews.llvm.org/D28627 llvm-svn: 292054