bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Revert "[LICM] Make LICM able to hoist phis"	Benjamin Kramer	2018-11-19	1	-302/+15
\| \| \| \| \| \|	This reverts commit r347190. llvm-svn: 347225
*	[LV] Avoid vectorizing unsafe dependencies in uniform address	Anna Thomas	2018-11-19	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently, when vectorizing stores to uniform addresses, the only instance we prevent vectorization is if there are multiple stores to the same uniform address causing an unsafe dependency. This patch teaches LAA to avoid vectorizing loops that have an unsafe cross-iteration dependency between a load and a store to the same uniform address. Fixes PR39653. Reviewers: Ayal, efriedma Subscribers: rkruppe, llvm-commits Differential Revision: https://reviews.llvm.org/D54538 llvm-svn: 347220
*	[LICM] Make LICM able to hoist phis	John Brawn	2018-11-19	1	-15/+302
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The general approach taken is to make note of loop invariant branches, then when we see something conditional on that branch, such as a phi, we create a copy of the branch and (empty versions of) its successors and hoist using that. This has no impact by itself that I've been able to see, as LICM typically doesn't see such phis as they will have been converted into selects by the time LICM is run, but once we start doing phi-to-select conversion later it will be important. Differential Revision: https://reviews.llvm.org/D52827 llvm-svn: 347190
*	[LoopSimplifyCFG] Teach LoopSimplifyCFG to constant-fold branches and switches	Max Kazantsev	2018-11-19	1	-0/+313
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch introduces infrastructure and the simplest case for constant-folding of branch and switch instructions within loop into unconditional branches. It is useful as a cleanup for such passes as loop unswitching that sometimes produce such branches. Only the simplest case supported in this patch: after the folding, no block should become dead or stop being part of the loop. Support for more sophisticated cases will go separately in follow-up patches. Differential Revision: https://reviews.llvm.org/D54021 Reviewed By: anna llvm-svn: 347183
*	[ProfileSummary] Standardize methods and fix comment	Vedant Kumar	2018-11-19	6	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Every Analysis pass has a get method that returns a reference of the Result of the Analysis, for example, BlockFrequencyInfo &BlockFrequencyInfoWrapperPass::getBFI(). I believe that ProfileSummaryInfo::getPSI() is the only exception to that, as it was returning a pointer. Another change is renaming isHotBB and isColdBB to isHotBlock and isColdBlock, respectively. Most methods use BB as the argument of variable names while methods usually refer to Basic Blocks as Blocks, instead of BB. For example, Function::getEntryBlock, Loop:getExitBlock, etc. I also fixed one of the comments. Patch by Rodrigo Caetano Rocha! Differential Revision: https://reviews.llvm.org/D54669 llvm-svn: 347182
*	[CorrelatedValuePropagation] Preserve debug locations (PR38178)	Vedant Kumar	2018-11-18	1	-15/+16
\| \| \| \| \| \| \| \| \|	Fix all of the missing debug location errors in CVP found by debugify. This includes the missing-location-after-udiv-truncation case described in llvm.org/PR38178. llvm-svn: 347147
*	Use llvm::copy. NFC	Fangrui Song	2018-11-17	3	-6/+5
\| \| \| \|	llvm-svn: 347126
*	[SimpleLoopUnswitch] adding cost multiplier to cap exponential unswitch with	Fedor Sergeev	2018-11-16	1	-2/+116
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need to control exponential behavior of loop-unswitch so we do not get run-away compilation. Suggested solution is to introduce a multiplier for an unswitch cost that makes cost prohibitive as soon as there are too many candidates and too many sibling loops (meaning we have already started duplicating loops by unswitching). It does solve the currently known problem with compile-time degradation (PR 39544). Tests are built on top of a recently implemented CHECK-COUNT-<num> FileCheck directives. Reviewed By: chandlerc, mkazantsev Differential Revision: https://reviews.llvm.org/D54223 llvm-svn: 347097
*	GlobalDCE: Teach isEmptyFunction() to ignore debug intrinsics.	Adrian Prantl	2018-11-16	1	-5/+10
\| \| \| \| \| \| \|	This fixes PR39669. https://bugs.llvm.org/show_bug.cgi?id=39669 llvm-svn: 347065
*	[ThinLTO] Internalize readonly globals	Eugene Leviant	2018-11-16	2	-6/+54
\| \| \| \| \| \| \| \|	An attempt to recommit r346584 after failure on OSX build bot. Fixed cache key computation in ThinLTOCodeGenerator and added test case llvm-svn: 347033
*	[LTO] Load sample profile in LTO link step.	Xin Tong	2018-11-15	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Load sample profile in LTO link step. ThinLTO calls populateModulePassManager to load the profile Reviewers: tejohnson, davidxl, danielcdh Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D54564 llvm-svn: 346971
*	[InstCombine] fix rotate narrowing bug for non-pow-2 types	Sanjay Patel	2018-11-15	1	-2/+7
\| \| \| \|	llvm-svn: 346968
*	[InstCombine] Remove a couple of asserts based on incorrect assumptions	Mandeep Singh Grang	2018-11-14	1	-11/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: These asserts are based on the assumption that the order of true/false operands in a select and those in the compare would always be the same. This fixes PR39595. Reviewers: craig.topper, spatel, dmgreen Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54359 llvm-svn: 346874
*	[InstCombine] fix formatting for matchBSwap(); NFC	Sanjay Patel	2018-11-14	2	-7/+9
\| \| \| \| \| \| \|	We should have a similar function for matching rotate and/or funnel shift, so tidy up the related existing call. llvm-svn: 346871
*	[VPlan, SLP] Use SmallPtrSet for Candidates.	Florian Hahn	2018-11-14	2	-27/+26
\| \| \| \| \| \|	This slightly improves the candidate handling in getBest(). llvm-svn: 346870
*	[VPlan] Remove LLVM_DEBUG from VPlanSlp::dumpBundle.	Florian Hahn	2018-11-14	1	-4/+4
\| \| \| \| \| \|	The caller should take care of only calling it with debug enabled. llvm-svn: 346860
*	[VPlan] Update ifdef.	Florian Hahn	2018-11-14	1	-2/+2
\| \| \| \|	llvm-svn: 346858
*	[VPlan, SLP] Add simple SLP analysis on top of VPlan.	Florian Hahn	2018-11-14	5	-1/+623
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds an initial implementation of the look-ahead SLP tree construction described in 'Look-Ahead SLP: Auto-vectorization in the Presence of Commutative Operations, CGO 2018 by Vasileios Porpodas, Rodrigo C. O. Rocha, Luís F. W. Góes'. It returns an SLP tree represented as VPInstructions, with combined instructions represented as a single, wider VPInstruction. This initial version does not support instructions with multiple different users (either inside or outside the SLP tree) or non-instruction operands; it won't generate any shuffles or insertelement instructions. It also just adds the analysis that builds an SLP tree rooted in a set of stores. It does not include any cost modeling or memory legality checks. The plan is to integrate it with VPlan based cost modeling, once available and to only apply it to operations that can be widened. A follow-up patch will add a support for replacing instructions in a VPlan with their SLP counter parts. Reviewers: Ayal, mssimpso, rengolin, mkuper, hfinkel, hsaito, dcaballe, vporpo, RKSimon, ABataev Reviewed By: rengolin Differential Revision: https://reviews.llvm.org/D4949 llvm-svn: 346857
*	Recommit r346483: [CallSiteSplitting] Only record conditions up to the ↵	Florian Hahn	2018-11-14	1	-13/+26
\| \| \| \| \| \| \| \| \|	IDom(call site). The underlying problem causing the expensive-check failure was fixed in rL346769. llvm-svn: 346843
*	Revert r346810 "Preserve loop metadata when splitting exit blocks"	Reid Kleckner	2018-11-14	1	-32/+0
\| \| \| \| \| \| \|	It broke the Windows self-host: http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/1457 llvm-svn: 346823
*	[InstCombine] fold funnel shift amount based on demanded bits	Sanjay Patel	2018-11-13	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The shift amount of a funnel shift is modulo the scalar bitwidth: http://llvm.org/docs/LangRef.html#llvm-fshl-intrinsic ...so we can use demanded bits analysis on that operand to simplify it when we have a power-of-2 bitwidth. This is another step towards canonicalizing {shift/shift/or} to the intrinsics in IR. Differential Revision: https://reviews.llvm.org/D54478 llvm-svn: 346814
*	Preserve loop metadata when splitting exit blocks	Craig Topper	2018-11-13	1	-0/+32
\| \| \| \| \| \| \| \| \| \|	LoopUtils.cpp contains a utility that splits an loop exit block, so that the new block contains only edges coming from the loop. In the case of nested loops, the exit path for the inner loop might also be the back-edge of the outer loop. The new block which is inserted on this path, is now a latch for the outer loop, and it needs to hold the loop metadata for the outer loop. (The test case gives a more concrete view of the situation.) Patch by Chang Lin (clin1) Differential Revision: https://reviews.llvm.org/D53876 llvm-svn: 346810
*	[InstCombine] canonicalize rotate patterns with cmp/select	Sanjay Patel	2018-11-13	1	-0/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The cmp+branch variant of this pattern is shown in: https://bugs.llvm.org/show_bug.cgi?id=34924 ...and as discussed there, we probably can't transform that without a rotate intrinsic. We do have that now via funnel shift, but we're not quite ready to canonicalize IR to that form yet. The case with 'select' should already be transformed though, so that's this patch. The sequence with negation followed by masking is what we use in the backend and partly in clang (though that part should be updated). https://rise4fun.com/Alive/TplC %cmp = icmp eq i32 %shamt, 0 %sub = sub i32 32, %shamt %shr = lshr i32 %x, %shamt %shl = shl i32 %x, %sub %or = or i32 %shr, %shl %r = select i1 %cmp, i32 %x, i32 %or => %neg = sub i32 0, %shamt %masked = and i32 %shamt, 31 %maskedneg = and i32 %neg, 31 %shl2 = lshr i32 %x, %masked %shr2 = shl i32 %x, %maskedneg %r = or i32 %shl2, %shr2 llvm-svn: 346807
*	[CSP, Cloning] Update DuplicateInstructionsInSplitBetween to use DomTreeUpdater.	Florian Hahn	2018-11-13	3	-39/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch updates DuplicateInstructionsInSplitBetween to update a DTU instead of applying updates to the DT directly. Given that there only are 2 users, also updated them in this patch to avoid churn. I slightly moved the code in CallSiteSplitting around to reduce the places where we have to pass in DTU. If necessary, I could split those changes in a separate patch. This fixes missing DT updates when dealing with musttail calls in CallSiteSplitting, by using DTU->deleteBB. Reviewers: junbuml, kuhar, NutshellySima, indutny, brzycki Reviewed By: NutshellySima llvm-svn: 346769
*	Revert "[ThinLTO] Internalize readonly globals"	Steven Wu	2018-11-13	2	-59/+7
\| \| \| \| \| \|	This reverts commit 10c84a8f35cae4a9fc421648d9608fccda3925f2. llvm-svn: 346768
*	[VPlan] VPlan version of InterleavedAccessInfo.	Florian Hahn	2018-11-13	4	-9/+102
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch turns InterleaveGroup into a template with the instruction type being a template parameter. It also adds a VPInterleavedAccessInfo class, which only contains a mapping from VPInstructions to their respective InterleaveGroup. As we do not have access to scalar evolution in VPlan, we can re-use convert InterleavedAccessInfo to VPInterleavedAccess info. Reviewers: Ayal, mssimpso, hfinkel, dcaballe, rengolin, mkuper, hsaito Reviewed By: rengolin Differential Revision: https://reviews.llvm.org/D49489 llvm-svn: 346758
*	Introduce DebugCounter into ConstProp pass	Zhizhou Yang	2018-11-13	1	-26/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch introduces DebugCounter into ConstProp pass at per-transformation level. It will provide an option to skip first n or stop after n transformations for the whole ConstProp pass. This will make debug easier for the pass, also providing chance to do transformation level bisecting. Reviewers: davide, fhahn Reviewed By: fhahn Subscribers: llozano, george.burgess.iv, llvm-commits Differential Revision: https://reviews.llvm.org/D50094 llvm-svn: 346720
*	[InstCombine] narrow width of rotate patterns, part 3	Sanjay Patel	2018-11-12	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a longer variant for the pattern handled in rL346713 This one includes zexts. Eventually, we should canonicalize all rotate patterns to the funnel shift intrinsics, but we need a bit more infrastructure to make sure the vectorizers handle those intrinsics as well as the shift+logic ops. https://rise4fun.com/Alive/FMn Name: narrow rotateright %neg = sub i8 0, %shamt %rshamt = and i8 %shamt, 7 %rshamtconv = zext i8 %rshamt to i32 %lshamt = and i8 %neg, 7 %lshamtconv = zext i8 %lshamt to i32 %conv = zext i8 %x to i32 %shr = lshr i32 %conv, %rshamtconv %shl = shl i32 %conv, %lshamtconv %or = or i32 %shl, %shr %r = trunc i32 %or to i8 => %maskedShAmt2 = and i8 %shamt, 7 %negShAmt2 = sub i8 0, %shamt %maskedNegShAmt2 = and i8 %negShAmt2, 7 %shl2 = lshr i8 %x, %maskedShAmt2 %shr2 = shl i8 %x, %maskedNegShAmt2 %r = or i8 %shl2, %shr2 llvm-svn: 346716
*	[InstCombine] narrow width of rotate patterns, part 2 (PR39624)	Sanjay Patel	2018-11-12	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The sub-pattern for the shift amount in a rotate can take on several different forms, and there's apparently no way to canonicalize those without seeing the entire rotate sequence. This is the form noted in: https://bugs.llvm.org/show_bug.cgi?id=39624 https://rise4fun.com/Alive/qnT %zx = zext i8 %x to i32 %maskedShAmt = and i32 %shAmt, 7 %shl = shl i32 %zx, %maskedShAmt %negShAmt = sub i32 0, %shAmt %maskedNegShAmt = and i32 %negShAmt, 7 %shr = lshr i32 %zx, %maskedNegShAmt %rot = or i32 %shl, %shr %r = trunc i32 %rot to i8 => %truncShAmt = trunc i32 %shAmt to i8 %maskedShAmt2 = and i8 %truncShAmt, 7 %shl2 = shl i8 %x, %maskedShAmt2 %negShAmt2 = sub i8 0, %truncShAmt %maskedNegShAmt2 = and i8 %negShAmt2, 7 %shr2 = lshr i8 %x, %maskedNegShAmt2 %r = or i8 %shl2, %shr2 llvm-svn: 346713
*	[InstCombine] refactor code for matching shift amount of a rotate; NFC	Sanjay Patel	2018-11-12	1	-12/+17
\| \| \| \| \| \| \| \|	As shown in existing test cases and with: https://bugs.llvm.org/show_bug.cgi?id=39624 ...we're missing at least 2 more patterns for rotate narrowing. llvm-svn: 346711
*	[GC][InstCombine] Fix a potential iteration issue	Philip Reames	2018-11-12	1	-1/+4
\| \| \| \| \| \|	Noticed via inspection. Appears to be largely innocious in practice, but slight code change could have resulted in either visit order dependent missed optimizations or infinite loops. May be a minor compile time problem today. llvm-svn: 346698
*	[CostModel] Add more realistic SK_ExtractSubvector generic costs.	Simon Pilgrim	2018-11-12	1	-1/+2
\| \| \| \| \| \| \| \|	Instead of defaulting to a cost = 1, expand to element extract/insert like we do for other shuffles. This exposes an issue in LoopVectorize which could call SK_ExtractSubvector with a scalar subvector type. llvm-svn: 346656
*	[LICM] Hoist guards from non-header blocks	Max Kazantsev	2018-11-12	1	-11/+3
\| \| \| \| \| \| \| \| \| \| \|	This patch relaxes overconservative checks on whether or not we could write memory before we execute an instruction. This allows us to hoist guards out of loops even if they are not in the header block. Differential Revision: https://reviews.llvm.org/D50891 Reviewed By: fedor.sergeev llvm-svn: 346643
*	[GCOV] Add options to filter files which must be instrumented.	Calixte Denizet	2018-11-12	1	-2/+82
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When making code coverage, a lot of files (like the ones coming from /usr/include) are removed when post-processing gcno/gcda so finally they doen't need to be instrumented nor to appear in gcno/gcda. The goal of the patch is to be able to filter the files we want to instrument, there are several advantages to do that: - improve speed (no overhead due to instrumentation on files we don't care) - reduce gcno/gcda size - it gives the possibility to easily instrument only few files (e.g. ones modified in a patch) without changing the build system - need to accept this patch to be enabled in clang: https://reviews.llvm.org/D52034 Reviewers: marco-c, vsk Reviewed By: marco-c Subscribers: llvm-commits, sylvestre.ledru Differential Revision: https://reviews.llvm.org/D52033 llvm-svn: 346641
*	[IPSCCP,PM] Preserve PDT in the new pass manager.	Florian Hahn	2018-11-11	2	-7/+8
\| \| \| \| \| \| \| \| \| \|	Reviewers: kuhar, chandlerc, NutshellySima, brzycki Reviewed By: NutshellySima, brzycki Differential Revision: https://reviews.llvm.org/D54317 llvm-svn: 346618
*	[InstCombine] simplify code for merging stores; NFCI	Sanjay Patel	2018-11-10	2	-51/+28
\| \| \| \|	llvm-svn: 346596
*	[ThinLTO] Internalize readonly globals	Eugene Leviant	2018-11-10	2	-7/+59
\| \| \| \| \| \| \| \| \|	This patch allows internalising globals if all accesses to them (from live functions) are from non-volatile load instructions Differential revision: https://reviews.llvm.org/D49362 llvm-svn: 346584
*	[JumpThreading] Fix exponential time algorithm computing known values.	Eli Friedman	2018-11-09	1	-19/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ComputeValueKnownInPredecessors has a "visited" set to prevent infinite loops, since a value can be visited more than once. However, the implementation didn't prevent the algorithm from taking exponential time. Instead of removing elements from the RecursionSet one at a time, we should keep around the whole set until ComputeValueKnownInPredecessors finishes, then discard it. The testcase is synthetic because I was having trouble effectively reducing the original. But it's basically the same idea. Instead of failing, we could theoretically cache the result instead. But I don't think it would help substantially in practice. Differential Revision: https://reviews.llvm.org/D54239 llvm-svn: 346562
*	Revert r346483: [CallSiteSplitting] Only record conditions up to the ↵	Florian Hahn	2018-11-09	1	-38/+15
\| \| \| \| \| \| \| \|	IDom(call site). This cause a failure with EXPENSIVE_CHECKS llvm-svn: 346492
*	[IPSCCP,PM] Preserve DT in the new pass manager.	Florian Hahn	2018-11-09	2	-40/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	After D45330, Dominators are required for IPSCCP and can be preserved. This patch preserves DominatorTreeAnalysis in the new pass manager. AFAIK the legacy pass manager cannot preserve function analysis required by a module analysis. Reviewers: davide, dberlin, chandlerc, efriedma, kuhar, NutshellySima Reviewed By: chandlerc, kuhar, NutshellySima Differential Revision: https://reviews.llvm.org/D47259 llvm-svn: 346486
*	[CallSiteSplitting] Only record conditions up to the IDom(call site).	Florian Hahn	2018-11-09	1	-15/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We can stop recording conditions once we reached the immediate dominator for the block containing the call site. Conditions in predecessors of the that node will be the same for all paths to the call site and splitting is not beneficial. This patch makes CallSiteSplitting dependent on the DT anlysis. because the immediate dominators seem to be the easiest way of finding the node to stop at. I had to update some exiting tests, because they were checking for conditions that were true/false on all paths to the call site. Those should now be handled by instcombine/ipsccp. Reviewers: davide, junbuml Reviewed By: junbuml Differential Revision: https://reviews.llvm.org/D44627 llvm-svn: 346483
*	[DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG.	Carlos Alberto Enciso	2018-11-09	1	-0/+16
\| \| \| \| \| \| \| \|	In SimplifyCFG when given a conditional branch that goes to BB1 and BB2, the hoisted common terminator instruction in the two blocks, caused debug line records associated with subsequent select instructions to become ambiguous. It causes the debugger to display unreachable source lines. Differential Revision: https://reviews.llvm.org/D53390 llvm-svn: 346481
*	[NFC] Add utility function for SafetyInfo updates for moveBefore	Max Kazantsev	2018-11-09	1	-3/+11
\| \| \| \|	llvm-svn: 346472
*	[LoopInterchange] Support reductions across inner and outer loop.	Florian Hahn	2018-11-08	1	-44/+130
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds logic to detect reductions across the inner and outer loop by following the incoming values of PHI nodes in the outer loop. If the incoming values take part in a reduction in the inner loop or come from outside the outer loop, we found a reduction spanning across inner and outer loop. With this change, ~10% more loops are interchanged in the LLVM test-suite + SPEC2006. Fixes https://bugs.llvm.org/show_bug.cgi?id=30472 Reviewers: mcrosier, efriedma, karthikthecool, davide, hfinkel, dmgreen Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D43245 llvm-svn: 346438
*	[LTO] Drop non-prevailing definitions only if linkage is not local or appending	Pirama Arumuga Nainar	2018-11-08	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes PR 37422 In ELF, non-weak symbols can also be non-prevailing. In this particular PR, the __llvm_profile_* symbols are non-prevailing but weren't getting dropped - causing multiply-defined errors with lld. Also add a test, strong_non_prevailing.ll, to ensure that multiple copies of a strong symbol are dropped. To fix the test regressions exposed by this fix, - do not mark prevailing copies for symbols with 'appending' linkage. There's no one prevailing copy for such symbols. - fix the prevailing version in dead-strip-fulllto.ll - explicitly pass exported symbols to llvm-lto in fumcimport.ll and funcimport_var.ll Reviewers: tejohnson, pcc Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, dang, srhines, llvm-commits Differential Revision: https://reviews.llvm.org/D54125 llvm-svn: 346436
*	InstCombine: Avoid introducing poison values when lowering llvm.amdgcn.[us]bfe	Tom Stellard	2018-11-08	1	-13/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When the 3rd argument to these intrinsics is zero, lowering them to shift instructions produces poison values, since we end up with shift amounts equal to the number of bits in the shifted value. This means we can only lower these intrinsics if we can prove that the 3rd argument is not zero. Reviewers: arsenm Reviewed By: arsenm Subscribers: bnieuwenhuizen, jvesely, wdng, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D53739 llvm-svn: 346422
*	[CodeExtractor] Mark functions noreturn when applicable	Vedant Kumar	2018-11-08	1	-0/+7
\| \| \| \| \| \| \| \| \| \|	This eliminates the outlining penalty for llvm.trap/unreachable, because callers no longer have to emit cleanup/ret instructions after calling an outlined `noreturn` function. rdar://45523626 llvm-svn: 346421
*	Return "[IndVars] Smart hard uses detection"	Max Kazantsev	2018-11-08	1	-13/+26
\| \| \| \| \| \| \| \| \| \|	The patch has been reverted because it ended up prohibiting propagation of a constant to exit value. For such values, we should skip all checks related to hard uses because propagating a constant is always profitable. Differential Revision: https://reviews.llvm.org/D53691 llvm-svn: 346397
*	[LSR] Combine unfolded offset into invariant register	Gil Rapaport	2018-11-08	1	-12/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	LSR reassociates constants as unfolded offsets when the constants fit as immediate add operands, which currently prevents such constants from being combined later with loop invariant registers. This patch modifies GenerateCombinations() to generate a second formula which includes the unfolded offset in the combined loop-invariant register. This commit fixes a bug in the original patch (committed at r345114, reverted at r345123). Differential Revision: https://reviews.llvm.org/D51861 llvm-svn: 346390
*	[MergeFuncs] Improve ordering of equal functions	whitequark	2018-11-08	1	-9/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: MergeFunctions currently tries to process strong functions before weak functions, because weak functions can simply call strong functions, while a strong/weak function cannot call a weak function (a backing strong function is needed). This patch additionally tries to process external functions before local functions, because we definitely have to keep the external function, but may be able to drop the local one (and definitely can if it is also unnamed_addr). Unfortunately, this exposes an existing bug in the implementation: The FnTree and FNodesInTree structures can currently go out of sync in the case where two weak functions are merged, because the function in FnTree/FNodesInTree is RAUWed. This leaves it behind in FnTree (this is intended, as it is the strong backing function which should be used for further merges), while it is replaced in FNodesInTree (this is not intended). This is fixed by switching FNodesInTree from using a ValueMap to using a DenseMap of AssertingVH. This exposes another minor issue: Currently FNodesInTree is not cleared after MergeFunctions finishes running. Currently, this is potentially dangerous (e.g. if something else wants to RAUW a function with a non-function), but at the very least it is unnecessary/inefficient. After the change to use AssertingVH it becomes more problematic, because there are certainly passes that remove functions. This issue is fixed by clearing FNodesInTree at the end of the pass. Reviewers: jfb, whitequark Reviewed By: whitequark Subscribers: rkruppe, llvm-commits Differential Revision: https://reviews.llvm.org/D53271 llvm-svn: 346386