bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[InstCombine] not(sub X, Y) --> add (not X), Y	Sanjay Patel	2018-07-27	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \|	The tests with constants show a missing optimization. Analysis for adds is better than subs, so this can also help with other transforms. And codegen is better with adds for targets like x86 (destructive ops, no sub-from). https://rise4fun.com/Alive/llK llvm-svn: 338118
*	[SimplifyIndVar] Canonicalize comparisons to unsigned while eliminating truncs	Max Kazantsev	2018-07-27	1	-2/+23
\| \| \| \| \| \| \| \| \| \| \|	This is a follow-up for the patch rL335020. When we replace compares against trunc with compares against wide IV, we can also replace signed predicates with unsigned where it is legal. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D48763 llvm-svn: 338115
*	PatternMatch: Add wrappers for fabs and canonicalize	Matt Arsenault	2018-07-27	1	-3/+3
\| \| \| \|	llvm-svn: 338111
*	Revert "[LV][DebugInfo] Set DL to the middle block Icmp instruction"	Anastasis Grammenos	2018-07-27	1	-3/+1
\| \| \| \| \| \|	This reverts commit r338106. llvm-svn: 338109
*	[LV][DebugInfo] Set DL to the middle block Icmp instruction	Anastasis Grammenos	2018-07-27	1	-1/+3
\| \| \| \| \| \| \| \|	Reviewers: hsaito Differential Revision: https://reviews.llvm.org/D49746 llvm-svn: 338106
*	[InstCombine] canonicalize abs pattern	Chen Zheng	2018-07-27	1	-20/+50
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D48754 llvm-svn: 338092
*	[DebugInfo] LowerDbgDeclare: Add derefs when handling CallInst users	Vedant Kumar	2018-07-26	1	-6/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	LowerDbgDeclare inserts a dbg.value before each use of an address described by a dbg.declare. When inserting a dbg.value before a CallInst use, however, it fails to append DW_OP_deref to the DIExpression. The DW_OP_deref is needed to reflect the fact that a dbg.value describes a source variable directly (as opposed to a dbg.declare, which relies on pointer indirection). This patch adds in the DW_OP_deref where needed. This results in the correct values being shown during a debug session for a program compiled with ASan and optimizations (see https://reviews.llvm.org/D49520). Note that ConvertDebugDeclareToDebugValue is already correct -- no changes there were needed. One complication is that SelectionDAG is unable to distinguish between direct and indirect frame-index (FRAMEIX) SDDbgValues. This patch also fixes this long-standing issue in order to not regress integration tests relying on the incorrect assumption that all frame-index SDDbgValues are indirect. This is a necessary fix: the newly-added DW_OP_derefs cannot be lowered properly otherwise. Basically the fix prevents a direct SDDbgValue with DIExpression(DW_OP_deref) from being dereferenced twice by a debugger. There were a handful of tests relying on this incorrect "FRAMEIX => indirect" assumption which actually had incorrect DW_AT_locations: these are all fixed up in this patch. Testing: - check-llvm, and an end-to-end test using lldb to debug an optimized program. - Existing unit tests for DIExpression::appendToStack fully cover the new DIExpression::append utility. - check-debuginfo (the debug info integration tests) Differential Revision: https://reviews.llvm.org/D49454 llvm-svn: 338069
*	[InstCombine] fold udiv with common factor from muls with nuw	Sanjay Patel	2018-07-26	1	-0/+15
\| \| \| \| \| \| \| \| \|	Unfortunately, sdiv isn't as simple because of UB due to overflow. This fold is mentioned in PR38239: https://bugs.llvm.org/show_bug.cgi?id=38239 llvm-svn: 338059
*	[UnJ] Common some code. NFC	David Green	2018-07-26	1	-40/+54
\| \| \| \| \| \| \| \| \|	Create a processHeaderPhiOperands for analysing the instructions in the aft blocks that must be moved before the loop. Differential Revision: https://reviews.llvm.org/D49061 llvm-svn: 338033
*	[LoadStoreVectorizer] Use const reference	Fangrui Song	2018-07-26	1	-4/+6
\| \| \| \|	llvm-svn: 337992
*	[LSV] Look through selects for consecutive addresses	Roman Tereshin	2018-07-25	1	-15/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In some cases LSV sees (load/store _ (select _ <pointer expression> <pointer expression>)) patterns in input IR, often due to sinking and other forms of CFG simplification, sometimes interspersed with bitcasts and all-constant-indices GEPs. With this patch`areConsecutivePointers` method would attempt to handle select instructions. This leads to an increased number of successful vectorizations. Technically, select instructions could appear in index arithmetic as well, however, we don't see those in our test suites / benchmarks. Also, there is a lot more freedom in IR shapes computing integral indices in general than in what's common in pointer computations, and it appears that it's quite unreliable to do anything short of making select instructions first class citizens of Scalar Evolution, which for the purposes of this patch is most definitely an overkill. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D49428 llvm-svn: 337965
*	Revert r337904: [IPSCCP] Use PredicateInfo to propagate facts from cmp ↵	Florian Hahn	2018-07-25	2	-140/+10
\| \| \| \| \| \| \| \|	instructions. I suspect it is causing the clang-stage2-Rthinlto failures. llvm-svn: 337956
*	Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp ↵	Florian Hahn	2018-07-25	2	-10/+140
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instructions. r337828 resolves a PredicateInfo issue with unnamed types. Original message: This patch updates IPSCCP to use PredicateInfo to propagate facts to true branches predicated by EQ and to false branches predicated by NE. As a follow up, we should be able to extend it to also propagate additional facts about nonnull. Reviewers: davide, mssimpso, dberlin, efriedma Reviewed By: davide, dberlin llvm-svn: 337904
*	[profile] Support profiling runtime on Fuchsia	Petr Hosek	2018-07-25	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	This ports the profiling runtime on Fuchsia and enables the instrumentation. Unlike on other platforms, Fuchsia doesn't use files to dump the instrumentation data since on Fuchsia, filesystem may not be accessible to the instrumented process. We instead use the data sink to pass the profiling data to the system the same sanitizer runtimes do. Differential Revision: https://reviews.llvm.org/D47208 llvm-svn: 337881
*	[LV] Fix for PR38110, LV encountered llvm_unreachable()	Hideki Saito	2018-07-24	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: truncateToMinimalBitWidths() doesn't handle all Instructions and the worst case is compiler crash via llvm_unreachable(). Fix is to add a case to handle PHINode and changed the worst case to NO-OP (from compiler crash). Reviewers: sbaranga, mssimpso, hsaito Reviewed By: hsaito Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49461 llvm-svn: 337861
*	Use SCEV to avoid inserting some bounds checks.	Joel Galenson	2018-07-24	1	-12/+28
\| \| \| \| \| \| \| \|	This patch uses SCEV to avoid inserting some bounds checks when they are not needed. This slightly improves the performance of code compiled with the bounds check sanitizer. Differential Revision: https://reviews.llvm.org/D49602 llvm-svn: 337830
*	[PredicateInfo] Use custom mangling to support ssa_copy with unnamed types.	Florian Hahn	2018-07-24	1	-6/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a workaround and it would be better to fix this generally, but doing it generally is quite tricky. See D48541 and PR38117. Doing it in PredicateInfo directly allows us to use the type address to differentiate different unnamed types, because neither the created declarations nor the ssa_copy calls should be visible after PredicateInfo got destroyed. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D49126 llvm-svn: 337828
*	[ThinLTO] Ensure the TargetLibraryInfo is constructed early enough	Teresa Johnson	2018-07-23	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Without this change, the WholeProgramDevirt pass, which requires the TargetLibraryInfo, will construct one from the default triple. Fixes PR38139. Reviewers: pcc Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D49278 llvm-svn: 337750
*	[DebugCounters] Keep track of total counts	George Burgess IV	2018-07-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch makes debug counters keep track of the total number of times we've called `shouldExecute` for each counter, so it's easier to build automated tooling on top of these. A patch to print these counts is coming soon. Patch by Zhizhou Yang! Differential Revision: https://reviews.llvm.org/D49560 llvm-svn: 337748
*	[GVN] Don't use the eliminated load as an available value in phi construction	John Brawn	2018-07-23	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \|	In ConstructSSAForLoadSet if an available value is actually the load that we're doing SSA construction to eliminate, then we can omit it as SSAUpdate will add in the value for the phi that will be replacing it anyway. This can result in simpler IR which can allow further optimisation. Differential Revision: https://reviews.llvm.org/D44160 llvm-svn: 337686
*	[GVNHoist] safeToHoistLdSt allows illegal hoisting	Alexandros Lamprineas	2018-07-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Bug fix for PR36787. When reasoning if it's safe to hoist a load we want to make sure that the defining memory access dominates the new insertion point of the hoisted instruction. safeToHoistLdSt calls firstInBB(InsertionPoint,DefiningAccess) which returns false if InsertionPoint == DefiningAccess, and therefore it falsely thinks it's safe to hoist. Differential Revision: https://reviews.llvm.org/D49555 llvm-svn: 337674
*	Early exit with cheaper checks	Aditya Kumar	2018-07-21	1	-13/+12
\| \| \| \| \| \| \|	Reviewers: sebpop,davide,fhahn,trentxintong Differential Revision: https://reviews.llvm.org/D49617 llvm-svn: 337643
*	Change the cap on the amount of padding for each vtable to 32-byte ↵	Peter Collingbourne	2018-07-20	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	(previously it was 128-byte) We tested different cap values with a recent commit of Chromium. Our results show that the 32-byte cap yields the smallest binary and all the caps yield similar performance. Based on the results, we propose to change the cap value to 32-byte. Patch by Zhaomo Yang! Differential Revision: https://reviews.llvm.org/D49405 llvm-svn: 337622
*	Reapply "[LSV] Refactoring + supporting bitcasts to a type of different size"	Roman Tereshin	2018-07-20	1	-46/+65
\| \| \| \| \| \| \| \|	This reapplies commit r337489 reverted by r337541 Additionally, this commit contains a speculative fix to the issue reported in r337541 (the report does not contain an actionable reproducer, just a stack trace) llvm-svn: 337606
*	[MSan] Hotfix compilation	Alexander Potapenko	2018-07-20	1	-0/+1
\| \| \| \| \| \|	Make sure NewSI is used in materializeStores() llvm-svn: 337577
*	[MSan] run materializeChecks() before materializeStores()	Alexander Potapenko	2018-07-20	1	-7/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When pointer checking is enabled, it's important that every pointer is checked before its value is used. For stores MSan used to generate code that calculates shadow/origin addresses from a pointer before checking it. For userspace this isn't a problem, because the shadow calculation code is quite simple and compiler is able to move it after the check on -O2. But for KMSAN getShadowOriginPtr() creates a runtime call, so we want the check to be performed strictly before that call. Swapping materializeChecks() and materializeStores() resolves the issue: both functions insert code before the given IR location, so the new insertion order guarantees that the code calculating shadow address is between the address check and the memory access. llvm-svn: 337571
*	[IPSCCP] Fix for bot failure caused by r337548	Florian Hahn	2018-07-20	1	-1/+1
\| \| \| \|	llvm-svn: 337554
*	Recommit r328307: [IPSCCP] Use constant range information for comparisons of ↵	Florian Hahn	2018-07-20	1	-110/+80
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	parameters. This version contains a fix to add values for which the state in ParamState change to the worklist if the state in ValueState did not change. To avoid adding the same value multiple times, mergeInValue returns true, if it added the value to the worklist. The value is added to the worklist depending on its state in ValueState. Original message: For comparisons with parameters, we can use the ParamState lattice elements which also provide constant range information. This improves the code for PR33253 further and gets us closer to use ValueLatticeElement for all values. Also, as we are using the range information in the solver directly, we do not need tryToReplaceWithConstantRange afterwards anymore. Reviewers: dberlin, mssimpso, davide, efriedma Reviewed By: mssimpso Differential Revision: https://reviews.llvm.org/D43762 llvm-svn: 337548
*	Revert "[LSV] Refactoring + supporting bitcasts to a type of different size"	Sam McCall	2018-07-20	1	-62/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit r337489. It causes asserts to fire in some TensorFlow tests, e.g. tensorflow/compiler/tests/gather_test.py on GPU. Example stack trace: Start test case: GatherTest.testHigherRank assertion failed at third_party/llvm/llvm/lib/Support/APInt.cpp:819 in llvm::APInt llvm::APInt::trunc(unsigned int) const: width && "Can't truncate to 0 bits" @ 0x5559446ebe10 __assert_fail @ 0x55593ef32f5e llvm::APInt::trunc() @ 0x55593d78f86e (anonymous namespace)::Vectorizer::lookThroughComplexAddresses() @ 0x55593d78f2bc (anonymous namespace)::Vectorizer::areConsecutivePointers() @ 0x55593d78d128 (anonymous namespace)::Vectorizer::isConsecutiveAccess() @ 0x55593d78c926 (anonymous namespace)::Vectorizer::vectorizeInstructions() @ 0x55593d78c221 (anonymous namespace)::Vectorizer::vectorizeChains() @ 0x55593d78b948 (anonymous namespace)::Vectorizer::run() @ 0x55593d78b725 (anonymous namespace)::LoadStoreVectorizer::runOnFunction() @ 0x55593edf4b17 llvm::FPPassManager::runOnFunction() @ 0x55593edf4e55 llvm::FPPassManager::runOnModule() @ 0x55593edf563c (anonymous namespace)::MPPassManager::runOnModule() @ 0x55593edf5137 llvm::legacy::PassManagerImpl::run() @ 0x55593edf5b71 llvm::legacy::PassManager::run() @ 0x55593ced250d xla::gpu::IrDumpingPassManager::run() @ 0x55593ced5033 xla::gpu::(anonymous namespace)::EmitModuleToPTX() @ 0x55593ced40ba xla::gpu::(anonymous namespace)::CompileModuleToPtx() @ 0x55593ced33d0 xla::gpu::CompileToPtx() @ 0x55593b26b2a2 xla::gpu::NVPTXCompiler::RunBackend() @ 0x55593b21f973 xla::Service::BuildExecutable() @ 0x555938f44e64 xla::LocalService::CompileExecutable() @ 0x555938f30a85 xla::LocalClient::Compile() @ 0x555938de3c29 tensorflow::XlaCompilationCache::BuildExecutable() @ 0x555938de4e9e tensorflow::XlaCompilationCache::CompileImpl() @ 0x555938de3da5 tensorflow::XlaCompilationCache::Compile() @ 0x555938c5d962 tensorflow::XlaLocalLaunchBase::Compute() @ 0x555938c68151 tensorflow::XlaDevice::Compute() @ 0x55593f389e1f tensorflow::(anonymous namespace)::ExecutorState::Process() @ 0x55593f38a625 tensorflow::(anonymous namespace)::ExecutorState::ScheduleReady()::$_1::operator()() * SIGABRT received by PID 7798 (TID 7837) from PID 7798; * llvm-svn: 337541
*	[SCCP] Don't use markForcedConstant on branch conditions.	Eli Friedman	2018-07-19	1	-73/+62
\| \| \| \| \| \| \| \| \| \| \| \|	It's more aggressive than we need to be, and leads to strange workarounds in other places like call return value inference. Instead, just directly mark an edge viable. Tests by Florian Hahn. Differential Revision: https://reviews.llvm.org/D49408 llvm-svn: 337507
*	[LSV] Refactoring + supporting bitcasts to a type of different size	Roman Tereshin	2018-07-19	1	-46/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is mostly a preparation work for adding a limited support for select instructions. It proved to be difficult to do due to size and irregularity of Vectorizer::isConsecutiveAccess, this is fixed here I believe. It also turned out that these changes make it simpler to finish one of the TODOs and fix a number of other small issues, namely: 1. Looking through bitcasts to a type of a different size (requires careful tracking of the original load/store size and some math converting sizes in bytes to expected differences in indices of GEPs). 2. Reusing partial analysis of pointers done by first attempt in proving them consecutive instead of starting from scratch. This added limited support for nested GEPs co-existing with difficult sext/zext instructions. This also required a careful handling of negative differences between constant parts of offsets. 3. Handing a case where the first pointer index is not an add, but something else (a function parameter for instance). I observe an increased number of successful vectorizations on a large set of shader programs. Only few shaders are affected, but those that are affected sport >5% less loads and stores than before the patch. Reviewed By: rampitec Differential-Revision: https://reviews.llvm.org/D49342 llvm-svn: 337489
*	[LoadStoreVectorizer] Use getMinusScev() to compute the distance between two ↵	Farhana Aleen	2018-07-19	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pointers. Summary: Currently, isConsecutiveAccess() detects two pointers(PtrA and PtrB) as consecutive by comparing PtrB with BaseDelta+PtrA. This works when both pointers are factorized or both of them are not factorized. But isConsecutiveAccess() fails if one of the pointers is factorized but the other one is not. Here is an example: PtrA = 4 * (A + B) PtrB = 4 + 4A + 4B This patch uses getMinusSCEV() to compute the distance between two pointers. getMinusSCEV() allows combining the expressions and computing the simplified distance. Author: FarhanaAleen Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D49516 llvm-svn: 337471
*	[ThinLTO] Enable ThinLTO WholeProgramDevirt and LowerTypeTests in new PM	Teresa Johnson	2018-07-19	2	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Enable these passes for CFI and WPD in ThinLTO and LTO with the new pass manager. Add a couple of tests for both PMs based on the clang tests tools/clang/test/CodeGen/thinlto-distributed-cfi*.ll, but just test through llvm-lto2 and not with distributed ThinLTO. Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D49429 llvm-svn: 337461
*	Rename __asan_gen_* symbols to ___asan_gen_*.	Peter Collingbourne	2018-07-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	This prevents gold from printing a warning when trying to export these symbols via the asan dynamic list after ThinLTO promotes them from private symbols to external symbols with hidden visibility. Differential Revision: https://reviews.llvm.org/D49498 llvm-svn: 337428
*	Skip debuginfo intrinsic in markLiveBlocks.	Xin Tong	2018-07-18	1	-39/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The optimizer is 10%+ slower with vs without debuginfo. I started checking where the difference is coming from. I compiled sqlite3.c with and without debug info from CTMark and compare the time difference. I use Xcode Instrument to find where time is spent. This brings about 20ms, out of ~20s. Reviewers: davide, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, aprantl, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D49337 llvm-svn: 337416
*	[SLPVectorizer] Avoid duplicate scalar cost calculations in ↵	Simon Pilgrim	2018-07-18	1	-50/+37
\| \| \| \| \| \| \| \|	BoUpSLP::getEntryCost. NFCI. Pulled out from D49225, we have a lot of repeated scalar cost calculations, often with arguments that don't look the same but turn out to be. llvm-svn: 337390
*	[InstCombine] Re-commit: Fold 'check for [no] signed truncation' pattern	Roman Lebedev	2018-07-18	1	-0/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: [[ https://bugs.llvm.org/show_bug.cgi?id=38149 \| PR38149 ]] As discussed in https://reviews.llvm.org/D49179#1158957 and later, the IR for 'check for [no] signed truncation' pattern can be improved: https://rise4fun.com/Alive/gBf ^ that pattern will be produced by Implicit Integer Truncation sanitizer, https://reviews.llvm.org/D48958 https://bugs.llvm.org/show_bug.cgi?id=21530 in signed case, therefore it is probably a good idea to improve it. The DAGCombine will reverse this transform, see https://reviews.llvm.org/D49266 This transform is surprisingly frustrating. This does not deal with non-splat shift amounts, or with undef shift amounts. I've outlined what i think the solution should be: ``` // Potential handling of non-splats: for each element: // * if both are undef, replace with constant 0. // Because (1<<0) is OK and is 1, and ((1<<0)>>1) is also OK and is 0. // * if both are not undef, and are different, bailout. // * else, only one is undef, then pick the non-undef one. ``` This is a re-commit, as the original patch, committed in rL337190 was reverted in rL337344 as it broke chromium build: https://bugs.llvm.org/show_bug.cgi?id=38204 and https://crbug.com/864832 Proofs that the fixed folds are ok: https://rise4fun.com/Alive/VYM Differential Revision: https://reviews.llvm.org/D49320 llvm-svn: 337376
*	Revert "[InstCombine] Fold 'check for [no] signed truncation' pattern"	Bob Haarman	2018-07-18	1	-69/+0
\| \| \| \| \| \| \| \| \|	This reverts r337190 (and a few follow-up commits), which caused the Chromium build to fail. See https://bugs.llvm.org/show_bug.cgi?id=38204 and https://crbug.com/864832 llvm-svn: 337344
*	[InstCombine] Preserve debug value when simplifying cast-of-select	Vedant Kumar	2018-07-17	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	InstCombine has a cast transform that matches a cast-of-select: Orig = cast (Src = select Cond TV FV) And tries to replace it with a select which has the cast folded in: NewSel = select Cond (cast TV) (cast FV) The combiner does RAUW(Orig, NewSel), so any debug values for Orig would survive the transform. But debug values for Src would be lost. This patch teaches InstCombine to replace all debug uses of Src with NewSel (taking care of doing any necessary DIExpression rewriting). Differential Revision: https://reviews.llvm.org/D49270 llvm-svn: 337310
*	[IPSCCP] Run Solve each time we resolved an undef in a function.	Florian Hahn	2018-07-17	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Once we resolved an undef in a function we can run Solve, which could lead to finding a constant return value for the function, which in turn could turn undefs into constants in other functions that call it, before resolving undefs there. Computationally the amount of work we are doing stays the same, just the order we process things is slightly different and potentially there are a few less undefs to resolve. We are still relying on the order of functions in the IR, which means depending on the order, we are able to resolve the optimal undef first or not. For example, if @test1 comes before @testf, we find the constant return value of @testf too late and we cannot use it while solving @test1. This on its own does not lead to more constants removed in the test-suite, probably because currently we have to be very lucky to visit applicable functions in the right order. Maybe we manage to come up with a better way of resolving undefs in more 'profitable' functions first. Reviewers: efriedma, mssimpso, davide Reviewed By: efriedma, davide Differential Revision: https://reviews.llvm.org/D49385 llvm-svn: 337283
*	[SLPVectorizer] Don't attempt horizontal reduction on pointer types (PR38191)	Simon Pilgrim	2018-07-17	1	-0/+2
\| \| \| \| \| \|	TTI::getMinMaxReductionCost typically can't handle pointer types - until this is changed its better to limit horizontal reduction to integer/float vector types only. llvm-svn: 337280
*	[LLVM-C] Fix name mangling on AggressiveInstCombine	whitequark	2018-07-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Similarly to rL336736, at least one more C API function does not properly get declared as extern "C" due to a missing header, causing name mangling and linking errors. This patch fixes calls to LLVMAddAggressiveInstCombinerPass(). Differential Revision: https://reviews.llvm.org/D49416 Reviewed By: whitequark llvm-svn: 337264
*	Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI.	Simon Pilgrim	2018-07-17	1	-2/+2
\| \| \| \|	llvm-svn: 337257
*	[InstCombine] Fold 'check for [no] signed truncation' pattern	Roman Lebedev	2018-07-16	1	-0/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: [[ https://bugs.llvm.org/show_bug.cgi?id=38149 \| PR38149 ]] As discussed in https://reviews.llvm.org/D49179#1158957 and later, the IR for 'check for [no] signed truncation' pattern can be improved: https://rise4fun.com/Alive/gBf ^ that pattern will be produced by Implicit Integer Truncation sanitizer, https://reviews.llvm.org/D48958 https://bugs.llvm.org/show_bug.cgi?id=21530 in signed case, therefore it is probably a good idea to improve it. Proofs for this transform: https://rise4fun.com/Alive/mgu This transform is surprisingly frustrating. This does not deal with non-splat shift amounts, or with undef shift amounts. I've outlined what i think the solution should be: ``` // Potential handling of non-splats: for each element: // * if both are undef, replace with constant 0. // Because (1<<0) is OK and is 1, and ((1<<0)>>1) is also OK and is 0. // * if both are not undef, and are different, bailout. // * else, only one is undef, then pick the non-undef one. ``` The DAGCombine will reverse this transform, see https://reviews.llvm.org/D49266 Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: JDevlieghere, rkruppe, llvm-commits Differential Revision: https://reviews.llvm.org/D49320 llvm-svn: 337190
*	Restore "[ThinLTO] Ensure we always select the same function copy to import"	Teresa Johnson	2018-07-16	1	-69/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit r337081, therefore restoring r337050 (and fix in r337059), with test fix for bot failure described after the original description below. In order to always import the same copy of a linkonce function, even when encountering it with different thresholds (a higher one then a lower one), keep track of the summary we decided to import. This ensures that the backend only gets a single definition to import for each GUID, so that it doesn't need to choose one. Move the largest threshold the GUID was considered for import into the current module out of the ImportMap (which is part of a larger map maintained across the whole index), and into a new map just maintained for the current module we are computing imports for. This saves some memory since we no longer have the thresholds maintained across the whole index (and throughout the in-process backends when doing a normal non-distributed ThinLTO build), at the cost of some additional information being maintained for each invocation of ComputeImportForModule (the selected summary pointer for each import). There is an additional map lookup for each callee being considered for importing, however, this was able to subsume a map lookup in the Worklist iteration that invokes computeImportForFunction. We also are able to avoid calling selectCallee if we already failed to import at the same or higher threshold. I compared the run time and peak memory for the SPEC2006 471.omnetpp benchmark (running in-process ThinLTO backends), as well as for a large internal benchmark with a distributed ThinLTO build (so just looking at the thin link time/memory). Across a number of runs with and without this change there was no significant change in the time and memory. (I tried a few other variations of the change but they also didn't improve time or peak memory). The new commit removes a test that no longer makes sense (Transforms/FunctionImport/hotness_based_import2.ll), as exposed by the reverse-iteration bot. The test depends on the order of processing the summary call edges, and actually depended on the old problematic behavior of selecting more than one summary for a given GUID when encountered with different thresholds. There was no guarantee even before that we would eventually pick the linkonce copy with the hottest call edges, it just happened to work with the test and the old code, and there was no guarantee that we would end up importing the selected version of the copy that had the hottest call edges (since the backend would effectively import only one of the selected copies). Reviewers: davidxl Subscribers: mehdi_amini, inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D48670 llvm-svn: 337184
*	MSan: minor fixes, NFC	Alexander Potapenko	2018-07-16	1	-7/+6
\| \| \| \| \| \| \|	- remove an extra space after \|ID\| declaration - drop the unused \|FirstInsn\| parameter in getShadowOriginPtrUserspace() llvm-svn: 337159
*	[MSan] factor userspace-specific declarations into createUserspaceApi(). NFC	Alexander Potapenko	2018-07-16	1	-38/+53
\| \| \| \| \| \| \| \| \| \|	This patch introduces createUserspaceApi() that creates function/global declarations for symbols used by MSan in the userspace. This is a step towards the upcoming KMSAN implementation patch. Reviewed at https://reviews.llvm.org/D49292 llvm-svn: 337155
*	[InstCombine] add more SPFofSPF folding	Chen Zheng	2018-07-16	1	-0/+5
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D49238 llvm-svn: 337143
*	[InstCombine] fold icmp pred (sub 0, X) C for vector type	Chen Zheng	2018-07-16	1	-2/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D49283 llvm-svn: 337141
*	Recommit r335794 "Add support for generating a call graph profile from ↵	Michael J. Spencer	2018-07-16	2	-0/+101
\| \| \| \| \| \|	Branch Frequency Info." with fix for removed functions. llvm-svn: 337140