bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	hwasan: add tag_offset DWARF attribute to optimized debug info	Evgenii Stepanov	2019-12-12	1	-12/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Support alloca-referencing dbg.value in hwasan instrumentation. Update AsmPrinter to emit DW_AT_LLVM_tag_offset when location is in loclist format. Reviewers: pcc Subscribers: srhines, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70753
*	[Attributor][FIX] Do treat byval arguments special	Johannes Doerfert	2019-12-12	1	-12/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we reason about the pointer argument that is byval we actually reason about a local copy of the value passed at the call site. This was not the case before and we wrongly introduced attributes based on the surrounding function. AAMemoryBehaviorArgument, AAMemoryBehaviorCallSiteArgument and AANoCaptureCallSiteArgument are made aware of byval now. The code to skip "subsuming positions" for reasoning follows a common pattern and we should refactor it. A TODO was added. Discovered by @efriedma as part of D69748.
*	[Matrix] Add first set of matrix intrinsics and initial lowering pass.	Florian Hahn	2019-12-12	4	-0/+493
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the first patch adding an initial set of matrix intrinsics and a corresponding lowering pass. This has been discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2019-October/136240.html The first patch introduces four new intrinsics (transpose, multiply, columnwise load and store) and a LowerMatrixIntrinsics pass, that lowers those intrinsics to vector operations. Matrixes are embedded in a 'flat' vector (e.g. a 4 x 4 float matrix embedded in a <16 x float> vector) and the intrinsics take the dimension information as parameters. Those parameters need to be ConstantInt. For the memory layout, we initially assume column-major, but in the RFC we also described how to extend the intrinsics to support row-major as well. For the initial lowering, we split the input of the intrinsics into a set of column vectors, transform those column vectors and concatenate the result columns to a flat result vector. This allows us to lower the intrinsics without any shape propagation, as mentioned in the RFC. In follow-up patches, we plan to submit the following improvements: * Shape propagation to eliminate the embedding/splitting for each intrinsic. * Fused & tiled lowering of multiply and other operations. * Optimization remarks highlighting matrix expressions and costs. * Generate loops for operations on large matrixes. * More general block processing for operation on large vectors, exploiting shape information. We would like to add dedicated transpose, columnwise load and store intrinsics, even though they are not strictly necessary. For example, we could instead emit a large shufflevector instruction instead of the transpose. But we expect that to (1) become unwieldy for larger matrixes (even for 16x16 matrixes, the resulting shufflevector masks would be huge), (2) risk instcombine making small changes, causing us to fail to detect the transpose, preventing better lowerings For the load/store, we are additionally planning on exploiting the intrinsics for better alias analysis. Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor, efriedma, rengolin Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70456
*	[Attributor][NFC] Fix comments and unnecessary comma	Hideto Ueno	2019-12-12	1	-6/+7
\|
*	[Attributor] [NFC] Use `checkForAllUses` helpr in ↵	Hideto Ueno	2019-12-12	1	-35/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	`AAHeapToStackImpl::updateImpl` Summary: Remove `Worklist` iteration and make use `checkForAllUses`. There is no test chage. Reviewers: sstefan1, jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71352
*	[Attributor][NFC] Refactoring `AANoFreeArgument::updateImpl`	Hideto Ueno	2019-12-12	1	-38/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Refactoring `AANoFreeArgument::updateImpl`. There is no test change. Reviewers: sstefan1, jdoerfert Reviewed By: sstefan1 Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71349
*	Temporarily Revert "[DataLayout] Fix occurrences that size and range of ↵	Nicola Zaghen	2019-12-12	4	-13/+13
\| \| \| \| \| \| \| \| \|	pointers are assumed to be the same." This reverts commit 5f6208778ff92567c57d7c1e2e740c284d7e69a5. This caused failures in Transforms/PhaseOrdering/scev-custom-dl.ll const: Assertion `getBitWidth() == CR.getBitWidth() && "ConstantRange types don't agree!"' failed.
*	[DataLayout] Fix occurrences that size and range of pointers are assumed to ↵	Nicola Zaghen	2019-12-12	4	-13/+13
\| \| \| \| \| \| \| \| \| \| \| \|	be the same. GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered. Differential Revision: https://reviews.llvm.org/D68328 Patch by Joseph Faulls!
*	[AutoFDO] Statistic for context sensitive profile guided inlining	Wenlei He	2019-12-11	1	-3/+40
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: AutoFDO compilation has two places that do inlining - the sample profile loader that does inlining with context sensitive profile, and the regular inliner as CGSCC pass. Ideally we want most inlining to come from sample profile loader as that is driven by context sensitive profile and also retains context sensitivity after inlining. However the reality is most of the inlining actually happens during regular inliner. To track the number of inline instances from sample profile loader and help move more inlining to sample profile loader, I'm adding statistics and optimization remarks for sample profile loader's inlining. Reviewers: wmi, davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70584
*	[IR] Split out target specific intrinsic enums into separate headers	Reid Kleckner	2019-12-11	3	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of object file size. - Incremental step towards decoupling target intrinsics. The enums are still compact, so adding and removing a single target-specific intrinsic will trigger a rebuild of all of LLVM. Assigning distinct target id spaces is potential future work. Part of PR34259 Reviewers: efriedma, echristo, MaskRay Reviewed By: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D71320
*	Rename TTI::getIntImmCost for instructions and intrinsics	Reid Kleckner	2019-12-11	2	-14/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Soon Intrinsic::ID will be a plain integer, so this overload will not be possible. Rename both overloads to ensure that downstream targets observe this as a build failure instead of a runtime failure. Split off from D71320 Reviewers: efriedma Differential Revision: https://reviews.llvm.org/D71381
*	[InstCombine] Optimize overflow check base on uadd.with.overflow result	Nikita Popov	2019-12-11	1	-0/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix for https://bugs.llvm.org/show_bug.cgi?id=40846. This adds a combine for cases where a (a + b) < a style overflow check is performed, but with a + b being the result of uadd.with.overflow, so the overflow result is also already available and we can just use it. Subsequently GVN/CSE will deduplicate the extracts. We can run into this situation if you have both a uadd.with.overflow and a manual add + overflow check in the same function (on the same operands), in which case GVN will rewrite the add to the with.overflow result and leave you with this pattern. The implementation is a bit ugly because I'm handling the various canonicalization edge cases. This does not yet handle the negated version of this pattern. Differential Revision: https://reviews.llvm.org/D58644
*	[MergeFuncs] Remove incorrect attribute copying	Nikita Popov	2019-12-11	1	-22/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix for https://bugs.llvm.org/show_bug.cgi?id=44236. This code was originally introduced in rG36512330041201e10f5429361bbd79b1afac1ea1. However, the attribute copying was done in the wrong place (in general call replacement, not thunk generation) and a proper fix was implemented in D12581. Previously this code was just unnecessary but harmless (because FunctionComparator ensured that the attributes of the two functions are exactly the same), but since byval was changed to accept a type this copying is actively wrong and may result in malformed IR. Differential Revision: https://reviews.llvm.org/D71173
*	[Alignment][NFC] Introduce Align in IRBuilder	Guillaume Chatelet	2019-12-11	1	-25/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71343
*	Rollback assumeAligned in MemorySanitizer	Guillaume Chatelet	2019-12-11	1	-18/+22
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Rollback of parts of D71213. After digging more into the code I think we should leave 0 when creating the instructions (CreateMemcpy, CreateMaskedStore, CreateMaskedLoad). It's probably fine for MemorySanitizer because Alignement is resolved but I'm having a hard time convincing myself it has no impact at all (although tests are passing). Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71332
*	[Alignment][NFC] Introduce Align in SROA	Guillaume Chatelet	2019-12-11	1	-26/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71277
*	Revert "Reapply: [DebugInfo] Recover debug intrinsics when killing ↵	Vlad Tsyrklevich	2019-12-10	2	-64/+22
\| \| \| \| \| \| \| \|	duplicated/empty..." This reverts commit f2ba93971ccc236c0eef5323704d31f48107e04f, it was causing build timeouts on sanitizer-x86_64-linux-autoconf such as http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/44917
*	[VectorUtils] Introduce the Vector Function Database (VFDatabase).	Francesco Petrogalli	2019-12-10	4	-20/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch introduced the VFDatabase, the framework proposed in http://lists.llvm.org/pipermail/llvm-dev/2019-June/133484.html. [] In this patch the VFDatabase is used to bridge the TargetLibraryInfo (TLI) calls that were previously used to query for the availability of vector counterparts of scalar functions. The VFISAKind field `ISA` of VFShape have been moved into into VFInfo, under the assumption that different vector ISAs may provide the same vector signature. At the moment, the vectorizer accepts any of the available ISAs as long as the signature provided by the VFDatabase matches the one expected in the vectorization process. For example, when targeting AVX or AVX2, which both have 256-bit registers, the IR signature of the two vector functions associated to the two ISAs is the same. The `getVectorizedFunction` method at the moment returns the first available match. We will need to add more heuristics to the search system to decide which of the available version (TLI, AVX, AVX2, ...) the system should prefer, when multiple versions with the same VFShape are present. Some of the code in this patch is based on the work done by Sumedh Arani in https://reviews.llvm.org/D66025. [] Notice that in the proposal the VFDatabase was called SVFS. The name VFDatabase is more in line with LLVM recommendations for naming classes and variables. Differential Revision: https://reviews.llvm.org/D67572
*	[InstCombine] replace shuffle's insertelement operand if inserted scalar is ↵	Sanjay Patel	2019-12-10	1	-1/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	not demanded This pattern is noted as a regression from: D70246 ...where we removed an over-aggressive shuffle simplification. SimplifyDemandedVectorElts fails to catch this case when the insert has multiple uses, so I'm proposing to pattern match the minimal sequence directly. This fold does not conflict with any of our current shuffle undef/poison semantics. Differential Revision: https://reviews.llvm.org/D71220
*	[Alignment][NFC] CreateMemSet use MaybeAlign	Guillaume Chatelet	2019-12-10	7	-135/+138
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71213
*	Reapply: [DebugInfo] Recover debug intrinsics when killing duplicated/empty...	stozer	2019-12-10	2	-22/+64
\| \| \| \| \| \| \| \| \| \| \|	basic blocks Originally applied in 72ce759928e6dfee6a9efa310b966c19722352ba. Fixed a build failure caused by incorrect use of cast instead of dyn_cast. This reverts commit 8b0780f795eb58fca0a2456e308adaaa1a0b5013.
*	[DebugInfo][EarlyCSE] Use the salvageDebugInfoOrMarkUndef(); NFC	Djordje Todorovic	2019-12-09	1	-2/+2
\| \| \| \| \| \|	Use the newest API. Differential Revision: https://reviews.llvm.org/D71061
*	[ARM] Teach the Arm cost model that a Shift can be folded into other ↵	David Green	2019-12-09	2	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instructions This attempts to teach the cost model in Arm that code such as: %s = shl i32 %a, 3 %a = and i32 %s, %b Can under Arm or Thumb2 become: and r0, r1, r2, lsl #3 So the cost of the shift can essentially be free. To do this without trying to artificially adjust the cost of the "and" instruction, it needs to get the users of the shl and check if they are a type of instruction that the shift can be folded into. And so it needs to have access to the actual instruction in getArithmeticInstrCost, which if available is added as an extra parameter much like getCastInstrCost. We otherwise limit it to shifts with a single user, which should hopefully handle most of the cases. The list of instruction that the shift can be folded into include ADC, ADD, AND, BIC, CMP, EOR, MVN, ORR, ORN, RSB, SBC and SUB. This translates to Add, Sub, And, Or, Xor and ICmp. Differential Revision: https://reviews.llvm.org/D70966
*	[LV] Pick correct BB as insert point when fixing PHI for FORs.	Florian Hahn	2019-12-07	1	-10/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently we fail to pick the right insertion point when PreviousLastPart of a first-order-recurrence is a PHI node not in the LoopVectorBody. This can happen when PreviousLastPart is produce in a predicated block. In that case, we should pick the insertion point in the BB the PHI is in. Fixes PR44020. Reviewers: hsaito, fhahn, Ayal, dorit Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D71071
*	[SimplifyCFG] Account for N being null.	Florian Hahn	2019-12-07	1	-5/+6
\| \| \| \| \|	Fixes a crash, e.g. http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/15119/
*	[SimplifyCFG] Handle AssumptionCache being null.	Rodrigo Caetano Rocha	2019-12-07	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \|	AssumptionCache can be null in SimplifyCFGOptions. However, FoldCondBranchOnPHI() was not properly handling that when passing a null AssumptionCache to simplifyCFG. Patch by Rodrigo Caetano Rocha <rcor.cs@gmail.com> Reviewers: fhahn, lebedev.ri, spatel Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D69963
*	[VPlan] Rename VPlanHCFGTransforms to VPlanTransforms (NFC).	Florian Hahn	2019-12-07	6	-14/+12
\| \| \| \| \| \| \| \| \| \| \| \|	The file is intended to gather various VPlan transformations, not only CFG related transforms. Actually, the only transformation there is not CFG related. Reviewers: Ayal, gilr, hsaito, rengolin Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D70732
*	[WPD] Remove unused parameter (NFC)	Teresa Johnson	2019-12-06	1	-4/+3
\| \| \| \|	Remove unused parameter.
*	[AutoFDO] Inline replay for cold/small callees from sample profile loader	Wenlei He	2019-12-06	1	-3/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Sample profile loader of AutoFDO tries to replay previous inlining using context sensitive profile. The replay only repeats inlining if the call site block is hot. As a result it punts inlining of small functions, some of which can be beneficial for size, and will still be inlined by CSGCC inliner later. The oscillation between sample profile loader's inlining and regular CGSSC inlining cause unnecessary loss of context-sensitive profile. It doesn't have much impact for inline decision itself, but it negatively affects post-inline profile quality as CGSCC inliner have to scale counts which is not as accurate as the original context sensitive profile, and bad post-inline profile can misguide code layout. This change added regular Inline Cost calculation for sample profile loader, so we can inline small functions upfront under switch -sample-profile-inline-size. In addition -sample-profile-cold-inline-threshold is added so we can tune the separate size threshold - currently the default is chosen to be the same as regular inliner's cold call-site threshold. Reviewers: wmi, davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70750
*	Revert "[InstCombine] reduce code duplication; NFC"	Sanjay Patel	2019-12-06	1	-40/+38
\| \| \| \| \|	This reverts commit db5739658467e20a52f20e769d3580412e13ff87. At least 1 of these supposedly NFC commits wasn't - sanitizer bot is angry.
*	Revert "[InstCombine] improve readability; NFC"	Sanjay Patel	2019-12-06	1	-6/+10
\| \| \| \| \|	This reverts commit 7250ef3613cc6b81145b9543bafb86d7f9466cde. At least 1 of these supposedly NFC commits wasn't - sanitizer bot is angry.
*	Revert "[InstCombine] reduce indentation; NFC"	Sanjay Patel	2019-12-06	1	-25/+28
\| \| \| \| \|	This reverts commit 8bf8ef7116bd0daec570b35480ca969b74e66c6e. At least 1 of these supposedly NFC commits wasn't - sanitizer bot is angry.
*	[InstCombine] reduce indentation; NFC	Sanjay Patel	2019-12-06	1	-28/+25
\|
*	[InstCombine] improve readability; NFC	Sanjay Patel	2019-12-06	1	-10/+6
\| \| \| \|	CreateIntCast returns the input if its type matches, so need to duplicate that check.
*	[InstCombine] reduce code duplication; NFC	Sanjay Patel	2019-12-06	1	-38/+40
\|
*	[InstCombine] improve readability; NFC	Sanjay Patel	2019-12-06	1	-33/+33
\|
*	[LV] Record GEP widening decisions in recipe (NFCI)	Gil Rapaport	2019-12-06	5	-76/+147
\| \| \| \| \| \| \| \| \| \| \| \| \|	InnerLoopVectorizer's code called during VPlan execution still relies on original IR's def-use relations to decide which vector code to generate, limiting VPlan transformations ability to modify def-use relations and still have ILV generate the vector code. This commit moves GEP operand queries controlling how GEPs are widened to a dedicated recipe and extracts GEP widening code to its own ILV method taking those recorded decisions as arguments. This reduces ingredient def-use usage by ILV as a step towards full VPlan-based def-use relations. Differential revision: https://reviews.llvm.org/D69067
*	[LCSSA] Don't use VH callbacks to invalidate SCEV when creating LCSSA phis	Daniil Suchkov	2019-12-06	2	-12/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In general ValueHandleBase::ValueIsRAUWd shouldn't be called when not all uses of the value were actually replaced, though, currently formLCSSAForInstructions calls it when it inserts LCSSA-phis. Calls of ValueHandleBase::ValueIsRAUWd were added to LCSSA specifically to update/invalidate SCEV. In the best case these calls duplicate some of the work already done by SE->forgetValue, though in case when SCEV of the value is SCEVUnknown, SCEV replaces the underlying value of SCEVUnknown with the new value (i.e. acts like LCSSA-phi actually fully replaces the value it is created for), which leads to SCEV being corrupted because LCSSA-phi rarely dominates all uses of its inputs. Fixes bug https://bugs.llvm.org/show_bug.cgi?id=44058. Reviewers: fhahn, efriedma, reames, sanjoy.google Reviewed By: fhahn Subscribers: hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70593
*	[ThinLTO] Add option to disable readonly/writeonly attribute propagation	Teresa Johnson	2019-12-05	1	-12/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add an option to allow the attribute propagation on the index to be disabled, to allow a workaround for issues (such as that fixed by D70977). Also move the setting of the WithAttributePropagation flag on the index into propagateAttributes(), and remove some old stale code that predated this flag and cleared the maybe read/write only bits when we need to disable the propagation (previously only when importing disabled, now also when the new option disables it). Reviewers: evgeny777, steven_wu Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70984
*	[AutoFDO] Top-down Inlining for specialization with context-sensitive profile	Wenlei He	2019-12-05	1	-9/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: AutoFDO's sample profile loader processes function in arbitrary source code order, so if I change the order of two functions in source code, the inline decision can change. This also prevented the use of context-sensitive profile to do specialization while inlining. This commit enforces SCC top-down order for sample profile loader. With this change, we can now do specialization, as illustrated by the added test case: Say if we have A->B->C and D->B->C call path, we want to inline C into B when root inliner is B, but not when root inliner is A or D, this is not possible without enforcing top-down order. E.g. Once C is inlined into B, A and D can only choose to inline (B->C) as a whole or nothing, but what we want is only inline B into A and D, not its recursive callee C. If we process functions in top-down order, this is no longer a problem, which is what this commit is doing. This change is guarded with a new switch "-sample-profile-top-down-load" for tuning, and it depends on D70653. Eventually, top-down can be the default order for sample profile loader. Reviewers: wmi, davidxl Subscribers: hiraditya, llvm-commits, tejohnson Tags: #llvm Differential Revision: https://reviews.llvm.org/D70655
*	[AutoFDO] Properly merge context-sensitive profile of inlinee back to ↵	Wenlei He	2019-12-05	1	-3/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	outlined function Summary: When sample profile loader decides not to inline a previously inlined call-site, we adjust the profile of outlined function simply by scaling up its profile counts by call-site count. This means the context-sensitive profile of that inlined instance will be thrown away. This commit try to keep context-sensitive profile for such cases: - Instead of scaling outlined function's profile, we now properly merge the FunctionSamples of inlined instance into outlined function, including all recursively inlined profile. - Instead of adjusting the profile for negative inline decision at the end of the sample profile loader pass, we do the profile merge right after processing each function. This change paired with top-down ordering of annotation/inline-replay (a separate diff) will make sure we recursively merge profile back before the profile is used for annotation and inline replay. A new switch -sample-profile-merge-inlinee is added to enable the new profile merge for tuning. It should be the default behavior eventually. Reviewers: wmi, davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70653
*	Revert "[DSE] Fix for a dangling point bug in DeadStoreElimination."	Florian Hahn	2019-12-05	1	-42/+17
\| \| \| \| \| \| \|	The commit causes a failure: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/20911 This reverts commit 1847fd9d85506ecee692230cb2500e3774ec628e.
*	LowerDbgDeclare: look through bitcasts.	Evgenii Stepanov	2019-12-05	1	-16/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Emit a value debug intrinsic (with OP_deref) when an alloca address is passed to a function call after going through a bitcast. This generates an FP or SP-relative location for the local variable in the following case: int x; use((void *)&x; Reviewers: aprantl, vsk, pcc Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70752
*	Revert "[InstCombine] keep assumption before sinking calls"	Bob Haarman	2019-12-05	1	-21/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This reverts commit c3b06d0c393e533eab712922911d14e5a079fa5d. Reason for revert: Caused miscompiles when inserting assume for undef. Also adds a test to prevent similar breakage in future. Fixes PR44154. Reviewers: rnk, jdoerfert, efriedma, xbolva00 Reviewed By: rnk Subscribers: thakis, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70933
*	[InstCombine] Invert `add A, sext(B) --> sub A, zext(B)` canonicalization ↵	Roman Lebedev	2019-12-05	1	-6/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(to `sub A, zext B -> add A, sext B`) Summary: D68408 proposes to greatly improve our negation sinking abilities. But in current canonicalization, we produce `sub A, zext(B)`, which we will consider non-canonical and try to sink that negation, undoing the existing canonicalization. So unless we explicitly stop producing previous canonicalization, we will have two conflicting folds, and will end up endlessly looping. This inverts canonicalization, and adds back the obvious fold that we'd miss: * `sub [nsw] Op0, sext/zext (bool Y) -> add [nsw] Op0, zext/sext (bool Y)` https://rise4fun.com/Alive/xx4 * `sext(bool) + C -> bool ? C - 1 : C` https://rise4fun.com/Alive/fBl It is obvious that `@ossfuzz_9880()` / `@lshr_out_of_range()`/`@ashr_out_of_range()` (oss-fuzz 4871) are no longer folded as much, though those aren't really worrying. Reviewers: spatel, efriedma, t.p.northover, hfinkel Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71064
*	[DSE] Fix for a dangling point bug in DeadStoreElimination.	Ankit	2019-12-05	1	-17/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The patch makes sure that the LastThrowing pointer does not point to any instruction deleted by call to DeleteDeadInstruction. While iterating through the instructions the pass maintains a pointer to the lastThrowing Instruction. A call to deleteDeadInstruction deletes a dead store and other instructions feeding the original dead instruction which also become dead. The instruction pointed by the lastThrowing pointer could also be deleted by the call to DeleteDeadInstruction and thus it becomes a dangling pointer. Because of this, we see an error in the next iteration. In the patch, we maintain a list of throwing instructions encountered previously and use the last non deleted throwing instruction from the container. Patch by Ankit <quic_aankit@quicinc.com> Reviewers: fhahn, bcahoon, efriedma Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D65326
*	[InstCombine] narrow select with FP casts	Sanjay Patel	2019-12-05	1	-0/+18
\| \| \| \|	Select doesn't change values, so truncate of extended operand cancels out.
*	[InstCombine] add FMF guard to builder in fptrunc transform; NFC	Sanjay Patel	2019-12-05	1	-0/+5
\| \| \| \| \| \| \| \|	This makes no difference currently because we don't apply FMF to FP casts, but that may change. This could also be a place to add a fold for select with fptrunc, so it will make that patch easier/smaller.
*	[InstCombine] Extend `0 - (X sdiv C) -> (X sdiv -C)` fold to non-splat vectors	Roman Lebedev	2019-12-05	1	-8/+10
\| \| \| \|	Split off from https://reviews.llvm.org/D68408
*	[ThinLTO] Fix importing of writeonly variables in distributed ThinLTO	Teresa Johnson	2019-12-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: D69561/dde5893 enabled importing of readonly variables with references, however, it introduced a bug relating to importing/internalization of writeonly variables with references. A fix for this was added in D70006/7f92d66. But this didn't work in distributed ThinLTO mode. The reason is that the fix (importing the writeonly var with a zeroinitializer) was only applied when there were references on the writeonly var summary. In distributed ThinLTO mode, where we only have a small slice of the index, we will not have the references on the importing side if we are not importing those referenced values. Rather than changing this handshaking (which will require a lot of other changes, since that's how we know what to import in the distributed backend clang invocation), we can simply always give the writeonly variable a zero initializer. Reviewers: evgeny777, steven_wu Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70977