bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	LiveIntervals: Fix handleMoveUp with subreg def moving across a def	Matt Arsenault	2019-10-18	1	-1/+16
\| \| \| \| \| \| \| \| \|	If a subregister def was moved across another subregister def and another use, the main range was not correctly updated. The end point of the moved interval ended too early and missed the use from theh other lanes in the subreg def. llvm-svn: 375300
*	[AMDGPU] move PHI nodes to AGPR class	Stanislav Mekhanoshin	2019-10-18	1	-5/+16
\| \| \| \| \| \| \| \| \|	If all uses of a PHI are in AGPR register class we should avoid unneeded copies via VGPRs. Differential Revision: https://reviews.llvm.org/D69200 llvm-svn: 375297
*	[SampleFDO] Add profile remapping support for profile on-demand loading used	Wei Mi	2019-10-18	2	-52/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	by ExtBinary format profile Profile on-demand loading was added for ExtBinary format profile in rL374233, but currently profile on-demand loading doesn't work well with profile remapping. The patch adds the support. Suppose a function in the current module has outline instance in the profile. The function name in the module is different from the name of the outline instance, but remapper knows the two names are equal. When loading profile on-demand, the outline instance has to be loaded with remapper's help. At the same time SampleProfileReaderItaniumRemapper is changed from a proxy of SampleProfileReader to a helper member in SampleProfileReader. Differential Revision: https://reviews.llvm.org/D68901 llvm-svn: 375295
*	[AMDGPU] Remove -amdgpu-spill-sgpr-to-smem.	Jay Foad	2019-10-18	2	-156/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The implementation was never completed and never used except in tests. Reviewers: arsenm, mareko Subscribers: qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69163 llvm-svn: 375293
*	[CVP] setDeducedOverflowingFlags(): actually inc per-opcode stats	Roman Lebedev	2019-10-18	1	-4/+4
\| \| \| \| \| \| \|	This is really embarrassing. Those are pointers, so that offsets the pointers, not the statistics pointed-by the pointer... llvm-svn: 375290
*	Disable exit-on-SIGPIPE in lldb	Vedant Kumar	2019-10-18	2	-1/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Occasionally, during test teardown, LLDB writes to a closed pipe. Sometimes the communication is inherently unreliable, so LLDB tries to avoid being killed due to SIGPIPE (it calls `signal(SIGPIPE, SIG_IGN)`). However, LLVM's default SIGPIPE behavior overrides LLDB's, causing it to exit with IO_ERR. Opt LLDB out of the default SIGPIPE behavior. I expect that this will resolve some LLDB test suite flakiness (tests randomly failing with IO_ERR) that we've seen since r344372. rdar://55750240 Differential Revision: https://reviews.llvm.org/D69148 llvm-svn: 375288
*	[X86] Fix register parsing in .seh_* in Intel syntax	Reid Kleckner	2019-10-18	1	-4/+3
\| \| \| \| \| \| \| \| \| \|	Previously, the parser checked for a '%' prefix to indicate a register. In Intel syntax mode, LLVM does not print a '%' prefix on registers, so LLVM could not parse its own assembly output. Instead, require that register numbers be integer literals, or at least start with an integer literal, which is consistent with .cfi_* directive register parsing. llvm-svn: 375287
*	[WebAssembly] Allow multivalue signatures in object files	Thomas Lively	2019-10-18	3	-14/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Also changes the wasm YAML format to reflect the possibility of having multiple return types and to put the returns after the params for consistency with the binary encoding. Reviewers: aheejin, sbc100 Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, arphaman, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69156 llvm-svn: 375283
*	[GISel][CallLowering] Make isIncomingArgumentHandler a pure virtual method	Quentin Colombet	2019-10-18	4	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	The default implementation of isIncomingArgumentHandler could lead to generating incorrect code. Make it a pure virtual method, so that targets know they have to override it to produce correct code. NFC Differential Revision: https://reviews.llvm.org/D69187 llvm-svn: 375277
*	[CVP] After proving that @llvm.with.overflow()/@llvm.sat() don't overflow, ↵	Roman Lebedev	2019-10-18	1	-4/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	also try to prove other no-wrap Summary: CVP, unlike InstCombine, does not run till exaustion. It only does a single pass. When dealing with those special binops, if we prove that they can safely be demoted into their usual binop form, we do set the no-wrap we deduced. But when dealing with usual binops, we try to deduce both no-wraps. So if we convert e.g. @llvm.uadd.with.overflow() to `add nuw`, we won't attempt to check whether it can be `add nuw nsw`. This patch proposes to call `processBinOp()` on newly-created binop, which is identical to what we do for div/rem already. Reviewers: nikic, spatel, reames Reviewed By: nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69183 llvm-svn: 375273
*	AMDGPU: Relax 32-bit SGPR register class	Matt Arsenault	2019-10-18	6	-34/+39
\| \| \| \| \| \| \| \| \| \| \|	Mostly use SReg_32 instead of SReg_32_XM0 for arbitrary values. This will allow the register coalescer to do a better job eliminating copies to m0. For GlobalISel, as a terrible hack, use SGPR_32 for things that should use SCC until booleans are solved. llvm-svn: 375267
*	AMDGPU: Fix SMEM WAR hazard for gfx10 readlane	Austin Kerbow	2019-10-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Hazard recognizer fails to see hazard with V_READLANE_B32_gfx10. Reviewers: rampitec Reviewed By: rampitec Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69172 llvm-svn: 375265
*	[PGO][PGSO] SizeOpts changes.	Hiroshi Yamauchi	2019-10-18	6	-18/+192
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: (Split of off D67120) SizeOpts/MachineSizeOpts changes for profile guided size optimization. Reviewers: davidxl Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69070 llvm-svn: 375254
*	[X86] combineX86ShufflesRecursively - pull out isTargetShuffleVariableMask. ↵	Simon Pilgrim	2019-10-18	1	-1/+2
\| \| \| \| \| \|	NFCI. llvm-svn: 375253
*	Update MinidumpYAML to use minidump::Exception for exception stream	Joseph Tremoulet	2019-10-18	2	-2/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: labath, jhenderson, clayborg, MaskRay, grimar Reviewed By: grimar Subscribers: lldb-commits, grimar, MaskRay, hiraditya, llvm-commits Tags: #llvm, #lldb Differential Revision: https://reviews.llvm.org/D68657 llvm-svn: 375242
*	[AMDGPU][MC][GFX10] Added sdwa/dpp versions of v_cndmask_b32	Dmitry Preobrazhensky	2019-10-18	2	-52/+80
\| \| \| \| \| \| \| \| \| \|	See https://bugs.llvm.org/show_bug.cgi?id=43608 Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D69096 llvm-svn: 375241
*	Revert r375152 as it is causing failures on EXPENSIVE_CHECKS bot	Nemanja Ivanovic	2019-10-18	1	-1/+1
\| \| \| \|	llvm-svn: 375233
*	[SCEV] Removing deprecated comment in ScalarEvolutionExpander	Victor Campos	2019-10-18	1	-3/+0
\| \| \| \| \| \| \|	Removing a comment in the ScalarEvolutionExpander.cpp file that was about the class SCEVSDivExpr, which has been long gone from LLVM. llvm-svn: 375232
*	[AMDGPU][MC][GFX9] Corrected parsing of v_cndmask_b32_sdwa	Dmitry Preobrazhensky	2019-10-18	2	-10/+22
\| \| \| \| \| \| \| \| \| \|	See https://bugs.llvm.org/show_bug.cgi?id=43607 Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D69095 llvm-svn: 375231
*	[NFC][CVP] Count all the no-wraps we proved	Roman Lebedev	2019-10-18	1	-20/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: It looks like this is the only missing statistic in the CVP pass. Since we prove NSW and NUW separately i'd think we should count them separately too. Reviewers: nikic, spatel, reames Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68740 llvm-svn: 375230
*	[AArch64] Adding support for PMMIR_EL1 register	Victor Campos	2019-10-18	4	-1/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The PMMIR_EL1 register is present in Armv8.4 with PMU extension. This patch adds support for it. Reviewers: t.p.northover, dnsampaio Reviewed By: dnsampaio Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68940 llvm-svn: 375228
*	[AArch64][SVE] Add SPLAT_VECTOR ISD Node	Graham Hunter	2019-10-18	11	-17/+136
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adds a new ISD node to replicate a scalar value across all elements of a vector. This is needed for scalable vectors, since BUILD_VECTOR cannot be used. Fixes up default type legalization for scalable vectors after the new MVT type ranges were introduced. At present I only use this node for scalable vectors. A DAGCombine has been added to transform a BUILD_VECTOR into a SPLAT_VECTOR if all elements are the same, but only if the default operation action of Expand has been overridden by the target. I've only added result promotion legalization for scalable vector i8/i16/i32/i64 types in AArch64 for now. Reviewers: t.p.northover, javed.absar, greened, cameron.mcinally, jmolloy Reviewed By: jmolloy Differential Revision: https://reviews.llvm.org/D47775 llvm-svn: 375222
*	[ThinLTOCodeGenerator] Add support for index-based WPD	Eugene Leviant	2019-10-18	1	-21/+47
\| \| \| \| \| \|	Differential revision: https://reviews.llvm.org/D68950 llvm-svn: 375219
*	[AArch64] Don't combine callee-save and local stack adjustment when ↵	David Green	2019-10-18	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	optimizing for size For arm64, D18619 introduced the ability to combine bumping the stack pointer upfront in case it needs to be bumped for both the callee-save area as well as the local stack area. That diff already remarks that "This change can cause an increase in instructions", but argues that even when that happens, it should be still be a performance benefit because the number of micro-ops is reduced. We have observed that this code-size increase can be significant in practice. This diff disables combining stack bumping for methods that are marked as optimize-for-size. Example of a prologue with the behavior before this diff (combining stack bumping when possible): sub sp, sp, #0x40 stp d9, d8, [sp, #0x10] stp x20, x19, [sp, #0x20] stp x29, x30, [sp, #0x30] add x29, sp, #0x30 [... compute x8 somehow ...] stp x0, x8, [sp] And after this diff, if the method is marked as optimize-for-size: stp d9, d8, [sp, #-0x30]! stp x20, x19, [sp, #0x10] stp x29, x30, [sp, #0x20] add x29, sp, #0x20 [... compute x8 somehow ...] stp x0, x8, [sp, #-0x10]! Note that without combining the stack bump there are two auto-decrements, nicely folded into the stp instructions, whereas otherwise there is a single sub sp, ... instruction, but not folded. Patch by Nikolai Tillmann! Differential Revision: https://reviews.llvm.org/D68530 llvm-svn: 375217
*	Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warnings. ↵	Simon Pilgrim	2019-10-18	1	-1/+1
\| \| \| \| \| \|	NFCI. llvm-svn: 375213
*	[Codegen] Alter the default promotion for saturating adds and subs	David Green	2019-10-18	1	-33/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The default promotion for the add_sat/sub_sat nodes currently does: ANY_EXTEND iN to iM SHL by M-N [US][ADD\|SUB]SAT L/ASHR by M-N If the promoted add_sat or sub_sat node is not legal, this can produce code that effectively does a lot of shifting (and requiring large constants to be materialised) just to use the overflow flag. It is simpler to just do the saturation manually, using the higher bitwidth addition and a min/max against the saturating bounds. That is what this patch attempts to do. Differential Revision: https://reviews.llvm.org/D68926 llvm-svn: 375211
*	[AArch64][SVE] Implement unpack intrinsics	Kerry McLaughlin	2019-10-18	5	-5/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Implements the following intrinsics: - int_aarch64_sve_sunpkhi - int_aarch64_sve_sunpklo - int_aarch64_sve_uunpkhi - int_aarch64_sve_uunpklo This patch also adds AArch64ISD nodes for UNPK instead of implementing the intrinsics directly, as they are required for a future patch which implements the sign/zero extension of legal vectors. This patch includes tests for the Subdivide2Argument type added by D67549 Reviewers: sdesmalen, SjoerdMeijer, greened, rengolin, rovka Reviewed By: greened Subscribers: tschuett, kristof.beyls, rkruppe, psnobl, cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D67550 llvm-svn: 375210
*	[InstCombine] Fix miscompile bug in canEvaluateShuffled	Bjorn Pettersson	2019-10-18	1	-7/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add restrictions in canEvaluateShuffled to prevent that we for example transform %0 = insertelement <2 x i16> undef, i16 %a, i32 0 %1 = srem <2 x i16> %0, <i16 2, i16 1> %2 = shufflevector <2 x i16> %1, <2 x i16> undef, <2 x i32> <i32 undef, i32 0> into %1 = insertelement <2 x i16> undef, i16 %a, i32 1 %2 = srem <2 x i16> %1, <i16 undef, i16 2> as having an undef denominator makes the srem undefined (for all vector elements). Fixes: https://bugs.llvm.org/show_bug.cgi?id=43689 Reviewers: spatel, lebedev.ri Reviewed By: spatel, lebedev.ri Subscribers: lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69038 llvm-svn: 375208
*	[X86] Emit KTEST when possible	David Zarzycki	2019-10-18	1	-8/+23
\| \| \| \| \| \|	https://reviews.llvm.org/D69111 llvm-svn: 375197
*	[IndVars] Factor out some common code into a utility function	Philip Reames	2019-10-17	1	-16/+13
\| \| \| \| \| \|	As requested in review of D69009 llvm-svn: 375191
*	DebugInfo: Move loclist base address from DwarfFile to DebugLocStream	David Blaikie	2019-10-17	5	-27/+18
\| \| \| \| \| \| \| \| \|	There's no need to have more than one of these (there can be two DwarfFiles - one for the .o, one for the .dwo - but only one loc/loclist section (either in the .o or the .dwo) & certainly one per DebugLocStream, which is currently singular in DwarfDebug) llvm-svn: 375183
*	DebugInfo: Remove unused parameter (from ↵	David Blaikie	2019-10-17	1	-3/+3
\| \| \| \| \| \|	DwarfDebug.cpp:emitListsTableHeaderStart) llvm-svn: 375180
*	[AMDGPU] drop getIsFP td helper	Stanislav Mekhanoshin	2019-10-17	3	-23/+13
\| \| \| \| \| \| \| \| \|	We already have isFloatType helper, and they are out of sync. Drop one and merge the type list. Differential Revision: https://reviews.llvm.org/D69138 llvm-svn: 375175
*	[NFC][InstCombine] Some more preparatory cleanup for ↵	Roman Lebedev	2019-10-17	1	-4/+4
\| \| \| \| \| \|	dropRedundantMaskingOfLeftShiftInput() llvm-svn: 375153
*	[PowerPC] Turn on CR-Logical reducer pass	Nemanja Ivanovic	2019-10-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Quite a while ago, we implemented a pass that will reduce the number of CR-logical operations we emit. It does so by converting a CR-logical operation into a branch. We have kept this off by default because it seemed to cause a significant regression with one benchmark. However, that regression turned out to be due to a completely unrelated reason - AADB introducing a self-copy that is a priority-setting nop and it was just exacerbated by this pass. Now that we understand the reason for the only degradation, we can turn this pass on by default. We have long since fixed the cause for the degradation. Differential revision: https://reviews.llvm.org/D52431 llvm-svn: 375152
*	Reapply r375051: [support] GlobPattern: add support for `\` and `[!...]`, ↵	Jordan Rupprecht	2019-10-17	1	-7/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	and allow `]` in more places Reland r375051 (reverted in r375052) after fixing lld tests on Windows in r375126 and r375131. Original description: Update GlobPattern in libSupport to handle a few more cases. It does not fully match the `fnmatch` used by GNU objcopy since named character classes (e.g. `[[:digit:]]`) are not supported, but this should support most existing use cases (mostly just `*` is what's used anyway). This will be used to implement the `--wildcard` flag in llvm-objcopy to be more compatible with GNU objcopy. This is split off of D66613 to land the libSupport changes separately. The llvm-objcopy part will land soon. Reviewers: jhenderson, MaskRay, evgeny777, espindola, alexshap Reviewed By: MaskRay Subscribers: nickdesaulniers, emaste, arichardson, hiraditya, jakehehrlich, abrachet, seiya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66613 llvm-svn: 375149
*	NFC: Fix variable only used in asserts by propagating the value.	Sterling Augustine	2019-10-17	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes builds with assertions disabled that would otherwise fail with unused variable warnings Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69123 llvm-svn: 375148
*	[IndVars] Split loop predication out of optimizeLoopExits [NFC]	Philip Reames	2019-10-17	1	-11/+42
\| \| \| \| \| \|	In the process of writing D69009, I realized we have two distinct sets of invariants within this single function, and basically no shared logic. The optimize loop exit transforms (including the new one in D69009) only care about analyzeable exits. Loop predication, on the other hand, has to reason about all exits. At the moment, we have the property (due to the requirement for an exact btc) that all exits are analyzeable, but that will likely change in the future as we add widenable condition support. llvm-svn: 375138
*	[codeview] Workaround for PR43479, don't re-emit instr labels	Reid Kleckner	2019-10-17	1	-4/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In the long run we should come up with another mechanism for marking call instructions as heap allocation sites, and remove this workaround. For now, we've had two bug reports about this, so let's apply this workaround. SLH (the other client of instruction labels) probably has the same bug, but the solution there is more likely to be to mark the call instruction as not duplicatable, which doesn't work for debug info. Reviewers: akhuang Subscribers: aprantl, hiraditya, aganea, chandlerc, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69068 llvm-svn: 375137
*	[IndVars] Factor out a helper function for readability [NFC]	Philip Reames	2019-10-17	1	-7/+20
\| \| \| \|	llvm-svn: 375133
*	[AIX] TOC pseudo expansion for 64bit large + 64bit small + 32bit large models	Xiangling Liao	2019-10-17	4	-11/+111
\| \| \| \| \| \| \| \| \|	This patch provides support for peudo ops including ADDIStocHA8, ADDIStocHA, LWZtocL, LDtoc, LDtocL for AIX, lowering them from MIR to assembly. Differential Revision: https://reviews.llvm.org/D68341 llvm-svn: 375113
*	[AMDGPU] Improve code size cost model	Daniil Fukalov	2019-10-17	3	-3/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Added estimation for zero size insertelement, extractelement and llvm.fabs operators. Updated inline/unroll parameters default values. Reviewers: rampitec, arsenm Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68881 llvm-svn: 375109
*	[ARM][MVE] Enable truncating masked stores	Sam Parker	2019-10-17	2	-33/+35
\| \| \| \| \| \| \| \| \| \|	Allow us to generate truncating masked store which take v4i32 and v8i16 vectors and can store to v4i8, v4i16 and v8i8 and memory. Removed support for unaligned masked stores. Differential Revision: https://reviews.llvm.org/D68461 llvm-svn: 375108
*	JumpThreadingPass::UnfoldSelectInstr - silence static analyzer dyn_cast<> ↵	Simon Pilgrim	2019-10-17	1	-1/+1
\| \| \| \| \| \| \| \|	null dereference warning. NFCI. The static analyzer is warning about a potential null dereference, but we should be able to use cast<> directly and if not assert will fire for us. llvm-svn: 375103
*	[LoopIdiom] BCmp: check, not assert that loop exits exit out of the loop ↵	Roman Lebedev	2019-10-17	1	-7/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(PR43687) We can't normally stumble into that assertion because a tautological conditional `br` in loop body is required, one that always branches to loop latch. But that should have been always folded to an unconditional branch before we get it. But that is not guaranteed if the pass is run standalone. So let's just promote the assertion into a proper check. Fixes https://bugs.llvm.org/show_bug.cgi?id=43687 llvm-svn: 375100
*	Reland: Dead Virtual Function Elimination	Oliver Stannard	2019-10-17	6	-36/+215
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove dead virtual functions from vtables with replaceNonMetadataUsesWith, so that CGProfile metadata gets cleaned up correctly. Original commit message: Currently, it is hard for the compiler to remove unused C++ virtual functions, because they are all referenced from vtables, which are referenced by constructors. This means that if the constructor is called from any live code, then we keep every virtual function in the final link, even if there are no call sites which can use it. This patch allows unused virtual functions to be removed during LTO (and regular compilation in limited circumstances) by using type metadata to match virtual function call sites to the vtable slots they might load from. This information can then be used in the global dead code elimination pass instead of the references from vtables to virtual functions, to more accurately determine which functions are reachable. To make this transformation safe, I have changed clang's code-generation to always load virtual function pointers using the llvm.type.checked.load intrinsic, instead of regular load instructions. I originally tried writing this using clang's existing code-generation, which uses the llvm.type.test and llvm.assume intrinsics after doing a normal load. However, it is possible for optimisations to obscure the relationship between the GEP, load and llvm.type.test, causing GlobalDCE to fail to find virtual function call sites. The existing linkage and visibility types don't accurately describe the scope in which a virtual call could be made which uses a given vtable. This is wider than the visibility of the type itself, because a virtual function call could be made using a more-visible base class. I've added a new !vcall_visibility metadata type to represent this, described in TypeMetadata.rst. The internalization pass and libLTO have been updated to change this metadata when linking is performed. This doesn't currently work with ThinLTO, because it needs to see every call to llvm.type.checked.load in the linkage unit. It might be possible to extend this optimisation to be able to use the ThinLTO summary, as was done for devirtualization, but until then that combination is rejected in the clang driver. To test this, I've written a fuzzer which generates random C++ programs with complex class inheritance graphs, and virtual functions called through object and function pointers of different types. The programs are spread across multiple translation units and DSOs to test the different visibility restrictions. I've also tried doing bootstrap builds of LLVM to test this. This isn't ideal, because only classes in anonymous namespaces can be optimised with -fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not work correctly with -fvisibility=hidden. However, there are only 12 test failures when building with -fvisibility=hidden (and an unmodified compiler), and this change does not cause any new failures for either value of -fvisibility. On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size reduction of ~6%, over a baseline compiled with "-O2 -flto -fvisibility=hidden -fwhole-program-vtables". The best cases are reductions of ~14% in 450.soplex and 483.xalancbmk, and there are no code size increases. I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which show a geomean size reduction of ~3%, again with no size increases. I had hoped that this would have no effect on performance, which would allow it to awlays be enabled (when using -fwhole-program-vtables). However, the changes in clang to use the llvm.type.checked.load intrinsic are causing ~1% performance regression in the C++ parts of SPEC2006. It should be possible to recover some of this perf loss by teaching optimisations about the llvm.type.checked.load intrinsic, which would make it worth turning this on by default (though it's still dependent on -fwhole-program-vtables). Differential revision: https://reviews.llvm.org/D63932 llvm-svn: 375094
*	[ARM][MVE] Change VPST to use, not def, VPR	Sam Parker	2019-10-17	1	-1/+1
\| \| \| \| \| \| \| \|	Unlike VPT, VPST just uses the current value of VPR.P0. Differential Revision: https://reviews.llvm.org/D69037 llvm-svn: 375087
*	[DFAPacketizer] Use DFAEmitter. NFC.	James Molloy	2019-10-17	1	-67/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a NFC change that removes the NFA->DFA construction and emission logic from DFAPacketizerEmitter and instead uses the generic DFAEmitter logic. This allows DFAPacketizer to use the Automaton class from Support and remove a bunch of logic there too. After this patch, DFAPacketizer is mostly logic for grepping Itineraries and collecting functional units, with no state machine logic. This will allow us to modernize by removing the 16-functional-unit limit and supporting non-itinerary functional units. This is all for followup patches. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68992 llvm-svn: 375086
*	[DAGCombine][ARM] Enable extending masked loads	Sam Parker	2019-10-17	5	-37/+141
\| \| \| \| \| \| \| \| \| \| \|	Add generic DAG combine for extending masked loads. Allow us to generate sext/zext masked loads which can access v4i8, v8i8 and v4i16 memory to produce v4i32, v8i16 and v4i32 respectively. Differential Revision: https://reviews.llvm.org/D68337 llvm-svn: 375085
*	[Alignment][NFC] Use Align for TargetFrameLowering/Subtarget	Guillaume Chatelet	2019-10-17	33	-83/+89
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68993 llvm-svn: 375084