bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Revert r372893 "[CodeGen] Replace -max-jump-table-size with ↵	Hans Wennborg	2019-09-27	3	-7/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	-max-jump-table-targets" This caused severe compile-time regressions, see PR43455. > Modern processors predict the targets of an indirect branch regardless of > the size of any jump table used to glean its target address. Moreover, > branch predictors typically use resources limited by the number of actual > targets that occur at run time. > > This patch changes the semantics of the option `-max-jump-table-size` to limit > the number of different targets instead of the number of entries in a jump > table. Thus, it is now renamed to `-max-jump-table-targets`. > > Before, when `-max-jump-table-size` was specified, it could happen that > cluster jump tables could have targets used repeatedly, but each one was > counted and typically resulted in tables with the same number of entries. > With this patch, when specifying `-max-jump-table-targets`, tables may have > different lengths, since the number of unique targets is counted towards the > limit, but the number of unique targets in tables is the same, but for the > last one containing the balance of targets. > > Differential revision: https://reviews.llvm.org/D60295 llvm-svn: 373060
*	[MC][ARM] vscclrm disassembles as vldmia	Alexandros Lamprineas	2019-09-27	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Happens only when the mve.fp subtarget feature is enabled: $ llvm-mc -triple thumbv8.1m.main -mattr=+mve.fp,+8msecext -disassemble <<< "0x9f,0xec,0x08,0x0b" .text vldmia pc, {d0, d1, d2, d3} $ llvm-mc -triple thumbv8.1m.main -mattr=+8msecext -disassemble <<< "0x9f,0xec,0x08,0x0b" .text vscclrm {d0, d1, d2, d3, vpr} Assembling returns the correct encoding with or without mve.fp: $ llvm-mc -triple thumbv8.1m.main -mattr=+mve.fp,+8msecext -show-encoding <<< "vscclrm {d0-d3, vpr}" .text vscclrm {d0, d1, d2, d3, vpr} @ encoding: [0x9f,0xec,0x08,0x0b] $ llvm-mc -triple thumbv8.1m.main -mattr=+8msecext -show-encoding <<< "vscclrm {d0-d3, vpr}" .text vscclrm {d0, d1, d2, d3, vpr} @ encoding: [0x9f,0xec,0x08,0x0b] The problem seems to be in the TableGen description of VSCCLRMD. The least significant bit should be set to zero. Differential Revision: https://reviews.llvm.org/D68025 llvm-svn: 373052
*	[WebAssembly] v128.andnot	Thomas Lively	2019-09-27	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: As specified at https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#bitwise-and-not Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68113 llvm-svn: 373041
*	[WebAssembly] SIMD Load and extend operations	Thomas Lively	2019-09-27	4	-2/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: As specified at https://github.com/webassembly/simd/blob/master/proposals/simd/SIMD.md#load-and-extend. These instructions are behind the unimplemented-simd128 target feature for now because they have not been implemented in V8 yet. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68058 llvm-svn: 373040
*	Speculative fix for gcc build.	Peter Collingbourne	2019-09-27	1	-2/+4
\| \| \| \|	llvm-svn: 373038
*	hwasan: Compatibility fixes for short granules.	Peter Collingbourne	2019-09-27	2	-75/+95
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We can't use short granules with stack instrumentation when targeting older API levels because the rest of the system won't understand the short granule tags stored in shadow memory. Moreover, we need to be able to let old binaries (which won't understand short granule tags) run on a new system that supports short granule tags. Such binaries will call the __hwasan_tag_mismatch function when their outlined checks fail. We can compensate for the binary's lack of support for short granules by implementing the short granule part of the check in the __hwasan_tag_mismatch function. Unfortunately we can't do anything about inline checks, but I don't believe that we can generate these by default on aarch64, nor did we do so when the ABI was fixed. A new function, __hwasan_tag_mismatch_v2, is introduced that lets code targeting the new runtime avoid redoing the short granule check. Because tag mismatches are rare this isn't important from a performance perspective; the main benefit is that it introduces a symbol dependency that prevents binaries targeting the new runtime from running on older (i.e. incompatible) runtimes. Differential Revision: https://reviews.llvm.org/D68059 llvm-svn: 373035
*	[X86] Remove CodeGenOnly instructions added in r373021, but keep the isel ↵	Craig Topper	2019-09-26	1	-16/+10
\| \| \| \| \| \|	patterns and add COPY_TO_REGCLASS to them. llvm-svn: 373031
*	[X86] Remove unused arguments from a tablegen multiclass. NFC	Craig Topper	2019-09-26	1	-13/+13
\| \| \| \|	llvm-svn: 373026
*	[X86] Add VMOVSSZrrk/VMOVSDZrrk/VMOVSSZrrkz/VMOVSDZrrkz to getUndefRegClearance.	Craig Topper	2019-09-26	1	-6/+15
\| \| \| \| \| \| \| \| \|	We have isel patterns that can put an IMPLICIT_DEF on one of the sources for these instructions. So we should make sure we break any dependencies there. This should be done by just using one of the other sources. llvm-svn: 373025
*	Remove the AliasAnalysis argument in function areMemAccessesTriviallyDisjoint	Changpeng Fang	2019-09-26	12	-23/+12
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Differential Revision: https://reviews.llvm.org/D58360 llvm-svn: 373024
*	[X86] Add CodeGenOnly instructions for (f32 (X86selects $mask, (loadf32 ↵	Craig Topper	2019-09-26	1	-1/+23
\| \| \| \| \| \| \| \| \| \| \| \|	addr), fp32imm0) to use masked MOVSS from memory. Similar for f64 and having a non-zero passthru value. We were previously not trying to fold the load at all. Using a CodeGenOnly instruction allows us to use FR32X/FR64X as the register class to avoid a bunch of COPY_TO_REGCLASS. llvm-svn: 373021
*	[AMDGPU] copy OtherPredicates from pseudo to VOP3_Real	Stanislav Mekhanoshin	2019-09-26	1	-0/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D68102 llvm-svn: 373015
*	[AIX]Emit function descriptor csect in assembly	Xiangling Liao	2019-09-26	1	-0/+57
\| \| \| \| \| \| \| \| \|	This patch emits the function descriptor csect for functions with definitions under both 32-bit/64-bit mode on AIX. Differential Revision: https://reviews.llvm.org/D66724 llvm-svn: 373009
*	ARMBaseInstrInfo getOperandLatency - silence static analyzer dyn_cast<> null ↵	Simon Pilgrim	2019-09-26	1	-2/+2
\| \| \| \| \| \| \| \|	dereference warnings. NFCI. The static analyzer is warning about potential null dereferences, but we should be able to use cast<> directly and if not assert will fire for us. llvm-svn: 372992
*	[PowerPC] Fix typo in rL372985	Jinsong Ji	2019-09-26	1	-1/+1
\| \| \| \|	llvm-svn: 372991
*	Updated comments in LWZtoc pseudo expansion.	Sean Fertile	2019-09-26	1	-4/+5
\| \| \| \| \| \| \|	Refined a couple of the comments in the LWZtoc expansion code based on a post commit review comment. llvm-svn: 372986
*	[PowerPC] Add missing pattern for VSX Scalar Negative Multiply-Subtract ↵	Jinsong Ji	2019-09-26	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Single Precision Summary: This was found during review of https://reviews.llvm.org/D66050. In the simple test of fdiv, we miss to fold ``` fneg 2, 2 xsmaddasp 3, 2, 0 ``` to ``` xsnmsubasp 3, 2, 0 ``` We have the patterns for Double Precision and vectors, just missing Single Precision, the patch add that. Reviewers: #powerpc, hfinkel, nemanjai, steven.zhang Reviewed By: #powerpc, steven.zhang Subscribers: wuzish, hiraditya, kbarton, MaskRay, shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67595 llvm-svn: 372985
*	[BPF] Remove unused variables. NFCI.	Simon Pilgrim	2019-09-26	1	-5/+1
\| \| \| \| \| \|	Fixes a dyn_cast<> null dereference warning. llvm-svn: 372958
*	[MIPS GlobalISel] Lower aggregate structure return arguments	Petar Avramovic	2019-09-26	2	-25/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement aggregate structure split to simpler types in splitToValueTypes. splitToValueTypes is used for return values. According to MipsABIInfo from clang/lib/CodeGen/TargetInfo.cpp, aggregate structure arguments for O32 always get simplified and thus will remain unsupported by the MIPS GlobalISel for the time being. For O32, aggregate structures can be encountered only for complex number returns e.g. 'complex float' or 'complex double' from <complex.h>. Differential Revision: https://reviews.llvm.org/D67963 llvm-svn: 372957
*	HexagonAsmParser::ParseDirectiveFalign - silence static analyzer ↵	Simon Pilgrim	2019-09-26	1	-1/+1
\| \| \| \| \| \| \| \|	dyn_cast<MCConstantExpr> null dereference warning. NFCI. The static analyzer is warning about a potential null dereference, but we should be able to use cast<MCConstantExpr> directly and if not assert will fire for us. llvm-svn: 372956
*	[CostModel][X86] Fix SLM <2 x i64> icmp costs	Simon Pilgrim	2019-09-26	1	-0/+9
\| \| \| \| \| \| \| \|	SLM is 2 x slower for <2 x i64> comparison ops than other vector types, we should account for this like we do for SLM <2 x i64> add/sub/mul costs. This should remove some of the SLM codegen diffs in D43582 llvm-svn: 372954
*	[SystemZ] Recognize mnop-mcount in backend	Jonas Paulsson	2019-09-26	2	-0/+11
\| \| \| \| \| \| \| \| \| \|	With -pg -mfentry -mnop-mcount, a nop is emitted instead of the call to fentry. Review: Ulrich Weigand https://reviews.llvm.org/D67765 llvm-svn: 372950
*	[X86] Remove isCodeGenOnly from (V)ROUND.*_Int and put it on the non _Int ↵	Craig Topper	2019-09-26	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	form instead. This matches what's done for VRNDSCALE and most other instructions. This mainly determines which instruction will be preferred by disassembler and assembly parser. The printing and encoding information is the same. We prefer the _Int form since it uses the VR128 class due to intrinsic interface. For some of EVEX features like embedded rounding, we only select from intrinsics today. So there is only a VR128 version. So making the VR128 version the preferred is overally consistent. llvm-svn: 372947
*	[X86] Mark the EVEX encoded PSADBW instructions as commutable to enable load ↵	Craig Topper	2019-09-26	1	-0/+1
\| \| \| \| \| \| \| \|	folding of the other operand. The SSE and VEX versions are already correct. llvm-svn: 372941
*	[TargetLowering] Make allowsMemoryAccess methode virtual.	Thomas Raoux	2019-09-26	5	-18/+22
\| \| \| \| \| \| \| \| \| \| \|	Rename old function to explicitly show that it cares only about alignment. The new allowsMemoryAccess call the function related to alignment by default and can be overridden by target to inform whether the memory access is legal or not. Differential Revision: https://reviews.llvm.org/D67121 llvm-svn: 372935
*	[X86] Use VR512_0_15RegClass intead of VR512RegClass in X86VZeroUpper.	Craig Topper	2019-09-25	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This pass is only concerned with ZMM0-15 and YMM0-15. For YMM we use VR256 which only contains YMM0-15, but for ZMM we were using VR512 which contains ZMM0-31. Using VR512_0_15 is more correct. Given that the ABI and register allocator will use registers in order, its unlikely that register from 16-31 would be used without also using 0-15. So this probably doesn't functionally matter. llvm-svn: 372933
*	[MSP430] Allow msp430_intrcc functions to not have interrupt attribute.	Vadzim Dambrouski	2019-09-25	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Useful in case you want to have control over interrupt vector generation. For example in Rust language we have an arrangement where all unhandled ISR vectors gets mapped to a single default handler function. Which is hard to implement when LLVM tries to generate vectors on its own. Reviewers: asl, krisb Subscribers: hiraditya, JDevlieghere, awygle, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67313 llvm-svn: 372910
*	[AMDGPU] gfx10 v_fmac_f16 operand folding	Stanislav Mekhanoshin	2019-09-25	1	-8/+15
\| \| \| \| \| \| \| \|	Fold immediates into v_fmac_f16. Differential Revision: https://reviews.llvm.org/D68037 llvm-svn: 372906
*	[AArch64][GlobalISel] Choose CCAssignFns per-argument for tail call lowering	Jessica Paquette	2019-09-25	1	-17/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When checking for tail call eligibility, we should use the correct CCAssignFn for each argument, rather than just checking if the caller/callee is varargs or not. This is important for tail call lowering with varargs. If we don't check it, then basically any varargs callee with parameters cannot be tail called on Darwin, for one thing. If the parameters are all guaranteed to be in registers, this should be entirely safe. On top of that, not checking for this could potentially make it so that we have the wrong stack offsets when checking for tail call eligibility. Also refactor some of the stuff for CCAssignFnForCall and pull it out into a helper function. Update call-translator-tail-call.ll to show that we can now correctly tail call on Darwin. Also add two extra tail call checks. The first verifies that we still respect the caller's stack size, and the second verifies that we still don't tail call when a varargs function has a memory argument. Differential Revision: https://reviews.llvm.org/D67939 llvm-svn: 372897
*	[CodeGen] Replace -max-jump-table-size with -max-jump-table-targets	Evandro Menezes	2019-09-25	3	-8/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Modern processors predict the targets of an indirect branch regardless of the size of any jump table used to glean its target address. Moreover, branch predictors typically use resources limited by the number of actual targets that occur at run time. This patch changes the semantics of the option `-max-jump-table-size` to limit the number of different targets instead of the number of entries in a jump table. Thus, it is now renamed to `-max-jump-table-targets`. Before, when `-max-jump-table-size` was specified, it could happen that cluster jump tables could have targets used repeatedly, but each one was counted and typically resulted in tables with the same number of entries. With this patch, when specifying `-max-jump-table-targets`, tables may have different lengths, since the number of unique targets is counted towards the limit, but the number of unique targets in tables is the same, but for the last one containing the balance of targets. Differential revision: https://reviews.llvm.org/D60295 llvm-svn: 372893
*	[TargetInstrInfo] Let findCommutedOpIndices take const MachineInstr&	Simon Pilgrim	2019-09-25	8	-8/+12
\| \| \| \| \| \| \| \| \| \|	Neither the base implementation of findCommutedOpIndices nor any in-tree target modifies the instruction passed in and there is no reason why they would in the future. Committed on behalf of @hvdijk (Harald van Dijk) Differential Revision: https://reviews.llvm.org/D66138 llvm-svn: 372882
*	[Dominators][AMDGPU] Don't use virtual exit node in ↵	Jakub Kuderski	2019-09-25	1	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	findNearestCommonDominator. Cleanup MachinePostDominators. Summary: This patch fixes a bug that originated from passing a virtual exit block (nullptr) to `MachinePostDominatorTee::findNearestCommonDominator` and resulted in assertion failures inside its callee. It also applies a small cleanup to the class. The patch introduces a new function in PDT that given a list of `MachineBasicBlock`s finds their NCD. The new overload of `findNearestCommonDominator` handles virtual root correctly. Note that similar handling of virtual root nodes is not necessary in (forward) `DominatorTree`s, as right now they don't use virtual roots. Reviewers: tstellar, tpr, nhaehnle, arsenm, NutshellySima, grosser, hliao Reviewed By: hliao Subscribers: hliao, kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, hiraditya, llvm-commits Tags: #amdgpu, #llvm Differential Revision: https://reviews.llvm.org/D67974 llvm-svn: 372874
*	[SystemZ] Improve emitSelect()	Jonas Paulsson	2019-09-25	1	-33/+58
\| \| \| \| \| \| \| \| \| \| \| \| \|	Merge more Select pseudo instructions in emitSelect() by allowing other instructions between them as long as they do not clobber CC. Debug value instructions are now moved down to below the new PHIs instead of erasing them. Review: Ulrich Weigand https://reviews.llvm.org/D67619 llvm-svn: 372873
*	[ARM] Ensure we do not attempt to create lsll #0	David Green	2019-09-25	3	-5/+6
\| \| \| \| \| \| \| \| \| \| \|	During legalisation we can end up with some pretty strange nodes, like shifts of 0. We need to make sure we don't try to make long shifts of these, ending up with invalid assembly instructions. A long shift with a zero immediate actually encodes a shift by 32. Differential Revision: https://reviews.llvm.org/D67664 llvm-svn: 372839
*	Add tracing in pickNodeFromQueue.	Jay Foad	2019-09-25	1	-0/+1
\| \| \| \| \| \| \|	This matches GenericScheduler::pickNodeFromQueue, from which this function was mostly cut and pasted. llvm-svn: 372829
*	[AArch64] Convert neon_ushl and neon_sshl with positive constants to VSHL.	Florian Hahn	2019-09-25	1	-19/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I think we should be able to use shl instead of sshl and ushl for positive constant shift values, unless I am missing something. We already have the machinery in place to ensure we only replace nodes, if the shift value is positive and <= the element width. This is a generalization of an earlier patch rL372565. Reviewers: t.p.northover, samparker, dmgreen, anemet Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D67955 llvm-svn: 372824
*	[AArch64][GlobalISel] Tweak legalization rule for G_BSWAP to handle widening ↵	Amara Emerson	2019-09-25	1	-1/+1
\| \| \| \| \| \|	s16. llvm-svn: 372812
*	[NFC] Add { } to silence compiler warning [-Wmissing-braces].	Huihui Zhang	2019-09-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	/local/mnt/workspace/huihuiz/llvm-comm-git-2/llvm-project/llvm/lib/Object/MachOObjectFile.cpp:2731:7: warning: suggest braces around initialization of subobject [-Wmissing-braces] "i386", "x86_64", "x86_64h", "armv4t", "arm", "armv5e", ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ { 1 warning generated. /local/mnt/workspace/huihuiz/llvm-comm-git-2/llvm-project/llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp:355:46: warning: suggest braces around initialization of subobject [-Wmissing-braces] return addMappingFromTable<1>(MI, MRI, { 0 }, Table); ^ {} 1 warning generated. /local/mnt/workspace/huihuiz/llvm-comm-git-2/llvm-project/llvm/tools/llvm-objcopy/ELF/Object.cpp:400:57: warning: suggest braces around initialization of subobject [-Wmissing-braces] static constexpr std::array<uint8_t, 4> ZlibGnuMagic = {'Z', 'L', 'I', 'B'}; ^~~~~~~~~~~~~~~~~~ { } 1 warning generated. llvm-svn: 372811
*	[Powerpc][LoopPreIncPrep] NFC - refactor this pass for ds/dq form.	Chen Zheng	2019-09-25	1	-295/+375
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D67431 llvm-svn: 372803
*	[WebAssembly][NFC] Remove duplicate SIMD instructions and predicates	Thomas Lively	2019-09-25	2	-59/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Instead of having different v128.load and v128.store instructions for each MVT, just have one of each that is reused in all the patterns. Also removes the HasSIMD128 predicate where accompanied by HasUnimplementedSIMD128, since the latter implies the former. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67930 llvm-svn: 372792
*	[BPF] Generate array dimension size properly for zero-size elements	Yonghong Song	2019-09-24	1	-26/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, if an array element type size is 0, the number of array elements will be set to 0, regardless of what user specified. This implementation is done in the beginning where BTF is mostly used to calculate the member offset. For example, struct s {}; struct s1 { int b; struct s a[2]; }; struct s1 s1; The BTF will have struct "s1" member "a" with element count 0. Now BTF types are used for compile-once and run-everywhere relocations and we need more precise type representation for type comparison. Andrii reported the issue as there are differences between original structure and BTF-generated structure. This patch made the change to correctly assign "2" as the number elements of member "a". Some dead codes related to ElemSize compuation are also removed. Differential Revision: https://reviews.llvm.org/D67979 llvm-svn: 372785
*	Extends the expansion of the LWZtoc pseduo op for AIX.	Sean Fertile	2019-09-24	1	-15/+38
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D67853 llvm-svn: 372772
*	[X86] Add MMX MOVD/MOVQ stores to folding tables to support stack folding	Simon Pilgrim	2019-09-24	1	-0/+2
\| \| \| \|	llvm-svn: 372770
*	Regex: Make "match" and "sub" const member functions	Thomas Preud'homme	2019-09-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The Regex "match" and "sub" member functions were previously not "const" because they wrote to the "error" member variable. This commit removes those assignments, and instead assumes that the validity of the regex is already known after the initial compilation of the regular expression. As a result, these member functions were possible to make "const". This makes it easier to do things like pre-compile Regexes up-front, and makes "match" and "sub" thread-safe. The error status is now returned as an optional output, which also makes the API of "match" and "sub" more consistent with each other. Also, some uses of Regex that could be refactored to be const were made const. Patch by Nicolas Guillemot Reviewers: jankratochvil, thopre Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67241 llvm-svn: 372764
*	Revert r372333: [DAG][X86] Convert isNegatibleForFree/GetNegatedExpression ↵	Ilya Biryukov	2019-09-24	2	-129/+17
\| \| \| \| \| \| \| \| \| \|	to a target hook (PR42863) Reason: this caused severe compile time regressions in JAX. See email thread of original revision on llvm-commits for details: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190923/697042.html llvm-svn: 372756
*	[ARM] Split large widening MVE loads	David Green	2019-09-24	1	-3/+72
\| \| \| \| \| \| \| \| \| \| \| \|	Similar to rL372717, we can force the splitting of extends of vector loads in MVE, in order to use the better widening loads as opposed to going through expensive extends. This adds a combine to early-on detect extends of loads and split the load in two, from where normal legalisation will kick in and we get a series of widening loads. Differential Revision: https://reviews.llvm.org/D67909 llvm-svn: 372721
*	[ARM] Split large truncating MVE stores	David Green	2019-09-24	1	-82/+148
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MVE does not have a simple sign extend instruction that can move elements across lanes. We currently often end up moving each lane into and out of a GPR, in order to get elements into the correct places. When we have a store of a trunc (or a extend of a load), we can instead just split the store/load in two, using the narrowing/widening load/store instructions from each half of the vector. This does that for stores. It happens very early in a store combine, so as to easily detect the truncates. (It would be possible to do this later, but that would involve looking through a buildvector of extract elements. Not impossible but this way seemed simpler). By enabling store combines we also get a vmovdrr combine for free, helping some other tests. Differential Revision: https://reviews.llvm.org/D67828 llvm-svn: 372717
*	MCRegisterInfo: Merge getLLVMRegNum and getLLVMRegNumFromEH	Pavel Labath	2019-09-24	3	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The functions different in two ways: - getLLVMRegNum could return both "eh" and "other" dwarf register numbers, while getLLVMRegNumFromEH only returned the "eh" number. - getLLVMRegNum asserted if the register was not found, while the second function returned -1. The second distinction was pretty important, but it was very hard to infer that from the function name. Aditionally, for the use case of dumping dwarf expressions, we needed a function which can work with both kinds of number, but does not assert. This patch solves both of these issues by merging the two functions into one, returning an Optional<unsigned> value. While the same thing could be achieved by adding an "IsEH" argument to the (renamed) getLLVMRegNumFromEH function, it seemed better to avoid the confusion of two functions and put the choice of asserting into the hands of the caller -- if he checks the Optional value, he can safely process "untrusted" input, and if he blindly dereferences the Optional, he gets the assertion. I've updated all call sites to the new API, choosing between the two options according to the function they were calling originally, except that I've updated the usage in DWARFExpression.cpp to use the "safe" method instead, and added a test case which would have previously triggered an assertion failure when processing (incorrect?) dwarf expressions. Reviewers: dsanders, arsenm, JDevlieghere Subscribers: wdng, aprantl, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67154 llvm-svn: 372710
*	Fix uninitialized variable warning. NFCI.	Simon Pilgrim	2019-09-23	1	-1/+1
\| \| \| \|	llvm-svn: 372662
*	[WebAssembly] vNxM.load_splat instructions	Thomas Lively	2019-09-23	4	-1/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Adds the new load_splat instructions as specified at https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#load-and-splat. DAGISel does not allow matching multiple copies of the same load in a single pattern, so we use a new node in WebAssemblyISD to wrap loads that should be splatted. Depends on D67783. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67784 llvm-svn: 372655