bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[SelectionDAG] move constant or splat functions to common location	Sanjay Patel	2018-11-25	2	-39/+28
\| \| \| \| \| \| \| \|	rL347502 moved the null sibling, so we should group all of these together. I'm not sure why these aren't methods of the SDValue class itself, but that's another patch if that's possible. llvm-svn: 347523
*	[X86] Synchronize a macro in getAvailableFeatures in Host.cpp with the same ↵	Craig Topper	2018-11-24	1	-3/+3
\| \| \| \| \| \|	macro in compiler-rt to fix a negative shift amount warning. llvm-svn: 347518
*	[InstCombine] Determine demanded and known bits for funnel shifts	Nikita Popov	2018-11-24	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Support funnel shifts in InstCombine demanded bits simplification. If the shift amount is constant, we can determine both the demanded bits of the operands, as well as the known bits of the result. If one of the operands has no demanded bits, it will be replaced by undef and the funnel shift will be simplified into a simple shift due to the simplifications added in D54778. Differential Revision: https://reviews.llvm.org/D54869 llvm-svn: 347515
*	Revert unapproved commit	Joel Jones	2018-11-24	1	-154/+4
\| \| \| \|	llvm-svn: 347511
*	[AArch64] Enable libm vectorized functions via SLEEF	Joel Jones	2018-11-24	1	-4/+154
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This changeset is modeled after Intel's submission for SVML. It enables trigonometry functions vectorization via SLEEF: http://sleef.org/. * A new vectorization library enum is added to TargetLibraryInfo.h: SLEEF. * A new option is added to TargetLibraryInfoImpl - ClVectorLibrary: SLEEF. * A comprehensive test case is included in this changeset. * In a separate changeset (for clang), a new vectorization library argument is added to -fveclib: -fveclib=SLEEF. Trigonometry functions that are vectorized by sleef: acos asin atan atanh cos cosh exp exp2 exp10 lgamma log10 log2 log sin sinh sqrt tan tanh tgamma Patch by Stefan Teleman Differential Revision: https://reviews.llvm.org/D53927 llvm-svn: 347510
*	[ARM] Add dependency from ARMAsmParser to ARMAsmPrinter after r347494	Fangrui Song	2018-11-23	1	-1/+1
\| \| \| \| \| \|	This fixes -DBUILD_SHARED_LIBS=on llvm-svn: 347506
*	[InstCombine] Simplify funnel shift with zero/undef operand to shift	Nikita Popov	2018-11-23	1	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following simplifications are implemented: * `fshl(X, 0, C) -> shl X, C%BW` * `fshl(X, undef, C) -> shl X, C%BW` (assuming undef = 0) * `fshl(0, X, C) -> lshr X, BW-C%BW` * `fshl(undef, X, C) -> lshr X, BW-C%BW` (assuming undef = 0) * `fshr(X, 0, C) -> shl X, (BW-C%BW)` * `fshr(X, undef, C) -> shl X, BW-C%BW` (assuming undef = 0) * `fshr(0, X, C) -> lshr X, C%BW` * `fshr(undef, X, C) -> lshr, X, C%BW` (assuming undef = 0) The simplification is only performed if the shift amount C is constant, because we can explicitly compute C%BW and BW-C%BW in this case. Differential Revision: https://reviews.llvm.org/D54778 llvm-svn: 347505
*	[DAG] consolidate shift simplifications	Sanjay Patel	2018-11-23	2	-74/+58
\| \| \| \| \| \| \| \| \| \|	...and use them to avoid creating obviously undef values as discussed in the post-commit thread for r347478. The diffs in vector div/rem show that we were missing real optimizations by creating bogus shift nodes. llvm-svn: 347502
*	Revert r347490 as it breaks address sanitizer builds	Luke Cheeseman	2018-11-23	14	-88/+11
\| \| \| \|	llvm-svn: 347499
*	[ARM][AsmParser] Improve debug printing of parsed asm operands	Oliver Stannard	2018-11-23	1	-19/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In ARMOperand::print: - Print human-readable register names, instead of numbers. - Print the correct names for IT condition masks (these were in the wrong order before). - Print all parts of memory operands, not just the base register. This makes the output of llvm-mc -show-inst-operands more readable. Differential revision: https://reviews.llvm.org/D54850 llvm-svn: 347494
*	Attempt to fix buildbot after r347489	Eugene Leviant	2018-11-23	1	-1/+1
\| \| \| \|	llvm-svn: 347492
*	Revert r343341	Luke Cheeseman	2018-11-23	14	-11/+88
\| \| \| \| \| \| \|	- Cannot reproduce the build failure locally and the build logs have been deleted. llvm-svn: 347490
*	[ThinLTO] Assembly representation of ReadOnly attribute	Eugene Leviant	2018-11-23	5	-24/+75
\| \| \| \| \| \|	Differential revision: https://reviews.llvm.org/D54754 llvm-svn: 347489
*	Disable LoopSimplifyCFG terminator folding by default	Max Kazantsev	2018-11-23	1	-0/+6
\| \| \| \|	llvm-svn: 347486
*	[LoopSimplifyCFG] Don't delete LCSSA Phis	Max Kazantsev	2018-11-23	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \|	When removing edges, we also update Phi inputs and may end up removing a Phi if it has only one input. We should not do it for edges that leave the current loop because these Phis are LCSSA Phis and need to be preserved. Thanks @dmgreen for finding this! Differential Revision: https://reviews.llvm.org/D54841 llvm-svn: 347484
*	[LegalizeVectorTypes] Don't use SplitVecOp_TruncateHelper if we're heading ↵	Craig Topper	2018-11-23	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \|	towards scalarizing the type. This code takes a truncate, fp_to_int, or int_to_fp with a legal result type and an input type that needs to be split and enlarges the elements in the result type before doing the split. Then inserts a follow up truncate or fp_round after concatenating the two halves back together. But if the input type of the original op is being split on its way to ultimately being scalarized we're just going to end up building a vector from scalars and then truncating or rounding it in the vector register. Seems kind of silly to enlarge the result element type of the operation only to end up with scalar code and then building a vector with large elements only to make the elements smaller again in the vector register. Seems better to just try to get away producing smaller result types in the scalarized code. The X86 test case that changes is a pretty contrived test case that exists because of a bug we used to have in our AVG matching code. I think the code is better now, but its not realistic anyway. llvm-svn: 347482
*	[LegalizeVectorTypes] Have SplitVecOp_TruncateHelper fall back to ↵	Craig Topper	2018-11-22	1	-1/+7
\| \| \| \| \| \| \| \| \| \|	SplitVecOp_UnaryOp if splitting the output type would be a legal type. SplitVecOp_TruncateHelper tries to introduce a multilevel truncate to avoid scalarization. But if splitting the result type would still be a legal type we don't need to do that. The comment block at the top of the function implied that this was already implemented. I looked back through the history and it doesn't look to have ever been checked. llvm-svn: 347479
*	[DAGCombiner] form 'not' ops ahead of shifts (PR39657)	Sanjay Patel	2018-11-22	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We fail to canonicalize IR this way (prefer 'not' ops to arbitrary 'xor'), but that would not matter without this patch because DAGCombiner was reversing that transform. I think we need this transform in the backend regardless of what happens in IR to catch cases where the shift-xor is formed late from GEP or other ops. https://rise4fun.com/Alive/NC1 Name: shl Pre: (-1 << C2) == C1 %shl = shl i8 %x, C2 %r = xor i8 %shl, C1 => %not = xor i8 %x, -1 %r = shl i8 %not, C2 Name: shr Pre: (-1 u>> C2) == C1 %sh = lshr i8 %x, C2 %r = xor i8 %sh, C1 => %not = xor i8 %x, -1 %r = lshr i8 %not, C2 https://bugs.llvm.org/show_bug.cgi?id=39657 llvm-svn: 347478
*	[NFC] Assert that all blocks staying in loop are live	Max Kazantsev	2018-11-22	1	-0/+2
\| \| \| \|	llvm-svn: 347458
*	[NFC] Ensure deterministic order of dead exit blocks	Max Kazantsev	2018-11-22	1	-6/+11
\| \| \| \|	llvm-svn: 347457
*	[AArch64] Fix SelectionDAG infinite loop for v1i64 SCALAR_TO_VECTOR	John Brawn	2018-11-22	2	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	A consequence of r347274 is that SCALAR_TO_VECTOR can be converted into BUILD_VECTOR by SimplifyDemandedBits, but LowerBUILD_VECTOR can turn BUILD_VECTOR into SCALAR_TO_VECTOR so we get an infinite loop. Fix this by making LowerBUILD_VECTOR not do this transformation for those vectors that would get transformed back, i.e. BUILD_VECTOR of a single-element constant vector. Doing that means we get a DUP, which we then need to recognise in ISel as a copy. llvm-svn: 347456
*	[NFC] Simplify code by using standard exit blocks collection	Max Kazantsev	2018-11-22	1	-10/+8
\| \| \| \|	llvm-svn: 347454
*	[TI removal] Leverage the fact that TerminatorInst is gone to create	Chandler Carruth	2018-11-22	2	-37/+108
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	a normal base class that provides all common "call" functionality. This merges two complex CRTP mixins for the common "call" logic and common operand bundle logic into a single, normal base class of `CallInst` and `InvokeInst`. Going forward, users can typically `dyn_cast<CallBase>` and use the resulting API. No more need for the `CallSite` wrapper. I'm planning to migrate current usage of the wrapper to directly use the base class and then it can be removed, but those are simpler and much more incremental steps. The big change is to introduce this abstraction into the type system. I've tried to do some basic simplifications of the APIs that I couldn't really help but touch as part of this: - I've tried to organize the attribute API and bundle API into groups to make understanding the API of `CallBase` easier. Without this, I wasn't able to navigate the API sanely for all of the ways I needed to modify it. - I've added what seem like more clear and consistent APIs for getting at the called operand. These ended up being especially useful to consolidate the numerous duplicated code paths trying to do this. - I've largely reworked the organization and implementation of the APIs for computing the argument operands as they needed to change to work with the new subclass approach. To minimize any cost associated with this abstraction, I've moved the operand layout in memory to store the called operand last. This makes its position relative to the end of the operand array the same, regardless of the subclass. It should make it much cheaper to reference from the `CallBase` abstraction, and this is likely one of the most frequent things to query. We do still pay one abstraction penalty here: we have to branch to determine whether there are 0 or 2 extra operands when computing the end of the argument operand sequence. However, that seems both rare and should optimize well. I've implemented this in a way specifically designed to allow it to optimize fairly well. If this shows up in profiles, we can add overrides of the relevant methods to the subclasses that bypass this penalty. It seems very unlikely that this will be an issue as the code was already dealing with an ever present abstraction of whether or not there are operand bundles, so this isn't the first branch to go into the computation. I've tried to remove as much of the obvious vestigial API surface of the old CRTP implementation as I could, but I suspect there is further cleanup that should now be possible, especially around the operand bundle APIs. I'm leaving all of that for future work in this patch as enough things are changing here as-is. One thing that made this harder for me to reason about and debug was the pervasive use of unsigned values in subtraction and other arithmetic computations. I had to debug more than one unintentional wrap. I've switched a few of these to use `int` which seems substantially simpler, but I've held back from doing this more broadly to avoid creating confusing divergence within a single class's API. I also worked to remove all of the magic numbers used to index into operands, putting them behind named constants or putting them into a single method with a comment and strictly using the method elsewhere. This was necessary to be able to re-layout the operands as discussed above. Thanks to Ben for reviewing this (somewhat large and awkward) patch! Differential Revision: https://reviews.llvm.org/D54788 llvm-svn: 347452
*	[SystemZTTIImpl] Give correct cost values for vector bswap intrinsics.	Jonas Paulsson	2018-11-22	2	-0/+33
\| \| \| \| \| \| \| \| \| \|	Implement getIntrinsicInstrCost() and return costs reflecting that bswap can be done with a vperm per vector register. Review: Ulrich Weigand https://reviews.llvm.org/D54789 llvm-svn: 347445
*	[PM] correcting return value for new-pass-manager version of Scalarizer	Fedor Sergeev	2018-11-21	1	-2/+2
\| \| \| \| \| \|	Obvious mistake missed during D54695 review. llvm-svn: 347432
*	[mingw] Use unmangled name after the $ in the section name	Reid Kleckner	2018-11-21	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GCC does it this way, and we have to be consistent. This includes stdcall and fastcall functions with suffixes. I confirmed that a fastcall function named "foo" ends up in ".text$foo", not ".text$@foo@8". Based on a patch by Andrew Yohn! Fixes PR39218. Differential Revision: https://reviews.llvm.org/D54762 llvm-svn: 347431
*	[PowerPC][NFC] Split PPCMCCodeEmitter into header and cpp file.	Stefan Pintilie	2018-11-21	2	-91/+110
\| \| \| \| \| \| \| \| \|	This is further cleanup for PPCMCCodeEmitter. The class had been contained within the cpp file alone. Now it has been split up between a header file and a cpp file which allows other classes to make use of the functions in this class if required. llvm-svn: 347428
*	[DAGCombiner] refactor select-of-FP-constants transform	Sanjay Patel	2018-11-21	1	-53/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This transform needs to be limited. We are converting to a constant pool load very early, and we are turning loads that are independent of the select condition (and therefore speculatable) into a dependent non-speculatable load. We may also be transferring a condition code from an FP register to integer to create that dependent load. llvm-svn: 347424
*	[PowerPC][NFC] Minor Code Cleaup for PPCMCCodeEmitter.	Stefan Pintilie	2018-11-21	1	-30/+41
\| \| \| \|	llvm-svn: 347422
*	[DAGCombiner] reduce code duplication; NFC	Sanjay Patel	2018-11-21	1	-33/+30
\| \| \| \|	llvm-svn: 347410
*	[MergeFuncs] Generate alias instead of thunk if possible	Nikita Popov	2018-11-21	1	-14/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The MergeFunctions pass was originally intended to emit aliases instead of thunks where possible (unnamed_addr). However, for a long time this functionality was behind a flag hardcoded to false, bitrotted and was eventually removed in r309313. Originally the functionality was first disabled in r108417 due to lack of support for aliases in Mach-O. I believe that this is no longer the case nowadays, but not really familiar with this area. In the interest of being conservative, this patch reintroduces the aliasing functionality behind a default disabled -mergefunc-use-aliases flag. Differential Revision: https://reviews.llvm.org/D53285 llvm-svn: 347407
*	[x86] fix predicate for avoiding vblendv	Sanjay Patel	2018-11-21	1	-6/+3
\| \| \| \| \| \| \|	It only makes sense to produce the logic ops when 1 of the constants is +0.0. Otherwise, go with vblendv to reduce code. llvm-svn: 347403
*	[mips][mc] Add basic support for R_MIPS_JALR/R_MICROMIPS_JALR	Vladimir Stefanovic	2018-11-21	3	-2/+18
\| \| \| \| \| \| \| \| \|	R_MIPS_JALR/R_MICROMIPS_JALR can now be parsed in .s files and emitted to .o. They are still not generated with JALR. Differential revision: https://reviews.llvm.org/D54721 llvm-svn: 347398
*	[MC] Support labels as offsets in .reloc directive	Vladimir Stefanovic	2018-11-21	2	-20/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, expressions like .reloc 1f, R_MIPS_JALR, foo 1: nop are not allowed, ie. an offset in .reloc can only be absolute value. This patch adds support for labels as offsets. If offset is a forward declared label, MCObjectStreamer keeps the fixup locally and adds it to the fixups vector after the label (and its offset) is defined. label+number is not supported yet. Differential revision: https://reviews.llvm.org/D53990 llvm-svn: 347397
*	[TargetLowering] SimplifyDemandedBits - only reduce known bits for integer ↵	Simon Pilgrim	2018-11-21	1	-1/+3
\| \| \| \| \| \| \| \|	constants Avoids fuzzing crash found by Mikael Holmén. llvm-svn: 347393
*	[PM] Port Scalarizer to the new pass manager.	Mikael Holmen	2018-11-21	4	-55/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch by: markus (Markus Lavin) Reviewers: chandlerc, fedor.sergeev Reviewed By: fedor.sergeev Subscribers: llvm-commits, Ka-Ka, bjope Differential Revision: https://reviews.llvm.org/D54695 llvm-svn: 347392
*	[nios2] Add missing Nios2CodeGen -> Nios2AsmPrinter linkage	Michal Gorny	2018-11-21	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add missing linkage from Nios2CodeGen library to Nios2AsmPrinter library. The missing dependency causes shared-lib build to fail with the following reason: lib/Target/Nios2/CMakeFiles/LLVMNios2CodeGen.dir/Nios2AsmPrinter.cpp.o: In function `(anonymous namespace)::Nios2AsmPrinter::PrintAsmMemoryOperand(llvm::MachineInstr const, unsigned int, unsigned int, char const, llvm::raw_ostream&)': Nios2AsmPrinter.cpp:(.text._ZN12_GLOBAL__N_115Nios2AsmPrinter21PrintAsmMemoryOperandEPKN4llvm12MachineInstrEjjPKcRNS1_11raw_ostreamE+0x2b): undefined reference to `llvm::Nios2InstPrinter::getRegisterName(unsigned int)' lib/Target/Nios2/CMakeFiles/LLVMNios2CodeGen.dir/Nios2AsmPrinter.cpp.o: In function `(anonymous namespace)::Nios2AsmPrinter::PrintAsmOperand(llvm::MachineInstr const, unsigned int, unsigned int, char const, llvm::raw_ostream&)': Nios2AsmPrinter.cpp:(.text._ZN12_GLOBAL__N_115Nios2AsmPrinter15PrintAsmOperandEPKN4llvm12MachineInstrEjjPKcRNS1_11raw_ostreamE+0x97): undefined reference to `llvm::Nios2InstPrinter::getRegisterName(unsigned int)' collect2: error: ld returned 1 exit status Differential Revision: https://reviews.llvm.org/D47810 llvm-svn: 347387
*	[X86][AVX] Remove BROADCAST if we only need the 0'th element	Simon Pilgrim	2018-11-21	1	-0/+7
\| \| \| \| \| \|	We don't catch this with target shuffle simplification if the src/dst types are different. llvm-svn: 347386
*	Test commit: Delete trailing space in comment	Nikita Popov	2018-11-21	1	-1/+1
\| \| \| \|	llvm-svn: 347385
*	[X86] In getScalarMaskingNode, replace scalar_to_vector with a bitcast to ↵	Craig Topper	2018-11-21	1	-1/+3
\| \| \| \| \| \| \| \|	v8i1 and an extract_subvector to convert i8 to v1i1. The bitcast can be nicely merged with any i8 loads that exist for argument passing in 32 mode for example. llvm-svn: 347380
*	[LVI] run transfer function for binary operator even when the RHS isn't a ↵	John Regehr	2018-11-21	1	-36/+39
\| \| \| \| \| \| \| \| \| \| \| \| \|	constant LVI was symbolically executing binary operators only when the RHS was constant, missing the case where we have a ConstantRange for the RHS, but not an actual constant. Tested using check-all and by bootstrapping. Compile time is not impacted measurably. Differential Revision: https://reviews.llvm.org/D19859 llvm-svn: 347379
*	[PowerPC] Do not use vectors to codegen bswap with Altivec turned off	Nemanja Ivanovic	2018-11-21	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have efficient codegen on P9 for lowering bswap that involves moving the value into a vector reg and moving it back. However, the check under which we custom lowered it did not adequately reflect the actual requirements. It required only that the subtarget be an implementation of ISA 3.0 since all compliant implementations have to provide the vector instructions. However, the kernel builds have a valid use case for -mno-altivec -mcpu=pwr9 (i.e. don't emit vector code, don't have to save vector regs for context switch). So we should require the correct features for this lowering. Fixes https://bugs.llvm.org/show_bug.cgi?id=39334 llvm-svn: 347376
*	[X86] Correct 256 vpmovzx/vpmovsx isel patterns to check HasAVX2 instead of ↵	Craig Topper	2018-11-21	1	-8/+8
\| \| \| \| \| \| \| \| \| \|	HasAVX to prevent fast-isel from using them incorrectly. These are AVX2 instructions, but have been incorrectly marked in tablegen for a while. This wasn't a problem until r346784 switched the patterns to use target independent ISD opcodes. This made the patterns visible to fast isel. Fixes PR39733 llvm-svn: 347375
*	[X86] Emit a PACKUS instead of a VECTOR_SHUFFLE from LowerTRUNCATE for ↵	Craig Topper	2018-11-20	1	-7/+2
\| \| \| \| \| \| \| \| \| \|	v16i16->v16i8. We can't guarantee that demanded bits passing through the vector shuffle won't cause the AND in front of this to be removed. This would prevent the PACKUS from being matched during shuffle lowering. Unfortunately, this adds a packuswb to one of the vector-reduce-mul.ll tests since we were removing the shuffle via SimplifyDemandedVectorElts. We appear to have similar issues with vpmovwb on the same test case on other targets. llvm-svn: 347361
*	[DAGCombiner] look through bitcasts when trying to narrow vector binops	Sanjay Patel	2018-11-20	1	-13/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is another step in vector narrowing - a follow-up to D53784 (and hoping to eventually squash potential regressions seen in D51553). The x86 test diffs are wins, but the AArch64 diff is probably not. That problem already exists independent of this patch (see PR39722), but it went unnoticed in the previous patch because there were no regression tests that showed the possibility. The x86 diff in i64-mem-copy.ll is close. Given the frequency throttling concerns with using wider vector ops, an extra extract to reduce vector width is the right trade-off at this level of codegen. Differential Revision: https://reviews.llvm.org/D54392 llvm-svn: 347356
*	[CodeView] Add support for ref-qualified member functions.	Zachary Turner	2018-11-20	3	-21/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When you have a member function with a ref-qualifier, for example: struct Foo { void Func() &; void Func2() &&; }; clang-cl was not emitting this information. Doing so is a bit awkward, because it's not a property of the LF_MFUNCTION type, which is what you'd expect. Instead, it's a property of the this pointer which is actually an LF_POINTER. This record has an attributes bitmask on it, and our handling of this bitmask was all wrong. We had some parts of the bitmask defined incorrectly, but importantly for this bug, we didn't know about these extra 2 bits that represent the ref qualifier at all. Differential Revision: https://reviews.llvm.org/D54667 llvm-svn: 347354
*	[CodeView] Mark this pointers as const.	Zachary Turner	2018-11-20	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \|	This is for compatibility with MSVC, which also marks this pointers as being const-qualified. Fixes llvm.org/pr36526 Differential Revision: https://reviews.llvm.org/D54736 llvm-svn: 347353
*	[X86] Emit a single shuffle for the v16i8->v4i32 step of a ↵	Craig Topper	2018-11-20	1	-54/+28
\| \| \| \| \| \| \| \| \| \|	SIGN_EXTEND_VECTOR_INREG lowering on pre-sse4.1 targets. Previously we emitted to separate shuffles, one for unpcklbw and one for unpcklwd. Instead emit a single shuffle equivalent to both of the original shuffles. Shuffle lowering seems able to handle it. This avoids a bitcast between the two shuffles which seems helpful to DAG combine. Remove the custom type legalization for v8i8->v8i32. I had put that in to avoid some almost duplicate punpcklbw instructions I was seeing, but this lowering change seems to fix that. It also fixes some duplicate shuffles seen in vector-sext.ll llvm-svn: 347348
*	[WebAssembly] WebAssemblyLowerEmscriptenEHSjLj: use getter/setter for ↵	Sam Clegg	2018-11-20	1	-40/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	accessing tempRet0 Rather than assuming that `tempRet0` exists in linear memory only assume the getter/setter functions exist. This avoids conflicting with binaryen which declares a wasm global for this purpose and defines it's own getter and setter for that. The other advantage of doing things this way is that it leaving it up to the linker/finalizer to decide how to actually store this temporary. As it happens binaryen uses a wasm global which is more appropriate since it is thread safe. This also allows us to change the way this is stored in the future (memory, TLS memory, wasm global) without modifying LLVM. This is part of a 4 part change: LLVM: https://reviews.llvm.org/D53240 fastcomp: https://github.com/kripken/emscripten-fastcomp/pull/237 emscripten: https://github.com/kripken/emscripten/pull/7358 binaryen: https://github.com/WebAssembly/binaryen/pull/1709 Differential Revision: https://reviews.llvm.org/D53240 llvm-svn: 347340
*	[InstSimplify] fold funnel shifts with undef operands	Sanjay Patel	2018-11-20	1	-1/+10
\| \| \| \| \| \| \| \|	Splitting these off from the D54666. Patch by: nikic (Nikita Popov) llvm-svn: 347332