bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[AArch64][GlobalISel] Flesh out vector load/store support for more types.	Amara Emerson	2019-04-11	1	-0/+8
\| \| \| \| \| \| \|	Some of these were legalizing into smaller vector types unnecessarily, others were simply not supported yet. llvm-svn: 358223
*	[AArch64][GlobalISel] Legalization and ISel support for load/stores of ↵	Amara Emerson	2019-04-11	3	-9/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vectors of pointers. Loads and store of values with type like <2 x p0> currently don't get imported because SelectionDAG has no knowledge of pointer types. To leverage the existing support for vector load/stores, we can bitcast the value to have s64 element types instead. We do this as a custom legalization. This patch also adds support for general loads of <2 x s64>, and relaxes some type conditions on selecting G_BITCAST. Differential Revision: https://reviews.llvm.org/D60534 llvm-svn: 358221
*	[X86] Restrict vselect handling in scalarizeExtEltFP to only case to pre ↵	Craig Topper	2019-04-11	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	type legalization where the setcc result type is vXi1. If the vector setcc has been legalized then we will need to convert a vector boolean of 0 or -1 to a scalar boolean of 0 or 1. The added test case previously crashed in 32-bit mode by creating a setcc with an i64 condition that type legalization couldn't expand. llvm-svn: 358218
*	[X86] Add patterns for using movss/movsd for atomic load/store of f32/64. ↵	Craig Topper	2019-04-11	2	-70/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove atomic fadd pseudos use isel patterns instead. This patch adds patterns for turning bitcasted atomic load/store into movss/sd. It also removes the pseudo instructions for atomic RMW fadd. Instead just adding isel patterns for folding an atomic load into addss/sd. And relying on the new movss/sd store pattern to handle the write part. This also makes the fadd patterns use VEX and EVEX instructions when AVX or AVX512F are enabled. Differential Revision: https://reviews.llvm.org/D60394 llvm-svn: 358215
*	Recommit r358211 "[X86] Use FILD/FIST to implement i64 atomic load on 32-bit ↵	Craig Topper	2019-04-11	3	-20/+75
\| \| \| \| \| \| \| \| \| \| \| \|	targets with X87, but no SSE2" With correct test checks this time. If we have X87, but not SSE2 we can atomicaly load an i64 value into the significand of an 80-bit extended precision x87 register using fild. We can then use a fist instruction to convert it back to an i64 integ This matches what gcc and icc do for this case and removes an existing FIXME. llvm-svn: 358214
*	Revert r358211 "[X86] Use FILD/FIST to implement i64 atomic load on 32-bit ↵	Craig Topper	2019-04-11	3	-75/+20
\| \| \| \| \| \| \| \|	targets with X87, but no SSE2" I seem to have messed up the test checks. llvm-svn: 358212
*	[X86] Use FILD/FIST to implement i64 atomic load on 32-bit targets with X87, ↵	Craig Topper	2019-04-11	3	-20/+75
\| \| \| \| \| \| \| \| \| \| \| \|	but no SSE2 If we have X87, but not SSE2 we can atomicaly load an i64 value into the significand of an 80-bit extended precision x87 register using fild. We can then use a fist instruction to convert it back to an i64 integer and store it to a stack temporary. From there we can do two 32-bit loads to get the value into integer registers without worrying about atomicness. This matches what gcc and icc do for this case and removes an existing FIXME. Differential Revision: https://reviews.llvm.org/D60156 llvm-svn: 358211
*	[X86] SimplifyDemandedVectorElts - add X86ISD::VPERMV3 mask support	Simon Pilgrim	2019-04-11	1	-1/+1
\| \| \| \| \| \|	Completes SimplifyDemandedVectorElts's basic variable shuffle mask support which should help D60512 + D60562 llvm-svn: 358186
*	[RISCV] Diagnose invalid second input register operand when using %tprel_add	Roger Ferrer Ibanez	2019-04-11	1	-2/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	RISCVMCCodeEmitter::expandAddTPRel asserts that the second operand must be x4/tp. As we are not currently checking this in the RISCVAsmParser, the assert is easy to trigger due to wrong assembly input. This patch does a late check of this constraint. An alternative could be using a singleton register class for x4/tp similar to the current one for sp. Unfortunately it does not result in a good diagnostic. Because add is an overloaded mnemonic, if no matching is possible, the diagnostic of the first failing alternative seems to be used as the diagnostic itself. This means that this case the %tprel_add is diagnosed as an invalid operand (because the real add instruction only has 3 operands). Differential Revision: https://reviews.llvm.org/D60528 llvm-svn: 358183
*	[X86] Add MM register mapping from CodeView to MC register id	Luo, Yuanke	2019-04-11	1	-0/+9
\| \| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D60437 Change-Id: I2183a6d825d0284b22705d423b88882992b236c5 llvm-svn: 358179
*	[X86] SimplifyDemandedVectorElts - add X86ISD::VPERMV mask support	Simon Pilgrim	2019-04-11	1	-0/+8
\| \| \| \|	llvm-svn: 358174
*	[AArch64] Add lowering pattern for llvm.aarch64.neon.vcvtfxs2fp.f16.i64	Diogo N. Sampaio	2019-04-11	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add lowering pattern for llvm.aarch64.neon.vcvtfxs2fp.f16.i64 Reviewers: pbarrio, DavidSpickett, LukeGeeson Reviewed By: LukeGeeson Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60259 llvm-svn: 358171
*	[X86] SimplifyDemandedVectorElts - add X86ISD::VPERMILPV mask support	Simon Pilgrim	2019-04-11	1	-1/+2
\| \| \| \|	llvm-svn: 358170
*	[X86] SimplifyDemandedVectorElts - add X86ISD::VPERMIL2 mask support	Simon Pilgrim	2019-04-11	1	-2/+2
\| \| \| \|	llvm-svn: 358167
*	[X86] SimplifyDemandedVectorElts - add VPPERM support	Simon Pilgrim	2019-04-11	1	-0/+9
\| \| \| \| \| \|	We need to add support for all variable shuffle mask ops, but VPPERM is the only one that already has test coverage. llvm-svn: 358165
*	Test commit access	Oliver Stannard	2019-04-11	1	-0/+1
\| \| \| \|	llvm-svn: 358162
*	[RISCV] Put data smaller than eight bytes to small data section	Shiva Chen	2019-04-11	2	-0/+120
\| \| \| \| \| \| \| \| \| \| \|	Because of gp = sdata_start_address + 0x800, gp with signed twelve-bit offset could covert most of the small data section. Linker relaxation could transfer the multiple data accessing instructions to a gp base with signed twelve-bit offset instruction. Differential Revision: https://reviews.llvm.org/D57493 llvm-svn: 358150
*	[AArch64][GlobalISel] Make <2 x p0> = G_BUILD_VECTOR legal.	Amara Emerson	2019-04-10	1	-0/+1
\| \| \| \| \| \|	The existing isel support already works for p0 once the legalizer accepts it. llvm-svn: 358144
*	[AArch64][GlobalISel] Add legalizer support for <8 x s16> and <16 x s8> G_ADD.	Amara Emerson	2019-04-10	1	-1/+1
\| \| \| \|	llvm-svn: 358143
*	[AArch64][GlobalISel] Scalarize vector SDIV.	Amara Emerson	2019-04-10	1	-1/+2
\| \| \| \|	llvm-svn: 358142
*	[X86] Teach foldMaskedShiftToScaledMask to look through an any_extend from ↵	Craig Topper	2019-04-10	1	-22/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	i32 to i64 between the and & shl foldMaskedShiftToScaledMask tries to reorder and & shl to enable the shl to fold into an LEA. But if there is an any_extend between them it doesn't work. This patch modifies the code to look through any_extend from i32 to i64 when the and mask only uses bits that weren't from the extended part. This will prevent a regression from D60358 caused by 64-bit SHL being narrowed to 32-bits when their upper bits aren't demanded. Differential Revision: https://reviews.llvm.org/D60532 llvm-svn: 358139
*	[X86] Make _Int instructions the preferred instructon for the assembly ↵	Craig Topper	2019-04-10	6	-147/+152
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	parser and disassembly parser to remove inconsistencies between VEX and EVEX. Many of our instructions have both a _Int form used by intrinsics and a form used by other IR constructs. In the EVEX space the _Int versions usually cover all the capabilities include broadcasting and rounding. While the other version only covers simple register/register or register/load forms. For this reason in EVEX, the non intrinsic form is usually marked isCodeGenOnly=1. In the VEX encoding space we were less consistent, but usually the _Int version was the isCodeGenOnly version. This commit makes the VEX instructions match the EVEX instructions. This was done by manually studying the AsmMatcher table so its possible I missed some cases, but we should be closer now. I'm thinking about using the isCodeGenOnly bit to simplify the EVEX2VEX tablegen code that disambiguates the _Int and non _Int versions. Currently it checks register class sizes and Record the memory operands come from. I have some other changes I was looking into for D59266 that may break the memory check. I had to make a few scheduler hacks to keep the _Int versions from being treated differently than the non _Int version. Differential Revision: https://reviews.llvm.org/D60441 llvm-svn: 358138
*	[X86] Replace some if statements in isel address matching that should never ↵	Craig Topper	2019-04-10	1	-8/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	be true with asserts. And move them earlier before we looked through operands that don't change size. NFC These ifs were ensuring we don't have to handle types larger than 64 bits probably because we use getZExtValue in several places below them. None of the callers of this code pass types larger than 64-bits so we can just assert instead of branching in release code. I've also moved them earlier since we're just looking through operations that don't effect bit width. This is prep work for some refactoring I plan to do to the (and (shl)) handling code. llvm-svn: 358123
*	[X86AsmPrinter] refactor to limit use of Modifier. NFC	Nick Desaulniers	2019-04-10	1	-47/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The Modifier memory operands is used in 2 cases of memory references (H & P ExtraCodes). Rather than pass around the likely nullptr Modifier, refactor the handling of the Modifier out from printOperand(). The refactorings in this patch: - Don't forward declare printOperand, move its definition up. - The diff makes it look like there's a change to printPCRelImm (narrator: there's not). - Create printModifiedOperand() - Move logic for Modifier to there from printOperand - Use printModifiedOperand in 3 call sites that actually create Modifiers. - Remove now unused Modifier parameter from printOperand - Remove default parameter from printLeaMemReference as it only has 1 call site that explicitly passes a parameter. - Remove default parameter from printMemReference, make call lone call site explicitly pass nullptr. - Drop Modifier parameter from printIntelMemReference, as Intel style memory references don't support the Modifiers in question. This will allow future changes to printOperand() to make it a pure virtual method on the base AsmPrinter class, allowing for more generic handling of some architecture generic constraints. X86AsmPrinter was the only derived class of AsmPrinter to have additional parameters on its printOperand function. Reviewers: craig.topper, echristo Reviewed By: echristo Subscribers: hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60526 llvm-svn: 358122
*	[X86] X86ScheduleBdVer2: use !listsplat operator to cleanup loadres calculation	Roman Lebedev	2019-04-10	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	The problem is that one can't concatenate an empty list (implied all-ones) with non-empty list here. The result will be the non-empty list, and it won't match the length of the ExePorts list. The problems begin when LoadRes != 1 here, which is the case in PdWriteResYMMPair, and more importantly i think it will be the case for PdWriteResExPair. llvm-svn: 358118
*	Revert rL357745: [SelectionDAG] Compute known bits of CopyFromReg	David Green	2019-04-10	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	Certain optimisations from ConstantHoisting and CGP rely on Selection DAG not seeing through to the constant in other blocks. Revert this patch while we come up with a better way to handle that. I will try to follow this up with some better tests. llvm-svn: 358113
*	[AArch64] Teach getTestBitOperand to look through ANY_EXTENDS	Craig Topper	2019-04-10	1	-0/+6
\| \| \| \| \| \| \| \|	This patch teach getTestBitOperand to look through ANY_EXTENDs when the extended bits aren't used. The test case changed here is based what D60358 did to test16 in tbz-tbnz.ll. So this patch will avoid that regression. Differential Revision: https://reviews.llvm.org/D60482 llvm-svn: 358108
*	[AsmPrinter] refactor to remove remove AsmVariant. NFC	Nick Desaulniers	2019-04-10	25	-154/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The InlineAsm::AsmDialect is only required for X86; no architecture makes use of it and as such it gets passed around between arch-specific and general code while being unused for all architectures but X86. Since the AsmDialect is queried from a MachineInstr, which we also pass around, remove the additional AsmDialect parameter and query for it deep in the X86AsmPrinter only when needed/as late as possible. This refactor should help later planned refactors to AsmPrinter, as this difference in the X86AsmPrinter makes it harder to make AsmPrinter more generic. Reviewers: craig.topper Subscribers: jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, llvm-commits, peter.smith, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60488 llvm-svn: 358101
*	[X86][AVX] getTargetConstantBitsFromNode - extract bits from ↵	Simon Pilgrim	2019-04-10	1	-0/+13
\| \| \| \| \| \|	X86ISD::SUBV_BROADCAST llvm-svn: 358096
*	[AArch64] Add lowering pattern for scalar fp16 facge and facgt	Diogo N. Sampaio	2019-04-10	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The fp16 scalar version of facge and facgt requires a custom patter matching, as the result type is not the same width of the operands. Reviewers: olista01, javed.absar, pbarrio Reviewed By: javed.absar Subscribers: kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60212 llvm-svn: 358083
*	[ARM] [FIX] Add missing f16 vector operations lowering	Diogo N. Sampaio	2019-04-10	2	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add missing <8xhalf> shufflevectors pattern, when using concat_vector dag node. As well, allows <8xhalf> and <4xhalf> vldup1 operations. These instructions are required for v8.2a fp16 lowering of vmul_n_f16, vmulq_n_f16 and vmulq_lane_f16 intrinsics. Reviewers: olista01, pbarrio, LukeGeeson, efriedma Reviewed By: efriedma Subscribers: efriedma, javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60319 llvm-svn: 358081
*	[NFC] Fix unused variable warning.	Clement Courbet	2019-04-10	1	-3/+0
\| \| \| \|	llvm-svn: 358080
*	Fixup r358063	Diana Picus	2019-04-10	1	-2/+2
\| \| \| \| \| \|	Fix warning/error about mixed signedness. llvm-svn: 358065
*	[ARM GlobalISel] Add some asserts. NFC.	Diana Picus	2019-04-10	1	-0/+2
\| \| \| \| \| \|	Make sure some arm opcodes don't unintentionally sneak into thumb mode. llvm-svn: 358064
*	[ARM GlobalISel] Select G_FCONSTANT for VFP3	Diana Picus	2019-04-10	2	-10/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make it possible to TableGen code for FCONSTS and FCONSTD. We need to make two changes to the TableGen descriptions of vfp_f32imm and vfp_f64imm respectively: * add GISelPredicateCode to check that the immediate fits in 8 bits; * extract the SDNodeXForms into separate definitions and create a GISDNodeXFormEquiv and a custom renderer function for each of them. There's a lot of boilerplate to get the actual value of the immediate, but it basically just boils down to calling ARM_AM::getFP32Imm or ARM_AM::getFP64Imm. llvm-svn: 358063
*	[ARM GlobalISel] Select G_FCONSTANT into pools	Diana Picus	2019-04-10	1	-0/+21
\| \| \| \| \| \| \|	Put all floating point constants into constant pools and load their values from there. llvm-svn: 358062
*	[ARM GlobalISel] Map G_FCONSTANT	Diana Picus	2019-04-10	1	-0/+8
\| \| \| \|	llvm-svn: 358061
*	[X86] Move the 2 byte VEX optimization for MOV instructions back to the ↵	Craig Topper	2019-04-10	3	-54/+63
\| \| \| \| \| \| \| \|	X86AsmParser::processInstruction where it used to be. Block when {vex3} prefix is present. Years ago I moved this to an InstAlias using VR128H/VR128L. But now that we support {vex3} pseudo prefix, we need to block the optimization when it is set to match gas behavior. llvm-svn: 358046
*	[Sparc] Fix incorrect MI insertion position for spilling f128.	Jim Lin	2019-04-10	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Obviously, new built MI (sethi+add or sethi+xor+add) for constructing large offset should be inserted before new created MI for storing even register into memory. So the insertion position should be *StMI instead of II. before fixed: std %f0, [%g1+80] sethi 4, %g1 <<< add %g1, %sp, %g1 <<< this two instructions should be put before "std %f0, [%g1+80]". sethi 4, %g1 add %g1, %sp, %g1 std %f2, [%g1+88] after fixed: sethi 4, %g1 add %g1, %sp, %g1 std %f0, [%g1+80] sethi 4, %g1 add %g1, %sp, %g1 std %f2, [%g1+88] Reviewers: venkatra, jyknight Reviewed By: jyknight Subscribers: jyknight, fedor.sergeev, jrtc27, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60397 llvm-svn: 358042
*	[X86] Support the EVEX versions vcvt(t)ss2si and vcvt(t)sd2si with the ↵	Craig Topper	2019-04-10	2	-32/+27
\| \| \| \| \| \| \| \| \| \| \| \|	{evex} pseudo prefix in the assembler. The EVEX versions are ambiguous with the VEX versions based on operands alone so we had explicitly dropped them from the AsmMatcher table. Unfortunately, when we add them they incorrectly show in the table before their VEX counterparts. This is different how the prioritization normally works. To fix this we have to explicitly reject the instructions unless the {evex} prefix has been seen. llvm-svn: 358041
*	[X86] Add VEX_LIG to scalar VEX/EVEX instructions that were missing it.	Craig Topper	2019-04-09	2	-39/+40
\| \| \| \| \| \| \| \| \| \|	Scalar VEX/EVEX instructions don't use the L bit and don't look at it for decoding either. So we should ignore it in our disassembler. The missing instructions here were found by grepping the raw tablegen class definitions in the tablegen debug output. llvm-svn: 358040
*	[X86] Fix a dangling StringRef issue introduced in r358029.	Craig Topper	2019-04-09	1	-3/+4
\| \| \| \| \| \| \| \|	I was attempting to convert mnemonics to lower case after processing a pseudo prefix. But the ParseOperands just hold a StringRef for tokens so there is no where to allocate the memory. Add FIXMEs for the lower case issue which also exists in the prefix parsing code. llvm-svn: 358036
*	[AArch64][GlobalISel] Add isel support for vector G_ICMP and G_ASHR & G_SHL	Amara Emerson	2019-04-09	1	-2/+259
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The selection for G_ICMP is unfortunately not currently importable from SDAG due to the use of custom SDNodes. To support this, this selection method has an opcode table which has been generated by a script, indexed by various instruction properties. Ideally in future we will have a GISel native selection patterns that we can write in tablegen to improve on this. For selection of some types we also need support for G_ASHR and G_SHL which are generated as a result of legalization. This patch also adds support for them, generating the same code as SelectionDAG currently does. Differential Revision: https://reviews.llvm.org/D60436 llvm-svn: 358035
*	[AArch64][GlobalISel] Legalize vector G_ICMP.	Amara Emerson	2019-04-09	1	-2/+27
\| \| \| \| \| \| \| \|	Selection support will be coming in a later patch. Differential Revision: https://reviews.llvm.org/D60435 llvm-svn: 358034
*	[AArch64][GlobalISel] Add legalization for some vector G_SHL and G_ASHR.	Amara Emerson	2019-04-09	1	-4/+6
\| \| \| \| \| \| \| \|	This is needed for some future support for vector ICMP. Differential Revision: https://reviews.llvm.org/D60433 llvm-svn: 358033
*	[GlobalISel][AArch64] Allow CallLowering to handle types which are normally	Amara Emerson	2019-04-09	3	-7/+60
\| \| \| \| \| \| \| \| \| \| \|	required to be passed as different register types. E.g. <2 x i16> may need to be passed as a larger <2 x i32> type, so formal arg lowering needs to be able truncate it back. Likewise, when dealing with returns of these types, they need to be widened in the appropriate way back. Differential Revision: https://reviews.llvm.org/D60425 llvm-svn: 358032
*	[X86] Add support for {vex2}, {vex3}, and {evex} to the assembler to match ↵	Craig Topper	2019-04-09	3	-9/+111
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gas. Use {evex} to improve the one our 32-bit AVX512 tests. These can be used to force the encoding used for instructions. {vex2} will fail if the instruction is not VEX encoded, but otherwise won't do anything since we prefer vex2 when possible. Might need to skip use of the _REV MOV instructions for this too, but I haven't done that yet. {vex3} will force the instruction to use the 3 byte VEX encoding or fail if there is no VEX form. {evex} will force the instruction to use the EVEX version or fail if there is no EVEX version. Differential Revision: https://reviews.llvm.org/D59266 llvm-svn: 358029
*	[PowerPC] fix trivial typos in comment, NFC	Hiroshi Inoue	2019-04-09	3	-6/+6
\| \| \| \|	llvm-svn: 357981
*	[X86] Have EVEX2VEX tablegenerator use HasVEX_L and HasEVEX_L2 fields ↵	Craig Topper	2019-04-09	1	-4/+1
\| \| \| \| \| \| \| \| \| \|	instead of the composite EVEX_LL field. Remove the EVEX_LL field. NFCI The composite existed to simplify some other tablegen code and not really in an important way. Remove the combined field and just calculate the vector size using two ifs. llvm-svn: 357972
*	[X86] Use VEX_WIG for VPINSRB/W and VPEXTRB/W to match what is done for EVEX.	Craig Topper	2019-04-09	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	The instruction's document this as W0 for the VEX encoding. But there's a footnote mentioning that VEX.W is ignored in 64-bit mode. And the main VEX encoding description says the VEX.W bit is ignored for instructions that are equivalent to a legacy SSE instruction that uses REX.W to select a GPR which would apply here. By making this match EVEX we can remove a special case of allowing EVEX2VEX to turn an EVEX.WIG instruction into VEX.W0. llvm-svn: 357971