bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[SystemZ] Wait with selection of legal vector/FP constants until Select().	Jonas Paulsson	2019-02-26	5	-163/+173
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch aims to make sure that any such constant that can be generated with a vector instruction (for example VGBM) is recognized as such during legalization and kept as a target independent node through post-legalize DAGCombining. Two new functions named isVectorConstantLegal() and loadVectorConstant() replace old ways of handling vector/FP constants. A new struct named SystemZVectorConstantInfo is used to cache the results of isVectorConstantLegal() and pass them onto loadVectorConstant(). Support for fp128 constants in the presence of FeatureVectorEnhancements1 (z14) has been added. Review: Ulrich Weigand https://reviews.llvm.org/D58270 llvm-svn: 354896
*	[InstCombine] canonicalize more unsigned saturated add with 'not'	Sanjay Patel	2019-02-26	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Yet another pattern variation suggested by: https://bugs.llvm.org/show_bug.cgi?id=14613 There are 8 more potential commuted patterns here on top of the 8 that were already handled (rL354221, rL354276, rL354393). We have the obvious commute of the 'add' + commute of the cmp predicate/operands (ugt/ult) + commute of the select operands: Name: base %notx = xor i32 %x, -1 %a = add i32 %notx, %y %c = icmp ult i32 %x, %y %r = select i1 %c, i32 -1, i32 %a => %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a Name: ugt %notx = xor i32 %x, -1 %a = add i32 %notx, %y %c = icmp ugt i32 %y, %x %r = select i1 %c, i32 -1, i32 %a => %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a Name: commute select %notx = xor i32 %x, -1 %a = add i32 %notx, %y %c = icmp ult i32 %y, %x %r = select i1 %c, i32 %a, i32 -1 => %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a Name: ugt + commute select %notx = xor i32 %x, -1 %a = add i32 %notx, %y %c = icmp ugt i32 %x, %y %r = select i1 %c, i32 %a, i32 -1 => %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a https://rise4fun.com/Alive/den llvm-svn: 354887
*	[DAG] Fix constant store folding to handle non-byte sizes.	Nirav Dave	2019-02-26	2	-12/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Avoid crashes from zero-byte values due to sub-byte store sizes. Reviewers: uabelho, courbet, rnk Reviewed By: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58626 llvm-svn: 354884
*	[mips] Emit `.module softfloat` directive	Simon Atanasyan	2019-02-26	2	-3/+7
\| \| \| \| \| \| \|	This change fixes crash on an assertion in case of using `soft float` ABI for mips32r6 target. llvm-svn: 354882
*	[MCA] Always check if scheduler resources are unavailable when reporting ↵	Andrea Di Biagio	2019-02-26	2	-4/+13
\| \| \| \| \| \| \| \| \| \| \|	dispatch stalls. Dispatch stall cycles may be associated to multiple dispatch stall events. Before this patch, each stall cycle was associated with a single stall event. This patch also improves a couple of code comments, and adds a helper method to query the Scheduler for dispatch stalls. llvm-svn: 354877
*	[yaml2obj][obj2yaml] - Add support for the architecture specific dynamic tags.	George Rimar	2019-02-26	1	-4/+29
\| \| \| \| \| \| \| \| \| \| \|	This allows tools to parse/dump the architecture specific tags like DT_MIPS_, DT_PPC64_ and DT_HEXAGON_* Also fixes a bug in DynamicTags.def which was revealed in this patch. Differential revision: https://reviews.llvm.org/D58667 llvm-svn: 354876
*	[llvm-objdump] Implement -Mreg-names-raw/-std options.	Igor Kudrin	2019-02-26	3	-6/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The --disassembler-options, or -M, are used to customize the disassembler and affect its output. The two implemented options allow selecting register names on ARM: * With -Mreg-names-raw, the disassembler uses rNN for all registers. * With -Mreg-names-std it prints sp, lr and pc for r13, r14 and r15, which is the default behavior of llvm-objdump. Differential Revision: https://reviews.llvm.org/D57680 llvm-svn: 354870
*	[ARM] Add Cortex-M35P	Luke Cheeseman	2019-02-26	1	-0/+10
\| \| \| \| \| \| \| \| \| \|	- Add LLVM backend support for Cortex-M35P - Documentation can be found at https://developer.arm.com/products/processors/cortex-m/cortex-m35p Differentail Revision: https://reviews.llvm.org/D57763 llvm-svn: 354868
*	[LegalizeDAG] Use APInt::getSplat helper to create bitreverse masks. NFCI.	Simon Pilgrim	2019-02-26	1	-10/+6
\| \| \| \|	llvm-svn: 354867
*	[LegalizeDAG] Expand SADDO/SSUBO using SADDSAT/SSUBSAT (PR37763)	Simon Pilgrim	2019-02-26	1	-5/+17
\| \| \| \| \| \| \| \| \| \|	If SADDSAT/SSUBSAT are legal, then we can expand SADDO/SSUBO by performing a ADD/SUB and a SADDO/SSUBO and then compare the results. I looked at doing this for UADDO/USUBO as well but as we don't have to do as many range comparisons I didn't see any/much benefit. Differential Revision: https://reviews.llvm.org/D58637 llvm-svn: 354866
*	[ThinLTO] Use defined node and edge order when dumping DOT file	Eugene Leviant	2019-02-26	1	-15/+5
\| \| \| \| \| \|	Differential revision: https://reviews.llvm.org/D58631 llvm-svn: 354850
*	Revert "Improve "llvm-nm -f sysv" output for Elf files"	Vlad Tsyrklevich	2019-02-26	1	-10/+0
\| \| \| \| \| \| \|	This reverts commit r354833, it was causing ASan test failures on sanitizer-x86_64-linux-fast. llvm-svn: 354849
*	[WebAssembly] Properly align fp128 arguments in outgoing varargs arguments	Dan Gohman	2019-02-26	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	For outgoing varargs arguments, it's necessary to check the OrigAlign field of the corresponding OutputArg entry to determine argument alignment, rather than just computing an alignment from the argument value type. This is because types like fp128 are split into multiple argument values, with narrower types that don't reflect the ABI alignment of the full fp128. This fixes the printf("printfL: %4.*Lf\n", 2, lval); testcase. Differential Revision: https://reviews.llvm.org/D58656 llvm-svn: 354846
*	[ARM] Be super conservative about atomics	Philip Reames	2019-02-26	1	-2/+5
\| \| \| \| \| \| \| \| \|	As requested during review of D57601 <https://reviews.llvm.org/D57601> https://reviews.llvm.org/D57601, be equally conservative for atomic MMOs as for volatile MMOs in all in tree backends. At the moment, all atomic MMOs are also volatile, but I'm about to change that. Differential Revision: https://reviews.llvm.org/D58490 Note: D58498 landed in several pieces as individual backends were approved. This is the last chunk. llvm-svn: 354845
*	[WebAssembly] Fix a bug deleting instruction in a ranged for loop	Heejin Ahn	2019-02-26	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We shouldn't delete elements while iterating a ranged for loop. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58519 llvm-svn: 354844
*	[CodeView] Emit HasConstructorOrDestructor class option for non-trivial ↵	Aaron Smith	2019-02-26	1	-4/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	constructors Reviewers: zturner, rnk, llvm-commits, aleksandr.urakov Reviewed By: zturner, rnk Subscribers: jdoerfert, majnemer, asmith Tags: #llvm Differential Revision: https://reviews.llvm.org/D44406 llvm-svn: 354841
*	[llvm-cov] Fix llvm-cov on Windows and un-XFAIL test	Reid Kleckner	2019-02-26	1	-1/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The llvm-cov tool needs to be able to find coverage names in the executable, so the .lprfn and .lcovmap sections cannot be merged into .rdata. Also, the linker merges .lprfn$M into .lprfn, so llvm-cov needs to handle that when looking up sections. It has to support running on both relocatable object files and linked PE files. Lastly, when loading .lprfn from a PE file, llvm-cov needs to skip the leading zero byte added by the profile runtime. Reviewers: vsk Subscribers: hiraditya, #sanitizers, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D58661 llvm-svn: 354840
*	[X86] Fix bug in x86_intrcc with arg copy elision	Reid Kleckner	2019-02-26	4	-43/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Use a custom calling convention handler for interrupts instead of fixing up the locations in LowerMemArgument. This way, the offsets are correct when constructed and we don't need to account for them in as many places. Depends on D56883 Replaces D56275 Reviewers: craig.topper, phil-opp Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D56944 llvm-svn: 354837
*	Improve "llvm-nm -f sysv" output for Elf files	Sunil Srivastava	2019-02-26	1	-0/+10
\| \| \| \| \| \| \| \|	Specifically, compute and Print Type and Section columns. Differential Revision: https://reviews.llvm.org/D58263 llvm-svn: 354833
*	RegBankSelect: Handle slightly more complex value mappings	Matt Arsenault	2019-02-25	3	-20/+63
\| \| \| \| \| \| \| \|	Try to use concat_vectors. Also remove unnecessary assert on pointers. Fixes asserting for <4 x s16> operations and 64-bit pointers for AMDGPU. llvm-svn: 354828
*	AMDGPU/GlobalISel: Fix bit ops for non-power-of-2 sizes	Matt Arsenault	2019-02-25	1	-0/+2
\| \| \| \|	llvm-svn: 354825
*	AMDGPU/GlobalISel: Clamp max implicit_def elements	Matt Arsenault	2019-02-25	1	-1/+2
\| \| \| \|	llvm-svn: 354818
*	RegisterScavenger: Allow fail without spill	Matt Arsenault	2019-02-25	1	-15/+23
\| \| \| \| \| \| \| \|	AMDGPU wants to use this in some contexts where the spilling is either impossible, or a worse alternative to doing something else. llvm-svn: 354816
*	AMDGPU: Remove IntrReadMem from memtime/memrealtime intrinsics	Matt Arsenault	2019-02-25	1	-2/+10
\| \| \| \| \| \| \|	EarlyCSE with MemorySSA was able to use this to merge multiple calls with no intervening store. llvm-svn: 354814
*	[X86] Improve detection of unneeded shift amount masking to also handle the ↵	Craig Topper	2019-02-25	2	-47/+63
\| \| \| \| \| \| \| \| \| \|	case that the LHS has known zeroes in it If the LHS has known zeros, the RHS immediate will have had bits removed. So call computeKnownBits to get the known zeroes so we can handle this case. Differential Revision: https://reviews.llvm.org/D58475 llvm-svn: 354811
*	Fix a sign compare warning breaking the -Werror build.	Andrea Di Biagio	2019-02-25	1	-1/+1
\| \| \| \| \| \|	The warning was introduced at r354793. llvm-svn: 354810
*	AMDGPU: Correct definitions for bitset instructions	Matt Arsenault	2019-02-25	2	-13/+21
\| \| \| \| \| \| \|	These really read and write the result register, so these need a tied input. llvm-svn: 354809
*	[Mips] Fix missing masking in fast-isel of br (PR40325)	Nikita Popov	2019-02-25	1	-11/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes https://bugs.llvm.org/show_bug.cgi?id=40325 by zero extending (and x, 1) the condition before branching on it. To avoid regressing trivial cases, I'm combining emission of cmp+br sequences for the single-use + same block case (similar to what we do in x86). icmpbr1.ll still regresses due to the cross-bb usage of the condition. Differential Revision: https://reviews.llvm.org/D58576 llvm-svn: 354808
*	[AArch64][GlobalISel] Refactor selectBuildVector to use MachineIRBuilder. NFC.	Amara Emerson	2019-02-25	1	-60/+43
\| \| \| \| \| \| \| \| \| \|	This is a preparatory change as I want to use emitScalarToVector() elsewhere, and in general we want to transition to MIRBuilder instead of using BuildMI directly. Differential Revision: https://reviews.llvm.org/D58528 llvm-svn: 354807
*	[Lanai] Be super conservative about atomics	Philip Reames	2019-02-25	1	-1/+2
\| \| \| \| \| \| \| \|	As requested during review of D57601 <https://reviews.llvm.org/D57601>, be equally conservative for atomic MMOs as for volatile MMOs in all in tree backends. At the moment, all atomic MMOs are also volatile, but I'm about to change that. Reviewed as part of https://reviews.llvm.org/D58490, with other backends still pending review. llvm-svn: 354800
*	[SelectionDAG] Add demanded elts variants to isConstOrConstSplat helpers. NFCI.	Simon Pilgrim	2019-02-25	1	-37/+74
\| \| \| \| \| \| \| \| \| \| \| \|	These helpers extend the existing isConstOrConstSplat helper checks to support DemandedElts masks as well. We already had a local version of this in SelectionDAG that computeKnownBits/ComputeNumSignBits made use of, but this adds the functionality directly to the BuildVectorSDNode node and extends isConstOrConstSplat etc. to use that. This will allow us to reuse the functionality in SimplifyDemandedVectorElts/SimplifyDemandedBits. Differential Revision: https://reviews.llvm.org/D58503 llvm-svn: 354797
*	[DAGCombine] Add undef shuffle elt support to partitionShuffleOfConcats	Simon Pilgrim	2019-02-25	1	-28/+29
\| \| \| \| \| \| \| \|	Support undef shuffle mask indices in the shuffle(concat_vectors, concat_vectors) -> concat_vectors fold Differential Revision: https://reviews.llvm.org/D58585 llvm-svn: 354793
*	[ARM] Add some more missing T1 opcodes for the peephole optimisier	David Green	2019-02-25	1	-12/+24
\| \| \| \| \| \| \| \| \| \| \|	This adds a few extra Thumb1 opcodes to improve the peephole opimisers ability to remove redundant cmp instructions. tADC and tSBC require a small fixup to prevent MOVS being moved past the instruction, giving the wrong flags. Differential Revision: https://reviews.llvm.org/D58281 llvm-svn: 354791
*	[Vectorizer] Add vectorization support for fixed smul/umul intrinsics	Simon Pilgrim	2019-02-25	3	-29/+43
\| \| \| \| \| \| \| \|	This requires a couple of tweaks to existing vectorization functions as they were assuming that only the second call argument (ctlz/cttz/powi) could ever be the 'always scalar' argument, but for smul.fix + umul.fix its the third argument. Differential Revision: https://reviews.llvm.org/D58616 llvm-svn: 354790
*	[AArch64] Add support for Cortex-A76 and Cortex-A76AE	Luke Cheeseman	2019-02-25	6	-1/+36
\| \| \| \| \| \| \| \|	- Add LLVM backend support for Cortex-A76 and Cortex-A76AE - Documentation can be found at https://developer.arm.com/products/processors/cortex-a/cortex-a76 llvm-svn: 354788
*	[X86] Merge ISD::ADD/SUB nodes into X86ISD::ADD/SUB equivalents (PR40483)	Simon Pilgrim	2019-02-25	1	-10/+28
\| \| \| \| \| \| \| \| \| \|	Avoid ADD/SUB instruction duplication by reusing the X86ISD::ADD/SUB results. Includes ADD commutation - I tried to include NEG+SUB SUB commutation as well but this causes regressions as we don't have good combine coverage to simplify X86ISD::SUB. Differential Revision: https://reviews.llvm.org/D58597 llvm-svn: 354771
*	[yaml2obj]Re-allow dynamic sections to have raw content	James Henderson	2019-02-25	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recently, support was added to yaml2obj to allow dynamic sections to have a list of entries, to make it easier to write tests with dynamic sections. However, this change also removed the ability to provide custom contents to the dynamic section, making it hard to test malformed contents (e.g. because the section is not a valid size to contain an array of entries). This change reinstates this. An error is emitted if raw content and dynamic entries are both specified. Reviewed by: grimar, ruiu Differential Review: https://reviews.llvm.org/D58543 llvm-svn: 354770
*	[ARM] Make fullfp16 instructions not conditionalisable.	Simon Tatham	2019-02-25	4	-8/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	More or less all the instructions defined in the v8.2a full-fp16 extension are defined as UNPREDICTABLE if you put them in an IT block (Thumb) or use with any condition other than AL (ARM). LLVM didn't know that, and was happy to conditionalise them. In order to force these instructions to count as not predicable, I had to make a small Tablegen change. The code generation back end mostly decides if an instruction was predicable by looking for something it can identify as a predicate operand; there's an isPredicable bit flag that overrides that check in the positive direction, but nothing that overrides it in the negative direction. (I considered the alternative approach of actually removing the predicate operand from those instructions, but thought that it would be more painful overall for instructions differing only in data type to have different shapes of operand list. This way, the only code that has to notice the difference is the if-converter.) So I've added an isUnpredicable bit alongside isPredicable, and set that bit on the right subset of FP16 instructions, and also on the VSEL, VMAXNM/VMINNM and VRINT[ANPM] families which should be unpredicable for all data types. I've included a couple of representative regression tests, both of which previously caused an fp16 instruction to be conditionalised in ARM state and (with -arm-no-restrict-it) to be put in an IT block in Thumb. Reviewers: SjoerdMeijer, t.p.northover, efriedma Reviewed By: efriedma Subscribers: jdoerfert, javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57823 llvm-svn: 354768
*	[SelectionDAG] Add a OPC_CheckChild2CondCode to SelectionDAGISel to remove a ↵	Craig Topper	2019-02-25	1	-0/+14
\| \| \| \| \| \| \| \| \| \|	MoveChild and MoveParent pair. OPC_CheckCondCode is always used as operand 2 of a setcc. And its always surrounded by a MoveChild2 and a MoveParent. By having a dedicated opcode for this case we can reduce the number of bytes needed for this pattern from 4 bytes to 2. This saves ~3000 bytes in the X86 table. llvm-svn: 354763
*	[PowerPC] [PowerPC] Enhance the fast selection of fptoi & fptrunc ↵	Kang Zhang	2019-02-25	1	-4/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instruction and clean up related asserts Summary: Fast selection of llvm fptoi & fptrunc instructions is not handled well about VSX instruction support. We'd use VSX float convert integer instruction instead of non-vsx float convert integer instruction if the operand register class is VSSRC or VSFRC because i32 and i64 are mapped to VSSRC and VSFRC correspondingly if VSX feature is openeded. For float trunc instruction, we do this silimar work like float convert integer instruction to try to use VSX instruction. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D58430 llvm-svn: 354762
*	[X86] Combine zext(packus(x),packus(y)) -> concat(x,y) (PR39637)	Simon Pilgrim	2019-02-24	1	-0/+14
\| \| \| \| \| \|	Its proving tricky to combine shuffles across multiple vector sizes, so for now I'm adding this more specific combine - the pattern is common enough to be worth it as a first step. llvm-svn: 354757
*	[X86] Fix tls variable lowering issue with large code model	Craig Topper	2019-02-24	1	-5/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The problem here is the lowering for tls variable. Below is the DAG for the code. SelectionDAG has 11 nodes: t0: ch = EntryToken t8: i64,ch = load<(load 8 from `i8 addrspace(257)* null`, addrspace 257)> t0, Constant:i64<0>, undef:i64 t10: i64 = X86ISD::WrapperRIP TargetGlobalTLSAddress:i64<i32* @x> 0 [TF=10] t11: i64,ch = load<(load 8 from got)> t0, t10, undef:i64 t12: i64 = add t8, t11 t4: i32,ch = load<(dereferenceable load 4 from @x)> t0, t12, undef:i64 t6: ch = CopyToReg t0, Register:i32 %0, t4 And when mcmodel is large, below instruction can NOT be folded. t10: i64 = X86ISD::WrapperRIP TargetGlobalTLSAddress:i64<i32* @x> 0 [TF=10] t11: i64,ch = load<(load 8 from got)> t0, t10, undef:i64 So "t11: i64,ch = load<(load 8 from got)> t0, t10, undef:i64" is lowered to " Morphed node: t11: i64,ch = MOV64rm<Mem:(load 8 from got)> t10, TargetConstant:i8<1>, Register:i64 $noreg, TargetConstant:i32<0>, Register:i32 $noreg, t0" When llvm start to lower "t10: i64 = X86ISD::WrapperRIP TargetGlobalTLSAddress:i64<i32* @x> 0 [TF=10]", it fails. The patch is to fold the load and X86ISD::WrapperRIP. Fixes PR26906 Patch by LuoYuanke Reviewers: craig.topper, rnk, annita.zhang, wxiao3 Reviewed By: rnk Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58336 llvm-svn: 354756
*	[X86][SSE] Use pblendw for v4i32/v2i64 during isel.	Craig Topper	2019-02-24	1	-13/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Previously we used BLENDPS/BLENDPD but that puts the blend in the FP domain. Under optsize, the two address instruction pass can cause blendps/blendpd to commute to blendps/blendpd. But we probably shouldn't do that if the original type was a integer. So use pblendw instead. Reviewers: spatel, RKSimon Reviewed By: RKSimon Subscribers: jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58574 llvm-svn: 354755
*	[X86] Correct some ADC/SBB with immediate scheduler data for Broadwell and ↵	Craig Topper	2019-02-24	3	-10/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Skylake. Summary: The AX/EAX/RAX with immediate forms are 2 uops just like the AL with immediate. The modrm form with r8 and immediate is a single uop just like r16/r32/r64 with immediate. Reviewers: RKSimon, andreadb Reviewed By: RKSimon Subscribers: gbedwell, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58581 llvm-svn: 354754
*	[LegalizeTypes][AArch64][X86] Make type legalization of vector ↵	Craig Topper	2019-02-24	3	-9/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(S/U)ADD/SUB/MULO follow getSetCCResultType for the overflow bits. Make UnrollVectorOverflowOp properly convert from scalar boolean contents to vector boolean contents Summary: When promoting the over flow vector for these ops we should use the target's desired setcc result type. This way a v8i32 result type will use a v8i32 overflow vector instead of a v8i16 overflow vector. A v8i16 overflow vector will cause LegalizeDAG/LegalizeVectorOps to have to use v8i32 and truncate to v8i16 in its expansion. By doing this in type legalization instead, we get the truncate into the DAG earlier and give DAG combine more of a chance to optimize it. We also have to fix unrolling to use the scalar setcc result type for the scalarized operation, and convert it to the required vector element type after the scalar operation. We have to observe the vector boolean contents when doing this conversion. The previous code was just taking the scalar result and putting it in the vector. But for X86 and AArch64 that would have only put a the boolean value in bit 0 of the element and left all other bits in the element 0. We need to ensure all bits in the element are the same. I'm using a select with constants here because that's what setcc unrolling in LegalizeVectorOps used. Reviewers: spatel, RKSimon, nikic Reviewed By: nikic Subscribers: javed.absar, kristof.beyls, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58567 llvm-svn: 354753
*	[X86][AVX] Rename lowerShuffleByMerging128BitLanes to ↵	Simon Pilgrim	2019-02-24	1	-10/+11
\| \| \| \| \| \| \| \|	lowerShuffleAsLanePermuteAndRepeatedMask. NFC. Name better matches the other similar 'lane permute' and 'repeated mask' functions we have. llvm-svn: 354749
*	[InstCombine] canonicalize add/sub with bool	Sanjay Patel	2019-02-24	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	add A, sext(B) --> sub A, zext(B) We have to choose 1 of these forms, so I'm opting for the zext because that's easier for value tracking. The backend should be prepared for this change after: D57401 rL353433 This is also a preliminary step towards reducing the amount of bit hackery that we do in IR to optimize icmp/select. That should be waiting to happen at a later optimization stage. The seeming regression in the fuzzer test was discussed in: D58359 We were only managing that fold in instcombine by luck, and other passes should be able to deal with that better anyway. llvm-svn: 354748
*	[CGP] add special-cases to form unsigned add with overflow (PR40486)	Sanjay Patel	2019-02-24	1	-8/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There's likely a missed IR canonicalization for at least 1 of these patterns. Otherwise, we wouldn't have needed the pattern-matching enhancement in D57516. Note that -- unlike usubo added with D57789 -- the TLI hook for this transform defaults to 'on'. So if there's any perf fallout from this, targets should look at how they're lowering the uaddo node in SDAG and/or override that hook. The x86 diffs suggest that there's some missing pattern-matching for forming inc/dec. This should fix the remaining known problems in: https://bugs.llvm.org/show_bug.cgi?id=40486 https://bugs.llvm.org/show_bug.cgi?id=31754 llvm-svn: 354746
*	Fix "enumeral and non-enumeral type in conditional expression" gcc7 warning. ↵	Simon Pilgrim	2019-02-24	1	-1/+2
\| \| \| \| \| \|	NFCI. llvm-svn: 354745
*	[WebAssembly] Rename a variable in CFGStackify (NFC)	Heejin Ahn	2019-02-24	1	-7/+7
\| \| \| \|	llvm-svn: 354744