bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[DAGCombiner] Enable UADDO/USUBO vector combine support	Simon Pilgrim	2019-03-06	1	-11/+8
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D58965 llvm-svn: 355517
*	[TargetLowering] simplify code for uaddsat/usubsat expansion; NFC	Sanjay Patel	2019-03-06	1	-17/+13
\| \| \| \| \| \|	We had 2 local variable names for the same type. llvm-svn: 355516
*	Revert "[CodeGen] Omit range checks from jump tables when lowering switches ↵	Alexander Kornienko	2019-03-06	2	-32/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	with unreachable default" This reverts commit 2a0f2c5ef3330846149598220467d9f3c6e8b99c (r355490). The commit causes an assertion failure when compiling LLVM code: $ cat repro.cpp class QQQ { public: bool x() const; bool y() const; unsigned getSizeInBits() const { if (y() \|\| x()) return getScalarSizeInBits(); return getScalarSizeInBits() * 2; } unsigned getScalarSizeInBits() const; }; int f(const QQQ &Ty) { switch (Ty.getSizeInBits()) { case 1: case 8: return 0; case 16: return 1; case 32: return 2; case 64: return 3; default: __builtin_unreachable(); } } $ clang -O2 -o repro.o repro.cpp assert.h assertion failed at llvm/include/llvm/ADT/ilist_iterator.h:139 in llvm::ilist_iterator::reference llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::MachineInstr, true, true, void>, true, false>::operator() const [OptionsT = llvm::ilist_detail::node_options<llvm::MachineInstr, true, true, void>, IsReverse = true, IsConst = false]: !NodePtr->isKnownSentinel() Check failure stack trace: * @ 0x558aab4afc10 __assert_fail @ 0x558aa885479b llvm::ilist_iterator<>::operator() @ 0x558aa8854715 llvm::MachineInstrBundleIterator<>::operator() @ 0x558aa92c33c3 llvm::X86InstrInfo::optimizeCompareInstr() @ 0x558aa9a9c251 (anonymous namespace)::PeepholeOptimizer::optimizeCmpInstr() @ 0x558aa9a9b371 (anonymous namespace)::PeepholeOptimizer::runOnMachineFunction() @ 0x558aa99a4fc8 llvm::MachineFunctionPass::runOnFunction() @ 0x558aab019fc4 llvm::FPPassManager::runOnFunction() @ 0x558aab01a3a5 llvm::FPPassManager::runOnModule() @ 0x558aab01aa9b (anonymous namespace)::MPPassManager::runOnModule() @ 0x558aab01a635 llvm::legacy::PassManagerImpl::run() @ 0x558aab01afe1 llvm::legacy::PassManager::run() @ 0x558aa5914769 (anonymous namespace)::EmitAssemblyHelper::EmitAssembly() @ 0x558aa5910f44 clang::EmitBackendOutput() @ 0x558aa5906135 clang::BackendConsumer::HandleTranslationUnit() @ 0x558aa6d165ad clang::ParseAST() @ 0x558aa6a94e22 clang::ASTFrontendAction::ExecuteAction() @ 0x558aa590255d clang::CodeGenAction::ExecuteAction() @ 0x558aa6a94840 clang::FrontendAction::Execute() @ 0x558aa6a38cca clang::CompilerInstance::ExecuteAction() @ 0x558aa4e2294b clang::ExecuteCompilerInvocation() @ 0x558aa4df6200 cc1_main() @ 0x558aa4e1b37f ExecuteCC1Tool() @ 0x558aa4e1a725 main @ 0x7ff20d56abbd __libc_start_main @ 0x558aa4df51c9 _start llvm-svn: 355515
*	[CGP] Avoid repeatedly building DominatorTree causing long compile-time (NFC)	Teresa Johnson	2019-03-06	1	-21/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In r354298 a DominatorTree construction was added via new function combineToUSubWithOverflow, which was subsequently restructured into replaceMathCmpWithIntrinsic in r354689. We are hitting a very long compile time due to this repeated construction, once per math cmp in the function. We shouldn't need to build the DominatorTree more than once per function, except when a transformation invalidates it. There is already a boolean flag that is returned from these methods indicating whether the DT has been modified. We can simply build the DT once per Function walk in CodeGenPrepare::runOnFunction, since any time a change is made we break out of the Function walk and restart it. I modified the code so that both replaceMathCmpWithIntrinsic as well as mergeSExts (which was also building a DT) use the DT constructed by the run method. From -mllvm -time-passes: Before this patch: CodeGen Prepare user time is 328s With this patch: CodeGen Prepare user time is 21s Reviewers: spatel Subscribers: jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58995 llvm-svn: 355512
*	Revert "[Remarks] Refactor remark diagnostic emission in a RemarkStreamer"	Francis Visoiu Mistrih	2019-03-06	1	-1/+0
\| \| \| \| \| \| \| \|	This reverts commit 2e8c4997a2089f8228c843fd81b148d903472e02. Breaks bots. llvm-svn: 355511
*	[TargetLowering] simplify code for uaddsat/usubsat expansion; NFC	Sanjay Patel	2019-03-06	1	-8/+5
\| \| \| \|	llvm-svn: 355508
*	[Remarks] Refactor remark diagnostic emission in a RemarkStreamer	Francis Visoiu Mistrih	2019-03-06	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	This allows us to store more info about where we're emitting the remarks without cluttering LLVMContext. This is needed for future support for the remark section. Differential Revision: https://reviews.llvm.org/D58996 llvm-svn: 355507
*	[DAGCombiner] Add SADDO/SSUBO combine support	Simon Pilgrim	2019-03-06	1	-0/+54
\| \| \| \| \| \| \| \|	Basic constant handling folds, for both scalars and vectors Differential Revision: https://reviews.llvm.org/D58967 llvm-svn: 355506
*	[DAGCombiner] Enable SMULO/UMULO vector combine support (PR40442)	Simon Pilgrim	2019-03-06	1	-2/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D58968 llvm-svn: 355495
*	[CodeGen] Omit range checks from jump tables when lowering switches with ↵	Ayonam Ray	2019-03-06	2	-53/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	unreachable default During the lowering of a switch that would result in the generation of a jump table, a range check is performed before indexing into the jump table, for the switch value being outside the jump table range and a conditional branch is inserted to jump to the default block. In case the default block is unreachable, this conditional jump can be omitted. This patch implements omitting this conditional branch for unreachable defaults. Differential Revision: https://reviews.llvm.org/D52002 Reviewers: Hans Wennborg, Eli Freidman, Roman Lebedev llvm-svn: 355490
*	Reversing the commit of revision 355483 since it is giving a regression on a ↵	Ayonam Ray	2019-03-06	2	-32/+53
\| \| \| \| \| \|	newly added test. llvm-svn: 355487
*	[CodeGen] Omit range checks from jump tables when lowering switches with ↵	Ayonam Ray	2019-03-06	2	-53/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	unreachable default During the lowering of a switch that would result in the generation of a jump table, a range check is performed before indexing into the jump table, for the switch value being outside the jump table range and a conditional branch is inserted to jump to the default block. In case the default block is unreachable, this conditional jump can be omitted. This patch implements omitting this conditional branch for unreachable defaults. Differential Revision: https://reviews.llvm.org/D52002 Reviewers: Hans Wennborg, Eli Freidman, Roman Lebedev llvm-svn: 355483
*	Revert "[AtomicExpand] Allow libcall expansion for non-zero address spaces" ↵	Mitch Phillips	2019-03-06	1	-8/+2
\| \| \| \| \| \|	for buildbot failures. llvm-svn: 355461
*	[AtomicExpand] Allow libcall expansion for non-zero address spaces	Philip Reames	2019-03-05	1	-2/+8
\| \| \| \| \| \| \| \|	Be consistent about how we treat atomics in non-zero address spaces. If we get to the backend, we tend to lower them as if in address space 0. Do the same if we need to insert a libcall instead. Differential Revision: https://reviews.llvm.org/D58760 llvm-svn: 355453
*	Revert r355224 "[TableGen][SelectionDAG][X86] Add specific isel matchers for ↵	Craig Topper	2019-03-05	1	-6/+0
\| \| \| \| \| \| \| \|	immAllZerosV/immAllOnesV. Remove bitcasts from X86 patterns that are no longer necessary." This caused the first matcher in the isel table for many targets to Opc_Scope instead of Opc_SwitchOpcode. This leads to a significant increase in isel match failures. llvm-svn: 355433
*	[Subtarget] Merge ProcSched and ProcDesc arrays in MCSubtargetInfo into a ↵	Craig Topper	2019-03-05	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	single array. These arrays are both keyed by CPU name and go into the same tablegenerated file. Merge them so we only need to store keys once. This also removes a weird space saving quirk where we used the ProcDesc.size() to create to build an ArrayRef for ProcSched. Differential Revision: https://reviews.llvm.org/D58939 llvm-svn: 355431
*	[Subtarget] Create a separate SubtargetSubtargetKV struct for ProcDesc to ↵	Craig Topper	2019-03-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	remove fields from the stack tables that aren't needed for CPUs The description for CPUs was just the CPU name wrapped with "Select the " and " processor". We can just do that directly in the help printer instead of making a separate version in the binary for each CPU. Also remove the Value field that isn't needed and was always 0. Differential Revision: https://reviews.llvm.org/D58938 llvm-svn: 355429
*	[SDAG] move FP constant folding to helper function; NFC	Sanjay Patel	2019-03-05	1	-67/+72
\| \| \| \|	llvm-svn: 355411
*	[CodeGenPrepare] avoid crashing on non-canonical/degenerate code	Sanjay Patel	2019-03-04	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \|	The test is reduced from an example in the post-commit thread for: rL354746 http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190304/632396.html While we must avoid dying here, the real question should be: Why is non-canonical and/or degenerate code making it to CGP when using the new pass manager? llvm-svn: 355345
*	[DAGCombiner][X86][SystemZ][AArch64] Combine some cases of (bitcast ↵	Craig Topper	2019-03-04	1	-6/+9
\| \| \| \| \| \| \| \| \| \|	(build_vector constants)) between legalize types and legalize dag. This patch enables combining integer bitcasts of integer build vectors when the new scalar type is legal. I've avoided floating point because the implementation bitcasts float to int along the way and we would need to check the intermediate types for legality Differential Revision: https://reviews.llvm.org/D58884 llvm-svn: 355324
*	[DebugInfo] Construct nested types on behalf of owner CU	Eugene Leviant	2019-03-04	2	-22/+31
\| \| \| \| \| \|	Differential revision: https://reviews.llvm.org/D58786 llvm-svn: 355303
*	[WebAssembly] Delete ThrowUnwindDest map from WasmEHFuncInfo	Heejin Ahn	2019-03-03	3	-25/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Before when we implemented the first EH proposal, 'catch <tag>' instruction may not catch an exception so there were multiple EH pads an exception can unwind to. That means a BB could have multiple EH pad successors. Now after we switched to the new proposal, every 'catch' instruction catches an exception, and there is only one catchpad per catchswitch, so we at most have one EH pad successor, making `ThrowUnwindDest` map in `WasmEHInfo` unnecessary. Keeping `ThrowUnwindDest` map in `WasmEHInfo` has its own problems, because other optimization passes can split a BB that contains possibly throwing calls (previously invokes), and we have to update the map every time that happens, which is not easy for common CodeGen passes. This also correctly updates successor info in LateEHPrepare when we add a rethrow instruction. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58486 llvm-svn: 355296
*	Use SDValue::getConstantOperandAPInt helper where possible. NFCI.	Simon Pilgrim	2019-03-02	2	-7/+4
\| \| \| \|	llvm-svn: 355267
*	[TableGen][SelectionDAG][X86] Add specific isel matchers for ↵	Craig Topper	2019-03-01	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	immAllZerosV/immAllOnesV. Remove bitcasts from X86 patterns that are no longer necessary. Previously we had build_vector PatFrags that called ISD::isBuildVectorAllZeros/Ones. Internally the ISD::isBuildVectorAllZeros/Ones look through bitcasts, but we aren't able to take advantage of that in isel. Instead of we have to canonicalize the types of the all zeros/ones build_vectors and insert bitcasts. Then we have to pattern match those exact bitcasts. By emitting specific matchers for these 2 nodes, we can make isel look through any bitcasts without needing to explicitly match them. We should also be able to remove the canonicalization to vXi32 from lowering, but I've left that for a follow up. This removes something like 40,000 bytes from the X86 isel table. Differential Revision: https://reviews.llvm.org/D58595 llvm-svn: 355224
*	[WebAssembly] Remove uses of ThreadModel	Thomas Lively	2019-02-28	1	-8/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In the clang UI, replaces -mthread-model posix with -matomics as the source of truth on threading. In the backend, replaces -thread-model=posix with the atomics target feature, which is now collected on the WebAssemblyTargetMachine along with all other used features. These collected features will also be used to emit the target features section in the future. The default configuration for the backend is thread-model=posix and no atomics, which was previously an invalid configuration. This change makes the default valid because the thread model is ignored. A side effect of this change is that objects are never emitted with passive segments. It will instead be up to the linker to decide whether sections should be active or passive based on whether atomics are used in the final link. Reviewers: aheejin, sbc100, dschuff Subscribers: mehdi_amini, jgravelle-google, hiraditya, sunfish, steven_wu, dexonsmith, rupprecht, jfb, jdoerfert, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D58742 llvm-svn: 355112
*	Add support for computing "zext of value" in KnownBits. NFCI	Bjorn Pettersson	2019-02-28	3	-12/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The description of KnownBits::zext() and KnownBits::zextOrTrunc() has confusingly been telling that the operation is equivalent to zero extending the value we're tracking. That has not been true, instead the user has been forced to explicitly set the extended bits as known zero afterwards. This patch adds a second argument to KnownBits::zext() and KnownBits::zextOrTrunc() to control if the extended bits should be considered as known zero or as unknown. Reviewers: craig.topper, RKSimon Reviewed By: RKSimon Subscribers: javed.absar, hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58650 llvm-svn: 355099
*	GlobalISel: Implement fewerElementsVector for phi	Matt Arsenault	2019-02-28	1	-8/+85
\| \| \| \|	llvm-svn: 355048
*	GlobalISel: Implement moreElementsVector for phi	Matt Arsenault	2019-02-28	1	-0/+21
\| \| \| \|	llvm-svn: 355047
*	Seperate volatility and atomicity/ordering in SelectionDAG	Philip Reames	2019-02-27	1	-18/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At the moment, we mark every atomic memory access as being also volatile. This is unnecessarily conservative and prohibits many legal transforms (DCE, folding, etc..). This patch removes MOVolatile from the MachineMemOperands of atomic, but not volatile, instructions. This should be strictly NFC after a series of previous patches which have gone in to ensure backend code is conservative about handling of isAtomic MMOs. Once it's in and baked for a bit, we'll start working through removing unnecessary bailouts one by one. We applied this same strategy to the middle end a few years ago, with good success. To make sure this patch itself is NFC, it is build on top of a series of other patches which adjust code to (for the moment) be as conservative for an atomic access as for a volatile access and build up a test corpus (mostly in test/CodeGen/X86/atomics-unordered.ll).. Previously landed D57593 Fix a bug in the definition of isUnordered on MachineMemOperand D57596 [CodeGen] Be conservative about atomic accesses as for volatile D57802 Be conservative about unordered accesses for the moment rL353959: [Tests] First batch of cornercase tests for unordered atomics. rL353966: [Tests] RMW folding tests w/unordered atomic operations. rL353972: [Tests] More unordered atomic lowering tests. rL353989: [SelectionDAG] Inline a single use helper function, and remove last non-MMO interface rL354740: [Hexagon, SystemZ] Be super conservative about atomics rL354800: [Lanai] Be super conservative about atomics rL354845: [ARM] Be super conservative about atomics Attention Out of Tree Backend Owners: This patch may break you. If it does, you can use the TLI getMMOFlags hook to restore the MOVolatile to any instruction you need to. (See llvm-dev thread titled "PSA: Changes to how atomics are handled in backends" started Feb 27, 2019.) Differential Revision: https://reviews.llvm.org/D57601 llvm-svn: 355025
*	[DebugInfo] Apply subprogram attributes on behalf of owner CU	Eugene Leviant	2019-02-27	2	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \|	When using full LTO it is possible that template function definition DIE is bound to one compilation unit and it's declaration to another. We should add function declaration attributes on behalf of its owner CU otherwise we may end up with malformed file identifier in function declaration DW_AT_decl_file attribute. Differential revision: https://reviews.llvm.org/D58538 llvm-svn: 354978
*	[MIPS GlobalISel] Select G_UADDO	Petar Avramovic	2019-02-26	1	-0/+12
\| \| \| \| \| \| \| \| \|	Lower G_UADDO. Legalize G_UADDO for MIPS32 Differential Revision: https://reviews.llvm.org/D58671 llvm-svn: 354900
*	[DAG] Fix constant store folding to handle non-byte sizes.	Nirav Dave	2019-02-26	2	-12/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Avoid crashes from zero-byte values due to sub-byte store sizes. Reviewers: uabelho, courbet, rnk Reviewed By: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58626 llvm-svn: 354884
*	[LegalizeDAG] Use APInt::getSplat helper to create bitreverse masks. NFCI.	Simon Pilgrim	2019-02-26	1	-10/+6
\| \| \| \|	llvm-svn: 354867
*	[LegalizeDAG] Expand SADDO/SSUBO using SADDSAT/SSUBSAT (PR37763)	Simon Pilgrim	2019-02-26	1	-5/+17
\| \| \| \| \| \| \| \| \| \|	If SADDSAT/SSUBSAT are legal, then we can expand SADDO/SSUBO by performing a ADD/SUB and a SADDO/SSUBO and then compare the results. I looked at doing this for UADDO/USUBO as well but as we don't have to do as many range comparisons I didn't see any/much benefit. Differential Revision: https://reviews.llvm.org/D58637 llvm-svn: 354866
*	[CodeView] Emit HasConstructorOrDestructor class option for non-trivial ↵	Aaron Smith	2019-02-26	1	-4/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	constructors Reviewers: zturner, rnk, llvm-commits, aleksandr.urakov Reviewed By: zturner, rnk Subscribers: jdoerfert, majnemer, asmith Tags: #llvm Differential Revision: https://reviews.llvm.org/D44406 llvm-svn: 354841
*	RegBankSelect: Handle slightly more complex value mappings	Matt Arsenault	2019-02-25	1	-12/+16
\| \| \| \| \| \| \| \|	Try to use concat_vectors. Also remove unnecessary assert on pointers. Fixes asserting for <4 x s16> operations and 64-bit pointers for AMDGPU. llvm-svn: 354828
*	RegisterScavenger: Allow fail without spill	Matt Arsenault	2019-02-25	1	-15/+23
\| \| \| \| \| \| \| \|	AMDGPU wants to use this in some contexts where the spilling is either impossible, or a worse alternative to doing something else. llvm-svn: 354816
*	Fix a sign compare warning breaking the -Werror build.	Andrea Di Biagio	2019-02-25	1	-1/+1
\| \| \| \| \| \|	The warning was introduced at r354793. llvm-svn: 354810
*	[SelectionDAG] Add demanded elts variants to isConstOrConstSplat helpers. NFCI.	Simon Pilgrim	2019-02-25	1	-37/+74
\| \| \| \| \| \| \| \| \| \| \| \|	These helpers extend the existing isConstOrConstSplat helper checks to support DemandedElts masks as well. We already had a local version of this in SelectionDAG that computeKnownBits/ComputeNumSignBits made use of, but this adds the functionality directly to the BuildVectorSDNode node and extends isConstOrConstSplat etc. to use that. This will allow us to reuse the functionality in SimplifyDemandedVectorElts/SimplifyDemandedBits. Differential Revision: https://reviews.llvm.org/D58503 llvm-svn: 354797
*	[DAGCombine] Add undef shuffle elt support to partitionShuffleOfConcats	Simon Pilgrim	2019-02-25	1	-28/+29
\| \| \| \| \| \| \| \|	Support undef shuffle mask indices in the shuffle(concat_vectors, concat_vectors) -> concat_vectors fold Differential Revision: https://reviews.llvm.org/D58585 llvm-svn: 354793
*	[SelectionDAG] Add a OPC_CheckChild2CondCode to SelectionDAGISel to remove a ↵	Craig Topper	2019-02-25	1	-0/+14
\| \| \| \| \| \| \| \| \| \|	MoveChild and MoveParent pair. OPC_CheckCondCode is always used as operand 2 of a setcc. And its always surrounded by a MoveChild2 and a MoveParent. By having a dedicated opcode for this case we can reduce the number of bytes needed for this pattern from 4 bytes to 2. This saves ~3000 bytes in the X86 table. llvm-svn: 354763
*	[LegalizeTypes][AArch64][X86] Make type legalization of vector ↵	Craig Topper	2019-02-24	2	-7/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(S/U)ADD/SUB/MULO follow getSetCCResultType for the overflow bits. Make UnrollVectorOverflowOp properly convert from scalar boolean contents to vector boolean contents Summary: When promoting the over flow vector for these ops we should use the target's desired setcc result type. This way a v8i32 result type will use a v8i32 overflow vector instead of a v8i16 overflow vector. A v8i16 overflow vector will cause LegalizeDAG/LegalizeVectorOps to have to use v8i32 and truncate to v8i16 in its expansion. By doing this in type legalization instead, we get the truncate into the DAG earlier and give DAG combine more of a chance to optimize it. We also have to fix unrolling to use the scalar setcc result type for the scalarized operation, and convert it to the required vector element type after the scalar operation. We have to observe the vector boolean contents when doing this conversion. The previous code was just taking the scalar result and putting it in the vector. But for X86 and AArch64 that would have only put a the boolean value in bit 0 of the element and left all other bits in the element 0. We need to ensure all bits in the element are the same. I'm using a select with constants here because that's what setcc unrolling in LegalizeVectorOps used. Reviewers: spatel, RKSimon, nikic Reviewed By: nikic Subscribers: javed.absar, kristof.beyls, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58567 llvm-svn: 354753
*	[CGP] add special-cases to form unsigned add with overflow (PR40486)	Sanjay Patel	2019-02-24	1	-8/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There's likely a missed IR canonicalization for at least 1 of these patterns. Otherwise, we wouldn't have needed the pattern-matching enhancement in D57516. Note that -- unlike usubo added with D57789 -- the TLI hook for this transform defaults to 'on'. So if there's any perf fallout from this, targets should look at how they're lowering the uaddo node in SDAG and/or override that hook. The x86 diffs suggest that there's some missing pattern-matching for forming inc/dec. This should fix the remaining known problems in: https://bugs.llvm.org/show_bug.cgi?id=40486 https://bugs.llvm.org/show_bug.cgi?id=31754 llvm-svn: 354746
*	[TwoAddressInstructionPass] After commuting an instruction and before trying ↵	Craig Topper	2019-02-23	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \|	to look for more commutable operands, resample the number of operands. The new instruciton might have less operands than the original instruction. If we don't resample, the next loop iteration might read an operand that doesn't exist. X86 can commute blends to movss/movsd which reduces from 4 operands to 3. This happened in the test case that caused r354363 & company to be reverted. A reduced version of that has been committed here. Really this whole checking for more commutable operands is a little fragile. It assumes that the new instructions operands are the same order and positions as the original except for the pair that was swapped. I don't know of anything that breaks this assumption today, but I've left a fixme. Fixing this will likely require an interface change. llvm-svn: 354738
*	Recommit r354647 and r354648 "[LegalizeTypes] When promoting the result of ↵	Craig Topper	2019-02-23	1	-3/+7
\| \| \| \| \| \| \| \| \| \|	EXTRACT_SUBVECTOR, also check if the input needs to be promoted. Use that to determine the element type to extract" r354648 was a follow up to fix a regression "[X86] Add a DAG combine for (aext_vector_inreg (aext_vector_inreg X)) -> (aext_vector_inreg X) to fix a regression from my previous commit." These were reverted in r354713 as their context depended on other patches that were reverted for a bug. llvm-svn: 354734
*	[NFC] Fix typos: preceeding -> preceding	Jordan Rupprecht	2019-02-23	2	-5/+5
\| \| \| \|	llvm-svn: 354715
*	Revert r354363 & co "[X86][SSE] Generalize X86ISD::BLENDI support to more ↵	Reid Kleckner	2019-02-23	1	-7/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	value types" r354363 caused https://crbug.com/934963#c1, which has a plain C reduced test case. I also had to revert some dependent changes: - r354648 - r354647 - r354640 - r354511 llvm-svn: 354713
*	[LegalizeTypes] Use PromoteTargetBoolean in PromoteIntOp_ADDSUBCARRY instead ↵	Craig Topper	2019-02-23	1	-13/+1
\| \| \| \| \| \|	of reimplementing it. NFCI llvm-svn: 354710
*	Restore ability for C++ API users to Enable IPRA.	Daniel Sanders	2019-02-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Prior to r310876 one of our out-of-tree targets was enabling IPRA by modifying the TargetOptions::EnableIPRA. This no longer works on current trunk since the useIPRA() hook overrides any values that are set in advance. This patch adjusts the behaviour of the hook so that API users and useIPRA() can both enable it but useIPRA() cannot disable it if the API user already enabled it. Reviewers: arsenm Reviewed By: arsenm Subscribers: wdng, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D38043 llvm-svn: 354692
*	[CGP] move overflow intrinsic insertion to common location; NFCI	Sanjay Patel	2019-02-22	1	-17/+28
\| \| \| \| \| \| \| \| \| \| \|	We need to enhance the uaddo matching to handle special-cases as seen in PR40486 and PR31754. That means we won't necessarily have a def-use pattern, so we'll need to check dominance to determine where to place the intrinsic (as we already do for usubo). This preliminary patch is just rearranging the code, so the planned follow-up to improve uaddo will be more clear. llvm-svn: 354689