bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merging r341642:	Hans Wennborg	2018-09-10	2	-0/+121
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r341642 \| tnorthover \| 2018-09-07 11:21:25 +0200 (Fri, 07 Sep 2018) \| 8 lines ARM: fix Thumb2 CodeGen for ldrex with folded frame-index. Because t2LDREX (& t2STREX) were marked as AddrModeNone, but did allow a FrameIndex operand, rewriteT2FrameIndex asserted. This gives them a proper addressing-mode and tells the rewriter about it so that encodable offsets are exploited and others are rejected. Should fix PR38828. ------------------------------------------------------------------------ llvm-svn: 341783
*	Merging r339225:	Hans Wennborg	2018-08-13	2	-122/+307
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r339225 \| thopre \| 2018-08-08 11:35:26 +0200 (Wed, 08 Aug 2018) \| 11 lines Support inline asm with multiple 64bit output in 32bit GPR Summary: Extend fix for PR34170 to support inline assembly with multiple output operands that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR as in the PR). Reviewers: bogner, t.p.northover, lattner, javed.absar, efriedma Reviewed By: efriedma Subscribers: efriedma, tra, eraman, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D45437 ------------------------------------------------------------------------ llvm-svn: 339539
*	Revert r338354 "[ARM] Revert r337821"	Reid Kleckner	2018-07-31	3	-11/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Disable ARMCodeGenPrepare by default again. It is causing verifier failues in V8 that look like: Duplicate integer as switch case switch i32 %trunc, label %if.end13 [ i32 0, label %cleanup36 i32 0, label %if.then8 ], !dbg !4981 i32 0 fatal error: error in backend: Broken function found, compilation aborted! I will continue reducing the test case and send it along. llvm-svn: 338452
*	[ARM] Revert r337821	Sam Parker	2018-07-31	3	-11/+11
\| \| \| \| \| \| \|	Re-enabling ARMCodeGenPrepare by default after failing to reproduce the bootstrap issues that I was concerned it was causing. llvm-svn: 338354
*	Reapply "Fix crash on inline asm with 64bit matching input in 32bit GPR"	Thomas Preud'homme	2018-07-30	1	-0/+80
\| \| \| \| \| \| \| \| \| \| \| \|	This reapplies commit r338206 reverted by r338214 since the bug that r338206 uncovered has been fixed in r338268. Add support for inline assembly with matching input operand that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR). Note that regular input is already handled by existing code. llvm-svn: 338269
*	Fix uninitialized read in ARM's PrintAsmOperand	Thomas Preud'homme	2018-07-30	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Fix read of uninitialized RC variable in ARM's PrintAsmOperand when hasRegClassConstraint returns false. This was causing inline-asm-operand-implicit-cast test to fail in r338206. Reviewers: t.p.northover, weimingz, javed.absar, chill Reviewed By: chill Subscribers: chill, eraman, kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D49984 llvm-svn: 338268
*	[ARM] Fix over-alignment in arguments that are HA of 128-bit vectors	Petr Pavlu	2018-07-30	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Code in `CC_ARM_AAPCS_Custom_Aggregate()` is responsible for handling homogeneous aggregates for `CC_ARM_AAPCS_VFP`. When an aggregate ends up fully on stack, the function tries to pack all resulting items of the aggregate as tightly as possible according to AAPCS. Once the first item was laid out, the alignment used for consecutive items was the size of one item. This logic went wrong for 128-bit vectors because their alignment is normally only 64 bits, and so could result in inserting unexpected padding between the first and second element. The patch fixes the problem by updating the alignment with the item size only if this results in reducing it. Differential Revision: https://reviews.llvm.org/D49720 llvm-svn: 338233
*	revert r338206 because the test does not pass	Sanjay Patel	2018-07-29	1	-80/+0
\| \| \| \| \| \| \|	Example of bot failure: http://lab.llvm.org:8011/builders/clang-cmake-armv8-quick/builds/5107/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Ainline-asm-operand-implicit-cast.ll llvm-svn: 338214
*	Fix crash on inline asm with 64bit matching input in 32bit GPR	Thomas Preud'homme	2018-07-28	1	-0/+80
\| \| \| \| \| \| \| \| \|	Add support for inline assembly with matching input operand that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR). Note that regular input is already handled by existing code. llvm-svn: 338206
*	[DAGCombiner] Teach DAG combiner that A-(B-C) can be folded to A+(C-B)	Craig Topper	2018-07-28	1	-2/+2
\| \| \| \| \| \| \| \|	This can be useful since addition is commutable, and subtraction is not. This matches a transform that is also done by InstCombine. llvm-svn: 338181
*	[ARM] Add new target feature to fuse literal generation	Evandro Menezes	2018-07-27	1	-0/+39
\| \| \| \| \| \| \| \| \| \|	This feature enables the fusion of such operations on Cortex A57 and Cortex A72, as recommended in their Software Optimisation Guides, sections 4.14 and 4.11, respectively. Differential revision: https://reviews.llvm.org/D49563 llvm-svn: 338147
*	Fix PR34170: Crash on inline asm with 64bit output in 32bit GPR	Thomas Preud'homme	2018-07-25	1	-0/+42
\| \| \| \| \| \| \| \|	Add support for inline assembly with output operand that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR as in the PR). llvm-svn: 337903
*	[ARM] Disable ARMCodeGenPrepare by default	Sam Parker	2018-07-24	3	-11/+11
\| \| \| \| \| \| \| \|	ARM Stage 2 builders have been suspiciously broken since the pass was committed. Disabling to hopefully fix the bots and give me time to debug. llvm-svn: 337821
*	[ARM] ARMCodeGenPrepare backend pass	Sam Parker	2018-07-23	3	-0/+905
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Arm specific codegen prepare is implemented to perform type promotion on icmp operands, which can enable the removal of uxtb and uxth (unsigned extend) instructions. This is possible because performing type promotion before ISel alleviates this duty from the DAG builder which has to perform legalisation, but has a limited view on data ranges. The pass visits any instruction operand of an icmp and creates a worklist to traverse the use-def tree to determine whether the values can simply be promoted. Our concern is values in the registers overflowing the narrow (i8, i16) data range, so instructions marked with nuw can be promoted easily. For add and sub instructions, we are able to use the parallel dsp instructions to operate on scalar data types and avoid overflowing bits. Underflowing adds and subs are also permitted when the result is only used by an unsigned icmp. Differential Revision: https://reviews.llvm.org/D48832 llvm-svn: 337687
*	[ARM] Add new feature to enable optimizing the VFP registers	Evandro Menezes	2018-07-20	1	-16/+10
\| \| \| \| \| \| \| \| \|	Enable the optimization of operations on DPR and SPR via a feature instead of checking the target. Differential revision: https://reviews.llvm.org/D49463 llvm-svn: 337575
*	ARM: switch armv7em MachO triple to hard-float defaults and libcalls.	Tim Northover	2018-07-19	2	-1/+38
\| \| \| \| \| \| \| \| \|	We were emitting incorrect calls to libm functions that LLVM had decided it knew about because the default is soft-float. Recommitted without breaking ELF this time. llvm-svn: 337450
*	Revert "ARM: switch armv7em triple to hard-float defaults and libcalls."	Tim Northover	2018-07-18	2	-37/+1
\| \| \| \| \| \|	This reverts commit r337385 until it can be targeted at MachO only. llvm-svn: 337424
*	ARM: switch armv7em triple to hard-float defaults and libcalls.	Tim Northover	2018-07-18	2	-1/+37
\| \| \| \| \| \| \|	We were emitting incorrect calls to libm functions that LLVM had decided it knew about because the default is soft-float. llvm-svn: 337385
*	[DAGCombiner] Call SimplifyDemandedVectorElts from EXTRACT_VECTOR_ELT	Simon Pilgrim	2018-07-17	1	-2/+0
\| \| \| \| \| \| \| \|	If we are only extracting vector elements via EXTRACT_VECTOR_ELT(s) we may be able to use SimplifyDemandedVectorElts to avoid unnecessary vector ops. Differential Revision: https://reviews.llvm.org/D49262 llvm-svn: 337258
*	[ARM] Regenerated arg endian test	Simon Pilgrim	2018-07-13	1	-48/+224
\| \| \| \| \| \|	As requested on D49262 llvm-svn: 336980
*	[FileCheck] Add -allow-deprecated-dag-overlap to failing llvm tests	Joel E. Denny	2018-07-11	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	See https://reviews.llvm.org/D47106 for details. Reviewed By: probinson Differential Revision: https://reviews.llvm.org/D47171 This commit drops that patch's changes to: llvm/test/CodeGen/NVPTX/f16x2-instructions.ll llvm/test/CodeGen/NVPTX/param-load-store.ll For some reason, the dos line endings there prevent me from commiting via the monorepo. A follow-up commit (not via the monorepo) will finish the patch. llvm-svn: 336843
*	[ARM] ParallelDSP: multiple reduction stmts in loop	Sjoerd Meijer	2018-07-11	1	-1/+76
\| \| \| \| \| \| \| \| \| \|	This fixes an issue that we were not properly supporting multiple reduction stmts in a loop, and not generating SMLADs for these cases. The alias analysis checks were done too early, making it too conservative. Differential revision: https://reviews.llvm.org/D49125 llvm-svn: 336795
*	[ARM] Treat cmn immediates as legal in isLegalICmpImmediate.	Eli Friedman	2018-07-10	2	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	The original code attempted to do this, but the std::abs() call didn't actually do anything due to implicit type conversions. Fix the type conversions, and perform the correct check for negative immediates. This probably has very little practical impact, but it's worth fixing just to avoid confusion in the future, I think. Differential Revision: https://reviews.llvm.org/D48907 llvm-svn: 336742
*	Revert 336426 (and follow-ups 428, 440), it very likely caused PR38084.	Nico Weber	2018-07-06	1	-98/+0
\| \| \| \|	llvm-svn: 336453
*	[ARM] ParallelDSP: added statistics, NFC.	Sjoerd Meijer	2018-07-06	13	-17/+18
\| \| \| \| \| \| \| \| \|	Added statistics for the number of SMLAD instructions created, and als renamed the pass name to -arm-parallel-dsp. Differential Revision: https://reviews.llvm.org/D48971 llvm-svn: 336441
*	Commit rL336426 cause buildbot failures	Diogo N. Sampaio	2018-07-06	1	-3/+3
\| \| \| \| \| \| \| \|	http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/50537/testReport/junit/LLVM/CodeGen_AArch64/FoldRedundantShiftedMasking_ll/ This removes the comments of the function label causing this error. llvm-svn: 336440
*	[SelectionDAG] https://reviews.llvm.org/D48278	Diogo N. Sampaio	2018-07-06	1	-0/+98
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	D48278 Allow to reduce redundant shift masks. For example: x1 = x & 0xAB00 x2 = (x >> 8) & 0xAB can be reduced to: x1 = x & 0xAB00 x2 = x1 >> 8 It only allows folding when the masks and shift values are constants. llvm-svn: 336426
*	[NEON] Fix combining of vldx_dup intrinsics with updating of base addresses	Ivan A. Kosarev	2018-07-05	1	-0/+43
\| \| \| \| \| \| \| \| \| \| \| \| \|	Resolves: Unsupported ARM Neon intrinsics in Target-specific DAG combine function for VLDDUP https://bugs.llvm.org/show_bug.cgi?id=38031 Related diff: D48439 Differential Revision: https://reviews.llvm.org/D48920 llvm-svn: 336325
*	Partial revert of "NFC - Various typo fixes in tests"	Mikael Holmen	2018-07-05	1	-10/+11
\| \| \| \| \| \| \| \|	This partially reverts r336268 since it causes buildbot failures. Added FIXME at the places where the CHECKs are misspelled. llvm-svn: 336323
*	[ARM] ParallelDSP: only support i16 loads for now	Sjoerd Meijer	2018-07-05	1	-1/+46
\| \| \| \| \| \| \| \| \|	We were miscompiling i8 loads, so reject them as unsupported narrow operations for now. Differential Revision: https://reviews.llvm.org/D48944 llvm-svn: 336319
*	NFC - Various typo fixes in tests	Gabor Buella	2018-07-04	6	-19/+19
\| \| \| \|	llvm-svn: 336268
*	[ARM] Fix PR37382: Don't optimize mul.with.overflow on thumbv6m.	Vadzim Dambrouski	2018-07-02	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: efriedma, rogfer01, javed.absar Reviewed By: efriedma, rogfer01 Subscribers: kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D48846 llvm-svn: 336144
*	[ARM] Parallel DSP Pass	Sjoerd Meijer	2018-06-28	13	-0/+677
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Armv6 introduced instructions to perform 32-bit SIMD operations. The purpose of this pass is to do some straightforward IR pattern matching to create ACLE DSP intrinsics, which map on these 32-bit SIMD operations. Currently, only the SMLAD instruction gets recognised. This instruction performs two multiplications with 16-bit operands, and stores the result in an accumulator. We will follow this up with patches to recognise SMLAD in more cases, and also to generate other DSP instructions (like e.g. SADD16). Patch by: Sam Parker and Sjoerd Meijer Differential Revision: https://reviews.llvm.org/D48128 llvm-svn: 335850
*	[NEON] Support vldNq intrinsics in AArch32 (LLVM part)	Ivan A. Kosarev	2018-06-27	1	-0/+234
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for the q versions of the dup (load-to-all-lanes) NEON intrinsics, such as vld2q_dup_f16() for example. Currently, non-q versions of the dup intrinsics are implemented in clang by generating IR that first loads the elements of the structure into the first lane with the lane (to-single-lane) intrinsics, and then propagating it other lanes. There are at least two problems with this approach. First, there are no double-spaced to-single-lane byte-element instructions. For example, there is no such instruction as 'vld2.8 { d0[0], d2[0] }, [r0]'. That means we cannot rely on the to-single-lane intrinsics and instructions to implement the q versions of the dup intrinsics. Note that to-all-lanes instructions do support all sizes of data items, including bytes. The second problem with the current approach is that we need a separate vdup instruction to propagate the structure to each lane. So for vld4q_dup_f16() we would need four vdup instructions in addition to the initial vld instruction. This patch introduces dup LLVM intrinsics and reworks handling of the currently supported (non-q) NEON dup intrinsics to expand them into those LLVM intrinsics, thus eliminating the need for using to-single-lane intrinsics and instructions. Additionally, this patch adds support for u64 and s64 dup NEON intrinsics. These are marked as Arch64-only in the ARM NEON Reference, but it seems there are no reasons to not support them in AArch32 mode. Please correct, if that is wrong. That's what we generate with this patch applied: vld2q_dup_f16: vld2.16 {d0[], d2[]}, [r0] vld2.16 {d1[], d3[]}, [r0] vld3q_dup_f16: vld3.16 {d0[], d2[], d4[]}, [r0] vld3.16 {d1[], d3[], d5[]}, [r0] vld4q_dup_f16: vld4.16 {d0[], d2[], d4[], d6[]}, [r0] vld4.16 {d1[], d3[], d5[], d7[]}, [r0] Differential Revision: https://reviews.llvm.org/D48439 llvm-svn: 335733
*	[X86,ARM] Retain split-stack prolog check for sibling calls	Than McIntosh	2018-06-26	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If a routine with no stack frame makes a sibling call, we need to preserve the stack space check even if the local stack frame is empty, since the call target could be a "no-split" function (in which case the linker needs to be able to fix up the prolog sequence in order to switch to a larger stack). This fixes PR37807. Reviewers: cherry, javed.absar Subscribers: srhines, llvm-commits Differential Revision: https://reviews.llvm.org/D48444 llvm-svn: 335604
*	Recommit r335333 "[MC] - Add .stack_size sections into groups and link them ↵	George Rimar	2018-06-22	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	with .text" With compilation fix. Original commit message: D39788 added a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. This change does following two things on top: 1) Imagine the case when there are -ffunction-sections flag given and there are text sections in COMDATs. The patch adds a '.stack-size' section into corresponding COMDAT group, so that linker will be able to eliminate them fast during resolving the COMDATs. 2) Patch sets a SHF_LINK_ORDER flag and links '.stack-size' with the corresponding .text. With that linker will be able to do -gc-sections on dead stack sizes sections. Differential revision: https://reviews.llvm.org/D46874 llvm-svn: 335336
*	Revert r335332 "[MC] - Add .stack_size sections into groups and link them ↵	George Rimar	2018-06-22	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	with .text" It broke bots. http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/12891 http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/9443 http://lab.llvm.org:8011/builders/lldb-x86_64-ubuntu-14.04-buildserver/builds/25551 llvm-svn: 335333
*	[MC] - Add .stack_size sections into groups and link them with .text	George Rimar	2018-06-22	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	D39788 added a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. This change does following two things on top: 1) Imagine the case when there are -ffunction-sections flag given and there are text sections in COMDATs. The patch adds a '.stack-size' section into corresponding COMDAT group, so that linker will be able to eliminate them fast during resolving the COMDATs. 2) Patch sets a SHF_LINK_ORDER flag and links '.stack-size' with the corresponding .text. With that linker will be able to do -gc-sections on dead stack sizes sections. Differential revision: https://reviews.llvm.org/D46874 llvm-svn: 335332
*	Recommit of r335326, with the test fixed that I missed.	Sjoerd Meijer	2018-06-22	1	-10/+9
\| \| \| \|	llvm-svn: 335331
*	Reverting r335326 while I look at the test failure	Sjoerd Meijer	2018-06-22	1	-9/+10
\| \| \| \|	llvm-svn: 335328
*	[ARM] ARMv6m and v8m.baseline strict align	Sjoerd Meijer	2018-06-22	1	-10/+9
\| \| \| \| \| \| \| \| \| \| \| \|	This sets target feature FeatureStrictAlign for Armv6-m and Armv8-m.baseline, because it has no support for unaligned accesses. It looks like we always pass target feature "+strict-align" from Clang, so this is not a user facing problem, but querying the subtarget (in e.g. llc) for unaligned access support is incorrect. Differential Revision: https://reviews.llvm.org/D48437 llvm-svn: 335326
*	[ARM] Enable useAA() for the in-order Cortex-R52	David Green	2018-06-21	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \|	This option allows codegen (such as DAGCombine or MI scheduling) to use alias analysis information, which can help with the codegen on in-order cpu's, especially machine scheduling. Here I have done things the same way as AArch64, adding a subtarget feature to enable this for specific cores, and enabled it for the R52 where we have a schedule to make use of it. Differential Revision: https://reviews.llvm.org/D48074 llvm-svn: 335249
*	[NFC][ARM] ldrd/strd negative tests	Sam Parker	2018-06-21	1	-0/+52
\| \| \| \| \| \|	Add negative tests for load and stores of alignment 2. llvm-svn: 335241
*	[DAGCombine] Fix alignment for offset loads/stores	David Green	2018-06-21	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The alignment parameter to getExtLoad is treated as a base alignment, not the alignment of the load (base + offset). When we infer a better alignment for a Ptr we need to ensure that it applies to the base to prevent the alignment on the load from being wrong. This fixes a bug where the alignment could then be used to incorrectly prove noalias between a load and a store, leading to a miscompile. Differential Revision: https://reviews.llvm.org/D48029 llvm-svn: 335210
*	Generalize MergeBlockIntoPredecessor. Replace uses of ↵	Alina Sbirlea	2018-06-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MergeBasicBlockIntoOnlyPred. Summary: Two utils methods have essentially the same functionality. This is an attempt to merge them into one. 1. lib/Transforms/Utils/Local.cpp : MergeBasicBlockIntoOnlyPred 2. lib/Transforms/Utils/BasicBlockUtils.cpp : MergeBlockIntoPredecessor Prior to the patch: 1. MergeBasicBlockIntoOnlyPred Updates either DomTree or DeferredDominance Moves all instructions from Pred to BB, deletes Pred Asserts BB has single predecessor If address was taken, replace the block address with constant 1 (?) 2. MergeBlockIntoPredecessor Updates DomTree, LoopInfo and MemoryDependenceResults Moves all instruction from BB to Pred, deletes BB Returns if doesn't have a single predecessor Returns if BB's address was taken After the patch: Method 2. MergeBlockIntoPredecessor is attempting to become the new default: Updates DomTree or DeferredDominance, and LoopInfo and MemoryDependenceResults Moves all instruction from BB to Pred, deletes BB Returns if doesn't have a single predecessor Returns if BB's address was taken Uses of MergeBasicBlockIntoOnlyPred that need to be replaced: 1. lib/Transforms/Scalar/LoopSimplifyCFG.cpp Updated in this patch. No challenges. 2. lib/CodeGen/CodeGenPrepare.cpp Updated in this patch. i. eliminateFallThrough is straightforward, but I added using a temporary array to avoid the iterator invalidation. ii. eliminateMostlyEmptyBlock(s) methods also now use a temporary array for blocks Some interesting aspects: - Since Pred is not deleted (BB is), the entry block does not need updating. - The entry block was being updated with the deleted block in eliminateMostlyEmptyBlock. Added assert to make obvious that BB=SinglePred. - isMergingEmptyBlockProfitable assumes BB is the one to be deleted. - eliminateMostlyEmptyBlock(BB) does not delete BB on one path, it deletes its unique predecessor instead. - adding some test owner as subscribers for the interesting tests modified: test/CodeGen/X86/avx-cmp.ll test/CodeGen/AMDGPU/nested-loop-conditions.ll test/CodeGen/AMDGPU/si-annotate-cf.ll test/CodeGen/X86/hoist-spill.ll test/CodeGen/X86/2006-11-17-IllegalMove.ll 3. lib/Transforms/Scalar/JumpThreading.cpp Not covered in this patch. It is the only use case using the DeferredDominance. I would defer to Brian Rzycki to make this replacement. Reviewers: chandlerc, spatel, davide, brzycki, bkramer, javed.absar Subscribers: qcolombet, sanjoy, nemanjai, nhaehnle, jlebar, tpr, kbarton, RKSimon, wmi, arsenm, llvm-commits Differential Revision: https://reviews.llvm.org/D48202 llvm-svn: 335183
*	ARM: convert ORR instructions to ADD where possible on Thumb.	Tim Northover	2018-06-20	2	-1/+42
\| \| \| \| \| \| \| \| \| \| \| \|	Thumb has more 16-bit encoding space dedicated to ADD than ORR, allowing both a 3-address encoding and a wider range of immediates. So, particularly when optimizing for code size (but it doesn't make things worse elsewhere) it's beneficial to select an OR operation to an ADD if we know overflow won't occur. This is made even better by LLVM's penchant for putting operations in canonical form by converting the other way. llvm-svn: 335119
*	[ARM] Add Thumb1 coverage for cmn testcases.	Eli Friedman	2018-06-19	1	-6/+45
\| \| \| \| \| \| \|	There's a missed optimization for immediates: we can save two instructions by using adds instead of movs+mvns+cmp. llvm-svn: 335002
*	[ARM] Testcase for missed optimization with i16 compare.	Eli Friedman	2018-06-19	1	-0/+21
\| \| \| \| \| \| \|	The result looks weird because the DAG actually has an explicit shift; I haven't figured out why, exactly. llvm-svn: 335000
*	easing the constraint for isNegatibleForFree and GetNegatedExpression	Michael Berg	2018-06-14	1	-7/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Here we relax the old constraint which utilized unsafe with the TargetOption flag HonorSignDependentRoundingFPMathOption, with the assertion that unsafe is no longer needed or never was required for correctness on FDIV/FMUL. Reviewers: spatel, hfinkel, wristow, arsenm, javed.absar Reviewed By: spatel Subscribers: efriedma, wdng, tpr Differential Revision: https://reviews.llvm.org/D48057 llvm-svn: 334769
*	DAG: Fix extract_subvector combine for a single element	Matt Arsenault	2018-06-11	1	-2/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This would fail before because 1x vectors aren't legal, so instead just use the scalar type. Avoids regressions in a future AMDGPU commit to add v4i16/v4f16 as legal types. Test update is just the one test that this triggers on in tree now. It wasn't checking anything before. The result is completely changed since the selects are eliminated. Not sure if it's considered better or not. llvm-svn: 334440