bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[AArch64] Armv8.4-A: LDAPR & STLR with immediate offset instructions	Sjoerd Meijer	2018-07-12	3	-0/+43
\| \| \| \| \| \|	These instructions are added to AArch64 only. llvm-svn: 336913
*	[InstCombine] Fold x & (-1 >> y) != x to x u> (-1 >> y)	Roman Lebedev	2018-07-12	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: A complementary fold to D49179. https://bugs.llvm.org/show_bug.cgi?id=38123 https://rise4fun.com/Alive/Rny Caveat: one more thing in `test/Transforms/InstCombine/icmp-logical.ll` breaks. Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49205 llvm-svn: 336911
*	[ThinLTO] Escape module paths when printing	Andrew Ng	2018-07-12	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have located a bug in AssemblyWriter::printModuleSummaryIndex(). This function outputs path strings incorrectly. Backslashes in the strings are not correctly escaped. Consequently, if a path name contains a backslash followed by two hexadecimal characters, the sequence is incorrectly interpreted when the output is read by another component. This mangles the path and results in error. This patch fixes this issue by calling printEscapedString() to output the module paths. Patch by Chris Jackson. Differential Revision: https://reviews.llvm.org/D49090 llvm-svn: 336908
*	[X86][SSE] Utilize ZeroableElements for canWidenShuffleElements	Simon Pilgrim	2018-07-12	1	-2/+31
\| \| \| \| \| \| \| \| \| \|	canWidenShuffleElements can do a better job if given a mask with ZeroableElements info. Apparently, ZeroableElements was being only used to identify AllZero candidates, but possibly we could plug it into more shuffle matchers. Original Patch by Zvi Rackover @zvi Differential Revision: https://reviews.llvm.org/D42044 llvm-svn: 336903
*	[X86][AVX] Use Zeroable mask to improve shuffle mask widening	Simon Pilgrim	2018-07-12	1	-2/+17
\| \| \| \| \| \| \| \| \| \|	Noticed while updating D42044, lowerV2X128VectorShuffle can improve the shuffle mask with the zeroable data to create a target shuffle mask to recognise more 'zero upper 128' patterns. NOTE: lowerV4X128VectorShuffle could benefit as well but the code needs refactoring first to discriminate between SM_SentinelUndef and SM_SentinelZero for negative shuffle indices. Differential Revision: https://reviews.llvm.org/D49092 llvm-svn: 336900
*	[UnJ] Use SmallPtrSets for block collections. NFC	David Green	2018-07-12	1	-30/+27
\| \| \| \| \| \| \| \| \|	We no longer care about the order of blocks in these collections, so can change to SmallPtrSets, making contains checks quicker. Differential revision: https://reviews.llvm.org/D49060 llvm-svn: 336897
*	[mips] Mark standard encoded instructions as not being in MIPS16e	Simon Atanasyan	2018-07-12	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \|	Mark standard encoded instructions and pseudo "standard encoded" as not being in MIPS16e by default. Patch by Simon Dardis. Differential revision: https://reviews.llvm.org/D48379 llvm-svn: 336893
*	[X86] Remove i128 type from FR128 regclass.	Craig Topper	2018-07-12	3	-18/+1
\| \| \| \| \| \|	i128 isn't a legal type in our x86 implementation today. So remove this and the few patterns that used it until it becomes necessary. llvm-svn: 336889
*	Fix few typos in comments (write access test commit)	Stefan Granitz	2018-07-12	1	-2/+2
\| \| \| \|	llvm-svn: 336887
*	[X86] Remove patterns and ISD nodes for the old scalar FMA intrinsic lowering.	Craig Topper	2018-07-12	5	-165/+19
\| \| \| \| \| \|	We now use llvm.fma.f32/f64 or llvm.x86.fmadd.f32/f64 intrinsics that use scalar types rather than vector types. So we don't these special ISD nodes that operate on the lowest element of a vector. llvm-svn: 336883
*	[InstSimplify] simplify add instruction if two operands are negative	Chen Zheng	2018-07-12	2	-0/+24
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D49216 llvm-svn: 336881
*	[AsmParser] Fix inconsistent declaration parameter name	Fangrui Song	2018-07-12	3	-41/+41
\| \| \| \|	llvm-svn: 336879
*	Temporarily revert "Recommit r328307: [IPSCCP] Use constant range ↵	Eric Christopher	2018-07-12	1	-81/+111
\| \| \| \| \| \| \| \| \| \|	information for comparisons of parameters." as it's causing miscompiles. A testcase was provided in the original review thread. This reverts commit r336098. llvm-svn: 336877
*	[x86] Fix another trivial bug in x86 flags copy lowering that has been	Chandler Carruth	2018-07-12	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	there for a long time. The boolean tracking whether we saw a kill of the flags was supposed to be per-block we are scanning and instead was outside that loop and never cleared. It requires a quite contrived test case to hit this as you have to have multiple levels of successors and interleave them with kills. I've included such a test case here. This is another bug found testing SLH and extracted to its own focused patch. llvm-svn: 336876
*	[X86] Add patterns to use VMOVSS/SD zero masking for scalar f32/f64 select ↵	Craig Topper	2018-07-12	1	-0/+8
\| \| \| \| \| \| \| \|	with zero. These showed up in some of the upgraded FMA code. We really need to improve these test cases more, but this helps for now. llvm-svn: 336875
*	[x86] Fix EFLAGS copy lowering to correctly handle walking past uses in	Chandler Carruth	2018-07-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	multiple successors where some of the uses end up killing the EFLAGS register. There was a bug where rather than skipping to the next basic block queued up with uses once we saw a kill, we stopped processing the blocks entirely. =/ Test case produces completely nonsensical code w/o this tiny fix. This was found testing Speculative Load Hardening and split out of that work. Differential Revision: https://reviews.llvm.org/D49211 llvm-svn: 336874
*	[X86] Remove and autoupgrade the scalar fma intrinsics with masking.	Craig Topper	2018-07-12	7	-136/+129
\| \| \| \| \| \|	This converts them to what clang is now using for codegen. Unfortunately, there seem to be a few kinks to work out still. I'll try to address with follow up patches. llvm-svn: 336871
*	IR: Skip -print--all after -print-	Duncan P. N. Exon Smith	2018-07-11	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	This changes `-print-*` from transformation passes to analysis passes so that `-print-after-all` and `-print-before-all` don't trigger. This avoids some redundant output. Patch by Son Tuan Vu! llvm-svn: 336869
*	[CodeGen] Emit more precise AssertZext/AssertSext nodes.	Eli Friedman	2018-07-11	2	-26/+9
\| \| \| \| \| \| \| \| \| \| \| \|	This is marginally helpful for removing redundant extensions, and the code is easier to read, so it seems like an all-around win. In the new test i8-phi-ext.ll, we used to emit an AssertSext i8; now we emit an AssertZext i2, which allows the extension of the return value to be eliminated. Differential Revision: https://reviews.llvm.org/D49004 llvm-svn: 336868
*	[LoopIdiomRecognize] Don't convert a do while loop to ctlz.	Craig Topper	2018-07-11	1	-10/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit suppresses turning loops like this into "(bitwidth - ctlz(input))". unsigned foo(unsigned input) { unsigned num = 0; do { ++num; input >>= 1; } while (input != 0); return num; } The loop version returns a value of 1 for both an input of 0 and an input of 1. Converting to a naive ctlz does not preserve that. Theoretically we could do better if we checked isKnownNonZero or we could insert a select to handle the divergence. But until we have motivating cases for that, this is the easiest solution. llvm-svn: 336864
*	AMDGPU/SI: Initialize InstrInfo before TargetLoweringInfo in GCNSubtarget	Tom Stellard	2018-07-11	2	-3/+3
\| \| \| \| \| \| \| \|	SITargetLowering queries SIInstrInfo in its constructor, so SIInstrInfo must be initialized first. This fixes msan buildbot failures and was introduced by r336851. llvm-svn: 336861
*	[MemorySSA] Add APIs to move memory accesses between blocks, following CFG ↵	Alina Sbirlea	2018-07-11	2	-1/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	changes. Summary: The move APIs added in this patch will be used to update MemorySSA when CFG changes merge or split blocks, by moving memory accesses accordingly in MemorySSA's internal data structures. [Split from D45299 for easier review] Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D48897 llvm-svn: 336860
*	AMDGPU: Remove duplicate call to initializeSubtargetDependencies()	Tom Stellard	2018-07-11	1	-1/+0
\| \| \| \| \| \|	This was added in r336851. llvm-svn: 336853
*	AMDGPU: Refactor Subtarget classes	Tom Stellard	2018-07-11	74	-381/+340
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a follow-up to r335942. - Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget - Rename AMDGPUCommonSubtarget to AMDGPUSubtarget - Merge R600Subtarget::Generation and GCNSubtarget::Generation into AMDGPUSubtarget::Generation. Reviewers: arsenm, jvesely Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D49037 llvm-svn: 336851
*	[DebugInfo] Fix getPreviousSibling after r336823	Fangrui Song	2018-07-11	1	-1/+2
\| \| \| \|	llvm-svn: 336837
*	[InstCombine] Fold x & (-1 >> y) == x to x u<= (-1 >> y)	Roman Lebedev	2018-07-11	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: https://bugs.llvm.org/show_bug.cgi?id=38123 This pattern will be produced by Implicit Integer Truncation sanitizer, https://reviews.llvm.org/D48958 https://bugs.llvm.org/show_bug.cgi?id=21530 in unsigned case, therefore it is probably a good idea to improve it. https://rise4fun.com/Alive/Rny ^ there are more opportunities for folds, i will follow up with them afterwards. Caveat: this somehow exposes a missing opportunities in `test/Transforms/InstCombine/icmp-logical.ll` It seems, the problem is in `foldLogOpOfMaskedICmps()` in `InstCombineAndOrXor.cpp`. But i'm not quite sure what is wrong, because it calls `getMaskedTypeForICmpPair()`, which calls `decomposeBitTestICmp()` which should already work for these cases... As @spatel notes in https://reviews.llvm.org/D49179#1158760, that code is a rather complex mess, so we'll let it slide. Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: yamauchi, majnemer, t.p.northover, llvm-commits Differential Revision: https://reviews.llvm.org/D49179 llvm-svn: 336834
*	[X86] Remove patterns for inserting a load into a zero vector.	Craig Topper	2018-07-11	2	-90/+50
\| \| \| \| \| \| \| \|	We can instead block the load folding isProfitableToFold. Then isel will emit a register->register move for the zeroing part and a separate load. The PostProcessISelDAG should be able to remove the register->register move. This saves us patterns and fixes the fact that we only had unaligned load patterns. The test changes show places where we should have been using an aligned load. llvm-svn: 336828
*	[TargetTransformInfo] Add pow2 analysis for scalar constants	Simon Pilgrim	2018-07-11	1	-0/+6
\| \| \| \| \| \|	Add ConstantInt analysis to getOperandInfo so we get more realistic div/rem expansion costs comparable to the vector costs. llvm-svn: 336827
*	AMDGPU/NFC: Use already available explicit kernarg	Konstantin Zhuravlyov	2018-07-11	1	-1/+2
\| \| \| \| \| \| \|	size instead of calculating it again when filling out the metadata. llvm-svn: 336825
*	[DebugInfo] Make children iterator bidirectional	Jonas Devlieghere	2018-07-11	2	-0/+45
\| \| \| \| \| \| \| \| \|	Make the DIE iterator bidirectional so we can move to the previous sibling of a DIE. Differential revision: https://reviews.llvm.org/D49173 llvm-svn: 336823
*	[X86] Fix MayLoad/HasSideEffect flag for (V)MOVLPSrm instructions.	Andrea Di Biagio	2018-07-11	2	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before revision 336728, the "mayLoad" flag for instruction (V)MOVLPSrm was inferred directly from the "default" pattern associated with the instruction definition. r336728 removed special node X86Movlps, and all the patterns associated to it. Now instruction (V)MOVLPSrm doesn't have a pattern associated to it, and the 'mayLoad/hasSideEffects' flags are left unset. When the instruction info is emitted by tablegen, method CodeGenDAGPatterns::InferInstructionFlags() sees that (V)MOVLPSrm doesn't have a pattern, and flags are undefined. So, it conservatively sets the "hasSideEffects" flag for it. As a consequence, we were losing the 'mayLoad' flag, and we were gaining a 'hasSideEffect' flag in its place. This patch fixes the issue (originally reported by Michael Holmen). The mca tests show the differences in the instruction info flags. Instructions that were affected by this problem were: MOVLPSrm/VMOVLPSrm/VMOVLPSZ128rm. Differential Revision: https://reviews.llvm.org/D49182 llvm-svn: 336818
*	[SLPVectorizer] Add initial alternate opcode support for cast instructions. ↵	Simon Pilgrim	2018-07-11	1	-22/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(REAPPLIED) We currently only support binary instructions in the alternate opcode shuffles. This patch is an initial attempt at adding cast instructions as well, this raises several issues that we probably want to address as we continue to generalize the alternate mechanism: 1 - Duplication of cost determination - we should probably add scalar/vector costs helper functions and get BoUpSLP::getEntryCost to use them instead of determining costs directly. 2 - Support alternate instructions with the same opcode (e.g. casts with different src types) - alternate vectorization of calls with different IntrinsicIDs will require this. 3 - Allow alternates to be a different instruction type - mixing binary/cast/call etc. 4 - Allow passthrough of unsupported alternate instructions - related to PR30787/D28907 'copyable' elements. Reapplied with fix to only accept 2 different casts if they come from the same source type. Differential Revision: https://reviews.llvm.org/D49135 llvm-svn: 336812
*	Revert rL336804: [SLPVectorizer] Add initial alternate opcode support for ↵	Simon Pilgrim	2018-07-11	1	-58/+22
\| \| \| \| \| \| \| \|	cast instructions. Reverting due to buildbot failures llvm-svn: 336806
*	[SLPVectorizer] Add initial alternate opcode support for cast instructions.	Simon Pilgrim	2018-07-11	1	-22/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently only support binary instructions in the alternate opcode shuffles. This patch is an initial attempt at adding cast instructions as well, this raises several issues that we probably want to address as we continue to generalize the alternate mechanism: 1 - Duplication of cost determination - we should probably add scalar/vector costs helper functions and get BoUpSLP::getEntryCost to use them instead of determining costs directly. 2 - Support alternate instructions with the same opcode (e.g. casts with different src types) - alternate vectorization of calls with different IntrinsicIDs will require this. 3 - Allow alternates to be a different instruction type - mixing binary/cast/call etc. 4 - Allow passthrough of unsupported alternate instructions - related to PR30787/D28907 'copyable' elements. Differential Revision: https://reviews.llvm.org/D49135 llvm-svn: 336804
*	[CodeGen] Ignore debug uses in MachineCopyPropagation	Krzysztof Parzyszek	2018-07-11	1	-1/+1
\| \| \| \| \| \| \|	Debug uses should not count as real uses, since the presence of debug information could affect the generated code. llvm-svn: 336803
*	[mips] Update the P5600 scheduler model not to use instruction itineraries.	Simon Atanasyan	2018-07-11	1	-63/+93
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This mostly brings the P5600 scheduler model to a mostly complete status. There are a number of instructions which trigger the `error:'MipsP5600Model' lacks information for` error. These are certain codegen only instructions relating to MIPS64 which can be addressed by using the correct predicates for them. That will be done in a full-up patch. Patch by Simon Dardis. Differential revision: https://reviews.llvm.org/D45245 llvm-svn: 336802
*	[NFC][InstCombine] Converts isLegalNarrowLoad into isLegalNarrowLdSt	Diogo N. Sampaio	2018-07-11	1	-41/+55
\| \| \| \| \| \| \| \| \| \| \|	Reuse this function as to test correctness and profitability of reducing width of either load or store operations. Reviewsers: samparker Differential Revision: https://reviews.llvm.org/D48624 llvm-svn: 336800
*	[ARM] ParallelDSP: multiple reduction stmts in loop	Sjoerd Meijer	2018-07-11	1	-40/+75
\| \| \| \| \| \| \| \| \| \|	This fixes an issue that we were not properly supporting multiple reduction stmts in a loop, and not generating SMLADs for these cases. The alias analysis checks were done too early, making it too conservative. Differential revision: https://reviews.llvm.org/D49125 llvm-svn: 336795
*	Use debug-prefix-map for AT_NAME	Jonas Devlieghere	2018-07-11	3	-21/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	AT_NAME was being emitted before the directory paths were remapped. This ensures that all paths are remapped before anything is emitted. An additional test case has been added. Note that this only works if the replacement string is an absolute path. If not, then AT_decl_file believes the new path is a relative path, and joins that path with the compilation directory. I do not know of a good way to resolve this. Patch by: Siddhartha Bagaria (starsid) Differential revision: https://reviews.llvm.org/D49169 llvm-svn: 336793
*	[AArch64][SVE] Asm: Support for COMPACT instruction.	Sander de Smalen	2018-07-11	2	-0/+24
\| \| \| \| \| \| \| \| \| \| \|	The compact instruction shuffles active elements of vector into lowest numbered elements and sets remaining elements to zero. e.g. compact z0.s, p0, z1.s llvm-svn: 336789
*	[AArch64][SVE] Asm: Support for LAST(A\|B) and CLAST(A\|B) instructions.	Sander de Smalen	2018-07-11	2	-0/+148
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The LASTB and LASTA instructions extract the last active element, or element after the last active, from the source vector. The added variants are: Scalar: last(a\|b) w0, p0, z0.b last(a\|b) w0, p0, z0.h last(a\|b) w0, p0, z0.s last(a\|b) x0, p0, z0.d SIMD & FP Scalar: last(a\|b) b0, p0, z0.b last(a\|b) h0, p0, z0.h last(a\|b) s0, p0, z0.s last(a\|b) d0, p0, z0.d The CLASTB and CLASTA conditionally extract the last or element after the last active element from the source vector. The added variants are: Scalar: clast(a\|b) w0, p0, w0, z0.b clast(a\|b) w0, p0, w0, z0.h clast(a\|b) w0, p0, w0, z0.s clast(a\|b) x0, p0, x0, z0.d SIMD & FP Scalar: clast(a\|b) b0, p0, b0, z0.b clast(a\|b) h0, p0, h0, z0.h clast(a\|b) s0, p0, s0, z0.s clast(a\|b) d0, p0, d0, z0.d Vector: clast(a\|b) z0.b, p0, z0.b, z1.b clast(a\|b) z0.h, p0, z0.h, z1.h clast(a\|b) z0.s, p0, z0.s, z1.s clast(a\|b) z0.d, p0, z0.d, z1.d Please refer to the architecture specification for more details on the semantics of the added instructions. llvm-svn: 336783
*	[llvm-readobj] Add -hex-dump (-x) option	Paul Semel	2018-07-11	2	-0/+36
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D48281 llvm-svn: 336782
*	[SelectionDAG] Add constant buildvector support to isKnownNeverZero	Simon Pilgrim	2018-07-11	2	-6/+9
\| \| \| \| \| \|	This allows us to use SelectionDAG::isKnownNeverZero in DAGCombiner::visitREM (visitSDIVLike/visitUDIVLike handle the checking for constants). llvm-svn: 336779
*	[mips] Remove dead code. NFC	Simon Atanasyan	2018-07-11	5	-38/+0
\| \| \| \|	llvm-svn: 336777
*	[DAGCombiner] Support non-uniform X%C -> X-(X/C)*C folds	Simon Pilgrim	2018-07-11	1	-1/+4
\| \| \| \| \| \| \| \| \| \|	First stage in PR38057 - support non-uniform constant vectors in the combine to reuse the division-by-constant logic. We can definitely do better for srem pow2 remainders (and avoid that extra multiply....) but this at least helps keep everything on the vector unit. Differential Revision: https://reviews.llvm.org/D48975 llvm-svn: 336774
*	[DAGCombiner] Add (urem X, -1) -> select(X == -1, 0, x) fold	Simon Pilgrim	2018-07-11	1	-0/+6
\| \| \| \|	llvm-svn: 336773
*	[TableGen] Add missing std::moves to fix build failure.	Simon Tatham	2018-07-11	1	-7/+7
\| \| \| \| \| \| \| \| \| \|	gcc 4.7 seems to disagree with gcc 5.3 about whether you need to say 'return std::move(thing)' instead of just 'return thing'. All the json::Arrays and json::Objects that I was implicitly turning into json::Values by returning them from functions now have explicit std::move wrappers, so hopefully 4.7 will be happy now. llvm-svn: 336772
*	[TableGen] Add a general-purpose JSON backend.	Simon Tatham	2018-07-11	2	-0/+190
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The aim of this backend is to output everything TableGen knows about the record set, similarly to the default -print-records backend. But where -print-records produces output in TableGen's input syntax (convenient for humans to read), this backend produces it as structured JSON data, which is convenient for loading into standard scripting languages such as Python, in order to extract information from the data set in an automated way. The output data contains a JSON representation of the variable definitions in output 'def' records, and a few pieces of metadata such as which of those definitions are tagged with the 'field' prefix and which defs are derived from which classes. It doesn't dump out absolutely every piece of knowledge it _could_ produce, such as type information and complicated arithmetic operator nodes in abstract superclasses; the main aim is to allow consumers of this JSON dump to essentially act as new backends, and backends don't generally need to depend on that kind of data. The new backend is implemented as an EmitJSON() function similar to all of llvm-tblgen's other EmitFoo functions, except that it lives in lib/TableGen instead of utils/TableGen on the basis that I'm expecting to add it to clang-tblgen too in a future patch. To test it, I've written a Python script that loads the JSON output and tests properties of it based on comments in the .td source - more or less like FileCheck, except that the CHECK: lines have Python expressions after them instead of textual pattern matches. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: arichardson, labath, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D46054 llvm-svn: 336771
*	[WebAssembly] Only call llvm::value::dump() in debug build.	Eric Liu	2018-07-11	1	-0/+2
\| \| \| \| \| \| \|	This fixes compile error in r336759. llvm::value::dump is not available in released build. llvm-svn: 336770
*	[X86] The TEST instruction is eliminated when BSF/TZCNT is used	Craig Topper	2018-07-11	2	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: These changes cover the PR#31399. Now the ffs(x) function is lowered to (x != 0) ? llvm.cttz(x) + 1 : 0 and it corresponds to the following llvm code: %cnt = tail call i32 @llvm.cttz.i32(i32 %v, i1 true) %tobool = icmp eq i32 %v, 0 %.op = add nuw nsw i32 %cnt, 1 %add = select i1 %tobool, i32 0, i32 %.op and x86 asm code: bsfl %edi, %ecx addl $1, %ecx testl %edi, %edi movl $0, %eax cmovnel %ecx, %eax In this case the 'test' instruction can't be eliminated because the 'add' instruction modifies the EFLAGS, namely, ZF flag that is set by the 'bsf' instruction when 'x' is zero. We now produce the following code: bsfl %edi, %ecx movl $-1, %eax cmovnel %ecx, %eax addl $1, %eax Patch by Ivan Kulagin Reviewers: davide, craig.topper, spatel, RKSimon Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D48765 llvm-svn: 336768