bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[TargetLowering] Move the setBooleanContents check on (xor (setcc), (setcc)) ↵	Craig Topper	2019-11-01	1	-8/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	== / != 1 -> (setcc) != / == (setcc) to the right place We need to be checking the value types for the inner setccs not the outer setcc. We need to ensure those setccs produce a 0/1 value or that the xor is on the i1 type. I think at the time this code was originally written, getBooleanContents didn't take any arguments so this was probably correct. But now we can have a different boolean contents for integer and floating point. Not sure why the other combines below the xor were also checking the boolean contents. None of them involve any setccs other than the outer one and they only produce a new setcc. Differential Revision: https://reviews.llvm.org/D69480
*	[TargetLowering] Add getBooleanContents contents check to "SETCC (SETCC), ↵	Craig Topper	2019-10-27	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	[0\|1], [EQ\|NE] -> SETCC" combine. This combine is only valid if the inner setcc produces a 0/1 result or the inner type is MVT::i1. I haven't seen this cause any issues, just happened to notice it while reviewing combines in this function. While there also fix another call to use the value type from the SDValue for the operand instead of calling SDNode::getValueType(0). Though its likely the use is result 0, its not guaranteed.
*	Revert 4334892e7b "[DAGCombine][ARM] x ==/!= c -> (x - c) ==/!= 0 iff ↵	Hans Wennborg	2019-10-23	1	-65/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	'-c' can be folded into the x node." This broke various Windows builds, see comments on the Phabricator review. This also reverts the follow-up 20bf0cf. > Summary: > This fold, helps recover from the rest of the D62266 ARM regressions. > https://rise4fun.com/Alive/TvpC > > Note that while the fold is quite flexible, i've restricted it > to the single interesting pattern at the moment. > > Reviewers: efriedma, craig.topper, spatel, RKSimon, deadalnix > > Reviewed By: deadalnix > > Subscribers: javed.absar, kristof.beyls, llvm-commits > > Tags: #llvm > > Differential Revision: https://reviews.llvm.org/D62450
*	[TargetLowering] optimizeSetCCToComparisonWithZero(): add extra sanity ↵	Roman Lebedev	2019-10-23	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \|	checks (PR43769) We should do the fold only if both constants are plain, non-opaque constants, at least that is the DAG.FoldConstantArithmetic() requirement. And if the constant we are comparing with is zero - we shouldn't be trying to do this fold in the first place. Fixes https://bugs.llvm.org/show_bug.cgi?id=43769
*	[DAGCombine][ARM] x ==/!= c -> (x - c) ==/!= 0 iff '-c' can be folded ↵	Roman Lebedev	2019-10-22	1	-0/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	into the x node. Summary: This fold, helps recover from the rest of the D62266 ARM regressions. https://rise4fun.com/Alive/TvpC Note that while the fold is quite flexible, i've restricted it to the single interesting pattern at the moment. Reviewers: efriedma, craig.topper, spatel, RKSimon, deadalnix Reviewed By: deadalnix Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62450
*	[TargetLowering][DAGCombine][MSP430] add/use hook for Shift Amount Threshold ↵	Sanjay Patel	2019-10-19	1	-12/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(1/2) Provides a TLI hook to allow targets to relax the emission of shifts, thus enabling codegen improvements on targets with no multiple shift instructions and cheap selects or branches. Contributes to a Fix for PR43559: https://bugs.llvm.org/show_bug.cgi?id=43559 Patch by: @joanlluch (Joan LLuch) Differential Revision: https://reviews.llvm.org/D69116 llvm-svn: 375347
*	[DAG][X86] Convert isNegatibleForFree/GetNegatedExpression to a target hook ↵	Simon Pilgrim	2019-10-01	1	-0/+240
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(PR42863) This patch converts the DAGCombine isNegatibleForFree/GetNegatedExpression into overridable TLI hooks. The intention is to let us extend existing FNEG combines to work more generally with negatible float ops, allowing it work with target specific combines and opcodes (e.g. X86's FMA variants). Unlike the SimplifyDemandedBits, we can't just handle target nodes through a Target callback, we need to do this as an override to allow targets to handle generic opcodes as well. This does mean that the target implementations has to duplicate some checks (recursion depth etc.). Partial reversion of rL372756 - I've identified the infinite loop issue inside the X86 override but haven't fixed it yet so I've only (re)committed the common TargetLowering refactoring part of the patch. Differential Revision: https://reviews.llvm.org/D67557 llvm-svn: 373343
*	[globalisel][knownbits] Allow targets to call ↵	Daniel Sanders	2019-09-30	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GISelKnownBits::computeKnownBitsImpl() Summary: It seems we missed that the target hook can't query the known-bits for the inputs to a target instruction. Fix that oversight Reviewers: aditya_nandakumar Subscribers: rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67380 llvm-svn: 373264
*	[TargetLowering] Simplify expansion of S{ADD,SUB}O	Roger Ferrer Ibanez	2019-09-30	1	-18/+13
\| \| \| \| \| \| \| \| \| \|	ISD::SADDO uses the suggested sequence described in the section §2.4 of the RISCV Spec v2.2. ISD::SSUBO uses the dual approach but checking for (non-zero) positive. Differential Revision: https://reviews.llvm.org/D47927 llvm-svn: 373187
*	Revert r372333: [DAG][X86] Convert isNegatibleForFree/GetNegatedExpression ↵	Ilya Biryukov	2019-09-24	1	-240/+0
\| \| \| \| \| \| \| \| \| \|	to a target hook (PR42863) Reason: this caused severe compile time regressions in JAX. See email thread of original revision on llvm-commits for details: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190923/697042.html llvm-svn: 372756
*	[SelectionDAG][Mips][Sparc] Don't allow SimplifyDemandedBits to constant ↵	Craig Topper	2019-09-20	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fold TargetConstant nodes to a Constant. Summary: After the switch in SimplifyDemandedBits, it tries to create a constant when possible. If the original node is a TargetConstant the default in the switch will call computeKnownBits on the TargetConstant which will succeed. This results in the TargetConstant becoming a Constant. But TargetConstant exists to avoid being changed. I've fixed the two cases that relied on this in tree by explicitly making the nodes constant instead of target constant. The Sparc case is an old bug. The Mips case was recently introduced now that ImmArg on intrinsics gets turned into a TargetConstant when the SelectionDAG is created. I've removed the ImmArg since it lowers to generic code. Reviewers: arsenm, RKSimon, spatel Subscribers: jyknight, sdardis, wdng, arichardson, hiraditya, fedor.sergeev, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67802 llvm-svn: 372409
*	[DAG][X86] Convert isNegatibleForFree/GetNegatedExpression to a target hook ↵	Simon Pilgrim	2019-09-19	1	-0/+240
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(PR42863) This patch converts the DAGCombine isNegatibleForFree/GetNegatedExpression into overridable TLI hooks and includes a demonstration X86 implementation. The intention is to let us extend existing FNEG combines to work more generally with negatible float ops, allowing it work with target specific combines and opcodes (e.g. X86's FMA variants). Unlike the SimplifyDemandedBits, we can't just handle target nodes through a Target callback, we need to do this as an override to allow targets to handle generic opcodes as well. This does mean that the target implementations has to duplicate some checks (recursion depth etc.). I've only begun to replace X86's FNEG handling here, handling FMADDSUB/FMSUBADD negation and some low impact codegen changes (some FMA negatation propagation). We can build on this in future patches. Differential Revision: https://reviews.llvm.org/D67557 llvm-svn: 372333
*	[DAG] Add SelectionDAG::MaxRecursionDepth constant	Simon Pilgrim	2019-09-19	1	-3/+4
\| \| \| \| \| \| \| \| \| \|	As commented on D67557 we have a lot of uses of depth checks all using magic numbers. This patch adds the SelectionDAG::MaxRecursionDepth constant and moves over some general cases to use this explicitly. Differential Revision: https://reviews.llvm.org/D67711 llvm-svn: 372315
*	[SimplifyDemandedBits] Use APInt::intersects to instead of ANDing and ↵	Craig Topper	2019-09-17	1	-2/+3
\| \| \| \| \| \|	comparing to 0 separately. NFC llvm-svn: 372158
*	[TargetLowering] SimplifyDemandedBits - add EXTRACT_SUBVECTOR support.	Simon Pilgrim	2019-09-14	1	-0/+15
\| \| \| \| \| \|	Call SimplifyDemandedBits on the source vector. llvm-svn: 371923
*	[SDAG] Update generic code to conservatively check for isAtomic in addition ↵	Philip Reames	2019-09-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	to isVolatile This is the first sweep of generic code to add isAtomic bailouts where appropriate. The intention here is to have the switch from AtomicSDNode to LoadSDNode/StoreSDNode be close to NFC; that is, I'm not looking to allow additional optimizations at this time. That will come later. See D66309 for context. Differential Revision: https://reviews.llvm.org/D66318 llvm-svn: 371786
*	GlobalISel: add combiner to form indexed loads.	Tim Northover	2019-09-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Loosely based on DAGCombiner version, but this part is slightly simpler in GlobalIsel because all address calculation is performed by G_GEP. That makes the inc/dec distinction moot so there's just pre/post to think about. No targets can handle it yet so testing is via a special flag that overrides target hooks. llvm-svn: 371384
*	[CodeGen] Handle SMULFIXSAT with scale zero in ↵	Bjorn Pettersson	2019-09-07	1	-10/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TargetLowering::expandFixedPointMul Summary: Normally TargetLowering::expandFixedPointMul would handle SMULFIXSAT with scale zero by using an SMULO to compute the product and determine if saturation is needed (if overflow happened). But if SMULO isn't custom/legal it falls through and uses the same technique, using MULHS/SMUL_LOHI, as used for non-zero scales. Problem was that when checking for overflow (handling saturation) when not using MULO we did not expect to find a zero scale. So we ended up in an assertion when doing APInt::getLowBitsSet(VTSize, Scale - 1) This patch fixes the problem by adding a new special case for how saturation is computed when scale is zero. Reviewers: RKSimon, bevinh, leonardchan, spatel Reviewed By: RKSimon Subscribers: wuzish, nemanjai, hiraditya, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67071 llvm-svn: 371309
*	[Intrinsic] Add the llvm.umul.fix.sat intrinsic	Bjorn Pettersson	2019-09-07	1	-19/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add an intrinsic that takes 2 unsigned integers with the scale of them provided as the third argument and performs fixed point multiplication on them. The result is saturated and clamped between the largest and smallest representable values of the first 2 operands. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Patch by: leonardchan, bjope Reviewers: RKSimon, craig.topper, bevinh, leonardchan, lebedev.ri, spatel Reviewed By: leonardchan Subscribers: ychen, wuzish, nemanjai, MaskRay, jsji, jdoerfert, Ka-Ka, hiraditya, rjmccall, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57836 llvm-svn: 371308
*	[TargetLowering] Fix Bugzilla ID 43183 to avoid soften comparison broken ↵	Shiva Chen	2019-09-01	1	-10/+7
\| \| \| \| \| \| \| \| \| \|	with constant inputs Summary: This fixes the bugzilla id 43183 which triggerd by the following commit: [RISCV] Avoid generating AssertZext for LP64 ABI when lowering floating LibCall llvm-svn: 370604
*	[TargetLowering] SimplifyDemandedBits ADD/SUB/MUL - correctly inherit ↵	Simon Pilgrim	2019-08-30	1	-4/+2
\| \| \| \| \| \| \| \| \| \|	SDNodeFlags from the original node. Just disable NSW/NUW flags. This matches what we're already doing for the other situations for these nodes, it was just missed for the demanded constant case. Noticed by inspection - confirmed in offline discussion with @spatel. I've checked we have test coverage in the x86 extract-bits.ll and extract-lowbits.ll tests llvm-svn: 370497
*	[RISCV] Avoid generating AssertZext for LP64 ABI when lowering floating LibCall	Shiva Chen	2019-08-28	1	-5/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The patch fixed the issue that RV64 didn't clear the upper bits when return complex floating value with lp64 ABI. float _Complex complex_add(float _Complex a, float _Complex b) { return a + b; } RealResult = zero_extend(RealA + RealB) ImageResult = ImageA + ImageB Return (RealResult \| (ImageResult << 32)) The patch introduces shouldExtendTypeInLibCall target hook to suppress the AssertZext generation when lowering floating LibCall. Thanks to Eli's comments from the Bugzilla https://bugs.llvm.org/show_bug.cgi?id=42820 Differential Revision: https://reviews.llvm.org/D65497 llvm-svn: 370275
*	[FPEnv] Add fptosi and fptoui constrained intrinsics.	Kevin P. Neal	2019-08-28	1	-9/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This implements constrained floating point intrinsics for FP to signed and unsigned integers. Quoting from D32319: The purpose of the constrained intrinsics is to force the optimizer to respect the restrictions that will be necessary to support things like the STDC FENV_ACCESS ON pragma without interfering with optimizations when these restrictions are not needed. Reviewed by: Andrew Kaylor, Craig Topper, Hal Finkel, Cameron McInally, Roman Lebedev, Kit Barton Approved by: Craig Topper Differential Revision: http://reviews.llvm.org/D63782 llvm-svn: 370228
*	[TargetLowering] Add buildLegalVectorShuffle facility to help build legal ↵	Amaury Sechet	2019-08-28	1	-5/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	shuffles Summary: There are at least 2 ways to express the same shuffle. Various pieces of code explicit check for both option, but other places do not when they would benefit from doing it. This patches refactor the codebase to use buildLegalVectorShuffle in order to make that behavior more consistent. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66804 llvm-svn: 370190
*	[SelectionDAG][X86] Enable iX SimplifyDemandedBits to vXi1 ↵	Craig Topper	2019-08-23	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SimplifyDemandedVectorElts simplification. Add a hack to X86 to avoid a regression Patch showing the effect of enabling bool vector oversimplification. Non-VLX builds can simplify a kshift shuffle, but VLX builds simplify: insert_subvector v8i zeroinitializer, v2i --> insert_subvector v8i undef, v2i Preventing the removal of the AND to clear the upper bits of result Differential Revision: https://reviews.llvm.org/D53022 llvm-svn: 369780
*	[TargetLowering] Remove optional arguments passing to makeLibCall	Shiva Chen	2019-08-22	1	-20/+18
\| \| \| \| \| \| \| \| \| \|	The patch introduces MakeLibCallOptions struct as suggested by @efriedma on D65497. The struct contain argument flags which will pass to makeLibCall function. The patch should not has any functionality changes. Differential Revision: https://reviews.llvm.org/D65795 llvm-svn: 369622
*	[TargetLowering] x s% C == 0 fold: vector divisor with INT_MIN handling	Roman Lebedev	2019-08-19	1	-13/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The general fold is only valid for positive divisors. Which effectively means, it is invalid for `INT_MIN` divisors, and we currently bailout if we see them. But that is too strict, we can just fix-up the results. For that, let's do a second computation 'in parallel': ``` Name: srem -> and Pre: isPowerOf2(C) %o = srem i8 %X, C %r = icmp eq %o, 0 => %n = and i8 %X, C-1 %r = icmp eq %n, 0 ``` https://rise4fun.com/Alive/Sup And then just blend results: if the divisor was `INT_MIN`, pick the value we got via bit-test, else pick the value from general fold. There's interesting observation - `ISD::ROTR` is set to `LegalizeAction::Expand` before AVX512, so we should not treat `INT_MIN` divisor as even; and as it can be seen while `@test_srem_odd_even_one` improves on all run-lines, `@test_srem_odd_even_INT_MIN` only improves for AVX512. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66300 llvm-svn: 369268
*	Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM	Daniel Sanders	2019-08-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned& Depends on D65919 Reviewers: arsenm, bogner, craig.topper, RKSimon Reviewed By: arsenm Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65962 llvm-svn: 369041
*	Remove BitVector.h include. NFCI.	Simon Pilgrim	2019-08-15	1	-1/+0
\| \| \| \| \| \|	BitVector type isn't used at all in the cpp file. llvm-svn: 369007
*	[CodeGen][SelectionDAG] More efficient code for X % C == 0 (SREM case)	Roman Lebedev	2019-08-13	1	-5/+221
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This implements an optimization described in Hacker's Delight 10-17: when `C` is constant, the result of `X % C == 0` can be computed more cheaply without actually calculating the remainder. The motivation is discussed here: https://bugs.llvm.org/show_bug.cgi?id=35479. One huge caveat: this signed case is only valid for positive divisors. While we can freely negate negative divisors, we can't negate `INT_MIN`, so for now if `INT_MIN` is encountered, we bailout. As a follow-up, it should be possible to handle that more gracefully via extra `and`+`setcc`+`select`. This passes llvm's test-suite, and from cursory(!) cross-examination the folds (the assembly) match those of GCC, and manual checking via alive did not reveal any issues (other than the `INT_MIN` case) Reviewers: RKSimon, spatel, hermord, craig.topper, xbolva00 Reviewed By: RKSimon, xbolva00 Subscribers: xbolva00, thakis, javed.absar, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65366 llvm-svn: 368702
*	[TargetLowering][NFC] prepareUREMEqFold(): fixup comment	Roman Lebedev	2019-08-13	1	-1/+1
\| \| \| \| \| \| \| \|	The comment initially matched the code, but the code was incorrect and was fixed after the initial revert back back when it was introduced, but the comment was never updated. llvm-svn: 368701
*	Revert r368276 "[TargetLowering] SimplifyDemandedBits - call ↵	Hans Wennborg	2019-08-13	1	-11/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SimplifyMultipleUseDemandedBits for ISD::EXTRACT_VECTOR_ELT" This introduced a false positive MemorySanitizer warning about use of uninitialized memory in a vectorized crc function in Chromium. That suggests maybe something is not right with this transformation. See https://crbug.com/992853#c7 for a reproducer. This also reverts the follow-up commits r368307 and r368308 which depended on this. > This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract. > > In particular this helps remove some unnecessary scalar->vector->scalar patterns. > > The wasm shift patterns are annoying - @tlively has indicated that the wasm vector shift codegen are to be refactored in the near-term and isn't considered a major issue. > > Differential Revision: https://reviews.llvm.org/D65887 llvm-svn: 368660
*	[TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits ↵	Simon Pilgrim	2019-08-12	1	-0/+5
\| \| \| \| \| \|	for ISD::TRUNCATE llvm-svn: 368553
*	[TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits ↵	Simon Pilgrim	2019-08-08	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	for ISD::EXTRACT_VECTOR_ELT This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract. In particular this helps remove some unnecessary scalar->vector->scalar patterns. The wasm shift patterns are annoying - @tlively has indicated that the wasm vector shift codegen are to be refactored in the near-term and isn't considered a major issue. Differential Revision: https://reviews.llvm.org/D65887 llvm-svn: 368276
*	[TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits ↵	Simon Pilgrim	2019-08-07	1	-4/+19
\| \| \| \| \| \| \| \|	for ISD::VECTOR_SHUFFLE In particular this helps the SSE vector shift cvttps2dq+add+shl pattern by avoiding the need for zeros in shuffle style extensions to vXi32 types as we'll be shifting out those bits anyway llvm-svn: 368155
*	[GISel]: Add GISelKnownBits analysis	Aditya Nandakumar	2019-08-06	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	https://reviews.llvm.org/D65698 This adds a KnownBits analysis pass for GISel. This was done as a pass (compared to static functions) so that we can add other features such as caching queries(within a pass and across passes) in the future. This patch only adds the basic pass boiler plate, and implements a lazy non caching knownbits implementation (ported from SelectionDAG). I've also hooked up the AArch64PreLegalizerCombiner pass to use this - there should be no compile time regression as the analysis is lazy. llvm-svn: 368065
*	[TargetLowering] SimplifyMultipleUseDemandedBits - return UNDEF for ↵	Simon Pilgrim	2019-08-06	1	-1/+10
\| \| \| \| \| \| \| \|	undemanded ops If we demand no bits/elts from an Op, just return UNDEF llvm-svn: 368043
*	[TargetLowering][X86] Teach SimplifyDemandedVectorElts to replace the base ↵	Craig Topper	2019-08-04	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vector of INSERT_SUBVECTOR with undef if none of the elements are demanded even if the node has other users. Summary: The SimplifyDemandedVectorElts function can replace with undef when no elements are demanded, but due to how it interacts with TargetLoweringOpts, it can only do this when the node has no other users. Remove a now unneeded DAG combine from the X86 backend. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65713 llvm-svn: 367788
*	Emit diagnostic if an inline asm constraint requires an immediate	Bill Wendling	2019-08-03	1	-7/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: An inline asm call can result in an immediate after inlining. Therefore emit a diagnostic here if constraint requires an immediate but one isn't supplied. Reviewers: joerg, mgorny, efriedma, rsmith Reviewed By: joerg Subscribers: asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, s.egerton, MaskRay, jyknight, dylanmckay, javed.absar, fedor.sergeev, jrtc27, Jim, krytarowski, eraman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60942 llvm-svn: 367750
*	[TargetLowering] SimplifyMultipleUseDemandedBits - don't assume ↵	Simon Pilgrim	2019-08-02	1	-1/+1
\| \| \| \| \| \| \| \|	INSERT_VECTOR_ELT value type is simple. Noticed by inspection - this was copied from the X86 target equivalent where we can assume its legal/simple. llvm-svn: 367721
*	[TargetLowering] SimplifyMultipleUseDemandedBits - Add ↵	Simon Pilgrim	2019-08-01	1	-0/+10
\| \| \| \| \| \| \| \|	ISD::INSERT_VECTOR_ELT handling Allow us to peek through vector insertions to avoid dependencies on entire insertion chains. llvm-svn: 367588
*	[TargetLowering] SimplifyMultipleUseDemandedBits - add BITCAST pass through ↵	Simon Pilgrim	2019-07-27	1	-2/+59
\| \| \| \| \| \| \| \| \| \| \| \|	support (Reapplied) This allows us to peek through BITCASTs, attempt to simplify the source operand, and then bitcast back. This reapplies rL367091 which was reverted at rL367118 - we were inconsistently peeking through the bitcasts to the source value. Fixes PR42777 llvm-svn: 367174
*	[SelectionDAG] Check for any recursion depth greater than or equal to limit ↵	Simon Pilgrim	2019-07-27	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	instead of just equal the limit. If anything called the recursive isKnownNeverNaN/computeKnownBits/ComputeNumSignBits/SimplifyDemandedBits/SimplifyMultipleUseDemandedBits with an incorrect depth then we could continue to recurse if we'd already exceeded the depth limit. This replaces the limit check (Depth == 6) with a (Depth >= 6) to make sure that we don't circumvent it. This causes a couple of regressions as a mixture of calls (SimplifyMultipleUseDemandedBits + combineX86ShufflesRecursively) were calling with depths that were already over the limit. I've fixed SimplifyMultipleUseDemandedBits to not do this. combineX86ShufflesRecursively is trickier as we get a lot of regressions if we reduce its own limit from 8 to 6 (it also starts at Depth == 1 instead of Depth == 0 like the others....) - I'll see what I can do in future patches. llvm-svn: 367171
*	[TargetLowering] Add depth limit to SimplifyMultipleUseDemandedBits	Simon Pilgrim	2019-07-27	1	-0/+3
\| \| \| \| \| \|	We're getting reports of massive compile time increases because SimplifyMultipleUseDemandedBits was losing track of the depth and not earlying-out. No repro yet, but consider this a pre-emptive commit. llvm-svn: 367169
*	Revert r367091, it caused PR42777.	Nico Weber	2019-07-26	1	-59/+2
\| \| \| \|	llvm-svn: 367118
*	[TargetLowering] SimplifyMultipleUseDemandedBits - add SIGN_EXTEND_INREG ↵	Simon Pilgrim	2019-07-26	1	-0/+7
\| \| \| \| \| \|	support. llvm-svn: 367096
*	[TargetLowering] SimplifyMultipleUseDemandedBits - add BITCAST pass through ↵	Simon Pilgrim	2019-07-26	1	-2/+59
\| \| \| \| \| \| \| \|	support. This allows us to peek through BITCASTs and attempt simplify the source operand, and then bitcast back. llvm-svn: 367091
*	[Codegen] (X & (C l>>/<< Y)) ==/!= 0 --> ((X <</l>> Y) & C) ==/!= 0 fold	Roman Lebedev	2019-07-24	1	-0/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This was originally reported in D62818. https://rise4fun.com/Alive/oPH InstCombine does the opposite fold, in hope that `C l>>/<< Y` expression will be hoisted out of a loop if `Y` is invariant and `X` is not. But as it is seen from the diffs here, if it didn't get hoisted, the produced assembly is almost universally worse. Much like with my recent "hoist add/sub by/from const" patches, we should get almost universal win if we hoist constant, there is almost always an "and/test by imm" instruction, but "shift of imm" not so much, so we may avoid having to materialize the immediate, and thus need one less register. And since we now shift not by constant, but by something else, the live-range of that something else may reduce. Special care needs to be applied not to disturb x86 `BT` / hexagon `tstbit` instruction pattern. And to not get into endless combine loop. Reviewers: RKSimon, efriedma, t.p.northover, craig.topper, spatel, arsenm Reviewed By: spatel Subscribers: hiraditya, MaskRay, wuzish, xbolva00, nikic, nemanjai, jvesely, wdng, nhaehnle, javed.absar, tpr, kristof.beyls, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62871 llvm-svn: 366955
*	[SDAG] convert (sub x, 1) to (add x, -1) in ctpop expansion; NFC	Sanjay Patel	2019-07-24	1	-3/+3
\| \| \| \| \| \|	We canonicalize to the add form, so create that directly for efficiency. llvm-svn: 366914
*	[TargetLowering] SimplifyMultipleUseDemandedBits - add VECTOR_SHUFFLE support.	Simon Pilgrim	2019-07-23	1	-0/+23
\| \| \| \| \| \| \| \|	If all the demanded elts are from one operand and are inline, then we can use the operand directly. The changes are mainly from SSE41 targets which has blendvpd but not cmpgtq, allowing the v2i64 comparison to be simplified as we only need the signbit from alternate v4i32 elements. llvm-svn: 366817