summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/X86
Commit message (Collapse)AuthorAgeFilesLines
* [DAGCombiner] convert logic-of-setcc into bit magic (PR40611)Sanjay Patel2019-02-121-8/+4
| | | | | | | | | | | | | | | | | | | | If we're comparing some value for equality against 2 constants and those constants have an absolute difference of just 1 bit, then we can offset and mask off that 1 bit and reduce to a single compare against zero: and/or (setcc X, C0, ne), (setcc X, C1, ne/eq) --> setcc ((add X, -C1), ~(C0 - C1)), 0, ne/eq https://rise4fun.com/Alive/XslKj This transform is disabled by default using a TLI hook ("convertSetCCLogicToBitwiseLogic()"). That should be overridden for AArch64, MIPS, Sparc and possibly others based on the asm shown in: https://bugs.llvm.org/show_bug.cgi?id=40611 llvm-svn: 353859
* [x86] add negative tests for setcc folds; NFCSanjay Patel2019-02-121-0/+70
| | | | llvm-svn: 353855
* [x86] add tests for logic of setcc (PR40611); NFCSanjay Patel2019-02-121-0/+30
| | | | llvm-svn: 353789
* [Test] Use autogenerated checks for more statepoint testsPhilip Reames2019-02-124-125/+303
| | | | llvm-svn: 353776
* [Tests] Fill out a few tests around gc relocation uniquingPhilip Reames2019-02-121-3/+36
| | | | llvm-svn: 353773
* [Test] Autogenerate a statepoint test and actual show the reloadPhilip Reames2019-02-111-2/+32
| | | | llvm-svn: 353770
* [X86] Regenerate insertelement testsSimon Pilgrim2019-02-111-66/+50
| | | | | | Add common X86/X64 prefixes (and use X86 instead of X32) llvm-svn: 353716
* Revert "[X86][SSE] Generalize X86ISD::BLENDI support to more value types"Sam McCall2019-02-1117-227/+325
| | | | | | | | | This reverts commit r353610. It causes a miscompile visible in macro expansion in a bootstrapped clang. http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190211/626590.html llvm-svn: 353699
* [DWARF] LLVM ERROR: Broken function found, while removing Debug Intrinsics.Carlos Alberto Enciso2019-02-111-50/+0
| | | | | | | | | | Check that when SimplifyCFG is flattening a 'br', all their debug intrinsic instructions are removed, including any dbg.label referencing a label associated with the basic blocks being removed. As the test case involves a CFG transformation, move it to the correct location. Differential Revision: https://reviews.llvm.org/D57444 llvm-svn: 353682
* [CodeGen][X86] Don't scalarize vector saturating add/subNikita Popov2019-02-104-6078/+4322
| | | | | | | | | | | Now that we have vector support for [US](ADD|SUB)O we no longer need to scalarize when expanding [US](ADD|SUB)SAT. This matches what the cost model already does. Differential Revision: https://reviews.llvm.org/D57348 llvm-svn: 353651
* [X86] Add basic bitreverse/bswap combine testsSimon Pilgrim2019-02-102-0/+159
| | | | | | Shows missing SimplifyDemandedBits support llvm-svn: 353647
* [DAGCombine] Simplify funnel shifts with undef/zero args to bitshiftsSimon Pilgrim2019-02-101-40/+36
| | | | | | | | Now that we have SimplifyDemandedBits support for funnel shifts (rL353539), we need to simplify funnel shifts back to bitshifts in cases where either argument has been folded to undef/zero. Differential Revision: https://reviews.llvm.org/D58009 llvm-svn: 353645
* [X86] Add masked variable tests for funnel undef/zero argument combinesSimon Pilgrim2019-02-101-0/+90
| | | | | | I've avoided 'modulo' masks as we'll SimplifyDemandedBits those in the future, and we just need to check that the shift variable is 'in range' llvm-svn: 353644
* [x86] narrow 256-bit horizontal ops via demanded elementsSanjay Patel2019-02-102-59/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | 256-bit horizontal math ops are an x86 monstrosity (and thankfully have not been extended to 512-bit AFAIK). The two 128-bit halves operate on separate halves of the inputs. So if we don't demand anything in the upper half of the result, we can extract the low halves of the inputs, do the math, and then insert that result into a 256-bit output. All of the extract/insert is free (ymm<-->xmm), so we're left with a narrower (cheaper) version of the original op. In the affected tests based on: https://bugs.llvm.org/show_bug.cgi?id=33758 https://bugs.llvm.org/show_bug.cgi?id=38971 ...we see that the h-op narrowing can result in further narrowing of other math via existing generic transforms. I originally drafted this patch as an exact pattern match starting from extract_vector_elt, but I thought we might see diffs starting from extract_subvector too, so I changed it to a more general demanded elements solution. There are no extra existing regression test improvements from that switch though, so we could go back. Differential Revision: https://reviews.llvm.org/D57841 llvm-svn: 353641
* [X86] Add additional tests for funnel undef/zero argument combinesSimon Pilgrim2019-02-101-0/+216
| | | | | | As suggested on D58009 llvm-svn: 353640
* [TargetLowering] refactor setcc folds to fix another miscompile (PR40657)Sanjay Patel2019-02-101-2/+8
| | | | | | | | | | SimplifySetCC still has much room for improvement, but this should fix the remaining problem examples from: https://bugs.llvm.org/show_bug.cgi?id=40657 The initial fix for this problem was rL353615. llvm-svn: 353639
* [X86][SSE] Add SimplifyDemandedBits test for BLENDVPDSimon Pilgrim2019-02-101-0/+26
| | | | llvm-svn: 353638
* [X86] Add tests for funnel undef argument combinesSimon Pilgrim2019-02-091-0/+138
| | | | | | If one of the shifted arguments is undef we should be folding to a regular shift. llvm-svn: 353628
* [X86] CombineOr - fold to generic funnel shiftsSimon Pilgrim2019-02-092-4/+12
| | | | | | | | As discussed on D57389, this is a first step towards moving the SHLD/SHRD matching code to DAGCombiner using FSHL/FSHR instead. There's a bit of work to do before I can do that, so this just folds to FSHL/FSHR in the existing code (handling the different SHRD/FSHR argument ordering), which fixes the issue we had with i16 shift amounts not being correctly masked. llvm-svn: 353626
* [x86] add another test for setcc miscompile (PR40657); NFCSanjay Patel2019-02-091-0/+19
| | | | llvm-svn: 353625
* [TargetLowering] add tests to show effect of setcc sub->shift; NFCSanjay Patel2019-02-091-0/+38
| | | | | | | | | There's effectively no difference for the cases with variables. We just trade a sub for an add on those. But the case with a subtract from constant would require an extra move instruction on x86, so this looks like a reasonable generic combine. llvm-svn: 353619
* [x86] add test for setcc sub->shift transform; NFCSanjay Patel2019-02-091-0/+14
| | | | llvm-svn: 353618
* [X86] Regenerate test.Simon Pilgrim2019-02-091-2/+22
| | | | llvm-svn: 353616
* [TargetLowering] avoid miscompile in setcc transform (PR40657)Sanjay Patel2019-02-091-2/+5
| | | | llvm-svn: 353615
* [X86][SSE] Generalize X86ISD::BLENDI support to more value typesSimon Pilgrim2019-02-0917-325/+227
| | | | | | | | | | | | | | | | D42042 introduced the ability for the ExecutionDomainFixPass to more easily change between BLENDPD/BLENDPS/PBLENDW as the domains required. With this ability, we can avoid most bitcasts/scaling in the DAG that was occurring with X86ISD::BLENDI lowering/combining, blend with the vXi32/vXi64 vectors directly and use isel patterns to lower to the float vector equivalent vectors. This helps the shuffle combining and SimplifyDemandedVectorElts be more aggressive as we lose track of fewer UNDEF elements than when we go up/down through bitcasts. I've introduced a basic blend(bitcast(x),bitcast(y)) -> bitcast(blend(x,y)) fold, there are more generalizations I can do there (e.g. widening/scaling and handling the tricky v16i16 repeated mask case). The vector-reduce-smin/smax regressions will be fixed in a future improvement to SimplifyDemandedBits to peek through bitcasts and support X86ISD::BLENDV. Differential Revision: https://reviews.llvm.org/D57888 llvm-svn: 353610
* [x86] add test for miscompiling setcc transform (PR40657); NFCSanjay Patel2019-02-081-0/+18
| | | | llvm-svn: 353580
* [X86] Add FPCW as an implicit use on floating point load instructions.Craig Topper2019-02-082-2/+2
| | | | | | These instructions can generate a stack overflow exception so technically they read the stack overflow exception mask bit. llvm-svn: 353564
* Implementation of asm-goto support in LLVMCraig Topper2019-02-086-0/+441
| | | | | | | | | | | | | | | | | | | | | | | | | This patch accompanies the RFC posted here: http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html This patch adds a new CallBr IR instruction to support asm-goto inline assembly like gcc as used by the linux kernel. This instruction is both a call instruction and a terminator instruction with multiple successors. Only inline assembly usage is supported today. This also adds a new INLINEASM_BR opcode to SelectionDAG and MachineIR to represent an INLINEASM block that is also considered a terminator instruction. There will likely be more bug fixes and optimizations to follow this, but we felt it had reached a point where we would like to switch to an incremental development model. Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii Differential Revision: https://reviews.llvm.org/D53765 llvm-svn: 353563
* [DAGCombine] Optimize pow(X, 0.75) to sqrt(X) * sqrt(sqrt(X))Nemanja Ivanovic2019-02-081-0/+48
| | | | | | | | | | | | The sqrt case is faster and we already do this for the case where the exponent is 0.25. This adds the 0.75 case which is also not sensitive to signed zeros. Patch by Whitney Tsang (Whitney) Differential revision: https://reviews.llvm.org/D57434 llvm-svn: 353557
* [TargetLowering] Use ISD::FSHR in expandFixedPointMulSimon Pilgrim2019-02-081-6/+6
| | | | | | Replace OR(SHL,SRL) pattern with ISD::FSHR (legalization expands this later if necessary) - this helps with the scale == 0 'undefined' drop-through case that was discussed on D55720. llvm-svn: 353546
* [TargetLowering] Add SimplifyDemandedBits funnel shift support Simon Pilgrim2019-02-081-15/+7
| | | | llvm-svn: 353539
* [X86] Add basic funnel shift demanded bits testsSimon Pilgrim2019-02-081-0/+48
| | | | llvm-svn: 353534
* [DWARF] LLVM ERROR: Broken function found, while removing Debug Intrinsics.Carlos Alberto Enciso2019-02-081-0/+50
| | | | | | | | Check that when SimplifyCFG is flattening a 'br', all their debug intrinsic instructions are removed, including any dbg.label referencing a label associated with the basic blocks being removed. Differential Revision: https://reviews.llvm.org/D57444 llvm-svn: 353511
* Fix the lowering issue of intrinsics llvm.localaddress on X86Craig Topper2019-02-081-0/+78
| | | | | | | | | | | | | | | | Patch by Yuanke Luo Reviewers: craig.topper, annita.zhang, smaslov, rnk, wxiao3 Reviewed By: rnk Subscribers: efriedma, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57501 llvm-svn: 353492
* [X86] Add FPCW as a register and start using it as an implicit use on ↵Craig Topper2019-02-083-5/+5
| | | | | | | | | | | | | | | | | | | floating point instructions. Summary: FPCW contains the rounding mode control which we manipulate to implement fp to integer conversion by changing the roudning mode, storing the value to the stack, and then changing the rounding mode back. Because we didn't model FPCW and its dependency chain, other instructions could be scheduled into the middle of the sequence. This patch introduces the register and adds it as an implciit def of FLDCW and implicit use of the FP binary arithmetic instructions and store instructions. There are more instructions that need to be updated, but this is a good start. I believe this fixes at least the reduced test case from PR40529. Reviewers: RKSimon, spatel, rnk, efriedma, andrew.w.kaylor Subscribers: dim, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57735 llvm-svn: 353489
* [CodeGen] Handle vector UADDO, SADDO, USUBO, SSUBONikita Popov2019-02-074-0/+6909
| | | | | | | | | | | | | | | This is part of https://bugs.llvm.org/show_bug.cgi?id=40442. Vector legalization is implemented for the add/sub overflow opcodes. UMULO/SMULO are also handled as far as legalization is concerned, but they don't support vector expansion yet (so no tests for them). The vector result widening implementation is suboptimal, because it could result in a legalization loop. Differential Revision: https://reviews.llvm.org/D57639 llvm-svn: 353464
* [x86] split more 256/512-bit shuffles in loweringSanjay Patel2019-02-075-103/+46
| | | | | | | | | | | | This is intentionally a small step because it's hard to know exactly where we might introduce a conflicting transform with the code that tries to form wider shuffles. But I think this is safe - if we have a wide shuffle with 2 operands, then we should do better with an extract + narrow shuffle. Differential Revision: https://reviews.llvm.org/D57867 llvm-svn: 353427
* [X86] Change the CPU on the test case for pr40529.ll to really show the bug. NFCCraig Topper2019-02-061-4/+4
| | | | llvm-svn: 353334
* [x86] add tests for horizontal ops (PR38971, PR33758); NFCSanjay Patel2019-02-062-0/+342
| | | | llvm-svn: 353332
* [InlineAsm][X86] Add backend support for X86 flag output parameters.Nirav Dave2019-02-061-0/+954
| | | | | | | Allow custom handling of inline assembly output parameters and add X86 flag parameter support. llvm-svn: 353307
* [x86] vectorize cast ops in lowering to avoid register file transfersSanjay Patel2019-02-063-41/+70
| | | | | | | | | | | | | | | The proposal in D56796 may cross the line because we're trying to avoid vectorization transforms in generic DAG combining. So this is an alternate, later, x86-specific translation of that patch. There are several potential follow-ups to enhance this: 1. Allow extraction from non-zero element index. 2. Peek through extends of smaller width integers. 3. Support x86-specific conversion opcodes like X86ISD::CVTSI2P Differential Revision: https://reviews.llvm.org/D56864 llvm-svn: 353302
* [Test] Add codegen tests for unordered and monotonic integer operationsPhilip Reames2019-02-062-0/+236
| | | | llvm-svn: 353266
* [x86] add tests for extract+sitofp; NFCSanjay Patel2019-02-061-0/+49
| | | | llvm-svn: 353249
* [X86] Regenerate tests missed in r353061. NFCCraig Topper2019-02-052-40/+40
| | | | | | We now print the implicit %st register on these instruction, but since they occur at the end of the line, FileCheck didn't see they were missing. llvm-svn: 353222
* [X86][SSE] Disable ZERO_EXTEND shuffle combiningSimon Pilgrim2019-02-052-19/+34
| | | | | | rL352997 enabled ZERO_EXTEND from non-shuffle-able value types. I've disabled it for now to fix a regression identified by @asbirlea until I can fix this properly. llvm-svn: 353198
* [X86][AVX] Attempt to combine shuffles to subvector broadcast loadSimon Pilgrim2019-02-051-4/+2
| | | | llvm-svn: 353189
* [X86][AVX] Add PR34041 subvector broadcast test casesSimon Pilgrim2019-02-051-0/+170
| | | | llvm-svn: 353182
* [AArch64][x86] add tests for unsigned subtract with overflow; NFCSanjay Patel2019-02-051-0/+205
| | | | llvm-svn: 353178
* [X86][SSE] Rename SimplifyDemandedVectorElts BLENDV testsSimon Pilgrim2019-02-051-6/+6
| | | | | | I'm going to be adding SimplifyDemandedBits tests shortly. llvm-svn: 353171
* [X86][SSE] Add SimplifyDemandedVectorElts support for X86ISD::BLENDVSimon Pilgrim2019-02-051-11/+4
| | | | llvm-svn: 353165
OpenPOWER on IntegriCloud