| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we're comparing some value for equality against 2 constants
and those constants have an absolute difference of just 1 bit,
then we can offset and mask off that 1 bit and reduce to a single
compare against zero:
and/or (setcc X, C0, ne), (setcc X, C1, ne/eq) -->
setcc ((add X, -C1), ~(C0 - C1)), 0, ne/eq
https://rise4fun.com/Alive/XslKj
This transform is disabled by default using a TLI hook
("convertSetCCLogicToBitwiseLogic()").
That should be overridden for AArch64, MIPS, Sparc and possibly
others based on the asm shown in:
https://bugs.llvm.org/show_bug.cgi?id=40611
llvm-svn: 353859
|
| |
|
|
| |
llvm-svn: 353855
|
| |
|
|
| |
llvm-svn: 353789
|
| |
|
|
| |
llvm-svn: 353776
|
| |
|
|
| |
llvm-svn: 353773
|
| |
|
|
| |
llvm-svn: 353770
|
| |
|
|
|
|
| |
Add common X86/X64 prefixes (and use X86 instead of X32)
llvm-svn: 353716
|
| |
|
|
|
|
|
|
|
| |
This reverts commit r353610.
It causes a miscompile visible in macro expansion in a bootstrapped clang.
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190211/626590.html
llvm-svn: 353699
|
| |
|
|
|
|
|
|
|
|
| |
Check that when SimplifyCFG is flattening a 'br', all their debug intrinsic instructions are removed, including any dbg.label referencing a label associated with the basic blocks being removed.
As the test case involves a CFG transformation, move it to the correct location.
Differential Revision: https://reviews.llvm.org/D57444
llvm-svn: 353682
|
| |
|
|
|
|
|
|
|
|
|
| |
Now that we have vector support for [US](ADD|SUB)O we no longer
need to scalarize when expanding [US](ADD|SUB)SAT.
This matches what the cost model already does.
Differential Revision: https://reviews.llvm.org/D57348
llvm-svn: 353651
|
| |
|
|
|
|
| |
Shows missing SimplifyDemandedBits support
llvm-svn: 353647
|
| |
|
|
|
|
|
|
| |
Now that we have SimplifyDemandedBits support for funnel shifts (rL353539), we need to simplify funnel shifts back to bitshifts in cases where either argument has been folded to undef/zero.
Differential Revision: https://reviews.llvm.org/D58009
llvm-svn: 353645
|
| |
|
|
|
|
| |
I've avoided 'modulo' masks as we'll SimplifyDemandedBits those in the future, and we just need to check that the shift variable is 'in range'
llvm-svn: 353644
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
256-bit horizontal math ops are an x86 monstrosity (and thankfully have
not been extended to 512-bit AFAIK).
The two 128-bit halves operate on separate halves of the inputs. So if we
don't demand anything in the upper half of the result, we can extract the
low halves of the inputs, do the math, and then insert that result into a
256-bit output.
All of the extract/insert is free (ymm<-->xmm), so we're left with a
narrower (cheaper) version of the original op.
In the affected tests based on:
https://bugs.llvm.org/show_bug.cgi?id=33758
https://bugs.llvm.org/show_bug.cgi?id=38971
...we see that the h-op narrowing can result in further narrowing of other
math via existing generic transforms.
I originally drafted this patch as an exact pattern match starting from
extract_vector_elt, but I thought we might see diffs starting from
extract_subvector too, so I changed it to a more general demanded elements
solution. There are no extra existing regression test improvements from
that switch though, so we could go back.
Differential Revision: https://reviews.llvm.org/D57841
llvm-svn: 353641
|
| |
|
|
|
|
| |
As suggested on D58009
llvm-svn: 353640
|
| |
|
|
|
|
|
|
|
|
| |
SimplifySetCC still has much room for improvement, but this should
fix the remaining problem examples from:
https://bugs.llvm.org/show_bug.cgi?id=40657
The initial fix for this problem was rL353615.
llvm-svn: 353639
|
| |
|
|
| |
llvm-svn: 353638
|
| |
|
|
|
|
| |
If one of the shifted arguments is undef we should be folding to a regular shift.
llvm-svn: 353628
|
| |
|
|
|
|
|
|
| |
As discussed on D57389, this is a first step towards moving the SHLD/SHRD matching code to DAGCombiner using FSHL/FSHR instead.
There's a bit of work to do before I can do that, so this just folds to FSHL/FSHR in the existing code (handling the different SHRD/FSHR argument ordering), which fixes the issue we had with i16 shift amounts not being correctly masked.
llvm-svn: 353626
|
| |
|
|
| |
llvm-svn: 353625
|
| |
|
|
|
|
|
|
|
| |
There's effectively no difference for the cases with variables.
We just trade a sub for an add on those. But the case with a
subtract from constant would require an extra move instruction
on x86, so this looks like a reasonable generic combine.
llvm-svn: 353619
|
| |
|
|
| |
llvm-svn: 353618
|
| |
|
|
| |
llvm-svn: 353616
|
| |
|
|
| |
llvm-svn: 353615
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
D42042 introduced the ability for the ExecutionDomainFixPass to more easily change between BLENDPD/BLENDPS/PBLENDW as the domains required.
With this ability, we can avoid most bitcasts/scaling in the DAG that was occurring with X86ISD::BLENDI lowering/combining, blend with the vXi32/vXi64 vectors directly and use isel patterns to lower to the float vector equivalent vectors.
This helps the shuffle combining and SimplifyDemandedVectorElts be more aggressive as we lose track of fewer UNDEF elements than when we go up/down through bitcasts.
I've introduced a basic blend(bitcast(x),bitcast(y)) -> bitcast(blend(x,y)) fold, there are more generalizations I can do there (e.g. widening/scaling and handling the tricky v16i16 repeated mask case).
The vector-reduce-smin/smax regressions will be fixed in a future improvement to SimplifyDemandedBits to peek through bitcasts and support X86ISD::BLENDV.
Differential Revision: https://reviews.llvm.org/D57888
llvm-svn: 353610
|
| |
|
|
| |
llvm-svn: 353580
|
| |
|
|
|
|
| |
These instructions can generate a stack overflow exception so technically they read the stack overflow exception mask bit.
llvm-svn: 353564
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch accompanies the RFC posted here:
http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html
This patch adds a new CallBr IR instruction to support asm-goto
inline assembly like gcc as used by the linux kernel. This
instruction is both a call instruction and a terminator
instruction with multiple successors. Only inline assembly
usage is supported today.
This also adds a new INLINEASM_BR opcode to SelectionDAG and
MachineIR to represent an INLINEASM block that is also
considered a terminator instruction.
There will likely be more bug fixes and optimizations to follow
this, but we felt it had reached a point where we would like to
switch to an incremental development model.
Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii
Differential Revision: https://reviews.llvm.org/D53765
llvm-svn: 353563
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The sqrt case is faster and we already do this for the case where
the exponent is 0.25. This adds the 0.75 case which is also not
sensitive to signed zeros.
Patch by Whitney Tsang (Whitney)
Differential revision: https://reviews.llvm.org/D57434
llvm-svn: 353557
|
| |
|
|
|
|
| |
Replace OR(SHL,SRL) pattern with ISD::FSHR (legalization expands this later if necessary) - this helps with the scale == 0 'undefined' drop-through case that was discussed on D55720.
llvm-svn: 353546
|
| |
|
|
| |
llvm-svn: 353539
|
| |
|
|
| |
llvm-svn: 353534
|
| |
|
|
|
|
|
|
| |
Check that when SimplifyCFG is flattening a 'br', all their debug intrinsic instructions are removed, including any dbg.label referencing a label associated with the basic blocks being removed.
Differential Revision: https://reviews.llvm.org/D57444
llvm-svn: 353511
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Patch by Yuanke Luo
Reviewers: craig.topper, annita.zhang, smaslov, rnk, wxiao3
Reviewed By: rnk
Subscribers: efriedma, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57501
llvm-svn: 353492
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
floating point instructions.
Summary:
FPCW contains the rounding mode control which we manipulate to implement fp to integer conversion by changing the roudning mode, storing the value to the stack, and then changing the rounding mode back. Because we didn't model FPCW and its dependency chain, other instructions could be scheduled into the middle of the sequence.
This patch introduces the register and adds it as an implciit def of FLDCW and implicit use of the FP binary arithmetic instructions and store instructions. There are more instructions that need to be updated, but this is a good start. I believe this fixes at least the reduced test case from PR40529.
Reviewers: RKSimon, spatel, rnk, efriedma, andrew.w.kaylor
Subscribers: dim, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57735
llvm-svn: 353489
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is part of https://bugs.llvm.org/show_bug.cgi?id=40442.
Vector legalization is implemented for the add/sub overflow opcodes.
UMULO/SMULO are also handled as far as legalization is concerned, but
they don't support vector expansion yet (so no tests for them).
The vector result widening implementation is suboptimal, because it
could result in a legalization loop.
Differential Revision: https://reviews.llvm.org/D57639
llvm-svn: 353464
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This is intentionally a small step because it's hard to know exactly
where we might introduce a conflicting transform with the code that
tries to form wider shuffles. But I think this is safe - if we have
a wide shuffle with 2 operands, then we should do better with an
extract + narrow shuffle.
Differential Revision: https://reviews.llvm.org/D57867
llvm-svn: 353427
|
| |
|
|
| |
llvm-svn: 353334
|
| |
|
|
| |
llvm-svn: 353332
|
| |
|
|
|
|
|
| |
Allow custom handling of inline assembly output parameters and add X86
flag parameter support.
llvm-svn: 353307
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The proposal in D56796 may cross the line because we're trying to avoid vectorization
transforms in generic DAG combining. So this is an alternate, later, x86-specific
translation of that patch.
There are several potential follow-ups to enhance this:
1. Allow extraction from non-zero element index.
2. Peek through extends of smaller width integers.
3. Support x86-specific conversion opcodes like X86ISD::CVTSI2P
Differential Revision: https://reviews.llvm.org/D56864
llvm-svn: 353302
|
| |
|
|
| |
llvm-svn: 353266
|
| |
|
|
| |
llvm-svn: 353249
|
| |
|
|
|
|
| |
We now print the implicit %st register on these instruction, but since they occur at the end of the line, FileCheck didn't see they were missing.
llvm-svn: 353222
|
| |
|
|
|
|
| |
rL352997 enabled ZERO_EXTEND from non-shuffle-able value types. I've disabled it for now to fix a regression identified by @asbirlea until I can fix this properly.
llvm-svn: 353198
|
| |
|
|
| |
llvm-svn: 353189
|
| |
|
|
| |
llvm-svn: 353182
|
| |
|
|
| |
llvm-svn: 353178
|
| |
|
|
|
|
| |
I'm going to be adding SimplifyDemandedBits tests shortly.
llvm-svn: 353171
|
| |
|
|
| |
llvm-svn: 353165
|