| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
Still need to remove masking from the 512-bit versions.
llvm-svn: 339841
|
|
|
|
|
|
| |
MSVC only accepts if-else chains up to 127 blocks long. I've had to merge a number of intrinsic cases together to get back below this limit, resulting in some duplication of string matches; this shouldn't cause any notable increase in runtime (and even then only for old IR, nothing that clang currently emits).
llvm-svn: 339666
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This revision improves previous version (rL330322) which has been reverted due to crashes.
This is the patch that lowers x86 intrinsics to native IR
in order to enable optimizations. The patch also includes folding
of previously missing saturation patterns so that IR emits the same
machine instructions as the intrinsics.
Reviewers: craig.topper, spatel, RKSimon
Reviewed By: craig.topper
Subscribers: mike.dvoretsky, DavidKreitzer, sroland, llvm-commits
Differential Revision: https://reviews.llvm.org/D46179
llvm-svn: 339650
|
|
|
|
|
|
| |
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}
llvm-svn: 338293
|
|
|
|
|
|
| |
This converts them to what clang is now using for codegen. Unfortunately, there seem to be a few kinks to work out still. I'll try to address with follow up patches.
llvm-svn: 336871
|
|
|
|
|
|
|
|
| |
The intrinsics can be implemented with a f32/f64 llvm.fma intrinsic and an insert into a zero vector.
There are a couple regressions here due to SelectionDAG not being able to pull an fneg through an extract_vector_elt. I'm not super worried about this though as InstCombine should be able to do it before we get to SelectionDAG.
llvm-svn: 336416
|
|
|
|
|
|
|
|
|
|
| |
unmasked 512-bit intrinsics with rounding mode.
This upgrades all of the intrinsics to use fneg instructions to convert fma into fmsub/fnmsub/fnmadd/fmsubadd. And uses a select instruction for masking.
This matches how clang uses the intrinsics these days.
llvm-svn: 336409
|
|
|
|
|
|
|
|
| |
'llvm.fma'. Add upgrade tests for all.
Still need to remove the AVX512 masked versions.
llvm-svn: 336383
|
|
|
|
|
|
| |
independent FMA and extractelement/insertelement.
llvm-svn: 336315
|
|
|
|
|
|
|
|
|
|
| |
in clang.
There's a regression in here due to inability to combine fneg inputs of X86ISD::FMSUB/FNMSUB/FNMADD nodes.
More removals to come, but I wanted to stop and fix the regression that showed up in this first.
llvm-svn: 336303
|
|
|
|
| |
llvm-svn: 336035
|
|
|
|
|
|
|
|
| |
IR instead.
While there improve the coverage of the intrinsic testing and add fast-isel tests.
llvm-svn: 335944
|
|
|
|
|
|
|
|
| |
that don't take a mask as input to exclude '.mask.' from their name.
I think the intrinsics named 'avx512.mask.' should refer to the previous behavior of taking a mask argument in the intrinsic instead of using a 'select' or 'and' instruction in IR to accomplish the masking. This is more consistent with the goal that eventually we will have no intrinsics that have masking builtin. When we reach that goal, we should have no intrinsics named "avx512.mask".
llvm-svn: 335744
|
|
|
|
|
|
|
|
|
|
|
|
| |
implement the mask input argument using an 'and' IR instruction.
This recommits r335562 and 335563 as a single commit.
The frontend will surround the intrinsic with the appropriate marshalling to/from a scalar type to match the sigature of the builtin that software expects.
By exposing the vXi1 type directly in the llvm intrinsic we make it available to optimizers much earlier. This can enable the scalar marshalling code to be optimized away.
llvm-svn: 335568
|
|
|
|
|
|
|
|
| |
to return a vXi1 mask and implement the mask input argument using an 'and' IR instruction."
These were supposed to have been squashed to a single commit.
llvm-svn: 335566
|
|
|
|
| |
llvm-svn: 335562
|
|
|
|
|
|
| |
instruction instead.
llvm-svn: 335199
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Complementary patch to lowering sqrt intrinsics in Clang.
Reviewers: craig.topper, spatel, RKSimon, DavidKreitzer, uriel.k
Reviewed By: craig.topper
Subscribers: tkrupa, mike.dvoretsky, llvm-commits
Differential Revision: https://reviews.llvm.org/D41599
llvm-svn: 334849
|
|
|
|
|
|
| |
intrinsics. Use select in IR instead.
llvm-svn: 334576
|
|
|
|
| |
llvm-svn: 334384
|
|
|
|
|
|
| |
We use the target independent intrinsics now.
llvm-svn: 334381
|
|
|
|
|
|
| |
intrinsics. Use a select in IR instead.
llvm-svn: 334358
|
|
|
|
|
|
| |
intrinsics and select instructions.
llvm-svn: 333857
|
|
|
|
|
|
| |
We have unmasked intrinsics now and wrap them with a select. This is a net reduction of 36 intrinsics from before the unmasked intrinsics were added.
llvm-svn: 333388
|
|
|
|
|
|
| |
This allows us to avoid having mask and maskz variant. Reducing from 12 intrinsics to 6.
llvm-svn: 333346
|
|
|
|
|
|
| |
These can all be implemented with sitofp/uitofp instructions.
llvm-svn: 332916
|
|
|
|
|
|
|
|
| |
This removes 6 intrinsics since we no longer need separate mask and maskz intrinsics.
Differential Revision: https://reviews.llvm.org/D47124
llvm-svn: 332890
|
|
|
|
|
|
|
|
| |
in IR instead.
Someday maybe we'll use selects for all intrinsics.
llvm-svn: 332824
|
|
|
|
|
|
| |
intrinsics.
llvm-svn: 332271
|
|
|
|
|
|
| |
uitofp+insertelement instead.
llvm-svn: 332206
|
|
|
|
| |
llvm-svn: 332198
|
|
|
|
| |
llvm-svn: 332187
|
|
|
|
|
|
| |
clang has used for a very long time.
llvm-svn: 332186
|
|
|
|
|
|
|
|
| |
with an older intrinsic and a select.
This is what clang already uses.
llvm-svn: 332170
|
|
|
|
|
|
| |
used by clang.
llvm-svn: 332146
|
|
|
|
| |
llvm-svn: 332079
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is one of the initial commit of "RFC: Devirtualization v2" proposal:
https://docs.google.com/document/d/16GVtCpzK8sIHNc2qZz6RN8amICNBtvjWUod2SujZVEo/edit?usp=sharing
Reviewers: rsmith, amharc, kuhar, sanjoy
Subscribers: arsenm, nhaehnle, javed.absar, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D45111
llvm-svn: 331448
|
|
|
|
|
|
|
|
| |
The LLVM commit introduces a crash in LLVM's instruction selection.
I filed http://llvm.org/PR37260 with the test case.
llvm-svn: 330997
|
|
|
|
| |
llvm-svn: 330614
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the patch that lowers x86 intrinsics to native IR
in order to enable optimizations. The patch also includes folding
of previously missing saturation patterns so that IR emits the same
machine instructions as the intrinsics.
Patch by tkrupa
Differential Revision: https://reviews.llvm.org/D44785
llvm-svn: 330322
|
|
|
|
|
|
| |
Older compiler issued '#' instead of ';'
llvm-svn: 330173
|
|
|
|
|
|
|
|
| |
This completes the work started in r329604 and r329605 when we changed clang to no longer use the intrinsics.
We lost some InstCombine SimplifyDemandedBit optimizations through this change as we aren't able to fold 'and', bitcast, shuffle very well.
llvm-svn: 329990
|
|
|
|
|
|
|
|
| |
512-bit masked intrinsic with unmasked intrinsic and a select.
The 128/256-bit versions were no longer used by clang. It uses the legacy SSE/AVX2 version and a select. The 512-bit was changed to the same for consistency.
llvm-svn: 329774
|
|
|
|
|
|
|
|
| |
need to upgrade to an unmasked version plus a select. NFCI
These are were previously grouped in small groups of similarish intrinsics. But all the intrinsics have the same number of arguments and the same order. So we can move them all into a larger group for handling.
llvm-svn: 329549
|
|
|
|
|
|
| |
Older compiler issued '#' instead of ';'
llvm-svn: 329248
|
|
|
|
|
|
|
|
| |
auto upgrade 128/256/512 bit masked pmulhrsw/pmulhw/pmulhuw intrinsics.
The 128 and 256 bit versions were already not used by clang. This adds an equivalent unmasked 512 bit version. Then autoupgrades all sizes to use unmasked intrinsics plus select.
llvm-svn: 325559
|
|
|
|
|
|
|
|
| |
The second operand needs to be in the lower bits of the concatenation. This matches llvm 5.0, gcc, and icc behavior.
Fixes PR36360.
llvm-svn: 324953
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vXi1 mask type to be closer to an fcmp.
Summary:
This patch changes the signature of the avx512 packed fp compare intrinsics to return a vXi1 vector and no longer take a mask as input. The casts to scalar type will now need to be explicit in the IR. The masking node will now be an explicit and in the IR.
This makes the intrinsic look much more similar to an fcmp instruction that we wish we could use for these but can't. We already use icmp instructions for integer compares.
Previously the lowering step of isel would turn the intrinsic into an X86 specific ISD node and a emit the masking nodes as well as some bitcasts. This means DAG combines can't see the vXi1 type until somewhat late, making it more difficult to combine out gpr<->mask transition sequences. By exposing the vXi1 type explicitly in the IR and initial SelectionDAG we give earlier DAG combines and even InstCombine the chance to see it and optimize it.
This should make any issues with gpr<->mask sequences the same between integer and fp. Meaning we only have to fix them once.
Reviewers: spatel, delena, RKSimon, zvi
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43137
llvm-svn: 324827
|
|
|
|
| |
llvm-svn: 324646
|
|
|
|
|
|
|
|
| |
Clang already stopped using these a couple months ago.
The test cases aren't great as there is nothing forcing the operations to stay in k-registers so some of them moved back to scalar ops due to the bitcasts being moved around.
llvm-svn: 324177
|