Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Remove trailing space | Fangrui Song | 2018-07-30 | 1 | -1/+1 |
| | | | | | | sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h} llvm-svn: 338293 | ||||
* | [X86] Remove and autoupgrade the scalar fma intrinsics with masking. | Craig Topper | 2018-07-12 | 1 | -40/+99 |
| | | | | | | This converts them to what clang is now using for codegen. Unfortunately, there seem to be a few kinks to work out still. I'll try to address with follow up patches. llvm-svn: 336871 | ||||
* | [X86] Remove FMA4 scalar intrinsics. Use llvm.fma intrinsic instead. | Craig Topper | 2018-07-06 | 1 | -0/+16 |
| | | | | | | | | The intrinsics can be implemented with a f32/f64 llvm.fma intrinsic and an insert into a zero vector. There are a couple regressions here due to SelectionDAG not being able to pull an fneg through an extract_vector_elt. I'm not super worried about this though as InstCombine should be able to do it before we get to SelectionDAG. llvm-svn: 336416 | ||||
* | [X86] Remove all of the avx512 masked packed fma intrinsics. Use llvm.fma or ↵ | Craig Topper | 2018-07-06 | 1 | -2/+128 |
| | | | | | | | | | | unmasked 512-bit intrinsics with rounding mode. This upgrades all of the intrinsics to use fneg instructions to convert fma into fmsub/fnmsub/fnmadd/fmsubadd. And uses a select instruction for masking. This matches how clang uses the intrinsics these days. llvm-svn: 336409 | ||||
* | [X86] Remove the last of the 'x86.fma.' intrinsics and autoupgrade them to ↵ | Craig Topper | 2018-07-05 | 1 | -19/+25 |
| | | | | | | | | 'llvm.fma'. Add upgrade tests for all. Still need to remove the AVX512 masked versions. llvm-svn: 336383 | ||||
* | [X86] Remove X86 specific scalar FMA intrinsics and upgrade to tart ↵ | Craig Topper | 2018-07-05 | 1 | -52/+33 |
| | | | | | | independent FMA and extractelement/insertelement. llvm-svn: 336315 | ||||
* | [X86] Remove some of the packed FMA3 intrinsics since we no longer use them ↵ | Craig Topper | 2018-07-05 | 1 | -40/+32 |
| | | | | | | | | | | in clang. There's a regression in here due to inability to combine fneg inputs of X86ISD::FMSUB/FNMSUB/FNMADD nodes. More removals to come, but I wanted to stop and fix the regression that showed up in this first. llvm-svn: 336303 | ||||
* | [X86] Remove masking from avx512 rotate intrinsics. Use select in IR instead. | Craig Topper | 2018-06-30 | 1 | -0/+64 |
| | | | | llvm-svn: 336035 | ||||
* | [X86] Remove masking from the avx512 packed sqrt intrinsics. Use select in ↵ | Craig Topper | 2018-06-29 | 1 | -8/+15 |
| | | | | | | | | IR instead. While there improve the coverage of the intrinsic testing and add fast-isel tests. llvm-svn: 335944 | ||||
* | [X86] Rename the autoupgraded of packed fp compare and fpclass intrinsics ↵ | Craig Topper | 2018-06-27 | 1 | -111/+65 |
| | | | | | | | | that don't take a mask as input to exclude '.mask.' from their name. I think the intrinsics named 'avx512.mask.' should refer to the previous behavior of taking a mask argument in the intrinsic instead of using a 'select' or 'and' instruction in IR to accomplish the masking. This is more consistent with the goal that eventually we will have no intrinsics that have masking builtin. When we reach that goal, we should have no intrinsics named "avx512.mask". llvm-svn: 335744 | ||||
* | [X86] Redefine avx512 packed fpclass intrinsics to return a vXi1 mask and ↵ | Craig Topper | 2018-06-26 | 1 | -0/+43 |
| | | | | | | | | | | | | implement the mask input argument using an 'and' IR instruction. This recommits r335562 and 335563 as a single commit. The frontend will surround the intrinsic with the appropriate marshalling to/from a scalar type to match the sigature of the builtin that software expects. By exposing the vXi1 type directly in the llvm intrinsic we make it available to optimizers much earlier. This can enable the scalar marshalling code to be optimized away. llvm-svn: 335568 | ||||
* | Revert r335562 and 335563 "[X86] Redefine avx512 packed fpclass intrinsics ↵ | Craig Topper | 2018-06-26 | 1 | -43/+0 |
| | | | | | | | | to return a vXi1 mask and implement the mask input argument using an 'and' IR instruction." These were supposed to have been squashed to a single commit. llvm-svn: 335566 | ||||
* | foo | Craig Topper | 2018-06-26 | 1 | -0/+43 |
| | | | | llvm-svn: 335562 | ||||
* | [X86] Remove masking from 512-bit floating max/min intrinsics. Use select ↵ | Craig Topper | 2018-06-21 | 1 | -12/+32 |
| | | | | | | instruction instead. llvm-svn: 335199 | ||||
* | [X86] Lowering sqrt intrinsics to native IR | Tomasz Krupa | 2018-06-15 | 1 | -0/+32 |
| | | | | | | | | | | | | | | Summary: Complementary patch to lowering sqrt intrinsics in Clang. Reviewers: craig.topper, spatel, RKSimon, DavidKreitzer, uriel.k Reviewed By: craig.topper Subscribers: tkrupa, mike.dvoretsky, llvm-commits Differential Revision: https://reviews.llvm.org/D41599 llvm-svn: 334849 | ||||
* | [X86] Remove masking from avx512vbmi2 concat and shift by immediate ↵ | Craig Topper | 2018-06-13 | 1 | -0/+44 |
| | | | | | | intrinsics. Use select in IR instead. llvm-svn: 334576 | ||||
* | [X86] Remove masking from dbpsadbw intrinsics, use select in IR instead. | Craig Topper | 2018-06-11 | 1 | -0/+10 |
| | | | | llvm-svn: 334384 | ||||
* | [X86] Remove and autoupgrade the expandload and compressstore intrinsics. | Craig Topper | 2018-06-11 | 1 | -0/+32 |
| | | | | | | We use the target independent intrinsics now. llvm-svn: 334381 | ||||
* | [X86] Remove masking from the 512-bit masked floating point add/sub/mul/div ↵ | Craig Topper | 2018-06-10 | 1 | -21/+57 |
| | | | | | | intrinsics. Use a select in IR instead. llvm-svn: 334358 | ||||
* | [X86] Remove and autoupgrade masked avx512vnni intrinsics using the unmasked ↵ | Craig Topper | 2018-06-03 | 1 | -0/+68 |
| | | | | | | intrinsics and select instructions. llvm-svn: 333857 | ||||
* | [X86] Remove masked vpermi2var/vpermt2var intrinsics and autoupgrade. | Craig Topper | 2018-05-29 | 1 | -0/+64 |
| | | | | | | We have unmasked intrinsics now and wrap them with a select. This is a net reduction of 36 intrinsics from before the unmasked intrinsics were added. llvm-svn: 333388 | ||||
* | [X86] Remove masking from avx512ifma intrinsics. Use a select instead. | Craig Topper | 2018-05-26 | 1 | -0/+30 |
| | | | | | | This allows us to avoid having mask and maskz variant. Reducing from 12 intrinsics to 6. llvm-svn: 333346 | ||||
* | [X86] Remove 128/256-bit cvtdq2ps, cvtudq2ps, cvtqq2pd, cvtuqq2pd intrinsics. | Craig Topper | 2018-05-21 | 1 | -24/+33 |
| | | | | | | These can all be implemented with sitofp/uitofp instructions. llvm-svn: 332916 | ||||
* | [X86] Remove masking from vpternlog intrinsics. Use a select in IR instead. | Craig Topper | 2018-05-21 | 1 | -0/+30 |
| | | | | | | | | This removes 6 intrinsics since we no longer need separate mask and maskz intrinsics. Differential Revision: https://reviews.llvm.org/D47124 llvm-svn: 332890 | ||||
* | [X86] Remove mask arguments from permvar builtins/intrinsics. Use a select ↵ | Craig Topper | 2018-05-20 | 1 | -6/+34 |
| | | | | | | | | in IR instead. Someday maybe we'll use selects for all intrinsics. llvm-svn: 332824 | ||||
* | [X86] Remove and autoupgrade avx512.vbroadcast.ss/avx512.vbroadcast.sd ↵ | Craig Topper | 2018-05-14 | 1 | -1/+3 |
| | | | | | | intrinsics. llvm-svn: 332271 | ||||
* | [X86] Remove and autoupgrade the cvtusi2sd intrinsic. Use ↵ | Craig Topper | 2018-05-14 | 1 | -0/+5 |
| | | | | | | uitofp+insertelement instead. llvm-svn: 332206 | ||||
* | [X86] Remove and autoupgrade masked vpermd/vpermps intrinsics. | Craig Topper | 2018-05-13 | 1 | -7/+13 |
| | | | | llvm-svn: 332198 | ||||
* | [X86] Remove an autoupgrade legacy cvtss2sd intrinsics. | Craig Topper | 2018-05-13 | 1 | -0/+5 |
| | | | | llvm-svn: 332187 | ||||
* | [X86] Remove and autoupgrade cvtsi2ss/cvtsi2sd intrinsics to match what ↵ | Craig Topper | 2018-05-12 | 1 | -0/+11 |
| | | | | | | clang has used for a very long time. llvm-svn: 332186 | ||||
* | [X86] Remove some unused masked conversion intrinsics that can be replaced ↵ | Craig Topper | 2018-05-12 | 1 | -3/+33 |
| | | | | | | | | with an older intrinsic and a select. This is what clang already uses. llvm-svn: 332170 | ||||
* | [X86] Remove and autoupgrade a bunch of FMA instrinsics that are no longer ↵ | Craig Topper | 2018-05-11 | 1 | -0/+83 |
| | | | | | | used by clang. llvm-svn: 332146 | ||||
* | [X86] Remove and autoupgrade the avx512.mask.store.ss intrinsic. | Craig Topper | 2018-05-11 | 1 | -0/+11 |
| | | | | llvm-svn: 332079 | ||||
* | Rename invariant.group.barrier to launder.invariant.group | Piotr Padlewski | 2018-05-03 | 1 | -0/+11 |
| | | | | | | | | | | | | | | Summary: This is one of the initial commit of "RFC: Devirtualization v2" proposal: https://docs.google.com/document/d/16GVtCpzK8sIHNc2qZz6RN8amICNBtvjWUod2SujZVEo/edit?usp=sharing Reviewers: rsmith, amharc, kuhar, sanjoy Subscribers: arsenm, nhaehnle, javed.absar, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45111 llvm-svn: 331448 | ||||
* | [x86] Revert r330322 (& r330323): Lowering x86 adds/addus/subs/subus intrinsics | Chandler Carruth | 2018-04-26 | 1 | -104/+2 |
| | | | | | | | | The LLVM commit introduces a crash in LLVM's instruction selection. I filed http://llvm.org/PR37260 with the test case. llvm-svn: 330997 | ||||
* | Fix -Wtautological-compare warning with npos on Windows | Reid Kleckner | 2018-04-23 | 1 | -2/+1 |
| | | | | llvm-svn: 330614 | ||||
* | Lowering x86 adds/addus/subs/subus intrinsics (llvm part) | Alexander Ivchenko | 2018-04-19 | 1 | -2/+104 |
| | | | | | | | | | | | | | This is the patch that lowers x86 intrinsics to native IR in order to enable optimizations. The patch also includes folding of previously missing saturation patterns so that IR emits the same machine instructions as the intrinsics. Patch by tkrupa Differential Revision: https://reviews.llvm.org/D44785 llvm-svn: 330322 | ||||
* | [IR] Upgrade comment token in objc retain release marker for asm call | Gerolf Hoflehner | 2018-04-17 | 1 | -0/+13 |
| | | | | | | Older compiler issued '#' instead of ';' llvm-svn: 330173 | ||||
* | [X86] Remove the pmuldq/pmuldq intrinsics and replace with native IR. | Craig Topper | 2018-04-13 | 1 | -18/+45 |
| | | | | | | | | This completes the work started in r329604 and r329605 when we changed clang to no longer use the intrinsics. We lost some InstCombine SimplifyDemandedBit optimizations through this change as we aren't able to fold 'and', bitcast, shuffle very well. llvm-svn: 329990 | ||||
* | [X86] Remove 128/256-bit masked pmaddubsw and pmaddwd intrinsics. Replace ↵ | Craig Topper | 2018-04-11 | 1 | -0/+20 |
| | | | | | | | | 512-bit masked intrinsic with unmasked intrinsic and a select. The 128/256-bit versions were no longer used by clang. It uses the legacy SSE/AVX2 version and a select. The 512-bit was changed to the same for consistency. llvm-svn: 329774 | ||||
* | [X86] Merge some of the autoupgrade handling for masked intrinsics that just ↵ | Craig Topper | 2018-04-09 | 1 | -170/+149 |
| | | | | | | | | need to upgrade to an unmasked version plus a select. NFCI These are were previously grouped in small groups of similarish intrinsics. But all the intrinsics have the same number of arguments and the same order. So we can move them all into a larger group for handling. llvm-svn: 329549 | ||||
* | [IR] Upgrade comment token in objc retain release marker | Gerolf Hoflehner | 2018-04-05 | 1 | -0/+24 |
| | | | | | | Older compiler issued '#' instead of ';' llvm-svn: 329248 | ||||
* | [X86] Add 512-bit unmasked pmulhrsw/pmulhw/pmulhuw intrinsics. Remove and ↵ | Craig Topper | 2018-02-20 | 1 | -0/+43 |
| | | | | | | | | auto upgrade 128/256/512 bit masked pmulhrsw/pmulhw/pmulhuw intrinsics. The 128 and 256 bit versions were already not used by clang. This adds an equivalent unmasked 512 bit version. Then autoupgrades all sizes to use unmasked intrinsics plus select. llvm-svn: 325559 | ||||
* | [X86] Reverse the operand order of the autoupgrade of the kunpack builtins. | Craig Topper | 2018-02-12 | 1 | -1/+2 |
| | | | | | | | | The second operand needs to be in the lower bits of the concatenation. This matches llvm 5.0, gcc, and icc behavior. Fixes PR36360. llvm-svn: 324953 | ||||
* | [X86] Change signatures of avx512 packed fp compare intrinsics to return a ↵ | Craig Topper | 2018-02-10 | 1 | -0/+59 |
| | | | | | | | | | | | | | | | | | | | | | | | vXi1 mask type to be closer to an fcmp. Summary: This patch changes the signature of the avx512 packed fp compare intrinsics to return a vXi1 vector and no longer take a mask as input. The casts to scalar type will now need to be explicit in the IR. The masking node will now be an explicit and in the IR. This makes the intrinsic look much more similar to an fcmp instruction that we wish we could use for these but can't. We already use icmp instructions for integer compares. Previously the lowering step of isel would turn the intrinsic into an X86 specific ISD node and a emit the masking nodes as well as some bitcasts. This means DAG combines can't see the vXi1 type until somewhat late, making it more difficult to combine out gpr<->mask transition sequences. By exposing the vXi1 type explicitly in the IR and initial SelectionDAG we give earlier DAG combines and even InstCombine the chance to see it and optimize it. This should make any issues with gpr<->mask sequences the same between integer and fp. Meaning we only have to fix them once. Reviewers: spatel, delena, RKSimon, zvi Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43137 llvm-svn: 324827 | ||||
* | [X86] Remove kortest intrinsics and replace with native IR. | Craig Topper | 2018-02-08 | 1 | -0/+15 |
| | | | | llvm-svn: 324646 | ||||
* | [X86] Remove and autoupgrade kand/kandn/kor/kxor/kxnor/knot intrinsics. | Craig Topper | 2018-02-03 | 1 | -0/+37 |
| | | | | | | | | Clang already stopped using these a couple months ago. The test cases aren't great as there is nothing forcing the operations to stay in k-registers so some of them moved back to scalar ops due to the bitcasts being moved around. llvm-svn: 324177 | ||||
* | Remove alignment argument from memcpy/memmove/memset in favour of alignment ↵ | Daniel Neilson | 2018-01-19 | 1 | -3/+67 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | attributes (Step 1) Summary: This is a resurrection of work first proposed and discussed in Aug 2015: http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html and initially landed (but then backed out) in Nov 2015: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two. This change is the first in a series that allows source and dest to each have their own alignments by using the alignment attribute on their arguments. In this change we: 1) Remove the alignment argument. 2) Add alignment attributes to the source & dest arguments. We, temporarily, require that the alignments for source & dest be equal. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false) Downstream users may have to update their lit tests that check for @llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script may help with updating the majority of your tests, but it does not catch all possible patterns so some manual checking and updating will be required. s~declare void @llvm\.mem(set|cpy|move)\.p([^(]*)\((.*), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* align \6 \3, i8 \4, i8 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* align \6 \3, i8 \4, i16 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* align \6 \3, i8 \4, i32 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* align \6 \3, i8 \4, i64 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* align \6 \3, i8 \4, i128 \5, i1 \7)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* \4, i8\5* \6, i8 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* \4, i8\5* \6, i16 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* \4, i8\5* \6, i32 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* \4, i8\5* \6, i64 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* \4, i8\5* \6, i128 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g The remaining changes in the series will: Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. Step 3) Update Clang to use the new IRBuilder API. Step 4) Update Polly to use the new IRBuilder API. Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use getDestAlignment() and getSourceAlignment() instead. Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reviewers: pete, hfinkel, lhames, reames, bollu Reviewed By: reames Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits Differential Revision: https://reviews.llvm.org/D41675 llvm-svn: 322965 | ||||
* | [X86] Autoupgrade kunpck intrinsics using vector operations instead of ↵ | Craig Topper | 2018-01-14 | 1 | -5/+17 |
| | | | | | | | | | | | | | | | | scalar operations Summary: This patch changes the kunpck intrinsic autoupgrade to use vXi1 shufflevector operations to perform vector extracts and concats. This more closely matches the definition of the kunpck instructions. Currently we rely on a DAG combine to turn the scalar shift/and/or code into a concat vectors operation. By doing it in the IR we get this for free. Reviewers: spatel, RKSimon, zvi, jina.nahias Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42018 llvm-svn: 322462 | ||||
* | [X86] Remove llvm.x86.avx512.cvt*2mask.* intrinsics and autoupgrade to (icmp ↵ | Craig Topper | 2018-01-09 | 1 | -3/+18 |
| | | | | | | | | slt X, 0) I had to drop fast-isel-abort from a test because we can't fast isel some of the mask stuff. When we used intrinsics we implicitly fell back to SelectionDAG for the intrinsic call without triggering the abort error. But with native IR that doesn't happen the same way. llvm-svn: 322050 |