summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/X86/avx512-vbroadcasti128.ll
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Pass v32i16/v64i8 in zmm registers on KNL target.Craig Topper2019-08-301-12/+16
| | | | | | | | | | | | | | | gcc and icc pass these types in zmm registers in zmm registers. This patch implements a quick hack to override the register type before calling convention handling to one that is legal. Longer term we might want to do something similar to 256-bit integer registers on AVX1 where we just split all the operations. Fixes PR42957 Differential Revision: https://reviews.llvm.org/D66708 llvm-svn: 370495
* [X86] Add custom execution domain fixing for 128/256-bit integer logic ↵Craig Topper2018-07-151-20/+6
| | | | | | | | | | | | operations with AVX512F, but not AVX512DQ. AVX512F only has integer domain logic instructions. AVX512DQ added FP domain logic instructions. Execution domain fixing runs before EVEX->VEX. So if we have AVX512F and not AVX512DQ we fail to do execution domain switching of the logic operations. This leads to mismatches in execution domain and more test differences. This patch adds custom domain fixing that switches EVEX integer logic operations to VEX fp logic operations if XMM16-31 are not used. llvm-svn: 337137
* [DAG, X86] Revert r327197 "Revert r327170, r327171, r327172"Nirav Dave2018-03-191-6/+3
| | | | | | | Reland ISel cycle checking improvements after simplifying node id invariant traversal and correcting typo. llvm-svn: 327898
* Revert "[DAG, X86] Revert r327197 "Revert r327170, r327171, r327172""Nirav Dave2018-03-171-3/+6
| | | | | | as it times out building test-suite on PPC. llvm-svn: 327778
* [DAG, X86] Revert r327197 "Revert r327170, r327171, r327172"Nirav Dave2018-03-171-6/+3
| | | | | | | Reland ISel cycle checking improvements after simplifying and reducing node id invariant traversal. llvm-svn: 327777
* Revert: r327172 "Correct load-op-store cycle detection analysis"Nirav Dave2018-03-101-3/+6
| | | | | | | | | | r327171 "Improve Dependency analysis when doing multi-node Instruction Selection" r328170 "[DAG] Enforce stricter NodeId invariant during Instruction selection" Reverting patch as NodeId invariant change is causing pathological increases in compile time on PPC llvm-svn: 327197
* Improve Dependency analysis when doing multi-node Instruction SelectionNirav Dave2018-03-091-6/+3
| | | | | | | | | | | | | | | | | | | | Relanding after fixing NodeId Invariant. Cleanup cycle/validity checks in ISel (IsLegalToFold, HandleMergeInputChains) and X86 (isFusableLoadOpStore). Now do a full search for cycles / dependencies pruning the search when topological property of NodeId allows. As part of this propogate the NodeId-based cutoffs to narrow hasPreprocessorHelper searches. Reviewers: craig.topper, bogner Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D41293 llvm-svn: 327171
* [DAG, X86] Revert r324797, r324491, and r324359.Chandler Carruth2018-02-171-3/+6
| | | | | | | | | | | | Sadly, r324359 caused at least PR36312. There is a patch out for review but it seems to be taking a bit and we've already had these crashers in tree for too long. We're hitting this PR in real code now and are blocked on shipping new compilers as a consequence so I'm reverting us back to green. Sorry for the churn due to the stacked changes that I had to revert. =/ llvm-svn: 325420
* [DAG, X86] Improve Dependency analysis when doing multi-nodeNirav Dave2018-02-061-6/+3
| | | | | | | | | | | | | | | | | | | | Instruction Selection Cleanup cycle/validity checks in ISel (IsLegalToFold, HandleMergeInputChains) and X86 (isFusableLoadOpStore). Now do a full search for cycles / dependencies pruning the search when topological property of NodeId allows. As part of this propogate the NodeId-based cutoffs to narrow hasPreprocessorHelper searches. Reviewers: craig.topper, bogner Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D41293 llvm-svn: 324359
* [CodeGen] Unify MBB reference format in both MIR and debug outputFrancis Visoiu Mistrih2017-12-041-19/+19
| | | | | | | | | | | | | | | | As part of the unification of the debug format and the MIR format, print MBB references as '%bb.5'. The MIR printer prints the IR name of a MBB only for block definitions. * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g' * find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g' * grep -nr 'BB#' and fix Differential Revision: https://reviews.llvm.org/D40422 llvm-svn: 319665
* [AVX-512] Don't change which instructions we use for unmasked subvector ↵Craig Topper2017-08-211-34/+10
| | | | | | | | | | broadcasts when AVX512DQ is enabled. There's no functional difference between the AVX512DQ instructions if we're not masking. This change unifies test checks and removes extra isel entries. Similar was done for subvector insert and extracts recently. llvm-svn: 311308
* [AVX512] Add 128->256 vbroadcastf64x2/vbroadcasti64x2 instructions to the ↵Craig Topper2017-08-211-34/+10
| | | | | | EVEX->VEX table. llvm-svn: 311307
* [X86] SET0 to use XMM registers where possible PR26018 PR32862Dinar Temirbulatov2017-08-031-3/+3
| | | | | | Differential Revision: https://reviews.llvm.org/D35965 llvm-svn: 309926
* [AVX-512] Add unmasked subvector inserts and extract to the execution domain ↵Craig Topper2017-07-311-6/+6
| | | | | | tables. llvm-svn: 309632
* [X86][AVX512] Add missing entries to EVEX2VEX tablesAyman Musa2017-03-071-12/+12
| | | | | | | | evex2vex pass defines 2 tables which maps EVEX instructions to their VEX identical when possible. Adding all missing entries. Differential Revision: https://reviews.llvm.org/D30501 llvm-svn: 297126
* [AVX-512] Teach EVEX to VEX conversion pass to handle VINSERT and VEXTRACT ↵Craig Topper2017-01-031-3/+3
| | | | | | instructions. llvm-svn: 290869
* This is a large patch for X86 AVX-512 of an optimization for reducing code ↵Gadi Haber2016-12-281-7/+7
| | | | | | | | | | | | size by encoding EVEX AVX-512 instructions using the shorter VEX encoding when possible. There are cases of AVX-512 instructions that have two possible encodings. This is the case with instructions that use vector registers with low indexes of 0 - 15 and do not use the zmm registers or the mask k registers. The EVEX encoding prefix requires 4 bytes whereas the VEX prefix can take only up to 3 bytes. Consequently, using the VEX encoding for these instructions results in a code size reduction of ~2 bytes even though it is compiled with the AVX-512 features enabled. Reviewers: Craig Topper, Zvi Rackoover, Elena Demikhovsky Differential Revision: https://reviews.llvm.org/D27901 llvm-svn: 290663
* [AVX-512] Teach isel lowering that a subvector broadcast being inserted into ↵Craig Topper2016-10-191-56/+18
| | | | | | | | | | | | | | | | | | | | | both halves of a 512-bit vector can be combined into a larger subvector broadcast. Summary: This allows us to create broadcasts of 128-bit vector loads into 512-bit vectors. New patterns added to support 8-bit and 16-bit vector types and v2f64/v2i64->v8f64/v8i64 without DQI instructions. There also fallback patterns when the load can't be folded. These patterns are a little complex as we first need to insert the lower 128-bits into the second 128-bits using a zmm subvector insert instruction. We need to use a zmm insert in case VLX isn't available. Then use another zmm sub vector insert to take those 256-bits and insert them into the upper bits. Since we used a zmm insert to create the 256-bits we also need to do a extract_subreg to get just the lower 256-bits to pass to the second insert. The outer insert for the fallback patterns should have its type correct because eventually we should also supported masked operations here too. So we need a DQI and a NoDQI version of the v16f32/v16i32 patterns. Reviewers: RKSimon, delena, igorb Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25651 llvm-svn: 284567
* [AVX-512] Add shuffle comments for vbroadcast instructions.Craig Topper2016-10-151-28/+28
| | | | llvm-svn: 284305
* [X86][AVX] Don't use SubVectorBroadcast if there are additional users of the ↵Simon Pilgrim2016-08-221-0/+30
| | | | | | | | chain (PR29088) We could improve on this by making X86SubVBroadcast a full memory intrinsic similar to X86vzload llvm-svn: 279441
* [X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128 ↵Simon Pilgrim2016-07-221-92/+36
| | | | | | | | | | | | | | | | (reapplied) As reported on PR26235, we don't currently make use of the VBROADCASTF128/VBROADCASTI128 instructions (or the AVX512 equivalents) to load+splat a 128-bit vector to both lanes of a 256-bit vector. This patch enables lowering from subvector insertion/concatenation patterns and auto-upgrades the llvm.x86.avx.vbroadcastf128.pd.256 / llvm.x86.avx.vbroadcastf128.ps.256 intrinsics to match. We could possibly investigate using VBROADCASTF128/VBROADCASTI128 to load repeated constants as well (similar to how we already do for scalar broadcasts). Reapplied with fix for PR28657 - removed intrinsic definitions (clang companion patch to be be submitted shortly). Differential Revision: https://reviews.llvm.org/D22460 llvm-svn: 276416
* Revert "[X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128"Benjamin Kramer2016-07-221-36/+92
| | | | | | | | It caused PR28657. This reverts commit r276281. llvm-svn: 276405
* [X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128Simon Pilgrim2016-07-211-92/+36
| | | | | | | | | | | | As reported on PR26235, we don't currently make use of the VBROADCASTF128/VBROADCASTI128 instructions (or the AVX512 equivalents) to load+splat a 128-bit vector to both lanes of a 256-bit vector. This patch enables lowering from subvector insertion/concatenation patterns and auto-upgrades the llvm.x86.avx.vbroadcastf128.pd.256 / llvm.x86.avx.vbroadcastf128.ps.256 intrinsics to match. We could possibly investigate using VBROADCASTF128/VBROADCASTI128 to load repeated constants as well (similar to how we already do for scalar broadcasts). Differential Revision: https://reviews.llvm.org/D22460 llvm-svn: 276281
* [X86][AVX512] Added AVX512 subvector broadcast testsSimon Pilgrim2016-07-191-0/+326
llvm-svn: 275994
OpenPOWER on IntegriCloud