summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/X86/fma_patterns_wide.ll
Commit message (Collapse)AuthorAgeFilesLines
* Revert "[NFC][CodeGen] Add unary fneg tests to X86/fma_patterns_wide.ll"Cameron McInally2019-06-131-219/+0
| | | | | | This reverts commit f1b8c6ac4f9d31899a2bc128f8a37b5a1c3e1f77. llvm-svn: 363315
* [DAGCombine] GetNegatedExpression - constant float vector support (PR42105)Simon Pilgrim2019-06-111-28/+28
| | | | | | | | Add support for negation of constant build vectors. Differential Revision: https://reviews.llvm.org/D62963 llvm-svn: 363040
* [NFC][CodeGen] Add unary fneg tests to X86/fma_patterns_wide.llCameron McInally2019-06-061-0/+219
| | | | llvm-svn: 362720
* [X86] Force floating point values in constant pool decoding to print in ↵Craig Topper2018-10-291-36/+36
| | | | | | | | | | scientific notation so they can't be confused with integers. When the floating point constants are whole numbers they have no decimal point so look like integers, but mean something very different in something like an 'and' instruction. Ideally we would just print a decimal point and a 0, but I couldn't see how to make APFloat::toString do that. llvm-svn: 345488
* [X86] Standardize floating point assembly commentsSimon Pilgrim2018-10-021-30/+30
| | | | | | | | Consistently try to use APFloat::toString for floating point constant comments to get rid of differences between Constant / ConstantDataSequential values - it should help stop some of the linux-windows buildbot failures matching NaN/INF etc. as well. Differential Revision: https://reviews.llvm.org/D52702 llvm-svn: 343562
* [X86] Regenerate fma comments.Simon Pilgrim2018-09-291-105/+105
| | | | llvm-svn: 343376
* [CodeGen] Unify MBB reference format in both MIR and debug outputFrancis Visoiu Mistrih2017-12-041-141/+141
| | | | | | | | | | | | | | | | As part of the unification of the debug format and the MIR format, print MBB references as '%bb.5'. The MIR printer prints the IR name of a MBB only for block definitions. * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g' * find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g' * grep -nr 'BB#' and fix Differential Revision: https://reviews.llvm.org/D40422 llvm-svn: 319665
* [X86] SET0 to use XMM registers where possible PR26018 PR32862Dinar Temirbulatov2017-08-031-2/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D35965 llvm-svn: 309926
* [X86] SET0 to use XMM registers where possible PR26018 PR32862Dinar Temirbulatov2017-07-271-4/+4
| | | | | | Differential Revision: https://reviews.llvm.org/D35839 llvm-svn: 309298
* [X86][FMA] Regenerate test with broadcast comments.Simon Pilgrim2017-07-261-6/+6
| | | | llvm-svn: 309093
* X86: Add checks for fma_patterns[_wide].ll with -enable-no-infs-fp-mathNicolai Haehnle2016-12-081-342/+594
| | | | | | | | | | | | This re-adds checks for the patterns that were disabled with r288506. Reviewers: spatel, delena, craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27346 llvm-svn: 289049
* [DAGCombiner] do not fold (fmul (fadd X, 1), Y) -> (fmad X, Y, Y) by defaultNicolai Haehnle2016-12-021-70/+174
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When X = 0 and Y = inf, the original code produces inf, but the transformed code produces nan. So this transform (and its relatives) should only be used when the no-infs-fp-math flag is explicitly enabled. Also disable the transform using fmad (intermediate rounding) when unsafe-math is not enabled, since it can reduce the precision of the result; consider this example with binary floating point numbers with two bits of mantissa: x = 1.01 y = 111 x * (y + 1) = 1.01 * 1000 = 1010 (this is the exact result; no rounding occurs at any step) x * y + x = 1000.11 + 1.01 =r 1000 + 1.01 = 1001.01 =r 1000 (with rounding towards zero) The example relies on rounding towards zero at least in the second step. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98578 Reviewers: RKSimon, tstellarAMD, spatel, arsenm Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26602 llvm-svn: 288506
* [X86] Don't lower FABS/FNEG masking directly to a ConstantPool load. Just ↵Craig Topper2016-08-291-3/+3
| | | | | | | | create a ConstantFPSDNode and let that be lowered. This allows broadcast loads to used when available. llvm-svn: 279958
* [AVX-512] Fix duplicate column in AVX512 execution dependency table that was ↵Craig Topper2016-08-011-2/+2
| | | | | | preventing VMOVDQU32/VMOVDQA32 from being recognized. Fix a bug in the code that stops execution dependency fix from turning operations on 32-bit integer element types into operations on 64-bit integer element types. llvm-svn: 277327
* [AVX512] Add load folding support for the unmasked forms of the FMA ↵Craig Topper2016-07-251-4/+2
| | | | | | instructions. llvm-svn: 276615
* [AVX512] Implement commuting support for EVEX encoded FMA3 instructions.Craig Topper2016-07-231-12/+7
| | | | llvm-svn: 276521
* [AVX512] Add initial support for the Execution Domain fixing pass to change ↵Craig Topper2016-07-221-2/+2
| | | | | | some EVEX instructions. llvm-svn: 276393
* [X86][FMA] Optimize FNEG(FMUL) PatternsSimon Pilgrim2015-12-021-1/+82
| | | | | | | | | | On FMA targets, we can avoid having to load a constant to negate a float/double multiply by instead using a FNMSUB (-(X*Y)-0) Fix for PR24366 Differential Revision: http://reviews.llvm.org/D14909 llvm-svn: 254495
* [X86][FMA4] Prefer FMA4 to FMASimon Pilgrim2015-11-301-1/+1
| | | | | | | | | | | | | | | | We currently output FMA instructions on targets which support both FMA4 + FMA (i.e. later Bulldozer CPUS bdver2/bdver3/bdver4). This patch flips this so FMA4 is preferred; this is for several reasons: 1 - FMA4 is non-destructive reducing the need for mov instructions. 2 - Its more straighforward to commute and fold inputs (although the recent work on FMA has reduced this difference). 3 - All supported targets have FMA4 performance equal or better to FMA - Piledriver (bdver2) in particular has half the throughput when executing FMA instructions. Its looks like no future AMD processor lines will support FMA4 after the Bulldozer series so we're not causing problems for later CPUs. Differential Revision: http://reviews.llvm.org/D14997 llvm-svn: 254339
* [X86][FMA] Added 512-bit tests to match 128/256-bit tests coverageSimon Pilgrim2015-11-281-0/+487
| | | | | | As discussed on D14909 llvm-svn: 254233
* [X86][FMA] More thorough FMA testsSimon Pilgrim2015-11-281-37/+152
| | | | | | | | | | | | Added FMADD/FMSUB/FNMADD/FNMSUB tests for all types Added load folding tests for 512-bit vectors NOTE: Many of the AVX512 FMA instructions don't yet commute/fold correctly As discussed on D14909 llvm-svn: 254232
* [X86][FMA] Begun adding AVX512 FMA testsSimon Pilgrim2015-11-261-63/+94
| | | | | | As discussed on D14909 llvm-svn: 254180
* Make utils/update_llc_test_checks.py note that the assertions areJames Y Knight2015-11-231-0/+1
| | | | | | | | | autogenerated. Also update existing test cases which appear to be generated by it and weren't modified (other than addition of the header) by rerunning it. llvm-svn: 253917
* [X86][FMA] Refreshed fma testsSimon Pilgrim2015-09-121-47/+69
| | | | llvm-svn: 247508
* Start using CHECK-LABEL in some tests.Stephen Lin2013-07-121-11/+11
| | | | llvm-svn: 186163
* AArch64/PowerPC/SystemZ/X86: This patch fixes the interface, usage, and allStephen Lin2013-07-091-0/+84
in-tree implementations of TargetLoweringBase::isFMAFasterThanMulAndAdd in order to resolve the following issues with fmuladd (i.e. optional FMA) intrinsics: 1. On X86(-64) targets, ISD::FMA nodes are formed when lowering fmuladd intrinsics even if the subtarget does not support FMA instructions, leading to laughably bad code generation in some situations. 2. On AArch64 targets, ISD::FMA nodes are formed for operations on fp128, resulting in a call to a software fp128 FMA implementation. 3. On PowerPC targets, FMAs are not generated from fmuladd intrinsics on types like v2f32, v8f32, v4f64, etc., even though they promote, split, scalarize, etc. to types that support hardware FMAs. The function has also been slightly renamed for consistency and to force a merge/build conflict for any out-of-tree target implementing it. To resolve, see comments and fixed in-tree examples. llvm-svn: 185956
OpenPOWER on IntegriCloud