summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/X86/fma_patterns.ll
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Force floating point values in constant pool decoding to print in ↵Craig Topper2018-10-291-30/+30
| | | | | | | | | | scientific notation so they can't be confused with integers. When the floating point constants are whole numbers they have no decimal point so look like integers, but mean something very different in something like an 'and' instruction. Ideally we would just print a decimal point and a 0, but I couldn't see how to make APFloat::toString do that. llvm-svn: 345488
* [DAGCombiner] allow undef elts in vector fma matchingSanjay Patel2018-10-151-66/+126
| | | | llvm-svn: 344528
* [x86] add tests for fma with undef elts; NFCSanjay Patel2018-10-151-0/+98
| | | | llvm-svn: 344527
* [DAGCombiner] allow undef elts in vector fma matchingSanjay Patel2018-10-151-30/+60
| | | | llvm-svn: 344525
* [x86] add tests for fma with undef elts; NFCSanjay Patel2018-10-151-0/+46
| | | | llvm-svn: 344523
* [X86] Standardize floating point assembly commentsSimon Pilgrim2018-10-021-17/+17
| | | | | | | | Consistently try to use APFloat::toString for floating point constant comments to get rid of differences between Constant / ConstantDataSequential values - it should help stop some of the linux-windows buildbot failures matching NaN/INF etc. as well. Differential Revision: https://reviews.llvm.org/D52702 llvm-svn: 343562
* [X86] Regenerate fma comments.Simon Pilgrim2018-09-291-128/+128
| | | | llvm-svn: 343376
* [CodeGen] Unify MBB reference format in both MIR and debug outputFrancis Visoiu Mistrih2017-12-041-216/+216
| | | | | | | | | | | | | | | | As part of the unification of the debug format and the MIR format, print MBB references as '%bb.5'. The MIR printer prints the IR name of a MBB only for block definitions. * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g' * find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g' * grep -nr 'BB#' and fix Differential Revision: https://reviews.llvm.org/D40422 llvm-svn: 319665
* [X86] SET0 to use XMM registers where possible PR26018 PR32862Dinar Temirbulatov2017-08-031-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D35965 llvm-svn: 309926
* [X86] SET0 to use XMM registers where possible PR26018 PR32862Dinar Temirbulatov2017-07-271-2/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D35839 llvm-svn: 309298
* [X86][FMA] Regenerate test with broadcast comments.Simon Pilgrim2017-07-261-7/+7
| | | | llvm-svn: 309093
* [AVX-512] Fix the execution domain for scalar FMA instructions.Craig Topper2017-02-251-1/+1
| | | | llvm-svn: 296271
* X86: Add checks for fma_patterns[_wide].ll with -enable-no-infs-fp-mathNicolai Haehnle2016-12-081-359/+650
| | | | | | | | | | | | This re-adds checks for the patterns that were disabled with r288506. Reviewers: spatel, delena, craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27346 llvm-svn: 289049
* [DAGCombiner] do not fold (fmul (fadd X, 1), Y) -> (fmad X, Y, Y) by defaultNicolai Haehnle2016-12-021-54/+138
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When X = 0 and Y = inf, the original code produces inf, but the transformed code produces nan. So this transform (and its relatives) should only be used when the no-infs-fp-math flag is explicitly enabled. Also disable the transform using fmad (intermediate rounding) when unsafe-math is not enabled, since it can reduce the precision of the result; consider this example with binary floating point numbers with two bits of mantissa: x = 1.01 y = 111 x * (y + 1) = 1.01 * 1000 = 1010 (this is the exact result; no rounding occurs at any step) x * y + x = 1000.11 + 1.01 =r 1000 + 1.01 = 1001.01 =r 1000 (with rounding towards zero) The example relies on rounding towards zero at least in the second step. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98578 Reviewers: RKSimon, tstellarAMD, spatel, arsenm Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26602 llvm-svn: 288506
* [X86] Don't lower FABS/FNEG masking directly to a ConstantPool load. Just ↵Craig Topper2016-08-291-5/+17
| | | | | | | | create a ConstantFPSDNode and let that be lowered. This allows broadcast loads to used when available. llvm-svn: 279958
* [AVX-512] Fix duplicate column in AVX512 execution dependency table that was ↵Craig Topper2016-08-011-2/+2
| | | | | | preventing VMOVDQU32/VMOVDQA32 from being recognized. Fix a bug in the code that stops execution dependency fix from turning operations on 32-bit integer element types into operations on 64-bit integer element types. llvm-svn: 277327
* [AVX512] Add load folding support for the unmasked forms of the FMA ↵Craig Topper2016-07-251-4/+2
| | | | | | instructions. llvm-svn: 276615
* [AVX512] Implement commuting support for EVEX encoded FMA3 instructions.Craig Topper2016-07-231-41/+23
| | | | llvm-svn: 276521
* [AVX512] Use VMOVAPSZ128rr/VMOVAPS256rr for VR128X/VR256X physreg moves when ↵Craig Topper2016-07-181-18/+18
| | | | | | | | VLX is supported. Ideally we would use VEX encoded moves instead of EVEX if the high 16 registers aren't referenced, but this a good first step. llvm-svn: 275763
* [AVX512] Add VLX 128/256-bit SET0 operations that encode to 128/256-bit EVEX ↵Craig Topper2016-05-081-2/+3
| | | | | | encoded VPXORD so all 32 registers can be used. llvm-svn: 268884
* X86-FMA3: Defined the ExeDomain property for Scalar FMA3 opcodes.Vyacheslav Klochkov2015-12-091-1/+1
| | | | | | | Reviewer: Simon Pilgrim. Differential Revision: http://reviews.llvm.org/D15317 llvm-svn: 255080
* [X86][FMA4] Explicitly set the domain of FMA4 float/double scalar instructionsSimon Pilgrim2015-12-051-1/+1
| | | | | | Both were defaulting to the float domain - now matches the packed instructions. llvm-svn: 254841
* [X86][FMA] Optimize FNEG(FMUL) PatternsSimon Pilgrim2015-12-021-1/+84
| | | | | | | | | | On FMA targets, we can avoid having to load a constant to negate a float/double multiply by instead using a FNMSUB (-(X*Y)-0) Fix for PR24366 Differential Revision: http://reviews.llvm.org/D14909 llvm-svn: 254495
* [X86][FMA4] Prefer FMA4 to FMASimon Pilgrim2015-11-301-1/+1
| | | | | | | | | | | | | | | | We currently output FMA instructions on targets which support both FMA4 + FMA (i.e. later Bulldozer CPUS bdver2/bdver3/bdver4). This patch flips this so FMA4 is preferred; this is for several reasons: 1 - FMA4 is non-destructive reducing the need for mov instructions. 2 - Its more straighforward to commute and fold inputs (although the recent work on FMA has reduced this difference). 3 - All supported targets have FMA4 performance equal or better to FMA - Piledriver (bdver2) in particular has half the throughput when executing FMA instructions. Its looks like no future AMD processor lines will support FMA4 after the Bulldozer series so we're not causing problems for later CPUs. Differential Revision: http://reviews.llvm.org/D14997 llvm-svn: 254339
* [X86][FMA] More thorough FMA testsSimon Pilgrim2015-11-281-149/+358
| | | | | | | | | | | | Added FMADD/FMSUB/FNMADD/FNMSUB tests for all types Added load folding tests for 512-bit vectors NOTE: Many of the AVX512 FMA instructions don't yet commute/fold correctly As discussed on D14909 llvm-svn: 254232
* [X86][FMA] Begun adding AVX512 FMA testsSimon Pilgrim2015-11-261-379/+612
| | | | | | As discussed on D14909 llvm-svn: 254180
* [X86][FMA] Optimize FNEG(FMA) PatternsSimon Pilgrim2015-11-241-0/+68
| | | | | | | | | | X86 needs to use its own FMA opcodes, preventing the standard FNEG(FMA) pattern table recognition method used by other platforms. This patch adds support for lowering FNEG(FMA(X,Y,Z)) into a single suitably negated FMA instruction. Fix for PR24364 Differential Revision: http://reviews.llvm.org/D14906 llvm-svn: 254016
* [X86][FMA] Regenerate tests.Simon Pilgrim2015-11-221-8/+8
| | | | | | Fixes some broken checks. llvm-svn: 253830
* Improved the operands commute transformation for X86-FMA3 instructions.Andrew Kaylor2015-11-061-12/+9
| | | | | | | | | | | | All 3 operands of FMA3 instructions are commutable now. Patch by Slava Klochkov Reviewers: Quentin Colombet(qcolombet), Ahmed Bougacha(ab). Differential Revision: http://reviews.llvm.org/D13269 llvm-svn: 252335
* [DAGCombiner] Improved FMA combine support for vectorsSimon Pilgrim2015-10-111-151/+185
| | | | | | | | Enabled constant canonicalization for all constants. Improved combining of constant vectors. llvm-svn: 249993
* [DAGCombiner] Improve FMA support for interpolation patternsSimon Pilgrim2015-09-211-0/+305
| | | | | | | | | | This patch adds support for combining patterns such as (FMUL(FADD(1.0, x), y)) and (FMUL(FSUB(x, 1.0), y)) to their FMA equivalents. This is useful in particular for linear interpolation cases such as (FADD(FMUL(x, t), FMUL(y, FSUB(1.0, t)))) Differential Revision: http://reviews.llvm.org/D13003 llvm-svn: 248210
* [X86][FMA] Refreshed fma testsSimon Pilgrim2015-09-121-105/+159
| | | | llvm-svn: 247508
* [opaque pointer type] Add textual IR support for explicit type parameter to ↵David Blaikie2015-02-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794
* Improve logic that decides if its profitable to commute when some of the ↵Craig Topper2014-11-051-2/+2
| | | | | | virtual registers involved have uses/defs chains connecting them to physical register. Fix up the tests that this change improves. llvm-svn: 221336
* Disambiguate function names in some CodeGen tests. (Some tests were using ↵Stephen Lin2013-07-181-4/+4
| | | | | | function names that also were names of instructions and/or doing other unusual things that were making the test not amenable to otherwise scriptable pattern matching.) No functionality change. llvm-svn: 186621
* Mark FMA4 instructions as commutable and add them to the folding tables.Craig Topper2012-08-311-0/+29
| | | | llvm-svn: 163035
* Mark FMA3 instructions as commutable so that the operands to the multiply ↵Craig Topper2012-08-311-15/+15
| | | | | | part can be commuted. llvm-svn: 163001
* Add support for converting llvm.fma to fma4 instructions.Craig Topper2012-08-311-0/+43
| | | | llvm-svn: 162999
* llvm/test/CodeGen/X86/fma_patterns.ll: Add -mtriple=x86_64. It was ↵NAKAMURA Takumi2012-08-271-1/+1
| | | | | | incompatible on i686 and Windows x64. llvm-svn: 162664
* FMA3 tests on bdver2 target for changes made in rev 162012. Also madeAnitha Boyapati2012-08-271-0/+1
| | | | | | | corresponding changes to existing tests for darwin triple to ensure that same pattern is tested for bdver2 target. llvm-svn: 162655
* Line endings.Matt Beaumont-Gay2012-08-011-139/+139
| | | | llvm-svn: 161117
* Added FMA functionality to X86 target.Elena Demikhovsky2012-08-011-0/+139
llvm-svn: 161110
OpenPOWER on IntegriCloud