summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86InstrFPStack.td
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Use SDNPOptInGlue instead of SDNPInGlue on a couple SDNodes.Craig Topper2020-01-121-2/+2
| | | | | At least one of these is used without a Glue. This doesn't seem to change the X86GenDAGISel.inc output so maybe it doesn't matter?
* [X86] Preserve fpexcept property when turning strict_fp_extend and ↵Craig Topper2020-01-101-0/+3
| | | | | | | | | | | strict_fp_round into stack operations. We use the stack for X87 fp_round and for moving from SSE f32/f64 to X87 f64/f80. Or from X87 f64/f80 to SSE f32/f64. Note for the SSE<->X87 conversions the conversion always happens in the X87 domain. The load/store ops in the X87 instructions are able to signal exceptions.
* [FPEnv][X86] Constrained FCmp intrinsics enabling on X86Wang, Pengfei2019-12-111-6/+21
| | | | | | | | | | | | Summary: This is a follow up of D69281, it enables the X86 backend support for the FP comparision. Reviewers: uweigand, kpn, craig.topper, RKSimon, cameron.mcinally, andrew.w.kaylor Subscribers: hiraditya, llvm-commits, annita.zhang, LuoYuanke, LiuChen3 Tags: #llvm Differential Revision: https://reviews.llvm.org/D70582
* [X86] Add strict fp support for operations of X87 instructionsCraig Topper2019-11-261-17/+17
| | | | | | | | | | This is the following patch of D68854. This patch adds basic operations of X87 instructions, including +, -, *, / , fp extensions and fp truncations. Patch by Chen Liu(LiuChen3) Differential Revision: https://reviews.llvm.org/D68857
* [X86] add mayRaiseFPException flag and FPCW registers for X87 instructionsPengfei Wang2019-11-011-24/+42
| | | | | | | | | | | | | | | | Summary: This patch adds flag "mayRaiseFPException" , FPCW and FPSW for X87 instructions which could raise float exception. Reviewers: pengfei, RKSimon, andrew.w.kaylor, uweigand, kpn, spatel, cameron.mcinally, craig.topper Reviewed By: craig.topper Subscribers: thakis, hiraditya, llvm-commits Patch by LiuChen. Differential Revision: https://reviews.llvm.org/D68854
* Revert "[X86] add mayRaiseFPException flag and FPCW registers for X87 ↵Nico Weber2019-10-311-42/+24
| | | | | | | instructions" This reverts commit a678677da498a45f59c16ee74fea438e34a801ce. It broke CodeGen/ms-inline-asm.c on most bots.
* [X86] add mayRaiseFPException flag and FPCW registers for X87 instructionsCraig Topper2019-10-311-24/+42
| | | | | | | | | This patch adds flag "mayRaiseFPException" , FPCW and FPSW for X87 instructions which could raise float exception. Patch by LiuChen. With a couple small fixes from me. Differential Revision: https://reviews.llvm.org/D68854
* [X86] Remove FSIN/FCOS isel patterns and the pseudo instructions that they ↵Craig Topper2019-10-311-5/+2
| | | | | | | selected for the FP stackifier. We always expand these to libcalls so get rid of the last vestiges of using the instructions.
* Recommit r358211 "[X86] Use FILD/FIST to implement i64 atomic load on 32-bit ↵Craig Topper2019-04-111-0/+13
| | | | | | | | | | | | targets with X87, but no SSE2" With correct test checks this time. If we have X87, but not SSE2 we can atomicaly load an i64 value into the significand of an 80-bit extended precision x87 register using fild. We can then use a fist instruction to convert it back to an i64 integ This matches what gcc and icc do for this case and removes an existing FIXME. llvm-svn: 358214
* Revert r358211 "[X86] Use FILD/FIST to implement i64 atomic load on 32-bit ↵Craig Topper2019-04-111-13/+0
| | | | | | | | targets with X87, but no SSE2" I seem to have messed up the test checks. llvm-svn: 358212
* [X86] Use FILD/FIST to implement i64 atomic load on 32-bit targets with X87, ↵Craig Topper2019-04-111-0/+13
| | | | | | | | | | | | but no SSE2 If we have X87, but not SSE2 we can atomicaly load an i64 value into the significand of an 80-bit extended precision x87 register using fild. We can then use a fist instruction to convert it back to an i64 integer and store it to a stack temporary. From there we can do two 32-bit loads to get the value into integer registers without worrying about atomicness. This matches what gcc and icc do for this case and removes an existing FIXME. Differential Revision: https://reviews.llvm.org/D60156 llvm-svn: 358211
* [X86] Remove hasSideEffects=1 from the X87 pseudos with folded load.Craig Topper2019-02-211-2/+4
| | | | | | This was done in r321424 to prevent scheduling from reordering things. But now that we model FPCW as a dependency, I don't think the same scheduling we were trying to prevent can occur. llvm-svn: 354628
* [X86] Mark FP32_TO_INT16_IN_MEM/FP32_TO_INT32_IN_MEM/FP32_TO_INT64_IN_MEM as ↵Craig Topper2019-02-191-1/+3
| | | | | | | | | | | | | | clobbering EFLAGS to prevent mis-scheduling during conversion from SelectionDAG to MIR. After r354178, these instruction expand to a sequence that uses an OR instruction. That OR clobbers EFLAGS so we need to state that to avoid accidentally using the clobbered flags. Our tests show the bug, but I didn't notice because the SETcc instructions didn't move after r354178 since it used to be safe to do the fp->int conversion first. We should probably convert this whole sequence to SelectionDAG instead of a custom inserter to avoid mistakes like this. Fixes PR40779 llvm-svn: 354395
* [X86] Collapse FP_TO_INT16_IN_MEM/FP_TO_INT32_IN_MEM/FP_TO_INT64_IN_MEM into ↵Craig Topper2019-02-121-7/+15
| | | | | | a single opcode using memory VT to distinquish. NFC llvm-svn: 353798
* [X86] Remove the value type operand from the floating point load/store ↵Craig Topper2019-02-121-43/+73
| | | | | | | | MemIntrinsicSDNodes. Use the MemoryVT instead. NFCI We already have the memory VT, we can just match from that during isel. llvm-svn: 353797
* [X86] Removed unused SDTypeProfile. NFCCraig Topper2019-02-111-2/+0
| | | | llvm-svn: 353659
* [X86] Add FPCW as an implicit use on floating point load instructions.Craig Topper2019-02-081-7/+7
| | | | | | These instructions can generate a stack overflow exception so technically they read the stack overflow exception mask bit. llvm-svn: 353564
* [X86] Remove isReMaterializable from X87 floating point constant loads and ↵Craig Topper2019-02-081-3/+2
| | | | | | | | | | | | | | | | | | constant pool loads. Summary: These instructions update FPSW so they aren't generically safe to rematerialize into any location if FPSW is live for a comparison result. They also use FPCW for exception masking control. Though the only exception they can generate is stack overflow and we manage the stack ourselves so that's not really going to occur. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57934 llvm-svn: 353536
* [X86] Add FPCW as a register and start using it as an implicit use on ↵Craig Topper2019-02-081-29/+27
| | | | | | | | | | | | | | | | | | | floating point instructions. Summary: FPCW contains the rounding mode control which we manipulate to implement fp to integer conversion by changing the roudning mode, storing the value to the stack, and then changing the rounding mode back. Because we didn't model FPCW and its dependency chain, other instructions could be scheduled into the middle of the sequence. This patch introduces the register and adds it as an implciit def of FLDCW and implicit use of the FP binary arithmetic instructions and store instructions. There are more instructions that need to be updated, but this is a good start. I believe this fixes at least the reduced test case from PR40529. Reviewers: RKSimon, spatel, rnk, efriedma, andrew.w.kaylor Subscribers: dim, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57735 llvm-svn: 353489
* [X86] Print all register forms of x87 fadd/fsub/fdiv/fmul as having two ↵Craig Topper2019-02-041-16/+18
| | | | | | | | | | arguments where on is %st. All of these instructions consume one encoded register and the other register is %st. They either write the result to %st or the encoded register. Previously we printed both arguments when the encoded register was written. And we printed one argument when the result was written to %st. For the stack popping forms the encoded register is always the destination and we didn't print both operands. This was inconsistent with gcc and objdump and just makes the output assembly code harder to read. This patch changes things to always print both operands making us consistent with gcc and objdump. The parser should still be able to handle the single register forms just as it did before. This also matches the GNU assembler behavior. llvm-svn: 353061
* [X86] Print %st(0) as %st when its implicit to the instruction. Continue ↵Craig Topper2019-02-041-37/+37
| | | | | | | | printing it as %st(0) when its encoded in the instruction. This is a step back from the change I made in r352985. This appears to be more consistent with gcc and objdump behavior. llvm-svn: 353015
* Revert r352985 "[X86] Print %st(0) as %st to match what gcc inline asm uses ↵Craig Topper2019-02-041-14/+14
| | | | | | | | | | as the clobber name to make MS inline asm work correctly" Looking into gcc and objdump behavior more this was overly aggressive. If the register is encoded in the instruction we should print %st(0), if its implicit we should print %st. I'll be making a more directed change in a future patch. llvm-svn: 353013
* [X86] Print %st(0) as %st to match what gcc inline asm uses as the clobber ↵Craig Topper2019-02-031-14/+14
| | | | | | | | | | | | | | | | | | | name to make MS inline asm work correctly Summary: When calculating clobbers for MS style inline assembly we fail if the asm clobbers stack top because we print st(0) and try to pass it through the gcc register name check. This was found with when I attempted to make a emms/femms clobber all ST registers. If you use emms/femms in MS inline asm we would try to use st(0) as the clobber name but clang would think that wasn't a valid clobber name. This also matches what objdump disassembly prints. It's also what is printed by gcc -S. Reviewers: RKSimon, rnk, efriedma, spatel, andreadb, lebedev.ri Reviewed By: rnk Subscribers: eraman, gbedwell, lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D57621 llvm-svn: 352985
* [X86] Add FPSW as a Def on some FP instructions that were missing it.Craig Topper2019-01-301-5/+5
| | | | llvm-svn: 352607
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [X86] Mark the FUCOMI instructions as requiring CMOV to be enabled. NFCICraig Topper2018-08-281-0/+3
| | | | | | | | These instructions were added on the PentiumPro along with CMOV. This was already comprehended by the lowering process which should emit an alternate sequence using FCOM and FNSTW. This just makes it an explicit error if that doesn't work for some reason. llvm-svn: 340844
* [X86] Introduce WriteFLDC for x87 constant loads.Clement Courbet2018-05-311-5/+8
| | | | | | | | | | | | | | | | | Summary: {FLDL2E, FLDL2T, FLDLG2, FLDLN2, FLDPI} were using WriteMicrocoded. - I've measured the values for Broadwell, Haswell, SandyBridge, Skylake. - For ZnVer1 and Atom, values were transferred form InstRWs. - For SLM and BtVer2, I've guessed some values :( Reviewers: RKSimon, craig.topper, andreadb Subscribers: gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D47585 llvm-svn: 333656
* [X86] Extract latency of fldz/fld1 in separate classes.Clement Courbet2018-05-311-2/+3
| | | | | | | | | | | | | | | | | Summary: - I've measured the values for Broadwell, Haswell, SandyBridge, Skylake. - For ZnVer1 and Atom, values were transferred form `InstRW`s. - For SLM and BtVer2, values are from Agner. This is split off from https://reviews.llvm.org/D47377 Reviewers: RKSimon, andreadb Subscribers: gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D47523 llvm-svn: 333642
* [X86] Add WriteFCMOV scheduler class for x87 CMOVsSimon Pilgrim2018-05-121-1/+1
| | | | llvm-svn: 332173
* [X86] Split WriteFRcp/WriteFRsqrt/WriteFSqrt schedule classesSimon Pilgrim2018-05-071-1/+1
| | | | | | | | | | | | | WriteFRcp/WriteFRsqrt are split to support scalar, XMM and YMM/ZMM instructions. WriteFSqrt is split into single/double/long-double sizes and scalar, XMM, YMM and ZMM instructions. This removes all InstrRW overrides for these instructions. NOTE: There were a couple of typos in the Znver1 model - notably a 1cy throughput for SQRT that is highly unlikely and doesn't tally with Agner. NOTE: I had to add Agner's numbers for several targets for WriteFSqrt80. llvm-svn: 331629
* [X86] Remove 'opaque ptr' from the intel syntax parser and printer.Craig Topper2018-05-011-4/+4
| | | | | | | | Previously for instructions like fxsave we would print "opaque ptr" as part of the memory operand. Now we print nothing. We also no longer accept "opaque ptr" in the parser. We still accept any size to be specified for these instructions, but we may want to consider only parsing when no explicit size is specified. This what gas does. llvm-svn: 331243
* [X86] Add WriteFSign/WriteFLogic scheduler classesSimon Pilgrim2018-04-201-1/+1
| | | | | | | | | | | | | | Split the fp and integer vector logical instruction scheduler classes - older CPUs especially often handled these on different pipes. This unearthed a couple of things that are also handled in this patch: (1) We were tagging avx512 fp logic ops as WriteFAdd, probably because of the lack of WriteFLogic (2) SandyBridge had integer logic ops only using Port5, when afaict they can use Ports015. (3) Cleaned up x86 FCHS/FABS scheduling as they are typically treated as fp logic ops. Differential Revision: https://reviews.llvm.org/D45629 llvm-svn: 330480
* [X86] Add FP comparison scheduler classesSimon Pilgrim2018-04-171-4/+6
| | | | | | | | Split VCMP/VMAX/VMIN instructions off to WriteFCmp and VCOMIS instructions off to WriteFCom instead of assuming they match WriteFAdd Differential Revision: https://reviews.llvm.org/D45656 llvm-svn: 330179
* [X86] Remove X87 schedule itineraries (PR37093)Simon Pilgrim2018-04-121-125/+97
| | | | | | First of a number of commits to remove x86 schedule itineraries entirely - approved off-line with @craig.topper llvm-svn: 329893
* [x86] put nops into the WriteNop class and customize for JaguarSanjay Patel2018-03-191-3/+3
| | | | | | | | | | | | 1. Given that we already have a classification bucket with 'nop' in the name, that's where 'nop' belongs. Right now, it's only used for prefix bytes and 'pause'. 2. Make the latency of this class '1' for Jaguar to tell the scheduler (and presumably llvm-mca) how to model the resource requirements better even though a nop has no dependencies. Differential Revision: https://reviews.llvm.org/D44608 llvm-svn: 327853
* [X86][X87] Mark pseudo memory fold instructions as load/sideeffects ↵Simon Pilgrim2017-12-241-4/+2
| | | | | | | | (PR21160, PR34080, PR34454). Match regular x87 memory fold instructions with load/sideeffects tags, to prevent the schedulers from re-ordering them across the fnstcw/fldcw sequences for truncating stores while they are still pseudo during the stack conversion pass. llvm-svn: 321424
* [X86] Fix XSAVE64 and similar instructions to not be allowed by the ↵Craig Topper2017-12-151-13/+12
| | | | | | | | | | assembler in 32-bit mode. There was a top level "let Predicates =" in the .td file that was overriding the Requires on each instruction. I've added an assert to the code emitter to catch more cases like this. I'm sure this isn't the only place where the right predicates aren't being applied. This assert already found that we don't block btq/btsq/btrq in 32-bit mode. llvm-svn: 320830
* [X86][X87] Tag x87 load/store instructions scheduler classesSimon Pilgrim2017-12-081-5/+11
| | | | llvm-svn: 320192
* [X86][X87] Tag x87 float compare instructions scheduler classesSimon Pilgrim2017-12-081-11/+15
| | | | llvm-svn: 320189
* [X86][X87] X87 math binop pseudo instructions don't need scheduling infoSimon Pilgrim2017-12-071-0/+5
| | | | llvm-svn: 320044
* [X86][X87] Tag FCMOV instruction scheduler classesSimon Pilgrim2017-12-051-15/+19
| | | | llvm-svn: 319804
* [X86][X87] Tag FP_TO_INT_IN_MEM pseudos with hasNoSchedulingInfoSimon Pilgrim2017-11-281-2/+2
| | | | | | We don't need scheduling info for pseudos llvm-svn: 319197
* [X86][X87] Tag FTST x87 instruction scheduler classSimon Pilgrim2017-11-281-1/+2
| | | | | | Looking through Agner, FTST is very similar to generic float compare behaviour, so I've added them to the existing IIC_FCOMI (WriteFAdd) tags. llvm-svn: 319184
* [X86][X87] Tag FABS/FCHS/FSQRT/FSIN/FCOS x87 instruction scheduler classesSimon Pilgrim2017-11-281-16/+26
| | | | | | | Atom's FABS/FCHS/FSQRT latencies taken from Agner. Note: I just added FSIN and FCOS to the existing IIC_FSINCOS itinerary, which is actually a more costly instruction. llvm-svn: 319175
* [globalisel][tablegen] Add support for fpimm and import of APInt/APFloat ↵Daniel Sanders2017-10-131-8/+8
| | | | | | | | | | | | | | | | | | | | | | based ImmLeaf. Summary: There's only a tablegen testcase for IntImmLeaf and not a CodeGen one because the relevant rules are rejected for other reasons at the moment. On AArch64, it's because there's an SDNodeXForm attached to the operand. On X86, it's because the rule either emits multiple instructions or has another predicate using PatFrag which cannot easily be supported at the same time. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Reviewed By: qcolombet Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D36569 llvm-svn: 315761
* [X86][X87] Ensure x87 instructions are tagged as altering the FPSW regSimon Pilgrim2017-09-061-7/+8
| | | | | | | | | | As noted in PR34080, a lot of x87 instructions alter the FPSW status register (or leave it in an undefined state) but aren't tagged as such in the tablegen. This patch tags the control word, stack, wait and math instructions as altering FPSW, which matches what the AMD APMs suggests happens. Differential Revision: https://reviews.llvm.org/D36414 llvm-svn: 312629
* [X86] Add comment to match closing Defs = [FPSW]. NFCI.Simon Pilgrim2017-08-061-1/+1
| | | | llvm-svn: 310202
* Revert r295004 (Add MXCSR) due to errors reported by MachineVerifierAndrew Kaylor2017-03-131-15/+11
| | | | | | I am leaving the code in clang which filters mxcsr from the clobber list because that is still technically correct and will be useful again when the MXCSR register is reintroduced. llvm-svn: 297664
* [X86] Add MXCSR registerAndrew Kaylor2017-02-131-9/+14
| | | | | | | | | | This adds MXCSR to the set of recognized registers for X86 targets and updates the instructions that read or write it. I do not intend for all of the various floating point instructions that implicitly use the control bits or update the status bits of this register to ever have that usage modeled by default. However, when constrained floating point modes (such as strict FP exception status modeling or dynamic rounding modes) are enabled, implicit use/def information for MXCSR will be added to those instructions. Until those additional updates are made this should cause (almost?) no functional changes. Theoretically, this will prevent instructions like LDMXCSR and STMXCSR from being moved past one another, but that should be prevented anyway and I haven't found a case where it is happening now. Differential Revision: https://reviews.llvm.org/D29903 llvm-svn: 295004
* [X86] Adding FFREEP instruction.Chris Ray2017-01-271-0/+3
| | | | | | | | | | | | | | Summary: Small change to get the FREEP instruction to decode properly. Reviewers: craig.topper Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29193 llvm-svn: 293314
OpenPOWER on IntegriCloud