summaryrefslogtreecommitdiffstats
path: root/llvm/lib
diff options
context:
space:
mode:
authorRoman Lebedev <lebedev.ri@gmail.com>2019-02-01 11:15:13 +0000
committerRoman Lebedev <lebedev.ri@gmail.com>2019-02-01 11:15:13 +0000
commit7857215f8ea6850e78819e5c37a7904700bb10cf (patch)
tree9f7377e02a6478a75ee697f4595d079deb4b1f9f /llvm/lib
parent2c15fc56f8f1548b0ab3fdaf32132e683199bfa6 (diff)
downloadbcm5719-llvm-7857215f8ea6850e78819e5c37a7904700bb10cf.tar.gz
bcm5719-llvm-7857215f8ea6850e78819e5c37a7904700bb10cf.zip
[X86][BdVer2] Transfer delays from the integer to the floating point unit.
Summary: I'm unable to find this number in the "AMD SOG for family 15h". llvm-exegesis measures the latencies of these instructions as `2`, which matches the latencies specified in "AMD SOG for family 15h". However if we look at Agner, Microarchitecture, "AMD Bulldozer, Piledriver, Steamroller and Excavator pipeline", "Data delay between different execution domains", the int->ivec transfer is listed as `8`..`10`cy of additional latency. Also, Agner's "Instruction tables", for Piledriver, lists their latencies as `12`, which is consistent with `2cy` from exegesis / AMD SOG + `10cy` transfer delay. Additional data point comes from the fact that Agner's "Instruction tables", for Jaguar, lists their latencies as `8`; and "AMD SOG for family 16h" does state the `+6cy` int->ivec delay, which is consistent with instr latency of `1` or `2`. Reviewers: andreadb, RKSimon, craig.topper Reviewed By: andreadb Subscribers: gbedwell, courbet, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57300 llvm-svn: 352861
Diffstat (limited to 'llvm/lib')
-rw-r--r--llvm/lib/Target/X86/X86ScheduleBdVer2.td5
1 files changed, 4 insertions, 1 deletions
diff --git a/llvm/lib/Target/X86/X86ScheduleBdVer2.td b/llvm/lib/Target/X86/X86ScheduleBdVer2.td
index 90ca79915fa..8e8fc6fd1ff 100644
--- a/llvm/lib/Target/X86/X86ScheduleBdVer2.td
+++ b/llvm/lib/Target/X86/X86ScheduleBdVer2.td
@@ -250,7 +250,10 @@ def : ReadAdvance<ReadAfterVecLd, 5>;
def : ReadAdvance<ReadAfterVecXLd, 5>;
def : ReadAdvance<ReadAfterVecYLd, 5>;
-def : ReadAdvance<ReadInt2Fpu, 0>;
+// Transfer from int domain to ivec domain incurs additional latency of 8..10cy
+// Reference: Agner, Microarchitecture, "AMD Bulldozer, Piledriver, Steamroller
+// and Excavator pipeline", "Data delay between different execution domains"
+def : ReadAdvance<ReadInt2Fpu, -10>;
// A folded store needs a cycle on the PdStore for the store data.
def : WriteRes<WriteRMW, [PdStore]>;
OpenPOWER on IntegriCloud