summaryrefslogtreecommitdiffstats
path: root/llvm/lib
diff options
context:
space:
mode:
authorAndrea Di Biagio <Andrea_DiBiagio@sn.scee.net>2018-09-21 12:43:07 +0000
committerAndrea Di Biagio <Andrea_DiBiagio@sn.scee.net>2018-09-21 12:43:07 +0000
commit4cd5cf9fc8ef39c6041aec26809813c6cb3dbf41 (patch)
treedee5ac7e786c74f7e78a98c05c9e23ea22e6fc20 /llvm/lib
parent7cf529c11f33f1a74a63f8923c627d4981fc00c4 (diff)
downloadbcm5719-llvm-4cd5cf9fc8ef39c6041aec26809813c6cb3dbf41.tar.gz
bcm5719-llvm-4cd5cf9fc8ef39c6041aec26809813c6cb3dbf41.zip
[X86][BtVer2] Fix latency and resource cycles of AVX 256-bit zero-idioms.
This patch introduces a SchedWriteVariant to describe zero-idiom VXORP(S|D)Yrr and VANDNP(S|D)Yrr. This is a follow-up of r342555. On Jaguar, a VXORPSYrr is 2 macro opcodes. Only one opcode is eliminated at register-renaming stage. The other opcode has to be executed to set the upper half of the destination YMM. Same for VANDNP(S|D)Yrr. Differential Revision: https://reviews.llvm.org/D52347 llvm-svn: 342728
Diffstat (limited to 'llvm/lib')
-rw-r--r--llvm/lib/Target/X86/X86ScheduleBtVer2.td11
1 files changed, 11 insertions, 0 deletions
diff --git a/llvm/lib/Target/X86/X86ScheduleBtVer2.td b/llvm/lib/Target/X86/X86ScheduleBtVer2.td
index af5ce7bbfe8..0c1b3c60e05 100644
--- a/llvm/lib/Target/X86/X86ScheduleBtVer2.td
+++ b/llvm/lib/Target/X86/X86ScheduleBtVer2.td
@@ -595,6 +595,10 @@ def JWriteZeroLatency : SchedWriteRes<[]> {
let Latency = 0;
}
+def JWriteZeroIdiomYmm : SchedWriteRes<[JFPU01, JFPX]> {
+ let NumMicroOps = 2;
+}
+
// Certain instructions that use the same register for both source
// operands do not have a real dependency on the previous contents of the
// register, and thus, do not have to wait before completing. They can be
@@ -619,6 +623,13 @@ def : InstRW<[JWriteFZeroIdiom], (instrs XORPSrr, VXORPSrr, XORPDrr, VXORPDrr,
ANDNPSrr, VANDNPSrr,
ANDNPDrr, VANDNPDrr)>;
+def JWriteFZeroIdiomY : SchedWriteVariant<[
+ SchedVar<MCSchedPredicate<ZeroIdiomPredicate>, [JWriteZeroIdiomYmm]>,
+ SchedVar<NoSchedPred, [WriteFLogicY]>
+]>;
+def : InstRW<[JWriteFZeroIdiomY], (instrs VXORPSYrr, VXORPDYrr,
+ VANDNPSYrr, VANDNPDYrr)>;
+
def JWriteVZeroIdiomLogic : SchedWriteVariant<[
SchedVar<MCSchedPredicate<ZeroIdiomPredicate>, [JWriteZeroLatency]>,
SchedVar<NoSchedPred, [WriteVecLogic]>
OpenPOWER on IntegriCloud