diff options
| author | Andrea Di Biagio <Andrea_DiBiagio@sn.scee.net> | 2018-09-21 12:43:07 +0000 |
|---|---|---|
| committer | Andrea Di Biagio <Andrea_DiBiagio@sn.scee.net> | 2018-09-21 12:43:07 +0000 |
| commit | 4cd5cf9fc8ef39c6041aec26809813c6cb3dbf41 (patch) | |
| tree | dee5ac7e786c74f7e78a98c05c9e23ea22e6fc20 /llvm/lib | |
| parent | 7cf529c11f33f1a74a63f8923c627d4981fc00c4 (diff) | |
| download | bcm5719-llvm-4cd5cf9fc8ef39c6041aec26809813c6cb3dbf41.tar.gz bcm5719-llvm-4cd5cf9fc8ef39c6041aec26809813c6cb3dbf41.zip | |
[X86][BtVer2] Fix latency and resource cycles of AVX 256-bit zero-idioms.
This patch introduces a SchedWriteVariant to describe zero-idiom VXORP(S|D)Yrr
and VANDNP(S|D)Yrr.
This is a follow-up of r342555.
On Jaguar, a VXORPSYrr is 2 macro opcodes. Only one opcode is eliminated at
register-renaming stage. The other opcode has to be executed to set the upper
half of the destination YMM.
Same for VANDNP(S|D)Yrr.
Differential Revision: https://reviews.llvm.org/D52347
llvm-svn: 342728
Diffstat (limited to 'llvm/lib')
| -rw-r--r-- | llvm/lib/Target/X86/X86ScheduleBtVer2.td | 11 |
1 files changed, 11 insertions, 0 deletions
diff --git a/llvm/lib/Target/X86/X86ScheduleBtVer2.td b/llvm/lib/Target/X86/X86ScheduleBtVer2.td index af5ce7bbfe8..0c1b3c60e05 100644 --- a/llvm/lib/Target/X86/X86ScheduleBtVer2.td +++ b/llvm/lib/Target/X86/X86ScheduleBtVer2.td @@ -595,6 +595,10 @@ def JWriteZeroLatency : SchedWriteRes<[]> { let Latency = 0; } +def JWriteZeroIdiomYmm : SchedWriteRes<[JFPU01, JFPX]> { + let NumMicroOps = 2; +} + // Certain instructions that use the same register for both source // operands do not have a real dependency on the previous contents of the // register, and thus, do not have to wait before completing. They can be @@ -619,6 +623,13 @@ def : InstRW<[JWriteFZeroIdiom], (instrs XORPSrr, VXORPSrr, XORPDrr, VXORPDrr, ANDNPSrr, VANDNPSrr, ANDNPDrr, VANDNPDrr)>; +def JWriteFZeroIdiomY : SchedWriteVariant<[ + SchedVar<MCSchedPredicate<ZeroIdiomPredicate>, [JWriteZeroIdiomYmm]>, + SchedVar<NoSchedPred, [WriteFLogicY]> +]>; +def : InstRW<[JWriteFZeroIdiomY], (instrs VXORPSYrr, VXORPDYrr, + VANDNPSYrr, VANDNPDYrr)>; + def JWriteVZeroIdiomLogic : SchedWriteVariant<[ SchedVar<MCSchedPredicate<ZeroIdiomPredicate>, [JWriteZeroLatency]>, SchedVar<NoSchedPred, [WriteVecLogic]> |

