diff options
| author | Andrea Di Biagio <Andrea_DiBiagio@sn.scee.net> | 2018-06-04 15:43:09 +0000 |
|---|---|---|
| committer | Andrea Di Biagio <Andrea_DiBiagio@sn.scee.net> | 2018-06-04 15:43:09 +0000 |
| commit | 39e5a5695fd3e560565b40d1b60b4a1e78665875 (patch) | |
| tree | 3945b3ad7cc4465a52e631ce891e42a057ddb79c /llvm/lib/Target | |
| parent | ab60a2823f1a6548c17c57abbbafdb4ddb3bb785 (diff) | |
| download | bcm5719-llvm-39e5a5695fd3e560565b40d1b60b4a1e78665875.tar.gz bcm5719-llvm-39e5a5695fd3e560565b40d1b60b4a1e78665875.zip | |
[RFC][patch 3/3] Add support for variant scheduling classes in llvm-mca.
This patch is the last of a sequence of three patches related to LLVM-dev RFC
"MC support for variant scheduling classes".
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123181.html
This fixes PR36672.
The main goal of this patch is to teach llvm-mca how to solve variant scheduling
classes. This patch does that, plus it adds new variant scheduling classes to
the BtVer2 scheduling model to identify so-called zero-idioms (i.e. so-called
dependency breaking instructions that are known to generate zero, and that are
optimized out in hardware at register renaming stage).
Without the BtVer2 change, this patch would not have had any meaningful tests.
This patch is effectively the union of two changes:
1) a change that teaches llvm-mca how to resolve variant scheduling classes.
2) a change to the BtVer2 scheduling model that allows us to special-case
packed XOR zero-idioms (this partially fixes PR36671).
Differential Revision: https://reviews.llvm.org/D47374
llvm-svn: 333909
Diffstat (limited to 'llvm/lib/Target')
| -rw-r--r-- | llvm/lib/Target/X86/X86Schedule.td | 5 | ||||
| -rw-r--r-- | llvm/lib/Target/X86/X86ScheduleBtVer2.td | 33 |
2 files changed, 37 insertions, 1 deletions
diff --git a/llvm/lib/Target/X86/X86Schedule.td b/llvm/lib/Target/X86/X86Schedule.td index ccee972c482..77e7f2e0f79 100644 --- a/llvm/lib/Target/X86/X86Schedule.td +++ b/llvm/lib/Target/X86/X86Schedule.td @@ -559,6 +559,11 @@ def SchedWriteFShuffleSizes : X86SchedWriteSizes<SchedWriteFShuffle, SchedWriteFShuffle>; //===----------------------------------------------------------------------===// +// Common MCInstPredicate definitions used by variant scheduling classes. + +def ZeroIdiomPredicate : CheckSameRegOperand<1, 2>; + +//===----------------------------------------------------------------------===// // Generic Processor Scheduler Models. // IssueWidth is analogous to the number of decode units. Core and its diff --git a/llvm/lib/Target/X86/X86ScheduleBtVer2.td b/llvm/lib/Target/X86/X86ScheduleBtVer2.td index 764d097e369..721088457a3 100644 --- a/llvm/lib/Target/X86/X86ScheduleBtVer2.td +++ b/llvm/lib/Target/X86/X86ScheduleBtVer2.td @@ -546,5 +546,36 @@ def JWriteJVZEROUPPER: SchedWriteRes<[]> { let NumMicroOps = 37; } def : InstRW<[JWriteJVZEROUPPER], (instrs VZEROUPPER)>; -} // SchedModel +/////////////////////////////////////////////////////////////////////////////// +// SchedWriteVariant definitions. +/////////////////////////////////////////////////////////////////////////////// + +def JWriteZeroLatency : SchedWriteRes<[]> { + let Latency = 0; +} + +// Vector XOR instructions that use the same register for both source +// operands do not have a real dependency on the previous contents of the +// register, and thus, do not have to wait before completing. They can be +// optimized out at register renaming stage. +// Reference: Section 10.8 of the "Software Optimization Guide for AMD Family +// 15h Processors". +// Reference: Agner's Fog "The microarchitecture of Intel, AMD and VIA CPUs", +// Section 21.8 [Dependency-breaking instructions]. + +def JWriteFZeroIdiom : SchedWriteVariant<[ + SchedVar<MCSchedPredicate<ZeroIdiomPredicate>, [JWriteZeroLatency]>, + SchedVar<MCSchedPredicate<TruePred>, [WriteFLogic]> +]>; + +def : InstRW<[JWriteFZeroIdiom], (instrs XORPSrr, VXORPSrr, XORPDrr, VXORPDrr)>; + +def JWriteVZeroIdiom : SchedWriteVariant<[ + SchedVar<MCSchedPredicate<ZeroIdiomPredicate>, [JWriteZeroLatency]>, + SchedVar<MCSchedPredicate<TruePred>, [WriteVecLogicX]> +]>; + +def : InstRW<[JWriteVZeroIdiom], (instrs PXORrr, VPXORrr)>; + +} // SchedModel |

