diff options
author | Andrea Di Biagio <Andrea_DiBiagio@sn.scee.net> | 2018-10-12 11:23:04 +0000 |
---|---|---|
committer | Andrea Di Biagio <Andrea_DiBiagio@sn.scee.net> | 2018-10-12 11:23:04 +0000 |
commit | 6eebbe0a971f7c350571c3788111da14198d01f2 (patch) | |
tree | fe22600a6f0ccf9854943112781ca7719fd8cafa /llvm/lib | |
parent | e02d09d3db6e5c9d0f5f0a384179e9cf6bec87a4 (diff) | |
download | bcm5719-llvm-6eebbe0a971f7c350571c3788111da14198d01f2.tar.gz bcm5719-llvm-6eebbe0a971f7c350571c3788111da14198d01f2.zip |
[tblgen][llvm-mca] Add the ability to describe move elimination candidates via tablegen.
This patch adds the ability to identify instructions that are "move elimination
candidates". It also allows scheduling models to describe processor register
files that allow move elimination.
A move elimination candidate is an instruction that can be eliminated at
register renaming stage.
Each subtarget can specify which instructions are move elimination candidates
with the help of tablegen class "IsOptimizableRegisterMove" (see
llvm/Target/TargetInstrPredicate.td).
For example, on X86, BtVer2 allows both GPR and MMX/SSE moves to be eliminated.
The definition of 'IsOptimizableRegisterMove' for BtVer2 looks like this:
```
def : IsOptimizableRegisterMove<[
InstructionEquivalenceClass<[
// GPR variants.
MOV32rr, MOV64rr,
// MMX variants.
MMX_MOVQ64rr,
// SSE variants.
MOVAPSrr, MOVUPSrr,
MOVAPDrr, MOVUPDrr,
MOVDQArr, MOVDQUrr,
// AVX variants.
VMOVAPSrr, VMOVUPSrr,
VMOVAPDrr, VMOVUPDrr,
VMOVDQArr, VMOVDQUrr
], CheckNot<CheckSameRegOperand<0, 1>> >
]>;
```
Definitions of IsOptimizableRegisterMove from processor models of a same
Target are processed by the SubtargetEmitter to auto-generate a target-specific
override for each of the following predicate methods:
```
bool TargetSubtargetInfo::isOptimizableRegisterMove(const MachineInstr *MI)
const;
bool MCInstrAnalysis::isOptimizableRegisterMove(const MCInst &MI, unsigned
CPUID) const;
```
By default, those methods return false (i.e. conservatively assume that there
are no move elimination candidates).
Tablegen class RegisterFile has been extended with the following information:
- The set of register classes that allow move elimination.
- Maxium number of moves that can be eliminated every cycle.
- Whether move elimination is restricted to moves from registers that are
known to be zero.
This patch is structured in three part:
A first part (which is mostly boilerplate) adds the new
'isOptimizableRegisterMove' target hooks, and extends existing register file
descriptors in MC by introducing new fields to describe properties related to
move elimination.
A second part, uses the new tablegen constructs to describe move elimination in
the BtVer2 scheduling model.
A third part, teaches llm-mca how to query the new 'isOptimizableRegisterMove'
hook to mark instructions that are candidates for move elimination. It also
teaches class RegisterFile how to describe constraints on move elimination at
PRF granularity.
llvm-mca tests for btver2 show differences before/after this patch.
Differential Revision: https://reviews.llvm.org/D53134
llvm-svn: 344334
Diffstat (limited to 'llvm/lib')
-rw-r--r-- | llvm/lib/Target/X86/X86ScheduleBtVer2.td | 34 |
1 files changed, 32 insertions, 2 deletions
diff --git a/llvm/lib/Target/X86/X86ScheduleBtVer2.td b/llvm/lib/Target/X86/X86ScheduleBtVer2.td index 2c1a4b6c7f5..33a6b01546d 100644 --- a/llvm/lib/Target/X86/X86ScheduleBtVer2.td +++ b/llvm/lib/Target/X86/X86ScheduleBtVer2.td @@ -48,12 +48,22 @@ def JFPU1 : ProcResource<1>; // Vector/FPU Pipe1: VALU1/STC/FPM // part of it. // Reference: Section 21.10 "AMD Bobcat and Jaguar pipeline: Partial register // access" - Agner Fog's "microarchitecture.pdf". -def JIntegerPRF : RegisterFile<64, [GR64, CCR]>; +def JIntegerPRF : RegisterFile<64, [GR64, CCR], [1, 1], [1, 0], + 0, // Max moves that can be eliminated per cycle. + 1>; // Restrict move elimination to zero regs. // The Jaguar FP Retire Queue renames SIMD and FP uOps onto a pool of 72 SSE // registers. Operations on 256-bit data types are cracked into two COPs. // Reference: www.realworldtech.com/jaguar/4/ -def JFpuPRF: RegisterFile<72, [VR64, VR128, VR256], [1, 1, 2]>; + +// The PRF in the floating point unit can eliminate a move from a MMX or SSE +// register that is know to be zero (i.e. it has been zeroed using a zero-idiom +// dependency breaking instruction, or via VZEROALL). +// Reference: Section 21.8 "AMD Bobcat and Jaguar pipeline: Dependency-breaking +// instructions" - Agner Fog's "microarchitecture.pdf" +def JFpuPRF: RegisterFile<72, [VR64, VR128, VR256], [1, 1, 2], [1, 1, 0], + 0, // Max moves that can be eliminated per cycle. + 1>; // Restrict move elimination to zero regs. // The retire control unit (RCU) can track up to 64 macro-ops in-flight. It can // retire up to two macro-ops per cycle. @@ -805,4 +815,24 @@ def : IsDepBreakingFunction<[ ], ZeroIdiomPredicate> ]>; +def : IsOptimizableRegisterMove<[ + InstructionEquivalenceClass<[ + // GPR variants. + MOV32rr, MOV64rr, + + // MMX variants. + MMX_MOVQ64rr, + + // SSE variants. + MOVAPSrr, MOVUPSrr, + MOVAPDrr, MOVUPDrr, + MOVDQArr, MOVDQUrr, + + // AVX variants. + VMOVAPSrr, VMOVUPSrr, + VMOVAPDrr, VMOVUPDrr, + VMOVDQArr, VMOVDQUrr + ], TruePred > +]>; + } // SchedModel |