diff options
author | Andrea Di Biagio <Andrea_DiBiagio@sn.scee.net> | 2018-07-19 16:42:15 +0000 |
---|---|---|
committer | Andrea Di Biagio <Andrea_DiBiagio@sn.scee.net> | 2018-07-19 16:42:15 +0000 |
commit | b6022aa8d900d175da00a5ca7b948a27ec61f4bf (patch) | |
tree | b593879065f37bb5c48bc069f10522d8f42c24a2 /llvm/lib/Target/X86/X86FixupLEAs.cpp | |
parent | b3638135432b66aa56831556b51963aef44a352c (diff) | |
download | bcm5719-llvm-b6022aa8d900d175da00a5ca7b948a27ec61f4bf.tar.gz bcm5719-llvm-b6022aa8d900d175da00a5ca7b948a27ec61f4bf.zip |
[X86][BtVer2] correctly model the latency/throughput of LEA instructions.
This patch fixes the latency/throughput of LEA instructions in the BtVer2
scheduling model.
On Jaguar, A 3-operands LEA has a latency of 2cy, and a reciprocal throughput of
1. That is because it uses one cycle of SAGU followed by 1cy of ALU1. An LEA
with a "Scale" operand is also slow, and it has the same latency profile as the
3-operands LEA. An LEA16r has a latency of 3cy, and a throughput of 0.5 (i.e.
RThrouhgput of 2.0).
This patch adds a new TIIPredicate named IsThreeOperandsLEAFn to X86Schedule.td.
The tablegen backend (for instruction-info) expands that definition into this
(file X86GenInstrInfo.inc):
```
static bool isThreeOperandsLEA(const MachineInstr &MI) {
return (
(
MI.getOpcode() == X86::LEA32r
|| MI.getOpcode() == X86::LEA64r
|| MI.getOpcode() == X86::LEA64_32r
|| MI.getOpcode() == X86::LEA16r
)
&& MI.getOperand(1).isReg()
&& MI.getOperand(1).getReg() != 0
&& MI.getOperand(3).isReg()
&& MI.getOperand(3).getReg() != 0
&& (
(
MI.getOperand(4).isImm()
&& MI.getOperand(4).getImm() != 0
)
|| (MI.getOperand(4).isGlobal())
)
);
}
```
A similar method is generated in the X86_MC namespace, and included into
X86MCTargetDesc.cpp (the declaration lives in X86MCTargetDesc.h).
Back to the BtVer2 scheduling model:
A new scheduling predicate named JSlowLEAPredicate now checks if either the
instruction is a three-operands LEA, or it is an LEA with a Scale value
different than 1.
A variant scheduling class uses that new predicate to correctly select the
appropriate latency profile.
Differential Revision: https://reviews.llvm.org/D49436
llvm-svn: 337469
Diffstat (limited to 'llvm/lib/Target/X86/X86FixupLEAs.cpp')
-rw-r--r-- | llvm/lib/Target/X86/X86FixupLEAs.cpp | 11 |
1 files changed, 3 insertions, 8 deletions
diff --git a/llvm/lib/Target/X86/X86FixupLEAs.cpp b/llvm/lib/Target/X86/X86FixupLEAs.cpp index 157b07d819b..d85389a0a7f 100644 --- a/llvm/lib/Target/X86/X86FixupLEAs.cpp +++ b/llvm/lib/Target/X86/X86FixupLEAs.cpp @@ -286,6 +286,8 @@ static inline bool isRegOperand(const MachineOperand &Op) { } /// hasIneffecientLEARegs - LEA that uses base and index registers /// where the base is EBP, RBP, or R13 +// TODO: use a variant scheduling class to model the latency profile +// of LEA instructions, and implement this logic as a scheduling predicate. static inline bool hasInefficientLEABaseReg(const MachineOperand &Base, const MachineOperand &Index) { return Base.isReg() && isInefficientLEAReg(Base.getReg()) && @@ -296,13 +298,6 @@ static inline bool hasLEAOffset(const MachineOperand &Offset) { return (Offset.isImm() && Offset.getImm() != 0) || Offset.isGlobal(); } -// LEA instruction that has all three operands: offset, base and index -static inline bool isThreeOperandsLEA(const MachineOperand &Base, - const MachineOperand &Index, - const MachineOperand &Offset) { - return isRegOperand(Base) && isRegOperand(Index) && hasLEAOffset(Offset); -} - static inline int getADDrrFromLEA(int LEAOpcode) { switch (LEAOpcode) { default: @@ -477,7 +472,7 @@ FixupLEAPass::processInstrForSlow3OpLEA(MachineInstr &MI, const MachineOperand &Offset = MI.getOperand(4); const MachineOperand &Segment = MI.getOperand(5); - if (!(isThreeOperandsLEA(Base, Index, Offset) || + if (!(TII->isThreeOperandsLEA(MI) || hasInefficientLEABaseReg(Base, Index)) || !TII->isSafeToClobberEFLAGS(*MFI, MI) || Segment.getReg() != X86::NoRegister) |