diff options
| author | Ahmed Bougacha <ahmed.bougacha@gmail.com> | 2017-04-04 22:55:53 +0000 |
|---|---|---|
| committer | Ahmed Bougacha <ahmed.bougacha@gmail.com> | 2017-04-04 22:55:53 +0000 |
| commit | d3c03a5ddd9523386ef766a397483d09a60f7156 (patch) | |
| tree | aae82cca474dc693b9f0f1257dbe6c0cab892b6c /llvm/lib/Target | |
| parent | e73e00c9b29b3d9b28b728849b06ed6b0b9bea88 (diff) | |
| download | bcm5719-llvm-d3c03a5ddd9523386ef766a397483d09a60f7156.tar.gz bcm5719-llvm-d3c03a5ddd9523386ef766a397483d09a60f7156.zip | |
[AArch64] Avoid partial register deps on insertelt of load into lane 0.
This improves upon r246462: that prevented FMOVs from being emitted
for the cross-class INSERT_SUBREGs by disabling the formation of
INSERT_SUBREGs of LOAD. But the ld1.s that we started selecting
caused us to introduce partial dependencies on the vector register.
Avoid that by using SCALAR_TO_VECTOR: it's a first-class citizen that
is folded away by many patterns, including the scalar LDRS that we
want in this case.
Credit goes to Adam for finding the issue!
llvm-svn: 299482
Diffstat (limited to 'llvm/lib/Target')
| -rw-r--r-- | llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | 16 |
1 files changed, 5 insertions, 11 deletions
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp index 0d4a9943ecc..ea184d55e44 100644 --- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp +++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp @@ -6587,19 +6587,13 @@ FailedModImm: SDValue Op0 = Op.getOperand(0); unsigned ElemSize = VT.getScalarSizeInBits(); unsigned i = 0; - // For 32 and 64 bit types, use INSERT_SUBREG for lane zero to + // For 32 and 64 bit types, use SCALAR_TO_VECTOR for lane zero to // a) Avoid a RMW dependency on the full vector register, and // b) Allow the register coalescer to fold away the copy if the - // value is already in an S or D register. - // Do not do this for UNDEF/LOAD nodes because we have better patterns - // for those avoiding the SCALAR_TO_VECTOR/BUILD_VECTOR. - if (!Op0.isUndef() && Op0.getOpcode() != ISD::LOAD && - (ElemSize == 32 || ElemSize == 64)) { - unsigned SubIdx = ElemSize == 32 ? AArch64::ssub : AArch64::dsub; - MachineSDNode *N = - DAG.getMachineNode(TargetOpcode::INSERT_SUBREG, dl, VT, Vec, Op0, - DAG.getTargetConstant(SubIdx, dl, MVT::i32)); - Vec = SDValue(N, 0); + // value is already in an S or D register, and we're forced to emit an + // INSERT_SUBREG that we can't fold anywhere. + if (!Op0.isUndef() && (ElemSize == 32 || ElemSize == 64)) { + Vec = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, VT, Op0); ++i; } for (; i < NumElts; ++i) { |

