[AArch64] Avoid partial register deps on insertelt of load into lane 0.

This improves upon r246462: that prevented FMOVs from being emitted for the cross-class INSERT_SUBREGs by disabling the formation of INSERT_SUBREGs of LOAD. But the ld1.s that we started selecting caused us to introduce partial dependencies on the vector register. Avoid that by using SCALAR_TO_VECTOR: it's a first-class citizen that is folded away by many patterns, including the scalar LDRS that we want in this case. Credit goes to Adam for finding the issue! llvm-svn: 299482
author: Ahmed Bougacha <ahmed.bougacha@gmail.com> 2017-04-04 22:55:53 +0000
committer: Ahmed Bougacha <ahmed.bougacha@gmail.com> 2017-04-04 22:55:53 +0000
commit: d3c03a5ddd9523386ef766a397483d09a60f7156 (patch)
tree: aae82cca474dc693b9f0f1257dbe6c0cab892b6c /llvm/lib/Target
parent: e73e00c9b29b3d9b28b728849b06ed6b0b9bea88 (diff)
download: bcm5719-llvm-d3c03a5ddd9523386ef766a397483d09a60f7156.tar.gz
bcm5719-llvm-d3c03a5ddd9523386ef766a397483d09a60f7156.zip
1 files changed, 5 insertions, 11 deletions
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 0d4a9943ecc..ea184d55e44 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -6587,19 +6587,13 @@ FailedModImm:
     SDValue Op0 = Op.getOperand(0);
     unsigned ElemSize = VT.getScalarSizeInBits();
     unsigned i = 0;
-    // For 32 and 64 bit types, use INSERT_SUBREG for lane zero to
+    // For 32 and 64 bit types, use SCALAR_TO_VECTOR for lane zero to
     // a) Avoid a RMW dependency on the full vector register, and
     // b) Allow the register coalescer to fold away the copy if the
-    //    value is already in an S or D register.
-    // Do not do this for UNDEF/LOAD nodes because we have better patterns
-    // for those avoiding the SCALAR_TO_VECTOR/BUILD_VECTOR.
-    if (!Op0.isUndef() && Op0.getOpcode() != ISD::LOAD &&
-        (ElemSize == 32 || ElemSize == 64)) {
-      unsigned SubIdx = ElemSize == 32 ? AArch64::ssub : AArch64::dsub;
-      MachineSDNode *N =
-          DAG.getMachineNode(TargetOpcode::INSERT_SUBREG, dl, VT, Vec, Op0,
-                             DAG.getTargetConstant(SubIdx, dl, MVT::i32));
-      Vec = SDValue(N, 0);
+    //    value is already in an S or D register, and we're forced to emit an
+    //    INSERT_SUBREG that we can't fold anywhere.
+    if (!Op0.isUndef() && (ElemSize == 32 || ElemSize == 64)) {
+      Vec = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, VT, Op0);
       ++i;
     }
     for (; i < NumElts; ++i) {
author	Ahmed Bougacha <ahmed.bougacha@gmail.com>	2017-04-04 22:55:53 +0000
committer	Ahmed Bougacha <ahmed.bougacha@gmail.com>	2017-04-04 22:55:53 +0000
commit	d3c03a5ddd9523386ef766a397483d09a60f7156 (patch)
tree	aae82cca474dc693b9f0f1257dbe6c0cab892b6c /llvm/lib/Target
parent	e73e00c9b29b3d9b28b728849b06ed6b0b9bea88 (diff)
download	bcm5719-llvm-d3c03a5ddd9523386ef766a397483d09a60f7156.tar.gz bcm5719-llvm-d3c03a5ddd9523386ef766a397483d09a60f7156.zip