diff options
author | Sanjay Patel <spatel@rotateright.com> | 2018-07-16 22:59:31 +0000 |
---|---|---|
committer | Sanjay Patel <spatel@rotateright.com> | 2018-07-16 22:59:31 +0000 |
commit | c71adc8040b1e382b195a0096015cb5c39628b23 (patch) | |
tree | 8711ea739eab9d1354abf5fed2412f7d00f75293 /llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | |
parent | c4846a551e0c1499e67f4aa287abe89be20ffe5f (diff) | |
download | bcm5719-llvm-c71adc8040b1e382b195a0096015cb5c39628b23.tar.gz bcm5719-llvm-c71adc8040b1e382b195a0096015cb5c39628b23.zip |
[Intrinsics] define funnel shift IR intrinsics + DAG builder support
As discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123292.html
http://lists.llvm.org/pipermail/llvm-dev/2018-July/124400.html
We want to add rotate intrinsics because the IR expansion of that pattern is 4+ instructions,
and we can lose pieces of the pattern before it gets to the backend. Generalizing the operation
by allowing 2 different input values (plus the 3rd shift/rotate amount) gives us a "funnel shift"
operation which may also be a single hardware instruction.
Initially, I thought we needed to define new DAG nodes for these ops, and I spent time working
on that (much larger patch), but then I concluded that we don't need it. At least as a first
step, we have all of the backend support necessary to match these ops...because it was required.
And shepherding these through the IR optimizer is the primary concern, so the IR intrinsics are
likely all that we'll ever need.
There was also a question about converting the intrinsics to the existing ROTL/ROTR DAG nodes
(along with improving the oversized shift documentation). Again, I don't think that's strictly
necessary (as the test results here prove). That can be an efficiency improvement as a small
follow-up patch.
So all we're left with is documentation, definition of the IR intrinsics, and DAG builder support.
Differential Revision: https://reviews.llvm.org/D49242
llvm-svn: 337221
Diffstat (limited to 'llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp')
-rw-r--r-- | llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | 37 |
1 files changed, 37 insertions, 0 deletions
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp index 7a808b12ea4..73a07d56a41 100644 --- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp @@ -5656,6 +5656,43 @@ SelectionDAGBuilder::visitIntrinsicCall(const CallInst &I, unsigned Intrinsic) { setValue(&I, DAG.getNode(ISD::CTPOP, sdl, Ty, Arg)); return nullptr; } + case Intrinsic::fshl: + case Intrinsic::fshr: { + bool IsFSHL = Intrinsic == Intrinsic::fshl; + SDValue X = getValue(I.getArgOperand(0)); + SDValue Y = getValue(I.getArgOperand(1)); + SDValue Z = getValue(I.getArgOperand(2)); + EVT VT = X.getValueType(); + + // TODO: When X == Y, this is rotate. Create the node directly if legal. + + // Get the shift amount and inverse shift amount, modulo the bit-width. + SDValue BitWidthC = DAG.getConstant(VT.getScalarSizeInBits(), sdl, VT); + SDValue ShAmt = DAG.getNode(ISD::UREM, sdl, VT, Z, BitWidthC); + SDValue NegZ = DAG.getNode(ISD::SUB, sdl, VT, BitWidthC, Z); + SDValue InvShAmt = DAG.getNode(ISD::UREM, sdl, VT, NegZ, BitWidthC); + + // fshl: (X << (Z % BW)) | (Y >> ((BW - Z) % BW)) + // fshr: (X << ((BW - Z) % BW)) | (Y >> (Z % BW)) + SDValue ShX = DAG.getNode(ISD::SHL, sdl, VT, X, IsFSHL ? ShAmt : InvShAmt); + SDValue ShY = DAG.getNode(ISD::SRL, sdl, VT, Y, IsFSHL ? InvShAmt : ShAmt); + SDValue Res = DAG.getNode(ISD::OR, sdl, VT, ShX, ShY); + + // If (Z % BW == 0), then (BW - Z) % BW is also zero, so the result would + // be X | Y. If X == Y (rotate), that's fine. If not, we have to select. + if (X != Y) { + SDValue Zero = DAG.getConstant(0, sdl, VT); + EVT CCVT = MVT::i1; + if (VT.isVector()) + CCVT = EVT::getVectorVT(*Context, CCVT, VT.getVectorNumElements()); + // For fshl, 0 shift returns the 1st arg (X). + // For fshr, 0 shift returns the 2nd arg (Y). + SDValue IsZeroShift = DAG.getSetCC(sdl, CCVT, ShAmt, Zero, ISD::SETEQ); + Res = DAG.getSelect(sdl, VT, IsZeroShift, IsFSHL ? X : Y, Res); + } + setValue(&I, Res); + return nullptr; + } case Intrinsic::stacksave: { SDValue Op = getRoot(); Res = DAG.getNode( |