|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | (REAPPLIED)
We currently only support binary instructions in the alternate opcode shuffles.
This patch is an initial attempt at adding cast instructions as well, this raises several issues that we probably want to address as we continue to generalize the alternate mechanism:
1 - Duplication of cost determination - we should probably add scalar/vector costs helper functions and get BoUpSLP::getEntryCost to use them instead of determining costs directly.
2 - Support alternate instructions with the same opcode (e.g. casts with different src types) - alternate vectorization of calls with different IntrinsicIDs will require this.
3 - Allow alternates to be a different instruction type - mixing binary/cast/call etc.
4 - Allow passthrough of unsupported alternate instructions - related to PR30787/D28907 'copyable' elements.
Reapplied with fix to only accept 2 different casts if they come from the same source type.
Differential Revision: https://reviews.llvm.org/D49135
llvm-svn: 336812 | 
| | 
| 
| 
| 
| 
| 
| 
| | cast instructions.
Reverting due to buildbot failures
llvm-svn: 336806 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | We currently only support binary instructions in the alternate opcode shuffles.
This patch is an initial attempt at adding cast instructions as well, this raises several issues that we probably want to address as we continue to generalize the alternate mechanism:
1 - Duplication of cost determination - we should probably add scalar/vector costs helper functions and get BoUpSLP::getEntryCost to use them instead of determining costs directly.
2 - Support alternate instructions with the same opcode (e.g. casts with different src types) - alternate vectorization of calls with different IntrinsicIDs will require this.
3 - Allow alternates to be a different instruction type - mixing binary/cast/call etc.
4 - Allow passthrough of unsupported alternate instructions - related to PR30787/D28907 'copyable' elements.
Differential Revision: https://reviews.llvm.org/D49135
llvm-svn: 336804 | 
| | 
| 
| 
| 
| 
| | Differential Revision: https://reviews.llvm.org/D48968
llvm-svn: 336667 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch introduces a VPValue in VPBlockBase to represent the condition
bit that is used as successor selector when a block has multiple successors.
This information wasn't necessary until now, when we are about to introduce
outer loop vectorization support in VPlan code gen.
Reviewers: fhahn, rengolin, mkuper, hfinkel, mssimpso
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D48814
llvm-svn: 336554 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | from opcodes. NFCI.
This is an early step towards matching Instructions by attributes other than the opcode. This will be necessary for cast/call alternates which share the same opcode but have different types/intrinsicIDs etc. - which we could vectorize as long as we split them using the alternate mechanism.
Differential Revision: https://reviews.llvm.org/D48945
llvm-svn: 336344 | 
| | 
| 
| 
| | llvm-svn: 336291 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | When creating `phi` instructions to resume at the scalar part of the loop,
copy the DebugLoc from the original phi over to the new one.
Differential Revision: https://reviews.llvm.org/D48769
llvm-svn: 336256 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary: It is common to have the following min/max pattern during the intermediate stages of SLP since we only optimize at the end. This patch tries to catch such patterns and allow more vectorization.
         %1 = extractelement <2 x i32> %a, i32 0
         %2 = extractelement <2 x i32> %a, i32 1
         %cond = icmp sgt i32 %1, %2
         %3 = extractelement <2 x i32> %a, i32 0
         %4 = extractelement <2 x i32> %a, i32 1
         %select = select i1 %cond, i32 %3, i32 %4
Author: FarhanaAleen
Reviewed By: ABataev, RKSimon, spatel
Differential Revision: https://reviews.llvm.org/D47608
llvm-svn: 336130 | 
| | 
| 
| 
| 
| 
| 
| 
| | getEntryCost
This code is only used by alternate opcodes so the InstructionsState has already confirmed that every Value is an Instruction, plus we use cast<Instruction> which will assert on failure.
llvm-svn: 336102 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | handle SK_Select patterns.
We were always using the opcodes of the first 2 scalars for the costs of the alternate opcode + shuffle. This made sense when we used SK_Alternate and opcodes were guaranteed to be alternating, but this fails for the more general SK_Select case.
This fix exposes an issue demonstrated by the fmul_fdiv_v4f32_const test - the SLM model has v4f32 fdiv costs which are more than twice those of the f32 scalar cost, meaning that the cost model determines that the vectorization is not performant. Unfortunately it completely ignores the fact that the fdiv by a constant will be changed into a fmul by InstCombine for a much lower cost vectorization. But at least we're seeing this now...
llvm-svn: 336095 | 
| | 
| 
| 
| 
| 
| 
| 
| | getEntryCost/vectorizeTree. NFCI.
Add assertions - we're already assuming this in how we use the AltOpcode and treat everything as BinaryOperators.
llvm-svn: 336092 | 
| | 
| 
| 
| 
| 
| | instead of an opcode. NFCI.
llvm-svn: 336069 | 
| | 
| 
| 
| 
| 
| 
| 
| | helper. NFCI.
This is a basic step towards matching more general instructions types than just opcodes.
llvm-svn: 336068 | 
| | 
| 
| 
| | llvm-svn: 336063 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | Since D46637 we are better at handling uniform/non-uniform constant Pow2 detection; this patch tweaks the SLP argument handling to support them.
As SLP works with arrays of values I don't think we can easily use the pattern match helpers here.
Differential Revision: https://reviews.llvm.org/D48214
llvm-svn: 335621 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | Enable tryToVectorizeList to support InstructionsState alternate opcode patterns at a root (build vector etc.) as well as further down the vectorization tree.
NOTE: This patch reduces some of the debug reporting if there are opcode mismatches - I can try to add it back if it proves a problem. But it could get rather messy trying to provide equivalent verbose debug strings via getSameOpcode etc.
Differential Revision: https://reviews.llvm.org/D48488
llvm-svn: 335364 | 
| | 
| 
| 
| 
| 
| 
| 
| | InstructionsState. NFCI.
All calls were extracting the InstructionsState Opcode/AltOpcode values so we might as well pass it directly
llvm-svn: 335359 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | SLP currently only accepts (F)Add/(F)Sub alternate counterpart ops to be merged into an alternate shuffle.
This patch relaxes this to accept any pair of BinaryOperator opcodes instead, assuming the target's cost model accepts the vectorization+shuffle.
Differential Revision: https://reviews.llvm.org/D48477
llvm-svn: 335349 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | call tree
As described in D48359, this patch pushes InstructionsState down the BoUpSLP call hierarchy instead of the corresponding raw OpValue. This makes it easier to track the alternate opcode etc. and avoids us having to call getAltOpcode which makes it difficult to support more than one alternate opcode.
Differential Revision: https://reviews.llvm.org/D48382
llvm-svn: 335170 | 
| | 
| 
| 
| 
| 
| | A future patch will have isOneOf use InstructionsState.
llvm-svn: 335142 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This is part of a move towards generalizing the alternate opcode mechanism and not just supporting (F)Add/(F)Sub counterparts.
The patch embeds the AltOpcode in the InstructionsState instead of calling getAltOpcode so often.
I'm hoping to eventually remove all uses of getAltOpcode and handle alternate opcode selection entirely within getSameOpcode, that will require us to use InstructionsState throughout the BoUpSLP call hierarchy (similar to some of the changes in D28907), which I will begin in future patches.
Differential Revision: https://reviews.llvm.org/D48359
llvm-svn: 335134 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | SK_Select shuffle pattern
D47985 saw the old SK_Alternate 'alternating' shuffle mask replaced with the SK_Select mask which accepts either input operand for each lane, equivalent to a vector select with a constant condition operand.
This patch updates SLPVectorizer to make full use of this SK_Select shuffle pattern by removing the 'isOdd()' limitation.
The AArch64 regression will be fixed by D48172.
Differential Revision: https://reviews.llvm.org/D48174
llvm-svn: 335130 | 
| | 
| 
| 
| | llvm-svn: 335110 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | getArithmeticInstrCost calls (NFC)
The getArithmeticInstrCost calls for shuffle vectors entry costs specify TargetTransformInfo::OperandValueKind arguments, but are just using the method's default values. This seems to be a copy + paste issue and doesn't affect the costs in anyway. The TargetTransformInfo::OperandValueProperties default arguments are already not being used.
Noticed while working on D47985.
Differential Revision: https://reviews.llvm.org/D48008
llvm-svn: 335045 | 
| | 
| 
| 
| 
| 
| | Minor step towards making the alternate opcode system work with a wider range of opcode pairs.
llvm-svn: 335032 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch introduces a VPInstructionToVPRecipe transformation, which
allows us to generate code for a VPInstruction based VPlan re-using the
existing infrastructure.
Reviewers: dcaballe, hsaito, mssimpso, hfinkel, rengolin, mkuper, javed.absar, sguggill
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D46827
llvm-svn: 334969 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | Ensure we keep track of the input vectors in all cases instead of just for SK_Select.
Ideally we'd reuse the shuffle mask pattern matching in TargetTransformInfo::getInstructionThroughput here to easily add support for all TargetTransformInfo::ShuffleKind without mass code duplication, I've added a TODO for now but D48236 should help us here.
Differential Revision: https://reviews.llvm.org/D48023
llvm-svn: 334958 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | Reviewers: dcaballe, hsaito, mkuper, hfinkel
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D48081
llvm-svn: 334951 | 
| | 
| 
| 
| | llvm-svn: 334943 | 
| | 
| 
| 
| | llvm-svn: 334934 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | Reviewers: dcaballe, mkuper, hfinkel, hsaito, mssimpso
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D48080
llvm-svn: 334933 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This is a minor fix for LV cost model, where the cost for VF=2 was
computed twice when the vectorization of the loop was forced without
specifying a VF.
Reviewers: xusx595, hsaito, fhahn, mkuper
Reviewed By: hsaito, xusx595
Differential Revision: https://reviews.llvm.org/D48048
llvm-svn: 334840 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | getSameOpcode
This is part of the work to cleanup use of 'alternate' ops so we can use the more general SK_Select shuffle type.
Only getSameOpcode calls getMainOpcode and much of the logic is repeated in both functions. This will require some reworking of D28907 but that patch has hit trouble and is unlikely to be completed anytime soon.
Differential Revision: https://reviews.llvm.org/D48120
llvm-svn: 334701 | 
| | 
| 
| 
| 
| 
| | There's no need to cast the base Value to an Instruction
llvm-svn: 334588 | 
| | 
| 
| 
| 
| 
| | We early-out for the case where we don't use alternate opcodes, so no need to check for it later.
llvm-svn: 334587 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | (PR33744)
As discussed on PR33744, this patch relaxes ShuffleKind::SK_Alternate which requires shuffle masks to only match an alternating pattern from its 2 sources:
e.g. v4f32: <0,5,2,7> or <4,1,6,3>
This seems far too restrictive as most SIMD hardware which will implement it using a general blend/bit-select instruction, so replaces it with SK_Select, permitting elements from either source as long as they are inline:
e.g. v4f32: <0,5,2,7>, <4,1,6,3>, <0,1,6,7>, <4,1,2,3> etc.
This initial patch just updates the name and cost model shuffle mask analysis, later patch reviews will update SLP to better utilise this - it still limits itself to SK_Alternate style patterns.
Differential Revision: https://reviews.llvm.org/D47985
llvm-svn: 334513 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Currently SmallSet<PointerTy> inherits from SmallPtrSet<PointerTy>. This
patch replaces such types with SmallPtrSet, because IMO it is slightly
clearer and allows us to get rid of unnecessarily including SmallSet.h
Reviewers: dblaikie, craig.topper
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D47836
llvm-svn: 334492 | 
| | 
| 
| 
| 
| 
| 
| 
| | SmallSet forwards to SmallPtrSet for pointer types. SmallPtrSet supports iteration, but a normal SmallSet doesn't. So if it wasn't for the forwarding, this wouldn't work.
These places were found by hiding the begin/end methods in the SmallSet forwarding
llvm-svn: 334343 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch moves the recipe-creation functions out of
LoopVectorizationPlanner, which should do the high-level
orchestration of the transformations.
Reviewers: dcaballe, rengolin, hsaito, Ayal
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D47595
llvm-svn: 334305 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This first step separates VPInstruction-based and VPRecipe-based
VPlan creation, which should make it easier to migrate to VPInstruction
based code-gen step by step.
Reviewers: Ayal, rengolin, dcaballe, hsaito, mkuper, mzolotukhin
Reviewed By: dcaballe
Subscribers: bollu, tschuett, rkruppe, llvm-commits
Differential Revision: https://reviews.llvm.org/D47477
llvm-svn: 334284 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | There could be more than one PHIs in exit block using same loop recurrence.
Don't assume there is only one and fix each user.
Differential Revision: https://reviews.llvm.org/D47788
llvm-svn: 334271 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | Review feedback from r328165. Split out just the one function from the
file that's used by Analysis. (As chandlerc pointed out, the original
change only moved the header and not the implementation anyway - which
was fine for the one function that was used (since it's a
template/inlined in the header) but not in general)
llvm-svn: 333954 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Minor replacement. LLVM_ATTRIBUTE_USED was introduced to silence
a warning but using #ifndef NDEBUG makes more sense in this case.
Reviewers: dblaikie, fhahn, hsaito
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D47498
llvm-svn: 333476 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary: It was fully replaced back in 2014, and the implementation was removed 11 months ago by r306797.
Reviewers: hfinkel, chandlerc, whitequark, deadalnix
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D47436
llvm-svn: 333378 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
It's internal to the VPlanHCFGBuilder and should not be visible outside of its
translation unit.
Reviewers: dcaballe, fhahn
Reviewed By: fhahn
Subscribers: rengolin, bollu, tschuett, llvm-commits, rkruppe
Differential Revision: https://reviews.llvm.org/D47312
llvm-svn: 333187 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | Now that the LLVM_DEBUG() macro landed on the various sub-projects
the DEBUG macro can be removed.
Also change the new uses of DEBUG to LLVM_DEBUG.
Differential Revision: https://reviews.llvm.org/D46952
llvm-svn: 333091 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | r332654 tried to fix an unused function warning with
a void cast. This approach worked for clang and gcc 
but not for MSVC. This commit replaces the void cast
with the LLVM_ATTRIBUTE_USED approach.
llvm-svn: 332910 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | r332654 was reverted due to an unused function warning in
release build. This commit includes the same code with the
warning silenced.
Differential Revision: https://reviews.llvm.org/D44338
llvm-svn: 332860 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | time.
The introduced problem is:
llvm.src/lib/Transforms/Vectorize/VPlanVerifier.cpp:29:13: error: unused function 'hasDuplicates' [-Werror,-Wunused-function]
static bool hasDuplicates(const SmallVectorImpl<VPBlockBase *> &VPBlockVec) {
            ^
llvm-svn: 332747 |