diff options
| author | Daniel Sanders <daniel_l_sanders@apple.com> | 2018-01-29 19:54:49 +0000 |
|---|---|---|
| committer | Daniel Sanders <daniel_l_sanders@apple.com> | 2018-01-29 19:54:49 +0000 |
| commit | 79cb839fcdee3ce545b8d7fc855544e712c94f08 (patch) | |
| tree | 50f40b4b44840c5d46cee5e96ac6cc23ac506eae /llvm/lib | |
| parent | e899a0b824d1fb8fdbd2dc0c38011a7e354a95fb (diff) | |
| download | bcm5719-llvm-79cb839fcdee3ce545b8d7fc855544e712c94f08.tar.gz bcm5719-llvm-79cb839fcdee3ce545b8d7fc855544e712c94f08.zip | |
[globalisel][legalizer] Adapt LegalizerInfo to support inter-type dependencies and other things.
Summary:
As discussed in D42244, we have difficulty describing the legality of some
operations. We're not able to specify relationships between types.
For example, declaring the following
setAction({..., 0, s32}, Legal)
setAction({..., 0, s64}, Legal)
setAction({..., 1, s32}, Legal)
setAction({..., 1, s64}, Legal)
currently declares these type combinations as legal:
{s32, s32}
{s64, s32}
{s32, s64}
{s64, s64}
but we currently have no means to say that, for example, {s64, s32} is
not legal. Some operations such as G_INSERT/G_EXTRACT/G_MERGE_VALUES/
G_UNMERGE_VALUES have relationships between the types that are currently
described incorrectly.
Additionally, G_LOAD/G_STORE currently have no means to legalize non-atomics
differently to atomics. The necessary information is in the MMO but we have no
way to use this in the legalizer. Similarly, there is currently no way for the
register type and the memory type to differ so there is no way to cleanly
represent extending-load/truncating-store in a way that can't be broken by
optimizers (resulting in illegal MIR).
It's also difficult to control the legalization strategy. We've added support
for legalizing non-power of 2 types but there's still some hardcoded assumptions
about the strategy. The main one I've noticed is that type0 is always legalized
before type1 which is not a good strategy for `type0 = G_EXTRACT type1, ...` if
you need to widen the container. It will converge on the same result eventually
but it will take a much longer route when legalizing type0 than if you legalize
type1 first.
Lastly, the definition of legality and the legalization strategy is kept
separate which is not ideal. It's helpful to be able to look at a one piece of
code and see both what is legal and the method the legalizer will use to make
illegal MIR more legal.
This patch adds a layer onto the LegalizerInfo (to be removed when all targets
have been migrated) which resolves all these issues.
Here are the rules for shift and division:
for (unsigned BinOp : {G_LSHR, G_ASHR, G_SDIV, G_UDIV})
getActionDefinitions(BinOp)
.legalFor({s32, s64}) // If type0 is s32/s64 then it's Legal
.clampScalar(0, s32, s64) // If type0 is <s32 then WidenScalar to s32
// If type0 is >s64 then NarrowScalar to s64
.widenScalarToPow2(0) // Round type0 scalars up to powers of 2
.unsupported(); // Otherwise, it's unsupported
This describes everything needed to both define legality and describe how to
make illegal things legal.
Here's an example of a complex rule:
getActionDefinitions(G_INSERT)
.unsupportedIf([=](const LegalityQuery &Query) {
// If type0 is smaller than type1 then it's unsupported
return Query.Types[0].getSizeInBits() <= Query.Types[1].getSizeInBits();
})
.legalIf([=](const LegalityQuery &Query) {
// If type0 is s32/s64/p0 and type1 is a power of 2 other than 2 or 4 then it's legal
// We don't need to worry about large type1's because unsupportedIf caught that.
const LLT &Ty0 = Query.Types[0];
const LLT &Ty1 = Query.Types[1];
if (Ty0 != s32 && Ty0 != s64 && Ty0 != p0)
return false;
return isPowerOf2_32(Ty1.getSizeInBits()) &&
(Ty1.getSizeInBits() == 1 || Ty1.getSizeInBits() >= 8);
})
.clampScalar(0, s32, s64)
.widenScalarToPow2(0)
.maxScalarIf(typeInSet(0, {s32}), 1, s16) // If type0 is s32 and type1 is bigger than s16 then NarrowScalar type1 to s16
.maxScalarIf(typeInSet(0, {s64}), 1, s32) // If type0 is s64 and type1 is bigger than s32 then NarrowScalar type1 to s32
.widenScalarToPow2(1) // Round type1 scalars up to powers of 2
.unsupported();
This uses a lambda to say that G_INSERT is unsupported when type0 is bigger than
type1 (in practice, this would be a default rule for G_INSERT). It also uses one
to describe the legal cases. This particular predicate is equivalent to:
.legalFor({{s32, s1}, {s32, s8}, {s32, s16}, {s64, s1}, {s64, s8}, {s64, s16}, {s64, s32}})
In terms of performance, I saw a slight (~6%) performance improvement when
AArch64 was around 30% ported but it's pretty much break even right now.
I'm going to take a look at constexpr as a means to reduce the initialization
cost.
Future work:
* Make it possible for opcodes to share rulesets. There's no need for
G_LSHR/G_ASHR/G_SDIV/G_UDIV to have separate rule and ruleset objects. There's
no technical barrier to this, it just hasn't been done yet.
* Replace the type-index numbers with an enum to get .clampScalar(Type0, s32, s64)
* Better names for things like .maxScalarIf() (clampMaxScalar?) and the vector rules.
* Improve initialization cost using constexpr
Possible future work:
* It's possible to make these rulesets change the MIR directly instead of
returning a description of how to change the MIR. This should remove a little
overhead caused by parsing the description and routing to the right code, but
the real motivation is that it removes the need for LegalizeAction::Custom.
With Custom removed, there's no longer a requirement that Custom legalization
change the opcode to something that's considered legal.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar, volkan, reames, bogner
Reviewed By: bogner
Subscribers: hintonda, bogner, aemerson, mgorny, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D42251
llvm-svn: 323681
Diffstat (limited to 'llvm/lib')
| -rw-r--r-- | llvm/lib/CodeGen/GlobalISel/CMakeLists.txt | 4 | ||||
| -rw-r--r-- | llvm/lib/CodeGen/GlobalISel/LegalityPredicates.cpp | 78 | ||||
| -rw-r--r-- | llvm/lib/CodeGen/GlobalISel/LegalizeMutations.cpp | 44 | ||||
| -rw-r--r-- | llvm/lib/CodeGen/GlobalISel/LegalizerInfo.cpp | 108 | ||||
| -rw-r--r-- | llvm/lib/Target/AArch64/AArch64LegalizerInfo.cpp | 554 |
5 files changed, 484 insertions, 304 deletions
diff --git a/llvm/lib/CodeGen/GlobalISel/CMakeLists.txt b/llvm/lib/CodeGen/GlobalISel/CMakeLists.txt index e68ed179558..4c1da3756b1 100644 --- a/llvm/lib/CodeGen/GlobalISel/CMakeLists.txt +++ b/llvm/lib/CodeGen/GlobalISel/CMakeLists.txt @@ -6,8 +6,10 @@ add_llvm_library(LLVMGlobalISel IRTranslator.cpp InstructionSelect.cpp InstructionSelector.cpp - LegalizerHelper.cpp + LegalityPredicates.cpp + LegalizeMutations.cpp Legalizer.cpp + LegalizerHelper.cpp LegalizerInfo.cpp Localizer.cpp MachineIRBuilder.cpp diff --git a/llvm/lib/CodeGen/GlobalISel/LegalityPredicates.cpp b/llvm/lib/CodeGen/GlobalISel/LegalityPredicates.cpp new file mode 100644 index 00000000000..0d30e0781cd --- /dev/null +++ b/llvm/lib/CodeGen/GlobalISel/LegalityPredicates.cpp @@ -0,0 +1,78 @@ +//===- lib/CodeGen/GlobalISel/LegalizerPredicates.cpp - Predicates --------===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// A library of predicate factories to use for LegalityPredicate. +// +//===----------------------------------------------------------------------===// + +#include "llvm/CodeGen/GlobalISel/LegalizerInfo.h" + +using namespace llvm; + +LegalityPredicate +LegalityPredicates::all(LegalityPredicate P0, LegalityPredicate P1) { + return [=](const LegalityQuery &Query) { + return P0(Query) && P1(Query); + }; +} + +LegalityPredicate +LegalityPredicates::typeInSet(unsigned TypeIdx, + std::initializer_list<LLT> TypesInit) { + SmallVector<LLT, 4> Types = TypesInit; + return [=](const LegalityQuery &Query) { + return std::find(Types.begin(), Types.end(), Query.Types[TypeIdx]) != Types.end(); + }; +} + +LegalityPredicate LegalityPredicates::typePairInSet( + unsigned TypeIdx0, unsigned TypeIdx1, + std::initializer_list<std::pair<LLT, LLT>> TypesInit) { + SmallVector<std::pair<LLT, LLT>, 4> Types = TypesInit; + return [=](const LegalityQuery &Query) { + std::pair<LLT, LLT> Match = {Query.Types[TypeIdx0], Query.Types[TypeIdx1]}; + return std::find(Types.begin(), Types.end(), Match) != Types.end(); + }; +} + +LegalityPredicate LegalityPredicates::isScalar(unsigned TypeIdx) { + return [=](const LegalityQuery &Query) { + return Query.Types[TypeIdx].isScalar(); + }; +} + +LegalityPredicate LegalityPredicates::narrowerThan(unsigned TypeIdx, + unsigned Size) { + return [=](const LegalityQuery &Query) { + const LLT &QueryTy = Query.Types[TypeIdx]; + return QueryTy.isScalar() && QueryTy.getSizeInBits() < Size; + }; +} + +LegalityPredicate LegalityPredicates::widerThan(unsigned TypeIdx, + unsigned Size) { + return [=](const LegalityQuery &Query) { + const LLT &QueryTy = Query.Types[TypeIdx]; + return QueryTy.isScalar() && QueryTy.getSizeInBits() > Size; + }; +} + +LegalityPredicate LegalityPredicates::sizeNotPow2(unsigned TypeIdx) { + return [=](const LegalityQuery &Query) { + const LLT &QueryTy = Query.Types[TypeIdx]; + return QueryTy.isScalar() && !isPowerOf2_32(QueryTy.getSizeInBits()); + }; +} + +LegalityPredicate LegalityPredicates::numElementsNotPow2(unsigned TypeIdx) { + return [=](const LegalityQuery &Query) { + const LLT &QueryTy = Query.Types[TypeIdx]; + return QueryTy.isVector() && isPowerOf2_32(QueryTy.getNumElements()); + }; +} diff --git a/llvm/lib/CodeGen/GlobalISel/LegalizeMutations.cpp b/llvm/lib/CodeGen/GlobalISel/LegalizeMutations.cpp new file mode 100644 index 00000000000..961988515e5 --- /dev/null +++ b/llvm/lib/CodeGen/GlobalISel/LegalizeMutations.cpp @@ -0,0 +1,44 @@ +//===- lib/CodeGen/GlobalISel/LegalizerMutations.cpp - Mutations ----------===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// A library of mutation factories to use for LegalityMutation. +// +//===----------------------------------------------------------------------===// + +#include "llvm/CodeGen/GlobalISel/LegalizerInfo.h" + +using namespace llvm; + +LegalizeMutation LegalizeMutations::identity(unsigned TypeIdx, LLT Ty) { + return + [=](const LegalityQuery &Query) { return std::make_pair(TypeIdx, Ty); }; +} + +LegalizeMutation LegalizeMutations::widenScalarToNextPow2(unsigned TypeIdx, + unsigned Min) { + return [=](const LegalityQuery &Query) { + unsigned NewSizeInBits = + 1 << Log2_32_Ceil(Query.Types[TypeIdx].getSizeInBits()); + if (NewSizeInBits < Min) + NewSizeInBits = Min; + return std::make_pair(TypeIdx, LLT::scalar(NewSizeInBits)); + }; +} + +LegalizeMutation LegalizeMutations::moreElementsToNextPow2(unsigned TypeIdx, + unsigned Min) { + return [=](const LegalityQuery &Query) { + const LLT &VecTy = Query.Types[TypeIdx]; + unsigned NewNumElements = 1 << Log2_32_Ceil(VecTy.getNumElements()); + if (NewNumElements < Min) + NewNumElements = Min; + return std::make_pair( + TypeIdx, LLT::vector(NewNumElements, VecTy.getScalarSizeInBits())); + }; +} diff --git a/llvm/lib/CodeGen/GlobalISel/LegalizerInfo.cpp b/llvm/lib/CodeGen/GlobalISel/LegalizerInfo.cpp index 12f5524aa8c..054acc1d2fe 100644 --- a/llvm/lib/CodeGen/GlobalISel/LegalizerInfo.cpp +++ b/llvm/lib/CodeGen/GlobalISel/LegalizerInfo.cpp @@ -24,6 +24,7 @@ #include "llvm/CodeGen/MachineRegisterInfo.h" #include "llvm/CodeGen/TargetOpcodes.h" #include "llvm/MC/MCInstrDesc.h" +#include "llvm/Support/Debug.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/LowLevelTypeImpl.h" #include "llvm/Support/MathExtras.h" @@ -33,6 +34,40 @@ using namespace llvm; using namespace LegalizeActions; +#define DEBUG_TYPE "legalizer-info" + +raw_ostream &LegalityQuery::print(raw_ostream &OS) const { + OS << Opcode << ", {"; + for (const auto &Type : Types) { + OS << Type << ", "; + } + OS << "}"; + return OS; +} + +LegalizeActionStep LegalizeRuleSet::apply(const LegalityQuery &Query) const { + DEBUG(dbgs() << "Applying legalizer ruleset to: "; Query.print(dbgs()); + dbgs() << "\n"); + if (Rules.empty()) { + DEBUG(dbgs() << ".. fallback to legacy rules (no rules defined)\n"); + return {LegalizeAction::UseLegacyRules, 0, LLT{}}; + } + for (const auto &Rule : Rules) { + if (Rule.match(Query)) { + DEBUG(dbgs() << ".. match\n"); + std::pair<unsigned, LLT> Mutation = Rule.determineMutation(Query); + DEBUG(dbgs() << ".. .. " << (unsigned)Rule.getAction() << ", " + << Mutation.first << ", " << Mutation.second << "\n"); + assert(Query.Types[Mutation.first] != Mutation.second && + "Simple loop detected"); + return {Rule.getAction(), Mutation.first, Mutation.second}; + } else + DEBUG(dbgs() << ".. no match\n"); + } + DEBUG(dbgs() << ".. unsupported\n"); + return {LegalizeAction::Unsupported, 0, LLT{}}; +} + LegalizerInfo::LegalizerInfo() : TablesInitialized(false) { // Set defaults. // FIXME: these two (G_ANYEXT and G_TRUNC?) can be legalized to the @@ -188,17 +223,79 @@ static LLT getTypeFromTypeIdx(const MachineInstr &MI, return MRI.getType(MI.getOperand(OpIdx).getReg()); } -LegalizerInfo::LegalizeActionStep +unsigned LegalizerInfo::getOpcodeIdxForOpcode(unsigned Opcode) const { + assert(Opcode >= FirstOp && Opcode <= LastOp && "Unsupported opcode"); + return Opcode - FirstOp; +} + +unsigned LegalizerInfo::getActionDefinitionsIdx(unsigned Opcode) const { + unsigned OpcodeIdx = getOpcodeIdxForOpcode(Opcode); + if (unsigned Alias = RulesForOpcode[OpcodeIdx].getAlias()) { + DEBUG(dbgs() << ".. opcode " << Opcode << " is aliased to " << Alias + << "\n"); + OpcodeIdx = getOpcodeIdxForOpcode(Alias); + DEBUG(dbgs() << ".. opcode " << Alias << " is aliased to " + << RulesForOpcode[OpcodeIdx].getAlias() << "\n"); + assert(RulesForOpcode[OpcodeIdx].getAlias() == 0 && "Cannot chain aliases"); + } + + return OpcodeIdx; +} + +const LegalizeRuleSet & +LegalizerInfo::getActionDefinitions(unsigned Opcode) const { + unsigned OpcodeIdx = getActionDefinitionsIdx(Opcode); + return RulesForOpcode[OpcodeIdx]; +} + +LegalizeRuleSet &LegalizerInfo::getActionDefinitionsBuilder(unsigned Opcode) { + unsigned OpcodeIdx = getActionDefinitionsIdx(Opcode); + auto &Result = RulesForOpcode[OpcodeIdx]; + assert(!Result.isAliasedByAnother() && "Modifying this opcode will modify aliases"); + return Result; +} + +LegalizeRuleSet &LegalizerInfo::getActionDefinitionsBuilder( + std::initializer_list<unsigned> Opcodes) { + unsigned Representative = *Opcodes.begin(); + + for (auto I = Opcodes.begin() + 1, E = Opcodes.end(); I != E; ++I) + aliasActionDefinitions(Representative, *I); + + auto &Return = getActionDefinitionsBuilder(Representative); + Return.setIsAliasedByAnother(); + return Return; +} + +void LegalizerInfo::aliasActionDefinitions(unsigned OpcodeTo, + unsigned OpcodeFrom) { + assert(OpcodeTo != OpcodeFrom && "Cannot alias to self"); + assert(OpcodeTo >= FirstOp && OpcodeTo <= LastOp && "Unsupported opcode"); + const unsigned OpcodeFromIdx = getOpcodeIdxForOpcode(OpcodeFrom); + RulesForOpcode[OpcodeFromIdx].aliasTo(OpcodeTo); +} + +LegalizeActionStep LegalizerInfo::getAction(const LegalityQuery &Query) const { + LegalizeActionStep Step = getActionDefinitions(Query.Opcode).apply(Query); + if (Step.Action != LegalizeAction::UseLegacyRules) { + return Step; + } + for (unsigned i = 0; i < Query.Types.size(); ++i) { auto Action = getAspectAction({Query.Opcode, i, Query.Types[i]}); - if (Action.first != Legal) + if (Action.first != Legal) { + DEBUG(dbgs() << ".. (legacy) Type " << i << " Action=" + << (unsigned)Action.first << ", " << Action.second << "\n"); return {Action.first, i, Action.second}; + } else + DEBUG(dbgs() << ".. (legacy) Type " << i << " Legal\n"); } + DEBUG(dbgs() << ".. (legacy) Legal\n"); return {Legal, 0, LLT{}}; } -LegalizerInfo::LegalizeActionStep +LegalizeActionStep LegalizerInfo::getAction(const MachineInstr &MI, const MachineRegisterInfo &MRI) const { SmallVector<LLT, 2> Types; @@ -323,6 +420,7 @@ LegalizerInfo::findAction(const SizeAndActionsVec &Vec, const uint32_t Size) { case Unsupported: return {Size, Unsupported}; case NotFound: + case UseLegacyRules: llvm_unreachable("NotFound"); } llvm_unreachable("Action has an unknown enum value"); @@ -333,7 +431,7 @@ LegalizerInfo::findScalarLegalAction(const InstrAspect &Aspect) const { assert(Aspect.Type.isScalar() || Aspect.Type.isPointer()); if (Aspect.Opcode < FirstOp || Aspect.Opcode > LastOp) return {NotFound, LLT()}; - const unsigned OpcodeIdx = Aspect.Opcode - FirstOp; + const unsigned OpcodeIdx = getOpcodeIdxForOpcode(Aspect.Opcode); if (Aspect.Type.isPointer() && AddrSpace2PointerActions[OpcodeIdx].find(Aspect.Type.getAddressSpace()) == AddrSpace2PointerActions[OpcodeIdx].end()) { @@ -364,7 +462,7 @@ LegalizerInfo::findVectorLegalAction(const InstrAspect &Aspect) const { // lanes in the vector. if (Aspect.Opcode < FirstOp || Aspect.Opcode > LastOp) return {NotFound, Aspect.Type}; - const unsigned OpcodeIdx = Aspect.Opcode - FirstOp; + const unsigned OpcodeIdx = getOpcodeIdxForOpcode(Aspect.Opcode); const unsigned TypeIdx = Aspect.Idx; if (TypeIdx >= ScalarInVectorActions[OpcodeIdx].size()) return {NotFound, Aspect.Type}; diff --git a/llvm/lib/Target/AArch64/AArch64LegalizerInfo.cpp b/llvm/lib/Target/AArch64/AArch64LegalizerInfo.cpp index 51a2bbb6ef7..16821fbe3d0 100644 --- a/llvm/lib/Target/AArch64/AArch64LegalizerInfo.cpp +++ b/llvm/lib/Target/AArch64/AArch64LegalizerInfo.cpp @@ -24,110 +24,7 @@ using namespace llvm; using namespace LegalizeActions; - -/// FIXME: The following static functions are SizeChangeStrategy functions -/// that are meant to temporarily mimic the behaviour of the old legalization -/// based on doubling/halving non-legal types as closely as possible. This is -/// not entirly possible as only legalizing the types that are exactly a power -/// of 2 times the size of the legal types would require specifying all those -/// sizes explicitly. -/// In practice, not specifying those isn't a problem, and the below functions -/// should disappear quickly as we add support for legalizing non-power-of-2 -/// sized types further. -static void -addAndInterleaveWithUnsupported(LegalizerInfo::SizeAndActionsVec &result, - const LegalizerInfo::SizeAndActionsVec &v) { - for (unsigned i = 0; i < v.size(); ++i) { - result.push_back(v[i]); - if (i + 1 < v[i].first && i + 1 < v.size() && - v[i + 1].first != v[i].first + 1) - result.push_back({v[i].first + 1, Unsupported}); - } -} - -static LegalizerInfo::SizeAndActionsVec -widen_1_narrow_128_ToLargest(const LegalizerInfo::SizeAndActionsVec &v) { - assert(v.size() >= 1); - assert(v[0].first > 2); - LegalizerInfo::SizeAndActionsVec result = {{1, WidenScalar}, - {2, Unsupported}}; - addAndInterleaveWithUnsupported(result, v); - auto Largest = result.back().first; - assert(Largest + 1 < 128); - result.push_back({Largest + 1, Unsupported}); - result.push_back({128, NarrowScalar}); - result.push_back({129, Unsupported}); - return result; -} - -static LegalizerInfo::SizeAndActionsVec -widen_16(const LegalizerInfo::SizeAndActionsVec &v) { - assert(v.size() >= 1); - assert(v[0].first > 17); - LegalizerInfo::SizeAndActionsVec result = {{1, Unsupported}, - {16, WidenScalar}, - {17, Unsupported}}; - addAndInterleaveWithUnsupported(result, v); - auto Largest = result.back().first; - result.push_back({Largest + 1, Unsupported}); - return result; -} - -static LegalizerInfo::SizeAndActionsVec -widen_1_8(const LegalizerInfo::SizeAndActionsVec &v) { - assert(v.size() >= 1); - assert(v[0].first > 9); - LegalizerInfo::SizeAndActionsVec result = { - {1, WidenScalar}, {2, Unsupported}, - {8, WidenScalar}, {9, Unsupported}}; - addAndInterleaveWithUnsupported(result, v); - auto Largest = result.back().first; - result.push_back({Largest + 1, Unsupported}); - return result; -} - -static LegalizerInfo::SizeAndActionsVec -widen_1_8_16(const LegalizerInfo::SizeAndActionsVec &v) { - assert(v.size() >= 1); - assert(v[0].first > 17); - LegalizerInfo::SizeAndActionsVec result = { - {1, WidenScalar}, {2, Unsupported}, - {8, WidenScalar}, {9, Unsupported}, - {16, WidenScalar}, {17, Unsupported}}; - addAndInterleaveWithUnsupported(result, v); - auto Largest = result.back().first; - result.push_back({Largest + 1, Unsupported}); - return result; -} - -static LegalizerInfo::SizeAndActionsVec -widen_1_8_16_narrowToLargest(const LegalizerInfo::SizeAndActionsVec &v) { - assert(v.size() >= 1); - assert(v[0].first > 17); - LegalizerInfo::SizeAndActionsVec result = { - {1, WidenScalar}, {2, Unsupported}, - {8, WidenScalar}, {9, Unsupported}, - {16, WidenScalar}, {17, Unsupported}}; - addAndInterleaveWithUnsupported(result, v); - auto Largest = result.back().first; - result.push_back({Largest + 1, NarrowScalar}); - return result; -} - -static LegalizerInfo::SizeAndActionsVec -widen_1_8_16_32(const LegalizerInfo::SizeAndActionsVec &v) { - assert(v.size() >= 1); - assert(v[0].first > 33); - LegalizerInfo::SizeAndActionsVec result = { - {1, WidenScalar}, {2, Unsupported}, - {8, WidenScalar}, {9, Unsupported}, - {16, WidenScalar}, {17, Unsupported}, - {32, WidenScalar}, {33, Unsupported}}; - addAndInterleaveWithUnsupported(result, v); - auto Largest = result.back().first; - result.push_back({Largest + 1, Unsupported}); - return result; -} +using namespace LegalityPredicates; AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST) { using namespace TargetOpcode; @@ -138,45 +35,51 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST) { const LLT s32 = LLT::scalar(32); const LLT s64 = LLT::scalar(64); const LLT s128 = LLT::scalar(128); + const LLT s256 = LLT::scalar(256); + const LLT s512 = LLT::scalar(512); + const LLT v16s8 = LLT::vector(16, 8); + const LLT v8s8 = LLT::vector(8, 8); + const LLT v4s8 = LLT::vector(4, 8); + const LLT v8s16 = LLT::vector(8, 16); + const LLT v4s16 = LLT::vector(4, 16); + const LLT v2s16 = LLT::vector(2, 16); const LLT v2s32 = LLT::vector(2, 32); const LLT v4s32 = LLT::vector(4, 32); const LLT v2s64 = LLT::vector(2, 64); - for (auto Ty : {p0, s1, s8, s16, s32, s64}) - setAction({G_IMPLICIT_DEF, Ty}, Legal); + getActionDefinitionsBuilder(G_IMPLICIT_DEF) + .legalFor({p0, s1, s8, s16, s32, s64}) + .clampScalar(0, s1, s64) + .widenScalarToNextPow2(0, 8); - for (auto Ty : {s16, s32, s64, p0}) - setAction({G_PHI, Ty}, Legal); + getActionDefinitionsBuilder(G_PHI) + .legalFor({p0, s16, s32, s64}) + .clampScalar(0, s16, s64) + .widenScalarToNextPow2(0); - setLegalizeScalarToDifferentSizeStrategy(G_PHI, 0, widen_1_8); + getActionDefinitionsBuilder(G_BSWAP) + .legalFor({s32, s64}) + .clampScalar(0, s16, s64) + .widenScalarToNextPow2(0); - for (auto Ty : { s32, s64 }) - setAction({G_BSWAP, Ty}, Legal); + getActionDefinitionsBuilder({G_ADD, G_SUB, G_MUL, G_AND, G_OR, G_XOR, G_SHL}) + .legalFor({s32, s64, v2s32, v4s32, v2s64}) + .clampScalar(0, s32, s64) + .widenScalarToNextPow2(0) + .clampNumElements(0, v2s32, v4s32) + .clampNumElements(0, v2s64, v2s64) + .moreElementsToNextPow2(0); - for (unsigned BinOp : {G_ADD, G_SUB, G_MUL, G_AND, G_OR, G_XOR, G_SHL}) { - // These operations naturally get the right answer when used on - // GPR32, even if the actual type is narrower. - for (auto Ty : {s32, s64, v2s32, v4s32, v2s64}) - setAction({BinOp, Ty}, Legal); + getActionDefinitionsBuilder(G_GEP) + .legalFor({{p0, s64}}) + .clampScalar(1, s64, s64); - if (BinOp != G_ADD) - setLegalizeScalarToDifferentSizeStrategy(BinOp, 0, - widen_1_8_16_narrowToLargest); - } + getActionDefinitionsBuilder(G_PTR_MASK).legalFor({p0}); - setAction({G_GEP, p0}, Legal); - setAction({G_GEP, 1, s64}, Legal); - - setLegalizeScalarToDifferentSizeStrategy(G_GEP, 1, widen_1_8_16_32); - - setAction({G_PTR_MASK, p0}, Legal); - - for (unsigned BinOp : {G_LSHR, G_ASHR, G_SDIV, G_UDIV}) { - for (auto Ty : {s32, s64}) - setAction({BinOp, Ty}, Legal); - - setLegalizeScalarToDifferentSizeStrategy(BinOp, 0, widen_1_8_16); - } + getActionDefinitionsBuilder({G_LSHR, G_ASHR, G_SDIV, G_UDIV}) + .legalFor({s32, s64}) + .clampScalar(0, s32, s64) + .widenScalarToNextPow2(0); for (unsigned BinOp : {G_SREM, G_UREM}) for (auto Ty : { s1, s8, s16, s32, s64 }) @@ -187,204 +90,259 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST) { setAction({Op, 1, s1}, Legal); } - for (unsigned Op : {G_UADDE, G_USUBE, G_SADDO, G_SSUBO, G_SMULH, G_UMULH}) { - for (auto Ty : { s32, s64 }) - setAction({Op, Ty}, Legal); - - setAction({Op, 1, s1}, Legal); - } - - for (unsigned BinOp : {G_FADD, G_FSUB, G_FMA, G_FMUL, G_FDIV}) - for (auto Ty : {s32, s64}) - setAction({BinOp, Ty}, Legal); - - for (unsigned BinOp : {G_FREM, G_FPOW}) { - setAction({BinOp, s32}, Libcall); - setAction({BinOp, s64}, Libcall); - } - - for (auto Ty : {s32, s64, p0}) { - setAction({G_INSERT, Ty}, Legal); - setAction({G_INSERT, 1, Ty}, Legal); - } - setLegalizeScalarToDifferentSizeStrategy(G_INSERT, 0, - widen_1_8_16_narrowToLargest); - for (auto Ty : {s1, s8, s16}) { - setAction({G_INSERT, 1, Ty}, Legal); - // FIXME: Can't widen the sources because that violates the constraints on - // G_INSERT (It seems entirely reasonable that inputs shouldn't overlap). - } - - for (auto Ty : {s1, s8, s16, s32, s64, p0}) - setAction({G_EXTRACT, Ty}, Legal); - - for (auto Ty : {s32, s64}) - setAction({G_EXTRACT, 1, Ty}, Legal); - - for (unsigned MemOp : {G_LOAD, G_STORE}) { - for (auto Ty : {s8, s16, s32, s64, p0, v2s32}) - setAction({MemOp, Ty}, Legal); - - setLegalizeScalarToDifferentSizeStrategy(MemOp, 0, - widen_1_narrow_128_ToLargest); - - // And everything's fine in addrspace 0. - setAction({MemOp, 1, p0}, Legal); - } + getActionDefinitionsBuilder({G_SMULH, G_UMULH}).legalFor({s32, s64}); + + getActionDefinitionsBuilder({G_UADDE, G_USUBE, G_SADDO, G_SSUBO}) + .legalFor({{s32, s1}, {s64, s1}}); + + getActionDefinitionsBuilder({G_FADD, G_FSUB, G_FMA, G_FMUL, G_FDIV}) + .legalFor({s32, s64}); + + getActionDefinitionsBuilder({G_FREM, G_FPOW}).libcallFor({s32, s64}); + + getActionDefinitionsBuilder(G_INSERT) + .unsupportedIf([=](const LegalityQuery &Query) { + return Query.Types[0].getSizeInBits() <= Query.Types[1].getSizeInBits(); + }) + .legalIf([=](const LegalityQuery &Query) { + const LLT &Ty0 = Query.Types[0]; + const LLT &Ty1 = Query.Types[1]; + if (Ty0 != s32 && Ty0 != s64 && Ty0 != p0) + return false; + return isPowerOf2_32(Ty1.getSizeInBits()) && + (Ty1.getSizeInBits() == 1 || Ty1.getSizeInBits() >= 8); + }) + .clampScalar(0, s32, s64) + .widenScalarToNextPow2(0) + .maxScalarIf(typeInSet(0, {s32}), 1, s16) + .maxScalarIf(typeInSet(0, {s64}), 1, s32) + .widenScalarToNextPow2(1); + + getActionDefinitionsBuilder(G_EXTRACT) + .unsupportedIf([=](const LegalityQuery &Query) { + return Query.Types[0].getSizeInBits() >= Query.Types[1].getSizeInBits(); + }) + .legalIf([=](const LegalityQuery &Query) { + const LLT &Ty0 = Query.Types[0]; + const LLT &Ty1 = Query.Types[1]; + if (Ty1 != s32 && Ty1 != s64) + return false; + if (Ty1 == p0) + return true; + return isPowerOf2_32(Ty0.getSizeInBits()) && + (Ty0.getSizeInBits() == 1 || Ty0.getSizeInBits() >= 8); + }) + .clampScalar(1, s32, s64) + .widenScalarToNextPow2(1) + .maxScalarIf(typeInSet(1, {s32}), 0, s16) + .maxScalarIf(typeInSet(1, {s64}), 0, s32) + .widenScalarToNextPow2(0); + + getActionDefinitionsBuilder({G_LOAD, G_STORE}) + .legalFor( + {{s8, p0}, {s16, p0}, {s32, p0}, {s64, p0}, {p0, p0}, {v2s32, p0}}) + .clampScalar(0, s8, s64) + .widenScalarToNextPow2(0) + .clampNumElements(0, v2s32, v2s32); // Constants - for (auto Ty : {s32, s64}) { - setAction({TargetOpcode::G_CONSTANT, Ty}, Legal); - setAction({TargetOpcode::G_FCONSTANT, Ty}, Legal); - } - - setAction({G_CONSTANT, p0}, Legal); - - setLegalizeScalarToDifferentSizeStrategy(G_CONSTANT, 0, widen_1_8_16); - setLegalizeScalarToDifferentSizeStrategy(G_FCONSTANT, 0, widen_16); - - setAction({G_ICMP, 1, s32}, Legal); - setAction({G_ICMP, 1, s64}, Legal); - setAction({G_ICMP, 1, p0}, Legal); - - setLegalizeScalarToDifferentSizeStrategy(G_ICMP, 0, widen_1_8_16); - setLegalizeScalarToDifferentSizeStrategy(G_FCMP, 0, widen_1_8_16); - setLegalizeScalarToDifferentSizeStrategy(G_ICMP, 1, widen_1_8_16); - - setAction({G_ICMP, s32}, Legal); - setAction({G_FCMP, s32}, Legal); - setAction({G_FCMP, 1, s32}, Legal); - setAction({G_FCMP, 1, s64}, Legal); + getActionDefinitionsBuilder(G_CONSTANT) + .legalFor({p0, s32, s64}) + .clampScalar(0, s32, s64) + .widenScalarToNextPow2(0); + getActionDefinitionsBuilder(G_FCONSTANT) + .legalFor({s32, s64}) + .clampScalar(0, s32, s64); + + getActionDefinitionsBuilder(G_ICMP) + .legalFor({{s32, s32}, {s32, s64}, {s32, p0}}) + .clampScalar(0, s32, s32) + .clampScalar(1, s32, s64) + .widenScalarToNextPow2(1); + + getActionDefinitionsBuilder(G_FCMP) + .legalFor({{s32, s32}, {s32, s64}}) + .clampScalar(0, s32, s32) + .clampScalar(1, s32, s64) + .widenScalarToNextPow2(1); // Extensions - for (auto Ty : { s1, s8, s16, s32, s64 }) { - setAction({G_ZEXT, Ty}, Legal); - setAction({G_SEXT, Ty}, Legal); - setAction({G_ANYEXT, Ty}, Legal); - } + getActionDefinitionsBuilder({G_ZEXT, G_SEXT, G_ANYEXT}) + .legalFor({s1, s8, s16, s32, s64}) + .maxScalar(0, s64) + .widenScalarToNextPow2(0); // FP conversions - for (auto Ty : { s16, s32 }) { - setAction({G_FPTRUNC, Ty}, Legal); - setAction({G_FPEXT, 1, Ty}, Legal); - } - - for (auto Ty : { s32, s64 }) { - setAction({G_FPTRUNC, 1, Ty}, Legal); - setAction({G_FPEXT, Ty}, Legal); - } + getActionDefinitionsBuilder(G_FPTRUNC).legalFor( + {{s16, s32}, {s16, s64}, {s32, s64}}); + getActionDefinitionsBuilder(G_FPEXT).legalFor( + {{s32, s16}, {s64, s16}, {s64, s32}}); // Conversions - for (auto Ty : { s32, s64 }) { - setAction({G_FPTOSI, 0, Ty}, Legal); - setAction({G_FPTOUI, 0, Ty}, Legal); - setAction({G_SITOFP, 1, Ty}, Legal); - setAction({G_UITOFP, 1, Ty}, Legal); - } - setLegalizeScalarToDifferentSizeStrategy(G_FPTOSI, 0, widen_1_8_16); - setLegalizeScalarToDifferentSizeStrategy(G_FPTOUI, 0, widen_1_8_16); - setLegalizeScalarToDifferentSizeStrategy(G_SITOFP, 1, widen_1_8_16); - setLegalizeScalarToDifferentSizeStrategy(G_UITOFP, 1, widen_1_8_16); - - for (auto Ty : { s32, s64 }) { - setAction({G_FPTOSI, 1, Ty}, Legal); - setAction({G_FPTOUI, 1, Ty}, Legal); - setAction({G_SITOFP, 0, Ty}, Legal); - setAction({G_UITOFP, 0, Ty}, Legal); - } + getActionDefinitionsBuilder({G_FPTOSI, G_FPTOUI}) + .legalForCartesianProduct({s32, s64}) + .clampScalar(0, s32, s64) + .widenScalarToNextPow2(0) + .clampScalar(1, s32, s64) + .widenScalarToNextPow2(1); + + getActionDefinitionsBuilder({G_SITOFP, G_UITOFP}) + .legalForCartesianProduct({s32, s64}) + .clampScalar(1, s32, s64) + .widenScalarToNextPow2(1) + .clampScalar(0, s32, s64) + .widenScalarToNextPow2(0); // Control-flow - for (auto Ty : {s1, s8, s16, s32}) - setAction({G_BRCOND, Ty}, Legal); - setAction({G_BRINDIRECT, p0}, Legal); + getActionDefinitionsBuilder(G_BRCOND).legalFor({s1, s8, s16, s32}); + getActionDefinitionsBuilder(G_BRINDIRECT).legalFor({p0}); // Select - setLegalizeScalarToDifferentSizeStrategy(G_SELECT, 0, widen_1_8_16); - - for (auto Ty : {s32, s64, p0}) - setAction({G_SELECT, Ty}, Legal); - - setAction({G_SELECT, 1, s1}, Legal); + getActionDefinitionsBuilder(G_SELECT) + .legalFor({{s32, s1}, {s64, s1}, {p0, s1}}) + .clampScalar(0, s32, s64) + .widenScalarToNextPow2(0); // Pointer-handling - setAction({G_FRAME_INDEX, p0}, Legal); - setAction({G_GLOBAL_VALUE, p0}, Legal); + getActionDefinitionsBuilder(G_FRAME_INDEX).legalFor({p0}); + getActionDefinitionsBuilder(G_GLOBAL_VALUE).legalFor({p0}); - for (auto Ty : {s1, s8, s16, s32, s64}) - setAction({G_PTRTOINT, 0, Ty}, Legal); + getActionDefinitionsBuilder(G_PTRTOINT) + .legalForCartesianProduct({s1, s8, s16, s32, s64}, {p0}) + .maxScalar(0, s64) + .widenScalarToNextPow2(0, /*Min*/ 8); - setAction({G_PTRTOINT, 1, p0}, Legal); - - setAction({G_INTTOPTR, 0, p0}, Legal); - setAction({G_INTTOPTR, 1, s64}, Legal); + getActionDefinitionsBuilder(G_INTTOPTR) + .unsupportedIf([&](const LegalityQuery &Query) { + return Query.Types[0].getSizeInBits() != Query.Types[1].getSizeInBits(); + }) + .legalFor({s64, p0}); // Casts for 32 and 64-bit width type are just copies. // Same for 128-bit width type, except they are on the FPR bank. - for (auto Ty : {s1, s8, s16, s32, s64, s128}) { - setAction({G_BITCAST, 0, Ty}, Legal); - setAction({G_BITCAST, 1, Ty}, Legal); - } - - // For the sake of copying bits around, the type does not really - // matter as long as it fits a register. - for (int EltSize = 8; EltSize <= 64; EltSize *= 2) { - setAction({G_BITCAST, 0, LLT::vector(128/EltSize, EltSize)}, Legal); - setAction({G_BITCAST, 1, LLT::vector(128/EltSize, EltSize)}, Legal); - if (EltSize >= 64) - continue; - - setAction({G_BITCAST, 0, LLT::vector(64/EltSize, EltSize)}, Legal); - setAction({G_BITCAST, 1, LLT::vector(64/EltSize, EltSize)}, Legal); - if (EltSize >= 32) - continue; - - setAction({G_BITCAST, 0, LLT::vector(32/EltSize, EltSize)}, Legal); - setAction({G_BITCAST, 1, LLT::vector(32/EltSize, EltSize)}, Legal); - } + getActionDefinitionsBuilder(G_BITCAST) + // FIXME: This is wrong since G_BITCAST is not allowed to change the + // number of bits but it's what the previous code described and fixing + // it breaks tests. + .legalForCartesianProduct({s1, s8, s16, s32, s64, s128, v16s8, v8s8, v4s8, + v8s16, v4s16, v2s16, v4s32, v2s32, v2s64}); - setAction({G_VASTART, p0}, Legal); + getActionDefinitionsBuilder(G_VASTART).legalFor({p0}); // va_list must be a pointer, but most sized types are pretty easy to handle // as the destination. - setAction({G_VAARG, 1, p0}, Legal); + getActionDefinitionsBuilder(G_VAARG) + .customForCartesianProduct({s8, s16, s32, s64, p0}, {p0}) + .clampScalar(0, s8, s64) + .widenScalarToNextPow2(0, /*Min*/ 8); - for (auto Ty : {s8, s16, s32, s64, p0}) - setAction({G_VAARG, Ty}, Custom); + if (ST.hasLSE()) { + getActionDefinitionsBuilder(G_ATOMIC_CMPXCHG) + .legalForCartesianProduct({s8, s16, s32, s64}, {p0}); + } + getActionDefinitionsBuilder(G_ATOMIC_CMPXCHG); if (ST.hasLSE()) { for (auto Ty : {s8, s16, s32, s64}) { setAction({G_ATOMIC_CMPXCHG_WITH_SUCCESS, Ty}, Lower); - setAction({G_ATOMIC_CMPXCHG, Ty}, Legal); - } - setAction({G_ATOMIC_CMPXCHG, 1, p0}, Legal); - - for (unsigned Op : - {G_ATOMICRMW_XCHG, G_ATOMICRMW_ADD, G_ATOMICRMW_SUB, G_ATOMICRMW_AND, - G_ATOMICRMW_OR, G_ATOMICRMW_XOR, G_ATOMICRMW_MIN, G_ATOMICRMW_MAX, - G_ATOMICRMW_UMIN, G_ATOMICRMW_UMAX}) { - for (auto Ty : {s8, s16, s32, s64}) { - setAction({Op, Ty}, Legal); - } - setAction({Op, 1, p0}, Legal); } + + getActionDefinitionsBuilder( + {G_ATOMICRMW_XCHG, G_ATOMICRMW_ADD, G_ATOMICRMW_SUB, G_ATOMICRMW_AND, + G_ATOMICRMW_OR, G_ATOMICRMW_XOR, G_ATOMICRMW_MIN, G_ATOMICRMW_MAX, + G_ATOMICRMW_UMIN, G_ATOMICRMW_UMAX}) + .legalForCartesianProduct({s8, s16, s32, s64}, {p0}); } // Merge/Unmerge - for (unsigned Op : {G_MERGE_VALUES, G_UNMERGE_VALUES}) - for (int Sz : {8, 16, 32, 64, 128, 192, 256, 384, 512}) { - LLT ScalarTy = LLT::scalar(Sz); - setAction({Op, ScalarTy}, Legal); - setAction({Op, 1, ScalarTy}, Legal); - if (Sz < 32) - continue; - for (int EltSize = 8; EltSize <= 64; EltSize *= 2) { - if (EltSize >= Sz) - continue; - LLT VecTy = LLT::vector(Sz / EltSize, EltSize); - setAction({Op, VecTy}, Legal); - setAction({Op, 1, VecTy}, Legal); + for (unsigned Op : {G_MERGE_VALUES, G_UNMERGE_VALUES}) { + unsigned BigTyIdx = Op == G_MERGE_VALUES ? 0 : 1; + unsigned LitTyIdx = Op == G_MERGE_VALUES ? 1 : 0; + + auto notValidElt = [](const LegalityQuery &Query, unsigned TypeIdx) { + const LLT &Ty = Query.Types[TypeIdx]; + if (Ty.isVector()) { + const LLT &EltTy = Ty.getElementType(); + if (EltTy.getSizeInBits() < 8 || EltTy.getSizeInBits() > 64) + return true; + if (!isPowerOf2_32(EltTy.getSizeInBits())) + return true; } - } + return false; + }; + auto scalarize = + [](const LegalityQuery &Query, unsigned TypeIdx) { + const LLT &Ty = Query.Types[TypeIdx]; + return std::make_pair(TypeIdx, Ty.getElementType()); + }; + + // FIXME: This rule is horrible, but specifies the same as what we had + // before with the particularly strange definitions removed (e.g. + // s8 = G_MERGE_VALUES s32, s32). + // Part of the complexity comes from these ops being extremely flexible. For + // example, you can build/decompose vectors with it, concatenate vectors, + // etc. and in addition to this you can also bitcast with it at the same + // time. We've been considering breaking it up into multiple ops to make it + // more manageable throughout the backend. + getActionDefinitionsBuilder(Op) + // Break up vectors with weird elements into scalars + .fewerElementsIf( + [=](const LegalityQuery &Query) { return notValidElt(Query, 0); }, + [=](const LegalityQuery &Query) { return scalarize(Query, 0); }) + .fewerElementsIf( + [=](const LegalityQuery &Query) { return notValidElt(Query, 1); }, + [=](const LegalityQuery &Query) { return scalarize(Query, 1); }) + // Clamp the big scalar to s8-s512 and make it either a power of 2, 192, + // or 384. + .clampScalar(BigTyIdx, s8, s512) + .widenScalarIf( + [=](const LegalityQuery &Query) { + const LLT &Ty = Query.Types[BigTyIdx]; + return !isPowerOf2_32(Ty.getSizeInBits()) && + Ty.getSizeInBits() % 64 != 0; + }, + [=](const LegalityQuery &Query) { + // Pick the next power of 2, or a multiple of 64 over 128. + // Whichever is smaller. + const LLT &Ty = Query.Types[BigTyIdx]; + unsigned NewSizeInBits = 1 + << Log2_32_Ceil(Ty.getSizeInBits() + 1); + if (NewSizeInBits >= 256) { + unsigned RoundedTo = alignTo<64>(Ty.getSizeInBits() + 1); + if (RoundedTo < NewSizeInBits) + NewSizeInBits = RoundedTo; + } + return std::make_pair(BigTyIdx, LLT::scalar(NewSizeInBits)); + }) + // Clamp the little scalar to s8-s256 and make it a power of 2. It's not + // worth considering the multiples of 64 since 2*192 and 2*384 are not + // valid. + .clampScalar(LitTyIdx, s8, s256) + .widenScalarToNextPow2(LitTyIdx, /*Min*/ 8) + // So at this point, we have s8, s16, s32, s64, s128, s192, s256, s384, + // s512, <X x s8>, <X x s16>, <X x s32>, or <X x s64>. + // At this point it's simple enough to accept the legal types. + .legalIf([=](const LegalityQuery &Query) { + const LLT &BigTy = Query.Types[BigTyIdx]; + const LLT &LitTy = Query.Types[LitTyIdx]; + if (BigTy.isVector() && BigTy.getSizeInBits() < 32) + return false; + if (LitTy.isVector() && LitTy.getSizeInBits() < 32) + return false; + return BigTy.getSizeInBits() % LitTy.getSizeInBits() == 0; + }) + // Any vectors left are the wrong size. Scalarize them. + .fewerElementsIf([](const LegalityQuery &Query) { return true; }, + [](const LegalityQuery &Query) { + return std::make_pair( + 0, Query.Types[0].getElementType()); + }) + .fewerElementsIf([](const LegalityQuery &Query) { return true; }, + [](const LegalityQuery &Query) { + return std::make_pair( + 1, Query.Types[1].getElementType()); + }); + } computeTables(); } |

