summaryrefslogtreecommitdiffstats
path: root/clang/lib/CodeGen/TargetInfo.cpp
diff options
context:
space:
mode:
authorUlrich Weigand <ulrich.weigand@de.ibm.com>2014-07-21 00:56:36 +0000
committerUlrich Weigand <ulrich.weigand@de.ibm.com>2014-07-21 00:56:36 +0000
commit601957fa23ab90cbb24114ee9913244883453f96 (patch)
tree306890a8831d7cb6d9d3480e365437bc9ae65eb5 /clang/lib/CodeGen/TargetInfo.cpp
parentb712237da6acd42a9221787acea7880f8db3fa0c (diff)
downloadbcm5719-llvm-601957fa23ab90cbb24114ee9913244883453f96.tar.gz
bcm5719-llvm-601957fa23ab90cbb24114ee9913244883453f96.zip
[PowerPC] Optimize passing certain aggregates by value
In addition to enabling ELFv2 homogeneous aggregate handling, LLVM support to pass array types directly also enables a performance enhancement. We can now pass (non-homogeneous) aggregates that fit fully in registers as direct integer arrays, using an element type to encode the alignment requirement (that would otherwise go to the "byval align" field). This is preferable since "byval" forces the back-end to write the aggregate out to the stack, even if it could be passed fully in registers. This is particularly annoying on ELFv2, if there is no parameter save area available, since we then need to allocate space on the callee's stack just to hold those aggregates. Note that to implement this optimization, this patch does not attempt to fully anticipate register allocation rules as (defined in the ABI and) implemented in the back-end. Instead, the patch is simply passing *any* aggregate passed by value using the array mechanism if its size is up to 64 bytes. This means that some of those will end up being passed in stack slots anyway, but the generated code shouldn't be any worse either. (*Large* aggregates remain passed using "byval" to enable optimized copying via memcpy etc.) llvm-svn: 213495
Diffstat (limited to 'clang/lib/CodeGen/TargetInfo.cpp')
-rw-r--r--clang/lib/CodeGen/TargetInfo.cpp25
1 files changed, 25 insertions, 0 deletions
diff --git a/clang/lib/CodeGen/TargetInfo.cpp b/clang/lib/CodeGen/TargetInfo.cpp
index 2ed33b00bca..5da22c3e6cb 100644
--- a/clang/lib/CodeGen/TargetInfo.cpp
+++ b/clang/lib/CodeGen/TargetInfo.cpp
@@ -3178,6 +3178,31 @@ PPC64_SVR4_ABIInfo::classifyArgumentType(QualType Ty) const {
return ABIArgInfo::getDirect(CoerceTy);
}
+ // If an aggregate may end up fully in registers, we do not
+ // use the ByVal method, but pass the aggregate as array.
+ // This is usually beneficial since we avoid forcing the
+ // back-end to store the argument to memory.
+ uint64_t Bits = getContext().getTypeSize(Ty);
+ if (Bits > 0 && Bits <= 8 * GPRBits) {
+ llvm::Type *CoerceTy;
+
+ // Types up to 8 bytes are passed as integer type (which will be
+ // properly aligned in the argument save area doubleword).
+ if (Bits <= GPRBits)
+ CoerceTy = llvm::IntegerType::get(getVMContext(),
+ llvm::RoundUpToAlignment(Bits, 8));
+ // Larger types are passed as arrays, with the base type selected
+ // according to the required alignment in the save area.
+ else {
+ uint64_t RegBits = ABIAlign * 8;
+ uint64_t NumRegs = llvm::RoundUpToAlignment(Bits, RegBits) / RegBits;
+ llvm::Type *RegTy = llvm::IntegerType::get(getVMContext(), RegBits);
+ CoerceTy = llvm::ArrayType::get(RegTy, NumRegs);
+ }
+
+ return ABIArgInfo::getDirect(CoerceTy);
+ }
+
// All other aggregates are passed ByVal.
return ABIArgInfo::getIndirect(ABIAlign, /*ByVal=*/true,
/*Realign=*/TyAlign > ABIAlign);
OpenPOWER on IntegriCloud