diff options
author | Tobias Grosser <tobias@grosser.es> | 2017-08-24 09:46:25 +0000 |
---|---|---|
committer | Tobias Grosser <tobias@grosser.es> | 2017-08-24 09:46:25 +0000 |
commit | d7eb61929929fcb5dae77f63a5d9d9be026eaeb8 (patch) | |
tree | 392f0b56f374df29e5b1aa5c380306fd770e0458 /llvm/lib/Analysis/TargetTransformInfo.cpp | |
parent | afc2cd3c9e27d27fbffa2d73be8a27d99c9347cc (diff) | |
download | bcm5719-llvm-d7eb61929929fcb5dae77f63a5d9d9be026eaeb8.tar.gz bcm5719-llvm-d7eb61929929fcb5dae77f63a5d9d9be026eaeb8.zip |
Model cache size and associativity in TargetTransformInfo
Summary:
We add the precise cache sizes and associativity for the following Intel
architectures:
- Penry
- Nehalem
- Westmere
- Sandy Bridge
- Ivy Bridge
- Haswell
- Broadwell
- Skylake
- Kabylake
Polly uses since several months a performance model for BLAS computations that
derives optimal cache and register tile sizes from cache and latency
information (based on ideas from "Analytical Modeling Is Enough for High-Performance BLIS", by Tze Meng Low published at TOMS 2016).
While bootstrapping this model, these target values have been kept in Polly.
However, as our implementation is now rather mature, it seems time to teach
LLVM itself about cache sizes.
Interestingly, L1 and L2 cache sizes are pretty constant across
micro-architectures, hence a set of architecture specific default values
seems like a good start. They can be expanded to more target specific values,
in case certain newer architectures require different values. For now a set
of Intel architectures are provided.
Just as a little teaser, for a simple gemm kernel this model allows us to
improve performance from 1.2s to 0.27s. For gemm kernels with less optimal
memory layouts even larger speedups can be reported.
Reviewers: Meinersbur, bollu, singam-sanjay, hfinkel, gareevroman, fhahn, sebpop, efriedma, asb
Reviewed By: fhahn, asb
Subscribers: lsaba, asb, pollydev, llvm-commits
Differential Revision: https://reviews.llvm.org/D37051
llvm-svn: 311647
Diffstat (limited to 'llvm/lib/Analysis/TargetTransformInfo.cpp')
-rw-r--r-- | llvm/lib/Analysis/TargetTransformInfo.cpp | 10 |
1 files changed, 10 insertions, 0 deletions
diff --git a/llvm/lib/Analysis/TargetTransformInfo.cpp b/llvm/lib/Analysis/TargetTransformInfo.cpp index 6cb7952d796..e09138168c9 100644 --- a/llvm/lib/Analysis/TargetTransformInfo.cpp +++ b/llvm/lib/Analysis/TargetTransformInfo.cpp @@ -321,6 +321,16 @@ unsigned TargetTransformInfo::getCacheLineSize() const { return TTIImpl->getCacheLineSize(); } +llvm::Optional<unsigned> TargetTransformInfo::getCacheSize(CacheLevel Level) + const { + return TTIImpl->getCacheSize(Level); +} + +llvm::Optional<unsigned> TargetTransformInfo::getCacheAssociativity( + CacheLevel Level) const { + return TTIImpl->getCacheAssociativity(Level); +} + unsigned TargetTransformInfo::getPrefetchDistance() const { return TTIImpl->getPrefetchDistance(); } |