summaryrefslogtreecommitdiffstats
path: root/clang/lib/Frontend/CompilerInvocation.cpp
diff options
context:
space:
mode:
authorBenjamin Kramer <benny.kra@googlemail.com>2018-04-27 13:36:05 +0000
committerBenjamin Kramer <benny.kra@googlemail.com>2018-04-27 13:36:05 +0000
commit733c7fc55d0dfa4d49f4becb2fb92e108611ef11 (patch)
tree8af400997cdce4f4309a02b601e5e08101d0168b /clang/lib/Frontend/CompilerInvocation.cpp
parentaef5ca72998b6afbd16de39503c318214f7fc19d (diff)
downloadbcm5719-llvm-733c7fc55d0dfa4d49f4becb2fb92e108611ef11.tar.gz
bcm5719-llvm-733c7fc55d0dfa4d49f4becb2fb92e108611ef11.zip
[NVPTX] Turn on Loop/SLP vectorization
Since PTX has grown a <2 x half> datatype vectorization has become more important. The late LoadStoreVectorizer intentionally only does loads and stores, but now arithmetic has to be vectorized for optimal throughput too. This is still very limited, SLP vectorization happily creates <2 x half> if it's a legal type but there's still a lot of register moving happening to get that fed into a vectorized store. Overall it's a small performance win by reducing the amount of arithmetic instructions. I haven't really checked what the loop vectorizer does to PTX code, the cost model there might need some more tweaks. I didn't see it causing harm though. Differential Revision: https://reviews.llvm.org/D46130 llvm-svn: 331035
Diffstat (limited to 'clang/lib/Frontend/CompilerInvocation.cpp')
0 files changed, 0 insertions, 0 deletions
OpenPOWER on IntegriCloud