[CUDA] Expand upon --cuda-gpu-arch flag in CompileCudaWithLLVM doc.

llvm-svn: 280848
author: Justin Lebar <jlebar@google.com> 2016-09-07 20:09:46 +0000
committer: Justin Lebar <jlebar@google.com> 2016-09-07 20:09:46 +0000
commit: 84473cdd412e34dc641b61efc0ef0778cb64cfb1 (patch)
tree: 905db98f696776a48679386ee089fc414e2fea33
parent: fbad5b2f1b7dfd1c7e16252315e63c6b3ebb1cc3 (diff)
download: bcm5719-llvm-84473cdd412e34dc641b61efc0ef0778cb64cfb1.tar.gz
bcm5719-llvm-84473cdd412e34dc641b61efc0ef0778cb64cfb1.zip
1 files changed, 7 insertions, 0 deletions
diff --git a/llvm/docs/CompileCudaWithLLVM.rst b/llvm/docs/CompileCudaWithLLVM.rst
index f57839cec96..85aab5dda0f 100644
--- a/llvm/docs/CompileCudaWithLLVM.rst
+++ b/llvm/docs/CompileCudaWithLLVM.rst
@@ -119,6 +119,13 @@ your GPU <https://developer.nvidia.com/cuda-gpus>`_. For example, if you want
 to run your program on a GPU with compute capability of 3.5, you should specify
 ``--cuda-gpu-arch=sm_35``.
 
+Note: You cannot pass ``compute_XX`` as an argument to ``--cuda-gpu-arch``;
+only ``sm_XX`` is currently supported.  However, clang always includes PTX in
+its binaries, so e.g. a binary compiled with ``--cuda-gpu-arch=sm_30`` would be
+forwards-compatible with e.g. ``sm_35`` GPUs.
+
+You can pass ``--cuda-gpu-arch`` multiple times to compile for multiple archs.
+
 Detecting clang vs NVCC
 =======================
author	Justin Lebar <jlebar@google.com>	2016-09-07 20:09:46 +0000
committer	Justin Lebar <jlebar@google.com>	2016-09-07 20:09:46 +0000
commit	84473cdd412e34dc641b61efc0ef0778cb64cfb1 (patch)
tree	905db98f696776a48679386ee089fc414e2fea33
parent	fbad5b2f1b7dfd1c7e16252315e63c6b3ebb1cc3 (diff)
download	bcm5719-llvm-84473cdd412e34dc641b61efc0ef0778cb64cfb1.tar.gz bcm5719-llvm-84473cdd412e34dc641b61efc0ef0778cb64cfb1.zip