summaryrefslogtreecommitdiffstats
path: root/clang/test/Driver/openmp-offload-gpu.c
diff options
context:
space:
mode:
authorAlexey Bataev <a.bataev@hotmail.com>2019-02-20 16:36:22 +0000
committerAlexey Bataev <a.bataev@hotmail.com>2019-02-20 16:36:22 +0000
commit8061acd501f1cb6c00a886f4ee5cb9adc6cda39a (patch)
treea4c87ec4d6dfe10c577d2c352be2dc1a41ce6bb5 /clang/test/Driver/openmp-offload-gpu.c
parentc4d07554e44867191492a102ea3639276ba1ece1 (diff)
downloadbcm5719-llvm-8061acd501f1cb6c00a886f4ee5cb9adc6cda39a.tar.gz
bcm5719-llvm-8061acd501f1cb6c00a886f4ee5cb9adc6cda39a.zip
[OPENMP][NVPTX]Use faster teams reduction algorithm.
A faster way to reduce the values in teams reductions was found, the codegen is updated to use this faster algorithm and new runtime functions. llvm-svn: 354479
Diffstat (limited to 'clang/test/Driver/openmp-offload-gpu.c')
-rw-r--r--clang/test/Driver/openmp-offload-gpu.c5
1 files changed, 5 insertions, 0 deletions
diff --git a/clang/test/Driver/openmp-offload-gpu.c b/clang/test/Driver/openmp-offload-gpu.c
index dfdc79b5f70..7a4dd95e541 100644
--- a/clang/test/Driver/openmp-offload-gpu.c
+++ b/clang/test/Driver/openmp-offload-gpu.c
@@ -273,3 +273,8 @@
// RUN: %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -Xopenmp-target -march=sm_60 %s -fopenmp-cuda-force-full-runtime -fno-openmp-cuda-force-full-runtime 2>&1 \
// RUN: | FileCheck -check-prefix=NO_FULL_RUNTIME %s
// NO_FULL_RUNTIME-NOT: "-{{fno-|f}}openmp-cuda-force-full-runtime"
+
+// RUN: %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -Xopenmp-target -march=sm_60 %s -fopenmp-cuda-teams-reduction-recs-num=2048 2>&1 \
+// RUN: | FileCheck -check-prefix=CUDA_RED_RECS %s
+// CUDA_RED_RECS: clang{{.*}}"-cc1"{{.*}}"-triple" "nvptx64-nvidia-cuda"
+// CUDA_RED_RECS-SAME: "-fopenmp-cuda-teams-reduction-recs-num=2048"
OpenPOWER on IntegriCloud