summaryrefslogtreecommitdiffstats
path: root/clang/test/CodeGenCUDA
diff options
context:
space:
mode:
authorJonas Hahnfeld <hahnjo@hahnjo.de>2018-06-08 11:17:08 +0000
committerJonas Hahnfeld <hahnjo@hahnjo.de>2018-06-08 11:17:08 +0000
commit3b9cbba9a82bc6755828f2174a235101b16b4380 (patch)
treedcdd250c33df748b9414030b1731f36a33c672d2 /clang/test/CodeGenCUDA
parent1d6254f7e9960bd90a6519face87e6b73c24fa20 (diff)
downloadbcm5719-llvm-3b9cbba9a82bc6755828f2174a235101b16b4380.tar.gz
bcm5719-llvm-3b9cbba9a82bc6755828f2174a235101b16b4380.zip
[CUDA] Fix emission of constant strings in sections
CGM.GetAddrOfConstantCString() sets the adress of the created GlobalValue to unnamed. When emitting the object file LLVM will mark the surrounding section as SHF_MERGE iff the string is nul-terminated and contains no other nuls (see IsNullTerminatedString). This results in problems when saving temporaries because LLVM doesn't set an EntrySize, so reading in the serialized assembly file fails. This never happened for the GPU binaries because they usually contain a nul-character somewhere. Instead this only affected the module ID when compiling relocatable device code. However, this points to a potentially larger problem: If we put a constant string into a named section, we really want the data to end up in that section in the object file. To avoid LLVM merging sections this patch unmarks the GlobalVariable's address as unnamed which also fixes the problem of invalid serialized assembly files when saving temporaries. Differential Revision: https://reviews.llvm.org/D47902 llvm-svn: 334281
Diffstat (limited to 'clang/test/CodeGenCUDA')
-rw-r--r--clang/test/CodeGenCUDA/device-stub.cu6
1 files changed, 3 insertions, 3 deletions
diff --git a/clang/test/CodeGenCUDA/device-stub.cu b/clang/test/CodeGenCUDA/device-stub.cu
index 894b7205c30..3798b8cf731 100644
--- a/clang/test/CodeGenCUDA/device-stub.cu
+++ b/clang/test/CodeGenCUDA/device-stub.cu
@@ -65,7 +65,7 @@ void use_pointers() {
// ALL: private unnamed_addr constant{{.*}}kernelfunc{{.*}}\00"
// * constant unnamed string with GPU binary
// HIP: @[[FATBIN:__hip_fatbin]] = external constant i8, section ".hip_fatbin"
-// CUDA: @[[FATBIN:.*]] = private unnamed_addr constant{{.*GPU binary would be here.*}}\00",
+// CUDA: @[[FATBIN:.*]] = private constant{{.*GPU binary would be here.*}}\00",
// CUDANORDC-SAME: section ".nv_fatbin", align 8
// CUDARDC-SAME: section "__nv_relfatbin", align 8
// * constant struct that wraps GPU binary
@@ -81,7 +81,7 @@ void use_pointers() {
// * variable to save GPU binary handle after initialization
// NORDC: @__[[PREFIX]]_gpubin_handle = internal global i8** null
// * constant unnamed string with NVModuleID
-// RDC: [[MODULE_ID_GLOBAL:@.*]] = private unnamed_addr constant
+// RDC: [[MODULE_ID_GLOBAL:@.*]] = private constant
// CUDARDC-SAME: c"[[MODULE_ID:.+]]\00", section "__nv_module_id", align 32
// HIPRDC-SAME: c"[[MODULE_ID:.+]]\00", section "__hip_module_id", align 32
// * Make sure our constructor was added to global ctor list.
@@ -141,7 +141,7 @@ void hostfunc(void) { kernelfunc<<<1, 1>>>(1, 1, 1); }
// There should be no __[[PREFIX]]_register_globals if we have no
// device-side globals, but we still need to register GPU binary.
// Skip GPU binary string first.
-// CUDANOGLOBALS: @{{.*}} = private unnamed_addr constant{{.*}}
+// CUDANOGLOBALS: @{{.*}} = private constant{{.*}}
// HIPNOGLOBALS: @{{.*}} = external constant{{.*}}
// NOGLOBALS-NOT: define internal void @__{{.*}}_register_globals
// NOGLOBALS: define internal void @__[[PREFIX:cuda|hip]]_module_ctor
OpenPOWER on IntegriCloud