From cbe6411401e21d76d6d80997f62583c98822c518 Mon Sep 17 00:00:00 2001 From: Chandler Carruth Date: Thu, 1 Oct 2015 23:40:12 +0000 Subject: Fix the SSE4 byte sign extension in a cleaner way, and more thoroughly test that our intrinsics behave the same under -fsigned-char and -funsigned-char. This further testing uncovered that AVX-2 has a broken cmpgt for 8-bit elements, and has for a long time. This is fixed in the same way as SSE4 handles the case. The other ISA extensions currently work correctly because they use specific instruction intrinsics. As soon as they are rewritten in terms of generic IR, they will need to add these special casts. I've added the necessary testing to catch this however, so we shouldn't have to chase it down again. I considered changing the core typedef to be signed, but that seems like a bad idea. Notably, it would be an ABI break if anyone is reaching into the innards of the intrinsic headers and passing __v16qi on an API boundary. I can't be completely confident that this wouldn't happen due to a macro expanding in a lambda, etc., so it seems much better to leave it alone. It also matches GCC's behavior exactly. A fun side note is that for both GCC and Clang, -funsigned-char really does change the semantics of __v16qi. To observe this, consider: % cat x.cc #include #include int main() { __v16qi a = { 1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}; __v16qi b = _mm_set1_epi8(-1); std::cout << (int)(a / b)[0] << ", " << (int)(a / b)[1] << '\n'; } % clang++ -o x x.cc && ./x -1, 1 % clang++ -funsigned-char -o x x.cc && ./x 0, 1 However, while this may be surprising, both Clang and GCC agree. Differential Revision: http://reviews.llvm.org/D13324 llvm-svn: 249097 --- clang/test/CodeGen/mmx-builtins.c | 1 + 1 file changed, 1 insertion(+) (limited to 'clang/test/CodeGen/mmx-builtins.c') diff --git a/clang/test/CodeGen/mmx-builtins.c b/clang/test/CodeGen/mmx-builtins.c index e9f8d8696f9..f17d6eadff0 100644 --- a/clang/test/CodeGen/mmx-builtins.c +++ b/clang/test/CodeGen/mmx-builtins.c @@ -1,5 +1,6 @@ // REQUIRES: x86-registered-target // RUN: %clang_cc1 %s -O3 -triple=x86_64-apple-darwin -target-feature +ssse3 -S -o - | FileCheck %s +// RUN: %clang_cc1 %s -O3 -triple=x86_64-apple-darwin -target-feature +ssse3 -fno-signed-char -S -o - | FileCheck %s // FIXME: Disable inclusion of mm_malloc.h, our current implementation is broken // on win32 since we don't generally know how to find errno.h. -- cgit v1.2.3