| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These intrinsics are special because they directly take a memory operand (AVX2
adds the register counterparts). Typically, other non-memop intrinsics take
registers and then it's left to isel to fold memory operands.
In order to LICM intrinsics directly reading memory, we require that no stores
are in the loop (LICM) or that the folded load accesses constant memory
(MachineLICM). When neither is the case we fail to hoist a loop-invariant
broadcast.
We can work around this limitation if we expose the load as a regular load and
then just implement the broadcast using the vector initializer syntax. This
exposes the load to LICM and other optimizations.
At the IR level this is translated into a series of insertelements. The
sequence is already recognized as a broadcast so there is no impact on the
quality of codegen.
_mm256_broadcast_pd and _mm256_broadcast_ps are not updated by this patch
because right now we lack the DAG-combiner smartness to recover the broadcast
instructions. This will be tackled in a follow-on.
There will be completing changes on the LLVM side to remove the LLVM
intrinsics and to auto-upgrade bitcode files.
Fixes <rdar://problem/16494520>
llvm-svn: 209846
|
| |
|
|
|
|
| |
(fixes PR19431 - http://llvm.org/bugs/show_bug.cgi?id=19431)
llvm-svn: 209769
|
| |
|
|
|
|
|
|
|
|
| |
The last step of _mm_cvtps_pi16 should use _mm_packs_pi32, which is a function
that reads two __m64 values and packs four 32-bit values into four 16-bit
values.
<rdar://problem/16873717>
llvm-svn: 209489
|
| |
|
|
| |
llvm-svn: 208699
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Most of the clang header patch by Simon Pilgrim @ SCEE.
Also fixed (or added) clang tests for these intrinsics.
LLVM tests to make sure we get the blend instruction out of these
shufflevectors are at http://reviews.llvm.org/D3600
Reviewers: eli.friedman, craig.topper, rafael
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D3601
llvm-svn: 208664
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glibc expects that stddef.h only defines a single thing if either of these
defines is set. For example, before this change, a C file containing
#include <stdlib.h>
int ptrdiff_t = 0;
would compile with gcc but not with clang. Now it compiles with clang too.
This also fixes PR12997, where older versions of the Linux headers would define
NULL incorrectly, and glibc would define __need_NULL and expect stddef.h to
redefine NULL with the correct definition.
llvm-svn: 207606
|
| |
|
|
| |
llvm-svn: 207483
|
| |
|
|
|
|
|
|
|
| |
See the bug and the cfe-commits thread "[patch] Let stddef.h redefine NULL if
__need_NULL is set" for discussion.
Fixes PR12997 and is similar to the __need_wint_t bits already in this file.
llvm-svn: 207482
|
| |
|
|
|
|
| |
Since r207132, these are defined in ia32intrin.h.
llvm-svn: 207134
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch:
1. Adds a definition for two new GCCBuiltins in BuiltinsX86.def:
__builtin_ia32_rdtsc;
__builtin_ia32_rdtscp;
2. Replaces the already existing definition of intrinsic __rdtsc in
ia32intrin.h with a simple call to the new GCC builtin __builtin_ia32_rdtsc.
3. Adds a definition for the new intrinsic __rdtscp in ia32intrin.h
llvm-svn: 207132
|
| |
|
|
|
|
| |
Don't install a file using the legacy spelling.
llvm-svn: 206431
|
| |
|
|
|
|
|
|
|
|
|
| |
Don't include input and output regs in clobbers. Prefix some
identifiers with __. Add a memory constraint to __readcr3 to prevent
reordering. This constraint is heavy handed, but conservatively
correct.
Thanks to PaX Team for the suggestions.
llvm-svn: 205778
|
| |
|
|
|
|
|
|
| |
Fixes PR19301.
Based on a patch from Steven Graf!
llvm-svn: 205751
|
| |
|
|
|
|
| |
Differential Revision: http://llvm-reviews.chandlerc.com/D3212
llvm-svn: 205172
|
| |
|
|
|
|
|
| |
I'd gone too far pruning aarch64_simd.h this time and took out one
instance of arm_neon.h. This should restore us to the status quo.
llvm-svn: 205111
|
| |
|
|
|
|
| |
They were causing the autotools builds to fail.
llvm-svn: 205103
|
| |
|
|
|
|
|
|
|
|
|
| |
This adds Clang support for the ARM64 backend. There are definitely
still some rough edges, so please bring up any issues you see with
this patch.
As with the LLVM commit though, we think it'll be more useful for
merging with AArch64 from within the tree.
llvm-svn: 205100
|
| |
|
|
| |
llvm-svn: 204827
|
| |
|
|
| |
llvm-svn: 203816
|
| |
|
|
| |
llvm-svn: 203722
|
| |
|
|
| |
llvm-svn: 203715
|
| |
|
|
|
|
|
|
| |
They're already defined in ia32intrin.h, and this would cause including Intrin.h
in 64-bit mode to fail because of conflicting types. Update ms-intrin.cpp to
also run in 64-bit mode to catch things like this.
llvm-svn: 203714
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Our usual definition of max_align_t wouldn't match up with MSVC if it
was used in a template argument.
Reviewers: chandlerc, rsmith, rnk
Reviewed By: chandlerc
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2924
llvm-svn: 202911
|
| |
|
|
|
|
|
|
|
|
| |
and one for PCLMUL support. The current immintrin.h header only includes
wmmintrin.h if AES support is enabled. It should include it if either AES or
PCLMUL is enabled (GCC's version of immintrin.h does this).
Patch by John Baldwin!
llvm-svn: 202871
|
| |
|
|
| |
llvm-svn: 202792
|
| |
|
|
|
|
| |
(renamed res to __res)
llvm-svn: 202784
|
| |
|
|
| |
llvm-svn: 202778
|
| |
|
|
|
|
|
| |
No functional change. It's unclear if the word FIXME is relevant given
that the macro behaves as intended.
llvm-svn: 201920
|
| |
|
|
|
|
|
|
|
|
| |
Because GCC incorrectly defines _mm_prefetch to take anything that casts
to void*, people have started using that behavior. The previous patch
that made _mm_prefetch actually take a const char * broke compatibility
with existing code. This update to the patch leaves the macro that
defines _mm_prefetch with the (void*) cast when _MSC_VER is not defined.
llvm-svn: 201901
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This breaks backwards compatibility with existing code. Previously, this
was defined as
#define _mm_prefetch(a, sel) (__builtin_prefetch((void *)(a), 0, (sel)))
Which basically accepts any pointer. Changing this to char* simply
breaks a lot of existing code. I have tried changing char* to
"const void*", which seems to be the right thing as per Intel
specification this should work on basically any pointer. However,
apparently this breaks windows compatibility (because of a conflicting
declaration in windows.h).
So, we probably need to #ifdef this based on whether clang is compiling
for windows. According to Chandler, this might be done by introducing an
additional symbol to a fake type in BuiltinsX86.def and then condition
the type expansion on the platform.
llvm-svn: 201775
|
| |
|
|
|
|
|
|
|
| |
for C99 is '199901L' and we shouldn't be comparing it with anything
else.
Neither of these should have had any impact in practice.
llvm-svn: 201738
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds several built-ins that are required for ms
compatibility. _mm_prefetch must be a built-in because it takes a
compile-time constant argument and our prior approach of using a #define
to the current built-in doesn't work in the presence of re-declaration
of _mm_prefetch. The others can be obtained by including the windows
system headers. If a user includes the windows system headers but not
intrin.h they still need to work and therefore must be built-in because
we don't get a chance to implement them in intrin.h in this case.
llvm-svn: 201734
|
| |
|
|
|
|
|
| |
This was broken because __has_include_next(...) would not be valid in a
preprocessor condition if __has_include_next is not defined.
llvm-svn: 201731
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This definition is not chosen idly. There is an unfortunate reality with
max_align_t -- the specific nature of its definition leaks into the ABI
almost immediately. Because it is part of C11 and C++11 it becomes
essential for it to match with other systems on that ABI. There is an
effort to discourage any further use of this construct as a consequence
-- using max_align_t introduces an immediate ABI problem. We can never
update it to have larger alignment even as the microarchitecture changes
to necessitate higher alignment. =/
The particular definition here exactly matches the ABI of GCC's chosen
::max_align_t definition, for better or worse. This was written with the
help of Richard Smith who was decoding the exact ABI implications of the
selected definition in GCC. Notably, in-register arguments are impacted
by the particular definition chosen. =/
No one is under the illusion that this is a "good" or "useful"
definition of max_align_t, and we are working with the standards
committee to specify a more useful interface to address this need.
llvm-svn: 201729
|
| |
|
|
|
|
|
| |
The two identical implementations of __cpuid for X86 / X86_64 were
leftovers from my first iteration on the patch that implemented it.
llvm-svn: 200568
|
| |
|
|
|
|
|
| |
This makes sure that the ms-intrin.cpp test passes by providing
a mock setjmp.h as a test input.
llvm-svn: 200344
|
| |
|
|
| |
llvm-svn: 200343
|
| |
|
|
|
|
|
|
|
|
|
| |
This failed the ms-intrin.cpp test.
This reverts commit r200237.
This also comments out the _setjmpex declaration for now so that
intrin.h will work on x64 targets.
llvm-svn: 200243
|
| |
|
|
|
|
|
|
|
| |
Adds an implementation for _InterlockedCompareExchangePointer() and
__faststorefence().
Patch by David Ziman!
llvm-svn: 200239
|
| |
|
|
|
|
|
| |
This fixes an error on our _setjmpex declaration for 64-bit code and
allows us to declare _setjmp for 32-bit code.
llvm-svn: 200237
|
| |
|
|
|
|
| |
This avoids warnings visible with -Wsystem-headers.
llvm-svn: 200235
|
| |
|
|
| |
llvm-svn: 200061
|
| |
|
|
|
|
| |
and remove duplicate declarations.
llvm-svn: 199992
|
| |
|
|
|
|
| |
Differential Revision: http://llvm-reviews.chandlerc.com/D2606
llvm-svn: 199958
|
| |
|
|
|
|
|
| |
The declarations seem correct, but the definitions were using
chars instead of shorts.
llvm-svn: 199923
|
| |
|
|
|
|
| |
LLVM_*_OUTPUT_INTDIR should be available everywhere. It was my mistake when I introduced INTDIR stuff.
llvm-svn: 199597
|
| |
|
|
|
|
|
|
|
|
|
| |
The _cpuid() implementation is the same as in lib/Headers/cpuid.h
with the parameter names adjusted to match the interface.
_xgetbv just does what the Intel manual says.
Differential Revision: http://llvm-reviews.chandlerc.com/D2564
llvm-svn: 199439
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
${BINARY_DIR}/${BUILD_MODE}/(bin|lib)
We have been seeing nasty directory layout with CMake multiconfig, such as,
bin/Release/clang.exe
lib/clang/3.x/...
lib/Release/clang/3.x/.. (duplicated)
Move the layout similar to autoconf's;
Release/bin/clang.exe
Release/lib/clang/3.x/...
Checked on Visual Studio 10. Could you guys please confirm my change on XCode(and other multiconfig builders)?
Note: Don't set variables CMAKE_*_OUTPUT_DIRECTORY any more, or a certain builder, for eaxample, msbuild.exe, would be confused.
llvm-svn: 198205
|
| |
|
|
|
|
| |
${CMAKE_CURRENT_BINARY_DIR}/arm_neon.h, instead of copied arm_neon.h.
llvm-svn: 197852
|
| |
|
|
|
|
| |
staged yet.
llvm-svn: 197441
|