| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
LLVM IR recently added a Type parameter to the byval Attribute, so that
when pointers become opaque and no longer have an element type the
information will still be present in IR.
For now the Type parameter is optional (which is why Clang didn't need
this change at the time), but it will become mandatory soon.
llvm-svn: 362652
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change is step three in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API.
Step 4) Update Polly to use the new IRBuilder API.
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use getDestAlignment()
and getSourceAlignment() instead.
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.
Reference
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
Reviewers: rjmccall
Subscribers: jyknight, nemanjai, nhaehnle, javed.absar, sbc100, aheejin, kbarton, fedor.sergeev, cfe-commits
Differential Revision: https://reviews.llvm.org/D41677
llvm-svn: 323617
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(Step 1).
Summary:
Upstream LLVM is changing the the prototypes of the @llvm.memcpy/memmove/memset
intrinsics. This change updates the Clang tests for this change.
The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument
which is required to be a constant integer. It represents the alignment of the
dest (and source), and so must be the minimum of the actual alignment of the
two.
This change removes the alignment argument in favour of placing the alignment
attribute on the source and destination pointers of the memory intrinsic call.
For example, code which used to read:
call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false)
will now read
call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false)
At this time the source and destination alignments must be the same (Step 1).
Step 2 of the change, to be landed shortly, will relax that contraint and allow
the source and destination to have different alignments.
llvm-svn: 322964
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In C++ all variables are in default address space. Previously change has been
made to cast automatic variables to default address space. However that is
not sufficient since all temporary variables need to be casted to default
address space.
This patch casts all temporary variables to default address space except those
for passing indirect arguments since they are only used for load/store.
This patch only affects target having non-zero alloca address space.
Differential Revision: https://reviews.llvm.org/D33706
llvm-svn: 305711
|
|
|
|
|
|
|
|
|
|
|
|
| |
We processed unnamed bitfields after our logic for non-vector field
elements in records larger than 128 bits. The vector logic would
determine that the bit-field disqualifies the record from occupying a
register despite the unnamed bit-field not participating in the record
size nor its alignment.
N.B. This behavior matches GCC and ICC.
llvm-svn: 278656
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An __m512 vector type wrapped in a structure should be passed in a
vector register.
Our prior implementation was based on a draft version of the psABI.
This fixes PR28975.
N.B. The update to the ABI was made here:
https://github.com/hjl-tools/x86-psABI/commit/30f9c9
llvm-svn: 278655
|
|
|
|
|
|
|
|
|
|
| |
For compatibility with GCC, classify __m64 as SSE.
However, clang is a platform compiler for certain targets; retain our
old behavior on those targets: classify __m64 as integer.
This fixes PR26832.
llvm-svn: 262688
|
|
|
|
|
|
|
|
|
|
|
| |
Fix calculating address of arguments larger than 32 bit on stack for
variadic functions (rounding up address to alignment) on ppc32 architecture.
Patch by Strahinja Petrovic.
Differential Revision: http://reviews.llvm.org/D14871
llvm-svn: 254670
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r253512.
This likely broke the bots in:
http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202
http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787
llvm-svn: 253542
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a follow on from a similar LLVM commit: r253511.
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
These intrinsics currently have an explicit alignment argument which is
required to be a constant integer. It represents the alignment of the
source and dest, and so must be the minimum of those.
This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments. The alignment
argument itself is removed.
The only code change to clang is hidden in CGBuilder.h which now passes
both dest and source alignment to IRBuilder, instead of taking the minimum of
dest and source alignments.
Reviewed by Hal Finkel.
llvm-svn: 253512
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As specified in the SysV AVX512 ABI drafts. It follows the same scheme
as AVX2:
Arguments of type __m512 are split into eight eightbyte chunks.
The least significant one belongs to class SSE and all the others
to class SSEUP.
This also means we change the OpenMP SIMD default alignment on AVX512.
Based on r240337.
Differential Revision: http://reviews.llvm.org/D9894
llvm-svn: 240338
|
|
|
|
|
|
|
| |
We used to only check the differing tests on AVX, but we might
as well check all of them.
llvm-svn: 237818
|
|
|
|
|
|
| |
(migration for recent LLVM change to textual IR for calls)
llvm-svn: 235147
|
|
|
|
| |
llvm-svn: 230795
|
|
|
|
| |
llvm-svn: 230783
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pass and return in registers
This is a patch for PR22563 ( http://llvm.org/bugs/show_bug.cgi?id=22563 ).
We were not correctly unwrapping a single 256-bit AVX vector that was defined as an array of 1 inside a struct.
We would generate a <4 x float> param/return value instead of <8 x float> and lose half of the vector.
Differential Revision: http://reviews.llvm.org/D7614
llvm-svn: 229408
|
|
|
|
|
|
| |
tests fail.
llvm-svn: 188447
|
|
|
|
|
|
|
|
| |
AVX vectors when AVX is turned on.
Fixes <rdar://problem/10513611>.
llvm-svn: 183813
|
|
|
|
| |
llvm-svn: 183720
|
|
|
|
| |
llvm-svn: 183590
|
|
|
|
| |
llvm-svn: 175253
|
|
|
|
|
|
|
| |
This update coincides with r174110. That change ordered the attributes
alphabetically.
llvm-svn: 174111
|
|
|
|
| |
llvm-svn: 173762
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were emitting calls to blocks as if all arguments were
required --- i.e. with signature (A,B,C,D,...) rather than
(A,B,...). This patch fixes that and accounts for the
implicit block-context argument as a required argument.
In addition, this patch changes the function type under which
we call unprototyped functions on platforms like x86-64 that
guarantee compatibility of variadic functions with unprototyped
function types; previously we would always call such functions
under the LLVM type T (...)*, but now we will call them under
the type T (A,B,C,D,...)*. This last change should have no
material effect except for making the type conventions more
explicit; it was a side-effect of the most convenient implementation.
llvm-svn: 169588
|
|
|
|
|
|
| |
rdar://12723368
llvm-svn: 168821
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the original parameter or return type.
Since we do not accurately represent the data fields of a union, we should not
directly load or store a union type.
As an exmple, if we have i8,i8, i32, i32 as one field type and i32,i32 as
another field type, the first field type will be chosen to represent the union.
If we load with the union's type, the 3rd byte and the 4th byte will be skipped.
rdar://12723368
llvm-svn: 168820
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- We do this when it is easy to determine that the backend will pass them on
the stack properly by itself.
Currently LLVM codegen is really bad in some cases with byval, for example, on
the test case here (which is derived from Sema code, which likes to pass
SourceLocations around)::
struct s47 { unsigned a; };
void f47(int,int,int,int,int,int,struct s47);
void test47(int a, struct s47 b) { f47(a, a, a, a, a, a, b); }
we used to emit code like this::
...
movl %esi, -8(%rbp)
movl -8(%rbp), %ecx
movl %ecx, (%rsp)
...
to handle moving the struct onto the stack, which is just appalling.
Now we generate::
movl %esi, (%rsp)
which seems better, no?
llvm-svn: 152462
|
|
|
|
|
|
| |
use byval so we're sure the backend does the right thing. Fixes va_arg with illegal vectors and an obscure ABI mismatch with __m64 vectors.
llvm-svn: 145652
|
|
|
|
|
|
| |
the arguments is an AVX vector.
llvm-svn: 145574
|
|
|
|
|
|
|
|
| |
when they are greater than 128 bits. This was incorrectly coercing things like long3 into a double2.
Add test case.
llvm-svn: 145312
|
|
|
|
|
|
| |
Fixes <rdar://problem/10463281>.
llvm-svn: 144966
|
|
|
|
|
|
|
|
|
|
| |
emit call results into potentially aliased slots. This allows us
to properly mark indirect return slots as noalias, at the cost
of requiring an extra memcpy when assigning an aggregate call
result into a l-value. It also brings us into compliance with
the x86-64 ABI.
llvm-svn: 138599
|
|
|
|
|
|
| |
failures.
llvm-svn: 135091
|
|
|
|
| |
llvm-svn: 135004
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
us to compile
stuff like this:
typedef struct {
int x, y, z;
} foo_t;
foo_t g;
into:
%"struct.<anonymous>" = type { i32, i32, i32 }
we now get:
%struct.foo_t = type { i32, i32, i32 }
This doesn't change the behavior of the compiler, but makes the IR much easier to read.
llvm-svn: 134969
|
|
|
|
|
|
| |
passing.
llvm-svn: 134951
|
|
|
|
|
|
| |
which is: { <4 x float>, <4 x float> } should continue to go through memory.
llvm-svn: 134946
|
|
|
|
|
|
| |
add one more testcase.
llvm-svn: 134934
|
|
|
|
| |
llvm-svn: 134831
|
|
|
|
| |
llvm-svn: 134765
|
|
|
|
| |
llvm-svn: 134754
|
|
|
|
|
|
| |
The start of some work on getting -mno-mmx working the way we want it to.
llvm-svn: 134300
|
|
|
|
|
|
|
| |
generator will give it something sufficient. This is important because
the mid-level optimizer doesn't know what alignment is required otherwise.
llvm-svn: 131879
|
|
|
|
|
|
|
| |
of which break strict compatibility with previous compilers. Implement
one of them and then immediately opt out on Darwin.
llvm-svn: 129899
|
|
|
|
|
|
|
| |
this fixes rdar://8358475 a failure of the gcc.dg/compat/vector_1 abi
test.
llvm-svn: 112205
|
|
|
|
|
|
| |
this fixes a miscompilation on the included testcase, rdar://8359248
llvm-svn: 112201
|
|
|
|
| |
llvm-svn: 112174
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
as a double in the x86-64 ABI. This allows us to generate much better
code for certain things, e.g.:
_Complex float f32(_Complex float A, _Complex float B) {
return A+B;
}
Used to compile into (look at the integer silliness!):
_f32: ## @f32
## BB#0: ## %entry
movd %xmm1, %rax
movd %eax, %xmm1
movd %xmm0, %rcx
movd %ecx, %xmm0
addss %xmm1, %xmm0
movd %xmm0, %edx
shrq $32, %rax
movd %eax, %xmm0
shrq $32, %rcx
movd %ecx, %xmm1
addss %xmm0, %xmm1
movd %xmm1, %eax
shlq $32, %rax
addq %rdx, %rax
movd %rax, %xmm0
ret
Now we get:
_f32: ## @f32
movdqa %xmm0, %xmm2
addss %xmm1, %xmm2
pshufd $16, %xmm2, %xmm2
pshufd $1, %xmm1, %xmm1
pshufd $1, %xmm0, %xmm0
addss %xmm1, %xmm0
pshufd $16, %xmm0, %xmm1
movdqa %xmm2, %xmm0
unpcklps %xmm1, %xmm0
ret
and compile stuff like:
extern float _Complex ccoshf( float _Complex ) ;
float _Complex ccosf ( float _Complex z ) {
float _Complex iz;
(__real__ iz) = -(__imag__ z);
(__imag__ iz) = (__real__ z);
return ccoshf(iz);
}
into:
_ccosf: ## @ccosf
## BB#0: ## %entry
pshufd $1, %xmm0, %xmm1
xorps LCPI4_0(%rip), %xmm1
unpcklps %xmm0, %xmm1
movaps %xmm1, %xmm0
jmp _ccoshf ## TAILCALL
instead of:
_ccosf: ## @ccosf
## BB#0: ## %entry
movd %xmm0, %rax
movq %rax, %rcx
shlq $32, %rcx
shrq $32, %rax
xorl $-2147483648, %eax ## imm = 0xFFFFFFFF80000000
addq %rcx, %rax
movd %rax, %xmm0
jmp _ccoshf ## TAILCALL
There is still "stuff to be done" here for the struct case,
but this resolves rdar://6379669 - [x86-64 ABI] Pass and return
_Complex float / double efficiently
llvm-svn: 112111
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
end of a struct. This improves the case when the struct being passed
contains 3 floats, either due to a struct or array of 3 things. Before
we'd generate this IR for the testcase:
define float @bar(double %X.coerce0, double %X.coerce1) nounwind {
entry:
%X = alloca %struct.foof, align 8 ; <%struct.foof*> [#uses=2]
%0 = bitcast %struct.foof* %X to %1* ; <%1*> [#uses=2]
%1 = getelementptr %1* %0, i32 0, i32 0 ; <double*> [#uses=1]
store double %X.coerce0, double* %1
%2 = getelementptr %1* %0, i32 0, i32 1 ; <double*> [#uses=1]
store double %X.coerce1, double* %2
%tmp = getelementptr inbounds %struct.foof* %X, i32 0, i32 2 ; <float*> [#uses=1]
%tmp1 = load float* %tmp ; <float> [#uses=1]
ret float %tmp1
}
which compiled (with optimization) to:
_bar: ## @bar
## BB#0: ## %entry
movd %xmm1, %rax
movd %eax, %xmm0
ret
Now we produce:
define float @bar(double %X.coerce0, float %X.coerce1) nounwind {
entry:
%X = alloca %struct.foof, align 8 ; <%struct.foof*> [#uses=2]
%0 = bitcast %struct.foof* %X to %0* ; <%0*> [#uses=2]
%1 = getelementptr %0* %0, i32 0, i32 0 ; <double*> [#uses=1]
store double %X.coerce0, double* %1
%2 = getelementptr %0* %0, i32 0, i32 1 ; <float*> [#uses=1]
store float %X.coerce1, float* %2
%tmp = getelementptr inbounds %struct.foof* %X, i32 0, i32 2 ; <float*> [#uses=1]
%tmp1 = load float* %tmp ; <float> [#uses=1]
ret float %tmp1
}
and:
_bar: ## @bar
## BB#0: ## %entry
movaps %xmm1, %xmm0
ret
llvm-svn: 109776
|
|
|
|
|
|
| |
that Eli pointed out, rdar://8249586
llvm-svn: 109762
|