summaryrefslogtreecommitdiffstats
path: root/clang/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* fix PR7280 by making the warning on code like this:Chris Lattner2010-07-111-1/+1
| | | | | | | | | | int test1() { return; } default to an error. llvm-svn: 108108
* allow this to pass on 32-bit hosts.Chris Lattner2010-07-081-2/+2
| | | | llvm-svn: 107845
* fix the clang side of PR7437: EmitAggregateCopyChris Lattner2010-07-081-0/+12
| | | | | | | | | | | was not producing a memcpy with the right address spaces because of two places in it doing casts of the arguments to i8, one of which that didn't preserve the address space. There is also an optimizer bug here. llvm-svn: 107842
* filecheckize this test.Chris Lattner2010-07-081-5/+17
| | | | llvm-svn: 107841
* Don't consider casted non-global pointers to be evaluatable.John McCall2010-07-071-0/+6
| | | | | | Fixes rdar://problem/8154689 llvm-svn: 107755
* in the "coerce" case, the ABI handling code ends up making theChris Lattner2010-07-051-1/+9
| | | | | | | | | alloca for an argument. Make sure the argument gets the proper decl alignment, which may be different than the type alignment. This fixes PR7567 llvm-svn: 107627
* fix PR7564 a cast where the bitfield struct init codeChris Lattner2010-07-051-0/+8
| | | | | | wasn't handling array padding elements right. llvm-svn: 107621
* fix rdar://8147692 - yet another crash due to my abi work.Chris Lattner2010-07-011-0/+15
| | | | llvm-svn: 107387
* Driver/IRgen: Add support for -momit-leaf-frame-pointer.Daniel Dunbar2010-07-011-0/+29
| | | | llvm-svn: 107367
* Reapply:Chris Lattner2010-06-301-0/+13
| | | | | | | | | | r107173, "fix PR7519: after thrashing around and remembering how all this stuff" r107216, "fix PR7523, which was caused by the ABI code calling ConvertType instead" This includes a fix to make ConvertTypeForMem handle the "recursive" case, and call it as such when lowering function types which have an indirect result. llvm-svn: 107310
* Revert r107173, "fix PR7519: after thrashing around and remembering how all ↵Daniel Dunbar2010-06-301-13/+0
| | | | | | this stuff", it broke bootstrap. llvm-svn: 107232
* IRgen: Assignment to Objective-C properties shouldn't reload the value (whichDaniel Dunbar2010-06-292-1/+49
| | | | | | | | would trigger an extra method call). - While in the area, I also changed Clang to not emit an unnecessary load from 'x' in cases like 'y = (x = 1)'. llvm-svn: 107210
* tests: Fix test to not depend on instruction names.Daniel Dunbar2010-06-291-2/+3
| | | | llvm-svn: 107186
* fix PR7519: after thrashing around and remembering how all this stuffChris Lattner2010-06-291-0/+13
| | | | | | | works, the fix is quite simple: just make sure to call ConvertTypeRecursive when the function type being lowered is in the midst of ConvertType. llvm-svn: 107173
* Change X86_64ABIInfo to have ASTContext and TargetData ivars toChris Lattner2010-06-291-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | avoid passing ASTContext down through all the methods it has. When classifying an argument, or argument piece, as INTEGER, check to see if we have a pointer at exactly the same offset in the preferred type. If so, use that pointer type instead of i64. This allows us to compile A function taking a stringref into something like this: define i8* @foo(i64 %D.coerce0, i8* %D.coerce1) nounwind ssp { entry: %D = alloca %struct.DeclGroup, align 8 ; <%struct.DeclGroup*> [#uses=4] %0 = getelementptr %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1] store i64 %D.coerce0, i64* %0 %1 = getelementptr %struct.DeclGroup* %D, i32 0, i32 1 ; <i8**> [#uses=1] store i8* %D.coerce1, i8** %1 %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1] %tmp1 = load i64* %tmp ; <i64> [#uses=1] %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i8**> [#uses=1] %tmp3 = load i8** %tmp2 ; <i8*> [#uses=1] %add.ptr = getelementptr inbounds i8* %tmp3, i64 %tmp1 ; <i8*> [#uses=1] ret i8* %add.ptr } instead of this: define i8* @foo(i64 %D.coerce0, i64 %D.coerce1) nounwind ssp { entry: %D = alloca %struct.DeclGroup, align 8 ; <%struct.DeclGroup*> [#uses=3] %0 = insertvalue %0 undef, i64 %D.coerce0, 0 ; <%0> [#uses=1] %1 = insertvalue %0 %0, i64 %D.coerce1, 1 ; <%0> [#uses=1] %2 = bitcast %struct.DeclGroup* %D to %0* ; <%0*> [#uses=1] store %0 %1, %0* %2, align 1 %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1] %tmp1 = load i64* %tmp ; <i64> [#uses=1] %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i8**> [#uses=1] %tmp3 = load i8** %tmp2 ; <i8*> [#uses=1] %add.ptr = getelementptr inbounds i8* %tmp3, i64 %tmp1 ; <i8*> [#uses=1] ret i8* %add.ptr } This implements rdar://7375902 - [codegen quality] clang x86-64 ABI lowering code punishing StringRef llvm-svn: 107123
* add IR names to coerced arguments.Chris Lattner2010-06-291-4/+4
| | | | llvm-svn: 107105
* Change CGCall to handle the "coerce" case where the coerce-to typeChris Lattner2010-06-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | is a FCA to pass each of the elements as individual scalars. This produces code fast isel is less likely to reject and is easier on the optimizers. For example, before we would compile: struct DeclGroup { long NumDecls; char * Y; }; char * foo(DeclGroup D) { return D.NumDecls+D.Y; } to: %struct.DeclGroup = type { i64, i64 } define i64 @_Z3foo9DeclGroup(%struct.DeclGroup) nounwind { entry: %D = alloca %struct.DeclGroup, align 8 ; <%struct.DeclGroup*> [#uses=3] store %struct.DeclGroup %0, %struct.DeclGroup* %D, align 1 %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1] %tmp1 = load i64* %tmp ; <i64> [#uses=1] %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i64*> [#uses=1] %tmp3 = load i64* %tmp2 ; <i64> [#uses=1] %add = add nsw i64 %tmp1, %tmp3 ; <i64> [#uses=1] ret i64 %add } Now we get: %0 = type { i64, i64 } %struct.DeclGroup = type { i64, i8* } define i8* @_Z3foo9DeclGroup(i64, i64) nounwind { entry: %D = alloca %struct.DeclGroup, align 8 ; <%struct.DeclGroup*> [#uses=3] %2 = insertvalue %0 undef, i64 %0, 0 ; <%0> [#uses=1] %3 = insertvalue %0 %2, i64 %1, 1 ; <%0> [#uses=1] %4 = bitcast %struct.DeclGroup* %D to %0* ; <%0*> [#uses=1] store %0 %3, %0* %4, align 1 %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1] %tmp1 = load i64* %tmp ; <i64> [#uses=1] %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i8**> [#uses=1] %tmp3 = load i8** %tmp2 ; <i8*> [#uses=1] %add.ptr = getelementptr inbounds i8* %tmp3, i64 %tmp1 ; <i8*> [#uses=1] ret i8* %add.ptr } Elimination of the FCA inside the function is still-to-come. llvm-svn: 107099
* X86-64:Chris Lattner2010-06-281-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pass/return structs of float/int as float/i32 instead of double/i64 to make the code generated for ABI cleaner. Passing in the low part of a double is the same as passing in a float. For example, we now compile: struct DeclGroup { float NumDecls; }; float foo(DeclGroup D); void bar(DeclGroup *D) { foo(*D); } into: %struct.DeclGroup = type { float } define void @_Z3barP9DeclGroup(%struct.DeclGroup* %D) nounwind { entry: %D.addr = alloca %struct.DeclGroup*, align 8 ; <%struct.DeclGroup**> [#uses=2] %agg.tmp = alloca %struct.DeclGroup, align 4 ; <%struct.DeclGroup*> [#uses=2] store %struct.DeclGroup* %D, %struct.DeclGroup** %D.addr %tmp = load %struct.DeclGroup** %D.addr ; <%struct.DeclGroup*> [#uses=1] %tmp1 = bitcast %struct.DeclGroup* %agg.tmp to i8* ; <i8*> [#uses=1] %tmp2 = bitcast %struct.DeclGroup* %tmp to i8* ; <i8*> [#uses=1] call void @llvm.memcpy.p0i8.p0i8.i64(i8* %tmp1, i8* %tmp2, i64 4, i32 4, i1 false) %coerce.dive = getelementptr %struct.DeclGroup* %agg.tmp, i32 0, i32 0 ; <float*> [#uses=1] %0 = load float* %coerce.dive, align 1 ; <float> [#uses=1] %call = call float @_Z3foo9DeclGroup(float %0) ; <float> [#uses=0] ret void } instead of: %struct.DeclGroup = type { float } define void @_Z3barP9DeclGroup(%struct.DeclGroup* %D) nounwind { entry: %D.addr = alloca %struct.DeclGroup*, align 8 ; <%struct.DeclGroup**> [#uses=2] %agg.tmp = alloca %struct.DeclGroup, align 4 ; <%struct.DeclGroup*> [#uses=2] %tmp3 = alloca double ; <double*> [#uses=2] store %struct.DeclGroup* %D, %struct.DeclGroup** %D.addr %tmp = load %struct.DeclGroup** %D.addr ; <%struct.DeclGroup*> [#uses=1] %tmp1 = bitcast %struct.DeclGroup* %agg.tmp to i8* ; <i8*> [#uses=1] %tmp2 = bitcast %struct.DeclGroup* %tmp to i8* ; <i8*> [#uses=1] call void @llvm.memcpy.p0i8.p0i8.i64(i8* %tmp1, i8* %tmp2, i64 4, i32 4, i1 false) %coerce.dive = getelementptr %struct.DeclGroup* %agg.tmp, i32 0, i32 0 ; <float*> [#uses=1] %0 = bitcast double* %tmp3 to float* ; <float*> [#uses=1] %1 = load float* %coerce.dive ; <float> [#uses=1] store float %1, float* %0, align 1 %2 = load double* %tmp3 ; <double> [#uses=1] %call = call float @_Z3foo9DeclGroup(double %2) ; <float> [#uses=0] ret void } which is this machine code (at -O0): __Z3barP9DeclGroup: subq $24, %rsp movq %rdi, 16(%rsp) movq 16(%rsp), %rdi leaq 8(%rsp), %rax movl (%rdi), %ecx movl %ecx, (%rax) movss 8(%rsp), %xmm0 callq __Z3foo9DeclGroup addq $24, %rsp ret vs this: __Z3barP9DeclGroup: subq $24, %rsp movq %rdi, 16(%rsp) movq 16(%rsp), %rdi leaq 8(%rsp), %rax movl (%rdi), %ecx movl %ecx, (%rax) movss 8(%rsp), %xmm0 movss %xmm0, (%rsp) movsd (%rsp), %xmm0 callq __Z3foo9DeclGroup addq $24, %rsp ret At -O3, it is the difference between this now: __Z3barP9DeclGroup: movss (%rdi), %xmm0 jmp __Z3foo9DeclGroup # TAILCALL vs this before: __Z3barP9DeclGroup: movl (%rdi), %eax movd %rax, %xmm0 jmp __Z3foo9DeclGroup # TAILCALL llvm-svn: 107048
* Have __func__ and siblings point to block's implementation functionFariborz Jahanian2010-06-281-0/+18
| | | | | | name. Fixes radar 7860965. llvm-svn: 107044
* tweak test to pass on windowsChris Lattner2010-06-281-1/+1
| | | | llvm-svn: 107040
* Fix UnitTests/2004-02-02-NegativeZero.c, which regressed whenChris Lattner2010-06-281-0/+7
| | | | | | I broke negate of FP values. llvm-svn: 107019
* If coercing something from int or pointer type to int or pointer typeChris Lattner2010-06-271-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (potentially after unwrapping it from a struct) do it without going through memory. We now compile: struct DeclGroup { unsigned NumDecls; }; int foo(DeclGroup D) { return D.NumDecls; } into: %struct.DeclGroup = type { i32 } define i32 @_Z3foo9DeclGroup(i64) nounwind ssp noredzone { entry: %D = alloca %struct.DeclGroup, align 4 ; <%struct.DeclGroup*> [#uses=2] %coerce.dive = getelementptr %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1] %coerce.val.ii = trunc i64 %0 to i32 ; <i32> [#uses=1] store i32 %coerce.val.ii, i32* %coerce.dive %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1] %tmp1 = load i32* %tmp ; <i32> [#uses=1] ret i32 %tmp1 } instead of: %struct.DeclGroup = type { i32 } define i32 @_Z3foo9DeclGroup(i64) nounwind ssp noredzone { entry: %D = alloca %struct.DeclGroup, align 4 ; <%struct.DeclGroup*> [#uses=2] %tmp = alloca i64 ; <i64*> [#uses=2] %coerce.dive = getelementptr %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1] store i64 %0, i64* %tmp %1 = bitcast i64* %tmp to i32* ; <i32*> [#uses=1] %2 = load i32* %1, align 1 ; <i32> [#uses=1] store i32 %2, i32* %coerce.dive %tmp1 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1] %tmp2 = load i32* %tmp1 ; <i32> [#uses=1] ret i32 %tmp2 } ... which is quite a bit less terrifying. llvm-svn: 106975
* Same patch as the previous on the store side. Before we compiled this:Chris Lattner2010-06-271-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct DeclGroup { unsigned NumDecls; }; int foo(DeclGroup D) { return D.NumDecls; } to: %struct.DeclGroup = type { i32 } define i32 @_Z3foo9DeclGroup(i64) nounwind ssp noredzone { entry: %D = alloca %struct.DeclGroup, align 4 ; <%struct.DeclGroup*> [#uses=2] %tmp = alloca i64 ; <i64*> [#uses=2] store i64 %0, i64* %tmp %1 = bitcast i64* %tmp to %struct.DeclGroup* ; <%struct.DeclGroup*> [#uses=1] %2 = load %struct.DeclGroup* %1, align 1 ; <%struct.DeclGroup> [#uses=1] store %struct.DeclGroup %2, %struct.DeclGroup* %D %tmp1 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1] %tmp2 = load i32* %tmp1 ; <i32> [#uses=1] ret i32 %tmp2 } which caused fast isel bailouts due to the FCA load/store of %2. Now we generate this just blissful code: %struct.DeclGroup = type { i32 } define i32 @_Z3foo9DeclGroup(i64) nounwind ssp noredzone { entry: %D = alloca %struct.DeclGroup, align 4 ; <%struct.DeclGroup*> [#uses=2] %tmp = alloca i64 ; <i64*> [#uses=2] %coerce.dive = getelementptr %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1] store i64 %0, i64* %tmp %1 = bitcast i64* %tmp to i32* ; <i32*> [#uses=1] %2 = load i32* %1, align 1 ; <i32> [#uses=1] store i32 %2, i32* %coerce.dive %tmp1 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1] %tmp2 = load i32* %tmp1 ; <i32> [#uses=1] ret i32 %tmp2 } This avoids fastisel bailing out and is groundwork for future patch. This reduces bailouts on CGStmt.ll to 911 from 935. llvm-svn: 106974
* merge two tests.Chris Lattner2010-06-272-4/+6
| | | | llvm-svn: 106971
* Change IR generation for return (in the simple case) to avoid doing sillyChris Lattner2010-06-274-79/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | load/store nonsense in the epilog. For example, for: int foo(int X) { int A[100]; return A[X]; } we used to generate: %arrayidx = getelementptr inbounds [100 x i32]* %A, i32 0, i64 %idxprom ; <i32*> [#uses=1] %tmp1 = load i32* %arrayidx ; <i32> [#uses=1] store i32 %tmp1, i32* %retval %0 = load i32* %retval ; <i32> [#uses=1] ret i32 %0 } which codegen'd to this code: _foo: ## @foo ## BB#0: ## %entry subq $408, %rsp ## imm = 0x198 movl %edi, 400(%rsp) movl 400(%rsp), %edi movslq %edi, %rax movl (%rsp,%rax,4), %edi movl %edi, 404(%rsp) movl 404(%rsp), %eax addq $408, %rsp ## imm = 0x198 ret Now we generate: %arrayidx = getelementptr inbounds [100 x i32]* %A, i32 0, i64 %idxprom ; <i32*> [#uses=1] %tmp1 = load i32* %arrayidx ; <i32> [#uses=1] ret i32 %tmp1 } and: _foo: ## @foo ## BB#0: ## %entry subq $408, %rsp ## imm = 0x198 movl %edi, 404(%rsp) movl 404(%rsp), %edi movslq %edi, %rax movl (%rsp,%rax,4), %eax addq $408, %rsp ## imm = 0x198 ret This actually does matter, cutting out 2000 lines of IR from CGStmt.ll for example. Another interesting effect is that altivec.h functions which are dead now get dce'd by the inliner. Hence all the changes to builtins-ppc-altivec.c to ensure the calls aren't dead. llvm-svn: 106970
* Implement rdar://7530813 - collapse multiple GEP instructions in IRgenChris Lattner2010-06-264-15/+13
| | | | | | | | | | | | | | | | | | | | This avoids generating two gep's for common array operations. Before we would generate something like: %tmp = load i32* %X.addr ; <i32> [#uses=1] %arraydecay = getelementptr inbounds [100 x i32]* %A, i32 0, i32 0 ; <i32*> [#uses=1] %arrayidx = getelementptr inbounds i32* %arraydecay, i32 %tmp ; <i32*> [#uses=1] %tmp1 = load i32* %arrayidx ; <i32> [#uses=1] Now we generate: %tmp = load i32* %X.addr ; <i32> [#uses=1] %arrayidx = getelementptr inbounds [100 x i32]* %A, i32 0, i32 %tmp ; <i32*> [#uses=1] %tmp1 = load i32* %arrayidx ; <i32> [#uses=1] Less IR is better at -O0. llvm-svn: 106966
* fix inc/dec to honor -fwrapv and -ftrapv, implementing PR7426.Chris Lattner2010-06-261-5/+17
| | | | llvm-svn: 106962
* Fix unary minus to trap on overflow with -ftrapv, refactoring binopChris Lattner2010-06-261-1/+1
| | | | | | code so we can use it from VisitUnaryMinus. llvm-svn: 106957
* Implement support for -fwrapv, rdar://7221421Chris Lattner2010-06-264-18/+38
| | | | | | | | | | | | As part of this, pull together trapv handling into the same enum. This also add support for NSW multiplies. This also makes PCH disagreement on overflow behavior silent, since it really doesn't matter except for warnings and codegen (no macros get defined etc). llvm-svn: 106956
* implement rdar://7432000 - signed negate should codegen as NSW.Chris Lattner2010-06-262-4/+17
| | | | | | While I'm in there, adjust pointer to member adjustments as well. llvm-svn: 106955
* A bug I've introduced in STDIN handling surfaced a few broken tests, fix them.Benjamin Kramer2010-06-252-3/+3
| | | | | | Lexer/hexfloat.cpp is now XFAIL'd, I'd appreciate if someone could look into it. llvm-svn: 106840
* implement support for -finstrument-functions, patch by NelsonChris Lattner2010-06-221-0/+18
| | | | | | Elhage! llvm-svn: 106507
* More AltiVec support.Anton Korobeynikov2010-06-191-113/+1014
| | | | | | Patch by Anton Yartsev! llvm-svn: 106387
* Merge the "regparm" attribute from a previous declaration of aDouglas Gregor2010-06-181-0/+5
| | | | | | function to redeclarations of that function. Fixes PR7025. llvm-svn: 106317
* Change the test for which ABI/CC to use on ARM to be base on the environmentRafael Espindola2010-06-161-1/+1
| | | | | | (the last argument of the triple). llvm-svn: 106131
* A a new test for my previous patch.Rafael Espindola2010-06-161-0/+18
| | | | llvm-svn: 106120
* Fix tests that I missed from my previous commit.Rafael Espindola2010-06-162-34/+34
| | | | llvm-svn: 106118
* Enable basic testing of __builtin_fpclassify.Benjamin Kramer2010-06-141-2/+3
| | | | llvm-svn: 105937
* Fix the constant evaluator for AltiVec-style vector literals so that theJohn McCall2010-06-111-0/+4
| | | | | | | vector is filled with the given constant; we were just initializing the first element. llvm-svn: 105824
* Add a test to the previous commit.Rafael Espindola2010-06-081-1/+7
| | | | llvm-svn: 105596
* Correctly align large arrays in x86-64. This fixes PR5599.Rafael Espindola2010-06-041-2/+2
| | | | llvm-svn: 105500
* Preserve more information from a block's original function declarator, if oneJohn McCall2010-06-041-0/+6
| | | | | | | was given. Remove some unnecessary accounting from BlockScopeInfo. Handle typedef'ed function types until such time as we decide not. llvm-svn: 105478
* Empty enum in c is now error to match gcc's behavior.Fariborz Jahanian2010-05-281-1/+1
| | | | | | (radar 8040068). llvm-svn: 105011
* Fix a miscompile of wchar pascal strings.Fariborz Jahanian2010-05-281-0/+31
| | | | | | (radar 8020384) llvm-svn: 104996
* Enable the implementation of __builtin_setjmp and __builtin_longjmp. Not allJohn McCall2010-05-271-0/+7
| | | | | | LLVM backends support these yet. llvm-svn: 104867
* Fix testsuite for blocks mangling changeDouglas Gregor2010-05-251-1/+1
| | | | llvm-svn: 104618
* Implement codegen for __builtin_isnormal.Benjamin Kramer2010-05-191-0/+8
| | | | llvm-svn: 104118
* Add missing test case, provided by Steven Watanabe.Douglas Gregor2010-05-181-0/+50
| | | | llvm-svn: 104037
* Add support for Microsoft's __thiscall, from Steven Watanabe!Douglas Gregor2010-05-181-13/+24
| | | | llvm-svn: 104026
* PR7117: Make sure we don't lose the calling convention for K&R-styleEli Friedman2010-05-171-0/+6
| | | | | | | definitions. llvm-svn: 103932
OpenPOWER on IntegriCloud