summaryrefslogtreecommitdiffstats
path: root/clang/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Mangle Objective-C pointers and block pointers in the Microsoft C++ Mangler.Charles Davis2010-07-031-13/+22
| | | | | | | | | | ObjC pointers were easy enough (as far as the ABI is concerned, they're just pointers to structs), but I had to invent a new mangling for block pointers. This is particularly worrying with the Microsoft ABI, because it is a vendor-specific ABI; extending it could come back to bite us later when MS extends it on their own (and you know they will). llvm-svn: 107572
* Provide convenience routines to save and restore the current insertionJohn McCall2010-07-031-3/+52
| | | | | | point. llvm-svn: 107570
* Fix mangling of array dimensions in the Microsoft C++ Mangler.Charles Davis2010-07-031-7/+7
| | | | llvm-svn: 107568
* Mangle member pointer types in the Microsoft C++ Mangler.Charles Davis2010-07-031-3/+15
| | | | llvm-svn: 107567
* Fix mangling of function pointers in the Microsoft C++ Mangler.Charles Davis2010-07-031-0/+5
| | | | llvm-svn: 107564
* Fix mangling of array parameters for functions in the Microsoft C++ Mangler.Charles Davis2010-07-031-10/+21
| | | | | | | | | | | Only actual functions get mangled correctly; I don't know how to fix it for function pointers yet. Thanks to John McCall for the hint. Also, mangle anonymous tag types. I don't have a suitable testcase yet; I have a feeling that that's going to need support for static locals, and I haven't figured out exactly how MSVC's scheme for mangling those works. llvm-svn: 107561
* Remove unnecessary ASTContext parameter fromDouglas Gregor2010-07-019-15/+14
| | | | | | CXXRecordDecl::getDestructor(); no functionality change. llvm-svn: 107394
* fix rdar://8147692 - yet another crash due to my abi work.Chris Lattner2010-07-013-15/+41
| | | | llvm-svn: 107387
* Driver/IRgen: Add support for -momit-leaf-frame-pointer.Daniel Dunbar2010-07-011-1/+15
| | | | llvm-svn: 107367
* Revert "IRgen: Make sure any prolog instructions get debug info.", the lexicalDaniel Dunbar2010-07-011-1/+0
| | | | | | | scope hasn't been set up yet so this isn't valid. It was just a cleanup to the IR, so I'm going to ignore it for now. llvm-svn: 107356
* IRgen: Fix debug info regression in r106970; when we eliminate the return valueDaniel Dunbar2010-06-301-5/+6
| | | | | | | | | store make sure to move the debug metadata from the store (which is actual 'return' statement location) to the return instruction (which otherwise would have the function end location as its debug info). - Tested by gdb test suite. llvm-svn: 107322
* IRgen: Make sure any prolog instructions get debug info.Daniel Dunbar2010-06-301-0/+1
| | | | llvm-svn: 107320
* Reapply:Chris Lattner2010-06-305-56/+89
| | | | | | | | | | r107173, "fix PR7519: after thrashing around and remembering how all this stuff" r107216, "fix PR7523, which was caused by the ABI code calling ConvertType instead" This includes a fix to make ConvertTypeForMem handle the "recursive" case, and call it as such when lowering function types which have an indirect result. llvm-svn: 107310
* Use isFunctionOrMethod for vars declared localllyFariborz Jahanian2010-06-301-1/+1
| | | | | | in method/blocks to decide not to mangle them. llvm-svn: 107309
* extern variable declared locally to objective-c++ methodFariborz Jahanian2010-06-301-1/+1
| | | | | | should not be mangled either. Fixes radar 8016412. llvm-svn: 107303
* reduce nesting.Chris Lattner2010-06-301-23/+27
| | | | llvm-svn: 107292
* Mangle arrays in the Microsoft C++ Mangler. It's not quite finished (itCharles Davis2010-06-301-27/+157
| | | | | | | | | doesn't mangle array parameters right), but I think that should be fixed in Sema (Doug, John, what do you think?). Also, stub out the remaining mangleType() routines. llvm-svn: 107264
* Revert r107173, "fix PR7519: after thrashing around and remembering how all ↵Daniel Dunbar2010-06-304-55/+26
| | | | | | this stuff", it broke bootstrap. llvm-svn: 107232
* Revert r107216, "fix PR7523, which was caused by the ABI code calling ↵Daniel Dunbar2010-06-303-26/+17
| | | | | | ConvertType instead", it is part of a boostrap breaking sequence. llvm-svn: 107231
* IRgen: Assignment to Objective-C properties shouldn't reload the value, forDaniel Dunbar2010-06-291-38/+36
| | | | | | | | complex values either. Previously we did this properly for regular assignment, but not for compound assignment. - Also, tidy up assignment code a bit to look more like the scalar path. llvm-svn: 107217
* fix PR7523, which was caused by the ABI code calling ConvertType insteadChris Lattner2010-06-293-17/+26
| | | | | | | of ConvertTypeRecursive when it needed to in a few cases, causing pointer types to get resolved at the wrong time. llvm-svn: 107216
* IRgen: Assignment to Objective-C properties shouldn't reload the value (whichDaniel Dunbar2010-06-291-30/+42
| | | | | | | | would trigger an extra method call). - While in the area, I also changed Clang to not emit an unnecessary load from 'x' in cases like 'y = (x = 1)'. llvm-svn: 107210
* change ABIArgInfo to hold its llvm type with PATypeHolder so thatChris Lattner2010-06-291-4/+2
| | | | | | | it doesn't dangle as types get refined. This fixes Shootout-C++/lists1 and probably also PR7522. llvm-svn: 107196
* relax the CGFunctionInfo::CGFunctionInfo ctor to allow any sequence Chris Lattner2010-06-292-12/+10
| | | | | | of CanQualTypes to be passed in. llvm-svn: 107176
* fix PR7519: after thrashing around and remembering how all this stuffChris Lattner2010-06-294-26/+55
| | | | | | | works, the fix is quite simple: just make sure to call ConvertTypeRecursive when the function type being lowered is in the midst of ConvertType. llvm-svn: 107173
* minor cleanups.Chris Lattner2010-06-292-10/+4
| | | | llvm-svn: 107150
* Change X86_64ABIInfo to have ASTContext and TargetData ivars toChris Lattner2010-06-291-43/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | avoid passing ASTContext down through all the methods it has. When classifying an argument, or argument piece, as INTEGER, check to see if we have a pointer at exactly the same offset in the preferred type. If so, use that pointer type instead of i64. This allows us to compile A function taking a stringref into something like this: define i8* @foo(i64 %D.coerce0, i8* %D.coerce1) nounwind ssp { entry: %D = alloca %struct.DeclGroup, align 8 ; <%struct.DeclGroup*> [#uses=4] %0 = getelementptr %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1] store i64 %D.coerce0, i64* %0 %1 = getelementptr %struct.DeclGroup* %D, i32 0, i32 1 ; <i8**> [#uses=1] store i8* %D.coerce1, i8** %1 %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1] %tmp1 = load i64* %tmp ; <i64> [#uses=1] %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i8**> [#uses=1] %tmp3 = load i8** %tmp2 ; <i8*> [#uses=1] %add.ptr = getelementptr inbounds i8* %tmp3, i64 %tmp1 ; <i8*> [#uses=1] ret i8* %add.ptr } instead of this: define i8* @foo(i64 %D.coerce0, i64 %D.coerce1) nounwind ssp { entry: %D = alloca %struct.DeclGroup, align 8 ; <%struct.DeclGroup*> [#uses=3] %0 = insertvalue %0 undef, i64 %D.coerce0, 0 ; <%0> [#uses=1] %1 = insertvalue %0 %0, i64 %D.coerce1, 1 ; <%0> [#uses=1] %2 = bitcast %struct.DeclGroup* %D to %0* ; <%0*> [#uses=1] store %0 %1, %0* %2, align 1 %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1] %tmp1 = load i64* %tmp ; <i64> [#uses=1] %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i8**> [#uses=1] %tmp3 = load i8** %tmp2 ; <i8*> [#uses=1] %add.ptr = getelementptr inbounds i8* %tmp3, i64 %tmp1 ; <i8*> [#uses=1] ret i8* %add.ptr } This implements rdar://7375902 - [codegen quality] clang x86-64 ABI lowering code punishing StringRef llvm-svn: 107123
* plumb preferred types down into X86_64ABIInfo::classifyArgumentType,Chris Lattner2010-06-291-4/+14
| | | | | | no functionality change. llvm-svn: 107115
* Pass the LLVM IR version of argument types down into computeInfo.Chris Lattner2010-06-293-10/+38
| | | | | | | | | This is somewhat annoying to do this at this level, but it avoids having ABIInfo know depend on CodeGenTypes for a hint. Nothing is using this yet, so no functionality change. llvm-svn: 107111
* add IR names to coerced arguments.Chris Lattner2010-06-291-0/+3
| | | | llvm-svn: 107105
* make the argument passing stuff in the FCA case smarter still, byChris Lattner2010-06-291-21/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | avoiding making the FCA at all when the types exactly line up. For example, before we made: %struct.DeclGroup = type { i64, i64 } define i64 @_Z3foo9DeclGroup(i64, i64) nounwind { entry: %D = alloca %struct.DeclGroup, align 8 ; <%struct.DeclGroup*> [#uses=3] %2 = insertvalue %struct.DeclGroup undef, i64 %0, 0 ; <%struct.DeclGroup> [#uses=1] %3 = insertvalue %struct.DeclGroup %2, i64 %1, 1 ; <%struct.DeclGroup> [#uses=1] store %struct.DeclGroup %3, %struct.DeclGroup* %D %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1] %tmp1 = load i64* %tmp ; <i64> [#uses=1] %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i64*> [#uses=1] %tmp3 = load i64* %tmp2 ; <i64> [#uses=1] %add = add nsw i64 %tmp1, %tmp3 ; <i64> [#uses=1] ret i64 %add } ... which has the pointless insertvalue, which fastisel hates, now we make: %struct.DeclGroup = type { i64, i64 } define i64 @_Z3foo9DeclGroup(i64, i64) nounwind { entry: %D = alloca %struct.DeclGroup, align 8 ; <%struct.DeclGroup*> [#uses=4] %2 = getelementptr %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1] store i64 %0, i64* %2 %3 = getelementptr %struct.DeclGroup* %D, i32 0, i32 1 ; <i64*> [#uses=1] store i64 %1, i64* %3 %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1] %tmp1 = load i64* %tmp ; <i64> [#uses=1] %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i64*> [#uses=1] %tmp3 = load i64* %tmp2 ; <i64> [#uses=1] %add = add nsw i64 %tmp1, %tmp3 ; <i64> [#uses=1] ret i64 %add } This only kicks in when x86-64 abi lowering decides it likes us. llvm-svn: 107104
* Change CGCall to handle the "coerce" case where the coerce-to typeChris Lattner2010-06-281-11/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | is a FCA to pass each of the elements as individual scalars. This produces code fast isel is less likely to reject and is easier on the optimizers. For example, before we would compile: struct DeclGroup { long NumDecls; char * Y; }; char * foo(DeclGroup D) { return D.NumDecls+D.Y; } to: %struct.DeclGroup = type { i64, i64 } define i64 @_Z3foo9DeclGroup(%struct.DeclGroup) nounwind { entry: %D = alloca %struct.DeclGroup, align 8 ; <%struct.DeclGroup*> [#uses=3] store %struct.DeclGroup %0, %struct.DeclGroup* %D, align 1 %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1] %tmp1 = load i64* %tmp ; <i64> [#uses=1] %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i64*> [#uses=1] %tmp3 = load i64* %tmp2 ; <i64> [#uses=1] %add = add nsw i64 %tmp1, %tmp3 ; <i64> [#uses=1] ret i64 %add } Now we get: %0 = type { i64, i64 } %struct.DeclGroup = type { i64, i8* } define i8* @_Z3foo9DeclGroup(i64, i64) nounwind { entry: %D = alloca %struct.DeclGroup, align 8 ; <%struct.DeclGroup*> [#uses=3] %2 = insertvalue %0 undef, i64 %0, 0 ; <%0> [#uses=1] %3 = insertvalue %0 %2, i64 %1, 1 ; <%0> [#uses=1] %4 = bitcast %struct.DeclGroup* %D to %0* ; <%0*> [#uses=1] store %0 %3, %0* %4, align 1 %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1] %tmp1 = load i64* %tmp ; <i64> [#uses=1] %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i8**> [#uses=1] %tmp3 = load i8** %tmp2 ; <i8*> [#uses=1] %add.ptr = getelementptr inbounds i8* %tmp3, i64 %tmp1 ; <i8*> [#uses=1] ret i8* %add.ptr } Elimination of the FCA inside the function is still-to-come. llvm-svn: 107099
* make the trivial forms of CreateCoerced{Load|Store} trivial.Chris Lattner2010-06-281-3/+12
| | | | llvm-svn: 107091
* pass/return structs of char and short as i8/i16 to avoidChris Lattner2010-06-281-5/+9
| | | | | | aweful through-memory coersion, just like we do for i32 now. llvm-svn: 107078
* more tidying up.Chris Lattner2010-06-281-32/+45
| | | | llvm-svn: 107076
* random acts of tidying.Chris Lattner2010-06-281-28/+47
| | | | llvm-svn: 107050
* X86-64:Chris Lattner2010-06-281-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pass/return structs of float/int as float/i32 instead of double/i64 to make the code generated for ABI cleaner. Passing in the low part of a double is the same as passing in a float. For example, we now compile: struct DeclGroup { float NumDecls; }; float foo(DeclGroup D); void bar(DeclGroup *D) { foo(*D); } into: %struct.DeclGroup = type { float } define void @_Z3barP9DeclGroup(%struct.DeclGroup* %D) nounwind { entry: %D.addr = alloca %struct.DeclGroup*, align 8 ; <%struct.DeclGroup**> [#uses=2] %agg.tmp = alloca %struct.DeclGroup, align 4 ; <%struct.DeclGroup*> [#uses=2] store %struct.DeclGroup* %D, %struct.DeclGroup** %D.addr %tmp = load %struct.DeclGroup** %D.addr ; <%struct.DeclGroup*> [#uses=1] %tmp1 = bitcast %struct.DeclGroup* %agg.tmp to i8* ; <i8*> [#uses=1] %tmp2 = bitcast %struct.DeclGroup* %tmp to i8* ; <i8*> [#uses=1] call void @llvm.memcpy.p0i8.p0i8.i64(i8* %tmp1, i8* %tmp2, i64 4, i32 4, i1 false) %coerce.dive = getelementptr %struct.DeclGroup* %agg.tmp, i32 0, i32 0 ; <float*> [#uses=1] %0 = load float* %coerce.dive, align 1 ; <float> [#uses=1] %call = call float @_Z3foo9DeclGroup(float %0) ; <float> [#uses=0] ret void } instead of: %struct.DeclGroup = type { float } define void @_Z3barP9DeclGroup(%struct.DeclGroup* %D) nounwind { entry: %D.addr = alloca %struct.DeclGroup*, align 8 ; <%struct.DeclGroup**> [#uses=2] %agg.tmp = alloca %struct.DeclGroup, align 4 ; <%struct.DeclGroup*> [#uses=2] %tmp3 = alloca double ; <double*> [#uses=2] store %struct.DeclGroup* %D, %struct.DeclGroup** %D.addr %tmp = load %struct.DeclGroup** %D.addr ; <%struct.DeclGroup*> [#uses=1] %tmp1 = bitcast %struct.DeclGroup* %agg.tmp to i8* ; <i8*> [#uses=1] %tmp2 = bitcast %struct.DeclGroup* %tmp to i8* ; <i8*> [#uses=1] call void @llvm.memcpy.p0i8.p0i8.i64(i8* %tmp1, i8* %tmp2, i64 4, i32 4, i1 false) %coerce.dive = getelementptr %struct.DeclGroup* %agg.tmp, i32 0, i32 0 ; <float*> [#uses=1] %0 = bitcast double* %tmp3 to float* ; <float*> [#uses=1] %1 = load float* %coerce.dive ; <float> [#uses=1] store float %1, float* %0, align 1 %2 = load double* %tmp3 ; <double> [#uses=1] %call = call float @_Z3foo9DeclGroup(double %2) ; <float> [#uses=0] ret void } which is this machine code (at -O0): __Z3barP9DeclGroup: subq $24, %rsp movq %rdi, 16(%rsp) movq 16(%rsp), %rdi leaq 8(%rsp), %rax movl (%rdi), %ecx movl %ecx, (%rax) movss 8(%rsp), %xmm0 callq __Z3foo9DeclGroup addq $24, %rsp ret vs this: __Z3barP9DeclGroup: subq $24, %rsp movq %rdi, 16(%rsp) movq 16(%rsp), %rdi leaq 8(%rsp), %rax movl (%rdi), %ecx movl %ecx, (%rax) movss 8(%rsp), %xmm0 movss %xmm0, (%rsp) movsd (%rsp), %xmm0 callq __Z3foo9DeclGroup addq $24, %rsp ret At -O3, it is the difference between this now: __Z3barP9DeclGroup: movss (%rdi), %xmm0 jmp __Z3foo9DeclGroup # TAILCALL vs this before: __Z3barP9DeclGroup: movl (%rdi), %eax movd %rax, %xmm0 jmp __Z3foo9DeclGroup # TAILCALL llvm-svn: 107048
* Minor refactorin of my last patch (radar 7860965 related).Fariborz Jahanian2010-06-281-1/+1
| | | | llvm-svn: 107047
* Have __func__ and siblings point to block's implementation functionFariborz Jahanian2010-06-281-1/+12
| | | | | | name. Fixes radar 7860965. llvm-svn: 107044
* Fix UnitTests/2004-02-02-NegativeZero.c, which regressed whenChris Lattner2010-06-281-2/+6
| | | | | | I broke negate of FP values. llvm-svn: 107019
* Correctly destroy reference temporaries with global storage. Remove ↵Anders Carlsson2010-06-272-21/+27
| | | | | | ErrorUnsupported call when binding a global reference to a non-lvalue. Fixes PR7326. llvm-svn: 106983
* Add a CreateReferenceTemporary that will do the right thing for variables ↵Anders Carlsson2010-06-271-6/+34
| | | | | | with global storage. llvm-svn: 106982
* Simplify CodeGenFunction::EmitReferenceBindingToExpr as a first step towards ↵Anders Carlsson2010-06-271-95/+90
| | | | | | fixing PR7326. llvm-svn: 106981
* Reduce indentation.Anders Carlsson2010-06-271-14/+11
| | | | llvm-svn: 106980
* misc tidyingChris Lattner2010-06-272-5/+2
| | | | llvm-svn: 106978
* finally get around to doing a significant cleanup to irgen:Chris Lattner2010-06-2716-233/+164
| | | | | | | | have CGF create and make accessible standard int32,int64 and intptr types. This fixes a ton of 80 column violations introduced by LLVMContextification and cleans up stuff a lot. llvm-svn: 106977
* tidy up OrderGlobalInitsChris Lattner2010-06-272-14/+12
| | | | llvm-svn: 106976
* If coercing something from int or pointer type to int or pointer typeChris Lattner2010-06-271-0/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (potentially after unwrapping it from a struct) do it without going through memory. We now compile: struct DeclGroup { unsigned NumDecls; }; int foo(DeclGroup D) { return D.NumDecls; } into: %struct.DeclGroup = type { i32 } define i32 @_Z3foo9DeclGroup(i64) nounwind ssp noredzone { entry: %D = alloca %struct.DeclGroup, align 4 ; <%struct.DeclGroup*> [#uses=2] %coerce.dive = getelementptr %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1] %coerce.val.ii = trunc i64 %0 to i32 ; <i32> [#uses=1] store i32 %coerce.val.ii, i32* %coerce.dive %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1] %tmp1 = load i32* %tmp ; <i32> [#uses=1] ret i32 %tmp1 } instead of: %struct.DeclGroup = type { i32 } define i32 @_Z3foo9DeclGroup(i64) nounwind ssp noredzone { entry: %D = alloca %struct.DeclGroup, align 4 ; <%struct.DeclGroup*> [#uses=2] %tmp = alloca i64 ; <i64*> [#uses=2] %coerce.dive = getelementptr %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1] store i64 %0, i64* %tmp %1 = bitcast i64* %tmp to i32* ; <i32*> [#uses=1] %2 = load i32* %1, align 1 ; <i32> [#uses=1] store i32 %2, i32* %coerce.dive %tmp1 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1] %tmp2 = load i32* %tmp1 ; <i32> [#uses=1] ret i32 %tmp2 } ... which is quite a bit less terrifying. llvm-svn: 106975
* Same patch as the previous on the store side. Before we compiled this:Chris Lattner2010-06-271-7/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct DeclGroup { unsigned NumDecls; }; int foo(DeclGroup D) { return D.NumDecls; } to: %struct.DeclGroup = type { i32 } define i32 @_Z3foo9DeclGroup(i64) nounwind ssp noredzone { entry: %D = alloca %struct.DeclGroup, align 4 ; <%struct.DeclGroup*> [#uses=2] %tmp = alloca i64 ; <i64*> [#uses=2] store i64 %0, i64* %tmp %1 = bitcast i64* %tmp to %struct.DeclGroup* ; <%struct.DeclGroup*> [#uses=1] %2 = load %struct.DeclGroup* %1, align 1 ; <%struct.DeclGroup> [#uses=1] store %struct.DeclGroup %2, %struct.DeclGroup* %D %tmp1 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1] %tmp2 = load i32* %tmp1 ; <i32> [#uses=1] ret i32 %tmp2 } which caused fast isel bailouts due to the FCA load/store of %2. Now we generate this just blissful code: %struct.DeclGroup = type { i32 } define i32 @_Z3foo9DeclGroup(i64) nounwind ssp noredzone { entry: %D = alloca %struct.DeclGroup, align 4 ; <%struct.DeclGroup*> [#uses=2] %tmp = alloca i64 ; <i64*> [#uses=2] %coerce.dive = getelementptr %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1] store i64 %0, i64* %tmp %1 = bitcast i64* %tmp to i32* ; <i32*> [#uses=1] %2 = load i32* %1, align 1 ; <i32> [#uses=1] store i32 %2, i32* %coerce.dive %tmp1 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1] %tmp2 = load i32* %tmp1 ; <i32> [#uses=1] ret i32 %tmp2 } This avoids fastisel bailing out and is groundwork for future patch. This reduces bailouts on CGStmt.ll to 911 from 935. llvm-svn: 106974
* improve CreateCoercedLoad a bit to generate slightly less awfulChris Lattner2010-06-271-1/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IR when handling X86-64 by-value struct stuff. For example, we use to compile this: struct DeclGroup { unsigned NumDecls; }; int foo(DeclGroup D); void bar(DeclGroup *D) { foo(*D); } into: define void @_Z3barP9DeclGroup(%struct.DeclGroup* %D) ssp nounwind { entry: %D.addr = alloca %struct.DeclGroup*, align 8 ; <%struct.DeclGroup**> [#uses=2] %agg.tmp = alloca %struct.DeclGroup, align 4 ; <%struct.DeclGroup*> [#uses=2] %tmp3 = alloca i64 ; <i64*> [#uses=2] store %struct.DeclGroup* %D, %struct.DeclGroup** %D.addr %tmp = load %struct.DeclGroup** %D.addr ; <%struct.DeclGroup*> [#uses=1] %tmp1 = bitcast %struct.DeclGroup* %agg.tmp to i8* ; <i8*> [#uses=1] %tmp2 = bitcast %struct.DeclGroup* %tmp to i8* ; <i8*> [#uses=1] call void @llvm.memcpy.p0i8.p0i8.i64(i8* %tmp1, i8* %tmp2, i64 4, i32 4, i1 false) %0 = bitcast i64* %tmp3 to %struct.DeclGroup* ; <%struct.DeclGroup*> [#uses=1] %1 = load %struct.DeclGroup* %agg.tmp ; <%struct.DeclGroup> [#uses=1] store %struct.DeclGroup %1, %struct.DeclGroup* %0, align 1 %2 = load i64* %tmp3 ; <i64> [#uses=1] call void @_Z3foo9DeclGroup(i64 %2) ret void } which would cause fastisel to bail out due to the first class aggregate load %1. With this patch we now compile it into the (still awful): define void @_Z3barP9DeclGroup(%struct.DeclGroup* %D) nounwind ssp noredzone { entry: %D.addr = alloca %struct.DeclGroup*, align 8 ; <%struct.DeclGroup**> [#uses=2] %agg.tmp = alloca %struct.DeclGroup, align 4 ; <%struct.DeclGroup*> [#uses=2] %tmp3 = alloca i64 ; <i64*> [#uses=2] store %struct.DeclGroup* %D, %struct.DeclGroup** %D.addr %tmp = load %struct.DeclGroup** %D.addr ; <%struct.DeclGroup*> [#uses=1] %tmp1 = bitcast %struct.DeclGroup* %agg.tmp to i8* ; <i8*> [#uses=1] %tmp2 = bitcast %struct.DeclGroup* %tmp to i8* ; <i8*> [#uses=1] call void @llvm.memcpy.p0i8.p0i8.i64(i8* %tmp1, i8* %tmp2, i64 4, i32 4, i1 false) %coerce.dive = getelementptr %struct.DeclGroup* %agg.tmp, i32 0, i32 0 ; <i32*> [#uses=1] %0 = bitcast i64* %tmp3 to i32* ; <i32*> [#uses=1] %1 = load i32* %coerce.dive ; <i32> [#uses=1] store i32 %1, i32* %0, align 1 %2 = load i64* %tmp3 ; <i64> [#uses=1] %call = call i32 @_Z3foo9DeclGroup(i64 %2) noredzone ; <i32> [#uses=0] ret void } which doesn't bail out. On CGStmt.ll, this reduces fastisel bail outs from 958 to 935, and is the precursor of better things to come. llvm-svn: 106973
OpenPOWER on IntegriCloud