summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* Generate better code for v8i16 shuffles on SSE2Nate Begeman2009-02-2314-281/+445
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Generate better code for v16i8 shuffles on SSE2 (avoids stack) Generate pshufb for v8i16 and v16i8 shuffles on SSSE3 where it is fewer uops. Document the shuffle matching logic and add some FIXMEs for later further cleanups. New tests that test the above. Examples: New: _shuf2: pextrw $7, %xmm0, %eax punpcklqdq %xmm1, %xmm0 pshuflw $128, %xmm0, %xmm0 pinsrw $2, %eax, %xmm0 Old: _shuf2: pextrw $2, %xmm0, %eax pextrw $7, %xmm0, %ecx pinsrw $2, %ecx, %xmm0 pinsrw $3, %eax, %xmm0 movd %xmm1, %eax pinsrw $4, %eax, %xmm0 ret ========= New: _shuf4: punpcklqdq %xmm1, %xmm0 pshufb LCPI1_0, %xmm0 Old: _shuf4: pextrw $3, %xmm0, %eax movsd %xmm1, %xmm0 pextrw $3, %xmm1, %ecx pinsrw $4, %ecx, %xmm0 pinsrw $5, %eax, %xmm0 ======== New: _shuf1: pushl %ebx pushl %edi pushl %esi pextrw $1, %xmm0, %eax rolw $8, %ax movd %xmm0, %ecx rolw $8, %cx pextrw $5, %xmm0, %edx pextrw $4, %xmm0, %esi pextrw $3, %xmm0, %edi pextrw $2, %xmm0, %ebx movaps %xmm0, %xmm1 pinsrw $0, %ecx, %xmm1 pinsrw $1, %eax, %xmm1 rolw $8, %bx pinsrw $2, %ebx, %xmm1 rolw $8, %di pinsrw $3, %edi, %xmm1 rolw $8, %si pinsrw $4, %esi, %xmm1 rolw $8, %dx pinsrw $5, %edx, %xmm1 pextrw $7, %xmm0, %eax rolw $8, %ax movaps %xmm1, %xmm0 pinsrw $7, %eax, %xmm0 popl %esi popl %edi popl %ebx ret Old: _shuf1: subl $252, %esp movaps %xmm0, (%esp) movaps %xmm0, 16(%esp) movaps %xmm0, 32(%esp) movaps %xmm0, 48(%esp) movaps %xmm0, 64(%esp) movaps %xmm0, 80(%esp) movaps %xmm0, 96(%esp) movaps %xmm0, 224(%esp) movaps %xmm0, 208(%esp) movaps %xmm0, 192(%esp) movaps %xmm0, 176(%esp) movaps %xmm0, 160(%esp) movaps %xmm0, 144(%esp) movaps %xmm0, 128(%esp) movaps %xmm0, 112(%esp) movzbl 14(%esp), %eax movd %eax, %xmm1 movzbl 22(%esp), %eax movd %eax, %xmm2 punpcklbw %xmm1, %xmm2 movzbl 42(%esp), %eax movd %eax, %xmm1 movzbl 50(%esp), %eax movd %eax, %xmm3 punpcklbw %xmm1, %xmm3 punpcklbw %xmm2, %xmm3 movzbl 77(%esp), %eax movd %eax, %xmm1 movzbl 84(%esp), %eax movd %eax, %xmm2 punpcklbw %xmm1, %xmm2 movzbl 104(%esp), %eax movd %eax, %xmm1 punpcklbw %xmm1, %xmm0 punpcklbw %xmm2, %xmm0 movaps %xmm0, %xmm1 punpcklbw %xmm3, %xmm1 movzbl 127(%esp), %eax movd %eax, %xmm0 movzbl 135(%esp), %eax movd %eax, %xmm2 punpcklbw %xmm0, %xmm2 movzbl 155(%esp), %eax movd %eax, %xmm0 movzbl 163(%esp), %eax movd %eax, %xmm3 punpcklbw %xmm0, %xmm3 punpcklbw %xmm2, %xmm3 movzbl 188(%esp), %eax movd %eax, %xmm0 movzbl 197(%esp), %eax movd %eax, %xmm2 punpcklbw %xmm0, %xmm2 movzbl 217(%esp), %eax movd %eax, %xmm4 movzbl 225(%esp), %eax movd %eax, %xmm0 punpcklbw %xmm4, %xmm0 punpcklbw %xmm2, %xmm0 punpcklbw %xmm3, %xmm0 punpcklbw %xmm1, %xmm0 addl $252, %esp ret llvm-svn: 65311
* If nobody minds, I'm using LTO to produce faster binaries. Switch fast codegenNick Lewycky2009-02-231-2/+2
| | | | | | off in libLTO. llvm-svn: 65310
* Changed option name from inline-threshold to basic-inline-threshold becauseMon P Wang2009-02-231-1/+1
| | | | | | inline-threshold option is used by the inliner. llvm-svn: 65309
* fix some typos that Duncan noticedChris Lattner2009-02-231-3/+3
| | | | llvm-svn: 65306
* A few small improvements to Evaluate for stuff I noted in FIXMEs.Eli Friedman2009-02-232-17/+97
| | | | llvm-svn: 65305
* retain/release checker: For now don't track the retain count of NSWindow ↵Ted Kremenek2009-02-232-2/+7
| | | | | | objects (opt for false negatives). llvm-svn: 65304
* More retain/release naming convention tests.Ted Kremenek2009-02-231-0/+3
| | | | llvm-svn: 65303
* Remove typo.Ted Kremenek2009-02-231-1/+1
| | | | llvm-svn: 65302
* '[NSAutoreleasePool addObject:]' has an 'autorelease' effect, not a ↵Ted Kremenek2009-02-231-2/+2
| | | | | | DoNothing effect. llvm-svn: 65301
* Sema::ActOnInstanceMessage(): Tighen up the lookup rules for handling ↵Steve Naroff2009-02-231-9/+31
| | | | | | messages to 'Class'. Also improve "super" handling. llvm-svn: 65300
* Add test case for PR 2599.Ted Kremenek2009-02-231-0/+64
| | | | llvm-svn: 65299
* Propagate debug loc info through prologue/epilogue.Bill Wendling2009-02-237-28/+39
| | | | llvm-svn: 65298
* Introduce the BuildVectorSDNode class that encapsulates the ISD::BUILD_VECTORScott Michel2009-02-2214-318/+397
| | | | | | | | | instruction. The class also consolidates the code for detecting constant splats that's shared across PowerPC and the CellSPU backends (and might be useful for other backends.) Also introduces SelectionDAG::getBUID_VECTOR() for generating new BUILD_VECTOR nodes. llvm-svn: 65296
* Add an option to the gold plugin to make it emit a file with the public apiNick Lewycky2009-02-221-1/+23
| | | | | | | | | | list that can in turn be passed to -internalize pass through -internalize-public-api-file. Pass gold -plugin-opt=generate-api-file to produce "apifile.txt" in the current directory. llvm-svn: 65295
* Minor cleanup, replace bool with qual_empty().Steve Naroff2009-02-221-3/+1
| | | | llvm-svn: 65293
* Contains the following (related to problems found while investigting ↵Steve Naroff2009-02-227-40/+57
| | | | | | | | | | <rdar://problem/6497631> Message lookup is sometimes different than gcc's). - Implement instance/class overloading in ObjCContainerDecl (removing a FIXME). This involved hacking NamedDecl::declarationReplaces(), which took awhile to figure out (didn't realize replace was the default). - Changed Sema::ActOnInstanceMessage() to remove redundant warnings when dealing with protocols. For now, I've omitted the "protocol" term in the diagnostic. It simplifies the code flow and wan't always 100% accurate (e.g. "Foo<Prot>" looks in the class interface, not just the protocol). - Changed several test cases to jive with the above changes. llvm-svn: 65292
* Make sure to reset the DidCallStackSave variable before emitting a compound ↵Anders Carlsson2009-02-221-0/+1
| | | | | | statement. Fixes PR3649. llvm-svn: 65291
* More objc gc work. Match gcc's treatment of ivar accessFariborz Jahanian2009-02-223-1/+32
| | | | | | | true a local pointer to objective-c object in generating write barriers. llvm-svn: 65290
* Revert the part of 64623 that attempted to align the source in aDan Gohman2009-02-221-1/+1
| | | | | | | | | | | | | | | memcpy to match the alignment of the destination. It isn't necessary for making loads and stores handled like the SSE loadu/storeu intrinsics, and it was causing a performance regression in MultiSource/Applications/JM/lencod. The problem appears to have been a memcpy that copies from some highly aligned array into an alloca; the alloca was then being assigned a large alignment, which required codegen to perform dynamic stack-pointer re-alignment, which forced the enclosing function to have a frame pointer, which led to increased spilling. llvm-svn: 65289
* Properly parenthesize this expression, fixing a real bug in the newDan Gohman2009-02-221-1/+1
| | | | | | -full-lsr code, as well as a GCC warning. llvm-svn: 65288
* bug 3610: Test case.Richard Pennington2009-02-221-0/+20
| | | | llvm-svn: 65287
* Copy some clenaups from Eli to code that I copied. :-)Mike Stump2009-02-221-6/+1
| | | | llvm-svn: 65286
* Cleanp code with some recent suggestions.Mike Stump2009-02-224-23/+14
| | | | llvm-svn: 65285
* A bit of Evaluate cleanup. Also, a full audit of what's missing that Eli Friedman2009-02-221-22/+33
| | | | | | someone would reasonably expect Evaluate to handle for C/ObjC. llvm-svn: 65284
* Update to checker-0.162 (fixed header issue in tar.bz2 package).Ted Kremenek2009-02-221-1/+1
| | | | llvm-svn: 65283
* Reverted back to checker-0.161 because of a header issue.Ted Kremenek2009-02-221-1/+1
| | | | llvm-svn: 65281
* Updated checker build.Ted Kremenek2009-02-221-1/+1
| | | | llvm-svn: 65280
* If a use operand is marked isKill, don't forget to add kill to its live ↵Evan Cheng2009-02-222-4/+30
| | | | | | interval as well. llvm-svn: 65279
* x86_64 ABI: Actually, we can always pass things we want to pass inDaniel Dunbar2009-02-221-10/+2
| | | | | | | memory using Indirect; this was a holdover from when CGCall wasn't as robust. llvm-svn: 65278
* ccc: Remove unknown host warning, it was breaking gcc's configure.Daniel Dunbar2009-02-221-1/+0
| | | | llvm-svn: 65276
* Add a note.Evan Cheng2009-02-221-0/+28
| | | | llvm-svn: 65275
* Be bug compatible with gcc by returning MMX values in RAX.Evan Cheng2009-02-223-8/+18
| | | | llvm-svn: 65274
* Do not consider MMX_MOVD64rr a move instructions. The source register is in ↵Evan Cheng2009-02-221-1/+0
| | | | | | GR32, the destination is VR64. They are not compatible. llvm-svn: 65273
* Fix test to be legal on 64-bit systems.Eli Friedman2009-02-221-1/+1
| | | | llvm-svn: 65270
* Fix regression in naming convention derivation: a method only follows the ↵Ted Kremenek2009-02-221-1/+1
| | | | | | copy 'rule' if it doesn't already start with 'init', etc. llvm-svn: 65269
* Only try to sink immediate when TLI is not null. It needs to check if ↵Evan Cheng2009-02-221-1/+1
| | | | | | immediate would fit in target addressing field. llvm-svn: 65268
* Eliminate a bunch of code which should be dead.Eli Friedman2009-02-221-194/+5
| | | | llvm-svn: 65267
* x86_64 ABI: Make sure to pass vectors that we want to pass in memoryDaniel Dunbar2009-02-221-1/+1
| | | | | | | | | as byval. Otherwise LLVM will have its own opinion about where to put things. We now pass all gcc dg.compat tests on x86_64. llvm-svn: 65266
* Throw the switch to exclusively use Evaluate (along with the small Eli Friedman2009-02-223-4/+15
| | | | | | | | | | | | | | | | helper isConstantInitializer) to check whether an initializer is constant. This passes tests, but it's possible that it'll cause regressions with real-world code. Future work: 1. The diagnostics obtained this way are lower quality at the moment; some work both here and in Evaluate is needed for accurate diagnostics. 2. We probably need some extra code when we're in -pedantic mode so we can strictly enforce the rules in C99 6.6p7. 3. Dead code cleanup (this should wait until after 2, because we might want to re-use some of the code). llvm-svn: 65265
* x86_64 ABI: Pass 32-bit vectors as Integer to match gcc. We don't careDaniel Dunbar2009-02-221-1/+16
| | | | | | about these much but <2 x i16> shows up in the gcc test suite. llvm-svn: 65264
* ABITestGen: Use explicit list of vector types instead of just a listDaniel Dunbar2009-02-221-10/+30
| | | | | | | of sizes. Turns out we don't care very much about vector types that don't map to the hardware. llvm-svn: 65263
* x86_64 ABI: Classify <1 x i64> as INTEGER (match gcc not llvm-gcc).Daniel Dunbar2009-02-221-6/+12
| | | | | | | Also, make sure to pass <1 x i64> as i64 (not <1 x i64>, which doesn't quite work yet in the backend). llvm-svn: 65262
* Enhance Evaluate to handle ObjC qualified id and class types; as far as Eli Friedman2009-02-222-4/+10
| | | | | | | | I know, these follow the exact same rules as pointers, so I just made them use the same codepath. Someone more familiar with ObjC should double-check this, though. llvm-svn: 65261
* Fix for PR3433: map __alignof__ to preferred alignment. (This was Eli Friedman2009-02-223-6/+7
| | | | | | partially done in r65258.) llvm-svn: 65260
* Last part of PR3254: use the same alignment computation in Sema and Eli Friedman2009-02-221-4/+1
| | | | | | | CodeGen. I'm not sure whether this actually makes any visible difference, but it's better to be consistent anyway. llvm-svn: 65259
* Improvements to ASTContext::getDeclAlignInBytes; fixes the testcase in Eli Friedman2009-02-223-32/+30
| | | | | | | | | PR3254 and part of PR3433. The isICE changes are necessary to keep the computed results consistent with Evaluate. llvm-svn: 65258
* Remove debugging statement.Steve Naroff2009-02-221-1/+0
| | | | llvm-svn: 65257
* Match gcc and always perform array/function conversion for asm input exprs. ↵Anders Carlsson2009-02-222-2/+8
| | | | | | Fixes PR3641. llvm-svn: 65256
* Correctly encode incomplete and variable length arrays. Fixes PR3639.Anders Carlsson2009-02-222-8/+30
| | | | llvm-svn: 65255
* ccc: Remove temporary files used in compilation, and removeDaniel Dunbar2009-02-221-3/+29
| | | | | | compilation results on failures. llvm-svn: 65254
OpenPOWER on IntegriCloud