summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
...
* Add header...Daniel Dunbar2010-12-201-0/+1
| | | | llvm-svn: 122247
* X86/MC/Mach-O: Split out createX86MachObjectWriter().Daniel Dunbar2010-12-204-17/+48
| | | | llvm-svn: 122246
* now that addc/adde are gone, "ADDC" in the X86 backend uses EFLAGS results,Chris Lattner2010-12-201-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | the same as setcc. Optimize ADDC(0,0,FLAGS) -> SET_CARRY(FLAGS). This is a step towards finishing off PR5443. In the testcase in that bug we now get: movq %rdi, %rax addq %rsi, %rax sbbq %rcx, %rcx testb $1, %cl setne %dl ret instead of: movq %rdi, %rax addq %rsi, %rax movl $0, %ecx adcq $0, %rcx testq %rcx, %rcx setne %dl ret llvm-svn: 122219
* We lower setb to sbb with the hope that the and will go away, when it Chris Lattner2010-12-201-0/+6
| | | | | | | | | | | | | | | | | | | | | doesn't, match it back to setb. On a 64-bit version of the testcase before we'd get: movq %rdi, %rax addq %rsi, %rax sbbb %dl, %dl andb $1, %dl ret now we get: movq %rdi, %rax addq %rsi, %rax setb %dl ret llvm-svn: 122217
* use for loop over types.Chris Lattner2010-12-201-20/+6
| | | | llvm-svn: 122214
* Change the X86 backend to stop using the evil ADDC/ADDE/SUBC/SUBE nodes (whichChris Lattner2010-12-204-42/+173
| | | | | | | | | | | | | their carry depenedencies with MVT::Flag operands) and use clean and beautiful EFLAGS dependences instead. We do this by changing the modelling of SBB/ADC to have EFLAGS input and outputs (which is what requires the previous scheduler change) and change X86 ISelLowering to custom lower ADDC and friends down to X86ISD::ADD/ADC/SUB/SBB nodes. With the previous series of changes, this causes no changes in the testsuite, woo. llvm-svn: 122213
* Prevents PerformShuffleCombine from creating a node with an illegal type ↵Mon P Wang2010-12-191-2/+7
| | | | | | | | after legalize types has run, e.g., prevent creating an i64 node from a v2i64 when i64 is not a legal type. llvm-svn: 122206
* improve the setcc -> setcc_carry optimization to happen moreChris Lattner2010-12-193-12/+34
| | | | | | | | consistently by moving it out of lowering into dag combine. Add some missing patterns for matching away extended versions of setcc_c. llvm-svn: 122201
* simplify some code to just reuse a setcc if we can instead of Chris Lattner2010-12-191-11/+16
| | | | | | going through the CSE maps to get it. llvm-svn: 122196
* now that generic vector types aren't selected onto MMX operations,Chris Lattner2010-12-191-8/+4
| | | | | | we don't need -disable-mmx anymore. llvm-svn: 122189
* reduce copy/paste programming with the power of for loops.Chris Lattner2010-12-191-40/+25
| | | | llvm-svn: 122187
* X86 supports i8/i16 overflow ops (except i8 multiplies), we shouldChris Lattner2010-12-191-17/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | generate them. Now we compile: define zeroext i8 @X(i8 signext %a, i8 signext %b) nounwind ssp { entry: %0 = tail call %0 @llvm.sadd.with.overflow.i8(i8 %a, i8 %b) %cmp = extractvalue %0 %0, 1 br i1 %cmp, label %if.then, label %if.end into: _X: ## @X ## BB#0: ## %entry subl $12, %esp movb 16(%esp), %al addb 20(%esp), %al jo LBB0_2 Before we were generating: _X: ## @X ## BB#0: ## %entry pushl %ebp movl %esp, %ebp subl $8, %esp movb 12(%ebp), %al testb %al, %al setge %cl movb 8(%ebp), %dl testb %dl, %dl setge %ah cmpb %cl, %ah sete %cl addb %al, %dl testb %dl, %dl setge %al cmpb %al, %ah setne %al andb %cl, %al testb %al, %al jne LBB0_2 llvm-svn: 122186
* Remove the MCObjectFormat class.Rafael Espindola2010-12-181-18/+0
| | | | llvm-svn: 122147
* Move some data to the TargetWriter.Rafael Espindola2010-12-181-10/+9
| | | | llvm-svn: 122134
* Relax push instructions.Rafael Espindola2010-12-181-0/+3
| | | | llvm-svn: 122121
* Add support for matching psign & plendvb to the x86 targetNate Begeman2010-12-174-41/+160
| | | | | | Remove unnecessary pandn patterns, 'vnot' patfrag looks through bitcasts llvm-svn: 122098
* Stub out explicit MCELFObjectTargetWriter interface.Rafael Espindola2010-12-171-2/+8
| | | | llvm-svn: 122067
* Move createELFObjectWriter to its own header.Rafael Espindola2010-12-171-0/+1
| | | | llvm-svn: 122064
* MC/Mach-O: On second thought, use a custom hook for enabling aggressiveDaniel Dunbar2010-12-171-1/+2
| | | | | | | | | | IsSymbolRefDifferenceFullyResolved, it turns out this does change behavior on enough cases for x86-32 that I would rather wait a bit on it. - In practice, we will want to change this eventually because it only means we generate less relocations (it also eliminates the need for the horrible '.set' hack that Darwin requires in some places). llvm-svn: 122042
* MC/Target: Remove HasScatteredSymbols target hook variable, which has beenDaniel Dunbar2010-12-171-5/+1
| | | | | | superceded and was effectively dead. llvm-svn: 122024
* Make pushq produce signed relocations.Rafael Espindola2010-12-161-1/+4
| | | | llvm-svn: 122005
* MC/Mach-O: Lift some MachObjectWriter arguments into the target specificDaniel Dunbar2010-12-161-10/+14
| | | | | | interface. llvm-svn: 121981
* MC/Mach-O: Stub out explicit MCMachObjectTargetWriter interface.Daniel Dunbar2010-12-161-2/+7
| | | | llvm-svn: 121973
* MC/Mach-O: Move createMachObjectWriter into MCMachObjectWriter.h.Daniel Dunbar2010-12-161-0/+1
| | | | llvm-svn: 121971
* MC: Move target specific fixup info descriptors to TargetAsmBackend instead ofDaniel Dunbar2010-12-163-21/+25
| | | | | | | the MCCodeEmitter, which seems like a better organization. - Also, cleaned up some magic constants while in the area. llvm-svn: 121953
* Only rr forms of ADD*_DB are commutable.Evan Cheng2010-12-151-1/+3
| | | | llvm-svn: 121908
* Disable auto-detection of AVX support since AVX codegen support is not ready.Evan Cheng2010-12-132-2/+5
| | | | llvm-svn: 121677
* Factor the (x & 2^n) ? 2^m : 0 instcombine into its own method and generalize itBenjamin Kramer2010-12-111-40/+0
| | | | | | to catch cases where n != m with a shift. llvm-svn: 121608
* Fixed version of 121434 with no new memory leaks.Rafael Espindola2010-12-101-23/+0
| | | | llvm-svn: 121471
* Revert my previous patch to make the valgrind bots happy.Rafael Espindola2010-12-101-0/+23
| | | | llvm-svn: 121461
* Add some missing predicates.Nate Begeman2010-12-101-2/+4
| | | | llvm-svn: 121445
* Formalize the notion that AVX and SSE are non-overlapping extensions from ↵Nate Begeman2010-12-107-49/+60
| | | | | | the compiler's point of view. Per email discussion, we either want to always use VEX-prefixed instructions or never use them, and are taking "HasAVX" to mean "Always use VEX". Passing -mattr=-avx,+sse42 should serve to restore legacy SSE support when desirable. llvm-svn: 121439
* Initial support for the cfi directives. This is just enough to getRafael Espindola2010-12-091-23/+0
| | | | | | | | | | | f: .cfi_startproc nop .cfi_endproc assembled (on ELF). llvm-svn: 121434
* Add support for AVX to materialize +0.0 when doing scalar FP.Nate Begeman2010-12-093-2/+12
| | | | llvm-svn: 121415
* Rewrite the darwin tlv support to use a chain and return to copyingEric Christopher2010-12-092-6/+9
| | | | | | | | the output to the correct register. Fixes a hidden problem uncovered by the last patch where we'd try to DAG combine our MVT::Other node oddly. llvm-svn: 121358
* Stop confusing people, it's not really a chain, or a tumor.Eric Christopher2010-12-091-2/+2
| | | | llvm-svn: 121340
* Remove extraneous copy from DAG conversion for darwin tls. This wasEric Christopher2010-12-091-3/+2
| | | | | | | popping up at O0 when it wasn't folded and the fast allocator would complain. llvm-svn: 121330
* Add rsp to the uses for the same reason as 32-bit.Eric Christopher2010-12-091-1/+1
| | | | llvm-svn: 121328
* Allow a slash, '/', as a prefix separator for X86. rdar://8741045Kevin Enderby2010-12-081-0/+2
| | | | llvm-svn: 121320
* lib/Target/X86/X86MCAsmInfo.cpp: [PR8741] On Win64, specify explicit ↵NAKAMURA Takumi2010-12-071-1/+3
| | | | | | | | PrivateGlobalPrefix as ".L". Or, global symbols @Lxxxx might be treated as temporal symbol by MCSymbol. llvm-svn: 121103
* Remove the instruction fragment to data fragment lowering since it was causingRafael Espindola2010-12-061-3/+3
| | | | | | freed data to be read. I will open a bug to track it being reenabled. llvm-svn: 121028
* Second try at making direct object emission produce the same resultsRafael Espindola2010-12-061-4/+0
| | | | | | | as llc + llvm-mc. This time ELF is not changed and I tested that llvm-gcc bootstrap on darwin10 using darwin9's assembler and linker. llvm-svn: 121006
* Teach X86ISelLowering that the second result of X86ISD::UMUL is a flagsChris Lattner2010-12-051-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | result. This allows us to compile: void *test12(long count) { return new int[count]; } into: test12: movl $4, %ecx movq %rdi, %rax mulq %rcx movq $-1, %rdi cmovnoq %rax, %rdi jmp __Znam ## TAILCALL instead of: test12: movl $4, %ecx movq %rdi, %rax mulq %rcx seto %cl testb %cl, %cl movq $-1, %rdi cmoveq %rax, %rdi jmp __Znam Of course it would be even better if the regalloc inverted the cmov to 'cmovoq', which would eliminate the need for the 'movq %rdi, %rax'. llvm-svn: 120936
* it turns out that when ".with.overflow" intrinsics were added to the X86Chris Lattner2010-12-055-20/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | backend that they were all implemented except umul. This one fell back to the default implementation that did a hi/lo multiply and compared the top. Fix this to check the overflow flag that the 'mul' instruction sets, so we can avoid an explicit test. Now we compile: void *func(long count) { return new int[count]; } into: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] seto %cl ## encoding: [0x0f,0x90,0xc1] testb %cl, %cl ## encoding: [0x84,0xc9] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL instead of: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL Other than the silly seto+test, this is using the o bit directly, so it's going in the right direction. llvm-svn: 120935
* generalize the previous check to handle -1 on either side of the Chris Lattner2010-12-051-7/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | select, inserting a not to compensate. Add a missing isZero check that I lost somehow. This improves codegen of: void *func(long count) { return new int[count]; } from: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL ## encoding: [0xeb,A] to: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] cmpq $1, %rdx ## encoding: [0x48,0x83,0xfa,0x01] sbbq %rdi, %rdi ## encoding: [0x48,0x19,0xff] notq %rdi ## encoding: [0x48,0xf7,0xd7] orq %rax, %rdi ## encoding: [0x48,0x09,0xc7] jmp __Znam ## TAILCALL ## encoding: [0xeb,A] llvm-svn: 120932
* Improve an integer select optimization in two ways:Chris Lattner2010-12-051-21/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | 1. generalize (select (x == 0), -1, 0) -> (sign_bit (x - 1)) to: (select (x == 0), -1, y) -> (sign_bit (x - 1)) | y 2. Handle the identical pattern that happens with !=: (select (x != 0), y, -1) -> (sign_bit (x - 1)) | y cmov is often high latency and can't fold immediates or memory operands. For example for (x == 0) ? -1 : 1, before we got: < testb %sil, %sil < movl $-1, %ecx < movl $1, %eax < cmovel %ecx, %eax now we get: > cmpb $1, %sil > sbbl %eax, %eax > orl $1, %eax llvm-svn: 120929
* Initialize HasPOPCNT.Bill Wendling2010-12-041-1/+2
| | | | llvm-svn: 120923
* Add patterns for the x86 popcnt instruction.Benjamin Kramer2010-12-044-15/+32
| | | | | | | - Also adds a new POPCNT subtarget feature that is currently enabled if the target supports SSE4.2 (nehalem) or SSE4A (barcelona). llvm-svn: 120917
* Simplify code. No functionality change.Benjamin Kramer2010-12-041-4/+3
| | | | llvm-svn: 120907
* There are two reasons why we might want to useRafael Espindola2010-12-041-0/+4
| | | | | | | | | | | | | | | | foo = a - b .long foo instead of just .long a - b First, on darwin9 64 bits the assembler produces the wrong result. Second, if "a" is the end of the section all darwin assemblers (9, 10 and mc) will not consider a - b to be a constant but will if the dummy foo is created. Split how we handle these cases. The first one is something MC should take care of. The second one has to be handled by the caller. llvm-svn: 120889
OpenPOWER on IntegriCloud