summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
...
* Second try at making direct object emission produce the same resultsRafael Espindola2010-12-061-4/+0
| | | | | | | as llc + llvm-mc. This time ELF is not changed and I tested that llvm-gcc bootstrap on darwin10 using darwin9's assembler and linker. llvm-svn: 121006
* Teach X86ISelLowering that the second result of X86ISD::UMUL is a flagsChris Lattner2010-12-051-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | result. This allows us to compile: void *test12(long count) { return new int[count]; } into: test12: movl $4, %ecx movq %rdi, %rax mulq %rcx movq $-1, %rdi cmovnoq %rax, %rdi jmp __Znam ## TAILCALL instead of: test12: movl $4, %ecx movq %rdi, %rax mulq %rcx seto %cl testb %cl, %cl movq $-1, %rdi cmoveq %rax, %rdi jmp __Znam Of course it would be even better if the regalloc inverted the cmov to 'cmovoq', which would eliminate the need for the 'movq %rdi, %rax'. llvm-svn: 120936
* it turns out that when ".with.overflow" intrinsics were added to the X86Chris Lattner2010-12-055-20/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | backend that they were all implemented except umul. This one fell back to the default implementation that did a hi/lo multiply and compared the top. Fix this to check the overflow flag that the 'mul' instruction sets, so we can avoid an explicit test. Now we compile: void *func(long count) { return new int[count]; } into: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] seto %cl ## encoding: [0x0f,0x90,0xc1] testb %cl, %cl ## encoding: [0x84,0xc9] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL instead of: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL Other than the silly seto+test, this is using the o bit directly, so it's going in the right direction. llvm-svn: 120935
* generalize the previous check to handle -1 on either side of the Chris Lattner2010-12-051-7/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | select, inserting a not to compensate. Add a missing isZero check that I lost somehow. This improves codegen of: void *func(long count) { return new int[count]; } from: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL ## encoding: [0xeb,A] to: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] cmpq $1, %rdx ## encoding: [0x48,0x83,0xfa,0x01] sbbq %rdi, %rdi ## encoding: [0x48,0x19,0xff] notq %rdi ## encoding: [0x48,0xf7,0xd7] orq %rax, %rdi ## encoding: [0x48,0x09,0xc7] jmp __Znam ## TAILCALL ## encoding: [0xeb,A] llvm-svn: 120932
* Improve an integer select optimization in two ways:Chris Lattner2010-12-051-21/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | 1. generalize (select (x == 0), -1, 0) -> (sign_bit (x - 1)) to: (select (x == 0), -1, y) -> (sign_bit (x - 1)) | y 2. Handle the identical pattern that happens with !=: (select (x != 0), y, -1) -> (sign_bit (x - 1)) | y cmov is often high latency and can't fold immediates or memory operands. For example for (x == 0) ? -1 : 1, before we got: < testb %sil, %sil < movl $-1, %ecx < movl $1, %eax < cmovel %ecx, %eax now we get: > cmpb $1, %sil > sbbl %eax, %eax > orl $1, %eax llvm-svn: 120929
* Initialize HasPOPCNT.Bill Wendling2010-12-041-1/+2
| | | | llvm-svn: 120923
* Add patterns for the x86 popcnt instruction.Benjamin Kramer2010-12-044-15/+32
| | | | | | | - Also adds a new POPCNT subtarget feature that is currently enabled if the target supports SSE4.2 (nehalem) or SSE4A (barcelona). llvm-svn: 120917
* Simplify code. No functionality change.Benjamin Kramer2010-12-041-4/+3
| | | | llvm-svn: 120907
* There are two reasons why we might want to useRafael Espindola2010-12-041-0/+4
| | | | | | | | | | | | | | | | foo = a - b .long foo instead of just .long a - b First, on darwin9 64 bits the assembler produces the wrong result. Second, if "a" is the end of the section all darwin assemblers (9, 10 and mc) will not consider a - b to be a constant but will if the dummy foo is created. Split how we handle these cases. The first one is something MC should take care of. The second one has to be handled by the caller. llvm-svn: 120889
* Revert this change since it breaks a couple of the AVX tests.Nate Begeman2010-12-031-7/+12
| | | | | | I'm unclear if the tests are actually correct or not, but reverting for now. llvm-svn: 120847
* Scalar f32/f64 are also subregs of ymm regsNate Begeman2010-12-031-0/+6
| | | | llvm-svn: 120844
* Remove SSE1-4 disable when AVX is enabled. While this may be useful for ↵Nate Begeman2010-12-031-12/+7
| | | | | | | | development, it completely breaks scalar fp in xmm regs when AVX is enabled. llvm-svn: 120843
* Revert r120580.Devang Patel2010-12-021-14/+0
| | | | llvm-svn: 120630
* Fix and re-enable tail call optimization of expanded libcalls.Evan Cheng2010-12-011-18/+19
| | | | llvm-svn: 120622
* Disable debug info for x86-darwin9 and earlier until PR 8715 and radar ↵Devang Patel2010-12-011-0/+14
| | | | | | 8709290 are fixed. llvm-svn: 120580
* I don't think it makes any sense to assert that the target supports SSE3 here.Duncan Sands2010-12-011-4/+0
| | | | | | | | | | The user (i.e. whoever generated a call to the intrinsic in the first place) is essentially asking for a particular instruction to be placed in the assembler. If that instruction won't execute on the target machine, that's their problem not ours. Two buildbots with processors that don't support SSE3 were barfing on the apm.ll test in CodeGen/X86 because of this assertion. llvm-svn: 120574
* Speculatively disable x86 portion of r120501 to appease the x86_64 buildbot.Evan Cheng2010-12-011-0/+2
| | | | llvm-svn: 120549
* Enable sibling call optimization of libcalls which are expanded duringEvan Cheng2010-11-304-33/+43
| | | | | | | | | | | legalization time. Since at legalization time there is no mapping from SDNode back to the corresponding LLVM instruction and the return SDNode is target specific, this requires a target hook to check for eligibility. Only x86 and ARM support this form of sibcall optimization right now. rdar://8707777 llvm-svn: 120501
* Move X86InstrFPStack.td over to PseudoI as well.Eric Christopher2010-11-301-27/+9
| | | | llvm-svn: 120470
* Migrate X86InstrControl.td to use PseudoI and fix a couple of 80-col violationsEric Christopher2010-11-301-19/+15
| | | | | | while I'm in there. llvm-svn: 120466
* Fix some grammar in comments I noticed.Eric Christopher2010-11-301-5/+5
| | | | llvm-svn: 120416
* This defaults to GenericDomain.Eric Christopher2010-11-301-1/+1
| | | | llvm-svn: 120415
* Implement a PseudoI class and transfer the sse instructions over to useEric Christopher2010-11-302-12/+15
| | | | | | it. llvm-svn: 120412
* Fix insertion point in pcmp expander.Eric Christopher2010-11-301-9/+2
| | | | | | While I'm there, clean up too many \n even for me. llvm-svn: 120411
* Fix some cleanups from my last patch.Eric Christopher2010-11-302-5/+5
| | | | llvm-svn: 120410
* Rewrite mwait and monitor support and custom lower arguments.Eric Christopher2010-11-303-4/+75
| | | | | | Fixes PR8573. llvm-svn: 120404
* Merge System into Support.Michael J. Spencer2010-11-293-3/+3
| | | | llvm-svn: 120298
* Move lowering of TLS_addr32 and TLS_addr64 to X86MCInstLower.Rafael Espindola2010-11-283-46/+67
| | | | llvm-svn: 120263
* fix PR8686, accepting a 'b' suffix at the end of all the setccChris Lattner2010-11-281-0/+5
| | | | | | | | | | instructions. I choose to handle this with an asmparser hack, though it could be handled by changing all the instruction definitions to allow be "setneb" instead of "setne". The asm parser hack is better in this case, because we want the disassembler to produce setne, not setneb. llvm-svn: 120260
* Define generic 1, 2 and 4 byte pc relative relocations. They are commonRafael Espindola2010-11-283-23/+11
| | | | | | and at least the 4 byte one will be needed to implement the .cfi_* directives. llvm-svn: 120240
* Move more PEI-related hooks to TFIAnton Korobeynikov2010-11-274-47/+48
| | | | llvm-svn: 120229
* Move callee-saved regs spills / reloads to TFIAnton Korobeynikov2010-11-274-81/+81
| | | | llvm-svn: 120228
* Lower TLS_addr32 and TLS_addr64.Rafael Espindola2010-11-273-9/+50
| | | | llvm-svn: 120225
* Implement the data16 prefix.Rafael Espindola2010-11-272-1/+4
| | | | llvm-svn: 120224
* MC/Mach-O: Switch to using MachOFormat.h.Daniel Dunbar2010-11-271-6/+7
| | | | | | - I'm leaving MachO.h, because I believe it has external consumers, but I would really like to eliminate it (we have stylistic disagreements with one another). llvm-svn: 120187
* Remove the unused TheTarget member.Rafael Espindola2010-11-261-1/+1
| | | | llvm-svn: 120168
* Use multiple 0x66 prefixes so that all nops up to 15 bytes are a single ↵Rafael Espindola2010-11-251-24/+8
| | | | | | instruction. llvm-svn: 120147
* Implement the rex64 prefix.Rafael Espindola2010-11-232-1/+5
| | | | llvm-svn: 120017
* Produce a relocation for pcrel absolute values. Based on a patch by David Meyer.Rafael Espindola2010-11-231-8/+12
| | | | llvm-svn: 120006
* Renaming ISD::BIT_CONVERT to ISD::BITCAST to better reflect the LLVM IR concept.Wesley Peck2010-11-233-154/+154
| | | | llvm-svn: 119990
* Remove duplicated constants. Thanks to Jason for noticing it.Rafael Espindola2010-11-222-42/+26
| | | | llvm-svn: 119985
* apparently tailcalls are better on darwin/x86-64 than on linux?Chris Lattner2010-11-211-0/+18
| | | | llvm-svn: 119947
* implement PR8524, apparently mainline gas accepts movq as an alias for movdChris Lattner2010-11-211-0/+4
| | | | | | when transfering between i64 gprs and mmx regs. llvm-svn: 119931
* Move some more hooks to TargetFrameInfoAnton Korobeynikov2010-11-204-38/+37
| | | | llvm-svn: 119904
* On X86, MEMBARRIER, MFENCE, SFENCE, LFENCE are not target memory intrinsics,Duncan Sands2010-11-201-6/+6
| | | | | | | | | | | so don't claim they are. They are allocated using DAG.getNode, so attempts to access MemSDNode fields results in reading off the end of the allocated memory. This fixes crashes with "llc -debug" due to debug code trying to print MemSDNode fields for these barrier nodes (since the crashes are not deterministic, use valgrind to see this). Add some nasty checking to try to catch this kind of thing in the future. llvm-svn: 119901
* Move getInitialFrameState() to TargetFrameInfoAnton Korobeynikov2010-11-185-39/+35
| | | | llvm-svn: 119754
* Move hasFP() and few related hooks to TargetFrameInfo.Anton Korobeynikov2010-11-186-63/+72
| | | | llvm-svn: 119740
* trivial QoI improvement. On this invalid input:Chris Lattner2010-11-181-1/+2
| | | | | | | | | | | | | | | | | | sahf movl 344(%rdi),%r14d we used to produce: t.s:2:1: error: unexpected token in argument list ^ we now produce: t.s:1:11: error: unexpected token in argument list sahf movl 344(%rdi),%r14d ^ rdar://8581401 llvm-svn: 119676
* make isVirtualSection a virtual method on MCSection. Chris' suggestion.Rafael Espindola2010-11-171-17/+0
| | | | llvm-svn: 119547
* Reapply r118917. With pseudo-instruction expansion moved toDan Gohman2010-11-161-5/+5
| | | | | | | a different pass, the complicated interaction between cmov expansion and fast isel is no longer a concern. llvm-svn: 119400
OpenPOWER on IntegriCloud