summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86InstrInfo.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Do not force indirect tailcall through fixed registers: eax, r11. Add ↵Evan Cheng2010-03-141-0/+17
| | | | | | support to allow loads to be folded to tail call instructions. llvm-svn: 98465
* Don't try to fold V_SET0 and V_SETALLONES to loads in medium andDan Gohman2010-03-091-0/+5
| | | | | | large code models. llvm-svn: 98042
* Revert r97766. It's deleting a tag.Bill Wendling2010-03-051-52/+13
| | | | llvm-svn: 97768
* Micro-optimization:Bill Wendling2010-03-051-13/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This code: float floatingPointComparison(float x, float y) { double product = (double)x * y; if (product == 0.0) return product; return product - 1.0; } produces this: _floatingPointComparison: 0000000000000000 cvtss2sd %xmm1,%xmm1 0000000000000004 cvtss2sd %xmm0,%xmm0 0000000000000008 mulsd %xmm1,%xmm0 000000000000000c pxor %xmm1,%xmm1 0000000000000010 ucomisd %xmm1,%xmm0 0000000000000014 jne 0x00000004 0000000000000016 jp 0x00000002 0000000000000018 jmp 0x00000008 000000000000001a addsd 0x00000006(%rip),%xmm0 0000000000000022 cvtsd2ss %xmm0,%xmm0 0000000000000026 ret The "jne/jp/jmp" sequence can be reduced to this instead: _floatingPointComparison: 0000000000000000 cvtss2sd %xmm1,%xmm1 0000000000000004 cvtss2sd %xmm0,%xmm0 0000000000000008 mulsd %xmm1,%xmm0 000000000000000c pxor %xmm1,%xmm1 0000000000000010 ucomisd %xmm1,%xmm0 0000000000000014 jp 0x00000002 0000000000000016 je 0x00000008 0000000000000018 addsd 0x00000006(%rip),%xmm0 0000000000000020 cvtsd2ss %xmm0,%xmm0 0000000000000024 ret for a savings of 2 bytes. This xform can happen when we recognize that jne and jp jump to the same "true" MBB, the unconditional jump would jump to the "false" MBB, and the "true" branch is the fall-through MBB. llvm-svn: 97766
* Implement XMM subregs.Dan Gohman2010-02-281-14/+4
| | | | | | | | | | | Extracting the low element of a vector is now done with EXTRACT_SUBREG, and the zero-extension performed by load movss is now modeled with SUBREG_TO_REG, and so on. Register-to-register movss and movsd are no longer considered copies; they are two-address instructions which insert a scalar into a vector. llvm-svn: 97354
* movl is a cheaper way to materialize 0 without clobbering EFLAGS than movabsq.Dan Gohman2010-02-261-1/+1
| | | | llvm-svn: 97227
* Fix a typo in a comment.Dan Gohman2010-02-221-1/+1
| | | | llvm-svn: 96778
* add a bunch of mod/rm encoding types for fixed mod/rm bytes.Chris Lattner2010-02-121-0/+8
| | | | | | | This will work better for the disassembler for modeling things like lfence/monitor/vmcall etc. llvm-svn: 95960
* refactor the conditional jump instructions in the .td file toChris Lattner2010-02-111-40/+40
| | | | | | | use a multipattern that generates both the 1-byte and 4-byte versions from the same defm llvm-svn: 95901
* move target-independent opcodes out of TargetInstrInfoChris Lattner2010-02-091-5/+5
| | | | | | | | | into TargetOpcodes.h. #include the new TargetOpcodes.h into MachineInstr. Add new inline accessors (like isPHI()) to MachineInstr, and start using them throughout the codebase. llvm-svn: 95687
* port X86InstrInfo::determineREX over to the new encoder.Chris Lattner2010-02-051-5/+4
| | | | llvm-svn: 95440
* move functions for decoding X86II values into the X86II namespace.Chris Lattner2010-02-051-9/+9
| | | | llvm-svn: 95410
* change getSizeOfImm and getBaseOpcodeFor to just takeChris Lattner2010-02-051-19/+9
| | | | | | TSFlags directly instead of a TargetInstrDesc. llvm-svn: 95405
* use findDebugLoc in more places.Dale Johannesen2010-01-261-10/+5
| | | | llvm-svn: 94477
* Be more conservative with clustering f32 / f64 loads.Evan Cheng2010-01-221-0/+2
| | | | llvm-svn: 94254
* Add two target hooks to determine whether two loads are near and should be ↵Evan Cheng2010-01-221-0/+130
| | | | | | scheduled together. llvm-svn: 94147
* Fix a minor issue in x86 load / store folding table. movups does an ↵Evan Cheng2010-01-211-1/+1
| | | | | | unaligned load so it doesn't require 16-byte alignment. llvm-svn: 94058
* make findDebugLoc a class methodDale Johannesen2010-01-201-2/+2
| | | | llvm-svn: 94032
* Move findDebugLoc somewhere more central. FixDale Johannesen2010-01-201-4/+2
| | | | | | | more cases where debug declarations affect debug line info. llvm-svn: 93953
* For aligned load/store instructions, it's only required to know whether aJim Grosbach2010-01-191-4/+2
| | | | | | | | | | | function can support dynamic stack realignment. That's a much easier question to answer at instruction selection stage than whether the function actually will have dynamic alignment prologue. This allows the removal of the stack alignment heuristic pass, and improves code quality for cases where the heuristic would result in dynamic alignment code being generated when it was not strictly necessary. llvm-svn: 93885
* For now, avoid issuing extract_subreg to reuse lower 8-bit, it's not safe in ↵Evan Cheng2010-01-131-0/+4
| | | | | | 32-bit. llvm-svn: 93307
* Add a quick pass to optimize sign / zero extension instructions. For targets ↵Evan Cheng2010-01-131-9/+8
| | | | | | | | where the pre-extension values are available in the subreg of the result of the extension, replace the uses of the pre-extension value with the result + extract_subreg. For now, this pass is fairly conservative. It only perform the replacement when both the pre- and post- extension values are used in the block. It will miss cases where the post-extension values are live, but not used. llvm-svn: 93278
* Reapply the MOV64r0 patch, with a fix: MOV64r0 clobbers EFLAGS.Dan Gohman2010-01-121-2/+12
| | | | llvm-svn: 93229
* Add TargetInstrInfo::isCoalescableInstr. It returns true if the specifiedEvan Cheng2010-01-121-0/+53
| | | | | | | | | instruction is copy like where the source and destination registers can overlap. This is to be used by the coalescable to coalesce the source and destination registers of instructions like X86::MOVSX64rr32. Apparently some crazy people believe the coalescer is too simple. llvm-svn: 93210
* Revert 93158. It's breaking quite a few x86_64 tests.Evan Cheng2010-01-111-12/+2
| | | | llvm-svn: 93185
* Re-instate MOV64r0 and MOV16r0, with adjustments to work with theDan Gohman2010-01-111-2/+12
| | | | | | | | new AsmPrinter. This is perhaps less elegant than describing them in terms of MOV32r0 and subreg operations, but it allows the current register to rematerialize them. llvm-svn: 93158
* Change errs() to dbgs().David Greene2010-01-051-1/+2
| | | | llvm-svn: 92653
* Remove dead variable.Bill Wendling2009-12-281-1/+0
| | | | llvm-svn: 92184
* completely eliminate the MOV16r0 'instruction'. The onlyChris Lattner2009-12-231-6/+1
| | | | | | | interesting part of this is the divrem changes, which are already tested by CodeGen/X86/divrem.ll. llvm-svn: 91975
* Remove target attribute break-sse-dep. Instead, do not fold load into sse ↵Evan Cheng2009-12-221-2/+2
| | | | | | partial update instructions unless optimizing for size. llvm-svn: 91910
* On recent Intel u-arch's, folding loads into some unary SSE instructions canEvan Cheng2009-12-181-0/+34
| | | | | | | | | | | | | | | | | | | be non-optimal. To be precise, we should avoid folding loads if the instructions only update part of the destination register, and the non-updated part is not needed. e.g. cvtss2sd, sqrtss. Unfolding the load from these instructions breaks the partial register dependency and it can improve performance. e.g. movss (%rdi), %xmm0 cvtss2sd %xmm0, %xmm0 instead of cvtss2sd (%rdi), %xmm0 An alternative method to break dependency is to clear the register first. e.g. xorps %xmm0, %xmm0 cvtss2sd (%rdi), %xmm0 llvm-svn: 91672
* Instruction fixes, added instructions, and AsmString changes in theSean Callanan2009-12-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | X86 instruction tables. Also (while I was at it) cleaned up the X86 tables, removing tabs and 80-line violations. This patch was reviewed by Chris Lattner, but please let me know if there are any problems. * X86*.td Removed tabs and fixed 80-line violations * X86Instr64bit.td (IRET, POPCNT, BT_, LSL, SWPGS, PUSH_S, POP_S, L_S, SMSW) Added (CALL, CMOV) Added qualifiers (JMP) Added PC-relative jump instruction (POPFQ/PUSHFQ) Added qualifiers; renamed PUSHFQ to indicate that it is 64-bit only (ambiguous since it has no REX prefix) (MOV) Added rr form going the other way, which is encoded differently (MOV) Changed immediates to offsets, which is more correct; also fixed MOV64o64a to have to a 64-bit offset (MOV) Fixed qualifiers (MOV) Added debug-register and condition-register moves (MOVZX) Added more forms (ADC, SUB, SBB, AND, OR, XOR) Added reverse forms, which (as with MOV) are encoded differently (ROL) Made REX.W required (BT) Uncommented mr form for disassembly only (CVT__2__) Added several missing non-intrinsic forms (LXADD, XCHG) Reordered operands to make more sense for MRMSrcMem (XCHG) Added register-to-register forms (XADD, CMPXCHG, XCHG) Added non-locked forms * X86InstrSSE.td (CVTSS2SI, COMISS, CVTTPS2DQ, CVTPS2PD, CVTPD2PS, MOVQ) Added * X86InstrFPStack.td (COM_FST0, COMP_FST0, COM_FI, COM_FIP, FFREE, FNCLEX, FNOP, FXAM, FLDL2T, FLDL2E, FLDPI, FLDLG2, FLDLN2, F2XM1, FYL2X, FPTAN, FPATAN, FXTRACT, FPREM1, FDECSTP, FINCSTP, FPREM, FYL2XP1, FSINCOS, FRNDINT, FSCALE, FCOMPP, FXSAVE, FXRSTOR) Added (FCOM, FCOMP) Added qualifiers (FSTENV, FSAVE, FSTSW) Fixed opcode names (FNSTSW) Added implicit register operand * X86InstrInfo.td (opaque512mem) Added for FXSAVE/FXRSTOR (offset8, offset16, offset32, offset64) Added for MOV (NOOPW, IRET, POPCNT, IN, BTC, BTR, BTS, LSL, INVLPG, STR, LTR, PUSHFS, PUSHGS, POPFS, POPGS, LDS, LSS, LES, LFS, LGS, VERR, VERW, SGDT, SIDT, SLDT, LGDT, LIDT, LLDT, LODSD, OUTSB, OUTSW, OUTSD, HLT, RSM, FNINIT, CLC, STC, CLI, STI, CLD, STD, CMC, CLTS, XLAT, WRMSR, RDMSR, RDPMC, SMSW, LMSW, CPUID, INVD, WBINVD, INVEPT, INVVPID, VMCALL, VMCLEAR, VMLAUNCH, VMRESUME, VMPTRLD, VMPTRST, VMREAD, VMWRITE, VMXOFF, VMXON) Added (NOOPL, POPF, POPFD, PUSHF, PUSHFD) Added qualifier (JO, JNO, JB, JAE, JE, JNE, JBE, JA, JS, JNS, JP, JNP, JL, JGE, JLE, JG, JCXZ) Added 32-bit forms (MOV) Changed some immediate forms to offset forms (MOV) Added reversed reg-reg forms, which are encoded differently (MOV) Added debug-register and condition-register moves (CMOV) Added qualifiers (AND, OR, XOR, ADC, SUB, SBB) Added reverse forms, like MOV (BT) Uncommented memory-register forms for disassembler (MOVSX, MOVZX) Added forms (XCHG, LXADD) Made operand order make sense for MRMSrcMem (XCHG) Added register-register forms (XADD, CMPXCHG) Added unlocked forms * X86InstrMMX.td (MMX_MOVD, MMV_MOVQ) Added forms * X86InstrInfo.cpp: Changed PUSHFQ to PUSHFQ64 to reflect table change * X86RegisterInfo.td: Added debug and condition register sets * x86-64-pic-3.ll: Fixed testcase to reflect call qualifier * peep-test-3.ll: Fixed testcase to reflect test qualifier * cmov.ll: Fixed testcase to reflect cmov qualifier * loop-blocks.ll: Fixed testcase to reflect call qualifier * x86-64-pic-11.ll: Fixed testcase to reflect call qualifier * 2009-11-04-SubregCoalescingBug.ll: Fixed testcase to reflect call qualifier * x86-64-pic-2.ll: Fixed testcase to reflect call qualifier * live-out-reg-info.ll: Fixed testcase to reflect test qualifier * tail-opts.ll: Fixed testcase to reflect call qualifiers * x86-64-pic-10.ll: Fixed testcase to reflect call qualifier * bss-pagealigned.ll: Fixed testcase to reflect call qualifier * x86-64-pic-1.ll: Fixed testcase to reflect call qualifier * widen_load-1.ll: Fixed testcase to reflect call qualifier llvm-svn: 91638
* Whitespace changes, comment clarification. No functional changes.Bill Wendling2009-12-141-15/+26
| | | | llvm-svn: 91274
* Disable r91104 for x86. It causes partial register stall which pessimize ↵Evan Cheng2009-12-121-12/+12
| | | | | | code in 32-bit. llvm-svn: 91223
* Add comment about potential partial register stall.Evan Cheng2009-12-121-0/+5
| | | | llvm-svn: 91220
* Add support to 3-addressify 16-bit instructions.Evan Cheng2009-12-111-88/+130
| | | | llvm-svn: 91104
* Remove the target hook TargetInstrInfo::BlockHasNoFallThrough in favor ofDan Gohman2009-12-051-21/+0
| | | | | | | MachineBasicBlock::canFallThrough(), which is target-independent and more thorough. llvm-svn: 90634
* Remove an unneeded include.David Greene2009-12-041-1/+0
| | | | llvm-svn: 90625
* Have hasLoad/StoreFrom/ToStackSlot return the relevant MachineMemOperand.David Greene2009-12-041-2/+9
| | | | llvm-svn: 90608
* improve portability to avoid conflicting with std::next in c++'0x.Chris Lattner2009-12-031-2/+2
| | | | | | Patch by Howard Hinnant! llvm-svn: 90365
* Remove ISD::DEBUG_LOC and ISD::DBG_LABEL, which are no longer used.Dan Gohman2009-11-231-1/+0
| | | | | | | | Note that "hasDotLocAndDotFile"-style debug info was already broken; people wanting this functionality should implement it in the AsmPrinter/DwarfWriter code. llvm-svn: 89711
* Re-apply 89011. It's not to be blamed.Evan Cheng2009-11-171-1/+4
| | | | llvm-svn: 89081
* Revert 89011. Buildbot thinks it might be breaking stuff.Evan Cheng2009-11-171-4/+1
| | | | llvm-svn: 89076
* A few more instructions that should be marked re-materializable.Evan Cheng2009-11-171-1/+4
| | | | llvm-svn: 89011
* - Check memoperand alignment instead of checking stack alignment. Most load ↵Evan Cheng2009-11-161-16/+13
| | | | | | | | / store folding instructions are not referencing spill stack slots. - Mark MOVUPSrm re-materializable. llvm-svn: 88974
* - Change TargetInstrInfo::reMaterialize to pass in TargetRegisterInfo.Evan Cheng2009-11-141-2/+3
| | | | | | | | - If destination is a physical register and it has a subreg index, use the sub-register instead. This fixes PR5423. llvm-svn: 88745
* Fix a bootstrap failure.David Greene2009-11-131-24/+53
| | | | | | | | Provide special isLoadFromStackSlotPostFE and isStoreToStackSlotPostFE interfaces to explicitly request checking for post-frame ptr elimination operands. This uses a heuristic so it isn't reliable for correctness. llvm-svn: 87047
* Add hasLoadFromStackSlot and hasStoreToStackSlot to return whether aDavid Greene2009-11-121-12/+57
| | | | | | | | | | | | | | | machine instruction loads or stores from/to a stack slot. Unlike isLoadFromStackSlot and isStoreFromStackSlot, the instruction may be something other than a pure load/store (e.g. it may be an arithmetic operation with a memory operand). This helps AsmPrinter determine when to print a spill/reload comment. This is only a hint since we may not be able to figure this out in all cases. As such, it should not be relied upon for correctness. Implement for X86. Return false by default for other architectures. llvm-svn: 87026
* Fix DenseMap iterator constness.Jeffrey Yasskin2009-11-101-5/+5
| | | | | | | | | | | | | | | | | | | This patch forbids implicit conversion of DenseMap::const_iterator to DenseMap::iterator which was possible because DenseMapIterator inherited (publicly) from DenseMapConstIterator. Conversion the other way around is now allowed as one may expect. The template DenseMapConstIterator is removed and the template parameter IsConst which specifies whether the iterator is constant is added to DenseMapIterator. Actually IsConst parameter is not necessary since the constness can be determined from KeyT but this is not relevant to the fix and can be addressed later. Patch by Victor Zverovich! llvm-svn: 86636
* Fix MachineLICM to use the correct virtual register class whenDan Gohman2009-10-301-1/+4
| | | | | | | | | | unfolding loads for hoisting. getOpcodeAfterMemoryUnfold returns the opcode of the original operation without the load, not the load itself, MachineLICM needs to know the operand index in order to get the correct register class. Extend getOpcodeAfterMemoryUnfold to return this information. llvm-svn: 85622
OpenPOWER on IntegriCloud