summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86InstrInfo.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Add a DebugLoc argument to TargetInstrInfo::copyRegToReg, so that itDan Gohman2010-05-061-2/+2
| | | | | | doesn't have to guess. llvm-svn: 103194
* Add argument TargetRegisterInfo to loadRegFromStackSlot and storeRegToStackSlot.Evan Cheng2010-05-061-4/+7
| | | | llvm-svn: 103193
* Frame index can be negative.Evan Cheng2010-04-291-1/+1
| | | | llvm-svn: 102577
* on darwin empty functions need to codegen into something of non-zero length,Chris Lattner2010-04-261-0/+7
| | | | | | | | | | | otherwise labels get incorrectly merged. We handled this by emitting a ".byte 0", but this isn't correct on thumb/arm targets where the text segment needs to be a multiple of 2/4 bytes. Handle this by emitting a noop. This is more gross than it should be because arm/ppc are not fully mc'ized yet. This fixes rdar://7908505 llvm-svn: 102400
* Remove a redundant comment.Evan Cheng2010-04-261-1/+0
| | | | llvm-svn: 102326
* - Move TargetLowering::EmitTargetCodeForFrameDebugValue to TargetInstrInfo ↵Evan Cheng2010-04-261-0/+14
| | | | | | | | and rename it to emitFrameIndexDebugValue. - Teach spiller to modify DBG_VALUE instructions to reference spill slots. llvm-svn: 102323
* Add const qualifiers to CodeGen's use of LLVM IR constructs.Dan Gohman2010-04-151-1/+1
| | | | llvm-svn: 101334
* Re-apply 101075 and fix it properly. Just reuse the debug info of the branch ↵Evan Cheng2010-04-131-1/+44
| | | | | | instruction being optimized. There is no need to --I which can deref off start of the BB. llvm-svn: 101162
* Temporarily revert r101075, it's causing invalid iterator assertionsEric Christopher2010-04-131-46/+1
| | | | | | in a nightly tester. llvm-svn: 101158
* Micro-optimization:Bill Wendling2010-04-121-1/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we have this situation: jCC L1 jmp L2 L1: ... L2: ... We can get a small performance boost by emitting this instead: jnCC L2 L1: ... L2: ... This testcase shows an example of this: float func(float x, float y) { double product = (double)x * y; if (product == 0.0) return product; return product - 1.0; } llvm-svn: 101075
* rename llvm::llvm_report_error -> llvm::report_fatal_errorChris Lattner2010-04-071-1/+1
| | | | llvm-svn: 100709
* Educate GetInstrSizeInBytes implementations thatDale Johannesen2010-04-071-0/+1
| | | | | | DBG_VALUE does not generate code. llvm-svn: 100681
* Properly enable load clustering.Jakob Stoklund Olesen2010-04-051-4/+0
| | | | | | | Operand 2 on a load instruction does not have to be a RegisterSDNode for this to work. llvm-svn: 100497
* use DebugLoc default ctor instead of DebugLoc::getUnknownLoc()Chris Lattner2010-04-021-3/+3
| | | | llvm-svn: 100214
* Teach AnalyzeBranch, RemoveBranch and the branchDale Johannesen2010-04-021-0/+4
| | | | | | | folder to be tolerant of debug info following the branch(es) at the end of a block. llvm-svn: 100168
* Replace V_SET0 with variants for each SSE execution domain.Jakob Stoklund Olesen2010-03-311-3/+8
| | | | llvm-svn: 99975
* Renumber SSE execution domains for better code size.Jakob Stoklund Olesen2010-03-301-16/+16
| | | | | | | | SSEDomainFix will collapse to the domain with the lower number when it has a choice. The SSEPackedSingle domain often has smaller instructions, so prefer that. llvm-svn: 99952
* Remove the pmulld intrinsic and autoupdate it as a vector multiply.Eric Christopher2010-03-301-1/+0
| | | | | | | Rewrite the pmulld patterns, and make sure that they fold in loads of arguments into the instruction. llvm-svn: 99910
* Basic implementation of SSEDomainFix pass.Jakob Stoklund Olesen2010-03-291-39/+43
| | | | | | Cross-block inference is primitive and wrong, but the pass is working otherwise. llvm-svn: 99848
* Add a late SSEDomainFix pass that twiddles SSE instructions to avoid domain ↵Jakob Stoklund Olesen2010-03-251-0/+43
| | | | | | | | | | | | | | | | | | | | | | crossings. On Nehalem and newer CPUs there is a 2 cycle latency penalty on using a register in a different domain than where it was defined. Some instructions have equvivalents for different domains, like por/orps/orpd. The SSEDomainFix pass tries to minimize the number of domain crossings by changing between equvivalent opcodes where possible. This is a work in progress, in particular the pass doesn't do anything yet. SSE instructions are tagged with their execution domain in TableGen using the last two bits of TSFlags. Note that not all instructions are tagged correctly. Life just isn't that simple. The SSE execution domain issue is very similar to the ARM NEON/VFP pipeline issue handled by NEONMoveFixPass. This pass may become target independent to handle both. llvm-svn: 99524
* Revert "Add a late SSEDomainFix pass that twiddles SSE instructions to avoid ↵Jakob Stoklund Olesen2010-03-231-3/+0
| | | | | | | | domain crossings." This reverts commit 99345. It was breaking buildbots. llvm-svn: 99352
* Add a late SSEDomainFix pass that twiddles SSE instructions to avoid domain ↵Jakob Stoklund Olesen2010-03-231-0/+3
| | | | | | | | | crossings. This is work in progress. So far, SSE execution domain tables are added to X86InstrInfo, and a skeleton pass is enabled with -sse-domain-fix. llvm-svn: 99345
* Teach isSafeToClobberEFLAGS to ignore dbg_value's. We need a ↵Evan Cheng2010-03-231-3/+13
| | | | | | MachineBasicBlock::iterator that does this automatically? llvm-svn: 99320
* Do not force indirect tailcall through fixed registers: eax, r11. Add ↵Evan Cheng2010-03-141-0/+17
| | | | | | support to allow loads to be folded to tail call instructions. llvm-svn: 98465
* Don't try to fold V_SET0 and V_SETALLONES to loads in medium andDan Gohman2010-03-091-0/+5
| | | | | | large code models. llvm-svn: 98042
* Revert r97766. It's deleting a tag.Bill Wendling2010-03-051-52/+13
| | | | llvm-svn: 97768
* Micro-optimization:Bill Wendling2010-03-051-13/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This code: float floatingPointComparison(float x, float y) { double product = (double)x * y; if (product == 0.0) return product; return product - 1.0; } produces this: _floatingPointComparison: 0000000000000000 cvtss2sd %xmm1,%xmm1 0000000000000004 cvtss2sd %xmm0,%xmm0 0000000000000008 mulsd %xmm1,%xmm0 000000000000000c pxor %xmm1,%xmm1 0000000000000010 ucomisd %xmm1,%xmm0 0000000000000014 jne 0x00000004 0000000000000016 jp 0x00000002 0000000000000018 jmp 0x00000008 000000000000001a addsd 0x00000006(%rip),%xmm0 0000000000000022 cvtsd2ss %xmm0,%xmm0 0000000000000026 ret The "jne/jp/jmp" sequence can be reduced to this instead: _floatingPointComparison: 0000000000000000 cvtss2sd %xmm1,%xmm1 0000000000000004 cvtss2sd %xmm0,%xmm0 0000000000000008 mulsd %xmm1,%xmm0 000000000000000c pxor %xmm1,%xmm1 0000000000000010 ucomisd %xmm1,%xmm0 0000000000000014 jp 0x00000002 0000000000000016 je 0x00000008 0000000000000018 addsd 0x00000006(%rip),%xmm0 0000000000000020 cvtsd2ss %xmm0,%xmm0 0000000000000024 ret for a savings of 2 bytes. This xform can happen when we recognize that jne and jp jump to the same "true" MBB, the unconditional jump would jump to the "false" MBB, and the "true" branch is the fall-through MBB. llvm-svn: 97766
* Implement XMM subregs.Dan Gohman2010-02-281-14/+4
| | | | | | | | | | | Extracting the low element of a vector is now done with EXTRACT_SUBREG, and the zero-extension performed by load movss is now modeled with SUBREG_TO_REG, and so on. Register-to-register movss and movsd are no longer considered copies; they are two-address instructions which insert a scalar into a vector. llvm-svn: 97354
* movl is a cheaper way to materialize 0 without clobbering EFLAGS than movabsq.Dan Gohman2010-02-261-1/+1
| | | | llvm-svn: 97227
* Fix a typo in a comment.Dan Gohman2010-02-221-1/+1
| | | | llvm-svn: 96778
* add a bunch of mod/rm encoding types for fixed mod/rm bytes.Chris Lattner2010-02-121-0/+8
| | | | | | | This will work better for the disassembler for modeling things like lfence/monitor/vmcall etc. llvm-svn: 95960
* refactor the conditional jump instructions in the .td file toChris Lattner2010-02-111-40/+40
| | | | | | | use a multipattern that generates both the 1-byte and 4-byte versions from the same defm llvm-svn: 95901
* move target-independent opcodes out of TargetInstrInfoChris Lattner2010-02-091-5/+5
| | | | | | | | | into TargetOpcodes.h. #include the new TargetOpcodes.h into MachineInstr. Add new inline accessors (like isPHI()) to MachineInstr, and start using them throughout the codebase. llvm-svn: 95687
* port X86InstrInfo::determineREX over to the new encoder.Chris Lattner2010-02-051-5/+4
| | | | llvm-svn: 95440
* move functions for decoding X86II values into the X86II namespace.Chris Lattner2010-02-051-9/+9
| | | | llvm-svn: 95410
* change getSizeOfImm and getBaseOpcodeFor to just takeChris Lattner2010-02-051-19/+9
| | | | | | TSFlags directly instead of a TargetInstrDesc. llvm-svn: 95405
* use findDebugLoc in more places.Dale Johannesen2010-01-261-10/+5
| | | | llvm-svn: 94477
* Be more conservative with clustering f32 / f64 loads.Evan Cheng2010-01-221-0/+2
| | | | llvm-svn: 94254
* Add two target hooks to determine whether two loads are near and should be ↵Evan Cheng2010-01-221-0/+130
| | | | | | scheduled together. llvm-svn: 94147
* Fix a minor issue in x86 load / store folding table. movups does an ↵Evan Cheng2010-01-211-1/+1
| | | | | | unaligned load so it doesn't require 16-byte alignment. llvm-svn: 94058
* make findDebugLoc a class methodDale Johannesen2010-01-201-2/+2
| | | | llvm-svn: 94032
* Move findDebugLoc somewhere more central. FixDale Johannesen2010-01-201-4/+2
| | | | | | | more cases where debug declarations affect debug line info. llvm-svn: 93953
* For aligned load/store instructions, it's only required to know whether aJim Grosbach2010-01-191-4/+2
| | | | | | | | | | | function can support dynamic stack realignment. That's a much easier question to answer at instruction selection stage than whether the function actually will have dynamic alignment prologue. This allows the removal of the stack alignment heuristic pass, and improves code quality for cases where the heuristic would result in dynamic alignment code being generated when it was not strictly necessary. llvm-svn: 93885
* For now, avoid issuing extract_subreg to reuse lower 8-bit, it's not safe in ↵Evan Cheng2010-01-131-0/+4
| | | | | | 32-bit. llvm-svn: 93307
* Add a quick pass to optimize sign / zero extension instructions. For targets ↵Evan Cheng2010-01-131-9/+8
| | | | | | | | where the pre-extension values are available in the subreg of the result of the extension, replace the uses of the pre-extension value with the result + extract_subreg. For now, this pass is fairly conservative. It only perform the replacement when both the pre- and post- extension values are used in the block. It will miss cases where the post-extension values are live, but not used. llvm-svn: 93278
* Reapply the MOV64r0 patch, with a fix: MOV64r0 clobbers EFLAGS.Dan Gohman2010-01-121-2/+12
| | | | llvm-svn: 93229
* Add TargetInstrInfo::isCoalescableInstr. It returns true if the specifiedEvan Cheng2010-01-121-0/+53
| | | | | | | | | instruction is copy like where the source and destination registers can overlap. This is to be used by the coalescable to coalesce the source and destination registers of instructions like X86::MOVSX64rr32. Apparently some crazy people believe the coalescer is too simple. llvm-svn: 93210
* Revert 93158. It's breaking quite a few x86_64 tests.Evan Cheng2010-01-111-12/+2
| | | | llvm-svn: 93185
* Re-instate MOV64r0 and MOV16r0, with adjustments to work with theDan Gohman2010-01-111-2/+12
| | | | | | | | new AsmPrinter. This is perhaps less elegant than describing them in terms of MOV32r0 and subreg operations, but it allows the current register to rematerialize them. llvm-svn: 93158
* Change errs() to dbgs().David Greene2010-01-051-1/+2
| | | | llvm-svn: 92653
OpenPOWER on IntegriCloud