summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* Add comments explaining why there's only one register forDan Gohman2009-03-231-1/+6
| | | | | | i8 return values. llvm-svn: 67502
* Removed AFGR32 register classBruno Cardoso Lopes2009-03-215-176/+110
| | | | | | Handle odd registers allocation in FGR32. llvm-svn: 67422
* Fix a few more indentation problems and an 80-column violation.Bob Wilson2009-03-201-8/+8
| | | | llvm-svn: 67416
* No functional changes. Fix indentation and whitespace only.Bob Wilson2009-03-201-101/+91
| | | | llvm-svn: 67412
* Fixed comment for libcalls.Sanjiv Gupta2009-03-201-20/+23
| | | | llvm-svn: 67373
* Reformatting. Inserted code comments. Cleaned interfaces.Sanjiv Gupta2009-03-202-112/+68
| | | | | | Removed unncessary code. No functionality change. llvm-svn: 67371
* Added option to enable generating less precise mad (multiply addition)Mon P Wang2009-03-201-0/+12
| | | | | | for those architectures that support the instruction. llvm-svn: 67363
* Remove strange extra semicolons.Nick Lewycky2009-03-192-2/+2
| | | | llvm-svn: 67287
* Add support to tablegen for naming the nodes themselves, not just the operands, Nate Begeman2009-03-191-1/+1
| | | | | | | in selectiondag patterns. This is required for the upcoming shuffle_vector rewrite, and as it turns out, cleans up a hack in the Alpha instruction info. llvm-svn: 67286
* Added support for Mips O32 Calling ConventionBruno Cardoso Lopes2009-03-192-33/+133
| | | | llvm-svn: 67280
* Disable the "call to immediate" optimization on x86-64. It isChris Lattner2009-03-181-1/+5
| | | | | | | | | | | not safe in general because the immediate could be an arbitrary value that does not fit in a 32-bit pcrel displacement. Conservatively fall back to loading the value into a register and calling through it. We still do the optzn on X86-32. llvm-svn: 67142
* CellSPU:Scott Michel2009-03-172-16/+16
| | | | | | Revert inadvertent mis-fix of fneg. llvm-svn: 67084
* Recognize bswapl as bswap too.Dan Gohman2009-03-171-2/+5
| | | | llvm-svn: 67072
* Recognize "bswapq" as an alternate spelling for the bswap instruction.Dan Gohman2009-03-171-2/+2
| | | | llvm-svn: 67071
* CellSPU:Scott Michel2009-03-175-488/+417
| | | | | | | | | | | | - Fix fabs, fneg for f32 and f64. - Use BuildVectorSDNode.isConstantSplat, now that the functionality exists - Continue to improve i64 constant lowering. Lower certain special constants to the constant pool when they correspond to SPU's shufb instruction's special mask values. This avoids the overhead of performing a shuffle on a zero-filled vector just to get the special constant when the memory load suffices. llvm-svn: 67067
* CellSPU:Scott Michel2009-03-163-98/+114
| | | | | | | Incorporate Tilmann's 128-bit operation patch. Evidently, it gets the llvm-gcc bootstrap a bit further along. llvm-svn: 67048
* This causes incorrect stack frame allocation when the last object is an ↵Bruno Cardoso Lopes2009-03-151-1/+1
| | | | | | | | array allocated on the stack which would lead the compiled program to run over its stack. Thanks to Gil Dogon llvm-svn: 67034
* Use %rip-relative addressing on x86-64 whenever practical, asDan Gohman2009-03-141-9/+10
| | | | | | it has a smaller encoding than absolute addressing. llvm-svn: 67002
* Don't forego folding of loads into 64-bit adds when the otherDan Gohman2009-03-141-10/+3
| | | | | | | | | | operand is a signed 32-bit immediate. Unlike with the 8-bit signed immediate case, it isn't actually smaller to fold a 32-bit signed immediate instead of a load. In fact, it's larger in the case of 32-bit unsigned immediates, because they can be materialized with movl instead of movq. llvm-svn: 67001
* Improve FastISel's handling of truncates to i1, and implementDan Gohman2009-03-131-0/+13
| | | | | | | | ptrtoint and inttoptr in X86FastISel. These casts aren't always handled in the generic FastISel code because X86 sometimes needs custom code to do truncation and zero-extension. llvm-svn: 66988
* Fix FastISel's assumption that i1 values are always zero-extendedDan Gohman2009-03-131-2/+4
| | | | | | | | | by inserting explicit zero extensions where necessary. Included is a testcase where SelectionDAG produces a virtual register holding an i1 value which FastISel previously mistakenly assumed to be zero-extended. llvm-svn: 66941
* add 8 and 16 bit TLS moves.Rafael Espindola2009-03-131-0/+20
| | | | | | add a fixme note on how to remove code duplication. llvm-svn: 66932
* Improve sext and zext of TLS variables.Rafael Espindola2009-03-131-0/+36
| | | | llvm-svn: 66922
* generalize this code so that fast isel handles integer truncates to i1, whichChris Lattner2009-03-131-2/+4
| | | | | | | codegen to the same thing as integer truncates to i8 (the top bits are just undefined). This implements rdar://6667338 llvm-svn: 66902
* These instructions have special lowering that may lower them to SSEBill Wendling2009-03-131-12/+19
| | | | | | instructions. Prevent that if we don't want implicit uses of SSE. llvm-svn: 66877
* Fix some significant problems with constant pools that resulted in ↵Evan Cheng2009-03-137-26/+24
| | | | | | | | | | | | | | | | | | | | | | | unnecessary paddings between constant pool entries, larger than necessary alignments (e.g. 8 byte alignment for .literal4 sections), and potentially other issues. 1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants. 2. MachineConstantPool alignment field is also a log2 value. 3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values. 4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries. 5. Asm printer uses expensive data structure multimap to track constant pool entries by sections. 6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic. Solutions: 1. ConstantPoolSDNode alignment field is changed to keep non-log2 value. 2. MachineConstantPool alignment field is also changed to keep non-log2 value. 3. Functions that create ConstantPool nodes are passing in non-log2 alignments. 4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT. 5. Asm printer uses cheaper data structure to group constant pool entries. 6. Asm printer compute entry offsets after grouping is done. 7. Change JIT code to compute entry offsets on the fly. llvm-svn: 66875
* generalize the previous code to use the full generality of LEAChris Lattner2009-03-131-13/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | for i32/i64 expressions (we could also do i16 on cpus where i16 lea is fast, but I didn't add this). On the example, we now generate: _test: movl 4(%esp), %eax cmpl $42, (%eax) setl %al movzbl %al, %eax leal 4(%eax,%eax,8), %eax ret instead of: _test: movl 4(%esp), %eax cmpl $41, (%eax) movl $4, %ecx movl $13, %eax cmovg %ecx, %eax ret llvm-svn: 66869
* optimize the case of cond ? 42 : 41 and friends. This compiles theChris Lattner2009-03-131-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | example to: _test: movl 4(%esp), %eax cmpl $41, (%eax) setg %al movzbl %al, %eax orl $4294967294, %eax ret instead of: movl 4(%esp), %eax cmpl $41, (%eax) movl $4294967294, %ecx movl $4294967295, %eax cmova %ecx, %eax ret which is smaller in code size and faster. rdar://6668608 llvm-svn: 66868
* Enhance address-mode folding of ISD::ADD to handle cases where theDan Gohman2009-03-131-0/+13
| | | | | | | | | | operands can't both be fully folded at the same time. For example, in the included testcase, a global variable is being added with an add of two values. The global variable wants RIP-relative addressing, so it can't share the address with another base register, but it's still possible to fold the initial add. llvm-svn: 66865
* Re-apply 66024 with fixes: 1. Fixed indirect call to immediate address ↵Evan Cheng2009-03-124-6/+16
| | | | | | assembly. 2. Fixed JIT encoding by making the address pc-relative. llvm-svn: 66803
* Move 3 "(add (select cc, 0, c), x) -> (select cc, x, (add, x, c))"Chris Lattner2009-03-123-24/+185
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | related transformations out of target-specific dag combine into the ARM backend. These were added by Evan in r37685 with no testcases and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll). Add some simple X86-specific (for now) DAG combines that turn things like cond ? 8 : 0 -> (zext(cond) << 3). This happens frequently with the recently added cp constant select optimization, but is a very general xform. For example, we now compile the second example in const-select.ll to: _test: movsd LCPI2_0, %xmm0 ucomisd 8(%esp), %xmm0 seta %al movzbl %al, %eax movl 4(%esp), %ecx movsbl (%ecx,%eax,4), %eax ret instead of: _test: movl 4(%esp), %eax leal 4(%eax), %ecx movsd LCPI2_0, %xmm0 ucomisd 8(%esp), %xmm0 cmovbe %eax, %ecx movsbl (%ecx), %eax ret This passes multisource and dejagnu. llvm-svn: 66779
* improve comment.Chris Lattner2009-03-121-4/+4
| | | | llvm-svn: 66778
* On x86, if the only use of a i64 load is a i64 store, generate a pair of ↵Evan Cheng2009-03-121-48/+67
| | | | | | double load and store instead. llvm-svn: 66776
* Forgot to check-in this as part of 7761.Sanjiv Gupta2009-03-121-3/+2
| | | | llvm-svn: 66763
* Banksel optimization is now based on the section names of symbols, since the ↵Sanjiv Gupta2009-03-121-44/+52
| | | | | | symbols in one section will always be put into one bank. llvm-svn: 66761
* Revert r66024. The JIT encoding for CALLpcrel32 is wrong -- see PR3773, and theDan Gohman2009-03-113-9/+4
| | | | | | assembly text output uses an indirect call ("call *") instead of a direct call. llvm-svn: 66735
* optimize i8 and i16 tls values.Rafael Espindola2009-03-111-0/+18
| | | | llvm-svn: 66725
* Add a -no-implicit-float flag. This acts like -soft-float, but may generateBill Wendling2009-03-112-76/+73
| | | | | | floating point instructions that are explicitly specified by the user. llvm-svn: 66719
* It makes no sense to have a ODR version of commonDuncan Sands2009-03-1110-24/+12
| | | | | | linkage, so remove it. llvm-svn: 66690
* For yonah, fix a vector shuffle case for v16i8 where we didn't properly ↵Mon P Wang2009-03-111-2/+19
| | | | | | clear some bits. llvm-svn: 66684
* fix PR3785, a valgrind error on test/CodeGen/ARM/pr3502.llChris Lattner2009-03-111-1/+5
| | | | llvm-svn: 66660
* Remove the one-definition-rule version of extern_weakDuncan Sands2009-03-112-6/+3
| | | | | | | linkage: this linkage type only applies to declarations, but ODR is only relevant to globals with definitions. llvm-svn: 66650
* Fixed a v8i16 shuffle case that should generate a pshufb instead of a ↵Mon P Wang2009-03-111-1/+4
| | | | | | pshuflw/hw. llvm-svn: 66645
* formatting change, reduce indentation. No functionality change.Chris Lattner2009-03-111-82/+80
| | | | llvm-svn: 66642
* Mark the Defs and Uses of STATUS register correctly, plus some reformatting.Sanjiv Gupta2009-03-101-41/+61
| | | | llvm-svn: 66540
* Add more information to the EFLAGS note.Dan Gohman2009-03-101-4/+12
| | | | llvm-svn: 66515
* Add a note about EFLAGS optimization.Dan Gohman2009-03-091-0/+15
| | | | llvm-svn: 66508
* ARM target now also recognize triplets like thumbv6-apple-darwin and set ↵Evan Cheng2009-03-092-14/+24
| | | | | | thumb mode and arch subversion. Eventually thumb triplets will go way and replaced with function notes. llvm-svn: 66435
* ARM isLegalAddressImmediate should check if type is a simple type now that ↵Evan Cheng2009-03-091-0/+3
| | | | | | optimizer can create values of funky scalar types. llvm-svn: 66429
* do not export all the X86FastISel symbols, ever.Chris Lattner2009-03-081-1/+4
| | | | llvm-svn: 66382
OpenPOWER on IntegriCloud