summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* Add IIC_ prefix to PPC instruction-class namesHal Finkel2013-11-2713-2355/+2366
| | | | | | | | | | | | | This adds the IIC_ prefix to the instruction itinerary class names, giving the PPC backend a naming convention for itinerary classes that is more consistent with that used by the X86 and ARM backends. Instruction scheduling in the PPC backend needs a bunch of cleanup and improvement (especially for the ooo cores). This is just a preliminary step. No functionality change intended. llvm-svn: 195890
* Don't set GlobalPrefix to the default value.Rafael Espindola2013-11-272-2/+0
| | | | llvm-svn: 195884
* The R600 has its own asm printer which doesn't use GlobalPrefix. Drop it.Rafael Espindola2013-11-271-1/+0
| | | | llvm-svn: 195883
* R600: Expand vector FABSTom Stellard2013-11-271-0/+1
| | | | | NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195881
* R600/SI: Implement spilling of SGPRs v5Tom Stellard2013-11-276-13/+161
| | | | | | | | | | | | | | | | | | | | | | | | | | SGPRs are spilled into VGPRs using the {READ,WRITE}LANE_B32 instructions. v2: - Fix encoding of Lane Mask - Use correct register flags, so we don't overwrite the low dword when restoring multi-dword registers. v3: - Register spilling seems to hang the GPU, so replace all shaders that need spilling with a dummy shader. v4: - Fix *LANE definitions - Change destination reg class for 32-bit SMRD instructions v5: - Remove small optimization that was crashing Serious Sam 3. https://bugs.freedesktop.org/show_bug.cgi?id=68224 https://bugs.freedesktop.org/show_bug.cgi?id=71285 NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195880
* R600/SI: Use SGPR_32 register class for 32-bit SMRD outputsTom Stellard2013-11-271-2/+5
| | | | | | | | Writing to the M0 register from an SMRD instruction hangs the GPU, so we need to use the SGPR_32 register class, which does not include M0. NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195879
* R600: Add support for ISD::FROUNDTom Stellard2013-11-273-4/+18
| | | | | NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195878
* Remove dead code.Rafael Espindola2013-11-271-36/+4
| | | | | | MO_ExternalSymbol and MO_JumpTableIndex don't show up in inline asm. llvm-svn: 195861
* Convert two if sequences to switches.Rafael Espindola2013-11-271-10/+21
| | | | llvm-svn: 195859
* Use a switch.Rafael Espindola2013-11-271-5/+11
| | | | llvm-svn: 195857
* Remove more dead code now that this is only used for inline asm.Rafael Espindola2013-11-271-4/+1
| | | | | | | MO_ConstantPoolIndex is handled in printLeaMemReference. MO_JumpTableIndex and MO_ExternalSymbol don't show up in inline asm. llvm-svn: 195847
* Fix the AArch64 NEON bug exposed by checking constant integer argument range ↵Jiangning Liu2013-11-272-164/+82
| | | | | | of ACLE intrinsics. llvm-svn: 195843
* Convert more methods in static helpers.Rafael Espindola2013-11-272-33/+26
| | | | llvm-svn: 195826
* Convert these methods into static functions.Rafael Espindola2013-11-272-58/+56
| | | | llvm-svn: 195825
* Cleanup and test X86AsmPrinter::printPCRelImm.Rafael Espindola2013-11-271-4/+0
| | | | | | | | | | | | | | It is only used for asm printing. On X86 we put basic block addresses on register before passing them to inline asm, so the MO_MachineBasicBlock case was dead. MO_ExternalSymbol was dead since any symbol being passed to inline asm is represented as MO_GlobalAddress. The MO_GlobalAddress and MO_Register cases were not tested. llvm-svn: 195824
* Fix comment in PPCA2ModelHal Finkel2013-11-271-1/+1
| | | | llvm-svn: 195807
* Remove dead argument.Rafael Espindola2013-11-271-17/+14
| | | | llvm-svn: 195806
* [AArch64] Add support for NEON scalar floating-point absolute difference.Chad Rosier2013-11-271-0/+5
| | | | llvm-svn: 195803
* [AArch64] Add support for NEON scalar floating-point to integer convertChad Rosier2013-11-261-1/+63
| | | | | | instructions. llvm-svn: 195788
* Fix a bug related to constant islands for Mips16 and mips16/32 dual mode.Reed Kotler2013-11-261-3/+2
| | | | | | | The determination of when we are doing constant pools was being made too early in the asm printer. llvm-svn: 195781
* Fix PR18054Michael Liao2013-11-261-7/+15
| | | | | | | | - Fix bug in (vsext (vzext x)) -> (vsext x) in SIGN_EXTEND_IN_REG lowering where we need to check whether x is a vector type (in-reg type) of i8, i16 or i32; otherwise, that optimization is not valid. llvm-svn: 195779
* Darwin-ARM: use movw/movt for static relocationsTim Northover2013-11-262-8/+4
| | | | llvm-svn: 195759
* [SystemZ] Fix incorrect use of RISBG for a zero-extended right shiftRichard Sandiford2013-11-261-19/+8
| | | | | | | | | We would wrongly transform the testcase into the equivalent of an AND with 1. The problem was that, when testing whether the shifted-in bits of the right shift were significant, we used the width of the final zero-extended result rather than the width of the shifted value. llvm-svn: 195731
* Refactored the implementation of AArch64 NEON instruction ZIP, UZPKevin Qin2013-11-263-328/+226
| | | | | | | and TRN. Fix a bug when mixed use of vget_high_u8() and vuzp_u8(). llvm-svn: 195716
* [AArch64]Implement 128 bit register copy with NEON.Kevin Qin2013-11-261-17/+19
| | | | llvm-svn: 195713
* StackMap: Implement support for DirectMemRefOp.Andrew Trick2013-11-263-10/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | A Direct stack map location records the address of frame index. This address is itself the value that the runtime requested. This differs from IndirectMemRefOp locations, which refer to a stack locations from which the requested values must be loaded. Direct locations can directly communicate the address if an alloca, while IndirectMemRefOp handle register spills. For example: entry: %a = alloca i64... llvm.experimental.stackmap(i32 <ID>, i32 <shadowBytes>, i64* %a) Since both the alloca and stackmap intrinsic are in the entry block, and the intrinsic takes the address of the alloca, the runtime can assume that LLVM will not substitute alloca with any intervening value. This must be verified by the runtime by checking that the stack map's location is a Direct location type. The runtime can then determine the alloca's relative location on the stack immediately after compilation, or at any time thereafter. This differs from Register and Indirect locations, because the runtime can only read the values in those locations when execution reaches the instruction address of the stack map. llvm-svn: 195712
* whitespaceAndrew Trick2013-11-261-5/+5
| | | | llvm-svn: 195711
* Add an intrinsic for the SSE2 PAUSE instruction.Cameron McInally2013-11-261-1/+3
| | | | llvm-svn: 195697
* Do the string comparison in the constructor instead of once per nop.Rafael Espindola2013-11-251-6/+9
| | | | | | Thanks to Roman Divacky for the suggestion. llvm-svn: 195684
* Don't use nopl in cpus that don't support it.Rafael Espindola2013-11-251-1/+5
| | | | | | | | | | | | | | | Patch by Mikulas Patocka. I added the test. I checked that for cpu names that gas knows about, it also doesn't generate nopl. The modified cpus: i686 - there are i686-class CPUs that don't have nopl: Via c3, Transmeta Crusoe, Microsoft VirtualBox - see https://bbs.archlinux.org/viewtopic.php?pid=775414 k6, k6-2, k6-3, winchip-c6, winchip2 - these are 586-class CPUs via c3 c3-2 - see https://bugs.archlinux.org/task/19733 as a proof that Via c3 and c3-Nehemiah don't have nopl llvm-svn: 195679
* Fix indentation typoTim Northover2013-11-251-1/+1
| | | | llvm-svn: 195660
* ARM: remove special cases for Darwin dynamic-no-pic mode.Tim Northover2013-11-2511-104/+73
| | | | | | | | | These are handled almost identically to static mode (and ELF's global address materialisation), except that a symbol may have "$non_lazy_ptr" appended. This can be handled by passing appropriate flags along with the instruction instead of using entirely separate pseudo-instructions. llvm-svn: 195655
* ARM: remove unused patterns.Tim Northover2013-11-253-6/+1
| | | | | | | | There is no sane way for an LEApcrel (= single ADR) instruction to generate a global address on any ARM target I know of. Fortunately, no-one was trying to any more, but there were vestigial patterns. llvm-svn: 195644
* [ARM] Enable FeatureMP for Cortex-A5 by default.Amara Emerson2013-11-251-1/+1
| | | | | | Patch by Oliver Stannard. llvm-svn: 195640
* X86: enable AVX2 under Haswell native compilationTim Northover2013-11-251-1/+6
| | | | | | Patch by Adam Strzelecki llvm-svn: 195632
* Fixed a bug about disassembling AArch64 post-index load/store single element ↵Hao Liu2013-11-251-9/+14
| | | | | | | | | | instructions. ie. echo "0x00 0x04 0x80 0x0d" | ../bin/llvm-mc -triple=aarch64 -mattr=+neon -disassemble echo "0x00 0x00 0x80 0x0d" | ../bin/llvm-mc -triple=aarch64 -mattr=+neon -disassemble will be disassembled into the same instruction st1 {v0b}[0], [x0], x0. llvm-svn: 195591
* SparcFrameLowering.cpp: Prune 'DL' [-Wunused-variable]NAKAMURA Takumi2013-11-251-1/+0
| | | | llvm-svn: 195590
* [Sparc] Emit large negative adjustments to SP/FP with sethi+xor instead of ↵Venkatraman Govindaraju2013-11-244-40/+108
| | | | | | sethi+or. This generates correct code for both sparc32 and sparc64. llvm-svn: 195576
* [Sparc]: Implement LEA pattern for sparcv9.Venkatraman Govindaraju2013-11-242-4/+11
| | | | llvm-svn: 195575
* [SparcV9]: Do not emit .register directives for global registers that are ↵Venkatraman Govindaraju2013-11-241-1/+1
| | | | | | clobbered by calls but not used in the function itself. llvm-svn: 195574
* [SparcV9] Enable custom lowering of DYNAMIC_STACKALLOC in sparc64.Venkatraman Govindaraju2013-11-241-6/+11
| | | | llvm-svn: 195573
* Make sure that for C++ emitting LwConstant32 pseudos, that it correspondsReed Kotler2013-11-242-3/+3
| | | | | | | | | | to what is needed for constant islands. The prescan method for Mips16 constant islands will eventually go away. It is only temporary and should be done earlier when the instructions are first created or from the DAG. If we keep it here we need to handle better the situation where constant islands is called multiple times since don't want to prescan more than once. llvm-svn: 195569
* Fix a funny bug I introduced during conversion of ARM constant islands to Mips.Reed Kotler2013-11-241-3/+5
| | | | | | | | | | | | I had to move some code and I moved a declaration forward past it's first use in the function but by nutty coincidence there was another variable of the same name and type and with completely unrelated function that was declared globally in the class so no compilation error ensued. It required some unusual conditions for it to even matter. Caused test case casts.c in test-suite to fail during compilation with a duplicate symbol error. I would have noticed it during final code review for this port. llvm-svn: 195565
* R600/SI: Fixing handling of condition codesTom Stellard2013-11-224-76/+98
| | | | | | | | We were ignoring the ordered/onordered bits and also the signed/unsigned bits of condition codes when lowering the DAG to MachineInstrs. NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195514
* X86: Perform integer comparisons at i32 or larger.Jim Grosbach2013-11-221-0/+29
| | | | | | | | | | | | | | | Utilizing the 8 and 16 bit comparison instructions, even when an input can be folded into the comparison instruction itself, is typically not worth it. There are too many partial register stalls as a result, leading to significant slowdowns. By always performing comparisons on at least 32-bit registers, performance of the calculation chain leading to the comparison improves. Continue to use the smaller comparisons when minimizing size, as that allows better folding of loads into the comparison instructions. rdar://15386341 llvm-svn: 195496
* Teach ISel not to optimize 'optnone' functions (revised).Paul Robinson2013-11-221-0/+5
| | | | | | | | | | | | | Improvements over r195317: - Set/restore EnableFastISel flag instead of just running FastISel within SelectAllBasicBlocks; the flag is checked in various places, and FastISel won't run properly if those places don't do the right thing. - Test looks for normal ISel versus FastISel behavior, and not something more subtle that doesn't work everywhere. Based on work by Andrea Di Biagio. llvm-svn: 195491
* Fix PR18014Michael Liao2013-11-221-0/+9
| | | | | | | - When simplifying the mask generation for BLEND, check whether that mask is also consumed by other non-BLEND insns. If true, skip that simplification. llvm-svn: 195476
* [SystemZ] Fix TMHH and TMHL usage for z10 with -O0Richard Sandiford2013-11-224-20/+34
| | | | | | | | | | | | I've no idea why I decided to handle TMxx differently from all the other high/low logic operations, but it was a stupid thing to do. The high registers aren't available as separate 32-bit registers on z10, so subreg_h32 can't be used on a GR64 there. I've normally been testing with z196 and with -O3 and so hadn't noticed this until now. llvm-svn: 195473
* Don't produce tail calls when the caller is x86_thiscallcc.Rafael Espindola2013-11-221-3/+7
| | | | | | The callee will not pop the stack for us. llvm-svn: 195467
* Fix typo in a comment added in r195455.Daniel Sanders2013-11-221-1/+1
| | | | | | Credit to Matheus Almeida for spotting it. llvm-svn: 195456
OpenPOWER on IntegriCloud