summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Check in conditional branches for constant islands. Still need to finishReed Kotler2013-11-283-3/+136
| | | | | | | | | | | | conditional branches for very large targets. That will be the next small patch. Everything now should in principle work as good (functionality wise) as without constant islands so we decided at Mips/Imagination to make constant islands the default for Mips16 now so that it will get excercised a lot and this port is still experimentatl though hopefully soon we will change the status. Some more cleanup and code review is in order but things are converging fast. llvm-svn: 195902
* [mips] Implement the following optimizations using dominance information toAkira Hatanaka2013-11-271-0/+83
| | | | | | | | | | | make PIC calls a little more efficient: 1. Remove instructions setting up $gp if it is known that a function has been called at least once. 2. Save the address of a called function in a register instead of loading it from the GOT at every call site. llvm-svn: 195892
* R600: Expand vector FABSTom Stellard2013-11-271-2/+34
| | | | | NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195881
* R600/SI: Implement spilling of SGPRs v5Tom Stellard2013-11-271-3/+877
| | | | | | | | | | | | | | | | | | | | | | | | | | SGPRs are spilled into VGPRs using the {READ,WRITE}LANE_B32 instructions. v2: - Fix encoding of Lane Mask - Use correct register flags, so we don't overwrite the low dword when restoring multi-dword registers. v3: - Register spilling seems to hang the GPU, so replace all shaders that need spilling with a dummy shader. v4: - Fix *LANE definitions - Change destination reg class for 32-bit SMRD instructions v5: - Remove small optimization that was crashing Serious Sam 3. https://bugs.freedesktop.org/show_bug.cgi?id=68224 https://bugs.freedesktop.org/show_bug.cgi?id=71285 NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195880
* R600/SI: Use SGPR_32 register class for 32-bit SMRD outputsTom Stellard2013-11-271-0/+692
| | | | | | | | Writing to the M0 register from an SMRD instruction hangs the GPU, so we need to use the SGPR_32 register class, which does not include M0. NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195879
* R600: Add support for ISD::FROUNDTom Stellard2013-11-271-0/+41
| | | | | NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195878
* Use FileCheck and expand the test a bit.Rafael Espindola2013-11-271-2/+9
| | | | | | In particular, check the name of the symbol we are putting in the constant pool. llvm-svn: 195865
* Fix the AArch64 NEON bug exposed by checking constant integer argument range ↵Jiangning Liu2013-11-271-375/+1591
| | | | | | of ACLE intrinsics. llvm-svn: 195843
* Cleanup and test X86AsmPrinter::printPCRelImm.Rafael Espindola2013-11-271-0/+15
| | | | | | | | | | | | | | It is only used for asm printing. On X86 we put basic block addresses on register before passing them to inline asm, so the MO_MachineBasicBlock case was dead. MO_ExternalSymbol was dead since any symbol being passed to inline asm is represented as MO_GlobalAddress. The MO_GlobalAddress and MO_Register cases were not tested. llvm-svn: 195824
* [AArch64] Add support for NEON scalar floating-point absolute difference.Chad Rosier2013-11-271-0/+26
| | | | llvm-svn: 195803
* [AArch64] Add support for NEON scalar floating-point to integer convertChad Rosier2013-11-261-0/+255
| | | | | | instructions. llvm-svn: 195788
* Fix a bug related to constant islands for Mips16 and mips16/32 dual mode.Reed Kotler2013-11-261-1/+5
| | | | | | | The determination of when we are doing constant pools was being made too early in the asm printer. llvm-svn: 195781
* Fix PR18054Michael Liao2013-11-261-0/+10
| | | | | | | | - Fix bug in (vsext (vzext x)) -> (vsext x) in SIGN_EXTEND_IN_REG lowering where we need to check whether x is a vector type (in-reg type) of i8, i16 or i32; otherwise, that optimization is not valid. llvm-svn: 195779
* Darwin-ARM: use movw/movt for static relocationsTim Northover2013-11-261-4/+4
| | | | llvm-svn: 195759
* [SystemZ] Fix incorrect use of RISBG for a zero-extended right shiftRichard Sandiford2013-11-261-0/+14
| | | | | | | | | We would wrongly transform the testcase into the equivalent of an AND with 1. The problem was that, when testing whether the shifted-in bits of the right shift were significant, we used the width of the final zero-extended result rather than the width of the shifted value. llvm-svn: 195731
* Refactored the implementation of AArch64 NEON instruction ZIP, UZPKevin Qin2013-11-261-0/+14
| | | | | | | and TRN. Fix a bug when mixed use of vget_high_u8() and vuzp_u8(). llvm-svn: 195716
* [AArch64]Implement 128 bit register copy with NEON.Kevin Qin2013-11-261-0/+3
| | | | llvm-svn: 195713
* StackMap: Implement support for DirectMemRefOp.Andrew Trick2013-11-262-10/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | A Direct stack map location records the address of frame index. This address is itself the value that the runtime requested. This differs from IndirectMemRefOp locations, which refer to a stack locations from which the requested values must be loaded. Direct locations can directly communicate the address if an alloca, while IndirectMemRefOp handle register spills. For example: entry: %a = alloca i64... llvm.experimental.stackmap(i32 <ID>, i32 <shadowBytes>, i64* %a) Since both the alloca and stackmap intrinsic are in the entry block, and the intrinsic takes the address of the alloca, the runtime can assume that LLVM will not substitute alloca with any intervening value. This must be verified by the runtime by checking that the stack map's location is a Direct location type. The runtime can then determine the alloca's relative location on the stack immediately after compilation, or at any time thereafter. This differs from Register and Indirect locations, because the runtime can only read the values in those locations when execution reaches the instruction address of the stack map. llvm-svn: 195712
* Add an intrinsic for the SSE2 PAUSE instruction.Cameron McInally2013-11-261-0/+7
| | | | llvm-svn: 195697
* Unrevert r195599 with testcase fix.Bill Wendling2013-11-251-0/+31
| | | | | | | I'm not sure how it was checking for the wrong values... PR18023. llvm-svn: 195670
* [ARM] Enable FeatureMP for Cortex-A5 by default.Amara Emerson2013-11-251-1/+44
| | | | | | Patch by Oliver Stannard. llvm-svn: 195640
* Revert r195599 as it broke the builds.Amara Emerson2013-11-251-29/+0
| | | | llvm-svn: 195636
* Fixed tryFoldToZero() for vector types that need expansion.Daniel Sanders2013-11-251-0/+143
| | | | | | | | | | | | | | | | | | | | | | Summary: Moved the requirement for SelectionDAG::getConstant() to return legally typed nodes slightly earlier. There were two optional DAGCombine passes that were missed out and were required to produce type-legal DAGs. Simplified a code-path in tryFoldToZero() to use SelectionDAG::getConstant(). This provides support for both promoted and expanded vector types whereas the previous code only supported promoted vector types. Fixes a "Type for zero vector elements is not legal" assertion detected by an llvm-stress generated test. Reviewers: resistor CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2251 llvm-svn: 195635
* Don't look past volatile loads.Bill Wendling2013-11-251-0/+29
| | | | | | | A volatile load should block us from trying to coalesce stores. PR18023 llvm-svn: 195599
* [Sparc] Emit large negative adjustments to SP/FP with sethi+xor instead of ↵Venkatraman Govindaraju2013-11-241-0/+21
| | | | | | sethi+or. This generates correct code for both sparc32 and sparc64. llvm-svn: 195576
* [SparcV9]: Do not emit .register directives for global registers that are ↵Venkatraman Govindaraju2013-11-241-2/+0
| | | | | | clobbered by calls but not used in the function itself. llvm-svn: 195574
* [SparcV9] Enable custom lowering of DYNAMIC_STACKALLOC in sparc64.Venkatraman Govindaraju2013-11-241-6/+16
| | | | llvm-svn: 195573
* Make sure that for C++ emitting LwConstant32 pseudos, that it correspondsReed Kotler2013-11-241-1/+3
| | | | | | | | | | to what is needed for constant islands. The prescan method for Mips16 constant islands will eventually go away. It is only temporary and should be done earlier when the instructions are first created or from the DAG. If we keep it here we need to handle better the situation where constant islands is called multiple times since don't want to prescan more than once. llvm-svn: 195569
* Update older test cases for latest patch.Reed Kotler2013-11-242-2/+2
| | | | llvm-svn: 195566
* Fix a funny bug I introduced during conversion of ARM constant islands to Mips.Reed Kotler2013-11-241-0/+39
| | | | | | | | | | | | I had to move some code and I moved a declaration forward past it's first use in the function but by nutty coincidence there was another variable of the same name and type and with completely unrelated function that was declared globally in the class so no compilation error ensued. It required some unusual conditions for it to even matter. Caused test case casts.c in test-suite to fail during compilation with a duplicate symbol error. I would have noticed it during final code review for this port. llvm-svn: 195565
* Debug Info: update testing cases to specify the debug info version number.Manman Ren2013-11-232-0/+5
| | | | | | | | | | We are going to drop debug info without a version number or with a different version number, to make sure we don't crash when we see bitcode files with different debug info metadata format. Make tests more robust by removing hard-coded metadata numbers in CHECK lines. llvm-svn: 195535
* R600/SI: Fixing handling of condition codesTom Stellard2013-11-222-9/+577
| | | | | | | | We were ignoring the ordered/onordered bits and also the signed/unsigned bits of condition codes when lowering the DAG to MachineInstrs. NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195514
* Debug Info: update testing cases to specify the debug info version number.Manman Ren2013-11-2234-3/+68
| | | | | | | | We are going to drop debug info without a version number or with a different version number, to make sure we don't crash when we see bitcode files with different debug info metadata format. llvm-svn: 195504
* X86: Perform integer comparisons at i32 or larger.Jim Grosbach2013-11-226-106/+21
| | | | | | | | | | | | | | | Utilizing the 8 and 16 bit comparison instructions, even when an input can be folded into the comparison instruction itself, is typically not worth it. There are too many partial register stalls as a result, leading to significant slowdowns. By always performing comparisons on at least 32-bit registers, performance of the calculation chain leading to the comparison improves. Continue to use the smaller comparisons when minimizing size, as that allows better folding of loads into the comparison instructions. rdar://15386341 llvm-svn: 195496
* Teach ISel not to optimize 'optnone' functions (revised).Paul Robinson2013-11-221-0/+42
| | | | | | | | | | | | | Improvements over r195317: - Set/restore EnableFastISel flag instead of just running FastISel within SelectAllBasicBlocks; the flag is checked in various places, and FastISel won't run properly if those places don't do the right thing. - Test looks for normal ISel versus FastISel behavior, and not something more subtle that doesn't work everywhere. Based on work by Andrea Di Biagio. llvm-svn: 195491
* patchpoint: factor SD builder code for live vars. Plain stackmap also ↵Andrew Trick2013-11-221-1/+19
| | | | | | optimizes Constant values now. llvm-svn: 195488
* Fix PR18014Michael Liao2013-11-221-0/+16
| | | | | | | - When simplifying the mask generation for BLEND, check whether that mask is also consumed by other non-BLEND insns. If true, skip that simplification. llvm-svn: 195476
* [SystemZ] Fix TMHH and TMHL usage for z10 with -O0Richard Sandiford2013-11-222-0/+50
| | | | | | | | | | | | I've no idea why I decided to handle TMxx differently from all the other high/low logic operations, but it was a stupid thing to do. The high registers aren't available as separate 32-bit registers on z10, so subreg_h32 can't be used on a GR64 there. I've normally been testing with z196 and with -O3 and so hadn't noticed this until now. llvm-svn: 195473
* [mips][msa] Add test case that should have been added in r195456.Daniel Sanders2013-11-221-0/+31
| | | | llvm-svn: 195469
* Don't produce tail calls when the caller is x86_thiscallcc.Rafael Espindola2013-11-221-0/+8
| | | | | | The callee will not pop the stack for us. llvm-svn: 195467
* ARM: use CHECK-LABEL on a test.Tim Northover2013-11-221-20/+20
| | | | llvm-svn: 195457
* Add support for Cortex-A12.Richard Barton2013-11-221-0/+32
| | | | | | Patch by Oliver Stannard! llvm-svn: 195448
* [mips][msa] Float vector constants cannot use ldi.[wd] directly. Bitcast ↵Daniel Sanders2013-11-221-0/+27
| | | | | | | | from the appropriate integer vector type. Fixes an instruction selection failure detected by llvm-stress. llvm-svn: 195444
* Revert r195318 as it causes miscompilation (PR18029)Kostya Serebryany2013-11-222-4/+6
| | | | llvm-svn: 195439
* Fix the bugs about AArch64 Load/Store vector types and bitcast between i64 ↵Hao Liu2013-11-222-0/+155
| | | | | | | | | and vector types. e.g. "%tmp = load <2 x i64>* %ptr" can't be selected. "%tmp = bitcast i64 %in to <2 x i32>" can't be selected. llvm-svn: 195424
* For AArch64 back-end instruction selection, lower Neon_Lowxxx with ↵Jiangning Liu2013-11-221-42/+42
| | | | | | EXTRCT_SUBREG. llvm-svn: 195408
* Tweak 3 tests in llvm/test/CodeGen/X86 to add -mcpu=generic since r195383.NAKAMURA Takumi2013-11-223-3/+3
| | | | | | | They failed on bdver2 buildslave. FIXME: FileCheck-ize them. llvm-svn: 195407
* SelectionDAG: Optimize expansion of vec_type = BITCAST scalar_typeTom Stellard2013-11-221-0/+15
| | | | | | | | | The legalizer can now do this type of expansion for more type combinations without loading and storing to and from the stack. NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195398
* SHLD/SHRD are VectorPath (microcode) instructions known to have poor latency ↵Ekaterina Romanova2013-11-214-0/+275
| | | | | | | | | | on certain architectures. While generating SHLD/SHRD instructions is acceptable when optimizing for size, optimizing for speed on these platforms should be implemented using alternative sequences of instructions composed of add, adc, shr, shl, or and lea which are directPath instructions. These alternative instructions not only have a lower latency but they also increase the decode bandwidth by allowing simultaneous decoding of a third directPath instruction. AMD's processors family K7, K8, K10, K12, K15 and K16 are known to have SHLD/SHRD instructions with very poor latency. Optimization guides for these processors recommend using an alternative sequence of instructions. For these AMD's processors, I disabled folding (or (x << c) | (y >> (64 - c))) when we are not optimizing for size. It might be beneficial to disable this folding for some of the Intel's processors. However, since I couldn't find specific recommendations regarding using SHLD/SHRD instructions on Intel's processors, I haven't disabled this peephole for Intel. llvm-svn: 195383
* [ARM] add the overlooked tests for Cortex-A7 build attributesArtyom Skrobov2013-11-211-0/+75
| | | | llvm-svn: 195365
OpenPOWER on IntegriCloud