summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Add target combine rules for horizontal add/sub.Andrea Di Biagio2014-06-091-0/+376
| | | | | | | | | | | | | | | | | | | | This patch adds new target specific combine rules to identify horizontal add/sub idioms from BUILD_VECTOR dag nodes. This patch also teaches the DAGCombiner how to canonicalize sequences of insert_vector_elt dag nodes according to the following rule: (insert_vector_elt (insert_vector_elt A, I0), I1) -> (insert_vecto_elt (insert_vector_elt A, I1), I0) This new canonicalization rule only triggers if the inner insert_vector dag node has exactly one use; also, both indices must be known constants, and I1 < I0. This last rule made it possible to write a simpler algorithm to identify horizontal add/sub patterns because now we don't have to worry about the ordering of insert_vector_elt dag nodes. llvm-svn: 210477
* R600/SI: Keep 64-bit not on SALUMatt Arsenault2014-06-092-1/+41
| | | | llvm-svn: 210476
* R600: Fix selection failure for vector bswapMatt Arsenault2014-06-091-0/+50
| | | | llvm-svn: 210475
* [PPC64LE] Generate correct little-endian code for v16i8 multiplyBill Schmidt2014-06-091-0/+17
| | | | | | | | | | | | | | | | The existing code in PPCTargetLowering::LowerMUL() for multiplying two v16i8 values assumes that vector elements are numbered in big-endian order. For little-endian targets, the vector element numbering is reversed, but the vmuleub, vmuloub, and vperm instructions still assume big-endian numbering. To account for this, we must adjust the permute control vector and reverse the order of the input registers on the vperm instruction. The existing test/CodeGen/PowerPC/vec_mul.ll is updated to be executed on powerpc64 and powerpc64le targets as well as the original powerpc (32-bit) target. llvm-svn: 210474
* llvm/test/CodeGen/X86/2014-05-29-factorial.ll: Relax an expression to match ↵NAKAMURA Takumi2014-06-091-2/+2
| | | | | | Win32 x64. llvm-svn: 210471
* [X86] Avoid emitting unnecessary test instructions.Andrea Di Biagio2014-06-091-0/+24
| | | | | | | | | | | | | This patch teaches the backend how to check for the 'NoSignedWrap' flag on binary operations to improve the emission of 'test' instructions. If the result of a binary operation is known not to overflow we know that resetting the Overflow flag is unnecessary and so we can avoid emitting the test instruction. Patch by Marcello Maggioni. llvm-svn: 210468
* [DAG] Expose NoSignedWrap, NoUnsignedWrap and Exact flags to SelectionDAG.Andrea Di Biagio2014-06-091-0/+20
| | | | | | | | | | | | | This patch modifies SelectionDAGBuilder to construct SDNodes with associated NoSignedWrap, NoUnsignedWrap and Exact flags coming from IR BinaryOperator instructions. Added a new SDNode type called 'BinaryWithFlagsSDNode' to allow accessing nsw/nuw/exact flags during codegen. Patch by Marcello Maggioni. llvm-svn: 210467
* R600: Add more and testcasesMatt Arsenault2014-06-091-18/+88
| | | | llvm-svn: 210453
* [AArch64] Fix the ordering of the accumulate operand in SchedRW list.Chad Rosier2014-06-091-3/+4
| | | | | | | Patch by Dave Estes <cestes@codeaurora.org> http://reviews.llvm.org/D4037 llvm-svn: 210446
* [AArch64] When combining constant mul of power of 2 plus/minus 1, prefer shiftChad Rosier2014-06-091-0/+8
| | | | | | | plus add. The shift can be folded into the add. This only effects codegen when the constant is 3. llvm-svn: 210445
* Revert "Do materialize for floating point"Alp Toker2014-06-081-39/+0
| | | | | | | | | | | | | | | | | | | | | | | 1) The commit was made despite profound lack of understanding: "I did not understand the comment about using dyn_cast instead of isa. I will commit as is and make the update after. You can explain what you meant to me." Commit first, understand later isn't OK. 2) Review comments were simply ignored: "Can you edit the summary to describe what the patch is for? It appears to be a list of commits at the moment." 3) The patch got LGTM'd off-list without any indication of readiness. 4) The public mailing list was excluded from patch review so all of this was hidden from the community. This reverts commit r210414. llvm-svn: 210424
* Do materialize for floating pointReed Kotler2014-06-081-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: start to do simple constants finish simplestore add test case format Merge branch 'master' into 1756_8 Add basic functionality for assignment of ints. This creates a lot of core infrastructure in which to add, with little effort, quite a bit more to mips fast-isel Merge branch 'master' into 1756_8 Add basic functionality for assignment of ints. This creates a lot of core infrastructure in which to add, with little effort, quite a bit more to mips fast-isel in progress finish integer materialize test cases test cases in progress Finish up fast-isel materialize for ints. Finish materialize for ints test cases simplestorei.ll Merge branch 'master' into 1756_8 fix fp constants for fast-isel Merge branch '1758_1' of dmz-portal.mips.com:llvm into 1758_1 in progress lastest for fp materialization clean up Merge branch 'master' into 1758_1 formatting add test case finish test case Merge branch 'master' into 1758_2 Test Plan: simplestore.ll simplestore.ll Reviewers: dsanders Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D3659 llvm-svn: 210414
* test: add test case for SVN r210406Saleem Abdulrasool2014-06-081-0/+12
| | | | | | Add missing test case for constructor section selection. Thanks David Blaikie! llvm-svn: 210409
* Fix typosAlp Toker2014-06-071-3/+3
| | | | llvm-svn: 210401
* ARM: correct assertion for long-calls on WoASaleem Abdulrasool2014-06-071-0/+18
| | | | | | | | | | | COFF/PE, so the relocation model is never static. Loosen the assertion accordingly. The relocation can still be emitted properly, as it will be converted to an IMAGE_REL_ARM_ADDR32 which will be resolved by the loader taking the base relocation into account. This is necessary to permit the emission of long calls which can be controlled via the -mlong-calls option in the driver. llvm-svn: 210399
* X86: Don't turn shifts into ands if there's another use that may not check ↵Benjamin Kramer2014-06-061-0/+13
| | | | | | | | for equality. Fixes PR19964. llvm-svn: 210371
* Fixed a bug in lowering shuffle_vectors to insertpsFilipe Cabecinhas2014-06-062-2/+15
| | | | | | | | | | | | | | Summary: We were being too strict and not accounting for undefs. Added a test case and fixed another one where we improved codegen. Reviewers: grosbach, nadav, delena Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4039 llvm-svn: 210361
* [PPC64LE] Fix lowering of BUILD_VECTOR and SHUFFLE_VECTOR for little endianBill Schmidt2014-06-061-0/+66
| | | | | | | | | | | | | | | This patch fixes a couple of lowering issues for little endian PowerPC. The code for lowering BUILD_VECTOR contains a number of optimizations that are only valid for big endian. For now, we disable those optimizations for correctness. In the future, we will add analogous optimizations that are correct for little endian. When lowering a SHUFFLE_VECTOR to a VPERM operation, we again need to make the now-familiar transformation of swapping the input operands and complementing the permute control vector. Correctness of this transformation is tested by the accompanying test case. llvm-svn: 210336
* Allow aliases to be unnamed_addr.Rafael Espindola2014-06-066-9/+9
| | | | | | | | | | | | | | | | | | Alias with unnamed_addr were in a strange state. It is stored in GlobalValue, the language reference talks about "unnamed_addr aliases" but the verifier was rejecting them. It seems natural to allow unnamed_addr in aliases: * It is a property of how it is accessed, not of the data itself. * It is perfectly possible to write code that depends on the address of an alias. This patch then makes unname_addr legal for aliases. One side effect is that the syntax changes for a corner case: In globals, unnamed_addr is now printed before the address space. llvm-svn: 210302
* [PPC64LE] Add test case for r210282 commitBill Schmidt2014-06-051-0/+17
| | | | | | | | Chandler correctly pointed out that I need an LLVM IR test for r210282, which modified the vperm -> shuffle transform for little endian PowerPC. This patch provides that test. llvm-svn: 210297
* Adding explicit triples to the ARM jumptable testsTom Roeder2014-06-051-2/+2
| | | | llvm-svn: 210288
* Add a new attribute called 'jumptable' that creates jump-instruction tables ↵Tom Roeder2014-06-055-1/+384
| | | | | | | | | | | | for functions marked with this attribute. It includes a pass that rewrites all indirect calls to jumptable functions to pass through these tables. This also adds backend support for generating the jump-instruction tables on ARM and X86. Note that since the jumptable attribute creates a second function pointer for a function, any function marked with jumptable must also be marked with unnamed_addr. llvm-svn: 210280
* [mips] Modify long branch for NaCl:Sasa Stankovic2014-06-051-0/+34
| | | | | | | | | * Move the instruction that changes sp outside of the branch delay slot. * Bundle-align the target of indirect branch. Differential Revision: http://llvm-reviews.chandlerc.com/D3928 llvm-svn: 210262
* Prevent hoisting the instruction whose def might be clobbered by the terminator.Sasa Stankovic2014-06-051-0/+144
| | | | llvm-svn: 210261
* R600: Fix test. Using wrong check prefix.Matt Arsenault2014-06-051-21/+21
| | | | llvm-svn: 210244
* R600/SI: Match rsq instructionsMatt Arsenault2014-06-051-0/+26
| | | | llvm-svn: 210226
* Revert r209381 as it isn't a local variable. Add a testcase so thatEric Christopher2014-06-031-0/+23
| | | | | | we know next time this happens. llvm-svn: 210127
* [AArch64] Add regression tests for the load/store optimizer which cover ↵Tilmann Scheller2014-06-031-0/+130
| | | | | | | | | | | | | | | | post-index update folding with sub rather than add. The tests check that the following transform happens: (ldr|str) X, [x20] ... sub x20, x20, #16 -> (ldr|str) X, [x20], #-16 with X being either w0, x0, s0, d0 or q0. llvm-svn: 210113
* AArch64: mark small types (i1, i8, i16) as promotedTim Northover2014-06-031-13/+14
| | | | | | | | | This means the output of LowerFormalArguments returns a lowered SDValue with the correct type (expected in SelectionDAGBuilder). Without this, an assertion under a DEBUG macro triggers when those types are passed on the stack. llvm-svn: 210102
* [AArch64] Correctly deal with VPR stack parameter passing.Jiangning Liu2014-06-031-0/+8
| | | | llvm-svn: 210067
* Allow alias to point to an arbitrary ConstantExpr.Rafael Espindola2014-06-033-3/+16
| | | | | | | | | | | | | | | | | | | | | This patch changes GlobalAlias to point to an arbitrary ConstantExpr and it is up to MC (or the system assembler) to decide if that expression is valid or not. This reduces our ability to diagnose invalid uses and how early we can spot them, but it also lets us do things like @test5 = alias inttoptr(i32 sub (i32 ptrtoint (i32* @test2 to i32), i32 ptrtoint (i32* @bar to i32)) to i32*) An important implication of this patch is that the notion of aliased global doesn't exist any more. The alias has to encode the information needed to access it in its metadata (linkage, visibility, type, etc). Another consequence to notice is that getSection has to return a "const char *". It could return a NullTerminatedStringRef if there was such a thing, but when that was proposed the decision was to just uses "const char*" for that. llvm-svn: 210062
* [X86] Fix checked arithmetic for i8 on X86.Andrea Di Biagio2014-06-021-0/+24
| | | | | | | | | | | When lowering a ISD::BRCOND into a test+branch, make sure that we always use the correct condition code to emit the test operation. This fixes PR19858: "i8 checked mul is wrong on x86". Patch by Keno Fisher! llvm-svn: 210032
* [AArch64] Add some more regression tests for store pre-index update folding ↵Tilmann Scheller2014-06-021-0/+105
| | | | | | | | | | | | | | | | in the load/store optimizer. Add tests for the following transform: add x8, x8, #16 ... str X, [x8] -> str X, [x8, #16]! with X being either w0, x0, s0, d0 or q0. llvm-svn: 210021
* [AArch64] Add some more regression tests for load pre-index update folding ↵Tilmann Scheller2014-06-021-0/+106
| | | | | | | | | | | | | | | | in the load/store optimizer. Add tests for the following transform: add x8, x8, #16 ... ldr X, [x8] -> ldr X, [x8, #16]! with X being either w0, x0, s0, d0 or q0. llvm-svn: 210018
* ARMEB: Fix function return type f64Christian Pirker2014-06-011-0/+12
| | | | | | Reviewed at http://reviews.llvm.org/D3968 llvm-svn: 209990
* R600/SI: Fix [s|u]int_to_fp for i1Matt Arsenault2014-05-314-29/+124
| | | | llvm-svn: 209971
* Make blend tests more specificFilipe Cabecinhas2014-05-313-8/+17
| | | | | | | Following the lead set by r209324, I'm making these tests match the whole instruction, so we can be sure we're lowering them correctly. llvm-svn: 209947
* [X86] Add two combine rules to simplify dag nodes introduced during type ↵Andrea Di Biagio2014-05-302-27/+281
| | | | | | | | | | | | | | | | | | | | | | | legalization when promoting nodes with illegal vector type. This patch teaches the backend how to simplify/canonicalize dag node sequences normally introduced by the backend when promoting certain dag nodes with illegal vector type. This patch adds two new combine rules: 1) fold (shuffle (bitcast (BINOP A, B)), Undef, <Mask>) -> (shuffle (BINOP (bitcast A), (bitcast B)), Undef, <Mask>) 2) fold (BINOP (shuffle (A, Undef, <Mask>)), (shuffle (B, Undef, <Mask>))) -> (shuffle (BINOP A, B), Undef, <Mask>). Both rules are only triggered on the type-legalized DAG. In particular, rule 1. is a target specific combine rule that attempts to sink a bitconvert into the operands of a binary operation. Rule 2. is a target independet rule that attempts to move a shuffle immediately after a binary operation. llvm-svn: 209930
* Convert a vselect into a concat_vector if possibleFilipe Cabecinhas2014-05-302-1/+14
| | | | | | | | | | | | | | | | | | | | | Summary: If both vector args to vselect are concat_vectors and the condition is constant and picks half a vector from each argument, convert the vselect into a concat_vectors. Added a test. The ConvertSelectToConcatVector is assuming it doesn't get vselects with arguments of, for example, <undef, undef, true, true>. Those get taken care of in the checks above its call. Reviewers: nadav, delena, grosbach, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3916 llvm-svn: 209929
* Separate the check for blend shuffle_vector masksFilipe Cabecinhas2014-05-301-2/+2
| | | | | | | | | | | | | | | | | Summary: Separate the check for blend shuffle_vector masks into isBlendMask. This function will also be used to check if a vector shuffle is legal. No change in functionality was intended, but we ended up improving codegen on two tests, which were being (more) optimized only if the resulting shuffle was legal. Reviewers: nadav, delena, andreadb Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3964 llvm-svn: 209923
* Fix MIPS exception personality encoding.Logan Chien2014-05-301-0/+34
| | | | | | | | | For MIPS, we have to encode the personality routine with an indirect pointer to absptr; otherwise, some link warning warning will be raised, and the program might crash in some early MIPS Android device. llvm-svn: 209907
* [pr19636] Fix known bit computation in urem instruction with power of two.Rafael Espindola2014-05-301-0/+14
| | | | | | Patch by Andrey Kuharev. llvm-svn: 209902
* SelectionDAG: skip barriers for unordered atomic operationsTim Northover2014-05-301-12/+25
| | | | | | | | | Unordered is strictly weaker than monotonic, so if the latter doesn't have any barriers then the former certainly shouldn't. rdar://problem/16548260 llvm-svn: 209901
* ARM: use AAPCS-style prologues for embedded MachO.Tim Northover2014-05-303-18/+16
| | | | | | | | | | | | | | | Darwin prologues save their GPRs in two stages: a narrow push of r0-r7 & lr, followed by a wide push of the remaining registers if there are any. AAPCS uses a single push.w instruction. It turns out that, on average, enough registers get pushed that code is smaller in the AAPCS prologue, which is a nice property for M-class programmers. They also have other options available for back-traces, so can hopefully deal with the fact that FP & LR aren't adjacent in memory. rdar://problem/15909583 llvm-svn: 209895
* AArch64 & ARM: disable generic test that relies on no CFG changes.Tim Northover2014-05-301-0/+3
| | | | llvm-svn: 209885
* ARM & AArch64: make use of common cmpxchg idioms after expansionTim Northover2014-05-3055-78/+265
| | | | | | | | | | | | | | | | | | | | | | | | The C and C++ semantics for compare_exchange require it to return a bool indicating success. This gets mapped to LLVM IR which follows each cmpxchg with an icmp of the value loaded against the desired value. When lowered to ldxr/stxr loops, this extra comparison is redundant: its results are implicit in the control-flow of the function. This commit makes two changes: it replaces that icmp with appropriate PHI nodes, and then makes sure earlyCSE is called after expansion to actually make use of the opportunities revealed. I've also added -{arm,aarch64}-enable-atomic-tidy options, so that existing fragile tests aren't perturbed too much by the change. Many of them either rely on undef/unreachable too pervasively to be restored to something well-defined (particularly while making sure they test the same obscure assert from many years ago), or depend on a particular CFG shape, which is disrupted by SimplifyCFG. rdar://problem/16227836 llvm-svn: 209883
* AArch64 & ARM: remove undefined behaviour from some tests.Tim Northover2014-05-3013-62/+66
| | | | llvm-svn: 209880
* Test cases named with dates is a legacy rule not used now. Rename several ↵Hao Liu2014-05-304-0/+0
| | | | | | test cases. llvm-svn: 209877
* [X86] Move test from r209863 to CodeGen/X86Adam Nemet2014-05-291-0/+41
| | | | | | We should only run this if X86 is in the targets. llvm-svn: 209866
* [X86] Remove AVX1 vbroadcast intrinsicsAdam Nemet2014-05-291-24/+0
| | | | | | | | | | | | | | | | | | | | | The corresponding CFE patch replaces these intrinsics with vector initializers in avxintrin.h. This patch removes the LLVM intrinsics from the backend. We now stop lowering at X86ISD::VBROADCAST custom node rather than lowering that further to the intrinsics. The patch only changes VBROADCASTS* and leaves VBROADCAST[FI]128 to continue to use intrinsics. As explained in the CFE patch, the reason is that we currently don't generate as good code for them without the intrinsics. CodeGen/X86/avx-vbroadcast.ll already provides coverage for this change. It checks that for a series of insertelements we generate the appropriate vbroadcast instruction. Also verified that there was no assembly change in the test-suite before and after this patch. llvm-svn: 209864
OpenPOWER on IntegriCloud