summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Fix unsupported addressing mode assertion for pldDavid Peixotto2014-01-272-22/+18
| | | | | | | | | | | | | | | | | | | Summary: This commit gives an address mode to the PLD instruction. We were getting an assertion failure in the frame lowering code because we had code that was doing a pld of a stack allocated address. The frame lowering was checking the address mode and then asserting because pld had none defined. This commit fixes pld for arm mode. There was a previous fix for thumb mode in a separate commit. The commit for thumb mode added a test in a separate file because it would otherwise fail for arm. This commit moves the thumb test back into the prefetch.ll file and adds the corresponding arm test. Differential Revision: http://llvm-reviews.chandlerc.com/D2622 llvm-svn: 200248
* [DAGCombiner] Teach how to fold sext/aext/zext of constant build vectors.Andrea Di Biagio2014-01-273-12/+308
| | | | | | | | | | | | | This patch teaches the DAGCombiner how to fold a sext/aext/zext dag node when the operand in input is a build vector of constants (or UNDEFs). The inability to fold a sext/zext of a constant build_vector was the root cause of some pcg bugs affecting vselect expansion on x86-64 with AVX support. Before this change, the DAGCombiner only knew how to fold a sext/zext/aext of a ConstantSDNode. llvm-svn: 200234
* Additional fix for 200201: due to dependence on bitwidth test was moved to ↵Stepan Dyatkovskiy2014-01-271-0/+0
| | | | | | X86 directory. llvm-svn: 200202
* Fix for PR18102.Stepan Dyatkovskiy2014-01-271-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Issue outcomes from DAGCombiner::MergeConsequtiveStores, more precisely from mem-ops sequence sorting. Consider, how MergeConsequtiveStores works for next example: store i8 1, a[0] store i8 2, a[1] store i8 3, a[1] ; a[1] again. return ; DAG starts here 1. Method will collect all the 3 stores. 2. It sorts them by distance from the base pointer (farthest with highest index). 3. It takes first consecutive non-overlapping stores and (if possible) replaces them with a single store instruction. The point is, we can't determine here which 'store' instruction would be the second after sorting ('store 2' or 'store 3'). It happens that 'store 3' would be the second, and 'store 2' would be the third. So after merging we have the next result: store i16 (1 | 3 << 8), base ; is a[0] but bit-casted to i16 store i8 2, a[1] So actually we swapped 'store 3' and 'store 2' and got wrong contents in a[1]. Fix: In sort routine just also take into account mem-op sequence number. llvm-svn: 200201
* R600/SI: Add intrinsic for BUFFER_LOAD_DWORD* instructionsMichel Danzer2014-01-271-0/+40
| | | | | Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200196
* R600/SI: Add intrinsic for S_SENDMSG instructionMichel Danzer2014-01-271-0/+21
| | | | | Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200195
* [AArch64 NEON] Try to generate CONCAT_VECTOR when lowering BUILD_VECTOR or ↵Kevin Qin2014-01-271-0/+270
| | | | | | | | SHUFFLE_VECTOR. Replace r199791. llvm-svn: 200180
* Revert r199791.Kevin Qin2014-01-271-270/+0
| | | | | | It's old version which has some bugs. I'll commit lattest patch soon. llvm-svn: 200179
* Clean up the Legal/Expand logic for SPARC popc.Jakob Stoklund Olesen2014-01-261-2/+2
| | | | llvm-svn: 200141
* Implement the missing bits corresponding to .mips_hack_elf_flags.Rafael Espindola2014-01-261-22/+36
| | | | | | | | | | | | These were: * noreorder handling on the target object streamer and asm parser. * setting the initial flag bits based on the enabled features. * setting the elf header flag for micromips It is *really* depressing I am the one doing this instead of someone at mips actually taking the time to understand the infrastructure. llvm-svn: 200138
* Only generate the popc instruction for SPARC CPUs that implement it.Jakob Stoklund Olesen2014-01-261-6/+6
| | | | | | | The popc instruction is defined in the SPARCv9 instruction set architecture, but it was emulated on CPUs older than Niagara 2. llvm-svn: 200131
* Fix swapped CASA operands.Jakob Stoklund Olesen2014-01-261-2/+2
| | | | | | Found by SingleSource/UnitTests/AtomicOps.c llvm-svn: 200130
* Improve pattern match from v1i8 to v1i32 for AArch64 Neon.Jiangning Liu2014-01-261-2/+1
| | | | llvm-svn: 200119
* Remove -print-hack-directives from a test where we already do the right thing.Rafael Espindola2014-01-261-1/+1
| | | | llvm-svn: 200116
* Move tests that just use llc from test/MC/Mips to test/MC/Codegen.Rafael Espindola2014-01-264-0/+16533
| | | | | | This is an expanded version of r200064. llvm-svn: 200115
* Implement pattern match from v1xx to v1xx for AArch64 Neon.Jiangning Liu2014-01-261-0/+114
| | | | llvm-svn: 200113
* [AArch64 NEON] Add patterns for concat_vector on v2i32.Kevin Qin2014-01-261-19/+46
| | | | llvm-svn: 200111
* [AArch64 NEON] Add test case for vector FP_ROUND.Kevin Qin2014-01-261-0/+18
| | | | llvm-svn: 200110
* Add a TBAA CodeGen failure test caseHal Finkel2014-01-251-0/+41
| | | | | | | | | I disabled the use of TBAA in CodeGen in r200093. This adds a test case that demonstrates the problems with inttoptr and TBAA in CodeGen (and, specifically, the problem that causes LLVM to miscompile itself in Release mode). This test will currently fail if -use-tbaa-in-sched-mi is enabled. llvm-svn: 200097
* XFAIL test/CodeGen/SystemZ/alias-01.ll which requires CodeGen TBAAHal Finkel2014-01-251-0/+3
| | | | llvm-svn: 200094
* This reverts commit r200064 and r200051.Rafael Espindola2014-01-252-134/+0
| | | | | | | | | | | | | | | | | | | r200064 depends on r200051. r200051 is broken: I tries to replace .mips_hack_elf_flags, which is a good thing, but what it replaces it with is even worse. The new emitMipsELFFlags it adds corresponds to no assembly directive, is not marked as a hack and is not even printed to the .s file. The patch also introduces more uses of hasRawTextSupport. The correct way to remove .mips_hack_elf_flags is to have the mips target streamer handle the default flags (and command line options). That way the same code path is used for asm and obj. The streamer interface should *really* correspond to what is printed in the .s file. llvm-svn: 200078
* [Mips] Move 2 test cases from MC to CodeGen.Jack Carter2014-01-252-0/+134
| | | | | | No code changes. Just reassignment of test case files. llvm-svn: 200064
* Revert "Revert "Add Constant Hoisting Pass" (r200034)"Juergen Ributzka2014-01-252-2/+71
| | | | | | | This reverts commit r200058 and adds the using directive for ARMTargetTransformInfo to silence two g++ overload warnings. llvm-svn: 200062
* Revert "Add Constant Hoisting Pass" (r200034)Hans Wennborg2014-01-252-71/+2
| | | | | | | | | | | | | | | This commit caused -Woverloaded-virtual warnings. The two new TargetTransformInfo::getIntImmCost functions were only added to the superclass, and to the X86 subclass. The other targets were not updated, and the warning highlighted this by pointing out that e.g. ARMTTI::getIntImmCost was hiding the two new getIntImmCost variants. We could pacify the warning by adding "using TargetTransformInfo::getIntImmCost" to the various subclasses, or turning it off, but I suspect that it's wrong to leave the functions unimplemnted in those targets. The default implementations return TCC_Free, which I don't think is right e.g. for ARM. llvm-svn: 200058
* [AArch64] Removed unused i8 type from FPR8 register class.Ana Pazos2014-01-241-0/+47
| | | | | | | | | | | | | | | The i8 type is not registered with any register class. This causes a segmentation fault in MachineLICM::getRegisterClassIDAndCost. The code selects the first type associated with register class FPR8, which happens to be i8. It uses this type (i8) to get the representative class pointer, which is 0. It then uses this pointer to access a field, resulting in segmentation fault. Since i8 type is not being used for printing any neon instruction we can safely remove it. llvm-svn: 200046
* Add Constant Hoisting PassJuergen Ributzka2014-01-242-2/+71
| | | | | | | | Retry commit r200022 with a fix for the build bot errors. Constant expressions have (unlike instructions) module scope use lists and therefore may have users in different functions. The fix is to simply ignore these out-of-function uses. llvm-svn: 200034
* Add a testcase for the changes in r199938.Lang Hames2014-01-241-3/+21
| | | | | | <rdar://problem/15611947> llvm-svn: 200027
* Revert "Add Constant Hoisting Pass"Juergen Ributzka2014-01-242-57/+2
| | | | | | This reverts commit r200022 to unbreak the build bots. llvm-svn: 200024
* Add Constant Hoisting PassJuergen Ributzka2014-01-242-2/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This pass identifies expensive constants to hoist and coalesces them to better prepare it for SelectionDAG-based code generation. This works around the limitations of the basic-block-at-a-time approach. First it scans all instructions for integer constants and calculates its cost. If the constant can be folded into the instruction (the cost is TCC_Free) or the cost is just a simple operation (TCC_BASIC), then we don't consider it expensive and leave it alone. This is the default behavior and the default implementation of getIntImmCost will always return TCC_Free. If the cost is more than TCC_BASIC, then the integer constant can't be folded into the instruction and it might be beneficial to hoist the constant. Similar constants are coalesced to reduce register pressure and materialization code. When a constant is hoisted, it is also hidden behind a bitcast to force it to be live-out of the basic block. Otherwise the constant would be just duplicated and each basic block would have its own copy in the SelectionDAG. The SelectionDAG recognizes such constants as opaque and doesn't perform certain transformations on them, which would create a new expensive constant. This optimization is only applied to integer constants in instructions and simple (this means not nested) constant cast experessions. For example: %0 = load i64* inttoptr (i64 big_constant to i64*) Reviewed by Eric llvm-svn: 200022
* Fix known typosAlp Toker2014-01-2411-15/+15
| | | | | | | Sweep the codebase for common typos. Includes some changes to visible function names that were misspelt. llvm-svn: 200018
* Don't use "llc -filetype=obj" now that the codepath is the same.Rafael Espindola2014-01-243-3/+3
| | | | | | | r200011 remove the special codepaths in MC for inline asm, so we can now test all the logic with just llc + llvm-mc. llvm-svn: 200013
* [AArch64 NEON] Fix a bug in implementing register copy bwtween FPR16.Kevin Qin2014-01-241-1/+12
| | | | llvm-svn: 199978
* [X86] Prevent the creation of redundant ops for sadd and ssub with overflow.Juergen Ributzka2014-01-241-0/+34
| | | | | | | | | | | | | This commit teaches the X86 backend to create the same X86 instructions when it lowers an sadd/ssub with overflow intrinsic and a conditional branch that uses that overflow result. This allows SelectionDAG to recognize and remove one of the redundant operations. This fixes <rdar://problem/15874016> and <rdar://problem/15661073>. Reviewed by Nadav llvm-svn: 199976
* Implement atomicrmw operations in 32 and 64 bits for SPARCv9.Jakob Stoklund Olesen2014-01-241-1/+82
| | | | | | These all use the compare-and-swap CASA/CASXA instructions. llvm-svn: 199975
* Replace vfmaddxx213 instructions with their 231-type equivalents in accumulatorLang Hames2014-01-231-0/+15
| | | | | | | loops. Writing back to the accumulator (231-type) allows the coalescer to eliminate an extra copy. llvm-svn: 199933
* [Thumbv8] Fix the value of BLXOperandIndex of isV8EligibleForITWeiming Zhao2014-01-232-3/+23
| | | | | | | | | Originally, BLX was passed as operand #0 in MachineInstr and as operand #2 in MCInst. But now, it's operand #2 in both cases. This patch also removes unnecessary FileCheck in the test case added by r199127. llvm-svn: 199928
* Move test to x86 directory.Eric Christopher2014-01-231-104/+0
| | | | llvm-svn: 199927
* [AArch64] Added vselect patterns with float and double typesAna Pazos2014-01-231-0/+13
| | | | llvm-svn: 199925
* Avoid emitting a DWARF type attribute for an ObjC property of typeEric Christopher2014-01-231-0/+104
| | | | | | | | void. Patch by Scott Talbot. llvm-svn: 199924
* R600: Disable the BFE patternTom Stellard2014-01-231-0/+2
| | | | | | | | | | This pattern uses an SDNodeXForm, which isn't being emitted for some reason. I can get it to work by attaching the PatLeaf that has the XForm to the argument in the output pattern, but this results in an immediate being used in a register operand, which the backend can't handle yet. llvm-svn: 199918
* R600: Correctly handle vertex fetch clauses the precede ENDIFsTom Stellard2014-01-231-0/+29
| | | | | | | | The control flow finalizer would sometimes use an ALU_POP_AFTER instruction before the vetex fetch clause instead of using a POP instruction after it. llvm-svn: 199917
* R600: Unconditionally unroll loops that contain GEPs with alloca pointersTom Stellard2014-01-231-0/+37
| | | | | | | | | | | | Implement the getUnrollingPreferences() function for AMDGPUTargetTransformInfo so that loops that do address calculations on pointers derived from alloca are unconditionally unrolled. Unrolling these loops makes it more likely that SROA will be able to eliminate the allocas, which is a big win for R600 since memory allocated by alloca (private memory) is really slow. llvm-svn: 199916
* Move a unit test into the correct dir. Sorry if it broke Mips-only builds.Andrew Trick2014-01-231-0/+0
| | | | llvm-svn: 199911
* R600: Recommit 199842: Add work-around for the CF stack entry HW bugTom Stellard2014-01-231-0/+227
| | | | | | | | | | | | | | | | | | The unit test is now disabled on non-asserts builds. The CF stack can be corrupted if you use CF_ALU_PUSH_BEFORE, CF_ALU_ELSE_AFTER, CF_ALU_BREAK, or CF_ALU_CONTINUE when the number of sub-entries on the stack is greater than or equal to the stack entry size and sub-entries modulo 4 is either 0 or 3 (on cedar the bug is present when number of sub-entries module 8 is either 7 or 0) We choose to be conservative and always apply the work-around when the number of sub-enries is greater than or equal to the stack entry size, so that we can safely over-allocate the stack when we are unsure of the stack allocation rules. reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 199905
* AVX-512: added VPERM2D VPERM2Q VPERM2PS VPERM2PD instructions,Elena Demikhovsky2014-01-231-7/+7
| | | | | | they give better sequences than VPERMI llvm-svn: 199893
* ARM: use litpools for normal i32 imms when compiling minsize.Tim Northover2014-01-231-0/+57
| | | | | | | | | With constant-sharing, litpool loads consume 4 + N*2 bytes of code, but movw/movt pairs consume 8*N. This means litpools are better than movw/movt even with just one use. Other materialisation strategies can still be better though, so the logic is a little odd. llvm-svn: 199891
* [AArch64]Add CHECK for two test cases testing scalar_to_vector committed in ↵Hao Liu2014-01-231-6/+19
| | | | | | r199461. llvm-svn: 199861
* Revert r162101 and replace it with a solution that works for targets where ↵Owen Anderson2014-01-221-1/+1
| | | | | | | | | | the pointer type is illegal. This is a horrible bit of code. We're calling a simplification routine *in the middle* of type legalization. We tell the simplification routine that it's running after legalization, but some of the types it will encounter will be illegal! The fix is only to invoke the simplification if the types in question were legal, so that none of its invariants will be violated. llvm-svn: 199847
* Revert "R600: Add work-around for the CF stack entry HW bug"Tom Stellard2014-01-221-225/+0
| | | | | | | | | This reverts commit 35b8331cad6eb512a2506adbc394201181da94ba. The -debug-only flag for llc doesn't appear to be available in all build configurations. llvm-svn: 199845
* R600: Add work-around for the CF stack entry HW bugTom Stellard2014-01-221-0/+225
| | | | | | | | | | | | | | | | The CF stack can be corrupted if you use CF_ALU_PUSH_BEFORE, CF_ALU_ELSE_AFTER, CF_ALU_BREAK, or CF_ALU_CONTINUE when the number of sub-entries on the stack is greater than or equal to the stack entry size and sub-entries modulo 4 is either 0 or 3 (on cedar the bug is present when number of sub-entries module 8 is either 7 or 0) We choose to be conservative and always apply the work-around when the number of sub-enries is greater than or equal to the stack entry size, so that we can safely over-allocate the stack when we are unsure of the stack allocation rules. reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 199842
OpenPOWER on IntegriCloud