summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Make sure SP is always aligned on a 2 byte boundaryJob Noorman2013-10-241-0/+17
| | | | llvm-svn: 193320
* [AArch64] Fix NZCV reg live-in bug in F128CSEL codegen.Amara Emerson2013-10-241-0/+17
| | | | | | | | | When generating the IfTrue basic block during the F128CSEL pseudo-instruction handling, the NZCV live-in for the newly created BB wasn't being added. This caused a fault during MI-sched/live range calculation when the predecessor for the fall-through BB didn't have a live-in for phys-reg as expected. llvm-svn: 193316
* AVX-512: added VCVTPH2PS, VCVTPS2PH with intrinsicsElena Demikhovsky2013-10-241-0/+15
| | | | llvm-svn: 193312
* Replace sse41/sse42 with sse4.1/sse4.2 in test command lines to fix bots.Craig Topper2013-10-242-2/+2
| | | | llvm-svn: 193311
* Add non-AVX tests for AES intrinsics.Craig Topper2013-10-241-0/+48
| | | | llvm-svn: 193310
* Add tests for SSE intrinsics in non-avx mode by copying from the AVX test ↵Craig Topper2013-10-246-0/+1704
| | | | | | cases. Some of these may have been tested by other tests, but most weren't. Patch by Cameron McInally. llvm-svn: 193309
* X86: Custom lower sext v16i8 to v16i16, and the corresponding truncate.Benjamin Kramer2013-10-234-3/+37
| | | | | | Also update the cost model. llvm-svn: 193270
* X86: Custom lower zext v16i8 to v16i16.Benjamin Kramer2013-10-232-0/+21
| | | | | | | | | | | | | | | | | On sandy bridge (PR17654) we now get vpxor %xmm1, %xmm1, %xmm1 vpunpckhbw %xmm1, %xmm0, %xmm2 vpunpcklbw %xmm1, %xmm0, %xmm0 vinsertf128 $1, %xmm2, %ymm0, %ymm0 On haswell it's a simple vpmovzxbw %xmm0, %ymm0 There is a maze of duplicated and dead transforms and patterns in this area. Remove the dead custom lowering of zext v8i16 to v8i32, that's already handled by LowerAVXExtend. llvm-svn: 193262
* Fix PR17631Michael Liao2013-10-231-0/+22
| | | | | | | | | - Skip instructions added in prolog. For specific targets, prolog may insert helper function calls (e.g. _chkstk will be called when there're more than 4K bytes allocated on stack). However, these helpers don't use/def YMM/XMM registers. llvm-svn: 193261
* [mips][msa] Added support for matching fexp2 from normal IR (i.e. not ↵Daniel Sanders2013-10-231-0/+69
| | | | | | intrinsics) llvm-svn: 193239
* R600/SI: fix MIMG writemask adjustementTom Stellard2013-10-231-0/+93
| | | | | | | | | | | | This fixes piglit: - shaders/glsl-fs-texture2d-masked - shaders/glsl-fs-texture2d-masked-4 Patch by: Marek Olšák Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 193222
* R600: Fix handling of vector kernel argumentsTom Stellard2013-10-235-115/+502
| | | | | | | | | | The SelectionDAGBuilder was promoting vector kernel arguments to legal types, but this won't work for R600 and SI since kernel arguments are stored in memory and can't be promoted. In order to handle vector arguments correctly we need to look at the original types from the LLVM IR function. llvm-svn: 193215
* R600/SI: Add support for i64 bitwise orTom Stellard2013-10-231-4/+17
| | | | llvm-svn: 193213
* R600/SI: Use S_LOAD_DWORD instructions for v8i32 and v16i32Tom Stellard2013-10-231-5/+10
| | | | llvm-svn: 193212
* R600: Simplify handling of private address spaceTom Stellard2013-10-221-0/+39
| | | | | | | | | | | | | | | | | | The AMDGPUIndirectAddressing pass was previously responsible for lowering private loads and stores to indirect addressing instructions. However, this pass was buggy and way too complicated. The only advantage it had over the new simplified code was that it saved one instruction per direct write to private memory. This optimization likely has a minimal impact on performance, and we may be able to duplicate it using some other transformation. For the private address space, we now: 1. Lower private loads/store to Register(Load|Store) instructions 2. Reserve part of the register file as 'private memory' 3. After regalloc lower the Register(Load|Store) instructions to MOV instructions that use indirect addressing. llvm-svn: 193179
* AVX-512: aligned / unaligned load and store for 512-bit integer vectors.Elena Demikhovsky2013-10-221-0/+28
| | | | llvm-svn: 193156
* Add testcase for PR3168. It was fixed over time.Bill Wendling2013-10-221-0/+21
| | | | | | PR3168 llvm-svn: 193152
* Fix spelling, grammar, and match naming convention for test files.Eric Christopher2013-10-211-1/+1
| | | | llvm-svn: 193130
* [AArch64] Add the constraint to NEON scalar mla/mls instructions.Chad Rosier2013-10-211-20/+24
| | | | llvm-svn: 193117
* Fix CodeGen for vectors of pointers with address spaces.Matt Arsenault2013-10-211-0/+30
| | | | llvm-svn: 193112
* Fix CodeGen for different size address space GEPsMatt Arsenault2013-10-211-0/+10
| | | | llvm-svn: 193111
* X86 vector element shift-by-immediate instructions take i8 immediates. MakeLang Hames2013-10-212-4/+4
| | | | | | | | | | | | | | the instruction defenitions and ISEL reflect this. Prior to this patch these instructions took an i32i8imm, and the high bits were dropped during encoding. This led to incorrect behavior for shifts by immediates higher than 255. This patch fixes that issue by detecting large immediate shifts and returning constant zero (for logical shifts) or capping the shift amount at an encodable value (for arithmetic shifts). Fixes <rdar://problem/14968098> llvm-svn: 193096
* AVX-512: MUL operation lowering for v8i64Elena Demikhovsky2013-10-211-1/+10
| | | | llvm-svn: 193083
* [mips][msa] Fix definition of SLD instruction.Matheus Almeida2013-10-211-27/+27
| | | | | | | The second parameter of the SLD intrinsic is the number of columns (GPR) to slide left the source array. llvm-svn: 193076
* Emit prefix data after debug and EH directives.Peter Collingbourne2013-10-201-0/+2
| | | | | | | | | This ensures that the prefix data is treated as part of the function for the purpose of debug info. This provides a better debugging experience, among other things by allowing a debug info client to correctly look up a function in debug info given a function pointer. llvm-svn: 193042
* Update PPC loop tests after SCEV non-unit-stride checkin r193015.Andrew Trick2013-10-192-24/+12
| | | | llvm-svn: 193021
* Test case for r192957David Majnemer2013-10-181-0/+21
| | | | | | Forgot to 'svn add' llvm-svn: 192978
* [PATCH] Fix PR17168 (DAG scheduler inserts DBG_VALUE before PHI with fast-isel)Bill Schmidt2013-10-181-0/+520
| | | | | | | | | | | | | | | | | | | | | | | | PR17168 describes a test case that fails when compiling for debug with fast-isel. Investigation showed that the test was failing because a DBG_VALUE machine instruction was placed prior to a PHI. For this problem to occur requires the following: * Compile for debug * Compile with fast-isel * In a block B, fast-isel must partially succeed before punting to DAG-isel * B must start with a PHI * The first unhandled node in the DAG must not generate a machine instruction * A debug value with an order less than that of that first node exists When all of these circumstances apply, the existing test that an instruction was not inserted won't fire. Currently it tests whether the block is empty, or whether the last instruction generated is a phi. When fast-isel has partially succeeded, the last instruction generated will not be a phi. Instead, we need to check whether the current insert position is immediately following a phi. This patch adds that check, and adds the test case from the PR as a regression test. llvm-svn: 192976
* [AArch64] Add support for NEON scalar extract narrow instructions.Chad Rosier2013-10-181-0/+104
| | | | llvm-svn: 192970
* [mips][msa] Added a regression test that depended on multiple patches to pass.Daniel Sanders2013-10-181-0/+150
| | | | llvm-svn: 192961
* Revert "Re-commit r192758 - MC: quote tricky symbol names in asm output"Hans Wennborg2013-10-183-7/+5
| | | | | | | | | | | | | | | | | This caused the clang-native-mingw32-win7 buildbot to break. The assembler was complaining about the following lines that were showing up in the asm for CrashRecoveryContext.cpp: movl $"__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4", 4(%eax) calll "_AddVectoredExceptionHandler@8" .def "__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4"; "__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4": calll "_RemoveVectoredExceptionHandler@4" Reverting for now. llvm-svn: 192940
* 17309 ARM backend incorrectly lowers COPY_STRUCT_BYVAL_I32 for thumb1 targetsDavid Peixotto2013-10-172-0/+1580
| | | | | | | | | | | | | | | | | | | | | | | | This commit implements the correct lowering of the COPY_STRUCT_BYVAL_I32 pseudo-instruction for thumb1 targets. Previously, the lowering of COPY_STRUCT_BYVAL_I32 generated the post-increment forms of ldr/ldrh/ldrb instructions. Thumb1 does not have the post-increment form of these instructions so the generated assembly contained invalid instructions. Passing the generated assembly to gcc caused it to complain with an error like this: Error: cannot honor width suffix -- `ldrb r3,[r0],#1' and the integrated assembler would generate an object file with an invalid instruction encoding. This commit contains a small test case that demonstrates the problem with thumb1 targets as well as an expanded test case that more throughly tests the lowering of byval struct passing for arm, thumb1, and thumb2 targets. llvm-svn: 192916
* [AArch64] Add support for NEON scalar three register different instructionChad Rosier2013-10-171-0/+75
| | | | | | | | class. The instruction class includes the signed saturating doubling multiply-add long, signed saturating doubling multiply-subtract long, and the signed saturating doubling multiply long instructions. llvm-svn: 192908
* Add testcase to make sure we don't generate a compact unwind section for ELF ↵Bill Wendling2013-10-171-0/+48
| | | | | | | | binaries. This tests r190354. llvm-svn: 192903
* [mips][msa] Added lsa instructionDaniel Sanders2013-10-171-0/+26
| | | | llvm-svn: 192895
* Fix tests not to depend on specific regalloc or instruction order.Benjamin Kramer2013-10-172-4/+4
| | | | | | They were failing with -mcpu=atom. llvm-svn: 192890
* Fix r192888: test/CodeGen/Mips/msa/3r_ld_st.ll should have been deletedDaniel Sanders2013-10-171-149/+0
| | | | llvm-svn: 192889
* Replace sra with srl if a single sign bit is requiredRichard Sandiford2013-10-172-5/+16
| | | | | | E.g. (and (sra (i32 x) 31) 2) -> (and (srl (i32 x) 30) 2). llvm-svn: 192884
* Fix edge condition in DAGCombiner to improve codegen of shift sequences.Andrea Di Biagio2013-10-171-0/+8
| | | | | | | | | | | | When canonicalizing dags according to the rule (shl (zext (shr X, c1) ), c1) ==> (zext (shl (shr X, c1), c1)) remember to add the new shl dag to the DAGCombiner worklist of nodes. If we don't explicitly add it to the worklist of nodes to visit, we may not trigger later on the rule that folds the shift left + logical shift right into a AND instruction with bitmask. llvm-svn: 192883
* x86: Move bitcasts outside concat_vector.Jim Grosbach2013-10-171-1/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider the following: typedef unsigned short ushort4U __attribute__((ext_vector_type(4), aligned(2))); typedef unsigned short ushort4 __attribute__((ext_vector_type(4))); typedef unsigned short ushort8 __attribute__((ext_vector_type(8))); typedef int int4 __attribute__((ext_vector_type(4))); int4 __bbase_cvt_int(ushort4 v) { ushort8 a; a.lo = v; return _mm_cvtepu16_epi32(a); } This generates the, not unreasonable, IR: define <4 x i32> @foo0(double %v.coerce) nounwind ssp { %tmp = bitcast double %v.coerce to <4 x i16> %tmp1 = shufflevector <4 x i16> %tmp, <4 x i16> undef, <8 x i32> <i32 %0, i32 1, i32 2, i32 3, i32 undef, i32 undef, i32 undef, i32 undef> %tmp2 = tail call <4 x i32> @llvm.x86.sse41.pmovzxwd(<8 x i16> %tmp1) ret <4 x i32> %tmp2 } The problem is when type legalization gets hold of the v4i16. It legalizes that by spilling to the stack, then doing a zero-extending load. Things go even more silly from there, ending up with something like: _foo0: movsd %xmm0, -8(%rsp) <== Spill to the stack. movq -8(%rsp), %xmm0 <== Reload it right back out. pmovzxwd %xmm0, %xmm1 <== Here's what we actually asked for. pblendw $1, %xmm1, %xmm0 <== We don't need this at all pmovzxwd %xmm0, %xmm0 <== We already did this ret The v8i8 to v8i16 zext intrinsic gives even worse results, with two table lookups via pshufb instructions(!!). To avoid all that, we can move the bitcasting until after we've formed the wider (legal) vector type. Then our normal codegen flows along nicely and we get the expected: _foo0: pmovzxwd %xmm0, %xmm0 ret rdar://15245794 llvm-svn: 192866
* Re-commit r192758 - MC: quote tricky symbol names in asm outputHans Wennborg2013-10-173-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | The reason this got reverted was that the @feat.00 symbol which was emitted for every TU became quoted, and on cygwin/mingw we use the gas assembler which couldn't handle the quotes. This commit fixes the problem by only emitting @feat.00 for win32, where we use clang -cc1as to assemble. gas would just drop this symbol anyway, so there is no loss there. With @feat.00 gone, there shouldn't be quoted symbols showing up on cygwin since it uses the Itanium ABI, which doesn't put these funny characters in symbols. > Because of win32 mangling, we produce symbol and section names with > funny characters in them, most notably @ characters. > > MC would choke on trying to parse its own assembly output. This patch addresses > that by: > > - Making @ trigger quoting of symbol names > - Also quote section names in the same way > - Just parse section names like other identifiers (to allow for quotes) > - Don't assume @ signifies a symbol variant if it is in a string. llvm-svn: 192859
* [AArch64] Add support for NEON scalar negate instruction.Chad Rosier2013-10-161-0/+12
| | | | llvm-svn: 192843
* [AArch64] Add support for NEON scalar absolute value instruction.Chad Rosier2013-10-161-0/+12
| | | | llvm-svn: 192842
* Enabling 3DNow! prefetch instruction for a few AMD processors: bobcat, jaguar,Yunzhong Gao2013-10-161-0/+3
| | | | | | | | | bulldozer and piledriver. Support for the instruction itself seems to have already been added in r178040. Differential Revision: http://llvm-reviews.chandlerc.com/D1933 llvm-svn: 192828
* R600: Fix a crash in the AMDILCFGStructurizerTom Stellard2013-10-161-0/+83
| | | | | | | | | | We were calling llvm_unreachable() when failing to optimize the branch into if case. However, it is still possible for us to structurize the CFG by duplicating blocks even if this optimization fails. Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 192813
* Port to FileCheck.Rafael Espindola2013-10-161-4/+17
| | | | llvm-svn: 192810
* [AArch64] Add support for NEON scalar signed saturating accumulated of unsignedChad Rosier2013-10-161-0/+104
| | | | | | value and unsigned saturating accumulate of signed value instructions. llvm-svn: 192800
* DAGCombiner: Don't fold xor into not if getNOT would introduce an illegal ↵Benjamin Kramer2013-10-161-0/+14
| | | | | | | | | | | constant. This happens e.g. with <2 x i64> -1 on x86_32. It cannot be generated directly because i64 is illegal. It would be nice if getNOT would handle this transparently, but I don't see a way to generate a legal constant there right now. Fixes PR17487. llvm-svn: 192795
* [SystemZ] Handle extensions in RxSBG optimizationsRichard Sandiford2013-10-161-3/+2
| | | | | | | The input to an RxSBG operation can be narrower as long as the upper bits are don't care. This fixes a FIXME added in r192783. llvm-svn: 192790
* [SystemZ] Improve handling of SETCCRichard Sandiford2013-10-163-16/+260
| | | | | | | | We previously used the default expansion to SELECT_CC, which in turn would expand to "LHI; BRC; LHI". In most cases it's better to use an IPM-based sequence instead. llvm-svn: 192784
OpenPOWER on IntegriCloud