summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Test cases named with dates is a legacy rule not used now. Rename several ↵Hao Liu2014-05-304-0/+0
| | | | | | test cases. llvm-svn: 209877
* [X86] Move test from r209863 to CodeGen/X86Adam Nemet2014-05-291-0/+41
| | | | | | We should only run this if X86 is in the targets. llvm-svn: 209866
* [X86] Remove AVX1 vbroadcast intrinsicsAdam Nemet2014-05-291-24/+0
| | | | | | | | | | | | | | | | | | | | | The corresponding CFE patch replaces these intrinsics with vector initializers in avxintrin.h. This patch removes the LLVM intrinsics from the backend. We now stop lowering at X86ISD::VBROADCAST custom node rather than lowering that further to the intrinsics. The patch only changes VBROADCASTS* and leaves VBROADCAST[FI]128 to continue to use intrinsics. As explained in the CFE patch, the reason is that we currently don't generate as good code for them without the intrinsics. CodeGen/X86/avx-vbroadcast.ll already provides coverage for this change. It checks that for a series of insertelements we generate the appropriate vbroadcast instruction. Also verified that there was no assembly change in the test-suite before and after this patch. llvm-svn: 209864
* Added tests for shufflevector lowering to blend instrs.Filipe Cabecinhas2014-05-293-0/+69
| | | | | | | | | | | | | | | | | | | | These tests ensure that a change I will propose in clang works as expected. Summary: Added tests for the generation of blend+immediate instructions from a shufflevector. These tests were proposed along with a patch that was dropped. I'm committing the tests anyway to protect against possible regressions in codegen. Reviewers: nadav, bkramer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3600 llvm-svn: 209853
* [PPC] Use alias symbols in address computation.Rafael Espindola2014-05-291-0/+31
| | | | | | | | | | | This seems to match what gcc does for ppc and what every other llvm backend does. This is a fixed version of r209638. The difference is to avoid any change in behavior for functions. The logic for using constant pools for function addresseses is spread over a few places and we have to keep them in sync. llvm-svn: 209821
* Add a test showing the ppc code sequence for getting a function pointer.Rafael Espindola2014-05-291-0/+21
| | | | | | This would have found the miscompile in r209638. llvm-svn: 209820
* Rename a test case to contain correct date info.Hao Liu2014-05-291-0/+0
| | | | llvm-svn: 209799
* Fix an assertion failure caused by v1i64 in DAGCombiner Shrink.Hao Liu2014-05-291-0/+14
| | | | llvm-svn: 209798
* [x86] Fold extract_vector_elt of a load into the Load's address computation.Michael J. Spencer2014-05-291-1/+19
| | | | | | | | An address only use of an extract element of a load can be simplified to a load. Without this the result of the extract element is spilled to the stack so that an address is available. llvm-svn: 209788
* [pr19844] Add thread local mode to aliases.Rafael Espindola2014-05-283-3/+19
| | | | | | | | | | This matches gcc's behavior. It also seems natural given that aliases contain other properties that govern how it is accessed (linkage, visibility, dll storage). Clang still has to be updated to expose this feature to C. llvm-svn: 209759
* Revert "[DAGCombiner] Split up an indexed load if only the base pointer ↵Hal Finkel2014-05-281-0/+3
| | | | | | | | | | value is live" This reverts r208640 (I've just XFAILed the test) because it broke ppc64/Linux self-hosting. Because nearly every regression test triggers a segfault, I hope this will be easy to fix. llvm-svn: 209747
* Revert "[PPC] Use alias symbols in address computation."Hal Finkel2014-05-281-31/+0
| | | | | | | | | This reverts commit r209638 because it broke self-hosting on ppc64/Linux. (the Clang-compiled TableGen would segfault because it jumped to an invalid address from within _ZNK4llvm17ManagedStaticBase21RegisterManagedStaticEPFPvvEPFvS1_E (which is within the command-line parameter registration process)). llvm-svn: 209745
* [AArch64] Add store post-index update folding regression tests for the ↵Tilmann Scheller2014-05-281-0/+125
| | | | | | | | | | | | | | | | load/store optimizer. Add regression tests for the following transformation: str X, [x20] ... add x20, x20, #32 -> str X, [x20], #32 with X being either w0, x0, s0, d0 or q0. llvm-svn: 209715
* [AArch64] Add load post-index update folding regression tests for the ↵Tilmann Scheller2014-05-281-0/+136
| | | | | | | | | | | | | | | | load/store optimizer. Add regression tests for the following transformation: ldr X, [x20] ... add x20, x20, #32 -> ldr X, [x20], #32 with X being either w0, x0, s0, d0 or q0. llvm-svn: 209711
* [mips] Optimize long branch for MIPS64 by removing %higher and %highest.Sasa Stankovic2014-05-271-4/+1
| | | | | | | | | | %higher and %highest can have non-zero values only for offsets greater than 2GB, which is highly unlikely, if not impossible when compiling a single function. This makes long branch for MIPS64 3 instructions smaller. Differential Revision: http://llvm-reviews.chandlerc.com/D3281.diff llvm-svn: 209678
* AArch64: add test for NZCV cross-copy save.Tim Northover2014-05-271-0/+18
| | | | llvm-svn: 209665
* AArch64: add AArch64-specific test for 'c' and 'n'.Tim Northover2014-05-271-0/+10
| | | | llvm-svn: 209664
* [PATCH] Correct type used for VADD_SPLAT optimization on PowerPCBill Schmidt2014-05-272-1/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In PPCISelLowering.cpp: PPCTargetLowering::LowerBUILD_VECTOR(), there is an optimization for certain patterns to generate one or two vector splats followed by a vector add or subtract. This operation is represented by a VADD_SPLAT in the selection DAG. Prior to this patch, it was possible for the VADD_SPLAT to be assigned the wrong data type, causing incorrect code generation. This patch corrects the problem. Specifically, the code previously assigned the value type of the BUILD_VECTOR node to the newly generated VADD_SPLAT node. This is correct much of the time, but not always. The problem is that the call to isConstantSplat() may return a SplatBitSize that is not the same as the number of bits in the original element vector type. The correct type to assign is a vector type with the same element bit size as SplatBitSize. The included test case shows an example of this, where the BUILD_VECTOR node has a type of v16i8. The vector to be built is {0, 16, 0, 16, 0, 16, 0, 16, 0, 16, 0, 16, 0, 16, 0, 16}. isConstantSplat detects that we can generate a splat of 16 for type v8i16, which is the type we must assign to the VADD_SPLAT node. If we do not, we generate a vspltisb of 8 and a vaddubm, which generates the incorrect result {16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16}. The correct code generation is a vspltish of 8 and a vadduhm. This patch also corrected code generation for CodeGen/PowerPC/2008-07-10-SplatMiscompile.ll, which had been marked as an XFAIL, so we can remove the XFAIL from the test case. llvm-svn: 209662
* [ARM] Emit correct build attributes for the relocation models.Amara Emerson2014-05-271-0/+10
| | | | | | Patch by Asiri Rathnayake. llvm-svn: 209656
* ARM: teach AAPCS-VFP to deal with Cortex-M4.Tim Northover2014-05-271-0/+111
| | | | | | | | | | | Cortex-M4 only has single-precision floating point support, so any LLVM "double" type will have been split into 2 i32s by now. Fortunately, the consecutive-register framework turns out to be precisely what's needed to reconstruct the double and follow AAPCS-VFP correctly! rdar://problem/17012966 llvm-svn: 209650
* Convert some X86 blendv* intrinsics into IR.Filipe Cabecinhas2014-05-273-0/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Implemented an InstCombine transformation that takes a blendv* intrinsic call and translates it into an IR select, if the mask is constant. This will eventually get lowered into blends with immediates if possible, or pblendvb (with an option to further optimize if we can transform the pblendvb into a blend+immediate instruction, depending on the selector). It will also enable optimizations by the IR passes, which give up on sight of the intrinsic. Both the transformation and the lowering of its result to asm got shiny new tests. The transformation is a bit convoluted because of blendvp[sd]'s definition: Its mask is a floating point value! This forces us to convert it and get the highest bit. I suppose this happened because the mask has type __m128 in Intel's intrinsic and v4sf (for blendps) in gcc's builtin. I will send an email to llvm-dev to discuss if we want to change this or not. Reviewers: grosbach, delena, nadav Differential Revision: http://reviews.llvm.org/D3859 llvm-svn: 209643
* [PPC] Use alias symbols in address computation.Rafael Espindola2014-05-261-0/+31
| | | | | | | This seems to match what gcc does for ppc and what every other llvm backend does. llvm-svn: 209638
* AArch64: force i1 to be zero-extended at an ABI boundary.Tim Northover2014-05-261-0/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit is debatable. There are two possible approaches, neither of which is really satisfactory: 1. Use "@foo(i1 zeroext)" to mean an extension to 32-bits on Darwin, and 8 bits otherwise. 2. Redefine "@foo(i1)" to mean that the i1 is extended by the caller to 8 bits. This goes against the spirit of "zeroext" I think, but it's a bit of a vague construct anyway (by definition you're going to extend to the amount required by the ABI, that's why it's the ABI!). This implements option 2. The DAG machinery really isn't setup for the first (there's a fairly strong assumption that "zeroext" goes to at least the smallest register size), and even if it was the resulting DAG looks like it would be inferior in many cases. Theoretically we could add AssertZext nodes in the consumers of ABI-passed values too now, but this actually seems to make the code worse in practice by making truncation proceed in two steps. The code produced is equally valid if we continue to assume only the low bit is defined. Should fix PR19850 llvm-svn: 209637
* AArch64: simplify calling conventions slightly.Tim Northover2014-05-261-0/+8
| | | | | | | | | We can eliminate the custom C++ code in favour of some TableGen to check the same things. Functionality should be identical, except for a buffer overrun that was present in the C++ code and meant webkit failed if any small argument needed to be passed on the stack. llvm-svn: 209636
* [AArch64] Add store + add folding regression tests for the load/store ↵Tilmann Scheller2014-05-261-2/+66
| | | | | | | | | | | | | | | | optimization pass. Add tests for the following transform: str X, [x0, #32] ... add x0, x0, #32 -> str X, [x0, #32]! with X being either w1, x1, s0, d0 or q0. llvm-svn: 209627
* [AArch64] Add more regression tests for the load/store optimization pass.Tilmann Scheller2014-05-261-11/+81
| | | | | | | | | | | | | | Cover the following cases: ldr X, [x0, #32] ... add x0, x0, #32 -> ldr X, [x0, #32]! with X being either w1, x1, s0, d0 or q0. llvm-svn: 209624
* Remove accidentally committed whitespace.Tilmann Scheller2014-05-261-2/+2
| | | | llvm-svn: 209619
* [AArch64] Add a regression test for the load store optimizer.Tilmann Scheller2014-05-261-0/+31
| | | | | | | | We have a couple of regression tests for load/store pairing, but (to my knowledge) there are no regression tests for the load/store + add/sub folding. As a first step towards increased test coverage of this area, this commit adds a test for one instance of a load + add to pre-indexed load transformation. llvm-svn: 209618
* Just check the entire string.Rafael Espindola2014-05-262-64/+64
| | | | | | Thanks to David Blaikie for the suggestion. llvm-svn: 209610
* Emit data or code export directives based on the type.Rafael Espindola2014-05-251-0/+4
| | | | | | | | | | | | | | | | | | | | | | Currently we look at the Aliasee to decide what type of export directive to use. It seems better to use the type of the alias directly. This is similar to how we handle the alias having the same address but other attributes (linkage, visibility) from the aliasee. With this patch it is now possible to do things like target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-pc-windows-msvc" @foo = global [6 x i8] c"\B8*\00\00\00\C3", section ".text", align 16 @f = dllexport alias i32 (), [6 x i8]* @foo !llvm.module.flags = !{!0} !0 = metadata !{i32 6, metadata !"Linker Options", metadata !1} !1 = metadata !{metadata !2, metadata !3} !2 = metadata !{metadata !"/DEFAULTLIB:libcmt.lib"} !3 = metadata !{metadata !"/DEFAULTLIB:oldnames.lib"} llvm-svn: 209600
* Make these CHECKs a bit more strict.Rafael Espindola2014-05-252-62/+62
| | | | | | The " at the end of the line makes sure we matched the entire directive. llvm-svn: 209599
* AArch64/ARM64: move ARM64 into AArch64's placeTim Northover2014-05-24370-4416/+4234
| | | | | | | | | | | | | | | This commit starts with a "git mv ARM64 AArch64" and continues out from there, renaming the C++ classes, intrinsics, and other target-local objects for consistency. "ARM64" test directories are also moved, and tests that began their life in ARM64 use an arm64 triple, those from AArch64 use an aarch64 triple. Both should be equivalent though. This finishes the AArch64 merge, and everyone should feel free to continue committing as normal now. llvm-svn: 209577
* AArch64/ARM64: remove AArch64 from tree prior to renaming ARM64.Tim Northover2014-05-24180-27097/+0
| | | | | | | | | | | | | | | | I'm doing this in two phases for a better "git blame" record. This commit removes the previous AArch64 backend and redirects all functionality to ARM64. It also deduplicates test-lines and removes orphaned AArch64 tests. The next step will be "git mv ARM64 AArch64" and rewire most of the tests. Hopefully LLVM is still functional, though it would be even better if no-one ever had to care because the rename happens straight afterwards. llvm-svn: 209576
* ARM64: extract a 32-bit subreg when selecting an inreg extendTim Northover2014-05-241-2/+135
| | | | | | | | After the load/store refactoring, we were sometimes trying to feed a GPR64 into a 32-bit register offset operand. This failed in copyPhysReg. llvm-svn: 209566
* Revert part of "Fix broken FileCheck prefixes"Nico Rieck2014-05-232-11/+11
| | | | | | This reverts part of commit r209538. llvm-svn: 209544
* Use alias linkage and visibility to decide tls access mode.Rafael Espindola2014-05-232-3/+3
| | | | | | | | | | | | | | | | | This matches both what we do for the non-thread case and what gcc does. With this patch clang would match gcc's behaviour in static __thread int a = 42; extern __thread int b __attribute__((alias("a"))); int *f(void) { return &a; } int *g(void) { return &b; } if not for pr19843. Manually writing the IL does produce the same access modes. It is also a step in the direction of fixing pr19844. llvm-svn: 209543
* Remove unused CHECK linesNico Rieck2014-05-231-4/+0
| | | | llvm-svn: 209539
* Fix broken FileCheck prefixesNico Rieck2014-05-234-13/+13
| | | | llvm-svn: 209538
* Convert test to use FileCheck.Rafael Espindola2014-05-231-1/+5
| | | | llvm-svn: 209528
* [mips][mips64r6] [ls][dw][lr] are not available in MIPS32r6/MIPS64r6Daniel Sanders2014-05-233-102/+474
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Instead the system is required to provide some means of handling unaligned load/store without special instructions. Options include full hardware support, full trap-and-emulate, and hybrids such as hardware support within a cache line and trap-and-emulate for multi-line accesses. MipsSETargetLowering::allowsUnalignedMemoryAccesses() has been configured to assume that unaligned accesses are 'fast' on the basis that I expect few hardware implementations will opt for pure-software handling of unaligned accesses. The ones that do handle it purely in software can override this. mips64-load-store-left-right.ll has been merged into load-store-left-right.ll The stricter testing revealed a Bits!=Bytes bug in passByValArg(). This has been fixed and the variables renamed to clarify the units they hold. Reviewers: zoran.jovanovic, jkolek, vmedic Reviewed By: vmedic Differential Revision: http://reviews.llvm.org/D3872 llvm-svn: 209512
* [ARM64] Fix a bug in shuffle vector lowering to generate corect vext ISD ↵Jiangning Liu2014-05-231-0/+172
| | | | | | with swapped input vectors. llvm-svn: 209495
* R600: Try to convert BFE back to standard bit ops when possible.Matt Arsenault2014-05-224-11/+276
| | | | | | | This allows existing DAG combines to work on them, and then we can re-match to BFE if necessary during instruction selection. llvm-svn: 209462
* R600: Add dag combine for BFEMatt Arsenault2014-05-223-0/+751
| | | | llvm-svn: 209461
* R600: Implement ComputeNumSignBitsForTargetNode for BFEMatt Arsenault2014-05-221-0/+15
| | | | llvm-svn: 209460
* R600: Expand mul24 for GPUs without itMatt Arsenault2014-05-222-3/+10
| | | | llvm-svn: 209458
* R600: Expand mad24 for GPUs without itMatt Arsenault2014-05-222-0/+14
| | | | llvm-svn: 209457
* R600: Add intrinsics for mad24Matt Arsenault2014-05-222-0/+26
| | | | llvm-svn: 209456
* [X86] Improve the lowering of BITCAST from MVT::f64 to MVT::v4i16/MVT::v8i8.Andrea Di Biagio2014-05-223-82/+157
| | | | | | | | | | | | | This patch teaches the x86 backend how to efficiently lower ISD::BITCAST dag nodes from MVT::f64 to MVT::v4i16 (and vice versa), and from MVT::f64 to MVT::v8i8 (and vice versa). This patch extends the logic from revision 208107 to also handle MVT::v4i16 and MVT::v8i8. Also, this patch correctly propagates Undef values when performing the widening of a vector (example: when widening from v2i32 to v4i32, the upper 64bits of the resulting vector are 'undef'). llvm-svn: 209451
* Segmented stacks: omit __morestack call when there's no frame.Tim Northover2014-05-223-11/+62
| | | | | | Patch by Florian Zeitz llvm-svn: 209436
* [mips] Make unalignedload.ll test stricter and easier to modify for ↵Daniel Sanders2014-05-221-12/+29
| | | | | | | | | | | | | | | | | | | | | | MIPS32r6/MIPS64r6 Summary: * Split into two functions, one to test each struct. * R0 and R2 must be defined by an lw with a %got reference to the correct symbol. * Test for $4 (first argument) where appropriate instead of accepting any register. * Test that the two lbu's are correctly combined into $4 Depends on D3844 Reviewers: jkolek, zoran.jovanovic, vmedic Reviewed By: vmedic Differential Revision: http://reviews.llvm.org/D3845 llvm-svn: 209424
OpenPOWER on IntegriCloud