summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Fix the root cause of PR15348 by correctly handling alignment 0 onChandler Carruth2013-02-252-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | memory intrinsics in the SDAG builder. When alignment is zero, the lang ref says that *no* alignment assumptions can be made. This is the exact opposite of the internal API contracts of the DAG where alignment 0 indicates that the alignment can be made to be anything desired. There is another, more explicit alignment that is better suited for the role of "no alignment at all": an alignment of 1. Map the intrinsic alignment to this early so that we don't end up generating aligned DAGs. It is really terrifying that we've never seen this before, but we suddenly started generating a large number of alignment 0 memcpys due to the new code to do memcpy-based copying of POD class members. That patch contains a bug that rounds bitfield alignments down when they are the first field. This can in turn produce zero alignments. This fixes weird crashes I've seen in library users of LLVM on 32-bit hosts, etc. llvm-svn: 176022
* Revert r169638 because it broke Mesa llvmpipe tests.Nadav Rotem2013-02-242-2/+2
| | | | | | Fix PR15239. llvm-svn: 175985
* X86: Disable cmov-memory patterns on subtargets without cmov.Benjamin Kramer2013-02-231-0/+11
| | | | | | Fixes PR15115. llvm-svn: 175962
* Expand pseudos/macros for Selt. This is the last of the complexReed Kotler2013-02-232-10/+11
| | | | | | macros.The rest is some small misc. stuff. llvm-svn: 175950
* [mips] Emit call16 operator instead of got_disp. The former allows lazy binding.Akira Hatanaka2013-02-222-14/+29
| | | | llvm-svn: 175920
* Fix test by matching movaps instead of AVX-only vmovapsPeter Collingbourne2013-02-221-2/+2
| | | | llvm-svn: 175914
* x86_64: designate most general purpose and SSE registers as callee save ↵Peter Collingbourne2013-02-221-0/+24
| | | | | | under coldcc llvm-svn: 175911
* Remove unused CHECK lines copied from another testPete Cooper2013-02-221-8/+0
| | | | llvm-svn: 175905
* Make ARMAsmPrinter generate the correct alignment specifier syntax in ↵Kristof Beyls2013-02-2218-105/+105
| | | | | | | | | instructions. The Printer will now print instructions with the correct alignment specifier syntax, like vld1.8 {d16}, [r0:64] llvm-svn: 175884
* Expand mips16 SelT form pseudso/macros.Reed Kotler2013-02-227-4/+85
| | | | llvm-svn: 175862
* Fix isa<> check which could never be true.Pete Cooper2013-02-221-0/+32
| | | | | | | | | | | | It was incorrectly checking a Function* being an IntrinsicInst* which isn't possible. It should always have been checking the CallInst* instead. Added test case for x86 which ensures we only get one constant load. It was 2 before this change. rdar://problem/13267920 llvm-svn: 175853
* Hexagon: Expand cttz, ctlz, and ctpop for now.Anshuman Dasgupta2013-02-211-0/+34
| | | | llvm-svn: 175783
* Make RAFast::UsedInInstr indexed by register units.Jakob Stoklund Olesen2013-02-211-0/+9
| | | | | | | | | | This fixes some problems with too conservative checking where we were marking all aliases of a register as used, and then also checking all aliases when allocating a register. <rdar://problem/13249625> llvm-svn: 175782
* Large code model support for PowerPC.Bill Schmidt2013-02-2110-94/+207
| | | | | | | | | | | Large code model is identical to medium code model except that the addis/addi sequence for "local" accesses is never used. All accesses use the addis/ld sequence. The coding changes are straightforward; most of the patch is taken up with creating variants of the medium model tests for large model. llvm-svn: 175767
* DAGCombiner: Make the post-legalize vector op optimization more aggressive.Benjamin Kramer2013-02-211-2/+0
| | | | | | | | A legal BUILD_VECTOR goes in and gets constant folded into another legal BUILD_VECTOR so we don't lose any legality here. The problematic PPC optimization that made this check necessary was fixed recently. llvm-svn: 175759
* R600: Fix for Unigine when MachineSched is enabledTom Stellard2013-02-211-0/+28
| | | | | | | | | | | Fixes for-loop.cl piglit test Patch By: Vincent Lejeune Reviewed-by: Tom Stellard <thomas.stellard@amd.com> NOTE: This is a candidate for the Mesa stable branch. llvm-svn: 175742
* R600/SI: Make sure M0 is loaded for V_INTERP_MOV_F32Michel Danzer2013-02-211-0/+23
| | | | | | | NOTE: This is a candidate for the Mesa stable branch. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 175733
* Expand the sel pseudo/macro. This generates basic blocks where previouslyReed Kotler2013-02-211-3/+3
| | | | | | | there were inline br .+4 instructions. Soon everything can enjoy the full instruction scheduling experience. llvm-svn: 175718
* PPCDAGToDAGISel::PostprocessISelDAG()Bill Schmidt2013-02-217-1/+187
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements the PPCDAGToDAGISel::PostprocessISelDAG virtual method to perform post-selection peephole optimizations on the DAG representation. One optimization is implemented here: folds to clean up complex addressing expressions for thread-local storage and medium code model. It will also be useful for large code model sequences when those are added later. I originally thought about doing this on the MI representation prior to register assignment, but it's difficult to do effective global dead code elimination at that point. DCE is trivial on the DAG representation. A typical example of a candidate code sequence in assembly: addis 3, 2, globalvar@toc@ha addi 3, 3, globalvar@toc@l lwz 5, 0(3) When the final instruction is a load or store with an immediate offset of zero, the offset from the add-immediate can replace the zero, provided the relocation information is carried along: addis 3, 2, globalvar@toc@ha lwz 5, globalvar@toc@l(3) Since the addi can in general have multiple uses, we need to only delete the instruction when the last use is removed. llvm-svn: 175697
* Stabilize vec_constants.llBill Schmidt2013-02-201-1/+4
| | | | llvm-svn: 175683
* DAGCombiner: Fold pointless truncate, bitcast, buildvector seriesArnold Schwaighofer2013-02-201-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | (2xi32) (truncate ((2xi64) bitcast (buildvector i32 a, i32 x, i32 b, i32 y))) can be folded into a (2xi32) (buildvector i32 a, i32 b). Such a DAG would cause uneccessary vdup instructions followed by vmovn instructions. We generate this code on ARM NEON for a setcc olt, 2xf64, 2xf64. For example, in the vectorized version of the code below. double A[N]; double B[N]; void test_double_compare_to_double() { int i; for(i=0;i<N;i++) A[i] = (double)(A[i] < B[i]); } radar://13191881 Fixes bug 15283. llvm-svn: 175670
* Additional fixes for bug 15155.Bill Schmidt2013-02-202-14/+85
| | | | | | | | This handles the cases where the 6-bit splat element is odd, converting to a three-instruction sequence to add or subtract two splats. With this fix, the XFAIL in test/CodeGen/PowerPC/vec_constants.ll is removed. llvm-svn: 175663
* Fix PR15267Michael Liao2013-02-201-0/+66
| | | | | | | | | - When extloading from a vector with non-byte-addressable element, e.g. <4 x i1>, the current logic breaks. Extend the current logic to fix the case where the element type is not byte-addressable by loading all bytes, bit-extracting/packing each element. llvm-svn: 175642
* Fix bug 14779 for passing anonymous aggregates [patch by Kai Nacke].Bill Schmidt2013-02-201-0/+99
| | | | | | | | The PPC backend doesn't handle these correctly. This patch uses logic similar to that in the X86 and ARM backends to track these arguments properly. llvm-svn: 175635
* Hexagon: Move HexagonMCInst.h to MCTargetDesc/HexagonMCInst.h.Jyotsna Verma2013-02-202-2/+59
| | | | | | | | Add HexagonMCInst class which adds various Hexagon VLIW annotations. In addition, this class also includes some APIs related to the constant extenders. llvm-svn: 175634
* Fix PR15155: lost vadd/vsplat optimization.Bill Schmidt2013-02-201-0/+77
| | | | | | | | | | | | | | During lowering of a BUILD_VECTOR, we look for opportunities to use a vector splat. When the splatted value fits in 5 signed bits, a single splat does the job. When it doesn't fit in 5 bits but does fit in 6, and is an even value, we can splat on half the value and add the result to itself. This last optimization hasn't been working recently because of improved constant folding. To circumvent this, create a pseudo VADD_SPLAT that can be expanded during instruction selection. llvm-svn: 175632
* I optimized the following patterns:Elena Demikhovsky2013-02-201-0/+23
| | | | | | | | | | | | | | | | sext <4 x i1> to <4 x i64> sext <4 x i8> to <4 x i64> sext <4 x i16> to <4 x i64> I'm running Combine on SIGN_EXTEND_IN_REG and revert SEXT patterns: (sext_in_reg (v4i64 anyext (v4i32 x )), ExtraVT) -> (v4i64 sext (v4i32 sext_in_reg (v4i32 x , ExtraVT))) The sext_in_reg (v4i32 x) may be lowered to shl+sar operations. The "sar" does not exist on 64-bit operation, so lowering sext_in_reg (v4i64 x) has no vector solution. I also added a cost of this operations to the AVX costs table. llvm-svn: 175619
* Fix thumbv5e frame lowering assertion failure.Logan Chien2013-02-201-0/+29
| | | | | | | | | | | | | | It is possible that frame pointer is not found in the callee saved info, thus FramePtrSpillFI may be incorrect if we don't check the result of hasFP(MF). Besides, if we enable the stack coloring algorithm, there will be an assertion to ensure the slot is live. But in the test case, %var1 is not live in the prologue of the function, and we will get the assertion failure. Note: There is similar code in ARMFrameLowering.cpp. llvm-svn: 175616
* Expand pseudos/macros:Reed Kotler2013-02-2013-15/+15
| | | | | | | | SltCCRxRy16, SltiCCRxImmX16, SltiuCCRxImmX16, SltuCCRxRy16 $T8 shows up as register $24 when emitted from C++ code so we had to change some tests that were already there for this functionality. llvm-svn: 175593
* [ms-inline asm] Force the use of a base pointer if the MachineFunction includesChad Rosier2013-02-191-2/+2
| | | | | | | | | | | | | MS-style inline assembly. This is a follow-on to r175334. Forcing a FP to be emitted doesn't ensure it will be used. Therefore, force the base pointer as well. We now treat MS inline assembly in the same way we treat functions with dynamic stack realignment and VLAs. This guarantees the BP will be used to reference parameters and locals. rdar://13218191 llvm-svn: 175576
* ARM: Allocation hints must make sure to be in the alloc order.Jim Grosbach2013-02-191-0/+53
| | | | | | | | | When creating an allocation hint for a register pair, make sure the hint for the physical register reference is still in the allocation order. rdar://13240556 llvm-svn: 175541
* Fix typoEli Bendersky2013-02-191-1/+1
| | | | llvm-svn: 175530
* Fix GCMetadaPrinter::finishAssembly not executed, patch by Yiannis Tsiouris.Benjamin Kramer2013-02-191-0/+31
| | | | | | | | | | | Due to the execution order of doFinalization functions, the GC information were deleted before AsmPrinter::doFinalization was executed. Thus, the GCMetadataPrinter::finishAssembly was never called. The patch fixes that by moving the code of the GCInfoDeleter::doFinalization to Printer::doFinalization. llvm-svn: 175528
* ARM NEON: Merge a f32 bitcast of a v2i32 extracteltArnold Schwaighofer2013-02-191-0/+25
| | | | | | | | | | | | | | A vectorized sitfp on doubles will get scalarized to a sequence of an extract_element of <2 x i32>, a bitcast to f32 and a sitofp. Due to the the extract_element, and the bitcast we will uneccessarily generate moves between scalar and vector registers. The patch fixes this by using a COPY_TO_REGCLASS and a EXTRACT_SUBREG to extract the element from the vector instead. radar://13191881 llvm-svn: 175520
* Expand pseudos/macros BteqzT8SltiX16, BteqzT8SltiuX16,Reed Kotler2013-02-192-0/+184
| | | | | | BtnezT8SltiX16, BtnezT8SltiuX16 . llvm-svn: 175486
* Expand pseudos BteqzT8CmpiX16 and BtnezT8CmpiX16.Reed Kotler2013-02-192-0/+198
| | | | llvm-svn: 175474
* Comment out the rdar number.Chad Rosier2013-02-181-1/+1
| | | | llvm-svn: 175460
* [fast-isel] Remove an invalid assert.Chad Rosier2013-02-181-0/+7
| | | | | | | | If the memcpy has an odd length with an alignment of 2, this would incorrectly assert on the last 1 byte copy. rdar://13202135 llvm-svn: 175459
* Support for HiPE-compatible code emission, patch by Yiannis Tsiouris.Benjamin Kramer2013-02-181-0/+67
| | | | llvm-svn: 175457
* R600/SI: Use MULADD_IEEE/V_MAD_F32 instruction for mad patternVincent Lejeune2013-02-181-0/+19
| | | | llvm-svn: 175446
* Expand macro/pseudo instructions BtnezT8SltX16 and BtnezT8SltuX16.Reed Kotler2013-02-181-0/+96
| | | | llvm-svn: 175420
* Expand pseudo/macro BteqzT8SltX16.Reed Kotler2013-02-181-0/+98
| | | | llvm-svn: 175417
* Expand macro/pseudo BteqzT8CmpX16.Reed Kotler2013-02-181-0/+96
| | | | llvm-svn: 175416
* Beginning of expanding all current mips16 macro/pseudo instruction sequences.Reed Kotler2013-02-181-0/+95
| | | | | | | | | | This expansion will be moved to expandISelPseudos as soon as I can figure out how to do that. There are other instructions which use this ExpandFEXT_T8I816_ins and as soon as I have finished expanding them all, I will delete the macro asm string text so it has no way to be used in the future. llvm-svn: 175413
* Force a cpu for test. It failed on atom due to different scheduling decisions.Benjamin Kramer2013-02-171-1/+1
| | | | llvm-svn: 175401
* Replace "check:" wth "CHECK:".Jakub Staszak2013-02-161-4/+4
| | | | | | Also fix one test by changing "vpermilps" to "vpshufd". llvm-svn: 175357
* Reinitialize the ivars in the subtarget so that they can be reset with the ↵Bill Wendling2013-02-161-1/+0
| | | | | | new features. llvm-svn: 175336
* [ms-inline asm] Do not omit the frame pointer if we have ms-inline assembly.Chad Rosier2013-02-161-2/+8
| | | | | | | | | | | If the frame pointer is omitted, and any stack changes occur in the inline assembly, e.g.: "pusha", then any C local variable or C argument references will be incorrect. I pass no judgement on anyone who would do such a thing. ;) rdar://13218191 llvm-svn: 175334
* Temporary revert of 175320.Bill Wendling2013-02-151-0/+1
| | | | llvm-svn: 175322
* Reinitialize the ivars in the subtarget.Bill Wendling2013-02-151-0/+66
| | | | | | | When we're recalculating the feature set of the subtarget, we need to have the ivars in their initial state. llvm-svn: 175320
OpenPOWER on IntegriCloud