summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* FileCheckize a test.Andrew Trick2011-03-191-7/+16
| | | | | | (one-by-one until valgrind is happy) llvm-svn: 127925
* Match a few more obvious patterns to revsh. rdar://9147637.Evan Cheng2011-03-181-2/+28
| | | | llvm-svn: 127913
* Revert r127852; it's apparently causing an ICE on mingw.Eli Friedman2011-03-182-30/+12
| | | | llvm-svn: 127909
* PTX: Fix various codegen issuesJustin Holewinski2011-03-183-46/+91
| | | | | | | | - Emit mad instead of mad.rn for shader model 1.0 - Emit explicit mov.u32 instructions for reading global variables - (most PTX instructions cannot take global variable immediates) llvm-svn: 127895
* ptx: fix parameter order that is reversedChe-Liang Chiou2011-03-181-0/+8
| | | | llvm-svn: 127874
* ptx: add unconditional and conditional branchChe-Liang Chiou2011-03-181-0/+21
| | | | llvm-svn: 127873
* Add a target-specific branchless method for double-width relationalEli Friedman2011-03-182-12/+30
| | | | | | | | | | | comparisons on x86. Essentially, the way this works is that SUB+SBB sets the relevant flags the same way a double-width CMP would. This is a substantial improvement over the generic lowering in LLVM. The output is also shorter than the gcc-generated output; I haven't done any detailed benchmarking, though. llvm-svn: 127852
* BuildUDIV: If the divisor is even we can simplify the fixup of the ↵Benjamin Kramer2011-03-171-0/+11
| | | | | | | | | | | | | | | | | | | | | | | multiplied value by introducing an early shift. This allows us to compile "unsigned foo(unsigned x) { return x/28; }" into shrl $2, %edi imulq $613566757, %rdi, %rax shrq $32, %rax ret instead of movl %edi, %eax imulq $613566757, %rax, %rcx shrq $32, %rcx subl %ecx, %eax shrl %eax addl %ecx, %eax shrl $4, %eax on x86_64 llvm-svn: 127829
* Add XCore intrinsic for setpsc.Richard Osborne2011-03-171-0/+8
| | | | llvm-svn: 127821
* test/CodeGen/X86/h-registers-1.ll: Add explicit -mtriple=x86_64-linux. It ↵NAKAMURA Takumi2011-03-171-1/+1
| | | | | | does not need to be checked on x86_64-win32 (aka Win64). llvm-svn: 127800
* test/CodeGen/X86/constant-pool-remat-0.ll: FileCheck-ize and add explicit ↵NAKAMURA Takumi2011-03-161-4/+12
| | | | | | -mtriple=x86_64-linux. llvm-svn: 127775
* The x86-64 ABI says that a bool is only guaranteed to be sign-extended to a byteCameron Zwarich2011-03-161-3/+15
| | | | | | | | | | | rather than an int. Thankfully, this only causes LLVM to miss optimizations, not generate incorrect code. This just fixes the zext at the return. We still insert an i32 ZextAssert when reading a function's arguments, but it is followed by a truncate and another i8 ZextAssert so it is not optimized. llvm-svn: 127766
* Rename a test to be more inclusive.Cameron Zwarich2011-03-161-0/+0
| | | | llvm-svn: 127765
* Revert r127757, "Patch to a fix dwarf relocation problem on ARM. One-line fixDaniel Dunbar2011-03-161-111/+0
| | | | | | | plus the test where it used to break.", which broke Clang self-host of a Debug+Asserts compiler, on OS X. llvm-svn: 127763
* Add XCore intrinsics for setclk, setrdy.Richard Osborne2011-03-161-0/+16
| | | | llvm-svn: 127761
* Patch to a fix dwarf relocation problem on ARM. One-line fix plus the test ↵Renato Golin2011-03-161-0/+111
| | | | | | where it used to break. llvm-svn: 127757
* Add a test for i1 zeroext arguments on x86-64. We currently generate code thatCameron Zwarich2011-03-161-0/+23
| | | | | | | conforms to the ABI, but DAGCombine could in theory recognize the sequence of zext asserts and truncates and generate incorrect code. llvm-svn: 127754
* Add checkevent intrinsic to check if any resources owned by the current threadRichard Osborne2011-03-161-0/+20
| | | | | | can event. llvm-svn: 127741
* test/CodeGen/X86: FileCheck-ize and add actions for x86_64-linux and ↵NAKAMURA Takumi2011-03-1610-24/+54
| | | | | | x86_64-win32. llvm-svn: 127734
* test/CodeGen/X86: Add a pattern for Win64.NAKAMURA Takumi2011-03-167-22/+141
| | | | llvm-svn: 127733
* test/CodeGen/X86: FileCheck-ize and add explicit -mtriple=x86_64-linux. They ↵NAKAMURA Takumi2011-03-1612-24/+62
| | | | | | are useless to Win64 target. llvm-svn: 127732
* test/CodeGen/X86/byval*.ll: Win64 has not supported byval yet.NAKAMURA Takumi2011-03-165-9/+102
| | | | llvm-svn: 127731
* test/CodeGen/X86/dyn-stackalloc.ll: FileCheck-ize.NAKAMURA Takumi2011-03-161-3/+6
| | | | llvm-svn: 127730
* Some minor cleanups based on feedback.Bill Wendling2011-03-151-0/+12
| | | | llvm-svn: 127694
* Do not form thumb2 ldrd / strd if the offset is by multiple of 4. rdar://9133587Evan Cheng2011-03-151-0/+55
| | | | llvm-svn: 127683
* On the XCore the scavenging slot should be closest to the SP.Richard Osborne2011-03-151-0/+52
| | | | llvm-svn: 127680
* Add XCore intrinsics for getps, setps, setsr and clrsr.Richard Osborne2011-03-152-0/+36
| | | | llvm-svn: 127678
* PTX: Set PTX 2.0 as the minimum supported versionJustin Holewinski2011-03-152-82/+83
| | | | | | | | - Remove PTX 1.4 code generation - Change type of intrinsics to .v4.i32 instead of .v4.i16 - Add and/or/xor integer instructions llvm-svn: 127677
* Add a peephole optimization to optimize pairs of bitcasts. e.g.Evan Cheng2011-03-151-0/+23
| | | | | | | | | | | | | | | | | | | | | | v2 = bitcast v1 ... v3 = bitcast v2 ... = v3 => v2 = bitcast v1 ... = v1 if v1 and v3 are of in the same register class. bitcast between i32 and fp (and others) are often not nops since they are in different register classes. These bitcast instructions are often left because they are in different basic blocks and cannot be eliminated by dag combine. rdar://9104514 llvm-svn: 127668
* sext(undef) = 0, because the top bits will all be the same.Evan Cheng2011-03-151-2/+2
| | | | | | zext(undef) = 0, because the top bits will be zero. llvm-svn: 127649
* Testcase for r127630.Bill Wendling2011-03-151-0/+18
| | | | llvm-svn: 127648
* Clean up ARM tail calls a bit. They're pseudo-instructions for normal branches.Jim Grosbach2011-03-151-1/+1
| | | | | | | Also more cleanly separate the ARM vs. Thumb functionality. Previously, the encoding would be incorrect for some Thumb instructions (the indirect calls). llvm-svn: 127637
* Generate a VTBL instruction instead of a series of loads and stores when weBill Wendling2011-03-141-12/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | can. As Nate pointed out, VTBL isn't super performant, but it *has* to be better than this: _shuf: @ BB#0: @ %entry push {r4, r7, lr} add r7, sp, #4 sub sp, #12 mov r4, sp bic r4, r4, #7 mov sp, r4 mov r2, sp vmov d16, r0, r1 orr r0, r2, #6 orr r3, r2, #7 vst1.8 {d16[0]}, [r3] vst1.8 {d16[5]}, [r0] subs r4, r7, #4 orr r0, r2, #5 vst1.8 {d16[4]}, [r0] orr r0, r2, #4 vst1.8 {d16[4]}, [r0] orr r0, r2, #3 vst1.8 {d16[0]}, [r0] orr r0, r2, #2 vst1.8 {d16[2]}, [r0] orr r0, r2, #1 vst1.8 {d16[1]}, [r0] vst1.8 {d16[3]}, [r2] vldr.64 d16, [sp] vmov r0, r1, d16 mov sp, r4 pop {r4, r7, pc} The "illegal" testcase in vext.ll is no longer illegal. <rdar://problem/9078775> llvm-svn: 127630
* Fix this test up a bit.Eric Christopher2011-03-141-3/+1
| | | | llvm-svn: 127621
* Minor optimization. sign-ext/anyext of undef is still undef.Evan Cheng2011-03-142-5/+19
| | | | llvm-svn: 127598
* PTX: Emit global arrays with proper sizesJustin Holewinski2011-03-142-40/+40
| | | | | | | - Emit all arrays as type .b8 and proper sizes in bytes to conform to the output of nvcc llvm-svn: 127584
* PTX: Add support for sqrt/sin/cos intrinsicsJustin Holewinski2011-03-141-0/+56
| | | | llvm-svn: 127578
* ptx: add set.p instruction and related changes to predicate executionChe-Liang Chiou2011-03-141-0/+109
| | | | llvm-svn: 127577
* Saving files before committing is overrated.Eric Christopher2011-03-121-1/+1
| | | | | | Add a RUN line to this test. llvm-svn: 127520
* Sometimes isPredicable lies to us and tells us we don't need the operands.Eric Christopher2011-03-121-0/+60
| | | | | | | | | Go ahead and add them on when we might want to use them and let later passes remove them. Fixes rdar://9118569 llvm-svn: 127518
* Properly pseudo-ize the ARM LDMIA_RET instruction. This has the nice side-Jim Grosbach2011-03-1111-18/+18
| | | | | | effect that we get proper instruction printing using the "pop" mnemonic for it. llvm-svn: 127502
* Roll r127459 back in:Cameron Zwarich2011-03-1113-15/+14
| | | | | | | | | | | Optimize trivial branches in CodeGenPrepare, which often get created from the lowering of objectsize intrinsics. Unfortunately, a number of tests were relying on llc not optimizing trivial branches, so I had to add an option to allow them to continue to test what they originally tested. This fixes <rdar://problem/8785296> and <rdar://problem/9112893>. llvm-svn: 127498
* Fix the GCC test suite issue exposed by r127477, which was caused by stackCameron Zwarich2011-03-111-0/+19
| | | | | | | protector insertion not working correctly with unreachable code. Since that revision was rolled out, this test doesn't actual fail before this fix. llvm-svn: 127497
* Revert r127459, "Optimize trivial branches in CodeGenPrepare, which often getDaniel Dunbar2011-03-1113-14/+15
| | | | | | created from the", it broke some GCC test suite tests. llvm-svn: 127477
* Optimize trivial branches in CodeGenPrepare, which often get created from theCameron Zwarich2011-03-1113-15/+14
| | | | | | | | | | lowering of objectsize intrinsics. Unfortunately, a number of tests were relying on llc not optimizing trivial branches, so I had to add an option to allow them to continue to test what they originally tested. This fixes <rdar://problem/8785296> and <rdar://problem/9112893>. llvm-svn: 127459
* Change the x86 32-bit scheduler to register pressure and fix up theEric Christopher2011-03-115-5/+4
| | | | | | | | corresponding testcases back to the previous versions. Fixes some performance regressions only seen on 32-bit. llvm-svn: 127441
* Avoid replacing the value of a directly stored load with the stored value if ↵Evan Cheng2011-03-111-0/+47
| | | | | | the load is indexed. rdar://9117613. llvm-svn: 127440
* Properly pseudo-ize MOVCCr and MOVCCs.Jim Grosbach2011-03-101-2/+2
| | | | llvm-svn: 127434
* PTX: Add preliminary support for floating-point divide and multiply-and-addJustin Holewinski2011-03-103-0/+47
| | | | llvm-svn: 127410
* ptx: add the rest of special registers of ISA version 2.0Che-Liang Chiou2011-03-101-34/+176
| | | | llvm-svn: 127397
OpenPOWER on IntegriCloud