summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* Peephole optimization of ptest-conditioned branch in X86 arch. Performs ↵Victor Umansky2012-01-052-0/+142
| | | | | | | | | instruction combining of sequences generated by ptestz/ptestc intrinsics to ptest+jcc pair for SSE and AVX. Testing: passed 'make check' including LIT tests for all sequences being handled (both SSE and AVX) Reviewers: Evan Cheng, David Blaikie, Bruno Lopes, Elena Demikhovsky, Chad Rosier, Anton Korobeynikov llvm-svn: 147601
* Replace the uint64_t -> double convertion algorithm with one that's more ↵Bill Wendling2012-01-051-52/+38
| | | | | | | | | | | | | | | | | | | | | | efficient. This small bit of ASM code is sufficient to do what the old algorithm did: movq %rax, %xmm0 punpckldq (c0), %xmm0 // c0: (uint4){ 0x43300000U, 0x45300000U, 0U, 0U } subpd (c1), %xmm0 // c1: (double2){ 0x1.0p52, 0x1.0p52 * 0x1.0p32 } #ifdef __SSE3__ haddpd %xmm0, %xmm0 #else pshufd $0x4e, %xmm0, %xmm1 addpd %xmm1, %xmm0 #endif It's arguably faster. One caveat, the 'haddpd' instruction isn't very fast on all processors. <rdar://problem/7719814> llvm-svn: 147593
* Reapply r146997, "Heed spill slot alignment on ARM."Jakob Stoklund Olesen2012-01-052-3/+4
| | | | | | | | | | | | Now that canRealignStack() understands frozen reserved registers, it is safe to use it for aligned spill instructions. It will only return true if the registers reserved at the beginning of register allocation allow for dynamic stack realignment. <rdar://problem/10625436> llvm-svn: 147579
* Avoid reserving an ARM base pointer during register allocation.Jakob Stoklund Olesen2012-01-051-2/+17
| | | | | | | | | | | | | | | | | Once register allocation has started the reserved registers are frozen. Fix the ARM canRealignStack() hook to respect the frozen register state. Now the hook returns false if register allocation was started with frame pointer elimination enabled. It also returns false if register allocation started without a reserved base pointer, and stack realignment would require a base pointer. This bug was breaking oggenc on armv6. No test case, an upcoming patch will use this functionality to realign the stack for spill slots when possible. llvm-svn: 147578
* Silence warnings of a mysterious compiler that still defaults to C89.Benjamin Kramer2012-01-041-2/+2
| | | | llvm-svn: 147553
* Enable -soft-float for MIPS.Akira Hatanaka2012-01-041-7/+10
| | | | llvm-svn: 147541
* Rename immLUiOpnd.Akira Hatanaka2012-01-042-3/+3
| | | | llvm-svn: 147519
* - Define base classes for Jump-and-link instructions and make 32-bit and 64-bitAkira Hatanaka2012-01-042-42/+27
| | | | | | | | | versions derive from them. - JALR64 is not needed since N64 does not emit jal. - Add template parameter to BranchLink that sets the rt field. - Fix the set of temporary registers for O32 and N64. llvm-svn: 147518
* Have getRegForInlineAsmConstraint return the correct register class when targetAkira Hatanaka2012-01-041-4/+9
| | | | | | is Mips64. llvm-svn: 147516
* Fix more places which should be checking for iOS, not darwin.Evan Cheng2012-01-043-18/+18
| | | | llvm-svn: 147513
* For x86, canonicalize maxEvan Cheng2012-01-041-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | (x > y) ? x : y => (x >= y) ? x : y So for something like (x - y) > 0 : (x - y) ? 0 It will be (x - y) >= 0 : (x - y) ? 0 This makes is possible to test sign-bit and eliminate a comparison against zero. e.g. subl %esi, %edi testl %edi, %edi movl $0, %eax cmovgl %edi, %eax => xorl %eax, %eax subl %esi, $edi cmovsl %eax, %edi rdar://10633221 llvm-svn: 147512
* Fix 80-column violations.Chad Rosier2012-01-031-4/+5
| | | | llvm-svn: 147495
* Revert r146997, "Heed spill slot alignment on ARM."Jakob Stoklund Olesen2012-01-032-4/+3
| | | | | | | | | This patch caused a miscompilation of oggenc because a frame pointer was suddenly needed halfway through register allocation. <rdar://problem/10625436> llvm-svn: 147487
* Revert 147426 because it caused pr11696.Nadav Rotem2012-01-031-18/+0
| | | | llvm-svn: 147485
* Enhance DAGCombine for transforming 128->256 casts into a vmovaps, ratherChad Rosier2012-01-032-0/+19
| | | | | | | then a vxorps + vinsertf128 pair if the original vector came from a load. rdar://10594409 llvm-svn: 147481
* Fix malformed assert.Matt Beaumont-Gay2012-01-031-1/+1
| | | | | | | | If anybody has strong feelings about 'default: assert(0 && "blah")' vs 'default: llvm_unreachable("blah")', feel free to regularize the instances of each in this file. llvm-svn: 147459
* Intel style asm variant does not need '%' prefix.Devang Patel2012-01-032-28/+28
| | | | llvm-svn: 147453
* Miscellaneous shuffle lowering cleanup. No functional changes. Primarily ↵Craig Topper2012-01-021-47/+43
| | | | | | converting the indexing loops to unsigned to be consistent across functions. llvm-svn: 147430
* Make CanXFormVExtractWithShuffleIntoLoad reject loads with multiple uses. ↵Craig Topper2012-01-022-26/+22
| | | | | | Also make it return false if there's not even a load at all. This makes the code better match the code in DAGCombiner that it tries to match. These two changes prevent some cases where vector_shuffles were making it to instruction selection and causing the older shuffle selection code to be triggered. Also needed to fix a bad pattern that this change exposed. This is the first step towards getting rid of the old shuffle selection support. No test cases yet because there's no way to tell whether a shuffle was handled in the legalize stage or at instruction selection. llvm-svn: 147428
* Optimize the sequence blend(sign_extend(x)) to blend(shl(x)) since SSE blend ↵Nadav Rotem2012-01-021-0/+18
| | | | | | instructions only look at the highest bit. llvm-svn: 147426
* Allow CRC32 instructions to be selected when AVX is enabled.Craig Topper2012-01-012-2/+3
| | | | llvm-svn: 147411
* Fix sfence, lfence, mfence, and clflush to be able to be selected when AVX ↵Craig Topper2012-01-013-20/+21
| | | | | | is enabled. Fix monitor and mwait to require SSE3 or AVX, previously they worked even if SSE3 was disabled. Make prefetch instructions not set the execution domain since they don't use XMM registers. llvm-svn: 147409
* X86Disassembler: Fix undefined behavior found by GCC 4.6Benjamin Kramer2012-01-011-3/+5
| | | | llvm-svn: 147404
* Merge X86 SHUFPS and SHUFPD node types.Craig Topper2011-12-314-58/+35
| | | | llvm-svn: 147394
* Add patterns for integer forms of SHUFPD/VSHUFPD with a memory load.Craig Topper2011-12-311-0/+6
| | | | llvm-svn: 147393
* Fix typo in a SHUFPD and VSHUFPD pattern that prevented SHUFPD/VSHUFPD with ↵Craig Topper2011-12-311-2/+2
| | | | | | a load from being selected. llvm-svn: 147392
* Cleanup Mips code and rename some variables. Patch by Jack CarterBruno Cardoso Lopes2011-12-304-171/+79
| | | | llvm-svn: 147383
* Improve Mips JIT.Bruno Cardoso Lopes2011-12-303-6/+15
| | | | | | | | | | | Implement encoder methods getJumpTargetOpValue and getBranchTargetOpValue for jmptarget and brtarget Mips tablegen operand types in the code emitter for old-style JIT. Rename the pc relative relocation for branches - new name is Mips::reloc_mips_pc16. Patch by Sasa Stankovic llvm-svn: 147382
* Make FMA4 imply AVX so that YMM registers would be available. Necessitates ↵Craig Topper2011-12-301-6/+8
| | | | | | removing from Bulldozer CPU types since it would enable AVX code generation implicitly. Also make SSE4A imply SSE3. Without some level of SSE implied, XMM registers wouldn't be legal. llvm-svn: 147369
* Add disassembler support for VPERMIL2PD and VPERMIL2PS.Craig Topper2011-12-302-4/+12
| | | | llvm-svn: 147368
* Add FMA4 instructions to disassembler.Craig Topper2011-12-301-38/+53
| | | | llvm-svn: 147367
* Separate the concept of having memory access in operand 4 from the concept ↵Craig Topper2011-12-305-34/+26
| | | | | | of having the W bit set for XOP instructons. Removes ORing W-bits in the encoder and will similarly simplify the disassembler implementation. llvm-svn: 147366
* Combine FMA4 SS/SD patterns with the instruction definitions.Craig Topper2011-12-301-97/+24
| | | | llvm-svn: 147365
* Combine FMA4 PS/PD patterns with the instruction definitions.Craig Topper2011-12-301-219/+42
| | | | llvm-svn: 147364
* Change FMA4 memory forms to use memopv* instead of alignedloadv*. No need to ↵Craig Topper2011-12-301-58/+48
| | | | | | force alignment on these instructions. Add a couple testcases for memory forms. llvm-svn: 147361
* Fix load size for FMA4 SS/SD instructions. They need to use f32 and f64 ↵Craig Topper2011-12-301-60/+43
| | | | | | size, but with the special handling to be compatible with the intrinsic expecting a vector. Similar handling is already used elsewhere. llvm-svn: 147360
* Cleanup stack/frame register define/kill states. This fixes two bugs:Hal Finkel2011-12-302-17/+17
| | | | | | | | 1. The ST*UX instructions that store and update the stack pointer did not set define/kill on R1. This became a problem when I activated post-RA scheduling (and had incorrectly adjusted the Frames-large test). 2. eliminateFrameIndex did not kill its scavenged temporary register, and this could cause the scavenger to exhaust all available registers (and its emergency spill slot) when there were a lot of CR values to spill. The 2010-02-12-saveCR test has been adjusted to check for this. llvm-svn: 147359
* Fix execution domains for PS/PD FMA3 instructions. Add SS/SD forms o FMA3 ↵Craig Topper2011-12-292-19/+55
| | | | | | instructions. llvm-svn: 147353
* Expose FMA3 instructions to the disassembler.Craig Topper2011-12-291-17/+15
| | | | llvm-svn: 147351
* Make FMA3 imply AVX needs to be enabled. Particularly because 256-bit types ↵Craig Topper2011-12-291-3/+4
| | | | | | aren't valid unless AVX is enabled. llvm-svn: 147349
* Change XOP detection to use the correct CPUID bit instead of using the FMA4 bit.Craig Topper2011-12-291-9/+13
| | | | llvm-svn: 147348
* Add FeaturePOPCNT to all CPU types that lost it was removed from SSE42/SSE4A ↵Craig Topper2011-12-291-14/+17
| | | | | | in r147339. llvm-svn: 147347
* Mark non-VEX forms of PCLMUL instructions as requiring SSE2 to be enabled ↵Craig Topper2011-12-291-1/+1
| | | | | | along with CLMUL. That's required for the XMM registers to be valid for integer data. Doesn't change any behavior since the CLMUL instructions don't have patterns yet. llvm-svn: 147345
* Mark non-VEX forms of AES instructions as requiring SSE2 to be enabled along ↵Craig Topper2011-12-291-2/+2
| | | | | | with AES. Since that's required for the XMM registers to be valid for integer data. Doesn't change any behavior though since you can't use an intrinsic with an illegal type anyway. Just makes it consistent with the VEX forms. llvm-svn: 147344
* Remove the separate explicit AES instruction patterns. They are equivalent ↵Craig Topper2011-12-291-48/+5
| | | | | | to the patterns specified by the instructions. Also remove unnecessary bitconverts from the AES patterns. llvm-svn: 147342
* Make SSE42 and SSE4A not imply POPCNT. POPCNT should be able to be disabled ↵Craig Topper2011-12-291-3/+2
| | | | | | on its own without disabling SSE4.2 or SSE4A. llvm-svn: 147339
* Make LowerBUILD_VECTOR keep node vector types consistent when creating MOVL ↵Craig Topper2011-12-291-9/+8
| | | | | | for v16i16 and v32i8. llvm-svn: 147337
* Remove some elses after returns.Craig Topper2011-12-291-7/+10
| | | | llvm-svn: 147336
* Remove trailing spaces. Fix an assert to use && instead of || before string. ↵Craig Topper2011-12-291-7/+5
| | | | | | Add same assert on similar code path. llvm-svn: 147335
* Fix type-checking for load transformation which is not legal on ↵Eli Friedman2011-12-281-1/+2
| | | | | | floating-point types. PR11674. llvm-svn: 147323
OpenPOWER on IntegriCloud