summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86ISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Add wider vector/integer support for PR12312Michael Liao2012-09-131-100/+101
| | | | | | | | - Enhance the fix to PR12312 to support wider integer, such as 256-bit integer. If more than 1 fully evaluated vectors are found, POR them first followed by the final PTEST. llvm-svn: 163832
* Fix PR11985Michael Liao2012-09-121-2/+3
| | | | | | | | | | | - BlockAddress has no support of BA + offset form and there is no way to propagate that offset into machine operand; - Add BA + offset support and a new interface 'getTargetBlockAddress' to simplify target block address forming; - All targets are modified to use new interface and X86 backend is enhanced to support BA + offset addressing. llvm-svn: 163743
* Indentation fixes. No functional change.Craig Topper2012-09-121-8/+8
| | | | llvm-svn: 163682
* Make a bunch of lowering helper functions static instead of member ↵Craig Topper2012-09-111-58/+55
| | | | | | functions. No functional change. llvm-svn: 163596
* Remove redundant semicolons which are null statements.Dmitri Gribenko2012-09-101-1/+1
| | | | llvm-svn: 163547
* Enhance PR11334 fix to support extload from v2f32/v4f32Michael Liao2012-09-101-0/+4
| | | | | | - Fix an remaining issue of PR11674 as well llvm-svn: 163528
* Add boolean simplification support from CMOVMichael Liao2012-09-101-12/+42
| | | | | | | | - If a boolean value is generated from CMOV and tested as boolean value, simplify the use of test result by referencing the original condition. RDRAND intrinisc is one of such cases. llvm-svn: 163516
* The VPSHUFB 256-bit instruction may be generated when one of input vector is ↵Elena Demikhovsky2012-09-101-4/+15
| | | | | | | | undefined or zeroinitializer. I've added the "zeroinitializer" case in this patch. llvm-svn: 163506
* Add instruction selection for ffloor of vectors when SSE4.1 or AVX is enabled.Craig Topper2012-09-081-0/+5
| | | | llvm-svn: 163473
* Use 256-bit alignment for constant pool value for 256-bit vector FNEG lowering.Craig Topper2012-09-081-2/+3
| | | | llvm-svn: 163463
* Add support for lowering FABS of vector types.Craig Topper2012-09-081-12/+25
| | | | llvm-svn: 163461
* Set operation action for FFLOOR to Expand for all vector types for X86. Set ↵Craig Topper2012-09-081-0/+1
| | | | | | FFLOOR of v4f32 to Expand for ARM. v2f64 was already correct. llvm-svn: 163458
* AVX2 optimization.Elena Demikhovsky2012-09-061-0/+40
| | | | | | Added generation of VPSHUB instruction for <32 x i8> vector shuffle when possible. llvm-svn: 163312
* Remove duplicated helper functionMichael Liao2012-09-061-17/+1
| | | | llvm-svn: 163295
* Use iPTR instead of i32 for extract_subvector/insert_subvector index in ↵Craig Topper2012-09-061-2/+2
| | | | | | lowering and patterns. This makes it consistent with the incoming DAG nodes from the DAG builder. llvm-svn: 163293
* Stop casting away const qualifier needlessly.Roman Divacky2012-09-051-1/+1
| | | | llvm-svn: 163258
* Generic Bypass Slow DivPreston Gurd2012-09-041-0/+4
| | | | | | | | | | | | | | | | | | | | | | | - CodeGenPrepare pass for identifying div/rem ops - Backend specifies the type mapping using addBypassSlowDivType - Enabled only for Intel Atom with O2 32-bit -> 8-bit - Replace IDIV with instructions which test its value and use DIVB if the value is positive and less than 256. - In the case when the quotient and remainder of a divide are used a DIV and a REM instruction will be present in the IR. In the non-Atom case they are both lowered to IDIVs and CSE removes the redundant IDIV instruction, using the quotient and remainder from the first IDIV. However, due to this optimization CSE is not able to eliminate redundant IDIV instructions because they are located in different basic blocks. This is overcome by calculating both the quotient (DIV) and remainder (REM) in each basic block that is inserted by the optimization and reusing the result values when a subsequent DIV or REM instruction uses the same operands. - Test cases check for the presents of the optimization when calculating either the quotient, remainder, or both. Patch by Tyler Nowicki! llvm-svn: 163150
* This patch optimizes shuffle instruction - generates 2 instructions instead ↵Elena Demikhovsky2012-09-041-16/+17
| | | | | | | | | | | | | | | | | | | | of 4. Since this specific shuffle is widely used in many workloads we have ~10% performance on them. shufflevector <8 x float> %A, <8 x float> %B, <8 x i32> <i32 0, i32 8, i32 2, i32 10, i32 4, i32 12, i32 6, i32 14> vmovaps (%rdx), %ymm0 vshufps $8, %ymm0, %ymm0, %ymm0 vmovaps (%rcx), %ymm1 vshufps $8, %ymm0, %ymm1, %ymm1 vunpcklps %ymm0, %ymm1, %ymm0 vmovaps (%rcx), %ymm0 vmovsldup (%rdx), %ymm1 vblendps $85, %ymm0, %ymm1, %ymm0 llvm-svn: 163134
* TyposCraig Topper2012-09-011-1/+1
| | | | llvm-svn: 163053
* SelectionDAG: when constructing VZEXT_LOAD from other loads, make sure itsManman Ren2012-08-311-0/+12
| | | | | | | | | | | output chain is correctly setup. As an example, if the original load must happen before later stores, we need to make sure the constructed VZEXT_LOAD is constrained to be before the stores. rdar://11457792 llvm-svn: 163036
* Fix PR12359Michael Liao2012-08-311-3/+5
| | | | | | | | | - In addition to undefined, if V2 is zero vector, skip 2nd PSHUFB and POR as well as PSHUFB will zero elements with negative indices. Patch by Sriram Murali <sriram.murali@intel.com> llvm-svn: 163018
* Add support for converting llvm.fma to fma4 instructions.Craig Topper2012-08-311-4/+6
| | | | llvm-svn: 162999
* Only perform DAG combine on FMAs of legal types.Craig Topper2012-08-301-0/+4
| | | | llvm-svn: 162892
* Convert FMA4 patterns to use target specific nodes instead of intrinsics to ↵Craig Topper2012-08-291-4/+0
| | | | | | align with FMA3. llvm-svn: 162829
* Add comments on the literal value used.Michael Liao2012-08-281-1/+1
| | | | llvm-svn: 162805
* Explicitly update the number of nodes to be traversedMichael Liao2012-08-281-1/+1
| | | | llvm-svn: 162780
* Fix PR12312Michael Liao2012-08-281-10/+112
| | | | | | | | | | - Add a target-specific DAG optimization to recognize a pattern PTEST-able. Such a pattern is a OR'd tree with X86ISD::OR as the root node. When X86ISD::OR node has only its flag result being used as a boolean value and all its leaves are extracted from the same vector, it could be folded into an X86ISD::PTEST node. llvm-svn: 162735
* Remove MMX shift intrinsic handling code that also exists in ↵Craig Topper2012-08-271-56/+0
| | | | | | SelectionDAGBuilder. llvm-svn: 162661
* Custom lower FMA intrinsics to target specific nodes and remove the patterns.Craig Topper2012-08-241-0/+72
| | | | llvm-svn: 162534
* fix a case where all operands of BUILD_VECTOR are undefinedMichael Liao2012-08-201-0/+4
| | | | llvm-svn: 162214
* When unsafe math is used, we can use commutative FMAX and FMIN. In some casesNadav Rotem2012-08-191-0/+27
| | | | | | | | | | | | | | | | | | | this allows for better code generation. Added a new DAGCombine transformation to convert FMAX and FMIN to FMANC and FMINC, which are commutative. For example: movaps %xmm0, %xmm1 movsd LC(%rip), %xmm0 minsd %xmm1, %xmm0 becomes: minsd LC(%rip), %xmm0 llvm-svn: 162187
* Reapply r162160 with a fix: Optimize Arith->Trunc->SETCC sequence to allow ↵Nadav Rotem2012-08-181-15/+60
| | | | | | better compare/branch code. llvm-svn: 162172
* Refactor code a bit to reduce number of calls in the final compiled code. No ↵Craig Topper2012-08-181-134/+144
| | | | | | functional change intended. llvm-svn: 162166
* Revert r162160 because it made a few buildbots fail.Nadav Rotem2012-08-181-43/+6
| | | | llvm-svn: 162164
* The X86 backend has a number of optimizations for SETCC nodes which useNadav Rotem2012-08-181-6/+43
| | | | | | | | | | | | | | | | | | | | | arithmetic instructions. However, when small data types are used, a truncate node appears between the SETCC node and the arithmetic operation. This patch adds support for this pattern. Before: xorl %esi, %edi testb %dil, %dil setne %al ret After: xorb %dil, %sil setne %al ret rdar://12081007 llvm-svn: 162160
* Use nested switch to select arguments to reduce calls to EmitPCMP.Craig Topper2012-08-171-5/+20
| | | | llvm-svn: 162089
* Make ReplaceATOMIC_BINARY_64 a static function. Use a nested switch to ↵Craig Topper2012-08-171-16/+30
| | | | | | reduce to only a single call to it thus allowing it to be inlined by the compiler. llvm-svn: 162088
* minor fix of X86ISD::VSEXT_MOVL dumpMichael Liao2012-08-141-0/+1
| | | | llvm-svn: 161902
* fix PR11334Michael Liao2012-08-141-0/+81
| | | | | | | | | | | | - FP_EXTEND only support extending from vectors with matching elements. This results in the scalarization of extending to v2f64 from v2f32, which will be legalized to v4f32 not matching with v2f64. - add X86-specific VFPEXT supproting extending from v4f32 to v2f64. - add BUILD_VECTOR lowering helper to recover back the original extending from v4f32 to v2f64. - test case is enhanced to include different vector width. llvm-svn: 161894
* Factor duplicate calls to getUNDEF in several functions.Craig Topper2012-08-141-10/+10
| | | | llvm-svn: 161860
* Re-factor intrinsic lowering to combine common parts of similar intrinsics. ↵Craig Topper2012-08-141-34/+133
| | | | | | Reduces compiled code size a little bit. llvm-svn: 161859
* Tidy up VSETCC lowering code a bit more by adding an llvm_unreachable and ↵Craig Topper2012-08-131-7/+9
| | | | | | putting an a couple if conditions in a better order. llvm-svn: 161746
* Refactor code a bit to share commonalities. No functional change intended.Craig Topper2012-08-131-20/+21
| | | | llvm-svn: 161745
* Fix an unused variable warning from r161742.Craig Topper2012-08-131-3/+0
| | | | llvm-svn: 161743
* Remove the LowerMMXCONCAT_VECTORS function. It could never execute because ↵Craig Topper2012-08-131-39/+1
| | | | | | there are no legal 64-bit vector types that could be used as inputs to a 128-bit concat_vectors. Remove a target specific SDNode and its patterns that become unused as a result. llvm-svn: 161742
* Remove call to setOperationAction for SETCC of v4f32. SETCC returns an ↵Craig Topper2012-08-121-1/+0
| | | | | | integer type not an FP type. llvm-svn: 161738
* Remove unnecessary call to setOperationAction for SETCC of v2i64 under ↵Craig Topper2012-08-121-3/+0
| | | | | | SSE42. It was already called for the same under SSE2. llvm-svn: 161737
* Make replace many calls to getSizeInBits() with is128BitVector/is256BitVectorCraig Topper2012-08-121-60/+65
| | | | llvm-svn: 161734
* Use MVT.isXBitVector instead of EVT.isXBitVector when setting up operation ↵Craig Topper2012-08-121-41/+35
| | | | | | actions. Compiles to smaller code. llvm-svn: 161733
* fix PR13577, an issue introduced by r161687Michael Liao2012-08-111-1/+19
| | | | | | | | - FCMOV only supports a subset of X86 conditions. Skip boolean simplification if X86 condition is not valid for FCMOV. - add a minimal test case for PR13577. llvm-svn: 161732
OpenPOWER on IntegriCloud