summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
* [X86][AVX512] Tag subvector extract/insert instructions scheduler classesSimon Pilgrim2017-12-011-32/+65
| | | | llvm-svn: 319568
* Fix line endings. NFCI.Simon Pilgrim2017-12-011-10/+10
| | | | llvm-svn: 319559
* [X86][AVX512] Tag VPERM2I/VPERM2T instructions scheduler classSimon Pilgrim2017-12-011-48/+64
| | | | llvm-svn: 319558
* [X86][AVX512] Tag VFPCLASS instructions scheduler classSimon Pilgrim2017-12-011-26/+43
| | | | llvm-svn: 319554
* [X86][AVX512] Tag VPSHUFBITQMB instructions scheduler classSimon Pilgrim2017-12-011-9/+12
| | | | llvm-svn: 319553
* [X86][AVX512] Tag VPCOMRESS/VPEXPAND instructions scheduler classesSimon Pilgrim2017-12-011-39/+55
| | | | llvm-svn: 319551
* [X86] Improvement in CodeGen instruction selection for LEAs.Jatin Bhateja2017-12-012-11/+563
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: 1/ Operand folding during complex pattern matching for LEAs has been extended, such that it promotes Scale to accommodate similar operand appearing in the DAG e.g. T1 = A + B T2 = T1 + 10 T3 = T2 + A For above DAG rooted at T3, X86AddressMode will now look like Base = B , Index = A , Scale = 2 , Disp = 10 2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs so that if there is an opportunity then complex LEAs (having 3 operands) could be factored out e.g. leal 1(%rax,%rcx,1), %rdx leal 1(%rax,%rcx,2), %rcx will be factored as following leal 1(%rax,%rcx,1), %rdx leal (%rdx,%rcx) , %edx 3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, thus avoiding creation of any complex LEAs within a loop. 4/ Simplify LEA converts (lea (BASE,1,INDEX,0) --> add (BASE, INDEX) which offers better through put. PR32755 will be taken care of by this pathc. Previous patch revisions : r313343 , r314886 Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy, jbhateja Reviewed By: lsaba, RKSimon, jbhateja Subscribers: jmolloy, spatel, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35014 llvm-svn: 319543
* [X86][AVX512] Tag vshift/vpermv/pshufd/pshufb instructions scheduler classesSimon Pilgrim2017-12-012-120/+158
| | | | llvm-svn: 319540
* GlobalISel: Enable the legalization of G_MERGE_VALUES and G_UNMERGE_VALUESVolkan Keles2017-12-011-0/+73
| | | | | | | | | | | | | | Summary: LegalizerInfo assumes all G_MERGE_VALUES and G_UNMERGE_VALUES instructions are legal, so it is not possible to legalize vector operations on illegal vector types. This patch fixes the problem by removing the related check and adding default actions for G_MERGE_VALUES and G_UNMERGE_VALUES. Reviewers: qcolombet, ab, dsanders, aditya_nandakumar, t.p.northover, kristof.beyls Reviewed By: dsanders Subscribers: rovka, javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D39823 llvm-svn: 319524
* [X86] Custom legalize v2i32 gathers via widening rather than promoting.Craig Topper2017-12-011-33/+57
| | | | | | | | The default legalization for v2i32 is promotion to v2i64. This results in a gather that reads 64-bit elements rather than 32. If one of the elements is near a page boundary this can cause an illegal access that can fault. We also miscalculate the scale for the gather which is an even worse problem, but we probably could have found a separate way to fix that. llvm-svn: 319521
* [X86] Add a DAG combine to simplify masks for AVX2 gather instructions.Craig Topper2017-12-011-0/+17
| | | | | | AVX2 gathers only use the upper bit of the mask allowing us to simplify sign_extend_inreg to a shift left. llvm-svn: 319514
* Mark all library options as hidden.Zachary Turner2017-12-012-7/+8
| | | | | | | | | | | | | | | | | These command line options are not intended for public use, and often don't even make sense in the context of a particular tool anyway. About 90% of them are already hidden, but when people add new options they forget to hide them, so if you were to make a brand new tool today, link against one of LLVM's libraries, and run tool -help you would get a bunch of junk that doesn't make sense for the tool you're writing. This patch hides these options. The real solution is to not have libraries defining command line options, but that's a much larger effort and not something I'm prepared to take on. Differential Revision: https://reviews.llvm.org/D40674 llvm-svn: 319505
* XOR the frame pointer with the stack cookie when protecting the stackReid Kleckner2017-11-304-0/+41
| | | | | | | | | | | | Summary: This strengthens the guard and matches MSVC. Reviewers: hans, etienneb Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits Differential Revision: https://reviews.llvm.org/D40622 llvm-svn: 319490
* [X86] Promote i8 CTPOP to i32 instead of i16 when we have the POPCNT ↵Craig Topper2017-11-301-1/+1
| | | | | | | | instruction. The 32-bit version is shorter to encode and the zext we emit for the promotion is likely going to be a 32-bit zero extend anyway. llvm-svn: 319468
* [X86][AVX512] Tag fcmp/ptest/ternlog instructions scheduler classesSimon Pilgrim2017-11-301-70/+90
| | | | llvm-svn: 319433
* [CodeGen] Print "%vreg0" as "%0" in both MIR and debug outputFrancis Visoiu Mistrih2017-11-301-2/+2
| | | | | | | | | | | | | | | | | As part of the unification of the debug format and the MIR format, avoid printing "vreg" for virtual registers (which is one of the current MIR possibilities). Basically: * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/%vreg([0-9]+)/%\1/g" * grep -nr '%vreg' . and fix if needed * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/ vreg([0-9]+)/ %\1/g" * grep -nr 'vreg[0-9]\+' . and fix if needed Differential Revision: https://reviews.llvm.org/D40420 llvm-svn: 319427
* [X86][AVX512] Tag binop/rounding/sae instructions scheduler classesSimon Pilgrim2017-11-301-124/+143
| | | | llvm-svn: 319424
* [X86][AVX512] Tag RCP/RSQRT/GETEXP instructions scheduler classesSimon Pilgrim2017-11-302-64/+100
| | | | llvm-svn: 319418
* [X86] Optimize avx2 vgatherqps for v2f32 with v2i64 index type.Craig Topper2017-11-301-7/+12
| | | | | | Normal type legalization will widen everything. This requires forcing 0s into the mask register. We can instead choose the form that only reads 2 elements without zeroing the mask. llvm-svn: 319406
* [X86] Make sure we don't remove sign extends of masks with AVX2 masked gathers.Craig Topper2017-11-301-3/+4
| | | | | | We don't use k-registers and instead use the MSB so we need to make sure we sign extend the mask to the msb. llvm-svn: 319405
* [X86] Remove some questionable looking code that seems to be looking through ↵Craig Topper2017-11-291-1/+1
| | | | | | | | | | a VZEXT to create a larger VSEXT. If the input the vzext was signed this would do the wrong thing. Not sure how to test this. llvm-svn: 319382
* [X86][AVX512] Tag RCP/RSQRT/GETEXP instructions scheduler classes (REVERSION)Simon Pilgrim2017-11-292-80/+53
| | | | | | Accidental commit of incomplete patch llvm-svn: 319346
* [X86][AVX512] Tag RCP/RSQRT/GETEXP instructions scheduler classesSimon Pilgrim2017-11-292-53/+80
| | | | llvm-svn: 319338
* [X86][AVX512] Tag 3OP (shuffles, double-shifts and GFNI) instructions ↵Simon Pilgrim2017-11-291-76/+91
| | | | | | scheduler classes llvm-svn: 319337
* [X86][AVX512] Add itinerary argument to all AVX512_maskable_* wrappers. NFCISimon Pilgrim2017-11-291-44/+49
| | | | | | All default to NoItinerary llvm-svn: 319326
* [X86][AVX512] Tag VPERMILV instruction scheduler classSimon Pilgrim2017-11-292-17/+32
| | | | llvm-svn: 319316
* [X86][AVX512] Setup unary (PABS/VPLZCNT/VPOPCNT/VPCONFLICT/VMOV*DUP) ↵Simon Pilgrim2017-11-292-55/+84
| | | | | | instruction scheduler classes llvm-svn: 319312
* [X86][SSE] Merged sse2_unpack and sse2_unpack PUNPCK instruction templates. ↵Simon Pilgrim2017-11-291-71/+69
| | | | | | NFCI. llvm-svn: 319310
* [X86][SSE] Merged sse2_pack and sse2_pack_y PACKSS/PACKUS instruction ↵Simon Pilgrim2017-11-291-78/+46
| | | | | | templates. NFCI. llvm-svn: 319308
* [X86] Remove setOperationAction Promote for ISD::SINT_TO_FP ↵Craig Topper2017-11-291-3/+0
| | | | | | | | MVT::v8i16/v16i8/v16i16. A DAG combine ensures these ops are always promoted to vXi32. llvm-svn: 319298
* [X86] Promote fp_to_sint v16f32->v16i16/v16i8 to avoid scalarization.Craig Topper2017-11-291-0/+2
| | | | llvm-svn: 319266
* [X86] Mark ISD::FP_TO_UINT v16i8/v16i16 as Promote under AVX512 instead of ↵Craig Topper2017-11-282-7/+2
| | | | | | | | | | legal. Fix infinite loop in op legalization when promotion requires 2 steps. Previously we had an isel pattern to add the truncate. Instead use Promote to add the truncate to the DAG before isel. The Promote legalization code had to be updated to prevent an infinite loop if promotion took multiple steps because it wasn't remembering the previously tried value. llvm-svn: 319259
* [X86] Tag CLFLUSHOPT with same scheduling behaviour as CLFLUSHSimon Pilgrim2017-11-281-2/+3
| | | | llvm-svn: 319253
* [X86][SSE] Add SSE_SHUFP OpndItinsSimon Pilgrim2017-11-281-11/+16
| | | | | | | | Update multi-classes to take the scheduling OpndItins instead of hard coding it. Will be reused in the AVX512 equivalents. llvm-svn: 319249
* [X86][SSE] Add SSE_UNPCK/SSE_PUNPCK OpndItinsSimon Pilgrim2017-11-281-49/+61
| | | | | | | | Update multi-classes to take the scheduling OpndItins instead of hard coding it. Will be reused in the AVX512 equivalents. llvm-svn: 319245
* [X86][SSE] Use SSE_PACK OpndItins in PACKSS/PACKUS instruction definitionsSimon Pilgrim2017-11-281-30/+30
| | | | | | | | Update multi-classes to take the scheduling OpndItins instead of hard coding it. SSE_PACK will be reused in the AVX512 equivalents. llvm-svn: 319243
* [X86] Remove unused variable.Craig Topper2017-11-281-1/+0
| | | | llvm-svn: 319239
* [X86] Remove code from combineUIntToFP that tried to favor UINT_TO_FP if ↵Craig Topper2017-11-281-3/+1
| | | | | | | | legal when zero extending from vXi8/vX816. The UINT_TO_FP is immediately converted to SINT_TO_FP when the node is re-evaluated because we'll detect that the sign bit is zero. llvm-svn: 319234
* [X86] Remove custom lowering for uint_to_fp from vXi8/vXi16.Craig Topper2017-11-281-20/+1
| | | | | | We have a DAG combine that uses a zero extend that should prevent this from ever occurring now. llvm-svn: 319233
* [X86][SSE] Add SSE_HADDSUB/SSE_PABS/SSE_PALIGN OpndItinsSimon Pilgrim2017-11-281-45/+59
| | | | | | | | Update multi-classes to take the scheduling OpndItins instead of hard coding it. Will be reused in the AVX512 equivalents. llvm-svn: 319209
* [X86] In lowerVectorShuffleAsElementInsertion, if were able to find a scalar ↵Craig Topper2017-11-281-1/+1
| | | | | | | | | | i8 or i16 and need to zero extend it, make sure we use a vXi32 type of the full vector width. Previously, this was hardcoded to v4i32, but if the input type is 256 bits we need to use v8i32. Fixes PR35443 llvm-svn: 319208
* [X86][X87] Tag FP_TO_INT_IN_MEM pseudos with hasNoSchedulingInfoSimon Pilgrim2017-11-281-2/+2
| | | | | | We don't need scheduling info for pseudos llvm-svn: 319197
* [CodeGen] Print register names in lowercase in both MIR and debug outputFrancis Visoiu Mistrih2017-11-288-50/+50
| | | | | | | | | | | As part of the unification of the debug format and the MIR format, always print registers as lowercase. * Only debug printing is affected. It now follows MIR. Differential Revision: https://reviews.llvm.org/D40417 llvm-svn: 319187
* [X86][X87] Tag FTST x87 instruction scheduler classSimon Pilgrim2017-11-281-1/+2
| | | | | | Looking through Agner, FTST is very similar to generic float compare behaviour, so I've added them to the existing IIC_FCOMI (WriteFAdd) tags. llvm-svn: 319184
* [X86][X87] Tag FABS/FCHS/FSQRT/FSIN/FCOS x87 instruction scheduler classesSimon Pilgrim2017-11-283-16/+30
| | | | | | | Atom's FABS/FCHS/FSQRT latencies taken from Agner. Note: I just added FSIN and FCOS to the existing IIC_FSINCOS itinerary, which is actually a more costly instruction. llvm-svn: 319175
* [X86][3DNow] Add instruction itinerary and scheduling classes for ↵Simon Pilgrim2017-11-281-6/+8
| | | | | | femms/prefetch/prefetchw llvm-svn: 319167
* [X86] Remove some unused pattern fragments from td file. NFCCraig Topper2017-11-281-10/+0
| | | | llvm-svn: 319143
* [X86] Make zero extend from v16i1/v8i1 to v16i8/v8i16/v16i16 not scalarize ↵Craig Topper2017-11-281-0/+4
| | | | | | under AVX512. llvm-svn: 319136
* [X86] Remove unnecessary fp<->int setOperationAction lines from a hasVLX ↵Craig Topper2017-11-281-7/+0
| | | | | | | | block. NFCI These lines all exist identically either under SSE2, AVX2 or AVX512. Given that VLX implies all of those, these aren't providing anything new. llvm-svn: 319124
* [X86] Remove duplicate calls to setOperationAction. NFCICraig Topper2017-11-281-2/+0
| | | | | | These same calls exist a few lines down. llvm-svn: 319122
OpenPOWER on IntegriCloud