summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
* [X86][XOP] Add missing scheduler classes to XOP instructionsSimon Pilgrim2017-11-211-28/+39
| | | | | | All match equivalent basic classes (WritePHAdd, WriteFAdd etc.) according to both the AMD 15h SOG and Agner's tables. llvm-svn: 318758
* [X86][LWP] Add missing LWP itinerary class to lwpins instructionsSimon Pilgrim2017-11-211-2/+2
| | | | | | It's on all other LWP instruction but I missed it from lwpins, despite similar scheduling behaviour. llvm-svn: 318751
* [x86][icelake]BITALGCoby Tayree2017-11-216-0/+25
| | | | | | | vpopcnt{b,w} Differential Revision: https://reviews.llvm.org/D40213 llvm-svn: 318748
* [x86][icelake]VNNICoby Tayree2017-11-219-0/+98
| | | | | | | | | Introducing Vector Neural Network Instructions, consisting of: vpdpbusd{s} vpdpwssd{s} Differential Revision: https://reviews.llvm.org/D40208 llvm-svn: 318746
* [x86][icelake]vbmi2Coby Tayree2017-11-219-10/+245
| | | | | | | | | | | introducing vbmi2, consisting of vpcompress{b,w} vpexpand{b,w} vpsh{l,r}d{w,d,q} vpsh{l,r}dv{w,d,q} Differential Revision: https://reviews.llvm.org/D40206 llvm-svn: 318745
* [x86][icelake]vpclmulqdq introductionCoby Tayree2017-11-218-63/+115
| | | | | | | an icelake promotion of pclmulqdq Differential Revision: https://reviews.llvm.org/D40101 llvm-svn: 318741
* [x86][icelake]VAES introductionCoby Tayree2017-11-216-26/+75
| | | | | | | an icelake promotion of AES Differential Revision: https://reviews.llvm.org/D40078 llvm-svn: 318740
* [X86] Simplify type constraints for AVX2 masked gather.Craig Topper2017-11-211-19/+14
| | | | | | We don't need separate 32 and 64 node types. We can use SDTCisInt and SDTCisSameSizeAs to ensure the mask size the result type and is integer. llvm-svn: 318732
* [X86] Simplify the predicates for avx2 masked gather patterns.Craig Topper2017-11-211-33/+17
| | | | | | We don't need a dyn_cast and we only need to check the type of the index. The base ptr is guaranteed to be scalar. llvm-svn: 318730
* [AMDGPU] Fix DAGTypeLegalizer::SplitInteger for shift amount typeYaxun Liu2017-11-211-2/+9
| | | | | | | | | | | | | | | DAGTypeLegalizer::SplitInteger uses default pointer size as shift amount constant type, which causes less performant ISA in amdgcn---amdgiz target since the default pointer type is i64 whereas the desired shift amount type is i32. This patch fixes that by using TLI.getScalarShiftAmountTy in DAGTypeLegalizer::SplitInteger. The X86 change is necessary since splitting i512 requires shifting amount of 256, which cannot be held by i8. Differential Revision: https://reviews.llvm.org/D40148 llvm-svn: 318727
* Revert r318678 to fix Clang testRichard Trieu2017-11-215-16/+20
| | | | | | r318678 caused the Clang test CodeGen/ms-inline-asm.c to start failing. llvm-svn: 318710
* Fix spelling in comment. NFCI.Simon Pilgrim2017-11-201-1/+1
| | | | llvm-svn: 318687
* [X86] Avoid unecessary opsize byte in segment move to memoryNirav Dave2017-11-205-20/+16
| | | | | | | | | | | | | | | | | Summary: Segment moves to memory are always 16-bit. Remove invalid 32 and 64 bit variants. Fixes PR34478. Reviewers: rnk, craig.topper Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D39847 llvm-svn: 318678
* [LV][X86] Support of AVX2 Gathers code generation and update the LV with thisMohammed Agabaria2017-11-205-46/+172
| | | | | | | | | | | | | | | This patch depends on: https://reviews.llvm.org/D35348 Support of pattern selection of masked gathers of AVX2 (X86\AVX2 code gen) Update LoopVectorize to generate gathers for AVX2 processors. Reviewers: delena, zvi, RKSimon, craig.topper, aaboud, igorb Reviewed By: delena, RKSimon Differential Revision: https://reviews.llvm.org/D35772 llvm-svn: 318641
* [X86] Add test cases for rndscaless/sd intrinsics.Craig Topper2017-11-191-1/+1
| | | | | | Also fix the memop in the ins for these instructions. Not sure what effect this has. llvm-svn: 318624
* [X86] Improve load folding of scalar rcp28 and rsqrt28 instructions using ↵Craig Topper2017-11-191-3/+2
| | | | | | sse_load_f32/f64. llvm-svn: 318623
* [X86] Switch cannonlake to use the SkylakeServer scheduling model instead of ↵Craig Topper2017-11-191-1/+1
| | | | | | | | Haswell. Cannonlake comes after skylake and supports avx512 so this is probably a closer model for now. llvm-svn: 318613
* [X86] Add skeleton support for icelake CPU.Craig Topper2017-11-192-1/+14
| | | | | | There are several patches out for review right now to implement Icelake features. This adds a CPU to collect them under. llvm-svn: 318612
* [X86] Fix 80 column violation and remove trailing whitespace. NFCCraig Topper2017-11-191-7/+8
| | | | llvm-svn: 318611
* [X86] Simplify the gather/scatter isel predicates.Craig Topper2017-11-181-54/+27
| | | | | | We don't need a dyn_cast, the predicate already specified the base node. We only need to check the type of the index, the base ptr is guaranteed to be scalar. llvm-svn: 318596
* [X86] Qualify a few places with ExperimentalVectorWideningLegalization.Craig Topper2017-11-181-4/+8
| | | | | | I'm playing around with this flag and these places cause errors if not qualified. llvm-svn: 318595
* [X86] Add todo comment for TRUNC(SUB(X,C)) -> SUB(TRUNC(X),C')Simon Pilgrim2017-11-181-0/+1
| | | | | | As discussed on PR35295, but it causes regressions in combineSubToSubus which need to be addressed first llvm-svn: 318594
* [X86] Output cfi directives for saved XMM registers even if no GPRs are savedMartin Storsjo2017-11-181-2/+1
| | | | | | | | | | This makes sure that functions that only clobber xmm registers (on win64) also get the right cfi directives, if dwarf exceptions are enabled. Differential Revision: https://reviews.llvm.org/D40191 llvm-svn: 318591
* [X86] Fix typo in variable name. NFCCraig Topper2017-11-181-4/+4
| | | | llvm-svn: 318590
* Fix a bunch more layering of CodeGen headers that are in TargetDavid Blaikie2017-11-1722-27/+27
| | | | | | | | All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around). llvm-svn: 318490
* [X86] Add DAG combine to remove sext i32->i64 from gather/scatter instructions.Craig Topper2017-11-161-1/+22
| | | | | | | | | | Only do this pre-legalize in case we're using the sign extend to legalize for KNL. This recovers all of the tests that changed when I stopped SelectionDAGBuilder from deleting sign extends. There's more work that could be done here particularly to fix the i8->i64 test case that experienced split. llvm-svn: 318468
* [X86] Pre-truncate gather/scatter indices that have element sizes larger ↵Craig Topper2017-11-161-2/+19
| | | | | | | | | | than 64-bits before Legalize. The wider element type will normally cause legalize to try to split and scalarize the gather/scatter, but we can't handle that. Instead, truncate the index early so the gather/scatter node is insulated from the legalization. This really shouldn't happen in practice since InstCombine will normalize index types to the same size as pointers. llvm-svn: 318452
* [X86] DAGCombinerInfo is in TargetLowering not X86TargetLowering.Craig Topper2017-11-161-1/+1
| | | | llvm-svn: 318451
* [TTI][X86] update costs of interleaved load\store of i64\doubleMohammed Agabaria2017-11-161-0/+6
| | | | | | | | | | | | This patch contains more accurate cost of interelaved load\store of stride 2 for the types int64\double on AVX2. Reviewers: delena, RKSimon, craig.topper, dorit Reviewed By: dorit Differential Revision: https://reviews.llvm.org/D40008 llvm-svn: 318385
* [X86] Update TTI to report that v1iX/v1fX types aren't legal for masked ↵Craig Topper2017-11-161-2/+10
| | | | | | | | gather/scatter/load/store. The type legalizer will try to scalarize these operations if it sees them, but there is no handling for scalarizing them. This leads to a fatal error. With this change they will now be scalarized by the mem intrinsic scalarizing pass before SelectionDAG. llvm-svn: 318380
* [X86] Custom type legalize v2f32 masked gathers instead of trying to cleanup ↵Craig Topper2017-11-161-26/+28
| | | | | | after type legalization. llvm-svn: 318368
* [globalisel][tablegen] Generate rule coverage and use it to identify ↵Daniel Sanders2017-11-161-17/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | untested rules Summary: This patch adds a LLVM_ENABLE_GISEL_COV which, like LLVM_ENABLE_DAGISEL_COV, causes TableGen to instrument the generated table to collect rule coverage information. However, LLVM_ENABLE_GISEL_COV goes a bit further than LLVM_ENABLE_DAGISEL_COV. The information is written to files (${CMAKE_BINARY_DIR}/gisel-coverage-* by default). These files can then be concatenated into ${LLVM_GISEL_COV_PREFIX}-all after which TableGen will read this information and use it to emit warnings about untested rules. This technique could also be used by SelectionDAG and can be further extended to detect hot rules and give them priority over colder rules. Usage: * Enable LLVM_ENABLE_GISEL_COV in CMake * Build the compiler and run some tests * cat gisel-coverage-[0-9]* > gisel-coverage-all * Delete lib/Target/*/*GenGlobalISel.inc* * Build the compiler Known issues: * ${LLVM_GISEL_COV_PREFIX}-all must be generated as a manual step due to a lack of a portable 'cat' command. It should be the concatenation of all ${LLVM_GISEL_COV_PREFIX}-[0-9]* files. * There's no mechanism to discard coverage information when the ruleset changes Depends on D39742 Reviewers: ab, qcolombet, t.p.northover, aditya_nandakumar, rovka Reviewed By: rovka Subscribers: vsk, arsenm, nhaehnle, mgorny, kristof.beyls, javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D39747 llvm-svn: 318356
* Add backend name to Target to enable runtime info to be fed back into TableGenDaniel Sanders2017-11-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | Summary: Make it possible to feed runtime information back to tablegen to enable profile-guided tablegen-eration, detection of untested tablegen definitions, etc. Being a cross-compiler by nature, LLVM will potentially collect data for multiple architectures (e.g. when running 'ninja check'). We therefore need a way for TableGen to figure out what data applies to the backend it is generating at the time. This patch achieves that by including the name of the 'def X : Target ...' for the backend in the TargetRegistry. Reviewers: qcolombet Reviewed By: qcolombet Subscribers: jholewinski, arsenm, jyknight, aditya_nandakumar, sdardis, nemanjai, ab, nhaehnle, t.p.northover, javed.absar, qcolombet, llvm-commits, fedor.sergeev Differential Revision: https://reviews.llvm.org/D39742 llvm-svn: 318352
* [X86] Add a return to the end of a switch to prevent an accidental ↵Craig Topper2017-11-151-0/+1
| | | | | | fallthrough in the future. llvm-svn: 318330
* [X86] Add CBW/CDQ/CDQE/CQO/CWD/CWDE to WriteALU schedule classSimon Pilgrim2017-11-152-32/+31
| | | | | | | | Some CPUs are already overriding these sign extension instructions but we should be able to use the WriteALU schedule class by default. Differential Revision: https://reviews.llvm.org/D39899 llvm-svn: 318308
* [X86] Redefine the 128-bit version of VPGATHERQD and VGATHERQPS to use a VK2 ↵Craig Topper2017-11-153-14/+24
| | | | | | | | | | mask instead of a VK4 mask. This allows us to remove extra extend creation during lowering and more accurately reflects the semantics of the instruction. While there add an extra output VT to X86 masked gather node to better match the isel pattern predicate. Currently we're exploiting the fact that the isel table doesn't count how many output results a node actually has if the result type of any can be inferred from the first result and the type constraints defined in tablegen. I think we might ultimately want to lower all MGATHER/MSCATTER to an X86ISD node with the extra mask result and stop relying on this hole in the isel checking. llvm-svn: 318278
* [X86] Fix typo in comment. NFCCraig Topper2017-11-141-2/+2
| | | | llvm-svn: 318156
* Update some code.google.com linksHans Wennborg2017-11-131-1/+1
| | | | llvm-svn: 318115
* [X86] Allow X86ISD::Wrapper to be folded into the base of gather/scatter addressCraig Topper2017-11-131-20/+35
| | | | | | | | | | | | If the base of our gather corresponds to something contained in X86ISD::Wrapper we should be able to fold it into the address. This patch refactors some of the address matching to more fully use the X86ISelAddressMode struct and the getAddressOperands helper. A new helper function matchVectorAddress is added to call matchWrapper or fall back to matchAddressBase. We should also be able to support constant offsets from a wrapper, but I'll look into that in a future patch. We may even be able to completely reuse matchAddress here, but I wanted to start simple and work up to it. Differential Revision: https://reviews.llvm.org/D39927 llvm-svn: 318057
* [X86] test/testn intrinsics lowering to IR. llvm part.Uriel Korach2017-11-131-24/+0
| | | | | | | | | Remove builtins from llvm and add AutoUpgrade support. Also add fast-isel tests for the TEST and TESTN instructions. Differential Revision: https://reviews.llvm.org/D38736 llvm-svn: 318036
* [x86][AVX512] Lowering shuffle i/f intrinsics to LLVM IRJina Nahias2017-11-131-16/+0
| | | | | | | | | This patch, together with a matching clang patch (https://reviews.llvm.org/D38672), implements the lowering of X86 shuffle i/f intrinsics to IR. Differential Revision: https://reviews.llvm.org/D38671 Change-Id: I1e7d359a74743e995ec356237a85214ce55d3661 llvm-svn: 318026
* [X86][SKX] Adding scheduling info of non-intrinsic + commutable SKX opcodes.Gadi Haber2017-11-131-102/+102
| | | | | | | | | | | | | | Updated the scheduling information of the SKX subtarget in the file X86SchedSkylakeServer.td under lib/Target/X86 to: 1. add regular opcodes in addition to the suffixed "_Int" opcodes 2. add the (V)MAXCPD/MAXCPS/MAXCSD/MAXCSS/MINCPD/MINCPS/MINCSD/MINCSS instructions that are equivalent to their counterparts without the 'C' as they are part of a hack to make floating point min/max commutable under fast math. Reviewers: zvi, RKSimon, craig.topper Differential Revision: https://reviews.llvm.org/D39833 Change-Id: Ie13702a5ce1b1a08af91ca637a52b6962881e7d6 llvm-svn: 318024
* [X86] Limit NOPs to 7 bytes when 'slm' is spelled 'silvermont'.Craig Topper2017-11-131-1/+1
| | | | | | We support 2 spelling for silvermont and we should accept both here. llvm-svn: 318023
* [X86] Use sse_load_f32/f64 to improve load folding of scalar vfscalefss/sd, ↵Craig Topper2017-11-131-5/+4
| | | | | | vrcp14ss/sd, rsqrt14ss/sd instructions. llvm-svn: 318022
* [X86] Use sse_load_f32/f64 to improve load folding for scalar VFPCLASS ↵Craig Topper2017-11-131-4/+4
| | | | | | intrinsics. llvm-svn: 318019
* [X86] Fix SQRTSS/SQRTSD/RCPSS/RCPSD intrinsics to use ↵Craig Topper2017-11-132-10/+13
| | | | | | sse_load_f32/sse_load_f64 to increase load folding opportunities. llvm-svn: 318016
* [X86] Attempt to fix signed and unsigned comparison warning.Craig Topper2017-11-131-2/+2
| | | | llvm-svn: 318010
* [X86] Use sse_load_f32/f64 in patterns for the memory forms of VRNDSCALESS/SD.Craig Topper2017-11-131-3/+2
| | | | llvm-svn: 318009
* [X86] Use EVEX encoded VRNDSCALE instructions to implement the legacy round ↵Craig Topper2017-11-134-29/+55
| | | | | | | | | | | | | | intrinsics. The VRNDSCALE instructions implement a superset of the (V)ROUND instructions. They are equivalent if the upper 4-bits of the immediate are 0. This patch lowers the legacy intrinsics to the VRNDSCALE ISD node and masks the upper bits of the immediate to 0. This allows us to take advantage of the larger register encoding space. We should maybe consider converting VRNDSCALE back to VROUND in the EVEX to VEX pass if the extended registers are not being used. I notice some load folding opportunities being missed for the VRNDSCALESS/SD instructions that I'll try to fix in future patches. llvm-svn: 318008
* [X86] Split VRNDSCALE/VREDUCE/VGETMANT/VRANGE ISD nodes into versions with ↵Craig Topper2017-11-135-99/+157
| | | | | | | | and without the rounding operand. NFCI I want to reuse the VRNDSCALE node for the legacy SSE rounding intrinsics so that those intrinsics can use EVEX instructions. All of these nodes share tablegen multiclasses so I split them all so that they all remain similar in their implementations. llvm-svn: 318007
OpenPOWER on IntegriCloud