summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
* [X86][SNB] Remove unnecessary CVT InstRW overridesSimon Pilgrim2018-05-161-82/+24
| | | | llvm-svn: 332536
* [X86][SSE] Reduce instruction/register usages for v4i32 vector shifts (PR37441)Simon Pilgrim2018-05-161-16/+39
| | | | | | | | | | As suggested by Fabian on PR37441, use PSHUFLW to extend shift amount types for use with PSRAD/PSRLD to reduce register pressure. Some of this ideally would be done by combineTargetShuffle but its tricky to do as most of the shuffles are sharing inputs. Differential Revision: https://reviews.llvm.org/D46959 llvm-svn: 332524
* [X86] Fix typo in instregex for CVTSI642SDrrSimon Pilgrim2018-05-161-1/+1
| | | | llvm-svn: 332510
* [X86][AVX512DQ] Use packed instructions for scalar FP<->i64 conversions on ↵Craig Topper2018-05-161-8/+62
| | | | | | | | | | | | 32-bit targets As i64 types are not legal on 32-bit targets, insert these into a suitable zero vector and use the packed vXi64<->FP conversion instructions instead. Fixes PR3163. Differential Revision: https://reviews.llvm.org/D43441 llvm-svn: 332498
* [X86] Split WriteCvtI2F/WriteCvtF2I into I<->F32 and I<->F64 scheduler classesSimon Pilgrim2018-05-1614-359/+338
| | | | | | A lot of the models still have too many InstRW overrides for these new classes - this needs cleaning up but I wanted to get the classes in first llvm-svn: 332451
* Remove unused variable introduced in r332336Mikael Holmen2018-05-161-1/+1
| | | | | | | | | | | The unused variable caused a compilation warning: ../lib/Target/X86/X86ISelLowering.cpp:34614:17: error: unused variable 'SMax' [-Werror,-Wunused-variable] if (SDValue SMax = MatchMinMax(SMin, ISD::SMAX, C1)) ^ 1 error generated. llvm-svn: 332431
* [x86][eflags] Fix PR37431 by teaching the EFLAGS copy lowering toChandler Carruth2018-05-151-3/+142
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | specially handle SETB_C* pseudo instructions. Summary: While the logic here is somewhat similar to the arithmetic lowering, it is different enough that it made sense to have its own function. I actually tried a bunch of different optimizations here and none worked well so I gave up and just always do the arithmetic based lowering. Looking at code from the PR test case, we actually pessimize a bunch of code when generating these. Because SETB_C* pseudo instructions clobber EFLAGS, we end up creating a bunch of copies of EFLAGS to feed multiple SETB_C* pseudos from a single set of EFLAGS. This in turn causes the lowering code to ruin all the clever code generation that SETB_C* was hoping to achieve. None of this is needed. Whenever we're generating multiple SETB_C* instructions from a single set of EFLAGS we should instead generate a single maximally wide one and extract subregs for all the different desired widths. That would result in substantially better code generation. But this patch doesn't attempt to address that. The test case from the PR is included as well as more directed testing of the specific lowering pattern used for these pseudos. Reviewers: craig.topper Subscribers: sanjoy, mcrosier, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D46799 llvm-svn: 332389
* [X86] Split WriteCvtF2F into F32->F64 and F64->F32 scheduler classesSimon Pilgrim2018-05-1512-136/+181
| | | | | | | | BtVer2 - Fixes schedules for (V)CVTPS2PD instructions A lot of the Intel models still have too many InstRW overrides for these new classes - this needs cleaning up but I wanted to get the classes in first llvm-svn: 332376
* [X86] Split off F16C WriteCvtPH2PS/WriteCvtPS2PH scheduler classesSimon Pilgrim2018-05-1512-139/+121
| | | | | | | | | Btver2 - VCVTPH2PSYrm needs to double pump the AGU Broadwell - missing VCVTPS2PH*mr stores extra latency Allows us to remove the WriteCvtF2FSt conversion store class llvm-svn: 332357
* [X86] Improve unsigned saturation downconvert detection.Artur Gainullin2018-05-151-19/+52
| | | | | | | | | | | | | | | | | | | | | | | Summary: New unsigned saturation downconvert patterns detection was implemented in X86 Codegen: (truncate (smin (smax (x, C1), C2)) to dest_type), where C1 >= 0 and C2 is unsigned max of destination type. (truncate (smax (smin (x, C2), C1)) to dest_type) where C1 >= 0, C2 is unsigned max of destination type and C1 <= C2. These two patterns are equivalent to: (truncate (umin (smax(x, C1), unsigned_max_of_dest_type)) to dest_type) Reviewers: RKSimon Subscribers: llvm-commits, a.elovikov Differential Revision: https://reviews.llvm.org/D45315 llvm-svn: 332336
* [X86] Add NT load/store scheduler classesSimon Pilgrim2018-05-1413-84/+148
| | | | llvm-svn: 332274
* [X86] Remove and autoupgrade avx512.vbroadcast.ss/avx512.vbroadcast.sd ↵Craig Topper2018-05-141-5/+0
| | | | | | intrinsics. llvm-svn: 332271
* [X86][BtVer2] Fix MMX/YMM integer vector nt store schedulesSimon Pilgrim2018-05-141-2/+8
| | | | | | MMX was missing and YMM was tagged as a fp nt store llvm-svn: 332269
* Rename DEBUG macro to LLVM_DEBUG.Nicola Zaghen2018-05-1416-127/+142
| | | | | | | | | | | | | | | | The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240
* [X86] Cleanup a multiclass that doesn't need as many parameters after recent ↵Craig Topper2018-05-141-25/+12
| | | | | | intrinsic removals. llvm-svn: 332207
* [X86] Remove and autoupgrade the cvtusi2sd intrinsic. Use ↵Craig Topper2018-05-141-7/+0
| | | | | | uitofp+insertelement instead. llvm-svn: 332206
* [X86] Add patterns for combining movss+uint_to_fp into the intrinsic ↵Craig Topper2018-05-131-0/+40
| | | | | | | | instructions under AVX512. This matches what we do for sint_to_fp. llvm-svn: 332205
* [X86] Remove and autoupgrade masked vpermd/vpermps intrinsics.Craig Topper2018-05-131-4/+0
| | | | llvm-svn: 332198
* [X86] Add some load folding patterns for cvtsi2ss/sd into intrinsic ↵Craig Topper2018-05-132-0/+60
| | | | | | instructions. llvm-svn: 332189
* [X86] Remove an autoupgrade legacy cvtss2sd intrinsics.Craig Topper2018-05-131-13/+7
| | | | llvm-svn: 332187
* [X86] Remove and autoupgrade cvtsi2ss/cvtsi2sd intrinsics to match what ↵Craig Topper2018-05-122-25/+13
| | | | | | clang has used for a very long time. llvm-svn: 332186
* [x86] Remove a comment obviated by r330269. Should have deleted theChandler Carruth2018-05-121-5/+0
| | | | | | | | comment in the same revision but missed it. Thanks to Dimitry Andric for catching this! llvm-svn: 332177
* Clear converters map after X86 Domain Reassignment to avoid crashesDimitry Andric2018-05-121-2/+1
| | | | | | | | | | | | | | | | | | | | | | Summary: As reported in PR37264, in some cases the X86 Domain Reassignment `runOnMachineFunction()` is called twice. Because it only deletes the `.second` members of its `InstrConverterBaseMap`, and does not clean up the map itself, this can lead to double frees and crashes. Use `DeleteContainerSeconds()` instead, so the `Converters` map can safely be reinitialized and its members re-deleted for each X86 Domain Reassignment pass. Reviewers: guyblank, craig.topper Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46425 llvm-svn: 332176
* [X86] Add WriteFCMOV scheduler class for x87 CMOVsSimon Pilgrim2018-05-1211-16/+15
| | | | llvm-svn: 332173
* [X86] Remove some unused masked conversion intrinsics that can be replaced ↵Craig Topper2018-05-121-18/+0
| | | | | | | | with an older intrinsic and a select. This is what clang already uses. llvm-svn: 332170
* [X86] Remove and autoupgrade a bunch of FMA instrinsics that are no longer ↵Craig Topper2018-05-111-22/+0
| | | | | | used by clang. llvm-svn: 332146
* [X86][BtVer2] Model ymm move as double pumped instructionsSimon Pilgrim2018-05-111-7/+7
| | | | | | We still need to handle mmx/xmm moves as 'decode-only' no-pipe instructions llvm-svn: 332109
* [X86][MMX] Tag MMX Move/Load/Store as WriteVec schedule classesSimon Pilgrim2018-05-115-17/+5
| | | | | | Fixes an issue on SLM/Btver2 where we had instructions were being treated as scalar loads/stores llvm-svn: 332104
* [X86][SLM] Vector stores only use the MEC port.Simon Pilgrim2018-05-111-10/+10
| | | | | | | | Confirmed by both Agner and Intel's AOM - the IEC/FPC are not required for pure load/stores (even if its a partial update). Can't fix WriteStore until all RMW instructions are cleaned up though.... llvm-svn: 332096
* [X86] Split WriteF/WriteVec Move/Load/Store scheduler classes by vector widthSimon Pilgrim2018-05-1110-80/+137
| | | | | | Fixes a SNB issue that was missing vlddqu/vmovntdqa ymm instructions llvm-svn: 332094
* [X86] Added scheduler helper classes to split move/load/store by sizeSimon Pilgrim2018-05-114-198/+261
| | | | | | Nothing uses this yet but this will allow us to specialize MMX/XMM/YMM/ZMM vector moves. llvm-svn: 332090
* [X86] Remove and autoupgrade the avx512.mask.store.ss intrinsic.Craig Topper2018-05-111-4/+0
| | | | llvm-svn: 332079
* [X86] Add new patterns for masked scalar load/store to match clang's codegen ↵Craig Topper2018-05-101-0/+117
| | | | | | | | | | | | from r331958. Clang's codegen now uses 128-bit masked load/store intrinsics in IR. The backend will widen to 512-bits on AVX512F targets. So this patch adds patterns to detect codegen's widening and patterns for AVX512VL that don't get widened. We may be able to drop some of the old patterns, but I leave that for a future patch. llvm-svn: 332049
* [X86] Initialize HasPTWRITE member of X86SubtargetGabor Buella2018-05-101-0/+1
| | | | | | | This was missing from r331961. Caught by sanitizer bots. llvm-svn: 332024
* [X86] Convert/Merge more instregex patterns to reduce InstrRW compile time.Simon Pilgrim2018-05-106-243/+162
| | | | | | Use instrs lists or merge multiple instregex patterns. llvm-svn: 332022
* [X86][Znver1] Remove unnecessary SchedWritePMULLD InstRW overrides.Simon Pilgrim2018-05-101-17/+2
| | | | llvm-svn: 332006
* [X86][SNB] Fix typo in PEXTRDmr instregex, was missing VPEXTRDmr.Simon Pilgrim2018-05-101-4/+2
| | | | llvm-svn: 332002
* [X86] Split ↵Simon Pilgrim2018-05-1012-341/+136
| | | | | | | | WriteVecALU/WriteVecLogic/WriteShuffle/WriteVarShuffle/WritePSADBW/WritePHAdd scheduler classes Split off XMM classes from the default (MMX) classes. llvm-svn: 331999
* [x86] fix fmaxnum/fminnum with nnanSanjay Patel2018-05-101-9/+13
| | | | | | | | | | | | | | | | | | | | | | With nnan, there's no need for the masked merge / blend sequence (that probably costs much more than the min/max instruction). Somewhere between clang 5.0 and 6.0, we started producing these intrinsics for fmax()/fmin() in C source instead of libcalls or fcmp/select. The backend wasn't prepared for that, so we regressed perf in those cases. Note: it's possible that other targets have similar problems as seen here. Noticed while investigating PR37403 and related bugs: https://bugs.llvm.org/show_bug.cgi?id=37403 The IR FMF propagation cases still don't work. There's a proposal that might fix those cases in D46563. llvm-svn: 331992
* [X86] ptwrite intrinsicGabor Buella2018-05-104-13/+25
| | | | | | | | | | Reviewers: craig.topper, RKSimon Reviewed By: craig.topper, RKSimon Differential Revision: https://reviews.llvm.org/D46539 llvm-svn: 331961
* [X86] Fix Broadwell's Shuffle256 schedule classes load latency values.Simon Pilgrim2018-05-091-21/+4
| | | | | | Allows us to remove some unnecessary InstRW overrides. llvm-svn: 331913
* [X86] Merge instregex patterns to reduce InstrRW compile time.Simon Pilgrim2018-05-095-1052/+365
| | | | llvm-svn: 331911
* [X86] Cleanup WriteFStore/WriteVecStore schedulesSimon Pilgrim2018-05-095-121/+10
| | | | | | | | MOVNTPD/MOVNTPS should be WriteFStore Standardized BDW/HSW/SKL/SKX WriteFStore/WriteVecStore - fixes some missed instregex patterns. (V)MASKMOVDQU was already using the default, its costs gets increased but is still nowhere near the real cost of that nasty instruction.... llvm-svn: 331864
* [X86] Combine (vXi1 (bitcast (-1)))) and (vXi1 (bitcast (0))) to all ones or ↵Craig Topper2018-05-091-0/+10
| | | | | | all zeros vXi1 vector. llvm-svn: 331847
* [DebugInfo] Examine all uses of isDebugValue() for debug instructions.Shiva Chen2018-05-097-16/+16
| | | | | | | | | | | | | | | | | | Because we create a new kind of debug instruction, DBG_LABEL, we need to check all passes which use isDebugValue() to check MachineInstr is debug instruction or not. When expelling debug instructions, we should expel both DBG_VALUE and DBG_LABEL. So, I create a new function, isDebugInstr(), in MachineInstr to check whether the MachineInstr is debug instruction or not. This patch has no new test case. I have run regression test and there is no difference in regression test. Differential Revision: https://reviews.llvm.org/D45342 Patch by Hsiangkai Wang. llvm-svn: 331844
* Revert "[X86][CET] Shadow stack fix for setjmp/longjmp"Jessica Paquette2018-05-082-251/+5
| | | | | | | | | | | | This reverts commit 30962eca38ef02666ebcdded72a94f2cd0292d68. This commit has been causing test asan failures on a build bot. http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/45108/ Original commit: https://reviews.llvm.org/D46181 llvm-svn: 331813
* [X86] Tag PCONFIG instruction with WriteSystem scheduler classSimon Pilgrim2018-05-081-0/+2
| | | | llvm-svn: 331773
* [X86] Split off WriteIMul64 from WriteIMul schedule class (PR36931)Simon Pilgrim2018-05-0811-105/+84
| | | | | | | This fixes a couple of BtVer2 missing instructions that weren't been handled in the override. NOTE: There are still a lot of overrides that still need cleaning up! llvm-svn: 331770
* [X86] Split WriteIDiv into div/idiv 8/16/32/64 implementations (PR36930)Simon Pilgrim2018-05-0811-130/+113
| | | | | | | I've created the necessary classes but there are still a lot of overrides that need cleaning up. NOTE: The Znver1 model was missing some div/idiv variants in the instregex patterns and wasn't setting the resource cycles at all in the overrides. llvm-svn: 331767
* [X86] Add vector masked load/store scheduler classes (PR32857)Simon Pilgrim2018-05-0811-247/+163
| | | | | | Split off from existing vector load/store classes to remove InstRW overrides. llvm-svn: 331760
OpenPOWER on IntegriCloud