| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 332536
|
| |
|
|
|
|
|
|
|
|
| |
As suggested by Fabian on PR37441, use PSHUFLW to extend shift amount types for use with PSRAD/PSRLD to reduce register pressure.
Some of this ideally would be done by combineTargetShuffle but its tricky to do as most of the shuffles are sharing inputs.
Differential Revision: https://reviews.llvm.org/D46959
llvm-svn: 332524
|
| |
|
|
| |
llvm-svn: 332510
|
| |
|
|
|
|
|
|
|
|
|
|
| |
32-bit targets
As i64 types are not legal on 32-bit targets, insert these into a suitable zero vector and use the packed vXi64<->FP conversion instructions instead.
Fixes PR3163.
Differential Revision: https://reviews.llvm.org/D43441
llvm-svn: 332498
|
| |
|
|
|
|
| |
A lot of the models still have too many InstRW overrides for these new classes - this needs cleaning up but I wanted to get the classes in first
llvm-svn: 332451
|
| |
|
|
|
|
|
|
|
|
|
| |
The unused variable caused a compilation warning:
../lib/Target/X86/X86ISelLowering.cpp:34614:17: error: unused variable 'SMax' [-Werror,-Wunused-variable]
if (SDValue SMax = MatchMinMax(SMin, ISD::SMAX, C1))
^
1 error generated.
llvm-svn: 332431
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
specially handle SETB_C* pseudo instructions.
Summary:
While the logic here is somewhat similar to the arithmetic lowering, it
is different enough that it made sense to have its own function.
I actually tried a bunch of different optimizations here and none worked
well so I gave up and just always do the arithmetic based lowering.
Looking at code from the PR test case, we actually pessimize a bunch of
code when generating these. Because SETB_C* pseudo instructions clobber
EFLAGS, we end up creating a bunch of copies of EFLAGS to feed multiple
SETB_C* pseudos from a single set of EFLAGS. This in turn causes the
lowering code to ruin all the clever code generation that SETB_C* was
hoping to achieve. None of this is needed. Whenever we're generating
multiple SETB_C* instructions from a single set of EFLAGS we should
instead generate a single maximally wide one and extract subregs for all
the different desired widths. That would result in substantially better
code generation. But this patch doesn't attempt to address that.
The test case from the PR is included as well as more directed testing
of the specific lowering pattern used for these pseudos.
Reviewers: craig.topper
Subscribers: sanjoy, mcrosier, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D46799
llvm-svn: 332389
|
| |
|
|
|
|
|
|
| |
BtVer2 - Fixes schedules for (V)CVTPS2PD instructions
A lot of the Intel models still have too many InstRW overrides for these new classes - this needs cleaning up but I wanted to get the classes in first
llvm-svn: 332376
|
| |
|
|
|
|
|
|
|
| |
Btver2 - VCVTPH2PSYrm needs to double pump the AGU
Broadwell - missing VCVTPS2PH*mr stores extra latency
Allows us to remove the WriteCvtF2FSt conversion store class
llvm-svn: 332357
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
New unsigned saturation downconvert patterns detection was implemented in
X86 Codegen:
(truncate (smin (smax (x, C1), C2)) to dest_type),
where C1 >= 0 and C2 is unsigned max of destination type.
(truncate (smax (smin (x, C2), C1)) to dest_type)
where C1 >= 0, C2 is unsigned max of destination type and C1 <= C2.
These two patterns are equivalent to:
(truncate (umin (smax(x, C1), unsigned_max_of_dest_type)) to dest_type)
Reviewers: RKSimon
Subscribers: llvm-commits, a.elovikov
Differential Revision: https://reviews.llvm.org/D45315
llvm-svn: 332336
|
| |
|
|
| |
llvm-svn: 332274
|
| |
|
|
|
|
| |
intrinsics.
llvm-svn: 332271
|
| |
|
|
|
|
| |
MMX was missing and YMM was tagged as a fp nt store
llvm-svn: 332269
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.
In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.
Differential Revision: https://reviews.llvm.org/D43624
llvm-svn: 332240
|
| |
|
|
|
|
| |
intrinsic removals.
llvm-svn: 332207
|
| |
|
|
|
|
| |
uitofp+insertelement instead.
llvm-svn: 332206
|
| |
|
|
|
|
|
|
| |
instructions under AVX512.
This matches what we do for sint_to_fp.
llvm-svn: 332205
|
| |
|
|
| |
llvm-svn: 332198
|
| |
|
|
|
|
| |
instructions.
llvm-svn: 332189
|
| |
|
|
| |
llvm-svn: 332187
|
| |
|
|
|
|
| |
clang has used for a very long time.
llvm-svn: 332186
|
| |
|
|
|
|
|
|
| |
comment in the same revision but missed it.
Thanks to Dimitry Andric for catching this!
llvm-svn: 332177
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
As reported in PR37264, in some cases the X86 Domain Reassignment
`runOnMachineFunction()` is called twice. Because it only deletes the
`.second` members of its `InstrConverterBaseMap`, and does not clean up
the map itself, this can lead to double frees and crashes.
Use `DeleteContainerSeconds()` instead, so the `Converters` map can
safely be reinitialized and its members re-deleted for each X86 Domain
Reassignment pass.
Reviewers: guyblank, craig.topper
Reviewed By: craig.topper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46425
llvm-svn: 332176
|
| |
|
|
| |
llvm-svn: 332173
|
| |
|
|
|
|
|
|
| |
with an older intrinsic and a select.
This is what clang already uses.
llvm-svn: 332170
|
| |
|
|
|
|
| |
used by clang.
llvm-svn: 332146
|
| |
|
|
|
|
| |
We still need to handle mmx/xmm moves as 'decode-only' no-pipe instructions
llvm-svn: 332109
|
| |
|
|
|
|
| |
Fixes an issue on SLM/Btver2 where we had instructions were being treated as scalar loads/stores
llvm-svn: 332104
|
| |
|
|
|
|
|
|
| |
Confirmed by both Agner and Intel's AOM - the IEC/FPC are not required for pure load/stores (even if its a partial update).
Can't fix WriteStore until all RMW instructions are cleaned up though....
llvm-svn: 332096
|
| |
|
|
|
|
| |
Fixes a SNB issue that was missing vlddqu/vmovntdqa ymm instructions
llvm-svn: 332094
|
| |
|
|
|
|
| |
Nothing uses this yet but this will allow us to specialize MMX/XMM/YMM/ZMM vector moves.
llvm-svn: 332090
|
| |
|
|
| |
llvm-svn: 332079
|
| |
|
|
|
|
|
|
|
|
|
|
| |
from r331958.
Clang's codegen now uses 128-bit masked load/store intrinsics in IR. The backend will widen to 512-bits on AVX512F targets.
So this patch adds patterns to detect codegen's widening and patterns for AVX512VL that don't get widened.
We may be able to drop some of the old patterns, but I leave that for a future patch.
llvm-svn: 332049
|
| |
|
|
|
|
|
| |
This was missing from r331961.
Caught by sanitizer bots.
llvm-svn: 332024
|
| |
|
|
|
|
| |
Use instrs lists or merge multiple instregex patterns.
llvm-svn: 332022
|
| |
|
|
| |
llvm-svn: 332006
|
| |
|
|
| |
llvm-svn: 332002
|
| |
|
|
|
|
|
|
| |
WriteVecALU/WriteVecLogic/WriteShuffle/WriteVarShuffle/WritePSADBW/WritePHAdd scheduler classes
Split off XMM classes from the default (MMX) classes.
llvm-svn: 331999
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With nnan, there's no need for the masked merge / blend
sequence (that probably costs much more than the min/max
instruction).
Somewhere between clang 5.0 and 6.0, we started producing
these intrinsics for fmax()/fmin() in C source instead of
libcalls or fcmp/select. The backend wasn't prepared for
that, so we regressed perf in those cases.
Note: it's possible that other targets have similar problems
as seen here.
Noticed while investigating PR37403 and related bugs:
https://bugs.llvm.org/show_bug.cgi?id=37403
The IR FMF propagation cases still don't work. There's
a proposal that might fix those cases in D46563.
llvm-svn: 331992
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: craig.topper, RKSimon
Reviewed By: craig.topper, RKSimon
Differential Revision: https://reviews.llvm.org/D46539
llvm-svn: 331961
|
| |
|
|
|
|
| |
Allows us to remove some unnecessary InstRW overrides.
llvm-svn: 331913
|
| |
|
|
| |
llvm-svn: 331911
|
| |
|
|
|
|
|
|
| |
MOVNTPD/MOVNTPS should be WriteFStore
Standardized BDW/HSW/SKL/SKX WriteFStore/WriteVecStore - fixes some missed instregex patterns. (V)MASKMOVDQU was already using the default, its costs gets increased but is still nowhere near the real cost of that nasty instruction....
llvm-svn: 331864
|
| |
|
|
|
|
| |
all zeros vXi1 vector.
llvm-svn: 331847
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because we create a new kind of debug instruction, DBG_LABEL, we need to
check all passes which use isDebugValue() to check MachineInstr is debug
instruction or not. When expelling debug instructions, we should expel
both DBG_VALUE and DBG_LABEL. So, I create a new function,
isDebugInstr(), in MachineInstr to check whether the MachineInstr is
debug instruction or not.
This patch has no new test case. I have run regression test and there is
no difference in regression test.
Differential Revision: https://reviews.llvm.org/D45342
Patch by Hsiangkai Wang.
llvm-svn: 331844
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 30962eca38ef02666ebcdded72a94f2cd0292d68.
This commit has been causing test asan failures on a build bot.
http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/45108/
Original commit: https://reviews.llvm.org/D46181
llvm-svn: 331813
|
| |
|
|
| |
llvm-svn: 331773
|
| |
|
|
|
|
|
| |
This fixes a couple of BtVer2 missing instructions that weren't been handled in the override.
NOTE: There are still a lot of overrides that still need cleaning up!
llvm-svn: 331770
|
| |
|
|
|
|
|
| |
I've created the necessary classes but there are still a lot of overrides that need cleaning up.
NOTE: The Znver1 model was missing some div/idiv variants in the instregex patterns and wasn't setting the resource cycles at all in the overrides.
llvm-svn: 331767
|
| |
|
|
|
|
| |
Split off from existing vector load/store classes to remove InstRW overrides.
llvm-svn: 331760
|