| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| |
|
|
|
|
| |
instructions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I'm not sure what the effect of this change will be on all of the affected
tests or a larger benchmark, but it fixes the horizontal add/sub problems
noted here:
https://reviews.llvm.org/D59710?vs=227972&id=228095&whitespace=ignore-most#toc
The costs are based on reciprocal throughput numbers in Agner's tables for
PEXTR*; these appear to be very slow ops on Silvermont.
This is a small step towards the larger motivation discussed in PR43605:
https://bugs.llvm.org/show_bug.cgi?id=43605
Also, it seems likely that insert/extract is the source of perf regressions on
other CPUs (up to 30%) that were cited as part of the reason to revert D59710,
so maybe we'll extend the table-based approach to other subtargets.
Differential Revision: https://reviews.llvm.org/D70607
|
|
|
|
|
|
|
| |
This fixes PR44129, which was broken in a7adc3185b (in 7.0.0
and newer).
Differential Revision: https://reviews.llvm.org/D70741
|
|
|
|
|
|
|
|
|
|
| |
This is the following patch of D68854.
This patch adds basic operations of X87 instructions, including +, -, *, / , fp extensions and fp truncations.
Patch by Chen Liu(LiuChen3)
Differential Revision: https://reviews.llvm.org/D68857
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MVE has a basic symmetry between it's normal loads/store operations and
the masked variants. This means that masked loads and stores can use
pre-inc and post-inc addressing modes, just like the standard loads and
stores already do.
To enable that, this patch adds all the relevant infrastructure for
treating masked loads/stores addressing modes in the same way as normal
loads/stores.
This involves:
- Adding an AddressingMode to MaskedLoadStoreSDNode, along with an extra
Offset operand that is added after the PtrBase.
- Extending the IndexedModeActions from 8bits to 16bits to store the
legality of masked operations as well as normal ones. This array is
fairly small, so doubling the size still won't make it very large.
Offset masked loads can then be controlled with
setIndexedMaskedLoadAction, similar to standard loads.
- The same methods that combine to indexed loads, such as
CombineToPostIndexedLoadStore, are adjusted to handle masked loads in
the same way.
- The ARM backend is then adjusted to make use of these indexed masked
loads/stores.
- The X86 backend is adjusted to hopefully be no functional changes.
Differential Revision: https://reviews.llvm.org/D70176
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix for PR24072:
X86 instructions jrcxz/jecxz/jcxz performs short jumps if rcx/ecx/cx register is 0
The maximum relative offset for a forward short jump is 127 Bytes (0x7F).
The maximum relative offset for a backward short jump is 128 Bytes (0x80).
Gnu assembler warns when the distance of the jump exceeds the maximum but llvm-as does not.
Patch by Konstantin Belochapka and Alexey Lapshin
Differential Revision: https://reviews.llvm.org/D70652
|
|
|
|
|
|
|
|
|
|
| |
Returning SDValue() means we didn't handle it and the common
code should try to expand it. But its a target intrinsic so
expanding won't do anything and just leave the node alone. But
it will print confusing debug messages.
By returning Op we tell the common code that the node is legal
and shouldn't receive any further processing.
|
|
|
|
|
|
|
|
|
|
| |
f32/f64/f80 in 64-bit mode.
These need to emit a libcall like we do for the non-strict version.
32-bit mode needs to SoftenFloat support to be implemented for strict FP nodes.
Differential Revision: https://reviews.llvm.org/D70504
|
| |
|
|
|
|
| |
Avoid MVT dependency which will be needed in a future patch.
|
|
|
|
|
|
|
|
|
|
| |
Add explicit setOperation actions for some to match their none
strict counterparts. This isn't required, but makes the code
self documenting that we didn't forget about strict fp. I've
used LibCall instead of Expand since that's more explicitly what
we want.
Only lrint/llrint/lround/llround are missing now.
|
|
|
|
|
| |
The Expand code would fall back to LibCall, but this makes it
more explicit.
|
|
|
|
|
|
|
| |
libcalls. Use it for fp128 on x86-64.
This requires a minor hack for f32/f64 strict fadd/fsub to avoid
turning those into libcalls.
|
|
|
|
|
|
|
| |
X86ISelDAGToDAG
The prevents LegalizeVectorOps from scalarizing them. We'll need
to remove the X86 mutation code when we add isel patterns.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
(Split of off D67120)
SelectionDAG::shouldOptForSize changes for profile guided size optimization.
Reviewers: davidxl
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70095
|
|
|
|
|
|
|
|
|
|
|
| |
LibCall.
The custom code just emits a libcall, but we can do the same
with generic code. The only difference is that the generic code
can form tail calls where the custom code couldn't. This is
responsible for the test changes.
This avoids needing to modify the Custom handling for strict fp.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Most libraries are defined in the lib/ directory but there are also a
few libraries defined in tools/ e.g. libLLVM, libLTO. I'm defining
"Component Libraries" as libraries defined in lib/ that may be included in
libLLVM.so. Explicitly marking the libraries in lib/ as component
libraries allows us to remove some fragile checks that attempt to
differentiate between lib/ libraries and tools/ libraires:
1. In tools/llvm-shlib, because
llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of
all libraries defined in the whole project, there was custom code
needed to filter out libraries defined in tools/, none of which should
be included in libLLVM.so. This code assumed that any library
defined as static was from lib/ and everything else should be
excluded.
With this change, llvm_map_components_to_libnames(LIB_NAMES, "all")
only returns libraries that have been added to the LLVM_COMPONENT_LIBS
global cmake property, so this custom filtering logic can be removed.
Doing this also fixes the build with BUILD_SHARED_LIBS=ON
and LLVM_BUILD_LLVM_DYLIB=ON.
2. There was some code in llvm_add_library that assumed that
libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or
ARG_LINK_COMPONENTS set. This is only true because libraries
defined lib lib/ use LLVMBuild.txt and don't set these values.
This code has been fixed now to check if the library has been
explicitly marked as a component library, which should now make it
easier to remove LLVMBuild at some point in the future.
I have tested this patch on Windows, MacOS and Linux with release builds
and the following combinations of CMake options:
- "" (No options)
- -DLLVM_BUILD_LLVM_DYLIB=ON
- -DLLVM_LINK_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON
Reviewers: beanz, smeenai, compnerd, phosek
Reviewed By: beanz
Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70179
|
|
|
|
|
|
| |
libcall.
Previously one of the test cases added here gave an error.
|
|
|
|
|
|
| |
libcall.
Previously one of the test cases added here gave an error.
|
|
|
|
|
|
|
|
|
| |
The Custom handler doesn't do anything for these nodes anyway.
SelectionDAGISel won't mutate them if they are Legal or Custom.
X86 has custom code for mutating them due to missing isel patterns.
When the isel patterns are added Legal will be the right answer.
So go ahead a change it now since that's where we'll end up.
|
|
|
|
|
|
|
|
|
|
| |
DoInstructionSelection when the node is marked Expand rather than when it is not Legal.
This allows operations that are marked Custom, but have some type
combinations that are legal to get past this code.
Add custom mutation code to X86's Select function for the nodes
that don't have isel patterns yet.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AL is only used for varargs on SysV platforms. Don't forward it on
Windows. This allows control flow guard to set up an extra hidden
parameter in RAX, as described in PR44049.
This also has the effect of freeing up RAX for use in virtual member
pointer thunks, which may also be a nice little code size improvement on
Win64.
Fixes PR44049
Reviewers: ajpaverd, efriedma, hans
Differential Revision: https://reviews.llvm.org/D70413
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D70220
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
STRICT_FP_TO_SINT/UINT
This is a first pass at Custom lowering for these operations. I also updated some of the vector code where it was obviously easy and straightforward. More work needed in follow up.
This enables these operations to be handled with X87 where special rounding control adjustments are needed to perform a truncate.
Still need to fix Promotion in the target independent code in LegalizeDAG.
llrint/llround split into separate test file because we can't make a strict libcall properly yet either and we need to do that when i64 isn't a legal type.
This does not include any isel support. So we still rely on the mutation in SelectionDAGIsel to remove the strict from this stuff later. Except for the X87 stuff which goes through custom nodes that already had chains.
Differential Revision: https://reviews.llvm.org/D70214
|
|
|
|
|
|
|
|
| |
AMDGPU needs to know the FP mode for the function to answer this
correctly when this is removed from the subtarget.
AArch64 had to make this more complicated by using this from an IR
hook, so add an IR typed overload.
|
|
|
|
|
|
|
|
| |
infinite loops (PR43971)
As detailed in PR43971/D70267, the use of XFormVExtractWithShuffleIntoLoad causes issues where we end up in infinite loops of extract(targetshuffle(vecload)) -> extract(shuffle(vecload)) -> extract(vecload) -> extract(targetshuffle(vecload)), there are just too many legalization checks at every stage that we can't guarantee that extract(shuffle(vecload)) -> scalarload can occur.
At the moment we see a number of minor regressions as we don't fold extract(shuffle(vecload)) -> scalarload before legal ops, these can be addressed in future patches and extension of X86ISelLowering's combineExtractWithShuffle.
|
|
|
|
| |
surprising the next person to add a case after this one. NFC
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Implements scalable size queries for MVTs, split out from D53137.
* Contains a fix for FindMemType to avoid using scalable vector type
to contain non-scalable types.
* Explicit casts for several places where implicit integer sign
changes or promotion from 32 to 64 bits caused problems.
* CodeGenDAGPatterns will treat scalable and non-scalable vector types
as different.
Reviewers: greened, cameron.mcinally, sdesmalen, rovka
Reviewed By: rovka
Differential Revision: https://reviews.llvm.org/D66871
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This is a bug fix for further issues in PR43585.
Reviewers: rnk, RKSimon, craig.topper, andrew.w.kaylor
Subscribers: hiraditya, llvm-commits, annita.zhang
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70224
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
recompiles touches affected_files header
342380 95 3604 llvm/include/llvm/ADT/STLExtras.h
314730 234 1345 llvm/include/llvm/InitializePasses.h
307036 118 2602 llvm/include/llvm/ADT/APInt.h
213049 59 3611 llvm/include/llvm/Support/MathExtras.h
170422 47 3626 llvm/include/llvm/Support/Compiler.h
162225 45 3605 llvm/include/llvm/ADT/Optional.h
158319 63 2513 llvm/include/llvm/ADT/Triple.h
140322 39 3598 llvm/include/llvm/ADT/StringRef.h
137647 59 2333 llvm/include/llvm/Support/Error.h
131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
|
|
|
|
|
| |
No instruction takes mxcsr as a an operand so we should always
treat it as an identifier name.
|
|
|
|
|
|
|
| |
because SSE1 is enabled.
Instead do custom promotion in the handler so that we can still
allow i16 to be used with fp80. And f64 without sse2.
|
| |
|
|
|
|
|
|
|
|
|
| |
!useSoftFloat block. Qualify all of the Promote actions for these with !useSoftFloat too. NFCI
The Promote action doesn't apply until LegalizeDAG. By the time
we get there, we would have already softened all the FP operations
if useSoftFloat was true. So there wouldn't be any operation left
to Promote.
|
|
|
|
|
|
|
|
| |
instructions
These are really just placeholders that use approximately the right resources - once we have CPUs scheduler models that support these instructions they will need revisiting.
In the meantime this means that all instructions have a class of some kind., meaning models can be more easily flagged as complete.
|
|
|
|
|
|
|
|
| |
This is no longer needed after widening legalization as we
custom legalize v8i8 ourselves.
Added entries to the cost model, but bumped the cost slightly
to account for the truncate shuffle that wasn't costed before.
|
|
|
|
|
| |
This avoids some nasty issues with argument passing and lowering of
arbitrary v64i8 shuffles.
|
|
|
|
|
|
|
|
|
|
|
| |
type won't be split by prefer-vector-width=256
Otherwise just let the v64i8/v32i16 types be split to v32i8/v16i16.
In reality this shouldn't happen because it means we have a 512-bit
vector argument, but min-legal-vector-width says a value less than
512. But a 512-bit argument should have been factored into the
preferred vector width.
|
| |
|
|
|
|
|
|
|
|
| |
MVT::i1 should be removed by type legalization before we reach
any code that would act on the promote action.
Mainly to avoid replicating this for strict FP versions of these
operations.
|
|
|
|
|
|
|
|
|
| |
operations to Expand.
If we're using soft floats, then these operations shoudl be
softened during type legalization. They'll never get to
LegalizeVectorOps or LegalizeDAG so they don't need to be
Expanded there.
|
| |
|
|
|
|
| |
Fixes PR43952
|
|
|
|
|
|
|
|
|
|
| |
We had some code for this for 32-bit ARM, but this doesn't really need
to be in target-specific code; generalize it.
(I think this started showing up recently because we added an
optimization that converts pow to powi.)
Differential Revision: https://reviews.llvm.org/D69013
|
|
|
|
|
|
|
|
|
|
| |
Refactor usage of isCopyInstrImpl, isCopyInstr and isAddImmediate methods
to return optional machine operand pair of destination and source
registers.
Patch by Nikola Prica
Differential Revision: https://reviews.llvm.org/D69622
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
Leftovers from before we switched to widening legalization.
Fixes PR43919.
|