summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Add support for STRICT_FP_TO_UINT/SINT from fp128.Craig Topper2019-11-271-4/+9
|
* [X86] Add SSEPackedSingle/Double execution domain to COMI/UCOMI SSE/AVX ↵Craig Topper2019-11-272-32/+34
| | | | instructions.
* [x86] make SLM extract vector element more expensive than defaultSanjay Patel2019-11-271-0/+14
| | | | | | | | | | | | | | | | | | | I'm not sure what the effect of this change will be on all of the affected tests or a larger benchmark, but it fixes the horizontal add/sub problems noted here: https://reviews.llvm.org/D59710?vs=227972&id=228095&whitespace=ignore-most#toc The costs are based on reciprocal throughput numbers in Agner's tables for PEXTR*; these appear to be very slow ops on Silvermont. This is a small step towards the larger motivation discussed in PR43605: https://bugs.llvm.org/show_bug.cgi?id=43605 Also, it seems likely that insert/extract is the source of perf regressions on other CPUs (up to 30%) that were cited as part of the reason to revert D59710, so maybe we'll extend the table-based approach to other subtargets. Differential Revision: https://reviews.llvm.org/D70607
* [X86] [Win64] Avoid truncating large (> 32 bit) stack allocationsMartin Storsjö2019-11-271-1/+1
| | | | | | | This fixes PR44129, which was broken in a7adc3185b (in 7.0.0 and newer). Differential Revision: https://reviews.llvm.org/D70741
* [X86] Add strict fp support for operations of X87 instructionsCraig Topper2019-11-263-19/+47
| | | | | | | | | | This is the following patch of D68854. This patch adds basic operations of X87 instructions, including +, -, *, / , fp extensions and fp truncations. Patch by Chen Liu(LiuChen3) Differential Revision: https://reviews.llvm.org/D68857
* [Codegen][ARM] Add addressing modes from masked loads and storesDavid Green2019-11-262-33/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | MVE has a basic symmetry between it's normal loads/store operations and the masked variants. This means that masked loads and stores can use pre-inc and post-inc addressing modes, just like the standard loads and stores already do. To enable that, this patch adds all the relevant infrastructure for treating masked loads/stores addressing modes in the same way as normal loads/stores. This involves: - Adding an AddressingMode to MaskedLoadStoreSDNode, along with an extra Offset operand that is added after the PtrBase. - Extending the IndexedModeActions from 8bits to 16bits to store the legality of masked operations as well as normal ones. This array is fairly small, so doubling the size still won't make it very large. Offset masked loads can then be controlled with setIndexedMaskedLoadAction, similar to standard loads. - The same methods that combine to indexed loads, such as CombineToPostIndexedLoadStore, are adjusted to handle masked loads in the same way. - The ARM backend is then adjusted to make use of these indexed masked loads/stores. - The X86 backend is adjusted to hopefully be no functional changes. Differential Revision: https://reviews.llvm.org/D70176
* [X86][MC] no error diagnostic for out-of-range jrcxz/jecxz/jcxzAlexey Lapshin2019-11-261-6/+21
| | | | | | | | | | | | | | Fix for PR24072: X86 instructions jrcxz/jecxz/jcxz performs short jumps if rcx/ecx/cx register is 0 The maximum relative offset for a forward short jump is 127 Bytes (0x7F). The maximum relative offset for a backward short jump is 128 Bytes (0x80). Gnu assembler warns when the distance of the jump exceeds the maximum but llvm-as does not. Patch by Konstantin Belochapka and Alexey Lapshin Differential Revision: https://reviews.llvm.org/D70652
* [X86] Return Op instead of SDValue() for lowering flags_read/write intrinsicsCraig Topper2019-11-251-1/+1
| | | | | | | | | | Returning SDValue() means we didn't handle it and the common code should try to expand it. But its a target intrinsic so expanding won't do anything and just leave the node alone. But it will print confusing debug messages. By returning Op we tell the common code that the node is legal and shouldn't receive any further processing.
* [X86] Add support for STRICT_FP_ROUND/STRICT_FP_EXTEND from/to fp128 to/from ↵Craig Topper2019-11-251-17/+37
| | | | | | | | | | f32/f64/f80 in 64-bit mode. These need to emit a libcall like we do for the non-strict version. 32-bit mode needs to SoftenFloat support to be implemented for strict FP nodes. Differential Revision: https://reviews.llvm.org/D70504
* [X86] Add proper execution domain information to the avx512vnni instructions.Craig Topper2019-11-251-0/+2
|
* [X86][SSE] Split off generic isLaneCrossingShuffleMask helper. NFC.Simon Pilgrim2019-11-231-3/+14
| | | | Avoid MVT dependency which will be needed in a future patch.
* [X86] Add test cases for most of the constrained fp libcalls with fp128.Craig Topper2019-11-211-4/+8
| | | | | | | | | | Add explicit setOperation actions for some to match their none strict counterparts. This isn't required, but makes the code self documenting that we didn't forget about strict fp. I've used LibCall instead of Expand since that's more explicitly what we want. Only lrint/llrint/lround/llround are missing now.
* [X86] Mark fp128 FMA as LibCall instead of Expand. Add STRICT_FMA as well.Craig Topper2019-11-211-1/+2
| | | | | The Expand code would fall back to LibCall, but this makes it more explicit.
* [LegalizeDAG][X86] Add support for turning STRICT_FADD/SUB/MUL/DIV into ↵Craig Topper2019-11-211-5/+16
| | | | | | | libcalls. Use it for fp128 on x86-64. This requires a minor hack for f32/f64 strict fadd/fsub to avoid turning those into libcalls.
* [X86] Mark vector STRICT_FADD/STRICT_FSUB as Legal and add mutation to ↵Craig Topper2019-11-212-3/+17
| | | | | | | X86ISelDAGToDAG The prevents LegalizeVectorOps from scalarizing them. We'll need to remove the X86 mutation code when we add isel patterns.
* [PGO][PGSO] DAG.shouldOptForSize part.Hiroshi Yamauchi2019-11-213-12/+14
| | | | | | | | | | | | | | | Summary: (Split of off D67120) SelectionDAG::shouldOptForSize changes for profile guided size optimization. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70095
* [X86] Change legalization action for f128 fadd/fsub/fmul/fdiv from Custom to ↵Craig Topper2019-11-211-12/+4
| | | | | | | | | | | LibCall. The custom code just emits a libcall, but we can do the same with generic code. The only difference is that the generic code can form tail calls where the custom code couldn't. This is responsible for the test changes. This avoids needing to modify the Custom handling for strict fp.
* [cmake] Explicitly mark libraries defined in lib/ as "Component Libraries"Tom Stellard2019-11-215-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Most libraries are defined in the lib/ directory but there are also a few libraries defined in tools/ e.g. libLLVM, libLTO. I'm defining "Component Libraries" as libraries defined in lib/ that may be included in libLLVM.so. Explicitly marking the libraries in lib/ as component libraries allows us to remove some fragile checks that attempt to differentiate between lib/ libraries and tools/ libraires: 1. In tools/llvm-shlib, because llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of all libraries defined in the whole project, there was custom code needed to filter out libraries defined in tools/, none of which should be included in libLLVM.so. This code assumed that any library defined as static was from lib/ and everything else should be excluded. With this change, llvm_map_components_to_libnames(LIB_NAMES, "all") only returns libraries that have been added to the LLVM_COMPONENT_LIBS global cmake property, so this custom filtering logic can be removed. Doing this also fixes the build with BUILD_SHARED_LIBS=ON and LLVM_BUILD_LLVM_DYLIB=ON. 2. There was some code in llvm_add_library that assumed that libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or ARG_LINK_COMPONENTS set. This is only true because libraries defined lib lib/ use LLVMBuild.txt and don't set these values. This code has been fixed now to check if the library has been explicitly marked as a component library, which should now make it easier to remove LLVMBuild at some point in the future. I have tested this patch on Windows, MacOS and Linux with release builds and the following combinations of CMake options: - "" (No options) - -DLLVM_BUILD_LLVM_DYLIB=ON - -DLLVM_LINK_LLVM_DYLIB=ON - -DBUILD_SHARED_LIBS=ON - -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON - -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON Reviewers: beanz, smeenai, compnerd, phosek Reviewed By: beanz Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70179
* [X86] Fix i16->f128 sitofp to promote the i16 to i32 before trying to form a ↵Craig Topper2019-11-201-8/+9
| | | | | | libcall. Previously one of the test cases added here gave an error.
* [X86] Fix f128->i16 fptosi to promote the i16 to i32 before trying to form a ↵Craig Topper2019-11-201-15/+16
| | | | | | libcall. Previously one of the test cases added here gave an error.
* [X86] Mark vector STRICT_FP_ROUND as Legal instead of Custom.Craig Topper2019-11-201-3/+9
| | | | | | | | | The Custom handler doesn't do anything for these nodes anyway. SelectionDAGISel won't mutate them if they are Legal or Custom. X86 has custom code for mutating them due to missing isel patterns. When the isel patterns are added Legal will be the right answer. So go ahead a change it now since that's where we'll end up.
* [SelectionDAG][X86] Mutate strictFP nodes to non-strict in ↵Craig Topper2019-11-201-0/+7
| | | | | | | | | | DoInstructionSelection when the node is marked Expand rather than when it is not Legal. This allows operations that are marked Custom, but have some type combinations that are legal to get past this code. Add custom mutation code to X86's Select function for the nodes that don't have isel patterns yet.
* [musttail] Don't forward AL on Win64Reid Kleckner2019-11-191-2/+2
| | | | | | | | | | | | | | | | AL is only used for varargs on SysV platforms. Don't forward it on Windows. This allows control flow guard to set up an extra hidden parameter in RAX, as described in PR44049. This also has the effect of freeing up RAX for use in virtual member pointer thunks, which may also be a nice little code size improvement on Win64. Fixes PR44049 Reviewers: ajpaverd, efriedma, hans Differential Revision: https://reviews.llvm.org/D70413
* [LegalizeDAG][X86] Enable STRICT_FP_TO_SINT/UINT to be promotedCraig Topper2019-11-191-4/+7
| | | | Differential Revision: https://reviews.llvm.org/D70220
* [X86] Add custom type legalization and lowering for scalar ↵Craig Topper2019-11-192-30/+119
| | | | | | | | | | | | | | | STRICT_FP_TO_SINT/UINT This is a first pass at Custom lowering for these operations. I also updated some of the vector code where it was obviously easy and straightforward. More work needed in follow up. This enables these operations to be handled with X87 where special rounding control adjustments are needed to perform a truncate. Still need to fix Promotion in the target independent code in LegalizeDAG. llrint/llround split into separate test file because we can't make a strict libcall properly yet either and we need to do that when i64 isn't a legal type. This does not include any isel support. So we still rely on the mutation in SelectionDAGIsel to remove the strict from this stuff later. Except for the X87 stuff which goes through custom nodes that already had chains. Differential Revision: https://reviews.llvm.org/D70214
* DAG: Add function context to isFMAFasterThanFMulAndFAddMatt Arsenault2019-11-192-3/+4
| | | | | | | | AMDGPU needs to know the FP mode for the function to answer this correctly when this is removed from the subtarget. AArch64 had to make this more complicated by using this from an IR hook, so add an IR typed overload.
* [X86][SSE] Remove XFormVExtractWithShuffleIntoLoad to prevent legalization ↵Simon Pilgrim2019-11-191-122/+2
| | | | | | | | infinite loops (PR43971) As detailed in PR43971/D70267, the use of XFormVExtractWithShuffleIntoLoad causes issues where we end up in infinite loops of extract(targetshuffle(vecload)) -> extract(shuffle(vecload)) -> extract(vecload) -> extract(targetshuffle(vecload)), there are just too many legalization checks at every stage that we can't guarantee that extract(shuffle(vecload)) -> scalarload can occur. At the moment we see a number of minor regressions as we don't fold extract(shuffle(vecload)) -> scalarload before legal ops, these can be addressed in future patches and extension of X86ISelLowering's combineExtractWithShuffle.
* [X86] Add a 'break;' to the end of the last case in a switch to avoid ↵Craig Topper2019-11-181-0/+2
| | | | surprising the next person to add a case after this one. NFC
* [SVE][CodeGen] Scalable vector MVT size queriesGraham Hunter2019-11-181-4/+5
| | | | | | | | | | | | | | | | | | | * Implements scalable size queries for MVTs, split out from D53137. * Contains a fix for FindMemType to avoid using scalable vector type to contain non-scalable types. * Explicit casts for several places where implicit integer sign changes or promotion from 32 to 64 bits caused problems. * CodeGenDAGPatterns will treat scalable and non-scalable vector types as different. Reviewers: greened, cameron.mcinally, sdesmalen, rovka Reviewed By: rovka Differential Revision: https://reviews.llvm.org/D66871
* [WinEH] Fix the wrong alignment orientation during calculating EH frame.Wang, Pengfei2019-11-151-1/+1
| | | | | | | | | | | | Summary: This is a bug fix for further issues in PR43585. Reviewers: rnk, RKSimon, craig.topper, andrew.w.kaylor Subscribers: hiraditya, llvm-commits, annita.zhang Tags: #llvm Differential Revision: https://reviews.llvm.org/D70224
* Sink all InitializePasses.h includesReid Kleckner2019-11-133-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation. I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild. Reviewers: bkramer, asbirlea, bollu, jdoerfert Differential Revision: https://reviews.llvm.org/D70211
* [X86] Don't treat mxcsr as a register name when parsing MS inline assembly.Craig Topper2019-11-131-2/+3
| | | | | No instruction takes mxcsr as a an operand so we should always treat it as an identifier name.
* [X86] Don't set the operation action for i16 SINT_TO_FP to Promote just ↵Craig Topper2019-11-131-3/+9
| | | | | | | because SSE1 is enabled. Instead do custom promotion in the handler so that we can still allow i16 to be used with fp80. And f64 without sse2.
* [X86] Fix typo in comment. NFCCraig Topper2019-11-131-1/+1
|
* [X86] Move all the FP_TO_XINT/XINT_TO_FP setOperationActions into the same ↵Craig Topper2019-11-131-41/+28
| | | | | | | | | !useSoftFloat block. Qualify all of the Promote actions for these with !useSoftFloat too. NFCI The Promote action doesn't apply until LegalizeDAG. By the time we get there, we would have already softened all the FP operations if useSoftFloat was true. So there wouldn't be any operation left to Promote.
* [X86][AVX] Add plausible schedule classes to MASKPAIR/VP2INTERSECT/VDPBF16PS ↵Simon Pilgrim2019-11-131-20/+24
| | | | | | | | instructions These are really just placeholders that use approximately the right resources - once we have CPUs scheduler models that support these instructions they will need revisiting. In the meantime this means that all instructions have a class of some kind., meaning models can be more easily flagged as complete.
* [X86] Remove setOperationAction for FP_TO_SINT v8i16.Craig Topper2019-11-122-8/+3
| | | | | | | | This is no longer needed after widening legalization as we custom legalize v8i8 ourselves. Added entries to the cost model, but bumped the cost slightly to account for the truncate shuffle that wasn't costed before.
* [X86] Don't consider v64i1 as a legal type unless v64i8 is also a legal type.Craig Topper2019-11-121-25/+47
| | | | | This avoids some nasty issues with argument passing and lowering of arbitrary v64i8 shuffles.
* [X86] Only pass v64i8/v32i16 as v16i32 on non-avx512bw targets if the v16i32 ↵Craig Topper2019-11-121-4/+4
| | | | | | | | | | | type won't be split by prefer-vector-width=256 Otherwise just let the v64i8/v32i16 types be split to v32i8/v16i16. In reality this shouldn't happen because it means we have a 512-bit vector argument, but min-legal-vector-width says a value less than 512. But a 512-bit argument should have been factored into the preferred vector width.
* [X86] Update stale comment. NFCCraig Topper2019-11-111-2/+2
|
* [X86] Remove setOperationAction lines that say to promote MVT::i1Craig Topper2019-11-111-6/+0
| | | | | | | | MVT::i1 should be removed by type legalization before we reach any code that would act on the promote action. Mainly to avoid replicating this for strict FP versions of these operations.
* [X86] Remove some else branches after checking for !useSoftFloat() that set ↵Craig Topper2019-11-111-9/+0
| | | | | | | | | operations to Expand. If we're using soft floats, then these operations shoudl be softened during type legalization. They'll never get to LegalizeVectorOps or LegalizeDAG so they don't need to be Expanded there.
* Use MCRegister in copyPhysRegMatt Arsenault2019-11-112-3/+3
|
* [X86] Handle MO_ConstantPoolIndex in X86AsmPrinter::PrintOperandCraig Topper2019-11-091-0/+1
| | | | Fixes PR43952
* [AArch64][X86] Don't assume __powidf2 is available on Windows.Eli Friedman2019-11-081-0/+6
| | | | | | | | | | We had some code for this for 32-bit ARM, but this doesn't really need to be in target-specific code; generalize it. (I think this started showing up recently because we added an optimization that converts pow to powi.) Differential Revision: https://reviews.llvm.org/D69013
* Reland: [TII] Use optional destination and source pair as a return value; NFCDjordje Todorovic2019-11-082-13/+9
| | | | | | | | | | Refactor usage of isCopyInstrImpl, isCopyInstr and isAddImmediate methods to return optional machine operand pair of destination and source registers. Patch by Nikola Prica Differential Revision: https://reviews.llvm.org/D69622
* X86FrameLowering - fix bool to unsigned cast static analyzer warnings. NFCI.Simon Pilgrim2019-11-071-7/+7
|
* X86CondBrFolding - remove non-existent fixBranchProb function. NFC.Simon Pilgrim2019-11-071-2/+0
|
* [X86] Remove unused variable. NFCCraig Topper2019-11-061-1/+0
|
* [X86] Remove dead code from combineStore.Craig Topper2019-11-061-44/+10
| | | | | | Leftovers from before we switched to widening legalization. Fixes PR43919.
OpenPOWER on IntegriCloud